-
Notifications
You must be signed in to change notification settings - Fork 210
High cpu or memory usage issues
jashaik edited this page Dec 9, 2021
·
9 revisions
- Identify which service is using more CPU/Memory. Go through the Access logs from nginx for indentifing the max time taking request.
status=200; req_time=346924; rdbms_time=267; rdbms_count=6; authz_time=91; authz_count=3; depsolver_time=146; depsolver_count=1
- Find the different response types from the access logs using the below command.
awk '{print $9}' /var/log/opscode/nginx/access.log | sort | uniq -c | sort -rn
sample output424886 200 106221 404 2 499
- The count of requests per second over the life of the log:
cat access.log | awk '{print $4}' | uniq -c
sample output280 rps
- Find the originating IP address in the access logs to identify the runs
- For example considering depsolver is taking more time for response.
- Run the fprof for finding which function is taking more time in the erchef console.
redbug:start("chef_wm_depsolver:make_json_list", [{print_file, "/tmp/redbug.out"}, {file_size, 150}, {msgs,1}]).
- This captures one execution of
make_json_list
and prints the function args to file.Next, I had to edit the file (redbug.out) to make it a valid erlang term - so removing the function call, and basically just leaving behind the argument I cared about -- the single long list of cookbook versions. After that:{ok, [Content|_]} = file:consult("/tmp/redbug.out").` % Run fprof to profile the function in question. We'll use the argument data we just captured in redbug as the input. fprof:apply(chef_wm_depsolver, make_json_list, [Content, "https://[2600:1f1c:f24:ad01:b300:cfe6:5f15:b905]", 1], [{file, "/tmp/fprof.trace"}]).
- This handy little escript converts the trace to callgrind format:
https://github.com/isacssouza/erlgrind
- Setup chef-server & 4 load servers in the AWS console using below AMI's
chef-server-load-test-03122021(ami-0e2ad9ec5256c7b4c) Load generator backup load-gen-backup-03122021(ami-0e2ad9ec5256c7b4c)
- Upgrade chef-server to specific version by following https://docs.chef.io/server/upgrades/
- Create user's and organization in the chef-server using the below commands.
chef-server-ctl org-create test1 test1 > test1_validator.pem
chef-server-ctl user-create testuser1 test test [[email protected]](mailto:[email protected]) password > /home/ubuntu/testuser1.pem
chef-server-ctl org-user-add -a test1 testuser1
- Drop the beam files to chef-server. Path -
/opt/opscode/embedded/service/opscode-erchef/lib/patches
- Use specific branch of chef-load (https://github.com/chef/chef-load/tree/mp/working)
- Copy all the users/client keys from chef-server to chef-load for generating load
Copy the pem's to local and then to load servers
scp -i ~/.ssh/aws-shared-chef-infra-server.pem [email protected]:/home/ubuntu/*.pem .
scp -i ~/.ssh/aws-shared-chef-infra-server.pem *.pem [email protected]:/home/ubuntu/
scp -i ~/.ssh/aws-shared-chef-infra-server.pem *.pem [email protected]:/home/ubuntu/
scp -i ~/.ssh/aws-shared-chef-infra-server.pem *.pem [email protected]:/home/ubuntu/
scp -i ~/.ssh/aws-shared-chef-infra-server.pem *.pem [email protected]:/home/ubuntu/
- update the chef-load.toml with chef server & other details.
log_file = "chef-load.log"
chef_server_url = "https://[2600:1f1c:f24:ad01:b300:cfe6:5f15:b905]/organizations/test1/"
client_key = "./testuser1.pem"
client_name = "testuser1"
ohai_json_file = "node.json"
chef_environment = "_default"
num_nodes = 1750
interval = 15
num_actions = 0 # For data collector, which is disabled.
node_name_prefix = "load4"
node_replacement_rate = 0
run_lists = []
download_cookbooks = "always"
download_cookbooks_scale_factor = 0.01
sleep_duration = 0
node_save_frequency = 0.8
api_get_requests = [ ]
chef_version = "13.2.20"
chef_server_creates_client_key = false
enable_reporting = false
random_data = true
liveness_agent = false
- Start the load using the below command. For more information please read the chef-load readme file(https://github.com/chef/chef-load#readme)
Example:
./chef-load -c chef-load.toml -i 1 -a 0 -n 10 -p load1a -R .01 start