Skip to content

Commit

Permalink
Merge pull request #5896 from janekmi/cflow-tech-preview
Browse files Browse the repository at this point in the history
common: introduce cflow-based call stacks analysis utilities
  • Loading branch information
janekmi authored Nov 7, 2023
2 parents c3f72fb + 06395b7 commit d53cf45
Show file tree
Hide file tree
Showing 14 changed files with 2,781 additions and 6 deletions.
8 changes: 4 additions & 4 deletions .github/workflows/scan_bandit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ on:
workflow_call:

env:
# Set path to the pmreorder tool. At the moment pmreorder is the only
# Python-based tool released in the PMDK.
SCAN_DIR: src/tools/pmreorder
# Python-based tools.
PMREORDER: src/tools/pmreorder/*.py
CALL_STACKS_ANALYSIS: utils/call_stacks_analysis/*.py

jobs:
bandit:
Expand All @@ -21,4 +21,4 @@ jobs:
run: sudo apt-get -y install bandit

- name: Bandit scan
run: bandit --version && bandit -r "$SCAN_DIR"
run: bandit --version && bandit $PMREORDER $CALL_STACKS_ANALYSIS
71 changes: 71 additions & 0 deletions utils/call_stacks_analysis/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Call-stacks analysis utilities

> XXX This document requires more details.
1. `stack_usage_stats.sh` generate `src/stats/stack-usage-$build.txt` files.
2. Collect [cflow](https://savannah.gnu.org/git/?group=cflow) data.
3. Generate all possible call stacks given the data provided.

```sh
# -u, --stack-usage-stat-file
# -f, --cflow-output-file
# -i, --config-file
./utils/call_stacks_analysis/generate_call_stacks.py \
-u src/stats/stack-usage-nondebug.txt \
-f src/libpmem/cflow.txt \
-i utils/call_stacks_analysis/libpmem/config.json
```

If succesfull, it produces:

- `call_stacks_all.json` with call stacks ordered descending by call stack consumption.
- `stack_usage.json` with the data extracted from the provided `src/stats/stack-usage-nondebug.txt` but limited to a single library according to the `config.json` filter value.

**Note**: If too many functions ought to be added to a white list it might be useful to ignore functions having a certain stack usage or lower. Please see `-t` option to set a desired threshold.

4. (Optional) Break down a call stack's stack consumption per function. Use the `stack_usage.json` as produced in the previous step and extract a single call stack and put it into a file (name `call_stack.json` below). Please see the examples directory for an example.

```sh
# -s, --stack-usage-file
# -c, --call-stack
./utils/call_stacks_analysis/stack_usage.py \
-s stack_usage.json \
-c call_stack.json
```

If successful, it prints out on the screen a list of functions along with their stack consumption e.g.

```
208 pmem_map_file
0 pmem_map_fileU
80 pmem_map_register
64 util_range_register
240 util_ddax_region_find
8224 pmem2_get_type_from_stat
0 ERR
384 out_err
0 out_error
224 out_snprintf
```

5. (Optional) List all API calls which call stacks contains a given function. Use the `stack_usage.json` as produced in the previous step.

```sh
# -a, --all-call-stacks-file
# -f, --function-name
./utils/call_stacks_analysis/api_callers.py \
-a call_stacks_all.json \
-f pmem2_get_type_from_stat
```

If successful, it prints out on screen a list of API calls that met the condition e.g.

```
os_part_deep_common
pmem_map_file
util_fd_get_type
util_file_device_dax_alignment
util_file_pread
util_file_pwrite
util_unlink_flock
```
45 changes: 45 additions & 0 deletions utils/call_stacks_analysis/api_callers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#!/usr/bin/env python3

# SPDX-License-Identifier: BSD-3-Clause
# Copyright 2023, Intel Corporation

import argparse
import json

from typing import List, Dict, Any

# https://peps.python.org/pep-0589/ requires Python >= 3.8
# from typing import TypedDict

# List all API calls which start call stacks containing a particular function name.

PARSER = argparse.ArgumentParser()
PARSER.add_argument('-a', '--all-call-stacks-file', required=True)
PARSER.add_argument('-f', '--function-name', required=True)

# class CallStack(TypedDict): # for Python >= 3.8
# stack: list[str]
# size: int
CallStack = Dict[str, Any] # for Python < 3.8

def load_all_call_stacks(all_call_stacks_file: str) -> List[CallStack]:
with open(all_call_stacks_file, 'r') as file:
return json.load(file)

def main():
args = PARSER.parse_args()
call_stacks = load_all_call_stacks(args.all_call_stacks_file)
apis = []
# lookup all call stacks in which the function of interest is mentioned
for call_stack in call_stacks:
if args.function_name in call_stack['stack']:
# callect all API calls which starts these call stacks
apis.append(call_stack['stack'][0])
# remove duplicates
apis = list(set(apis))
apis.sort()
for api in apis:
print(api)

if __name__ == '__main__':
main()
15 changes: 15 additions & 0 deletions utils/call_stacks_analysis/examples/call_stack.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"stack": [
"pmem_map_file",
"pmem_map_fileU",
"pmem_map_register",
"util_range_register",
"util_ddax_region_find",
"pmem2_get_type_from_stat",
"ERR",
"out_err",
"out_error",
"out_snprintf"
],
"size": 9424
}
19 changes: 19 additions & 0 deletions utils/call_stacks_analysis/examples/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
{
"__comment": "only consider stack usage for functions listed in the src/(non)debug/<library_name>/ directory",
"filter": "library_name",
"api": [
"api_call_1"
],
"dead_end": [
"not_called_function_1"
],
"extra_calls": {
"caller_1": [
"callee_1",
"callee_2"
]
},
"white_list": [
"irrelevant_function_1"
]
}
Loading

0 comments on commit d53cf45

Please sign in to comment.