Merge pull request #5896 from janekmi/cflow-tech-preview

common: introduce cflow-based call stacks analysis utilities
pmem · Nov 7, 2023 · d53cf45 · d53cf45
2 parents c3f72fb + 06395b7
commit d53cf45
Show file tree

Hide file tree

Showing 14 changed files with 2,781 additions and 6 deletions.
diff --git a/.github/workflows/scan_bandit.yml b/.github/workflows/scan_bandit.yml
@@ -5,9 +5,9 @@ on:
   workflow_call:
 
 env:
-  # Set path to the pmreorder tool. At the moment pmreorder is the only
-  # Python-based tool released in the PMDK.
-  SCAN_DIR: src/tools/pmreorder
+  # Python-based tools.
+  PMREORDER: src/tools/pmreorder/*.py
+  CALL_STACKS_ANALYSIS: utils/call_stacks_analysis/*.py
 
 jobs:
   bandit:
@@ -21,4 +21,4 @@ jobs:
         run: sudo apt-get -y install bandit
 
       - name: Bandit scan
-        run: bandit --version && bandit -r "$SCAN_DIR"
+        run: bandit --version && bandit $PMREORDER $CALL_STACKS_ANALYSIS
diff --git a/utils/call_stacks_analysis/README.md b/utils/call_stacks_analysis/README.md
@@ -0,0 +1,71 @@
+# Call-stacks analysis utilities
+
+> XXX This document requires more details.
+
+1. `stack_usage_stats.sh` generate `src/stats/stack-usage-$build.txt` files.
+2. Collect [cflow](https://savannah.gnu.org/git/?group=cflow) data.
+3. Generate all possible call stacks given the data provided.
+
+```sh
+# -u, --stack-usage-stat-file
+# -f, --cflow-output-file
+# -i, --config-file
+./utils/call_stacks_analysis/generate_call_stacks.py \
+        -u src/stats/stack-usage-nondebug.txt \
+        -f src/libpmem/cflow.txt \
+        -i utils/call_stacks_analysis/libpmem/config.json
+```
+
+If succesfull, it produces:
+
+- `call_stacks_all.json` with call stacks ordered descending by call stack consumption.
+- `stack_usage.json` with the data extracted from the provided `src/stats/stack-usage-nondebug.txt` but limited to a single library according to the `config.json` filter value.
+
+**Note**:  If too many functions ought to be added to a white list it might be useful to ignore functions having a certain stack usage or lower. Please see `-t` option to set a desired threshold.
+
+4. (Optional) Break down a call stack's stack consumption per function. Use the `stack_usage.json` as produced in the previous step and extract a single call stack and put it into a file (name `call_stack.json` below). Please see the examples directory for an example.
+
+```sh
+# -s, --stack-usage-file
+# -c, --call-stack
+./utils/call_stacks_analysis/stack_usage.py \
+        -s stack_usage.json \
+        -c call_stack.json
+```
+
+If successful, it prints out on the screen a list of functions along with their stack consumption e.g.
+
+```
+208     pmem_map_file
+0       pmem_map_fileU
+80      pmem_map_register
+64      util_range_register
+240     util_ddax_region_find
+8224    pmem2_get_type_from_stat
+0       ERR
+384     out_err
+0       out_error
+224     out_snprintf
+```
+
+5. (Optional) List all API calls which call stacks contains a given function. Use the `stack_usage.json` as produced in the previous step.
+
+```sh
+# -a, --all-call-stacks-file
+# -f, --function-name
+./utils/call_stacks_analysis/api_callers.py \
+        -a call_stacks_all.json \
+        -f pmem2_get_type_from_stat
+```
+
+If successful, it prints out on screen a list of API calls that met the condition e.g.
+
+```
+os_part_deep_common
+pmem_map_file
+util_fd_get_type
+util_file_device_dax_alignment
+util_file_pread
+util_file_pwrite
+util_unlink_flock
+```
diff --git a/utils/call_stacks_analysis/api_callers.py b/utils/call_stacks_analysis/api_callers.py
@@ -0,0 +1,45 @@
+#!/usr/bin/env python3
+
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2023, Intel Corporation
+
+import argparse
+import json
+
+from typing import List, Dict, Any
+
+# https://peps.python.org/pep-0589/ requires Python >= 3.8
+# from typing import TypedDict
+
+# List all API calls which start call stacks containing a particular function name.
+
+PARSER = argparse.ArgumentParser()
+PARSER.add_argument('-a', '--all-call-stacks-file', required=True)
+PARSER.add_argument('-f', '--function-name', required=True)
+
+# class CallStack(TypedDict): # for Python >= 3.8
+#     stack: list[str]
+#     size: int
+CallStack = Dict[str, Any] # for Python < 3.8
+
+def load_all_call_stacks(all_call_stacks_file: str) -> List[CallStack]:
+        with open(all_call_stacks_file, 'r') as file:
+                return json.load(file)
+
+def main():
+        args = PARSER.parse_args()
+        call_stacks = load_all_call_stacks(args.all_call_stacks_file)
+        apis = []
+        # lookup all call stacks in which the function of interest is mentioned
+        for call_stack in call_stacks:
+                if args.function_name in call_stack['stack']:
+                        # callect all API calls which starts these call stacks
+                        apis.append(call_stack['stack'][0])
+        # remove duplicates
+        apis = list(set(apis))
+        apis.sort()
+        for api in apis:
+                print(api)
+
+if __name__ == '__main__':
+        main()
diff --git a/utils/call_stacks_analysis/examples/call_stack.json b/utils/call_stacks_analysis/examples/call_stack.json
@@ -0,0 +1,15 @@
+{
+        "stack": [
+            "pmem_map_file",
+            "pmem_map_fileU",
+            "pmem_map_register",
+            "util_range_register",
+            "util_ddax_region_find",
+            "pmem2_get_type_from_stat",
+            "ERR",
+            "out_err",
+            "out_error",
+            "out_snprintf"
+        ],
+        "size": 9424
+    }
diff --git a/utils/call_stacks_analysis/examples/config.json b/utils/call_stacks_analysis/examples/config.json
@@ -0,0 +1,19 @@
+{
+    "__comment": "only consider stack usage for functions listed in the src/(non)debug/<library_name>/ directory",
+    "filter": "library_name",
+    "api": [
+        "api_call_1"
+    ],
+    "dead_end": [
+        "not_called_function_1"
+    ],
+    "extra_calls": {
+        "caller_1": [
+            "callee_1",
+            "callee_2"
+        ]
+    },
+    "white_list": [
+       "irrelevant_function_1"
+    ]
+}