We host a full web API that includes all objects created by cpe2stix, Vulmatch.
A command line tool that turns NVD CPE records into STIX 2.1 Objects.
Having a standardised way of to describe CPEs becomes very useful when managing software tools you're using. That is where Common Platform Enumerations (CPEs) come in;
CPE is a structured naming scheme for information technology systems, software, and packages. Based upon the generic syntax for Uniform Resource Identifiers (URI), CPE includes a formal name format, a method for checking names against a system, and a description format for binding text and tests to a name.
We had a requirement to have an up-to-date copy of NVD CPEs in STIX 2.1 format.
The code in this repository turns CPEs into STIX 2.1 objects, and keeps them updated to match the official CPE dictionary;
- Downloads the current CPEs (that match a users filters) from the NVD API
- Converts them to STIX 2.1 Objects
- Stores the STIX 2.1 Objects in the file store
- Creates STIX Bundles of generated objects for each update run
# clone the latest code
git clone https://github.com/muchdogesec/cpe2stix
# create a venv
cd cpe2stix
python3 -m venv cpe2stix-venv
source cpe2stix-venv/bin/activate
# install requirements
pip3 install -r requirements.txt
You will also need to have redis installed on your machine. Instructions to do this are here.
If you're on Mac, like me, the easiest way to do this is;
brew install redis
cpe2stix has various settings that are defined in an .env
file.
To create a template for the file:
cp .env.example .env
To see more information about how to set the variables, and what they do, read the .env.markdown
file.
The script runs Redis and Celery jobs to download the data, you must start this first.
Generally you want to run these in a seperate terminal window but still in the a cpe2stix-venv
.
# navigate to the root of cpe2stix install
cd cpe2stix
# activate venv
source cpe2stix-venv/bin/activate
# restart redis
brew services restart redis
# start celery
celery -A cpe2stix.celery worker --loglevel=info --purge
If you continually run into issues, you can also use flower to monitor Celery workers for debugging. In a new terminal run;
celery -A cpe2stix.celery flower
To open the application. You can also use Docker to run flower, as detailed here.
The script to get CPEs can now be executed (in the second terminal window) using;
python3 cpe2stix.py
It will also filter the data created using any values entered in the .env
file on each run.
On each run, the old stix2_objects/cpe-bundle.json
will be overwritten.
When the data conversion is complete you must kill the celery worker before running the script again. Failure to do so will lead to issues with the bundle IDs.
^C
worker: Hitting Ctrl+C again will terminate all running tasks!
worker: Warm shutdown (MainProcess)
Don't forget to restart the workers again, as follows;
# start celery
celery -A cpe2stix.celery worker --loglevel=info --purge
These are hardcoded and imported:
- Marking Definition: https://raw.githubusercontent.com/muchdogesec/stix4doge/main/objects/marking-definition/cpe2stix.json
- Extension Definition: https://raw.githubusercontent.com/muchdogesec/stix2extensions/refs/heads/main/extension-definitions/properties/software-cpe-properties.json
cpe2stix creates Software SCOs for CPEs as follows;
{
"type": "software",
"spec_version": "2.1",
"id": "software--<GENERATED BY STIX2 LIBRARY>",
"name": "<products.cpe.titles.title> (if multiple, where lan = en, else first result)",
"cpe": "<products.cpe.cpeName>",
"swid": "<products.cpe.cpeNameId>",
"version": "<products.cpe.cpeName[version_section]>",
"vendor": "<products.cpe.cpeName[vendor_section]>",
"languages": [
"<products.cpe.titles.lang>"
],
"object_marking_refs": [
"marking-definition--94868c89-83c2-464b-929b-a1a8aa3c8487",
"<IMPORTED MARKING DEFINTION OBJECT>"
],
"extensions": {
"extension-definition--82cad0bb-0906-5885-95cc-cafe5ee0a500": {
"extension_type": "toplevel-property-extension"
}
},
"x_cpe_struct": {
"cpe_version": "<CPE_VERSION>",
"part": "<PART>",
"vendor": "<VENDOR>",
"product": "<PRODUCT>",
"version": "<VERSION>",
"update": "<UPDATE>",
"edition": "<EDITION>",
"language": "<LANGUAGE>",
"sw_edition": "<SW_EDITION>",
"target_sw": "<TARGET_SW>",
"target_hw": "<TARGET_HW>",
"other": "<OTHER>"
}
}
Note, if the NVD API record contains the property products.cpe.deprecated
then [DEPRECATED]
is added to the name
property.
All objects will be packed into a bundle file in stix2_objects
names cpe-bundle.json
which has the following structure.
{
"type": "bundle",
"id": "bundle--<UUIDV5 GENERATION LOGIC>",
"objects": [
"<ALL STIX JSON OBJECTS>"
]
}
To generate the id of the SRO, a UUIDv5 is generated using the namespace 5e6fc5ec-e507-52e7-8465-cf5ffc47138a
and an md5 hash of all the sorted objects in the bundle.
New CPEs are added weekly. Existing CPEs are also updated.
Therefore the script can be used to keep an up-to-date copy of objects.
Generally it is assumed the script will be used like so;
- on install, a user will create a backfill of all CPEs (almost 1.2 million at the time of writing, depending on
CPE_LAST_MODIFIED_EARLIEST
/CPE_LAST_MODIFIED_LATEST
date used)- note, generally this job will be split into multiple parts, downloading one year of data at a time.
- said bundle(s) will be imported to some downstream tool (e.g. a threat intelligence platform)
- the user runs the script again, this time updating the
CPE_LAST_MODIFIED_EARLIEST
variable to match the last time script is run (so that updated bundle only captures new and update objects)
The script will store the STIX objects created in the stix2_objects
directory. All old objects will be purged with each run.
I STRONGLY recommend you use cxe2stix_helper to perform the backfill. cxe2stix_helper will handle the splitting of the bundle files into your desired time ranges.
- To generate STIX 2.1 Objects: stix2 Python Lib
- The STIX 2.1 specification: STIX 2.1 docs
- NVD CPE Overview
- NVD CVE API