cpe2stix

Before you begin

We host a full web API that includes all objects created by cpe2stix, Vulmatch.

Overview

A command line tool that turns NVD CPE records into STIX 2.1 Objects.

Having a standardised way of to describe CPEs becomes very useful when managing software tools you're using. That is where Common Platform Enumerations (CPEs) come in;

CPE is a structured naming scheme for information technology systems, software, and packages. Based upon the generic syntax for Uniform Resource Identifiers (URI), CPE includes a formal name format, a method for checking names against a system, and a description format for binding text and tests to a name.

We had a requirement to have an up-to-date copy of NVD CPEs in STIX 2.1 format.

The code in this repository turns CPEs into STIX 2.1 objects, and keeps them updated to match the official CPE dictionary;

Downloads the current CPEs (that match a users filters) from the NVD API
Converts them to STIX 2.1 Objects
Stores the STIX 2.1 Objects in the file store
Creates STIX Bundles of generated objects for each update run

tl;dr

Watch the demo.

Install the script

# clone the latest code
git clone https://github.com/muchdogesec/cpe2stix
# create a venv
cd cpe2stix
python3 -m venv cpe2stix-venv
source cpe2stix-venv/bin/activate
# install requirements
pip3 install -r requirements.txt

You will also need to have redis installed on your machine. Instructions to do this are here.

If you're on Mac, like me, the easiest way to do this is;

brew install redis

Configuration options

cpe2stix has various settings that are defined in an .env file.

To create a template for the file:

cp .env.example .env

To see more information about how to set the variables, and what they do, read the .env.markdown file.

Running the script

The script runs Redis and Celery jobs to download the data, you must start this first.

Generally you want to run these in a seperate terminal window but still in the a cpe2stix-venv.

# navigate to the root of cpe2stix install
cd cpe2stix
# activate venv
source cpe2stix-venv/bin/activate
# restart redis
brew services restart redis
# start celery
celery -A cpe2stix.celery worker --loglevel=info --purge

If you continually run into issues, you can also use flower to monitor Celery workers for debugging. In a new terminal run;

celery -A cpe2stix.celery flower

To open the application. You can also use Docker to run flower, as detailed here.

The script to get CPEs can now be executed (in the second terminal window) using;

python3 cpe2stix.py

It will also filter the data created using any values entered in the .env file on each run.

On each run, the old stix2_objects/cpe-bundle.json will be overwritten.

When the data conversion is complete you must kill the celery worker before running the script again. Failure to do so will lead to issues with the bundle IDs.

^C
worker: Hitting Ctrl+C again will terminate all running tasks!

worker: Warm shutdown (MainProcess)

Don't forget to restart the workers again, as follows;

# start celery
celery -A cpe2stix.celery worker --loglevel=info --purge

Mapping information

Marking Definition / Extension Definition

These are hardcoded and imported:

Marking Definition: https://raw.githubusercontent.com/muchdogesec/stix4doge/main/objects/marking-definition/cpe2stix.json
Extension Definition: https://raw.githubusercontent.com/muchdogesec/stix2extensions/refs/heads/main/extension-definitions/properties/software-cpe-properties.json

Software

cpe2stix creates Software SCOs for CPEs as follows;

{
    "type": "software",
    "spec_version": "2.1",
    "id": "software--<GENERATED BY STIX2 LIBRARY>",
    "name": "<products.cpe.titles.title> (if multiple, where lan = en, else first result)",
    "cpe": "<products.cpe.cpeName>",
    "swid": "<products.cpe.cpeNameId>",
    "version": "<products.cpe.cpeName[version_section]>",
    "vendor": "<products.cpe.cpeName[vendor_section]>",
    "languages": [
        "<products.cpe.titles.lang>"
    ],
    "object_marking_refs": [
        "marking-definition--94868c89-83c2-464b-929b-a1a8aa3c8487",
        "<IMPORTED MARKING DEFINTION OBJECT>"
    ],
    "extensions": {
        "extension-definition--82cad0bb-0906-5885-95cc-cafe5ee0a500": {
            "extension_type": "toplevel-property-extension"
        }
    },
    "x_cpe_struct": {
        "cpe_version": "<CPE_VERSION>",
        "part": "<PART>",
        "vendor": "<VENDOR>",
        "product": "<PRODUCT>",
        "version": "<VERSION>",
        "update": "<UPDATE>",
        "edition": "<EDITION>",
        "language": "<LANGUAGE>",
        "sw_edition": "<SW_EDITION>",
        "target_sw": "<TARGET_SW>",
        "target_hw": "<TARGET_HW>",
        "other": "<OTHER>"
    }
}

Note, if the NVD API record contains the property products.cpe.deprecated then [DEPRECATED] is added to the name property.

Bundle

All objects will be packed into a bundle file in stix2_objects names cpe-bundle.json which has the following structure.

{
    "type": "bundle",
    "id": "bundle--<UUIDV5 GENERATION LOGIC>",
    "objects": [
        "<ALL STIX JSON OBJECTS>"
    ]
}

To generate the id of the SRO, a UUIDv5 is generated using the namespace 5e6fc5ec-e507-52e7-8465-cf5ffc47138a and an md5 hash of all the sorted objects in the bundle.

Updating STIX Objects

New CPEs are added weekly. Existing CPEs are also updated.

Therefore the script can be used to keep an up-to-date copy of objects.

Generally it is assumed the script will be used like so;

on install, a user will create a backfill of all CPEs (almost 1.2 million at the time of writing, depending on CPE_LAST_MODIFIED_EARLIEST/CPE_LAST_MODIFIED_LATEST date used)
- note, generally this job will be split into multiple parts, downloading one year of data at a time.
said bundle(s) will be imported to some downstream tool (e.g. a threat intelligence platform)
the user runs the script again, this time updating the CPE_LAST_MODIFIED_EARLIEST variable to match the last time script is run (so that updated bundle only captures new and update objects)

The script will store the STIX objects created in the stix2_objects directory. All old objects will be purged with each run.

Recommendations for backfill

I STRONGLY recommend you use cxe2stix_helper to perform the backfill. cxe2stix_helper will handle the splitting of the bundle files into your desired time ranges.

Useful supporting tools

To generate STIX 2.1 Objects: stix2 Python Lib
The STIX 2.1 specification: STIX 2.1 docs
NVD CPE Overview
NVD CVE API

Support

Minimal support provided via the DOGESEC community.

License

Apache 2.0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cpe2stix

Before you begin

Overview

tl;dr

Install the script

Configuration options

Running the script

Mapping information

Marking Definition / Extension Definition

Software

Bundle

Updating STIX Objects

Recommendations for backfill

Useful supporting tools

Support

License

About

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
cpe2stix		cpe2stix
docs		docs
.env.example		.env.example
.env.markdown		.env.markdown
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cpe2stix.py		cpe2stix.py
requirements.txt		requirements.txt

License

muchdogesec/cpe2stix

Folders and files

Latest commit

History

Repository files navigation

cpe2stix

Before you begin

Overview

tl;dr

Install the script

Configuration options

Running the script

Mapping information

Marking Definition / Extension Definition

Software

Bundle

Updating STIX Objects

Recommendations for backfill

Useful supporting tools

Support

License

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 3

Languages