Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datastore: fix get query caching #1449

Merged
merged 1 commit into from
Dec 18, 2024

Conversation

crozzy
Copy link
Contributor

@crozzy crozzy commented Dec 9, 2024

This change replaces caching the DB pgx.Rows, which is an iterator, with the actual potential vulnerabilities. Previously, subsequent records manifesting in the same query were given a pgx.Rows object that had been exhausted.

@crozzy crozzy requested a review from a team as a code owner December 9, 2024 23:13
@crozzy crozzy requested review from hdonnay and removed request for a team December 9, 2024 23:13
@crozzy
Copy link
Contributor Author

crozzy commented Dec 9, 2024

Post-change:

    "186": {
      "id": "186",
      "name": "xz-libs",
      "version": "5.2.2-1.el7",
      "kind": "binary",
      "source": {
        "id": "185",
        "name": "xz",
        "version": "5.2.2-1.el7",
        "kind": "source"
      },
      "arch": "x86_64"
    },
...
    "1924": {
      "id": "1924",
      "name": "xz-libs",
      "version": "5.2.2-1.el7",
      "kind": "binary",
      "source": {
        "id": "185",
        "name": "xz",
        "version": "5.2.2-1.el7",
        "kind": "source"
      },
      "arch": "i686"
    },
...
    "2675687": {
      "id": "2675687",
      "updater": "rhel-vex",
      "name": "CVE-2022-1271",
      "description": "An arbitrary file write vulnerability was found in GNU gzip's zgrep utility. When zgrep is applied on the attacker's chosen file name (for example, a crafted file name), this can overwrite an attacker's content to an arbitrary attacker-selected file. This flaw occurs due to insufficient validation when processing filenames with two or more newlines where selected content and the target file names are embedded in crafted multi-line file names. This flaw allows a remote, low privileged attacker to force zgrep to write arbitrary files on the system.",
      "issued": "2022-04-07T00:00:00Z",
      "links": "https://access.redhat.com/security/cve/CVE-2022-1271 https://bugzilla.redhat.com/show_bug.cgi?id=2073310 https://www.cve.org/CVERecord?id=CVE-2022-1271 https://nvd.nist.gov/vuln/detail/CVE-2022-1271 https://security.access.redhat.com/data/csaf/v2/vex/2022/cve-2022-1271.json https://access.redhat.com/errata/RHSA-2022:5052",
      "severity": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H",
      "normalized_severity": "High",
      "package": {
        "id": "",
        "name": "xz-libs",
        "version": "",
        "kind": "binary",
        "arch": "i686|ppc|ppc64|ppc64le|s390|s390x|amd64|x86_64"
      },
      "distribution": {
        "id": "",
        "did": "",
        "name": "",
        "version": "",
        "version_code_name": "",
        "version_id": "",
        "arch": "",
        "cpe": "",
        "pretty_name": ""
      },
      "repository": {
        "name": "cpe:2.3:o:redhat:enterprise_linux:7:*:server:*:*:*:*:*",
        "key": "rhel-cpe-repository",
        "cpe": "cpe:2.3:o:redhat:enterprise_linux:7:*:server:*:*:*:*:*"
      },
      "fixed_in_version": "0:5.2.2-2.el7_9",
      "arch_op": "pattern match"
    },
...
"package_vulnerabilities": {
    "186": [
      "2675687"
    ],
...
    "1924": [
      "2675687"
    ],
}

hdonnay
hdonnay previously approved these changes Dec 10, 2024
@crozzy
Copy link
Contributor Author

crozzy commented Dec 10, 2024

@hdonnay sorry just had to update

datastore/postgres/get.go Outdated Show resolved Hide resolved
datastore/postgres/get.go Show resolved Hide resolved
RTann
RTann previously approved these changes Dec 11, 2024
Copy link
Contributor

@RTann RTann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like rows does seem to close on its own. LGTM, though I would not be opposed to throwing in a defer rows.Close, if desired

@crozzy
Copy link
Contributor Author

crozzy commented Dec 12, 2024

Seems like rows does seem to close on its own. LGTM, though I would not be opposed to throwing in a defer rows.Close, if desired

Updated, could probably do with another once over

return fmt.Errorf("failed to scan vulnerability: %w", err)
}

if _, ok := vulnSet[rid]; !ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this was here before but I'm wondering: is it possible we see the same rid twice in the rqs loop and they both have different queries and different sets of vulnerabilities? If so, then it seems like, in the end, we give them both the same set of vulns instead of separate. Is this expected? If so, when?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think this is a dormant bug, it probably never comes up in practice because:

  1. With VEX the repository doesn't matter until Vulnerable() is called (so after the querying) so the queries per package are the same.
  2. Before that, in the OVAL files, each vulnerability referenced all the possible repos so I suppose you were bound to get a comparable set of Vulns for each package.ID anyway.

I think this probably deserves more discussion and a separate issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Every other time a matcher queries on driver.RepositoryName, it's a mono-repository representing that ecosytem

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I looked into it a bit, and I realize that's supposed to happen due to the nature of IndexRecords. As in, there is a record for every unique for every repository in every environment in every package. So, there should be an example where we see the same rid with different queries (for example two different potential repositories) and it's possible we get two different results. At least in this particular example that I'm thinking of (rpms), this was something which has popped up and was a bug, but I believe switching to VEX resolved it? Or, it's still a problem because we have to guess which repository is responsible for the found rpms (we try them all and see what happens)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's not an issue for VEX because the specific repo isn't part of the query anymore, it's checked after the vulns are returned. But, it's still a little bit of a bear trap for future stuff.

This change replaces caching the DB pgx.Rows, which is an iterator, with
the actual potential vulnerabilities. Previously, subsequent records
manifesting in the same query were given a pgx.Rows object that had been
exhausted.

Signed-off-by: crozzy <[email protected]>
@crozzy crozzy force-pushed the matching-queries-bug branch from 7c8bf6b to f2b7d47 Compare December 18, 2024 16:49
@crozzy
Copy link
Contributor Author

crozzy commented Dec 18, 2024

/fast-forward

@github-actions github-actions bot merged commit f2b7d47 into quay:main Dec 18, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants