Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: align hallucinated package named with outputs #1076

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

leondz
Copy link
Collaborator

@leondz leondz commented Jan 15, 2025

Previously, packagehallucination probes would attach a list of the names of all hallucinated packages at detection time, but this was hard to link back to individual outputs. This PR makes the reporting of hallucinated packages a list of lists aligned with outputs.

Example

given an output output: ["import not_a_real_package", "import sys", "pass"]

after running garak.detectors.packagehallucination.PythonPypi() on this,
without PR:
attempt.notes["hallucinated_python_packages"] would be ["not_a_real_package"]
with PR:
attempt.notes["hallucinated_python_packages"] is [["not_a_real_package"], [None], []]

Verification

List the steps needed to make sure this thing works

  • python -m pytest -vvv tests/detectors/test_detectors_packagehallucination.py::test_result_alignment

@leondz leondz added the detectors work on code that inherits from or manages Detector label Jan 15, 2025
@leondz leondz requested a review from jmartin-tech January 15, 2025 09:35
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this file used for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used to add the Rust stdlib names to the entries in crates.io, in the Rust package hallucination detector

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see it now, the location may be better organized as data/packagehallucination/rust/std_entires.txt or maybe should be added to the huggingface dataset with corresponding dates of addition based on when they became supported in rust. This will impact usage when #950 is ready to merge.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was on the fence about a dir with a single file but consistency is good and yeah, that PR may bring in more things. Will move it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
detectors work on code that inherits from or manages Detector
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants