-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Postprocessing is Noisy and Wasteful #65
Comments
A question for the creator @kevthehermit What is meant to take priority in the code? The blacklist, or the "parse all" setting? I believe what comes first in the code will make the difference. For example: Current with Blacklist and Parse All: Option A: Option B: And there are a few more possibilities. |
I can see your point, process ALL was just a lazy way to say it should run on everything without having to specify a list of every rule. Store All should take priority over the blacklist. The idea of the blacklist was to help false positive reduction in data you wanted to keep. Store all should ignore all other filters. I have updated the workflow to only post process a blacklisted item if Store All is True. If store all is false then there is no post process performed. As for the log output |
If you turn on the entropy calculator, it fills the default logs with:
INFO:pastehunter.py:Running Post Module postprocess.post_entropy on
It also runs on blacklisted pastes, wasting CPU time.
Affected code:
To cut the logic off at the important point:
if any(i in results for i in post_values["rule_list"]) or "ALL" in post_values["rule_list"]:
This says either
any(i in results for i in post_values["rule_list"])
or"ALL" in post_values["rule_list"]
will cause a paste to be parsed. This means a post_processor like "entropy calculator" will be run on EVERY paste, blacklisted or not.The text was updated successfully, but these errors were encountered: