Postprocessing is Noisy and Wasteful #65

secbug · 2019-01-24T15:10:12Z

If you turn on the entropy calculator, it fills the default logs with:
INFO:pastehunter.py:Running Post Module postprocess.post_entropy on
It also runs on blacklisted pastes, wasting CPU time.

Affected code:

# If any of the blacklist rules appear then empty the result set
if conf['yara']['blacklist'] and 'blacklist' in results:
    results = []
    logger.info("Blacklisted {0} paste {1}".format(paste_data['pastesite'], paste_data['pasteid']))

# Post Process

# If post module is enabled and the paste has a matching rule.
post_results = paste_data
for post_process, post_values in conf["post_process"].items():
    if post_values["enabled"]:
        if any(i in results for i in post_values["rule_list"]) or "ALL" in post_values["rule_list"]:
            logger.debug("Running Post Module {0} on {1}".format(post_values["module"], paste_data["pasteid"]))
            post_module = importlib.import_module(post_values["module"])
            post_results = post_module.run(results,
                                            raw_paste_data,
                                            paste_data
                                            )

To cut the logic off at the important point:
if any(i in results for i in post_values["rule_list"]) or "ALL" in post_values["rule_list"]:

This says either any(i in results for i in post_values["rule_list"]) or "ALL" in post_values["rule_list"] will cause a paste to be parsed. This means a post_processor like "entropy calculator" will be run on EVERY paste, blacklisted or not.

The text was updated successfully, but these errors were encountered:

secbug · 2019-01-28T20:34:52Z

A question for the creator @kevthehermit What is meant to take priority in the code? The blacklist, or the "parse all" setting? I believe what comes first in the code will make the difference. For example:

Current with Blacklist and Parse All:
Blacklist - result = []
Post_process_current - process only "all"
Parse_all setting - result = [none_empty]
print() everything

Option A:
Blacklist - result = []
Parse_all setting - result = [none_empty]
Post_process_current - process everything applicable and process all on all
print() everything

Option B:
Parse_all setting - result = [none_empty]
Blacklist - result = []
Post_process_current - process everything applicable and process non-blacklisted on all
print() non-blacklisted

And there are a few more possibilities.

kevthehermit · 2019-02-09T20:18:12Z

I can see your point, process ALL was just a lazy way to say it should run on everything without having to specify a list of every rule.

Store All should take priority over the blacklist. The idea of the blacklist was to help false positive reduction in data you wanted to keep. Store all should ignore all other filters.

I have updated the workflow to only post process a blacklisted item if Store All is True. If store all is false then there is no post process performed.

As for the log output
All output modules generate an info log. I dont want to start setting logging per post process module as it just becomes harder to manage.
I can set the default to disable entropy calculation. I use it to look for encrytped blobs or large chucks of base64 and binary data so it may not be useful for others.

secbug changed the title ~~Turning on entropy calculator fills INFO log level with data~~ Postprocessing is Noisy and Wasteful Jan 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Postprocessing is Noisy and Wasteful #65

Postprocessing is Noisy and Wasteful #65

secbug commented Jan 24, 2019 •

edited

Loading

secbug commented Jan 28, 2019 •

edited

Loading

kevthehermit commented Feb 9, 2019

Postprocessing is Noisy and Wasteful #65

Postprocessing is Noisy and Wasteful #65

Comments

secbug commented Jan 24, 2019 • edited Loading

secbug commented Jan 28, 2019 • edited Loading

kevthehermit commented Feb 9, 2019

secbug commented Jan 24, 2019 •

edited

Loading

secbug commented Jan 28, 2019 •

edited

Loading