Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flowbits unreliable across threads #71

Open
kshelton opened this issue Jun 23, 2016 · 8 comments
Open

Flowbits unreliable across threads #71

kshelton opened this issue Jun 23, 2016 · 8 comments

Comments

@kshelton
Copy link
Contributor

kshelton commented Jun 23, 2016

I am attempting to write a rule that detects DHCPACK in response to a DHCPREQUEST using flowbits. I am finding that the result is non-deterministic. Sometimes the flowbit from the REQUEST is set before the ACK is processed, and sometimes it is not.

Here are the rules:

alert syslog $HOME_NET any -> $HOME_NET any (msg: "[DHCPD] Lease obtained by client"; content: "DHCPACK"; classtype: asset-info; normalize; flowbits: isset, by_src, dhcpreq; program: dhcpd; sid: 10000001; rev:1;)
alert syslog $HOME_NET any -> $HOME_NET any (msg: "[DHCPD] Lease requested by client"; content: "DHCPREQUEST"; classtype: asset-info; normalize; flowbits: set, dhcpreq, 30; flowbits: noalert; program: dhcpd; sid: 10000000; rev:1;)

Here are sample log entries:

1.1.1.1|IGNORE1|IGNORE2|IGNORE3|dhcp|2016-06-23T08:27:29+00:00|2016-06-23T08:27:29+00:00|dhcpd| DHCPREQUEST for 192.168.255.4 from de:ad:be:ef:de:ad (SomeHost) via igb2
1.1.1.1|IGNORE1|IGNORE2|IGNORE3|dhcp|2016-06-23T08:27:29+00:00|2016-06-23T08:27:29+00:00|dhcpd| DHCPACK on 192.168.255.4 to de:ad:be:ef:de:ad (SomeHost) via igb2

Here are is a run where it fails to match the flow bit

[*] Loading GeoIP2 database. [/usr/share/GeoIP/Geo-Country.mmdb]
[*] Loading classifications.conf file. [/usr/local/etc/sagan-rules/classification.config]
[*] 48 classifications loaded
[*] Loading /usr/local/etc/sagan-rules/normalization.rulebase for normalization.
[*] Loading references.conf file. [/usr/local/etc/sagan-rules/reference.config]
[*] 6 references loaded.
[*] Loading gen-msg.map file. [/usr/local/etc/sagan-rules/gen-msg.map]
[*] 3 generators loaded.
[*] Loading protocol map file. [/usr/local/etc/sagan-rules/protocol.map]
[*] 12 protocols loaded [Message search: 3|Program search: 9]
[*] Loading /usr/local/etc/sagan-rules/internal.rules rule file
[*] Loading /usr/local/etc/sagan-rules/official.rules rule file
[*] Configuration file /usr/local/etc/sagan.conf loaded and 61 rules loaded.
[*] Out of 61 rules, 3 Flowbit(s) are in use.
[*] Sagan version 1.1.0 is firing up!
[*] Dropping privileges [UID: 65534 GID: 65534]
[*] ---------------------------------------------------------------------------
[*] Initializing shared memory objects.
[*] ---------------------------------------------------------------------------
[*] + Counters shared object (new).
[*] + Flowbit shared object (new).
[*] + Thresh_by_src shared object (new).
[*] + Thresh_by_dst shared object (new).
[*] + Thresh_by_dstport shared object (new).
[*] + Thresh_by_username shared object (new).
[*] + After_by_src shared object (new).
[*] + After_by_dst shared object (new).
[*] + After_by_dstport shared object (new).
[*] + After_by_username shared object (new).
[*]
[*]  ,-._,-.    -*> Sagan! <*-
[*]  \/)"(\/    Version 1.1.0
[*]   (_o_)     Champ Clark III & The Quadrant InfoSec Team [quadrantsec.com]
[*]   /   \/)   Copyright (C) 2009-2016 Quadrant Information Security, et al.
[*]  (|| ||)    Using PCRE version: 8.38 2015-11-23
[*]   oo-oo     Sagan is processing events.....
[*]
[*] Spawning 50 Processor Threads.
[*]
[*] Attempting to open syslog FILE (/tmp/fifo).
[*] Successfully opened FILE (/tmp/fifo) and processing events.....
[*] EOF reached. Waiting for threads to catch up....
[*]
[*]  ,-._,-.  -[ Sagan Version 1.1.0 - Engine Statistics ]-
[*]  \/)"(\/
[*]   (_o_)    Events processed         : 2
[*]   /   \/)  Signatures matched       : 0 (0.000%)
[D] [processors/sagan-engine.c, line 1669] **[Trigger]*********************************
[*]  (|| ||)   Alerts                   : 0 (0.000%)
[D] [processors/sagan-engine.c, line 1670] Program: dhcpd | Facility: IGNORE1 | Priority: IGNORE2 | Level: IGNORE3 | Tag: dhcp
[*]   oo-oo    After                    : 0 (0.000%)
[*]            Threshold                : 0 (0.000%)
[*]            Dropped                  : 0 (0.000%)
[*]            Thread Exhaustion        : 0 (0.000%)
[*]            GeoIP2 Hits:             : 0 (0.000%)
[*]            GeoIP2 Lookups:          : 0
[*]            GeoIP2 Misses            : 0
[D] [processors/sagan-engine.c, line 1671] Threshold flag: 0 | After flag: 0 | Flowbit Flag: 1 | Flowbit status: 0
[D] [processors/sagan-engine.c, line 1672] Triggering Message:  DHCPREQUEST for 192.168.255.4 from de:ad:be:ef:de:ad (SomeHost) via igb2
[D] [sagan-flowbit.c, line 391] Condition of flowbit returning FALSE. 1 0
[*]            Uptime                   : 0 days, 0 hours, 0 minutes, 0 seconds.
[D] [sagan-flowbit.c, line 684] [1] Created flowbit "" via "set" [192.168.255.4 -> 1.1.1.1],
[*]            Avg. events per/second   : 0
[*]
[*]           -[ Sagan Processor Statistics ]-
[*]
[*]            Dropped                  : 0 (0.000%)
[*]
[*]           -[ Sagan follow_flow Statistics ]-
[*]
[*]            Total                    : 0
[*]            Dropped                  : 0 (0.000%)
[*] -------------------------------------------------------------------------------
[*] Exiting.

Here is a run where it does match:

[*] Loading GeoIP2 database. [/usr/share/GeoIP/Geo-Country.mmdb]
[*] Loading classifications.conf file. [/usr/local/etc/sagan-rules/classification.config]
[*] 48 classifications loaded
[*] Loading /usr/local/etc/sagan-rules/normalization.rulebase for normalization.
[*] Loading references.conf file. [/usr/local/etc/sagan-rules/reference.config]
[*] 6 references loaded.
[*] Loading gen-msg.map file. [/usr/local/etc/sagan-rules/gen-msg.map]
[*] 3 generators loaded.
[*] Loading protocol map file. [/usr/local/etc/sagan-rules/protocol.map]
[*] 12 protocols loaded [Message search: 3|Program search: 9]
[*] Loading /usr/local/etc/sagan-rules/internal.rules rule file
[*] Loading /usr/local/etc/sagan-rules/official.rules rule file
[*] Configuration file /usr/local/etc/sagan.conf loaded and 61 rules loaded.
[*] Out of 61 rules, 3 Flowbit(s) are in use.
[*] Sagan version 1.1.0 is firing up!
[*] Dropping privileges [UID: 65534 GID: 65534]
[*] ---------------------------------------------------------------------------
[*] Initializing shared memory objects.
[*] ---------------------------------------------------------------------------
[*] + Counters shared object (new).
[*] + Flowbit shared object (new).
[*] + Thresh_by_src shared object (new).
[*] + Thresh_by_dst shared object (new).
[*] + Thresh_by_dstport shared object (new).
[*] + Thresh_by_username shared object (new).
[*] + After_by_src shared object (new).
[*] + After_by_dst shared object (new).
[*] + After_by_dstport shared object (new).
[*] + After_by_username shared object (new).
[*]
[*]  ,-._,-.    -*> Sagan! <*-
[*]  \/)"(\/    Version 1.1.0
[*]   (_o_)     Champ Clark III & The Quadrant InfoSec Team [quadrantsec.com]
[*]   /   \/)   Copyright (C) 2009-2016 Quadrant Information Security, et al.
[*]  (|| ||)    Using PCRE version: 8.38 2015-11-23
[*]   oo-oo     Sagan is processing events.....
[*]
[*] Spawning 50 Processor Threads.
[*]
[*] Attempting to open syslog FILE (/tmp/fifo).
[*] Successfully opened FILE (/tmp/fifo) and processing events.....
[D] [processors/sagan-engine.c, line 1669] **[Trigger]*********************************
[D] [processors/sagan-engine.c, line 1670] Program: dhcpd | Facility: IGNORE1 | Priority: IGNORE2 | Level: IGNORE3 | Tag: dhcp
[D] [processors/sagan-engine.c, line 1671] Threshold flag: 0 | After flag: 0 | Flowbit Flag: 1 | Flowbit status: 0
[D] [processors/sagan-engine.c, line 1672] Triggering Message:  DHCPREQUEST for 192.168.255.4 from de:ad:be:ef:de:ad (SomeHost) via igb2
[D] [sagan-flowbit.c, line 684] [1] Created flowbit "" via "set" [192.168.255.4 -> 1.1.1.1],
[*] EOF reached. Waiting for threads to catch up....
[*]
[*] Waiting on 1 threads....
[D] [sagan-flowbit.c, line 162] "isset" flowbit "dhcpreq" (direction: "by_src"). (192.168.255.4 -> any)
[D] [sagan-flowbit.c, line 381] Condition of flowbit returning TRUE. 1 1
[D] [processors/sagan-engine.c, line 1669] **[Trigger]*********************************
[D] [processors/sagan-engine.c, line 1670] Program: dhcpd | Facility: IGNORE1 | Priority: IGNORE2 | Level: IGNORE3 | Tag: dhcp
[D] [processors/sagan-engine.c, line 1671] Threshold flag: 0 | After flag: 0 | Flowbit Flag: 1 | Flowbit status: 1
[D] [processors/sagan-engine.c, line 1672] Triggering Message:  DHCPACK on 192.168.255.4 to de:ad:be:ef:de:ad (SomeHost) via igb2
[*]  ,-._,-.  -[ Sagan Version 1.1.0 - Engine Statistics ]-
[*]  \/)"(\/
[*]   (_o_)    Events processed         : 2
[*]   /   \/)  Signatures matched       : 2 (100.000%)
[*]  (|| ||)   Alerts                   : 1 (50.000%)
[*]   oo-oo    After                    : 0 (0.000%)
[*]            Threshold                : 0 (0.000%)
[*]            Dropped                  : 0 (0.000%)
[*]            Thread Exhaustion        : 0 (0.000%)
[*]            GeoIP2 Hits:             : 0 (0.000%)
[*]            GeoIP2 Lookups:          : 0
[*]            GeoIP2 Misses            : 0
[*]            Uptime                   : 0 days, 0 hours, 0 minutes, 2 seconds.
[*]            Avg. events per/second   : 1
[*]
[*]           -[ Sagan Processor Statistics ]-
[*]
[*]            Dropped                  : 0 (0.000%)
[*]
[*]           -[ Sagan follow_flow Statistics ]-
[*]
[*]            Total                    : 0
[*]            Dropped                  : 0 (0.000%)
[*] -------------------------------------------------------------------------------
[*] Exiting.
@kshelton
Copy link
Contributor Author

kshelton commented Jun 23, 2016

Looking at the code, I believe this is due to the fact that there is no guarantee on order for processing flowbits. I believe there needs to be a data structure that tracks current record timestamps and guarantees that an isset check cannot occur until there are no older records that have not matched a rule yet or, if there is a rule that has been matched and is still being processed, that rule has already set its flowbit(s) or cannot set a flowbit the current record wishes to check.

@beave
Copy link
Owner

beave commented Jun 23, 2016

Hello, Thank you for the bug report.

It appears (?) that you are attempting to use flowbits and read in a file.

"[*] EOF reached. Waiting for threads to catch up...."

I've never tested flowbits on a file. When using a FIFO, we are receiving the events "in order". That is, the input times ranges. Sagan is consuming the file all at once. For example:

In the FIFO world, I might have two events span several seconds. In the file world, then data is read in bulk. There is not "time" between events. I could certainly see threads in a "race" condition.

In reality, the "-F" (file in) option needs to be completely rethought out. The reading in of a file was added as a later after thought.

You point is certainly taken recording of time stamps for verification. This would likely event have some benefit when receiving data via FIFO.

There actually is a data structure which keeps track of flowbits:

See the _Sagan_IPC_Flowbit structure.

Even better, we have a little "hidden" utility we use for debugging. In the Sagan source tree, go under "tools". You'll see a program called "sagan-peek.c". Run "make" in that directory. This allows you to dump the contents of the mmap structures. It will show the values of the flowbits, It displays the "state" ("S"), "Flowbit name", "SRC IP", "DST IP", "Date added/modified" and the "expire time".

Rather than using a file, you might want to see if the same thing happens via a FIFO. I still think that adding time based verification is a good idea.

@kshelton
Copy link
Contributor Author

So, using a file was done to reproduce the issue. The same thing occurs if using a FIFO being written to by rsyslog.

@kshelton
Copy link
Contributor Author

kshelton commented Jun 23, 2016

Also of note, the timestamps in the records i listed are the actual timestamps from the events. Both occurred within the same second, which is why this is probably exhibiting in production as well. If you have enough time between events then the race becomes a non-issue, but for things like DHCP where the events could occur sub-second, it does present an issue.

@beave
Copy link
Owner

beave commented Jun 23, 2016

I can certainly see it being an issue via FIFO.

If the log is sent to Sagan within a very short amount of time, a race condition occurs. We might be able to do time comparisons but some code would need to change. Sagan would need to store the reception of the log at a more granular level (microsecond?).

Let me leave this open and ponder on it a bit.

@beave
Copy link
Owner

beave commented Aug 15, 2016

We might look into this in 1.1.2, but could also be further out. Need to investigate what it would take to make flowbits "atomic"ish.

@beave
Copy link
Owner

beave commented Feb 28, 2017

Just a quick update. We have Sagan time resolution at the 1000th of a second. We still need to do some more work to get this into xbits (formerly known as "flowbits"). This may help this issue. Thread sequence is still an issue.

@kshelton
Copy link
Contributor Author

kshelton commented Apr 5, 2017

Great news, thanks for following up on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants