-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support ISP database for ASN data and move parts to Java #111
Conversation
'dma_code', 'ip', 'latitude', | ||
'longitude', 'postal_code', 'region_name', | ||
'region_code', 'timezone', 'location'] | ||
config :fields, :validate => :array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docs still mention the same defaults, but the actual defaults have moved to Java enum class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not listing a :default here makes the documentation generator (which will change soon with @ph's work) show that the setting has no default value. Could be confusing.
However, I think given @ph/@dede's efforts to change how plugins get documented, this is probably a fine change.
(Mostly noting this for posterity)
@@ -113,132 +96,44 @@ class LogStash::Filters::GeoIP < LogStash::Filters::Base | |||
# of the same geoip_type share the same cache. The last declared cache size will 'win'. The reason for this is that there would be no benefit | |||
# to having multiple caches for different instances at different points in the pipeline, that would just increase the | |||
# number of cache misses and waste memory. | |||
config :lru_cache_size, :validate => :number, :default => 1000 | |||
config :lru_cache_size, :validate => :number, :default => 1000, :deprecated => "This field has been deprecated in favor of cache_size." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we were never using this, and it was lingering around
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a little feedback on field naming.
As a user the field names anonymous_system_number and anonymous_system_organization seem unnecessary long. Imagine typing those into the filter bar in Kibana. Simply using 'asn' and 'as_org' would seem sufficient. Pretty much everyone who would care what about these values uses the term ASN. Afterall, the field 'isp' isn't named 'internet_service_provider', so why name the AS-related fields in long form?
That said, if the fields are named that way within the MaxMind database and the goal is to stick with that naming, I can understand that reasoning too.
import java.util.Locale; | ||
|
||
enum Fields { | ||
ANONYMOUS_SYSTEM_NUMBER("anonymous_system_number"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In ASN, the A is autonomous, not anonymous
'dma_code', 'ip', 'latitude', | ||
'longitude', 'postal_code', 'region_name', | ||
'region_code', 'timezone', 'location'] | ||
config :fields, :validate => :array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not listing a :default here makes the documentation generator (which will change soon with @ph's work) show that the setting has no default value. Could be confusing.
However, I think given @ph/@dede's efforts to change how plugins get documented, this is probably a fine change.
(Mostly noting this for posterity)
import java.net.UnknownHostException; | ||
import java.util.*; | ||
|
||
public class GeoIpFilter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IP is capitalized, and writing it "Ip" feels wrong. I dont' think we have a firm style guide on this, but my practice has always been acronyms stay all caps.
This kind of mixing gets confusing when, for example, Java crypto calls it "SSL" but Netty calls it "Ssl" ...
Do we have a practice for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i was debating this. I do like acronyms as capitalized. I'll change it to GeoIP
|
||
private Set<Fields> createDesiredFields(List<String> fields) { | ||
Set<Fields> desiredFields = EnumSet.noneOf(Fields.class); | ||
if (fields == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also worth checking if fields.isEmpty() ? Or would we reject an empty fields list at configuration-time?
} | ||
// only do event.set(@target) if the lookup result is not null | ||
if (event.getField(targetField) == null) { | ||
event.setField(targetField, Collections.emptyMap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this necessary? event.set("[foo][bar]", ...) should create any parent fields if it doesn't exist... I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no it doesn't. I had to put this in, otherwise, tests fail. It was in the old ruby code as well.
This is because of the canceled events bug found in logstash-plugins/logstash-filter-date#90
@robacj suggestion accepted. I shortened the fields like you mentioned. @jordansissel updated after your comments. |
public class GeoIPFilter { | ||
private static Logger logger = LogManager.getLogger(); | ||
private static final String CITY_DB_TYPE = "GeoLite2-City"; | ||
private static final String COUNTRY_DB_TYPE = "GeoLite2-Country"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commercial databases have the type GeoIP2-City
and GeoIP2-Country
. It'd likely be worth supporting those too, especially given that they are mentioned in the recent blog post.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oschwald good call. Updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM. <3
|
||
// if location is empty, there is no point populating geo data | ||
// and most likely all other fields are empty as well | ||
if (location.getLatitude() == null && location.getLongitude() == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're rewriting this anyway, please don't keep this mistake. There are GeoIP records with no lat-lon that this plugin should be able to handle.
Minor note, changelog is missing for this release. |
When can I use it? |
@zcola you can use it now by updating the plugin within your current logstash:
|
@hackery
logstash return
Use python to parse the return
Is my compiler script wrong? |
This adds support to GeoIP2 ISP database which includes the ASN data.
Also, while doing this, it made sense to me to just move most parts to Java since we were already using the java APIs and not the ruby libs. All existing tests pass.