Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need more information extracted from GeoIP database not only lat or lng #107

Open
keefeleen opened this issue Mar 6, 2017 · 8 comments
Open

Comments

@keefeleen
Copy link

We have to extract province, city information from MaxMind GeoIP2 database when using logstash filter.

But it seems that logstash default geoip plugin can just provide "latitude" and "longitude" info.
Actually we write a plugin for info extraction, but we strongly recommend official plugin maintainer can update this plugin. Can future version add more fields that can be extracted from GeoIP2 databases?

Thanks in advance and waiting for your reply.

@wiibaa
Copy link

wiibaa commented Mar 6, 2017

Strange report as the default list of field is already quite exhaustive and contains the city_name : https://github.com/logstash-plugins/logstash-filter-geoip/blob/master/lib/logstash/filters/geoip.rb#L67
Also this is configurable by you to add the desired level of subdivision as documented here http://dev.maxmind.com/geoip/geoip2/geoip2-city-country-csv-databases/

@keefeleen do you have a restricted fields config in your configuration ?

@keefeleen
Copy link
Author

Thanks for your replay, I checked over again that just as you said the default fields already contains the info we need. Then I found the problem that caused failure when we tried to extract city_name information.

Since GeoIP filter uses GeoIP2-java lib. We found that before reading the database, the DatabaseReader class will compare database's metadata named "database_type" with the data type name we want (eg. "city"), if "database_type" doesn't contain the data type name, it will throw an exception and won't give the IP data we want.

I think the related code is as follows in DatabaseReader.java:

    String databaseType = this.getMetadata().getDatabaseType();
    if (!databaseType.contains(type)) {
        String caller = Thread.currentThread().getStackTrace()[2]
                .getMethodName();
        throw new UnsupportedOperationException(
                "Invalid attempt to open a " + databaseType
                        + " database using the " + caller + " method");
    }

Actually we want to know the exact purpose of adding this checking logic, and we also need to know the way to avoid failure when we do not want to use "city" or "country" in "database_type" while reading "city" or "country" information.

Thanks in advance.

@keefeleen
Copy link
Author

@wiibaa hello, can you take a look at the question above?

Since another team in our company build our own city level GeoIP database which follows MaxMind's standard and provide it for us to use. But the "database_type" the defined doesn't contain "city" in it so we cannot use logstash GeoIP plugin with it.

Is there any way to avoid asking them to rebuild the database and make logstash GeoIP plugin compatible with the existing database?

@wiibaa
Copy link

wiibaa commented Mar 9, 2017

@keefeleen I'm sorry but this is how MaxMind DatabaseReader seems to work, logstash is simply calling this method

 @Override
public CityResponse city(InetAddress ipAddress) throws IOException,
        GeoIp2Exception {
    return this.get(ipAddress, CityResponse.class, "City");
}

So your custom database must define the proper type otherwise Maxmind lib cannot use it, this is an issue of compatiblity with your database with maxmind lib, logstash cannot help much

@joewreschnig
Copy link

joewreschnig commented Apr 6, 2017

The logstash plugin does treat lat-lon specially. Even if the database has the right entries for e.g. city or country, the record is thrown out by logstash if it doesn't have a lat-lon.

geoip.rb:

    if location.getLatitude().nil? && location.getLongitude().nil?
      return
    end

This causes problems even with the official MaxMind databases. The GeoIP2-City-Europe DB, for example, has continent/country codes but no location fields for places outside Europe.

@wiibaa
Copy link

wiibaa commented Apr 7, 2017

@joewreschnig so you mean this assumption is wrong ?

  # if location is empty, there is no point populating geo data
  # and most likely all other fields are empty as well

Could you provide some examples, I cannot find easily on MaxMind documentation the description of such cases

@joewreschnig
Copy link

joewreschnig commented Apr 7, 2017

Yes, I believe that assumption is wrong, even for official MaxMind DBs. For example, when I look up a US IP in the GeoIP2-City-Europe DB, I get only the country, no location:

  # mmdblookup -f /var/lib/GeoIP/GeoIP2-City-Europe.mmdb -i 8.8.8.8
  {
    "continent": 
      {
        "code": 
          "NA" <utf8_string>
        "geoname_id": 
          6255149 <uint32>
        "names": 
          {
            "de": 
              "Nordamerika" <utf8_string>
            "en": 
              "North America" <utf8_string>
            "es": 
              "Norteamérica" <utf8_string>
            "fr": 
              "Amérique du Nord" <utf8_string>
            "ja": 
              "北アメリカ" <utf8_string>
            "pt-BR": 
              "América do Norte" <utf8_string>
            "ru": 
              "Северная Америка" <utf8_string>
            "zh-CN": 
              "北美洲" <utf8_string>
          }
      }
    "country": 
      {
        "geoname_id": 
          6252001 <uint32>
        "iso_code": 
          "US" <utf8_string>
        "names": 
          {
            "de": 
              "USA" <utf8_string>
            "en": 
              "United States" <utf8_string>
            "es": 
              "Estados Unidos" <utf8_string>
            "fr": 
              "États-Unis" <utf8_string>
            "ja": 
              "アメリカ合衆国" <utf8_string>
            "pt-BR": 
              "Estados Unidos" <utf8_string>
            "ru": 
              "США" <utf8_string>
            "zh-CN": 
              "美国" <utf8_string>
          }
      }
    "registered_country": 
      {
        "geoname_id": 
          6252001 <uint32>
        "iso_code": 
          "US" <utf8_string>
        "names": 
          {
            "de": 
              "USA" <utf8_string>
            "en": 
              "United States" <utf8_string>
            "es": 
              "Estados Unidos" <utf8_string>
            "fr": 
              "États-Unis" <utf8_string>
            "ja": 
              "アメリカ合衆国" <utf8_string>
            "pt-BR": 
              "Estados Unidos" <utf8_string>
            "ru": 
              "США" <utf8_string>
            "zh-CN": 
              "美国" <utf8_string>
          }
      }
  }

If I remove the check (and put appropriate guards around the assignments) the plugin handles the data just fine - I get a continent and country.

@wiibaa
Copy link

wiibaa commented Apr 7, 2017

@joewreschnig very interesting, it's true that the history of the geoip filter was mainly to retrieve the lat/lon and use it with a map widget in kibana, but that should not be the only use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants