GeoIP processor

edit

The geoip processor adds information about the geographical location of IP addresses, based on data from the Maxmind databases. This processor adds this information by default under the geoip field. The geoip processor can resolve both IPv4 and IPv6 addresses.

The ingest-geoip module ships by default with the GeoLite2 City, GeoLite2 Country and GeoLite2 ASN GeoIP2 databases from Maxmind made available under the CCA-ShareAlike 4.0 license. For more details see, http://dev.maxmind.com/geoip/geoip2/geolite2/

The geoip processor can run with other city, country and ASN GeoIP2 databases from Maxmind. On Elasticsearch Service deployments, custom database files must be uploaded using a custom bundle. On self-managed deployments, custom database files must be copied into the ingest-geoip config directory located at $ES_CONFIG/ingest-geoip.

Custom database files must be stored uncompressed and the extension must be -City.mmdb, -Country.mmdb, or -ASN.mmdb to indicate the type of the database. The database_file processor option is used to specify the filename of the custom database to use for the processor.

Using the geoip Processor in a Pipeline

edit

Table 21. geoip options

Name Required Default Description

field

yes

-

The field to get the ip address from for the geographical lookup.

target_field

no

geoip

The field that will hold the geographical information looked up from the Maxmind database.

database_file

no

GeoLite2-City.mmdb

The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

properties

no

[continent_name, country_iso_code, country_name, region_iso_code, region_name, city_name, location] *

Controls what properties are added to the target_field based on the geoip lookup.

ignore_missing

no

false

If true and field does not exist, the processor quietly exits without modifying the document

first_only

no

true

If true only first found geoip data will be returned, even if field contains array

*Depends on what is available in database_file:

  • If the GeoLite2 City database is used, then the following fields may be added under the target_field: ip, country_iso_code, country_name, continent_name, region_iso_code, region_name, city_name, timezone, latitude, longitude and location. The fields actually added depend on what has been found and which properties were configured in properties.
  • If the GeoLite2 Country database is used, then the following fields may be added under the target_field: ip, country_iso_code, country_name and continent_name. The fields actually added depend on what has been found and which properties were configured in properties.
  • If the GeoLite2 ASN database is used, then the following fields may be added under the target_field: ip, asn, organization_name and network. The fields actually added depend on what has been found and which properties were configured in properties.

Here is an example that uses the default city database and adds the geographical information to the geoip field based on the ip field:

PUT _ingest/pipeline/geoip
{
  "description" : "Add geoip info",
  "processors" : [
    {
      "geoip" : {
        "field" : "ip"
      }
    }
  ]
}
PUT my-index-00001/_doc/my_id?pipeline=geoip
{
  "ip": "8.8.8.8"
}
GET my-index-00001/_doc/my_id

Which returns:

{
  "found": true,
  "_index": "my-index-00001",
  "_type": "_doc",
  "_id": "my_id",
  "_version": 1,
  "_seq_no": 55,
  "_primary_term": 1,
  "_source": {
    "ip": "8.8.8.8",
    "geoip": {
      "continent_name": "North America",
      "country_name": "United States",
      "country_iso_code": "US",
      "location": { "lat": 37.751, "lon": -97.822 }
    }
  }
}

Here is an example that uses the default country database and adds the geographical information to the geo field based on the ip field. Note that this database is included in the module. So this:

PUT _ingest/pipeline/geoip
{
  "description" : "Add geoip info",
  "processors" : [
    {
      "geoip" : {
        "field" : "ip",
        "target_field" : "geo",
        "database_file" : "GeoLite2-Country.mmdb"
      }
    }
  ]
}
PUT my-index-00001/_doc/my_id?pipeline=geoip
{
  "ip": "8.8.8.8"
}
GET my-index-00001/_doc/my_id

returns this:

{
  "found": true,
  "_index": "my-index-00001",
  "_type": "_doc",
  "_id": "my_id",
  "_version": 1,
  "_seq_no": 65,
  "_primary_term": 1,
  "_source": {
    "ip": "8.8.8.8",
    "geo": {
      "continent_name": "North America",
      "country_name": "United States",
      "country_iso_code": "US",
    }
  }
}

Not all IP addresses find geo information from the database, When this occurs, no target_field is inserted into the document.

Here is an example of what documents will be indexed as when information for "80.231.5.0" cannot be found:

PUT _ingest/pipeline/geoip
{
  "description" : "Add geoip info",
  "processors" : [
    {
      "geoip" : {
        "field" : "ip"
      }
    }
  ]
}

PUT my-index-00001/_doc/my_id?pipeline=geoip
{
  "ip": "80.231.5.0"
}

GET my-index-00001/_doc/my_id

Which returns:

{
  "_index" : "my-index-00001",
  "_type" : "_doc",
  "_id" : "my_id",
  "_version" : 1,
  "_seq_no" : 71,
  "_primary_term": 1,
  "found" : true,
  "_source" : {
    "ip" : "80.231.5.0"
  }
}

Recognizing Location as a Geopoint

edit

Although this processor enriches your document with a location field containing the estimated latitude and longitude of the IP address, this field will not be indexed as a geo_point type in Elasticsearch without explicitly defining it as such in the mapping.

You can use the following mapping for the example index above:

PUT my_ip_locations
{
  "mappings": {
    "properties": {
      "geoip": {
        "properties": {
          "location": { "type": "geo_point" }
        }
      }
    }
  }
}

Node Settings

edit

The geoip processor supports the following setting:

ingest.geoip.cache_size
The maximum number of results that should be cached. Defaults to 1000.

Note that these settings are node settings and apply to all geoip processors, i.e. there is one cache for all defined geoip processors.