Map custom regions with reverse geocoding

edit

Map custom regions with reverse geocoding

edit

Maps comes with predefined regions that allow you to quickly visualize regions by metrics. Maps also offers the ability to map your own regions. You can use any region data you’d like, as long as your source data contains an identifier for the corresponding region.

But how can you map regions when your source data does not contain a region identifier? This is where reverse geocoding comes in. Reverse geocoding is the process of assigning a region identifer to a feature based on its location.

In this tutorial, you’ll use reverse geocoding to visualize United States Census Bureau Combined Statistical Area (CSA) regions by web traffic.

You’ll learn to:

  • Upload custom regions.
  • Reverse geocode with the Elasticsearch enrich processor.
  • Create a map and visualize CSA regions by web traffic.

When you complete this tutorial, you’ll have a map that looks like this:

csa regions by web traffic

Step 1: Index web traffic data

edit

GeoIP is a common way of transforming an IP address to a longitude and latitude. GeoIP is roughly accurate on the city level globally and neighborhood level in selected countries. It’s not as good as an actual GPS location from your phone, but it’s much more precise than just a country, state, or province.

You’ll use the web logs sample data set that comes with Kibana for this tutorial. Web logs sample data set has longitude and latitude. If your web log data does not contain longitude and latitude, use GeoIP processor to transform an IP address into a geo_point field.

To install web logs sample data set:

  1. On the home page, click Try sample data.
  2. On the Sample web logs card, click Add data.

Step 2: Index Combined Statistical Area (CSA) regions

edit

GeoIP level of detail is very useful for driving decision-making. For example, say you want to spin up a marketing campaign based on the locations of your users or show executive stakeholders which metro areas are experiencing an uptick of traffic.

That kind of scale in the United States is often captured with what the Census Bureau calls the Combined Statistical Area (CSA). Combined Statistical Area is roughly equivalent with how people intuitively think of which urban area they live in. It does not necessarily coincide with state or city boundaries.

CSAs generally share the same telecom providers and ad networks. New fast food franchises expand to a CSA rather than a particular city or municipality. Basically, people in the same CSA shop in the same IKEA.

To get the CSA boundary data:

  1. Download the Cartographic Boundary shapefile (.shp) from the Census Bureau’s website.
  2. To use the data in Kibana, convert it to GeoJSON format. Follow this helpful tutorial to use QGIS to convert the Cartographic Boundary shapefile to GeoJSON. Or, download a prebuilt GeoJSON version.

Once you have your GeoJSON file:

  1. Open the main menu, and click Maps.
  2. Click Create map.
  3. Click Add layer.
  4. Click Upload GeoJSON.
  5. Use the file chooser to import the CSA GeoJSON file.
  6. Set index name to csa and click Import file.
  7. When importing is complete, click Add as document layer.
  8. Add Tooltip fields:

    1. Click + Add to open field select.
    2. Select NAME, GEOID, and AFFGEOID.
    3. Click Add.
  9. Click Save & close.

Looking at the map, you get a sense of what constitutes a metro area in the eyes of the Census Bureau.

csa regions

Step 3: Reverse geocoding

edit

To visualize CSA regions by web log traffic, the web log traffic must contain a CSA region identifier. You’ll use Elasticsearch enrich processor to add CSA region identifiers to the web logs sample data set. You can skip this step if your source data already contains region identifiers.

  1. Open the main menu, then click Dev Tools.
  2. In Console, create a geo_match enrichment policy:

    PUT /_enrich/policy/csa_lookup
    {
      "geo_match": {
        "indices": "csa",
        "match_field": "coordinates",
        "enrich_fields": [ "GEOID", "NAME"]
      }
    }
  3. To initialize the policy, run:

    POST /_enrich/policy/csa_lookup/_execute
  4. To create a ingest pipeline, run:

    PUT _ingest/pipeline/lonlat-to-csa
    {
      "description": "Reverse geocode longitude-latitude to combined statistical area",
      "processors": [
        {
          "enrich": {
            "field": "geo.coordinates",
            "policy_name": "csa_lookup",
            "target_field": "csa",
            "ignore_missing": true,
            "ignore_failure": true,
            "description": "Lookup the csa identifier"
          }
        },
        {
          "remove": {
            "field": "csa.coordinates",
            "ignore_missing": true,
            "ignore_failure": true,
            "description": "Remove the shape field"
          }
        }
      ]
    }
  5. To update your existing data, run:

    POST kibana_sample_data_logs/_update_by_query?pipeline=lonlat-to-csa
  6. To run the pipeline on new documents at ingest, run:

    PUT kibana_sample_data_logs/_settings
    {
      "index": {
        "default_pipeline": "lonlat-to-csa"
      }
    }
  7. Open the main menu, and click Discover.
  8. Set the index pattern to kibana_sample_data_logs.
  9. Open the time filter, and set the time range to the last 30 days.
  10. Scan through the list of Available fields until you find the csa.GEOID field. You can also search for the field by name.
  11. Click Add icon to toggle the field into the document table.
  12. Find the csa.NAME field and add it to your document table.

Your web log data now contains csa.GEOID and csa.NAME fields from the matching csa region. Web log traffic not contained in a CSA region does not have values for csa.GEOID and csa.NAME fields.

discover enriched web log

Step 4: Visualize Combined Statistical Area (CSA) regions by web traffic

edit

Now that our web traffic contains CSA region identifiers, you’ll visualize CSA regions by web traffic.

  1. Open the main menu, and click Maps.
  2. Click Create map.
  3. Click Add layer.
  4. Click Choropleth.
  5. For Boundaries source:

    1. Select Points, lines, and polygons from Elasticsearch.
    2. Set Index pattern to csa.
    3. Set Join field to GEOID.
  6. For Statistics source:

    1. Set Index pattern to kibana_sample_data_logs.
    2. Set Join field to csa.GEOID.keyword.
  7. Click Add layer.
  8. Scroll to Layer Style and Set Label to Fixed.
  9. Click Save & close.
  10. Save the map.

    1. Give the map a title.
    2. Under Add to dashboard, select None.
    3. Click Save and add to library.
csa regions by web traffic

Congratulations! You have completed the tutorial and have the recipe for visualizing custom regions. You can now try replicating this same analysis with your own data.