- Elasticsearch Guide: other versions:
- Getting Started
- Setup
- Breaking changes
- Breaking changes in 2.4
- Breaking changes in 2.3
- Breaking changes in 2.2
- Breaking changes in 2.1
- Breaking changes in 2.0
- Removed features
- Network changes
- Multiple
path.data
striping - Mapping changes
- CRUD and routing changes
- Query DSL changes
- Search changes
- Aggregation changes
- Parent/Child changes
- Scripting changes
- Index API changes
- Snapshot and Restore changes
- Plugin and packaging changes
- Setting changes
- Stats, info, and
cat
changes - Java API changes
- API Conventions
- Document APIs
- Search APIs
- Aggregations
- Metrics Aggregations
- Avg Aggregation
- Cardinality Aggregation
- Extended Stats Aggregation
- Geo Bounds Aggregation
- Geo Centroid Aggregation
- Max Aggregation
- Min Aggregation
- Percentiles Aggregation
- Percentile Ranks Aggregation
- Scripted Metric Aggregation
- Stats Aggregation
- Sum Aggregation
- Top hits Aggregation
- Value Count Aggregation
- Bucket Aggregations
- Children Aggregation
- Date Histogram Aggregation
- Date Range Aggregation
- Filter Aggregation
- Filters Aggregation
- Geo Distance Aggregation
- GeoHash grid Aggregation
- Global Aggregation
- Histogram Aggregation
- IPv4 Range Aggregation
- Missing Aggregation
- Nested Aggregation
- Range Aggregation
- Reverse nested Aggregation
- Sampler Aggregation
- Significant Terms Aggregation
- Terms Aggregation
- Pipeline Aggregations
- Avg Bucket Aggregation
- Derivative Aggregation
- Max Bucket Aggregation
- Min Bucket Aggregation
- Sum Bucket Aggregation
- Stats Bucket Aggregation
- Extended Stats Bucket Aggregation
- Percentiles Bucket Aggregation
- Moving Average Aggregation
- Cumulative Sum Aggregation
- Bucket Script Aggregation
- Bucket Selector Aggregation
- Serial Differencing Aggregation
- Caching heavy aggregations
- Returning only aggregation results
- Aggregation Metadata
- Metrics Aggregations
- Indices APIs
- Create Index
- Delete Index
- Get Index
- Indices Exists
- Open / Close Index API
- Put Mapping
- Get Mapping
- Get Field Mapping
- Types Exists
- Index Aliases
- Update Indices Settings
- Get Settings
- Analyze
- Index Templates
- Warmers
- Shadow replica indices
- Indices Stats
- Indices Segments
- Indices Recovery
- Indices Shard Stores
- Clear Cache
- Flush
- Refresh
- Force Merge
- Optimize
- Upgrade
- cat APIs
- Cluster APIs
- Query DSL
- Mapping
- Dots in field names
- Field datatypes
- Meta-Fields
- Mapping parameters
analyzer
boost
coerce
copy_to
doc_values
dynamic
enabled
fielddata
format
geohash
geohash_precision
geohash_prefix
ignore_above
ignore_malformed
include_in_all
index
index_options
lat_lon
fields
norms
null_value
position_increment_gap
precision_step
properties
search_analyzer
similarity
store
term_vector
- Dynamic Mapping
- Transform
- Analysis
- Analyzers
- Tokenizers
- Token Filters
- Standard Token Filter
- ASCII Folding Token Filter
- Length Token Filter
- Lowercase Token Filter
- Uppercase Token Filter
- NGram Token Filter
- Edge NGram Token Filter
- Porter Stem Token Filter
- Shingle Token Filter
- Stop Token Filter
- Word Delimiter Token Filter
- Stemmer Token Filter
- Stemmer Override Token Filter
- Keyword Marker Token Filter
- Keyword Repeat Token Filter
- KStem Token Filter
- Snowball Token Filter
- Phonetic Token Filter
- Synonym Token Filter
- Compound Word Token Filter
- Reverse Token Filter
- Elision Token Filter
- Truncate Token Filter
- Unique Token Filter
- Pattern Capture Token Filter
- Pattern Replace Token Filter
- Trim Token Filter
- Limit Token Count Token Filter
- Hunspell Token Filter
- Common Grams Token Filter
- Normalization Token Filter
- CJK Width Token Filter
- CJK Bigram Token Filter
- Delimited Payload Token Filter
- Keep Words Token Filter
- Keep Types Token Filter
- Classic Token Filter
- Apostrophe Token Filter
- Decimal Digit Token Filter
- Character Filters
- Modules
- Index Modules
- Testing
- Glossary of terms
- Release Notes
- 2.4.6 Release Notes
- 2.4.5 Release Notes
- 2.4.4 Release Notes
- 2.4.3 Release Notes
- 2.4.2 Release Notes
- 2.4.1 Release Notes
- 2.4.0 Release Notes
- 2.3.5 Release Notes
- 2.3.4 Release Notes
- 2.3.3 Release Notes
- 2.3.2 Release Notes
- 2.3.1 Release Notes
- 2.3.0 Release Notes
- 2.2.2 Release Notes
- 2.2.1 Release Notes
- 2.2.0 Release Notes
- 2.1.2 Release Notes
- 2.1.1 Release Notes
- 2.1.0 Release Notes
- 2.0.2 Release Notes
- 2.0.1 Release Notes
- 2.0.0 Release Notes
- 2.0.0-rc1 Release Notes
- 2.0.0-beta2 Release Notes
- 2.0.0-beta1 Release Notes
WARNING: Version 2.4 of Elasticsearch has passed its EOL date.
This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.
Geo-Shape datatype
editGeo-Shape datatype
editThe geo_shape
datatype facilitates the indexing of and searching
with arbitrary geo shapes such as rectangles and polygons. It should be
used when either the data being indexed or the queries being executed
contain shapes other than just points.
You can query documents using this type using geo_shape Query.
Mapping Options
editThe geo_shape mapping maps geo_json geometry objects to the geo_shape type. To enable it, users must explicitly map fields to the geo_shape type.
Option | Description | Default |
---|---|---|
|
Name of the PrefixTree implementation to be used: |
|
|
This parameter may be used instead of |
|
|
Maximum number of layers to be used by the PrefixTree.
This can be used to control the precision of shape representations and
therefore how many terms are indexed. Defaults to the default value of
the chosen PrefixTree implementation. Since this parameter requires a
certain level of understanding of the underlying implementation, users
may use the |
|
|
The strategy parameter defines the approach for how to
represent shapes at indexing and search time. It also influences the
capabilities available so it is recommended to let Elasticsearch set
this parameter automatically. There are two strategies available:
|
|
|
Used as a hint to the PrefixTree about how
precise it should be. Defaults to 0.025 (2.5%) with 0.5 as the maximum
supported value. PERFORMANCE NOTE: This value will default to 0 if a |
|
|
Optionally define how to interpret vertex order for
polygons / multipolygons. This parameter defines one of two coordinate
system rules (Right-hand or Left-hand) each of which can be specified in three
different ways. 1. Right-hand rule: |
|
|
Setting this option to |
|
Prefix trees
editTo efficiently represent shapes in the index, Shapes are converted into a series of hashes representing grid squares (commonly referred to as "rasters") using implementations of a PrefixTree. The tree notion comes from the fact that the PrefixTree uses multiple grid layers, each with an increasing level of precision to represent the Earth. This can be thought of as increasing the level of detail of a map or image at higher zoom levels.
Multiple PrefixTree implementations are provided:
- GeohashPrefixTree - Uses geohashes for grid squares. Geohashes are base32 encoded strings of the bits of the latitude and longitude interleaved. So the longer the hash, the more precise it is. Each character added to the geohash represents another tree level and adds 5 bits of precision to the geohash. A geohash represents a rectangular area and has 32 sub rectangles. The maximum amount of levels in Elasticsearch is 24.
- QuadPrefixTree - Uses a quadtree for grid squares. Similar to geohash, quad trees interleave the bits of the latitude and longitude the resulting hash is a bit set. A tree level in a quad tree represents 2 bits in this bit set, one for each coordinate. The maximum amount of levels for the quad trees in Elasticsearch is 50.
Spatial strategies
editThe PrefixTree implementations rely on a SpatialStrategy for decomposing the provided Shape(s) into approximated grid squares. Each strategy answers the following:
- What type of Shapes can be indexed?
- What types of Query Operations and Shapes can be used?
- Does it support more than one Shape per field?
The following Strategy implementations (with corresponding capabilities) are provided:
Strategy | Supported Shapes | Supported Queries | Multiple Shapes |
---|---|---|---|
|
|
Yes |
|
|
|
Yes |
Accuracy
editGeo_shape does not provide 100% accuracy and depending on how it is configured it may return some false positives or false negatives for certain queries. To mitigate this, it is important to select an appropriate value for the tree_levels parameter and to adjust expectations accordingly. For example, a point may be near the border of a particular grid cell and may thus not match a query that only matches the cell right next to it — even though the shape is very close to the point.
Example
edit{ "properties": { "location": { "type": "geo_shape", "tree": "quadtree", "precision": "1m" } } }
This mapping maps the location field to the geo_shape type using the quad_tree implementation and a precision of 1m. Elasticsearch translates this into a tree_levels setting of 26.
Performance considerations
editElasticsearch uses the paths in the prefix tree as terms in the index and in queries. The higher the level is (and thus the precision), the more terms are generated. Of course, calculating the terms, keeping them in memory, and storing them on disk all have a price. Especially with higher tree levels, indices can become extremely large even with a modest amount of data. Additionally, the size of the features also matters. Big, complex polygons can take up a lot of space at higher tree levels. Which setting is right depends on the use case. Generally one trades off accuracy against index size and query performance.
The defaults in Elasticsearch for both implementations are a compromise between index size and a reasonable level of precision of 50m at the equator. This allows for indexing tens of millions of shapes without overly bloating the resulting index too much relative to the input size.
Input Structure
editThe GeoJSON format is used to represent shapes as input as follows:
GeoJSON Type | Elasticsearch Type | Description |
---|---|---|
|
|
A single geographic coordinate. |
|
|
An arbitrary line given two or more points. |
|
|
A closed polygon whose first and last point
must match, thus requiring |
|
|
An array of unconnected, but likely related points. |
|
|
An array of separate linestrings. |
|
|
An array of separate polygons. |
|
|
A GeoJSON shape similar to the
|
|
|
A bounding rectangle, or envelope, specified by specifying only the top left and bottom right points. |
|
|
A circle specified by a center point and radius with
units, which default to |
For all types, both the inner type
and coordinates
fields are
required.
In GeoJSON, and therefore Elasticsearch, the correct coordinate order is longitude, latitude (X, Y) within coordinate arrays. This differs from many Geospatial APIs (e.g., Google Maps) that generally use the colloquial latitude, longitude (Y, X).
A point is a single geographic coordinate, such as the location of a building or the current position given by a smartphone’s Geolocation API.
{ "location" : { "type" : "point", "coordinates" : [-77.03653, 38.897676] } }
A linestring
defined by an array of two or more positions. By
specifying only two points, the linestring
will represent a straight
line. Specifying more than two points creates an arbitrary path.
{ "location" : { "type" : "linestring", "coordinates" : [[-77.03653, 38.897676], [-77.009051, 38.889939]] } }
The above linestring
would draw a straight line starting at the White
House to the US Capitol Building.
A polygon is defined by a list of a list of points. The first and last points in each (outer) list must be the same (the polygon must be closed).
{ "location" : { "type" : "polygon", "coordinates" : [ [ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0] ] ] } }
The first array represents the outer boundary of the polygon, the other arrays represent the interior shapes ("holes"):
{ "location" : { "type" : "polygon", "coordinates" : [ [ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0] ], [ [100.2, 0.2], [100.8, 0.2], [100.8, 0.8], [100.2, 0.8], [100.2, 0.2] ] ] } }
IMPORTANT NOTE: GeoJSON does not mandate a specific order for vertices thus ambiguous polygons around the dateline and poles are possible. To alleviate ambiguity the Open Geospatial Consortium (OGC) Simple Feature Access specification defines the following vertex ordering:
- Outer Ring - Counterclockwise
- Inner Ring(s) / Holes - Clockwise
For polygons that do not cross the dateline, vertex order will not matter in Elasticsearch. For polygons that do cross the dateline, Elasticsearch requires vertex ordering to comply with the OGC specification. Otherwise, an unintended polygon may be created and unexpected query/filter results will be returned.
The following provides an example of an ambiguous polygon. Elasticsearch will apply OGC standards to eliminate ambiguity resulting in a polygon that crosses the dateline.
{ "location" : { "type" : "polygon", "coordinates" : [ [ [-177.0, 10.0], [176.0, 15.0], [172.0, 0.0], [176.0, -15.0], [-177.0, -10.0], [-177.0, 10.0] ], [ [178.2, 8.2], [-178.8, 8.2], [-180.8, -8.8], [178.2, 8.8] ] ] } }
An orientation
parameter can be defined when setting the geo_shape mapping (see Mapping Options). This will define vertex
order for the coordinate list on the mapped geo_shape field. It can also be overridden on each document. The following is an example for
overriding the orientation on a document:
{ "location" : { "type" : "polygon", "orientation" : "clockwise", "coordinates" : [ [ [-177.0, 10.0], [176.0, 15.0], [172.0, 0.0], [176.0, -15.0], [-177.0, -10.0], [-177.0, 10.0] ], [ [178.2, 8.2], [-178.8, 8.2], [-180.8, -8.8], [178.2, 8.8] ] ] } }
A list of geojson points.
{ "location" : { "type" : "multipoint", "coordinates" : [ [102.0, 2.0], [103.0, 2.0] ] } }
A list of geojson linestrings.
{ "location" : { "type" : "multilinestring", "coordinates" : [ [ [102.0, 2.0], [103.0, 2.0], [103.0, 3.0], [102.0, 3.0] ], [ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0] ], [ [100.2, 0.2], [100.8, 0.2], [100.8, 0.8], [100.2, 0.8] ] ] } }
A list of geojson polygons.
{ "location" : { "type" : "multipolygon", "coordinates" : [ [ [[102.0, 2.0], [103.0, 2.0], [103.0, 3.0], [102.0, 3.0], [102.0, 2.0]] ], [ [[100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0]], [[100.2, 0.2], [100.8, 0.2], [100.8, 0.8], [100.2, 0.8], [100.2, 0.2]] ] ] } }
A collection of geojson geometry objects.
{ "location" : { "type": "geometrycollection", "geometries": [ { "type": "point", "coordinates": [100.0, 0.0] }, { "type": "linestring", "coordinates": [ [101.0, 0.0], [102.0, 1.0] ] } ] } }
Envelope
editElasticsearch supports an envelope
type, which consists of coordinates
for upper left and lower right points of the shape to represent a
bounding rectangle:
{ "location" : { "type" : "envelope", "coordinates" : [ [-45.0, 45.0], [45.0, -45.0] ] } }
Circle
editElasticsearch supports a circle
type, which consists of a center
point with a radius:
{ "location" : { "type" : "circle", "coordinates" : [-45.0, 45.0], "radius" : "100m" } }
Note: The inner radius
field is required. If not specified, then
the units of the radius
will default to METERS
.
Sorting and Retrieving index Shapes
editDue to the complex input structure and index representation of shapes,
it is not currently possible to sort shapes or retrieve their fields
directly. The geo_shape value is only retrievable through the _source
field.
On this page