Elasticsearch Guide: other versions:
What’s new in 8.17
Elasticsearch basics
- What is Elasticsearch?
- Run Elasticsearch
- Indices and documents
- Add data to Elasticsearch
- Search and analyze data
- Get ready for production
Quick starts
- Basics: Index and search using APIs
- Basics: Full-text search and filtering
- Basics: Analyze eCommerce data with aggregations
Set up Elasticsearch
- Run Elasticsearch locally
- Installing Elasticsearch
- Configuring Elasticsearch
- Important system configuration
- Bootstrap Checks
- Bootstrap Checks for X-Pack
- Starting Elasticsearch
- Stopping Elasticsearch
- Discovery and cluster formation
- Add and remove nodes in your cluster
- Full-cluster restart and rolling restart
- Remote clusters
- Plugins
Upgrade Elasticsearch
- Archived settings
- Reading indices from older Elasticsearch versions
Index modules
- Analysis
- Index Shard Allocation
- Index blocks
- Mapper
- Merge
- Similarity module
- Slow Log
- Store
  - Preloading data into the file system cache
- Translog
- History retention
- Index Sorting
  - Use index sorting to speed up conjunctions
- Indexing pressure
Mapping
- Dynamic mapping
  - Dynamic field mapping
  - Dynamic templates
- Explicit mapping
- Runtime fields
- Field data types
  - Aggregate metric
  - Alias
  - Arrays
  - Binary
  - Boolean
  - Completion
  - Date
  - Date nanoseconds
  - Dense vector
  - Flattened
  - Geopoint
  - Geoshape
  - Histogram
  - IP
  - Join
  - Keyword
  - Nested
  - Numeric
  - Object
  - Pass-through object
  - Percolator
  - Point
  - Range
  - Rank feature
  - Rank features
  - Search-as-you-type
  - Semantic text
  - Shape
  - Sparse vector
  - Text
  - Token count
  - Unsigned long
  - Version
- Metadata fields
- Mapping parameters
- Mapping limit settings
- Removal of mapping types
Text analysis
- Overview
- Concepts
- Configure text analysis
- Built-in analyzer reference
  - Fingerprint
  - Keyword
  - Language
  - Pattern
  - Simple
  - Standard
  - Stop
  - Whitespace
- Tokenizer reference
  - Character group
  - Classic
  - Edge n-gram
  - Keyword
  - Letter
  - Lowercase
  - N-gram
  - Path hierarchy
  - Pattern
  - Simple pattern
  - Simple pattern split
  - Standard
  - Thai
  - UAX URL email
  - Whitespace
- Token filter reference
- Character filters reference
- Normalizers
Index templates
- Simulate multi-component templates
- Config ignore_missing_component_templates
  - Usage example
Data streams
- Set up a data stream
- Use a data stream
- Modify a data stream
- Time series data stream (TSDS)
- Logs data stream
- Data stream lifecycle
Ingest pipelines
- Example: Parse logs
- Enrich your data
- Processor reference
  - Append
  - Attachment
  - Bytes
  - Circle
  - Community ID
  - Convert
  - CSV
  - Date
  - Date index name
  - Dissect
  - Dot expander
  - Drop
  - Enrich
  - Fail
  - Fingerprint
  - Foreach
  - Geo-grid
  - GeoIP
  - Grok
  - Gsub
  - HTML strip
  - Inference
  - IP Location
  - Join
  - JSON
  - KV
  - Lowercase
  - Network direction
  - Pipeline
  - Redact
  - Registered domain
  - Remove
  - Rename
  - Reroute
  - Script
  - Set
  - Set security user
  - Sort
  - Split
  - Terminate
  - Trim
  - Uppercase
  - URL decode
  - URI parts
  - User agent
- Ingest pipelines in Search
  - Inference processing
  - NLP tutorial
Aliases
Search your data
- The search API
- Full-text search
- Search relevance optimizations
- Retrievers
  - Retrievers examples
- kNN search
- Semantic search
- Retrieval augmented generation
- Search across clusters
- Search with synonyms
- Search Applications
- Search analytics
Re-ranking
- Semantic re-ranking
- Learning To Rank
  - Deploy and manage LTR models
  - Search using LTR
Query DSL
- Query and filter context
- Compound queries
- Full text queries
- Geo queries
- Shape queries
  - Shape
- Joining queries
  - Nested
  - Has child
  - Has parent
  - Parent ID
- Match all
- Span queries
- Vector queries
- Specialized queries
- Term-level queries
  - Exists
  - Fuzzy
  - IDs
  - Prefix
  - Range
  - Regexp
  - Term
  - Terms
  - Terms set
  - Wildcard
- minimum_should_match parameter
- rewrite parameter
- Regular expression syntax
Aggregations
- Bucket aggregations
- Metrics aggregations
  - Avg
  - Boxplot
  - Cardinality
  - Extended stats
  - Geo-bounds
  - Geo-centroid
  - Geo-line
  - Cartesian-bounds
  - Cartesian-centroid
  - Matrix stats
  - Max
  - Median absolute deviation
  - Min
  - Percentile ranks
  - Percentiles
  - Rate
  - Scripted metric
  - Stats
  - String stats
  - Sum
  - T-test
  - Top hits
  - Top metrics
  - Value count
  - Weighted avg
- Pipeline aggregations
Geospatial analysis
Connectors
- Connectors references
  - Azure Blob Storage
  - Box
  - Confluence
  - Dropbox
  - GitHub
  - Gmail
  - Google Cloud Storage
  - Google Drive
  - GraphQL
  - Jira
  - Microsoft SQL
  - MongoDB
  - MySQL
  - Network drive
  - Notion
  - OneDrive
  - OpenText Documentum
  - Oracle
  - Outlook
  - PostgreSQL
  - Redis
  - S3
  - Salesforce
  - ServiceNow
  - SharePoint Online
  - SharePoint Server
  - Slack
  - Teams
  - Zoom
- Self-managed connectors
- Elastic managed connectors
  - Managed connector tutorial (MongoDB)
- Build and customize connectors
- Connectors UI in Kibana
- Connector APIs
  - API tutorial
- Content syncs
- Extract and transform
  - Content extraction
  - Sync rules
- Document level security
  - How DLS works
  - DLS in Search Applications
- Management topics
- Use cases
  - Internal knowledge search
- Release notes
- Known issues
EQL
- Syntax reference
- Function reference
- Pipe reference
- Example: Detect threats with EQL
ES|QL
- Getting started
- ES|QL reference
- Using ES|QL
- Limitations
- Examples
SQL
- Overview
- Getting Started with SQL
- Conventions and Terminology
  - Mapping concepts across SQL and Elasticsearch
- Security
- SQL REST API
- SQL Translate API
- SQL CLI
- SQL JDBC
  - API usage
- SQL ODBC
  - Driver installation
  - Configuration
- SQL Client Applications
- SQL Language
- Functions and Operators
- Reserved keywords
- SQL Limitations
Scripting
- Painless scripting language
- How to write scripts
- Access fields in a document
- Common scripting use cases
  - Field extraction
- Accessing document fields and special variables
- Scripting and security
- Lucene expressions language
- Advanced scripts using script engines
Data management
- ILM: Manage the index lifecycle
- Tutorial: Customize built-in policies
- Tutorial: Automate rollover
- Index management in Kibana
- Overview
- Concepts
- Index lifecycle actions
  - Allocate
  - Delete
  - Force merge
  - Migrate
  - Read only
  - Rollover
  - Downsample
  - Searchable snapshot
  - Set priority
  - Shrink
  - Unfollow
  - Wait for snapshot
- Configure a lifecycle policy
- Migrate index allocation filters to node roles
- Troubleshooting index lifecycle management errors
- Start and stop index lifecycle management
- Manage existing indices
- Skip rollover
- Restore a managed data stream or index
- Data tiers
Autoscaling
- Autoscaling deciders
Monitor a cluster
- Overview
- How it works
- Elasticsearch application logging
- Monitoring in a production environment
- Collecting monitoring data with Elastic Agent
- Collecting monitoring data with Metricbeat
- Collecting log data with Filebeat
- Configuring data streams/indices for monitoring
- Legacy collection methods
Roll up or transform your data
- Rolling up historical data
- Transforming data
Set up a cluster for high availability
- Designing for resilience
  - Resilience in small clusters
  - Resilience in larger clusters
Snapshot and restore
- Register a repository
- Create a snapshot
- Restore a snapshot
- Searchable snapshots
Secure the Elastic Stack
- Elasticsearch security principles
- Start the Elastic Stack with security enabled automatically
- Manually configure security
- Updating node security certificates
  - With the same CA
  - With a different CA
- User authentication
- User authorization
- Enable audit logging
- Restricting connections with IP filtering
- Securing clients and integrations
- Operator privileges
- Troubleshooting
- Limitations
Watcher
- Getting started with Watcher
- How Watcher works
- Encrypting sensitive data in Watcher
- Inputs
- Triggers
  - Schedule trigger
- Conditions
- Actions
- Transforms
- Managing watches
- Example watches
  - Watching the status of an Elasticsearch cluster
- Limitations
Cross-cluster replication
- Set up cross-cluster replication
- Manage cross-cluster replication
- Manage auto-follow patterns
- Upgrading clusters
  - Uni-directional index following
  - Bi-directional index following
- Uni-directional disaster recovery
- Bi-directional disaster recovery
Data store architecture
- Nodes and shards
- Node roles
- Reading and writing documents
- Shard allocation, relocation, and recovery
  - Shard allocation awareness
- The shard request cache
REST APIs
- API conventions
- Common options
- REST API compatibility
- Autoscaling APIs
  - Create or update autoscaling policy
  - Get autoscaling capacity
  - Delete autoscaling policy
  - Get autoscaling policy
- Behavioral Analytics APIs
  - Put Analytics Collection
  - Delete Analytics Collection
  - List Analytics Collections
  - Post Analytics Collection Event
- Compact and aligned text (CAT) APIs
  - cat aliases
  - cat allocation
  - cat anomaly detectors
  - cat component templates
  - cat count
  - cat data frame analytics
  - cat datafeeds
  - cat fielddata
  - cat health
  - cat indices
  - cat master
  - cat nodeattrs
  - cat nodes
  - cat pending tasks
  - cat plugins
  - cat recovery
  - cat repositories
  - cat segments
  - cat shards
  - cat snapshots
  - cat task management
  - cat templates
  - cat thread pool
  - cat trained model
  - cat transforms
- Cluster APIs
  - Cluster allocation explain
  - Cluster get settings
  - Cluster health
  - Health
  - Cluster reroute
  - Cluster state
  - Cluster stats
  - Cluster update settings
  - Nodes feature usage
  - Nodes hot threads
  - Nodes info
  - Prevalidate node removal
  - Nodes reload secure settings
  - Nodes stats
  - Cluster Info
  - Pending cluster tasks
  - Remote cluster info
  - Task management
  - Voting configuration exclusions
  - Create or update desired nodes
  - Get desired nodes
  - Delete desired nodes
  - Get desired balance
  - Reset desired balance
- Cross-cluster replication APIs
  - Get CCR stats
  - Create follower
  - Pause follower
  - Resume follower
  - Unfollow
  - Forget follower
  - Get follower stats
  - Get follower info
  - Create auto-follow pattern
  - Delete auto-follow pattern
  - Get auto-follow pattern
  - Pause auto-follow pattern
  - Resume auto-follow pattern
- Connector APIs
  - Create connector
  - Delete connector
  - Get connector
  - List connectors
  - Update connector API key id
  - Update connector configuration
  - Update connector index name
  - Update connector features
  - Update connector filtering
  - Update connector name and description
  - Update connector pipeline
  - Update connector scheduling
  - Update connector service type
  - Create connector sync job
  - Cancel connector sync job
  - Delete connector sync job
  - Get connector sync job
  - List connector sync jobs
  - Check in a connector
  - Update connector error
  - Update connector last sync stats
  - Update connector status
  - Check in connector sync job
  - Claim connector sync job
  - Set connector sync job error
  - Set connector sync job stats
- Data stream APIs
  - Create data stream
  - Delete data stream
  - Get data stream
  - Migrate to data stream
  - Data stream stats
  - Promote data stream
  - Modify data streams
  - Put Data Stream Lifecycle
  - Get Data Stream Lifecycle
  - Delete Data Stream Lifecycle
  - Explain Data Stream Lifecycle
  - Get Data Stream Lifecycle
  - Downsample
- Document APIs
  - Index
  - Get
  - Delete
  - Delete by query
  - Update
  - Update by query
  - Multi get
  - Bulk
  - Reindex
  - Term vectors
  - Multi term vectors
  - ?refresh
  - Optimistic concurrency control
- Enrich APIs
  - Create enrich policy
  - Delete enrich policy
  - Get enrich policy
  - Execute enrich policy
  - Enrich stats
- EQL APIs
  - Delete async EQL search
  - EQL search
  - Get async EQL search
  - Get async EQL search status
- ES|QL APIs
  - ES|QL query API
  - ES|QL async query API
  - ES|QL async query get API
  - ES|QL async query delete API
- Features APIs
  - Get features
  - Reset features
- Fleet APIs
  - Get global checkpoints
  - Fleet search
  - Fleet multi search
- Graph explore API
- Index APIs
  - Alias exists
  - Aliases
  - Analyze
  - Analyze index disk usage
  - Clear cache
  - Clone index
  - Close index
  - Create index
  - Create or update alias
  - Create or update component template
  - Create or update index template
  - Create or update index template (legacy)
  - Delete component template
  - Delete dangling index
  - Delete alias
  - Delete index
  - Delete index template
  - Delete index template (legacy)
  - Exists
  - Field usage stats
  - Flush
  - Force merge
  - Get alias
  - Get component template
  - Get field mapping
  - Get index
  - Get index settings
  - Get index template
  - Get index template (legacy)
  - Get mapping
  - Import dangling index
  - Index recovery
  - Index segments
  - Index shard stores
  - Index stats
  - Index template exists (legacy)
  - List dangling indices
  - Open index
  - Refresh
  - Resolve index
  - Resolve cluster
  - Advantages of using this endpoint before a cross-cluster search
  - Rollover
  - Shrink index
  - Simulate index
  - Simulate template
  - Split index
  - Unfreeze index
  - Update index settings
  - Update mapping
- Index lifecycle management APIs
  - Create or update lifecycle policy
  - Get policy
  - Delete policy
  - Move to step
  - Remove policy
  - Retry policy
  - Get index lifecycle management status
  - Explain lifecycle
  - Start index lifecycle management
  - Stop index lifecycle management
  - Migrate indices, ILM policies, and legacy, composable and component templates to data tiers routing
- Inference APIs
  - Delete inference API
  - Get inference API
  - Perform inference API
  - Create inference API
  - Stream inference API
  - Update inference API
  - AlibabaCloud AI Search inference integration
  - Amazon Bedrock inference integration
  - Anthropic inference integration
  - Azure AI studio inference integration
  - Azure OpenAI inference integration
  - Cohere inference integration
  - Elasticsearch inference integration
  - ELSER inference integration
  - Google AI Studio inference integration
  - Google Vertex AI inference integration
  - HuggingFace inference integration
  - Mistral inference integration
  - OpenAI inference integration
  - Watsonx inference integration
- Info API
- Ingest APIs
  - Create or update pipeline
  - Get pipeline
  - Delete pipeline
  - Simulate pipeline
  - Simulate ingest
  - GeoIP stats
  - Create or update IP geolocation database configuration
  - Get IP geolocation database configuration
  - Delete IP geolocation database configuration
- Licensing APIs
  - Delete license
  - Get license
  - Get trial status
  - Start trial
  - Get basic status
  - Start basic
  - Update license
- Logstash APIs
  - Create or update Logstash pipeline
  - Delete Logstash pipeline
  - Get Logstash pipeline
- Machine learning APIs
  - Get machine learning info
  - Get machine learning memory stats
  - Set upgrade mode
- Machine learning anomaly detection APIs
  - Add events to calendar
  - Add jobs to calendar
  - Close jobs
  - Create jobs
  - Create calendars
  - Create datafeeds
  - Create filters
  - Delete calendars
  - Delete datafeeds
  - Delete events from calendar
  - Delete filters
  - Delete forecasts
  - Delete jobs
  - Delete jobs from calendar
  - Delete model snapshots
  - Delete expired data
  - Estimate model memory
  - Flush jobs
  - Forecast jobs
  - Get buckets
  - Get calendars
  - Get categories
  - Get datafeeds
  - Get datafeed statistics
  - Get influencers
  - Get jobs
  - Get job statistics
  - Get model snapshots
  - Get model snapshot upgrade statistics
  - Get overall buckets
  - Get scheduled events
  - Get filters
  - Get records
  - Open jobs
  - Post data to jobs
  - Preview datafeeds
  - Reset jobs
  - Revert model snapshots
  - Start datafeeds
  - Stop datafeeds
  - Update datafeeds
  - Update filters
  - Update jobs
  - Update model snapshots
  - Upgrade model snapshots
- Machine learning data frame analytics APIs
  - Create data frame analytics jobs
  - Delete data frame analytics jobs
  - Evaluate data frame analytics
  - Explain data frame analytics
  - Get data frame analytics jobs
  - Get data frame analytics jobs stats
  - Preview data frame analytics
  - Start data frame analytics jobs
  - Stop data frame analytics jobs
  - Update data frame analytics jobs
- Machine learning trained model APIs
  - Clear trained model deployment cache
  - Create or update trained model aliases
  - Create part of a trained model
  - Create trained models
  - Create trained model vocabulary
  - Delete trained model aliases
  - Delete trained models
  - Get trained models
  - Get trained models stats
  - Infer trained model
  - Start trained model deployment
  - Stop trained model deployment
  - Update trained model deployment
- Migration APIs
  - Deprecation info
  - Feature migration
- Node lifecycle APIs
  - Put shutdown API
  - Get shutdown API
  - Delete shutdown API
- Query rules APIs
  - Create or update query ruleset
  - Get query ruleset
  - List query rulesets
  - Delete query ruleset
  - Create or update query rule
  - Get query rule
  - Delete query rule
  - Tests query ruleset
- Reload search analyzers API
- Repositories metering APIs
  - Get repositories metering information
  - Clear repositories metering archive
- Rollup APIs
  - Create rollup jobs
  - Delete rollup jobs
  - Get job
  - Get rollup caps
  - Get rollup index caps
  - Rollup search
  - Start rollup jobs
  - Stop rollup jobs
- Root API
- Script APIs
  - Create or update stored script
  - Delete stored script
  - Get script contexts
  - Get script languages
  - Get stored script
- Search APIs
  - Search
  - Async search
  - Point in time
  - kNN search
  - Retriever
  - Reciprocal rank fusion
  - Scroll
  - Clear scroll
  - Search template
  - Multi search template
  - Render search template
  - Search shards
  - Suggesters
  - Multi search
  - Count
  - Validate
  - Terms enum
  - Explain
  - Profile
  - Field capabilities
  - Ranking evaluation
  - Vector tile search
- Search Application APIs
  - Put Search Application
  - Get Search Application
  - List Search Applications
  - Delete Search Application
  - Search Application Search
  - Render Search Application Query
- Searchable snapshots APIs
  - Mount snapshot
  - Cache stats
  - Searchable snapshot statistics
  - Clear cache
- Security APIs
  - Authenticate
  - Change passwords
  - Clear cache
  - Clear roles cache
  - Clear privileges cache
  - Clear API key cache
  - Clear service account token caches
  - Create API keys
  - Create or update application privileges
  - Create or update role mappings
  - Create or update roles
  - Bulk create or update roles API
  - Bulk delete roles API
  - Create or update users
  - Create service account tokens
  - Delegate PKI authentication
  - Delete application privileges
  - Delete role mappings
  - Delete roles
  - Delete service account token
  - Delete users
  - Disable users
  - Enable users
  - Enroll Kibana
  - Enroll node
  - Get API key information
  - Get application privileges
  - Get builtin privileges
  - Get role mappings
  - Get roles
  - Query Role
  - Get service accounts
  - Get service account credentials
  - Get Security settings
  - Get token
  - Get user privileges
  - Get users
  - Grant API keys
  - Has privileges
  - Invalidate API key
  - Invalidate token
  - OpenID Connect prepare authentication
  - OpenID Connect authenticate
  - OpenID Connect logout
  - Query API key information
  - Query User
  - Update API key
  - Update Security settings
  - Bulk update API keys
  - SAML prepare authentication
  - SAML authenticate
  - SAML logout
  - SAML invalidate
  - SAML complete logout
  - SAML service provider metadata
  - SSL certificate
  - Activate user profile
  - Disable user profile
  - Enable user profile
  - Get user profiles
  - Suggest user profile
  - Update user profile data
  - Has privileges user profile
  - Create Cross-Cluster API key
  - Update Cross-Cluster API key
- Snapshot and restore APIs
  - Create or update snapshot repository
  - Verify snapshot repository
  - Repository analysis
  - Verify repository integrity
  - Get snapshot repository
  - Delete snapshot repository
  - Clean up snapshot repository
  - Clone snapshot
  - Create snapshot
  - Get snapshot
  - Get snapshot status
  - Restore snapshot
  - Delete snapshot
- Snapshot lifecycle management APIs
  - Create or update policy
  - Get policy
  - Delete policy
  - Execute snapshot lifecycle policy
  - Execute snapshot retention policy
  - Get snapshot lifecycle management status
  - Get snapshot lifecycle stats
  - Start snapshot lifecycle management
  - Stop snapshot lifecycle management
- SQL APIs
  - Clear SQL cursor
  - Delete async SQL search
  - Get async SQL search
  - Get async SQL search status
  - SQL search
  - SQL translate
- Synonyms APIs
  - Create or update synonyms set
  - Get synonyms set
  - List synonyms sets
  - Delete synonyms set
  - Create or update synonym rule
  - Get synonym rule
  - Delete synonym rule
- Text structure APIs
  - Find field structure API
  - Find messages structure API
  - Find text structure API
  - Test Grok pattern
- Transform APIs
  - Create transform
  - Delete transform
  - Get transforms
  - Get transform statistics
  - Preview transform
  - Reset transform
  - Schedule now transform
  - Start transform
  - Stop transforms
  - Update transform
  - Upgrade transforms
- Usage API
- Watcher APIs
  - Ack watch
  - Activate watch
  - Deactivate watch
  - Delete watch
  - Execute watch
  - Get watch
  - Get Watcher stats
  - Query watches
  - Create or update watch
  - Update Watcher settings
  - Get Watcher settings
  - Start watch service
  - Stop watch service
- Definitions
  - Role mapping resources
Command line tools
- elasticsearch-certgen
- elasticsearch-certutil
- elasticsearch-create-enrollment-token
- elasticsearch-croneval
- elasticsearch-keystore
- elasticsearch-node
- elasticsearch-reconfigure-node
- elasticsearch-reset-password
- elasticsearch-saml-metadata
- elasticsearch-service-tokens
- elasticsearch-setup-passwords
- elasticsearch-shard
- elasticsearch-syskeygen
- elasticsearch-users
Optimizations
- General recommendations
- Tune for indexing speed
- Tune for search speed
- Tune approximate kNN search
- Tune for disk usage
- Size your shards
- Use Elasticsearch for time series data
Troubleshooting
- Fix common cluster issues
  - Watermark errors
  - Circuit breaker errors
  - High CPU usage
  - High JVM memory pressure
  - Red or yellow cluster health status
  - Rejected requests
  - Task queue backlog
  - Mapping explosion
  - Hot spotting
- Diagnose unassigned shards
- Add a missing tier to the system
- Allow Elasticsearch to allocate the data in the system
- Allow Elasticsearch to allocate the index
- Indices mix index allocation filters with data tiers node roles to move through data tiers
- Not enough nodes to allocate all shard replicas
- Total number of shards for an index on a single node exceeded
- Total number of shards per node has been reached
- Troubleshooting corruption
- Fix data nodes out of disk
  - Increase the disk capacity of data nodes
  - Decrease the disk usage of data nodes
- Fix master nodes out of disk
- Fix other role nodes out of disk
- Start index lifecycle management
- Start Snapshot Lifecycle Management
- Restore from snapshot
- Troubleshooting broken repositories
  - Diagnosing corrupted repositories
  - Diagnosing unknown repositories
  - Diagnosing invalid repositories
- Addressing repeated snapshot policy failures
- Troubleshooting an unstable cluster
- Troubleshooting discovery
- Troubleshooting monitoring
- Troubleshooting transforms
- Troubleshooting Watcher
- Troubleshooting searches
- Troubleshooting shards capacity health issues
- Troubleshooting an unbalanced cluster
- Capture diagnostics
Migration guide
- 8.17
- 8.16
- 8.15
- 8.14
- 8.13
- 8.12
- 8.11
- 8.10
- 8.9
- 8.8
- 8.7
- 8.6
- 8.5
- 8.4
- 8.3
- 8.2
- 8.1
- 8.0
  - Java time migration guide
  - Transient settings migration guide
Release notes
- Elasticsearch version 8.17.3
- Elasticsearch version 8.17.2
- Elasticsearch version 8.17.1
- Elasticsearch version 8.17.0
- Elasticsearch version 8.16.4
- Elasticsearch version 8.16.3
- Elasticsearch version 8.16.2
- Elasticsearch version 8.16.1
- Elasticsearch version 8.16.0
- Elasticsearch version 8.15.5
- Elasticsearch version 8.15.4
- Elasticsearch version 8.15.3
- Elasticsearch version 8.15.2
- Elasticsearch version 8.15.1
- Elasticsearch version 8.15.0
- Elasticsearch version 8.14.3
- Elasticsearch version 8.14.2
- Elasticsearch version 8.14.1
- Elasticsearch version 8.14.0
- Elasticsearch version 8.13.4
- Elasticsearch version 8.13.3
- Elasticsearch version 8.13.2
  - Bug fixes
- Elasticsearch version 8.13.1
  - Bug fixes
- Elasticsearch version 8.13.0
- Elasticsearch version 8.12.2
- Elasticsearch version 8.12.1
- Elasticsearch version 8.12.0
- Elasticsearch version 8.11.4
- Elasticsearch version 8.11.3
- Elasticsearch version 8.11.2
- Elasticsearch version 8.11.1
- Elasticsearch version 8.11.0
- Elasticsearch version 8.10.4
- Elasticsearch version 8.10.3
- Elasticsearch version 8.10.2
- Elasticsearch version 8.10.1
- Elasticsearch version 8.10.0
- Elasticsearch version 8.9.2
- Elasticsearch version 8.9.1
- Elasticsearch version 8.9.0
- Elasticsearch version 8.8.2
- Elasticsearch version 8.8.1
- Elasticsearch version 8.8.0
- Elasticsearch version 8.7.1
- Elasticsearch version 8.7.0
- Elasticsearch version 8.6.2
- Elasticsearch version 8.6.1
- Elasticsearch version 8.6.0
- Elasticsearch version 8.5.3
- Elasticsearch version 8.5.2
- Elasticsearch version 8.5.1
- Elasticsearch version 8.5.0
- Elasticsearch version 8.4.3
- Elasticsearch version 8.4.2
- Elasticsearch version 8.4.1
- Elasticsearch version 8.4.0
- Elasticsearch version 8.3.3
- Elasticsearch version 8.3.2
- Elasticsearch version 8.3.1
- Elasticsearch version 8.3.0
- Elasticsearch version 8.2.3
- Elasticsearch version 8.2.2
- Elasticsearch version 8.2.1
- Elasticsearch version 8.2.0
- Elasticsearch version 8.1.3
- Elasticsearch version 8.1.2
- Elasticsearch version 8.1.1
- Elasticsearch version 8.1.0
- Elasticsearch version 8.0.1
- Elasticsearch version 8.0.0
- Elasticsearch version 8.0.0-rc2
- Elasticsearch version 8.0.0-rc1
- Elasticsearch version 8.0.0-beta1
- Elasticsearch version 8.0.0-alpha2
- Elasticsearch version 8.0.0-alpha1
Dependencies and versions

› › ›

Suggesters

Suggests similar looking terms based on a provided text by using a suggester.

resp = client.search(
    index="my-index-000001",
    query={
        "match": {
            "message": "tring out Elasticsearch"
        }
    },
    suggest={
        "my-suggestion": {
            "text": "tring out Elasticsearch",
            "term": {
                "field": "message"
            }
        }
    },
)
print(resp)

response = client.search(
  index: 'my-index-000001',
  body: {
    query: {
      match: {
        message: 'tring out Elasticsearch'
      }
    },
    suggest: {
      "my-suggestion": {
        text: 'tring out Elasticsearch',
        term: {
          field: 'message'
        }
      }
    }
  }
)
puts response

const response = await client.search({
  index: "my-index-000001",
  query: {
    match: {
      message: "tring out Elasticsearch",
    },
  },
  suggest: {
    "my-suggestion": {
      text: "tring out Elasticsearch",
      term: {
        field: "message",
      },
    },
  },
});
console.log(response);

POST my-index-000001/_search
{
  "query" : {
    "match": {
      "message": "tring out Elasticsearch"
    }
  },
  "suggest" : {
    "my-suggestion" : {
      "text" : "tring out Elasticsearch",
      "term" : {
        "field" : "message"
      }
    }
  }
}

Copy as curl Try in Elastic

Request

edit

The suggest feature suggests similar looking terms based on a provided text by using a suggester. The suggest request part is defined alongside the query part in a _search request. If the query part is left out, only suggestions are returned.

Examples

edit

Several suggestions can be specified per request. Each suggestion is identified with an arbitrary name. In the example below two suggestions are requested. Both my-suggest-1 and my-suggest-2 suggestions use the term suggester, but have a different text.

resp = client.search(
    suggest={
        "my-suggest-1": {
            "text": "tring out Elasticsearch",
            "term": {
                "field": "message"
            }
        },
        "my-suggest-2": {
            "text": "kmichy",
            "term": {
                "field": "user.id"
            }
        }
    },
)
print(resp)

response = client.search(
  body: {
    suggest: {
      "my-suggest-1": {
        text: 'tring out Elasticsearch',
        term: {
          field: 'message'
        }
      },
      "my-suggest-2": {
        text: 'kmichy',
        term: {
          field: 'user.id'
        }
      }
    }
  }
)
puts response

const response = await client.search({
  suggest: {
    "my-suggest-1": {
      text: "tring out Elasticsearch",
      term: {
        field: "message",
      },
    },
    "my-suggest-2": {
      text: "kmichy",
      term: {
        field: "user.id",
      },
    },
  },
});
console.log(response);

POST _search
{
  "suggest": {
    "my-suggest-1" : {
      "text" : "tring out Elasticsearch",
      "term" : {
        "field" : "message"
      }
    },
    "my-suggest-2" : {
      "text" : "kmichy",
      "term" : {
        "field" : "user.id"
      }
    }
  }
}

Copy as curl Try in Elastic

The below suggest response example includes the suggestion response for my-suggest-1 and my-suggest-2. Each suggestion part contains entries. Each entry is effectively a token from the suggest text and contains the suggestion entry text, the original start offset and length in the suggest text and if found an arbitrary number of options.

{
  "_shards": ...
  "hits": ...
  "took": 2,
  "timed_out": false,
  "suggest": {
    "my-suggest-1": [ {
      "text": "tring",
      "offset": 0,
      "length": 5,
      "options": [ {"text": "trying", "score": 0.8, "freq": 1 } ]
    }, {
      "text": "out",
      "offset": 6,
      "length": 3,
      "options": []
    }, {
      "text": "elasticsearch",
      "offset": 10,
      "length": 13,
      "options": []
    } ],
    "my-suggest-2": ...
  }
}

Each options array contains an option object that includes the suggested text, its document frequency and score compared to the suggest entry text. The meaning of the score depends on the used suggester. The term suggester’s score is based on the edit distance.

Global suggest text

edit

To avoid repetition of the suggest text, it is possible to define a global text. In the example below the suggest text is defined globally and applies to the my-suggest-1 and my-suggest-2 suggestions.

$params = [
    'body' => [
        'suggest' => [
            'text' => 'tring out Elasticsearch',
            'my-suggest-1' => [
                'term' => [
                    'field' => 'message',
                ],
            ],
            'my-suggest-2' => [
                'term' => [
                    'field' => 'user',
                ],
            ],
        ],
    ],
];
$response = $client->search($params);

resp = client.search(
    suggest={
        "text": "tring out Elasticsearch",
        "my-suggest-1": {
            "term": {
                "field": "message"
            }
        },
        "my-suggest-2": {
            "term": {
                "field": "user"
            }
        }
    },
)
print(resp)

response = client.search(
  body: {
    suggest: {
      text: 'tring out Elasticsearch',
      "my-suggest-1": {
        term: {
          field: 'message'
        }
      },
      "my-suggest-2": {
        term: {
          field: 'user'
        }
      }
    }
  }
)
puts response

res, err := es.Search(
	es.Search.WithBody(strings.NewReader(`{
	  "suggest": {
	    "text": "tring out Elasticsearch",
	    "my-suggest-1": {
	      "term": {
	        "field": "message"
	      }
	    },
	    "my-suggest-2": {
	      "term": {
	        "field": "user"
	      }
	    }
	  }
	}`)),
	es.Search.WithPretty(),
)
fmt.Println(res, err)

const response = await client.search({
  suggest: {
    text: "tring out Elasticsearch",
    "my-suggest-1": {
      term: {
        field: "message",
      },
    },
    "my-suggest-2": {
      term: {
        field: "user",
      },
    },
  },
});
console.log(response);

POST _search
{
  "suggest": {
    "text" : "tring out Elasticsearch",
    "my-suggest-1" : {
      "term" : {
        "field" : "message"
      }
    },
    "my-suggest-2" : {
       "term" : {
        "field" : "user"
       }
    }
  }
}

Copy as curl Try in Elastic

The suggest text can in the above example also be specified as suggestion specific option. The suggest text specified on suggestion level override the suggest text on the global level.

Term suggester

edit

The term suggester suggests terms based on edit distance. The provided suggest text is analyzed before terms are suggested. The suggested terms are provided per analyzed suggest text token. The term suggester doesn’t take the query into account that is part of request.

Common suggest options:

edit

`text`	The suggest text. The suggest text is a required option that needs to be set globally or per suggestion.
`field`	The field to fetch the candidate suggestions from. This is a required option that either needs to be set globally or per suggestion.
`analyzer`	The analyzer to analyse the suggest text with. Defaults to the search analyzer of the suggest field.
`size`	The maximum corrections to be returned per suggest text token.
`sort`	Defines how suggestions should be sorted per suggest text term. Two possible values: `score`: Sort by score first, then document frequency and then the term itself. `frequency`: Sort by document frequency first, then similarity score and then the term itself.
`suggest_mode`	The suggest mode controls what suggestions are included or controls for what suggest text terms, suggestions should be suggested. Three possible values can be specified: `missing`: Only provide suggestions for suggest text terms that are not in the index (default). `popular`: Only suggest suggestions that occur in more docs than the original suggest text term. `always`: Suggest any matching suggestions based on terms in the suggest text.

Other term suggest options:

edit

`max_edits`	The maximum edit distance candidate suggestions can have in order to be considered as a suggestion. Can only be a value between 1 and 2. Any other value results in a bad request error being thrown. Defaults to 2.
`prefix_length`	The number of minimal prefix characters that must match in order be a candidate for suggestions. Defaults to 1. Increasing this number improves spellcheck performance. Usually misspellings don’t occur in the beginning of terms.
`min_word_length`	The minimum length a suggest text term must have in order to be included. Defaults to `4`.
`shard_size`	Sets the maximum number of suggestions to be retrieved from each individual shard. During the reduce phase only the top N suggestions are returned based on the `size` option. Defaults to the `size` option. Setting this to a value higher than the `size` can be useful in order to get a more accurate document frequency for spelling corrections at the cost of performance. Due to the fact that terms are partitioned amongst shards, the shard level document frequencies of spelling corrections may not be precise. Increasing this will make these document frequencies more precise.
`max_inspections`	A factor that is used to multiply with the `shard_size` in order to inspect more candidate spelling corrections on the shard level. Can improve accuracy at the cost of performance. Defaults to 5.
`min_doc_freq`	The minimal threshold in number of documents a suggestion should appear in. This can be specified as an absolute number or as a relative percentage of number of documents. This can improve quality by only suggesting high frequency terms. Defaults to 0f and is not enabled. If a value higher than 1 is specified, then the number cannot be fractional. The shard level document frequencies are used for this option.
`max_term_freq`	The maximum threshold in number of documents in which a suggest text token can exist in order to be included. Can be a relative percentage number (e.g., 0.4) or an absolute number to represent document frequencies. If a value higher than 1 is specified, then fractional can not be specified. Defaults to 0.01f. This can be used to exclude high frequency terms — which are usually spelled correctly — from being spellchecked. This also improves the spellcheck performance. The shard level document frequencies are used for this option.
`string_distance`	Which string distance implementation to use for comparing how similar suggested terms are. Five possible values can be specified: `internal`: The default based on damerau_levenshtein but highly optimized for comparing string distance for terms inside the index. `damerau_levenshtein`: String distance algorithm based on Damerau-Levenshtein algorithm. `levenshtein`: String distance algorithm based on Levenshtein edit distance algorithm. `jaro_winkler`: String distance algorithm based on Jaro-Winkler algorithm. `ngram`: String distance algorithm based on character n-grams.

Phrase Suggester

edit

The term suggester provides a very convenient API to access word alternatives on a per token basis within a certain string distance. The API allows accessing each token in the stream individually while suggest-selection is left to the API consumer. Yet, often pre-selected suggestions are required in order to present to the end-user. The phrase suggester adds additional logic on top of the term suggester to select entire corrected phrases instead of individual tokens weighted based on ngram-language models. In practice this suggester will be able to make better decisions about which tokens to pick based on co-occurrence and frequencies.

API Example

edit

In general the phrase suggester requires special mapping up front to work. The phrase suggester examples on this page need the following mapping to work. The reverse analyzer is used only in the last example.

resp = client.indices.create(
    index="test",
    settings={
        "index": {
            "number_of_shards": 1,
            "analysis": {
                "analyzer": {
                    "trigram": {
                        "type": "custom",
                        "tokenizer": "standard",
                        "filter": [
                            "lowercase",
                            "shingle"
                        ]
                    },
                    "reverse": {
                        "type": "custom",
                        "tokenizer": "standard",
                        "filter": [
                            "lowercase",
                            "reverse"
                        ]
                    }
                },
                "filter": {
                    "shingle": {
                        "type": "shingle",
                        "min_shingle_size": 2,
                        "max_shingle_size": 3
                    }
                }
            }
        }
    },
    mappings={
        "properties": {
            "title": {
                "type": "text",
                "fields": {
                    "trigram": {
                        "type": "text",
                        "analyzer": "trigram"
                    },
                    "reverse": {
                        "type": "text",
                        "analyzer": "reverse"
                    }
                }
            }
        }
    },
)
print(resp)

resp1 = client.index(
    index="test",
    refresh=True,
    document={
        "title": "noble warriors"
    },
)
print(resp1)

resp2 = client.index(
    index="test",
    refresh=True,
    document={
        "title": "nobel prize"
    },
)
print(resp2)

response = client.indices.create(
  index: 'test',
  body: {
    settings: {
      index: {
        number_of_shards: 1,
        analysis: {
          analyzer: {
            trigram: {
              type: 'custom',
              tokenizer: 'standard',
              filter: [
                'lowercase',
                'shingle'
              ]
            },
            reverse: {
              type: 'custom',
              tokenizer: 'standard',
              filter: [
                'lowercase',
                'reverse'
              ]
            }
          },
          filter: {
            shingle: {
              type: 'shingle',
              min_shingle_size: 2,
              max_shingle_size: 3
            }
          }
        }
      }
    },
    mappings: {
      properties: {
        title: {
          type: 'text',
          fields: {
            trigram: {
              type: 'text',
              analyzer: 'trigram'
            },
            reverse: {
              type: 'text',
              analyzer: 'reverse'
            }
          }
        }
      }
    }
  }
)
puts response

response = client.index(
  index: 'test',
  refresh: true,
  body: {
    title: 'noble warriors'
  }
)
puts response

response = client.index(
  index: 'test',
  refresh: true,
  body: {
    title: 'nobel prize'
  }
)
puts response

const response = await client.indices.create({
  index: "test",
  settings: {
    index: {
      number_of_shards: 1,
      analysis: {
        analyzer: {
          trigram: {
            type: "custom",
            tokenizer: "standard",
            filter: ["lowercase", "shingle"],
          },
          reverse: {
            type: "custom",
            tokenizer: "standard",
            filter: ["lowercase", "reverse"],
          },
        },
        filter: {
          shingle: {
            type: "shingle",
            min_shingle_size: 2,
            max_shingle_size: 3,
          },
        },
      },
    },
  },
  mappings: {
    properties: {
      title: {
        type: "text",
        fields: {
          trigram: {
            type: "text",
            analyzer: "trigram",
          },
          reverse: {
            type: "text",
            analyzer: "reverse",
          },
        },
      },
    },
  },
});
console.log(response);

const response1 = await client.index({
  index: "test",
  refresh: "true",
  document: {
    title: "noble warriors",
  },
});
console.log(response1);

const response2 = await client.index({
  index: "test",
  refresh: "true",
  document: {
    title: "nobel prize",
  },
});
console.log(response2);

PUT test
{
  "settings": {
    "index": {
      "number_of_shards": 1,
      "analysis": {
        "analyzer": {
          "trigram": {
            "type": "custom",
            "tokenizer": "standard",
            "filter": ["lowercase","shingle"]
          },
          "reverse": {
            "type": "custom",
            "tokenizer": "standard",
            "filter": ["lowercase","reverse"]
          }
        },
        "filter": {
          "shingle": {
            "type": "shingle",
            "min_shingle_size": 2,
            "max_shingle_size": 3
          }
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "fields": {
          "trigram": {
            "type": "text",
            "analyzer": "trigram"
          },
          "reverse": {
            "type": "text",
            "analyzer": "reverse"
          }
        }
      }
    }
  }
}
POST test/_doc?refresh=true
{"title": "noble warriors"}
POST test/_doc?refresh=true
{"title": "nobel prize"}

Copy as curl Try in Elastic

Once you have the analyzers and mappings set up you can use the phrase suggester in the same spot you’d use the term suggester:

resp = client.search(
    index="test",
    suggest={
        "text": "noble prize",
        "simple_phrase": {
            "phrase": {
                "field": "title.trigram",
                "size": 1,
                "gram_size": 3,
                "direct_generator": [
                    {
                        "field": "title.trigram",
                        "suggest_mode": "always"
                    }
                ],
                "highlight": {
                    "pre_tag": "<em>",
                    "post_tag": "</em>"
                }
            }
        }
    },
)
print(resp)

const response = await client.search({
  index: "test",
  suggest: {
    text: "noble prize",
    simple_phrase: {
      phrase: {
        field: "title.trigram",
        size: 1,
        gram_size: 3,
        direct_generator: [
          {
            field: "title.trigram",
            suggest_mode: "always",
          },
        ],
        highlight: {
          pre_tag: "<em>",
          post_tag: "</em>",
        },
      },
    },
  },
});
console.log(response);

POST test/_search
{
  "suggest": {
    "text": "noble prize",
    "simple_phrase": {
      "phrase": {
        "field": "title.trigram",
        "size": 1,
        "gram_size": 3,
        "direct_generator": [ {
          "field": "title.trigram",
          "suggest_mode": "always"
        } ],
        "highlight": {
          "pre_tag": "<em>",
          "post_tag": "</em>"
        }
      }
    }
  }
}

Copy as curl Try in Elastic

The response contains suggestions scored by the most likely spelling correction first. In this case we received the expected correction "nobel prize".

{
  "_shards": ...
  "hits": ...
  "timed_out": false,
  "took": 3,
  "suggest": {
    "simple_phrase" : [
      {
        "text" : "noble prize",
        "offset" : 0,
        "length" : 11,
        "options" : [ {
          "text" : "nobel prize",
          "highlighted": "<em>nobel</em> prize",
          "score" : 0.48614594
        }]
      }
    ]
  }
}

Basic Phrase suggest API parameters

edit

`field`	The name of the field used to do n-gram lookups for the language model, the suggester will use this field to gain statistics to score corrections. This field is mandatory.
`gram_size`	Sets max size of the n-grams (shingles) in the `field`. If the field doesn’t contain n-grams (shingles), this should be omitted or set to `1`. Note that Elasticsearch tries to detect the gram size based on the specified `field`. If the field uses a `shingle` filter, the `gram_size` is set to the `max_shingle_size` if not explicitly set.
`real_word_error_likelihood`	The likelihood of a term being misspelled even if the term exists in the dictionary. The default is `0.95`, meaning 5% of the real words are misspelled.
`confidence`	The confidence level defines a factor applied to the input phrases score which is used as a threshold for other suggest candidates. Only candidates that score higher than the threshold will be included in the result. For instance a confidence level of `1.0` will only return suggestions that score higher than the input phrase. If set to `0.0` the top N candidates are returned. The default is `1.0`.
`max_errors`	The maximum percentage of the terms considered to be misspellings in order to form a correction. This method accepts a float value in the range `[0..1)` as a fraction of the actual query terms or a number `>=1` as an absolute number of query terms. The default is set to `1.0`, meaning only corrections with at most one misspelled term are returned. Note that setting this too high can negatively impact performance. Low values like `1` or `2` are recommended; otherwise the time spend in suggest calls might exceed the time spend in query execution.
`separator`	The separator that is used to separate terms in the bigram field. If not set the whitespace character is used as a separator.
`size`	The number of candidates that are generated for each individual query term. Low numbers like `3` or `5` typically produce good results. Raising this can bring up terms with higher edit distances. The default is `5`.
`analyzer`	Sets the analyzer to analyze to suggest text with. Defaults to the search analyzer of the suggest field passed via `field`.
`shard_size`	Sets the maximum number of suggested terms to be retrieved from each individual shard. During the reduce phase, only the top N suggestions are returned based on the `size` option. Defaults to `5`.
`text`	Sets the text / query to provide suggestions for.
`highlight`	Sets up suggestion highlighting. If not provided then no `highlighted` field is returned. If provided must contain exactly `pre_tag` and `post_tag`, which are wrapped around the changed tokens. If multiple tokens in a row are changed the entire phrase of changed tokens is wrapped rather than each token.
`collate`	Checks each suggestion against the specified `query` to prune suggestions for which no matching docs exist in the index. The collate query for a suggestion is run only on the local shard from which the suggestion has been generated from. The `query` must be specified and it can be templated. See Search templates. The current suggestion is automatically made available as the `{{suggestion}}` variable, which should be used in your query. You can still specify your own template `params` — the `suggestion` value will be added to the variables you specify. Additionally, you can specify a `prune` to control if all phrase suggestions will be returned; when set to `true` the suggestions will have an additional option `collate_match`, which will be `true` if matching documents for the phrase was found, `false` otherwise. The default value for `prune` is `false`.

resp = client.search(
    index="test",
    suggest={
        "text": "noble prize",
        "simple_phrase": {
            "phrase": {
                "field": "title.trigram",
                "size": 1,
                "direct_generator": [
                    {
                        "field": "title.trigram",
                        "suggest_mode": "always",
                        "min_word_length": 1
                    }
                ],
                "collate": {
                    "query": {
                        "source": {
                            "match": {
                                "{{field_name}}": "{{suggestion}}"
                            }
                        }
                    },
                    "params": {
                        "field_name": "title"
                    },
                    "prune": True
                }
            }
        }
    },
)
print(resp)

const response = await client.search({
  index: "test",
  suggest: {
    text: "noble prize",
    simple_phrase: {
      phrase: {
        field: "title.trigram",
        size: 1,
        direct_generator: [
          {
            field: "title.trigram",
            suggest_mode: "always",
            min_word_length: 1,
          },
        ],
        collate: {
          query: {
            source: {
              match: {
                "{{field_name}}": "{{suggestion}}",
              },
            },
          },
          params: {
            field_name: "title",
          },
          prune: true,
        },
      },
    },
  },
});
console.log(response);

POST test/_search
{
  "suggest": {
    "text" : "noble prize",
    "simple_phrase" : {
      "phrase" : {
        "field" :  "title.trigram",
        "size" :   1,
        "direct_generator" : [ {
          "field" :            "title.trigram",
          "suggest_mode" :     "always",
          "min_word_length" :  1
        } ],
        "collate": {
          "query": { 
            "source" : {
              "match": {
                "{{field_name}}" : "{{suggestion}}" 
              }
            }
          },
          "params": {"field_name" : "title"}, 
          "prune": true 
        }
      }
    }
  }
}

Copy as curl Try in Elastic

	This query will be run once for every suggestion.
	The `{{suggestion}}` variable will be replaced by the text of each suggestion.
	An additional `field_name` variable has been specified in `params` and is used by the `match` query.
	All suggestions will be returned with an extra `collate_match` option indicating whether the generated phrase matched any document.

Smoothing Models

edit

The phrase suggester supports multiple smoothing models to balance weight between infrequent grams (grams (shingles) are not existing in the index) and frequent grams (appear at least once in the index). The smoothing model can be selected by setting the smoothing parameter to one of the following options. Each smoothing model supports specific properties that can be configured.

`stupid_backoff`	A simple backoff model that backs off to lower order n-gram models if the higher order count is `0` and discounts the lower order n-gram model by a constant factor. The default `discount` is `0.4`. Stupid Backoff is the default model.
`laplace`	A smoothing model that uses an additive smoothing where a constant (typically `1.0` or smaller) is added to all counts to balance weights. The default `alpha` is `0.5`.
`linear_interpolation`	A smoothing model that takes the weighted mean of the unigrams, bigrams, and trigrams based on user supplied weights (lambdas). Linear Interpolation doesn’t have any default values. All parameters (`trigram_lambda`, `bigram_lambda`, `unigram_lambda`) must be supplied.

resp = client.search(
    index="test",
    suggest={
        "text": "obel prize",
        "simple_phrase": {
            "phrase": {
                "field": "title.trigram",
                "size": 1,
                "smoothing": {
                    "laplace": {
                        "alpha": 0.7
                    }
                }
            }
        }
    },
)
print(resp)

const response = await client.search({
  index: "test",
  suggest: {
    text: "obel prize",
    simple_phrase: {
      phrase: {
        field: "title.trigram",
        size: 1,
        smoothing: {
          laplace: {
            alpha: 0.7,
          },
        },
      },
    },
  },
});
console.log(response);

POST test/_search
{
  "suggest": {
    "text" : "obel prize",
    "simple_phrase" : {
      "phrase" : {
        "field" : "title.trigram",
        "size" : 1,
        "smoothing" : {
          "laplace" : {
            "alpha" : 0.7
          }
        }
      }
    }
  }
}

Copy as curl Try in Elastic

Candidate Generators

edit

The phrase suggester uses candidate generators to produce a list of possible terms per term in the given text. A single candidate generator is similar to a term suggester called for each individual term in the text. The output of the generators is subsequently scored in combination with the candidates from the other terms for suggestion candidates.

Currently only one type of candidate generator is supported, the direct_generator. The Phrase suggest API accepts a list of generators under the key direct_generator; each of the generators in the list is called per term in the original text.

Direct Generators

edit

The direct generators support the following parameters:

`field`	The field to fetch the candidate suggestions from. This is a required option that either needs to be set globally or per suggestion.
`size`	The maximum corrections to be returned per suggest text token.
`suggest_mode`	The suggest mode controls what suggestions are included on the suggestions generated on each shard. All values other than `always` can be thought of as an optimization to generate fewer suggestions to test on each shard and are not rechecked when combining the suggestions generated on each shard. Thus `missing` will generate suggestions for terms on shards that do not contain them even if other shards do contain them. Those should be filtered out using `confidence`. Three possible values can be specified: `missing`: Only generate suggestions for terms that are not in the shard. This is the default. `popular`: Only suggest terms that occur in more docs on the shard than the original term. `always`: Suggest any matching suggestions based on terms in the suggest text.
`max_edits`	The maximum edit distance candidate suggestions can have in order to be considered as a suggestion. Can only be a value between 1 and 2. Any other value results in a bad request error being thrown. Defaults to 2.
`prefix_length`	The number of minimal prefix characters that must match in order be a candidate suggestions. Defaults to 1. Increasing this number improves spellcheck performance. Usually misspellings don’t occur in the beginning of terms.
`min_word_length`	The minimum length a suggest text term must have in order to be included. Defaults to 4.
`max_inspections`	A factor that is used to multiply with the `shard_size` in order to inspect more candidate spelling corrections on the shard level. Can improve accuracy at the cost of performance. Defaults to 5.
`min_doc_freq`	The minimal threshold in number of documents a suggestion should appear in. This can be specified as an absolute number or as a relative percentage of number of documents. This can improve quality by only suggesting high frequency terms. Defaults to 0f and is not enabled. If a value higher than 1 is specified, then the number cannot be fractional. The shard level document frequencies are used for this option.
`max_term_freq`	The maximum threshold in number of documents in which a suggest text token can exist in order to be included. Can be a relative percentage number (e.g., 0.4) or an absolute number to represent document frequencies. If a value higher than 1 is specified, then fractional can not be specified. Defaults to 0.01f. This can be used to exclude high frequency terms — which are usually spelled correctly — from being spellchecked. This also improves the spellcheck performance. The shard level document frequencies are used for this option.
`pre_filter`	A filter (analyzer) that is applied to each of the tokens passed to this candidate generator. This filter is applied to the original token before candidates are generated.
`post_filter`	A filter (analyzer) that is applied to each of the generated tokens before they are passed to the actual phrase scorer.

The following example shows a phrase suggest call with two generators: the first one is using a field containing ordinary indexed terms, and the second one uses a field that uses terms indexed with a reverse filter (tokens are index in reverse order). This is used to overcome the limitation of the direct generators to require a constant prefix to provide high-performance suggestions. The pre_filter and post_filter options accept ordinary analyzer names.

resp = client.search(
    index="test",
    suggest={
        "text": "obel prize",
        "simple_phrase": {
            "phrase": {
                "field": "title.trigram",
                "size": 1,
                "direct_generator": [
                    {
                        "field": "title.trigram",
                        "suggest_mode": "always"
                    },
                    {
                        "field": "title.reverse",
                        "suggest_mode": "always",
                        "pre_filter": "reverse",
                        "post_filter": "reverse"
                    }
                ]
            }
        }
    },
)
print(resp)

const response = await client.search({
  index: "test",
  suggest: {
    text: "obel prize",
    simple_phrase: {
      phrase: {
        field: "title.trigram",
        size: 1,
        direct_generator: [
          {
            field: "title.trigram",
            suggest_mode: "always",
          },
          {
            field: "title.reverse",
            suggest_mode: "always",
            pre_filter: "reverse",
            post_filter: "reverse",
          },
        ],
      },
    },
  },
});
console.log(response);

POST test/_search
{
  "suggest": {
    "text" : "obel prize",
    "simple_phrase" : {
      "phrase" : {
        "field" : "title.trigram",
        "size" : 1,
        "direct_generator" : [ {
          "field" : "title.trigram",
          "suggest_mode" : "always"
        }, {
          "field" : "title.reverse",
          "suggest_mode" : "always",
          "pre_filter" : "reverse",
          "post_filter" : "reverse"
        } ]
      }
    }
  }
}

Copy as curl Try in Elastic

pre_filter and post_filter can also be used to inject synonyms after candidates are generated. For instance for the query captain usq we might generate a candidate usa for the term usq, which is a synonym for america. This allows us to present captain america to the user if this phrase scores high enough.

Completion Suggester

edit

The completion suggester provides auto-complete/search-as-you-type functionality. This is a navigational feature to guide users to relevant results as they are typing, improving search precision. It is not meant for spell correction or did-you-mean functionality like the term or phrase suggesters.

Ideally, auto-complete functionality should be as fast as a user types to provide instant feedback relevant to what a user has already typed in. Hence, completion suggester is optimized for speed. The suggester uses data structures that enable fast lookups, but are costly to build and are stored in-memory.

Mapping

edit

To use the completion suggester, map the field from which you want to generate suggestions as type completion. This indexes the field values for fast completions.

resp = client.indices.create(
    index="music",
    mappings={
        "properties": {
            "suggest": {
                "type": "completion"
            }
        }
    },
)
print(resp)

response = client.indices.create(
  index: 'music',
  body: {
    mappings: {
      properties: {
        suggest: {
          type: 'completion'
        }
      }
    }
  }
)
puts response

const response = await client.indices.create({
  index: "music",
  mappings: {
    properties: {
      suggest: {
        type: "completion",
      },
    },
  },
});
console.log(response);

PUT music
{
  "mappings": {
    "properties": {
      "suggest": {
        "type": "completion"
      }
    }
  }
}

Copy as curl Try in Elastic

Parameters for `completion` fields

edit

The following parameters are accepted by completion fields:

`analyzer`	The index analyzer to use, defaults to `simple`.
`search_analyzer`	The search analyzer to use, defaults to value of `analyzer`.
`preserve_separators`	Preserves the separators, defaults to `true`. If disabled, you could find a field starting with `Foo Fighters`, if you suggest for `foof`.
`preserve_position_increments`	Enables position increments, defaults to `true`. If disabled and using stopwords analyzer, you could get a field starting with `The Beatles`, if you suggest for `b`. Note: You could also achieve this by indexing two inputs, `Beatles` and `The Beatles`, no need to change a simple analyzer, if you are able to enrich your data.
`max_input_length`	Limits the length of a single input, defaults to `50` UTF-16 code points. This limit is only used at index time to reduce the total number of characters per input string in order to prevent massive inputs from bloating the underlying datastructure. Most use cases won’t be influenced by the default value since prefix completions seldom grow beyond prefixes longer than a handful of characters.

Indexing

edit

You index suggestions like any other field. A suggestion is made of an input and an optional weight attribute. An input is the expected text to be matched by a suggestion query and the weight determines how the suggestions will be scored. Indexing a suggestion is as follows:

resp = client.index(
    index="music",
    id="1",
    refresh=True,
    document={
        "suggest": {
            "input": [
                "Nevermind",
                "Nirvana"
            ],
            "weight": 34
        }
    },
)
print(resp)

response = client.index(
  index: 'music',
  id: 1,
  refresh: true,
  body: {
    suggest: {
      input: [
        'Nevermind',
        'Nirvana'
      ],
      weight: 34
    }
  }
)
puts response

const response = await client.index({
  index: "music",
  id: 1,
  refresh: "true",
  document: {
    suggest: {
      input: ["Nevermind", "Nirvana"],
      weight: 34,
    },
  },
});
console.log(response);

PUT music/_doc/1?refresh
{
  "suggest" : {
    "input": [ "Nevermind", "Nirvana" ],
    "weight" : 34
  }
}

Copy as curl Try in Elastic

The following parameters are supported:

input

The input to store, this can be an array of strings or just a string. This field is mandatory.

This value cannot contain the following UTF-16 control characters:

\u0000 (null)
\u001f (information separator one)
\u001e (information separator two)

weight

A positive integer or a string containing a positive integer, which defines a weight and allows you to rank your suggestions. This field is optional.

You can index multiple suggestions for a document as follows:

resp = client.index(
    index="music",
    id="1",
    refresh=True,
    document={
        "suggest": [
            {
                "input": "Nevermind",
                "weight": 10
            },
            {
                "input": "Nirvana",
                "weight": 3
            }
        ]
    },
)
print(resp)

response = client.index(
  index: 'music',
  id: 1,
  refresh: true,
  body: {
    suggest: [
      {
        input: 'Nevermind',
        weight: 10
      },
      {
        input: 'Nirvana',
        weight: 3
      }
    ]
  }
)
puts response

const response = await client.index({
  index: "music",
  id: 1,
  refresh: "true",
  document: {
    suggest: [
      {
        input: "Nevermind",
        weight: 10,
      },
      {
        input: "Nirvana",
        weight: 3,
      },
    ],
  },
});
console.log(response);

PUT music/_doc/1?refresh
{
  "suggest": [
    {
      "input": "Nevermind",
      "weight": 10
    },
    {
      "input": "Nirvana",
      "weight": 3
    }
  ]
}

Copy as curl Try in Elastic

You can use the following shorthand form. Note that you can not specify a weight with suggestion(s) in the shorthand form.

resp = client.index(
    index="music",
    id="1",
    refresh=True,
    document={
        "suggest": [
            "Nevermind",
            "Nirvana"
        ]
    },
)
print(resp)

response = client.index(
  index: 'music',
  id: 1,
  refresh: true,
  body: {
    suggest: [
      'Nevermind',
      'Nirvana'
    ]
  }
)
puts response

const response = await client.index({
  index: "music",
  id: 1,
  refresh: "true",
  document: {
    suggest: ["Nevermind", "Nirvana"],
  },
});
console.log(response);

PUT music/_doc/1?refresh
{
  "suggest" : [ "Nevermind", "Nirvana" ]
}

Copy as curl Try in Elastic

Querying

edit

Suggesting works as usual, except that you have to specify the suggest type as completion. Suggestions are near real-time, which means new suggestions can be made visible by refresh and documents once deleted are never shown. This request:

resp = client.search(
    index="music",
    pretty=True,
    suggest={
        "song-suggest": {
            "prefix": "nir",
            "completion": {
                "field": "suggest"
            }
        }
    },
)
print(resp)

response = client.search(
  index: 'music',
  pretty: true,
  body: {
    suggest: {
      "song-suggest": {
        prefix: 'nir',
        completion: {
          field: 'suggest'
        }
      }
    }
  }
)
puts response

const response = await client.search({
  index: "music",
  pretty: "true",
  suggest: {
    "song-suggest": {
      prefix: "nir",
      completion: {
        field: "suggest",
      },
    },
  },
});
console.log(response);

POST music/_search?pretty
{
  "suggest": {
    "song-suggest": {
      "prefix": "nir",        
      "completion": {         
          "field": "suggest"  
      }
    }
  }
}

Copy as curl Try in Elastic

	Prefix used to search for suggestions
	Type of suggestions
	Name of the field to search for suggestions in

returns this response:

{
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits": ...
  "took": 2,
  "timed_out": false,
  "suggest": {
    "song-suggest" : [ {
      "text" : "nir",
      "offset" : 0,
      "length" : 3,
      "options" : [ {
        "text" : "Nirvana",
        "_index": "music",
        "_id": "1",
        "_score": 1.0,
        "_source": {
          "suggest": ["Nevermind", "Nirvana"]
        }
      } ]
    } ]
  }
}

_source metadata field must be enabled, which is the default behavior, to enable returning _source with suggestions.

The configured weight for a suggestion is returned as _score. The text field uses the input of your indexed suggestion. Suggestions return the full document _source by default. The size of the _source can impact performance due to disk fetch and network transport overhead. To save some network overhead, filter out unnecessary fields from the _source using source filtering to minimize _source size. Note that the _suggest endpoint doesn’t support source filtering but using suggest on the _search endpoint does:

resp = client.search(
    index="music",
    source="suggest",
    suggest={
        "song-suggest": {
            "prefix": "nir",
            "completion": {
                "field": "suggest",
                "size": 5
            }
        }
    },
)
print(resp)

response = client.search(
  index: 'music',
  body: {
    _source: 'suggest',
    suggest: {
      "song-suggest": {
        prefix: 'nir',
        completion: {
          field: 'suggest',
          size: 5
        }
      }
    }
  }
)
puts response

const response = await client.search({
  index: "music",
  _source: "suggest",
  suggest: {
    "song-suggest": {
      prefix: "nir",
      completion: {
        field: "suggest",
        size: 5,
      },
    },
  },
});
console.log(response);

POST music/_search
{
  "_source": "suggest",     
  "suggest": {
    "song-suggest": {
      "prefix": "nir",
      "completion": {
        "field": "suggest", 
        "size": 5           
      }
    }
  }
}

Copy as curl Try in Elastic

	Filter the source to return only the `suggest` field
	Name of the field to search for suggestions in
	Number of suggestions to return

Which should look like:

{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 0,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "suggest": {
    "song-suggest": [ {
        "text": "nir",
        "offset": 0,
        "length": 3,
        "options": [ {
            "text": "Nirvana",
            "_index": "music",
            "_id": "1",
            "_score": 1.0,
            "_source": {
              "suggest": [ "Nevermind", "Nirvana" ]
            }
          } ]
      } ]
  }
}

The basic completion suggester query supports the following parameters:

`field`	The name of the field on which to run the query (required).
`size`	The number of suggestions to return (defaults to `5`).
`skip_duplicates`	Whether duplicate suggestions should be filtered out (defaults to `false`).

The completion suggester considers all documents in the index. See Context Suggester for an explanation of how to query a subset of documents instead.

In case of completion queries spanning more than one shard, the suggest is executed in two phases, where the last phase fetches the relevant documents from shards, implying executing completion requests against a single shard is more performant due to the document fetch overhead when the suggest spans multiple shards. To get best performance for completions, it is recommended to index completions into a single shard index. In case of high heap usage due to shard size, it is still recommended to break index into multiple shards instead of optimizing for completion performance.

Skip duplicate suggestions

edit

Queries can return duplicate suggestions coming from different documents. It is possible to modify this behavior by setting skip_duplicates to true. When set, this option filters out documents with duplicate suggestions from the result.

resp = client.search(
    index="music",
    pretty=True,
    suggest={
        "song-suggest": {
            "prefix": "nor",
            "completion": {
                "field": "suggest",
                "skip_duplicates": True
            }
        }
    },
)
print(resp)

response = client.search(
  index: 'music',
  pretty: true,
  body: {
    suggest: {
      "song-suggest": {
        prefix: 'nor',
        completion: {
          field: 'suggest',
          skip_duplicates: true
        }
      }
    }
  }
)
puts response

const response = await client.search({
  index: "music",
  pretty: "true",
  suggest: {
    "song-suggest": {
      prefix: "nor",
      completion: {
        field: "suggest",
        skip_duplicates: true,
      },
    },
  },
});
console.log(response);

POST music/_search?pretty
{
  "suggest": {
    "song-suggest": {
      "prefix": "nor",
      "completion": {
        "field": "suggest",
        "skip_duplicates": true
      }
    }
  }
}

Copy as curl Try in Elastic

When set to true, this option can slow down search because more suggestions need to be visited to find the top N.

Fuzzy queries

edit

The completion suggester also supports fuzzy queries — this means you can have a typo in your search and still get results back.

resp = client.search(
    index="music",
    pretty=True,
    suggest={
        "song-suggest": {
            "prefix": "nor",
            "completion": {
                "field": "suggest",
                "fuzzy": {
                    "fuzziness": 2
                }
            }
        }
    },
)
print(resp)

response = client.search(
  index: 'music',
  pretty: true,
  body: {
    suggest: {
      "song-suggest": {
        prefix: 'nor',
        completion: {
          field: 'suggest',
          fuzzy: {
            fuzziness: 2
          }
        }
      }
    }
  }
)
puts response

const response = await client.search({
  index: "music",
  pretty: "true",
  suggest: {
    "song-suggest": {
      prefix: "nor",
      completion: {
        field: "suggest",
        fuzzy: {
          fuzziness: 2,
        },
      },
    },
  },
});
console.log(response);

POST music/_search?pretty
{
  "suggest": {
    "song-suggest": {
      "prefix": "nor",
      "completion": {
        "field": "suggest",
        "fuzzy": {
          "fuzziness": 2
        }
      }
    }
  }
}

Copy as curl Try in Elastic

Suggestions that share the longest prefix to the query prefix will be scored higher.

The fuzzy query can take specific fuzzy parameters. The following parameters are supported:

`fuzziness`	The fuzziness factor, defaults to `AUTO`. See Fuzziness for allowed settings.
`transpositions`	if set to `true`, transpositions are counted as one change instead of two, defaults to `true`
`min_length`	Minimum length of the input before fuzzy suggestions are returned, defaults `3`
`prefix_length`	Minimum length of the input, which is not checked for fuzzy alternatives, defaults to `1`
`unicode_aware`	If `true`, all measurements (like fuzzy edit distance, transpositions, and lengths) are measured in Unicode code points instead of in bytes. This is slightly slower than raw bytes, so it is set to `false` by default.

If you want to stick with the default values, but still use fuzzy, you can either use fuzzy: {} or fuzzy: true.

Regex queries

edit

The completion suggester also supports regex queries meaning you can express a prefix as a regular expression

resp = client.search(
    index="music",
    pretty=True,
    suggest={
        "song-suggest": {
            "regex": "n[ever|i]r",
            "completion": {
                "field": "suggest"
            }
        }
    },
)
print(resp)

response = client.search(
  index: 'music',
  pretty: true,
  body: {
    suggest: {
      "song-suggest": {
        regex: 'n[ever|i]r',
        completion: {
          field: 'suggest'
        }
      }
    }
  }
)
puts response

const response = await client.search({
  index: "music",
  pretty: "true",
  suggest: {
    "song-suggest": {
      regex: "n[ever|i]r",
      completion: {
        field: "suggest",
      },
    },
  },
});
console.log(response);

POST music/_search?pretty
{
  "suggest": {
    "song-suggest": {
      "regex": "n[ever|i]r",
      "completion": {
        "field": "suggest"
      }
    }
  }
}

Copy as curl Try in Elastic

The regex query can take specific regex parameters. The following parameters are supported:

flags

Possible flags are ALL (default), ANYSTRING, COMPLEMENT, EMPTY, INTERSECTION, INTERVAL, or NONE. See regexp-syntax for their meaning

max_determinized_states

Regular expressions are dangerous because it’s easy to accidentally create an innocuous looking one that requires an exponential number of internal determinized automaton states (and corresponding RAM and CPU) for Lucene to execute. Lucene prevents these using the max_determinized_states setting (defaults to 10000). You can raise this limit to allow more complex regular expressions to execute.

Context Suggester

edit

The completion suggester considers all documents in the index, but it is often desirable to serve suggestions filtered and/or boosted by some criteria. For example, you want to suggest song titles filtered by certain artists or you want to boost song titles based on their genre.

To achieve suggestion filtering and/or boosting, you can add context mappings while configuring a completion field. You can define multiple context mappings for a completion field. Every context mapping has a unique name and a type. There are two types: category and geo. Context mappings are configured under the contexts parameter in the field mapping.

It is mandatory to provide a context when indexing and querying a context enabled completion field.

The maximum allowed number of completion field context mappings is 10.

The following defines types, each with two context mappings for a completion field:

resp = client.indices.create(
    index="place",
    mappings={
        "properties": {
            "suggest": {
                "type": "completion",
                "contexts": [
                    {
                        "name": "place_type",
                        "type": "category"
                    },
                    {
                        "name": "location",
                        "type": "geo",
                        "precision": 4
                    }
                ]
            }
        }
    },
)
print(resp)

resp1 = client.indices.create(
    index="place_path_category",
    mappings={
        "properties": {
            "suggest": {
                "type": "completion",
                "contexts": [
                    {
                        "name": "place_type",
                        "type": "category",
                        "path": "cat"
                    },
                    {
                        "name": "location",
                        "type": "geo",
                        "precision": 4,
                        "path": "loc"
                    }
                ]
            },
            "loc": {
                "type": "geo_point"
            }
        }
    },
)
print(resp1)

response = client.indices.create(
  index: 'place',
  body: {
    mappings: {
      properties: {
        suggest: {
          type: 'completion',
          contexts: [
            {
              name: 'place_type',
              type: 'category'
            },
            {
              name: 'location',
              type: 'geo',
              precision: 4
            }
          ]
        }
      }
    }
  }
)
puts response

response = client.indices.create(
  index: 'place_path_category',
  body: {
    mappings: {
      properties: {
        suggest: {
          type: 'completion',
          contexts: [
            {
              name: 'place_type',
              type: 'category',
              path: 'cat'
            },
            {
              name: 'location',
              type: 'geo',
              precision: 4,
              path: 'loc'
            }
          ]
        },
        loc: {
          type: 'geo_point'
        }
      }
    }
  }
)
puts response

const response = await client.indices.create({
  index: "place",
  mappings: {
    properties: {
      suggest: {
        type: "completion",
        contexts: [
          {
            name: "place_type",
            type: "category",
          },
          {
            name: "location",
            type: "geo",
            precision: 4,
          },
        ],
      },
    },
  },
});
console.log(response);

const response1 = await client.indices.create({
  index: "place_path_category",
  mappings: {
    properties: {
      suggest: {
        type: "completion",
        contexts: [
          {
            name: "place_type",
            type: "category",
            path: "cat",
          },
          {
            name: "location",
            type: "geo",
            precision: 4,
            path: "loc",
          },
        ],
      },
      loc: {
        type: "geo_point",
      },
    },
  },
});
console.log(response1);

PUT place
{
  "mappings": {
    "properties": {
      "suggest": {
        "type": "completion",
        "contexts": [
          {                                 
            "name": "place_type",
            "type": "category"
          },
          {                                 
            "name": "location",
            "type": "geo",
            "precision": 4
          }
        ]
      }
    }
  }
}
PUT place_path_category
{
  "mappings": {
    "properties": {
      "suggest": {
        "type": "completion",
        "contexts": [
          {                           
            "name": "place_type",
            "type": "category",
            "path": "cat"
          },
          {                           
            "name": "location",
            "type": "geo",
            "precision": 4,
            "path": "loc"
          }
        ]
      },
      "loc": {
        "type": "geo_point"
      }
    }
  }
}

Copy as curl Try in Elastic

	Defines a `category` context named place_type where the categories must be sent with the suggestions.
	Defines a `geo` context named location where the categories must be sent with the suggestions.
	Defines a `category` context named place_type where the categories are read from the `cat` field.
	Defines a `geo` context named location where the categories are read from the `loc` field.

Adding context mappings increases the index size for completion field. The completion index is entirely heap resident, you can monitor the completion field index size using Index stats.

Category Context

edit

The category context allows you to associate one or more categories with suggestions at index time. At query time, suggestions can be filtered and boosted by their associated categories.

The mappings are set up like the place_type fields above. If path is defined then the categories are read from that path in the document, otherwise they must be sent in the suggest field like this:

resp = client.index(
    index="place",
    id="1",
    document={
        "suggest": {
            "input": [
                "timmy's",
                "starbucks",
                "dunkin donuts"
            ],
            "contexts": {
                "place_type": [
                    "cafe",
                    "food"
                ]
            }
        }
    },
)
print(resp)

response = client.index(
  index: 'place',
  id: 1,
  body: {
    suggest: {
      input: [
        "timmy's",
        'starbucks',
        'dunkin donuts'
      ],
      contexts: {
        place_type: [
          'cafe',
          'food'
        ]
      }
    }
  }
)
puts response

const response = await client.index({
  index: "place",
  id: 1,
  document: {
    suggest: {
      input: ["timmy's", "starbucks", "dunkin donuts"],
      contexts: {
        place_type: ["cafe", "food"],
      },
    },
  },
});
console.log(response);

PUT place/_doc/1
{
  "suggest": {
    "input": [ "timmy's", "starbucks", "dunkin donuts" ],
    "contexts": {
      "place_type": [ "cafe", "food" ]                    
    }
  }
}

Copy as curl Try in Elastic

These suggestions will be associated with cafe and food category.

If the mapping had a path then the following index request would be enough to add the categories:

resp = client.index(
    index="place_path_category",
    id="1",
    document={
        "suggest": [
            "timmy's",
            "starbucks",
            "dunkin donuts"
        ],
        "cat": [
            "cafe",
            "food"
        ]
    },
)
print(resp)

response = client.index(
  index: 'place_path_category',
  id: 1,
  body: {
    suggest: [
      "timmy's",
      'starbucks',
      'dunkin donuts'
    ],
    cat: [
      'cafe',
      'food'
    ]
  }
)
puts response

const response = await client.index({
  index: "place_path_category",
  id: 1,
  document: {
    suggest: ["timmy's", "starbucks", "dunkin donuts"],
    cat: ["cafe", "food"],
  },
});
console.log(response);

PUT place_path_category/_doc/1
{
  "suggest": ["timmy's", "starbucks", "dunkin donuts"],
  "cat": ["cafe", "food"] 
}

Copy as curl Try in Elastic

These suggestions will be associated with cafe and food category.

If context mapping references another field and the categories are explicitly indexed, the suggestions are indexed with both set of categories.

Category Query

edit

Suggestions can be filtered by one or more categories. The following filters suggestions by multiple categories:

resp = client.search(
    index="place",
    pretty=True,
    suggest={
        "place_suggestion": {
            "prefix": "tim",
            "completion": {
                "field": "suggest",
                "size": 10,
                "contexts": {
                    "place_type": [
                        "cafe",
                        "restaurants"
                    ]
                }
            }
        }
    },
)
print(resp)

response = client.search(
  index: 'place',
  pretty: true,
  body: {
    suggest: {
      place_suggestion: {
        prefix: 'tim',
        completion: {
          field: 'suggest',
          size: 10,
          contexts: {
            place_type: [
              'cafe',
              'restaurants'
            ]
          }
        }
      }
    }
  }
)
puts response

const response = await client.search({
  index: "place",
  pretty: "true",
  suggest: {
    place_suggestion: {
      prefix: "tim",
      completion: {
        field: "suggest",
        size: 10,
        contexts: {
          place_type: ["cafe", "restaurants"],
        },
      },
    },
  },
});
console.log(response);

POST place/_search?pretty
{
  "suggest": {
    "place_suggestion": {
      "prefix": "tim",
      "completion": {
        "field": "suggest",
        "size": 10,
        "contexts": {
          "place_type": [ "cafe", "restaurants" ]
        }
      }
    }
  }
}

Copy as curl Try in Elastic

If multiple categories or category contexts are set on the query they are merged as a disjunction. This means that suggestions match if they contain at least one of the provided context values.

Suggestions with certain categories can be boosted higher than others. The following filters suggestions by categories and additionally boosts suggestions associated with some categories:

resp = client.search(
    index="place",
    pretty=True,
    suggest={
        "place_suggestion": {
            "prefix": "tim",
            "completion": {
                "field": "suggest",
                "size": 10,
                "contexts": {
                    "place_type": [
                        {
                            "context": "cafe"
                        },
                        {
                            "context": "restaurants",
                            "boost": 2
                        }
                    ]
                }
            }
        }
    },
)
print(resp)

response = client.search(
  index: 'place',
  pretty: true,
  body: {
    suggest: {
      place_suggestion: {
        prefix: 'tim',
        completion: {
          field: 'suggest',
          size: 10,
          contexts: {
            place_type: [
              {
                context: 'cafe'
              },
              {
                context: 'restaurants',
                boost: 2
              }
            ]
          }
        }
      }
    }
  }
)
puts response

const response = await client.search({
  index: "place",
  pretty: "true",
  suggest: {
    place_suggestion: {
      prefix: "tim",
      completion: {
        field: "suggest",
        size: 10,
        contexts: {
          place_type: [
            {
              context: "cafe",
            },
            {
              context: "restaurants",
              boost: 2,
            },
          ],
        },
      },
    },
  },
});
console.log(response);

POST place/_search?pretty
{
  "suggest": {
    "place_suggestion": {
      "prefix": "tim",
      "completion": {
        "field": "suggest",
        "size": 10,
        "contexts": {
          "place_type": [                             
            { "context": "cafe" },
            { "context": "restaurants", "boost": 2 }
          ]
        }
      }
    }
  }
}

Copy as curl Try in Elastic

The context query filter suggestions associated with categories cafe and restaurants and boosts the suggestions associated with restaurants by a factor of 2

In addition to accepting category values, a context query can be composed of multiple category context clauses. The following parameters are supported for a category context clause:

`context`	The value of the category to filter/boost on. This is mandatory.
`boost`	The factor by which the score of the suggestion should be boosted, the score is computed by multiplying the boost with the suggestion weight, defaults to `1`
`prefix`	Whether the category value should be treated as a prefix or not. For example, if set to `true`, you can filter category of type1, type2 and so on, by specifying a category prefix of type. Defaults to `false`

If a suggestion entry matches multiple contexts the final score is computed as the maximum score produced by any matching contexts.

Geo location Context

edit

A geo context allows you to associate one or more geo points or geohashes with suggestions at index time. At query time, suggestions can be filtered and boosted if they are within a certain distance of a specified geo location.

Internally, geo points are encoded as geohashes with the specified precision.

Geo Mapping

edit

In addition to the path setting, geo context mapping accepts the following settings:

precision

This defines the precision of the geohash to be indexed and can be specified as a distance value (5m, 10km etc.), or as a raw geohash precision (1..12). Defaults to a raw geohash precision value of 6.

The index time precision setting sets the maximum geohash precision that can be used at query time.

Indexing geo contexts

edit

geo contexts can be explicitly set with suggestions or be indexed from a geo point field in the document via the path parameter, similar to category contexts. Associating multiple geo location context with a suggestion, will index the suggestion for every geo location. The following indexes a suggestion with two geo location contexts:

resp = client.index(
    index="place",
    id="1",
    document={
        "suggest": {
            "input": "timmy's",
            "contexts": {
                "location": [
                    {
                        "lat": 43.6624803,
                        "lon": -79.3863353
                    },
                    {
                        "lat": 43.6624718,
                        "lon": -79.3873227
                    }
                ]
            }
        }
    },
)
print(resp)

response = client.index(
  index: 'place',
  id: 1,
  body: {
    suggest: {
      input: "timmy's",
      contexts: {
        location: [
          {
            lat: 43.6624803,
            lon: -79.3863353
          },
          {
            lat: 43.6624718,
            lon: -79.3873227
          }
        ]
      }
    }
  }
)
puts response

const response = await client.index({
  index: "place",
  id: 1,
  document: {
    suggest: {
      input: "timmy's",
      contexts: {
        location: [
          {
            lat: 43.6624803,
            lon: -79.3863353,
          },
          {
            lat: 43.6624718,
            lon: -79.3873227,
          },
        ],
      },
    },
  },
});
console.log(response);

PUT place/_doc/1
{
  "suggest": {
    "input": "timmy's",
    "contexts": {
      "location": [
        {
          "lat": 43.6624803,
          "lon": -79.3863353
        },
        {
          "lat": 43.6624718,
          "lon": -79.3873227
        }
      ]
    }
  }
}

Copy as curl Try in Elastic

Geo location Query

edit

Suggestions can be filtered and boosted with respect to how close they are to one or more geo points. The following filters suggestions that fall within the area represented by the encoded geohash of a geo point:

resp = client.search(
    index="place",
    suggest={
        "place_suggestion": {
            "prefix": "tim",
            "completion": {
                "field": "suggest",
                "size": 10,
                "contexts": {
                    "location": {
                        "lat": 43.662,
                        "lon": -79.38
                    }
                }
            }
        }
    },
)
print(resp)

response = client.search(
  index: 'place',
  body: {
    suggest: {
      place_suggestion: {
        prefix: 'tim',
        completion: {
          field: 'suggest',
          size: 10,
          contexts: {
            location: {
              lat: 43.662,
              lon: -79.38
            }
          }
        }
      }
    }
  }
)
puts response

const response = await client.search({
  index: "place",
  suggest: {
    place_suggestion: {
      prefix: "tim",
      completion: {
        field: "suggest",
        size: 10,
        contexts: {
          location: {
            lat: 43.662,
            lon: -79.38,
          },
        },
      },
    },
  },
});
console.log(response);

POST place/_search
{
  "suggest": {
    "place_suggestion": {
      "prefix": "tim",
      "completion": {
        "field": "suggest",
        "size": 10,
        "contexts": {
          "location": {
            "lat": 43.662,
            "lon": -79.380
          }
        }
      }
    }
  }
}

Copy as curl Try in Elastic

When a location with a lower precision at query time is specified, all suggestions that fall within the area will be considered.

If multiple categories or category contexts are set on the query they are merged as a disjunction. This means that suggestions match if they contain at least one of the provided context values.

Suggestions that are within an area represented by a geohash can also be boosted higher than others, as shown by the following:

resp = client.search(
    index="place",
    pretty=True,
    suggest={
        "place_suggestion": {
            "prefix": "tim",
            "completion": {
                "field": "suggest",
                "size": 10,
                "contexts": {
                    "location": [
                        {
                            "lat": 43.6624803,
                            "lon": -79.3863353,
                            "precision": 2
                        },
                        {
                            "context": {
                                "lat": 43.6624803,
                                "lon": -79.3863353
                            },
                            "boost": 2
                        }
                    ]
                }
            }
        }
    },
)
print(resp)

response = client.search(
  index: 'place',
  pretty: true,
  body: {
    suggest: {
      place_suggestion: {
        prefix: 'tim',
        completion: {
          field: 'suggest',
          size: 10,
          contexts: {
            location: [
              {
                lat: 43.6624803,
                lon: -79.3863353,
                precision: 2
              },
              {
                context: {
                  lat: 43.6624803,
                  lon: -79.3863353
                },
                boost: 2
              }
            ]
          }
        }
      }
    }
  }
)
puts response

const response = await client.search({
  index: "place",
  pretty: "true",
  suggest: {
    place_suggestion: {
      prefix: "tim",
      completion: {
        field: "suggest",
        size: 10,
        contexts: {
          location: [
            {
              lat: 43.6624803,
              lon: -79.3863353,
              precision: 2,
            },
            {
              context: {
                lat: 43.6624803,
                lon: -79.3863353,
              },
              boost: 2,
            },
          ],
        },
      },
    },
  },
});
console.log(response);

POST place/_search?pretty
{
  "suggest": {
    "place_suggestion": {
      "prefix": "tim",
      "completion": {
        "field": "suggest",
        "size": 10,
        "contexts": {
          "location": [             
                      {
              "lat": 43.6624803,
              "lon": -79.3863353,
              "precision": 2
            },
            {
              "context": {
                "lat": 43.6624803,
                "lon": -79.3863353
              },
              "boost": 2
            }
          ]
        }
      }
    }
  }
}

Copy as curl Try in Elastic

The context query filters for suggestions that fall under the geo location represented by a geohash of (43.662, -79.380) with a precision of 2 and boosts suggestions that fall under the geohash representation of (43.6624803, -79.3863353) with a default precision of 6 by a factor of 2

If a suggestion entry matches multiple contexts the final score is computed as the maximum score produced by any matching contexts.

In addition to accepting context values, a context query can be composed of multiple context clauses. The following parameters are supported for a geo context clause:

`context`	A geo point object or a geo hash string to filter or boost the suggestion by. This is mandatory.
`boost`	The factor by which the score of the suggestion should be boosted, the score is computed by multiplying the boost with the suggestion weight, defaults to `1`
`precision`	The precision of the geohash to encode the query geo point. This can be specified as a distance value (`5m`, `10km` etc.), or as a raw geohash precision (`1`..`12`). Defaults to index time precision level.
`neighbours`	Accepts an array of precision values at which neighbouring geohashes should be taken into account. precision value can be a distance value (`5m`, `10km` etc.) or a raw geohash precision (`1`..`12`). Defaults to generating neighbours for index time precision level.

The precision field does not result in a distance match. Specifying a distance value like 10km only results in a geohash precision value that represents tiles of that size. The precision will be used to encode the search geo point into a geohash tile for completion matching. A consequence of this is that points outside that tile, even if very close to the search point, will not be matched. Reducing the precision, or increasing the distance, can reduce the risk of this happening, but not entirely remove it.

Returning the type of the suggester

edit

Sometimes you need to know the exact type of a suggester in order to parse its results. The typed_keys parameter can be used to change the suggester’s name in the response so that it will be prefixed by its type.

Considering the following example with two suggesters term and phrase:

resp = client.search(
    typed_keys=True,
    suggest={
        "text": "some test mssage",
        "my-first-suggester": {
            "term": {
                "field": "message"
            }
        },
        "my-second-suggester": {
            "phrase": {
                "field": "message"
            }
        }
    },
)
print(resp)

response = client.search(
  typed_keys: true,
  body: {
    suggest: {
      text: 'some test mssage',
      "my-first-suggester": {
        term: {
          field: 'message'
        }
      },
      "my-second-suggester": {
        phrase: {
          field: 'message'
        }
      }
    }
  }
)
puts response

const response = await client.search({
  typed_keys: "true",
  suggest: {
    text: "some test mssage",
    "my-first-suggester": {
      term: {
        field: "message",
      },
    },
    "my-second-suggester": {
      phrase: {
        field: "message",
      },
    },
  },
});
console.log(response);

POST _search?typed_keys
{
  "suggest": {
    "text" : "some test mssage",
    "my-first-suggester" : {
      "term" : {
        "field" : "message"
      }
    },
    "my-second-suggester" : {
      "phrase" : {
        "field" : "message"
      }
    }
  }
}

Copy as curl Try in Elastic

In the response, the suggester names will be changed to respectively term#my-first-suggester and phrase#my-second-suggester, reflecting the types of each suggestion:

{
  "suggest": {
    "term#my-first-suggester": [ 
      {
        "text": "some",
        "offset": 0,
        "length": 4,
        "options": []
      },
      {
        "text": "test",
        "offset": 5,
        "length": 4,
        "options": []
      },
      {
        "text": "mssage",
        "offset": 10,
        "length": 6,
        "options": [
          {
            "text": "message",
            "score": 0.8333333,
            "freq": 4
          }
        ]
      }
    ],
    "phrase#my-second-suggester": [ 
      {
        "text": "some test mssage",
        "offset": 0,
        "length": 16,
        "options": [
          {
            "text": "some test message",
            "score": 0.030227963
          }
        ]
      }
    ]
  },
  ...
}

	The name `my-first-suggester` now contains the `term` prefix.
	The name `my-second-suggester` now contains the `phrase` prefix.

« Search shards API Multi search API »

Was this helpful?

Feedback

The Search AI Company

Generative AI

Search

Security

Observability

By solution

Industries

Suggesters

Suggesters

Request

Examples

Global suggest text

Term suggester

Common suggest options:

Other term suggest options:

Phrase Suggester

API Example

Basic Phrase suggest API parameters

Smoothing Models

Candidate Generators

Direct Generators

Completion Suggester

Mapping

Parameters for completion fields

Indexing

Querying

Skip duplicate suggestions

Fuzzy queries

Regex queries

Context Suggester

Category Context

Category Query

Geo location Context

Geo Mapping

Indexing geo contexts

Geo location Query

Returning the type of the suggester

Follow us

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards

Parameters for `completion` fields