- Elasticsearch Guide: other versions:
- What is Elasticsearch?
- What’s new in 8.12
- Quick start
- Set up Elasticsearch
- Installing Elasticsearch
- Run Elasticsearch locally
- Configuring Elasticsearch
- Important Elasticsearch configuration
- Secure settings
- Auditing settings
- Circuit breaker settings
- Cluster-level shard allocation and routing settings
- Miscellaneous cluster settings
- Cross-cluster replication settings
- Discovery and cluster formation settings
- Field data cache settings
- Health Diagnostic settings
- Index lifecycle management settings
- Data stream lifecycle settings
- Index management settings
- Index recovery settings
- Indexing buffer settings
- License settings
- Local gateway settings
- Logging
- Machine learning settings
- Monitoring settings
- Node
- Networking
- Node query cache settings
- Search settings
- Security settings
- Shard request cache settings
- Snapshot and restore settings
- Transforms settings
- Thread pools
- Watcher settings
- Advanced configuration
- Important system configuration
- Bootstrap Checks
- Heap size check
- File descriptor check
- Memory lock check
- Maximum number of threads check
- Max file size check
- Maximum size virtual memory check
- Maximum map count check
- Client JVM check
- Use serial collector check
- System call filter check
- OnError and OnOutOfMemoryError checks
- Early-access check
- All permission check
- Discovery configuration check
- Bootstrap Checks for X-Pack
- Starting Elasticsearch
- Stopping Elasticsearch
- Discovery and cluster formation
- Add and remove nodes in your cluster
- Full-cluster restart and rolling restart
- Remote clusters
- Plugins
- Upgrade Elasticsearch
- Index modules
- Mapping
- Text analysis
- Overview
- Concepts
- Configure text analysis
- Built-in analyzer reference
- Tokenizer reference
- Token filter reference
- Apostrophe
- ASCII folding
- CJK bigram
- CJK width
- Classic
- Common grams
- Conditional
- Decimal digit
- Delimited payload
- Dictionary decompounder
- Edge n-gram
- Elision
- Fingerprint
- Flatten graph
- Hunspell
- Hyphenation decompounder
- Keep types
- Keep words
- Keyword marker
- Keyword repeat
- KStem
- Length
- Limit token count
- Lowercase
- MinHash
- Multiplexer
- N-gram
- Normalization
- Pattern capture
- Pattern replace
- Phonetic
- Porter stem
- Predicate script
- Remove duplicates
- Reverse
- Shingle
- Snowball
- Stemmer
- Stemmer override
- Stop
- Synonym
- Synonym graph
- Trim
- Truncate
- Unique
- Uppercase
- Word delimiter
- Word delimiter graph
- Character filters reference
- Normalizers
- Index templates
- Data streams
- Ingest pipelines
- Example: Parse logs
- Enrich your data
- Processor reference
- Append
- Attachment
- Bytes
- Circle
- Community ID
- Convert
- CSV
- Date
- Date index name
- Dissect
- Dot expander
- Drop
- Enrich
- Fail
- Fingerprint
- Foreach
- Geo-grid
- GeoIP
- Grok
- Gsub
- HTML strip
- Inference
- Join
- JSON
- KV
- Lowercase
- Network direction
- Pipeline
- Redact
- Registered domain
- Remove
- Rename
- Reroute
- Script
- Set
- Set security user
- Sort
- Split
- Trim
- Uppercase
- URL decode
- URI parts
- User agent
- Ingest pipelines in Search
- Aliases
- Search your data
- Query DSL
- Aggregations
- Bucket aggregations
- Adjacency matrix
- Auto-interval date histogram
- Categorize text
- Children
- Composite
- Date histogram
- Date range
- Diversified sampler
- Filter
- Filters
- Frequent item sets
- Geo-distance
- Geohash grid
- Geohex grid
- Geotile grid
- Global
- Histogram
- IP prefix
- IP range
- Missing
- Multi Terms
- Nested
- Parent
- Random sampler
- Range
- Rare terms
- Reverse nested
- Sampler
- Significant terms
- Significant text
- Terms
- Time series
- Variable width histogram
- Subtleties of bucketing range fields
- Metrics aggregations
- Pipeline aggregations
- Average bucket
- Bucket script
- Bucket count K-S test
- Bucket correlation
- Bucket selector
- Bucket sort
- Change point
- Cumulative cardinality
- Cumulative sum
- Derivative
- Extended stats bucket
- Inference bucket
- Max bucket
- Min bucket
- Moving function
- Moving percentiles
- Normalize
- Percentiles bucket
- Serial differencing
- Stats bucket
- Sum bucket
- Bucket aggregations
- Geospatial analysis
- EQL
- ES|QL
- SQL
- Overview
- Getting Started with SQL
- Conventions and Terminology
- Security
- SQL REST API
- SQL Translate API
- SQL CLI
- SQL JDBC
- SQL ODBC
- SQL Client Applications
- SQL Language
- Functions and Operators
- Comparison Operators
- Logical Operators
- Math Operators
- Cast Operators
- LIKE and RLIKE Operators
- Aggregate Functions
- Grouping Functions
- Date/Time and Interval Functions and Operators
- Full-Text Search Functions
- Mathematical Functions
- String Functions
- Type Conversion Functions
- Geo Functions
- Conditional Functions And Expressions
- System Functions
- Reserved keywords
- SQL Limitations
- Scripting
- Data management
- ILM: Manage the index lifecycle
- Tutorial: Customize built-in policies
- Tutorial: Automate rollover
- Index management in Kibana
- Overview
- Concepts
- Index lifecycle actions
- Configure a lifecycle policy
- Migrate index allocation filters to node roles
- Troubleshooting index lifecycle management errors
- Start and stop index lifecycle management
- Manage existing indices
- Skip rollover
- Restore a managed data stream or index
- Data tiers
- Autoscaling
- Monitor a cluster
- Roll up or transform your data
- Set up a cluster for high availability
- Snapshot and restore
- Secure the Elastic Stack
- Elasticsearch security principles
- Start the Elastic Stack with security enabled automatically
- Manually configure security
- Updating node security certificates
- User authentication
- Built-in users
- Service accounts
- Internal users
- Token-based authentication services
- User profiles
- Realms
- Realm chains
- Security domains
- Active Directory user authentication
- File-based user authentication
- LDAP user authentication
- Native user authentication
- OpenID Connect authentication
- PKI user authentication
- SAML authentication
- Kerberos authentication
- JWT authentication
- Integrating with other authentication systems
- Enabling anonymous access
- Looking up users without authentication
- Controlling the user cache
- Configuring SAML single-sign-on on the Elastic Stack
- Configuring single sign-on to the Elastic Stack using OpenID Connect
- User authorization
- Built-in roles
- Defining roles
- Role restriction
- Security privileges
- Document level security
- Field level security
- Granting privileges for data streams and aliases
- Mapping users and groups to roles
- Setting up field and document level security
- Submitting requests on behalf of other users
- Configuring authorization delegation
- Customizing roles and authorization
- Enable audit logging
- Restricting connections with IP filtering
- Securing clients and integrations
- Operator privileges
- Troubleshooting
- Some settings are not returned via the nodes settings API
- Authorization exceptions
- Users command fails due to extra arguments
- Users are frequently locked out of Active Directory
- Certificate verification fails for curl on Mac
- SSLHandshakeException causes connections to fail
- Common SSL/TLS exceptions
- Common Kerberos exceptions
- Common SAML issues
- Internal Server Error in Kibana
- Setup-passwords command fails due to connection failure
- Failures due to relocation of the configuration files
- Limitations
- Watcher
- Command line tools
- elasticsearch-certgen
- elasticsearch-certutil
- elasticsearch-create-enrollment-token
- elasticsearch-croneval
- elasticsearch-keystore
- elasticsearch-node
- elasticsearch-reconfigure-node
- elasticsearch-reset-password
- elasticsearch-saml-metadata
- elasticsearch-service-tokens
- elasticsearch-setup-passwords
- elasticsearch-shard
- elasticsearch-syskeygen
- elasticsearch-users
- How to
- Troubleshooting
- Fix common cluster issues
- Diagnose unassigned shards
- Add a missing tier to the system
- Allow Elasticsearch to allocate the data in the system
- Allow Elasticsearch to allocate the index
- Indices mix index allocation filters with data tiers node roles to move through data tiers
- Not enough nodes to allocate all shard replicas
- Total number of shards for an index on a single node exceeded
- Total number of shards per node has been reached
- Troubleshooting corruption
- Fix data nodes out of disk
- Fix master nodes out of disk
- Fix other role nodes out of disk
- Start index lifecycle management
- Start Snapshot Lifecycle Management
- Restore from snapshot
- Multiple deployments writing to the same snapshot repository
- Addressing repeated snapshot policy failures
- Troubleshooting an unstable cluster
- Troubleshooting discovery
- Troubleshooting monitoring
- Troubleshooting transforms
- Troubleshooting Watcher
- Troubleshooting searches
- Troubleshooting shards capacity health issues
- REST APIs
- API conventions
- Common options
- REST API compatibility
- Autoscaling APIs
- Behavioral Analytics APIs
- Compact and aligned text (CAT) APIs
- cat aliases
- cat allocation
- cat anomaly detectors
- cat component templates
- cat count
- cat data frame analytics
- cat datafeeds
- cat fielddata
- cat health
- cat indices
- cat master
- cat nodeattrs
- cat nodes
- cat pending tasks
- cat plugins
- cat recovery
- cat repositories
- cat segments
- cat shards
- cat snapshots
- cat task management
- cat templates
- cat thread pool
- cat trained model
- cat transforms
- Cluster APIs
- Cluster allocation explain
- Cluster get settings
- Cluster health
- Health
- Cluster reroute
- Cluster state
- Cluster stats
- Cluster update settings
- Nodes feature usage
- Nodes hot threads
- Nodes info
- Prevalidate node removal
- Nodes reload secure settings
- Nodes stats
- Cluster Info
- Pending cluster tasks
- Remote cluster info
- Task management
- Voting configuration exclusions
- Create or update desired nodes
- Get desired nodes
- Delete desired nodes
- Get desired balance
- Reset desired balance
- Cross-cluster replication APIs
- Connector APIs
- Cancel connector sync job
- Check in a connector
- Check in connector sync job
- Create connector
- Create connector sync job
- Delete connector
- Delete connector sync job
- Get connector
- Get connector sync job
- List connectors
- List connector sync jobs
- Set connector sync job error
- Set connector sync job stats
- Update connector configuration
- Update connector error
- Update connector filtering
- Update connector last sync stats
- Update connector name and description
- Update connector pipeline
- Update connector scheduling
- Data stream APIs
- Document APIs
- Enrich APIs
- EQL APIs
- ES|QL query API
- Features APIs
- Fleet APIs
- Find structure API
- Graph explore API
- Index APIs
- Alias exists
- Aliases
- Analyze
- Analyze index disk usage
- Clear cache
- Clone index
- Close index
- Create index
- Create or update alias
- Create or update component template
- Create or update index template
- Create or update index template (legacy)
- Delete component template
- Delete dangling index
- Delete alias
- Delete index
- Delete index template
- Delete index template (legacy)
- Exists
- Field usage stats
- Flush
- Force merge
- Get alias
- Get component template
- Get field mapping
- Get index
- Get index settings
- Get index template
- Get index template (legacy)
- Get mapping
- Import dangling index
- Index recovery
- Index segments
- Index shard stores
- Index stats
- Index template exists (legacy)
- List dangling indices
- Open index
- Refresh
- Resolve index
- Rollover
- Shrink index
- Simulate index
- Simulate template
- Split index
- Unfreeze index
- Update index settings
- Update mapping
- Index lifecycle management APIs
- Create or update lifecycle policy
- Get policy
- Delete policy
- Move to step
- Remove policy
- Retry policy
- Get index lifecycle management status
- Explain lifecycle
- Start index lifecycle management
- Stop index lifecycle management
- Migrate indices, ILM policies, and legacy, composable and component templates to data tiers routing
- Inference APIs
- Info API
- Ingest APIs
- Licensing APIs
- Logstash APIs
- Machine learning APIs
- Machine learning anomaly detection APIs
- Add events to calendar
- Add jobs to calendar
- Close jobs
- Create jobs
- Create calendars
- Create datafeeds
- Create filters
- Delete calendars
- Delete datafeeds
- Delete events from calendar
- Delete filters
- Delete forecasts
- Delete jobs
- Delete jobs from calendar
- Delete model snapshots
- Delete expired data
- Estimate model memory
- Flush jobs
- Forecast jobs
- Get buckets
- Get calendars
- Get categories
- Get datafeeds
- Get datafeed statistics
- Get influencers
- Get jobs
- Get job statistics
- Get model snapshots
- Get model snapshot upgrade statistics
- Get overall buckets
- Get scheduled events
- Get filters
- Get records
- Open jobs
- Post data to jobs
- Preview datafeeds
- Reset jobs
- Revert model snapshots
- Start datafeeds
- Stop datafeeds
- Update datafeeds
- Update filters
- Update jobs
- Update model snapshots
- Upgrade model snapshots
- Machine learning data frame analytics APIs
- Create data frame analytics jobs
- Delete data frame analytics jobs
- Evaluate data frame analytics
- Explain data frame analytics
- Get data frame analytics jobs
- Get data frame analytics jobs stats
- Preview data frame analytics
- Start data frame analytics jobs
- Stop data frame analytics jobs
- Update data frame analytics jobs
- Machine learning trained model APIs
- Clear trained model deployment cache
- Create or update trained model aliases
- Create part of a trained model
- Create trained models
- Create trained model vocabulary
- Delete trained model aliases
- Delete trained models
- Get trained models
- Get trained models stats
- Infer trained model
- Start trained model deployment
- Stop trained model deployment
- Update trained model deployment
- Migration APIs
- Node lifecycle APIs
- Query rules APIs
- Reload search analyzers API
- Repositories metering APIs
- Rollup APIs
- Script APIs
- Search APIs
- Search Application APIs
- Searchable snapshots APIs
- Security APIs
- Authenticate
- Change passwords
- Clear cache
- Clear roles cache
- Clear privileges cache
- Clear API key cache
- Clear service account token caches
- Create API keys
- Create or update application privileges
- Create or update role mappings
- Create or update roles
- Create or update users
- Create service account tokens
- Delegate PKI authentication
- Delete application privileges
- Delete role mappings
- Delete roles
- Delete service account token
- Delete users
- Disable users
- Enable users
- Enroll Kibana
- Enroll node
- Get API key information
- Get application privileges
- Get builtin privileges
- Get role mappings
- Get roles
- Get service accounts
- Get service account credentials
- Get Security settings
- Get token
- Get user privileges
- Get users
- Grant API keys
- Has privileges
- Invalidate API key
- Invalidate token
- OpenID Connect prepare authentication
- OpenID Connect authenticate
- OpenID Connect logout
- Query API key information
- Update API key
- Update Security settings
- Bulk update API keys
- SAML prepare authentication
- SAML authenticate
- SAML logout
- SAML invalidate
- SAML complete logout
- SAML service provider metadata
- SSL certificate
- Activate user profile
- Disable user profile
- Enable user profile
- Get user profiles
- Suggest user profile
- Update user profile data
- Has privileges user profile
- Create Cross-Cluster API key
- Update Cross-Cluster API key
- Snapshot and restore APIs
- Snapshot lifecycle management APIs
- SQL APIs
- Synonyms APIs
- Transform APIs
- Usage API
- Watcher APIs
- Definitions
- Migration guide
- Release notes
- Elasticsearch version 8.12.2
- Elasticsearch version 8.12.1
- Elasticsearch version 8.12.0
- Elasticsearch version 8.11.4
- Elasticsearch version 8.11.3
- Elasticsearch version 8.11.2
- Elasticsearch version 8.11.1
- Elasticsearch version 8.11.0
- Elasticsearch version 8.10.4
- Elasticsearch version 8.10.3
- Elasticsearch version 8.10.2
- Elasticsearch version 8.10.1
- Elasticsearch version 8.10.0
- Elasticsearch version 8.9.2
- Elasticsearch version 8.9.1
- Elasticsearch version 8.9.0
- Elasticsearch version 8.8.2
- Elasticsearch version 8.8.1
- Elasticsearch version 8.8.0
- Elasticsearch version 8.7.1
- Elasticsearch version 8.7.0
- Elasticsearch version 8.6.2
- Elasticsearch version 8.6.1
- Elasticsearch version 8.6.0
- Elasticsearch version 8.5.3
- Elasticsearch version 8.5.2
- Elasticsearch version 8.5.1
- Elasticsearch version 8.5.0
- Elasticsearch version 8.4.3
- Elasticsearch version 8.4.2
- Elasticsearch version 8.4.1
- Elasticsearch version 8.4.0
- Elasticsearch version 8.3.3
- Elasticsearch version 8.3.2
- Elasticsearch version 8.3.1
- Elasticsearch version 8.3.0
- Elasticsearch version 8.2.3
- Elasticsearch version 8.2.2
- Elasticsearch version 8.2.1
- Elasticsearch version 8.2.0
- Elasticsearch version 8.1.3
- Elasticsearch version 8.1.2
- Elasticsearch version 8.1.1
- Elasticsearch version 8.1.0
- Elasticsearch version 8.0.1
- Elasticsearch version 8.0.0
- Elasticsearch version 8.0.0-rc2
- Elasticsearch version 8.0.0-rc1
- Elasticsearch version 8.0.0-beta1
- Elasticsearch version 8.0.0-alpha2
- Elasticsearch version 8.0.0-alpha1
- Dependencies and versions
Aggregate Functions
editAggregate Functions
editFunctions for computing a single result from a set of input values. Elasticsearch SQL supports aggregate functions only alongside grouping (implicit or explicit).
General Purpose
editAVG
editSynopsis:
Input:
numeric field. If this field contains only |
Output: double
numeric value
Description: Returns the Average (arithmetic mean) of input values.
SELECT AVG(salary) AS avg FROM emp; avg --------------- 48248.55
SELECT AVG(salary / 12.0) AS avg FROM emp; avg --------------- 4020.7125
COUNT
editSynopsis:
Input:
a field name, wildcard ( |
Output: numeric value
Description: Returns the total number (count) of input values.
SELECT COUNT(*) AS count FROM emp; count --------------- 100
COUNT(ALL)
editSynopsis:
Input:
a field name. If this field contains only |
Output: numeric value
Description: Returns the total number (count) of all non-null input values. COUNT(<field_name>)
and COUNT(ALL <field_name>)
are equivalent.
SELECT COUNT(ALL last_name) AS count_all, COUNT(DISTINCT last_name) count_distinct FROM emp; count_all | count_distinct ---------------+------------------ 100 |96
SELECT COUNT(ALL CASE WHEN languages IS NULL THEN -1 ELSE languages END) AS count_all, COUNT(DISTINCT CASE WHEN languages IS NULL THEN -1 ELSE languages END) count_distinct FROM emp; count_all | count_distinct ---------------+--------------- 100 |6
COUNT(DISTINCT)
editSynopsis:
Input:
Output: numeric value. If this field contains only null
values, the function
returns null
. Otherwise, the function ignores null
values in this field.
Description: Returns the total number of distinct non-null values in input values.
SELECT COUNT(DISTINCT hire_date) unique_hires, COUNT(hire_date) AS hires FROM emp; unique_hires | hires ----------------+--------------- 99 |100
SELECT COUNT(DISTINCT DATE_TRUNC('YEAR', hire_date)) unique_hires, COUNT(DATE_TRUNC('YEAR', hire_date)) AS hires FROM emp; unique_hires | hires ---------------+--------------- 14 |100
FIRST/FIRST_VALUE
editSynopsis:
Input:
Output: same type as the input
Description: Returns the first non-null
value (if such exists) of the field_name
input column sorted by
the ordering_field_name
column. If ordering_field_name
is not provided, only the field_name
column is used for the sorting. E.g.:
a | b |
---|---|
100 |
1 |
200 |
1 |
1 |
2 |
2 |
2 |
10 |
null |
20 |
null |
null |
null |
SELECT FIRST(a) FROM t
will result in:
FIRST(a) |
1 |
and
SELECT FIRST(a, b) FROM t
will result in:
FIRST(a, b) |
100 |
SELECT FIRST(first_name) FROM emp; FIRST(first_name) -------------------- Alejandro
SELECT gender, FIRST(first_name) FROM emp GROUP BY gender ORDER BY gender; gender | FIRST(first_name) ------------+-------------------- null | Berni F | Alejandro M | Amabile
SELECT FIRST(first_name, birth_date) FROM emp; FIRST(first_name, birth_date) -------------------------------- Remzi
SELECT gender, FIRST(first_name, birth_date) FROM emp GROUP BY gender ORDER BY gender; gender | FIRST(first_name, birth_date) --------------+-------------------------------- null | Lillian F | Sumant M | Remzi
FIRST_VALUE
is a name alias and can be used instead of FIRST
, e.g.:
SELECT gender, FIRST_VALUE(first_name, birth_date) FROM emp GROUP BY gender ORDER BY gender; gender | FIRST_VALUE(first_name, birth_date) --------------+-------------------------------------- null | Lillian F | Sumant M | Remzi
SELECT gender, FIRST_VALUE(SUBSTRING(first_name, 2, 6), birth_date) AS "first" FROM emp GROUP BY gender ORDER BY gender; gender | first ---------------+--------------- null |illian F |umant M |emzi
FIRST
cannot be used in a HAVING clause.
FIRST
cannot be used with columns of type text
unless
the field is also saved as a keyword.
LAST/LAST_VALUE
editSynopsis:
Input:
Output: same type as the input
Description: It’s the inverse of FIRST/FIRST_VALUE
. Returns the last non-null
value (if such exists) of the
field_name
input column sorted descending by the ordering_field_name
column. If ordering_field_name
is not
provided, only the field_name
column is used for the sorting. E.g.:
a | b |
---|---|
10 |
1 |
20 |
1 |
1 |
2 |
2 |
2 |
100 |
null |
200 |
null |
null |
null |
SELECT LAST(a) FROM t
will result in:
LAST(a) |
200 |
and
SELECT LAST(a, b) FROM t
will result in:
LAST(a, b) |
2 |
SELECT LAST(first_name) FROM emp; LAST(first_name) ------------------- Zvonko
SELECT gender, LAST(first_name) FROM emp GROUP BY gender ORDER BY gender; gender | LAST(first_name) ------------+------------------- null | Patricio F | Xinglin M | Zvonko
SELECT LAST(first_name, birth_date) FROM emp; LAST(first_name, birth_date) ------------------------------- Hilari
SELECT gender, LAST(first_name, birth_date) FROM emp GROUP BY gender ORDER BY gender; gender | LAST(first_name, birth_date) -----------+------------------------------- null | Eberhardt F | Valdiodio M | Hilari
LAST_VALUE
is a name alias and can be used instead of LAST
, e.g.:
SELECT gender, LAST_VALUE(first_name, birth_date) FROM emp GROUP BY gender ORDER BY gender; gender | LAST_VALUE(first_name, birth_date) -----------+------------------------------------- null | Eberhardt F | Valdiodio M | Hilari
SELECT gender, LAST_VALUE(SUBSTRING(first_name, 3, 8), birth_date) AS "last" FROM emp GROUP BY gender ORDER BY gender; gender | last ---------------+--------------- null |erhardt F |ldiodio M |lari
LAST
cannot be used in HAVING
clause.
LAST
cannot be used with columns of type text
unless
the field is also saved as a keyword
.
MAX
editSynopsis:
Input:
a numeric field. If this field contains only |
Output: same type as the input
Description: Returns the maximum value across input values in the field field_name
.
SELECT MAX(salary) AS max FROM emp; max --------------- 74999
SELECT MAX(ABS(salary / -12.0)) AS max FROM emp; max ----------------- 6249.916666666667
MAX
on a field of type text
or keyword
is translated into
LAST/LAST_VALUE
and therefore, it cannot be used in HAVING
clause.
MIN
editSynopsis:
Input:
a numeric field. If this field contains only |
Output: same type as the input
Description: Returns the minimum value across input values in the field field_name
.
SELECT MIN(salary) AS min FROM emp; min --------------- 25324
MIN
on a field of type text
or keyword
is translated into
FIRST/FIRST_VALUE
and therefore, it cannot be used in HAVING
clause.
SUM
editSynopsis:
Input:
a numeric field. If this field contains only |
Output: bigint
for integer input, double
for floating points
Description: Returns the sum of input values in the field field_name
.
SELECT SUM(salary) AS sum FROM emp; sum --------------- 4824855
SELECT ROUND(SUM(salary / 12.0), 1) AS sum FROM emp; sum --------------- 402071.3
Statistics
editKURTOSIS
editSynopsis:
Input:
a numeric field. If this field contains only |
Output: double
numeric value
Description:
Quantify the shape of the distribution of input values in the field field_name
.
SELECT MIN(salary) AS min, MAX(salary) AS max, KURTOSIS(salary) AS k FROM emp; min | max | k ---------------+---------------+------------------ 25324 |74999 |2.0444718929142986
KURTOSIS
cannot be used on top of scalar functions or operators but only directly on a field. So, for example,
the following is not allowed and an error is returned:
SELECT KURTOSIS(salary / 12.0), gender FROM emp GROUP BY gender
MAD
editSynopsis:
Input:
a numeric field. If this field contains only |
Output: double
numeric value
Description:
Measure the variability of the input values in the field field_name
.
SELECT MIN(salary) AS min, MAX(salary) AS max, AVG(salary) AS avg, MAD(salary) AS mad FROM emp; min | max | avg | mad ---------------+---------------+---------------+--------------- 25324 |74999 |48248.55 |10096.5
SELECT MIN(salary / 12.0) AS min, MAX(salary / 12.0) AS max, AVG(salary/ 12.0) AS avg, MAD(salary / 12.0) AS mad FROM emp; min | max | avg | mad ------------------+-----------------+---------------+----------------- 2110.3333333333335|6249.916666666667|4020.7125 |841.3750000000002
PERCENTILE
editSynopsis:
Input:
a numeric field. If this field contains only |
|
a numeric expression (must be a constant and not based on a field). If
|
|
optional string literal for the percentile algorithm. Possible values: |
|
optional numeric literal that configures the percentile algorithm. Configures |
Output: double
numeric value
Description:
Returns the nth percentile (represented by numeric_exp
parameter)
of input values in the field field_name
.
SELECT languages, PERCENTILE(salary, 95) AS "95th" FROM emp GROUP BY languages; languages | 95th ---------------+----------------- null |74482.4 1 |71122.8 2 |70271.4 3 |71926.0 4 |69352.15 5 |56371.0
SELECT languages, PERCENTILE(salary / 12.0, 95) AS "95th" FROM emp GROUP BY languages; languages | 95th ---------------+------------------ null |6206.866666666667 1 |5926.9 2 |5855.949999999999 3 |5993.833333333333 4 |5779.345833333333 5 |4697.583333333333
SELECT languages, PERCENTILE(salary, 97.3, 'tdigest', 100.0) AS "97.3_TDigest", PERCENTILE(salary, 97.3, 'hdr', 3) AS "97.3_HDR" FROM emp GROUP BY languages; languages | 97.3_TDigest | 97.3_HDR ---------------+-----------------+--------------- null |74720.036 |74992.0 1 |72316.132 |73712.0 2 |71792.436 |69936.0 3 |73326.23999999999|74992.0 4 |71753.281 |74608.0 5 |61176.16000000001|56368.0
PERCENTILE_RANK
editSynopsis:
Input:
a numeric field. If this field contains only |
|
a numeric expression (must be a constant and not based on a field). If
|
|
optional string literal for the percentile algorithm. Possible values: |
|
optional numeric literal that configures the percentile algorithm. Configures |
Output: double
numeric value
Description:
Returns the nth percentile rank (represented by numeric_exp
parameter)
of input values in the field field_name
.
SELECT languages, PERCENTILE_RANK(salary, 65000) AS rank FROM emp GROUP BY languages; languages | rank ---------------+----------------- null |73.65766569962062 1 |73.7291625157734 2 |88.88005607010643 3 |79.43662623295829 4 |85.70446389643493 5 |96.79075152940749
SELECT languages, PERCENTILE_RANK(salary/12, 5000) AS rank FROM emp GROUP BY languages; languages | rank ---------------+------------------ null |66.91240875912409 1 |66.70766707667076 2 |84.13266895048271 3 |61.052992625621684 4 |76.55646443990001 5 |94.00696864111498
SELECT languages, ROUND(PERCENTILE_RANK(salary, 65000, 'tdigest', 100.0), 2) AS "rank_TDigest", ROUND(PERCENTILE_RANK(salary, 65000, 'hdr', 3), 2) AS "rank_HDR" FROM emp GROUP BY languages; languages | rank_TDigest | rank_HDR ---------------+---------------+--------------- null |73.66 |80.0 1 |73.73 |73.33 2 |88.88 |89.47 3 |79.44 |76.47 4 |85.7 |83.33 5 |96.79 |95.24
SKEWNESS
editSynopsis:
Input:
a numeric field. If this field contains only |
Output: double
numeric value
Description:
Quantify the asymmetric distribution of input values in the field field_name
.
SELECT MIN(salary) AS min, MAX(salary) AS max, SKEWNESS(salary) AS s FROM emp; min | max | s ---------------+---------------+------------------ 25324 |74999 |0.2707722118423227
SKEWNESS
cannot be used on top of scalar functions but only directly on a field. So, for example, the following is
not allowed and an error is returned:
SELECT SKEWNESS(ROUND(salary / 12.0, 2), gender FROM emp GROUP BY gender
STDDEV_POP
editSynopsis:
Input:
a numeric field. If this field contains only |
Output: double
numeric value
Description:
Returns the population standard deviation of input values in the field field_name
.
SELECT MIN(salary) AS min, MAX(salary) AS max, STDDEV_POP(salary) AS stddev FROM emp; min | max | stddev ---------------+---------------+------------------ 25324 |74999 |13765.125502787832
SELECT MIN(salary / 12.0) AS min, MAX(salary / 12.0) AS max, STDDEV_POP(salary / 12.0) AS stddev FROM emp; min | max | stddev ------------------+-----------------+----------------- 2110.3333333333335|6249.916666666667|1147.093791898986
STDDEV_SAMP
editSynopsis:
Input:
a numeric field. If this field contains only |
Output: double
numeric value
Description:
Returns the sample standard deviation of input values in the field field_name
.
SELECT MIN(salary) AS min, MAX(salary) AS max, STDDEV_SAMP(salary) AS stddev FROM emp; min | max | stddev ---------------+---------------+------------------ 25324 |74999 |13834.471662090747
SELECT MIN(salary / 12.0) AS min, MAX(salary / 12.0) AS max, STDDEV_SAMP(salary / 12.0) AS stddev FROM emp; min | max | stddev ------------------+-----------------+----------------- 2110.3333333333335|6249.916666666667|1152.872638507562
SUM_OF_SQUARES
editSynopsis:
Input:
a numeric field. If this field contains only |
Output: double
numeric value
Description:
Returns the sum of squares of input values in the field field_name
.
SELECT MIN(salary) AS min, MAX(salary) AS max, SUM_OF_SQUARES(salary) AS sumsq FROM emp; min | max | sumsq ---------------+---------------+---------------- 25324 |74999 |2.51740125721E11
SELECT MIN(salary / 24.0) AS min, MAX(salary / 24.0) AS max, SUM_OF_SQUARES(salary / 24.0) AS sumsq FROM emp; min | max | sumsq ------------------+------------------+------------------- 1055.1666666666667|3124.9583333333335|4.370488293767361E8
VAR_POP
editSynopsis:
Input:
a numeric field. If this field contains only |
Output: double
numeric value
Description:
Returns the population variance of input values in the field field_name
.
SELECT MIN(salary) AS min, MAX(salary) AS max, VAR_POP(salary) AS varpop FROM emp; min | max | varpop ---------------+---------------+---------------- 25324 |74999 |1.894786801075E8
SELECT MIN(salary / 24.0) AS min, MAX(salary / 24.0) AS max, VAR_POP(salary / 24.0) AS varpop FROM emp; min | max | varpop ------------------+------------------+------------------ 1055.1666666666667|3124.9583333333335|328956.04185329855
VAR_SAMP
editSynopsis:
Input:
a numeric field. If this field contains only |
Output: double
numeric value
Description:
Returns the sample variance of input values in the field field_name
.
SELECT MIN(salary) AS min, MAX(salary) AS max, VAR_SAMP(salary) AS varsamp FROM emp; min | max | varsamp ---------------+---------------+---------------- 25324 |74999 |1.913926061691E8
SELECT MIN(salary / 24.0) AS min, MAX(salary / 24.0) AS max, VAR_SAMP(salary / 24.0) AS varsamp FROM emp; min | max | varsamp ------------------+------------------+---------------- 1055.1666666666667|3124.9583333333335|332278.830154847
On this page