Creating text analysis plugins with the stable plugin API
editCreating text analysis plugins with the stable plugin API
editText analysis plugins provide Elasticsearch with custom Lucene analyzers, token filters, character filters, and tokenizers.
The stable plugin API
editText analysis plugins can be developed against the stable plugin API. This API consists of the following dependencies:
-
plugin-api
- an API used by plugin developers to implement custom Elasticsearch plugins. -
plugin-analysis-api
- an API used by plugin developers to implement analysis plugins and integrate them into Elasticsearch. -
lucene-analysis-common
- a dependency ofplugin-analysis-api
that contains core Lucene analysis interfaces likeTokenizer
,Analyzer
, andTokenStream
.
For new versions of Elasticsearch within the same major version, plugins built against this API do not need to be recompiled. Future versions of the API will be backwards compatible and plugins are binary compatible with future versions of Elasticsearch. In other words, once you have a working artifact, you can re-use it when you upgrade Elasticsearch to a new bugfix or minor version.
A text analysis plugin can implement four factory classes that are provided by the analysis plugin API.
-
AnalyzerFactory
to create a Lucene analyzer -
CharFilterFactory
to create a character character filter -
TokenFilterFactory
to create a Lucene token filter -
TokenizerFactory
to create a Lucene tokenizer
The key to implementing a stable plugin is the @NamedComponent
annotation.
Many of Elasticsearch’s components have names that are used in configurations. For
example, the keyword analyzer is referenced in configuration with the name
"keyword"
. Once your custom plugin is installed in your cluster, your named
components may be referenced by name in these configurations as well.
You can also create text analysis plugins as a classic plugin. However, classic plugins are pinned to a specific version of Elasticsearch. You need to recompile them when upgrading Elasticsearch. Because classic plugins are built against internal APIs that can change, upgrading to a new version may require code changes.
Stable plugin file structure
editStable plugins are ZIP files composed of JAR files and two metadata files:
-
stable-plugin-descriptor.properties
- a Java properties file that describes the plugin. Refer to The plugin descriptor file for stable plugins. -
named_components.json
- a JSON file mapping interfaces to key-value pairs of component names and implementation classes.
Note that only JAR files at the root of the plugin are added to the classpath for the plugin. If you need other resources, package them into a resources JAR.
Development process
editElastic provides a Grade plugin, elasticsearch.stable-esplugin
, that makes it
easier to develop and package stable plugins. The steps in this section assume
you use this plugin. However, you don’t need Gradle to create plugins.
The Elasticsearch Github repository contains
an example analysis plugin.
The example build.gradle
build script provides a good starting point for
developing your own plugin.
Prerequisites
editPlugins are written in Java, so you need to install a Java Development Kit (JDK). Install Gradle if you want to use Gradle.
Step by step
edit- Create a directory for your project.
-
Copy the example
build.gradle
build script to your project directory. Note that this build script uses theelasticsearch.stable-esplugin
gradle plugin to build your plugin. -
Edit the
build.gradle
build script:-
Add a definition for the
pluginApiVersion
and matchingluceneVersion
variables to the top of the file. You can find these versions in thebuild-tools-internal/version.properties
file in the Elasticsearch Github repository. -
Edit the
name
anddescription
in theesplugin
section of the build script. This will create the plugin descriptor file. If you’re not using theelasticsearch.stable-esplugin
gradle plugin, refer to The plugin descriptor file for stable plugins to create the file manually. - Add module information.
-
Ensure you have declared the following compile-time dependencies. These dependencies are compile-time only because Elasticsearch will provide these libraries at runtime.
-
org.elasticsearch.plugin:elasticsearch-plugin-api
-
org.elasticsearch.plugin:elasticsearch-plugin-analysis-api
-
org.apache.lucene:lucene-analysis-common
-
-
For unit testing, ensure these dependencies have also been added to the
build.gradle
script astestImplementation
dependencies.
-
Add a definition for the
-
Implement an interface from the analysis plugin API, annotating it with
NamedComponent
. Refer to Example text analysis plugin for an example. -
You should now be able to assemble a plugin ZIP file by running:
gradle bundlePlugin
The resulting plugin ZIP file is written to the
build/distributions
directory.
YAML REST tests
editThe Gradle elasticsearch.yaml-rest-test
plugin enables testing of your
plugin using the Elasticsearch yamlRestTest framework.
These tests use a YAML-formatted domain language to issue REST requests against
an internal Elasticsearch cluster that has your plugin installed, and to check the
results of those requests. The structure of a YAML REST test directory is as
follows:
-
A test suite class, defined under
src/yamlRestTest/java
. This class should extendESClientYamlSuiteTestCase
. -
The YAML tests themselves should be defined under
src/yamlRestTest/resources/test/
.