- Elastic Cloud Enterprise - Elastic Cloud on your Infrastructure: other versions:
- Introducing Elastic Cloud Enterprise
- Preparing your installation
- Installing Elastic Cloud Enterprise
- Identify the deployment scenario
- Install ECE on a public cloud
- Install ECE on your own premises
- Alternative: Install ECE with Ansible
- Log into the Cloud UI
- Install ECE on additional hosts
- Migrate ECE to Podman hosts
- Post-installation steps
- Configuring your installation
- System deployments configuration
- Configure deployment templates
- Tag your allocators
- Edit instance configurations
- Create instance configurations
- Create deployment templates
- Configure system deployment templates
- Configure index management for templates
- Updating custom templates to support
node_roles
and autoscaling - Updating custom templates to support Integrations Server
- Default instance configurations
- Include additional Kibana plugins
- Manage snapshot repositories
- Manage licenses
- Change the ECE API URL
- Change endpoint URLs
- Enable custom endpoint aliases
- Configure allocator affinity
- Change allocator disconnect timeout
- Migrate ECE on Podman hosts to SELinux in
enforcing
mode
- Securing your installation
- Monitoring your installation
- Administering your installation
- Working with deployments
- Create a deployment
- Access Kibana
- Adding data to Elasticsearch
- Migrating data
- Ingesting data from your application
- Ingest data with Node.js on Elastic Cloud Enterprise
- Ingest data with Python on Elastic Cloud Enterprise
- Ingest data from Beats to Elastic Cloud Enterprise with Logstash as a proxy
- Ingest data from a relational database into Elastic Cloud Enterprise
- Ingest logs from a Python application using Filebeat
- Ingest logs from a Node.js web application using Filebeat
- Manage data from the command line
- Administering deployments
- Change your deployment configuration
- Maintenance mode
- Terminate a deployment
- Restart a deployment
- Restore a deployment
- Delete a deployment
- Migrate to index lifecycle management
- Disable an Elasticsearch data tier
- Access the Elasticsearch API console
- Work with snapshots
- Restore a snapshot across clusters
- Upgrade versions
- Editing your user settings
- Deployment autoscaling
- Configure Beats and Logstash with Cloud ID
- Keep your clusters healthy
- Keep track of deployment activity
- Secure your clusters
- Deployment heap dumps
- Deployment thread dumps
- Traffic Filtering
- Connect to your cluster
- Manage your Kibana instance
- Manage your APM & Fleet Server (7.13+)
- Manage your APM Server (versions before 7.13)
- Manage your Integrations Server
- Switch from APM to Integrations Server payload
- Enable logging and monitoring
- Enable cross-cluster search and cross-cluster replication
- Access other deployments of the same Elastic Cloud Enterprise environment
- Access deployments of another Elastic Cloud Enterprise environment
- Access deployments of an Elasticsearch Service organization
- Access clusters of a self-managed environment
- Enabling CCS/R between Elastic Cloud Enterprise and ECK
- Edit or remove a trusted environment
- Migrate the cross-cluster search deployment template
- Enable App Search
- Enable Enterprise Search
- Enable Graph (versions before 5.0)
- Troubleshooting
- RESTful API
- Authentication
- API calls
- How to access the API
- API examples
- Setting up your environment
- A first API call: What deployments are there?
- Create a first Deployment: Elasticsearch and Kibana
- Applying a new plan: Resize and add high availability
- Updating a deployment: Checking on progress
- Applying a new deployment configuration: Upgrade
- Enable more stack features: Add Enterprise Search to a deployment
- Dipping a toe into platform automation: Generate a roles token
- Customize your deployment
- Remove unwanted deployment templates and instance configurations
- Secure your settings
- API reference
- Changes to index allocation and API
- Script reference
- Release notes
- Elastic Cloud Enterprise 3.7.3
- Elastic Cloud Enterprise 3.7.2
- Elastic Cloud Enterprise 3.7.1
- Elastic Cloud Enterprise 3.7.0
- Elastic Cloud Enterprise 3.6.2
- Elastic Cloud Enterprise 3.6.1
- Elastic Cloud Enterprise 3.6.0
- Elastic Cloud Enterprise 3.5.1
- Elastic Cloud Enterprise 3.5.0
- Elastic Cloud Enterprise 3.4.1
- Elastic Cloud Enterprise 3.4.0
- Elastic Cloud Enterprise 3.3.0
- Elastic Cloud Enterprise 3.2.1
- Elastic Cloud Enterprise 3.2.0
- Elastic Cloud Enterprise 3.1.1
- Elastic Cloud Enterprise 3.1.0
- Elastic Cloud Enterprise 3.0.0
- Elastic Cloud Enterprise 2.13.4
- Elastic Cloud Enterprise 2.13.3
- Elastic Cloud Enterprise 2.13.2
- Elastic Cloud Enterprise 2.13.1
- Elastic Cloud Enterprise 2.13.0
- Elastic Cloud Enterprise 2.12.4
- Elastic Cloud Enterprise 2.12.3
- Elastic Cloud Enterprise 2.12.2
- Elastic Cloud Enterprise 2.12.1
- Elastic Cloud Enterprise 2.12.0
- Elastic Cloud Enterprise 2.11.2
- Elastic Cloud Enterprise 2.11.1
- Elastic Cloud Enterprise 2.11.0
- Elastic Cloud Enterprise 2.10.1
- Elastic Cloud Enterprise 2.10.0
- Elastic Cloud Enterprise 2.9.2
- Elastic Cloud Enterprise 2.9.1
- Elastic Cloud Enterprise 2.9.0
- Elastic Cloud Enterprise 2.8.1
- Elastic Cloud Enterprise 2.8.0
- Elastic Cloud Enterprise 2.7.2
- Elastic Cloud Enterprise 2.7.1
- Elastic Cloud Enterprise 2.7.0
- Elastic Cloud Enterprise 2.6.2
- Elastic Cloud Enterprise 2.6.1
- Elastic Cloud Enterprise 2.6.0
- Elastic Cloud Enterprise 2.5.1
- Elastic Cloud Enterprise 2.5.0
- Elastic Cloud Enterprise 2.4.3
- Elastic Cloud Enterprise 2.4.2
- Elastic Cloud Enterprise 2.4.1
- Elastic Cloud Enterprise 2.4.0
- Elastic Cloud Enterprise 2.3.2
- Elastic Cloud Enterprise 2.3.1
- Elastic Cloud Enterprise 2.3.0
- Elastic Cloud Enterprise 2.2.3
- Elastic Cloud Enterprise 2.2.2
- Elastic Cloud Enterprise 2.2.1
- Elastic Cloud Enterprise 2.2.0
- Elastic Cloud Enterprise 2.1.1
- Elastic Cloud Enterprise 2.1.0
- Elastic Cloud Enterprise 2.0.1
- Elastic Cloud Enterprise 2.0.0
- Elastic Cloud Enterprise 1.1.5
- Elastic Cloud Enterprise 1.1.4
- Elastic Cloud Enterprise 1.1.3
- Elastic Cloud Enterprise 1.1.2
- Elastic Cloud Enterprise 1.1.1
- Elastic Cloud Enterprise 1.1.0
- Elastic Cloud Enterprise 1.0.2
- Elastic Cloud Enterprise 1.0.1
- Elastic Cloud Enterprise 1.0.0
- What’s new with the Elastic Stack
- About this product
Common issues
editCommon issues
editThis set of common symptoms and resolutions can help you to diagnose unexpected behavior with Elastic Cloud Enterprise. You can also refer to the list of product Limitations and known problems.
Emergency token not spinning up the coordinator role
editSymptom: You have no access to API and UI because all coordinators are lost. More than half of the director hosts are available. If you have 5 directors, 3 directors must be available. If you lost more than half of the directors, contact the support. If all directors are lost, re-install ECE.
Resolution: Use the emergency token provided during the installation of the genesis ECE nodes. You must explicitly specify the roles with the parameter --roles
, for example "coordinator,director,proxy"
. Otherwise, the host does not run any role.
Allocator failures
editAllocator failures result in alerts in the Cloud UI. You cannot repair an allocator in the Cloud UI, so you should first move all nodes to a new allocator. Vacate options for a selected allocator can help fine tune the removal of nodes. After the allocator is vacated, you should investigate the cause of the failure separately, which could be due to a host machine failure, and replace any lost capacity.
Allocators not being used
editSymptoms: You installed Elastic Cloud Enterprise on a new host and assigned it the allocator role from the command line with the --roles "allocator"
parameter during installation, but new clusters are not being created on the allocator.
Resolution: The issue is caused by a token specified with the --roles-token 'TOKEN'
parameter that does not have sufficient privileges to assign the role correctly. To resolve this issue, you might need to refresh the roles for the allocator.
- Follow the steps for assigning roles to hosts, but do not change any of the assigned roles. Select Update roles.
- Verify that the allocator is now being used when creating new Elasticsearch clusters or moving nodes off other allocators.
To avoid this issue on future allocators you create, generate a roles token that has the right permissions for this to work, in this case the permission to assign the allocator role; the token generated during the installation on the first host will not suffice.
Cloud UI login failures
editSymptoms: When you attempt to log into the Cloud UI, the login process appears to hang and then fails.
Resolution: The administration console that supports the Cloud UI might be running out of Java heap space, causing login failures. This issue is expected to be fixed in a future release. As a workaround, you can manually increase the heap size.
In the current release, there is no direct way to change the Java heap size in the UI, so you need to increase the heap size as follows:
-
For convenience, you can store the IP address of the host machine where the Cloud UI is running in the
ADMIN_IP
environment variable. Alternatively, replace$ADMIN_IP
in the commands shown with the IP address. -
Create a file with your current configuration, here
containerdata.json
(requires that you have jq installed):curl -u admin http://$ADMIN_IP:12400/api/v1/platform/infrastructure/container-sets/admin-consoles/containers/admin-console > containerdata.json
When prompted, enter the password you use to log into the Cloud UI.
-
Filter the output file to a format that you can modify and push back into the system. If this step fails, do not push the file back into the system as this can prevent the admin console container from starting up.
jq '{config: {env: .config.env}}' containerdata.json > containerdata_new.json
-
Open the
containerdata_new.json
file in your favorite editor and locate this line:"ADMINCONSOLE_JAVA_OPTIONS=-Djute.maxbuffer=33554432 -Xmx256M -Xms256M",
-
Increase the Java heap size by changing the values for
Xmx
andXms
to 1024 or 4096, depending on the size of your host machine. For example: Change the values to-Xmx1024M -Xms1024M
. - Save the configuration and exit the editor.
-
Apply the new configuration:
curl -H 'Content-Type: application/json' -XPATCH -u admin http://$ADMIN_IP:12400/api/v1/platform/infrastructure/container-sets/admin-consoles/containers/admin-console -d @containerdata_new.json
-
On the host machine where the Cloud UI administration console is running, recreate the Cloud UI:
docker stop frc-admin-consoles-admin-console && docker rm -f frc-admin-consoles-admin-console
If you prefer, you can use the HTTPS protocol on port 12443, but it currently supports only a self-signed certificate. Alternatively, you can perform the step from localhost.
- Log into the Cloud UI administration console to confirm that the issue is resolved.
Cloud UI, Elasticsearch, and Kibana endpoint URLs inaccessible on AWS
editSymptoms: When you attempt to log into the Cloud UI or when you attempt to connect to an Elasticsearch or Kibana endpoint URL, the connection eventually times out with an error. The error indicates that the host cannot be reached.
Resolution: On AWS, the default URLs provided might point to a private host IP address, which is not accessible externally. To resolve this issue, use a URL for the Cloud UI that is externally accessible and update your cluster endpoint to use a public IP address for Elasticsearch and Kibana.
This issue applies only to hosts running on AWS, where both public and private IP addresses are provided.
To check if you are affected and to resolve this issue:
-
Compare the URL you are trying to reach to the host IP address information in the AWS EC2 Dashboard.
For example, on a Elastic Cloud Enterprise installation, the following URLs might be provided by default:
-
Cloud UI:
http://192.168.40.73:12400
-
Elasticsearch:
https://e025c4xxxxxxxxxxxxx.192.168.40.73.ip.es.io:9243/
-
Kibana:
https://1e2b57xxxxxxxxxxxxx.192.168.40.73.ip.es.io:9243/
A quick check in the AWS EC2 Dashboard confirms that
192.168.40.73
is a private IP address, which is not accessible externally: -
Cloud UI:
-
To resolve this issue:
-
For the Cloud UI, use the public host name or public IP. In this example, the Cloud UI is accessible externally at
ec2-54-162-168-86.compute-1.amazonaws.com:12400
. -
For Elasticsearch and Kibana, update your cluster endpoint to use the public IP address. In this example, you can use
54.162.168.86
:
-
For the Cloud UI, use the public host name or public IP. In this example, the Cloud UI is accessible externally at
Upgrade failures
editSymptoms: When upgrading Elastic Cloud Enterprise, the upgrade process indicates that a container upgrade has failed, followed by a rollback of all container upgrades and an error message that the upgrade process has failed. The information you get is similar to this output:
... - Runner [192.168.44.10]: container [proxies-proxy] status changed: [upgrade failed] at [2017-08-23T20:27:02.114Z] - Runner [192.168.44.10]: container [proxies-proxy] status changed: [rollback started] at [2017-08-23T20:27:07.134Z] ... - The upgrade of the {n} installation has failed, but we successfully rolled back changes. - Check that all services are running and that they are at the same version level with the docker ps command. Try the upgrade again. - The log files on each host might provide additional information about the cause of the failures. (path: HOST_STORAGE_PATH/logs/upgrader-logs). - You can find a backup of ZooKeeper's transaction log that was created before the upgrade in [/mnt/data/elastic/192.168.44.10/services/zookeeper/data/backup/20170823-202249/version-2]. - Exiting upgrader - Removing upgrade container
Resolution: The Elastic Cloud Enterprise upgrade process is designed to be safe. If there is an issue with any part of the upgrade, the entire process is rolled back. In most cases, the rollback is automatic and the upgrade process can be reattempted after you fix the issue that caused the rollback.
The upgrade process can fail for several reasons, including:
-
If a host fails during the upgrade process, causing the
frc-upgraders-monitor
container to time out while it monitors the upgrade process. -
If there is an issue with the ZooKeeper ensemble establishing a quorum after the upgrade or if the
frc-upgraders-upgrader
containers performing the upgrade on each host continue to wait for a ZooKeeper connection indefinitely to report their upgrade status. - If an upgraded container does not keep running and the upgrade process determines that it is not viable.
To determine the root cause of an upgrade failure, the following logs are available where HOST_STORAGE_PATH
and RUNNER_ID
are specific to your installation:
-
HOST_STORAGE_PATH/logs/upgrader-logs/monitor.log
- Available on the host where you initiated the upgrade process. This log file can help you pinpoint the host where an upgrade issue occurred or where in the overall upgrade process a failure happened.
-
HOST_STORAGE_PATH/logs/upgrader-logs/upgrader.log
- Available on every host that attempted the upgrade. This log file can tell you about the specific issues that caused the upgrade to fail on a host.
In rare cases, a manual rollback of the upgrade might be required. Check Getting help.
On this page