Troubleshoot Elastic Defend

This topic covers common troubleshooting issues when using Elastic Defend's endpoint management tools.

Endpoints

In some cases, an Unhealthy Elastic Agent status may be caused by a failure in the Elastic Defend integration policy. In this situation, the integration and any failing features are flagged on the agent details page in Fleet. Expand each section and subsection to display individual responses from the agent.

Tip

Integration policy response information is also available from the Endpoints page in the Elastic Security app (AssetsEndpoints, then click the link in the Policy status column).

Common causes of failure in the Elastic Defend integration policy include missing prerequisites or unexpected system configuration. Consult the following topics to resolve a specific error:

Tip

If the Elastic Defend integration policy is not the cause of the Unhealthy agent status, refer to Fleet troubleshooting for help with the Elastic Agent.

If you have an Unhealthy Elastic Agent status with the message Disabled due to potential system deadlock, that means malware protection was disabled on the Elastic Defend integration policy due to errors while monitoring a Linux host.

You can resolve the issue by configuring the policy's advanced settings related to fanotify, a Linux feature that monitors file system events. By default, Elastic Defend works with fanotify to monitor specific file system types that Elastic has tested for compatibility, and ignores other unknown file system types.

If your network includes nonstandard, proprietary, or otherwise unrecognized Linux file systems that cause errors while being monitored, you can configure Elastic Defend to ignore those file systems. This allows Elastic Defend to resume monitoring and protecting the hosts on the integration policy.

Caution

Ignoring file systems can create gaps in your security coverage. Use additional security layers for any file systems ignored by Elastic Defend.

To resolve the potential system deadlock error:

  1. Go to AssetsPolicies, then click a policy's name.

  2. Scroll to the bottom of the policy and click Show advanced settings.

  3. In the setting linux.advanced.fanotify.ignored_filesystems, enter a comma-separated list of file system names to ignore, as they appear in /proc/filesystems (for example: ext4,tmpfs). Refer to Find file system names for more on determining the file system names.

  4. Click Save.

    Once you save the policy, malware protection is re-enabled.

If you encounter a “Required transform failed” notice on the Endpoints page, you can usually resolve the issue by restarting the transform. Refer to Transforming data for more information about transforms.

To restart a transform that’s not running:

  1. Go to Project settingsManagementTransforms.

  2. Enter endpoint.metadata in the search box to find the transforms for Elastic Defend.

  3. Click the Actions menu () and do one of the following for each transform, depending on the value in the Status column:

    • stopped: Select Start to restart the transform.
    • failed: Select Stop to first stop the transform, and then select Start to restart it.

  4. On the confirmation message that displays, click Start to restart the transform.

  5. The transform’s status changes to started. If it doesn't change, refresh the page.

After Elastic Agent installs Endpoint, Endpoint connects to Elastic Agent over a local relay connection to report its health status and receive policy updates and response action requests. If that connection cannot be established, the Elastic Defend integration will cause Elastic Agent to be in an Unhealthy status, and Endpoint won't operate properly.

Identify if the issue is happening

You can identify if this issue is happening in the following ways:

  • Run Elastic Agent's status command:

    • sudo /opt/Elastic/Agent/elastic-agent status (Linux)
    • sudo /Library/Elastic/Agent/elastic-agent status (macOS)
    • c:\Program Files\Elastic\Agent\elastic-agent.exe status (Windows)

    If the status result for endpoint-security says that Endpoint has missed check-ins or localhost:6788 cannot be bound to, it might indicate this problem is occurring.

  • If the problem starts happening right after installing Endpoint, check the value of fleet.agent.id in the following file:

    • /opt/Elastic/Endpoint/elastic-endpoint.yaml (Linux)
    • /Library/Elastic/Endpoint/elastic-endpoint.yaml (macOS)
    • c:\Program Files\Elastic\Endpoint\elastic-endpoint.yaml (Windows)

    If the value of fleet.agent.id is 00000000-0000-0000-0000-000000000000, this indicates this problem is occurring.

    Note

    If this problem starts happening after Endpoint has already been installed and working properly, then this value will have changed even though the problem is happening.

Examine Endpoint logs

If you've confirmed that the issue is happening, you can look at Endpoint log messages to identify the cause:

  • Failed to find connection to validate. Is Agent listening on 127.0.0.1:6788? or Failed to validate connection. Is Agent running as root/admin? means that Endpoint is not able to create an initial connection to Elastic Agent over port 6788.
  • Unable to make GRPC connection in deadline(60s). Fetching connection info again means that Endpoint's original connection to Elastic Agent over port 6788 worked, but the connection over port 6789 is failing.

Resolve the issue

To debug and resolve the issue, follow these steps:

  1. Examine the Endpoint diagnostics file named analysis.txt, which contains information about what may cause this issue. Elastic Agent diagnostics automatically include Endpoint diagnostics.

  2. Make sure nothing else on your device is listening on ports 6788 or 6789 by running:

    • sudo netstat -anp --tcp (Linux)
    • sudo netstat -an -f inet (macOS)
    • netstat -an (Windows)
  3. Make sure localhost can be resolved to 127.0.0.1 by running:

    • ping -4 -c 1 localhost (Linux)
    • ping -c 1 localhost (macOS)
    • ping -4 localhost (Windows)

After deploying Elastic Defend, you might encounter warnings or errors in the endpoint's Policy status in Fleet if your mobile device management (MDM) is misconfigured or certain permissions for Elastic Endpoint aren't granted. The following sections explain issues that can cause warnings or failures in the endpoint's policy status.

Connect Kernel has failed

This means that the system extension or kernel extension was not approved. Consult the following topics for approving the system extension, either with MDM or without MDM:

You can validate the system extension is loaded by running

sudo systemextensionsctl list | grep co.elastic.systemextension

In the command output, the system extension should be marked as "active enabled".

Connect Kernel has failed and the system extension is loaded

If the system extension is loaded and kernel connection still fails, this means that Full Disk Access was not granted. Elastic Endpoint requires Full Disk Access to subscribe to system events via the Elastic Defend framework, which is one of the primary sources of eventing information used by Elastic Endpoint. Consult the following topics for granting Full Disk Access, either with MDM or without MDM:

You can validate that Full Disk Access is approved by running

sudo /Library/Elastic/Endpoint/elastic-endpoint test install

If the command output doesn't contain a message about enabling Full Disk Access, the approval was successful.

Detect Network Events has failed

This means that the network extension content filtering was not approved. Consult the following topics for approving network content filtering, either with MDM or without MDM:

You can validate that network content filtering is approved by running

sudo /Library/Elastic/Endpoint/elastic-endpoint test install

If the command output doesn't contain a message about approving network content filtering, the approval was successful.

Full Disk Access has a warning

This means that Full Disk Access was not granted for one or all Elastic Endpoint components. Consult the following topics for granting Full Disk Access, either with MDM or without MDM:

You can validate that Full Disk Access is approved by running

sudo /Library/Elastic/Endpoint/elastic-endpoint test install

If the command output doesn't contain a message about enabling Full Disk Access, the approval was successful.

On this page