Interpreting CI Failures
editInterpreting CI Failures
editKibana CI uses a Jenkins feature called "Pipelines" to automate testing of the code in pull requests and on tracked branches. Pipelines are defined within the repository via the Jenkinsfile
at the root of the project.
More information about Jenkins Pipelines can be found in the Jenkins book.
Github Checks
editWhen a test fails it will be reported to Github via Github Checks. We currently bucket tests into several categories which run in parallel to make CI faster. Groups like ciGroup{X}
get a single check in Github, and other tests like linting, or type checks, get their own checks.
Clicking the link next to the check in the conversation tab of a pull request will take you to the log output from that section of the tests. If that log output is truncated, or doesn’t clearly identify what happened, you can usually get more complete information by visiting Jenkins directly.
Viewing Job Executions in Jenkins
editTo view the results of a job execution in Jenkins, either click the link in the comment left by @elasticmachine
or search for the kibana-ci
check in the list at the bottom of the PR. This link will take you to the top-level page for the specific job execution that failed.
- Git Changes: the list of commits that were in this build which weren’t in the previous build. For Pull Requests this list is calculated by comparing against the most recent Pull Request which was tested, it is not limited to build for this specific Pull Request, so it’s not very useful.
- Test Results: A link to the test results screen, and shortcuts to the failed tests. Functional tests capture and store the log output from each specific test, and make it visible at these links. For other test runners only the error message is visible and log output must be tracked down in the Pipeline Steps.
- Google Cloud Storage (GCS) Upload Report: Link to the screen which lists out the artifacts uploaded to GCS during this job execution.
- Pipeline Steps:: A breakdown of the pipeline that was executed, along with individual log output for each step in the pipeline.
Viewing ciGroup/test Logs
editTo view the logs for a failed specific ciGroup, jest, type checkers, linters, etc., click on the Pipeline Steps link in from the Job page.
Scroll down the page until you find a failed step (1), and then look up a few lines for the Branch:
step to see which specific job this is. If this is the job you’re looking for click the little terminal icon next to the failed step (1) to view the logs for that specific step in the Pipeline.
Debugging Functional UI Test Failures
editThe logs in Pipeline Steps contain Info
level logging. To debug Functional UI tests it’s usually helpful to see the debug logging. You can go to the list of all tests including failures (1), or directly to the failures (2).
Looking at the failure, we first look at the Error and stack trace. In the example below, this test failed to find an element within the timeout;
Error: retry.try timeout: TimeoutError: Waiting for element to be located By(css selector, [data-test-subj="createSpace"])
We know the test file from the stack trace was on line 50 of test/accessibility/apps/spaces.ts
(this test and the stack trace context is kibana/x-pack/ so the file is https://github.com/elastic/kibana/blob/main/x-pack/test/accessibility/apps/spaces.ts#L50).
The function to click on the element was called from a page object method in test/functional/page_objects/space_selector_page.ts
https://github.com/elastic/kibana/blob/main/x-pack/test/functional/page_objects/space_selector_page.ts#L58
[00:03:36] │ debg --- retry.try error: Waiting for element to be located By(css selector, [data-test-subj="createSpace"]) [00:03:36] │ Wait timed out after 10020ms [00:03:36] │ info Taking screenshot "/dev/shm/workspace/parallel/24/kibana/x-pack/test/functional/screenshots/failure/Kibana spaces page meets a11y validations a11y test for click on create space page.png" [00:03:37] │ info Current URL is: http://localhost:61241/app/home#/ [00:03:37] │ info Saving page source to: /dev/shm/workspace/parallel/24/kibana/x-pack/test/functional/failure_debug/html/Kibana spaces page meets a11y validations a11y test for click on create space page.html [00:03:37] └- ✖ fail: Kibana spaces page meets a11y validations a11y test for click on create space page [00:03:37] │ Error: retry.try timeout: TimeoutError: Waiting for element to be located By(css selector, [data-test-subj="createSpace"]) [00:03:37] │ Wait timed out after 10020ms [00:03:37] │ at /dev/shm/workspace/parallel/24/kibana/node_modules/selenium-webdriver/lib/webdriver.js:842:17 [00:03:37] │ at runMicrotasks (<anonymous>) [00:03:37] │ at processTicksAndRejections (internal/process/task_queues.js:93:5) [00:03:37] │ at onFailure (/dev/shm/workspace/parallel/24/kibana/test/common/services/retry/retry_for_success.ts:17:9) [00:03:37] │ at retryForSuccess (/dev/shm/workspace/parallel/24/kibana/test/common/services/retry/retry_for_success.ts:57:13) [00:03:37] │ at Retry.try (/dev/shm/workspace/parallel/24/kibana/test/common/services/retry/retry.ts:32:14) [00:03:37] │ at Proxy.clickByCssSelector (/dev/shm/workspace/parallel/24/kibana/test/functional/services/common/find.ts:420:7) [00:03:37] │ at TestSubjects.click (/dev/shm/workspace/parallel/24/kibana/test/functional/services/common/test_subjects.ts:109:7) [00:03:37] │ at SpaceSelectorPage.clickCreateSpace (test/functional/page_objects/space_selector_page.ts:59:7) [00:03:37] │ at Context.<anonymous> (test/accessibility/apps/spaces.ts:50:7) [00:03:37] │ at Object.apply (/dev/shm/workspace/parallel/24/kibana/node_modules/@kbn/test/src/functional_test_runner/lib/mocha/wrap_function.js:73:16)
But we don’t know why the test didn’t find the element. It could be that its not on the right page, or that the element has changed.
Just above the ✖ fail:
line, there is a line info Taking screenshot ...
which tells us the name of the screenshot to look for in the Google Cloud Storage (GCS) Upload Report:
Clicking the [Download]
link for that png shows this image:
If we use a running Kibana instance and inspect elements, we find that the createSpace
data-test-subj attribute is on this button in the Spaces page in Stack Management:
We know the test was not on the correct page to find the element to click. We see in the debug log the repeated attempts to find the element. If we scroll to the start of those repeated attempts, we see that the first thing the test did was this attempt to click on the createSpace
element.
[00:01:30] └-> a11y test for manage spaces menu from top nav on Kibana home [00:01:30] └-> a11y test for manage spaces page [00:01:30] └-> a11y test for click on create space page [00:01:30] └-> "before each" hook: global before each for "a11y test for click on create space page" [00:01:30] │ debg TestSubjects.click(createSpace)
And we can confirm that looking at the test code.
So we need to backtrack further to find where the test opens the Spaces page. It turns out that the test before this one would have navigated to the proper page, but the test is skipped (marked it.skip
in a PR).
it.skip('a11y test for manage spaces page', async () => { await PageObjects.spaceSelector.clickManageSpaces();
Perhaps someone skipped the previous tests not realizing that the tests were not independent. A best practice would be for every test to be atomic and not depend on the results of any other test(s). But in UI testing, the setup takes time and we generally need to optimize for groups of tests within a describe block.