Testing

Creating packages

To create a distribution without running the tests, simply run the following:

./gradlew assemble

To create a platform-specific build including the x-pack modules, use the following depending on your operating system:

./gradlew :distribution:archives:linux-tar:assemble --parallel
./gradlew :distribution:archives:darwin-tar:assemble --parallel
./gradlew :distribution:archives:windows-zip:assemble --parallel

Running Elasticsearch from a checkout

In order to run Elasticsearch from source without building a package, you can run it using Gradle:

./gradlew run

Launching and debugging from an IDE

If you want to run Elasticsearch from your IDE, the ./gradlew run task supports a remote debugging option:

./gradlew run --debug-jvm

This will instruct all JVMs (including any that run cli tools such as creating the keyring or adding users) to suspend and initiate a debug connection on port incrementing from 5005. As such the IDE needs to be instructed to listen for connections on this port. Since we might run multiple JVMs as part of configuring and starting the cluster it’s recommended to configure the IDE to initiate multiple listening attempts. In case of IntelliJ, this option is called "Auto restart" and needs to be checked. In case of Eclipse, "Connection limit" setting needs to be configured with a greater value (ie 10 or more).

Distribution

By default a node is started with the zip distribution. In order to start with a different distribution use the -Drun.distribution argument.

To for example start the open source distribution:

./gradlew run -Drun.distribution=oss

License type

By default a node is started with the basic license type. In order to start with a different license type use the -Drun.license_type argument.

In order to start a node with a trial license execute the following command:

./gradlew run -Drun.license_type=trial

This enables security and other paid features and adds a superuser with the username: elastic-admin and password: elastic-password.

Other useful arguments

In order to start a node with a different max heap space add: -Dtests.heap.size=4G In order to disable assertions add: -Dtests.asserts=false In order to set an Elasticsearch setting, provide a setting with the following prefix: -Dtests.es.

Test case filtering.

You can run a single test, provided that you specify the Gradle project. See the documentation on simple name pattern filtering.

Run a single test case in the server project:

./gradlew :server:test --tests org.elasticsearch.package.ClassName

Run all tests in a package and its sub-packages:

./gradlew :server:test --tests 'org.elasticsearch.package.*'

Run all tests that are waiting for a bugfix (disabled by default)

./gradlew test -Dtests.filter=@awaitsfix

Seed and repetitions.

Run with a given seed (seed is a hex-encoded long).

./gradlew test -Dtests.seed=DEADBEEF

Repeats all tests of ClassName N times.

Every test repetition will have a different method seed (derived from a single random master seed).

./gradlew :server:test -Dtests.iters=N --tests org.elasticsearch.package.ClassName

Repeats all tests of ClassName N times.

Every test repetition will have exactly the same master (0xdead) and method-level (0xbeef) seed.

./gradlew :server:test -Dtests.iters=N -Dtests.seed=DEAD:BEEF --tests org.elasticsearch.package.ClassName

Repeats a given test N times

(note the filters - individual test repetitions are given suffixes, ie: testFoo[0], testFoo[1], etc… so using testmethod or tests.method ending in a glob is necessary to ensure iterations are run).

./gradlew :server:test -Dtests.iters=N --tests org.elasticsearch.package.ClassName.methodName

Repeats N times but skips any tests after the first failure or M initial failures.

./gradlew test -Dtests.iters=N -Dtests.failfast=true ...
./gradlew test -Dtests.iters=N -Dtests.maxfailures=M ...

Test groups.

Test groups can be enabled or disabled (true/false).

Default value provided below in [brackets].

./gradlew test -Dtests.awaitsfix=[false] - known issue (@AwaitsFix)

Load balancing and caches.

By default the tests run on multiple processes using all the available cores on all available CPUs. Not including hyper-threading. If you want to explicitly specify the number of JVMs you can do so on the command line:

./gradlew test -Dtests.jvms=8

Or in ~/.gradle/gradle.properties:

systemProp.tests.jvms=8

Its difficult to pick the "right" number here. Hypercores don’t count for CPU intensive tests and you should leave some slack for JVM-internal threads like the garbage collector. And you have to have enough RAM to handle each JVM.

Test compatibility.

It is possible to provide a version that allows to adapt the tests behaviour to older features or bugs that have been changed or fixed in the meantime.

./gradlew test -Dtests.compatibility=1.0.0

Miscellaneous.

Run all tests without stopping on errors (inspect log files).

./gradlew test -Dtests.haltonfailure=false

Run more verbose output (slave JVM parameters, etc.).

./gradlew test -verbose

Change the default suite timeout to 5 seconds for all tests (note the exclamation mark).

./gradlew test -Dtests.timeoutSuite=5000! ...

Change the logging level of ES (not Gradle)

./gradlew test -Dtests.es.logger.level=DEBUG

Print all the logging output from the test runs to the commandline even if tests are passing.

./gradlew test -Dtests.output=always

Configure the heap size.

./gradlew test -Dtests.heap.size=512m

Pass arbitrary jvm arguments.

# specify heap dump path
./gradlew test -Dtests.jvm.argline="-XX:HeapDumpPath=/path/to/heapdumps"
# enable gc logging
./gradlew test -Dtests.jvm.argline="-verbose:gc"
# enable security debugging
./gradlew test -Dtests.jvm.argline="-Djava.security.debug=access,failure"

Running verification tasks

To run all verification tasks, including static checks, unit tests, and integration tests:

./gradlew check

Note that this will also run the unit tests and precommit tasks first. If you want to just run the integration tests (because you are debugging them):

./gradlew integTest

If you want to just run the precommit checks:

./gradlew precommit

Some of these checks will require docker-compose installed for bringing up test fixtures. If it’s not present those checks will be skipped automatically. The host running Docker (or VM if you’re using Docker Desktop) needs 4GB of memory or some of the containers will fail to start. You can tell that you are short of memory if containers are exiting quickly after starting with code 137 (128 + 9, where 9 means SIGKILL).

Testing the REST layer

The available integration tests make use of the java API to communicate with the elasticsearch nodes, using the internal binary transport (port 9300 by default). The REST layer is tested through specific tests that are shared between all the elasticsearch official clients and consist of YAML files that describe the operations to be executed and the obtained results that need to be tested.

The YAML files support various operators defined in the rest-api-spec and adhere to the Elasticsearch REST API JSON specification

The REST tests are run automatically when executing the "./gradlew check" command. To run only the REST tests use the following command:

./gradlew :distribution:archives:integ-test-zip:integTest   \
  -Dtests.class="org.elasticsearch.test.rest.*Yaml*IT"

A specific test case can be run with

./gradlew :distribution:archives:integ-test-zip:integTest \
  -Dtests.class="org.elasticsearch.test.rest.*Yaml*IT" \
  -Dtests.method="test {p0=cat.shards/10_basic/Help}"

*Yaml*IT are the executable test classes that runs all the yaml suites available within the rest-api-spec folder.

The REST tests support all the options provided by the randomized runner, plus the following:

tests.rest[true|false]: determines whether the REST tests need to be run (default) or not.
tests.rest.suite: comma separated paths of the test suites to be run (by default loaded from /rest-api-spec/test). It is possible to run only a subset of the tests providing a sub-folder or even a single yaml file (the default /rest-api-spec/test prefix is optional when files are loaded from classpath) e.g. -Dtests.rest.suite=index,get,create/10_with_id
tests.rest.blacklist: comma separated globs that identify tests that are blacklisted and need to be skipped e.g. -Dtests.rest.blacklist=index//Index document,get/10_basic/

Testing packaging

The packaging tests use Vagrant virtual machines or cloud instances to verify that installing and running Elasticsearch distributions works correctly on supported operating systems. These tests should really only be run on ephemeral systems because they’re destructive; that is, these tests install and remove packages and freely modify system settings, so you will probably regret it if you execute them on your development machine.

When you run a packaging test, Gradle will set up the target VM and mount your repository directory in the VM. Once this is done, a Gradle task will issue a Vagrant command to run a nested Gradle task on the VM. This nested Gradle runs the actual "destructive" test classes.

Install Virtual Box and Vagrant.
(Optional) Install vagrant-cachier to squeeze a bit more performance out of the process:
```
vagrant plugin install vagrant-cachier
```
You can run all of the OS packaging tests with ./gradlew packagingTest. This task includes our legacy bats tests. To run only the OS tests that are written in Java, run .gradlew distroTest, will cause Gradle to build the tar, zip, and deb packages and all the plugins. It will then run the tests on every available system. This will take a very long time.

Fortunately, the various systems under test have their own Gradle tasks under qa/os. To find out what packaging combinations can be tested on a system, run the tasks task. For example:
```
./gradlew :qa:os:ubuntu-1804:tasks
```
If you want a quick test of the tarball and RPM packagings for Centos 7, you would run:
```
./gradlew :qa:os:centos-7:distroTest.default-rpm :qa:os:centos-7:distroTest.default-linux-archive
```

Note that if you interrupt Gradle in the middle of running these tasks, any boxes started will remain running and you’ll have to stop them manually with ./gradlew --stop or vagrant halt.

All the regular vagrant commands should just work so you can get a shell in a VM running trusty by running vagrant up ubuntu-1604 --provider virtualbox && vagrant ssh ubuntu-1604.

These are the linux flavors supported, all of which we provide images for

ubuntu-1604 aka xenial
ubuntu-1804 aka bionic beaver
debian-8 aka jessie
debian-9 aka stretch, the current debian stable distribution
centos-6
centos-7
rhel-8
fedora-28
fedora-29
oel-6 aka Oracle Enterprise Linux 6
oel-7 aka Oracle Enterprise Linux 7
sles-12
opensuse-42 aka Leap

We’re missing the following from the support matrix because there aren’t high quality boxes available in vagrant atlas:

sles-11

Testing packaging on Windows

The packaging tests also support Windows Server 2012R2 and Windows Server 2016. Unfortunately we’re not able to provide boxes for them in open source use because of licensing issues. Any Virtualbox image that has WinRM and Powershell enabled for remote users should work.

Specify the image IDs of the Windows boxes to gradle with the following project properties. They can be set in ~/.gradle/gradle.properties like

vagrant.windows-2012r2.id=my-image-id
vagrant.windows-2016.id=another-image-id

or passed on the command line like -Pvagrant.windows-2012r2.id=my-image-id -Pvagrant.windows-2016=another-image-id

These properties are required for Windows support in all gradle tasks that handle packaging tests. Either or both may be specified.

If you’re running vagrant commands outside of gradle, specify the Windows boxes with the environment variables

VAGRANT_WINDOWS_2012R2_BOX
VAGRANT_WINDOWS_2016_BOX

Testing VMs are disposable

It’s important to think of VMs like cattle. If they become lame you just shoot them and let vagrant reprovision them. Say you’ve hosed your precise VM:

vagrant ssh ubuntu-1604 -c 'sudo rm -rf /bin'; echo oops

All you’ve got to do to get another one is

vagrant destroy -f ubuntu-1604 && vagrant up ubuntu-1604 --provider virtualbox

The whole process takes a minute and a half on a modern laptop, two and a half without vagrant-cachier.

Its possible that some downloads will fail and it’ll be impossible to restart them. This is a bug in vagrant. See the instructions here for how to work around it: hashicorp/vagrant#4479

Some vagrant commands will work on all VMs at once:

vagrant halt
vagrant destroy -f

vagrant up would normally start all the VMs but we’ve prevented that because that’d consume a ton of ram.

Iterating on packaging tests

Because our packaging tests are capable of testing many combinations of OS (e.g., Windows, Linux, etc.), package type (e.g., zip file, RPM, etc.), Elasticsearch distribution type (e.g., default or OSS), and so forth, it’s faster to develop against smaller subsets of the tests. For example, to run tests for the default archive distribution on Fedora 28:

./gradlew :qa:os:fedora-28:distroTest.default-linux-archive

These test tasks can use the --tests, --info, and --debug parameters just like non-OS tests can. For example:

./gradlew :qa:os:fedora-28:distroTest.default-linux-archive \
  --tests "com.elasticsearch.packaging.test.ArchiveTests"

Testing backwards compatibility

Backwards compatibility tests exist to test upgrading from each supported version to the current version. To run them all use:

./gradlew bwcTest

A specific version can be tested as well. For example, to test bwc with version 5.3.2 run:

./gradlew v5.3.2#bwcTest

Tests are ran for versions that are not yet released but with which the current version will be compatible with. These are automatically checked out and built from source. See VersionCollection and distribution/bwc/build.gradle for more information.

When running ./gradlew check, minimal bwc checks are also run against compatible versions that are not yet released.

BWC Testing against a specific remote/branch

Sometimes a backward compatibility change spans two versions. A common case is a new functionality that needs a BWC bridge in an unreleased versioned of a release branch (for example, 5.x). To test the changes, you can instruct Gradle to build the BWC version from a another remote/branch combination instead of pulling the release branch from GitHub. You do so using the bwc.remote and bwc.refspec.BRANCH system properties:

./gradlew check -Dbwc.remote=${remote} -Dbwc.refspec.5.x=index_req_bwc_5.x

The branch needs to be available on the remote that the BWC makes of the repository you run the tests from. Using the remote is a handy trick to make sure that a branch is available and is up to date in the case of multiple runs.

Example:

Say you need to make a change to master and have a BWC layer in 5.x. You will need to: . Create a branch called index_req_change off your remote ${remote}. This will contain your change. . Create a branch called index_req_bwc_5.x off 5.x. This will contain your bwc layer. . Push both branches to your remote repository. . Run the tests with ./gradlew check -Dbwc.remote=${remote} -Dbwc.refspec.5.x=index_req_bwc_5.x.

Skip fetching latest

For some BWC testing scenarios, you want to use the local clone of the repository without fetching latest. For these use cases, you can set the system property tests.bwc.git_fetch_latest to false and the BWC builds will skip fetching the latest from the remote.

How to write good tests?

Base classes for test cases

There are multiple base classes for tests:

ESTestCase: The base class of all tests. It is typically extended directly by unit tests.
ESSingleNodeTestCase: This test case sets up a cluster that has a single node.
ESIntegTestCase: An integration test case that creates a cluster that might have multiple nodes.
ESRestTestCase: An integration tests that interacts with an external cluster via the REST API. For instance, YAML tests run via sub classes of ESRestTestCase.

Good practices

What kind of tests should I write?

Unit tests are the preferred way to test some functionality: most of the time they are simpler to understand, more likely to reproduce, and unlikely to be affected by changes that are unrelated to the piece of functionality that is being tested.

The reason why ESSingleNodeTestCase exists is that all our components used to be very hard to set up in isolation, which had led us to having a number of integration tests but close to no unit tests. ESSingleNodeTestCase is a workaround for this issue which provides an easy way to spin up a node and get access to components that are hard to instantiate like IndicesService. Whenever practical, you should prefer unit tests.

Many tests extend ESIntegTestCase, mostly because this is how most tests used to work in the early days of Elasticsearch. However the complexity of these tests tends to make them hard to debug. Whenever the functionality that is being tested isn’t intimately dependent on how Elasticsearch behaves as a cluster, it is recommended to write unit tests or REST tests instead.

In short, most new functionality should come with unit tests, and optionally REST tests to test integration.

Refactor code to make it easier to test

Unfortunately, a large part of our code base is still hard to unit test. Sometimes because some classes have lots of dependencies that make them hard to instantiate. Sometimes because API contracts make tests hard to write. Code refactors that make functionality easier to unit test are encouraged. If this sounds very abstract to you, you can have a look at this pull request for instance, which is a good example. It refactors IndicesRequestCache in such a way that: - it no longer depends on objects that are hard to instantiate such as IndexShard or SearchContext, - time-based eviction is applied on top of the cache rather than internally, which makes it easier to assert on what the cache is expected to contain at a given time.

Bad practices

Use randomized-testing for coverage

In general, randomization should be used for parameters that are not expected to affect the behavior of the functionality that is being tested. For instance the number of shards should not impact date_histogram aggregations, and the choice of the store type (niofs vs mmapfs) does not affect the results of a query. Such randomization helps improve confidence that we are not relying on implementation details of one component or specifics of some setup.

However it should not be used for coverage. For instance if you are testing a piece of functionality that enters different code paths depending on whether the index has 1 shards or 2+ shards, then we shouldn’t just test against an index with a random number of shards: there should be one test for the 1-shard case, and another test for the 2+ shards case.

Abuse randomization in multi-threaded tests

Multi-threaded tests are often not reproducible due to the fact that there is no guarantee on the order in which operations occur across threads. Adding randomization to the mix usually makes things worse and should be done with care.

Test coverage analysis

Generating test coverage reports for Elasticsearch is currently not possible through Gradle. However, it is possible to gain insight in code coverage using IntelliJ’s built-in coverage analysis tool that can measure coverage upon executing specific tests. Eclipse may also be able to do the same using the EclEmma plugin.

Test coverage reporting used to be possible with JaCoCo when Elasticsearch was using Maven as its build system. Since the switch to Gradle though, this is no longer possible, seeing as the code currently used to build Elasticsearch does not allow JaCoCo to recognize its tests. For more information on this, see the discussion in issue #28867.

Debugging remotely from an IDE

If you want to run Elasticsearch and be able to remotely attach the process for debugging purposes from your IDE, can start Elasticsearch using ES_JAVA_OPTS:

ES_JAVA_OPTS="-Xdebug -Xrunjdwp:server=y,transport=dt_socket,address=4000,suspend=y" ./bin/elasticsearch

Read your IDE documentation for how to attach a debugger to a JVM process.

Building with extra plugins

Additional plugins may be built alongside elasticsearch, where their dependency on elasticsearch will be substituted with the local elasticsearch build. To add your plugin, create a directory called elasticsearch-extra as a sibling of elasticsearch. Checkout your plugin underneath elasticsearch-extra and the build will automatically pick it up. You can verify the plugin is included as part of the build by checking the projects of the build.

./gradlew projects

Environment misc

There is a known issue with macOS localhost resolve strategy that can cause some integration tests to fail. This is because integration tests have timings for cluster formation, discovery, etc. that can be exceeded if name resolution takes a long time. To fix this, make sure you have your computer name (as returned by hostname) inside /etc/hosts, e.g.:

127.0.0.1       localhost ElasticMBP.local
255.255.255.255 broadcasthost
::1             localhost ElasticMBP.local`

Benchmarking

For changes that might affect the performance characteristics of Elasticsearch you should also run macrobenchmarks. We maintain a macrobenchmarking tool called Rally which you can use to measure the performance impact. It comes with a set of default benchmarks that we also run every night. To get started, please see Rally’s documentation.

Files

TESTING.asciidoc

Latest commit

History

TESTING.asciidoc

File metadata and controls

Testing

Creating packages

Running Elasticsearch from a checkout

Launching and debugging from an IDE

Distribution

License type

Other useful arguments

Test case filtering.

Seed and repetitions.

Repeats all tests of ClassName N times.

Repeats all tests of ClassName N times.

Repeats a given test N times

Test groups.

Load balancing and caches.

Test compatibility.

Miscellaneous.

Running verification tasks

Testing the REST layer

Testing packaging

Testing packaging on Windows

Testing VMs are disposable

Iterating on packaging tests

Testing backwards compatibility

BWC Testing against a specific remote/branch

Skip fetching latest

How to write good tests?

Base classes for test cases

Good practices

What kind of tests should I write?

Refactor code to make it easier to test

Bad practices

Use randomized-testing for coverage

Abuse randomization in multi-threaded tests

Test coverage analysis

Debugging remotely from an IDE

Building with extra plugins

Environment misc

Benchmarking