Concurrent API Tests

Stéphane Leblanc
10 min readJul 5, 2021

--

Photo by Omer Rana on Unsplash

The benefits of API testing are extremely valuable, but they also come with some challenges. The goal of this post is to share some techniques that can be used to overcome these challenges.

I remember being so impressed by the quality of the conversation between Kent Beck, David Heinemeier Hansson, and Martin Fowler during the TDD is Dead controversy. Back in 2014, unit testing was my favorite tool and DHH’s arguments did not change my mind. But last year, I stumbled on the keynote talk that had started the whole controversy and it convinced me to try testing at a coarser level of granularity.

I realized that data partitioning and concurrent tests are a great combo for API testing. I managed to build 220 reliable and maintainable tests. The test suite takes only 35 seconds to complete, although it makes 1189 HTTP requests to the system under test. These tests provide a high level of confidence since they verify that the system as a whole behaves as expected. Moreover, they only depend on the API of the system. This allows the system implementation to change without breaking the tests. Moreover, these tests can be executed against multiple versions of the system. For example, running the tests against the last stable version and the latest version can prevent regression when rewriting a module of a legacy system. Finally, these tests document the system as a black box. This form of documentation is always up to date and facilitates the understanding of the system as a whole without getting lost in the implementation details.

Data Partition

To be reliable, test cases should not have side effects on each other. Moreover, if the system under test is shared among the team, each team member must be able to execute the test cases without coordinating with each other. Thus test cases must allow for multiple simultaneous executions.

To reach this level of isolation, each test case must set up a clean test fixture without assuming any pre-existing state in the system. Sharing of data between test suite executions should be avoided; since it implies that the test suite assumes a pre-existing state in the system.

In theory, a brand new system for each test case would be required to reach perfect isolation between test cases. However, this is not very practical if you have hundreds of test cases and your system has many dependencies difficult to set up (ex: external dependencies).

Data partitioning is a much more practical solution. You have to find a logical data partition that prevents side effects between test case executions in the context of your system. For example, in a blog post system, many features can be verified within a single blog post. In that context, a new blog post can be created at the beginning of each test case and never be reused outside of the test case. The data would be partitioned by blog post id. A test suite may use many different data partitions (ex: by blog post, by author, by blog post category, etc.) as long as each test case is isolated from the others.

Here test case 1 creates a new blog post and uses the id of this blog post to prevent side effects from test cases 2 and 3, and from subsequent execution of test case 1. Test case 2 uses the same strategy. On the other hand, test case 3 creates a new author and uses the id of this author to prevent side effects from other test case executions.

During the execution of a test suite, sharing of data between test cases may be used, but with great caution. If some data is required by many test cases and this data is immutable, it may be reused between test cases. In the example above, test cases 1 and 2 require an author in order to create a blog post, but the attributes of the author are not meaningful for these test cases, a default immutable user may be reused by test cases 1 and 2. If the attributes of the author are meaningful for a test case or if they have to change during the execution of a test case, it’s better to create a new author for this test to avoid side effects between tests.

But what if a feature cannot be isolated in a data partition? For example, what if our fictitious blog post system should be able to compute the total number of blog posts?

There are a few options (from less restrictive to the most):

  1. Work without a data partition and use less precise assertions
    Ex: Create a blog post and assert that the total number of blog posts increases instead of asserting a specific total number of blog posts.
  2. Create a data partition on something with potential business value
    Ex: GET /blog-post-count?keyword=
  3. Build your system on a shared schema Multi-Tenant Data Architecture
    Ex: GET /tenant/blog-post-count

Option 3 may be overkill, but it solves the issue once and for all since it allows to simply create a new tenant for each test case. Options 2 and 3 require adding features in the API that are not directly related to business needs; which is not ideal. On the other hand, it’s completely acceptable to make tradeoffs to improve the testability of the system, since the evolution of the system, its reliability, and its maintainability depend largely on its testability. When a feature cannot be easily isolated in a data partition, you have to make a decision based on the costs and benefits in the context of your system.

Concurrent Tests

Fast test execution allows for executing tests more often. Each time the tests are executed, they provide valuable feedback that can be taken into account in the next step of the software development process. Data partitions prevent side effects between test cases. This allows for executing test cases concurrently to significantly reduce the time required to execute the test suite.

For example, if a test suite has 3 tests:

  • Test 1 (2 seconds)
  • Test 2 (5 seconds)
  • Test 3 (1 second)

If these test cases were run sequentially, the total testing time would be 10 seconds. By running them concurrently, it only takes 5 seconds (in perfect multitasking conditions of course!). If 5 tests of 2 seconds each were added, the test suite would still be completed in around 5 seconds. That’s interesting!

What if 5000 tests of 2 seconds were added to the test suite? The time required for completing a test suite is determined mainly by the slowest test case of the test suite and by the ability of the system to handle a large number of concurrent requests. An overloaded system won’t handle the test suite faster. It will only add extra delay.

Although Node.js made the single threaded event loop architecture popular for managing concurrency, this breakthrough has not been adopted yet by test frameworks. Most of them support concurrency through multithreading and do not offer concurrency by test cases (ex: Mocha and Jest). They instead concurrently execute small groups of test cases that can share a common setup/teardown.

Fortunately, Mocha has an extension (mocha.parallel) that allows for handling concurrency with a single threaded event loop architecture. This allows for running all tests cases concurrently. Mocha with its extension mocha.parallel includes all the core features required to write fast and reliable API tests.

Flaky Tests

A flaky test case is a test case that fails from time to time but succeeds if you retry it.

Fast tests are important, but reliable tests are even more important. Unreliable tests are not very useful and fixing a flaky test can take a significant effort. One can even be discouraged and decide to delete a flaky test case instead of fixing it.

Don’t overload the system in order to reduce the test execution time by a few seconds. Concretely, the maximum number of concurrent tests has to be limited so that the system can easily handle the load (for example with mocha.parallel). After some threshold, the server becomes overloaded and the tests do not complete faster anyway; but sporadic errors start to appear because the system cannot handle the load (ex: ETIMEDOUT, ECONNRESET, ECONNREFUSED, etc.).

If possible, avoid sharing the system by hosting the API server and all its dependencies (ex: databases) locally; it avoids all sorts of problems.

However, if your system must be shared among the team (ex: a dependency cannot be hosted locally), the number of persons that can run the tests at the same time has to be taken into account when setting up the maximum number of concurrent tests. For example, if it is unlikely that 3 people would run the tests at the same time in your context and your system can easily handle 60 concurrent tests, set the maximum number of concurrent tests to 20. Reliable tests are more important than fast tests.

Some test cases must rely on the timing between HTTP requests. These test cases are likely to be flaky if the timing is not managed with care. Limiting the maximum number of concurrent tests is already a good first step since it reduces the variability in the time required to execute each test case. However, this is not enough. In order to prevent timing issues between HTTP requests, make all the delays of the system configurable and set reasonable delays in the test environment. These delays do not have to be the same in the test environment and in the production environment in order to have meaningful tests.

If a delay is too tight, the related tests will be flaky. Configure this delay with enough slack so that related tests are reliable. For example, if the delay has to be 500 milliseconds in the production environment, you may set it to 10 seconds in the test environment. Such a situation provides insights about the reliability of the system in the production environment. If a delay is too tight for enabling reliable tests, one has to stop and wonder if the client of the API will be able to reliably respect this delay in the production environment, especially when the system is under heavy load.

If a delay is too long, the feature will not be tested. For example, if a blog post has to be archived after 6 months. It would not make sense to wait 6 months for testing the feature. In the test environment, the delay could be configured for archiving the blog posts after 60 seconds. Again, it’s important to give enough slack in the timing, to prevent blog posts from being accidentally archived, because this could turn a test that is not related to the archiving feature into a flaky test. The benefits of configurable delays are not limited to API tests. For example, manual exploratory testing would also benefit from the reasonable delays of the test environment.

Finally, don’t try to test everything with concurrent API tests. If a test case is extremely time-sensitive or if a flaky test is difficult to fix, consider using another testing tool (ex: unit testing) for this test case. Don’t stay with flaky test cases in your test suite, because a flaky test suite cannot be trusted. Skip flaky tests until they are fixed, because they will cause more harm than good, but make sure to fix them sooner than later.

Maintainable Tests

A test fixture sets up the system in an interesting state (arrange) before acting on the system and asserting the expected result. Each test case will require a test fixture.

Ideally, the test fixture should arrange the system only by using its API. A direct dependency on the system’s internal components (ex: databases) will make the tests more brittle. Before adding a direct dependency between the test and the system’s internal components, consider adding a new endpoint on the API for the missing feature.

To arrange the system, a test fixture may send multiple HTTP requests. Resist the temptation to chain multiple test cases together in order to reuse a test fixture. Instead, build small utility functions and reuse them to create a specific test fixture for each test case. Maintainable test cases test only one specific thing. It’s not a problem if the test fixture takes a few seconds to complete since the test cases will be executed concurrently anyway.

A test case can require multiple HTTP requests and the payload of each of these requests may be complex to set up. If the complete payload is specified for each request, the test will soon become brittle since any change in the payloads will break many unrelated tests. Instead, use a default payload template for each request and specify only the part of the payload that is meaningful for the test case. This way, test cases are easier to read since only the part that matters is specified. Moreover, if a change in a payload is required, only the default payload template and the related tests have to be changed.

mocha-concurrent-api-tests

Along with unit testing, API testing is now an essential tool in my testing toolbox. To make it easier to understand and apply the testing approach described in this blog post, I published an example project that demonstrates how concurrent API tests can be implemented with Mocha. The example project use mocha-concurrent-api-tests; an npm package that provides the core functions required to implement concurrent API tests with Mocha. The example project can be used as a template for starting new test projects.

Special thanks to Morgan Martinet, Jean-François Marcoux and Daniel Brodeur for reviewing the post.

--

--