AI Disclaimer
I’ve used an LLM to generate the code examples in this article otherwise it would have never seen the light of the day (because that’s how lazy I am). The goal of the examples is to give the reader an idea of what the tests I’m talking about look like (since our industry has completely broken test taxonomy for the foreseeable future) and by no means meant to be accurate or production-ready.
I’ve also used an LLM to generate the article structure from my notes. However, after doing so, I’ve reviewed it carefully and made the necessary changes so it reflects my thoughts, opinion, and style. Any resemblance to any other articles published anywhere, if any, is not intentional.
It’s the first time I’m experimenting with LLM-assisted writing and my goal was to see what the experience and end result would be.
Feedback is always welcome.
Introduction
Effective and efficient testing is crucial to ensure quality, reliability, and smooth functionality in software applications as well as a healthy software delivery process. In this article, I outline best practices and methodologies for different testing types (or categories) based on my experience testing backend applications. I call this approach diff-based testing.
Diff-based testing advocates that in order to avoid repetition (efficiency) and ensure high-quality test coverage (effectiveness), each category (or layer) of tests must focus on what can be tested only through that category. I call it diff-based testing because the main idea behind it is to have each layer of tests validate only the behavior that can’t be validated in the previous layer. It’s heavily influenced by the test pyramid paradigm however it’s more opinionated. Diff-based testing also attempts to minimize the time spent running all test suites by having as much as possible of the test coverage being achieved by the fastest test types.
For example, diff-based testing states that unit tests must focus on validating the core business domain logic whereas integration testing must focus on validating only what can’t be covered by unit tests like, for example, HTTP requests and responses, database operations, and messaging, to name a few.
Following this approach will naturally lead to a test pyramid where most of the tests are going to be implemented as unit tests which are the fastest types of tests to execute since they are executed in-memory without exercising any IO operations.
Imagine we have a Calculator API. If we can validate the sum operation behaves correctly using unit tests, why should we validate the same behavior again using integration or any other type of test? That would be a waste of time and energy due to having to implement and maintain the same test logic in multiple places. That means if you’re building a library with no IO operations whatsoever, all you need is unit tests and its extensions like mutation and property testing (and maybe performance testing depending on your library use case).
But since this article is focused on testing backend applications like REST APIs, I’ll be covering the other category of tests I believe are necessary to cover all of the functionality usually implemented by this type of application.
Without further ado, here we go.
Unit Testing
Unit testing focuses on validating the correctness of individual components of business logic without requiring us to run the entire application. These tests aim to ensure that each part of the code works as expected in isolation. In order to maintain precise and efficient coverage, which are the goals of diff-based testing, we must ensure core business domain logic is validated only through unit tests, as mentioned before.
Unit tests should be limited to public methods and functions, as this approach provides clear insight into how well each component performs independently. Focusing on public methods helps ensure that tests remain aligned with how the code is used in practice, promoting better maintainability. Private methods or functions will be tested indirectly when testing the public ones.
By concentrating on the external behavior rather than internal implementation details, developers can refactor code with confidence, knowing the tests will continue to validate core functionality without needing constant updates.
To ensure clarity and maintainability, each class or module being tested should have its corresponding test class or module. Comprehensive coverage, including both typical (happy paths) and edge cases, ensures that a wide range of potential issues are captured early. Testing edge cases is crucial to identify behavior that might break under less common scenarios, thereby strengthening the reliability of the component.
Dependencies should not be tested directly within unit tests; instead, they should be mocked or stubbed to maintain the focus on the specific logic of the component being tested. Incorporating unit tests as part of the same code repository ensures cohesion and enables seamless code management.
Finally, unit tests can lead to better design as we make an effort to ensure our code is testable and modularized.
Example:
// Unit test for addition operation in a Calculator REST API
public class CalculatorServiceTest {
@Test
public void testAddition() {
CalculatorService calculatorService = new CalculatorService();
int result = calculatorService.add(3, 7);
assertEquals(10, result, "Addition should return the correct sum.");
}
}
Integration Testing
Integration testing ensures that the application can communicate and interact effectively with its external dependencies like databases, messaging platforms, external APIs, etc. Again, it is crucial to note that the core domain business logic should not be validated through integration tests since those should already have been validated through unit tests.
This ensures that integration tests remain focused on interactions and data flow rather than duplicating the work of unit tests. This type of testing is essential for verifying that the integration between architectural components works as intended, providing confidence that the system under test behaves as expected when integrated with its dependencies.
These tests should confirm that valid requests produce appropriate responses, while ensuring that anticipated errors, such as a database going offline or receiving invalid inputs, are handled gracefully.
Additionally, they should verify that expected error messages are returned for various issues, including invalid routes, parameters, or contracts. To simulate real-world scenarios without using actual production data, stubbing external services and databases is recommended. The test data should resemble production conditions as closely as possible to ensure realistic results.
Each functionality being tested should have its designated test class or module to keep the tests organized and maintainable. Integration tests should be able to create an application context or its equivalent. Integration tests must be fully automated to ensure they can be executed as part of the CI/CD pipeline, supporting continuous integration and delivery.
Another benefit of integration tests is to allow the validation of the integration of a system with its dependencies before it’s deployed to the infrastructure where it’s going to be executed. This allows the delivery team to obtain quick feedback on the behavior of the application and potential defects early in the delivery process.
Example:
// Integration test using Testcontainers and WireMock for the Calculator API
// This example assumes we're testing an API which depends on an
// external notification API for notifying the user of completed operations
// This API also publishes an event to Kafka for every operation performed.
@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
@EmbeddedKafka(partitions = 1, topics = {"operation-performed"})
public class CalculatorIntegrationTest {
@Autowired
private TestRestTemplate restTemplate;
@Autowired
private KafkaTemplate<String, String> kafkaTemplate;
@BeforeEach
public void setupMocks() {
// WireMock setup for external notification service
WireMockServer wireMockServer = new WireMockServer(8081);
wireMockServer.start();
wireMockServer.stubFor(post(urlEqualTo("/notify"))
.willReturn(aResponse().withStatus(200)));
}
@Test
public void testAdditionEndpoint() {
String response = this.restTemplate.postForObject("/calculate/add", new OperationRequest(3, 7), String.class);
assertThat(response).isNotNull();
assertThat(response).contains("operation", "result");
assertThat(responseEntity.getStatusCode()).isEqualTo(HttpStatus.OK);
// Verify that an OperationPerformed event is consumed from Kafka
Consumer<String, String> consumer = createKafkaConsumer();
consumer.subscribe(Collections.singletonList("operation-performed"));
ConsumerRecord<String, String> record = KafkaTestUtils.getSingleRecord(consumer, "operation-performed");
assertThat(record.value()).contains("operation", "addition", "result", "10");
consumer.close();
}
@Test
public void testDivideByZero() {
ResponseEntity<String> response = this.restTemplate.postForEntity("/calculate/divide", new OperationRequest(10, 0), String.class);
assertThat(response.getStatusCode()).isEqualTo(HttpStatus.BAD_REQUEST);
assertThat(response.getBody()).contains("Cannot divide by zero");
}
}
End-to-End (E2E) Testing
End-to-end testing aims to validate the entire application flow and ensure that all components in the system interact seamlessly. However, traditional E2E tests can be complex, fragile, and time-consuming. E2E tests often face challenges such as high complexity, flakiness, and significant maintenance overhead.
Beyond being time-consuming, those tests are prone to failures at the slight changes in the system’s infrastructure, test data, and dependencies, making them difficult to rely on for continuous integration.
To address these limitations, contract testing can be a more efficient alternative. Contract tests offer focused validation of interactions between services without the extensive infrastructure or fragile nature of full E2E testing. For these reasons, it is often better to replace them with more focused contract tests that, together with unit and integration tests, provide the same assurances with less overhead.
I won’t be presenting an example since I consider such tests a bad practice or anti-pattern.
Contract Testing
Contract testing ensures that different components of an architecture, such as services and clients, interact correctly based on predefined agreements or “contracts.” These tests validate that both the producer (data provider) and the consumer (data user) adhere to these contracts. This includes both synchronous and asynchronous communication between components.
The consumer defines the data structure it needs, while the producer guarantees it can deliver this data format. By versioning contracts alongside the codebase and storing them in a shared repository, both sides can stay in sync. The most well-known contract test framework is Pact.
Contract tests should be executed at every stage of the CI/CD pipeline, validating published contracts in each environment e.g., Dev, QA, Pre-Prod, and Prod, ensuring that changes in one component do not unexpectedly impact another, keeping producers and consumers aligned.
None of the other test categories covered in this article can provide the guarantees of contract tests. Contract tests are essential when implementing distributed systems.
Example:
// Consumer-side contract test using Pact
@ExtendWith(PactConsumerTestExt.class)
public class CalculatorConsumerContractTest {
@Pact(consumer = "CalculatorConsumer", provider = "CalculatorProvider")
public Pact createPact(PactDslWithProvider builder) {
return builder
.given("Calculator provides addition operation")
.uponReceiving("A request for addition")
.path("/calculate/add")
.method("POST")
.body("{\"num1\": 3, \"num2\": 7}")
.willRespondWith()
.status(200)
.body("{\"operation\": \"addition\", \"result\": 10}")
.toPact();
}
@Test
@ConsumerPactTest(pactMethod = "createPact")
public void testConsumerPact() {
RestTemplate restTemplate = new RestTemplate();
String response = restTemplate.postForObject("http://localhost:8080/calculate/add", new OperationRequest(3, 7), String.class);
assertThat(response).contains("operation", "addition", "result");
}
}
// Producer-side contract test using Pact
@ExtendWith(SpringExtension.class)
@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.DEFINED_PORT)
public class CalculatorProviderContractTest {
@BeforeEach
void before(PactVerificationContext context) {
context.setTarget(new HttpTestTarget("localhost", 8080));
}
@TestTemplate
@ProviderPactVerification("CalculatorProvider")
void verifyPact(PactVerificationContext context) {
context.verifyInteraction();
}
}
Exploratory Testing
Exploratory testing is performed to examine aspects of the application that are challenging to automate, such as user interface behavior and user experience. This testing type relies on the skills and intuition of QA professionals to identify unexpected behaviors and potential usability issues.
Conducted in a controlled QA environment, exploratory testing leverages the creativity and expertise of testers to investigate various scenarios. This approach helps uncover issues that structured test scripts might miss, ensuring a more holistic evaluation of the software.
Smoke Testing
Smoke testing serves as a quick validation method to verify that a recent deployment was successful. It is a lightweight test that checks basic application functionality without diving into deeper, more detailed testing.
This testing type focuses on ensuring that the application is accessible, responding as expected, and available at the correct routes. Typically performed after deployments in UAT and production, smoke tests provide immediate feedback on deployment success.
At this level we want to validate what can’t be validated by the integration and unit tests, i.e., that our application is capable of running in the provisioned infrastructure and can talk to the real dependencies deployed to that environment.
Example:
// Smoke test to verify basic functionality of the Calculator API after deployment
@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
public class CalculatorSmokeTest {
@Autowired
private TestRestTemplate restTemplate;
@Test
public void testServiceIsUp() {
ResponseEntity<String> response = restTemplate.getForEntity("/health", String.class);
assertThat(response.getStatusCode()).isEqualTo(HttpStatus.OK);
assertThat(response.getBody()).contains("status", "UP");
}
}
Synthetic Monitoring
Synthetic monitoring involves running a subset of automated tests in a live production environment to ensure the system continues to work as expected. This proactive measure helps detect issues before users encounter them.
It involves running a subset of automated tests in a live production environment to ensure the system continues to work as expected. This proactive measure helps detect issues before users encounter them.
These tests use innocuous data, such as fake client profiles, dummy accounts, or synthetic transactions, that do not interfere with real transactions or analytics. By integrating synthetic tests with monitoring tools, organizations can receive alerts if these tests detect problems, allowing for quick intervention.
Example:
// Synthetic monitoring test example to run a health check that performs a synthetic operation in production
@SpringBootTest
public class CalculatorSyntheticMonitoringTest {
@Autowired
private RestTemplate restTemplate;
@Test
public void testProductionHealthCheckWithSyntheticOperation() {
// URL for a health endpoint that performs a synthetic operation
String syntheticTestUrl = "https://production-url.com/health";
ResponseEntity<String> response = restTemplate.getForEntity(syntheticTestUrl, String.class);
assertThat(response.getStatusCode()).isEqualTo(HttpStatus.OK);
assertThat(response.getBody()).contains("status", "up");
}
}
// Controller code for the health endpoint with synthetic operation flag
@RestController
public class HealthController {
@Autowired
private CalculatorService calculatorService;
@GetMapping("/health")
public ResponseEntity<Map<String, Object>> performSyntheticOperation() {
Map<String, Object> response = new HashMap<>();
boolean isSynthetic = true;
int result = calculatorService.add(5, 10, isSynthetic); //won't publish an event
response.put("operation", "addition");
response.put("result", result);
response.put("status", "UP");
return ResponseEntity.ok(response);
}
}
Performance Testing
Performance testing aims to assess how the system performs under expected and peak load conditions. Shifting performance testing to the left—incorporating it early during the development phase—helps identify and resolve potential bottlenecks sooner.
Incorporating performance tests as part of the continuous delivery pipeline ensures that each new version of the software meets performance benchmarks, preventing performance degradation over time.
Performance is usually considered a non-functional requirement or, as I prefer, a cross-functional requirement. In the book Evolutionary Architecture, the authors present the concept of Fitness Functions which are a way to ensure such requirements are met throughout the lifecycle of the system’s architecture.
When implementing fitness functions, I believe it’s totally fine to consider this category of tests as Fitness Functions or cross-functional tests, where Performance Tests are just a subset of this larger category. Logging and security, to name a few, are other potential subset of tests that would belong in this category.
Example:
// Performance test using JUnit and a load testing approach for the REST API endpoint
@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
public class CalculatorPerformanceTest {
@Autowired
private TestRestTemplate restTemplate;
@Test
public void testAdditionEndpointPerformance() {
long startTime = System.nanoTime();
for (int i = 0; i < 1000; i++) {
ResponseEntity<String> response = restTemplate.postForEntity("/calculate/add", new OperationRequest(i, i + 1), String.class);
assertThat(response.getStatusCode()).isEqualTo(HttpStatus.OK);
}
long endTime = System.nanoTime();
Duration duration = Duration.ofNanos(endTime - startTime);
assertThat(duration.getSeconds()).isLessThan(10).withFailMessage("Performance test failed: Took too long to complete.");
}
}
Mutation Testing
Mutation testing is a method of testing the test suite itself. By introducing small changes (mutations) to the code, this practice ensures that the existing tests can detect and fail appropriately when the code is altered.
Mutation testing helps assess the effectiveness and coverage of the test suite, revealing areas where additional tests may be necessary to improve robustness.
Those tests are usually performed by a library or framework which mutates the application code and then run an existing suite of tests with the goal of validating the test suite fails when it should.
I don’t consider mutation testing as a category of its own. I think of it as an extension to unit testing.
Example:
// Mutation testing example using PIT (Pitest) library
public class CalculatorMutationTest {
@Test
public void testAddition() {
CalculatorService calculatorService = new CalculatorService();
int result = calculatorService.add(2, 3);
assertEquals(5, result, "Mutation test: ensure the addition logic is intact.");
}
// Note: The actual mutation testing is conducted using PIT by running
// the PIT Maven plugin or configuring it in your build tool.
// This code example represents a standard unit test that PIT will mutate
// to check if the test fails when the code is altered.
}
// To run mutation testing with PIT, add the following to your Maven POM file:
// <plugin>
// <groupId>org.pitest</groupId>
// <artifactId>pitest-maven</artifactId>
// <version>1.6.8</version>
// <configuration>
// <targetClasses>your.package.name.*</targetClasses>
// <targetTests>your.package.name.*Test</targetTests>
// </configuration>
// </plugin>
Property Testing
Property testing focuses on verifying that the system holds true to specified properties over a range of inputs. This type of testing is designed to explore edge cases and input variations that a developer may not have initially considered.
In property testing, instead of specifying exact input and output pairs, the properties or invariants that the function should uphold are defined. The test framework then generates random input data and checks that the properties are always met. This method ensures that the software can handle a broader range of conditions and helps reveal hidden bugs that traditional examples-based testing might miss.
Property testing complements unit and integration tests by pushing beyond predetermined cases and validating the system’s behavior in unexpected scenarios. Integrating property testing into the existing testing framework can be done by selecting tools that support property-based testing, such as QuickCheck or Hypothesis, and incorporating them into the test suite.
Developers should start by identifying key properties that functions or modules should satisfy and implement these as tests. This approach helps ensure that, across a variety of inputs, the software consistently meets the defined invariants, bolstering the overall reliability of the codebase.
By incorporating property testing, developers can gain greater confidence in the robustness of their code and discover vulnerabilities early in the development cycle.
Similar to mutation testing, I don’t consider property testing as a category of its own. I also think of it as an extension to unit testing.
Example:
// Property testing example using a property-based testing library
public class CalculatorPropertyTest {
@Property
public void testAdditionProperties(@ForAll int a, @ForAll int b) {
CalculatorService calculatorService = new CalculatorService();
int result = calculatorService.add(a, b);
assertThat(result).isEqualTo(a + b);
assertThat(result).isGreaterThanOrEqualTo(a).isGreaterThanOrEqualTo(b);
}
}
Testing Multi-Threaded and Asynchronous Code
I don’t consider multi-threaded and asynchronous tests as a separate category of testing, but since I’ve seen many teams struggle with it, I believe it deserves its own section.
Testing multi-threaded and asynchronous code presents unique challenges due to issues like non-determinism, where the order of execution can vary between runs. This variability can make tests flaky and difficult to trust.
To mitigate these challenges, it is essential to design tests that focus on the individual behavior performed by each thread or asynchronous task. A rule of thumb I use is to ensure the scope of a given test scenario ends at the boundary of a thread or async call. A way to detect if there’s something wrong when testing multi-threaded or async behavior is if there’s a need to add a wait or sleep call in order for the test to be successful.
Non-determinism can also be avoided by using synchronization mechanisms or testing frameworks that simulate controlled environments, ensuring that the tests remain predictable. Additionally, tests should isolate and validate smaller, independent units of work to avoid race conditions.
By adopting these practices, developers can build confidence that tests that validate multi-threaded and asynchronous code won’t result in flaky and untrustworthy tests.
Example:
@Testcontainers
public class KafkaPublisherIntegrationTest {
@Container
private static KafkaContainer kafkaContainer = new KafkaContainer("confluentinc/cp-kafka:latest");
private static KafkaProducer<String, String> producer;
private static KafkaConsumer<String, String> consumer;
@BeforeAll
public static void setUp() {
kafkaContainer.start();
// Producer properties
Properties producerProps = new Properties();
...
producer = new KafkaProducer<>(producerProps);
// Consumer properties
Properties consumerProps = new Properties();
...
consumer = new KafkaConsumer<>(consumerProps);
consumer.subscribe(Collections.singletonList("test-topic"));
}
@AfterAll
public static void tearDown() {
producer.close();
consumer.close();
kafkaContainer.stop();
}
@Test
public void testEventPublication() throws ExecutionException, InterruptedException {
String topic = "test-topic";
String key = "test-key";
String value = "test-value";
// Publish the event to Kafka
Future<RecordMetadata> future = producer.send(new ProducerRecord<>(topic, key, value));
RecordMetadata metadata = future.get();
assertNotNull(metadata);
assertEquals(topic, metadata.topic());
}
@Test
public void testEventConsumption() {
String topic = "test-topic";
String key = "test-key";
String value = "test-value";
// Publish an event to set up the test
producer.send(new ProducerRecord<>(topic, key, value));
producer.flush(); // Ensure the event is sent before consuming
// Poll the consumer to validate the message was published
ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(5));
assertEquals(1, records.count());
ConsumerRecord<String, String> record = records.iterator().next();
assertEquals(key, record.key());
assertEquals(value, record.value());
}
}
General Best Practices for Automated Testing
To maintain the reliability of automated testing, flaky tests should be fixed, quarantined, or removed immediately. This practice prevents inconsistent failures that can erode trust in the test suite and the pipeline. Tests that fail inconsistently compromise trust in the test suite and the CI/CD pipeline. Failing tests should stop the pipeline until they are resolved, ensuring that issues are not overlooked.
Running a subset of tests locally before committing code helps developers identify potential issues early and prevents surprises during CI/CD runs. Lastly, tests should never be commented out, ignored, or removed to pass a failing pipeline, as this quick-fix approach undermines the integrity of the testing process and can mask underlying issues.
By adhering to these best practices, development teams can create robust, maintainable, and high-quality software products while minimizing risks and ensuring a seamless user experience.
Conclusion
Diff-based testing is an approach to testing that is heavily based on the test pyramid paradigm but goes one step further and states that:
- We should always test a functionality using the fastest type of test possible, e.g., unit tests over integration tests, integration tests over smoke tests, and so on.
- We shouldn’t duplicate test logic in different layers of tests. Each layer should add coverage to the behavior that couldn’t be tested in the previous layer.
By doing so, we ensure we end up with a healthy suite of tests that’s both effective, efficient, easy to execute, and maintain.






