Welcome to our new blog series about everything related to testing. Hopefully, these blog posts will give you a general idea of how we write them here at Mediatoolkit and why. Before we dive into examples, let’s start off with some basic definitions, ideas, and benefits of writing tests.
What kind of tests are out there?
Not all tests are created equal. There are different kinds of tests for different purposes. While this post focuses mainly on unit tests, we should be aware of the differences and understand the benefits of each.
1. Manual tests
Manual testing involves tests executed manually by a tester. These are usually done by Quality Assurance or by developers themselves. QA should be seen as the last line of defense, not the main application tester (QA doing manual testing as the main way for UI testing is a valid exception to this case).
Developers usually write manual tests when they want to run some things locally and see how the system behaves without pushing any code to the staging phase, or God save you, production. However, the goal of these tests is not the code’s robustness. The only role of these tests is to catch bugs and ensure the quality of your product.
2. Integration tests
Systems are composed of multiple components, or at least they should be. Integration tests check the public API of these components. Not just REST API, but any publicly exposed API, such as your Kafka topic communication.
If a component “says” it outputs a message on a topic in a certain format, that is its public API or contract. Integration testing is checking if multiple individually developed components can communicate and if their contracts match as a group.
If you only test these components individually, their individual behavior might work correctly, but when you try to connect them in a greater group, you discover that their contracts differ.
3. End-to-end (E2E) tests
End-to-end testing is ensuring the user experience flow of the end product. The complexity of systems today is hard to cover with tests. Many systems rely on each other and E2E tests ensure that the product does what is expected. QA validates the correctness of your product by going through the end-users flow and checking if all the systems behave as expected.
4. Unit tests
Unit tests are the backbone of any reliable software. They make the foundation for other tests. They test individual units.
Developers might mistake the definition of a unit with a single method. Unit tests should test the behavior of a component that might be composed of multiple different classes. Public methods accessible by other components need to be tested, do not test protected methods or classes. They are implementation details, not part of your public API.
Why should we even test?
If we write good tests and write them often, we can ensure the quality of our product before it sees the light of day.
As our system ages, the benefits of testing become more and more apparent. Our environments become reliable, they save development time and a lot of hair-pulling when inevitably things go wrong. Your colleagues will be thankful when they can take a peek at your tests and understand what your code should and shouldn’t do without having to manually run things.
Before we start talking about “robust code”, how should we define it? What makes code robust? Does that mean that we should program defensively and think about how other developers might abuse our code?
Robust code is always simple and clean. Simplicity is robust, complexity is fragile. We should handle invalid inputs, but that doesn’t mean that we need to program defensively and not trust our team.
Gall’s Law: A complex system that works is invariably found to have evolved from a simple system that worked.
The inverse proposition also appears to be true: a complex system designed from scratch never works and cannot be made to work. You have to start over, beginning with a working simple system.
When your code is covered by tests, you no longer fear changing existing code. After each change, you can run your tests and make sure you didn’t break stuff. When you have tests, you don’t have to program defensively.
Refactoring without tests is going down a slippery slope that will end up in sleepless nights and working Sundays. This topic is too broad to be covered here and deserves a blog post of its own in the future.
How do we not write tests?
It is just as important to know how NOT to write unit tests, as it is how to write them.
- Writing a test that prints the result of a method call is not a test since we do not validate the desired outcome.
- If your test reads data from a file in your Documents folder, it isn’t a real unit test since tests should not depend on the environment.
- Any developer should be able to check out your code and run the tests successfully without doing anything else.
- Every unit test should be independent of other tests. That implies that the execution order of your tests also shouldn’t matter.
- Running tests multiple times should always end with the same results if we don’t change any code.
- Unit tests should test behaviors, not individual method calls. Not every class and method needs to have its test.
Behaviour is something that produces a real value that the user of your system needs. Does your user need to know if
productFactory.create() created the same object when called twice or if your repository was called with some parameters? Probably not, but still many developers write exactly these kinds of tests.
If your tests look like that, they are tightly coupled with your implementation. Each time you want to change the details of your implementation, you need to update your tests, even though the behavior is the same. Your tests should change only when the behavior changes, not the implementation details. In other words, test what your code does, not how it does it.
How do we write tests?
Our tests must follow best code practices, they must be environment independent and they need to execute fast.
It is important to keep the test execution time as short as possible. Each test shouldn’t take more than a couple of milliseconds. When tests take too long to execute, people tend to skip them and just rely on their CI server, such as Jenkins to make a fuss when it cannot build their deployment executables.
Each test is composed of 3 ‘A’ sections (The AAA pattern):
In the arrange section of our test, we ensure our system is in a specific state before calling the behavior we want to test. The ‘system’ could be an object which we need to set up in a specific way to produce behavior, creating temporary files or things of that nature.
The code in this section is usually larger than the other two combined.
One design pattern that should prove particularly useful to keep this section small is
Object Mother. This design pattern is very similar to
Factory, but it has more specific methods that build pre-configured objects for you. While a standard
Factory could have a method such as
ObjectMother will have methods like
This section of your test must have one line. This line executes the behavior under test. If you find yourself writing more than one line for this section, you probably don’t have the right encapsulation of your behavior. Your clients shouldn’t be expected to call multiple methods of your object in a particular order, so why would your tests?
This line is a method call we want to test. If this method returns a result, you should store that value in a variable to check if it is the expected value in the Assert step.
After we have prepared the system in the Arrange section and executed our action under test in the Act section, we need to validate the result of the action. We usually check the method’s result here, but sometimes, our methods do not return values, but they still produce side effects. If our code was expected to change the state of an object, create a file, or remove something from a
List, we should check if it did exactly that.
Stubs vs Mocks
Most developers use the terms mock and stub interchangeably, but there are differences.
A stub cannot fail the test, a mock can.
Stubs are state-based, they return hard-coded values. “What result did I get?”
Mocks are behavior-based, you use them to verify how your behavior passes through it. “How did I get the result?”
If your unit tests are littered with mocks, your tests end up very fragile, meaning that each time you change one of your implementation details, you need to update all of your mock calls.
Test only one thing.
We need to be able to isolate one behavior and prove that it works. If that behavior should work differently with different inputs, we need to write a new test for each of those behaviors. It is hard to know why our test failed if we have a large test that tests multiple things at once. Moreover, it gets harder to remove features we don’t need anymore and see which features we broke when we add new code.
It is perfectly fine to have multiple asserts in the final section of your tests as long as that doesn’t require that you call the behavior under test multiple times. If that’s the case, it will be hard to pinpoint the faulty behavior and fix it.
When we assert multiple behaviors in one test, we don’t get a clear picture of what exactly doesn’t work because the test will report only the first failure and the rest are skipped. It gets a lot harder to understand what changes are necessary and how many things do not work as expected.
One thing that separates good tests from great tests is the test name. Tests should not only tell us what they do but when they do it.
There are a lot of good naming patterns you can use, so pick the one you find most descriptive and stick with it. Here are some examples of great test names:
When you write unit tests, it is much more important to communicate what you are testing than it is to follow the best method naming practices. For example, in Java, we use camelCase when writing methods, but it is perfectly valid to use underscore (_) to separate the state from the action in your test name.
The tests you write should follow all the clean code practices you apply to your code. Tests are not second-class citizens and you need to apply the same level of care as you do with the rest of your code to make them readable.
Definition of code duplication in tests is very important. DRY principle (Don’t Repeat Yourself) applies to extracting behavior that changes for the same reason. Tests change for different reasons, so be varied in extracting things from your tests if they truly don’t change for the same reason. Spoiler alert, they often do not.
if statements do not belong in tests. The
if statement tells us that our test does at least two different things and we would be better off if we rewrote our test as two different tests. When named properly, it will be easier to understand what the tests do, and what are all the different behaviors.
When should we write tests?
Guided by the principles of TDD, we should write tests before writing new code.
When we need to add a new feature, we first describe the desired behavior as a new test. We make the least amount of changes necessary to pass that test without breaking any others.
Otherwise, the more time passes, the more code we have that has not been tested and the chance of introducing bugs or overengineering goes up.
Also, the tests become more complex since we have more code to test now, and what usually happens is that developers adjust the tests to code. That means that we adjust the behavior to match what our code does instead of the other way around.
When we write tests early, the definition of the problem becomes smaller and it is easier to wrap our head around such issues than more generic and complex behaviors.
While writing tests might seem like an optional thing to do, it is crucial to begin with good foundations. Coding is hard already, make it easier on yourself and your teammates by writing testable code that is easier to read, understand and maintain. Finally, if you have difficulties making your tests comply with your wishes, the problem is more likely in your code than in your tests.