When teams start out with writing tests for their codebase, it’s very common to have difficulties because certain types of code are harder to test or “untestable”. A mistake that’s commonly made is to try to use mocks or similar, and make a complex setup, trying to work around the bits that are hard to test. Generally speaking, these hard to test parts are dependencies, such as data, or frameworks. The better approach, and the key to testing, is isolating things from each other, so that different types of tests can be applied at appropriate scopes, where they provide the highest value for least cost.
The classic “testing pyramid” describes a healthy structure of tests for an application, with a wide base of cheap, fast running component tests, fewer infrastructure tests that hit real databases and so on, and fewer end-to-end automation tests. These add a lot of value, by doing different things. The component (“unit”) tests work entirely in-memory, with no IO dependencies, and can cover all the logical variations very cheaply and quickly. Infrastructure tests make sure that all the database calls work correctly, without the logical twists of the application, but with realistic infrastructure – this is much slower and more expensive to set up and tear down each time, ensuring each test is isolated. Automation tests make sure the system’s parts are composed correctly, by hitting the application from the outside and making sure it responds correctly for a handful of cases.
This works because we can apply elimination. When a component test fails, we know instantly that the problem is with logic. A failing infrastructure test tells us exactly where to look. A failing automation test doesn’t tell us exactly wrong… but we know it’s not the things that are covered by unit or integration tests, so we can narrow down the search area pretty quickly. The better the job we do of covering the application with component and infrastructure tests, the smaller that search area is. The trick is in building our application in such a way to maximize the coverage the cheaper tests can give us.
I think of this as a “cross shape”:
Component tests cover the entire width of the system, that is all the logical variants. Every branch of every if clause, every case statement, every try/catch – all the routes through the codebase – should be captured in the core of the system, with no direct dependencies on external systems but on abstractions that can be stubbed. If you compose your system from pure functions, with no side effects, you get this outcome. If you’re using OO paradigms, put side effects behind interfaces, but keep all the decisions in the core of the application, so that the dependencies only do what the core of the application tells them to. The aim is to be able to cover the full width of the system, but zero depth – that is zero infrastructure or frameworks.
Infrastructure tests cover the “bottom” of the system, as viewed in the traditional n-tier way. Infrastructure tests don’t deal with the domain or core of the application, they simply exercise the methods with method calls. This mostly applies to data access, and you don’t want to be limited by having to go through intricate application flows. The goal is to make sure each method works correctly, essentially “unit testing” the data layer in isolation, against realistic but isolated infrastructure.
Automation tests cover the entire depth of the system, hitting a deployed instance. Your automation tests essentially see what an end user would see, and should know nothing about the internals of the system. This obviously severely limits what can be tested this way, but you should by now have covered a lot of the system in other cheaper ways. Automation tests tell you that you have dealt with frameworks correctly, configured HTTP servers correctly, set up connection strings and configuration correctly, deployed correctly and so on. If anything is wrong, at any level, the automation tests should tell you about it by failing. These tests are expensive, but really important – they are your automated smoke test that tells you everything else is alright, for real. Because these tests are expensive, and hard to write well, save them for deep-dives through the application, touching as much as possible in a simple use case. As a rule-of-thumb, I go for automation tests for a single “happy case” (no errors) for each feature.
The tests that I exclude are the “middle” ones… things like testing controllers in an ASP.Net application by mocking/stubbing/using fake servers. The problem with the middle tests is that they tend to be very tightly coupled to the frameworks you’re using, so much so that they are useless in any significant refactor of the system. If you want to replace ASP.Net WebApi with Nancy or some other framework, you can lean on your automation tests, but any kind of “controller tests” will need so much modification that they become meaningless as a safety net. These tests tend to be hard to write, expensive to write, and tell you the least about the system – so don’t bother with them. Thin down your code at this level, so that all your controllers do is mediate between HTTP and the core system – people refer to this as “thin controllers”.
Keep the infrastructure and logic in your system strictly separate, and you’ll find it easier to test and maintain – and earn plenty of SOLID kudos points! Push complexity away from the edges, away from code you don’t control, and away from things that are hard to test.