Back to blog

Vanoma software testing philosophy

Tuesday, May 3rd, 2022

Vanoma relies heavily on continuous deployment to ship new features and improvements to our customers. The confidence to ship continuously would not be possible without reliable automated tests, which have become an industry standard for most software companies. In this blog post, we want to highlight our testing philosophy and explain why we do things a certain way.

Facts and constraints

Before we dive into how we approach testing at Vanoma, let's lay out some facts and constraints of our tech stack.

First and foremost, there are a lot of techniques and philosophies about software testing, with varying degrees of applicability to different software needs. So our philosophy evolved out of the need to iterate fast on the product side while ensuring that we are not serving our customers a half-baked and buggy product. Second, we happen to use Python and Java for our backend, and our testing approach should transcend language barriers to provide a framework that is agnostic of the programming language. Finally, given how our codebase changes fast, documentation is almost non-existing! Therefore, our testing philosophy should account for this reality so that tests can serve as an organized body of documentation, at least in some capacity.

Endpoint-based testing

Vanoma uses a micro-service architecture, as depicted in the picture below. Each service exposes a set of RESTful API endpoints that other services or our internal and external clients can call. Also, each service can optionally call external third-party services during the lifecycle of the request-response processing.

A simplied view of Vanoma architecture

Using service B depicted in the picture above as an example, a client of this service (be it our web apps or API customers) only cares whether the service fulfills its responsibility: return an appropriate response for a given request. That is the same idea that underpins Vanoma's testing philosophy. We hit all endpoints exposed by the service during testing and assert that the service returns expected responses. Our tests do not care about how the service fulfills the request in the same manner that the clients do not. To illustrate this concept, let's assume a /foo endpoint which creates a foo resource and sends an SMS to the phone number associated with foo.

In a test case of /foo endpoint:

  1. We call the endpoint with a valid foo payload in the request body
  2. Next, we assert that the endpoint returns a response with 201 status. As part of the assertion, we also verify that the attributes returned in the response body match the payload included in the request
  3. Finally, we verify that the service created a foo record in the database in addition to validating the payload of the request sent to the SMS provider. This last step of verifying database and SMS provider interactions is critical as it validates our assumption that the service interacts with its dependencies.

We just described one test case of /foo endpoint, but we can add more. We can, for example, add another test case with an invalid foo payload or simulate the SMS provider being down. We try to capture as many "what-if" scenarios as possible in the tests.

Pros and cons

Our testing philosophy affords a certain degree of confidence that our services are functional while shielding us from the need to have unit and integration tests. Given the pace at which our codebase changes, unit testing would result in lots of tests being useless in a short time. As for integration tests that touch all the micro-services, there would be a non-trivial amount of overhead needed to maintain them at this stage. We found that service-level endpoint tests strike a reasonable balance between unit and integration testing. We still have a handful of unit tests for code not interfaced through endpoints (e.g., cron jobs), but they are insignificant.

Finally, as with anything in software development, we are making some tradeoffs by adopting this philosophy. The main tradeoff is that we're not optimizing for "test run time." There's a fractional overhead needed to set up each endpoint, affecting the cumulative test run time. So as the number of tests increases, it might become impractical to stick to endpoint testing while covering all the edge cases. We might have to adopt the classic software testing pyramid at that point. But until then, "test run time" is the tithe we are giving forth by adopting this philosophy!

-- Theophile Nsengimana