cgore.comcgore@cgore.commicroblogblogphotosChristianitycryptographyjournalingoutdoorsmy master's thesisprogramming
Thinking Bicycle
Github @cgoreFacebook @cgoreInstagram @christopher.mark.gore

This website is copyright © 1995-2016 Christopher Mark Gore, all rights reserved. The content may not be copied or republished via any means, traditional or electronic, without express written permission of the author.

Chris Gore: Blog: 2015: Wednesday, August 26

Introducing Generative Load Testing

Generative Testing

I've been using various forms of generative testing for a while now, and it's great. For those of you not already familiar with it, the basic idea behind generative testing is to randomly generate lots of input to the function/program/whatever you are trying to test, and run all of those randomly generated inputs through the function. You don't define specific individual tests, with brittle and limited sample inputs, but instead you define generators that should generate any and all valid input. The upside is you don't have to maintain brittle test cases, you don't have to wonder (as much) if you are missing some key input, and you can more reasonably define why you are testing something, not just how, which is often unclear in brittle one-off tests.

There's a lot of really good resources for generative testing already out there. Since I've been doing Clojure for the last two years, I'll focus on that. Stuart Halloway gives some really good information about generative testing here, and in Clojure the primary library for doing generative testing these days is test.check. The library itself is a good introduction to the concepts. Another really good library you'll want to use for generative testing is test.chuck, it adds lots of useful missing generators. I've recently started work on my own library of generators called gin, but it isn't very far along yet.

Load Testing and Benchmarking

Load testing is an unfortunate reality of software development. Sometimes a program only needs to be correct. Most of the time that's not enough, it needs to meet some minimum performance requirements, and that's where load testing comes into the picture. Load testing is putting a high demand on the system under test and seeing how it performs. For example, if you are developing a service that provides some data over a JSON API, you might want it to be able to handle dozens or even hundreds of concurrent client connections. How do you figure out if it can handle that many connections at once? The solution is to actually make that many connections and see if it works of course. You might go well beyond the expected load requirements to see just how long until the system breaks; this is benchmarking in a nutshell, load test until you know the maximum load. Or if you are unlucky, you might find out the system can't handle the load it needs to, and then you need to spend a lot of time figuring out how to make it more efficient in some way. One of the best benchmarking libraries for Clojure I've seen so far is Criterium, I would definitely recommend it if you haven't given it a shot yet.

So What Is Generative Load Testing?

Quite simply, it's just combining the two, and they are well suited for each other. One of the biggest problems with benchmarking being realistic is getting lots of realistic input to test with, and generators are perfect for that. Often simple load tests will run over and over on the same input, but that's not really all that good of a load test for several reasons. Often how fast or slow a service is depends on the input, and often in non-obvious ways, or at least in ways that aren't obvious until after the tests. Often there's caching of some sort enabled on your service, so if you run the same input over and over again it only has to do the real work once, and then just send you the same result lots of times, and so you are only exercising the network side of your code, not the real backend work.

How Do We Do Generative Load Testing?

Currently I've done simpler load testing this way. I've generated a lazy sequence of inputs, split that input over a number of threads, and then had all the threads run through their inputs until completion to get basic timing information. In the long run though I'd love to see something built on top of Criterium to support this idea, because all of the extra features from Criterium would be great. I'm hoping to make that a real thing sometime soon. It would be great to see other resources under these sorts of generative load tests too, memory, CPU, network usage and the like.