Mea culpa: Lessons learned from firefighting

Background

On the morning of Tuesday, May 23 there was a mood of excitement within the Australian team of The Conversation as donations from generous supporters began to roll in. The annual appeal for financial support from our readers had swung into action.

Little did the development team anticipate how exciting the day would become.

When we received reports that some donors were receiving multiple receipts we initially suspected a bug in the emailing code. Unfortunately it transpired that this was not the case. Nor was it the only bug in our donations code-base that was affecting donors.

Our response

While most of the donations were processed without any problems, before long we received reports that led us to identify three bugs affecting donors:

a very small percentage of donors were being charged multiple times
incompatibilities with Internet Explorer 11 were preventing any donations from people using that browser
some users who entered invalid credit card details were unable to correct their details and donate.

Several members of our development team collaborated as we received reports that all was not well with our Donations platform. Fortunately, early in our troubleshooting we identified that the “multiple-charging” problem was only affecting a very small number of users, less than 20 in all. Consequently we were able to continue to receive donations while we divided the tasks of monitoring errors and fixing bugs amongst our team.

We also liaised with other key staff so that they could contact affected users.

Once we attended to the urgent tasks associated with deploying fixes to the bugs that had been identified, our thoughts turned to reflection. What had led to these problems that affected our generous readers? We resolved to ensure that we would not only learn lessons from the experience but take appropriate steps to fundamentally improve the code-base.

Postmortem

A couple of days after our firefighting effort, we converged as a team to review what had led to this embarrassing scenario. A key aspect of our postmortem was that it was framed as a blameless exercise. Our aim was to review what happened, what should have happened and what we resolved to do about it, both in terms of code quality and process actions.

Our postmortem also resulted in a list of things about the Donations code-base that we were unhappy with. The title of this list may or may not have been a four letter word analogous to excrement but with two fewer syllables.

Code quality actions

A key outcome of our postmortem was the resolution that we needed comprehensive acceptance tests. Given that financial transactions were involved, “happy path” acceptance were insufficient. For example, we needed automated testing of validation errors and correction of errors.

We also painfully acknowledged that our JavaScript had been written in a way that had made unit testing far too difficult. This had to change.

More generally we resolved to refactor based on the “smelly” list referred to above.

Process actions

We resolved to strive harder to unit test more Ruby and JS code. When reviewing, we felt that we should feel confident about pointing out missing specs. We felt it important to discuss with the author why they’re missing. Part of that discussion should cover what the trade-offs are with respect to adding unit tests, compared with accepting the risk of not having those tests.

We noted that it was important:

to ask questions to ensure that both the reviewer and the developer understand the code
to confirm that both the JS and Ruby code is tested
to use judgement to write tests to inform our code design as well as to reject valid scenarios where effort to test outweighs benefit
to not aim for 100% coverage
that the reviewer should describe what they’ve done to review.

Another key insight was the importance of considering a feature freeze. The week before future donation campaigns should be reserved for manual testing in a staging environment.

Resulting improvements

Acceptance tests

The first step we took was to set out a safety net by ensuring that we had satisfactory acceptance spec coverage.

We decided that it was no longer sufficient to cover only the “happy path” cases. This was because most of the new bugs we saw arose from sad path interactions. For instance, there was a bug that only arose after the user had attempted a donation with an incorrect email address. We were mindful that happy and sad paths involved potentially taking money from our supporters. A (much) longer-running test suite was a fair trade-off for the sense of security we’d get from covering all eventualities.

The specs actually interact with the test sandbox servers at our payment providers, which is uncommon. Stubbing all our interactions with our providers would have been quite complicated. Also, it has lead to some false negatives in the past. Again, we decided to accept the extra running time, and the occasional flakiness that comes from relying on external services.

Finally, we incorporated BrowserStack into our test suite. One of the bugs we found arose only in IE11. BrowserStack’s ability to run our tests against IE11 and a host of other browsers automatically will hopefully save us from such problems in future. With our acceptance specs in place, we felt confident to proceed to proceed with the improvements on our list.

Front end

We had tended to return to our Donations code-base once a year around campaign time. The effort required to understand existing features in that limited time window meant that we ended up working around them rather than with them. As a consequence they tended to become more complex over time. They began to stray from their initial intent.

Our code ended up suffering from “palimpsest” development. Recent additions to the team understood their predecessors’ code incompletely, and had to use it in ways the original developers did not necessarily intend. This became telling in our front end code when we ended up calling a complicated method with an innocuous name multiple times during our validation process. It turned out that the method would raise a credit card charge when successful.

We determined that we needed to be able to implement our validation code in a way that was more easy to understand, and was testable. We would like to have been able to understand the existing code, aided by existing tests. Additionally, new tests would help us implement new behaviour.

However, the structure of the existing code was too tangled to work like this. We needed to cut the Gordian Knot of this complexity.

With the acceptance test safety net under us, we proceeded to replace existing components of our UI, piece by piece, with React components. Those of us who were directly involved with this “React-ification” came to love it, as it enabled exactly isolating the behaviour of specific parts of the UI in a way that makes unit testing possible, even easy.

A React component’s “shape” (how it is rendered) at any given moment is always determined by a set of knowable inputs:

state (attributes which it manages);
props (attributes managed by its parents) and
the events triggered on it (e.g. changes in its input boxes, etc).

Thus it is always possible to construct tests in terms of inputs and expected outputs. This makes testing much easier than it usually is in vanilla Javascript, with all its asynchronous behaviour and action at a distance.

Naturally, this isolation also makes it much easier to understand the elements on the page. It makes their borders and interactions very clear.

It also makes it relatively simple to know where to look for the code for a given component. Wherever it may sit overall according to your local conventions, a component’s code can sit in a directory with its specs along with any styles which belong to it specifically. We recommend this approach.

Back end

Our back end code had ended up in a similar state of complexity to our front end code.

Premature generalization

Developers this year were having trouble divining and using the structures laid down by our predecessors. A fair amount of this was down to finding ourselves with what you could call premature, or speculative, generalizations.

There was, for instance, an inheritance hierarchy in our payment processing code. This code was effectively only serving one of our payment providers at the time.

Consequently, the classes at different levels of the hierarchy had logic which specifically catered to one provider only. This would have to be unpicked when we added another provider.

We were attempting to rewrite our code to serve both of our payment providers: Stripe and PayPal. We also needed to support both of our payment options: once-off donations and regular monthly “subscriptions”.

Two providers, two payment options, one base class – but which functionality belongs to the base class? Author provided

For each piece of new functionality, we needed not just to understand it in itself, but to understand which level of the hierarchy implemented it. Also, we had to understand how it differed from the existing functionality. So, rather than conferring a benefit from sharing code, inheritance left us with an extra cognitive burden.

The inheritance hierarchy was getting in the way. In the end, there was not enough shared code between all cases to justify them sharing a common ancestor.

As soon as you find yourself with a hierarchy which shares behaviour conditionally, you have what Sandi Metz would label the wrong abstraction. In such cases, coding for specific classes involves not just figuring out the correct behaviour in the case you were working on. It also necessitates understanding how that differs from the other cases covered by the hierarchy. We haven’t totally fixed this problem. However, we’ve made some progress in refactoring away the hierarchy, backed up, as ever, by our acceptance tests.

Unclear abstractions

Another cause of complexity was in the unclear division between two important classes: Donation and Subscription.

A Donation represented a person making a one off donation.
A Subscription represented a person signing up to give us a monthly donation.

It sounds like a fair enough division, given that some slightly different logic applies in each of the two cases. Our payment providers offer different mechanisms for collecting once-off and regular payments.

But matters were confused here. For each once-off donation, we will send a thank you email and a tax receipt email. For a monthly donation, we will only send a thank you email once, but will send a tax receipt email every month.

To help support this behaviour, our code would create a Donation every time a monthly donation was received, as well as for every once-off donation. Because each class could be our only source of information about a donor, both classes had user contact information.

Instances of each of the two classes would have payment logic and also contact information. Consequently, they may find themselves passed into a lot of the same places as each other, with the receiving code taking responsibility for figuring out whether they were a Subscription or a Donation. This uncertainty was usually reflected in the name of the parameter/variable which received the object, which was the vague and ambiguous word “source”.

We eventually figured out that we could shift the dividing line between the classes a little and remove a lot of the confusion.

We did this by making the classes reflect more closely the natural categories of things as we perceive them. Instead of the major divide being between a once-off and a monthly donation, we drew it between the donation and the person making the donation. The person is responsible for having contact information and knowing when the donations are supposed to happen, and the donation is responsible for tracking the actual transaction.

Naturally, in this scheme, a person has zero to many donations. When we followed through this logic, it became abundantly clear in each case which type of object should be passed to which job. The job for sending a thank you email should always expect a person, given that it happens once when a person signs up to make donations. The job to send tax receipts, on the other hand, should always expect a donation, as it will happen once for each financial transaction which occurs.

By performing this refactoring, we have been able to remove a lot of conditional logic. Also, perhaps as important, we have been able to name the relevant methods and variables unambiguously according to whether they are (or are expecting) a person or a donation. This will almost certainly have made our code a lot clearer and more maintainable for the next developers to work on it (i.e. most likely us in a few months).

Summary

What were the key insights that we gained from this exercise?

To begin with, always consider happy path acceptance tests as a bedrock indication that your code will do as intended. Where your code handles money, especially where it can handle a lot at once, consider expanding your suite of acceptance tests to the sad paths. Consider incorporating your payment provider’s test tools/sandbox environments. Also, try to make automated cross-browser testing part of the mix.

Secondly, try to give yourself a enough of a run-up to to your deadline give yourself room to manoeuvre. No matter how simple you think your application is, or how undramatic you think your change is, allow a buffer. It will take a while for future you to understand past you’s intentions.

Next, take the time to make it easier for future you by paying attention to the code smells of speculative generalisation and classes or variables with vague or ambiguous names. Classes with vague or ambiguous purposes are a likely indicator that they should be divided differently.

Also, use React to structure your front end code. As described above, this approach is already paying dividends for us.

Whilst there was some pain involved in fighting the fires that our sub-optimally maintained Donations platform resulted in, we’re pleased with our response. Our subsequent improvements have left us feeling confident that future donation campaigns will run smoothly.

Building The Conversation