Hajime, the duck guy

Monday, June 24, 2024, by Hajime Yamasaki Vukelic

Test-driven coding platforms may be a double edged sword

I'm talking about sites like LeetCode and CodeWars, where you're presented with a problem, and an automated test suite to verify the solution.

This thought came to me after an exchange with an aspiring developer last night. In particular, it was prompted by a comment in the code he showed me:

// added to pass a test

I'm glad he added that comment, because otherwise I would have had a hard time figuring why the line did what it did, and might have never gained these insights.

Reactive coding

On CodeWars, you sometimes have two sets of tests. One set tests for more or less arbitrarily selected cases that are supposed to cover all possible cases (but sometimes don't), and then there's sometimes a set of randomized samples that do a more thorough testing. When encountering failed tests, it's easy to get worked up and do crazy things.

The code this guy lovingly tweaked to pass the first set failed the second suite, because the author of the problem did not foresee some edge cases in the manually written test cases. He did not realize what was missing in the first suite because he was so focused on passing the tests that he didn't think this solution through. Eventually he got stuck and asked for my help.

This is the trap many developers are likely to fall into when doing these online exercises. He was lucky this time because there were randomized tests. I know, however, that not all of the exercises include those. And even when they do, it's not an actual guarantee there are no missed cases. It just means that it's a bit less likely.

It's up to the exercise author to provide appropriate coverage. When the problem doesn't have a complete test suite, it's quite probable that this kind of reactive coding will lead to a suboptimal solution that is potentially broken in some edge cases not covered by tests. Once you learn this approach is "valid" (because it passes the tests), you'll need to work twice as hard later to undo the damage.

If you're a newbie, and you're doing these types of exercises, my opinion is that you should forget the test suites. Just think about what you're doing, and don't use the tests to guide your thought process.

Now these tests will run eventually, and the solution may not pass them. It's not like you can literally avoid them. If this happens first try to snap out of your urge to fix things. Go out for a walk, turn around and count to 30, whatever. Then return to it with a cool head. Don't rush in to tweak the code. Find a proper solution.

Side notes

After this experience, I made a few more observations that you might find interesting.

On code comments

There are people who hate on those who add comments. Let me show you again the comment that prompted this post:

// added to pass a test

Can you really write the code to express this concept? Are you going to define a function that is named passTheTest() and put that line in it?

The comment did reveal an important fact that clarified why the line looked the way it did (out of place). Without that comment, I wouldn't know whether it's weird because it's wrong, or it's weird because I'm overlooking something.

Now some may argue that this code should not have existed to begin with because it's a hack, etc. That's actually besides the point. This comment shows that there are concepts that cannot be expressed — or are just too ridiculous to express — as code. Comments are valuable when they provide additional context that is far cheaper to express in natural language, than code.

On TDD

One of the solutions to a problem included something like this (paraphrased):

let i = 0
function solution() {
  return [2, -1, 4, 4, 6, 23, ....][i++]
}

If you're not sure what this means, it means that the test-suite has a finite number of tests and each test expects a certain result. The solution() function simply returns the expected values one by one.

The comment below the solution says:

TDD in a nutshell

To me this comment was quite amusing, because it's also quite true. Now, quite obviously, it doesn't mean that people who do TDD literally write functions like these. After all, they work on very serious projects where such solutions wouldn't work.

Consider this, though. The solution in question passes the entire test suite and it's one of the 'accepted' solutions. This is where I'd like you to focus.

The aforementioned code is a perfect example of the worst code you could possibly come up with — granted, with some creativity — for the given test suite. It takes one of the TDD rules — it's ok to hard-code values in the first pass — to the extreme.

There were other solutions that actually competently solved the given problem, but if we only consider the test suite as the ultimate judge, the parody solution is still completely valid, and other solutions are actually implementing things that aren't covered in the test suite. This is against the basic principles of TDD where the idea is to write just enough code to pass the tests — satisfy the requirement covered by the test, in other words.

Now, I know the next step in TDD is to refactor, where you're supposed to get rid of hard-coded values, but if that's where you are adding code that solves for cases that aren't covered, you're not doing TDD. Removing hard-coded values in a way that covers more cases than what tests suggest you should is cheating as tests are no longer driving development if you do that. If you're really driving development using tests, you need to add more cases to force the code to evolve.

To put it another way, considering the provided incomplete test suite, all solutions that cover more cases than the parody code — regardless of how they are implemented — are over-designed. Again, keep in mind that this is if we only look at the provided test suite. To put it yet another way, within the constraints of the TDD, the only way to prevent this kind of cheating is to have a more robust test suite.

Which leads us to one of the most common criticism of TDD: if you're able to account for all requirements and imagine a complete test suite that prevents incomplete solutions, you are likely well-placed to create a complete the solution to the problem straight away. Otherwise, the solution would likely be incomplete. At this point, it shouldn't matter whether you write the tests first or later.

And another one: multiple requirements can impose constraints on the design simultaneously and optimizing for one requirement first can yield a design that must later be fixed to cover a different requirement, and this backtracking takes time. Unless, of course, you cheat by over-designing in the refactoring step. If you are aware of all the constraints ahead of time, then it's more efficient to consider these overlaps first.

Posted in Opinion
Back to top