Looks like we were also computing our test cases in a slightly sketchy way, and just testing that we failed in exactly the same way. We do, but now we generate better test data.
26 KiB
26 KiB