Looks like we were also computing our test cases in a slightly sketchy
way, and just testing that we failed in exactly the same way. We do, but
now we generate better test data.
In general, it's a straight translation of the Google code, which in
turn is "mostly taken from the ref10 version of Ed25519 in SUPERCOP
10241124.", except that it's been hand translated to rust with some
test case generators. Future versions should clean this up to be more
normally rust-y.