CODEGATE 2026 Quals - oldschool#
- Category: Reverse / AEG
- Challenge:
oldschool - Description:
Back to the past - Solver:
solver.py - Client helper:
drive_client.py
TL;DL#
The provided Go client is only a courier. The real challenge is the ELF it downloads every round. Each round binary checks sha256(input[:4]), uses those same four bytes to decrypt a 7-instruction VM program, runs the remaining 60 bytes through that VM, applies one more generated bytewise transform, and compares the result against a target buffer in .rodata. I solved it by separating the stable part from the unstable part: recover the 4-byte prefix statically, invert the VM cleanly, and let one or two GDB probes reveal the final generated stage instead of trying to re-lift that tail by hand every round.
Overview#
The handout looked almost intentionally unhelpful. There was no obvious challenge binary, only a Go client and the same file again in the archive. That immediately told me where to start: before doing any reversing on the per-round binaries, I needed to understand how the client asked for them and how it submitted answers.
That was a good first move because the client was not hiding anything clever. The useful functions were easy to find, the protobuf message types were obvious, and the framing was simple: 1 byte type + 4 byte big-endian length + protobuf payload. Once I traced RequestChallenge and SubmitAnswer, the server interaction became mechanical. The only thing I really needed from the client was the dropped ELF path and the final success/failure messages.
Running the official client against the service made the actual challenge finally appear: prob1.bin, prob2.bin, and so on. At that point the job stopped being “reverse a Go client” and became “reverse twenty related ELFs quickly enough that the transport never becomes the hard part.”
Analysis#
The first useful binary made the overall shape obvious. It reads exactly 64 bytes, hashes part of the input, transforms the rest through a tiny custom VM, runs one more stage, then compares against a 60-byte target in .rodata.
The first important correction was noticing that the SHA-256 check only covers the first 4 input bytes. That same 4-byte prefix is also reused to decrypt the embedded program seed into the real 7-instruction VM program. That split is what made the whole challenge manageable:
- the prefix decides the VM program
- the suffix is the data the program transforms
Once I saw that, brute-forcing the prefix stopped sounding ridiculous. I was not brute-forcing a whole answer. I was only testing four independent bytes against very strong structural filters. For each byte position, I kept only values that decrypted instructions with sane properties: valid opcode, non-zero repeat count, valid next-PC, and valid table selector. That collapses the search space very quickly. The embedded SHA-256 digest then removes the last ambiguity.
That is why I chose a structural search for the prefix instead of trying to symbolically solve the whole binary end to end. The binary itself was already telling me how to prune the search, and the prefix space was tiny compared to the full 64-byte input.
Once the prefix was fixed, the VM was much less intimidating than it first looked. The instruction families are all table-based and all invertible once the right tables are loaded from .rodata: a bytewise transform table, a 60-byte permutation, a 256-byte substitution table, and a slightly stateful opcode whose behavior still depends only on fixed data plus the recovered prefix.
My first real mistake came from overfitting to one sample. I initially treated some of the tables as if their absolute addresses mattered. That happened to work on the first binary I was staring at, then failed immediately on the next one. The real invariant was not address identity. It was section-relative layout inside .rodata. Switching to pyelftools and reading everything by .rodata offset fixed that class of mistakes for good.
The second wrong turn was more subtle. After reversing one round, I thought the final post-VM stage was simple enough to just lift directly and reuse. That also broke on fresh rounds. The binaries all come from the same template, but the last transform is generated differently enough that hard-coding it is exactly the kind of shortcut this challenge punishes.
That was the point where the solve became much cleaner. Instead of insisting on a full static lift, I asked what I actually needed. By that point I already trusted my model of the prefix recovery and the VM itself. The only unknown was the last 60-byte transform right before the final memcmp(). So I stopped reversing there and used the binary itself as the oracle for the unstable part.
The GDB probe is small but decisive. I run the binary under gdb --batch, break at the final memcmp() when rdx == 0x3c, and dump the 60-byte buffer passed in $rdi. For one or two chosen payloads, that gives me clean (pre_final, post_final) pairs. Once I have those pairs, the inference problem is tiny compared to the original reversing problem. For each modulo-4 byte class, I search over a narrow family of transforms: xor, add, or sub, optionally combined with a rotate under a small modular condition. That search is cheap, and it matched every round cleanly.
That is also why I kept a local validator in the solver. This challenge is the kind where a nearly-correct model looks convincing right up until the server rejects it. I wanted the script to prove the recovered answer against the binary locally before it ever went back to the official client.
Exploit#
The final workflow was:
- Request the next round with the official client and wait for the dropped ELF.
- Parse
.rodatawithpyelftoolsinstead of relying on absolute addresses. - Recover the 4-byte prefix by decrypting candidate VM programs and filtering on instruction structure.
- Keep the candidate whose prefix digest matches the embedded SHA-256 value.
- Emulate the 7-instruction VM forward and backward.
- Run one or two chosen payloads under GDB and dump the final compare buffer at
memcmp(). - Infer the generated final-stage transform from those buffer pairs.
- Invert the whole pipeline, validate the recovered answer locally, then feed it back to the official client.
I deliberately kept the network layer boring after that. drive_client.py does not try to replace the official client or speak protobuf directly. It just watches the existing client output for binary_path=..., solves the dropped ELF, and writes the answer back. That was a pragmatic choice. Once the protobuf path was already working, there was no reason to introduce a second custom transport layer and risk a self-inflicted bug.
Verification#
I reran the full service flow on March 29, 2026 with drive_client.py and let it solve all 20 rounds again. The method did not need any structural change; only the final flag changed, which is exactly what I want to see from an AEG-style solve.
The fresh rerun ended with:
| |
Final flag:
| |