Readme checkpoint

This commit is contained in:
Satsuoni 2021-07-29 17:59:53 +09:00
parent 8c931c4b83
commit 11aa79d55c
2 changed files with 20 additions and 1 deletions

View File

@ -129,7 +129,21 @@ while(true)
...
```
After several days of investigation, it became obvious that that is a form of code obfuscation, breaking down code flow into small segments and arranging them in switch statement in order defined by a primitive PRNG.
After several days of investigation, it became obvious that that is a form of code obfuscation, breaking down code flow into small segments and arranging them in switch statement in order defined by a primitive PRNG. PRNG can be controlled to execute if/else statements and loops. The *halt_baddata* portion causes access violation crash when reached. Any jump table index outside bounds leads to *while(true)* executing indefinitely. Since switch is driven by PRNG, decompiler cannot seem to find limits of jump tables, resulting in invalid switch statements or mangled decompilation. I tried to ameliorate that by [fixing jump tables](ghidra_scripts/FixJumptable.py), but results were not encouraging. I then tried to [follow the instruction flow](ghidra_scripts/Deobfuscatorr.py) by using Ghidra Emulator API. AFter a lot of experimentation, I drew the following conclusions:
- Many of the switch cases are almost-duplicates, and some are either never reached or only reached in case of failed check, craching program or sending it into infinite loop.
- The anti-debugging code is hidden within the switch statements.
- Most of anti-debugging code seems to be similar to what is decribed [here](https://anti-debug.checkpoint.com/techniques/misc.html). The list of the debugger windows names is exactly the same, which is amusing (and outdated).
- Some functions actually use memory checksums **as PRNG seeds** which makes guessing where it would go after impossible without knowing the checksum. And how many iterations it took to calculate it. And results of various checks in the middle. Etc...
- None of the anti-debugger tricks are activated by emulation, but emulation is literally hundreds, if not thousands of times slower than direct CPU execution, so that checksum calculation can take several hours (depending on log verbosity).
- Emulating just one function does not help much, since flow might depend on input parameters :( .
After that, I tried to reverse Protobuf encoding/decoding functions found in the code. While I did manage to find some of them (using *getchar* as a convenient breakpoint to attach debugger), they did not match Protobuf functions in the original repository, leading me to believe that the source file was changed. For example, SignedMessage now has more than 9 fields, rather than original 5. Luckily, protocol seems backward compatible enough, so the necessary signatures/keys can still be extracted. To parse protobuf messages, I used either original extension or this [convenient website](https://protogen.marcgravell.com/decode).
In any case, that investigation did not seem to lead anywhere, and in the end (after several weeks and lots of cursing), I decided to emulate the whole program in Ghidra. To that end, I developed a simple [script](ghidra_scripts/Longsusrunner.py) that emulated system and host calls made by DLL. Necessary system calls were extracted by just running emulation until it came to the code it could not execute and replacing that with python function. As an aside, script executes one instruction at a time, so it is slower than using Ghidra breakpoints, but easier for me to manage. Manipulating it allowed me to dump logs of program flow and memory contents, as well as save and restore simulator state. Eventually, I managed to reach point where emulator formed a valid signature and called into fake host code with it. It took several days, though, with the longest part being something that seemed to be jump table and calculation table generation. After that, it was just a matter of tracing signature back to generation function.
![Probable signature generation function](images/screen1.png?raw=true)
### Extracting part of the exponent
@ -141,4 +155,9 @@ After several days of investigation, it became obvious that that is a form of co
### Conclusion
### Some references
https://datatracker.ietf.org/doc/html/rfc3447#page-36

BIN
images/screen1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 87 KiB