Mostly finished README and extension now may work with Shaka

This commit is contained in:
Satsuoni 2021-08-01 22:16:11 +09:00
parent faf90be4aa
commit e9e22bdb2d
5 changed files with 27295 additions and 20 deletions

128
README.md
View File

@ -5,9 +5,11 @@
- This work is based (obviously) on the [widevine-l3-decryptor extension](https://github.com/cryptonek/widevine-l3-decryptor). Many parts are the same, parts of Readme are a verbatim copy, etc.
- I have no working knowledge of Chrome extension structure.
- Some parts of code are copied from RFC documents, wikipedia, etc. *shrug*
- Tldr: The result seems to work, but relies on code lifting into wasm module and lots of brute-forcing, resulting in about 15-minute wait for a single RSA decryption.
- Tldr: The result seems to work, but relies on code lifting into wasm module and lots of brute-forcing, resulting in about 15-minute wait for a single RSA decryption. **UPDATE** While writing README, I found encoding tables
- I am too lazy to improve on this.
- Bignum arithmetics was taken from [CryptoPP](https://github.com/weidai11/cryptopp) library. I found it the easiest library to work with and easiest to compile into wasm as well.
- Should work with widevine for 64bit Windows of version 4.10.2209. Unlikely to work for any other versions.
## Introduction
@ -18,8 +20,8 @@ But Widevine's least secure security level, L3, as used in most browsers and PCs
This Chrome extension demonstrates how it's possible to bypass Widevine DRM by hijacking calls to the browser's [Encrypted Media Extensions (EME)](https://www.html5rocks.com/en/tutorials/eme/basics) and (very slowly) decrypting all Widevine content keys transferred - effectively turning it into a clearkey DRM.
## Usage
To see this concept in action, just load the extension in Developer Mode and browse to any website that plays Widevine-protected content, such as https://bitmovin.com/demos/drm _[Update: link got broken?]_.
First, extension will try to brute-force input encoding for the code-lifted part, dumping progess to console. Then, assuming it succeeds, keys will be logged in plaintext to the javascript console.
To see this concept in action, just load the extension in Developer Mode and browse to any website that plays Widevine-protected content, such as https://bitmovin.com/demos/drm .
First, extension will try to brute-force input encoding for the code-lifted part. Then, assuming it succeeds, keys will be logged in plaintext to the javascript console. (__Update__: will avoid brute forcing now)
e.g:
@ -42,7 +44,7 @@ ffmpeg -decryption_key 100b6c20940f779a4589152b57d2dacb -i encrypted_media.mp4 -
It is my honest opinion that DRM is a malignant tumor growing upon various forms of media, and that people that either implement or enforce implementation are morally repugnant and do no good to society. With that in mind, I was sad to learn in May 2021 that the original extension would soon be rendered obsolete. I found myself with some free time on my hands, and so I decided to try and replicate original key extraction. Unfortunately, there was not much data pertaining to what process the original's authors used, and even some confusion as to [who was the one who performed extraction](https://github.com/tomer8007/widevine-l3-decryptor/issues/14). Nevertheless, I decided to give it a go, and hopefully boost my flagging self-confidence a little. I did not succeed in either of those tasks, but I managed to write a barely-functioning decryptor, and decided to document the steps I followed, in case they are of use to somebody else.
### Reverse enginering and emulating
### Reverse engineering and emulating
In order to deal with executable, I decided to use [Ghidra](https://github.com/NationalSecurityAgency/ghidra), despite its association with NSA, mostly because it is free and has most features that I wanted. I also wrote a simple snippet to be able to debug dll.
@ -132,7 +134,7 @@ while(true)
After several days of investigation, it became obvious that that is a form of code obfuscation, breaking down code flow into small segments and arranging them in switch statement in order defined by a primitive PRNG. PRNG can be controlled to execute if/else statements and loops. The *halt_baddata* portion causes access violation crash when reached. Any jump table index outside bounds leads to *while(true)* executing indefinitely. Since switch is driven by PRNG, decompiler cannot seem to find limits of jump tables, resulting in invalid switch statements or mangled decompilation. I tried to ameliorate that by [fixing jump tables](ghidra_scripts/FixJumptable.py), but results were not encouraging. I then tried to [follow the instruction flow](ghidra_scripts/Deobfuscatorr.py) by using Ghidra Emulator API. AFter a lot of experimentation, I drew the following conclusions:
- Many of the switch cases are almost-duplicates, and some are either never reached or only reached in case of failed check, craching program or sending it into infinite loop.
- Many of the switch cases are almost-duplicates, and some are either never reached or only reached in case of failed check, crashing program or sending it into infinite loop.
- The anti-debugging code is hidden within the switch statements.
- Most of anti-debugging code seems to be similar to what is decribed [here](https://anti-debug.checkpoint.com/techniques/misc.html). The list of the debugger windows names is exactly the same, which is amusing (and outdated).
- Some functions actually use memory checksums **as PRNG seeds** which makes guessing where it would go after impossible without knowing the checksum. And how many iterations it took to calculate it. And results of various checks in the middle. Etc...
@ -188,11 +190,11 @@ After staring at the wall of poorly decompiled code for a while, I realized that
}
```
Same function was used most of the time, but with different offsets and initial arrays, resulting in a variety of permutations. Regardless, I was able to roughly identify montgomery multiplications, subtractions and additions performed on 256-byte arrays (implying the use of 2048 bit keys). One of the most important factors was the use of "ADC" assemler command, mostly restricted to two areas of the code, which I tentatively identified as "signature generation" and "session key decryption". I concentrated on the former, since I could access and verify the output. Which did however raise the question about what kind of input the function took. More about that later.
Same function was used most of the time, but with different offsets and initial arrays, resulting in a variety of permutations. Regardless, I was able to roughly identify montgomery multiplications, subtractions and additions performed on 256-byte arrays (implying the use of 2048 bit keys). One of the most important factors was the use of "ADC" assembler command, mostly restricted to two areas of the code, which I tentatively identified as "signature generation" and "session key decryption". I concentrated on the former, since I could access and verify the output. Which did however raise the question about what kind of input the function took. More about that later.
Of course, sick, sadisitic minds behind the obfuscation did not use a straightforward exponentiation algorithms. As described in Google patent [US20160328543A1](https://patentimages.storage.googleapis.com/0c/f3/c0/08ce4394385810/US20160328543A1.pdf), they multiply input by constant and output by reversing constant, use permutation function to confuse memory layouts, and seem to use "split variables" at times, though not often in this case. In any case, resulting exponentiation function also has some additions which cancel each other in the end.
Of course, sick, sadistic minds behind the obfuscation did not use a straightforward exponentiation algorithms. As described in Google patent [US20160328543A1](https://patentimages.storage.googleapis.com/0c/f3/c0/08ce4394385810/US20160328543A1.pdf), they multiply input by constant and output by reversing constant, use permutation function to confuse memory layouts, and seem to use "split variables" at times, though not often in this case. In any case, resulting exponentiation function also has some additions which cancel each other in the end.
In order to extract the exponent from the code, I first logged most of the inputs and outputs of the functions that seemed to operate on bignum, unscrambling the permutation using the already generated tables in memory. Then, I used [python script](log_parsing/prfnd.py) to guess the operations performed on the numbers, and a [separate script](log_parsing/prfold.py) to map those operations into a tree. The second script went through several iterations as I tried various things, including adding [dual number](https://en.wikipedia.org/wiki/Dual_number) support in order to extract exponent from the result's derivative. Ultimately, I settled on simple single-variable tracing. Finding a route that did not lead to exponential explosion in number of polynomial powers was somewhat of a challenge, but eventually (after,once again, a week or two of work :| ) I succeded in extracting an exponent and multiplicative constant:
In order to extract the exponent from the code, I first logged most of the inputs and outputs of the functions that seemed to operate on bignum, unscrambling the permutation using the already generated tables in memory. Then, I used [python script](log_parsing/prfnd.py) to guess the operations performed on the numbers, and a [separate script](log_parsing/prfold.py) to map those operations into a tree. The second script went through several iterations as I tried various things, including adding [dual number](https://en.wikipedia.org/wiki/Dual_number) support in order to extract exponent from the result's derivative. Ultimately, I settled on simple single-variable tracing. Finding a route that did not lead to exponential explosion in number of polynomial powers was somewhat of a challenge, but eventually (after,once again, a week or two of work :| ) I succeeded in extracting an exponent and multiplicative constant:
```
Integer sec_pwr("3551441281151793803468150009579234152846302559876786023474384116741665435201433229827460838178073195265052758445179713180253281489421418956836612831335004147646176505141530943528324883137600012405638129713792381026483382453797745572848181778529230302846010343984669300346417859153737685586930651901846172995428426406351792520589303063891856952065344275226656441810617840991639119780740882869045853963976689135919705280633103928107644634342854948241774287922955722249610590503749361988998785615792967601265193388327434138142757694869631012949844904296215183578755943228774880949418421910124192610317509870035781434478472005580772585827964297458804686746351314049144529869398920254976283223212237757896308731212074522690246629595868795188862406555084509923061745551806883194011498591777868205642389994190989922575357099560320535514451309411366278194983648543644619059240366558012360910031565467287852389667920753931835645260421");
@ -202,7 +204,7 @@ Integer sec_mul("0x15ba06219067ebfbe9ed0b5f446f1dca81c3276915b6cd27621bfefe5cf28
An example of log and script output can be found in [log_parsing](/log_parsing) folder.
One can easily see that the exponent is 3072 bits in length, which is a lot longer than expected (2048). Obvisouly, since exponent is periodic, it can be extended to any length. It can be also confirmed that this is not a complete exponent, since the first bignum-like structure in the function does not match the encryption input. (Decryption of the RSA is easily done using public exponent, 65537). There is also no linear. or quadratic, or... (I checked polynomials to about 128th power) dependency. Which leads me to the following stage.
One can easily see that the exponent is 3072 bits in length, which is a lot longer than expected (2048). Obviously, since exponent is periodic, it can be extended to any length. It can be also confirmed that this is not a complete exponent, since the first bignum-like structure in the function does not match the encryption input. (Decryption of the RSA is easily done using public exponent, 65537). There is also no linear. or quadratic, or... (I checked polynomials to about 128th power) dependency. Which leads me to the following stage.
### Descending into despair
@ -219,7 +221,7 @@ In there, Param_1 seems to be constant, or at least input-independent. It is sti
010506030600040701030601060303060006010000030100000301030004050106010006010106010300030600030100060101000103000606030303010001060600030301060303060300030601000100000006010103010300010101010101060100030300030303010006000003000301060100060106000000000600000003030402050003010001010003000600060106010601010100030601010303000103010405010003000303000601040506040506000606060600030600000103030300010601030006030000030001000600000601010603000001010001010000010001000000030603030000060006030303000001030405060306060401060000010301030601000600010001030103060101010606000006000004050001060006030304050600030306010001000606060600000003060006000301060600000100060003030001060600010306030003010300010303000001010606010300010101010006030000010103000301010101060001010405040207020205010000030603000606000100000006030600030104050001060300000600030000030303060003060600030606060000000001060606010003030101010104070205040506010600060004010603060300030101010303030300010301000000010001010300000600030101000601030300040501010600010001000000060000000000060301060301060100010101030000060405040501010106060006010001030103010101010106030600060104050103000001060604050006010100060306010300030000030600010101030606060301060301000100000003030100010103000003030405060601030000060106010600000000060000030600000601000001010006030004010601000006010000010001060301000103060106030003010306010601030101060106040702050000010300030300010601060103000004050103060405000401000000040501000303060006000106060306030606060103000003060301010000000606000300030003000104050103060303010606000601000100060301000601030103060600060004010000000304010301030000010003000603000100060006010000010303030600030104050006000601060006010600040503000001060306010300060000010003010606010401030103060301060006000000010303000003000006010304070501010405030100000300000000040101060000010600000106030306010103060000010001060601060000000303000300010303030101060601030001010300000301030106010600000601000006030000010100000604050001030603010006010106000601010601030301000001010601030000000001000603040106010306060101010000
```
Lookup tables in *ConstUser_18016b077* essentally map 11 bit number(2x3 bit+5bit "carry") to 8-bit number(3-bit output plus carry). There are also other tables in the code that work on larger number of bits. But, since input and outputs are permuted in random order (and possibly have a carry bit), I could not for the life of me figure out what each of the (several thousands of) tables actually *did*. Each operation seemed to invoke a new table, or at least, a new sequence offset.
Lookup tables in *ConstUser_18016b077* essentially map 11 bit number(2x3 bit+5bit "carry") to 8-bit number(3-bit output plus carry). There are also other tables in the code that work on larger number of bits. But, since input and outputs are permuted in random order (and possibly have a carry bit), I could not for the life of me figure out what each of the (several thousands of) tables actually *did*. Each operation seemed to invoke a new table, or at least, a new sequence offset.
In any event, we have 4 or those numbers somehow generated from input and presented to exponentiation function. Where they are split into 18-bytes overlapping increments, processed in a loop, compressed back to 4-byte integers and passed on into *yet another* function:
@ -228,7 +230,7 @@ void ManyMutiplies_1801720e0
(byte* param_1, byte* param_2, byte* param_3, byte* param_4, byte* param_5, byte* out)
```
Where... I have no idea :( I've spent a lot of time looking at the code, but to this day I have no idea what *exactly* it does to 4 input buffers. Those buffers do not seem to be representations of 256-byte bignums ( buffer length vary, but are mostly multiples of 90). A lot ofoperations involve preparations like
Where... I have no idea :( I've spent a lot of time looking at the code, but to this day I have no idea what *exactly* it does to 4 input buffers. Those buffers do not seem to be representations of 256-byte bignums ( buffer length vary, but are mostly multiples of 90). A lot of operations involve preparations like
```
do {
@ -258,7 +260,7 @@ Which seem to use lookup tables (DAT_18091af30) to look up 8-byte carries? Yeah,
### Code lifting
After spending far too much time staring dumbly on decompiler and trying to run code modifications in Ghidra emulator, I decided to try dumping decompiled code into c++ file and making it compile again, with the "bright" idea of "maybe manipulatinfg inputs will give me some insight". I believe that is what is called "code lifting"? That came with its own set of challenges. The major one was the fact that decompiler was confused by overlapping buffer accesses, and could not separate local variables properly. Other was that somebody in Ghidra decompiler team thought that accessing, say, last two bytes in uint64 should be represented as *variable._6_2* instead of, say *\((short\*)&variable)\[3\]*. One of those is not proper C... So I had to go through code and replace that. As well as guess at stack variable overlaps and split those, which took weeks of painstaking register comparison.
After spending far too much time staring dumbly on decompiler and trying to run code modifications in Ghidra emulator, I decided to try dumping decompiled code into c++ file and making it compile again, with the "bright" idea of "maybe manipulating inputs will give me some insight". I believe that is what is called "code lifting"? That came with its own set of challenges. The major one was the fact that decompiler was confused by overlapping buffer accesses, and could not separate local variables properly. Other was that somebody in Ghidra decompiler team thought that accessing, say, last two bytes in uint64 should be represented as *variable._6_2* instead of, say *\((short\*)&variable)\[3\]*. One of those is not proper C... So I had to go through code and replace that. As well as guess at stack variable overlaps and split those, which took weeks of painstaking register comparison.
Next hurdle was a function that took two buffers already encoded into long form and spat out long form of almost-output. That one first ran table generation (unpacking?) and then jumped to runtime-generated point. Then it used a long array of addresses and values to jump over 6(?) possible code points and execute a variety of operations on data. The structure in the array looked somewhat like:
@ -275,20 +277,108 @@ Next hurdle was a function that took two buffers already encoded into long form
And the array was long... 5153 operations long. If my guess about Fourier transformation is correct, that would probably be the function that performs inverse transformation, but once again, no idea ;(
The final hurdle of the code-lifting, and the one that contributed the most to the wasm size, was constant extraction. Some constants were available from the beginning, while others, such as lookup tables, were generated at various points at runtime. There were over 600 constants used, so in the end I just automatically grabbed them from [memory dumps](/memory_dumps) with a python script without checking the appropriate legth, which resulted in a lot of overlap (it is better to have a too-long constant than access violation of undefined behavior). It is probably possible to cut the wasm size by at least half by carefully removing overlaps (and checking afterwards, since some seem necessary).
The final hurdle of the code-lifting, and the one that contributed the most to the wasm size, was constant extraction. Some constants were available from the beginning, while others, such as lookup tables, were generated at various points at runtime. There were over 600 constants used, so in the end I just automatically grabbed them from [memory dumps](/memory_dumps) with a python script without checking the appropriate length, which resulted in a lot of overlap (it is better to have a too-long constant than access violation of undefined behavior). It is probably possible to cut the wasm size by at least half by carefully removing overlaps (and checking afterwards, since some seem necessary).
After performing all that, I managed to recreate *HasMulAdc_18016d24d* in c++ code. Unfortunately, I did not gain any insight. The dependencies of actual input number on input buffers seemed highy non-linear as well. After a lot of trial and error(s), I was left with no recourse but to recreate input function for signature, which, luckily, was not obfuscated by switch statement. Unlike previous version, hovewer, actual RSA message to be exponentiated was never in memory during runtime, so I had to trace its creation from protobuf message.
After performing all that, I managed to recreate *HasMulAdc_18016d24d* in c++ code. Unfortunately, I did not gain any insight. The dependencies of actual input number on input buffers seemed highly non-linear as well. After a lot of trial and error(s), I was left with no recourse but to recreate input function for signature, which, luckily, was not obfuscated by switch statement. Unlike previous version, however, actual RSA message to be exponentiated was never in memory during runtime, so I had to trace its creation from protobuf message.
(to be continued)
One of the first ideas I came with, which eventually proved to be the most fruitful, was tracking SHA1 invocations. All SHA1 invocations should use the same starting values, as per [wiki](https://en.wikipedia.org/wiki/SHA-1):
```
h0 = 0x67452301
h1 = 0xEFCDAB89
h2 = 0x98BADCFE
h3 = 0x10325476
h4 = 0xC3D2E1F0
```
By searching for those values or round constants in memory and tracking references to them, I managed to find a few areas that appeared to calculate SHA1, one of them quite near the exponentiation code(abridged):
```
void Longstringproc_18017e3b0(byte **param_1,stdstring *data,uint len,stdstring *param_4)
{
byte *charbuffer;
longlong lVar1;
undefined8 local_24b8;
byte output_24b0 [512];
byte local_22b0 [2056];
byte local_1aa8 [2056];
byte local_12a0 [1040];
byte local_e90 [1040];
byte local_a80 [1032];
byte local_678 [1032];
byte local_270 [82];
SHA1_buf buffer;
ulonglong local_50;
undefined8 uStack72;
lVar1 = 0x100;
STLStringResizeUninitialize(param_4,0x100,0);
charbuffer = (byte *)GetStrOffset_1801d456e(param_4,0);
local_24b8 = *param_1;
FUN_18011394e();
Fill_SHA_buffer_18016ae81(&buffer);
LooksLikeSha1_1801695b1((byte *)data,len,&buffer);
Shabuf_Transform_18016b9ac((uint *)local_270,&buffer);
Crazery_18016c0bb((char *)local_270,local_678,local_a80,local_e90,local_12a0);
OtherConstUser_180169484(0x10004020000345e1,local_678,local_678,local_1aa8);
OtherConstUser_180169484(0x1000402000007410,local_a80,local_a80,local_22b0);
Maybe_MEMSET_180512a50((char *)output_24b0,0xaa,0x200);
HasMulAdc_18016d24d(local_24b8,local_1aa8,local_22b0,local_e90,local_12a0,output_24b0);
do {
*charbuffer = output_24b0[lVar1 + -1];
charbuffer = charbuffer + 1;
lVar1 = lVar1 + -1;
} while (lVar1 != 0);
}
```
Indeed, that proved to be a signing function, with *data* being the message to be signed. *LooksLikeSha1_1801695b1* calculates message hash, while other functions encode and decode normal hash to and from longform. As I mentioned before, at no point does the exponentiated value itself (that is, message hash padded as per [RSA-PSS gudelines](https://datatracker.ietf.org/doc/html/rfc3447#page-36) with "0xbc" appended) appear in memory in "normal" form, even permuted. Neither is the [MGF1](https://en.wikipedia.org/wiki/Mask_generation_function#MGF1) calculated "in the clear". So where*is* it calculated? Why, in the function using runime-generated jump tables, of course" That is, *Crazery_18016c0bb*... That function also uses the same functionality as *ConstUser_18016b077*, but with a twist: they use modulo arithmetics to permute the byte order in memory. Otherwise, the procedure is the same.
Unfortunately (for me), Ghidra was confused by missing jump table and produced garbage in decompiler, do I had to decompile the function mostly by hand. Fortunately, it only had ~6 jump entries which were not very long. After that, I ran the function while logging [data inputs and outputs](misc/rolls1.txt). In there, the first part is mostly MGF1 dunction, and of particular interest are these entries, since the data manipulated is 256 bytes, the size of RSA input:
```
0x3f9 Total length: 1026 Zeros1: 84 Chunk1: 942 Input: 330 Output: 11218 Cnt: 26762
0x812 First len: 86 Second len: 940 Source 1: 1402 Source 2: 11218 Destination: 19942 Cnt: 26763
0x60f Initial skip: 3 Processed len: 1023 Second len: 3 Input: 19942 Output: 11218 Cnt: 26764
0x812 First len: 1026 Second len: 0 Source 1: 7530 Source 2: 11218 Destination: 7530 Cnt: 26765
0x812 First len: 1026 Second len: 0 Source 1: 19942 Source 2: 7530 Destination: 11218 Cnt: 26766
0x812 First len: 1026 Second len: 0 Source 1: 3582 Source 2: 11218 Destination: 3582 Cnt: 26767
0x812 First len: 1026 Second len: 0 Source 1: 15058 Source 2: 7530 Destination: 7530 Cnt: 26768
0x812 First len: 1026 Second len: 0 Source 1: 7530 Source 2: 3582 Destination: 15058 Cnt: 26769
0x812 First len: 1026 Second len: 0 Source 1: 1490 Source 2: 7530 Destination: 1490 Cnt: 26770
0x812 First len: 1026 Second len: 0 Source 1: 11218 Source 2: 1490 Destination: 7530 Cnt: 26771
0x812 First len: 1026 Second len: 0 Source 1: 15058 Source 2: 15058 Destination: 3582 Cnt: 26772
0x812 First len: 1026 Second len: 0 Source 1: 7530 Source 2: 7530 Destination: 1490 Cnt: 26773
0x2b4 Length skip: 0 Length proc: 1026 Len 00: 12 Input1 7530 Input 2 7530 Output 23150 Cnt: 26774
```
After that, the input is somehow split into 4 parts of 259 bytes each. Part of the division is just splitting original input into sum of two numbers. The exact nature of further manipulation remains a mystery to me.
With this, the whole signing process is in c++ code!... so I can sign license requests, but cannot create custom inputs for decryption... yet.
### FaIlUrEs uPoN fAiLuReS
By this point, i HaVe fAiLeD AlReaDy iN My gOaL oF eXtRaCtInG RsA key <_< IaM pRoBaBLy mISsinG soMeThINg trIViAl. *AGAIN*
All that i had left for me was to maybe find a way to modify input so that it would approach the encoded value, thereby decrypting ciphertext with section key. To do that, I tried to modify values at various steps above, then running the whole encryption/decryption cycle to see what the input is like. Some modifications did not produce any input differences at all, hinting at redundancies/ variable splitting (meaning, I was modifying something that was used as obfuscation and then cancelled out). Eventually, I found a few values (steps/memory offsets) that produced "linear" modifications to the input, linear in this case meaning that modification to a single byte resulted in localized modifications to the "input", not affecting previous input bytes unless wrapping was involved. Unfortunately, try as I may, I could not figure out the actual encoding used... Also, there were several locations affecting input (last 21 bytes of seed+padding and first 235 bytes were split into different variables). Eventually,I gave up and decided to brute-force input in 2-bit increments. Since one "decryption" operation took about ~1 second, that took... Quite a while.
Failure, failure, FaIlURe <_< But better than nothing?
**Update:** While writing this ReadMe, I kept fiddling with input encoding, and found out that just an operation later the output was XORed with another buffer to form original input. I also realized that whitebox engineers (that I would rather do something more moral, like human experimentation) were lazy, and used the same buffer to long form encoding tables in all locations instead of chaining them. So, after brute forcing table order, I managed to work out encoding procedure that seems to work for most inputs (maybe all, but I cannot prove that), so only a single decrypting operation is needed instead of ~4000 on average.
Now I had a chunk of c++ code doing "decryption" and "encryption" given a long-form input guess. I needed a way to connect in into Chrome browser. The obvious way, of course, is to use [Emscripten](https://emscripten.org/docs/getting_started/downloads.html) to compile c++ code into WebAssembly, support for which was added to Chrome... recently enough. Emscripten also provides an initial JS wrapper for the exported functions.
Luckily, a [single command](/build_wasm.bat) managed to compile c++ file with some CryptoPP support after a few minor modifications. Originally, I put the brute-forcing code out into Javascript so it could be more easily interrupted and monitored. Unfortunately, while the program was working, it was unbearably slow and tended to freeze video playback. AnoTHer faIlUre <_< At least that one was later resolved.
The last thing was to remove [OAEP padding](https://datatracker.ietf.org/doc/html/rfc8017#page-19). Why is it so hard to get a proper info on those formats outside of RFC? Unfortunately, original repository used library that combined decryption with padding removal, so I decided to simply put a rough implementation of RFC into c++ code, since it was already plenty bloated. That seemed to work well enough for the purpose.
### Conclusion
### Some references
In the end, I only extracted about half of the RSA key. I am not sure how long is the key that remains in whitebox, though I have checked values up to about 64000 (power value, not bits). Neither I am sure why or how input was split into 4 buffers. I am leaving this ReadMe and scripts here in some hope that they may help when Google inevitably changes key again. As an additional reference, author of original repo, Tomer8007, uploaded writeup on [original extraction method](https://github.com/tomer8007/widevine-l3-decryptor/wiki/Reversing-the-old-Widevine-Content-Decryption-Module), seemingly somewhere around the time I was uploading my repo. It is a lot better than mine here, so give it a read as well.
All in all, it was a decent, albeit somewhat depressing exercise that I have little desire to ever repeat. I will probably cease updating repo soon after Readme is finished, so for people that want it modified: please fork or copy it and modify as you see fit. Attribution is appreciated, though ;)
*The end.*
https://datatracker.ietf.org/doc/html/rfc3447#page-36
https://patentimages.storage.googleapis.com/0c/f3/c0/08ce4394385810/US20160328543A1.pdf
https://github.com/tomer8007/widevine-l3-decryptor/wiki/Reversing-the-old-Widevine-Content-Decryption-Module

View File

@ -90,7 +90,7 @@ EmeInterception.prototype.addListenersToNavigator_ = function()
{
if (navigator.listenersAdded_)
return;
console.log('Adding listeners to navigator');
var originalRequestMediaKeySystemAccessFn = EmeInterception.extendEmeMethod(
navigator,
navigator.requestMediaKeySystemAccess,
@ -114,7 +114,35 @@ EmeInterception.prototype.addListenersToNavigator_ = function()
}.bind(this));
}.bind(this);
if(navigator.mediaCapabilities)
{
if(navigator.mediaCapabilities.decodingInfo)
{
var originalDecodingInfoFn = EmeInterception.extendEmeMethod(
navigator.mediaCapabilities, navigator.mediaCapabilities.decodingInfo,"DecodingInfoCall");
navigator.mediaCapabilities.decodingInfo = function()
{
var self = arguments[0];
//console.log(arguments);
// slice "It is recommended that a robustness level be specified" warning
var modifiedArguments = arguments;
//var modifiedOptions = EmeInterception.addRobustnessLevelIfNeeded(options);
//modifiedArguments[1] = modifiedOptions;
var result = originalDecodingInfoFn.apply(null, modifiedArguments);
// Attach listeners to returned MediaKeySystemAccess object
return result.then(function(res)
{
//console.log(res);
if(res.keySystemAccess)
this.addListenersToMediaKeySystemAccess_(res.keySystemAccess);
return Promise.resolve(res);
}.bind(this));
}.bind(this);
}
}
navigator.listenersAdded_ = true;
};

322
misc/Memsearcher.py Normal file
View File

@ -0,0 +1,322 @@
# script that I used to look up values and dump constants out of memory dumps produced by Ghidra emulator. Very rough.
import struct
import gzip
import sys
import os
fl1=sys.argv[1]
#fl2=sys.argv[2]
srch=None
if len(sys.argv)>2:
srch=sys.argv[2]
def readRegister(dct,fl):
dt=fl.read(2)
if len(dt)<2: return False
nmlen=struct.unpack("<H",dt)[0]
nml=fl.read(nmlen)
if len(nml)<nmlen: return False
nm=nml.decode("ascii")
dt=fl.read(2)
if len(dt)<2: return False
ln=struct.unpack("<H",dt)[0]
if ln==16:
dt=fl.read(16)
if len(dt)<16: return False
vals=struct.unpack("<QQ",dt)
val=(vals[0]<<64)+vals[1]
elif ln==8:
dt=fl.read(8)
if len(dt)<8: return False
val=struct.unpack("<Q",dt)[0]
elif ln==4:
dt=fl.read(4)
if len(dt)<4: return False
val=struct.unpack("<I",dt)[0]
else:
dt=fl.read(2)
if len(dt)<2: return False
val=struct.unpack("<H",dt)[0]
if not "registers" in dct:
dct["registers"]={}
dct["registers"][nm]=val
return True
def readMemoryChunk(dct,fl):
if not "mem" in dct:
dct["mem"]={}
dt=fl.read(8)
if len(dt)<8: return False
start=struct.unpack("<Q",dt)[0]
dt=fl.read(8)
if len(dt)<8: return False
ln=struct.unpack("<Q",dt)[0]
if ln>0:
dat=fl.read(ln)
if len(dat)<ln: return False
dct["mem"][start]=dat
return True
def readSnapshot(dct,fl):
numreg=struct.unpack("<I",fl.read(4))[0]
for i in range(numreg):
if not readRegister(dct,fl):
print("Corrupt snapshot: not enough registers")
return
while readMemoryChunk(dct,fl):
pass
basedir="./"
def loadSnapshot(dct,name):
global basedir
fname=os.path.join(basedir,name)
with gzip.open(fname, 'rb') as f:
readSnapshot(dct,f)
kt={}
loadSnapshot(kt,fl1)
print(kt["mem"].keys())
import codecs
st="22e54cd8"#"22E54CD8A10671840752EF46"
if srch is not None:
st=srch
bts=codecs.decode(st,"hex")
def find_all(a_str, sub):
start = 0
while True:
start = a_str.find(sub, start)
if start == -1: return
yield start
start += len(sub)
for offs in kt["mem"]:
i=list(find_all(kt["mem"][offs],bts))
for ko in i:
print("{:x}".format(offs+ko))
def readAddr(dct,addr,nm):
for offs in kt["mem"]:
if offs<=addr and offs+len(kt["mem"][offs])>=addr:
dt=kt["mem"][offs]
return dt[addr-offs:addr-offs+nm]
def readULL(dct,addr):
dt=readAddr(dct,addr,8)
return struct.unpack("<Q",dt)[0]
def readUI(dct,addr):
dt=readAddr(dct,addr,4)
return struct.unpack("<I",dt)[0]
def readUS(dct,addr):
dt=readAddr(dct,addr,2)
return struct.unpack("<H",dt)[0]
def readByte(dct,addr):
dt=readAddr(dct,addr,1)
return struct.unpack("<c",dt)[0][0]
def lesserConstShuffle(const,p1,p2):
global kt
ret=[]
length=(const >> 0x24) & 0x3fff
offset=(const & 0x3fffff)
if length>0:
eax=0
ret=[0]*length
for k in range(length):
eax=eax&0xf8
#print("{:b}".format(eax))
fl=p1[k]
eax=eax^fl
f2=(p2[k]<<8)
fl=f2+eax
f3=readByte(kt,offset+k+0x180a85ad0)<<11
#print(readByte(kt,offset+k+0x180a25040))
fl=fl+f3
eax=readByte(kt,0x1809cde30+fl)
ret[k]=eax&7
return ret
def cnstShuffle(const,p1,p2):
global kt
ret=[]
length=(const >> 0x24) & 0x3fff
offset=(const & 0x3fffff)
slen=(const >>0x32)
if length>0:
eax=0
ret=[0]*length
for k in range(length):
eax=eax&0xf8
#print("{:b}".format(eax))
fl=p1[k]
eax=eax^fl
f2=(p2[k]<<8)
fl=f2+eax
f3=readByte(kt,offset+k+0x180a25040)<<11
#print(readByte(kt,offset+k+0x180a25040))
fl=fl+f3
eax=readByte(kt,0x1809cde30+fl)
ret[k]=eax&7
if slen>0:
while len(ret)<slen+length:
ret.append(0)
for l in range(slen):
k=l+length
eax=eax&0xf8
esi=(p2[k]<<8)
esi=esi|eax
eax=(readByte(kt,offset+k+0x180a25040)<<11)
eax=(eax^esi)
eax=readByte(kt,0x1809cde30+eax)
ret[k]=eax&7
return ret
def otherShuffle(const,p1,p2):
global kt
ret=[]
length=(const >> 0x24) & 0x3fff
offset=(const & 0x3fffff)
sublen=(const >> 0x16) & 0x3fff
if sublen ==0:
eax=0
else:
eax=0
rtval=0
for a in range(sublen):
eax=(readByte(kt,offset+a+0x180a25040)<<11)+(p2[a]<<8)+(eax&0xf8)+(p1[a]&0x7)
eax=readByte(kt,0x1809cde30+fl)
slen=(const >>0x32)
if length>0:
eax=0
ret=[0]*length
for k in range(length):
eax=eax&0xf8
fl=p1[k+sublen]
eax=eax^fl
f2=(p2[k+sublen]<<8)
fl=f2+eax
f3=readByte(kt,offset+k+sublen+0x180a25040)<<11
#print(readByte(kt,offset+k+0x180a25040))
fl=fl+f3
eax=readByte(kt,0x1809cde30+fl)
ret[k]=eax&7
if slen>0:
while len(ret)<slen+length+sublen:
ret.append(0)
for l in range(slen):
k=l+length+sublen
eax=eax&0xf8
esi=(p2[k]<<8)
esi=esi|eax
eax=(readByte(kt,offset+k+0x180a25040)<<11)
eax=(eax^esi)
eax=readByte(kt,0x1809cde30+eax)
ret[k]=eax&7
return ret
def cnPack(lst,stp=0):
eax=0
ret=[]
for k in range(stp,len(lst),1):
ebx=lst[k]&3
ebx=(ebx<<(eax&6))# 110
ecx=((k-stp)>>2)
while len(ret)<ecx+1: ret.append(0)
ret[ecx]|=ebx
eax+=2
return ret
def cnUnpack(lst,ln=None):
ret=[]
for k in range(len(lst)):
l=lst[k]
ret.append(l&0x3)
ret.append((l>>2)&0x3)
ret.append((l>>4)&0x3)
ret.append((l>>6)&0x3)
if ln is not None:
while(len(ret)<ln): ret=[0,*ret]
return ret
"""
Tables:
each side has 3 bits (permuted) - 8 values. 4 have carry
Many tables do who-knows-what XD
some are "sum" tables -can be recognized by the fact that they have all values in each row if varied (and can be permuted to symmetrical form)
With additional "carry" sum tables can affect up to 3 cells forward by variation (more if other cells are full)
Some are normalizer, work on the doubled arguments (a=b) and allow to pack data afterwards...
a+carry(b) table? (carry goes to next "digit" - interleaved??
table has separate encodings for a,b,c (8x8 table)
etc ...
"""
#0x12000027448 -sum table, it seems
#0x1200002b000 - another sum table?
#0x12000026a1b - normalizer table...
#0x25000037501 - carry flipper?
#0x1000002cd3a - carry ... no-flipper? not sure
for q in range(8):
ls=[]
for w in range(8):
a2=[1]*0x40e
a1=[1]*0x40e
a1[0]=0
a2[0]=0
a1[1]=q
a2[1]=1
a1[2]=w
a2[2]=w
#a3=cnstShuffle(0x12000033f37,a1,a2) #0x120000054d6
a3=cnstShuffle(0x12000033f37,a1,a1) #0x120000054d6
#print(a3)
ls.append(a3[4])
print("{}: {}".format(q,ls))
carry=0
def printTC(num):
offs=num<<11
ecr=set()
print("TC {}".format(num))
for carry in range(32):
st=[]
for q in range(8):
dm=offs+(q<<8)+(carry<<3)+q
dt=readByte(kt,0x1809cde30+dm)
ec=dt>>3
vl=dt&7
st.append(vl)
ecr.add(ec)
print("{}: {}".format(carry,st))
print(ecr)
printTC(22)
printTC(47)
sys.exit()
for ss in range(256):
offs=ss<<11
crr=set()
print("Table # {}".format(ss))
for q in range(8):
ls=[]
for w in range(8):
dm=offs+(q<<8)+(carry<<3)+w
dt=readByte(kt,0x1809cde30+dm)
ls.append([dt>>3,dt&0x7])
crr.add(dt>>3)
print("{}: {}".format(q,ls))
print("Carries: {} {}".format(len(crr),crr))
print("{:x}".format(readULL(kt,0x181253ac8)))
llen=38848
with open("dats.lg","r") as fl:
for ln in fl:
ls=ln.strip()
if "DAT" in ls:
offs=ls.split("_")[1]
l=[]
for i in range(llen):
l.append(readByte(kt,int("0x"+offs,16)+i))
ssl='{'+', '.join(["{}".format(k) for k in l])+'};';
print("unsigned char DAT_{} [{}]={}".format(offs,len(l),ssl))
elif "INT" in ls:
offs=ls.split("_")[1]
l=[]
for i in range(llen):
l.append(readUI(kt,int("0x"+offs,16)+i*4))
ssl='{'+', '.join(["{}".format(k) for k in l])+'};';
print("unsigned int INT_{} [{}]={}".format(offs,len(l),ssl))
elif "QWORD" in ls:
offs=ls.split("_")[1]
l=[]
for i in range(llen):
l.append(readULL(kt,int("0x"+offs,16)+i*8))
ssl='{'+', '.join(["{}".format(k) for k in l])+'};';
print("unsigned long long QWORD_{} [{}]={}".format(offs,len(l),ssl))

31
misc/oaep.py Normal file
View File

@ -0,0 +1,31 @@
#simple script that shows how to unmask OAEP padding from big integer
import hashlib
def i2osp(integer, size):
return bytes([((integer >> (8 * i)) & 0xFF) for i in reversed(range(size))])
def mgf1(input_str, length, hash=hashlib.sha1):
"""Mask generation function."""
counter = 0
output = b""
while len(output) < length:
C = i2osp(counter, 4)
output += hash(input_str + C).digest()
counter += 1
return output[:length]
def decode(bts,seed):
zr=mgf1(bts,20)
sd=bytearray()
for (a,b) in zip(seed,zr):
sd.append(a^b)
seed=bytes(sd)
xormask=mgf1(seed,len(bts))
ret=bytearray()
for (a,b) in zip(bts,xormask):
ret.append(a^b)
return bytes(ret)
ii=int("e4ae6c475d00d73552eae63d3456cd59f17e0f4bbad2a587d34c774658b9b5ce7857491e6e06fbc79cc8f688ad20e9c2f6d65419b3ec86657c1b87a80cd4a5c012a1d7571b842ff7c0f56c1d83ae003b73e73633f65f4c3644f0570c57dffa72f7e00788365a0726511b05bb3d440777770742cc776f3266456755b803b3743a0cd1b139d2a8522b1f6e4970afd74096a9e11abbdbfdb06b10a529877840e825d42b117c285bb064fc4778dd4242cb2e9df49e63c3ab60dc54a0f2d45126683bb71602bf5963468e56e8e84bc6c58c3c68f4670b080937db93aa22d90f35d8e8767654965f40b2fde20a84d2d57e9e12ecf9dddf02c3943cb0d2f513d0c965",16).to_bytes(256,'big')
print("isValidPrefix={}".format(ii[0]==0))
print("Masked data: {}".format(ii[21:].hex()))
print("Seed Mask: {}".format(mgf1(ii[21:],20).hex()))
print("Unmasked data: {}".format(decode(ii[21:],ii[1:21]).hex()))

26804
misc/rolls1.txt Normal file

File diff suppressed because it is too large Load Diff