This challenge gives a .pyc file and a Python 3.11 alpha7 based patch file.
Patch
This patch file mainly patches the opcode.py
file. After the patch, opcode.py
file first generated two random lists and modifies the opcode according to the values of the lists. Finally, it will output the new opcode to a tmp file.
+import sys
+if "generate_opcode_h" in sys.argv[0]:
+ random = __import__("random")
+ random.shuffle(perm)
+ random.shuffle(perm2)
-def_op('CACHE', 0)
-def_op('POP_TOP', 1)
-def_op('PUSH_NULL', 2)
+def_op('CACHE', perm[0])
+def_op('POP_TOP', perm[1])
+def_op('PUSH_NULL', perm[2])
+with open("/tmp/opcode_map", "w") as f:
+ for x, y in enumerate(opname):
+ f.write("%s %s\n" % (x, y))
The patch file also modifies the ceval.c
file, but I don’t understand what this does.
diff --git a/Python/ceval.c b/Python/ceval.c
index 1d2c6432d0..e275158d9d 100644
--- a/Python/ceval.c
+++ b/Python/ceval.c
@@ -1344,6 +1344,8 @@ eval_frame_handle_pending(PyThreadState *tstate)
#define OR_DTRACE_LINE
#endif
+#define USE_COMPUTED_GOTOS 0
+
#ifdef HAVE_COMPUTED_GOTOS
#ifndef USE_COMPUTED_GOTOS
#define USE_COMPUTED_GOTOS 1
@@ -1705,6 +1707,7 @@ _PyEval_EvalFrameDefault(PyThreadState *tstate, _PyInterpreterFrame *frame, int
#if USE_COMPUTED_GOTOS
/* Import the static jump table */
#include "opcode_targets.h"
+ asdasdasd, break the compilation...
#endif
#ifdef Py_STATS
Opcode
Based on the information in the patch file, we came up with two ideas. One is to solve it by cracking the random number generator, and the other is to recover all opcodes manually.
Cracking the random number generator
Since randon.shuffle is based on mt19937, cracking this requires providing 624 32bit numbers which is impossible. So we spent a lot of time on cracking the seed of the random number generator. For example, I found the file generation date of the patch file very suspicious, so I tried to use that time as random seed. We also used the sleeping hours to run the explosion program. Until we find:
random.seed(a=None, version=2)
Initialize the random number generator.
If a is omitted or None, the current system time is used. If randomness sources are provided by the operating system, they are used instead of the system time (see the os.urandom() function for details on availability).
We do not think there is any way to guess os.urandom(), so we focused on the second method.
Recover opcodes by hand
It’s difficult to disasm python 3.11 .pyc files, for the marshal and opcodes changes a lot. The newist tools I know is pycdc which still support python 3.10 only.
I started with some easy functions. For example, it is obvious that __main__
start with import sys
and followd with functions defination.
0 RESUME 0
2 LOAD_CONST 0
4 LOAD_CONST 1
6 IMPORT_NAME 0 (sys)
8 STORE_NAME 0
26 LOAD_CONST 2
28 MAKE_FUNCTION 0
30 STORE_NAME 3 (ks)
68 LOAD_CONST 1
70 RETURN_VALUE 0
Therefore we got:
code | op |
---|---|
0x61 | RESUME |
0x60 | LOAD_CONST |
0xab | IMPORT_NAME |
0x96 | STORE_NAME |
0x31 | RETURN |
0x63 | MAKE_FUNCTION |
I also find there were lots of 20 00
in .pyc file. Which is CACHE
, and we can ignore it.
Since there were bugs in dis
module after I change the opcode.py, I gave it up. But we can still use dis.show_code()
to unserialize marshal.
>>> import marshal, dis
>>> fd = open('x.pyc', 'rb')
>>> fd.seek(16)
16
>>> a = marshal.load(fd)
>>> dis.show_code(a)
Name: <module>
Filename: /usr/local/google/home/xxxxxxxxx/yyyy/zzzzzzzzzzzzzzzz/xxxxxxxxxxx.py
Argument count: 0
Positional-only arguments: 0
Kw-only arguments: 0
Number of locals: 0
Stack size: 2
Flags: 0x0
Constants:
0: 0
1: None
2: <code object ks at 0x101419470, file "/usr/local/google/home/xxxxxxxxx/yyyy/zzzzzzzzzzzzzzzz/xxxxxxxxxxx.py", line 5>
3: <code object cry at 0x1016003f0, file "/usr/local/google/home/xxxxxxxxx/yyyy/zzzzzzzzzzzzzzzz/xxxxxxxxxxx.py", line 11>
4: <code object fail at 0x101553000, file "/usr/local/google/home/xxxxxxxxx/yyyy/zzzzzzzzzzzzzzzz/xxxxxxxxxxx.py", line 18>
5: <code object game1 at 0x12c704920, file "/usr/local/google/home/xxxxxxxxx/yyyy/zzzzzzzzzzzzzzzz/xxxxxxxxxxx.py", line 23>
6: <code object game2 at 0x12c704cc0, file "/usr/local/google/home/xxxxxxxxx/yyyy/zzzzzzzzzzzzzzzz/xxxxxxxxxxx.py", line 68>
7: <code object game3 at 0x12c7051f0, file "/usr/local/google/home/xxxxxxxxx/yyyy/zzzzzzzzzzzzzzzz/xxxxxxxxxxx.py", line 96>
8: <code object main at 0x12c704f20, file "/usr/local/google/home/xxxxxxxxx/yyyy/zzzzzzzzzzzzzzzz/xxxxxxxxxxx.py", line 123>
Names:
0: sys
1: random
2: time
3: ks
4: cry
5: fail
6: game1
7: game2
8: game3
9: main
>>> dis.show_code(a.co_consts[2])
Name: ks
Filename: /usr/local/google/home/xxxxxxxxx/yyyy/zzzzzzzzzzzzzzzz/xxxxxxxxxxx.py
Argument count: 1
Positional-only arguments: 0
Kw-only arguments: 0
Number of locals: 1
Stack size: 4
Flags: OPTIMIZED, NEWLOCALS, GENERATOR
Constants:
0: None
1: True
2: 0
3: 255
4: 13
5: 17
6: 256
Names:
0: random
1: seed
2: randint
Variable names:
0: seed
>>>
Start with the short function will be easier. For example, it is easy to recognize BINARY_OP
and LOAD_FAST
in function w()
which defined in function game1()
.
Name: w
Filename: /usr/local/google/home/xxxxxxxxx/yyyy/zzzzzzzzzzzzzzzz/xxxxxxxxxxx.py
Argument count: 3
Positional-only arguments: 0
Kw-only arguments: 0
Number of locals: 3
Stack size: 3
Flags: OPTIMIZED, NEWLOCALS, NESTED
Constants:
0: None
1: 10
2: 1
Variable names:
0: m
1: i
2: j
0x7a2 0x7c0
0 RESUME 0
2 LOAD_FAST 0 (m)
4 LOAD_FAST 1 (i)
6 LOAD_CONST 1 (10)
8 BINARY_OP 5 (*)
12 LOAD_FAST 2 (j)
14 BINARY_OP 0 (+)
18 BINARY_OP 9 (>>)
22 LOAD_CONST 2 (1)
24 BINARY_OP 1 (&)
28 RETURN_VALUE 0
Some important information of LOAD_*
from docs:
LOAD_CONST(consti)
Pushes
co_consts[consti]
onto the stack.
LOAD_NAME(namei)
LOAD_NAME [namei]
Pushesco_names[x // 2]
onto the stack.
LOAD_FAST(var_num)
Pushes a reference to the local
co_varnames[var_num]
onto the stack.
LOAD_METHOD(namei)
LOAD_METHOD x
Pushesco_names[x]
onto the stack.
By recognize print ()
in main()
:
0 RESUME 0
2 PUSH_NULL
4 LOAD_NAME 0 (print)
16 LOAD_CONST 1 ('Pass 3 tests to prove your worth!')
18 PRECALL 1
22 CALL 1
32 POP_TOP 0
We got:
code | op |
---|---|
0xB9 | LOAD_NAME |
0xBF | PRECALL |
0x8E | CALL |
0x01 | POP_TOP |
0x50 | PUSH_NULL |
After recover all functions, we got:
code | op | |
---|---|---|
0x7b | JUMP_BACKWARD | |
0xb6 | FOR_ITER | |
0x1f | GET_ITER | |
0x8d | STORE_FAST | |
0x8f | UNPACK_SEQUENCE | |
0x62 | LOAD_FAST | |
0xc6 | BINARY_OP | (+, &, //, «, @, *, %, |, **, », -, /, ^, +=) |
0x74 | BUILD_LIST | |
0x68 | COMPARE_OP | (’<’, ‘<=’, ‘==’, ‘!=’, ‘>’, ‘>=’) |
0xba | POP_JUMP_FORWARD_IF_FALSE | |
0x5d | JUMP_FORWARD | |
0x92 | LOAD_METHOD | |
0x88 | BUILD_SLICE | |
0xaf | POP_JUMP_BACKWARD_IF_TRUE | |
0x82 | BUILD_TUPLE | |
0x75 | CONTAINS_OP | |
0xb1 | BUILD_CONST_KEY_MAP |
There’s still some opcode I didn’t recognize, but that doesn’t affect our decompilation anymore.
The decompilation result (by hand) was:
import sys
import time
import random
def ks(seed):
random.seed(seed)
while 1:
yield (random.randint(0, 255) * 13 + 17) % 256
def cry(s, seed):
r = []
for x, y in zip(ks(seed), s):
r.append(x ^ y)
bytes(r)
def fail(s):
print (s)
print ("Thanks for playing!")
sys.exit(0)
def game1():
def w(m, i, j):
return (m >> (i * 10 + j)) & 1
m = 1267034045110727999721745963007
fuel = 8
x, y = (1, 1)
stops = set([(5, 4), (3, 3)])
log = ''
print("Fuels:", fuel)
for i in range(10):
s = ''
for j in range(10):
if w(m, i, j):
s += '🧱'
elif (j, i) == (x, y):
s += '🚓'
elif (j, i) == (8, 8):
s += '🏁'
elif (j, i) in stops:
s += '⛽️'
else:
s += ' '
print (s)
inp = input().strip()
for c in inp:
log += c
if c not in 'wasd':
fail('Nope!')
elif fuel == 0:
fail('Empty...')
dx, dy = {'w':(0, -1), 's':(0, 1), 'a':(-1, 0), 'd':(1, 0)}[c]
x += dx
y += dy
if w(m, y, x):
fail('Crash!')
fuel -= 1
if (x, y) in stops:
stops.remove((x, y))
fuel += 15
elif (x, y) == (8, 8):
print ('Nice!')
return log
def game2():
print ('Math quiz time!')
qs = (('sum', 12, 5), ('difference', 45, 14), ('product', 8, 9), ('ratio', 18, 6), ('remainder from division', 23, 7))
log = '_'
for q in qs:
print ('What is the %s of %d and %d?' % q)
x, a, b = q
if x == 'sum':
r = a + b
elif x == 'difference':
r = a - b
elif x == 'product':
r = a * b
elif x == 'ratio':
r = a // b
elif x == 'remainder from division':
r = a % b
else:
print ('What?')
inp = int(input())
if inp == r:
print ('Correct!')
log += str(inp) + '_'
else:
print ('Wrong!')
return log
#_17_31_72_3_2_
def game3():
print ('Speed typing game.')
t = time.time()
text = '''
Text: Because of its performance advantage, today many language implementations
execute a program in two phases, first compiling the source code into bytecode,
and then passing the bytecode to the virtual machine.
'''
words = text.split()
it = 1
log = '_'
while it != len(words):
print ('%0.2f seconds left.' % (20 - (time.time() - t)))
print ('\x1b[32m', ' '.join(words[:it]), '\x1b[39m ', words[it])
inp = input()
if time.time() > t + 20:
fail('Too slow!')
if inp == words[it]:
log += words[it].upper() + '_'
it += 1
else:
fail('You made a mistake!')
print ('Nice!')
return log
#_BECAUSE_OF_ITS_PERFORMANCE_ADVANTAGE,_TODAY_MANY_LANGUAGE_IMPLEMENTATIONS_EXECUTE_A_PROGRAM_IN_TWO_PHASES,_FIRST_COMPILING_THE_SOURCE_CODE_INTO_BYTECODE,_AND_THEN_PASSING_THE_BYTECODE_TO_THE_VIRTUAL_MACHINE._
def main():
print('Pass 3 tests to prove your worth!')
seed = 'seed:'
seed += game1() + ':'
print (seed)
seed += game2() + ':'
print (seed)
seed += game3()
print (seed)
print ()
print ("You can drive to work, know some maths and can type fast. You're hired!")
print('Your sign-on bonus:', cry(b'\xa0?n\xa5\x7f)\x1f6Jvh\x95\xcc!\x1e\x95\x996a\x11\xf6OV\x88\xc1\x9f\xde\xb50\x9d\xae\x14\xde\x18YHI\xd8\xd5\x90\x8a\x181l\xb0\x16^O;]', seed).decode())
SOLVE!
import random
def ks(seed):
random.seed(seed)
while 1:
yield (random.randint(0, 255) * 13 + 17) % 256
def cry(s, seed):
r = []
for x, y in zip(ks(seed), s):
r.append(x ^ y)
bytes(r)
return r
seed = 'seed:sssddwwddwddsssdssaaawwssaaaassddddddd:_17_31_72_3_2_:_BECAUSE_OF_ITS_PERFORMANCE_ADVANTAGE,_TODAY_MANY_LANGUAGE_IMPLEMENTATIONS_EXECUTE_A_PROGRAM_IN_TWO_PHASES,_FIRST_COMPILING_THE_SOURCE_CODE_INTO_BYTECODE,_AND_THEN_PASSING_THE_BYTECODE_TO_THE_VIRTUAL_MACHINE._'
print('Your sign-on bonus:', cry(b'\xa0?n\xa5\x7f)\x1f6Jvh\x95\xcc!\x1e\x95\x996a\x11\xf6OV\x88\xc1\x9f\xde\xb50\x9d\xae\x14\xde\x18YHI\xd8\xd5\x90\x8a\x181l\xb0\x16^O;]', seed))
Afterword
After failed to solve adamd in DEFCON 30 QUALS, I finally solved a Python 3.11 challenge :D