BCACTF 3.0 2022

a Legit proteCtor of flaGs (175 points; 30 solves)

const v0 = process.env.FLAG; // example: bcactf{example flag}

function* v1(v2, v3, v4) {
    while (true) {
        v2 = (v3 * v2 + v4) % 65536
        yield v2;
    }
}

const v5 = ~~(Math.random() * 65536)
const v6 = ~~(Math.random() * 65536)
const v7 = v1(v5, v6, 17483);

const v8 = new TextEncoder().encode(v0);
const v9 = new Uint16Array(v8.buffer);

for (let va = 0; va < v9.length; ++va) {
    var v = v7.next().value;
    v9[va] ^= v;
}

let vb = "";

for (const vc of v8) vb += vc.toString(16).padStart(2, "0");

console.log(vb);

The challenge contains the hint: Challenge names are not random, what is this program doing to generate these numbers?

Understanding the process

Essentially, what the program.js script is doing is:

  1. Generating random numbers using a Linear Congruence Generator, with the increment value being shown to us, but not the seed and the multiplier
  2. Converting the flag into a byte array, and then feeding the BUFFER itself into a Uint16Array instance
  3. XORing each character of the flag with a random number < 65536
  4. Combining each result into a giant hexadecimal string, which is the encrypted output we’re provided

Now, before we get into the specifics, I would like to discuss why it is important that I emphasized the fact that the buffer itself of the flag bytes array is being placed into the Uint16Array. To explain why, I would like to demonstrate the process in a node terminal, with our test flag bcactf{test_flag}

It all looks normal. 98 is the ascii value “b”, 99 is the ascii value “c”, etc. But, then we go to actually make the Uint16Array with the byte buffer, and…

What’s this? What are these numbers??

Well, let’s take a look at the first number’s hex value in a python shell. As you can see, the numbers aren’t actually anything new or interesting; they’re just pairs of bytes from the flag, but concatenated together and then converted to an integer. The low order is the char that came first, and the high order is the char that came after.

So, really, the plaintext we’re encrypting with random numbers is actually not each byte of the flag, but rather an integer representing two bytes of the flag.

Decrypting

My AP Computer Science type solution script is already pretty self-explanatory, but I’d just like to provide a quick TL;DR to the faint of heart before they collapse at the sheer size of the script:

  1. In order to retrieve “m” for the LCG in order to generate the keys used for encryption, we will simply brute-force it. 65536 is well-within bruteforceable range.
  2. When we figure out the ciphertext for the first portion of the flag, we will XOR it with 0x6362 (the flag will always start with this value, due to the above). This will grant us the state of the LCG, which assuming our “m” is correct, will allow us to successfully generate the rest of the keys from there.

Due to the fact that the ciphertext has no clear seperation between each encrypted char, I ended up having the program bruteforce each possible length of each character ciphertext, and then running the decryption on each length possibility. In each trial, I would XOR with the key for that current ciphertext, then strain out the flag bytes via the high-order low-order thing.

In the event that there were multiple possibilities, I was too mentally wounded to bother having the program solve it. So I ended up just making it ask me what I though was the most logical, and I’d see on the next iteration if my guess was correct.

But regardless, here’s big bertha:

import string

#define LCG params
n = 65536
increment = 17483

#simple LCG
class LCG():
    def __init__(self, state, m):
        self.state = state
        self.m = m

    def generate(self):
        self.state = (self.state*m + increment)%n
        return self.state

#define variables
#since Uint16Array was used in the process of encryption: the plaintext flag length is a multiple of 2
encrypted_flag = "f31824e8f96ea624ce93587f893d8c20da4fc92c8ed3fb898ca9ef0fe2898ddd848fdfc1"
#encrypted_flag = "92e1fac490f52c44dd17dae90d77" #test flag

#simple function to make reading the encrypted_flag easier
def read_ct(offset, bytes=1):
    #get the data
    #return none if we exhausted all the flag bytes
    data = encrypted_flag[offset:offset+bytes]
    if data == "":
        return None

    #swap the high order and low order bytes
    high_order = data[2:]
    low_order = data[:2]
    data = f"{high_order}{low_order}"

    #fix up the offset and return the portion of the flag
    offset += bytes
    return int(data,16)

#decrypt function
def decrypt(offset, bytes, key):
    #read from the ciphertext at the provided offset and length
    #return none if the reading returns none
    data = read_ct(offset, bytes)
    if data == None:
        return None

    #XOR this data with the key
    pt = data ^ key

    #because the program.js file used a Uint16Array() with the flag buffer as the input, each plaintext is actually two hex bytes, concatenated as an integer
    #these bytes are in reverse order (i.e: higher order bytes are the second byte, lower order bytes are the first byte)
    #so, for instance, 25442 == 0x6362 == 0x63 0x62 == 0x62 0x63 == "bc" (the start of the flag)
    pt_formatted = hex(pt)[2:]
    first_byte = pt_formatted[2:]
    second_byte = pt_formatted[:2]

    #both bytes should exist. if they don't, our data is likely not part of the flag
    if first_byte == "" or second_byte == "":
        return None

    #convert each byte to ascii
    charify = lambda b: chr(int(b, 16))
    try:
        #verify the char is printable
        returned_data = ""
        for pt_byte in [charify(first_byte), charify(second_byte)]:
            if pt_byte not in string.ascii_letters+string.digits+string.punctuation:
                raise Exception("not printable; invalid char")
            else:
                returned_data += pt_byte

        #return the decrypted data
        return returned_data
    except:
        return None

#generates candidates in the form (NEW_OFFSET_AFTER_CANDIDATE, CANDIDATE)
def generate_candidates(offset: int, key: int, target_text=None):
    #generate the key and make some candidates
    candidates = [(offset+i, decrypt(offset, i, key)) for i in range(1,6)]

    #if there is no target text, simply return the candidates
    if target_text == None:
        return candidates

    #otherwise, try to find the target text
    #only return if it is present
    else:
        for candidate_len, candidate in candidates:
            #check if the candidate is equal to the target text
            #if it is, return it
            if candidate == target_text:
                return (candidate_len, candidate)

def validate_m(m) -> tuple[LCG, int]:
    #bruteforce
    for first_char_len in range(1,6):
        #get the ciphertext and XOR with known plaintext (0x6362) and generate the LCG
        ct = read_ct(0, first_char_len)
        ct ^= 0x6362
        lcg = LCG(ct,m)

        #now that we have the first char, we will check if the next candidates are ["ac", "tf"], spelling (in total): "bcactf"
        offset = first_char_len
        loop_terminated = False
        for next_chars in ["ac", "tf"]:
            #generate the key and the candidate
            key = lcg.generate()
            candidate = generate_candidates(offset, key, next_chars)

            #if the candidate is none, we terminate the for loop. this shouldn't happen in ideal circumstances
            if candidate == None:
                loop_terminated = True
                break

            #set the offset to the candidate length
            offset = candidate[0]

        #if the loop was not terminated and we managed to make it here, we have ourselves a winner for "m"!
        if not loop_terminated:
            return (lcg, offset)

#we are going to bruteforce m (it's small, we don't know it, and m < n)
for m in range(n):
    #notify
    print(f"\rLocating m:{m}...", end="")

    #check if "m" is valid. if it isn't, the statement below will return "none", and we will continue
    validation_result = validate_m(m)
    if validation_result == None:
        continue

    #we now have an LCG instance ready to generate the next key, and the offset for when we fetch the next candidates
    flag = "bcactf"
    offset = validation_result[1]
    lcg = validation_result[0]
    print(f"\nOffset: {offset}")

    #now that we have "m", we have to manually (by hand) distinguish what the next possible characters might be
    #ain't nothing like the power of human language recognition, ey?
    while not flag.endswith("}"):
        #generate the candidates
        candidates = {new_offset:candidate for new_offset,candidate in generate_candidates(offset, lcg.generate()) if candidate != None}

        #if the candidates are only of a length 1, then we can just do it without needing user interaction
        if len(candidates) == 1:
            value_to_add = list(candidates.items())[0]
            offset = value_to_add[0]
            flag += value_to_add[1]
            continue

        #we need user interaction now
        print("Next sequence?")
        for new_offset, candidate in candidates.items():
            print(f"[{new_offset}]: {flag}{candidate}")

        #wait for input
        while True:
            try:
                #get the number the user entered
                user_in = int(input("> ").strip())

                #get from the candidates dictionary and set accordingly
                flag += candidates[user_in]
                offset = user_in

                #break the loop
                break
            except Exception as e:
                print(f"{type(e).__name__}: {e}")
                pass

    #print the flag
    print(flag)
    exit()

bcactf{g3neR4t1ng_th3_FU1UR3_50a93b}

Leave a Reply