Department of Computer Science and Technology

Security Group

SHA3-32bit dataset: building templates for recovering SHA-3 inputs

The SHA3-32bit dataset contains recordings of the power-supply current changes of the 32-bit processor STM32F303RCT7, which has one ARM Cortex-M4 core, on a ChipWhisperer-Lite (CW-Lite) board. We used an NI PXIe-5160 10-bit oscilloscope, which can sample at 2.5 GS/s into 2 GB of sampling memory, and an NI PXIe-5423 wave generator, as an external clock signal source, to supply the target board with a 5 MHz square wave signal.

More details of the attack are described in the following paper:

Because the size of the datasets is quite large, we provide here only a part of the data as an example. If you are interested in full data access, please email [Javascript required].

Source code

The source code of the SHA-3 implementation on CW-Lite is available below.

Recording scripts

The scripts to control the recording platform are available below.

The refrence trace, which is the average of 1600 recorded raw traces:

A raw trace contains 7500000 samples (Raw[0] to Raw[7499999]), covering 15000 clock cycles. In the later experiments, we defined the first clock cycle (Clock[0]) as the 500 samples from Raw[75455] to Raw[759554], where Clock[0][0] = Raw[75455], and we used samples in 14500 clock cycles from Clock[0].

Detection dataset

We recorded traces of 16000 Keccak-f permutations for interesting clock cycle detection. The raw data are stored in 100 sets, such as:

The raw traces were later processed to the processed data, where each clock cycle contains only one sample (S[x], where S[x] = sum(Clock[x][20], ..., Clock[x][69])), such as:

With the pre-caculated intermediate_values_DN.zip (updated: 2022-01-23), the detection results of the interesting clock cycle sets are:

The interesting clock cycle sets contains the index of the intersting clock cycles for each intermediate 32-bit word, where we use tag "A00" to represent state \(\alpha'_{0}\), "B01" for \(\beta_{1}\), C02 for \(\mathbf{C}_{2}\), etc.

Profiling dataset

We recorded traces of 64000 Keccak-f permutations to profile the templates. The raw data are stored in 400 sets, such as:

The raw traces were later processed by resampling down to 10 samples per clock cycle, such as

Then the processed trace set were further processed according to the interesting clock cycle sets by concatenating the smaples of the interesting clock cycles of the target 32-bit word into traces fragments, such as:

which are for the first 32-bit word in state \(\alpha'_{0}\).

The intermediate values for target bytes are as follow:

With the trace fragments, the intermediate values, as well as the interesting clock cycle sets, we can build the templates:

Testing dataset (SHA3-512)

In our experiments, we teseted all SHA-3 and SHAKE functions with inputs that can be absorbed within one or two invocations of Keccak-f permutation. Here we published the testing data set of SHA3-512 with inputs being absorbed in one invocations for example.

We recorded 1000 traces for this test. The raw data are stored in 10 zip files, such as:

Similar to what had done on the profiling traces, the raw traces were later processed by resampling down to 10 samples per clock cycle and were also stored in the following 10 zip files:

We also keep all the corresponding input and output strings of these recorded traces for checking if we correctly predicted these values:

With these processed traces and these corresponding data, as well as the interesting clock cycle sets and the templates, we can finish this test with the following code:

A faster version of Python code

We have optimized our template building and testing Python code as following:

This optimization mostly takes the advantage of the parallelization of the NumPy library to reduce the computing time, and we executed this optimized version on a 32-core server with 256G memeory.

Note that the new templates and testing results will be a little different from the original version we have published in our paper due to some precision issues related to floating-point numbers. However, the differences are not statistically significant, and we publish the results of the new version here: