Cpu sha3 что это такое
SHA-3 (Secure Hash Algorithm 3) is the latest member of the Secure Hash Algorithm family of standards, released by NIST on August 5, 2015.[4][5] Although part of the same series of standards, SHA-3 is internally quite different from the MD5-like structure of SHA-1 and SHA-2.
SHA-3 is a subset of the broader cryptographic primitive family Keccak (/ˈkɛtʃæk, -ɑːk/),[6][7] designed by Guido Bertoni, Joan Daemen, Michaël Peeters, and Gilles Van Assche, building upon RadioGatún. Keccak’s authors have proposed additional uses for the function, not (yet) standardized by NIST, including a stream cipher, an authenticated encryption system, a "tree" hashing scheme for faster hashing on certain architectures,[8][9] and AEAD ciphers Keyak and Ketje.[10][11]
Keccak is based on a novel approach called sponge construction.[12] Sponge construction is based on a wide random function or random permutation, and allows inputting ("absorbing" in sponge terminology) any amount of data, and outputting ("squeezing") any amount of data, while acting as a pseudorandom function with regard to all previous inputs. This leads to great flexibility.
NIST does not currently plan to withdraw SHA-2 or remove it from the revised Secure Hash Standard. The purpose of SHA-3 is that it can be directly substituted for SHA-2 in current applications if necessary, and to significantly improve the robustness of NIST’s overall hash algorithm toolkit.[13]
Contents
History[edit]
The Keccak algorithm is the work of Guido Bertoni, Joan Daemen (who also co-designed the Rijndael cipher with Vincent Rijmen), Michael Peeters, and Gilles Van Assche. It is based on earlier hash function designs PANAMA and RadioGatún. PANAMA was designed by Daemen and Craig Clapp in 1998. RadioGatún, a successor of PANAMA, was designed by Daemen, Peeters, and Van Assche, and was presented at the NIST Hash Workshop in 2006.[14] The reference implementation source code was dedicated to public domain via CC0 waiver.[15]
In 2006 NIST started to organize the NIST hash function competition to create a new hash standard, SHA-3. SHA-3 is not meant to replace SHA-2, as no significant attack on SHA-2 has been demonstrated. Because of the successful attacks on MD5, SHA-0 and SHA-1,[16] NIST perceived a need for an alternative, dissimilar cryptographic hash, which became SHA-3.
After a setup period, admissions were to be submitted by the end of 2008. Keccak was accepted as one of the 51 candidates. In July 2009, 14 algorithms were selected for the second round. Keccak advanced to the last round in December 2010.[17]
During the competition, entrants were permitted to "tweak" their algorithms to address issues that were discovered. Changes that have been made to Keccak are:[18][19]
- The number of rounds was increased from 12 + ℓ to 12 + 2ℓ to be more conservative about security.
- The message padding was changed from a more complex scheme to the simple 10*1 pattern described below.
- The rate r was increased to the security limit, rather than rounding down to the nearest power of 2.
On October 2, 2012, Keccak was selected as the winner of the competition.[6]
In 2014, the NIST published a draft FIPS 202 "SHA-3 Standard: Permutation-Based Hash and Extendable-Output Functions".[20] FIPS 202 was approved on August 5, 2015.[21]
On August 5, 2015 NIST announced that SHA-3 had become a hashing standard.[22]
Design[edit]
The sponge construction for hash functions. Piare input, Zi are hashed output. The unused "capacity" c should be twice the desired resistance to collision or preimage attacks.
SHA-3 uses the sponge construction,[12][23] in which data is "absorbed" into the sponge, then the result is "squeezed" out. In the absorbing phase, message blocks are XORed into a subset of the state, which is then transformed as a whole using a permutation function f. In the "squeeze" phase, output blocks are read from the same subset of the state, alternated with the state transformation function f. The size of the part of the state that is written and read is called the "rate" (denoted r), and the size of the part that is untouched by input/output is called the "capacity" (denoted c). The capacity determines the security of the scheme. The maximum security level is half the capacity.
Given an input bit string N, a padding function pad, a permutation function f that operates on bit blocks of width b, a rate r and an output length d, we have capacity c = b − r and the sponge construction Z = sponge[f,pad,r](N,d), yielding a bit string Z of length d, works as follows:[24]:18
- pad the input N using the pad function, yielding a padded bit string P with a length divisible by r (such that n = len(P)/r is integer),
- break P into n consecutive r-bit pieces P0, . Pn-1
- initialize the state S to a string of b 0 bits.
- absorb the input into the state: For each block Pi,
- extend Pi at the end by a string of c 0 bits, yielding one of length b,
- XOR that with S and
- apply the block permutation f to the result, yielding a new state S
- append the first r bits of S to Z
- if Z is still less than d bits long, apply f to S, yielding a new state S.
The fact that the internal state S contains c additional bits of information in addition to what is output to Z prevents the length extension attacks that SHA-2, SHA-1, MD5 and other hashes based on the Merkle–Damgård construction are susceptible to.
In SHA-3, the state S consists of a 5 × 5 array of w = 64-bit words, b = 5 × 5 × w = 5 × 5 × 64 = 1600 bits total. Keccak is also defined for smaller power-of-2 word sizes w down to 1 bit (25 bits total state). Small state sizes can be used to test cryptanalytic attacks, and intermediate state sizes (from w = 8, 200 bits, to w = 32, 800 bits) can be used in practical, lightweight applications.[10][11]
For SHA-3-224, SHA-3-256, SHA-3-384, and SHA-3-512 instances, r is greater than d, so there is no need for additional block permutations in the squeezing phase; the leading d bits of the state are the desired hash. However, SHAKE-128 and SHAKE-256 allow an arbitrary output length, which is useful in applications such as optimal asymmetric encryption padding.
Padding[edit]
To ensure the message can be evenly divided into r-bit blocks, padding is required. SHA-3 uses the pattern 10*1 in its padding function: a 1 bit, followed by zero or more 0 bits (maximum r − 1) and a final 1 bit.
The maximum of r − 1 0 bits occurs when the last message block is r − 1 bits long. Then another block is added after the initial 1 bit, containing r − 1 0 bits before the final 1 bit.
The two 1 bits will be added even if the length of the message is already divisible by r.[24]:5.1 In this case, another block is added to the message, containing a 1 bit, followed by a block of r − 2 0 bits and another 1 bit. This is necessary so that a message with length divisible by r ending in something that looks like padding does not produce the same hash as the message with those bits removed.
The initial 1 bit is required so messages differing only in a few additional 0 bits at the end do not produce the same hash.
The position of the final 1 bit indicates which rate r was used (multi-rate padding), which is required for the security proof to work for different hash variants. Without it, different hash variants of the same short message would be the same up to truncation.
The block permutation[edit]
The block transformation f, which is Keccak-f[1600] for SHA-3, is a permutation that uses xor, and and not operations, and is designed for easy implementation in both software and hardware.
It is defined for any power-of-two word size, w = 2ℓ bits. The main SHA-3 submission uses 64-bit words, ℓ = 6.
The state can be considered to be a 5 × 5 × w array of bits. Let a[i][ j][k] be bit (5i + j) × w + k of the input, using a little-endian bit numbering convention and row-major indexing. I.e. i selects the row, j the column, and k the bit.
Index arithmetic is performed modulo 5 for the first two dimensions and modulo w for the third.
The basic block permutation function consists of 12 + 2ℓ rounds of five steps:
Compute the parity of each of the 5w (320, when w = 64) 5-bit columns, and exclusive-or that into two nearby columns in a regular pattern. To be precise, a[i][ j][k] ← a[i][ j][k] ⊕ parity(a[0. 4][ j−1][k]) ⊕ parity(a[0. 4][ j+1][k−1])
Exclusive-or a round constant into one word of the state. To be precise, in round n, for 0 ≤ m ≤ ℓ, a[0][0][2m−1] is exclusive-ORed with bit m + 7n of a degree-8 LFSR sequence. This breaks the symmetry that is preserved by the other steps.
Speed[edit]
The speed of SHA-3 hashing of long messages is dominated by the computation of f = Keccak-f[1600] and XORing S with the extended Pi, an operation on b = 1600 bits. However, since the last c bits of the extended Pi are 0 anyway, and XOR with 0 is a noop, it is sufficient to perform XOR operations only for r bits (r = 1600 − 2 × 224 = 1152 bits for SHA3-224, 1088 bits for SHA3-256, 832 bits for SHA3-384 and 576 bits for SHA3-512). The lower r is (and, conversely, the higher c = b − r = 1600 − r), the less efficient but more secure the hashing becomes since fewer bits of the message can be XORed into the state (a quick operation) before each application of the computationally expensive f. The authors report the following speeds for software implementations of Keccak-f[1600] plus XORing 1024 bits,[1] which roughly corresponds to SHA3-256:
- 57.4 cpb on IA-32, Intel Pentium 3[25]
- 41 cpb on IA-32+MMX, Intel Pentium 3
- 20 cpb on IA-32+SSE, Intel Core 2 Duo or AMD Athlon 64
- 12.6 cpb on a typical x86-64-based machine
- 6–7 cpb on IA-64[citation needed]
For the exact SHA3-256 on x86-64, Bernstein measures 11.7–12.25 cpb depending on the CPU.[26]:7 SHA-3 has been criticized for being slow in software – SHA2-512 is more than twice as fast as SHA3-512 and SHA-1 is more than three times as fast on an Intel Skylake processor clocked at 3.2 GHz.[27] The authors have reacted to this criticism by suggesting to use SHAKE128 and SHAKE256 instead of SHA3-256 and SHA3-512, at the expense of cutting the preimage resistance in half (but while keeping the collision resistance). With this, performance is on par with SHA2-256 and SHA2-512. To further increase speed, the authors suggest using KangarooTwelve, a variant of Keccak with half the number of rounds (trading safety margin for speed) and using parallelizable tree hashing to exploit the availability of parallelism in the processor.
However, in hardware implementations, SHA-3 is notably faster than all other finalists,[28] and also faster than SHA-2 and SHA-1.[27]
Instances[edit]
The NIST standard defines the following instances, for message M and output length d:[24]:20,23
Instance Output
size drate r
= block sizecapacity c Definition Security Strengths in Bits Collision Preimage 2nd Preimage SHA3-224(M) 224 1152 448 Keccak[448](M || 01, 224) 112 224 224 SHA3-256(M) 256 1088 512 Keccak[512](M || 01, 256) 128 256 256 SHA3-384(M) 384 832 768 Keccak[768](M || 01, 384) 192 384 384 SHA3-512(M) 512 576 1024 Keccak[1024](M || 01, 512) 256 512 512 SHAKE128(M, d) d 1344 256 Keccak[256](M || 1111, d) min(d/2,128) ≥min(d,128) min(d,128) SHAKE256(M, d) d 1088 512 Keccak[512](M || 1111, d) min(d/2,256) ≥min(d,256) min(d,256) With the following definitions
- Keccak[c](N, d) = sponge[Keccak-f[1600], pad10*1, r](N, d)[24]:20
- Keccak-f[1600] = Keccak-p[1600, 24][24]:17
- c is the capacity
- r is the rate = 1600 − c
- N is the input bit string
Note that the appended postfixes are written as bit strings, not hexadecimal digits.
The SHA-3 instances are the drop-in replacements for SHA-2, with identical security claims. SHAKE instances are so called XOF’s, Extendable Output Functions. For example, SHAKE128(M, 256) can be used as a hash function with a 256 bit length and 128 bit overall security.
Note that all instances append some bits to the message, the rightmost of which represent the domain separation suffix. The purpose of this is to ensure that it is not possible to construct messages that produce the same hash output for different applications of the Keccak hash function. The following domain separation suffixes exist:[24][29]
Suffix Meaning . 0 reserved for future use 01 SHA-3 . 11 RawSHAKE RawSHAKE is the basis for the Sakura coding for tree hashing, which has not been standardized yet. However, the SHAKE suffix has been carefully chosen so that it is forward compatible with Sakura. Sakura appends 0 for a chaining hop or 1 for a message, then 10*0 for a non-final (inner) node or 1 for a final node, before it applies RawSHAKE. Sequential hashing corresponds to a hop tree with a single message node, which means that 11 is appended to the message before RawSHAKE is applied. Thus, the SHAKE XOFs append 1111 to the message, i.e., 1 for message, 1 for final node, and 11 for the RawSHAKE domain separation suffix.[29]:16
Since 10*1 padding always adds at least two bits, in byte aligned libraries there are always six unused zero bits. Therefore, these appended extra bits never make the padded message longer.
Security against quantum attacks[edit]
There is a general result (Grover’s algorithm) that quantum computers can perform a structured preimage attack in √2d = 2d/2, while a classical brute-force attack needs 2d. A structured preimage attack implies a second preimage attack[30] and thus a collision attack. A quantum computer can also perform a birthday attack, thus break collision resistance, in 3√2d = 2d/3[31] (although that is disputed[32]). Noting that the maximum strength can be c/2, this gives the following upper[33] bounds on the quantum security of SHA-3:
Instance Security Strengths in Bits Collision
(Brassard et al.)Collision
(Bernstein)Preimage 2nd Preimage SHA3-224(M) 74⅔ 112 112 112 SHA3-256(M) 85⅓ 128 128 128 SHA3-384(M) 128 192 192 192 SHA3-512(M) 170⅔ 256 256 256 SHAKE128(M, d) min(d/3,128) min(d/2,128) ≥min(d/2,128) min(d/2,128) SHAKE256(M, d) min(d/3,256) min(d/2,256) ≥min(d/2,256) min(d/2,256) It has been shown that the Merkle–Damgård construction, as used by SHA-2, is collapsing and, by consequence, quantum collision-resistant,[34] but for the sponge construction used by SHA-3, the authors provide proofs only for the case that the block function f is not efficiently invertible; Keccak-f[1600], however, is efficiently invertible, and so their proof does not apply.[35]
Capacity change controversy[edit]
In February 2013 at the RSA Conference, and then in August 2013 at CHES, NIST announced they would select different values for the capacity, i.e. the security parameter, for the SHA-3 standard, compared to the submission.[36][37] The changes caused some turmoil.
The hash function competition called for hash functions at least as secure as the SHA-2 instances. It means that a d-bit output should have d/2-bit resistance to collision attacks and d-bit resistance to preimage attacks, the maximum achievable for d bits of output. Keccak’s security proof allows an adjustable level of security based on a "capacity" c, providing c/2-bit resistance to both collision and preimage attacks. To meet the original competition rules, Keccak’s authors proposed c=2d. The announced change was to accept the same d/2-bit security for all forms of attack and standardize c=d. This would have sped up Keccak by allowing an additional d bits of input to be hashed each iteration. However, the hash functions would not have been drop-in replacements with the same preimage resistance as SHA-2 anymore; it would have been cut in half, making it vulnerable to advances in quantum computing, which effectively would cut it in half once more.[30]
In September 2013, Daniel J. Bernstein suggested on the NIST hash-forum mailing list[38] to strengthen the security to the 576-bit capacity that was originally proposed as the default Keccak, in addition to and not included in the SHA-3 specifications.[39] This would have provided at least a SHA3-224 and SHA3-256 with the same preimage resistance as their SHA-2 predecessors, but SHA3-384 and SHA3-512 would have had significantly less preimage resistance than theirs. In late September, the Keccak team responded by stating that they had proposed 128-bit security by setting c = 256 as an option already in their SHA-3 proposal.[40] Although the reduced capacity was justifiable in their opinion, in the light of the negative response, they proposed raising the capacity to c = 512 bits for all instances. This would be as much as any previous standard up to the 256-bit security level, while providing reasonable efficiency,[41] but not the 384/512 bit preimage resistance offered by SHA2-384/512. The authors tried to justify that with the claim that "claiming or relying on security strength levels above 256 bits is meaningless."
In early October 2013, Bruce Schneier criticized NIST’s decision on the basis of its possible detrimental effects on the acceptance of the algorithm, saying:
There is too much mistrust in the air. NIST risks publishing an algorithm that no one will trust and no one (except those forced) will use.[42]
Paul Crowley, a cryptographer and senior developer at an independent software development company, expressed his support of the decision, saying that Keccak is supposed to be tunable and there is no reason for different security levels within one primitive. He also added:
Yes, it’s a bit of a shame for the competition that they demanded a certain security level for entrants, then went to publish a standard with a different one. But there’s nothing that can be done to fix that now, except re-opening the competition. Demanding that they stick to their mistake doesn’t improve things for anyone.[43]
There was also some confusion that internal changes were made to Keccak. The Keccak team clarified this, stating that NIST’s proposal for SHA-3 is a subset of the Keccak family, for which one can generate test vectors using their reference code submitted to the contest, and that this proposal was the result of a series of discussions between them and the NIST hash team.[44] Also, Bruce Schneier corrected his earlier statement, saying:
I misspoke when I wrote that NIST made "internal changes" to the algorithm. That was sloppy of me. The Keccak permutation remains unchanged. What NIST proposed was reducing the hash function’s capacity in the name of performance. One of Keccak’s nice features is that it’s highly tunable.[42]
In response to the controversy, in November 2013 John Kelsey of NIST proposed to go back to the original c = 2d proposal for all SHA-2 drop-in replacement instances.[45] These changes were confirmed in the April 2014 draft.[46] This proposal was implemented in the final release standard in August 2015.[4]
The reduced-capacity forms were published as SHAKE128 and SHAKE256, where the number indicates the security level and the number of bits of output is variable, but should be twice as large as the required collision resistance.
Examples of SHA-3 variants[edit]
The following hash values are from NIST.gov:[47]
Changing a single bit causes each bit in the output to change with 50% probability, demonstrating an avalanche effect:
Comparison of SHA functions[edit]
In the table below, internal state means the number of bits that are carried over to the next block.
Algorithm and variant Output size
(bits) Internal state size
(bits) Block size
(bits) Rounds Operations Security (in bits) against collision attacks Capacity
against length extension attacks Performance on Skylake (median cpb)[48] First Published long messages 8 bytes MD5 (as reference) 128 128
(4 × 32) 512 64 And, Xor, Rot, Add (mod 232), Or ≤18
(collisions found)[49] 0 4.99 55.00 1992 SHA-0 160 160
(5 × 32) 512 80 And, Xor, Rot, Add (mod 232), Or <34
(collisions found) 0 ≈ SHA-1 ≈ SHA-1 1993 SHA-1 <63
(collisions found[50]) 3.47 52.00 1995 SHA-2 SHA-224
SHA-256 224
256 256
(8 × 32) 512 64 And, Xor, Rot, Add (mod 232), Or, Shr 112
128 32
0 7.62
7.63 84.50
85.25 2004
2001 SHA-384
SHA-512 384
512 512
(8 × 64) 1024 80 And, Xor, Rot, Add (mod 264), Or, Shr 192
256 128 (≤ 384)
0 5.12
5.06 135.75
135.50 2001 SHA-512/224
SHA-512/256 224
256 112
128 288
256 ≈ SHA-384 ≈ SHA-384 2012 SHA-3 SHA3-224
SHA3-256
SHA3-384
SHA3-512 224
256
384
512 1600
(5 × 5 × 64) 1152
1088
832
576 24[51] And, Xor, Rot, Not 112
128
192
256 448
512
768
1024 8.12
8.59
11.06
15.88 154.25
155.50
164.00
164.00 2015 SHAKE128
SHAKE256 d (arbitrary)
d (arbitrary) 1344
1088 min(d/2, 128)
min(d/2, 256) 256
512 7.08
8.59 155.25
155.50Алгоритмы / Хэш-функция SHA-3
SHA-3 (Secure Hash Algorithm Version 3), также именуемый Keccak (Кечак), представляет собой однонаправленную функцию для создания цифровых отпечатков выбранной длины (в стандарте приняты 224, 256, 384 или 512 бит) из входных данных любого размера, разработанным группой авторов во главе с Йоаном Дайменом в 2008 году и принятым в 2015 в качестве нового стандарта FIPS. Алгоритм работает посредством функции перемешивания со сжатием до выбранного размера “криптографической губкой“.
Оригинальный алгоритм Keccak имеет множество настраиваемых параметров (размер блока данных, размер состояния алгоритма, количество раундов в функции f и другие)с целью обеспечения оптимального соотношения криптостойкости и быстродействия для применения на выбранной платформе.
Версия алгоритма SHA-3, имеет несколько отличий от оригинального алгоритма Keccak:
- отброшены медленные режимы c=768 и c=1024
- упрощён алгоритм заполнения
- введены «функции с удлиняемым результатом» (XOF, Extendable Output Functions) SHAKE128 и SHAKE256, для чего хешируемое сообщение стало необходимо дополнять «суффиксом» из 2 или 4 бит, в зависимости от типа функции.
Основой функции сжатия алгоритма является функция f, выполняющая перемешивание внутреннего состояния алгоритма. Состояние A представляется в виде массива 5×5, элементами которого являются 64-битные слова, инициализированные нулевыми битами (размер состояния составляет 5*5*64=1600 битов), инициализируются временные массивы B (5*5*64 бит), C и D (5*64 бит). Функция f выполняет 24 раунда, в каждом из которых производятся следующие операции с индексами массива по модулю 5 и наложение операцией XOR раундовой константы на слово A[0, 0]:
r — массив, определяющий величину циклического сдвига для каждого слова состояния;
x — поразрядное дополнение к x;
Перед выполнением функции сжимания накладывается операция XOR фрагментов исходного сообщения с фрагментами исходного состояния. Результат обрабатывается функцией f. Данное наложение в совокупности с функцией сжимания, выполняемые для каждого блока входных данных, представляют собой «впитывающую» (absorbing) фазу криптографической губки. Результирующее хеш-значение вычисляется в процессе выполнения «выжимающей» (squeezing) фазы криптографической губки, основу которой также составляет описанная выше функция f.
Аппаратное шифрование в процессорах
Оформление разрешительных документов для ввоза и вывоза ШКС:
В современном мире шифрование используется практически повсеместно, как для защиты особо важной информации в специальных областях (оборонный сектор, банковская сфера и т.д.), так и в составе бытовых устройств: компьютеров, смартфонов, телевизоров. Более того, с каждым годом расширяется сфера применения криптографии, и растет объем передаваемых (хранимых) зашифрованных данных.
В то же время, шифрование данных значительно увеличивает вычислительную нагрузку на реализующие его устройства, поэтому неудивительно, что все чаще криптографические операции переносятся на аппаратный уровень (часто в виде специальных криптографических ко-процессоров или плат расширения). В последние годы шифровальные инструкции стали также широко внедряться непосредственно в центральные процессоры крупнейших брендов для бытовых ПК и мобильных устройств.
Содержание
Использование шифрования
По мере проникновения электроники и автоматики во все сферы нашей жизни растет потребность в защите передаваемых данных и ограничении доступа к ключевым компонентам. Практически все современные компьютеры, планшеты и смартфоны, роутеры, «умные» бытовые приборы, автомобили и пр. активно используют шифрование. Так, например, криптографические алгоритмы используются:
- при подключении к большинству видов беспроводных сетей передачи данных (Wi-Fi, Bluetooth и пр.);
- в мобильной связи;
- мобильные ОС (iOS, Android) шифруют данные на устройствах для защиты от несанкционированного доступа;
- безопасное хранение паролей требует определенных криптографических функций (т.е. большинство устройств с возможностью задания пароля использует шифрование);
- банковские карты, банкоматы, терминалы оплаты всегда защищены криптографически;
- криптовалюты основаны на принципах шифрования.
Все больше организаций и людей осознают важность применения шифрования для защиты данных. Так, совместными усилиями многих компаний, в частности, Google, доля зашифрованного HTTP-трафика выросла с 30% в начале 2014 года до 70% на начало 2018 [1] .
В то же время, любое шифрование — математически сложное преобразование данных и требует дополнительных вычислительных ресурсов от аппаратуры. В зависимости от сценария работы с данными внедрение шифрования может снизить общую пропускную способность (объем обрабатываемых данных в единицу времени) в несколько раз [2] .
Алгоритмы шифрования
Существует большое количество криптографических алгоритмов [3] . Поддерживать их все было бы технически трудно осуществимо.
Некоторые алгоритмы, однако, используются значительно чаще, чем другие. Это связано с тем, что многие алгоритмы признаны устаревшими или недостаточно безопасными, другие оказываются излишне сложными вычислительно. Есть также и другие причины.
Среди блочных симметричных алгоритмов прежде всего следует выделить AES (Advanced Encryption Standard). Данный алгоритм был отобран в качестве национального стандарта США по результатам конкурса [4] . AES является основным симметричным алгоритмом шифрования во многих протоколах и технологиях (TLS, Wi-Fi, Bluetooth (с версии 4.0), GPG, IPsec, Bitlocker (шифрование файловой системы Windows), LUKS (шифрование файловой системы Linux), Miscrosoft Office, многие программы-архиваторы (WinZip, 7-zip) и пр.).
Также крайне широко используются алгоритмы криптографического хеширования. В связи с тем, что алгоритм MD5 был признан небезопасным, в настоящее время наиболее распространенными являются алгоритмы серии SHA, прежде всего, SHA-1 и SHA-2, также являющимися стандартами FIPS США. Им на смену со временем придет алгоритм SHA-3, ставший в 2012 году победителем соответствующего конкурса.
Среди алгоритмов с открытым ключом стоит отметить RSA, DSA и Diffe-Hellman.
В процессорах наиболее распространенной архитектуры x86-64 (производителей Intel и AMD) последних поколений реализованы специальные инструкции для ускорения вычислений по алгоритмам AES и SHA-1, SHA-2 (256 бит).
Инструкции Intel
Компанией Intel в 2008 г. были предложены новые команды для x86-64 архитектуры, которые добавили поддержку на аппаратном уровне симметричного алгоритма шифрования AES. На данный момент AES — один из самых популярных алгоритмов блочного шифрования. Поэтому аппаратная реализация должна привести к повышению производительности программ, использующих этот алгоритм шифрования.
Набор новых инструкции носит название AES-NI (AES New Instructions) и состоит из четырёх инструкций для шифрования AES:
- AESENC — выполнить один раунд шифрования AES,
- AESENCLAST — выполнить последний раунд шифрования AES и расшифровки,
- AESDEC — выполнить один раунд расшифрования AES,
- AESDECLAST — выполнить последний раунд расшифрования AES,
и ещё двух инструкции для работы с ключом AES:
- AESIMC — Inverse Mix Columns,
- AESKEYGENASSIST — поспособствовать в генерации раундового ключа AES.
Как и раньше, инструкции относятся к SIMD, то есть к типу «одна инструкция много данных» (Single Instruction Multiple Data). Поддерживаются все три ключа режима AES (с длинами ключей 128, 192 и 256 битов с 10, 12 и 14 проходами подстановки и перестановки).
Использование этих инструкций обеспечивает ускорение операций шифрования в несколько раз [5] .
В 2013 году Intel представила спецификацию нового набора инструкций для алгоритмов SHA-1 и SHA-256:
- SHA-1: SHA1RNDS4, SHA1NEXTE, SHA1MSG1, SHA1MSG2
- SHA-256: SHA256RNDS2, SHA256MSG1, SHA256MSG2
Данные инструкции должны вызываться на различных этапах вычисления подписи (хеша) сообщения для ускорения наиболее вычислительно сложных операций.
Первые процессоры с поддержкой данных инструкций были представлены в 2016 году (микроархитектура Goldmont).
Поддержка процессорами
Набор инструкций AES-NI поддерживается процессорами Intel на основе архитектур [6] :
- Westmere:
- Westmere-EP (Xeon 56xx)
- все настольные, кроме Pentium, Celeron, Core i3
- мобильные: только Core i7 и Core i5
- Bulldozer
- Piledriver
- Steamroller
- Excavator
- Jaguar
- Puma
- Zen
Инструкции SHA поддерживаются процессорами Intel, начиная с архитектуры Goldmont (2016 год), процессорами AMD — с архитектуры Zen (2017 год).
Другие процессоры
Процессоры общего назначения других архитектур и производителей также часто включают поддержку специальных криптографических инструкций.
Так, поддержка алгоритма AES реализована в процессорах:
- на основе архитектуры Qualcomm ARMv8 и выше [7] ;
- SPARC T4 и выше [8] ;
- VIA (C3, C7) [9] .
Архитектура ARM также имеет набор инструкций для алгоритмов SHA:
- SHA1C — SHA1 hash update accelerator, choose
- SHA1H — SHA1 fixed rotate
- SHA1M — SHA1 hash update accelerator, majority
- SHA1P — SHA1 hash update accelerator, parity
- SHA1SU0 — SHA1 schedule update accelerator, first part
- SHA1SU1 — SHA1 schedule update accelerator, second part
- SHA256H — SHA256 hash update accelerator
- SHA256H2 — SHA256 hash update accelerator, upper part
- SHA256SU0 — SHA256 schedule update accelerator, first part
- SHA256SU1 — SHA256 schedule update accelerator, second part
Примечания
- ↑Доля страниц, загружаемых браузером Firefox по HTTPS
- ↑Производительность сети Wi-Fi при использовании различных алгоритмов шифрования и аутентификации
- ↑Симметричные алгоритмы (неполный список)
- ↑Конкурс NIST
- ↑ Ускорение в 5 раз (Truecrypt), в
Ссылки
Если Вам требуются услуги по таможенному оформлению, получению разрешительных документов или у Вас есть вопросы, свяжитесь с нами — Контакты IFCG.
В частности, мы готовы оказать услуги по оформлению следующих разрешительных документов для ввоза и вывоза товаров с шифровальными функциями:
Background
SHA-3 is short for Secure Hash Algorithm 3 This means that SHA-3 is a hash function and meets certain attack resistance criteria, if you don’t know what those are you can read my introduction post on hashing functions. I will go into depth about hashing function attacks and what makes a hashing function secure as well as how previous functions were defeated in a later post.
This post covers step by step how SHA-3 takes in data and creates a hashed output. SHA-3 is most popularly used in Ethereum(ETH) via its Keccak-256 flavor.
I wrote this post mainly because it was pretty hard for me to break down SHA-3 coming from a Chemical Engineering background and there were really no posts out there that tried to explain it to someone without a strong Computer Science/Mathematics background.
Sponge Construction
The SHA-3 algorithm is unique to other SHA functions due to the fact it uses an approach called sponge construction. After trying to make sense of a few diagrams, I decided to create a version that made more sense to me.
SHA-3’s sponge construction works by:
- Breaking the input data to be hashed into r-bit sized chunks, in SHA-3’s case the rate+capacity bits sum up to 1600 bits.
- The first input rate and capacity are all zeros and are XORed with the first rate block r1
- The combined rate and capacity block is put through a function, usually of multiple rounds.
- The first r-bits of the function are rate and the remaining c-bits capacity.
- The above r-bits are XORed with r2 and fed into another function
- This is done until there are no left over data blocks.
- On the last data block, the hash is taken from r-bits output of the function.
- If more bits are needed the above rate and capacity are fed through the function without inputing any additional data.
Padding
Since data is rarely perfectly divided into an equal number of blocks perfectly, SHA-3 must pad the input data to be perfectly divisible by r-bits. This padding is done by adding a 1 bit, then filling in with zero or more 0’s and then ending with a 1 bit. The two cases for input padding are described below.
In the case of the inputs being perfectly divisible into r-bits we use case 2 as shown above. This is done to make sure that a message with length divisible by r-bits that ends in something that that looks like padding does not produce the same hash as the message with those bits removed.
The Keccak Function
The Keccak function is the heart of SHA-3, this function is the function block in the above diagram, it uses XOR, AND and NOT operations which translate into easy implementation in both software and hardware scenarios.
∈ – Element of, means that it is a part of the next set of numbers
ℤ – Represents the family of Integers
b – number of bits the function takes
ℓ – is any integer from 0 to 6, the main SHA-3 implementation uses a ℓ value of 6.
Explaining the mathematical notation of line 3 in plain english.
ℓ is a member of the family of integers, it needs to be a whole number and ℓ is greater or equal to 0 but less than or equal to 6.
All transformations of the Keccak function are done on the above 3-D array, the numbering of the indices is defined in FIPS 202. This 3 dimensional object is a 5-by-5-by-w array (the w in the equation above).
The entire 3-D array is called a state and various other parts of the state array are denoted above.
Capacity and Rate
For the main Keccak implantation b is usually restricted to 1600-bits, this gives us a selection of 4 SHA-3 hash functions each with a different output size. This obviously disregards the SHAKE implementations that can have an output of any length.
Above is a table of the four NIST standard SHA-3 functions, as you can see for each instance r+c=1600.
Rate – sometimes called bit rate is the size of data blocks that will be absorbed by the function
Capacity – is the area of bits that is untouched by the input or output, meaning new absorption of data will not change these bits. However the function on the newly absorbed data will. None of the capacity bits are used for output of the hash function. The size of the capacity is directly related to how secure the instance is.
The Block Transformation
The block transformation, which is represented as the block labeled function in the above sponge function diagram is broken up in to 5 steps that are done a number of rounds.
The 5 steps are:
- θ (theta)
- ρ (rho)
- π(pi)
- χ(chi)
- ι (iota)
In SHA-3 since we have ℓ = 6, we will do this block transformation 24 rounds for each time the function is run. The round, denoted by ir starts at 0, and is incremented by 1 each time the iota step finishes. Once ir =24, the output of the iota function will be the new capacity and rate This means that if we have twice the amount of bits to hash as the rate bits allow we must run the function 48 times, 24 times for each rate block input, r1 and r2 respectively in the sponge function diagram above.
The θ (theta) Step
The θ step consists of two separate functions, the C function and the D function.
θ C Function
The C function is probably one of the easiest to grasp conceptually. It consists of 4 stages of consecutive XOR calculations.
In the example able I show the C function calculation for x=0. We can bulk calculate with respect to the z axis and give the results per lane as shown in the final 4 bit array. The output of the C function for x=0 with with respect to z is broken down below showing the mapping per bit.
θ D Function
The D function introduces cross slice diffusion via the right bitwise rotation with respect to z. It takes two offset C function outputs and XOR’s them together.
In this example I am showing the D function calculation for x=0. The visual representation of the array has changed axis to show a plane, and the values are the output from the C function. The C value has no y-axis values since it XOR’s each column together, you can think of the C function as the bitwise sum in the y-direction, effectively squishing our state into a single plane.
For D[0], you can see that it does not use C[0], it actually uses the x+1 and x-1 offset of of C[0]. This provides diffusion across the x-direction.
The second argument to the D function offsets z by one position, while you can just do z-1 when calculating an individual bit, when you calculate an entire lane we can represent this z offset as a right-rotation. This provides diffusion across the z-direction.
The final step is to XOR both arguments together, giving you D[0] the lane representation of the D function output for x=0. The mapping per bit is broken down for D[0] to show mapping per bit.
θ Output
The output of the theta function is the XORed result of the initial state array A and the output of the D function.
In this example I am showing the final θ XOR output for x=0 and y=0, since D[x,z] does not have a y-component, we end up using D[0] for all subsequent XOR’s when x=0 regardless of y-value.
We can calculate this like our previous examples on a lane (respect to z-axis). The output of the function is labeled A’ (A-prime) which has a size of 5-by-5-by-w bits (the same size as the initial state). A’ is then fed into the ρ (rho) step.
The ρ (rho) Step
The ρ step thankfully only consist of one step. This step is conceptually easy to apply but difficult to understand when written in math notation. We will use A as the input state and A’ as the modified state. In this case A is the output from the final θ step above.
In my example I took the sheet when x=0 and w=8 then applied the ρ function to that sheet. The function is quite simple to apply if you have the table, I also gave the derivation of the table which I will show an example calculation later for a value of t.
Basically for A[0,0] you rotate it 0-bits so it stays the same. To find the rotation for A[0,2] we look at the table and find where x=0 and y=2 intersect, in this case we get 3. We will then rotate the bits in lane A[0,2] by 3-bits to the right, this is visually represented by a dot (its initial position) with an arrow pointing to a grey box (its final position). I also gave a visual representation of the rotation for A[0,2]. This rotation is then preformed on all lanes in the 5-by-5-by-w state.
In order to be able to explain how the ρ offset table is generate we need to understand how matrix multiplication works.
Matrix multiplication is pretty simple, just follow the pretty colors. In order to raise a matrix to the exponent you multiply the n-1 exponential matrix by the initial matrix, since matrix multiplication is not commutative you can’t go the other way.
Now that you have a rough understanding we can do a sample calculation for ρ offset, x and y for when t=10.
First we plug in 10 into the ρ offset, which gives us a 66-bit rotation. Now we need to know what lane this rotation is referring to. We plugin 10 into our matrix equation and with the help of wolframalpha we get our values for i and j. We then take the mod 5 of each of these values and we end up with x=1 and y=4. We can verify that for these x and y values on the table we have a 66-bit rotation.
With this information you can now just trust the table and forget the matrix algebra you just learned.
The π(pi) step
The π step is actually quite simple as well, the wording of the mathematical function confused me a bit to begin with but, once you get it, it’s pretty straight forward.
I decided to use the first example of the π function that was given in the NIST FIPS 202 publication of SHA-3. The π function is applied specifically to the diagonal points on a the slice listed above. The function rearranges the positions of lanes in the state.
I decided to list two different versions of the function, the first one, which was take from the official specification and the second one taken from the pseudo-code implementation. While they both do the same thing, the inputs differ.
The first function takes in the location from the output state and gives you the input state location from where it was moved from.
The second function takes in the input state location and gives you where to move it to for the output state.
I have used both functions in the calculations to show that they both do the same thing.
Above are the illustrations taken from the SHA-3 specification that show the π function being applied to all x and y positions.
The χ (chi) Step
The χ step is the only nonlinear mapping step in. It consists for XOR, NOT, and AND operations with bits across the x-direction and is applied to a plane at a time.
I decided to apply the function to a row instead of a plane to reduce clutter in visualization. I also color coded the x values to help contrast where each logic gate input gets its input value.
The function loop translates to do the χ function on one sheet at a time for all y values.
The ι (iota) Step
The ι step is the most simple of all the steps. The purpose of the ι step is to modify some of the bits of Lane(0,0), the other 24 lanes are not affected by ι. This is done by XORing Lane(0,0) by the round constant (RC[ir]).
Although the ι step is the most simple, it took me the longest to understand, specifically understanding how to generate the round constant vs using the supplied table.
If the case of ℓ=6, the z-position for the RC is given by the first table. The table shows the 7-bits which can be set and their position in the 2 ℓ -1 length lane. For the visual representation I show only 7-bits because the rest of the bits are always 0 for the round constant and since they are 0 do not change the initial lane at positions other then the 7-bits.
In this example, ir was chosen to be 1, meaning this is the second round of the block transformation.
The round constants table gives the round constant lane for each value of ir from 0 to 24, represented in hexadecimal format.
I have highlighted the 7-bits can be set for the XOR process to better highlight what is happening.
Coming from a Chemical Engineering background I usually want to understand where constants and equations are derived from before trusting them. Generally because things explode or fail catastrophically when you’re wrong. So, I decided to try to understand how the round constants were calculated.
This took me way longer to understand and required me to learn about linear-feedback shift registers (LFSR). There are a few ways to calculate LFSRs given a the representative polynomial, you can apply it as a Fibonacci or Galois LFSR. I had a lot of trouble since neither of these were referenced in the SHA-3 documentation.
I finally ended up figuring out the round constants were generated using a Galois LFSR with the bit shift direction flipped.
In the above example I gave calculations of RC[0] and RC[1], I think most of the diagram is self explanatory and not much needs to be explained in text. The only thing being that the polynomial degrees represent the bit position of the “taps” of the LFSR.
After the ι step ir is incremented by 1 and the output of the ι step is the new input to the θ step. Once ir is 24 the output of the ι step is the new rate and capacity. This new state is then XORed with any remaining rate chunks and fed into the function again. See the Sponge Construction section for next steps.