0%

2019 De1CTF XORZ

汉明距离【Hamming distance】

在信息论中,两个等长字符串之间的汉明距离 是两个字符串对应位置的不同字符的个数。

简单来说 汉明距离度量了通过替换字符的方式将字符串x变成y所需要的最小的替换次数。

比如

1011101与1001001 之间的汉明距离是2。

1011101

1001001

2143896与2233796之间的汉明距离是3。

2143896

2233796

"toned"与"roses"之间的汉明距离是3。

toned

roses

快速求解汉明距离的方法 就是 将等长字符串 进行异或。两个二进制字符异或后计算值为1的 比特位格式 就是最后的汉明距离。

2019 De1CTF XORZ

1
2
3
4
5
6
7
8
9
10
11
12
from itertools import *
from data import flag,plain

key=flag.strip("de1ctf{").strip("}")
assert(len(key<38))
salt="WeAreDe1taTeam"
ki=cycle(key)
si=cycle(salt)
cipher = ''.join([hex(ord(p) ^ ord(next(ki)) ^ ord(next(si)))[2:].zfill(2) for p in plain])
print cipher
# output:
# 49380d773440222d1b421b3060380c3f403c3844791b202651306721135b6229294a3c3222357e766b2f15561b35305e3c3b670e49382c295c6c170553577d3a2b791470406318315d753f03637f2b614a4f2e1c4f21027e227a4122757b446037786a7b0e37635024246d60136f7802543e4d36265c3e035a725c6322700d626b345d1d6464283a016f35714d434124281b607d315f66212d671428026a4f4f79657e34153f3467097e4e135f187a21767f02125b375563517a3742597b6c394e78742c4a725069606576777c314429264f6e330d7530453f22537f5e3034560d22146831456b1b72725f30676d0d5c71617d48753e26667e2f7a334c731c22630a242c7140457a42324629064441036c7e646208630e745531436b7c51743a36674c4f352a5575407b767a5c747176016c0676386e403a2b42356a727a04662b4446375f36265f3f124b724c6e346544706277641025063420016629225b43432428036f29341a2338627c47650b264c477c653a67043e6766152a485c7f33617264780656537e5468143f305f4537722352303c3d4379043d69797e6f3922527b24536e310d653d4c33696c635474637d0326516f745e610d773340306621105a7361654e3e392970687c2e335f3015677d4b3a724a4659767c2f5b7c16055a126820306c14315d6b59224a27311f747f336f4d5974321a22507b22705a226c6d446a37375761423a2b5c29247163046d7e47032244377508300751727126326f117f7a38670c2b23203d4f27046a5c5e1532601126292f577776606f0c6d0126474b2a73737a41316362146e581d7c1228717664091c

去盐

key 38个十六进制数 30个未知字符

c = p ^ ki ^ si

已知ci

salt key 都是循环利用的

salt是已知的 所以先将salt层 异或去掉 得到 si

c ^ si = p ^ ki

c 和 si是 已知的 所以可以求出来

1
2
3
4
5
6
7
8
9
10
11
c = '49380d773440222d1b421b3060380c3f403c3844791b202651306721135b6229294a3c3222357e766b2f15561b35305e3c3b670e49382c295c6c170553577d3a2b791470406318315d753f03637f2b614a4f2e1c4f21027e227a4122757b446037786a7b0e37635024246d60136f7802543e4d36265c3e035a725c6322700d626b345d1d6464283a016f35714d434124281b607d315f66212d671428026a4f4f79657e34153f3467097e4e135f187a21767f02125b375563517a3742597b6c394e78742c4a725069606576777c314429264f6e330d7530453f22537f5e3034560d22146831456b1b72725f30676d0d5c71617d48753e26667e2f7a334c731c22630a242c7140457a42324629064441036c7e646208630e745531436b7c51743a36674c4f352a5575407b767a5c747176016c0676386e403a2b42356a727a04662b4446375f36265f3f124b724c6e346544706277641025063420016629225b43432428036f29341a2338627c47650b264c477c653a67043e6766152a485c7f33617264780656537e5468143f305f4537722352303c3d4379043d69797e6f3922527b24536e310d653d4c33696c635474637d0326516f745e610d773340306621105a7361654e3e392970687c2e335f3015677d4b3a724a4659767c2f5b7c16055a126820306c14315d6b59224a27311f747f336f4d5974321a22507b22705a226c6d446a37375761423a2b5c29247163046d7e47032244377508300751727126326f117f7a38670c2b23203d4f27046a5c5e1532601126292f577776606f0c6d0126474b2a73737a41316362146e581d7c1228717664091c'
salt = "WeAreDe1taTeam"
si2 = cycle(salt)
nosalt = ''
for i in range(int(len(c)/2)):
temp = c[i*2:i*2+2]
# print(binascii.a2b_hex(temp))
temphex_int = ord(binascii.a2b_hex(temp).decode())
# p = chr(temphex_int ^ ord(next(si2)) ^ ord(next(ki2)))
temp_re = hex(temphex_int ^ ord(next(si2)))[2:].zfill(2)
nosalt = nosalt + temp_re

猜解密钥长度

key长度小于38 其中已知位数8位 所以未知位数小于30

可以使用汉明距离 去猜解密钥长度

1
2
3
hamming("1010", "1111") == 2
hamming("1111", "0000") == 4
hamming("1111", "1111") == 0

汉明距离 与 密钥长度猜解之间的联系:

两个以 Ascii编码的英文字符 之间的汉明距离是 23 所以正常英文字母的评价汉明距离为 23 【每比特】,任意字符(非纯字母)的两两汉明距离 平均为4。

大写字母A-Z ascii 65-90 65对应二进制 01000001 90对应二进制 01011010

小写字母a-z ascii 97-122

65对应二进制 01000001

90对应二进制 01011010 字母汉明距离平均为2-3

正确分组的密文与密文直接的汉明距离 等于明文与明文之间的汉明距离【通过按正确密钥长度分组的密文与密文异或等于明文与明文异或证明】

所以 当使用正确的密钥长度后,两两字母进行计算汉明距离,这个值会趋于最小。

密文解密

1.异或加密中 使用相同密钥加密的明文和密文加存在这个规律: 密文和密文异或 等于 明文和明文异或。

2.空格和所有小写字母异或的结果是对应的大写字母; 空格和所有大写字母 异或的结果是对应的小写字母。除空格外,有些组合可出现异或结果是大小写字母,但是空格出现时,结果在大小写字母间的概率最大。

3.两个以ascii编码的英文字符的汉明距离是2-3之间,也就是说正常英文字母的平均汉明距离为2-3(每比特),任意字符(非纯字母)的两两汉明距离平均为4。

4.在破解这类问题的三步走:猜解密钥长度;根据密钥长度分组,依次求解密钥每个字节得出密钥;最后根据密钥还原出明文。

直接用文章的脚本 https://www.anquanke.com/post/id/161171#h2-0 当然题目和这个是不一样的 但是去盐后就是同一类题目

后面稍加变动即可 学习下这种思路

WP

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
import binascii
import base64
import string
from itertools import *

def bxor(a, b): # xor two byte strings of different lengths
if len(a) > len(b):
return bytes([x ^ y for x, y in zip(a[:len(b)], b)])
else:
return bytes([x ^ y for x, y in zip(a, b[:len(a)])])

def hamming_distance(b1, b2):
differing_bits = 0
for byte in bxor(b1, b2):
differing_bits += bin(byte).count("1")
return differing_bits

# text = ''
# with open("6.txt","r") as f:
# for line in f:
# text += line

# b = base64.b64decode(text)

def break_single_key_xor(text):
key = 0
possible_space=0
max_possible=0
letters = string.ascii_letters.encode('ascii')
for a in range(0, len(text)):
maxpossible = 0
for b in range(0, len(text)):
if(a == b):
continue
c = text[a] ^ text[b]
if c not in letters and c != 0:
continue
maxpossible += 1
if maxpossible>max_possible:
max_possible=maxpossible
possible_space=a
key = text[possible_space]^ 0x20
return chr(key)


c='49380d773440222d1b421b3060380c3f403c3844791b202651306721135b6229294a3c3222357e766b2f15561b35305e3c3b670e49382c295c6c170553577d3a2b791470406318315d753f03637f2b614a4f2e1c4f21027e227a4122757b446037786a7b0e37635024246d60136f7802543e4d36265c3e035a725c6322700d626b345d1d6464283a016f35714d434124281b607d315f66212d671428026a4f4f79657e34153f3467097e4e135f187a21767f02125b375563517a3742597b6c394e78742c4a725069606576777c314429264f6e330d7530453f22537f5e3034560d22146831456b1b72725f30676d0d5c71617d48753e26667e2f7a334c731c22630a242c7140457a42324629064441036c7e646208630e745531436b7c51743a36674c4f352a5575407b767a5c747176016c0676386e403a2b42356a727a04662b4446375f36265f3f124b724c6e346544706277641025063420016629225b43432428036f29341a2338627c47650b264c477c653a67043e6766152a485c7f33617264780656537e5468143f305f4537722352303c3d4379043d69797e6f3922527b24536e310d653d4c33696c635474637d0326516f745e610d773340306621105a7361654e3e392970687c2e335f3015677d4b3a724a4659767c2f5b7c16055a126820306c14315d6b59224a27311f747f336f4d5974321a22507b22705a226c6d446a37375761423a2b5c29247163046d7e47032244377508300751727126326f117f7a38670c2b23203d4f27046a5c5e1532601126292f577776606f0c6d0126474b2a73737a41316362146e581d7c1228717664091c'
salt="WeAreDe1taTeam"
si2=cycle(salt)
nosalt=''
for i in range(int(len(c)/2)):
temp=c[i*2:i*2+2]
# print(binascii.a2b_hex(temp))
temphex_int=ord(binascii.a2b_hex(temp).decode())
# p = chr(temphex_int ^ ord(next(si2)) ^ ord(next(ki2)))
temp_re= hex(temphex_int ^ ord(next(si2)))[2:].zfill(2)
nosalt=nosalt+temp_re


b=binascii.a2b_hex(nosalt)

normalized_distances = []


for KEYSIZE in range(2, 40):
#我们取其中前6段计算平局汉明距离
b1 = b[: KEYSIZE]
b2 = b[KEYSIZE: KEYSIZE * 2]
b3 = b[KEYSIZE * 2: KEYSIZE * 3]
b4 = b[KEYSIZE * 3: KEYSIZE * 4]
b5 = b[KEYSIZE * 4: KEYSIZE * 5]
b6 = b[KEYSIZE * 5: KEYSIZE * 6]

normalized_distance = float(
hamming_distance(b1, b2) +
hamming_distance(b2, b3) +
hamming_distance(b3, b4) +
hamming_distance(b4, b5) +
hamming_distance(b5, b6)
) / (KEYSIZE * 5)
normalized_distances.append(
(KEYSIZE, normalized_distance)
)
normalized_distances = sorted(normalized_distances,key=lambda x:x[1])


for KEYSIZE,_ in normalized_distances[:5]:
block_bytes = [[] for _ in range(KEYSIZE)]
for i, byte in enumerate(b):
block_bytes[i % KEYSIZE].append(byte)
keys = ''
try:
for bbytes in block_bytes:
keys += break_single_key_xor(bbytes)
key = bytearray(keys * len(b), "utf-8")
plaintext = bxor(b, key)
print("keysize:", KEYSIZE)
print("key is:", keys, "n")
s = bytes.decode(plaintext)
print(s)
except Exception:
continue

References

https://fitzbc.github.io/2019/08/04/De1CTF-xorz-Writeup/

https://www.anquanke.com/post/id/161171#h2-0

欢迎关注我的其它发布渠道

------------- 💖 🌞 本 文 结 束 😚 感 谢 您 的 阅 读 🌞 💖 -------------