The cryptographic foundation of blockchain

The cryptographic foundation of blockchain

Original: shuwoom.com/?p=643

1. hash algorithm

(1) Concept

Hash algorithm is also called hash function, hash algorithm, and hash function. It can generate a fixed-length message digest from any data length (as shown in the figure below). For a particular message, the message digest (or hash value) can be regarded as the fingerprint of the message, and the message is the only representation. The hash function is the core part of digital signature and verification.

(2) The characteristics of the hash algorithm

  • Unidirectional
    hash function must be one-way, that is to say, given an input, a hash value can be obtained through the hash function, but conversely, given a hash value, the input cannot be obtained. In other words, given a hash value (fingerprint), it is impossible for us to find the corresponding message.
  • Conflict avoidance
    is difficult to find two different messages, so that the generated hash values are consistent (a conflict occurs).

  • Any slight change in the input of sensitive raw data will result in a completely different hash value

  • Given a message and a hash algorithm in a forward direction, the hash value is calculated within a limited time and effective resources

(3) Hash collision

Hash collision means that there are different messages (inputs), so that the output of the hash function, that is, the hash value, is the same. This probability is very, very low. Take the following md5 as an example, the two inputs are slightly different, but the hash value is the same:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# coding: utf-8
import hashlib
# HEX
a = bytearray.fromhex("0e306561559aa787d00bc6f70bbdfe3404cf03659e704f8534c00ffb659c4c8740cc942feb2da115a3f4155cbb8607497386656d7d1f34a42059d78f5a8dd1ef")
b = bytearray.fromhex("0e306561559aa787d00bc6f70bbdfe3404cf03659e744f8534c00ffb659c4c8740cc942feb2da115a3f415dcbb8607497386656d7d1f34a42059d78f5a8dd1ef")
# MD5
print(hashlib.md5(a).hexdigest())
print(hashlib.md5(b).hexdigest())
### a b
cee9a457e790cf20d4bdaa6d69f01e41
cee9a457e790cf20d4bdaa6d69f01e41

Take CRC32 as an example. Its digest length is 32bit, which means there are 2^32 possibilities. Today, the total number of Internet pages has exceeded 1 trillion in 2008. If the CRC32 digest is used, then a lot of hash collisions will occur. The MD5 digest length is 128bit, which means there are 2^128 possibilities, but the MD5 algorithm has been proven to be insecure. The same SHA1 algorithm (160bit) has also been proved to be insecure. It is recommended to use the SHA256 algorithm to be more secure.

(4) Common hash algorithms

Currently the more popular hash algorithms are MD5, SHA-1 and SHA-2

  • MD4

MD4 has been proven to be insecure and is not recommended

  • MD5

MD5 is an improved version of MD4, and the output is 128bit. Similarly, MD5 has also been proven not to have strong collision resistance and is not safe enough.

  • SHA-1

SHA is a family of hash functions. The length of the hash value of the SHA-1 algorithm is 160 bits, which is more resistant to exhaustion than MD5 and MD4. However, SHA-1 has been proven not to have "strong collision resistance" (Google has breached SHA-1),

  • SHA-2

The SHA-224, SHA-256, SHA-384, and SHA-512 algorithms are collectively referred to as SHA-2, and the algorithm principle is similar to SHA-1.

In summary, MD5 and SHA-1 are no longer secure enough, it is recommended to use at least the SHA-256 algorithm.

(5) Application

The application of hash algorithm is mainly reflected in the following three aspects:

  • Document verification
  • digital signature
  • Authentication protocol (secure communication)

2. Asymmetric encryption technology

(1) Concept

Asymmetric encryption algorithm is also called public key encryption algorithm, which is divided into three parts: public key, private key, and encryption and decryption algorithm.
Asymmetric encryption often requires a cryptographically secure pseudo-random number generator to generate a pair of secret keys (public key and private key). The two are paired. The public key can be disclosed, while the private key is kept by the user. . Data encrypted with the private key can only be decrypted with the public key (conversely, data encrypted with the public key can only be decrypted with the private key). This mathematical relationship between the public key and the private key allows the private key to be used to generate a signature for a specific message. And this signature can be verified by the public key without exposing the private key.

That is to say, I sign a message with a private key (that is, encrypt), and then send this data together with the signature and my public key to the other party, and the other party can verify the signature (decrypt) and compare the data with the public key to verify The validity of the data.
In the Bitcoin system, the wallet address generated by the public key is used to receive bitcoins, and the private key is used to sign transactions when paying in bitcoins.
When paying for bitcoin, the owner of the bitcoin needs to submit his public key and signature of the transaction in the transaction. All nodes in the Bitcoin network can verify the submitted public key and signature to confirm the payer's ownership of the transaction Bitcoin.

(2) Encryption and decryption process

  • Encryption process: Encrypt the plain text through the encryption algorithm and the public key to obtain the cipher text
  • Decryption process: Decrypt by decryption algorithm (here, encryption and decryption algorithms must be the same) and secret key to get the plaintext.

As shown in the figure below, the encryption process is one-way, and the ciphertext encrypted by the public key can only be decrypted with the corresponding private key.

(3) Common asymmetric encryption algorithms

  • RSA
  • ElGamal
  • Knapsack algorithm
  • Rabin (a special case of RSA)
  • Public Key Encryption Algorithm in Diffie-Hellman Key Exchange Protocol
  • Elliptic Curve Cryptography (English: Elliptic Curve Cryptography, ECC)

The most widely used is the RSA algorithm (from the acronyms of the inventors Rivest, Shmir, and Adleman), which is a well-known public key encryption algorithm, and ElGamal is another commonly used asymmetric encryption algorithm.

(4) Application

The application of asymmetric encryption algorithm is very extensive, the following are the common ones

  • digital signature
  • Digital certificate
  • Encrypted communication HTTPS

(5) Python programming implementation

Generate key pair

1
2
3
4
5
6
7
8
def create_genisus_keypair():
#
pubkey, privkey = rsa.newkeys(1024)
with open('genisus_public.pem', 'w+') as f:
f.write(pubkey.save_pkcs1().decode())
with open('genisus_private.pem', 'w+') as f:
f.write(privkey.save_pkcs1().decode())

Open genisus_public.pem and genisus_private.pem directly, we can see a long string of strings.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
genisus_public.pem
-----BEGIN RSA PUBLIC KEY-----
MIGJAoGBAJIiGuUOlhKFQEHIr5YXa9uajM+sI5FEZ/8RJdR5EOC4Wo+9bZfyrvnu
PRBtK7PJzUXHdXCaNohPzA5IuiEpkoELPyWfCozQF9FAgK2Gyf1rBeC7e2lUvuwI
h5IXhRjC2Rom6wiJRWXn/W0/EezrXl8YlYmrRR1Boa9OB1RFJvJXAgMBAAE=
-----END RSA PUBLIC KEY-----
genisus_private.pem
-----BEGIN RSA PRIVATE KEY-----
MIICYAIBAAKBgQCSIhrlDpYShUBByK+WF2vbmozPrCORRGf/ESXUeRDguFqPvW2X
8q757j0QbSuzyc1Fx3VwmjaIT8wOSLohKZKBCz8lnwqM0BfRQICthsn9awXgu3tp
VL7sCIeSF4UYwtkaJusIiUVl5/1tPxHs615fGJWJq0UdQaGvTgdURSbyVwIDAQAB
AoGAOiw7ep3A3iSPfOCQDXbLaANxNKa5DfYmVCKWZavALUUWQAxPmWJxh2rwgh6D
fDHEdpe9R5MMTF0/xRvsQGQvZU2sNNhA2ebVOB4mMCPcURYWCbJXjT14IC7rfwPp
YnkpiCHxP+HBS2xjMyf63vk7tKR/zCfzm9tsDAKqKuFUYPECRQCrQpiuYxp2Znch
szwR15q0HUpU8/HdCQlKdBK2fa4M2Y1xPqNjpOlQ2B8zCS3aOQ/9nhbVaRy0LqFD
sUjPPJsTqbaSyQI9ANpwsv1MHkpYGlYSrCItFS1oRoD/foxByVdVORCtarA0z9xe
Am8kEi9HcnZf2jsYvqHAeQsTtuPw0PvMHwJEXL0+csilztHj1zL452yKkNh/pQtI
wPogttmuPHZIZxrz9gwGbHIkCixOkNN6qf5Wg281TDGUYpoRp9d75wUZsQcpH8kC
PE0rvYBREO5w27UG2bslNDMbgLT4DkwcvbXVzNhAe82OitSufauoEaiUVDLPwDha
kJZyehDYwSccH6ilPwJFAIhBzCQ61A2zfEJlgKeuX3OJGq1pywpnv+vFyqnDc4T6
oEW9kfnrAI+6x4L4jyyHOWMNfkAPajtkQ+YwxZqUmKL8ixr0
-----END RSA PRIVATE KEY-----

encrypt and decode

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#
with open('genisus_private.pem', 'r') as f:
privkey = rsa.PrivateKey.load_pkcs1(f.read().encode())
with open('genisus_public.pem', 'r') as f:
pubkey = rsa.PublicKey.load_pkcs1(f.read().encode())
#
message = 'hello world'
#
crypto = rsa.encrypt(message.encode(), pubkey)
#
message = rsa.decrypt(crypto, privkey).decode()
print(message)

3. digital signature

(1) Concept

A digital signature is a digital string that can only be generated by the sender of the information, and this digital string is also an effective proof of the authenticity of the information sent by the sender. It is a universal technology for identity authentication and secure online transactions.
The digital signature uses the asymmetric encryption technology (public key encryption algorithm) and hash algorithm mentioned in the previous two sections.

(2) Function

  • Digital signature has three main functions:
  • Ensure the integrity of information transmission (integrity)
  • Sender's identity authentication (authentication)
  • To prevent denial (non-repudiation) in transactions,
    digital signatures are an encryption process, and digital signature verification is a decryption process.

(3) Signature and verification

  • The sender s signature process The
    digital signature is a digest of the information that the sender first obtains through a hash algorithm

    Then sign the digest with the secret key

    Finally, the signature and the original text are sent to the recipient
  • Receiver verification process
    digital signature verification, after the receiver receives the signature and original text sent by the sender, it first separates

    Extract the digest of the original text using the same hash algorithm as the sender

    Then use the sender's public key to decrypt the signature to obtain the decrypted digest

    Comparing the two, if they are consistent, it means that the received information is complete and has not been modified during transmission.

    Here, if you think carefully, the reader will find that if a hacker replaces the sender's private key and the sender's public key possessed by the receiver with the hacker's own private key and public key. At this time, the receiver cannot find the problem during verification, because the private key used for signature is the private key of the hacker, and the public key used for verification is also the public key of the hacker, and the two are a pair. This becomes the verification of the legality of the public key, so to solve this problem involves the concept of digital certificates (as shown in the figure below). This reader can read another article I wrote "What is a digital certificate", I won't go into it here.

(4) Application

Digital signature technology is widely used, such as personal secure mail certificates, access to secure https sites, online signing, electronic transactions, etc.

(5) Python programming implementation

First generate a pair of secret keys (public key and private key)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#
message = 'hello world'
#
with open('genisus_private.pem', 'r') as f:
privkey = rsa.PrivateKey.load_pkcs1(f.read().encode())
with open('genisus_public.pem', 'r') as f:
pubkey = rsa.PublicKey.load_pkcs1(f.read().encode())
#
signature = rsa.sign(message.encode(), privkey, 'SHA-1')
#
try:
rsa.verify(message.encode(), signature, pubkey)
except rsa.pkcs1.VerificationError:
print 'invalid'

Follow my WeChat public account (shuwoom's blog) and push articles regularly every week:

Reference:
https://www.jianshu.com/p/bf1d7eee28d0
https://zh.wikipedia.org/wiki/%E6%95%A3%E5%88%97%E5%87%BD%E6%95% B8
http://www.infoq.com/cn/news/2017/02/google-first-sha1-collision
http://blog.jobbole.com/106733/
https://baike.baidu.com/item/%E6%95%B0%E5%AD%97%E7%AD%BE%E5%90%8D
https://blog.csdn.net/u011630575/article/details/53241027