Welcome to BrainLab's hands-on training sessions! The purpose is to illustrate the concepts behind the demos in the course in a way that allows you to explore different functions, arguments and examples as you master the material.
In this Notebook we'll address the importance of hashing in the blockchain.
Hashing is what allows us to quickly verify that the data in the blockchain remains immutable by providing a way to create a digital signature for an arbitrary block of data.
We'll now dive into how hashing works on Python (though the same concepts apply regardless of the programming language we're using).
SHA-256
in Python¶First, we'll import the sha256
function from Python's standard library, which implements the hashing algorithm that's used in the Bitcoin blockchain. Then, we will test the function with various arguments in order to verify that every output leads to a different alphanumeric string.
It's also worth mentioning that the function is called sha256
because it's used as a "secure hashing algorithm" with 2 to the power of 256 (or 115792089237316195423570985008687907853269984665640564039457584007913129639936) possible combinations.
Let's get started!
import hashlib
def create_hash(a):
"""Return the sha256 of the provided string."""
return hashlib.sha256(a.encode('utf-8')).hexdigest()
def print_hash(a):
"""Print the sha256 of the provided string."""
print(("create_hash('%s')\t=>\t" % (a)) + create_hash(a) + '\n')
Now that we have our create_hash()
function, we're ready to try out different inputs. We've also defined a secondary function, print_hash()
, to display the hash in a helpful format.
Let's get the hash of dog
first and see how it compares to a few other strings: cat
and mouse
.
# Hash of "dog"
print_hash('dog')
# Hash of "cat"
print_hash('cat')
# Hash of "mouse"
print_hash('mouse')
Well, the hashes of dog
, cat
and mouse
were completely different from each other. No surprise, though, since the three terms didn't have much in common to begin with.
What happens, however, if we try to compute the hashes of terms that are not too different from each other? Say, dog
and dog1
, which only differ by one character? Or how about dog
and Dog
?
# Hash of "dog"
print_hash('dog')
# Hash of "dog1"
print_hash('dog1')
# Hash of "Dog"
print_hash('Dog')
Notice how the hashes are completely different from each other, even though the inputs were somewhat similar!
This is, precisely, what makes hashing powerful. Two strings that are the least bit different from each other will lead to completely different outputs.
These outputs — which, in the case of sha256
will always be 64 characters long — can be compared in an instant. If their hashes are identical to each other, then we'll be able to assume that their input strings were also identical. If not, then regardless of how different the input strings were from one another, then the hashes will be different!
On the blockchain, all of this means that, if someone were to tamper with a block of data in the most minor way (even changing a lowercase letter to an uppercase one!), its resulting hash would be completely different.
Then, the rest of the network would learn that a block has been tampered with and discard it!
Now that we understand how hashing works, we'll move onto how it's used with blocks.