Welcome to BrainLab's hands-on training sessions! The purpose is to illustrate the concepts behind the demos in the course in a way that allows you to explore different functions, arguments and examples as you master the material.

Hashing

Why It's Important

In this Notebook we'll address the importance of hashing in the blockchain.

Hashing is what allows us to quickly verify that the data in the blockchain remains immutable by providing a way to create a digital signature for an arbitrary block of data.

How It Works

We'll now dive into how hashing works on Python (though the same concepts apply regardless of the programming language we're using).

SHA-256 in Python

First, we'll import the sha256 function from Python's standard library, which implements the hashing algorithm that's used in the Bitcoin blockchain. Then, we will test the function with various arguments in order to verify that every output leads to a different alphanumeric string.

It's also worth mentioning that the function is called sha256 because it's used as a "secure hashing algorithm" with 2 to the power of 256 (or 115792089237316195423570985008687907853269984665640564039457584007913129639936) possible combinations.

Let's get started!

In [53]:
import hashlib


def create_hash(a):
    """Return the sha256 of the provided string."""
    
    return hashlib.sha256(a.encode('utf-8')).hexdigest()


def print_hash(a):
    """Print the sha256 of the provided string."""
    
    print(("create_hash('%s')\t=>\t" % (a)) + create_hash(a) + '\n')

Examples

Now that we have our create_hash() function, we're ready to try out different inputs. We've also defined a secondary function, print_hash(), to display the hash in a helpful format.

Let's get the hash of dog first and see how it compares to a few other strings: cat and mouse.

In [54]:
# Hash of "dog"
print_hash('dog')

# Hash of "cat"
print_hash('cat')

# Hash of "mouse"
print_hash('mouse')
create_hash('dog')	=>	cd6357efdd966de8c0cb2f876cc89ec74ce35f0968e11743987084bd42fb8944

create_hash('cat')	=>	77af778b51abd4a3c51c5ddd97204a9c3ae614ebccb75a606c3b6865aed6744e

create_hash('mouse')	=>	47c5c28cae2574cdf5a194fe7717de68f8276f4bf83e653830925056aeb32a48

Well, the hashes of dog, cat and mouse were completely different from each other. No surprise, though, since the three terms didn't have much in common to begin with.

What happens, however, if we try to compute the hashes of terms that are not too different from each other? Say, dog and dog1, which only differ by one character? Or how about dog and Dog?

In [55]:
# Hash of "dog"
print_hash('dog')

# Hash of "dog1"
print_hash('dog1')

# Hash of "Dog"
print_hash('Dog')
create_hash('dog')	=>	cd6357efdd966de8c0cb2f876cc89ec74ce35f0968e11743987084bd42fb8944

create_hash('dog1')	=>	5d50a42e107cae9ce625a95646578db5a56c52c9fb30e3f08de5c0ab88c0573b

create_hash('Dog')	=>	0eb129bf94594aaeee66e38361d7be212cd927c3df4dd92e3ded2e0da0c7ad88

Notice how the hashes are completely different from each other, even though the inputs were somewhat similar!

This is, precisely, what makes hashing powerful. Two strings that are the least bit different from each other will lead to completely different outputs.

These outputs — which, in the case of sha256 will always be 64 characters long — can be compared in an instant. If their hashes are identical to each other, then we'll be able to assume that their input strings were also identical. If not, then regardless of how different the input strings were from one another, then the hashes will be different!

Conclusions

On the blockchain, all of this means that, if someone were to tamper with a block of data in the most minor way (even changing a lowercase letter to an uppercase one!), its resulting hash would be completely different.

Then, the rest of the network would learn that a block has been tampered with and discard it!

Next Steps: Blocks!

Now that we understand how hashing works, we'll move onto how it's used with blocks.