Blockchain Part 1: How Exactly Does It Work?

Blockchain for real estate, blockchain for resumes, blockchain for banking… everyone is talking about how blockchain is going to disrupt everything so let’s get under the hood a bit.

After talking to a lot of smart people about what seems to be a buzzword these days, I realized that a lot of people may not actually understand how Bitcoin, cryptocurrency and blockchain work.

A Typical Database

A blockchain is a type of database that works differently than any other created before. Databases are nothing new – they run just about everything that makes modern technology possible. However, traditional databases are centralized meaning there is one central architect, manager and keeper of all of the data: the creator of the data.

Anyone accessing this data is getting it from a single source. This makes keeping and maintaining data risky in many ways, it’s subject to hacking or if there’s any misinformation, there’s no real way to verify it. If you’re using the data, you’re completely reliant on the single source.

How Blockchain Works

A blockchain is a database too, but it’s a distributed database. It works particularly well for keeping track of transactions that happen, one after the next, after the next, after the next. Think: the path of money, real estate transactions, a person’s resume history.

Few have explained this better than Sean Han in his interactive blockchain demo. The screenshots below are from that excellent tutorial. Feel free to follow along!

A blockchain has a list of blocks, each block represents a transaction or an event. It always starts with one block called the genesis block.

This is a block on the blockchain:

‘Previous Hash’ is ‘0’ because there was nothing previous to this, it’s the first one. The index is this transaction or block’s position on the chain. This index is ‘Genesis’. The next will be ‘Block # 1’.

There’s a timestamp record of when this block was mined, or created.

This is the hash:

A hash looks like a bunch of random numbers. It’s the digital fingerprint of the information contained on the block. A valid hash starts with four zeros.

The hash is generated by a cryptographic function called SHA256 and it has meaning. The characters in the hash represent the block index, the previous hash, the actual data about this transaction/block, the timestamp and nonce. When any of these things change, a new and unique hash is created.

In this case, the previous hash is 0 because it’s the genesis block. The next block will contain this hash as its previous hash.

This is the data held on the block, the purpose for this new block. If this were Bitcoin for example, it would be the transaction. If this were real estate, it would describe that transaction.

This is important: if you were to go back and change the data in any way, the hash would change, since some of the unique identifiers in the hash represent the exact data in the block. This would generate a new hash without the four leading zeros, and this block would become invalid.

The number of zeros at the front of the hash is called difficulty, in this case four. When data changes, a new hash is automatically generated without four leading zeros, making the block invalid.

Because the hashing functions need the previous hash to make a block valid, every subsequent block on the chain becomes invalid. Someone can’t go back to a prior transaction and make a change without it being known to everyone.

Blockchain Mining

Mining is how a blockchain is born. It’s the process of finding a valid and unused hash.

The nonce is the number of iterations the miner had to try before finding a valid hash. In this case, the miner that created this genesis block, went through 77,117 different versions before finding a valid hash to start the blockchain.

Here’s the really cool part: a global network of computers called nodes are the ones to verify every block in the chain. Nodes are the machines that are connected to the network. That’s how you know it’s valid, by millions of nodes around the world reaching a consensus. To defraud someone and pass along false data, you’d have to get every node to agree to pass along that same exact false data. This peer-to-peer verification changes the security and fraud game versus a traditional database, where changing information in one single place will rewrite history.