Cryptocurrency from a Computer Science Lens

All good software engineering involves the use of computer science to design the way the desired software will operate. Today, the variety of cryptocurrencies all have different goals and road maps towards their implementation, but I have yet to find one project that doesn't use hashing as a foundational operation.

Hashing

Hashing is a cryptographic function in computer science that takes data of any input size and maps it to a hash with a given fixed size. If designed well, it is practically impossible to run inversely, and can ultimately be used as a one-way function. In practice, cryptographic hashing can be used to securely store passwords of users in a database (in this case, encryption would not be a suitable way to store passwords securely). Some criteria for a well designed and implemented cryptographic hashing function are as follows:

  1. It should be impossible to recover the input to the hashing function by using the output of the function.
  2. It should avoid collisions, meaning each unique input should have a unique output.
  3. The smallest possible change to an input should change the resulting hash value.
  4. The hash should be fast to compute for any kind of input data.
  5. Identical inputs should have identical outputs (this is true for any mathematical function in general).

It is essential to keep in mind that not all hashing functions meet these four important criteria. The MD5 hash, for example, was intended to be a cryptographic hashing function, but in the last few decades, it has been shown to have serious vulnerabilities. So why might a previously viable cryptographic function no longer be secure? In some cases, computer hardware will catch up and surpass the computational power to break certain cryptographic functions. For example, standard hardware can now find MD5 collisions rather quickly. Alternatively, non-brute-force algorithms may be found that will invert a poorly designed hashing function.

Hashing's Place in Cryptocurrency

Now if you understand the basics of hashing, it is quite simple to see how useful it is for computations and record keeping on cryptocurrency ledgers. If you have been in the cryptocurrency space for a while, you may already know that each block in the blockchain has a distinct hash. Specifically for a Bitcoin block hash you will notice it begins with many zeros.

Bitcoin and SHA256

In general, Bitcoin uses the sha256 cryptographic hashing function to keep a record of its blocks. The input data to this hashing function is the list of all the transactions that a miner is accepting, and a specific number that must be found.

This specific number referred to as the nonce along with the transactions will produce a unique hash for the given block. Great! So now we have a hash of the block that anyone can reproduce so long as they use the same transactions and nonce. What's even better is that the Bitcoin network uses the hashes in a manner that keeps block generation timing consistent.

Block Timing

If we pick the first hash that someone comes up with, Bitcoin blocks would be generated almost immediately, and that isn't particularly helpful for a ledger or the blockchain data structure that Bitcoin uses. Instead, for the network to accept a block, the hash given must have a minimum number of leading zeros. Because sha256 can't be reversed, the only way to find the nonce that will provide us with this hash is to try many different values for the nonce. Once a hash is produced that has the correct number of leading zeros or more, the rest of the network is notified of the mining difficulty(required leading zeros) transactions, and nonce used to produce the hash. Each mining node can very quickly verify that the transactions and nonce do, indeed give a valid hash. It is then recorded on the ledger with a time-stamp and all the corresponding data about the block.

As new miners join and others stop, the network will compensate for fluctuating computing power by adjusting the block difficulty, by changing the minimum required leading zeros in the hash. Every 2016 blocks, the timestamps are checked to analyze how quickly each block was found on average. For Bitcoin specifically, the ideal is ten minutes for each block. So if miners are finding this on average faster than ten minutes, the difficulty will be increased to make it take longer to generate blocks. Likewise, if miners are finding blocks in more than ten minutes, the difficulty will be reduced in an attempt to bring the average time back down.

Example

You can take a look at a specific block to see each of this in practice. Block 570000 has the resulting hash, as well as the nonce, timestamp, difficulty, and other useful information about the block. If you have the transaction data, you should be able to make a sha256 hash of the transactions, nonce, and you can see you should get the exact hash of that block or any other block that you pick.

Other Algorithms

Not all cryptocurrencies use sha256. There are a variety of other hashing algorithms that exist in the cryptocurrency ecosystem:

  • Litecoin - Scrypt
  • Dogecoin - Scrypt
  • Dash - X11
  • Bitcoin Cash - SHA256
  • Monero - CryptoNight-R

Interesting Block Time Quirks

If you are at all interested in Monero, you may have been paying attention to the algorithm hardfork that occurred this past March to kick potential ASIC or FPGA miners off of the network. In general, the target time for a Monero block is two minutes. But if you use a block explorer, you can see a very interesting effect of this fork. On block 1788000 the Monero hard fork occurred, you can see that from block 1787999 it took over two hours to find the next nonce. This is because the difficulty of Monero is adjusted with very similar reasoning to Bitcoin. At the time the hard fork occurred, the combined computing power of the network was nearly 1 Gh/s. As expected, when the algorithm changed, many powerful ASIC machines were unable to continue guessing nonces for the hash. The network isn't able to respond to a change in computer power instantly, so, on block 1788000, miners with a combined 270 Mh/s were trying to solve a block with difficulty designed for an ecosystem with a combined 1 Gh/s of hashing power. The way Monero determines difficulty takes into account the previous seven hundred and twenty block times, excluding the sixty highest and lowest times. The average block time is then calculated from the remaining six hundred blocks. For this reason, it wasn't until around a day later that the Monero block time returned to normal.

I hope this gave you a little more insight into the nature of cryptocurrency mining. Whether you are mining coins yourself or just interested in the topic it helps to be aware of what's going on under the hood when people talk about block hashes and mining difficulty.