Bitcoin: The Tree of Bytes

26 March 2024

The blockchain truly is a marvelous piece of technology. A mechanism to timestamp the order of digital information without needing to depend on a centralized operator. A decentralized mechanism with no one in charge, that provides undeniably cryptographic guarantees around what data was added to the temporal record in what order. This property is the entire reason Bitcoin is useful as a form of digital money, without it there would be no way for the system to function at all without a centralized authority.

All of these guarantees are provided by three simple technical building blocks: private/public key cryptography, merkle trees, and hash algorithms. Every Bitcoin block is just some extra necessary data wrapped around the root of a merkle tree of all the transactions in it. The rest of the header includes data like the timestamp, difficulty target, block version, the hash of the previous block in the chain, and the random nonce used when hashing the head looking for enough leading 0s.

Cryptographic Commitments, Publishing and Verification

Miners don’t actually hash the whole block, and they don’t have to, because of how a merkle tree works. Each piece of data in a merkle tree is hashed, and then each pair of data units is hashed together upwards until you arrive at the single hash of the merkle root. Simply by mining over the header that includes that single hash, miners can prove beyond the shadow of a doubt all the transactions in the block were part of the block they mined, and that it pointed back to a single previous block with a specific set of prior transactions, and so on. In a similar fashion, when people sign Bitcoin transactions, they aren’t signing over the actual transaction’s raw bytes, they’re signing the hash of them. They’re the same thing in terms of cryptographic commitment.

The way cryptographic commitments work in combination with proof-of-work are what guarantee we can have a linear view of what was cryptographically committed to in what order. This is the entire basis of Bitcoin, proof-of-work creating a material cost to adding to that chain, and using that to sequence all of the actual data (transactions) committed to in order to completely verify no funny business occurred. As a miner you can’t “mine” two different Bitcoin blocks at the same time, and you can’t fake digital signatures or break hash functions.

The entire functioning of the Bitcoin network can be boiled down essentially to two things: committing to information, and publishing that information to be verified. Bitcoin provides two commitment guarantees in terms of data relevant to the protocol: that individual transactions were properly committed to by the correct signatures and other witness data, and that blocks bundling transactions have been committed to by an appropriate amount of work.

This is what gives value to Bitcoin as a network and system, the commitment guarantees it provides using cryptography and thermodynamics, and publishing them so everyone who wants to can verify those commitments. Without the soundness of its commitments, and the public circulation of those commitments, it would be useless as a trustless money.

Those properties of commitment, publishing, and verification are valuable far beyond the use case of money. The movement of money is by no means the only type of information that can gain value from a cryptographic and thermodynamic commitment to when it was created (or the earliest point it existed) and when its existence was publicized to the world. Jpegs have shown people value this for even pointlessly stupid arbitrary information, but there is information immensely more valuable than jpegs in this world.

Density of Information

You have to pay for blockspace when you transact on Bitcoin, and that blockspace is priced in bytes. For every byte of space you take up in that block you have to compete with every other person trying to use that blockspace to pay the going market rate, and anyone can always just pay more and push that rate higher. This gives denser information a competitive advantage in trying to get included in a block. If the density of information is very high, i.e. how many bytes of space you need is very small, you can use that blockspace while paying a lower fee in absolute terms than someone with less dense information.

The use of blockspace to transfer economic value is one of the densest forms of information that can be included in a block. This will always be the case, and despite all of the drama and rabble rousing about Bitcoin turning into Ethereum, this will ensure Bitcoin’s primary use case remains the transfer of economic value. It is simply the most competitive use of the system in terms of information density.

However, this does not mean that it will be the only use of Bitcoin. If Bitcoin truly does succeed, the reality is the current market frenzy and activity surrounding Ordinals and Inscription will die off. It will not be cost effective to engage in such activities as the cost of blockspace for lower net worth individuals, and as fees rise that dynamic will compound until the use-case is either priced out entirely or reserved to only immensely wealthy individuals. Maybe one day nation states will inscribe images or data to commemorate important historical events, but middle class degenerate gamblers won’t be inscribing jpegs like trading cards in the future.

They will have to either stop playing those games, or take their games somewhere else.

There Is No Blocksize Limit

Merkle trees are magical. They can be literally infinitely large, and all you need to prove that a piece of data is part of one is the root hash, and the other hashes in the interior of the tree all the way to the actual piece of data. Cryptographic magic. The only reason the size of merkle trees in a Bitcoin block are limited in size is because users need to validate the contents of the entire block to ensure every transaction inside it is valid. Verifiability of the commitments in a block are integral to Bitcoin’s functioning as a system.

You can stick a hash inside of an individual Bitcoin transaction, which means because of the magic of merkle trees, there is no such thing as the blocksize limit when it comes to the Bitcoin blockchain committing to data outside of the scope of Bitcoin transactions themselves. The same way that the small blockheader commits to every transaction in a block with a single hash, a Bitcoin transaction itself can commit to a massive merkle tree made up of immense amounts of data. This has literally been done before with the entire contents of Internet Archive.

Earlier I said that transferring economic value is one of the densest forms of data that could utilize Bitcoin blockspace. One of, not the densest. That is because of general purpose timestamping. A single transaction, with a single hash embedded in it, can literally timestamp an infinite amount of data in a way that 100% proves it existed when that block was mined. It is impossible for any use case of blockspace to be denser in informational terms than this.

Because everything in this merkle tree a transaction commits to has nothing to do with Bitcoin transactions, or whether or not they are valid, it can completely ignore the Bitcoin blocksize limit. On the other hand, it also cannot depend on the Bitcoin network to actually propagate the published information itself, but that is not a critical problem in the digital age.

Using The Trees

Satoshi himself in the recently released emails with Martii Malmi discussed the use of Bitcoin as a general purpose timestamping tool. This is something many people have done for as long as Bitcoin existed. Old projects like Wall of Eternity would let you pay to stamp messages into the blockchain. People have announced weddings, the birth of children, as well as other much more childish things using OP_RETURN on the blockchain for over a decade. This combines both the commitment and publication functions into a single action, but one that is incredibly inefficient in its use of blockspace.

Opentimestamps

Opentimestamps (OTS) is the perfect example of a scalable mechanism to facilitate at least the commitment aspect of timestamping. The publication of the data (as well as its commitment in the form of a merkle proof) is left entirely on the user timestamping the information, but the actual timestamping commitment is handled by the OTS Calendar Server. As users submit documents or files to the server, it bundles them up into an unordered merkle tree. It continues aggregating all the hash commitments of individual users files into a single tree until it conducts a periodic on-chain Bitcoin transaction which includes the current root hash of the entire tree it is building.

As evidenced by the demonstration cited above, this can have immense value as a utility. Now that the entirety of the Internet Archive as of 2017 is timestamped using OTS, it is thermodynamically impossible to alter the contents of anything contained in that archive in a way that could not be detected. Centralized information stores such as the Internet Archive have historically functioned as what amounts to an oracle. They duplicate and copy the state of different pages or information and we trust them not to lie when they say “this is what that information looked like at this date.”

With a proper Opentimestamps integration, they would never be a trusted entity in that way ever again. They would simply be a host that stores the information itself alongside an OTS merkle proof, and that itself would prove beyond the shadow of a doubt that the information they are showing you existed in that form at roughly the time they claimed it did. The historical state of arbitrary information secured thermodynamically by Bitcoin.

Mainstay

Anyone even remotely familiar with timestamping knows that OTS has one major problem: I can timestamp as many different conflicting things as I want to, and only show you one of them after the fact. For many use cases that boil down to needing to prove a piece of data existed at a certain time, this is a detail that doesn’t matter, but for others it does.

If I needed to prove that a piece of data was signed off by someone, say a corporate document signed by an executive’s private key, it doesn’t matter if he signed other (even conflicting) things with that key at the same time. All I’m trying to do is prove he signed one specific thing. OTS works fine for that. But imagine a situation where someone wants to attest to a file and prove that “officially” they have attested to only that file and not any others.

Mainstay is a variation of Opentimestamps that addresses this problem. Rather than a completely unordered merkle tree, it’s very specifically organized in such a way that every user has a specific “slot” in the tree where they can commit to data. Now while this doesn’t prevent people from commiting to other conflicting data in general, when using a Mainstay tree they can publicly use an identifiable slot as their “official” commitment. Anyone verifying such commitments can then ignore or not treat as legitimate any commitment with a merkle proof located in any other part of the tree.

Para-Consensus Systems

The basic concept of Mainstay can be extended even further to create para-consensus systems piggybacking on top of Bitcoin, Stacks is probably the best known example. By committing the merkle root of arbitrary data in an ordered/identifiable way, and by publishing that information out of band somewhere else so it can be verified against arbitrary rules, a whole new consensus system can be built by anchoring itself into Bitcoin’s blockchain.

Bitcoin itself doesn’t need to be aware of this in any way. Because of that fact, information that is consensus invalid to the para-consensus system can be committed to by Bitcoin and published out of band, but participants in that para-consensus system can simply ignore it and wait for the next commitment to valid data in their system. This can allow informational density of other economic assets to match that of arbitrary data timestamps.

This might not be desirable, but it is unstoppable.

Other Uses

While tokens like Stacks are rather pointless uses of extending Bitcoin’s thermodynamic commitments in my opinion, some ‘assets’ that are not strictly monetary do actually have very sound use cases that could benefit from timestamping. Domain Names and namespaces in general are one. The entire way you interact with the web is steered by DNS, a centralized and trusted system. When you type in www.google.com a hierarchy of servers is telling your computer what actual IP address to connect to. Those servers can arbitrarily redirect you anywhere, they can deny people access to a domain, they can revoke domains, they have total control over those “directions” everyone’s computer listens to.

An open and decentralized DNS system piggybacking on top of Bitcoin can address those issues. Rather than an authority granting access to a domain, any person can independently register and commit to a “name” tied to a cryptographic key themselves. Software can find published commitments to such data, and on a basis of trusting the first entries to be the “owner” of a domain, acquire directions to the correct server to connect to from a system that is open, decentralized, and cryptographically verifiable without a centralized authority.

A Map of Space and Time

Everyone fixates on the use of Bitcoin as money, and rightly so, it is the primary and core functionality of the protocol and network. The economic incentives its use as money creates are the core of what keeps it secure and functioning, it could not exist without that aspect of itself. It would collapse and fail without it.

But Bitcoin is so much more than just that money system. It is a distributed timestamping system with a decentralized network for publishing everything the system commits to. It is a thermodynamically guaranteed map of digital data in space and time. One that is infinitely extendable. The blocksize limit governs the maximum size of Bitcoin transactions that can be committed to in a single bundle at a time, but it has absolutely no power to restrict any other type of data that the blockchain can commit to.

Bitcoin is a thermodynamically driven blackhole in a digital era, and it was gobble up every byte of information into its merkle trees that in any way can benefit from the cryptographic guarantees that it can provide. Bitcoin is not just money, and no matter how many times people chant it is only money and nothing else, it will never be true.

Bitcoin is a digital monster, and it will eat everything. 

Need help?

Please use the contact form to get support.