What is Ethereum?

Ethereum is one of the hottest things in the world of cryptocurrencies. As it is nearing its first big release, here is my take at explaining what it is.

In a way, Ethereum is like a cloud infrastructure that is extremely open: the cloud has been specifically designed to increase trust in your code. Firstly, the code you run in Ethereum has to be essentially open source: you can only run your code if you publish it, unmodified. Secondly, even your database contents are public. (Privacy is still possible with the use of client-side encryption, though naturally, client-side encryption will not work if the server also needs to read the data. Even in that case, preserving full privacy is still possible, but may require sophisticated cryptography). This trustworthiness goes further: the underlying cloud infrastructure has no single owner. Ethereum is a fully distributed system, enabled by the block chain technology.

Why would anybody need this much openness? There are certain applications that truly require this level of trust and auditability. The Bitcoin currency is one example. Bitcoin already uses the same underlying block chain technology as Ethereum (in fact, Bitcoin pioneered it). It is unlikely that Bitcoin could have succeeded otherwise. Indeed, who would trust a startup that essentially prints and sells their own money, and whose founders are not known?

Granted, we don’t know if Bitcoin will succeed in the long run, but it is a lot more trustworthy because it is fully auditable and open. In fact, even if the creators of Bitcoin wanted to print (fake) money for themselves, they could not do it: the technology would not allow it. All Bitcoin clients automatically check up on each other and eventually reach a consensus over which transactions are valid. Thus, a single bad actor cannot harm the network, even if the actor is the creator of the network. (I will describe the protocol in more details below.)

Note that Ethereum is not limited to cryptocurrencies in any way: you can run any code you want in the cloud. Of course, you do have to specifically design your application for Ethereum, perhaps using the JavaScript-like Solidify langauge. Additionally, there are limitations about performance; and again, everything has to be publicly visible, even the database contents (though possibly encrypted). Yet for all these downsides, the upsides can be worth it: you get increased trust and auditability, among other benefits.

Without such protocols, it can be difficult to achieve the highest levels of trust. As an example, imagine a web-based e-mail provider like Google’s Gmail, but from a less reputable company. Let’s suppose they are accused of deleting customer’s e-mails and that for some strange reason, they decide to defend themselves by releasing their source code (perhaps to prove that their code has no back doors for deleting letters). The problem is that in this case, showing their code would not be enough, for two reasons. Firstly, we don’t know that they are actually running the same code that they are showing us. Secondly, as they have direct access to their own database, they could simply delete e-mails manually from the data store. Thus, they cannot easily be audited by an outside party.

Of course, usually we simply trust our e-mail providers. However, very few start-ups enjoy the same level of trust as well-known companies like Google or Microsoft, particularly when doing something as risky as launching a new global cryptocurrency.

So how does Ethereum work? It is a peer-to-peer network of servers, called nodes. Everybody is free to run their own nodes. If you connect your own node to the network, the first thing that your server does is download the entire cloud. This literally means that your server’s hard disk will contain the entire world-wide cloud: all code and all database contents. (In case you are wondering, more scalable architectures are also being discussed.)

Now that your server has the entire cloud state, it can start to serve customer requests on behalf of the cloud. But how do clients know that they can trust your server? They don’t. This is why clients send their requests (transactions) to all servers at the same time. That is, all servers serve all requests, give or take a few.

This creates the next problem: every server can produce a different result (especially if someone purposefully sends conflicting requests to different servers). How do we handle conflicts? By essentially rolling a dice: roughly every 12 seconds, one server is chosen at random as a winner of this round. The lucky winner gets some money for its hard work, and everybody accepts the output of this server as the new state of the cloud.

Of course, before accepting the winner, all other servers double-check its calculations (essentially, they re-process the same requests as the winner did, and verify that they get the same state for the cloud in the end). The result of every round is called a “block”. In order to preserve complete history, every such block contains a reference to its previous block, using a cryptographic hash function. This is where the underlying technology gets its name from — we are creating a “block chain”.

There is a price to pay, of course: such a cloud is slower than a typical cloud service. After all, one round of confirmations takes 12 seconds on average. (Or worse: to get more security, you may need to wait for multiple rounds). Luckily though, sometimes the client software doesn’t need to wait (for example, even credit card terminals don’t always verify account balances on small purchases).

There are several application areas that can benefit from Ethereum – typical examples are custom cryptocurrencies and various smart contracts in general (e.g. rental cars that change their ownership automatically, based on payments from customers). IBM and Samsung have been experimenting with Ethereum regarding their Internet of Things initiative, as a way to bring down the cost of datacenters, among other benefits.

Posted in Software development | Leave a comment

How frequently do you run your tests?

Nowadays it is quite common for developers to write automated tests. Somewhat less common is running the tests very frequently — getting them to pass every 2 minutes or so. This way of working can seem slow at first: you need to get the code to a working state every couple of minutes. However, in many cases it can actually be faster. Indeed, there are times when it feels almost as if your IDE had developed a magical ability to highlight buggy code; here’s why.

First, you simply cannot write many lines of code in 2 minutes. So when a test fails, you don’t have a lot of code to check: the bug is probably in one of the 2 or 3 new lines you just wrote. (The rest of the code is probably fine because the tests worked before adding the new lines.) In contrast, if you had modified 50 lines of code before running the tests, you might need more debugging to determine which of the 50 lines is in error.

Additionally, odds are that the offending code is still fresh in your head: it is easy to remember what you did 1 minute ago. Indeed, sometimes when I’m lucky I don’t even need to look at the failure message to recognize the mistake.

Of course, there are always changes that work better in bigger chunks. Nevertheless, for beginners I would still recommend trying small chunks even when they feel slow. This is because it may take some time before the small chunks start to feel productive — perhaps a month or two. For some inspiration to get started, a good example is the bowling game kata by Robert C. Martin.

Posted in Software development | Leave a comment