The Fox in the Henhouse

Slogging

I’m writing a new time series DB in C++ called Henhouse and this is a story about a bug and my heroic battle with it. Actually, it’s not really a story about a bug, but more of a way of finding bugs and preventing them. I hope to convince you that there are better ways of writing code that don’t involve slogging with unit tests.

Freakin Lasers!

So I ran a randomized test that hammered my new precious DB. It dumped random data into it and did random queries. I started it and came back several hours later to find this…

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! require failed
!! expr: (_metadata->size >= 170) [_metadata->size = 171, _max_items = 170]
!! func: size
!! file: /home/maxim/src/max/henhouse/src/db/../util/mapped_vector.hpp (63)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

./henhouse(_ZN8henhouse4util5traceERSo+0x35) [0x68dbea]
./henhouse(_ZN8henhouse4util5raiseEPKc+0x81) [0x68dd32]
./henhouse(_ZN8henhouse4util6raise2ImiEEvPKcS3_iS3_S3_S3_RKT_S3_RKT0_+0x2b8) [0x67923f]
....

 

See that?! That is a bug right there!

You see, I am writing the DB in a style of programming called Design by Contract. My code is littered with contracts and whenever one of them fails, the program prints a helpful message with a stack trace and then violently aborts. No remorse. Want a guaranteed way to have a fairly bug free system? Make sure the system can detect bugs, and then can find your family and send them down river.

The nasty bug above was found in this bit of code.

uint64_t size() const
{
    REQUIRE(_metadata);
    REQUIRE_LESS_EQUAL(_metadata->size, _max_items);

    return _metadata->size;
}

This is a member function which implements a memory mapped vector. Every time you call size on the vector, the contracts are tested.

This Machine Kills Fascist Bugs

Design by Contract is one of those secret weapons nobody teaches you. The term was coined by Bertrand Meyer who created the Eiffel programming language. Eiffel is really the only language that has truly native support for this technique, well, Eiffel and Perl 6. But we all know Perl 6 is from the future, but talking about Perl 6 is for another day.

The idea is that units in your software, like a function, or an object, establish contracts between the caller and the callee. The callee says “I REQUIRE you give me parameters that meet this criteria and I will ENSURE you this result”.

Preconditions
Postconditions
Invariants

The important bit to understand here is that a RELATIONSHIP is established between pieces of code. Not just an informal relationship, but an executable one! A relationship that can be tested millions of times a second. All day, every day on real user data.

This is substantially different than Unit testing, which ignores relationships between code.

Parallel Monkeys Are Terrible

The bug was introduced when I added support for parallel puts and queries into my DB. As data is put into the DB I have a function that would resize the memory mapped file that looked like this.

void resize(size_t new_size)
{
    REQUIRE(_data_file);
    REQUIRE_GREATER_EQUAL(new_size, _data_file>size() + sizeof(data_type));
    const auto old_max = _max_items;

    _data_file->resize(new_size);
    _items = reinterpret_cast<data_type>(_data_file->data());
    _max_items = _data_file->size() / sizeof(data_type);

    ENSURE_GREATER(_max_items, old_max);
}

The failure detected that “_max_items” was being stale somewhere, which indicated a copy was being made in another thread.

Amazing! our contract found a multi-threading error!

I solved the bug by a standard method. I created a worker pool and made sure puts and gets for a piece of data always went to the same worker. The worker has a request queue that it pulls from and processes requests one after another. No more chances of a get and a put happening in parallel. The much more tricky solution would have been to use read/write locks, but I know better.

The Fat Bollard

What would our code above look like using unit tests instead? Well first thing, we would not have contracts, because we are TDD people and TDD people don’t know about contracts.

void resize(size_t new_size)
{
    _data_file->resize(new_size);
    _items = reinterpret_cast<data_type>(_data_file->data());
    _max_items = _data_file->size() / sizeof(data_type);
}

What we lose should be immediately obvious. For example, the requirement that new_size should be bigger than the old size. If we enjoy defensive programming, we may further refine the code to look like this.

void resize(size_t new_size)
{
    if(new_size < _data_file->size() + sizeof(data_type))
        throw runtime_error("new size too small");

    _data_file->resize(new_size);
    _items = reinterpret_cast<data_type>(_data_file->data());
    _max_items = _data_file->size() / sizeof(data_type);
}

I get a lot of people asking me “What the difference is between Design by Contract and plain old defensive programming?” I think it should be pretty obvious from above why defensive programming is not the same thing. First with defensive programming, you allow the caller to recover. With the Design by Contract approach there is no remorse. Fail early and fail fast. Second, the caller has no information about what the callee ensures in the result, requiring the caller to check the result.

Being good TDD folks we would actually have written a test first that looks like this.

void test_resize()
{
    size_t initial_capacity=50
    mapped_vector<int> v{"test_file", initial_capacity};
    for(int a = 0; a < 100; a++)
        v.push_back(a)
    EXPECT_EQ(v.size(), 100);
}

And guess what, this would not have found the parallel bug! Even if we modified the test to use threads to insert into the vector, the bug would still not fail the tests. We would also have to query the size as we insert in parallel for the bug to show up, and guess what, in that query the size would necessarily be unknown since the query and insert are done in parallel. The best you can do is maybe somehow coerce the test to segfault (which it won’t since it is accessing valid memory). Or worse, expose max_items to the outside world muddying the interface.

I tested to see what would happen if I removed the contract. I ran the test and waited a day. And guess what? nothing happened except that the DB put data in the wrong place. In other words, a silent and deadly data corruption.

Not All Tests Are Created Equal

I combined Design by Contract with other powerful tools such as randomized testing and fuzzing. There are many tools for randomized testing such as QuickCheck and for fuzz testing such as afl-fuzz. I decided to write my own tests using Perl 6. For example, here is a test for dumping random data into the DB. Henhouse is compatible with Graphite’s input format.

use v6;

sub MAIN($workers, $a, $z)
{
    my @keys = "$a".."$z";
    my @ps;

    for (1..$workers) -> $w {
        @ps.push: start { put(@keys); };
    }
    await Promise.allof(@ps);
}

sub put(@keys)
{
    my $s = IO::Socket::INET.new(:host<localhost>, :port<2003>);
    loop
    {
        my $c = (0..10).pick;
        my $k = @keys.pick;
        $s.print("{$k} {$c} {DateTime.now.posix}\n");
        sleep 0.01;
    }
}

Why should I write many manually created unit tests when I can create a test that makes tests! A typical project using unit testing has tens of thousands of hand written tests. Look Ma! I just made a million!

Do This, Be Brave

Design by Contract is a far better weapon to fight bugs and improve code quality than Unit Testing. Design By Contract…

  • Helps you do code reviews by being explicit about assumptions.
  • Tests both input domain and output codomain
  • Tests ranges of values instead of individual points.
  • Tests real data in production vs fake data in development.
  • Always on. Ships with production code.
  • Prevents data corruption and secures code.
  • Failures happen close to the source of the problem.

If you combine this technique with…

  • Randomized Testing
  • Fuzz Testing
  • Acceptance and Behavioral Testing

You will never look back. Stop unit testing, do it, be brave.

 

The Mother Software

moon

Not the Earth

I am writing this on earth day, but this isn’t about the earth. On way to work I was riding the bus. And as usual, I was catching up on my reading as I was glued to the phone. During a stop, as many of you probably have, I looked around and found everyone else doing mostly the same thing. It isn’t too different from the scenes way back when where men in hats read their newspaper on the bus. The format changed, but the activity stays the same. Instead of hats we have headphones.

Lamenting the dreams of Alan Kay, I have come to understand that computers have replaced the old media. Not just in form, but also its place as people makers. Society and its stories is what makes our people, and while Mother Earth creates everything that keeps us alive, it is Mother Software that now is creating who we are.

killfacsists

This Machine Kills Fascists

David Graeber talks about how up until the 1700s, nobody had ever written a book describing what are the conditions that create the most wealth. They were writing, what are the conditions that create the best people. I am starting to see a moral transformation where people are asking these questions again. It all started in 2011 when a wave of revolutions swept the middle east and social movements like Occupy Wall Street spread throughout the world. Before violently being crushed. You can see it today with the slow destruction of America’s two party system, with Bernie Sanders and Trump representing the tensions between the old guard and the new constituency on the Democratic and Republic sides.

people

Walking Away

I believe a similar moral transformation will take place in our little world of software. We have to reconsider the place of software in our society. Because the production output of society is not the creation of things, but the creation of people.

Which brings us back to the people glued to their phones. Besides replacing the old media, software is replacing old economic structures. Where the clearest example is crowd funding. Instead of being passive subjects in the decisions of what gets made. Where, prior to crowd funding, bank managers and venture capitalists decided which ideas get funded. We can now be part of that decision process thanks to software innovation.

A similar structural transformation happening before our eyes is happening in politics. Bernie Sanders is raising more money than Hillary Clinton by using software. He has raised $46 million dollars in March alone. Where before he would have to beg the owning class for funds, he can simply ask them from us. His campaign for presidency would not be possible without software like ActBlue.

These are examples where Software is helping produce better people. People who feel empowered. This is software helping people practice democracy in both politics and economics. Yes, technology is helping create more Socialists.

glued

Bubble of the Mind

Mother Software is also replacing other fundamental functions in our society. Social institutions like local newspapers, zines, and libraries, are being replaced with, what DuckDuckGo calls the “Filter Bubble”. We are now more often self selecting the information we receive. Websites like Google and Reddit take our input and feed it right back to us, creating a cycle of perversion. So instead of getting opinions about information and ideas from those around us, we get it from those around the world like us.

cycle

Cycle of Perversion

This is further reinforced with news feed algorithms from Facebook. They show you what they think you like, based on your likes. It seems people are being slowly exposed to a fewer variety of ideas.

It is time we, the software makers, start thinking whether the software we are creating, is creating the friends we would want to have. What is the software that creates the best people? Because as Earth Day reminds us, we are approaching a global crisis. Global Warming is happening, and the effects are largely unknown. We will want the best people around our children when they have to handle that fatal bind.

fatlbind

Fatal Bind

 

The Cloud’s Shadow on Grass Computing

Fortunately for all of us, in 15 years since the dot-com bubble collapse, there has been an increased effort from Free Software and other developers to create decentralized systems. Bittorrent,  Tox,  ZeroTier One, TeleHash, Media Goblin, Sandstorm, GNUNet, and countless others.  This includes my own Fire★ project which I have coined Grass Computing.

Grass Computing is running decentralized Free Software.

If Cloud Computing is running decentralized software on rented machines corralled behind a wall. Grass Computing is running decentralized free software on your own hardware in the open. It is the Yin to the Cloud’s Yang.

There are many modern projects that enable people to do this now, but one myth is that this is something new. That in the beginning, the internet and the web was centralized, and it is moving towards a decentralized model. This is how the story is told but the reality is the opposite. The internet and many applications that run on it were first decentralized and have since been centralized. This forgotten history should be a warning to these modern decentralized system developers. Their systems can be co-opted and centralized at any time.

The most obvious example is email. Email is inherently decentralized because anyone can run a mail server and two mail servers can communicate directly without a middle man. Since the late 90s, many companies started to provide centralized email services and now most people in the world don’t run their own mail server, but rely on corporations to mediate their email communications for them.

There are many reasons why this happened, but the most compelling and provocative reason comes from the Telekommunist Manifesto. Which basically states that capitalists didn’t understand what the internet was but knew it was important and started buying everything they could. This led to the dot-com bubble of the late nighties. And in order to monetize on their investments, they invested in centralized computing because they can’t control communication of peer-to-peer software. They need the central control because it is the clearest way to get a return on their investment.

In other words, capital investment moved heavily towards centralized services instead of peer-to-peer ones, starving peer-to-peer software development of resources. The brightest and most well paid engineers now concentrate their minds and bodies to build centralized systems. These are called Star Networks because all communication of all nodes is mediated through a central point, having the shape of a star.

This is why people prefer a central email service. Because it is really hard to run your own mail server and it is really easy to log into Gmail. Imagine a different world where Google invested in making it easy to run your own email server which you can access from any web browser. Imagine a “Download Gmail” button where, with a couple short steps, you can be running your own Gmail software on your own machine at home. There is no technical reason why they could not do it. But there is no incentive for them to do it.

Another example is chat. Before centralized chat programs became popular, people could talk on IRC, which is federated and can be decentralized. I can run an IRC server which you connect to. We can talk directly as a group with each other. Now people use centralized chat services because they are far more convenient than running your own. Not because of any technical limitation, but simply because companies spend more time polishing and making centralized chat easier and more attractive than any competing decentralized chat.

Another example is GitHub, which takes Git’s inherently decentralized model and centralizes it. People use GitHub because it is easier than hosting your own code..

We need more Grass Computing software, and I believe if we had more social and capital investment, fellow engineers can make using decentralized software as easy as logging into Gmail, or talking over Skype (which used to be p2p but became centralized).

Check out redecentralized for some great interviews with engineers making grass computing easier to use. Get inspired and participate. And as a shameless plug, check out my own Fire★ project which aims to make writing p2p software easier than writing client server software.

If you take anything from this post take this:

Spread the idea of Grass Computing and warn others that it is a constant fight not to have decentralized software become co-opted and centralized.

Also fight for a basic income. If we have a basic income, then many more people may spend time working on decentralized systems. Consider it a social capital investment.