Deep Latent Space Maps

As I watch my children grow up, I am always amazed at their pace of learning. It turns out by 18 the average person learns about 60,000 words which average about 9 words a day. Typically these new words are learned by as little as one example of the thing or concept learned. Anyone working in the AI field understands that one huge drawback of the latest NN based learning systems is that the number of examples the system needs to learn a label is enormous. Typically, in the thousands to millions of examples, which is often out of reach for most researchers in the field, which is why there are public datasets used by researchers (TIMIT, MNIST, and ImageNet, ActivityNet, etc.) for training and testing new models.

I think it’s clear to most researchers that Deep Learning has a deep problem because these systems learn in clearly inferior ways. As I watch my children learn, I am deeply unsatisfied with current techniques and know there must be a better way. I am not the only one who feels this way, and many started looking at older ideas in ML to combine them with Deep Learning (Deep Learning itself is an old idea).

There is some exciting research coming out suggesting that there are these grid cells which represent objects in many domains, and they take input from the senses and vote on the best model of the world they are perceiving.  This research seems to partly validate Geoffrey Hinton’s idea of capsule networks which take many models and vote on a prediction to higher level features. Typically the capsules vote for manually selected features like the position and orientation of an object in Hinton’s case. It seems we can learn the features instead because it’s possible the brain uses grid cells to map to features in an abstract latent space.

One promising approach to learning this mapping is to use deep learning to learn embeddings from some medium, to take input data and map it to some latent space. This is called Deep Feature Extraction. Such a trained network can convert an image, sound, or text into a Feature Vector. This technique is useful because this can be done unsupervised only requiring many examples of data (images, sound, text, ext) instead of labeled ground truth. These learned feature vectors are then typically used downstream by some other technique (xgboost, svm) to learn labels.

One area I want to explore is to take Peter Gärdenfors idea of a Conceptual Space and combine it with deep feature extraction and Geoffrey Hinton’s idea of capsule networks. What I want to do is to use deep feature extraction to produce feature vectors in latent space. Then the Labeled data can be used to map out the latent space over several domains using Voronoi space partitioning. You can then train many of these mappings into the domain of your choosing and use a voting mechanism to extract the probable labels. I call the latent space partitioning using labeled domains Deep Latent Space Maps.

The devil is in the details, but in principle, a learning system becomes creating these Deep Latent Space Maps. Classification is then taking the inputs through all the mappers and using the voting mechanism to extract the probable labels. In other words, you can train it using only as little as one example of an object, word or thing. As more examples are provided, the map can be re-adjusted to take in the new information. The interesting bit here is to explore using Peter Gärdenfors idea of how learning works in our brain which new research is increasingly validating. It just feels right.

Advertisements

We Must Become System-System Thinkers

picassoEveryone building software must become a system-system thinker. A system-system thinker is someone who understands that the system they are building is part of a larger whole of other systems. The software we build is part of a more significant system that extends beyond the computer screen and network packets. Despite what many say, software does have a physical manifestation in the forms of the computers it runs on and the people who make and use it. The internet and the world-wide-web are mappable and have a physical form. All Software runs on hardware and hardware exists in the real world maintained by people.

The most critical computer system in the world is undoubtedly the internet. It’s the only computer system that has grown to the scale of trillions of components. All the atoms of the internet have been replaced, and it has never been shut down. This is no small accomplishment and is a result of the great design of TCP/IP and the protocol stack.

The reason the internet was able to scale so successfully was that its design is excellent. It scales because it was designed like another system that has a vast scale called the ecosystem.

The term “Ecosystem” is an interesting concept in itself because it was inspired by the electrical view of the brain proposed by Sigmund Freud. One night, the ecologist who promoted the concept of an ecosystem called Arthur Tansely, had a dream where he shot his wife. This dream disturbed him so much he read the works of Freud. He read how the brain is an electrical system of flowing energy, and he thought that nature must work the same way. Instead of electricity, nature has flows of minerals and energy. In other words, our view of nature is inspired by the technological developments at the time. Our view of nature is also that of a stable system that can adjust to shocks. As we shall see, it appears this view is wrong.

We have a much bigger systemic problem that will likely impact the computer and software industry in the coming years. We can ignore it in the community, but I want to make it clear, if nothing is done, there won’t be a computer industry or software profession. High technology needs organized human life to develop and function, and it won’t be possible with systemic failure.

That biggest threat to technological systemic failure is global warming. Why is global warming the biggest threat to the computer and software industry? We must first put on our system-system thinking caps on. Now that our caps are on let’s look at the extent of the problem.

The warming is accelerating and the new report by the IPCC on 1.5 degrees explains that we have a short window of 11 years until 2030 to dramatically reduce emissions. I sense that this is a conservative estimate and likely most scientists privately believe we will blow past 1.5 degrees. Global warming is having the most significant impact on the ecosystem.

For those that want to understand if there is already a systemic failure underway in our ecosystem, they should read the reports about the mass extinction happening now. A recent article by the Guardian described the collapse of insects in the Puerto Rican rainforest stating that 98% of insects on the ground had vanished. This is a total collapse of the ecosystem since a collapse of insects means a collapse of any animals relying on them. The scientists have attributed this insect collapse to global warming because the number of extremely hot days has dramatically increased in the rain forest going from almost zero to 40% of the days. The 2018 edition of the Living Planet Report found that 60% of all animals have disappeared since 1970. In other words, in less than 50 years, we have lost most of the wild animals of the world. To understand the extent of how few animals are left, a study published in Proceedings of the National Academy of Sciences found that humans make up 36% of all mammals on the planet. Domesticated mammals make up around 60%. That means only 4% of all mammals are wild.

While the internet has proven to be a robust system, it isn’t disconnected from the larger world. If the total collapse of our ecosystem causes mass social upheaval, the internet will stop functioning. It relies on points of centralization (large organizations) and large scale investment to keep functioning.

We have a choice as technologists to either design systems that could be robust against social collapse or build systems to maintain organized human life. In the past, I worked on the problem by designing robust networking systems. I built Fire★, a P2P computing system, back in 2013 because I believed a technical solution was possible. However, by understanding that our computer systems are connected to much larger systems, I don’t believe in a purely technical solution anymore. Since one of those larger systems, the ecosystem, is decidedly collapsing (not tomorrow, but now), it’s not far fetched to imagine our social systems are likely to follow, since they also rely on the ecosystem. If our social systems collapse, the framework which allows technological progress and development will also collapse.

The future of the computer industry and software profession is in peril. We must acknowledge this as a community and take appropriate actions. We must be system-system thinkers!

The Divide by Zero Programmer

Most working programmers have heard of the 10x programmer, a mythical creature that accomplishes 10 times more than their peers. Some believe it’s a myth and some believe it’s true. I personally don’t know, but I do know of a creature that is ultra rare but very real called the Divide by Zero programmer.

In most programming languages, divide-by-zero is defined as infinity. This usually comes from the idea that if you have a function 1/x where x approaches zero, you approach infinity. Since 1/0.1 = 10, and 1/0.01 = 100, and 1/0.000001 = 1000000, and so on.

If we define “one” in our case to mean “Able to start a project from scratch” and “zero” to mean “Unable to start a project from scratch”, then a programmer who is able to start a project is unknowably more valuable than someone who can’t.

Now you may ask “Valuable how?”. Many great programmers are wonderful at modifying existing software. I would argue it’s much more valuable work than what the Divide by Zero programmer does in aggregate. After all, you make a cup once, but wash it a thousand times. And it’s no secret that most software is in maintenance mode. Software goes into maintenance mode as soon as version one is released. So it’s perfectly valuable to have people who can improve and maintain that software.

However, the Divide by Zero programmer can be an engine that drives invention in software. They are rare because the blank page is terrifying. It takes a bit of what I call “The Fighting Spirit” to tackle the blank page.

The story doesn’t end there. Those alert enough might have realized that the function 1/x can approach zero from the other side! That is, 1/-0.1 = -10, and 1/-0.01 = -100, and 1/-0.000001 = -1000000, and so on. In other words, the Divide by Zero programmer is also unknowable destructive too!