AI Will Change Our Lives More Than The Printing Press or Internet
Machine learning, commonly referred to as artificial intelligence, has been going through a phase change.
A phase change is, for example, when water moves from ice to liquid or liquid to gas, it undergoes a fundamental shift in its organization. This change is rapid and utterly different. At 32 degrees water is solid, at 33 degrees it’s liquid. In other words, AI has undergone a fundamental change in capabilities.
Case in point, in 2018 Goggle’s BERT model had a game changing 110M parameters within the program. In 2022, the top AI programs, like China’s Wu Dao (“Enlightenment”), are running a trillion parameters. In four years, AI programs have become 10,000 times more powerful. Follow that trajectory over a mere 10 years, and the implications become staggering.
We’ve moved from AI that barely understand us, to one’s that can guess what we’ll say next, and speak it in plain English the right tonal inflection. It can look at a picture, and name elements in it or compose poetry that is purportedly “excellent.” But it gets weirder.
Emergent properties have begun to manifest, and the whole is revealing itself to be greater than the sum of its parts. They can’t be found through Cartesian reductionism. For instance, a watch with gears is easy to understand if you take it apart. Not a brain. Single neurons, “the gears” of our brains, do nothing by themselves. Put a hundred billion of them together in the right way, and the emergent property of mind appears. It is uncanny, meaning that it isn’t just strange and mysterious, it’s unsettling.
For example, Microsoft’s Florence model and GPT-3 can generate images, based on text input, that appear to be original, and artistically sophisticated. The above cover image was created by this system for the Economist after it was fed a headline “March of the machines.” Note the art deco elements and excellent relevance of the imagery. It also has a coherent style.
Other tools like Midjourney have learned how to create images from a sequence of words, then apply a style it has taught itself to associate with that name such as “Salvador Dali” or the renaissance style of “Bruegel” and it can even collage the two styles which brings us to another benefit of emergent properties, flexibility.
Historically, AI only did one very specific thing, but newer models do much more with relative ease, and require only minor changes in programming. These electronic “brains”, built by exposing AI to massive amounts of data, are called “foundation models.” But now, the results they are capable of producing are changing from the artsy and unpredictable to something precise and useful, inspiring people like author Jack Clark, an AI expert, to claim “AI is moving into its industrial age.”
To understand the size and importance of its impact, we must first understand “general purpose technologies.” These are core technologies that have a broad application which spills over into other sectors, causing cascading repercussions throughout associated industries. For instance, when the printing press was invented, it allowed a body of knowledge to be cloned and shared across great distances, ultimately leading to the Renaissance. Likewise, the invention of steam engines gave locomotives the power to move huge amounts of freight across country, causing vast changes to the economy. Now, it’s becoming clear that AI foundational models are going to be a GPT.
They’re so important that companies like Microsoft, Facebook, Google and Tesla are devoting upwards of 80% of their research into this field. Given the governmental support for the development of China’s Wu Dao, they also consider the field a national priority. Now, the race is on to gain what could be a major technological edge, and whoever does so will likely benefit from a concentration of economic and political power. Fei-Fei Li, co-director of Stanford University’s Institute for Human-Centred AI says we are in a “phase change in AI.” To understand what that means consider how different today’s AI is from days gone by.
All modern machine-learning is based on “neural networks” — programming that mimics how brain cells interact with each other by simulating small clusters of neurons that are “trained” by a series of trial and error. Their output is based on how they are programmed to process specific inputs. Historically these were interesting models, but ultimately impractical because of the computing power required.
Around 2010, computers became powerful enough to imitate large clusters of neurons that were trained using internet imagery. This enabled the AI to provide useful services like translations of text which has become more and more accurate.
Today, text to speech plugins can easily read your text out loud, with a close approximation of the right tonal inflection. How did this happen so quickly? Blame it on video gamers.
Modern gamers have a voracious appetite for computing power that renders photo-realistic games in real time using “graphics processing units” (GPUs), like those from the chipmaker nVidia. This cheap processing power runs many calculations in parallel, making them ideal for simulating neural nets. In 2010, their performance began to increase rapidly.
In 2017 Google and the University of Toronto created a new software architecture to be used by Google’s BERT, a natural language processor that enables Google to better understand your queries. Previously, data was processed sequentially, but using BERT processed it all at once. It could find word patterns by looking at the whole of a field of text, rather than looking at it word by word.
BERT was a major evolution in machine learning, that didn’t require pre-labelled data sets to learn. It could teach itself using a technique called self-supervising learning. While scanning text, it would hide words from itself, then guess, based on context, what the missing words should be. After a few billion guess-compare-improve-guess tests, it got incredibly good. Since then, the system has used the same principle to study pictures, videos and more, all so that you and I could get better search engine results.
An important discovery from these tests was the models appear to work better the bigger they get. This led to an important breakthrough with Open AI’s GPT-3 in 2020. It’s predecessor, GPT-2 had been fed 7,000 unpublished works of fiction, totaling 40GB, and had 1.5gb parameters. GPT-3 processed 570 gigabytes. It’s data set included more books, a chunk of the Net and all of Wikipedia. It was 118 times bigger than GPT-2 and had 175bn param.
Because it had been fed a big part of the Internet, GPT-3 saw a lot of code, after which one of its emergent properties appeared. GPT-3 had not only trained itself to read and write clear English. It had trained itself to program computer code. The results were unimpressive on big programs, but acceptable on smaller programs. Today, developers on GitHub, an open-source code repository, are using CoPilot, an AI programming software, to provide a third of their code.
The pace of development has grown phenomenally fast. In early April of 2022, Google released PALM, which has 540bn parameters and outperforms GPT-3 in many ways. It can even explain jokes. A month later DeepMind, a Google startup, released Gato, which can play video games, meaning it can process what it is seeing. It can also manipulate a robot arm to do tasks. Now Meta, parent company to Facebook, has begun developing what it calls a “World Model” to examine facial movement and body signals to power it’s metaverse.
Other firms are using it to provide additional services. FableStudios make interactive stories with it; Elicit answers research questions based on academic papers; and Viable shifts through customer feedback looking for important insights. These foundation models are also distilling corporate data to understand the basics of banking or car making.
Naturally, all is not perfect. If you feed the system a nonsensical parameter, it will make up results rather than admitting it doesn’t know the answer. Ask the system about something in which Muslims are doing something and GPT-3 will take the narrative in violent direction, which it doesn’t do when prompted with references to other faiths.
Even though they can be trained using reinforcement learning with human feedback, and bias can be removed, they are still very present. The big concern is biases in the system that humans don’t even know they should know may creep into results. And there are other social implications.
At Stanford, economist Erik Brynjolfsson worries that trying to go bigger and more human could catch us in a “Turing trap”. Turing was the father of modern computing and speculated on AI. This Turing trap would result from trying to do things humans already do, instead of augmenting their capabilities. If machines start replacing humans, for example on the assembly line, then more people will lose their jobs and their ability to bargain will be limited. Wealth and power could then be consolidated into the hands of oligarchs or plutocrats.
This is already starting to happen. Consider that Google and Microsoft have cloud systems that require huge amounts of resources. These systems, like those proposed by nVidia execs, will cost $1bn to train, making it cost prohibitive for institutions like Stanford to run their own system. Therefore, it’s AI institute is pushing for a government-funded “National Research Cloud”, similar to China’s Wu Dao or Frances BigScience, to provide universities with computing power now only available to private companies.
Another factor that could drive centralization in this field is, based on what we’ve seen in the past, is the winner takes all. Why? As more users and developers move to a given platform — be it an operating system or social network — social proof makes it more attractive to other users and developers until it becomes an industry standard. This is how platforms like IBM and Microsoft rose to dominance.
More importantly, national security may be at stake. In his book “Daemon”, author Daniel Suarez envisions an AI spreading through Internet, first replacing organized crime structures, then the government. That may be a bit far-fetched but consider what if somebody created a service like Copilot, which uses AI to write computer code. Copilot has limitations, but if the other service didn’t, it could build viruses and release them. What if that self-programming code generator was created by a government who wanted to use it to destabilize the world? Another malicious use of the system that is easily within reach is the automation of a system to create deepfake video streams at scale to create misinformation? As the quality of art continues to improve, what if it gets used to create art for propaganda purposes?
Either way, few people believe doubt AI is going to become sentient. But what if an AI began to chart its own course? Some have said “we’re building a self-driving car, without a steering wheel.” What if for example an AI begins to modify itself? It could build a better AI, which would build a better AI and so on. Self-evolving code could go incredibly fast, and then what? Given access to the Internet, what could it learn and how could it affect us?
Stepping away from these concerns, it will be exciting to see what happens from a communications perspective, as AI augments our relationships. What if you could create a negotiation program? In Unix, these programs are known as daemons. This daemon could be invoked to work out psychological issues. One man has used an AI as therapy for his depression, and claims it saved his marriage.
As always there will be benefits and detriments to this technology. It will both simplify our lives and make them more complex, given the history of how poor design tends to infiltrate systems. This will be something we have to pay close attention too. Consider, for example, how time consuming it has become to login to your software. Likewise, consider that I’m typing with 10 fingers on a full keyboard, but people type with two thumbs on their phones.
In conclusion, my grandmother lived to be 93. She saw the birth of radio, color TVs, personal computers and the internet. All within one lifetime.
Given the rate of exponential growth we’re seeing with AI, and the incredible repercussions this general-purpose technology will have on countless industries, one thing in our future is certain.
We live in an Age of Technomagic where inconceivably advanced technologies will emerge out of nowhere, like the iPhone or Facebook, and change our lives forever.