Elon Musk filed a lawsuit in San Franciscoās Superior Court accusing OpenAI and its CEO, Sam Altman, of betraying the startupās initial commitment to openness, the betterment of society, and lack of profit as a motive. Among other things, Muskās 35-page complaint argues that OpenAI has violated its original deal to share its GPT large language models with Microsoft, which stated that the software giant would lose access to new LLMs once OpenAI had achieved AGI. According to the complaint, OpenAI reached that epoch-shifting moment a year ago with GPT-4, its most powerful model to date.
Muskāwho cofounded OpenAI but left in 2018āis at least as entitled as anyone to come up with his own definition of AGI. His complaint describes it as āa general purpose artificial intelligence systemāa machine having intelligence for a wide variety of tasks like a human.ā That does sound like GPT-4 as I, a mere layperson, experience it in ChatGPT Plus.
But Muskās declaration that the AGI era is already upon us is hardly the consensus among AI scientists. Even those who think itās not far off predict arrival dates that are least a few years away. And GPT-4 falls well short of meeting OpenAIās own explanation of the term: āA highly autonomous system that outperforms humans at most economically valuable work.ā
Consider the evidence:
GPT-4 isnāt remotely autonomous; indeed, it does its best work when humans provide plenty of hand-holding in the form of detailed prompts. The world is still in the process of figuring out what tasks GPT-4 can do, and we frequently overrate its competence. Thatās not even getting into the fact that OpenAIās reference to āmost economically valuable workā suggests that true AGI may involve not just software but also sophisticated robotics that donāt exist yet. To guess when OpenAIāor a rival such as Google, Anthropic, Meta, Mistral, or Perplexityāmight reach AGI, as OpenAI defines it, is to expect that itāll be an obvious moment in time. But OpenAIās definition, like all the others, is squishy and difficult to put to a conclusive test. To riff on Supreme Court Justice Potter Stewartās famous comment about pornography, maybe weāll know it when we see it. At the moment, however, Iām convinced that obsessing over AGIās existence or nonexistence is counterproductive.
The whole notion of AGI is predicated on the assumption that AI started out dumber than a human but could someday match or exceed our level of thinking. Already, though, generative AI is different than human intelligenceāfar closer to omniscient than any individual flesh-and-blood thinker, yet also preternaturally gullible and prone to blurring fact and fiction in ways that donāt map to common human frailties. Thatās because itās a predictive engine, trained to string together words without truly understanding them. If its present trajectory of simulated brilliance mixed with boneheadedness continues, it might wander off in a direction far afield from most definitions of AGI.
Even if the world lands on a new, more inclusive definition of AGI, it may be hard to prove whether a particular LLM has attained it. Muskās lawsuit cites proof points of GPT-4ās reasoning power, such as its scoring in the 90th percentile on the Uniform Bar Exam for lawyers and the 99th percentile on the GRE Verbal Assessment. That it can do so is astounding. But acing tests is not synonymous with performing useful work. And even if it were, who gets to decide how many tests an LLM must pass before itās achieved AGI rather than just bobbled somewhere in its vicinity?
For decades, the Turing Testāwhich a computer would pass by fooling a human into thinking that it, too, was humanāwas computer scienceās beloved thought experiment for determining when AI had gotten real. Strangely enough, itās useless as a tool for assessing todayās LLM-based chatbots. But not because they know too little to fake humanity convincingly, or canāt express it glibly enoughābut because they betray their artificiality by being so good at churning out endless wordage on more topics than any human knows. AGI could end up in a similar predicament: a benchmark, devised by humans, thatās rendered obsolete by the technology it was meant to measure.
DID YOU HEAR THE ONE ABOUT THE āMAC CAR?ā Last week, Appleās long, expensive quest to build an autonomous EV entered its rearview-mirror phaseāa sad fate my colleague Jared Newman blamed on the companyās sometimes counterproductive pursuit of perfection. Wondering what an Apple car would be like has been an obsession for techies since 2012, when news broke that Steve Jobs had toyed with getting into the automobile business even before there was an iPhone. Or maybe it started in 2008, when reports of a meeting between Steve Jobs and Volkswagenās CEO led to wild speculation about an āiCar.ā
Or how about 1998? According to Snopes, thatās when a joke involving cars designed by software companies began spreading like crabgrass across the internet, eventually evolving into an urban legend involving a Bill Gates keynote and a General Motors press release. Along with a Microsoft car that crashed twice a day and occasionally needed its engine replaced for no apparent reason, it mentioned a āMac carā that āwas powered by the sun, was reliable, five times as fast, twice as easy to driveābut would only run on 5% of the roads.ā
Yeah absolutely. Even AI as a term has become a crock of shit because itās been latched on to by companies to market their products in the AI the equivalent of the dotcom boom.
Artificial Intelligence was once a sufficient smterm for Artificial General Intelligence. Now any old algorithm is being labelled AI to sell it.
But the terms donāt matter - the concept is sound but itās further away than we probably expect because so much crap is being sold to make a quick buck.
Chat-GPT is basically beta software and it is practically useless because itās inaccurate. You canāt use a tool in business, government or health are when it can be wrong and worse so confidently wrong. Itās an impressive tool but they still havenāt got that working well, let alone any further āadvancesā.
And blindly throwing data at LLMs and hoping to stumble on AGI is not going to work - crudely that is the approach of much of the cow boy outfits out there claiming to be innovating in AI. That includes big tech companies who have jumped on the bandwagon over the last 18 months.
The term āArtificial Intelligenceā is an umbrella term for a wide range of algorithms and techniques that has been in use by the scientific and engineering communities for over half a century. The term was brought into use by the Dartmouth workshop in 1956. Itās perfectly applicable to LLMs and other similar generative algorithms being used today, and many less sophisticated ones as well. āArtificial general intelligenceā is a subset of AI.