There’s no question that Large Language Models (LLMs) like ChatGPT and others have upended our society. This is especially evident when you think about the world just a few short years ago, and realize that the common tasks performed by LLMs would seem like pure science fiction. To be able to conjure up amazing art, an insightful editorial, or a super-charged search engine… all of these feats are miracles in their own right, and already we take them for granted. Still, the seemingly democratic “level up” for society as a whole has been incredible, certainly making the history books in the same way that the internet or social media have caused humans to evolve. So what’s the problem?
Just like other major advances in tech, LLMs are not without their downsides. Already we are seeing the darker side of LLMs creep into society with how they are being used by those with dubious ethics. However, aside from bad actors, LLMs have some baggage that is starting to affect their performance and our ability to use them. This is frustrating because on the whole, LLMs can offer us so much more than previous AI models, and they are more accessible to the general public than a very narrowly trained, sensitive algorithm.
These models are worth fighting for, and worth improving upon. Their weak points have true costs, but thankfully there are indications that these large models could find a more comfortable middle ground between the bloated, over-generalized LLMs we see available and the hyper-focused AI algorithms that can solve a single problem when used by a specialized technician. Let’s look at the latest evolutions to see how this middle ground is actually achieved thanks to the decentralized nature of Web3. Organizations like the Artificial Superintelligence (ASI) Alliance are making some very interesting use cases come to life, possibly showing what the future of large, bold AI models could look like in the near future, pairing the breadth of knowledge for a given area with the specialization needed to offer real, reliable value.
The LLM Bloat
For the casual user of ChatGPT, the enormous and concerning bloat of LLMs as a whole might not be evident. After all, the current business model makes it easy for the average user to log into a clean, simple interface, ask a question, and get an instant answer for free. For many questions, the answers given are adequate, even excellent at times. What they don’t see, however, is the bloat behind the scenes. The amount of energy it takes to run a simple LLM query is eye watering compared to a normal search engine. The amount of energy for training an LLM is significantly more intense, with some taking the energy of over a hundred homes’ worth of annual electrical demands.
Beyond the sheer energy, however, is the core challenge with making a model that can do “everything.” The comparison to humans is fairly straightforward. An LLM is typically trained on a very wide breadth of data and subject matter, because the types of queries from users will vary widely. Don’t think about it like a normal human learning a wide range of topics. We learn a little about a lot of things, as part of our natural education and experience. An LLM is more akin to a game show reigning champion, answering very specific, in-depth questions about nearly any subject. The training required for this is exponentially more intense, requiring effort that is likely a thousandfold more intensive than natural experience over time.
Even when successful, the massive amount of training required for an LLM doesn’t equate to the performance you might expect. When you cram that much training and sheer knowledge into a single brain, even an artificial brain designed for it, there are going to be some problems. For one thing, the LLM might seem smart while being astonishingly inaccurate. LLMs actually “hallucinate,” providing information that they actually conjured from nowhere, or giving vague answers that they have no ability to trace to its source. Needless to say, this is not helpful. What we’ve also discovered is that there is a limit to performance with LLMs. Spread this thinly across knowledge areas, LLMs tend to plateau in their performance, developing latency issues, accuracy issues, and power inefficiency for the value provided. What is the result of all this? The bottom line is that when LLMs need to answer detailed, accurate, and important questions, they are biased, verbose without being relevant, wasteful with resources, and sometimes just wrong. For those who are taking the answer with a big grain of salt, or using LLM output to help with their creative process, this is fine. But for serious work, like discovering cures for diseases or simulating complex robotics, LLMs are not cut out for the job.
A Happy Medium: Domain-specific Foundational Models
There has been a natural evolution as these LLM weaknesses become more obvious. The domain-specific foundational model has shown a lot of promise for solving the more serious problems that are too general for traditional AI, but too in depth for LLMs. This type of model bounds itself around a specific topic, such as robotics, biotech, or quantum mechanics, then dives in deep to train accurately across the subject. Early progress shows that use cases like drug discovery or robot simulation could be revolutionary, giving reliable information for a wide range of problems without having to specifically train AI models on each individual problem to solve.
Currently the leader in this field is the Artificial Superintelligence (ASI) Alliance mentioned above, although the field continues to grow as the results speak for themselves. The ASI Alliance has developed an interesting twist to their model, “ASI<Train/>”, that transforms the ability to scale up training while maintaining validated answers. The secret to success is incorporating Web3 to create an automated, distributed process that rewards contributors as their contributions are collected and verified. A client will submit the problem they need to solve, and their budget goes to reward a globally distributed group of participants who each take a small part of the process, validate it, and submit it to boost the model’s training. At a high level, the process is obvious and simple, but the details of scaling and validation are where the organization ensures success. The results have shown that this may be the next revolution in AI, creating the ability to go deep and accurate with important questions, but wide enough to be used by many professionals answering questions that could truly benefit society.
What’s Next?
LLMs will likely stay in the spotlight as the average user gets what they need out of over-generalized but mostly accurate answers. However, with unsustainable business models and growing inefficiencies, the LLM architecture will fade and give way to much better evolutions in AI, such as the Domain-specific Foundational Model. One thing is for sure, however. We will continue to see growth, improvements, and a tireless desire for improvement as AI has given us a glimpse of what it can do.
