Generative artificial intelligence has a reliability problem. Here’s how investors can gain confidence in portfolios that deploy the technology.
As generative artificial intelligence (GAI) gains popularity, the technology’s tendency to fabricate responses remains a big flaw. We believe specialist models can be designed to reduce hallucinations and improve AI’s accuracy and efficacy for use in investing applications.
If you’ve played with ChatGPT or GAI-driven applications over the last year, you’ve probably been amazed and skeptical. The technology has dazzled us with its ability to write smart summaries, compose poetry, tell jokes and answer questions on a range of topics in remarkably well-written prose. Yet it also tends to fabricate information—between 3% and 27% of the time, depending on the model, according to one study by study by AI start-up Vectara. While this defect may be tolerable in entertainment applications, GAI’s hallucinations must be tamed for investors to gain a high level of confidence in its output for portfolios.
Why Does GAI Hallucinate?
The magic of GAI happens in large language models (LLMs). LLMs are algorithms, based on deep learning technology, that can recognize, summarize, translate, predict and generate text and other forms of content. The knowledge that drives these models is based on massive datasets and the statistical probabilities of words and word sequences occurring in a particular context.
But building large models comes at a cost. LLMs are generalists, which means they are trained on generic data that can be found across the internet, with no fact-checking of the source. These models may also fail when faced with unfamiliar data that were not included in training. And depending on how the user prompts the model, it may come up with answers that are simply not true.
Fixing hallucinations is a big focus of GAI providers seeking to boost confidence in and commercialization of the technology. For investment applications, we believe the key to solving the problem is to create specialist models that can improve the output. These smaller models are known as knowledge graphs (KGs), which are built on narrower, defined datasets. KGs use graph-based technology—a type of machine learning that improves a model’s ability to capture reliable relationships and patterns.
Specialist Applications Are More Consistent
Using a graph-based model structure and training an AI brain on a smaller, yet more focused, dataset, helps confine the boundaries of responses. For an investment application, KGs can guide a model through the open-source LLM’s vast knowledge, for example, by determining the relevance of various terms that may or may not be related to technology and innovation (Display). While an LLM focuses on the statistical probability of a word appearing near another word, a graph-based model can become an increasingly intelligent subject-matter expert. It can be designed to understand causal relationships among concepts, words and phrases in areas such as macroeconomics, technology, finance and geopolitics that are more likely to have an impact on financial markets and securities.
KGs can improve accuracy because the relationships between the words and topics discussed are more precisely defined. Building an AI model that combines a KG’s specificity with an LLM’s breadth can deliver the best of both worlds for investors (Display), in our view.
The reliability of a KG-based AI model can also be improved by carefully choosing training data from verified sources of unbiased knowledge in an array of fields. Sources can include material from the International Monetary Fund, the World Trade Organization and global central banks. The KG must constantly evolve by capturing and digesting new information from the selected reliable sources.
Quality Control and Fact-Checking
Still, carefully curating KG training materials won’t eliminate hallucinations. AI professionals can deploy other tools when building a model that we believe can lead to much more trustworthy outputs.
For example, knowledge-based reasoning can foster smarter prompting. This involves using KGs to generate prompts that are grounded in factual information and vetted sources. It provides the LLM with a more accurate and relevant starting point for generating its output.
Graph structure can also fact-check outcomes. In fact, a well-designed graph structure can quantify the reliability of an LLM model’s output and can warn systems and/or human administrators of potential hallucinations.
Why Does This Matter for Investors?
For investment models, an AI-driven system must be designed for clearly defined goals. That requires combining different models and technologies for the targeted output, such as identifying securities that may benefit or underperform because of an event that affects an industry. While nontechnological investors may not fully understand how the models work, it’s important to ask a portfolio manager that uses AI to explain in plain English how the models form a coherent architecture that supports the investment strategy.
With the right combination of techniques, we believe hallucinations can be dramatically curbed in an AI model. Greater accuracy is the cornerstone of an AI investing brain, which can be designed to search beyond traditional data sources and quantitative strategies for attractive investments across asset classes, unhindered by human behavioral biases.
In a world of increasing uncertainty and rapid change, investors seeking to capitalize on the AI revolution need conviction that the technology isn’t delusional. By combining LLMs with focused KGs and incorporating thorough risk-control mechanisms, systematic sanity checks can be applied to ensure that an AI-driven investing model is firmly rooted in a world of real opportunity.
The views expressed herein do not constitute research, investment advice or trade recommendations and do not necessarily represent the views of all AB portfolio-management teams. Views are subject to change over time.
A message from Advisor Perspectives and VettaFi: To learn more about this and other topics, check out our full schedule of upcoming CE-approved virtual events.
© AllianceBernstein
Read more commentaries by AllianceBernstein