For a technology that promises to help businesses cut costs, artificial intelligence has had a big problem with being so costly.
AI’s scaling laws, which say that you need more computing power to make more powerful models, have put tech companies on a race to spend billions of dollars building vast data centers and buying powerful chips — costs they can’t help passing on to their customers. Google’s AI tool for generating documents or emails for office workers is not cheap. It adds $20 to their employer’s monthly $6 bill for the company’s Workspace suite, per staff member. Microsoft Corp.’s Copilot AI assistant costs $30 a month per worker.
Meanwhile, the cost of deploying AI directly into a company’s systems can cost between $5 million and $20 million, according to research firm Gartner, which estimates that 30% of generative AI projects will be abandoned by the end of 2025 in part because of all that expense.
The good news for those customers is that AI costs appear to be coming down, helping to close the gap between benefit and investment. The bad news: That still doesn’t address the bigger issue of utility, which will take a few years yet to solve.
The prevailing wisdom in Silicon Valley is to keep spending to get a foothold on the future. On Tuesday, Microsoft announced that its capital expenditures hit a record $19 billion in the last quarter, more than 80% higher than last year. Chief Executive Officer Satya Nadella said all that investment would continue to “capture the opportunity of AI.” Alphabet Inc. CEO Sundar Pichai said much the same in a recent earnings call about Google’s results: “The risk of underinvesting is dramatically greater than the risk of overinvesting for us.” Investors aren’t entirely buying it: Microsoft’s shares are down about 2% since its latest earnings announcement, Google’s by 5%.
But even as the cost of training AI has risen over the years as shown in the chart above, both tech giants’ AI services seem to be heading in a cheaper direction. A spokesman for Google says the company’s latest Gemini model — which companies can use to automate their customer service operations or summarize internal documents — is more powerful than the last, but close to half the price.1
OpenAI’s latest model, known as GPT-4o, is faster but also 50% cheaper than its predecessor, GPT-4 Turbo. A spokeswoman told me the cost of accessing its models, which is measured by the processing of tokens (essentially words by its language model), has dropped by 99% since 2022. “We’re committed to continuing this trajectory,” she added.
Among AI scientists, cutting costs with techniques like “sparsity” and “quantization” has been a major focus at recent conferences. Neerav Kingsland, an executive at OpenAI rival Anthropic, told me it was plausible that its models could fall to 25% of their current price over the next one to two years, and that the company — which has raised $8.8 billion from investors including Google and Amazon.com Inc. — had already halved the cost of building a recent model through novel research methods.
There are other signs that costs are falling. In China, AI companies have been engaged in a price war that has driven down the price of using generative AI, thanks in part to the more lax regulatory environment, lower labor costs and government subsidies. An AI startup there called DeepSeek, for instance, charges $0.14 per million tokens for business users, while a similar model from OpenAI costs $10.
And cost efficiencies are coming from business users too. Many are realizing they don’t need the most powerful AI to give their staff a productivity edge, so they’re experimenting with open-weight models from companies like Meta Platforms Inc. or smaller models that are cheaper or slower. A customer service chatbot might need a state-of-the-art AI tool that can do real-time inference, but analyzing customer calls to improve them? That can be done with less-advanced technology.
Man Group Plc, a global asset management firm, tells me that the models it’s using to summarize text for its portfolio managers, or to reduce a day’s work to 30 minutes for other staff, are indeed coming down in cost. But there are lingering questions about how sustainable that price trajectory will be in the future.
Silicon Valley has a history of subsidizing prices, with streaming platforms, ride-sharing apps and cloud services all taking a margin hit to grow market share. The goal is to ride out the competition and eventually raise prices, becoming profitable. But here’s the sticking point for generative AI: There’s still a broad question of how useful it can be to a business’s bottom line, which Gartner says is the main reason for its prediction that 30% of projects will be abandoned by the end of next year.
If the technology remains stuck in just dispensing chatbots and summarizing text, it might not be worth even the lower price tags. That is the issue tech companies should perhaps grapple with over others, even cost.
1. The latest Gemini model costs $0.35 per million tokens to access, down from $0.50.
A message from Advisor Perspectives and VettaFi: To learn more about this and other topics, check out ourpodcasts.
Bloomberg News provided this article. For more articles like this please visit
bloomberg.com.