The popularity of ChatGPT and generative AI is on the rise, but the expenses associated with them can be exorbitant.




SUMMARY
1. Developing and maintaining generative AI software can be extremely costly.
2. Nvidia dominates the GPU market for the AI industry, with its primary data center chip priced at $10,000.
3. Experts and technologists speculate that training large language models like GPT-3 can result in expenses exceeding $4 million.

REPHRASED
In summary, creating and maintaining generative AI software can be incredibly expensive. Nvidia is the primary supplier of GPUs for the AI industry, and its primary data center chip costs $10,000. Analysts and experts estimate that training a significant language model, such as GPT-3, could cost over $4 million.

Before OpenAI's ChatGPT gained widespread recognition for its ability to generate compelling sentences, a small startup named Latitude impressed users with its AI Dungeon game that allowed them to create fantastical stories based on their prompts using artificial intelligence. However, as AI Dungeon's popularity increased, Latitude's CEO Nick Walton discovered that the cost to maintain the game began to rise sharply. The game's text-generation software relied on OpenAI's GPT language technology, which became more expensive as more people played AI Dungeon, resulting in a larger bill for Latitude.

To compound matters, Walton found out that content marketers were using AI Dungeon to create promotional copy, a use for the game that his team had not anticipated but one that contributed to the startup's AI bill. At its height in 2021, Latitude was spending nearly $200,000 monthly on OpenAI's generative AI software and Amazon Web Services to process millions of user queries daily.

By the end of 2021, Walton said that Latitude had switched from using OpenAI's GPT software to a cheaper but equally capable language software provided by startup AI21 Labs. Additionally, the startup incorporated open-source and free language models into its service to reduce costs. Walton said that Latitude's generative AI bills had dropped to less than $100,000 per month, and the startup charged users a monthly subscription fee for advanced AI features to help lower the cost.

The high cost of developing and maintaining generative AI software is troubling for firms that create foundation models and those that use AI to power their software. This is particularly true as venture capitalists look to invest in companies that could be worth trillions, and larger companies such as Microsoft, Meta, and Google use their considerable capital to gain a lead in the technology that smaller challengers cannot match.

However, the high cost of machine learning is a structural cost that differs from previous computing booms. Even after the software is built, it requires a significant amount of computing power to run large language models because they perform billions of calculations each time they return a response to a prompt. By comparison, serving web apps or pages requires far fewer calculations.

These calculations also require specialized hardware. Most training and inference now occur on graphics processors (GPUs), initially designed for 3D gaming but have become the standard for AI applications because they can simultaneously perform many simple calculations. Nvidia dominates the GPU market for the AI industry, and its primary data center chip costs $10,000. Scientists who build these models often joke that they "melt GPUs."

If the margin for AI applications remains permanently smaller than that of previous software-as-a-service models due to the high cost of computing, it could dampen the current boom.

The Costly and Cautious Process of Training Large Language Models

Rowan Curran, a Forrester analyst specializing in AI and machine learning, has estimated that the training process of large language models like OpenAI's GPT-3 could exceed $4 million, with more advanced models costing over "high-single-digit millions."

Last month, Meta unveiled its largest LLaMA model, which was trained on 1.4 trillion tokens using 2,048 Nvidia A100 GPUs over 21 days. The company disclosed that the training process consumed approximately 1 million GPU hours, which would cost more than $2.4 million with AWS dedicated pricing. Although the model boasts 65 billion parameters, it is still smaller than OpenAI's latest GPT models, such as ChatGPT-3, which has 175 billion parameters.

According to Clement Delangue, the CEO of AI startup Hugging Face, training the company's Bloom large language model was a lengthy process that took more than two and a half months. Delangue revealed that the process required access to a supercomputer equivalent to around 500 GPUs. He also emphasized the need for caution when retraining large language models, as the process can be expensive and resource-intensive, making it essential to ensure that any retraining efforts are targeted and efficient.

Delangue emphasized that large language models are not continuously trained daily, which is why some models like ChatGPT do not have knowledge of recent events beyond a certain point, such as the year 2021. He added that Hugging Face is currently retraining version two of Bloom, which is expected to cost at most $10 million. Delangue noted that such retraining efforts are not frequently feasible, highlighting the significant investment of time and resources.


The Cost of Inference for Large Language Models: Challenges and Considerations for Startups and Enterprises

Engineers use an " inference " process to make predictions or generate text using a trained machine learning model. This process can be much more expensive than training because it may need to run millions of times for popular products. OpenAI estimates that it could have cost $40 million to process the millions of prompts people fed into the ChatGPT software in January, as the platform had reached 100 million monthly active users. For tools used billions of times a day, costs can skyrocket. For example, financial analysts estimate that Microsoft's Bing AI chatbot needs at least $4 billion of infrastructure to serve responses to all Bing users.

Startups specializing in generative AI technologies receive billions of dollars in investments from venture capitalists and tech giants. Salesforce Ventures, for instance, debuted a $250 million fund that caters to generative AI startups. However, entrepreneurs are concerned about relying on subsidized AI models they do not control and pay for on a per-use basis. Some companies are exploring how to use language models effectively through Microsoft's Azure cloud service, which is currently offered at a discounted price.

Possible ways it could be altered

The cost of AI computation in the future remains uncertain as the industry evolves. Various companies, including those creating foundational models, semiconductor makers, and startups, are exploring ways to reduce the price of running AI software.

Although Nvidia currently dominates the AI chip market with a 95% share, the overall industry has experienced a slowdown in improving the power of unlimited chips in recent years. However, Jensen Huang, Nvidia's CEO, predicts that AI will become "a million times" more efficient within the next decade due to advancements in chips, software, and other computer components.

Startups are also addressing the high cost of AI as an opportunity for innovation. For instance, D-Matrix is developing a system that performs more processing in a computer's memory rather than on a GPU to save costs on inference. This approach is beneficial when workloads experience sudden spikes, as more than GPU capacity may be required for large-scale computations.

According to the CEO of HuggingFace, Delangue, companies would benefit more from focusing on smaller, specialized models that are less expensive to train and operate rather than the larger language models that are currently receiving most of the attention.

On the other hand, OpenAI recently announced that it has reduced the cost for companies to utilize its GPT models. For approximately 750 words of output, it now charges one-fifth of one cent. This price reduction has attracted the attention of Latitude, the creator of AI Dungeon, who stated that they are continually assessing how they can provide the best user experience and will evaluate all AI models to ensure they have the best product available.

Post a Comment

Previous Post Next Post