One of my friends is looking at an investment in Anthropic. He asked me what I thought – are the LLMs going to get commoditized? I don’t have THE answer, but here are a few thoughts:
- There are a lot of data points saying LLMs are getting more expensive to train. Witness the huge rounds of funding Anthropic and OpenAI are raising, mostly for GPUs for model training.
- Dario, CEO of Anthropic, has said that it costs $100M to train a model today. In a few years he said it will cost $1B. And in 5 years it could cost up to $10B.
- But… every day, a new LLM seems to come to market. If you look at the leaderboards, you have decent to best in class LLMs from OpenAI, Anthropic, Meta, Google, Mistral, Cohere, X, Cohesity, Inflection, and Databricks. Amazon is now working on one too.
- Our experience at Bito, while using a number of the LLMs mentioned above, is that there are significant differences when you are trying to do complicated use cases. For example, if you try to get structured output with complex reasoning required, GPT-4 and Claude 3 far exceed the capabilities of other models such as Llama 2 and related models like Mistral. If you want to write a paragraph about Paris, the models are similar.
- Interestingly, most of these model companies are not that big in terms of headcount. Mistral launched its first model when there were 10 people. Anthropic said Claude 3 was trained by 60 people.
- I was talking to a top 5 exec at a top 5 cloud company, and he said he is training a 500B parameter model for $3M. And most of the money goes into failed training runs where you find out after 4 weeks you made a mistake. The successful run takes 2 months to run.
So, is a lot of capital going to be required to build models? Or is it mostly about the people?
If it’s capital, then we are probably headed for a world where there will be a few LLMs that matter. And even if there are just a few, and they are similar, they will be an oligopoly and capture a lot of profits. As an example, the 3 major cloud vendors (AWS, Azure, and GCP) are similar in capabilities, but have cornered the market.
If it’s cheap to train an LLM and the gating factor is people, then we will probably end up with many LLMs. People are relatively scarce right now, but that will change as the field grows, and more and more people learn how to train a model. There are probably 10x the number of people that have trained a model today than 18 months ago.
One other point I’ll make is around data. If your LLM is truly leveraging a data set that is hard to get or not readily available then you could be the LLM for that vertical. Like Hippocratic AI is focused on healthcare, and is doing some unique stuff there leveraging actual patient voice conversations and encryption. That’s pretty different from OpenAI.
So, is Anthropic a good investment? My gut tells me that the cost of training models has been going up. And will continue to go up. So capital will become a more defining feature. I think we will see a few big LLMs over time. So seems reasonable that Anthropic could be a winner. Is the $18B valuation at a reasonable price… well, that’s a different question for a different post!