Disillusion in AI

Raw Text

Speed vs. Moat, Obsolescence from Advancements, and Other Thoughts

Mark

Speed vs. Moat

Incumbent AI startups with millions of dollars in VC funding will struggle in the coming quarters. Rifts will be created between co-founders, board members, and investors, as the debate between Speed vs. Moat brews under the surface. Many of these companies will experience attrition as patience among employees runs out. ​

Some of the most impressive startups have been slow to release public betas. Although their demos look impressive, it’s impossible to properly evaluate their capabilities without using them first. The lack of a public demo also means less publicity. These startups are oftentimes the same ones that are building and training their own proprietary models, all in the hopes of avoiding their dependence on the likes of OpenAI.

Considering OpenAI’s changed stance from GPT-2 to GPT-3, their proactiveness seems prudent. In a way, proprietary, fine-tuned models can eventually become a moat. However, a counterargument could be made that these efforts are fruitless when considering the progress that the likes of OpenAI have made — the lead time, training volume, and established brand name make them difficult to surpass. GPT-3 is already a 2-year-old technology that OpenAI sunk $4.6mm dollars in training cost alone. Despite the influx of users eating into OpenAI’s funding each day, the publicity that ChatGPT has gotten means that the company’s platform is instantly recognizable, not to mention the millions of users providing feedback on the model’s response. Furthermore, OpenAI continues to release new features that make similar features released by standalone applications obsolete.

When considering these factors, it appears ineffective to be creating a new foundational model or a fork of existing ones for the sake of creating a moat. Instead, I would argue that the time spent on new models could better be used to build a better product on top of existing models, making minor adjustments that don't require excessive fine-tuning. Using widely available models like GPT-3 or Jurassic-1, startups should quickly iterate on their ideas on top of existing technologies developed using hundreds of millions of dollars of venture capital funding.

The additional benefit here would be that the models would continue to be improved by their developers. It’s reminiscent of the cloud — more specifically, of AWS. New AI models are the equivalent of AWS web tools. End users do not have to worry about the infrastructure or the hardware powering them. By amortizing the cost of the model and infrastructure across millions of people, OpenAI provides others access to their flagship models at a fraction of the cost for them to train one from scratch.

Scattered Thoughts

Obsolescence from Advancements in Computing Power: With the rapid progression of computing power, the innovations designed to overcome the constraints of this technology become obsolete. For example, baking human understanding into a system was once considered to be the holy grail of AI. However, advances in computing power made basic systems like search and learning much more powerful. Today, many of the state-of-the-art AI systems for computer chess and poker utilize some form of search and deep learning. This is all thanks to the decreasing cost and time to train AI models. According to Stanford, the cost to train image classification systems and training times have decreased by 63.6% and 94.4%, respectively, since 2018, with similar trends being observed for tasks like recommendation and language processing. The race is on to build a new system that will achieve today's holy grail of AGI. It remains to be seen which approach will win out in the end. [Further Reading: 2022 AI Index Report ]

Fun AI Stats: Compared to 2018, the fastest AI systems in 2021 had 9x the number of accelerators (e.g., GPU or TPU) when compared to the average in all systems. Does this mean that titans in the AI space are much further ahead than the rest or that blazing-fast AI systems are not a necessity for success?

Black Box of AI: LLMs continue to be black boxes. Researchers are often surprised at how minor tweaks like chain-of-thought prompting lead to significant accuracy or how rapid improvement in a model's performance (coined “phase transitions”) is observed after a certain threshold of training is done. Diving deeper into these two concepts, chain-of-thought prompting is the action of prompting the model in a way that gets it to list out its thought process when arriving at an answer to a question. This minor change can significantly boost the capability of the LLM without fine-tuning. This shows how difficult it may be to predict a model's performance without a trial-and-error type of approach. [Further Reading: Chain of Thought Prompting Elicits Reasoning in Large Language Models ]

Black Box of AI #2: In science broadly, emergence can be defined as "a qualitative change that arises from quantitative changes." For LLMs, emergent abilities occur when an ability that is not present in smaller models presents itself in bigger ones. This may seem pretty obvious. However, what may not be so obvious is the concept of phase transitions, which occurs when there is a sharp improvement in a model's performance after a certain threshold of training is done – specifically, test accuracy or how well a model performs when presented with new data it hasn't seen before. Instances of this phenomenon include AlphaZero showing significant improvement in understanding chess concepts after around 32,000 training steps. [Further Reading: Future ML Systems Will Be Qualitatively Different ]

What the Black Box Means for Startups: I would argue that AI startups (application-focused ones) should stay up to date on the latest research coming out, ensuring that new techniques are properly and quickly implemented. Until we get a firm grasp of how LLMs with billions (or even trillions) of parameters actually work and be able to reliably predict their performance, rapid iteration seems to be the key to succeeding. I suspect this is why firms like Stability AI are able to succeed with what they claim to be just 1 Ph.D. on their team. Other than the fact that the open-source nature enables a community-driven development, the Cambrian explosion of AI means new papers are coming out every day. A Ph.D. isn't needed to read through a paper and implement new techniques.

Marriage of AI and Big Tech: Microsoft tied its knot with OpenAI through its $1B investment back in 2019. Amazon has courted AI21 Labs (by offering the company's models on AWS) and Stability AI (by becoming a preferred cloud provider). Apple has hinted its interest in this space by optimizing its silicon to effectively run Stable Diffusion. However, it has yet to make moves towards an official partnership. Perhaps they are aiming to derive service revenue via this type of optimization vs. hardware sales. It’d be worthwhile to dig into what other resources AI startups covet in order to predict the next big partnership announcement. Distribution? Devices? Developer mindshare?

Single Line Text

Speed vs. Moat, Obsolescence from Advancements, and Other Thoughts. Mark. Speed vs. Moat. Incumbent AI startups with millions of dollars in VC funding will struggle in the coming quarters. Rifts will be created between co-founders, board members, and investors, as the debate between Speed vs. Moat brews under the surface. Many of these companies will experience attrition as patience among employees runs out. ​. Some of the most impressive startups have been slow to release public betas. Although their demos look impressive, it’s impossible to properly evaluate their capabilities without using them first. The lack of a public demo also means less publicity. These startups are oftentimes the same ones that are building and training their own proprietary models, all in the hopes of avoiding their dependence on the likes of OpenAI. Considering OpenAI’s changed stance from GPT-2 to GPT-3, their proactiveness seems prudent. In a way, proprietary, fine-tuned models can eventually become a moat. However, a counterargument could be made that these efforts are fruitless when considering the progress that the likes of OpenAI have made — the lead time, training volume, and established brand name make them difficult to surpass. GPT-3 is already a 2-year-old technology that OpenAI sunk $4.6mm dollars in training cost alone. Despite the influx of users eating into OpenAI’s funding each day, the publicity that ChatGPT has gotten means that the company’s platform is instantly recognizable, not to mention the millions of users providing feedback on the model’s response. Furthermore, OpenAI continues to release new features that make similar features released by standalone applications obsolete. When considering these factors, it appears ineffective to be creating a new foundational model or a fork of existing ones for the sake of creating a moat. Instead, I would argue that the time spent on new models could better be used to build a better product on top of existing models, making minor adjustments that don't require excessive fine-tuning. Using widely available models like GPT-3 or Jurassic-1, startups should quickly iterate on their ideas on top of existing technologies developed using hundreds of millions of dollars of venture capital funding. The additional benefit here would be that the models would continue to be improved by their developers. It’s reminiscent of the cloud — more specifically, of AWS. New AI models are the equivalent of AWS web tools. End users do not have to worry about the infrastructure or the hardware powering them. By amortizing the cost of the model and infrastructure across millions of people, OpenAI provides others access to their flagship models at a fraction of the cost for them to train one from scratch. Scattered Thoughts. Obsolescence from Advancements in Computing Power: With the rapid progression of computing power, the innovations designed to overcome the constraints of this technology become obsolete. For example, baking human understanding into a system was once considered to be the holy grail of AI. However, advances in computing power made basic systems like search and learning much more powerful. Today, many of the state-of-the-art AI systems for computer chess and poker utilize some form of search and deep learning. This is all thanks to the decreasing cost and time to train AI models. According to Stanford, the cost to train image classification systems and training times have decreased by 63.6% and 94.4%, respectively, since 2018, with similar trends being observed for tasks like recommendation and language processing. The race is on to build a new system that will achieve today's holy grail of AGI. It remains to be seen which approach will win out in the end. [Further Reading: 2022 AI Index Report ] Fun AI Stats: Compared to 2018, the fastest AI systems in 2021 had 9x the number of accelerators (e.g., GPU or TPU) when compared to the average in all systems. Does this mean that titans in the AI space are much further ahead than the rest or that blazing-fast AI systems are not a necessity for success? Black Box of AI: LLMs continue to be black boxes. Researchers are often surprised at how minor tweaks like chain-of-thought prompting lead to significant accuracy or how rapid improvement in a model's performance (coined “phase transitions”) is observed after a certain threshold of training is done. Diving deeper into these two concepts, chain-of-thought prompting is the action of prompting the model in a way that gets it to list out its thought process when arriving at an answer to a question. This minor change can significantly boost the capability of the LLM without fine-tuning. This shows how difficult it may be to predict a model's performance without a trial-and-error type of approach. [Further Reading: Chain of Thought Prompting Elicits Reasoning in Large Language Models ] Black Box of AI #2: In science broadly, emergence can be defined as "a qualitative change that arises from quantitative changes." For LLMs, emergent abilities occur when an ability that is not present in smaller models presents itself in bigger ones. This may seem pretty obvious. However, what may not be so obvious is the concept of phase transitions, which occurs when there is a sharp improvement in a model's performance after a certain threshold of training is done – specifically, test accuracy or how well a model performs when presented with new data it hasn't seen before. Instances of this phenomenon include AlphaZero showing significant improvement in understanding chess concepts after around 32,000 training steps. [Further Reading: Future ML Systems Will Be Qualitatively Different ] What the Black Box Means for Startups: I would argue that AI startups (application-focused ones) should stay up to date on the latest research coming out, ensuring that new techniques are properly and quickly implemented. Until we get a firm grasp of how LLMs with billions (or even trillions) of parameters actually work and be able to reliably predict their performance, rapid iteration seems to be the key to succeeding. I suspect this is why firms like Stability AI are able to succeed with what they claim to be just 1 Ph.D. on their team. Other than the fact that the open-source nature enables a community-driven development, the Cambrian explosion of AI means new papers are coming out every day. A Ph.D. isn't needed to read through a paper and implement new techniques. Marriage of AI and Big Tech: Microsoft tied its knot with OpenAI through its $1B investment back in 2019. Amazon has courted AI21 Labs (by offering the company's models on AWS) and Stability AI (by becoming a preferred cloud provider). Apple has hinted its interest in this space by optimizing its silicon to effectively run Stable Diffusion. However, it has yet to make moves towards an official partnership. Perhaps they are aiming to derive service revenue via this type of optimization vs. hardware sales. It’d be worthwhile to dig into what other resources AI startups covet in order to predict the next big partnership announcement. Distribution? Devices? Developer mindshare?