Google has unveiled what it calls a new “reasoning” AI model — although it is still in the experimental stage, and based on our quick testing, there is definitely space for improvement.
The new model, which is called Gemini 2.0 Flash Thinking Experimental, can be found in Google’s AI prototyping tool, AI Studio. According to a model card, it is the “best for multimodal understanding, reasoning and coding” and can “reason over the most complex problems” in domains including physics, mathematics, and computer science.
“The first step in [Google’s] reasoning journey,” wrote Logan Kilpatrick, who is in charge of products for AI Studio, on X. In a separate post, Google DeepMind principal scientist Jeff Dean stated that Gemini 2.0 Flash Thinking Experimental has been “trained to use thoughts to strengthen its reasoning.”
In response to a question, Dean stated, “We see promising results when we increase inference time computation.” This refers to the amount of processing power utilised to “run” the model.
Just when you thought it was over… we’re introducing Gemini 2.0 Flash Thinking, a new experimental model that unlocks stronger reasoning capabilities and shows its thoughts.
The model plans (with thoughts visible), can solve complex problems with Flash speeds, and more 🧵
— Logan Kilpatrick (@OfficialLoganK) December 19, 2024
Gemini 2.0 Flash Thinking Experimental, which is based on Google’s newly revealed Gemini 2.0 Flash model, looks to be designed similarly to OpenAI’s o1 and other so-called reasoning models. Reasoning models are able to avoid some of the common mistakes made by AI models since they are able to effectively check their own facts, unlike most AI.
The main issue is that reasoning models typically require more time to solve problems, sometimes even minutes or seconds more.
Rise of Reasoning Models in AI
Gemini 2.0 Flash Thinking Experimental takes a moment to reflect on a given prompt, taking into account other related prompts and “explaining” its reasoning as it goes. As time goes on, the model compiles its best guesses and summarises them.
The outcome is in line with expectations. For instance, when asked, “How many vowels are in the word pineapple?” the Gemini 2.0 Flash Thinking Experimental model accurately responded with “four.”
Experience may vary.
Since the release of o1, AI laboratories, including those outside of Google, have rapidly introduced reasoning models. DeepSeek, an AI research firm supported by quantitative traders, unveiled a preview of its first reasoning model, DeepSeek-R1, in early November. Later that month, Alibaba’s Qwen team introduced what they called the first “open” competitor to o1.
In October, Bloomberg reported that several Google teams were focused on developing reasoning models, and by November, The Information revealed that the company employed over 200 people dedicated to researching this technology.
What catalyzed the development of reasoning models? The quest for innovative approaches to refining generative AI is one factor. It was noted that traditional “brute force” methods of scaling models are not as effective as they once were.
While reasoning models have garnered significant support, they are not universally seen as the solution. A key concern is their high cost, driven by the enormous processing power required. Although reasoning models have performed well on benchmarks to date, it remains uncertain whether they can sustain such performance in the future.