The OpenAI o1 model takes longer than ChatGPT to respond to queries but proves to be promising in solving some of the world’s more pressing issues
OpenAI has launched the much-anticipated OpenAI o1 model (nicknamed Strawberry), which can think through problems and break them down step by step like PhD students reason and solve issues.
Its areas of focus are science, maths and coding, and video testimonials on its website show a genetics expert, a quantum physicist, an economist and a programmer test drive the new function with varying results. Mario Krenn, research lead at the Max Planck Institute for the Science of Light, explains in the video, “I asked a question about a certain quantum operator which I know previous models like GPT-4 would fail in this task. In contrast to GPT-4, this one gives me very detailed mathematics [...] I think it tries to do something incredibly difficult.”
OpenAI tested this current model by making it take a qualifying exam for the International Mathematics Olympiad: “GPT-4 correctly solved only 13 percent of problems, while the reasoning model scored 83 percent. Their coding abilities were evaluated in contests and reached the 89th percentile in Codeforces competitions.”
Here’s everything you need to know about OpenAI o1 so far.
There are some caveats
OpenAI o1 is more expensive than ChatGPT Plus, so don’t go asking OpenAi o1 for gifting ideas or to clean up a messy email thread. For now, only paying users of ChatGPT Plus and Teams can access the new programme at 30 prompts a week.
OpenAI o1 also takes slightly longer for the model to answer prompts, because “like a human”, it’s analysing and finding the answers.
Experts suggest making prompts as detailed as possible and using the programme for information the current model would struggle with. For now, o1 cannot browse the web for information nor can documents or images be uploaded to it.
Tech platform Tom’s Guide tried five detailed prompts and received complex answers that broke down its reasoning step-by-step with explanations for each section. The results are impressive (although given my lack of knowledge about terraforming Mars, I can’t attest to the validity of the response.)