OpenAI GPT score on Humanity’s Last Exam by June 30? - AI Odds Analysis
All
Outcomes
Market
Price
AI Fair
Value
Value
Edge
40%+
YesNo
50%+
YesNo
AI Insights:
03.06 11:23 UpdatedFair Value Reasoning:
Google Gemini 3 Pro scoring 38.3% has fundamentally shifted expectations. For OpenAI, clearing the 35% threshold (improving from ~29.9%) is now trivial, and the 40% threshold (requiring only a 1.7% lead over Gemini) is highly accessible. Given OpenAI's history of rivaling Google and the remaining 4-month window, the '40%+' option is extremely likely, justifying a significant FV hike to 85c. However, the '50%+' option demands a nearly 12% leap over the current SOTA. On a logarithmic difficulty curve like Humanity's Last Exam, this represents a generational capability gap. Despite recent market mania, the rational fair value remains low (22c) as achieving such a leap in 4 months is statistically improbable.
Sign up to view more information
Exotics
'Humanity's Last Exam' (HLE) is a relatively new and niche AI benchmark designed to measure AI on extremely hard tasks. While AI performance prediction is a hot topic, this is more specific and novel than predicting general benchmarks like GSM8K or MMLU, making it moderately exotic.
Movers
March 2, 2026 - March 4, 2026, the '50%+' option crashed from 67.5c to 21c. This move represents a severe 'return to reality.' The initial speculation driven by Gemini 3 Pro's high score—betting on an imminent OpenAI 'singularity'—collapsed as traders recalculated the exponential difficulty of bridging the gap from 38% to 50%, causing the price to revert to a rational range.
Feb 21, 2026 - Feb 24, 2026, the '40%+' option saw a V-shaped recovery (64.5c->55c->64c). The market initially feared OpenAI was lagging after Gemini 3 Pro's release, but quickly realized that a competitor's high score validated the technical feasibility, thereby increasing the probability of OpenAI breaking 40%.