top | item 46166072

(no title)

iamjackg | 2 months ago

Curious how this will fare when playing Pokemon Red.

discuss

order

minimaxir|2 months ago

Gemini 3 Pro has been playing Pokemon Crystal (which is significantly harder than Red) in a race against Gemini 2.5 Pro: https://www.twitch.tv/gemini_plays_pokemon

Gemini 3 Pro has been making steady progress (12/16 badges) while Gemini 2.5 Pro is stuck (3/16 badges) despite using double the turns and tokens.

theLiminator|2 months ago

I think what would be interesting is if it could play the game with vision only inputs. That would represent a massive leap multimodal understanding.

euvin|2 months ago

Yeah the "High frame rate understanding" feature caught my eye, actual real time analysis of live video feeds seems really cool. Also wondering what they mean by "video reasoning/thinking"?

skybrian|2 months ago

I don’t think it’s real time? The videos were likely taken previously.

danso|2 months ago

> 3. Turning long videos into action: Gemini 3 Pro bridges the gap between video and code. It can extract knowledge from long-form content and immediately translate it into functioning apps or structured code

I'm curious as to how close these models are to achieving that once long-ago mocked claim (by Microsoft I think?) that AIs could view gameplay video of long lost games and produce the code to emulate them.