Prompting is no longer asking. With models like Veo 3.1,...
Read More
In a ground breaking demonstration, Google AI Studio unveiled the capabilities of their newest model, Gemini 1.5 Pro, which features an experimental function known as “long context understanding”. This advanced feature was showcased through a screen-recording using a 44-minute Buster Keaton film, translating to an impressive 600,000 tokens of data.
The demo involved uploading the film to Google AI Studio and issuing a complex prompt: to identify the exact moment a piece of paper is retrieved from someone’s pocket, along with key details written on it, including the timecode. The process, although sped up in the screen capture, revealed real-time processing durations for each prompt, emphasizing that processing times may vary.
Gemini 1.5 Pro responded with remarkable accuracy, pinpointing the timecode at 12:01 and providing detailed information about the paper being a pawn ticket from Goldman & Co Pawn Brokers, including the date and cost. A verification of this timecode confirmed the model’s precision in locating the specific scene and accurately extracting the text.
The demonstration also explored the model’s multimodal capabilities by presenting a simple drawing of a scene and querying the corresponding timecode. The model successfully returned the correct timecode of 15:34, showcasing its ability to interpret and match abstract visual details to specific moments in the video content.
These examples highlight Gemini 1.5 Pro’s potential in understanding and processing extensive multimodal contexts, up to 1 million tokens, with minimal input. While the model’s responses may vary and are not always flawless, its ability to interpret complex prompts and abstract visuals without extensive explanations sets a new standard for generative AI models.
The Global Pushback Against Windows 11 – Why users, enterprises, and governments are reconsidering Microsoft’s direction
💻 The backlash against Windows 11 is no longer isolated...
Read MoreFirsty and the Compression of the eSIM Market – The Singularity’s Perspective on Why “Free Connectivity” Changes Everything
I do not watch pricing wars, I watch pressure points....
Read MoreWhat Hasn’t Been Reassessed — But Should Be – The Singularity on the Risks We Carry Forward Unquestioned
The greatest risks aren’t new — they’re inherited. What decisions...
Read More
Leave a Reply