TechOpenAI's new GPT-4o dazzles with real-time sound and image analysis

OpenAI's new GPT‑4o dazzles with real-time sound and image analysis

OpenAI has unveiled its latest achievement—the GPT-4o model, which can analyze sound, image, and text in real time. Surprisingly, the model demonstrates an extraordinary speed of reaction to received sound signals.

OpenAI's new GPT-4o dazzles with real-time sound and image analysis
Images source: © Unsplash

14 May 2024 08:04

Artificial intelligence enthusiasts eagerly awaited the OpenAI Spring Update - a presentation by the creators of ChatGPT. The anticipation before the event was boosted by loud industry announcements about the potential unveiling of a new AI technology-based internet search engine. However, this time, the spotlight was on the latest model.

GPT-4o operates in real-time

OpenAI introduced the GPT-4o model, facilitating more natural interactions. According to the company's declarations, GPT-4o responds to sound signals in barely 230 milliseconds, which means an average response time of 320 milliseconds for a reply. This speed is comparable to the response time during a conversation with a human. Performance-wise, the model is on par with GPT-4 Turbo for analyzing English text and even outperforming it in other languages.

OpenAI claims that its new GPT-4o model has also significantly improved the interpretation of images and sounds compared to existing models. So, what can this new tool do? One of my highlights was a demonstration in which GPT-4o was asked to start counting from one to ten.

GPT-4o's response to commands to change the pace was instantaneous, happening in real time. Another fascinating instance was when GPT-4o acted as a Spanish language teacher, analyzing objects it saw through the camera.

When can we expect access to GPT-4o? OpenAI has announced that text and graphic functions of the GPT-4o model are already available today in ChatGPT. The new model can be accessed in the free version, and subscription users can enjoy up to five times the message limits. OpenAI also plans to release a new version of the GPT-4o voice mode in an alpha version for ChatGPT Plus users in the upcoming weeks.

Remember, OpenAI isn't just about ChatGPT. The forthcoming Sora model will enable users to create videos, an advancement many creators keenly anticipate.

© Daily Wrap
·

Downloading, reproduction, storage, or any other use of content available on this website—regardless of its nature and form of expression (in particular, but not limited to verbal, verbal-musical, musical, audiovisual, audio, textual, graphic, and the data and information contained therein, databases and the data contained therein) and its form (e.g., literary, journalistic, scientific, cartographic, computer programs, visual arts, photographic)—requires prior and explicit consent from Wirtualna Polska Media Spółka Akcyjna, headquartered in Warsaw, the owner of this website, regardless of the method of exploration and the technique used (manual or automated, including the use of machine learning or artificial intelligence programs). The above restriction does not apply solely to facilitate their search by internet search engines and uses within contractual relations or permitted use as specified by applicable law.Detailed information regarding this notice can be found  here.