Google (NASDAQ:GOOG) (NASDAQ:GOOGL) has unveiled its new Gemini generative AI model, positioning itself as a strong competitor to Microsoft-backed (NASDAQ: MSFT) OpenAI’s GPT-4. According to DeepMind CEO Demis Hassabis, Gemini is Google’s “most capable and general model” to date.
Gemini stands out as a natively multimodal model, adept at analyzing text, audio, video, images, and code. Unlike other models that are pieced together from separate models for different media, Gemini has been designed from the ground up to handle all these mediums collectively.
This integrated approach allows Gemini to more effectively understand and process multimodal data, enhancing its performance across various applications, from interpreting handwritten notes to analyzing images and videos.
Google (NASDAQ:GOOG) (NASDAQ:GOOGL) showcased Gemini’s capabilities through several demonstrations. One highlighted the AI’s ability to recognize both a drawing and a physical version of a blue duck. In another instance, the AI preferred a roller coaster with a loop over one without, demonstrating its contextual understanding.
Gemini also has practical applications for everyday tasks. It can assist in educational settings, capable of reading and assessing a student’s handwritten math answers, providing corrections and explanations where needed. In the coding domain, Google touts Gemini as a leading model, proficient in programming languages like Python, Java, C++, and Go.
The company is introducing three versions of Gemini: Gemini Ultra for complex tasks in data centers, the mid-range Gemini Pro, and Gemini Nano for mobile devices, including the upcoming Google Pixel 8 Pro.
Google (NASDAQ:GOOG) (NASDAQ:GOOGL) plans to integrate Gemini Nano into its Recorder app’s Summarize feature on the Pixel 8 Pro, enabling it to analyze recordings and generate bullet-point summaries. Additionally, Gemini will power Smart Reply in the Gboard, initially with WhatsApp, before expanding to other apps next year.
Furthermore, Gemini Pro is now available in Google’s Bard chatbot’s English version, enhancing its capabilities in understanding, summarizing, reasoning, coding, and planning. Google also announced the future release of Bard Advanced, powered by Gemini Ultra, set to launch next year.
Featured Image – Freepik