ETV Bharat / science-and-technology

'Google's Gemini': Largest AI model to redefine text, image, video, audio and coding

author img

By ETV Bharat Tech Team

Published : Dec 7, 2023, 10:10 AM IST

Updated : Dec 7, 2023, 10:16 AM IST

Google Wednesday introduced Gemini, its highly successful and multiple formats like text, code, audio, image, and video, with modern overall performance averaging many main benchmarks, in 3 iterations- Ultra, Pro and Nano. The latest model's primary advantage is its ability to operate across different formats.

Google has launched its largest AI model Gemini which will be available in 3 iterations- Ultra, Pro and Nano. The latest model is set to operate across different information types like text, code, audio, image, and video.
Representative picture of Google's Gemini

Hyderabad: In the never-ending race of the tech giants in the realm of Artificial Intelligence, Google has entered with their latest AI model, Gemini, designed to excel in processing text, images, audio, video, and code.

Google CEO Sundar Pichai highlighted Gemini's "state-of-the-art performance across many leading benchmarks" and introduced three versions: Ultra (for highly complex tasks), Pro (for scaling across a wide range of tasks) and Nano (on-device tasks).

These models aim to generalise and seamlessly operate across different information types like text, code, audio, image, and video. Gemini's capabilities were showcased in demos, emphasising its ability to perceive like a human eye, evaluate real-time data, and suggest actions.

Gemini Ultra, the largest model, targets highly complex tasks, while Gemini Pro excels at scaling across various tasks. Gemini Nano is tailored for on-device functions, and it has already been integrated into Pixel 8 Pro for features like Summarise in the Recorder app and Smart Reply via Gboard on WhatsApp.

Google plans to expand Gemini's integration into other products and services, such as Search, Ads, Chrome, and Duet AI. "These are the first models of the Gemini era and the first realisation of the vision we had when we formed Google DeepMind earlier this year," said Alphabet and Google CEO Sundar Pichai.

According to Demis Hassabis, CEO of Google DeepMind, Gemini Ultra outperforms existing large language models on 30 out of 32 widely used academic benchmarks. It boasts to surpass human experts on the massive multitask language understanding benchmark, which assesses knowledge and problem-solving skills across 57 subjects. Gemini Pro has also demonstrated superiority over GPT-3.5 in various benchmarks.

  • The Gemini era is here. Thrilled to launch Gemini 1.0, our most capable & general AI model. Built to be natively multimodal, it can understand many types of info. Efficient & flexible, it comes in 3 sizes each best-in-class & optimized for different uses https://t.co/VUu1277bC2 pic.twitter.com/pKyBxXwdYw

    — Demis Hassabis (@demishassabis) December 6, 2023 " class="align-text-top noRightClick twitterSection" data=" ">

"With a score of 90 per cent, Gemini Ultra is the first model to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects such as maths, physics, history, law, medicine and ethics for testing both world knowledge and problem-solving abilities," Google said.

Gemini's flexibility is emphasised as it can efficiently run on both data centres and mobile devices. This versatility enhances developers' ability to build and scale AI applications. Hassabis envisions Gemini as a step closer to creating AI models that feel like expert helpers or assistants, providing useful and intuitive interactions.

The model's multimodal reasoning capabilities enable it to comprehend complex written and visual information, extracting insights from extensive datasets.

The first version of Gemini exhibits proficiency in understanding nuanced information, answering complex questions in subjects like maths and physics, and generating high-quality code in popular programming languages. Hassabis acknowledges ongoing efforts to enhance Gemini's capabilities, including advancements in planning and memory, and expanding the context window for improved responses.

Google plans to roll out Gemini incrementally, starting with its integration into the chatbot Bard for English language settings. It will be available in over 170 countries and territories, with developers gaining access through Google Cloud's API from December 13.

Gemini's compact version will power suggested messaging replies on Pixel 8 smartphones, with further integration into Google products anticipated in the coming months. The most powerful version of Gemini is expected to debut in 2024, subject to extensive trust and safety checks.

To address potential risks, Alphabet, Google's parent company, is implementing new protections in alignment with safety policies and AI principles.

More on Artificial Intelligence-

  1. Explained: ChatGPT maker OpenAI CEO Sam Altman's rollercoaster ride from departure to return, amidst leadership shakeup
  2. Artificial Intelligence: World is finally starting to regulate artificial intelligence, what to expect from US, EU and China's new laws
  3. ChatGTP topped list of most used chatbots, beats Google Bard and Microsoft's Bing
  4. Grok vs GPT-4 Turbo: Decoding rivalry between tech giants' AI chatbots
  5. Elon Musk calls AI "one of most disruptive forces in history", discusses risks with Rishi Sunak
Last Updated : Dec 7, 2023, 10:16 AM IST
ETV Bharat Logo

Copyright © 2024 Ushodaya Enterprises Pvt. Ltd., All Rights Reserved.