Google Gemini 1.5 Flash, Project Astra, Imagen 3 and more introduced!

Google made important developments in the field of artificial intelligence at the I/O 2024 event. The company is shaping the future of artificial intelligence with new features in the Gemini model family, faster and more efficient models, productive media tools, innovative search experiences and the 6th generation of Google Cloud TPU, Trillium.

Gemini is now faster and smarter!

Google DeepMind CEO Demis Hassabis announced updates to the Gemini model family. The first native multimodal model, Gemini 1.0, released in December and offered in three different sizes (Ultra, Pro, Nano), soon received the 1.5 Pro version with improved performance and a context window of 1 million tokens.

Developers and enterprise customers also found 1.5 Pro’s long context window, multimodal reasoning capabilities, and impressive overall performance highly useful. In response to user feedback that some applications require lower latency and lower service costs, Google has added a new member to the Gemini family: 1.5Flash.

A new era that pushes the limits in artificial intelligence: Google Gemini has evolved!

A new era that pushes the limits in artificial intelligence: Google Gemini has evolved!

Google is opening a new era in the field of artificial intelligence with a series of updates to the Gemini model family.

Gemini 1.5 Flash

Optimized for speed and efficiency, this lightweight model is ideal and cost-effective for high-volume, high-frequency missions. Offering an expanded context window of 1 million tokens, 1.5 Flash shows superior performance in tasks such as summarization, chat applications, image and video subtitling, and data extraction from long documents and tables. Trained by “distillation” by the larger 1.5 Pro model, 1.5 Flash transfers essential knowledge and skills to a smaller, more efficient model.

Gemini 1.5 Pro

Google has also significantly improved the 1.5 Pro, which is the best model for overall performance. Context window expanded to 2 million tokens. Code generation, logical reasoning and planning, multi-round speech, and audio and video understanding have been enhanced through data and algorithmic improvements.

1.5 Pro can now follow increasingly complex and nuanced instructions, including product-level behavioral determinants such as role, format, and style. Control over the model’s responses has been improved for certain use cases, such as creating a chat application’s personality and response style or automating workflows through multiple function calls. Allowed users to direct model behavior by setting System instructions.

Gemini Nano

Going beyond just text input, Gemini Nano can now process images as a network. Starting with Pixel phones, apps that use Gemini Nano with Multimodality will be able to understand the world the same way humans do. It will provide this not only with text input, but also with voice and spoken language.

The future of artificial intelligence assistants: Project Astra

As part of its mission to develop AI responsibly to benefit humanity, Google DeepMind announced Project Astra, with the goal of developing universal AI agents that can assist in daily life. Astra aims to develop AI agents that can understand context and take action the same way humans understand and react to a complex world.

These agents will serve as proactive, approachable and personalized assistants. Users will be able to talk to these agents naturally and without delay. Astra was designed to be able to process and remember video and speech input.

These agents are built on the Gemini model and other task-specific models to process information faster by continuously encoding video frames, combining video and speech input into a timeline of events, and caching that information for efficient recall. Some of Astra’s features will be integrated into Google products such as the Gemini app later this year.

New productive media models and tools

Google also introduced new productive media models and tools for creative work:

And he

Veo, Google’s most capable video creation model to date, can create high-quality 1080p videos that can exceed a minute. Supporting a variety of cinematic and visual styles, Veo can create videos that reflect the user’s creative vision by understanding natural language and visual semantics. The model also understands cinematic terms like “timelapse” or “aerial shot of a landscape,” providing an unprecedented level of creative control.

Creates consistent and unified shots; people, animals and objects move realistically throughout the shots. Google is inviting a number of filmmakers and creators to try the model to explore how Veo can best support the storyteller’s creative process.

Imagen 3

Imagen 3, Google’s highest quality text-to-image model, can produce incredibly detailed and photorealistic images. Better interpreting user commands by understanding natural language, Imagen 3 can include small details from long commands and create text within an image.

Music AI Sandbox

Music AI Sandbox, a toolset that supports musicians to create new instrumental tracks from scratch, transform audio and create creative work, aims to open a new playground for creativity.

AI improvements

Google is committed to not only advancing technology, but also doing so responsibly. Therefore, measures are being taken to address the challenges posed by generative technologies and help people work responsibly with AI-generated content.

Let it search for you: The era of artificial intelligence has begun in Google Search!Let it search for you: The era of artificial intelligence has begun in Google Search!

Let it search for you: The era of artificial intelligence has begun in Google Search!

Google takes the search experience it has been developing for 25 years one step further with productive artificial intelligence. Here are AI Overviews…

These measures include collaborating with the creative community and other stakeholders, gathering insights for the safe and responsible development and deployment of technologies, listening to feedback, and giving creators a voice. Google believes that artificial intelligence technologies should be used to benefit humanity and is working to ensure that these technologies are developed ethically, responsibly and fairly.

source site-29