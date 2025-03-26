Hyderabad: OpenAI has announced the introduction of image generation capability into GPT-4o, resulting in images that are hard to identify as AI-generated pictures. The company says that the image generation is not only beautiful but useful as well, showcasing text rendering, character consistency, restyling, and transparent layers.

"We trained our models on the joint distribution of online images and text, learning not just how images relate to language, but how they relate to each other. Combined with aggressive post-training, the resulting model has surprising visual fluency, capable of generating images that are useful, consistent, and context-aware," OpenAI explained in a blog post.

GPT‑4o image generation claims to excel at accurately rendering text, precisely following prompts, and leveraging 4o’s inherent knowledge base and chat context. OpenAI says that these capabilities allow users to create exactly the image they envision.

GPT-4o image generation capabilities

Text rendering allows the GPT-4o image generation model to write long sentences and understand request for complex text placement into the picture. Earlier image generation models have struggled with writing a single word with correct spelling as well as font and style consistency. However, the new product can even design complete restaurant menu, invitation cards, and street signs filled with text and images.

Since image generation is now native to GPT-4o, it allows users to refine images through natural conversation. With improved character consistency, the image generation model can remember the subject to alter it as per provided directions, allowing you to turn anything into a video game character or a sticker.

It can also understand detailed prompts with attention to detail and can handle up to 10-20 different objects. Additionally, with in-context learning, GPT-4o can analyse and learn from user-uploaded images, integrating their details into its context to inform image generation. For instance, you can have it generate an image of a real building from a sketch or turn a painting into a photo-realistic picture.

OpenAI CEO Sam Altman called it an incredible technology/product. "I remember seeing some of the first images come out of this model and having a hard time believing they were really made by AI," Altman wrote in an X post. "We think people will love it, and we are excited to see the resulting creativity."

Availability

GPT-4o image generation has started to roll out to Plus, Pro, Team, and Free users as the default image generator in ChatGPT. Enterprise and Education users will get the access soon. The new image-generation tool is also available to use in Sora. Notably, DALL-E can still be accessed through a dedicated DALL-E GPT.