On Tuesday, OpenAI’s CEO, Sam Altman, announced a significant upgrade to ChatGPT’s image-generation capabilities during a livestream, marking the first major update in over a year.
With this upgrade, ChatGPT is now able to utilize the company’s GPT-4o model to create and modify images natively, in addition to its existing text generation capabilities. Although GPT-4o has been the foundation of the AI-powered chatbot platform, its capabilities were previously limited to generating and editing text, not images.
According to Altman, the native image generation feature powered by GPT-4o is now live for subscribers to OpenAI’s $200-a-month Pro plan, including ChatGPT and Sora, the company’s AI video-generation product. The feature is expected to be rolled out to Plus and free users of ChatGPT, as well as developers utilizing the company’s API service, in the near future.
GPT-4o’s image output capabilities are described as more accurate and detailed, taking slightly longer to process than the model it replaces, DALL-E 3. The new model can edit existing images, including those with people, and can “inpaint” details such as foreground and background objects, transforming them as needed.
To develop the new image feature, OpenAI trained GPT-4o on a combination of publicly available data and proprietary data from partnerships with companies like Shutterstock, as reported to the Wall Street Journal.
Many vendors of generative AI consider training data to be a competitive advantage and therefore keep it confidential, which can also help avoid potential IP-related lawsuits. As a result, companies tend to be discreet about their training data and related information.
OpenAI’s Chief Operating Officer, Brad Lightcap, stated that the company is committed to respecting artists’ rights and has implemented policies to prevent the generation of images that directly mimic the work of living artists, as quoted in the Wall Street Journal.
OpenAI provides an opt-out form that allows creators to request the removal of their work from its training datasets. Furthermore, the company respects requests to disallow its web-scraping bots from collecting training data, including images, from specific websites.
The upgraded image-generation feature in ChatGPT follows the introduction of Google’s experimental native image output for Gemini 2.0 Flash, one of the company’s flagship models. However, the feature’s lack of guardrails led to its viral spread on social media, with users exploiting it to remove watermarks and create images depicting copyrighted characters.
This article was updated at 12pm PT to include OpenAI’s statement to the Wall Street Journal regarding GPT-4o’s training data.
Source Link