
Comply with ZDNET: Add us as a preferred source on Google.
ZDNET’s key takeaways
- OpenAI reframes photos as a visible language.
- Considering mode builds context-aware infographics.
- Model constancy remains to be inconsistent in early testing.
At present, OpenAI introduced ChatGPT Photos 2.0, its next-generation image model, which the corporate says is targeted on precision, usability, and sophisticated visible duties.
Probably the most notable new functionality is the power to mix textual content and pictures to construct advanced, lovely pages. OpenAI is reframing the entire thought of picture technology from a course of that creates decorations (their phrase) to a language (additionally their time period).
Additionally: The best AI image generators of 2026: There’s only one clear winner now
OpenAI describes it as, “An excellent picture does what a very good sentence does — it selects, arranges, and divulges. It could actually clarify a mechanism, stage a temper, take a look at an thought, or make an argument.”
Considering capabilities allow advanced workflows
Along with its vastly improved capability to combine textual content and graphics, the brand new mannequin makes use of enhanced considering capabilities. It could actually generate a number of photos per immediate with continuity throughout outputs. This method is feasible as a result of the mannequin really integrates reasoning into the picture output.
This shift is massive. As a substitute of simply producing a picture that just about matches the immediate particulars, Photos 2.0 can take a a lot vaguer immediate, like “Generate an infographic about actions I ought to do with tomorrow’s climate in San Francisco in thoughts.”
Additionally: How to switch from ChatGPT to Gemini
From this immediate, the AI will collect climate and exercise information about San Francisco, decide actions acceptable to the climate, after which construct a picture or set of photos that match the outcomes.
In keeping with OpenAI, “On this mannequin, Photos 2.0 acts extra like a visible thought companion, serving to carry a challenge from tough idea to completed asset with considerably much less work in your half.”
Precision and design management enhance usability
Many people have lengthy struggled to persuade ChatGPT to generate photos in a selected desired side ratio. Typically, the AI stubbornly produces what it desires. However now, with Photos 2.0, the mannequin has help for “side ratios as large as 3:1 and as tall as 1:3.”
The mannequin additionally helps higher-fidelity outputs that (largely) produce correct object placement, detailed textual content rendering, and sophisticated compositions. We’ll see if we are able to take away the phrase “largely” from that sentence after the product is formally launched.
Additionally: I tried Personal Intelligence, and it was accurate (but unsettling)
The AI additionally helps small textual content, UI components, and stylistic constraints at as much as 2K decision. Cool.
Testing the preview
I used to be given entry to a day-before-release preview, and the mannequin is spectacular, largely. I fed it a screenshot of the ZDNET house web page and a draft of the Photos 2.0 press launch.
Then I instructed, “Based mostly on the contents of the press launch, generate a 16:9 infographic concerning the new picture replace and generate it utilizing the ZDNET model model as proven within the ZDNET house web page doc.”
Additionally: I tried Google Photos’ new AI Enhance tool: How it crops, relights, and fixes your shots – sometimes
The mannequin did an awesome job on the infographic, however attempt as it’d, it couldn’t reproduce the ZDNET emblem. On its first attempt, it rendered the Z in ZDNET with a slight droop.
I attempted a wide range of requests on the order of, “Repair the ZDNET Brand. The Z droops in your model however is just not droopy within the precise emblem.” However Photos 2.0 by no means managed to repair it.
So I began a brand new session. This time, I included the instruction, “Use particular care to breed the ZDNET emblem precisely.”
Additionally: I tested ChatGPT Plus vs. Gemini Pro to see which is better – and if it’s worth switching
This is the place issues obtained very odd. For its first run, the mannequin in some way dug up a replica of ZDNET’s emblem from earlier than our 2022 redesign. This emblem is nowhere to be discovered on our present house web page. Weirdly, it rendered that outdated emblem utilizing the present colour scheme. The mannequin then pushed the brand and the infographic data off the left fringe of the picture. It additionally selected a lightweight blue for “Photos 2.0” that is not a ZDNET model colour.
I attempted mightily to persuade it to make use of the present emblem. I managed to get it to push the picture to the best, so nothing was lower off. However including the immediate, “Use the ZDNET emblem that’s on the supplied web page. Don’t seek for an alternate emblem,” did nothing to repair the issue.
I took yet another shot on the problem earlier than deciding to return to ending up this text. As soon as once more, I began a brand new session so the AI did not have muscle reminiscence from its earlier miscalculations.
Additionally: This powerful Gemini setting made my AI results way more personal and accurate
The mannequin tousled the brand once more. This time, the AI determined so as to add a rudder form to the stem of the stretched-out capital D.
To be honest, I am utilizing a pre-release model of Photos 2.0. I will be again with a way more complete take a look at run of the mannequin after the official product launch.
I additionally tried the same take a look at utilizing a distinct doc with Google’s Nano Banana Professional, however as a result of it did not deal with the synthesis the best way that this new model of OpenAI’s product does, it wasn’t actually in a position to repeat the outcomes I obtained right here. We’ll know extra as we do extra superior checks
Pricing and availability
The brand new mannequin is on the market at this time to all ChatGPT and Codex users. Superior outputs and the considering functionality can be found to ChatGPT Plus, Professional, Enterprise, and Enterprise customers. You’ll want to choose “Considering” from the ChatGPT dropdown bar on the prime of the display screen.
On the time of writing, earlier than launch, the brand new Photos 2.0 mannequin is barely obtainable on the desktop. However OpenAI guarantees that these capabilities will probably be within the cellular model as nicely, together with the power to finger-select photos utilizing your cellular touchscreen.
The photographs are additionally obtainable through API utilizing the gpt-image-2 mannequin. API pricing varies relying on the standard, thinkiness (my phrase), and desired picture decision.
If an AI can deal with structure and content material together, will that change the way you method design tasks? Tell us within the feedback under.
You possibly can comply with my day-to-day challenge updates on social media. You’ll want to subscribe to my weekly update newsletter, and comply with me on Twitter/X at @DavidGewirtz, on Fb at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, on Bluesky at @DavidGewirtz.com, and on YouTube at YouTube.com/DavidGewirtzTV.





