- Modern Chaos
- Posts
- MC.88: The world is about to go bananas.
MC.88: The world is about to go bananas.
How a mysterious model with a quirky name changed everything we thought we knew about AI image editing
Remember when GPT-4 first dropped and suddenly everyone was having conversations with AI that felt... different? Not just better than GPT-3, but fundamentally more capable, more reliable, more there? We just had that moment again, except this time it's happening with images.
Google's Gemini 2.5 Flash Image, which was dominating benchmarks under the mysterious codename "nano-banana," isn't just another incremental improvement in AI image generation. It's the moment when AI image editing crossed from "impressive tech demo" to "holy shit, this actually works."
Why This Feels Different
I've been testing AI image tools since DALL-E first blew our minds with surreal dogs wearing spacesuits. But there's something fundamentally different about what Google just released. Four things make this feel like a watershed moment:
It's stupidly simple to use. No more crafting the perfect prompt like you're casting a magic spell. You can literally tell it "make this person wearing a red shirt instead of blue" and it just... does it. Correctly. The first time.
The adherence is borderline uncanny. Ask for a specific change and Gemini actually follows your instructions instead of giving you something vaguely related. Want to change someone's hair color while keeping everything else identical? It understands the assignment.
It's fast enough to feel conversational. We're talking 2-3 seconds per generation. Fast enough that you can iterate, experiment, and refine in real-time rather than waiting around wondering if this attempt will be the one.
It's practically free. At $0.039 per image, you could generate a thousand professional-quality edits for less than $40. Compare that to hiring a photographer or graphic designer.
Images Hit Different Than Text
Here's the thing about text AI that we've gotten used to: there's always been a subtle uncanny valley. You can usually spot ChatGPT's writing, it's a bit too polished, slightly generic, missing that human spark. It's incredibly useful, but rarely passes for genuinely human. (NB: I love when LLMs are so meta)
Images don't have that luxury. When you look at a photo, your brain makes an instant judgment: real or fake? And with Gemini 2.5 Flash Image, that judgment is getting a lot harder to make.
Unlike text, which you process sequentially and analytically, images hit you viscerally and immediately. There's no gradual realization that something feels "AI-generated", it either looks real or it doesn't. And increasingly, it looks real.
The Manipulation Problem Just Got Real
This is where things get complicated. When you're generating images from scratch, "create a photo of a cat wearing sunglasses", it's obvious what's happening.
But when you're editing existing photos? The line between "enhanced" and "fabricated" becomes incredibly blurry.
I spent an afternoon editing real photos with Gemini 2.5 Flash Image. Changing someone's outfit, adjusting their expression, moving them to different backgrounds. The results were seamless enough that even I started losing track of which versions were real.
This isn't theoretical. We're already seeing perfect deepfakes created by people with zero technical expertise. When the tools are this good and this accessible, every photo becomes potentially suspect.
Watermarks vs. Reality
Google's solution? They're embedding invisible watermarks (called SynthID) in every generated image. It's technically impressive, the watermark survives compression, cropping, and minor edits. But here's the problem: it only works if everyone uses Google's tool.
What happens when the inevitable open-source clones emerge? When someone builds a version without watermarking? When bad actors specifically seek out unwatermarked alternatives?
Watermarking feels like bringing a policy solution to a technical arms race. It's a good start, but hardly sufficient for what's coming.
Platform Responsibility in the Age of Perfect Fakes
The bigger question isn't technical, it's social. Who's responsible for identifying and labeling AI-generated content?
Should platforms like Twitter, Instagram, and Facebook be required to detect and flag AI images? Should they ban them entirely? Should creators be legally required to disclose when content is AI-generated?
The technology is advancing faster than our social systems can adapt. While we're still debating disclosure requirements, millions of people are already creating perfect fakes that spread faster than fact-checkers can keep up.
Living in the Post-Photography Era
Whether we're ready or not, we've entered the post-photography era. The question isn't whether AI will be able to create convincing fake images, it already can. The question is how we adapt our media literacy, our legal frameworks, and our social norms to a world where seeing is no longer believing.
Google's nano-banana moment is remarkable not just for what it enables, but for what it forces us to confront. We've crossed a threshold where the technology has outpaced our ability to manage its implications.
The genie isn't going back in the bottle. The question now is whether we can build the social and technical infrastructure to live responsibly with perfect fake images, or whether we'll let the chaos unfold while we figure it out.
Cheers,
Olivier
Like this newsletter? Forward it to a friend and have them sign up here.
Until next Thursday 🎉
1 Since we wrote this piece, previous models are now back for plus users
Reply