MC.24: AI is like water

Explore transformative shifts: NFX's AI analogy with water, Google's Gemini 1.5 monumental context window, OpenAI's new text-to-video model, and massive investments in China and Spain.

modern chaos issue 24

Modern chaos is a newsletter exploring tech and AI through the journey of a dev agency turning into a startup studio. We share our notes, analysis, and experiments, plus a bit of Emacs tips.

Hello everyone,

In my last newsletter, I asked how you would work with AI agents on tasks that require your credentials. Fortunately, nobody would give their credentials directly, but you are 80% willing to let them handle tasks on your behalf.

After all, this isn't surprising. We expect AI to ease our burden by taking some tasks off our shoulders. This is the promise AI companies are selling us: more time, more productivity, you can work on things that matter to you, our AI will do the rest.

But something is missing: how to delegate when AI agents can't log in or use APIs? We can't let them use our passwords; it's too risky. Today, APIs are made for developers to interface software, transit data, and make some actions. Managing the whole back office through a programmatic interface is rare and far from the main business goals.

Startups have a chance to achieve Zapier's success by focusing to AI agents' needs: authentication, APIs, and interaction with their owner's software and apps. Addressing a subset of the problems might be enough to gain early traction:

  • Security: Preventing AI agents from going rogue.

  • Quality and reliability: Ensuring AI agents' work meets standards.

  • Web interactivity: Enabling AI agents to understand and control browser apps.

  • Native interactivity: Allowing AI agents to understand and control OS-native apps.

Software companies should start thinking about it. How can they design and extend their systems to support AI agents? What does it mean in terms of security and features?

I believe that customers will look for companies that support AI agents just like we evaluate apps by their integrations in an ecosystem. That’s why starting some implementations should prove profitable in the long run. Waiting for a hypothetical provider that solves this could delay their capabilities to answer a market shift towards AI agents.

If you know of people or companies working on the topic, I'd be happy to have some names and cover their work: [email protected]

Poll Results

  • I'd share my credentials directly with the Agent 0%

  • I'd set up a delegation account 60%

  • I'd use a programmatic interface, if available 20%

  • I wouldn't grant access due to potential security risks 20%

Cheers,

In this issue:

🌊 NFX: AI Is Like Water
🔗 BAIR: The Shift from Models to Compound AI Systems
🚀 Moonshot AI: $1 billion in funding led by Alibaba
💰 Microsoft: 4X investment in AI and Cloud in Spain
👀 Meta: a new AI model that learns by watching videos
🎥 Sora: a new text-to-video model from OpenAI
🌌 Google: Gemini 1.5, next-generation AI model
✨ Magika: Casting Spells on File Identification
📝 Hamel.dev: Show Me The Prompt
⚡ Groq: Inference 18 times faster than anybody
🎞️ DomoAI: Video to Video 2.0

Updates & tools

NFX: AI Is Like Water 
NFX drops a profound analogy comparing AI to water - it's everywhere, and it's transformative. The core message? We're at a pivotal moment where AI technology, like the $200 battery or the $100 genome sequence, has become incredibly accessible. This shift means companies need to sprint through the gates of innovation they might not have even realized were open. What's captivating here is the emphasis on differentiation in a sea of sameness. How to brand your product when your core tech is the same as 1000 others startups in the space? MORE

BAIR: The Shift from Models to Compound AI Systems 
The Berkeley Artificial Intelligence Research believes that compound AI systems will remain the best way to maximize the quality and reliability of AI applications going forward, even in case of new strong models. I share this view, businesses staying at the model level will be outpaced by others adopting and experimenting with more complex systems. MORE

Moonshot AI: $1 billion in funding led by Alibaba
The Chinese startup is now valued at $2.5 billion. While the USA is still ahead in the game thanks to OpenAI, there are more and more players with enough funding to compete. MORE

Microsoft: 4X investment in AI and Cloud in Spain 
When the words investment and AI are mentioned in a press release, it's more often in billions than millions. Here it's $2.1 billion, the company's largest investment in Spain in over three decades. If we weren't already 100% sure that their strategy deck is made of one slide "AI everywhere", now we know. MORE

Meta: a new AI model that learns by watching videos 
Their new model isn't just learning from text; it's absorbing the world through videos. Yann LeCun's vision suggests that the key to artificial general intelligence (AGI) might lie in teaching AI to understand our world as we do: through sights and sounds. The addition of audio learning is next on their list, promising to further bridge the gap between human and machine learning processes. MORE

Sora: a new text-to-video model from OpenAI 
OpenAI is at it again, this time with Sora, a groundbreaking text-to-video model. It can churn out minute-long videos, pushing the boundaries of AI-generated content. The implications for entertainment, education, and even advertising are monumental. Checkout the demo reels and see by yourself: the era of AI in filmmaking is upon us. MORE

Google: Gemini 1.5, next-generation AI model 
This isn't just another model; it's a paradigm shift. Gemini Pro 1.5 shatters expectations with a context window up to 10 million tokens—yes, you read that right. Imagine finding precise quotes in a 402-page document or pinpointing scenes in a 44-minute movie without breaking a sweat. This model isn't just about size; it's about precision and understanding on an unprecedented scale. MORE

Magika: Casting Spells on File Identification 
Google introduces Magika, a tool that brings a kind of magic to the often tedious task of identifying file types. Magika is fast and efficient, a GPU is not required and it's available on GitHub under the Apache2 License. This innovation hints at a future where mundane (but complex) tasks are dispatched with a whisper of AI. MORE

Hamel.dev: Show Me The Prompt 
Lots of libraries try to optimize your prompts by rewriting them for you. Hamel Husain shows in this blog post how to check what they do by intercepting API calls. He tried 5 different libraries; one made hundreds of calls to OpenAI, another one used a lengthy prompt with XML to constrain the output, enlarging your token usage. The moral? Understand your tools before committing. MORE

Groq: Inference 18 times faster than anybody 
Groq platform broke the speed inference benchmark with their LPUs. It's 18 times faster than what's offered by top cloud services. The service is still in alpha. LLama2 and Mistral models are their current lineup. MORE

DomoAI: Video to Video 2.0 
Expect smoother videos, clearer backgrounds, and better character sync. They also added a Ghibly-style render. I imagine myself creating a manga-like series from my family video for my kid. MORE


Reply

or to participate.