• Modern Chaos
  • Posts
  • MC.12: Kyutai's AI Vision, From Screenshot to Syntax, and DeepMind's Music AI Experiments

MC.12: Kyutai's AI Vision, From Screenshot to Syntax, and DeepMind's Music AI Experiments

This week, discover the new research lab Kyutai backed by €300 million, grasp the magic of turning screenshots into code, and experience DeepMind's innovative fusion of AI with the art of music composition.

modern chaos issue 12

Modern chaos is a newsletter exploring tech and AI through the journey of a dev agency shifting from services to product design. We share our notes, analysis and experiments.

Hello everyone!

Like many, I've been caught in the recent drama at OpenAI. It's sparked much speculation. Is it ego, an AGI breakthrough, or something else? The truth is unclear. This reminds us of Cloud services' wisdom: don't rely on one provider. This is true for applications and Language Learning Models (LLMs) alike. Diversifying sources and providers is crucial.

Cheers, 

In this issue:

📺 YouTube: Our Approach to Responsible AI Innovation
🌟 Awesome-GPTs: Curated List of GPTs Built on OpenAI
🖼️ Screenshot-to-Code: Convert a Screenshot to HTML/Tailwind/JS Code
📊 Hallucination Leaderboard: LLM Performance in Summarizing
🎶 Google DeepMind: Transforming the Future of Music Creation
🤖 Kyutai: A New AI Research Lab with a €330 Million Budget
🧠 Orca 2: Teaching Small Language Models How to Reason
💻 DeepSeek Coder: A New Open-Source Code LLM in English & Chinese

Tech updates & tools

Youtube: Our Approach to Responsible AI Innovation
Google will introduce new content labels to inform viewers when the content they’re viewing is synthetic. Creators will be required to disclose any use of AI tools, even if it justs partially alteration. The labels will be prominently displayed on sensitive contents, such as elections. In my opinion, this is a step in the right direction. MORE

Awesome-GPTs: Curated List of GPTs Built on OpenAI
It’s great to see the work of other AI enthusiasts and have the ability to try it out immediately. Plus, if you like one of them, it’s easy to retrieve the prompt and create your own personal version. MORE

Screenshot-to-Code: Convert a Screenshot to HTML/Tailwind/JS Code
A simple app using GPT-4 Vision to generate code and images from a given screenshot. I’m curious to see how paid services like v0.dev will survive when so many open-source alternatives are emerging. VIDEO | MORE

Hallucination Learboard: LLM Performance in Summarizing
This is a public LLM leaderboard computed using Vectara's Hallucination Evaluation Model. It evaluates how often an LLM introduces hallucinations when summarizing a document. GPT-4 is at 3%, and Mistral 7B at 9.4%. MORE

Google Deepmind: Transforming the Future of Music Creation
Have a sneal peek at the AI-related music experiments taking place in Youtube. They showcase Dream Track for Shorts, generative music mimicking various artists, and Music AI tools where you just have to sing to turn your idea into music. Impressive. MORE

Kyutai: A New Ai Research Lab with a €330 Million Budget 
I talked last week about how France is betting on AI. This news is another strong signal. This lab is aiming to democratize AI and will open-source and disclose everything, including the data used. MORE

Orca 2: Teaching Small Language Models How to Reason
Orca 2 significantly surpasses models of similar size and attains performance levels better than models 5-10 times larger, as assessed on complex tasks that test advanced reasoning abilities in zero-shot settings. Model 7B & 13B available. MORE | HUGGINGFACE

DeepSeek Coder: A New Open Source Code LLM in English & Chinese
Another state-of-the-art code model to try. It’s available on Ollama and features various model size from 1.3B to 33B. MORE

Reply

or to participate.