ε Pulse: Issue #13
Google, MD 👩🏼⚕, Apple’s New Ferret 🐿, and Platforms for Multi AI Agent Systems 🚣♂.
[AI X Industry + Products] 🤖🖥👨🏿💻
[I]: 📹AI Video Generators
“OpenAI captivated the tech world a few months back with a generative AI model, Sora, that turns scene descriptions into original videos — no cameras or film crews required. But Sora has so far been tightly gated, and the firm seems to be aiming it toward well-funded creatives like Hollywood directors — not hobbyists or small-time marketers, necessarily.
Alex Mashrabov, the former head of generative AI at Snap, sensed an opportunity. So he launched Higgsfield AI, an AI-powered video creation and editing platform designed for more tailored, personalized applications…”
In another news, Adobe Premier Pro is getting generative AI video tools:
“Adobe is working on a generative AI video model for its Firefly family that will bring new tools to its Premiere Pro video editing platform. These new Firefly tools — alongside some proposed third-party integrations with Runway, Pika Labs, and OpenAI’s Sora models — will allow Premiere Pro users to generate video and add or remove objects using text prompts (just like Photoshop’s Generative Fill feature) and extend the length of video clips.”
[II]: 👩🏼⚕Google, MD.
I will be shocked if this isn’t happening:
“Google may want to bring its AI expertise to the doctor’s office. The company filed multiple patent applications related to machine learning-based note-taking and research assistance in healthcare settings…“
[III]: 💂 Google’s (new) A- powered products: Agents et al.
Folks over at Google are busy, here is a work through video of some of the latest AI products from Google. They have something like OpenAI’s GPTs for enterprises going on, I like the new Google Vids product, and I am also looking forward to trying out the Gemini Code Assists(looks like they have it currently on a free plan).
[IV]: 🐿 Ferret: Apple’s New AI Model that can see.
If this works, it’s going to be a game changer. App Agent, I suppose.
“The new paper for Ferret-UI explains that, while there have been noteworthy advancements in MLLM usage, they still "fall short in their ability to comprehend and interact effectively with user interface (UI) screens." Ferret-UI is described as a new MLLM tailored for understanding mobile UI screens, complete with "referring, grounding, and reasoning capabilities…”
[IV]: 🐇 Rabbit R1
If you have not yet seen Marques Brownlee review of Rabbit R1, this is it.
[V]: 🚣♂ Platform for Multi AI Agent Systems
I have been playing with multi-AI agent systems lately. I started out with Lang Graph, but I have been trying out Crew AI recently. This is a good tutorial on Crew AI. My impression of it is that it might be very useful depending on your use, and the controls you might want to have within the loop.
[AI + Commentary] 📝🤖📰
[I]: 👝Mark Zuckerberg on Llama 3 et al
[II]: 🎍Pinecone CEO on Pioneering Vector Databases
[III]: 🏗 Vinod’s Advice for Entrepreneurs Innovating with AI
I particularly enjoyed the discussion about risk.
[V] 👮♂ Andrew Ng on Agentic AI
Ng discusses agentic AI design pattern in these five newsletters: introduction, reflection, tool use, planning, and multi-agent collaboration. It goes without saying that these frameworks are powering the next generation of AI applications.
[V]: 🤖 AI Models, Data Scaling, Enterprise, and Personal AI
I recently came across this podcast and it’s quite good, looks like they just got it started.