ε Pulse: Issue #14

Casual ML ⌨, Apple Intelligence 🍎, and Why AGI is Coming in 2027 🤖.

Jun 13, 2024

[AI/ML + Bio] 🤖 🦠 🧬

[I]: 🏥 Causal Machine Learning for Predicting Treatment Outcomes

Summary: I have been trying to learn a bit about Causal machine learning (ML), and this perspective paper was quite the way to get started.

Unlike traditional ML, causal ML focuses on quantifying changes in outcomes due to treatment, answering "what if" questions about potential outcomes under different treatment scenarios. It can be applied to observational data from real-world sources. However, careful consideration of assumptions and potential biases is crucial for reliable causal inference.

The causal ML workflow involves formulating the problem structure, selecting the causal quantity of interest, assessing assumptions, choosing and fitting a causal ML method, evaluating the method, performing robustness checks, and interpreting the results.

I started reading this book too on the subject of Causal ML.

[II]: 👨‍🔬 Large Language Model for Predictive Chemistry.

Summary: This (really cool) paper demonstrates that large language models (LLMs) like GPT-3, which are primarily trained on vast amounts of text data, can be fine-tuned to effectively perform various predictive tasks in chemistry and materials science, often outperforming traditional machine learning models, especially in low-data scenarios.

Some details:

The study involves fine-tuning GPT-3 to perform chemistry-related tasks for instance, predicting the phase of high-entropy alloys. The model architecture leverages the capabilities of GPT-3, a transformer-based model trained on extensive internet-sourced text, further fine-tuned on domain-specific questions. The fine-tuning process utilizes the OpenAI API, typically requiring only a few minutes to adapt the model to new tasks. The system facilitates tasks such as property prediction, classification, and inverse design of molecules and materials. The work fills a gap by showcasing how general LLMs can be adapted for highly specialized scientific tasks without the need for chemistry-specific pre-training, making advanced predictive capabilities accessible even with limited data.

[AI X Industry + Products] 🤖🖥👨🏿‍💻

[I]: 👨‍⚕Google Medical AI Outperforms Doctors.

Google's Med-Gemini AI, a breakthrough in medical AI developed by Google and DeepMind, significantly advances clinical diagnostics with impressive capabilities in interpreting complex medical data. Med-Gemini excels in long-context reasoning, accessing web-based information for better clinical reasoning, and achieving new standards in medical benchmarks, outperforming previous models and human doctors in various tests. It's trained on specialized datasets that enhance its reasoning and can perform complex diagnostic tasks, making it a powerful tool in healthcare.

Relevant Papers

[II]: 👨Sundar Pichai on the Future of AI & Google

Link to The Circuit Episode from Bloomberg Originals.

[III]: 🤖 OpenAI Spring Update

By the time you are reading this, except you are living under the rock, you must have heard about GPT-4o, just in case you haven’t, here are links. Quite a big deal!

Here is a summary from Sam Witteveen: GPT-4o, what they didn’t say. (As a side note, for learning hands-on LLM application, Sam’s YouTube is one of the most helpful out there.)

[IV]: 🐇 Google I/O

I spent some considerable amount of time going through virtually all Google I/O events online. It’s quite a lot of products, if you ask me. A few ones that caught my eye:

Google Photos. Ask questions of your photos

Text to Video: Veo from Google via VideoFX

Illuminate: Turn academic papers to AI-generated audio discussions

Search: AI Overviews, AI tools while browsing (you have probably heard about this, with the news on glueing pizza, eating rocks, and what-not)

Learn About: Learn about anything.

/Code Data Science Agent & Code Transformation Agent.

AI Teammate

Gemini Live and Gemini with Gems (akin to OpenAI’s GPTs)

Other interesting breakout sessions I liked: Multimodal RAG.

[V]: 🧑‍✈ Microsoft Team Copilot

Microsoft’s new ‘Team Copilot’ AI assistant runs meetings, manages projects, assigns tasks.

[VI]: 👓PaliGemma: An Open Vision Language Model

PaliGemma is a lightweight open vision-language model (VLM) inspired by PaLI-3, and based on open components like the SigLIP vision model and the Gemma language model. PaliGemma takes both images and text as inputs and can answer questions about images with detail and context, meaning that PaliGemma can perform deeper analysis of images and provide useful insights, such as captioning for images and short videos, object detection, and reading text embedded within images.

Tutorial on finetuning PaliGemma.

[VII]: 👷 AutoGen Update: Complex Tasks and Agents

Adam Fourney discusses the effectiveness of using multiple agents, working together, to complete complex multi-step tasks. He will showcase their capability to outperform previous single-agent solutions on benchmarks like GAIA, utilizing customizable arrangements of agents that collaborate, reason, and utilize tools to achieve complex outcomes.

Microsoft Research Forum, June 4, 2024

Video

[VIII]: 🪟 Generative User Interface

This is a very nice introduction to generative user interface using LangChain.

[IX]: 🍎 Apple Intelligence

Every new AI feature coming to the iPhone and Mac:

“Starting later this year, Apple is rolling out what it says is a more conversational Siri, custom, AI-generated “Genmoji,” and GPT-4o access that lets Siri turn to OpenAI’s chatbot when it can’t handle what you ask it for.”

Apple’s On Device and Server Foundational Models: Detailed blog on Apple’s Models powering the new ‘Apple Intelligence’ system:

Apple Intelligence is comprised of multiple highly-capable generative models that are specialized for our users’ everyday tasks and can adapt on the fly for their current activity. The foundation models built into Apple Intelligence have been fine-tuned for user experiences such as writing and refining text, prioritizing and summarizing notifications, creating playful images for conversations with family and friends, and taking in-app actions to simplify interactions across apps…

[AI + Commentary] 📝🤖📰

[I]: 👝Elon Musk Plan for AI news.

I realized early enough that one ‘easy’ use case of AI is aggregating news content. See also AI powered Discover Daily Podcast by Perplexity X 11Labs .

[II]: 👨‍💻Startup Building

A few commentaries were made on AI.

[III]: 💻These New Computers are Getting Creepy

Commentary on Microsoft’s Copilot + PC

Folks say it’s creepy, but I suppose the question is how many techs did we considered creepy many years ago but have warmed up to today? And why have we warmed up?

[IV] 🎧 AI-Voiced Audiobooks Top 40,000 Titles on Audible

“During the beta we are learning more about what our customers want as we continue to innovate on their behalf,” said an Audible spokesperson over email. These AI-voiced titles have an “average overall rating of 4+,” the spokesperson said.
Still, some consumers worry this development portends a tough future for narrators who will lose work while listeners suffer from lagging quality.
“So depressing to discover Virtual Voice narrations of audiobooks on Audible,” posted one X user. “Yes, they are good enough. Alas.”

Link to article.

[V] AGI is coming in 2027

If whatever this guy is saying comes to pass, the chaos will be a lot. 2027 is already here: AGI is coming in 2027.

[VI] Plentiful, high-paying jobs in the age of AI

Perhaps a complementary read with the last essay on ‘AGI coming in 2027’, the argument is very rational but not intuitive:

“…because of comparative advantage, it’s possible that many of the jobs that humans do today will continue to be done by humans indefinitely, no matter how much better AIs are at those jobs. And it’s possible that humans will continue to be well-compensated for doing those same jobs.”

(And with some relevant caveats.)

[VII] 🎙 Podcast on AI and GenAI

Few of other podcast episodes I listened to over the past few weeks.

The Epsilon

Discussion about this post