ε Pulse: Issue #17
Language Agents 🗣️, LangGraph Ecosystem 🕸, and Why Distribution is all You Need 📦.
Note: I am currently working on an AI project mostly in the science space with a friend, and we are searching for a technical partner with full-stack software engineering expertise. Please reach out to me via obifarin@yahoo.com or simply reply to this newsletter so we can schedule a time to chat. Ideally, we are looking for a candidate based in the United States.
My other publications:
See my latest Around the Web newsletter: Robots at the Gate.
Watch my latest video essay: On being antifragile.
In this newsletter:
🧬
AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
Language Agent Achieve Superhuman Synthesis of Scientific Knowledge
An LLM-based Knowledge Synthesis and Scientific Reasoning Framework for Biomedical Discovery
👨💻
GraphRAG: The Marriage of Knowledge Graphs and RAG
100M Token Context Windows
The LangGraph Ecosystem
🏭
Cursor AI: The AI Code Editor
Top 100 GenAI Consumer Apps
AI Friend Necklace.
Arizona State University X OpenAI
Google AI Experiments: Illuminate, Learn About, NoteBookLM
HasAnyOne.com: Topic Research Verification
📝
What does it take to build an AI Scientist?
Answer.ai & AI Magic with Jeremy Howard
An Artificial Intelligence Conversation With Andrew Ng
Distribution is all You Need
Anthropic CEO Dario Amodei Talks Everything AI
[LLM in Science et al] 🤖 🦠 🧬
[I]: 🧪 AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
The paper "The AI Scientist" presents an innovative framework for fully automated open-ended scientific discovery using LLMs. It automates the entire research process—from ideation to paper generation and review—illustrating its potential to significantly democratize and accelerate research.
I think pulling all these together was only a matter of time, since all the bits and pieces are already implemented elsewhere (e.g. chain-of-thoughts, self reflection, web access as a tool, Aider, etc ). Nevertheless, this is, for better or for worse pointing towards a future that we will have to embrace. A bit more detail:
AI Scientist operates in three key phases: idea generation, experimental iteration, and manuscript preparation. It takes minimal input—a research direction and a basic codebase template—and autonomously generates new research ideas. Through LLM-driven "brainstorming," it refines these ideas using chain-of-thought and self-reflection methods. Once the idea is selected, the system leverages an LLM-based coding assistant (Aider) to modify existing codebases and execute experiments. After generating results, the system produces visualizations, composes the manuscript in LaTeX, and conducts peer reviews using an automated reviewing process modeled after NeurIPS standards. The AI Scientist was tested on machine learning subfields like diffusion modeling, transformer-based language modeling, and learning dynamics. It generated papers at a low cost of approximately $15 per paper, achieving near-human quality in evaluations. The framework addresses a gap in research automation, offering scalability and efficiency beyond hyperparameter tuning and architecture search.
Another interesting note that stands out in the paper: “The AI Scientist can generate hundreds of interesting, medium-quality papers over the course of a week” And let’s face it, most of the papers humans generate are medium quality. I suppose the question to ask is who will read all these papers? Maybe AI too?
Here is a high level summary of the paper and the code base.
[II]: 🗒 Language Agent Achieve Superhuman Synthesis of Scientific Knowledge
This paper presents PaperQA2 from Future House, an advanced language agent designed for scientific knowledge synthesis, showing superhuman performance in literature search, summarization, and contradiction detection tasks, surpassing human experts in precision. I covered a first version of the agent PaperQA in Issue 10 of the Pulse.
Some detail: The primary input to PaperQA2 includes natural language queries that the agent uses to search scientific papers. It employs RAG to collect relevant sections from papers. Key tools in PaperQA2 include a "Paper Search" for identifying papers, "Gather Evidence" for retrieving and summarizing key sections, and "Generate Answer" for synthesizing final responses. The model is built around a multi-step retrieval process, grounded by a dense-vector retrieval system, followed by a re-ranking and contextual summarization (RCS) step that prevents irrelevant information from contaminating the context. Outputs are Wikipedia-style articles and contradictions identified within scientific papers, validated by human experts. PaperQA2 is benchmarked using a novel dataset, LitQA2, which emphasizes questions whose answers are not found in paper abstracts but in their main bodies. Compared to other retrieval-based systems, PaperQA2’s multi-step approach allows it to modify searches iteratively, improving both precision and recall. This work fills gaps in traditional retrieval systems that often fail to handle such complex synthesis tasks in scientific research, making it highly relevant for detailed literature exploration.
[III]: 👨🔬 An LLM-based Knowledge Synthesis and Scientific Reasoning Framework for Biomedical Discovery
This paper presents BioLunar, an LLM-based framework for biomedical discovery that integrates heterogeneous data sources, specialized tools, and contextual reasoning to support complex scientific workflows, demonstrated through a use case in molecular-level evidence enrichment for biomarker discovery in oncology.
The BioLunar framework integrates LLMs with specialized databases and biomedical tools to support complex scientific reasoning across distributed evidence spaces. It takes gene sets and contextual information as inputs and employs a modular design with reusable data access and analysis components. The system queries multiple knowledge bases like CIVIC, Human Protein Atlas, and COSMIC through API connectors, and utilizes analytical tools such as gene enrichment analysis. LLM-based natural language inference components interpret results from each subworkflow in the given context. The outputs include statistical tables, evidence-based interpretations, and a comprehensive summary synthesizing findings across subworkflows. The low-code user interface allows researchers of all programming levels to construct LLM-enabled scientific workflows. By facilitating automatic scientific discovery and inference from heterogeneous evidence, BioLunar addresses the challenge of coherently analyzing diverse biomedical data sources to support expert-level knowledge synthesis, particularly in oncology research. The paper demonstrates the framework's utility through a use case in molecular-level evidence enrichment for biomarker discovery in oncology, showcasing its potential to streamline complex analytical processes in biomedicine.
You can watch a demo here.
[AI/LLM Engineering] 🤖🖥⚙
[I]: 🔍 GraphRAG: The Marriage of Knowledge Graphs and RAG
“A famous poet once said "Natural language is most powerful when it can draw from a rich context." Ok fine, I said that. But that's true of both poetry, and of LLMs! Well, Knowledge Graphs excel at capturing context. How can combining Knowledge Graphs with RAG – an emerging technique known as GraphRAG – give context to your RAG application, and lead to more accurate and complete results, accelerated development, and explainable AI decisions? This talk will go deep on the why and how of GraphRAG, and where best to apply it. You’ll get concepts, examples, and specifics on how you can get started. You’ll walk away with an understanding of how GraphRAG can improve the context you pass to the LLM and the performance of your AI applications.”
Link to talk.
[II]: 🔖 100M Token Context Windows
“There are currently two ways for AI models to learn things: training, and in-context during inference. Until now, training has dominated, because contexts are relatively short. But ultra-long context could change that.
Instead of relying on fuzzy memorization, our LTM (Long-Term Memory) models are trained to reason on up to 100M tokens of context given to them during inference.
While the commercial applications of these ultra-long context models are plenty, at Magic we are focused on the domain of software development…”
[III]: 👨💻The LangGraph Ecosystem
In the last issue of this newsletter, I wrote about LangGraph Cloud and LangGraph Engineer.
I've recently taken the time to test them out, and they're quite interesting. However, as a postdoc, I don't particularly want to spend $40 on a hobbyist project. I was able to clone LangGraph Studio and LangGraph Engineer, which I didn’t pay enough attention to realize was an open-source project. These two tutorials helped me get started, and they could be useful for anybody who wants to try it out.
Also, folks at LangChain recently launched an Introduction to LangGraph course. I am planning to spend some time working through the course, hopefully it will help with the confusion I still have with regards to multiple different ways of doing one single thing in LangChain.
[AI X Industry + Products] 🤖🖥👨🏿💻
[I]: 🖱 Cursor AI: The AI Code Editor
From where I stand, it appears software engineering and programming is one of the fields that will be most impacted by AI. I have heard about cursor, an AI native IDE, over the past year or so, and I left it dormant on my todo list. The IDE crossed my radar recently, and it looks like they have a solid product now. This video gives a great high-level use case of Cursor. Software engineering has changed forever.
[II]: 📱Top 100 GenAI Consumer Apps
“The magic of creative tools continues to draw in consumers. Fifty-two percent of the companies on the web list are focused on content generation or editing, across modalities — image, video, music, speech, and more. Of the 12 new entrants, 58% are in the creative tool space.
This included four of the top five first-time listmakers by rank: Luma (#14), Viggle (#21), SeaArt (#29), and Udio (#33). And the biggest leap in the past six months was music generator Suno, which rose from #36 to #5.”
“Perplexity is now #3 on web — the AI-powered search engine focuses on delivering concise, real-time, and accurate answers to queries, with cited sources. Perplexity slightly edges out ChatGPT in visit duration (at over seven minutes) according to Similarweb data, suggesting that users are deeply engaged. Perplexity also made the top 50 mobile list for the first time. Anthropic’s Claude, arguably even more of a direct ChatGPT competitor, entered the top five on web at #4, up from #10 in the prior ranking. The company recently launched Artifacts, which goes head-to-head with ChatGPT’s GPTs.”
Read the full article from a16z.
[III]: 👔AI Friend Necklace.
“Do you ever find yourself speaking to Alexa or Siri just to have someone to talk to? Then you might be the right audience for an upcoming AI-powered necklace called friend (lowercase "F" is their style), an always-listening pendant. The $99 necklace can't turn your lights on and off or Google things for you like Alexa can. Its aim is to keep you company if you're lonely.
Friend also don't speak back to you via an AI voice, like Amazon's Alexa. Instead, you tap your pendant to start the conversation, and it responds via text…”
See the product’s trailer, it’s really something. These kinds of gadgets are primed to do crazy things like identify predictive markers of conflict between partners, for example. All you do is just to get the embedding for conversation at night say, and see how it drifts over time. Brave new world!
[IV]: 🏫 Arizona State University X OpenAI
“Arizona State University (ASU) is one of the largest public universities in the United States, serving 181,000 students in a given year and offering over 800 degree options. For nine straight years, U.S. News and World Report has named ASU the most innovative university in America. Today, ASU is enhancing educational outcomes by integrating ChatGPT Edu into projects across teaching, research, and operations. …”
[V]: 👨🏫 Google AI Experiments: Illuminate, Learn About, NoteBookLM
I have been on the wait list of a few Google’s AI experiments for some time now, and now I finally able to try them:
Learn About: A chatbot that guides you to learn about anything you want
Illuminate: Covert scientific papers (for now optimized for computer science papers) to podcasts. They have also recently deployed the tech to NotebookLM, and it’s pretty good I have to say: see announcement on X.
[VI]: 🧑🔬 HasAnyOne.com: Topic Research Verification
A minimalist AI tool to search if anyone has ever researched a given topic. This tool is part of FutureHouse's suite of AI-powered tools aimed at accelerating scientific discovery and automating research processes.
[AI + Commentary] 📝🤖📰
[I]: 🧑🔬What does it take to build an AI Scientist?
An insightful commentary by Sam Rodriques on the AI scientist paper I discussed in the first entry of this newsletter. Rodriques essentially points out that building AI scientists is not going to be as easy as confessing that you have just built one. The commentary is focused on the natural sciences. Some key points:
There are key challenges including navigating vast amounts of scientific data, designing experiments, and interpreting the results.
There is a need for robust evaluation methods.
A true AI scientist would need to be able to form its own hypotheses and design experiments which would require major breakthroughs in scientific reasoning and engineering.
[II]: 👝Answer.ai & AI Magic with Jeremy Howard
I particularly admire Jeremy Howard's anti-mimetic approach to thinking and his commitment to embodying it. In this insightful podcast, he discussed continuous pre-training, multi-phase pre-training, and structuring companies for long-term value creation instead of short-term profits (a stark contrast to many large research labs). He also touched on building a new BERT, dialogue engineering (a technique he uses to refine thought processes before generating code), and his hiring philosophy, which prioritizes individuals with unique backgrounds who bring fresh perspectives and have demonstrated resilience in overcoming challenges. Nice conversation with folks at latent space.
[III]: 👨 An Artificial Intelligence Conversation With Andrew Ng
Nice podcast here between folks at ARK invest and Andrew Ng about AI. A few takeaways, predictions, etc:
Agentic systems are here and now: high level of confidence that agentic systems are already being developed, and the technical risk of these systems seems low (I do cover these routinely in the papers I share and AI engineering section).
Open source is the way to go: Ng believes that open source is the right approach for AI, and the safety concerns regarding open source are likely overblown.
The bottleneck for AI deployment is evaluation: Even though AI models are becoming more advanced, evaluating their outputs remains a challenge. Owning distribution channels might be more important than owning the models themselves because it allows for real-time evaluation through user feedback.
There are many good AI ideas waiting to be deployed due to lack of resources: Many companies have ideas for AI applications that could be commercially viable, but they are limited by factors like GPU availability and software engineering resources. And adjacent to the previous point; There are many AI models and techniques that are already powerful but haven't been implemented in the real world yet. Additionally, there are architectural improvements on the horizon that will lead to even more powerful AI.
Fast inference is becoming a major bottleneck: While significant computing power was previously required to train AI models, the ability to quickly use these models (inference) is becoming a bigger hurdle. This is critical for agentic workloads, where fast response times are necessary.
It will take time for AI to transform industries: While AI is making significant progress, it can take a long time to change industries because of the slow pace of business and cultural adoption.
[IV]: 📦 Distribution is all You Need.
The cost of developing software applications continues to decrease, as evidenced by the recent release of Replit's AI agent. This trend highlights key factors for successful AI startups a) Low Entry Barriers: The ability to start with minimal resources. b) Expertise in Target Workflows: Deep understanding of the processes you aim to improve, enabling creation of tailored user experiences. c) Effective Distribution: The capacity to reach and acquire users efficiently. As software development becomes more accessible, distribution will likely emerge as the most critical factor for success. In a landscape where creating decent software is no longer a significant hurdle, the ability to stand out and reach your target audience will become paramount.
[V]: 💻 Anthropic CEO Dario Amodei Talks Everything AI.
For some reason, I haven’t heard (seen) Dario Amodei talk on podcasts except this one with Erik Torenberg and Noah Smith. Perhaps it’s just me missing him, anyways, I find him to be very brilliant. A brief highlight:
The discussion revolved around the economics of AI development, the comparative advantage of AI companies, and inequality in a world powered by AI.
Amodei argues that technological innovation can make business process innovation less important. AI could potentially substitute for human intelligence in many tasks, making business models less relevant. He cautions however, that there is no guarantee that the AI scaling laws driving these innovations will continue indefinitely (I suppose that goes without saying).
Also, he disagrees with the idea that AI will render comparative advantage obsolete. He believes that even if AI becomes very powerful, some tasks will remain better suited for humans. Humans will adapt and find new ways to add value in the economy. In the same vein, he argues that AI is unlikely to replace humans in most jobs, but instead will complement human workers and create new business models. He compares the potential impact of AI to the impact of electricity on manufacturing. Electricity did not replace steam boilers entirely, but instead led to new ways of working that increased productivity.
The conversation also touched on other topics including the importance of aligning the goals of AI with human values, and AI arms war.
[VI] 🎙 Podcast on AI and GenAI
(Additional) podcast episodes I listened to over the past few weeks.