One Cerebral
Posts
Why Your RAG System Is Broken, and How to Fix It

Why Your RAG System Is Broken, and How to Fix It

Ex Meta Data Scientist: Jason Liu

Jiten Patel
March 11, 2025

Credit and Thanks: 
Based on insights from The TWIML AI Podcast with 
Sam Charrington.

Key Learnings

Understanding customer needs is crucial for effective AI implementation.
Prioritize building robust datasets tailored to user needs to enhance AI model performance.
Implement fast evaluation loops to facilitate rapid testing and iteration of AI systems.
Focus on retrieval mechanisms rather than solely tuning generation to improve model accuracy.
Encourage a culture of experimentation within teams to foster innovation and problem-solving.
Leverage multimodal approaches to create richer user experiences and differentiate offerings.
Fine-tuning models for specific tasks can lead to significant performance improvements.

_{Today’s Podcast Host:}_{Sam Charrington}

Title

Why Your RAG System Is Broken, and How to Fix It

Guests

Jason Liu

Guest Credentials

Jason Liu is an independent consultant specializing in recommendation systems and AI applications, with previous experience as a Staff Machine Learning Engineer at Stitch Fix and working on safety at Meta. He is the creator of Instructor and Flight, as well as an ML and data science educator.

Podcast Duration

57:33

Read Time

Approx. 5 mins

Deep Dive

One of the primary challenges in generative AI is ensuring that models can perform complex reasoning. Jason emphasizes that many companies mistakenly focus on tuning the generation aspect of their models without addressing the underlying retrieval mechanisms. He advises founders to first diagnose their systems by examining the quality of the data being retrieved rather than solely adjusting prompts. This approach encourages a culture of critical thinking and problem-solving within teams, where founders should empower their engineers to experiment and trust their instincts when developing solutions. For instance, Jason recalls instances where companies were losing customers due to ineffective retrieval systems, highlighting the need for a robust evaluation loop that prioritizes precision and recall over mere generation quality.

Building a comprehensive dataset is another crucial aspect discussed. Jason notes that many engineers struggle with data literacy, which can hinder their ability to create effective datasets. Founders should prioritize training their teams on what constitutes a good dataset and encourage them to conduct small experiments to validate their hypotheses. For example, generating synthetic questions from text chunks can help engineers understand the relationship between queries and the data they are working with. This iterative process not only enhances the dataset but also fosters a culture of experimentation, which is vital for innovation.

The conversation also delves into the importance of a decision matrix for embedding implementation. Jason suggests that rather than relying on off-the-shelf embedding models, founders should consider building their own systems tailored to their specific use cases. This requires a deep understanding of the data and the context in which it will be used. By running multiple experiments to test different embedding strategies, founders can identify the most effective approach for their applications. This hands-on experimentation can lead to significant performance improvements and a competitive edge in the market.

Evaluation tooling and metrics are essential for assessing the effectiveness of AI systems. Jason advocates for the development of fast, efficient evaluation processes that allow for rapid testing and iteration. Founders should implement simple metrics, such as compression rates in summarization tasks, to gauge the performance of their models. By monitoring these metrics regularly, teams can quickly identify areas for improvement and make data-driven decisions. For instance, Jason shares how tracking the average length of summaries relative to input lengths can reveal insights into model behavior, enabling teams to adjust their prompts and improve outcomes.

Fine-tuning in RAG is another critical area where founders can gain an advantage. Jason explains that fine-tuning should focus on specific tasks, such as ranking or metadata filtering, rather than attempting to fine-tune large language models for general tasks. By leveraging transfer learning and fine-tuning rankers, founders can achieve significant performance gains with relatively low data requirements. This targeted approach allows startups to optimize their systems without the extensive resources typically associated with large-scale model training.

The discussion also highlights the significance of long context lengths in generative AI. Jason notes that as context lengths increase, the complexity of instructions and interactions can also grow. Founders should be mindful of the trade-offs between context length and latency, as even minor delays can impact user experience and revenue. By strategically managing context and ensuring that relevant information is prioritized, startups can enhance the effectiveness of their AI systems.

Optimizations play a crucial role in improving AI performance. Jason emphasizes the need for founders to focus on user experience and product-facing aspects, such as reducing perceived latency through effective UI design. By integrating feedback mechanisms into their systems, startups can gather valuable data that informs future iterations and optimizations. For example, Jason recounts how a simple change in wording on a feedback prompt led to a fivefold increase in user responses, demonstrating the power of thoughtful UX design in driving engagement and improvement.

The conversation also touches on the potential of multimodal approaches in AI. Jason expresses excitement about the advancements in visual language models that can enhance search capabilities and provide richer interactions. Founders should explore these technologies to differentiate their offerings and create more engaging user experiences. By leveraging multimodal capabilities, startups can tap into new markets and applications, ultimately driving growth and innovation.

Agentic programs, which involve breaking down complex tasks into manageable steps, are another area of focus. Jason suggests that founders should consider how to structure their AI systems to allow for iterative improvements and refinements. By segmenting problem spaces and developing specific indices for different types of queries, startups can enhance the efficiency and effectiveness of their AI applications. This structured approach not only simplifies the development process but also enables teams to respond more effectively to user needs.

Finally, Jason discusses the growing field of AI consulting, emphasizing the importance of equipping teams with the knowledge and skills necessary to leverage AI effectively. Founders should invest in training and resources that empower their teams to navigate the complexities of AI implementation. By fostering a culture of continuous learning and adaptation, startups can position themselves for success in an increasingly competitive landscape.

Actionable Insights

Train your team on data literacy to ensure they understand what constitutes a good dataset.
Establish simple metrics for evaluating model performance, such as compression rates in summarization tasks.
Conduct regular experiments to test different embedding strategies and optimize retrieval systems.
Integrate user feedback mechanisms into your AI applications to gather valuable insights for improvement.
Explore the use of agentic programs to break down complex tasks into manageable steps for better efficiency.

Mind Map

Key Quote

"Founders can draw from this by encouraging their teams to take ownership and demonstrate value before being formally assigned roles, creating a culture of proactivity and results."

Future Trends & Predictions

As the landscape of generative AI continues to evolve, startups will increasingly focus on integrating multimodal capabilities into their systems, allowing for richer interactions and more nuanced outputs. The demand for effective evaluation metrics and rapid iteration processes will become paramount, as founders seek to differentiate their offerings in a competitive market. Additionally, the rise of AI consulting will provide startups with the necessary expertise to navigate the complexities of AI implementation, ultimately leading to more innovative and effective solutions in the industry.

Check out the podcast here:

_{Latest in AI}

1. Deutsche Telekom unveiled an AI Phone powered by Perplexity's chatbot at MWC 2025, designed to be primarily controlled by voice for tasks like booking flights or making restaurant reservations. The device features multimodal input, deep integration of Perplexity AI assistant, and a comprehensive AI ecosystem including Magenta AI, Google Cloud AI, and ElevenLabs.

2. Opera has introduced Browser Operator, a native AI agent capable of performing browsing tasks for users, marking a shift towards agentic browsing. This feature allows users to instruct the browser in natural language to complete online tasks, such as shopping or travel planning, while maintaining user privacy and control.

3. Microsoft launched Dragon Copilot, an AI assistant for healthcare professionals that combines voice dictation, ambient listening, and generative AI capabilities. This all-in-one tool enables clinicians to dictate medical notes, search for information, and automate tasks like orders and referral letters, aiming to reduce administrative burden and combat burnout.

_{Startup World}

1. Anthropic has raised $3.5 billion in a Series E funding round, valuing the company at $61.5 billion. The investment will be used to advance the development of next-generation AI systems, expand compute capacity, deepen research in interpretability and alignment, and accelerate international expansion.

2. SoftBank Group is reportedly in advanced talks to secure $16 billion in funding for AI investments, as part of its strategy to heavily invest in AI technologies. This funding round is expected to close in the near future, positioning SoftBank to lead in the AI revolution.

❝

_Analogy

Building an AI system without focusing on retrieval is like assembling a library with beautifully bound books but no catalog system. No matter how well-written the books are, if readers can't find the right one, the library fails its purpose. Jason argues that many founders obsess over improving how their AI generates responses while neglecting how it retrieves information. Instead, they should prioritize fine-tuning retrieval mechanisms, ensuring engineers experiment with data and embedding strategies. Just as a well-organized library serves its visitors efficiently, a well-structured AI system enhances user experience by delivering precise, relevant information—leading to real impact.

What did you think of today's email?

Your feedback helps me create better emails for you!

Loved it

It was ok

Terrible

Thanks for reading, have a lovely day!

Jiten-One Cerebral

_{All summaries are based on publicly available content from podcasts. One Cerebral provides complementary insights and encourages readers to support the original creators by engaging directly with their work; by listening, liking, commenting or subscribing.}

Reply

or to participate.