Simplifying RAG systems: An open-source blueprint for scalable and customizable development

26.2.2025
Piotr Kalota
Large Language Models (LLMs) are being widely adopted across industries to optimize business processes. Retrieval-augmented generation (RAG) systems, the focus of this blog post, enhance LLMs by incorporating internal knowledge that the models wouldn't otherwise have access to. In this article, we'll explore the advantages and applications of RAG systems—from e-commerce personalization to knowledge bases. We'll then introduce the RAG Blueprint, our open-source codebase that's customizable for different use cases, scalable for production environments, and secure for your data.
In this blog article, you’ll learn:
- What RAG systems are
- The possible applications of RAG systems
- What makes building a RAG system complex
- How to build a RAG system using our RAG Blueprint
- Plus links to further resources and documentation over on GitHub.
Here's a demo of a RAG system using a Bavarian beer data source.
Thinking of building your own RAG system?
We’re here to answer questions and help you get started with the RAG Blueprint.
What is a RAG system?
RAG systems enhance LLMs by providing access to external data sources, enabling them to answer questions relating to private data, such as "What are the sales figures for 2024?” They supply the LLM with relevant internal information alongside the user query, enabling it to provide accurate, data-backed answers.
The high-level concept of a RAG system.
The result? Reliable answers based on real data, complete with references to the source material. This not only enhances accuracy but also reduces LLM hallucinations (when an AI model generates plausible but false or misleading answers). Moreover, RAG systems are near real-time solutions, adapting to dynamic data sources and document updates as they happen.
What are the possible applications of RAG systems?
RAG systems have a wide range of applications. Here are just a few examples:
- E-commerce personalization: RAG enhances shopping experiences by using real-time data (browsing history, past purchases, preferences) to deliver tailored product recommendations, boosting conversion rates and customer engagement. [1]
- Knowledge bases: Say goodbye to manually digging through vast documentation. A RAG system can search your knowledge base, surface the most relevant documents, and summarize them for you. [2]
- Coding copilots: RAG-powered copilots can help developers explain, fix, improve, or extend codebases, accelerating the coding process. [3]
These use cases highlight common requirements: data security, accuracy, factual grounding, and relevance—areas where RAG systems outperform traditional generative AI solutions.
Why is building a RAG system complex?
While building a small-scale RAG system for a single use case is relatively straightforward, scaling it for multiple use cases or adapting it for production introduces significant challenges.
RAG system architecture.
The architecture above showcases the many components and processes involved. Each of these can vary based on the specific use case. For instance:
- Data sources: Different use cases require different document formats and datasets.
- Performance & cost trade-offs: Choosing between cloud-based and on-premise solutions affects both system performance and operating costs. The same applies to different vector store providers.
- Privacy concerns: You may prefer a locally hosted LLM to safeguard sensitive data.
All these components, together with embedding, retrieval, and augmentation processes, demand a flexible codebase that can easily adapt to diverse requirements.
How to build a RAG system: Our RAG Blueprint
To simplify this complexity, we created an open-source RAG codebase, available here. It enables developers to easily configure the system to fit specific use cases and extend it for new components. Our goal is to help you adapt and scale your RAG system while keeping your data secure.
Flexibility
Our modular architecture allows developers to:
- Integrate various data sources with ease.
- Swap out vector stores, embedding models, and LLMs to fine-tune performance and cost.
- Customize processes to align with specific project goals.
This flexibility ensures that your RAG system is optimized for both functionality and efficiency.
User interface
The blueprint comes with an out-of-the-box chat interface for interacting with the system. However, developers can easily extend it to integrate it with other end-user interfaces.
Monitoring
A key aspect of any production-scale system is monitoring. Our blueprint includes tools to provide insightful metrics on system performance and user interactions.
Langfuse as a monitoring system for RAG.
We’ve integrated Langfuse to offer detailed tracking of every process involved in generating answers. It tracks system usage metrics and costs to help assess business value. For developers, it serves as a troubleshooting tool, helping to identify and fix issues efficiently.
Evaluation
Every robust system needs consistent evaluation. Our blueprint supports human-in-the-loop evaluations, where users can rate responses with a simple thumbs-up or thumbs-down. This feedback feeds into evaluation datasets that help track performance trends.
Additionally, you can manually upload evaluation datasets if pre-existing data is available. We employ various evaluation methods, from simple statistical techniques to advanced approaches involving embeddings and language models, ensuring a comprehensive assessment of system performance.
Summary
In this article, we explored what RAG systems are, their wide range of applications, and the complexities involved in building them. We also introduced our RAG Blueprint, a complete and scalable open-source solution designed to simplify RAG development while ensuring data security.
Dive deeper into the technical details on our GitHub and detailed documentation. Or take a look at our data science & AI services to see how we can support you with your use cases.
Got questions or feedback? Reach out—we’d love to hear from you!
References
[1] Alvarez, E. (2024, April 2). Transforming Retail with RAG: The Future of Personalized Shopping. Medium.
[2] Jay, S. (2023, Sep 12). Leveraging Retrieval-Augmented Generation and Embeddings for Advanced Document Search. Medium.
[3] GitHub. (2025, January 30). Enhancing software development with retrieval-augmented generation. GitHub Resources.