Dev - The Codespace Blog

Image of: Retrieval-Augmented Generation (RAG)

AI

Oct 27, 2024

2 min read

Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) is an architectural approach that improves the accuracy and reliability of LLM applications by grounding their outputs on external data....

Image of: RAG: Deployment and Operations in Production

Dev

Oct 20, 2024

5 min read

RAG: Deployment and Operations in Production

The final step is deploying the RAG system so that end-users or applications can consume it, and setting up an operational workflow to maintain it. We’ll discuss various deployment architectures and considerations on AWS, using NVIDIA and Databricks components....

Image of: RAG: Performance Optimization with NVIDIA TensorRT and Quantization

Dev

Oct 19, 2024

6 min read

RAG: Performance Optimization with NVIDIA TensorRT and Quantization

Large Language Models and semantic search can be resource-intensive. To deploy a responsive RAG system at scale, we need to optimize both the retrieval and generation components for latency and efficiency....

Image of: RAG: Post-Processing & Evaluation

Dev

Oct 13, 2024

4 min read

RAG: Post-Processing & Evaluation

Once the LLM generates an answer based on retrieved documents, there are a few additional steps that can enhance the quality and reliability of the RAG system: Answer Post-Processing, Evaluation and Continuous Monitoring....

Image of: RAG: LLM Integration and Generation

Dev

Oct 05, 2024

4 min read

RAG: LLM Integration and Generation

When it comes the generative part – using a Large Language Model to produce the final answer, augmented with the retrieved context. We need to integrate an LLM into our pipeline....

Image of: RAG: Query Handling and Document Retrieval Workflow

Dev

Sep 28, 2024

4 min read

RAG: Query Handling and Document Retrieval Workflow

When a user query comes into a RAG system it must retrieve relevant documents and prepare them to feed into the LLM....

DevA collection of 13 posts