Making AI Work with Your Newsroom's Data

A Practical Guide to RAG, Agents, and Fine-Tuning

Workshop · 75 min

What Is an LLM?

  • Trained on massive text — books, web, code, news articles
  • Learns to predict the next token
  • Builds a compressed model of language, facts, and reasoning patterns
  • Only "knows" what was in its training data
Billions of text documents
Training
LLM
Knowledge frozen at
training cutoff date

GPT-4, Claude, Llama, Mistral — same core idea, different training data and techniques.

In-Context Learning

LLMs can reason over text you put directly in the prompt — no retraining needed.

System instructions
Your documents / context
User question

This is the foundation — every approach we'll discuss is about getting the right data into that green zone.

The Gap

  • Your archives, sources, internal docs — not in the training set
  • LLMs hallucinate when they don't know — confidently wrong
  • Recent events may be past the knowledge cutoff
  • You can paste docs into the prompt, but that doesn't scale to thousands of articles
55,895 press releases
× avg 500 words each
=
~40M tokens
Won't fit in any context window

How do we bridge the gap?

Three Ways to Bridge the Gap

1. RAG

Search your data, inject results into the prompt. The model answers based on what you found.

Automated copy-paste

2. Agents

Give the model tools to search and reason on its own. It decides what to look for.

A researcher with access

3. Fine-Tuning

Retrain the model on your data so it internalizes your domain knowledge and style.

Training a new journalist

Each has different strengths, costs, and failure modes. Let's look at each one.

RAG — Retrieval-Augmented Generation

  • User asks a question
  • System searches your data for relevant documents
  • Top results are injected into the prompt as context
  • LLM generates an answer grounded in those documents

Think: automated copy-paste. The system finds the right documents so you don't have to.

Instructions
Retrieved doc 1
Retrieved doc 2
Retrieved doc 3
User question

What the LLM actually sees

RAG: How It Works

User Question
Data System
search
top K results
Question + Documents
LLM
Grounded Answer

Key characteristics

  • One search, one LLM call
  • Simple, fast, predictable
  • The quality of the answer depends entirely on what the search returns
  • No reasoning about what to search for — the user query goes straight to the data system

RAG: Weaknesses

Great for direct factual questions. Struggles with complex, multi-faceted research.

Agentic Systems

  • The LLM doesn't just answer — it drives the research
  • Given tools (search, APIs, databases) it decides what to look for
  • It can refine queries, follow leads, cross-reference results
  • Multiple search → reason → search cycles

Less like copy-paste, more like giving a researcher access to your archive.

Agent thinking:
"Let me search for covid policy in Spain..."
→ 2,438 results
"Now let me compare with Luxembourg..."
→ 1,176 results
"Interesting. Let me check the timeline..."
→ filtered by year
"Now I can synthesize an answer."

Agents: How They Work

User Question
Agent (LLM)
plans research
tool call
Data System
results
Agent
reasons, refines
Data System
Agent
repeat as needed
Synthesized Answer

Key characteristics

  • Multiple loops — the agent controls the research process
  • Can use different tools: search, filter, count, compare
  • Each step informed by previous results
  • Produces deeper analysis but takes longer and costs more

Tools: Claude Code, custom agents with tool use, LangChain, CrewAI

Agents: Weaknesses

Powerful for complex investigations. Overkill (and expensive) for simple lookups.

Fine-Tuning with LoRA

  • Instead of showing data at runtime, bake it into the model
  • LoRA — Low-Rank Adaptation: train a small adapter on top of a base model
  • Much cheaper than full fine-tuning — updates a fraction of parameters
  • Shines at simpler, focused tasks: classification, sentiment analysis, tagging, summarization in a house style
  • Can run locally — no API costs at inference time

Model parameters

Base model (frozen)
LoRA

Typically 0.1–5% of total parameters

Great for newsroom tasks like:

Topic classification Sentiment analysis Auto-tagging Headline generation Language detection

Fine-Tuning: How It Works

Training phase (offline)
Data System
export examples
Curate Q&A pairs
Base Model
+
LoRA Training
Fine-Tuned Model
Inference (runtime)
Question
Model
Answer

Key characteristics

  • Data system feeds training, not inference
  • Knowledge is in the weights — no search at runtime
  • Fast inference, no retrieval latency
  • But knowledge is frozen until you retrain

Fine-Tuning: Weaknesses

Best for style/format adaptation. Not ideal when you need current facts with sources.

The Common Foundation

RAG
QueryData System

LLM → Answer
Agents
AgentData System
↓ ↑ ↓ ↑
Agent → Answer
Fine-Tuning
Data System → Export

Train → Model

Every approach starts with the same green box

Before you choose an AI approach, build a searchable, structured, API-accessible data system.
Get this right and you can swap AI approaches as needs evolve.

What Goes in That Box?

Typesense
Fast, typo-tolerant, multilingual. Easy to self-host. Built-in vector search. Great developer experience.
Elasticsearch / OpenSearch
Industry standard. Extremely powerful and flexible. Complex to operate. Supports vectors via kNN.
Meilisearch
Developer-friendly, fast for small-to-medium datasets. Growing vector and AI search features.
PostgreSQL + pgvector
Add vector search to your existing database. Zero new infrastructure. Good enough for many use cases.
Apache Solr
Mature, battle-tested. Common in large newsrooms and archives. Dense vector support via plugins.
Dedicated Vector DBs
Pinecone, Weaviate, Qdrant, Chroma. Built for embeddings. Less useful for traditional keyword search.

The right choice depends on your data size, team skills, and existing infrastructure.

Inside the Green Box

Two ways to search

Keyword search

Match words directly. Fast, exact, interpretable. Struggles with synonyms and meaning.

"climate policy" → finds "climate policy"
misses "environmental regulation"

Vector search

Compare meaning, not words. Finds semantically similar content across languages.

"climate policy" → also finds
"Klimaschutzpolitik", "política climática"

What are embeddings?

  • An embedding is a list of numbers that represents the meaning of a text
  • An embedding model reads your document and outputs a vector (e.g., 384 numbers)
  • Texts with similar meaning end up close together in vector space
  • At query time: embed the question, find the nearest document vectors
"climate policy"
Embedding Model
[0.12, -0.34, 0.78, …]

Best systems combine both: hybrid search — keyword precision + semantic recall.

Our Setup

Data

  • Typesense — self-hosted, single container
  • 55,895 EU government press releases
  • 4 countries — Spain, Germany, Luxembourg, Italy
  • 4 languages — ES, DE, FR, IT
  • Multilingual embeddings (e5-small, 384d)
  • Full-text + vector + faceted search

Tools

  • Open WebUI — RAG chat interface
  • Claude Code — agentic research via API
  • LLMs via OpenRouter
  • Everything on one server
UserOpen WebUIPipelineTypesense
                           
                     OpenRouter (LLM)

All data from the EU Open Data Portal. Open-licensed (CC-BY-4.0 / CC0).

Typesense Search

Exploring 56K press releases — full-text search, facets, typo tolerance, multilingual queries

Keyword search
Facet by country/year
Typo tolerance
Cross-language

Typesense Dashboard · :8109

💬

RAG in Action

Asking questions about EU press releases — Typesense retrieves, LLM answers with citations

Ask a question
Typesense searches
LLM synthesizes
Grounded response

Open WebUI · :3000

🤖

Agent Research

Claude Code investigates a complex question — multiple searches, cross-referencing, synthesis

Complex question
Agent plans
Searches & reasons
Synthesized report

Claude Code · terminal

Choosing Your Approach

RAG Agents Fine-Tuning
Best for Direct factual Q&A Complex, multi-step research Classification, tagging, sentiment, style
Task complexity Simple questions Open-ended investigation Focused, repeatable tasks
Latency Seconds Minutes Milliseconds
Cost / query $ $$$$ Free (after training)
Data freshness Real-time Real-time Stale until retrained
Can cite sources Yes Yes No
Setup complexity Low Medium High (ML expertise)
Needs data system At runtime At runtime At training time only

These aren't mutually exclusive. A newsroom might use RAG for reporter Q&A, agents for investigations, and a fine-tuned model to auto-tag incoming articles.

?

Questions & Discussion