Skip to main content

Better Search & Embed = What Reggie Thinks (Multi-Round Analysis)

Almost all of DSW's ability to help participants lives in three data streams: incident reports, shift reports, and Discord conversations. These contain the daily reality of what happens, who was involved, how staff responded, and what participants experienced.

The v2 analysis pipeline analyses each data type for what it actually IS, through multiple focused LLM passes ("lenses"), and stores the results in multiple output formats designed for different consumers. The same analysis that powers search also powers agent context, compliance dashboards, and frontend display.

This is not an embedding upgrade. This is getting our most important data into the database correctly.


1. The Problem with V1

One Prompt For Everything

The v1 extraction pipeline uses a single prompt for all content types. The only differentiation between an incident report and a Discord message is four lines telling the LLM to "drop chatter for short messages" and "preserve factual tone for staff reports." The unit types, domain tags, entity extraction, output format, verification, and enrichment are identical for everything.

Incident reports and shift reports even share the same container_type = 'staff_report'. The LLM literally cannot distinguish between them.

What Gets Lost

Data TypeWhat the LLM Sees in V1What It Should See
Incident reportsBefore/incident/after narrative onlyFull context: date, location, staff, participants, severity, NDIS flags, incident type -- THEN the narrative
Shift reportsParticipant notes + staff commentShift type, expected content for that type, all participants, vehicle data -- THEN the notes
Discord messagesRaw message text onlyChannel context, author role, preceding conversation, thread topic -- THEN the message

What V1 Produces

One generic set of "semantic units" and one set of embeddings per source. No structured analysis of what the content actually means from different perspectives. No assessment of quality, compliance, gaps, or patterns.


2. Multi-Round Analysis: The V2 Approach

The Core Idea

Instead of one generic prompt, each source type gets a custom analysis pipeline consisting of multiple focused LLM passes. Each pass has one specific mission and produces one specific structured output.

For every piece of content, we produce and store:

Output TypeWhat It IsWho Uses It
Full analysisComplete structured JSONB per lensAgents needing deep context, compliance reporting
Summary1-3 sentence human-readable summarySearch result cards, tooltips, dashboards
Single-line factsAtomic fact statements extracted from contentEmbedding vectors, search index, quick retrieval
MetadataEntities, domain tags, scores, flagsFiltering, relevance scoring, cross-referencing
Embedding vectors3072-dim vectors (atomic, contextual, summary)Semantic similarity search

Every consumer gets what it needs from the same analysis pass. We analyse once, store in multiple formats, serve to everyone.

How It Flows

RAW DATA ARRIVES (incident report, shift report, Discord message)
|
v
ENRICHMENT: Build the full picture
- Prepend structured metadata the LLM needs to see
- For incidents: add date, location, staff, participants, severity
- For shifts: add shift type, expected content, all participants
- For Discord: add preceding messages, channel context, author role
|
v
MULTI-LENS ANALYSIS: Run 4-6 focused LLM passes per source
- Each pass answers ONE specific question deeply
- Passes can run in parallel where independent
- Each produces structured JSONB output
|
v
QUALITY ASSURANCE: Final LLM pass cross-checks all outputs
- Flags contradictions between lenses
- Adjusts confidence scores
- Produces overall quality score
|
v
STORAGE: Multiple output formats from the same analysis
- source_analysis table: full JSONB per lens (for agents, compliance)
- Summaries: human-readable condensations (for search cards, tooltips)
- Single-line facts: atomic statements (for embedding, quick retrieval)
- Metadata: entities, tags, scores (for filtering, relevance)
- Embedding vectors: 3 strategies per unit (for similarity search)

Why Multiple Passes Instead Of One

A single LLM pass trying to extract facts AND assess participant impact AND evaluate staff conduct AND check compliance produces mediocre results at everything. Separate passes with focused missions produce expert-level results at each.

With Tier 4 limits at less than 1% utilisation, running 6 passes on an incident report costs fractions of a cent and takes seconds in a background worker. The quality difference is enormous.


3. What Each Data Type Gets

3.1 Incident Reports -- 6 Analysis Lenses

An incident report is a structured narrative about something that went wrong. It feeds compliance reporting, pattern detection, staff development, participant safety monitoring, and operational improvement.

Pre-analysis enrichment: The raw_text is rebuilt to include all structured context (date, location, staff, participants, severity, incident type, NDIS flags) alongside the narrative sections. The v1 pipeline sends only the narrative -- the LLM never sees the structured metadata.

LensMissionKey Outputs
Factual ExtractionWhat happened, when, where, who, what actionsEvents, timeline, entities, atomic facts
Participant ImpactImpact on participant wellbeing, dignity, safetyWellbeing indicators, restrictive practices used, pattern indicators, follow-up needs
Staff Conduct & ResponseWas the response proportionate? Language analysisResponse assessment, areas glossed over vs detailed, author sentiment (defensive? factual? compensatory?)
Gap AnalysisWhat SHOULD be in this report that is NOTMissing elements, timeline gaps, unanswered questions, expected vs actual content
Compliance AssessmentNDIS reportable? WHS implications? Safeguarding?NDIS indicators, WHS flags, recommended actions, regulatory confidence
Quality AssuranceCross-check all lenses for consistencyContradictions between lenses, adjusted confidence, final quality score

Example stored output for one incident:

embeddings.source_analysis (6 rows):
incident_factual_extraction: {events: [...], entities: [...], timeline: [...]}
incident_participant_impact: {wellbeing: {...}, restrictive_practices: [...]}
incident_staff_conduct: {response: {...}, sentiment: "defensive", glossed: [...]}
incident_gap_analysis: {missing: [...], unanswered: [...], score: 0.6}
incident_compliance: {ndis_reportable: true, whs_flags: [...]}
incident_qa: {contradictions: [...], final_score: 0.78}

embeddings.sources (metadata):
tooltip_summary: "Restraint incident at Coral Flame involving [participant]..."
keywords: "restraint coral flame [participant] [staff] severity-4 ndis"
importance_score: 0.9

embeddings.units (searchable atoms, each with 3 embedding vectors):
"Staff member [name] used physical restraint on [participant] at 3:15pm"
"[Participant] was distressed and crying following the restraint"
"De-escalation was not documented before restraint was applied"

3.2 Discord Messages -- 4 Analysis Lenses

Discord messages are conversational fragments. A single message might be meaningless alone. The value comes from understanding conversation flow, operational significance buried in casual language, and social dynamics.

Pre-analysis enrichment: Preceding conversation (up to 15 messages) injected alongside the target message, with channel context and author role.

LensMissionKey Outputs
Operational ContentExtract operationally relevant info buried in casual languageShift cover requests, leave notices, participant updates, equipment issues
Conversational ContextHow does this fit the conversation? Does it change prior meaning?Thread topic, conversation role, retroactive implications
Social & Team DynamicsTeam morale, workload stress, collaboration patternsTone markers, wellbeing indicators, coordination signals
Quality AssuranceIs extraction depth appropriate for this message?Over/under extraction flags

Critical distinction from v1: The v1 prompt says "drop chatter/greetings." This is wrong for Discord. "Hey can someone cover me tomorrow arvo, I'm sick" is casual language but operationally critical. The v2 operational content lens specifically recognises operational importance in informal language.

Retroactive re-analysis: When the conversational context lens identifies that a message changes the meaning of preceding messages (e.g., "just kidding" after a series of complaints), those earlier sources are flagged for re-processing. This is unique to conversational data.

3.3 Shift Reports -- 5 Analysis Lenses

Shift reports are daily care narratives. They are arguably the most important documents for NDIS compliance and participant care quality.

Pre-analysis enrichment: Expected content section generated based on shift type (e.g., "Group Home" shifts should document meals, medications, personal care, sleep patterns).

LensMissionKey Outputs
Activity & RoutineWhat activities happened, per participantActivities, routines, deviations from normal
Participant WellbeingPhysical, emotional, social wellbeing per participantMood indicators, health observations, behaviour notes, engagement level
Care QualityWas the care person-centred or task-oriented?Care quality score, choice indicators, activity meaningfulness
Completeness & Data QualityWhat is missing that should be there?Missing elements for this shift type, brevity flags, follow-up suggestions
Quality AssuranceCross-check: does wellbeing match activities?Contradictions, overall quality score

4. Output Formats: One Analysis, Multiple Representations

Every analysis pass produces outputs stored in multiple formats simultaneously. This is how one analysis serves search, agents, compliance, and frontend without re-processing.

4.1 Full Structured Analysis

Stored in embeddings.source_analysis as complete JSONB per lens. This is the deep, detailed version.

-- Fetch all analysis for an incident
SELECT lens_name, analysis_output, confidence
FROM embeddings.source_analysis
WHERE source_id = '...'
ORDER BY lens_name;

Used by: agents needing full context, compliance audits, investigation support.

4.2 Source-Level Summaries

Human-readable condensations stored directly on the source record.

FieldPurposeExample
tooltip_summary1-2 sentence summary for hover previews"Restraint incident at Coral Flame involving participant X, severity 4, NDIS reportable"
keywordsSpace-separated terms for text search"restraint coral flame participant-x staff-y severity-4 ndis"
importance_score0.0-1.0 relevance weight0.9 (high -- involves restrictive practice)

Used by: search result previews, dashboard cards, notification summaries.

4.3 Atomic Facts / Semantic Units

Single-line, self-contained statements optimised for embedding and search. Each gets three embedding vectors (atomic, contextual, summary) at 3072 dimensions using text-embedding-3-large.

Used by: semantic similarity search, RAG retrieval, agent context assembly.

4.4 Metadata and Scores

Structured fields on units and sources for filtering and relevance scoring:

FieldLives OnPurpose
domain_tagsunitsFilter by domain: incidents, participants, roster, medical
entity_mentionsunitsWho is mentioned: Staff: Sam, Participant: X
resolved_entitiesunitsMatched to database IDs via YP3000
confidenceunits, source_analysisHow reliable is this extraction (0.0-1.0)
sentimentunitsAuthor tone, subject sentiment, urgency
intentunitsWhy was this communicated: inform, request, escalate
record_scopeunitsIs this about a staff member, participant, vehicle, venue
pipeline_versionsourcesWhich version of the analysis pipeline processed this

5. What This Means for Agents

When someone asks an agent "what did you think of that incident from Sam on Friday?", the agent does not re-read the raw incident report. It queries the database:

-- Find Sam's incident from Friday
SELECT s.id, s.tooltip_summary, s.importance_score
FROM embeddings.sources s
WHERE s.origin_system = 'incidents'
AND s.author_display ILIKE '%sam%'
AND s.created_at BETWEEN '2026-03-13' AND '2026-03-14';

-- Get all lens analysis for that incident
SELECT lens_name, analysis_output
FROM embeddings.source_analysis
WHERE source_id = '...';

The agent now has: factual extraction, participant impact assessment, staff conduct analysis, gap analysis, compliance assessment, and quality score. All pre-computed. The agent can answer in depth without any live analysis.

Cross-referencing ("what were they saying about it on Discord?") is a time-band query on sources with matching entities. The database has everything. Agents just ask for it.

This is not agent-specific. Any agent type, any version, any future system can query the same tables. The database is the knowledge. Agents are consumers.


6. The Embedding Vector System

6.1 Model and Dimensions

All embeddings use text-embedding-3-large at 3072 dimensions, consistent across all source types.

6.2 Three Embedding Strategies Per Unit

Each semantic unit gets three vectors, each capturing meaning at a different level:

StrategyWhat Gets EmbeddedBest For
AtomicThe unit text alonePrecise factual queries
ContextualUnit text + surrounding source contextQueries needing broader understanding
SummaryUnit text + domain/topic framingBroad topic queries

Search queries all three strategies and takes the best match per unit.

6.3 What Gets Embedded vs What Gets Stored

ContentStored WhereEmbedded?
Full lens analysissource_analysis.analysis_output (JSONB)No -- too large, not for similarity search
Tooltip summarysources.tooltip_summaryNo -- for display only
Keywordssources.keywordsNo -- for text search (tsvector)
Atomic factsunits.text_for_embeddingYes -- three vectors each
Source overviewsource_overviews.overview_summaryYes -- one vector

Vectors find the needle (similarity search). Structured analysis tells you everything about the needle.


7. Data Quality Fixes

The v1 pipeline has several confirmed problems addressed in v2:

Incident Reports: New Incidents Not Being Ingested

The Monday.com import is CLI-only. Not scheduled, not automated. New incidents are not being pulled in.

Fix: Schedule the import as a server cron every 2 hours with column validation.

Incident Reports: Future Dates

The LLM date parser has no plausibility validation. Records appear with dates in years 3405, 2028.

Fix: Date plausibility checks after parsing. Suspicious dates quarantined for review rather than rejected (legitimate late reports exist).

Shift Reports: Late Notes Never Embedded

When a shift arrives from Deputy without participant notes, it gets skipped for embedding. When notes arrive hours later, the embedding is never regenerated.

Fix: Daily re-checker re-fetches last 7 days from Deputy. If notes arrived, re-embed. After 7 days without notes, flag as confirmed-no-notes.

Shift Reports: Updated Content Not Re-Embedded

The ON CONFLICT clause does not reset extraction status. Updated content is stored but never re-embedded.

Fix: ON CONFLICT DO UPDATE with extraction_status = 'pending'.


8. Pipeline Versioning and Re-Processing

Every source record carries a pipeline_version integer. Every pipeline improvement increments the version. A background re-processor works through old-version records at low priority.

Lens outputs are stored separately per lens. This means a single improved lens can be re-run across all sources without re-running the entire pipeline.

Why this matters: In two months we will think of something that improves our analysis. The pipeline versioning ensures every improvement is applied retroactively to all historical data, automatically, in the background.


9. Execution Plan

PhaseGoalKey Work
1. FoundationFix plumbing, get data flowingSchema migrations, Monday.com cron, shift re-checker, date validation
2. IncidentsHighest-value data type, 6 lensesBuild enriched raw_text, implement all lenses, re-process existing
3. DiscordConversational-aware, 4 lensesEnhanced context window, retroactive re-analysis triggers
4. ShiftsCare-quality-aware, 5 lensesShift-type expectations, participant wellbeing per person
5. Search UIGlobal search on v2 dataWire header search bar, type-specific result cards, filtering
6. ContinuousMake it repeatableRe-processing command, reconciliation cron, health dashboard

V2 is built alongside v1 (which continues running). Both write to the same tables; pipeline_version distinguishes them. When v2 is proven, v1 goes dormant and all old content is queued for re-processing at low priority.


10. Key Principle

The database is the product.

We are not building a search engine, an agent brain, or a compliance tool. We are building a database that contains deeply analysed, well-structured operational knowledge about the care we provide.

Search consumes it. Agents consume it. Dashboards consume it. Compliance tools consume it. Things we have not thought of yet will consume it. The better we analyse and store the data, the better everything works.

Almost all of our insight and ability to help our participants lives in this data. Getting it right is some of the most important work in the system.