Skip to main content

Reggie SMS V3 Architecture

Status: Active
Version: 3.0
Last Updated: 2026-01-12


1. Overview

Reggie V3 is a complete rebuild addressing issues discovered in V2:

  • Exhausted iterations - V2 would loop requesting the same data
  • Complex validation - V2 validation was too strict, rejecting valid responses
  • Keyword guessing - V2 guessed which data to fetch based on keywords

V3 uses a clean dual-loop architecture with:

  • Mega Prompt - All available data upfront in the first LLM call
  • Dual Conversation Model - USER channel (staff↔Reggie) + SYSTEM channel (code↔LLM)
  • LLM-Driven Decisions - LLM decides if response needed, not regex patterns
  • Lenient Validation - Advisory, not blocking
  • Graceful Failures - Personality-consistent error messages

2. Architecture

┌─────────────────────────────────────────────────────────────────┐
│ REGGIE V3 PIPELINE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────┐ │
│ │ 1. IDENTIFY │ YP3000 lookup + Deputy ID fetch │
│ │ CALLER │ → name, staffId, deputyId, isAdmin │
│ └───────┬────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ 2. BUILD MEGA PROMPT (everything upfront) │ │
│ │ ┌─────────────────────────────────────────────────────┐ │ │
│ │ │ • Admin team directory (who handles what) │ │ │
│ │ │ • Staff's leave balances (Xero API) │ │ │
│ │ │ • Staff's upcoming shifts - 14 days (Deputy API) │ │ │
│ │ │ • Staff's recent shifts - 7 days (Deputy API) │ │ │
│ │ │ • Pay info & next payday │ │ │
│ │ │ • Discord channels list │ │ │
│ │ │ • Available document titles │ │ │
│ │ │ • Document topic examples │ │ │
│ │ │ • Conversation history (last 10 messages, 24h) │ │ │
│ │ └─────────────────────────────────────────────────────┘ │ │
│ └───────┬────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ 3. OUTER LOOP (3 validation retries) │ │
│ │ ┌──────────────────────────────────────────────────────┐│ │
│ │ │ INNER LOOP (4 data-gathering turns) ││ │
│ │ │ Turn 1: LLM sees mega prompt ││ │
│ │ │ Turn 2-4: LLM can request more data ││ │
│ │ │ (search_discord, search_documents, ││ │
│ │ │ search_directory) ││ │
│ │ └──────────────────────────────────────────────────────┘│ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌──────────────────────────────────────────────────────┐│ │
│ │ │ VALIDATION (LLM-based, lenient) ││ │
│ │ │ Valid? → Send SMS ││ │
│ │ │ Invalid? → Retry with feedback ││ │
│ │ └──────────────────────────────────────────────────────┘│ │
│ └───────┬────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────┐ │
│ │ 4. GRACEFUL │ After 3 retries, send personality- │
│ │ FAILURE │ consistent apology message │
│ └────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘

3. Dual Conversation Model

V3 maintains two separate conversation threads:

USER Channel (Visible)

Staff ↔ Reggie SMS conversation

Staff: "How much leave do I have?"
Reggie: "Hi Sarah! You have 42 hours annual leave..."
Staff: "Thanks!"

SYSTEM Channel (Invisible)

Code ↔ LLM orchestration

SYSTEM: Here's the mega prompt with all data...
LLM: { "needs_response": true, "can_answer": true, "response": "..." }
SYSTEM: Validating response...
LLM: { "valid": true }

The LLM sees both channels but only the USER channel affects the SMS conversation.


4. LLM Response Format

V3 uses structured JSON responses from the LLM:

{
"needs_response": true, // Does this message need a reply?
"can_answer": true, // Can we answer with current data?
"response": "Hi Sarah! ...", // The SMS response text
"need_data": ["search_discord"], // If more data needed
"query": "dress code policy", // Search query for requested data
"confidence": "high", // high/medium/low
"reasoning": "User asked about leave, I have their Xero data"
}

needs_response Logic

The LLM decides if a message needs a response:

  • ✅ "What shifts do I have?" → needs_response: true
  • ✅ "Thanks mate!" (first ack) → needs_response: true (brief acknowledgment)
  • ❌ "👍" (after we already replied) → needs_response: false
  • ❌ "Ok cool" (after Reggie answered) → needs_response: false

can_answer Logic

  • true → Has enough info to answer, response field populated
  • false → Needs more data, need_data and query fields populated

5. Data Sources

5.1 Pre-fetched in Mega Prompt

DataSourceNotes
Admin teamcore_source.org_directoryWho handles what, phone numbers
Leave balancesXero APIAnnual, personal, sick leave
Upcoming shiftsDeputy APINext 14 days
Recent shiftsDeputy APILast 7 days
Pay infoStaticNext payday, pay frequency
Discord channelscomms.discord_channelsChannel list
Document titlesresources.documentsAvailable handbook sections
Conversation historycomms.reggie_conversationsLast 10 messages in 24h

5.2 On-Demand Searches

SourceFunctionReturns
search_discordSemantic search Discord KBQ&A context with follow-ups
search_documentsSemantic search handbookRelevant policy chunks
search_directoryText search org directoryStaff contact info

6. Validation System

V3 uses lenient LLM-based validation:

// Validation prompt (abbreviated)
`Review this SMS response for SERIOUS quality issues only.

ONLY flag as invalid if:
1. Obviously fake/placeholder info (e.g., "John Doe", "xxx-xxxx")
2. Completely fails to address the question
3. Contains harmful or inappropriate content

DO NOT flag as invalid for:
- Australian phone numbers (02, 04 prefixes are normal)
- Informal but friendly tone
- Minor formatting differences
- Admitting when information isn't available

Be LENIENT. Most responses should be valid.`

Validation Flow

  1. Response generated → Validate
  2. Valid? → Send SMS
  3. Invalid? → Add feedback to system channel, retry
  4. After 3 retries → Graceful failure

7. Graceful Failure Messages

When V3 exhausts all retries, it sends a personality-consistent apology:

const GRACEFUL_FAILURES = [
"Apologies {name}, bit of a migraine coming on. Mind checking with the admin team at (02) 8783 0544 while I have a lie down? Cheers!",
"Sorry {name}, brain's gone fuzzy on me. Try (02) 8783 0544 for the admin crew.",
"Having a moment {name}! The admin team at (02) 8783 0544 can help while I collect myself.",
"Drawing a blank here I'm afraid {name}. Give admin a bell on (02) 8783 0544.",
"My brain's being uncooperative {name}. Try (02) 8783 0544 for the admin team. Sorry about that!",
// ... 5 more variations
];

8. Logging

V3 uses descriptive retry/turn logging:

[Reggie-V3] Incoming from +61413731705: "What shifts do I have?"
[Reggie-V3] Identified: Sarah Johnson (admin: false, deputyId: 123)
[Reggie-V3] R1/3 T0/4 →: Starting attempt 1/3
[Reggie-V3] R1/3 T1/4 →: Sending to LLM
[Reggie-V3] R1/3 T1/4 ←: needs_response: true, can_answer: true, confidence: high
[Reggie-V3] R1/3 T0/4 ?: Validating response
[Reggie-V3] R1/3 T0/4 ✓: Valid
[Reggie-V3] Response generated in 3421ms

Log format: R{retry}/{maxRetry} T{turn}/{maxTurn} {symbol}: {message}


9. Model Selection

ComponentModelRationale
Main conversationgpt-4oBetter instruction following, smarter
Validationgpt-4o-miniFaster, simpler task

With ~80 users, the cost difference between models is negligible.


10. Database

Conversations Table

comms.reggie_conversations
- phone_number VARCHAR(20)
- identity_id UUID
- staff_id UUID
- staff_name VARCHAR(100)
- direction VARCHAR(10) -- 'inbound' or 'outbound'
- message TEXT
- topic VARCHAR(50)
- sources_used JSONB
- response_time_ms INTEGER
- conversation_id UUID
- created_at TIMESTAMPTZ

Legacy Logging

hr.reggie_runs
- phone_from, message_body
- question_type = 'V3'
- response_text, processing_time_ms
- staff_id, staff_name

11. Files

FilePurpose
backend/services/reggie-sms-v3.jsMain V3 service
backend/services/reggie-sms-v2.jsV2 fallback
backend/services/resource-kb.jsHandbook/document search
bot/discord-kb-sync.jsDiscord KB sync & search
backend/routes_v1p/sms.jsTwilio webhook (uses V3, falls back to V2)

12. Comparison: V2 vs V3

FeatureV2V3
Modelgpt-4.1-minigpt-4o
Initial dataKeyword-guessedEverything upfront
Question detectionRegex patternsLLM decides
Data fetchingParallel keyword searchMega prompt + on-demand
ValidationStrict, blockingLenient, advisory
Failure handlingGeneric errorPersonality-consistent
Max iterations5 total3 retries × 4 turns
LoggingBasicRetry/turn tracking

13. Fallback Chain

// In sms.js webhook handler
try {
result = await reggieSmsV3.handleIncomingSMS(from, body);
} catch (err) {
console.log('[reggie-v3] Falling back to V2...');
result = await reggieSmsV2.handleIncomingSMS(from, body);
}

14. Example Flow

Staff texts: "What's the sick leave policy?"

  1. Identify: YP3000 → Sarah Johnson (deputyId: 123)
  2. Mega Prompt: Built with all data
  3. R1/3 T1/4: LLM sees prompt
    • needs_response: true
    • can_answer: false
    • need_data: ["search_documents"]
    • query: "sick leave policy"
  4. Fetch: Search handbook for "sick leave policy"
  5. R1/3 T2/4: LLM sees search results
    • can_answer: true
    • response: "Hi Sarah! Per the handbook, you should notify your supervisor ASAP when calling in sick..."
  6. Validate: Lenient check → Valid
  7. Send: SMS sent, logged to comms.sms_messages and hr.reggie_runs

15. Silent Responses

V3 can return silent: true for messages that don't need responses:

// Emoji or acknowledgment after Reggie already replied
if (!llmResponse.needs_response) {
return {
success: true,
silent: true,
reasoning: "Message is acknowledgment, no response needed"
};
}

The webhook handler checks for silent and skips sending SMS.