Reggie SMS V3 Architecture
Status: Active
Version: 3.0
Last Updated: 2026-01-12
1. Overview
Reggie V3 is a complete rebuild addressing issues discovered in V2:
- Exhausted iterations - V2 would loop requesting the same data
- Complex validation - V2 validation was too strict, rejecting valid responses
- Keyword guessing - V2 guessed which data to fetch based on keywords
V3 uses a clean dual-loop architecture with:
- Mega Prompt - All available data upfront in the first LLM call
- Dual Conversation Model - USER channel (staff↔Reggie) + SYSTEM channel (code↔LLM)
- LLM-Driven Decisions - LLM decides if response needed, not regex patterns
- Lenient Validation - Advisory, not blocking
- Graceful Failures - Personality-consistent error messages
2. Architecture
┌─────────────────────────────────────────────────────────────────┐
│ REGGIE V3 PIPELINE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────┐ │
│ │ 1. IDENTIFY │ YP3000 lookup + Deputy ID fetch │
│ │ CALLER │ → name, staffId, deputyId, isAdmin │
│ └───────┬────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ 2. BUILD MEGA PROMPT (everything upfront) │ │
│ │ ┌─────────────────────────────────────────────────────┐ │ │
│ │ │ • Admin team directory (who handles what) │ │ │
│ │ │ • Staff's leave balances (Xero API) │ │ │
│ │ │ • Staff's upcoming shifts - 14 days (Deputy API) │ │ │
│ │ │ • Staff's recent shifts - 7 days (Deputy API) │ │ │
│ │ │ • Pay info & next payday │ │ │
│ │ │ • Discord channels list │ │ │
│ │ │ • Available document titles │ │ │
│ │ │ • Document topic examples │ │ │
│ │ │ • Conversation history (last 10 messages, 24h) │ │ │
│ │ └─────────────────────────────────────────────────────┘ │ │
│ └───────┬────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ 3. OUTER LOOP (3 validation retries) │ │
│ │ ┌──────────────────────────────────────────────────────┐│ │
│ │ │ INNER LOOP (4 data-gathering turns) ││ │
│ │ │ Turn 1: LLM sees mega prompt ││ │
│ │ │ Turn 2-4: LLM can request more data ││ │
│ │ │ (search_discord, search_documents, ││ │
│ │ │ search_directory) ││ │
│ │ └──────────────────────────────────────────────────────┘│ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌──────────────────────────────────────────────────────┐│ │
│ │ │ VALIDATION (LLM-based, lenient) ││ │
│ │ │ Valid? → Send SMS ││ │
│ │ │ Invalid? → Retry with feedback ││ │
│ │ └──────────────────────────────────────────────────────┘│ │
│ └───────┬────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────┐ │
│ │ 4. GRACEFUL │ After 3 retries, send personality- │
│ │ FAILURE │ consistent apology message │
│ └────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
3. Dual Conversation Model
V3 maintains two separate conversation threads:
USER Channel (Visible)
Staff ↔ Reggie SMS conversation
Staff: "How much leave do I have?"
Reggie: "Hi Sarah! You have 42 hours annual leave..."
Staff: "Thanks!"
SYSTEM Channel (Invisible)
Code ↔ LLM orchestration
SYSTEM: Here's the mega prompt with all data...
LLM: { "needs_response": true, "can_answer": true, "response": "..." }
SYSTEM: Validating response...
LLM: { "valid": true }
The LLM sees both channels but only the USER channel affects the SMS conversation.
4. LLM Response Format
V3 uses structured JSON responses from the LLM:
{
"needs_response": true, // Does this message need a reply?
"can_answer": true, // Can we answer with current data?
"response": "Hi Sarah! ...", // The SMS response text
"need_data": ["search_discord"], // If more data needed
"query": "dress code policy", // Search query for requested data
"confidence": "high", // high/medium/low
"reasoning": "User asked about leave, I have their Xero data"
}
needs_response Logic
The LLM decides if a message needs a response:
- ✅ "What shifts do I have?" → needs_response: true
- ✅ "Thanks mate!" (first ack) → needs_response: true (brief acknowledgment)
- ❌ "👍" (after we already replied) → needs_response: false
- ❌ "Ok cool" (after Reggie answered) → needs_response: false
can_answer Logic
true→ Has enough info to answer,responsefield populatedfalse→ Needs more data,need_dataandqueryfields populated
5. Data Sources
5.1 Pre-fetched in Mega Prompt
| Data | Source | Notes |
|---|---|---|
| Admin team | core_source.org_directory | Who handles what, phone numbers |
| Leave balances | Xero API | Annual, personal, sick leave |
| Upcoming shifts | Deputy API | Next 14 days |
| Recent shifts | Deputy API | Last 7 days |
| Pay info | Static | Next payday, pay frequency |
| Discord channels | comms.discord_channels | Channel list |
| Document titles | resources.documents | Available handbook sections |
| Conversation history | comms.reggie_conversations | Last 10 messages in 24h |
5.2 On-Demand Searches
| Source | Function | Returns |
|---|---|---|
search_discord | Semantic search Discord KB | Q&A context with follow-ups |
search_documents | Semantic search handbook | Relevant policy chunks |
search_directory | Text search org directory | Staff contact info |
6. Validation System
V3 uses lenient LLM-based validation:
// Validation prompt (abbreviated)
`Review this SMS response for SERIOUS quality issues only.
ONLY flag as invalid if:
1. Obviously fake/placeholder info (e.g., "John Doe", "xxx-xxxx")
2. Completely fails to address the question
3. Contains harmful or inappropriate content
DO NOT flag as invalid for:
- Australian phone numbers (02, 04 prefixes are normal)
- Informal but friendly tone
- Minor formatting differences
- Admitting when information isn't available
Be LENIENT. Most responses should be valid.`
Validation Flow
- Response generated → Validate
- Valid? → Send SMS
- Invalid? → Add feedback to system channel, retry
- After 3 retries → Graceful failure
7. Graceful Failure Messages
When V3 exhausts all retries, it sends a personality-consistent apology:
const GRACEFUL_FAILURES = [
"Apologies {name}, bit of a migraine coming on. Mind checking with the admin team at (02) 8783 0544 while I have a lie down? Cheers!",
"Sorry {name}, brain's gone fuzzy on me. Try (02) 8783 0544 for the admin crew.",
"Having a moment {name}! The admin team at (02) 8783 0544 can help while I collect myself.",
"Drawing a blank here I'm afraid {name}. Give admin a bell on (02) 8783 0544.",
"My brain's being uncooperative {name}. Try (02) 8783 0544 for the admin team. Sorry about that!",
// ... 5 more variations
];
8. Logging
V3 uses descriptive retry/turn logging:
[Reggie-V3] Incoming from +61413731705: "What shifts do I have?"
[Reggie-V3] Identified: Sarah Johnson (admin: false, deputyId: 123)
[Reggie-V3] R1/3 T0/4 →: Starting attempt 1/3
[Reggie-V3] R1/3 T1/4 →: Sending to LLM
[Reggie-V3] R1/3 T1/4 ←: needs_response: true, can_answer: true, confidence: high
[Reggie-V3] R1/3 T0/4 ?: Validating response
[Reggie-V3] R1/3 T0/4 ✓: Valid
[Reggie-V3] Response generated in 3421ms
Log format: R{retry}/{maxRetry} T{turn}/{maxTurn} {symbol}: {message}
9. Model Selection
| Component | Model | Rationale |
|---|---|---|
| Main conversation | gpt-4o | Better instruction following, smarter |
| Validation | gpt-4o-mini | Faster, simpler task |
With ~80 users, the cost difference between models is negligible.
10. Database
Conversations Table
comms.reggie_conversations
- phone_number VARCHAR(20)
- identity_id UUID
- staff_id UUID
- staff_name VARCHAR(100)
- direction VARCHAR(10) -- 'inbound' or 'outbound'
- message TEXT
- topic VARCHAR(50)
- sources_used JSONB
- response_time_ms INTEGER
- conversation_id UUID
- created_at TIMESTAMPTZ
Legacy Logging
hr.reggie_runs
- phone_from, message_body
- question_type = 'V3'
- response_text, processing_time_ms
- staff_id, staff_name
11. Files
| File | Purpose |
|---|---|
backend/services/reggie-sms-v3.js | Main V3 service |
backend/services/reggie-sms-v2.js | V2 fallback |
backend/services/resource-kb.js | Handbook/document search |
bot/discord-kb-sync.js | Discord KB sync & search |
backend/routes_v1p/sms.js | Twilio webhook (uses V3, falls back to V2) |
12. Comparison: V2 vs V3
| Feature | V2 | V3 |
|---|---|---|
| Model | gpt-4.1-mini | gpt-4o |
| Initial data | Keyword-guessed | Everything upfront |
| Question detection | Regex patterns | LLM decides |
| Data fetching | Parallel keyword search | Mega prompt + on-demand |
| Validation | Strict, blocking | Lenient, advisory |
| Failure handling | Generic error | Personality-consistent |
| Max iterations | 5 total | 3 retries × 4 turns |
| Logging | Basic | Retry/turn tracking |
13. Fallback Chain
// In sms.js webhook handler
try {
result = await reggieSmsV3.handleIncomingSMS(from, body);
} catch (err) {
console.log('[reggie-v3] Falling back to V2...');
result = await reggieSmsV2.handleIncomingSMS(from, body);
}
14. Example Flow
Staff texts: "What's the sick leave policy?"
- Identify: YP3000 → Sarah Johnson (deputyId: 123)
- Mega Prompt: Built with all data
- R1/3 T1/4: LLM sees prompt
needs_response: truecan_answer: falseneed_data: ["search_documents"]query: "sick leave policy"
- Fetch: Search handbook for "sick leave policy"
- R1/3 T2/4: LLM sees search results
can_answer: trueresponse: "Hi Sarah! Per the handbook, you should notify your supervisor ASAP when calling in sick..."
- Validate: Lenient check → Valid
- Send: SMS sent, logged to
comms.sms_messagesandhr.reggie_runs
15. Silent Responses
V3 can return silent: true for messages that don't need responses:
// Emoji or acknowledgment after Reggie already replied
if (!llmResponse.needs_response) {
return {
success: true,
silent: true,
reasoning: "Message is acknowledgment, no response needed"
};
}
The webhook handler checks for silent and skips sending SMS.