Communications Plan: Voice, SMS, and Chat

This document outlines the strategy, recommended providers, and implementation plan for integrating real-time communication channels into RABS.

1. Provider Comparison & Recommendation

A review was conducted of Twilio, ClickSend, and the existing TextMagic service to determine the best “telecom spine” for Reggie.

Capability	Twilio	ClickSend	TextMagic (Current)
Local AU Numbers (Voice + SMS)	Yes, instant provisioning via API.	Yes, dedicated numbers available.	Yes, virtual numbers available.
Programmable Voice (PSTN ↔ WebRTC)	Yes, full IVR, call recording, and SDKs for in-browser calls. This is a core strength.	No, outbound Text-to-Speech (TTS) only. Not suitable for real-time conversational agents.	No, only allows calls through their web UI; no developer-grade voice API.
Omni-channel Chat (Web, SMS, WhatsApp)	Yes, via Twilio Conversations, which unifies multiple channels into a single thread.	No native chat widget.	None.
Pricing (AU Outbound SMS)	~A$0.0515 / msg	~A$0.04 / msg (slightly cheaper for bulk)	~A$0.059 / msg (highest of the three)

Recommendation

Adopt Twilio as the primary telecom spine. It is the only provider of the three that can handle inbound/outbound voice calls, SMS, and embedded web chat under a single, unified API surface. This simplifies development and provides a single, coherent conversation history.
Keep LiveKit for ultra-low-latency in-browser voice, especially for use cases requiring screen-sharing or multi-party calls. PSTN calls from Twilio can be bridged into LiveKit rooms when necessary.
Use ClickSend selectively for large, one-way bulk notifications (e.g., mass weather alerts) where its slightly lower per-message cost provides a benefit.

2. Kick-off Integration Plan (Step-by-Step)

This plan outlines the first two weeks of focused work to build the initial voice and chat pipeline.

#	Task	Outcome / Rationale
1. Define Success Criteria	List the 3 most important metrics for the voice experience (e.g., “< 250 ms latency”, “high accuracy for Australian accents”).	Aligns the team and prevents endless technology churn.
2. Spin up a LiveKit Cloud Sandbox	Create a free LiveKit project and copy the API keys into your `.env` file.	Provides a playground for low-latency WebRTC audio without long-term commitment.
3. Wire a “Hello World” Voice Loop	Use the LiveKit Node SDK to capture microphone audio → STT (e.g., AssemblyAI) → LLM Gateway → TTS (e.g., Hume AI) back to user.	Confirms end-to-end plumbing (`mic → text → LLM → speech`) and exposes latency bottlenecks early.
4. Pipe Transcripts into `internalmonologue`	Store each final STT segment and its vector embedding, tagged with `source = 'voice_livekit'`.	Begins populating the RAG knowledge base with real user speech.
5. Add a Text Chat Channel (Twilio Conversations)	Embed Twilio’s JS widget on a dev page and forward messages to the LLM Gateway.	Provides an immediate fallback for users who prefer text and enables A/B comparison of voice vs text.
6. Run a 1-Week Alpha with Staff	Collect subjective ratings (clarity, empathy, fatigue) and note edge cases (crosstalk, strong accents, long pauses).	Real-world usage surfaces issues that lab tests miss (e.g., office noise causing false positives).
7. Decide and Lock-in Providers	Compare alpha metrics to the success criteria from Step 1 and select STT/TTS providers for public beta.	Ensures choices are based on evidence, not marketing hype.

1. Provider Comparison & Recommendation​

2. Kick-off Integration Plan (Step-by-Step)​

1. Provider Comparison & Recommendation

2. Kick-off Integration Plan (Step-by-Step)