System Architecture Diagram
- When a customer call comes in, the call enters the SoftSwitch dialplan. Within the dialplan, a
curlcommand is executed to call an API endpoint provided by EasyCallCenter365, sending the call UUID, caller number, callee number, and other related information.
At this point, the call is taken over and managed by EasyCallCenter365. - EasyCallCenter365 starts call recording and attempts to establish a connection with the large language model (LLM).
- The LLM returns responses through a streaming HTTP response. While EasyCallCenter365 receives the text stream, it simultaneously sends the text to theĀ
speakcommand for text-to-speech synthesis. - After the SoftSwitch TTS module receives the speech synthesis command, it extracts the text parameters, connects to the TTS server, and sends a speech synthesis request.
- Since text is continuously generated throughout the process, the TTS module sends text for synthesis while simultaneously receiving the synthesized audio stream, decoding the audio, and playing it back in real time.
-
The ASR module is responsible for sending the audio stream and receiving the speech recognition result text.
Throughout the entire call, the SoftSwitch ASR module continuously performs real-time speech recognition.
- Through Event Socket messages, SoftSwitch sends the recognized speech text back to EasyCallCenter365.
- EasyCallCenter365 packages the speech recognition result together with the previous conversation context and sends it to the LLM.
- The process then continues in a loop until the phone call ends.

No comments to display
No comments to display