At a high level, Evo Voice is an application service that is running in the Azure cloud communicating with Twilio over HTTP to perform call routing and control. End user applications and devices such as SIP phones, the HostedSuite console, and the Voice app communicate with the Evo Voice cloud and the Twilio cloud in order to provide the standard VoIP services that are commonly associated with PBXs.
Twilio (is not a Phone System)
It's important to understand that Twilio is not a phone system. It is a platform that provides programmatic/API control of calls.
Twilio can be thought of as occupying a spot above your typical carrier but well below your traditional PBX. It provides very basic call services via an API and relies upon developers to build applications on top of it.
Twilio communicates with third party applications (such as Evo Voice) using HTTP which allows for a very restricted set of voice features.
Twilio provides the very basic services that are needed in order to build voice applications. The list of services that Twilio provides includes:
- Incoming phone numbers - Twilio allows you to purchase phone numbers in various geographic regions and associate them with an HTTP endpoint that will be used whenever the phone rings (or SMS is received)
- Text to Speech - Twilio provides basic text to speech services that applications can use to "talk" to the caller
- Audio/Video conferences - Twilio provides the ability to move incoming calls into conference rooms where other calls can be joined to create a conference call.
- Dial - Twilio provides the ability to ring a Web RTC endpoint (such as a web browser or a mobile device via its SDK) or a SIP endpoint (such as a Polycom handset)
- Record - Twilio provides the ability to record the audio that comes over a call and to then do something with that audio
- Play - Twilio allows you to play recorded audio over an established call
- Gather Digits - Twilio can notify an application when the caller dials digits on their handset
- Client SDKs - Twilio provides Web RTC SDKs for web browsers, iOS and Android devices that are able to connect to the Twilio servers and receive calls (when the above Dial is used)
- Call Recording - Twilio provides the ability to record calls and then notify an application with the URL to the audio file
Although the above list may seem small compared to the established PBX vendors of the world, Twilio was designed to be leveraged by application developers to build (potentially complicated) applications on top of it.
Although the title of this section is limitations, it is important to note that these are not limitations in the traditional sense, Twilio was not designed to compete with the PBX vendors of the world. Instead, it is an API that developers can leverage to build applications.
- Voicemail - Twilio does not have any concept of voicemail although this can easily be created by using their Play feature, Record feature, and a little bit of programmer magic
- Conference Rooms - while Twilio does support conferencing multiple calls together, it has no concept of rooms, passwords, members, etc. - these all require developer work to make happen
- Hold, Transfer, Blind Transfer, etc. - Twilio only supports the basic call control primitives, namely answer and disconnect. In order to provide more traditional features such as Transfer, developers must code that
- Account codes/authentication - Twilio does not handle any account code/authentication, that must all be handled by applications
- Extensions - Twilio does not have the concept of an extension. When a user picks up their SIP phone (or dials via WebRTC) it is entirely under application control what happens
There are a lot more traditional phone system features which do not come out of the box with Twilio - instead Twilio requires that applications fill in the gaps with functionality.
The above lists may make it seem like Twilio is a very poor choice for a PBX platform, but the reality is that Twilio combined with an application such as Evo Voice can fulfill the vast majority of the features that traditional phone systems offer at a fraction of the cost.
Because Twilio has focused on providing outstanding global coverage of voice and a simple (focused API) they have created an opening for vendors such as Evo to come in and create a communication platform on top of it and charge significantly less.
Per the diagram above, Microsoft Azure is the second big piece of the Evo Voice architecture.
At its core, Microsoft Azure is a managed cloud that offers global reach and high scalability for applications such as Evo Voice.
At a technical level, Evo Voice is composed of the following Azure components:
- App Service
- Web Site
- MongoDB Atlas
- Redis Cache
App Service/Web Site
Azure app services (web sites) are redundant, distributed applications that are accessible via HTTP and support features such as auto scaling, hot swapping, etc.
The core Evo Voice application is an Azure App Service that is accessed via its web service (HTTP) interface by Twilio, browsers, apps, etc. It is configured for automatic scaling so that as load increases, more instances of it will automatically be created.
MongoDB Atlas is a geo-distributed database that automatically scales as needed and handles geographic distribution of data so that the app service always has a close source of data (reducing latency).
Evo Voice Scalability
Evo Voice was designed from the beginning to be able to scale depending upon load and geographic need. This is possible due to the following technologies:
- App Services can be configured to automatically start new instances as load demands. Additionally, Azure geographic router can automatically send client requests to the closest app service instance
- Cosmos DB supports geographic distribution and as clients come live in new regions, we can configure it to replicate to those regions thus allowing the local app service fast access to data
A Simple Example
In order to understand the basic architecture, we will use an example of an external party making a call to a client's incoming phone number that will ring the operators.
(Before the call)
- The operator logs into their console
- Evo Voice initializes the Twilio Voice browser SDK and registers for incoming calls
- The Twilio Voice SDK sets up the microphone/speaker and establishes the necessary HTTP communication with the Twilio cloud so that it can receive incoming calls sent to it via the Dial command
- The operator is now ready to take calls
- The caller dials the 10 digit number from their mobile device
- This call arrives at the Twilio servers and Twilio looks up the HTTP endpoint associated with the phone number (this will be an Evo Voice URL)
- Twilio makes an HTTP request to the Evo Voice app servers running in the Azure cloud
- Evo Voice determines the flow associated with this phone number and begins executing the flow
- The flow reaches a Dial node which Evo Voice translates into Twilio's TwIML (XML) language and responds to Twilio to dial the operator user that signed in above
- Twilio gets this HTTP response back and establishes the Web RTC connection to the operator's browser
- HostedSuite shows a red box indicating that there is a new ringing call and we pop based on the 10 digit number that was dialed
- The operator clicks the Answer button
- The operator is now talking to the caller
It's important to note that all of the above functionality happened via the Twilio APIs and the Evo Voice/HostedSuite UIs. Twilio is a set of APIs, not a user interface or a PBX.
Evo Voice is a PBX-like application built on top of the Azure cloud and the Twilio voice API. It makes use of the Twilio voice primitives and a lot of custom code in order to provide an easy to manage, highly scalable, hosted VoIP platform.
Twilio provides the basic Voice capabilities exposed via an API but it's important to note that it is not a PBX and does not intend to be. Twilio is first and foremost an API that allows programmatic control of Voice calls through a simple set of primitives.