Phone Call Transcriptions is a feature that allows you to have your calls transcribed into a PDF by our network.
Transcription is built into our Call Recording Feature and can be enabled on a phone number by phone number basis.
By default, the Skyetel Transcription service costs $0.05/min (in 60 second increments) to transcribe a call. However - Transcription requires calls to be recorded (we have to send the recorded calls to the transcribe engine). Because of this, you will also be billed the $0.0025/min Call Recording rate in addition to the Transcription Service. This is true even if you select "Only transcribe calls" from the phone number options. Therefore the total cost to transcribe a call will be $0.0525/min in 60 second increments.
The Transcription UI allows you to choose between displaying the calls in either a Flow format or Standard format. By default, your calls are displayed in the Flow format.
This is one of our favorite parts of this feature! We've introduced a new concept for displaying Transcribed Calls that we call "Flow."
The fundamental principal guiding this technique is that people do not speak on the phone as they do walkie talkies; there are no clear pauses in phone calls and people do not take turns speaking. People regularly talk through each other, or add additional contextual noises like "yep" or "mhmm" to indicate they are listening. Since this regularly happens while participants are still speaking, it is hard to display that noise as if the caller was speaking during "their turn."
As such, Flow is formatted by the cadence of the call rather than the turn of the speaker. We measure the cadence of the call by the silence shared in the call by both parties. Therefore, each break in the "flow" of the call occurs when the callers shared a moment of silence. You can identify each participant by the color of the text.
In our testing, we found that the concept of a phone call cadence varied by listener/reader (which was interesting by itself!). In order to account for the variable way people experience conversational cadences, we allow you to select what period of silence you want to use for the break. By default we use 0.3 seconds - the value favorited by most people. However you can adjust this to any value and make the conversation easier to read based on your own particular style.
Standard view displays the call as though the parties took turns speaking. We don't like this method very much, but you might!
This view is a much more conventional method to display text, but we feel it is not an accurate representation of the call. The main exception to this would be in circumstances where callers are expected to speak in turn (such as a earnings call or conference bridge). In those cases, the Standard view may perform better.
Things to note
In order to record the calls accurate, we transcribe each leg of the call separately and then merge them together. This means that in a conversation between John and Sue, John's portion of the audio is transcribed and then merged to Sue's portion of the transcribed audio. This allows us to easily identify which speaker was talking and makes the transcription significantly more accurate. However - because we are performing the transcriptions in this way, our Transcription service will not work well where more than 2 parties are on a call. (The calls will complete successfully, and the parties will not notice any audio issues - however the transcription for the third party may either be missing or merged with one of the other two channels).
There can be a considerable lag between the time a call completes and a transcription is available - up to about 4 hours. This is caused primarily by the Transcription Engine processing the call and our speech processor creating the "Flow" of the call.
While the laws surrounding call Transcriptions are different from laws surrounding Recording do differ, we are not lawyers. If you have any questions around the legality of recorded calls, please consult with your legal council.