How do Interprefy Captions work?

Frequently asked questions about Interprefy's automatic captioning services for meetings and events

Captions are on the rise. More than 80% of Netflix members use closed captions or subtitles at least once a month. We provide meeting and event organisers with a powerful tool to provide a live captioning experience in multiple languages during their online or hybrid events.

What we offer:

  • ASR Captions: Automatic closed captions, rendering the speaker audio into text ni real-time.
    • Glossary function: Enhance captions accuracy by pre-loading the system with specific names, brand names, or accuracy.
    • Linguistic support: Professional vetted linguists support in preparing the glossary for your event.
  • MT Captions: Real-time machine-translated captions, rendering the content into a different language automatically.

As a visual aid to follow the speech, live audio is transcribed into text through AI-powered Automated Speech Recognition (ASR) technology.

Interprefy Captions are generated off the audio speech of each speaker (and interpreter, if active) using Automated Speech Recognition (ASR) technology powered by Artificial Intelligence (AI).

This technology combination uses speech-to-text processing technology to provide text directly from the words being spoken. Just like interpretation, captions will follow as live transcription slightly after the speaker has delivered their words.

Additionally, we provide machine-translated captions that render automatic captions into another language in real-time.

How it works

How does automatic captioning workDiagram - MTDiagram - RSI and ASR and MT

Frequently Asked Questions

 

1. What is the difference between ASR captions and MT captions? 

ASR = Automatic Speech Recognition. The AI-powered technology automatically recognises the speech and transcribes the speech into text in real-time. 

MT= Machine Translation. This AI-powered technology will automatically translate the speech from one language into another and display it as text in real-time.

 

2. What are the differences between ASR captions from interpreters and machine-translated captions? 

This comes down to the key differences between MT translation and human interpretation. Conference interpreters will always strive to convey the message of the speaker, and may paraphrase, while machine translation aims for completeness of translation of the sentences spoken.

ASR captions from interpreting audio are being used in conferences involving simultaneous interpretation and are in sync with the audio interpretation.

Because the captions are based on a professional live audio translation from a vetted and subject-savvy conference interpreter, the speech is translated by taking cultural aspects, context, and tone of voice into consideration. 

MT captions provide a complete translation of the sentences spoken.

 

3. Are Interprefy Captions available for events or meetings without simultaneous interpretation? 

Yes, Interprefy Captions with machine translation can be used for events without interpretation. However, we can also support both simultaneous interpretation (spoken or signed), ASR-, and MT captions simultaneously.  

 

4. Can a user select to listen to the floor audio language and read captions in a different language? 

Yes. 

 

5. Why do I need live captions? 

Captions are especially useful for delegates and attendees who are for some reason unable to hear what is being said or for those who choose to read rather than listen or those who need visual reinforcement. There are a number of benefits for event organisers and content creators that we outlined in detail in this blog article.

Example users include: 

  • The deaf and hard-of-hearing, who can follow the dialogue in written form with the aid of captioning. 
  • People who wish to follow the discussion but are in a location where another dialogue is taking place. 
  • Individuals in a noisy environment like in a café who wish to follow the event even when listening conditions are poor. 
  • Those who wish to have a readable feed to back up their understanding of what is being said. For instance, in a chemical conference when complex formulas are being voiced it is sometimes useful to have a readable text feed alongside the spoken words. 
  • Those attending (but not contributing) in areas of poor network connection where audio feeds may be unreliable. 

 

6. How do I make sure the captions are accurately reflecting the words of the speaker? 

The words and terms spoken by the speaker or interpreter are automatically recognized by AI technology. For the system to recognize the speech, good source audio quality is essential.

Limitations of automatic speech recognition:

  • Background noise
  • Volume and clarity of the speakers' voice
  • Lexicons and heavy accents

As with any multilingual meeting, we recommend educating speakers about the importance of high audio quality and clear, precise, and paced speech.  

Populating the glossary before the event further supports accuracy of the transcription and is essential to help key terms, acronyms and names to be spelled out correctly. 

 

7. What is the delay for captions to appear on screen? 

Interprefy Captions can be enabled in two different modes. By default, the text will appear within 4 seconds of the speaker having completed a sentence. If 'instant mode' is activated, text will appear in real-time with instant auto-correction. 

 

8. Are Interprefy Captions available in the Interprefy mobile app? 

Yes, the Interprefy mobile app also supports captions. This is particularly useful for audiences at a venue accessing live captions on their mobile. 

 

9. Are Interprefy Captions available in events using the Interprefy Select widget on a third-party platform? 

Captions are available on Interprefy Connect, Interprefy Connect Pro and selected third-party platforms. Please connect with an Interprefy representative to discuss availability in your preferred platform. 

 

10. Which languages are Interprefy captions available in? 

ASR captions are currently available for the following languages: Arabic, Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, English, Finnish, French, German, Greek, Hindi, Hungarian, Italian, Japanese, Korean, Latvian, Mandarin, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Lithuanian, Spanish, Malay, Swedish, Turkish 

MT captions are currently available from English into the following languages: Danish, Dutch, Italian, French, German, Norwegian, Portuguese, Russian, Spanish, Swedish 

  • If the spoken language is something not English, an interpreter can translate to English to further allow translation engines to translate from EN into one or more of the above languages.

11. What is the pricing model for adding Interprefy Captions to my meeting or event? 

Interprefy Captions are available as a cost option. Pricing is dependent on two factors: Amount of languages required and event duration and involves the following cost items: 

  • Set-up fee per floor language (including support from a professional linguist) 
  • Streaming fee per hour, per language 

 

12. Can I have both the audio and text appear at the same time? 

Yes, of course. Many users like to have written reinforcement of what’s being said whereas others prefer to read the content only. Users can turn on/off captions anytime and adjust text size and colour. 

 

13. Are captions available as transcripts after the event? 

Yes, transcripts of the captions can be made available after the event. 

 

14.  Which translation engine does Interprefy use? 

We don’t use a single translation engine, but hand-select engines for each language pair. Our AI Delivery team continuously tests and compares leading translation engines to ensure that for each language combination, we pick the best-performing engine.