Using Microsoft's Speech to Text REST API? Don't use it on Chrome
I have been mentoring at a couple hackathons recently.
Super fun. Wrote about the experience here.
There was a very gritty problem that I helped a team with. It The solution wasn’t very obvious, but I feel like it should be.
I’m writing about it here in hopes to make it more obvious.
Context
A team was using Microsoft’s Speech to text REST API.
Without getting too deep into the details, the team wanted to record audio using the browser’s MediaRecorder, and send the audio file directly to the API. The following illustration depicts the plan.
As shown in the Microsoft documentation, the API supports either audio/wav; codecs=pcm
or audio/ogg; codecs=opus
.
The Problem
This team tried both, but none of it worked.
const mimeType = { 'type' : 'audio/ogg; codecs=opus' }
const mimeType = { 'type' : 'audio/wav; codecs=pcm' }
The API kept returning 400 Invalid Request
in both cases.
The Reason
Turns out, Chrome’s MediaRecorder doesn’t support either codecs.
To check if a codec&format is supported, run the following in your DevTools console.
MediaRecorder.isTypeSupported('audio/ogg; codecs=opus')
You can check other codec(s) too.
MediaRecorder.isTypeSupported('audio/wav; codecs=pcm')
The Solution
Use a browser that supports the format & codec.
Turns out Firefox can support audio/ogg; codec=opus
).
MediaRecorder.isTypeSupported('audio/ogg; codecs=opus'); // false on chrome, true on firefox
Go download it if you need a workaround. https://www.mozilla.org/en-CA/firefox/new/
References
The golden blog post that saved the night. Thank you Nick!