Deepgram, a company developing speech recognition technology for the enterprise, today raised $47 million in new funding led by Madrona Venture Group with participation from Citi Ventures and Alkeon. An expansion of Deepgram’s Series B that launched in February 2021, led by Tiger Global, brings the startup’s total amount raised to $86 million, which CEO Scott Stephenson says will be spent on R&D in areas such as emotion detection, intent recognition, summary, topic detection, translation and editing.
“We are pleased that Deepgram achieved its highest ever pre- and post-money valuation, even despite challenging market conditions,” Stephenson told londonbusinessblog.com in an email interview. (Unfortunately, he declined to reveal exactly what the valuation was.) “We believe Deepgram is in a strong position to thrive in this more challenging macroeconomic environment. Deepgram’s speech AI is the core technology behind many of our customers’ applications, and the demand for speech understanding is growing as companies look for greater efficiencies.”
Launched in 2015, Deepgram focuses on building custom speech recognition solutions for clients such as Spotify, Auth0 and even NASA. The company’s data scientists collect, create, label and evaluate speech data to produce speech recognition models that can understand brands and jargon, capture a range of languages and accents, and adapt to challenging audio environments. For example, Deepgram built a model for NASA to transcribe communications between Mission Control and the International Space Station.
“Audio data is one of the world’s largest untapped data resources. [But] it is difficult to use in its audio format because audio is an unstructured data type and therefore cannot be mined for insights without further processing,” said Stephenson. “Deepgram takes unstructured audio data and structures it as text and metadata at high speeds and low cost, designed for enterprise scale… [W]with Deepgram, [companies] can send all of their customer audio – hundreds of thousands or millions of hours – to be transcribed and analyzed.
Where does the audio data come from to train Deepgram’s models? Stephenson was a little coy about that, though he didn’t deny that Deepgram uses customer data to improve its systems. He was quick to point out that the company is GDPR compliant and that users can request that their data be deleted at any time.
“Deepgram’s models are primarily trained on data collected or generated by our data curation experts, in addition to some anonymized data submitted by our users,” said Stephenson. “Training models on real-world data are a cornerstone of our product quality; it’s what enables machine learning systems like ours to produce human-like results. That said, we allow our users to opt out of using their anonymized data for training if they wish.”
Through Deepgram’s API, companies can build the platform into their tech stacks to enable voice-based automations and customer experiences. For organizations in highly regulated industries such as healthcare and government, Deepgram offers an on-premises deployment option that allows customers to manage and process data locally. (It’s worth noting that In-Q-Tel, the CIA’s strategic investment arm, has backed Deepgram in the past.)
Deepgram – a Y Combinator graduate founded by Stephenson and Noah Shutty, a physics graduate from the University of Michigan – is competing with a number of vendors in a speech recognition market that could be worth $48.8 billion by 2030, according to to one (optimistic?) source. Tech giants like Nuance, Cisco, Google, Microsoft and Amazon provide real-time speech transcription and captioning services, as do startups like Otter, Speechmatics, Voicera and Verbit.
The technology has hurdles to overcome. According to a 2022 report according to Speechmatics, 29% of executives have observed AI bias in speech technologies, specifically imbalances in the types of voices understood by speech recognition. But demand is clearly strong enough to support supply from suppliers; Stephenson claims that Deepgram’s gross margins are “in line with the top performing software companies.”
That’s in contrast to the consumer speech recognition market, which has been in decline recently. Alexa from Amazon division is reportedly on track to lose $10 billion this year. And that’s Google rumours to look at cutting back on Google Assistant development in favor of more profitable projects.
Stephenson says Deepgram’s focus in recent months has been on on-the-fly language translation, sentiment analysis, and split transcripts of multi-way conversations. The company is also scaling up, now reaching over 300 customers and over 15,000 users.
On the hunt for new customers, Deepgram recently launched the Deepgram Startup Program, which offers $10 million in free speech recognition credits on the Deepgram platform to education and business startups. Participating companies do not have to pay any fees and can use the money in combination with existing grant, start-up, incubator and accelerator benefits.
“Deepgram’s business continues to grow rapidly. As a fundamental AI infrastructure company, we have not seen a decline in demand for Deepgram,” said Stephenson. “We’ve even seen companies look for ways to cut costs and delegate repetitive, menial tasks to AIs, giving people more time to pursue interesting, demanding work. Examples include reducing large cloud computing costs by switching large cloud transcripts to Deepgram’s transcription product, or in new use cases such as drive-thru orders and triaging of the first round of customer service responses.”
Deepgram currently has 146 employees across offices in Ann Arbor and San Francisco. When asked about hiring plans for the rest of the year, Stephenson declined to answer – no doubt aware of the unpredictability of today’s global economy and the point of fixing a fixed number.