3.6 C
London
Saturday, January 28, 2023

Twelve Labs secures $12 million for AI that understands video context • londonbusinessblog.com

Must read

Police prosecute second teen in Iowa school shooting that left 2 dead

DES MOINES, Iowa — Authorities on Friday charged a second teen with murder in the shooting of two students at an educational program in...

Fifth Wall, focused on $3.2 billion real estate technology and management, appears to be eating even more of its market – londonbusinessblog.com

Brendan Wallace's ambition is beginning to seem almost limitless. The LA-based venture firm that Wallace and co-founder Brad Greiwe launched less than seven...

Warner Bros. stole our Harry Potter wand’s IP address, says Kano • londonbusinessblog.com

canoethe British start-up known for its own computer kits and software for learning coding and associated STEM skills, Warner Bros. accused of copying...

Adani Enterprises’ ₹20,000 crore FPO is now open, analysts remain cautiously optimistic

Adani The follow-on public offering (FPO) of ₹20,000 crore from companies has received mixed reception from brokers, who underline that the bet on green...
Shreya Christinahttps://londonbusinessblog.com
Shreya has been with londonbusinessblog.com for 3 years, writing copy for client websites, blog posts, EDMs and other mediums to engage readers and encourage action. By collaborating with clients, our SEO manager and the wider londonbusinessblog.com team, Shreya seeks to understand an audience before creating memorable, persuasive copy.

For Jae Lee, a data scientist by training, it never made sense that video – which has become a huge part of our lives, which with the rise of platforms such as TikTok, Vimeo and YouTube – was difficult to search due to the technical barriers posed by understanding context . Searching through the titles, descriptions, and tags of videos has always been easy enough, requiring nothing more than a basic algorithm. But search inside videos for specific moments and scenes have long been beyond the capabilities of technology, especially if those moments and scenes weren’t labeled in an obvious way.

To solve this problem, Lee and friends from the tech industry have built a cloud service for searching and understanding videos. It became Twelve laboratories, which went on to raise $17 million in venture capital — $12 million of which came from a seed extension round that closed today. Radical Ventures led the expansion with participation from Index Ventures, WndrCo, Spring Ventures, Weights & Biases CEO Lukas Biewald and others, Lee told londonbusinessblog.com in an email.

“Twelve Labs’ vision is to help developers build programs that can see, listen and understand the world the way we do, by providing them with the most powerful video understanding infrastructure,” said Lee.

A demo of the capabilities of the Twelve Labs platform. Image Credits: Twelve laboratories

Twelve Labs, which is currently in closed beta, is using AI to try and extract “rich information” from videos, such as movement and actions, objects and people, sound, on-screen text, and speech to identify the relationships between them. The platform converts these different elements into mathematical representations called “vectors” and forms “temporary connections” between frames, enabling applications such as video scene search.

“As part of realizing the company’s vision to help developers create intelligent video applications, the Twelve Labs team is building ‘fundamental models’ for multimodal video understanding,” said Lee. “Developers can access these models through a series of APIs, not only performing semantic searches, but also performing other tasks such as ‘chapterizing’ long videos, generating summaries, and answering video questions.”

Google takes a similar approach to understanding videos with its MUM AI system, which the company uses to drive video recommendations on Google Search and YouTube by selecting topics in videos (for example, “acrylic materials”) based on the audio , text and visual content. But while the technology is similar, Twelve Labs is one of the first vendors to bring it to market; Google has chosen to keep MUM internal and refuses to make it available through a public API.

That said, Google, as well as Microsoft and Amazon, offer services (i.e. Google Cloud Video AI, Azure Video Indexer, and AWS Rekognition) that recognize objects, places, and actions in videos and extract rich, frame-level metadata. There’s also Reminiz, a French computer vision startup that claims to be able to index any type of video and add tags to both recorded and live-streamed content. But Lee argues that Twelve Labs is sufficiently differentiated, in part because the platform allows customers to tailor the AI ​​to specific categories of video content.

Mockup of API for refining the model to work better with salad related content. Image Credits: Twelve laboratories

“What we found is that narrow AI products built to detect specific problems show high accuracy in their ideal scenarios in a controlled environment, but don’t scale as well to messy real-world data,” said Lee. . “They operate more like a rule-based system and therefore lack the ability to generalize when deviations occur. We also see this as a limitation arising from a lack of understanding of the context. Understanding context is what gives people the unique ability to make generalizations about seemingly different real-world situations, and this is where Twelve Labs comes into its own.”

In addition to search, Lee says Twelve Labs’ technology can boost things like ad insertion and content moderation, for example, intelligently figuring out which videos with knives are violent versus instructional. It can also be used for media analysis and real-time feedback, he says, as well as automatically generating highlights from videos.

Just over a year after its inception (March 2021), Twelve Labs has paying customers — Lee wouldn’t reveal exactly how many — and a multi-year contract with Oracle to train AI models using Oracle’s cloud infrastructure. Looking ahead, the startup plans to invest in building out its technology and growing its team. (Lee declined to disclose the current size of Twelve Labs’ workforce, but LinkedIn data shows it’s about 18 people.)

“Despite the tremendous value that can be achieved with large models, most companies don’t feel like training, operating and maintaining these models themselves. By leveraging a Twelve Labs platform, any organization can leverage powerful video insight capabilities with just a few intuitive API calls,” said Lee. “The future direction of AI innovation is going straight to multimodal video understanding, and Twelve Labs is well positioned to push the boundaries even further in 2023.”

More articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest article

Police prosecute second teen in Iowa school shooting that left 2 dead

DES MOINES, Iowa — Authorities on Friday charged a second teen with murder in the shooting of two students at an educational program in...

Fifth Wall, focused on $3.2 billion real estate technology and management, appears to be eating even more of its market – londonbusinessblog.com

Brendan Wallace's ambition is beginning to seem almost limitless. The LA-based venture firm that Wallace and co-founder Brad Greiwe launched less than seven...

Warner Bros. stole our Harry Potter wand’s IP address, says Kano • londonbusinessblog.com

canoethe British start-up known for its own computer kits and software for learning coding and associated STEM skills, Warner Bros. accused of copying...

Adani Enterprises’ ₹20,000 crore FPO is now open, analysts remain cautiously optimistic

Adani The follow-on public offering (FPO) of ₹20,000 crore from companies has received mixed reception from brokers, who underline that the bet on green...

Southwest develops software fixation to prevent travel meltdowns

After the disastrous Christmas travel season, which saw 16,700 flight cancellations, Southwest Airlines is testing new software solutions - and facing an investigation from...