Lightspeed Ventures-backed audio platform Pocket FM announced it has partnered with voice-cloning company ElevenLabs to quickly convert text content, such as script, into audio series using AI.
Pocket FM, which raised $103 million in Series D funding in March, told TechCrunch at the time that it was already experimenting with the ability to convert text content into audio using ElevenLabs‘ tech. Now, the India-based company has expanded the partnership to make the conversion tool available to all creators over the next few weeks.
In the test phase, Pocket FM already produced 30,000 hours of audio series using ElevenLab’s AI tech. With the new roll-out, the startup expects to triple its content library of over 100,000 hours of audio content this year. Pocket FM also said that during the experimental phase, the AI-powered tools helped it cut the cost of producing audio by 90%.
Pocket FM’s co-founder and CTO Prateek Dixit told TechCrunch over a call that with this partnership, the company wants to make it easier for writers to convert their writings into audio series.
“We have over 250,000 writers (including the ones on the company’s Pocket Novel writing plaform) and this partnership decreases the cost of setting up and recording audio for them,” he said.
“Even with a good set up of recording tools and equipment, writers can produce roughly 30 minutes of high-quality audio content per day. With the AI tools, this output can be 10 times more,” he added.
Pocket FM has built a tool integrating ElevenLabs tech, through which it is offering 50 voices for writers who want to convert their content. ElevenLabs’ co-founder Mati Staniszewski said that his company’s tool understands the context of the writing and infers emotions through the voice automatically.
“Working with Pocket FM, we are deploying our newer models that understand the genre of writing and are emotionality better,” Staniszewski said.
Dixit noted that based on data from users’ engagement with this kind of content, the platform also plans to suggest voices that work well for writers in a particular genre.
Pocket FM is not the only audio series platform experimenting with AI-powered tools. Google-backed Kuku FM is using GPT-4, Claude, BandLab and even ElevenLabs to help its writers with different stages of creation, including refining script, generating thumbnails, adding sound effects and converting text into audio.
Kuku FM told TechCrunch that it is also experimenting with using visual generation tools such as Midjourney and Runway to create ads related to content.
Quality of content and impact on artists
The promise of AI-powered tools is to generate more content faster, but that doesn’t mean the content is good. Pocket FM’s answer to aiding discovery and surfacing quality content is making its discovery algorithm sophisticated and experimenting with user engagement.
“If a writer publishes an audio series, we surface that content to a select number of users and observe engagement metrics. If these metrics are positive, we further propagate that,” Dixit said.
Kuku FM said it is working with its quality control team to ensure only high-quality content is promoted on its app, even if creators have used AI in the process.
“We realized the importance of having a human Quality Control team at the center of our decision-making when it comes to audio content production. We have developed a core team of Content Producers who have high ownership & authority on the artistic standards,” the company’s co-foudner and CEO Lal Chand Bisu said.
Utilizing AI could lead to quicker results and a bigger content library for these platforms, but it will also reduce the roles of voiceover artists working with them. India’s Association of Voiceover Artists (AVA) has expressed its concerns about AI taking over.
“If AI takes over, we are finished. As voice artists, we need to get some regulation in place so that our livelihood is protected,” Amarinder Singh Sodhi, the association’s general secretary, told Indian publication Scroll.
Sodi also told Scroll about incidents where voiceover artists were called into the studio to record samples to train AI without obtaining their consent or informing them.
“On an emotional level, it scares me. By using AI, you are essentially diluting the human experience of storytelling. You lose out on an emotional connection,” Delhi-based voiceover artist Aditya Mattoo told TechCrunch.
He added that giving access to premium voices to people who don’t have the taste and skill to produce quality content will lead to the market getting flooded by bad content.
Voice artists in other parts of the world have also raised concerns about AI impacting their jobs. And despite working with some of the AI companies, they feel uncomfortable about their voices being altered.
When we asked about the impact of AI-powered voice generation on Pocket FM, the company didn’t directly answer the question. However, Dixit noted that engagement with AI-generated content in its experiments is “as good as human voiceover production.” Notably, the company is also working on technology to incorporate multiple voices in one audio output.
Both Pocket FM and Kuku FM don’t currently label their content to indicate if AI has been used in the creation process.
Comment