Skip to content

Home

About Us

Advertisement

Contact Us

  • Facebook
  • X
  • Instagram
  • Pinterest
  • WhatsApp
  • RSS Feed
  • TikTok

TechVilla

Your Trusted Voice Across the World.

Search

OpenAI Introduces New Audio Models in API, Can Be Used for Agentic Workflows

Admin Avatar
Admin
March 21, 2025
OpenAI Introduces New Audio Models in API, Can Be Used for Agentic Workflows

OpenAI, on Thursday, introduced new audio models in application programming interface (API) that offer improved performance in accuracy and reliability. The San Francisco-based AI firm released three new artificial intelligence (AI) models for both speech-to-text transcription and text-to-speech (TTS) functions. The company claimed that these models will enable developers to build applications with agentic workflows. It also stated that the API can enable businesses to automate customer support-like operations. Notably, the new models are based on the company’s GPT-4o and GPT-4o mini AI models.

OpenAI Brings New Audio Models in API

In a blog post, the AI firm detailed the new API-specific AI models. The company highlighted that over the years it has released several AI agents such as Operator, Deep Research, Computer-Using Agents, and the Responses API with built-in tools. However, it added that the true potential of agents can only be unlocked when they can perform intuitively and interact across mediums beyond text.

There are three new audio models. GPT-4o-transcribe and GPT-4o-mini-transcribe are the speech-to-text models and the GPT-4o-mini-tts is, as the name suggests, a TTS model. OpenAI claims that these models outperform its existing Whisper models which were released in 2022. However, unlike the older models, the new ones are not open-source.

Coming to the GPT-4o-transcribe, the AI firm stated that it showcases improved “word error rate” (WER) performance on the Few-shot Learning Evaluation of Universal Representations of Speech (FLEURS) benchmark which tests AI models on multilingual speech across 100 languages. OpenAI said the improvements were a result of targeted training techniques such as reinforcement learning (RL) and extensive midtraining with high-quality audio datasets.

These speech-to-text models can capture audio even in challenging scenarios such as heavy accents, noisy environments, and varying speech speeds.

The GPT-4o-mini-tts model also comes with significant improvements. The AI firm claims that the models can speak with customisable inflections, intonations, and emotional expressiveness. This will enable developers to build applications that can be used for a wide range of tasks including customer service and creative storytelling. Notably, the model only offers artificial and preset voices.

OpenAI’s API pricing page highlights that the GPT-4o-based audio model will cost $40 (roughly Rs. 3,440) per million input tokens and $80 (roughly Rs. 6,880) per million output tokens. On the other hand, the GPT-4o mini-based audio models will be charged at the rate of $10 (roughly Rs. 860) per million input tokens and $20 (roughly Rs. 1,720) per million output tokens.

All of the audio models are now available to developers via API. OpenAI is also releasing an integration with its Agents software development kit (SDK) to help users build voice agents.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Featured Articles

  • Samsung’s Upcoming Running Events Reportedly Hint at Galaxy Z Fold 7, Flip 7 and Watch 8 Series Launch Timeline

    Samsung’s Upcoming Running Events Reportedly Hint at Galaxy Z Fold 7, Flip 7 and Watch 8 Series Launch Timeline

    June 14, 2025
  • Poco F7 Design Spotted in Leaked Renders; Battery Specifications Revealed via Flipkart

    Poco F7 Design Spotted in Leaked Renders; Battery Specifications Revealed via Flipkart

    June 14, 2025
  • Neuralink Device Helps Monkey See Something That’s Not There

    Neuralink Device Helps Monkey See Something That’s Not There

    June 14, 2025
  • Power Meets Affordability: Flipkart’s Best Gaming Laptop Deals for June 2025

    Power Meets Affordability: Flipkart’s Best Gaming Laptop Deals for June 2025

    June 14, 2025
  • Samsung Galaxy M36, Galaxy F36 Spotted on Google Play Console; Galaxy M36 Launch Reportedly Teased via Amazon

    Samsung Galaxy M36, Galaxy F36 Spotted on Google Play Console; Galaxy M36 Launch Reportedly Teased via Amazon

    June 14, 2025

Search

Author Details

Jenifer Propets

Lorem ipsum dolor sit amet, adipiscing elit, sed do eiusmod tempor ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

  • X
  • Instagram
  • TikTok
  • Facebook

Follow Us on

  • Facebook
  • X
  • Instagram
  • VK
  • Pinterest
  • Last.fm
  • TikTok
  • Telegram
  • WhatsApp
  • RSS Feed

Categories

  • Tech (2,001)

Archives

  • June 2025 (212)
  • May 2025 (471)
  • April 2025 (424)
  • March 2025 (442)
  • February 2025 (371)
  • January 2025 (81)

Tags

About Us

Jetnews Magazine

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

Latest Articles

  • Samsung’s Upcoming Running Events Reportedly Hint at Galaxy Z Fold 7, Flip 7 and Watch 8 Series Launch Timeline

    Samsung’s Upcoming Running Events Reportedly Hint at Galaxy Z Fold 7, Flip 7 and Watch 8 Series Launch Timeline

    June 14, 2025
  • Poco F7 Design Spotted in Leaked Renders; Battery Specifications Revealed via Flipkart

    Poco F7 Design Spotted in Leaked Renders; Battery Specifications Revealed via Flipkart

    June 14, 2025
  • Neuralink Device Helps Monkey See Something That’s Not There

    Neuralink Device Helps Monkey See Something That’s Not There

    June 14, 2025

Categories

  • Tech (2,001)
  • Instagram
  • Facebook
  • LinkedIn
  • X
  • VK
  • TikTok

Proudly Powered by WordPress | JetNews Magazine by CozyThemes.

Scroll to Top