Breaking News

Gnani AI launches new speech recognition model for Indian languages

The Bengaluru startup says its latest speech-to-text model delivers stronger accuracy across Indian languages, dialects, and noisy environments, addressing longstanding challenges in voice AI applications used by enterprises and customer-facing services.

Bengaluru-based voice artificial intelligence company Gnani AI has introduced Prisma v2.5, a new speech-to-text model designed to improve recognition accuracy across India’s linguistically diverse and often challenging audio environments.

The company said the latest version of its automatic speech recognition (ASR) platform demonstrated leading performance across multiple Indian language benchmarks, particularly in scenarios involving regional accents, rural dialects, background noise, and mixed-language conversations. According to Gnani AI, the model secured top rankings in eight out of nine Indian languages evaluated through real-world speech-recognition tests.

The launch comes as businesses increasingly adopt voice AI technologies to automate customer interactions, analyze conversations, and improve operational efficiency. However, many existing speech-recognition systems continue to face difficulties when handling non-standard accents, poor-quality telephony audio, and conversations that frequently switch between English and regional languages.

Built for real-world Indian speech patterns

Gnani AI said Prisma v2.5 was trained using a proprietary dataset comprising nearly 14 million hours of speech data collected across 12 languages. The dataset includes regional dialects, code-switched conversations, and audio samples recorded in noisy environments, enabling the model to better reflect real-world communication patterns in India.

The company claimed that the new model achieved significantly lower word error rates compared with several competing speech-recognition platforms. Performance gains were particularly notable in rural Hindi dialects and Dravidian languages, areas where voice technologies have historically struggled to maintain accuracy.

A key feature of Prisma v2.5 is its ability to understand multilingual conversations without requiring manual language tagging. The model can seamlessly process exchanges that alternate between Hindi and English, Tamil and English, and other regional language combinations at the word level.

Gnani AI also stated that the system natively supports audio transmitted over GSM and Voice over Internet Protocol (VoIP) networks, making it suitable for contact centers and telephony-based applications.

Enterprise focus across critical sectors

The company is positioning Prisma v2.5 for industries where transcription accuracy is critical, including banking, financial services, insurance, healthcare, and customer support operations. Errors involving names, account details, numerical data, or technical terminology can have significant implications for compliance, customer records, and operational workflows.

According to Bharath Shankar, Co-founder and Chief Product and Engineering Officer, optimization efforts after training have enabled the company to double processing throughput compared with the previous version while maintaining accuracy levels.

Co-founder and CEO Ganesh Gopalan said many voice AI deployments encounter limitations because they are not adequately trained on the realities of Indian speech. He noted that accents, background noise, code-switching, and compressed telephony audio are everyday characteristics of conversations in the country rather than exceptional cases.

With Prisma v2.5, Gnani AI aims to bridge that gap and strengthen the adoption of voice AI technologies across India’s enterprise ecosystem.