Text-to-Speech(TTS) |VitalVoice. Russian Speech Synthesis -- speech technology

Sept. 9, 2011 - PRLog -- Speech synthesis is a conversion of unknown normal language text into speech. Speech output of information is a speech interface without which communication can’t take place. Actually due to speech synthesis there can be one more data transmission channel from PC and mobile phone to person.

It is very convenient to listen to your e-mail, timetable of the day or latest news while preparing for negotiations. No longer strain your eyes, nervous system trying to read tiny fonts in emails or web pages. Listen to anything you used to read with text to speech (TTS).

Speech synthesis technology is becoming more and more important in the society. Speech synthesis system is widely applied and implemented in many solutions. Text-to-Speech technology is the best way for people with defective eyesight.

Any text consists of words separated by blanks and stops. Words pronouncing depends on their location in the sentence, word meaning; phrase intonation – on stops. Accordingly, in order synthetic speech sounds naturally, it is necessary to solve the whole range of tasks of how to ensure the naturalness of voice (smooth sound and intonation) and the correct placement of accents, decryption of abbreviations, numbers, and special signs allowing for the grammar of the Russian language.

There are several approaches to solving problems:

- Allophone synthesis systems ensure stable but insufficient natural, robotized sounding
- Systems based on the Unit Selection method ensure more natural sounding but may contain fragments of speech with sudden failures of quality up to the loss of intelligibility
- Hybrid technology based on the Unit Selection method and amended with units of allophone synthesis.

VitalVoice system was developed on this technology basis, which ensures stable and natural sounding on the acoustic level.

Corporate solutions:

- Construction of the automated information-reference telephone systems of voice self-service in the Contact Center (SVS - voice self-service system)
- Integration into corporate information systems
- Notification and warning systems
- Insonation of information, posted on websites (Voice Internet)

Mobile devices:

- Navigation Systems
- Reading of information from internet websites (strips of news, blogs, etc.)
- Automatic translators
- Portable devices for people with sight and speech disabilities

Applications based on PC:

- E-mail reading, quick access to business information
- Russian language programs
- Creating of Audiobooks
- Computer Games
- Integration into the device (payment terminals, automatic newsstands)

Potential users:

- Owners and developers of online news sites and sites with frequently updated content
- Public bodies that host sites in the Internet whose information should be easily accessible to all citizens
- Private companies, whose sites are aimed at the greatest access to information about the company for a wider audience
- Companies which are interested in creating and hosting of its own podcasts from an unlimited amount of content without the usage of speakers and special acoustic conditions

VitalVoice meets the basic requirement of users to speech synthesis system: it allows you to voice any (even non-standard) text (SMS, emails, online forums, etc.) so that the listener has the impression that he hears natural human voice.

Text can be read in different synthesized voices. Every voice is based on the use of speaker’s speech base (volume of about 10 hours of speech), marked on 9 levels, including textual interpretation, counting the words, syllables, allophones, pause, markers of words’ and phrasal stresses, types of intonation, non-speech events and other phonetic phenomena.

For the correct intonation and determining the place of stress in words, the powerful module of automated processing of Russian text has been developed, using morphological, syntactic and semantic types of analysis. So «VitalVoice» is the unique technology of Russian speech synthesis due to this module.

Advantages:

- High quality and natural sounding of any text
- Taking into account phonetic, morphological and grammatical peculiarities of the Russian language
- Technology of natural intonation cloning
- Proper placement of accents
- Proper explanation of abbreviations, numbers and special characters
- Simplicity of use and implementation
- Support of standard data exchange protocols and markup languages (MRCP, SAPI, SSML)
- User dictionary
- Ability to change the pitch of the voice and speech rate in wide range

Functional features:

- Explanation of standard abbreviations using semantic analysis (Minsk, Brest, Vitebsk, 2010, 145)
- Correct reading of abbreviations (State Auto Inspection, BSU)
- Explanation of dates, time, correct reading of numbers (02/26/2010, 10:40)
- Explanation of special marks ($ 20, house number 7)
- Correct interpretation of formulas (2 * 3 = 6)
- Withdrawal of homographs (correct pronunciation of words with different meanings and the same spelling)
- 8 different voices of speech synthesizer
- Ability to change the pitch of the voice and speech rate in wide range
- The rate of formation of sound file is 10-12 times higher than actual time

Technical characteristics:

- The format of input data: txt, doc, rtf
- Output data format: wav, mp3
- Wav-file format: 22050 Hz sampling frequency, bit rate of 16, PCM, mono

Check out http://speetech.com/technologies/vitalvoice

# # #

Speech Technology Ltd. designs and develops science intensive software for recording, processing and analysis of speech data. Our products make it possible to recognize speech, change voice and speech rate, implement voice biometrics, analyze audio/video data and many others. Our solutions are warning systems, stenographing systems, audio editors, video surveillance systems and many others. Our team of developers is professionals in the range of implementation of cross-platform applications from mobile devices to server stations. Detailed information is on our web site: http://speetech.com/

End