Gnuspeech

An extensible, text-to-speech package, based on real-time, articulatory, speech-synthesis-by-rules
Download

Gnuspeech Ranking & Summary

Advertisement

  • Rating:
  • License:
  • GPL
  • Price:
  • FREE
  • Publisher Name:
  • David R. Hill
  • Publisher web site:
  • http://www.cpsc.ucalgary.ca/~hill
  • Operating Systems:
  • Mac OS X
  • File Size:
  • 2 MB

Gnuspeech Tags


Gnuspeech Description

An extensible, text-to-speech package, based on real-time, articulatory, speech-synthesis-by-rules That is, Gnuspeech converts text strings into phonetic descriptions, aided by a pronouncing dictionary, letter-to-sound rules, rhythm and intonation models; transforms the phonetic descriptions into parameters for a low-level articulatory synthesiser; and uses these to drive an articulatory model of the human vocal tract producing an output suitable for the normal sound output devices used by GNU/Linux.The synthesiser is a tube resonance, or waveguide model that accurately models the behaviour of the real vocal tract. The associated modules are those used to develop the original spoken English databases, and they could be used for other languages. Gnuspeech is suitable for linguistic research and psycho-acoustic. Here are some key features of "Gnuspeech": · A Tube Resonance Model (TRM) for the human vocal tract (also known as a transmission-line analog, or a waveguide model) that truly represents the physical properties of the tract, including energy balance between the nasal and oral cavities as well as the radiation impedance at lips and nose. · A control model for the TRM based on formant sensitivity analysis that allows accurate specification of the relevant vocal tract configurations for speech and comprising a low-level articulatory model having a small number of parameters and a low bit rate. The model is based on research at KTH in Stockholm, LCTI (ENST) in Paris, and The University of Calgary. · Databases specifying the articulatory postures and control dynamics required to produce English speech from an augmented phonemic input. Some French vowels are also included. · Models of English rhythm and intonation based on research at IPO in The Netherlands, the University of Essex (UK) and the University of Calgary. · “Monet”—a GUI-based database creation and editing system that allows the phonetic data and dynamic rules to be set up and modified for arbitrary languages. The MONET real-time engine also translates augmented phonetic strings into synthesiser parameters. · A text-to-augmented-phonetics module to convert arbitrary text, preferably with normal punctuation, into the input required by the MONET engine. This also provides the API for the text-to-speech system. · A 70,000+ word English pronouncing dictionary with rules for derivatives such as plurals, and adverbs. The dictionary also provides part-of-speech information for later addition of grammatical parsing and includes 6000 given names. · Sub-dictionaries that allow different user- or application-specific pronunciations to be substituted for the default pronunciations coming from the main dictionary. · Letter-to-sound rules to deal with spellings and words that are not in the dictionaries. · Tools for managing the dictionary and carrying out analysis of speech. · “Synthesiser”—a GUI-based application to allow experimentation with a stand-alone TRM. All parameters may be varied and the output monitored and analysed. It was an important component in the research needed to create the original English speech databases.


Gnuspeech Related Software