How to text-to-speech output using command-line?

How to get speech output from entered text by using command-line?

Also facility to change speech rate, pitch, volume etc using simple command.

Asked By: Pandya

||

espeak is a nice little tool.

I just like playing around with it in a command line. You might find it conflicts with Pulseaudio so I’m using a long-winded version that negates having to set it up properly.

sudo apt-get install espeak
espeak --stdout "this is a test" | paplay

espeak --help will show you the options to calibrate reading speed, pitch, voice, etc.

When you’re doing your notes, save them as a text file and then:

echo "these are my notes" > text.txt
espeak --stdout -f text.txt > text.wav
paplay text.wav # you should hear "these are my notes"

You can then play around with ffmeg et al to compress this down from PCM to something more manageable like MP3 or OGG. But that’s a different story.

Answered By: Oli

Even though you’ve already accepted an answer, I wanted to mention festival, which I like quite a lot too. This post on the Ubuntu forums has a lot of information on getting very nice voices set up for it.

Answered By: frabjous

And yet another espeak gui: gespeaker. It uses both espeak and mbrola engines. Also, it has more options than espeak-gui.

Answered By: luri

The following is not a FLOSS solution, but you may find it worthwhile. (it is a wine solution),

I’m personally very keen on TTS, I use it quite often… eg. listening to a rambling discourse which I would never bother to stick with otherise (because I need to get another cup of coffee… 🙂

A few things I’ve discovered along the way.. or should I say, things I haven’t discovered along the way… To put it bluntly: Every piece of FOSS TTS voice software I’ve tried is under par and therefore unsuitable for any semi-protracted listening…

I currently use ATnT’s NaturalVoices. It is only available for Windows (maybe the Mac), but it does run under wine in Ubuntu .. (it has minor glytch, where I sometimes need to click on the panel when I move away from the reader… It is a minor issue when compared to the advantage gained by quality of speech from NatualVoices.

Some other things I’ve found to be virtually essential for a half-sensible listening experience, are;…

  1. These TTS progamas are not intelligent (well maybe as intelligent as a young baboon) .. so they need every bit of help they can get. and there is one (and only one Reader program I’ve found which helps greatly in this.. The app is called ReadPlease (2003 Pro)… It allowd you to specially modify words and groups of word to be pronounced as you want them… It is by no means perfect, but for me, it made the difference between the entire process being usable and not usable…

  2. The speech in Natural Voices is “okay”, but it is a bit boring. There are other good products too, but they are all for Windows, unfortunately)..
    It infeclts surprisingl well sometimes .. but OMG, initially it is a pain! .. so #2 is *patience… and lots of updating of your “special words” list … By patience, I mean you(I) actually became accustomed to my particular baboon’s speech patterns :)… and by the way, I currently have about 3000 words that now sound “Human” enough that I no longer cringe when I hear them.

    3.. “Follow the Bouncing Ball” … Again because the voice is never as good as a real speaker, things sometimes need to be clarified .. . The Reader program I use has one feature for which I even put up with its clunky looking interface…. Is has a “select the currently being read” word option.. Many readers have this, but ReadPlease keeps the current line bang on center of the screen .. This is invaluable to be able to see ahead and behind to quickly re-read what you just missed (so auto-centering the curent line is good)…

Well that’s my experience.. I’m going to make a coffee now, and while I’m doing it, I’ll be listening to this, to see how it “reads”…. TTS is surprisingl good for picking up typos (I make lots of typos)…

If something as good as ATnT NaturalVoices turns up on the Ubuntu repository, I’ll jump at it.

Here is a link to some samples of Natural Voices: I use “MIke”

Answered By: Peter.O

SVOX pico2wave

That’s what I use. And it sounds natural, it’s easy to understand and it recognizes units (m, °C,kg, …).

Here is my first post about pico2wave.

All you have to do is: Go to Ubuntu Software Center and search for "pico". You’ll find 4 or 5 entries with "Small Footprint Ling…". Install them.

A possible use of pico2wave is described in my first posting (follow the link above).

Answered By: user85321

Mbrola doesn’t work since 11.10.

SVOX (pico) tools are easy to install, easy to use and brings good quality voices in Ubuntu. Install it:

sudo apt-get install libttspico0 libttspico-utils libttspico-data

Even more easy, you can use LibreOffice in combination with SVOX (pico) tools by install the “Read Text” extension and you obtain a “GUI” for this excellent TTS software:

Set up Read Text Extension’s options with Tools – Add-ons – Read selection…. Use /usr/bin/python as the external program. Select a command line option that includes the token (PICO_READ_TEXT_PY).

Answered By: leoperbo

From man spd-say:

NAME
       spd-say - send text-to-speech output request to speech-dispatcher

SYNOPSIS
       spd-say [options] "some text"

DESCRIPTION
       spd-say  sends text-to-speech output request to speech-dispatcher process which handles it and ideally outputs the result
       to the audio system.

OPTIONS
       -r, --rate
              Set the rate of the speech (between -100 and +100, default: 0)

       -p, --pitch
              Set the pitch of the speech (between -100 and +100, default: 0)

       -i, --volume
              Set the volume (intensity) of the speech (between -100 and +100, default: 0)

Hence you can get text-to-speech by following command:

spd-say "<type text>"

Ex:

spd-say "Welcome to Ubuntu Linux"

You can also set speech rate, pitch, volume etc. see man-page.

Answered By: Pandya

In order of descending popularity:

  • say converts text to audible speech using the GNUstep speech engine.

    sudo apt-get install gnustep-gui-runtime
    say "hello"
    
  • festival General multi-lingual speech synthesis system.

    sudo apt-get install festival
    echo "hello" | festival --tts
    
  • spd-say sends text-to-speech output request to speech-dispatcher

    sudo apt-get install speech-dispatcher
    spd-say "hello"
    
  • espeak is a multi-lingual software speech synthesizer.

    sudo apt-get install espeak
    espeak "hello"
    
Answered By: Sylvain Pineau

Balabolka under Wine works fine (for me) with SAPI4 voices (SAPI5 voices are not detected on my Linux system). It can open files and start reading.

Here is link to wine’s AppDB entry for Balabolka.

Answered By: Hemantkumar Garach

For festival (the voice seems more natural to me):

sudo apt-get install festival
echo "hello" | festival --tts

Pitch and speed configuration:

create ~/.festivalrc with the following content:

(Parameter.set 'Audio_Command "play -b 16 -c 1 -e signed-integer -r $SR -t raw $FILE tempo 1.5 pitch -100")
(Parameter.set 'Audio_Method 'Audio_Command)

See also http://www.solomonson.com/content/ubuntu-linux-text-speech

Update: tried on another Ubuntu computer. Had to install English speech engine package to work with festival properly:

sudo apt-get install festvox-kallpc16k

Also play is a cli command which comes with the sox package:

sudo apt-get install sox
Answered By: d9k

Python Google Speech :

pip install google_speech

google_speech "Test the hello world"

Svox From Android :

apt-get install svox-pico

pico2wave --wave=test.wav "Test the hello world"
play test.wav

Svox Nanotts :

git clone https://github.com/gmn/nanotts.git
cd nanotts
make

./nanotts -v en-US "Test the hello world"

Linked resource: Comparison of speech synthesizers
Post source: Linuxhacks.org
Disclosure: I am the owner of Linuxhacks.org

Answered By: intika

The tool gTTS is great for generating audio files from text. It uses the Google Translate’s text-to-speech API, and generates mp3 files.
Given that it uses pip for installation, I strongly recommend you install Miniconda, and then use conda to create an environment where you can install gTTS. You can download Miniconda from here.

gTTS GitHub repository and documentation.

Answered By: evaristegd

Meet espeak-ng – A multi-lingual software speech synthesizer:

espeak-ng "text to read"
espeak-ng -f "~/file to read"

It uses a default English voice, but there are numerous other voices for other languages and even dialects available and can be listed with espeak-ng --voices (for all) or e.g. espeak-ng --voices=en (for English). They can be set with -v together with either the language abbreviation or the file name, e.g. for Scottish or Swahili:

espeak-ng -v en-gb-scotland "text to read" # language name
espeak-ng -v bnt/sw "text to read" # file name: “bnt” for Bantu, “sw” for Swahili

There are many other options available, e.g. -s for the speed and -w to write the output to a wave file, see the manpage linked below.

Further reading

espeak-ng (“ng” for “next generation”) is an actively developed fork of the original espeak speech synthesizer software, see the History chapter on Wikipedia. Both are available from the official sources via the package espeak or espeak-ng respectively.

Answered By: dessert

Update for 2023. Pico2wave is a very lightweight utility, however these two are very natural sounding:

Audio comparison of free Linux TTS 2022. – YouTube

Answered By: alchemy

Piper

A fast, local neural text to speech system. Check site project for installation, download of a voice and usage. For e.g.:

echo 'Welcome to the world of speech synthesis!' | 
  ./piper --model blizzard_lessac-medium.onnx --output_file welcome.wav
Answered By: Pablo Bianchi