Damn Small Linux (DSL) Forums
Welcome, Guest. Please login or register.
Did you miss your activation email?
April 18, 2014, 08:53:45 AM

Login with username, password and session length
News
The new DSL forums are now open.
Stats
1238 Posts in 241 Topics by 245 Members
Latest Member: TheJoshMan2000
Search:     Advanced search
* Home Help Search Login Register
Get The Official Damn Small Linux Book. Great VPS hosting provided by Tektonic

+  Damn Small Linux (DSL) Forums
|-+  MyDSL Extensions
| |-+  Multimedia
| | |-+  SPEECH Synthesis and Voice Recognition
« previous next »
Pages: [1] Print
Author Topic: SPEECH Synthesis and Voice Recognition  (Read 6896 times)
mterras
Newbie
*
Posts: 23


View Profile
« on: September 15, 2012, 04:00:16 PM »

Hi everybody! Grin
Actually working on a great project including oldies like espeak, cmu-sphinx2 and murgalua to command and script some repetitive tasks, like internet searching, mails reading, automating media-database updates, etc...
I'm scripting within a Debian Etch powered system (4.0, kernel 2.6.18), and will try to "backport" to DSL-N0.4RC1, and maybe DSL 4.x...

You can help backporting espeak and cmu-sphinx2 to sarge or better woody (woody is the base of hard-installed DSL 4.x)

My goal : making cry those who have throwing away their old PC!
Seriously: ReUse+optimizations is the Future of Computer science...
Sincerly,
Logged
lm
Newbie
*
Posts: 12


View Profile
« Reply #1 on: December 07, 2012, 02:46:37 PM »

Would be very interested to hear more about your project and how it's going.  Do you have a web site with more information on it?  Any opinions on flite versus espeak?  Would also be interested in any other optimizations to reuse old computers that you're working on.  Thanks.
Logged
mterras
Newbie
*
Posts: 23


View Profile
« Reply #2 on: December 09, 2012, 08:46:25 AM »

Hi lm,

1/ i didn't imagine myself working on such a project ... hmm ... one year ago. Then, came murgaLua (scripting language with an embedded  powerful & simple to use graphic user interface) ... with it, all was possible. But my basic knowledges in computer science didn't let me write an entire engine for speech recognition, especially in a foreign -minor- language like French...

2/ Searching for an old but functional and -always- simple to use E.S.R. led me to fantastic <a href="http://sourceforge.net/projects/cmusphinx/files/sphinx2/">cmu sphinx2 </a> (open source naturally), present in debian repos, since woody (corresponding to Damn Small Linux!). This complex app (especially concerning its line command arguments) does for us a VERY SIMPLE task: it writes recognized text from your speech to a dummy file. Then murgalua reads periodically this files (every 2 seconds, for example), and extracts recognized words to send to system, with the help of the powerful line command!
DON'T EXPECT 100% recognition performances! You can increase these perfs reducing the set of words to be recognized... and selecting the best recognized words...

3/ cmu sphinx2 came with default recognized language set for english (good for you?), so i had to find a French language model, and i found one  built for a similar project (but written in Perl) <a href="http://perlboxfr.tuxfamily.org/">PerlBox-fr</a>, model is part of this archive

4/ the rest is in the source script attached here (42kb!), named "Alfreid.lua"

5/ For your question about espeak versus flite, i selected espeak because of the easiness of adding for it the better quality voices of <a href="http://tcts.fpms.ac.be/synthesis/">mbrola</a> (including two french voices), as Alfreid has for some commands a vocal "feedback", and for the user, understanding these vocal feedbacks is sometimes "critical". For example, with Alfreid, i kill some unix processes with my voice, amazing no?

6/ At this stage, i've no other material to upload to my personal website, than this script, so i share it with you (attached Alfreid.lua) before uploading a more "finished" version . Must say, too, that i had the mad idea of porting it to MS-Windows (XP) and Mac OS X (Tiger), but i've no time for it... So this attached script works in Windows, but as i haven't found a MS-Windows binary of cmu-Sphinx2 (i think i have to use the next version of sphinx2 called pocketsphinx, but other syntax for command line will take time to understand... or use embedded Microsoft Speech Recognition Engine -> not free nor open source => no way!), i used this Windows version as test for the vocal feedback (virtual loop with random commands sent to the "command manager" function.

7/ You can find some Linux-Mac-Windows murgaLua scripts on <a href="http://michelterras.perso.sfr.fr/">my home site </a>

thanks for your interest

PS: OUPS! i forget to say that this development script was made with a debian 4 powered system, and murgalua 0.5.5
« Last Edit: December 09, 2012, 09:31:26 AM by mterras » Logged
lm
Newbie
*
Posts: 12


View Profile
« Reply #3 on: December 12, 2012, 04:39:13 PM »

Not sure if uploads are working to this board.  Couldn't find the attached script.  If you have it somewhere at your site or add it there, I should be able to download from there and check it out. 

I have espeak and flite compiled from source on Windows using mingw so I can cross-test some applications on that machine.  Haven't tried cmu-sphinx2, but it looks like it should compile with mingw as well.  Don't have my Windows development system up at this time or I would have tried it already.  If I do test it out and get it working, do you want a copy of the executable?

The murgaLua download and scripts at your site look really interesting.  I'm personally more comfortable with C/C++ than Lua, but I've done a little Lua programming for use with SciTE.  Look forward to trying murgaLua out with some of your scripts.

If you have any other tips on useful software that runs well on older systems would enjoy hearing about it.
Logged
mterras
Newbie
*
Posts: 23


View Profile
« Reply #4 on: December 12, 2012, 06:12:09 PM »

Hi lm!
Downloading is working for me with attached "Alfreid.lua" script.
I upload it again as attached piece on this board, and put it on my site NOW!
(http://michelterras.perso.sfr.fr/, section "murgalua"), so you'll can download script from there.
You will have to make heavy changes in this script, as my primary development system was a Debian Etch, and focus was set on the French language model...
There is plenty of command line calls, which is very "system-specific", but with sites like http://ss64.com/, you can find quickly equivalent command-line commands between mac os X, MS Windows & linuxes.

If you're not interested in French recognition, you have just to install "basic" cmu-sphinx2 which comes with an english-american language model & dictionnary.

As for me, i'm very interested if you can send me the windows binaries of sphinx2, and for your codes or executables...
As soon as i have enough time, i'll make my script more "multi-platform", more smart, and more multi-language...

Thanks again for your interest, and for shaking me (i was just about to temporarily leave this project to begin another one!)
Logged
lm
Newbie
*
Posts: 12


View Profile
« Reply #5 on: December 20, 2012, 12:26:18 PM »

As for me, i'm very interested if you can send me the windows binaries of sphinx2, and for your codes or executables...

Needed some minor patching, but it sphinx2 0.6 appears to build fine with MinGW (4.7.2).  I haven't tried running the executables yet.  Sent you an e-mail via your web site e-mail address.  We'll have to work out how to get the binaries to you.

Also curious if sphinx2 is the best option (since it's no longer actively developed) or if some of the other sphinx projects would be worth looking into.  You also mentioned pocketsphinx, any improvements with that version?  One web site mentioned sphinx2 is currently used with some commercial products.
Logged
mterras
Newbie
*
Posts: 23


View Profile
« Reply #6 on: December 20, 2012, 06:27:45 PM »

Hi lm,
You're right about pocketsphinx, and i was currently testing it this evening within a MS-Windows XP powered system.
It's lite (as for the binaries), fast, and as accurate as previous version cmu-sphinx2. Language models & acoustic data (for French as well as for US-English) are easy to find on the sourceforge-CMU sphinx site, and easy to setup, too. The bad side is that pocketsphinx is not in the default and earlier debian repositories (it's mainly in the ubuntu repos, starting from the Lucid version   http://packages.ubuntu.com/lucid/pocketsphinx-utils ). If you can compile it (i think it's written in C language ?), i will be very insterested getting your binaries for my Linuxes systems (i386 arch).
I will "keep" CMU-Sphinx2 in this project for earlier versions of Linux kernels, like DSL and DSL-Not...
Thanks for your always relevant asking.
Sincerly,
« Last Edit: December 20, 2012, 06:54:35 PM by mterras » Logged
mterras
Newbie
*
Posts: 23


View Profile
« Reply #7 on: December 28, 2012, 06:43:44 AM »

Well, well, from huge, my project went to VHP (Very Huge Project), ie a lot to do!
For "universal" user (speaking US-english language), i decided to extend my project with an us-english language set.
Now, i'm working with:
  • two Operating Systems (Windows & Linux Debian 4), with a very different command line syntax;
  • two Speech Recognition Engines (Sphinx2 for Linux, and PocketSphinx for Windows), both rather easy to setup and use;
  • two language sets (French & US-English), needing some tests for speed, accuracy, related to the selected language model
I've some difficulties to let the code (script) be OS- and language-independent!
My actual work (from 27/12/12) is about writing a better table for handling menus and spoken commands...
Will update soon the Alfreid.lua script on my site... as soon as i get a working prototype!

You all, spend a nice holiday season for this end of 2012!
Logged
mterras
Newbie
*
Posts: 23


View Profile
« Reply #8 on: February 03, 2013, 10:55:27 AM »

Hello everybody!
Project progresses slowly but surely. I got various prototypes "in operation", but as said previously, i focused mainly on the french language model.
So, i got three DSL-Not (kernel 2.6.12) powered systems running Alfreid, with some problems:
    - one P200MMX+256 MB RAM, audio hardware PCI ES1371 compatible with the last version of CMU-sphinx2 (0.6)
    - the second, Celeron500+128MB RAM (HP Pavilion 8545), audio hardware PCI-1371, compatible with sphinx2
    - the third, VIA Samuel2 800MHz+512MB RAM (mini ITX), audio hardware via82cxxx, compatible with sphinx2
    - the last, Celeron 450+256MB RAM (laptop Samsung VM 7000), audio hardware i810, NOT compatible with sphinx2, and giving this annoying message at launching time of cmu-sphinx2:

Audio ioctl(SPEED): 47280, expected: 16000
FATAL_ERROR: "tty-continuous.c", line 219: ad_open_sps failed

I noticed too that,
- because of a "generic" OSS driver in DSL-N, no alsa driver was installed, so "piping" of espeak to use mbrola higher quality voices is NOT possible, you'll have to use available synthetic poor quality voices (with french available)...
- espeak (binaries taken from debian etch repo) is NOT compatible with the VIA Samuel2 motherboard (floating point exception error), so you will have to use festival instead (with no default french voice, i think!?), but i've not tested festival with this machine, while it is implemented in Alfreid!!!!

I go on with the fast-prototyping (Debian4+MS Windows XP), but give up for the DSL platforms (sorry, no time!)... If you're interested, go to my murgalua webpage, and download Alfreid's script, and fine-tune for DSL & DSL-N.

http://michelterras.perso.sfr.fr/index.html?n=15

This prototype IS FULLY WORKING for DSL-N users on compatible hardware (see below). CMU-Sphinx2 is accurate and WAY MORE REACTIVE THAN the later pocketsphinx version, that is used on my MS-XP platforms...

LM, if you're still in the corner, what about the MinGW compilation of CMU-sphinx2 to get MS-Windows binaries Huh Possible or not???

I upload my murgalua webpage as soon as possible. (done this 03/02/2013, at 12h25!)

Goodbye!
« Last Edit: February 03, 2013, 11:22:12 AM by mterras » Logged
lm
Newbie
*
Posts: 12


View Profile
« Reply #9 on: February 04, 2013, 02:23:25 PM »

This prototype IS FULLY WORKING for DSL-N users on compatible hardware (see below). CMU-Sphinx2 is accurate and WAY MORE REACTIVE THAN the later pocketsphinx version, that is used on my MS-XP platforms...

LM, if you're still in the corner, what about the MinGW compilation of CMU-sphinx2 to get MS-Windows binaries Huh Possible or not???

Was definitely wondering if CMU-Sphinx2 or Pocketsphinx was the better way to go.  Thanks for the information.  Haven't tried building Pocketsphinx, but CMU-Sphinx2 builds with MinGW.  I have the binaries if you want them.  Would be curious to hear if it performs better on Windows versus pocketsphinx too or if it's just a Windows versus Linux performance issue.  Going to try building CMU-Sphinx2 on Linux from source when I have more time.  Want to see if the build scripts I created work cross-platform.
Logged
mterras
Newbie
*
Posts: 23


View Profile
« Reply #10 on: February 05, 2013, 10:51:03 AM »

Quote
CMU-Sphinx2 builds with MinGW.  I have the binaries if you want them
Yes, please, please, please lm, send me the Sphinx2 MS-Windows binaries at my personal mail (michel.terras@laposte.net). I tried to compile it, but it is a too complex task for me, as i don't know at all this MinGW!

For the speed differences between Linux and MS-Windows systems, there will be always the problem of this increasing "anti-viral" activity, eating always more ressources (RAM and CPY cycles)... But there are some other explanations, too...
Thanks in advance for your mail
Sincerly,

PS: Last minute notes
1/ I tried yesterday the debian package wmctrl for DSL-N, allowing (from command line) to give focus to any window of the desktop, and it works without ANY dependancies!!!! So, i quickly wrote a code for calling to foreground the main graphic window of Alfreid - for example, after it has launched firefox, which windows is hiding it -, i will update script on my website soon.
2/ I found marvelous freeware Nircmd, that does the focusing action as wmctrl, and much more (send keyboard token), but this time for MS-Windows version of Alfreid...
« Last Edit: February 05, 2013, 10:52:41 AM by mterras » Logged
lm
Newbie
*
Posts: 12


View Profile
« Reply #11 on: February 20, 2013, 03:01:51 PM »

Sent you an e-mail.  Let me know if you have any issues getting it.
Logged
Pages: [1] Print 
« previous next »
Jump to:  

Powered by SMF 1.1.19 | SMF © 2013, Simple Machines
Mercury design by Bloc