Re: Intent to ship: Web Speech API - Speech Recognition with Pocketsphinx

Sandip Kamat Fri, 14 Nov 2014 15:55:29 -0800

Hi Olli, In general for FxOS devices, the thought is to let the OEMs decide 
which language models they would like to ship with, preloaded. That way there 
is a partner choice based on regions, but also the users could directly 
download the packages they like. For now, since we are very early stage, we 
just have English support. We need help to build and test other language models 
in parallel.


Sandip 

----- Original Message -----

> From: "Andre Natal" <ana...@gmail.com>
> To: "smaug" <sm...@welho.com>
> Cc: "Sandip Kamat" <ska...@mozilla.com>, dev-platform@lists.mozilla.org
> Sent: Saturday, November 8, 2014 8:50:44 PM
> Subject: Re: Intent to ship: Web Speech API - Speech Recognition with
> Pocketsphinx

> Hi Olli,

> > How much does Pocketsphinx increase binary size? or download size?

> In the past was suggested to avoid ship the models with packages, but yes to
> create a preferences panel in the apps to allow the user to download the
> models he wants to.

> About the size of pocketsphinx libraries itself, in mac os, they sum ~ 2.3 mb
> [1]. I don't know which type of compression the build system does when
> compiling/packaging, but should be efficient enough.

> [1]
> MacBook-Air-de-AndreNatal:gecko-dev andrenatal$ ls -lsa
> /usr/local/lib/libsphinxbase.a
> 2184 -rw-r--r-- 1 root admin 1114840 Jul 7 14:39
> /usr/local/lib/libsphinxbase.a
> MacBook-Air-de-AndreNatal:gecko-dev andrenatal$ ls -lsa
> /usr/local/lib/libpocketsphinx.a
> 2352 -rw-r--r-- 1 root admin 1201240 Jul 7 14:52
> /usr/local/lib/libpocketsphinx.a

> > When the pref is enabled, how much does it use memory on desktop, what
> > about
> > on b2g?
> 

> On b2g, it uses memory only after the decoder be activated and loaded the
> models. I did a profile in Zte Open C and here is the report [2] and here
> the exact snapshot [3]. Seems ~ 21 mb is used after load the models.

> In desktop mac os Nightly, the memory usage was of ~11mb.

> [2] https://www.dropbox.com/s/cf1drl3thkf6mp1/memory-reports?dl=0
> [3] https://www.dropbox.com/s/1rt6z9t5h30whn0/Vaani_b2g_openc.png?dl=0

> > > -Olli
> > 
> 

> > > On 10/31/2014 01:18 AM, Andre Natal wrote:
> > 
> 

> > > > I've been researching speech recognition in Firefox for two years.
> > > > First
> > > 
> > 
> 
> > > > SpeechRTC, then emscripten, and now Web Speech API with CMU
> > > > pocketsphinx
> > > 
> > 
> 
> > > > [1] embedded in Gecko C++ layer, project that I had the luck to develop
> > > > for
> > > 
> > 
> 
> > > > Google Summer of Code with the mentoring of Olli Pettay, Guilherme
> > > 
> > 
> 
> > > > Gonçalves, Steven Lee, Randell Jesup plus others and with the
> > > > management
> > > > of
> > > 
> > 
> 
> > > > Sandip Kamat.
> > > 
> > 
> 

> > > > The implementation already works in B2G, Fennec and all FF desktop
> > > 
> > 
> 
> > > > versions, and the first language supported will be english. The API and
> > > 
> > 
> 
> > > > implementation are in conformity with W3C standard [2]. The preference
> > > > to
> > > 
> > 
> 
> > > > enable it is: media.webspeech.service. default = pocketsphinx
> > > 
> > 
> 

> > > > The required patches for achieve this are:
> > > 
> > 
> 

> > > > - Import pocketsphinx sources in Gecko. Bug 1051146 [3]
> > > 
> > 
> 
> > > > - Embed english models. Bug 1065911 [4]
> > > 
> > 
> 
> > > > - Change SpeechGrammarList to store grammars inside SpeechGrammar
> > > > objects.
> > > 
> > 
> 
> > > > Bug 1088336 [5]
> > > 
> > 
> 
> > > > - Creation of a SpeechRecognitionService for Pocketsphinx. Bug 1051148
> > > > [6]
> > > 
> > 
> 

> > > > Also, other important features that we don't have patches yet:
> > > 
> > 
> 
> > > > - Relax VAD strategy to be les strict and avoid stop in the middle of
> > > 
> > 
> 
> > > > speech when speaking low volume phonemes [7]
> > > 
> > 
> 
> > > > - Integrate or develop a grapheme to phoneme algorithm to realtime
> > > 
> > 
> 
> > > > generator when compiling grammars [8]
> > > 
> > 
> 
> > > > - Inlcude and build models for other languages [9]
> > > 
> > 
> 
> > > > - Continuous and wordspotting recognition [10]
> > > 
> > 
> 

> > > > The wip repo is here [11] and this Air Mozilla video [12] plus this
> > > > wiki
> > > 
> > 
> 
> > > > has more detailed info [13].
> > > 
> > 
> 

> > > > At this comment you can see a cpu usage on flame while recognition is
> > > 
> > 
> 
> > > > happening [14]
> > > 
> > 
> 

> > > > I wish to hear your comments.
> > > 
> > 
> 

> > > > Thanks,
> > > 
> > 
> 

> > > > Andre Natal
> > > 
> > 
> 

> > > > [1] http://cmusphinx.sourceforge. net/
> > > 
> > 
> 
> > > > [2] https://dvcs.w3.org/hg/speech- api/raw-file/tip/speechapi. html
> > > 
> > 
> 
> > > > [3] https://bugzilla.mozilla.org/ show_bug.cgi?id=1051146
> > > 
> > 
> 
> > > > [4] https://bugzilla.mozilla.org/ show_bug.cgi?id=1065911
> > > 
> > 
> 
> > > > [5] https://bugzilla.mozilla.org/ show_bug.cgi?id=1088336
> > > 
> > 
> 
> > > > [6] https://bugzilla.mozilla.org/ show_bug.cgi?id=1051148
> > > 
> > 
> 
> > > > [7] https://bugzilla.mozilla.org/ show_bug.cgi?id=1051604
> > > 
> > 
> 
> > > > [8] https://bugzilla.mozilla.org/ show_bug.cgi?id=1051554
> > > 
> > 
> 
> > > > [9] https://bugzilla.mozilla.org/ show_bug.cgi?id=1065904 and
> > > 
> > 
> 
> > > > https://bugzilla.mozilla.org/ show_bug.cgi?id=1051607
> > > 
> > 
> 
> > > > [10] https://bugzilla.mozilla.org/ show_bug.cgi?id=967896
> > > 
> > 
> 
> > > > [11] https://github.com/andrenatal/ gecko-dev
> > > 
> > 
> 
> > > > [12] https://air.mozilla.org/ mozilla-weekly-project- meeting-20141027/
> > > > (Jump
> > > 
> > 
> 
> > > > to 12:00)
> > > 
> > 
> 
> > > > [13] https://wiki.mozilla.org/ SpeechRTC_-_Speech_enabling_
> > > > the_open_web
> > > 
> > 
> 
> > > > [14] https://bugzilla.mozilla.org/ show_bug.cgi?id=1051148#c14
> > > 
> > 
> 
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Re: Intent to ship: Web Speech API - Speech Recognition with Pocketsphinx

Reply via email to