Re: Intent to ship: Web Speech API - Speech Recognition with Pocketsphinx

Sandip Kamat Fri, 14 Nov 2014 15:37:42 -0800

Hi Andre, I suggest let's update the wiki for these sizes (as well as other 
questions in this thread) so we can use that as a central place of info.


-Sandip 

----- Original Message -----

> From: "Andre Natal" <ana...@gmail.com>
> To: "smaug" <sm...@welho.com>
> Cc: "Sandip Kamat" <ska...@mozilla.com>, dev-platform@lists.mozilla.org
> Sent: Saturday, November 8, 2014 8:50:44 PM
> Subject: Re: Intent to ship: Web Speech API - Speech Recognition with
> Pocketsphinx

> Hi Olli,

> > How much does Pocketsphinx increase binary size? or download size?

> In the past was suggested to avoid ship the models with packages, but yes to
> create a preferences panel in the apps to allow the user to download the
> models he wants to.

> About the size of pocketsphinx libraries itself, in mac os, they sum ~ 2.3 mb
> [1]. I don't know which type of compression the build system does when
> compiling/packaging, but should be efficient enough.

> [1]
> MacBook-Air-de-AndreNatal:gecko-dev andrenatal$ ls -lsa
> /usr/local/lib/libsphinxbase.a
> 2184 -rw-r--r-- 1 root admin 1114840 Jul 7 14:39
> /usr/local/lib/libsphinxbase.a
> MacBook-Air-de-AndreNatal:gecko-dev andrenatal$ ls -lsa
> /usr/local/lib/libpocketsphinx.a
> 2352 -rw-r--r-- 1 root admin 1201240 Jul 7 14:52
> /usr/local/lib/libpocketsphinx.a

> > When the pref is enabled, how much does it use memory on desktop, what
> > about
> > on b2g?
> 

> On b2g, it uses memory only after the decoder be activated and loaded the
> models. I did a profile in Zte Open C and here is the report [2] and here
> the exact snapshot [3]. Seems ~ 21 mb is used after load the models.

> In desktop mac os Nightly, the memory usage was of ~11mb.

> [2] https://www.dropbox.com/s/cf1drl3thkf6mp1/memory-reports?dl=0
> [3] https://www.dropbox.com/s/1rt6z9t5h30whn0/Vaani_b2g_openc.png?dl=0

> > > -Olli
> > 
> 

> > > On 10/31/2014 01:18 AM, Andre Natal wrote:
> > 
> 

> > > > I've been researching speech recognition in Firefox for two years.
> > > > First
> > > 
> > 
> 
> > > > SpeechRTC, then emscripten, and now Web Speech API with CMU
> > > > pocketsphinx
> > > 
> > 
> 
> > > > [1] embedded in Gecko C++ layer, project that I had the luck to develop
> > > > for
> > > 
> > 
> 
> > > > Google Summer of Code with the mentoring of Olli Pettay, Guilherme
> > > 
> > 
> 
> > > > Gonçalves, Steven Lee, Randell Jesup plus others and with the
> > > > management
> > > > of
> > > 
> > 
> 
> > > > Sandip Kamat.
> > > 
> > 
> 

> > > > The implementation already works in B2G, Fennec and all FF desktop
> > > 
> > 
> 
> > > > versions, and the first language supported will be english. The API and
> > > 
> > 
> 
> > > > implementation are in conformity with W3C standard [2]. The preference
> > > > to
> > > 
> > 
> 
> > > > enable it is: media.webspeech.service. default = pocketsphinx
> > > 
> > 
> 

> > > > The required patches for achieve this are:
> > > 
> > 
> 

> > > > - Import pocketsphinx sources in Gecko. Bug 1051146 [3]
> > > 
> > 
> 
> > > > - Embed english models. Bug 1065911 [4]
> > > 
> > 
> 
> > > > - Change SpeechGrammarList to store grammars inside SpeechGrammar
> > > > objects.
> > > 
> > 
> 
> > > > Bug 1088336 [5]
> > > 
> > 
> 
> > > > - Creation of a SpeechRecognitionService for Pocketsphinx. Bug 1051148
> > > > [6]
> > > 
> > 
> 

> > > > Also, other important features that we don't have patches yet:
> > > 
> > 
> 
> > > > - Relax VAD strategy to be les strict and avoid stop in the middle of
> > > 
> > 
> 
> > > > speech when speaking low volume phonemes [7]
> > > 
> > 
> 
> > > > - Integrate or develop a grapheme to phoneme algorithm to realtime
> > > 
> > 
> 
> > > > generator when compiling grammars [8]
> > > 
> > 
> 
> > > > - Inlcude and build models for other languages [9]
> > > 
> > 
> 
> > > > - Continuous and wordspotting recognition [10]
> > > 
> > 
> 

> > > > The wip repo is here [11] and this Air Mozilla video [12] plus this
> > > > wiki
> > > 
> > 
> 
> > > > has more detailed info [13].
> > > 
> > 
> 

> > > > At this comment you can see a cpu usage on flame while recognition is
> > > 
> > 
> 
> > > > happening [14]
> > > 
> > 
> 

> > > > I wish to hear your comments.
> > > 
> > 
> 

> > > > Thanks,
> > > 
> > 
> 

> > > > Andre Natal
> > > 
> > 
> 

> > > > [1] http://cmusphinx.sourceforge. net/
> > > 
> > 
> 
> > > > [2] https://dvcs.w3.org/hg/speech- api/raw-file/tip/speechapi. html
> > > 
> > 
> 
> > > > [3] https://bugzilla.mozilla.org/ show_bug.cgi?id=1051146
> > > 
> > 
> 
> > > > [4] https://bugzilla.mozilla.org/ show_bug.cgi?id=1065911
> > > 
> > 
> 
> > > > [5] https://bugzilla.mozilla.org/ show_bug.cgi?id=1088336
> > > 
> > 
> 
> > > > [6] https://bugzilla.mozilla.org/ show_bug.cgi?id=1051148
> > > 
> > 
> 
> > > > [7] https://bugzilla.mozilla.org/ show_bug.cgi?id=1051604
> > > 
> > 
> 
> > > > [8] https://bugzilla.mozilla.org/ show_bug.cgi?id=1051554
> > > 
> > 
> 
> > > > [9] https://bugzilla.mozilla.org/ show_bug.cgi?id=1065904 and
> > > 
> > 
> 
> > > > https://bugzilla.mozilla.org/ show_bug.cgi?id=1051607
> > > 
> > 
> 
> > > > [10] https://bugzilla.mozilla.org/ show_bug.cgi?id=967896
> > > 
> > 
> 
> > > > [11] https://github.com/andrenatal/ gecko-dev
> > > 
> > 
> 
> > > > [12] https://air.mozilla.org/ mozilla-weekly-project- meeting-20141027/
> > > > (Jump
> > > 
> > 
> 
> > > > to 12:00)
> > > 
> > 
> 
> > > > [13] https://wiki.mozilla.org/ SpeechRTC_-_Speech_enabling_
> > > > the_open_web
> > > 
> > 
> 
> > > > [14] https://bugzilla.mozilla.org/ show_bug.cgi?id=1051148#c14
> > > 
> > 
> 
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Re: Intent to ship: Web Speech API - Speech Recognition with Pocketsphinx

Reply via email to