Re: Intent to ship: Web Speech API - Speech Recognition with Pocketsphinx

Chris Hofmann Thu, 30 Oct 2014 17:47:28 -0700

On 10/30/14 5:24 PM, smaug wrote:

On 10/31/2014 02:21 AM, smaug wrote:
Intent to ship is too strong for this.
We need to first have implementation landed and tested ;)
I wouldn't ship the implementation in desktop FF without plenty ofmore testing.
But I guess the question is what people think about shipping thepocketspinx + API, even if disabled by default.
Andre, we need some numbers here. How much does Pocketsphinx increasebinary size? or download size?When the pref is enabled, how much does it use memory on desktop, whatabout on b2g?

This is important work and the competition is ramping quicky after manyyears of promises about this year being the year of voice recognition.We will probably fall behind quickly if we don't get something goinghere in the next year.

Can you also talk a bit about what the plan and set of challenges looklike for expanding the supported languages, and how these would impactthe numbers ollie has asked for?

The place we really need this is b2g, but phones are only shipping ininternational markets right now so english only is not all that helpful.


-chofmann



-Olli


On 10/31/2014 01:18 AM, Andre Natal wrote:

I've been researching speech recognition in Firefox for two years.FirstSpeechRTC, then emscripten, and now Web Speech API with CMUpocketsphinx[1] embedded in Gecko C++ layer, project that I had the luck todevelop for

Google Summer of Code with the mentoring of Olli Pettay, Guilherme

Gonçalves, Steven Lee, Randell Jesup plus others and with themanagement of

Sandip Kamat.

The implementation already works in B2G, Fennec and all FF desktop
versions, and the first language supported will be english. The API and

implementation are in conformity with W3C standard [2]. Thepreference to

enable it is: media.webspeech.service.default = pocketsphinx

The required patches for achieve this are:

  - Import pocketsphinx sources in Gecko. Bug 1051146 [3]
  - Embed english models. Bug 1065911 [4]

- Change SpeechGrammarList to store grammars inside SpeechGrammarobjects.

Bug 1088336 [5]

- Creation of a SpeechRecognitionService for Pocketsphinx. Bug1051148 [6]



Also, other important features that we don't have patches yet:
  - Relax VAD strategy to be les strict and avoid stop in the middle of
speech when speaking low volume phonemes [7]
  - Integrate or develop a grapheme to phoneme algorithm to realtime
generator when compiling grammars [8]
  - Inlcude and build models for other languages [9]
  - Continuous and wordspotting recognition [10]

The wip repo is here [11] and this Air Mozilla video [12] plus thiswiki

has more detailed info [13].

At this comment you can see a cpu usage on flame while recognition is
happening [14]

I wish to hear your comments.

Thanks,

Andre Natal

[1] http://cmusphinx.sourceforge.net/
[2] https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html
[3] https://bugzilla.mozilla.org/show_bug.cgi?id=1051146
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=1065911
[5] https://bugzilla.mozilla.org/show_bug.cgi?id=1088336
[6] https://bugzilla.mozilla.org/show_bug.cgi?id=1051148
[7] https://bugzilla.mozilla.org/show_bug.cgi?id=1051604
[8] https://bugzilla.mozilla.org/show_bug.cgi?id=1051554
[9] https://bugzilla.mozilla.org/show_bug.cgi?id=1065904 and
https://bugzilla.mozilla.org/show_bug.cgi?id=1051607
[10] https://bugzilla.mozilla.org/show_bug.cgi?id=967896
[11] https://github.com/andrenatal/gecko-dev

[12]https://air.mozilla.org/mozilla-weekly-project-meeting-20141027/ (Jump

to 12:00)
[13] https://wiki.mozilla.org/SpeechRTC_-_Speech_enabling_the_open_web
[14] https://bugzilla.mozilla.org/show_bug.cgi?id=1051148#c14


_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Re: Intent to ship: Web Speech API - Speech Recognition with Pocketsphinx

Reply via email to