Hi all,
As usual its available from
http://www.imsc.res.in/~golam/anubadok/
Alternately, you may try Google("Anubadok") or even
Google("অনুবাদক") :-)
Here goes the release note...
It started little more than a year from now. Within this
time-frame and given my own constraints and limitations, it is
heartening to announce its first official release.
Few key features of the first official release:
* For the first time the number of entries in Anubadok's
English-to-Bengali dictionary has gone to five-figure.
This number currently stands at 10,128.
* Anubadok now completely supports free "gposttl" tagger
along with the restricted "treetagger". Further, it
includes some error correcting codes for some known
tagging errors of gposttl. This also means that one
can expect a little higher translation accuracy
while using "gposttl" than "treetagger".
* Anubadok now has an improved proper noun handling
mechanism. For example, it can now recognise pattern
like "Bay of Bengal" as a single proper noun and will
translate as "bangopsagor" instead "banglar
upsagor". Although it can recognise such pattern
but for the translation to proceed, it needs to have a
corresponding entry ("bay.of.bengal") in its E2B
dictionary. Otherwise, Anubadok will use a fall-back
mechanism and will translate the same as usual.
Nevertheless, it will report for the entry through
"new_words.list".
* Documentation has been slightly improved. Though, it
needs more works.
* English sentence splitter for complex sentences:
This is lacking ever since Anubadok was born but
recently I have started working on it. The version
0.1 itself has some code in it. This will be the
main area of thrust for anubadok-0.2 which is
available as cvs version.
* Anubadok-0.1 does not have any dedicated program for
Wikipedia translation but one can use "anubadok" or
"english2bangla" scripts for generic translations.
However, anubadok-0.2-cvs now includes "wiki_anubadok"
for translation of wiki text (a script "wikiget" is
included in the package for fetching wiki articles in
text format. You just need to give the title of an
English article.).
-------------------------------------------------------------
Lastly, I thought of having a my own balance sheet of how
much I could and couldn't do in last one year, mainly using
the pre-release versions of Anubadok (its also sort of
beating my own drum :-)). In last few months, I have
translated more than six thousands PO strings in KDE. I am
sure, I would have gone no way near to that without using
it. So to conclude, though Anubadok still has a long way to go
but its current performance with the given amount of codes
in it, is certainly encouraging for the future of Bengali
Machine translation.
Cheers,
Golam
--
http://www.imsc.res.in/~golam/
_______________________________________________
Bengalinux-core mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bengalinux-core