Dear Anna,
I hope it is OK if I give my feedback here.
While it is true that better syntactic handling would give Apertium a
better chance at producing better translations, particularly for
languages that are not closely related, your proposal would need to be
more convincing as to why adding a corpus-based shallow function
labeller would be better than approaches that are actually present such
as statistical (HMM or sliding-window) part-of-speech tagging,
constraint grammar, or pattern-based syntactical transfer.
Which problems are not correctly handled and would be better handled
with the new approach? Which language pairs are you thinking of? How do
you plan to treat possible discrepancies in the tagset of the UD bank
and the Apertium tagset(s)? How will you ensure that the machine-learned
module, when inserted, will not slow down too much the Apertium pipeline?
Also, Apertium started as a rule-based machine translation with one
corpus-based component: an HMM part-of-speech tagger. Later on, some
languages have been endowed with rule-based part-of-speech tagging based
on constraint grammars, in a move which clearly makes Apertium more
rule-based and more transparent. Therefore, the adoption of a
corpus-based component needs to be justified better.
I hope this helps
Mikel
El 20/03/17 a les 23:08, Анна Кондратьева ha escrit:
Hello everyone,
I'm a wannabe GSoC student and I'm very interested in Apertium projects.
So, I have written a draft of my proposal and will be extremely happy,
if someone takes a look at it and gives me some constructive feedback.
http://wiki.apertium.org/wiki/User:Deltamachine/proposal
Thanks in advance!
------------------------------------------------------------------------------------------------------
Best regards,
Anna Kondratjeva
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff
--
Mikel L. Forcada http://www.dlsi.ua.es/~mlf/
Departament de Llenguatges i Sistemes Informàtics
Universitat d'Alacant
E-03690 Sant Vicent del Raspeig
Spain
Office: +34 96 590 9776
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff