El 2019-06-13 22:34, Danielle Rossetti Dos Santos escribió:
Hello,
I'm working with the monolingual transfer rule learning code and have
a few questions:
1. I see some language pairs used to have a multi mode (such as in
this old version of eng-cat [1]). They also used to have "poly"
dictionaries (such as this one [2]). These files seem necessary for
the latest monolingual rule learning script I've found [3]. Why do
language pairs no longer have a multi mode or poly dictionaries?
They are deprecated.
2. Is there a script that can generate a poly dictionary from a
bilingual dictionary?
Not really no, it is deprecated.
3. The third step in the monolingual rule learning script I linked
above says this should be ran:
cat europarl.en-es.es.tagged | ~/source/apertium-lex-tools/multitrans
~/source/apertium-en-es/en-es.autobil -m -f -t -n >
europarl.en-es.es.multi-trimmed
I was trying to do this step with the apertium-en-pt language pair
using 10% of the English-Portuguese
Europarl corpus. I stopped the program because the output file was
getting really big (dozens of
gigabytes). Is this expected behavior from ./multitrans with the -m
option? If so, how are the
English-Spanish Europarl examples run?
Yes, they are run with a very large harddisk. :)
However, it would be helpful to know
1) what kind of output you are getting
2) what the exact setup is that you are using.
F.
F.
in order to work out if there
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff