On Thu, Aug 09, 2012 at 02:54:27PM +0200, Kevin Brubeck Unhammer wrote: > Francis Tyers <[email protected]> writes: > > What you're describing is gisting/translation for understanding; I can't > imagine gisting MT would be very useful for sv-nb/nn (and I suspect > people would use Google for that anyway). But with these closely related > languages, it's possible to get to a standard good enough for > post-editing (pre-publishing), e.g. with OmegaT as you mentioned, and in > that case the users definitely know which language it is already. > > > There are three possibilities. > > > > (1) You can make an sv-nb (or sv-nn) translator, and then include a > > subset of the nn-nb translator in it, piping the output of sv-nb into > > sv-nn. (here you would have an sv-nb dictionary and an nb-nn dictionary) > > > > (2) You make two translators in parallel. > > > > (3) You make the two translators in the one pair. For this, you could > > have the same Swedish dictionary, but would need different nb and nn > > dictionaries, different sv-nb and sv-nn dictionaries and different sv-nb > > and sv-nn transfer rules. > > > > I think that (3) is probably best, but would like input from others > > (e.g. Unhammer or Trond). > > (3) sounds best to me too. Perhaps you could even do with one bidix, and > just use the alt="nn" vs alt="nb" attribute; a rough and dirty count > shows that the majority of entries in the nn-nb bidix carry over the > same lemma/tag: > > $ lt-expand apertium-nn-nb.nn-nb.dix | grep -v ':[<>]:' | awk -F: '$1==$2'|wc > -l > 71628 > $ lt-expand apertium-nn-nb.nn-nb.dix | grep -v ':[<>]:' | awk -F: '$1!=$2'|wc > -l > 11365 >
As Danish is a kind of old Norwegian bokmaal, maybe we could inlude that language too. Then all three languages could benefit from the combined work. > >> B. I have looked in the repository and found that some work has been > >> done on the following dictionaries: > >> > >> Danish (da) - Norwegian Bokm??l (nb) - nursery > >> Swedish (sv) - Norwegian Bokm??l (nb) - incubator > >> > >> Tihomir told me he's working on Swedish-Icelandic and has expanded the > >> Swedish monolingual dictionary from sv-da. But which is the most > >> complete Norwegian Bokm??l (nb) monolingual dictionnary? The one from the > >> pair Norwegian Bokm??l (nb) - Norwegian Nynorsk (nn)? > > > > Yes, I would take the Swedish dictionary from sv-is and the Norwegian > > dictionar(y,ies) from nn-nb. I have also been working on swedish nouns from SALDO. I was working on a scheme that could remove about 60 % of the eambiguities. Best regards keld ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
