Hi Jacob,
Thanks for your compliment.
Dixtools: I tested sorting and merge a long time ago. But the
autoconcord puzzles me: I'ts obviously used for the pair Swedish-Danish
(sv-da) and I suppose I break it by introducing compound paradigms.
I will dig deeper into this later.
My tools are intended as a first step towards a Windows GUI for laymen
wanting to contribute to dictionaries. Something I possibly cannot
achieve without help.
Yours,
Per Tunedal
On Mon, Jan 7, 2013, at 17:00, Jacob Nordfalk wrote:
Hej Per,
Fine code, most of us have made our own code at some point.
In Linux these tools are not that needed as there are powerfull
tools like sed and grep that makes text processing from command line
easy enough.
WRT formatting and sorting, there is apertium-dixtools.
Its even mentioned in your dixes:
<?xml version="1.0" encoding="UTF-8"?>
<!--
Dictionary:
Sections: 3
Entries: 10140
Sdefs: 64
Paradigms: 341
Last processed by: apertium-dixtools fix apertium-sv-da.da.dix
x.dix
-->
You should try that out, and you might make an apertium-dixtools
extension that fits your needs.
dixtools is a monster program that reads (several) dixes into memory
as Java objects and lets you manipulate them and save them again. It
remembers formatting as well.
$ apertium-dixtools
Usage: apertium-dixtools [task] [generic options] [task parameters]
...
Tasks:
autorestrict: automatically adds restrictions on a bidix so no
ambiguity exitst
autoconcord: automatically makes gender, number etc in bidix
concord with the monodices
cross: cross 2 language pairs (using linguistic
res. XML file)
cross-param: cross 2 language pairs (using command line
parameters)
merge-morph: merges two morphological dictionaries
(monodix)
equiv-paradigms: finds equivalent paradigms and updates
references
list: lists entries in a dictionary
reverse-bil: reverses a bilingual dictionary
sort: sorts (and groups by category) a dictionary
format: formats a dictionary (according to Generic
Options)
fix: fix a dictionary (remove duplicates, convert
spaces)
For help on a task, invoke it without parameters
Generic options: (mostly for tasks that outputs dix files)
-debug print extra debugging information
-useTabs use tabs (instead of default 2 spaces) when
indenting
-noProcComments don't add processing comments (telling what
was done)
-noHeader don't put header comment with a summary in
the top
-stripEmptyLines removes empty lines (originating from
original file)
-alignBidix align a bidix (<p> or <i> at col 10, <r> at
col 55)
-alignMonodix align a monodix (pardef 10, 30, other
entries 25, 45)
-align [[E] P R] custom align (default <p>/<i> at col 10, <r>
at col 55)
-alignpardef [[E] P R] paradigm alignment (if differ from
general align)
-noalign old, noncompact XML-ish output (one tag per
line, lots of indents)
If no -align option is specified, the alignment is autodetected
Use - as file name for piping (read/write .dix files on standard
input/output)
More info: [1]http://wiki.apertium.org/wiki/Apertium-dixtools
2013/1/7 Per Tunedal <[2][email protected]>
Hi,
yes, you're right. But ...
The pair Swedish-Danish (sv-da) is full of comments about different
groups of words (within the categories). Comments like: check these
etc.
Sorting would create a terrible mess! And when the dictionary isn't
sorted, it's not obvious where to put a new entry.
My simple tools makes it possible to add new words, without solving
all
old problems. I prefer to tackle them one at a time, when I feel
like
it.
BTW: isn't it a shame that we all have are own personal tools for
adding
words? It would be great if we could build an universal tool for
adding
words.
Yours,
Per Tunedal
PS I've just updated my tools to version 0.2. I've added the
possibility
to create a monodix with the help of a bidix, and fixed a lot of
bugs.
Now you can work like this:
A. Create a monodix in language 1.
B. Create a bidix from the monodix (simply adding the translation)
and finally,
C. Create a monodix in language 2 from the bidix.
[3]http://www.tunedal.nu/download/AddToDix/
On Sun, Jan 6, 2013, at 23:31, Bernard Chardonneau wrote:
> > X-Mailer: MessagingEngine.com Webmail Interface - html
> > Date: Wed, 02 Jan 2013 17:32:44 +0100
> > From: Per Tunedal <[4][email protected]>
> > To: Apertium Stuff <[5][email protected]>
> > Reply-To: [6][email protected]
> > Subject: [Apertium-stuff] Tools for adding to dictionaries
> >
> > Hi again,
> > (..........)
> >
> > A side effect is that it will be unnecessary to sort the
dictionaries!
> > You can be sure that you don't add some word already present and
new
> > words can be pasted anywhere in the dictionary files.
> >
> I just answer to this point.
>
> I previously wrote, I also have my personal tools for adding word,
> and I think we are a lot to work like this.
>
> But sometimes, we just have to correct a mistake.
>
> For instance, last week, I compared analysis of French done by eo-fr
> and fr-es pairs. And I found a result for the surface form "joue"
> with the verb "jouir" whitch is wrong. The solution was to change the
> paradigm for this verb in the French monodix.
>
> To do that, a text editor is enough, and there is a search fonction
> in it. But if you just ast for the string "jouir", you can find
bigger
> word with the same letters inside. So, if you did not think to type
> the word between "" for a monodix and between >< for a bidix, having
> words in alphabetic order will make it easier to see if we went too
> far in the file or not (when working on more than one word).
>
> So, an automatic sorting when adding words may save time and make
> dictionaries more pleasant to edit on other occasions.
>
> > (..........)
> >
> > Yours,
> > Per Tunedal
> >
--snip--
-----------------------------------------------------------------------
-------
Master Visual Studio, SharePoint, SQL, [7]ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. SALE $99.99 this month only -- learn more at:
[8]http://p.sf.net/sfu/learnmore_122412
_______________________________________________
Apertium-stuff mailing list
[9][email protected]
[10]https://lists.sourceforge.net/lists/listinfo/apertium-stuff
--
[11]Jacob Nordfalk
[12]javabog.dk
Androidudvikler og -underviser på [13]IHK og [14]Lund&Bendsen
-----------------------------------------------------------------------
-------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. SALE $99.99 this month only -- learn more at:
[15]http://p.sf.net/sfu/learnmore_122412
_______________________________________________
Apertium-stuff mailing list
[16][email protected]
[17]https://lists.sourceforge.net/lists/listinfo/apertium-stuff
References
1. http://wiki.apertium.org/wiki/Apertium-dixtools
2. mailto:[email protected]
3. http://www.tunedal.nu/download/AddToDix/
4. mailto:[email protected]
5. mailto:[email protected]
6. mailto:[email protected]
7. http://ASP.NET/
8. http://p.sf.net/sfu/learnmore_122412
9. mailto:[email protected]
10. https://lists.sourceforge.net/lists/listinfo/apertium-stuff
11. http://profiles.google.com/jacob.nordfalk
12. http://javabog.dk/
13. http://cv.ihk.dk/diplomuddannelser/itd/vf/MAU
14. https://www.lundogbendsen.dk/undervisning/beskrivelse/LB1809/
15. http://p.sf.net/sfu/learnmore_122412
16. mailto:[email protected]
17. https://lists.sourceforge.net/lists/listinfo/apertium-stuff
------------------------------------------------------------------------------
Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS
and more. Get SQL Server skills now (including 2012) with LearnDevNow -
200+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only - learn more at:
http://p.sf.net/sfu/learnmore_122512
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff