Hello,

> When trying to maintain a huge database (something akin to
>> a master game repository), I encounter an annoying
>> problem.  Doublechecking names easily lead to a nightmare
>> : I have to choose some name of a player without knowing
>> exactly who to pick.  Szabo, for instance, seems to be a
>> popular name for a chessplayer.
>>
>
> It is. It is just a popular name anyway. VIAF lists 827
> persons of that name. And VIAF is only built for persons who
> published some book or the like...


I did not know about VIAF.  Great stuff !  What I mean is that there is no
sense to spellcheck a big file if there is all these decisions to make by
hand.  If we have a tournament file with the names of the players, that kind
of correction would really be a spellcheck, not a second-guessing game for
too much names.

What you suggest would be to build up a database of all
> tournaments and their participants.


Exactly, like Franz does for players' names.  It should be consistent with
his work, because the main idea would be to complement them.


> Eh? Why is a collection of games no real database? Do you
> mean in the IBM sense? (DB2 is a real database, Access not.)
>
> BTW: I'd surely collect those metadata in a real database.


Yes, my point would be to let Scid handle this metadata for his maintainance
spellchecks.

Well, working within a central database would be in fact the
> only way to go. But you suggest to build up an authority
> file (or registry if you prefer that naming) for all chess
> games of the world. Even if you just start out with
> important tournaments of the past and work through them,
> this is a huge effort.


Exactly.  In fact, once you have this tool at hand, working on a central
database does not seem that crazy anymore...


 But I don't see any solution to the lack of information,
>> except trying to build a tournament file that would be
>> orthogonal to a file of chessplayers' names.
>>
>
> Acutally, the real solution would be something like VIAF.


That would be the final step.  For now, a text file cross-indexing
tournaments and players would simplify a lot spellcheck.


> There you build up an authority file based upon the
> publications of a person. For your project you'd build up
> the authority file containing tournaments and persons based
> upon the games of chess they played. Its very similar (just
> to avoid the word "identical ;)
>
> Still, you miss a point here: you'll have to individualize
> the players and the games. You'd have to end up at an
> authority record for each tournament connected to the
> authority records of each participating player and all that
> would be connected to the actual games being the backend of
> the whole thing.


I don't miss that point.  It's just that I like bottom-up approaches.  I
liked correcting names by hand, when I was young.  But 5 millions games ?
 This is nonsense !


> It is a good idea, but it is a huge effort. Have a look at
> VIAF's record for Garry Kasparov e.g.
>
>   http://viaf.org/24621810
>
> You might want to unfold the MARC-21 section, check out the
> 400's and 700's, just to get an idea about the spellings for
> his name. And just to get a vague idea about the size of the
> project you might just call up http://viaf.org to see who is
> participating in this game.  And it's only about the
> names part... (Plus using databases of books we already have
> in huge databases built by autopsy using trained personel.
> Creating such a catalogue record for one book from scratch
> takes about 15min on the average.)


This is just too good to be true !


Debate is over. I know that I'm right and I've quite some
> forces behind me. ;) LoC, BNB, DNB, BNF (just to name the
> really big ones) can't be wrong.


The debate about being right is not the important one : the important one is
how to implement the idea in Scid's maintainance window... ;-)


>
> Still I see the difficulty to get it done in a spare time
> project. I see a possibility to gradually work in linkages
> to VIAF.
>
> Franz: getting VIAF record IDs into the spelling file is the
> way to go to get linkage up into Wikipedia and whatever.
> VIAF-ID is persistend, of course.  (Try to search for
> 118721097 at de.wikipedia.org. :) At the moment they use
> PND-IDs as german wikipedia mainly set this up with DNB but
> thats about to change and they go for VIAF-IDs (there's a
> 1:1 mapping PND <-> VIAF) plus if I got them right this
> should start to spread out over the whole wikipedia.
> Wikipedia wants to reuse as much data from authority files
> as possible. Guess why ;) The national libraries are
> currently investigating if they could use data from
> Wikipedia to enrich their authority files. There was just
> recently a talk about this issue at the Bibliothekartag in
> Erfurt.


Great stuff, once again !

Keep on the good ideas coming !

Bye,

B
------------------------------------------------------------------------------
_______________________________________________
Scid-users mailing list
Scid-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scid-users

Reply via email to