Andreas- Thank you, I’ll experiment with that.
David > On Mar 31, 2023, at 7:49 AM, Fischlin Andreas <[email protected]> > wrote: > > If the cite keys remain constant, i.e. serve as the primary key, then you can > clean up your new data base by using following AppleScript. Note, it > overwrites any existing older record with the imported one. If this is not > your case, you must not use this AppleScript. > > (* > Name Cleanup Duplicates > > Purpose If importing into a BibDesk file updated records, which > have been imported previously, are obsolete, since outdated. Such duplicates > are deleted while retaining static group memberships. > > Remark This script does also properly restore static group memberships > even in cases where duplicates are present in multiple numbers, i.e. not only > simple pairs (see variable 'youngestPub') > > Installation > Copy this script to folder '~/Library/Application > Support/BibDesk/Scripts/' > > Usage Run this script by choosing corresponding menu command > from within BibDesk's Script menu. > > Remark Alternatively you can also run this script from anywhere on > your system without any installation. > > > Programmer Andreas Fischlin, [email protected] > <mailto:[email protected]>, http://www.sysecol.ethz.ch/staff/af/ > building on script 'Select Obsolete Duplicates' > written by Christiaan Hofmann, as of 1.Sep.2009 and the previous, but > comparably very slow script 'Cleanup Duplicates' written by af > > History > > af 01.Sep.2009 v 1.0: First implementation (works with > BibDesk Version 1.5.2 (1879) under Snow Leopard OS X 10.6.4). In contrast to > its predecessor, this script does no logging to be as efficient as possible. > This algorithm is considerably more efficient than its much more complex > predecessor variant. > af 08.Oct.2011 v 1.0.1: Introducing core algorithm as a > separte routine to enable copy/paste maintenance (identical routines used in > scripts 'Cleanup Duplicates.scpt' and 'Fix PDF and URL Links.scpt'). > af 15.Sep.2019 v 1.1: Enhancement: For a large library the > algorithm becomes very time consuming. I added therefore a case working only > within the set of currently selected pubs. This works also in the case when > called after an import, making the importing much more efficient. > af 8.Apr.2022 v 1.2: Enhancement: Adding routine > alertUserOfFailedDeletion to notify user when deletion of records failed > > *) > > > on run {} > CleanupDuplicates() > end run > > > -- IMPORTANT NOTE: The following routine is an identical copy as contained in > files 'Cleanup Duplicates.scpt' and 'Fix PDF and URL Links.scpt'. Make sure > the two copies are always kept identical. > on CleanupDuplicates() > set theBibDeskDocu to document 1 of application "BibDesk" > tell document 1 of application "BibDesk" > set thePubs to selection > if (count of thePubs) = 0 then > -- get and sort all publications by cite key ensuring > that in any set of publications with the same cite key the youngest comes > first and the oldest, typically the only one of the set that is still member > of any static groups, comes last. To retain static group memberships we have > to ensure that such "membership info" is copied from the last to the first > publication of any set of publications with the same cite key (see vars > 'aPub', 'prevPub', 'youngestPub'). > -- with a large library this functionality becomes > quite tedious and with BibDesk 1.7.1 under OS X 10.14.x (Mojave) it seems > that duplicates are no longer found with the menu command 'Database -> Select > Duplicates -> Only Duplicates'. Choosing this menu command results only in a > beep, despite the fact that duplicates (by cite key) are present and > correctly shown in red. > set thePubs to (sort (get publications) by "Cite Key" > subsort by "Date-Added" without ascending) > else > -- this part of the algorithm does only search for > duplicates for the currently selected publications regardless of the possible > presence of additional other duplicates in the data base or not. It finds for > every cite key within current selection a possibly matching publication in > the data base. Only the thus constructed set of publications is then fixed > for duplicates by keeping the younger record only. This helps to speed up > things considerably for large libraries, in particular when importing > potential duplicates is a relatively small set compared to the entire library. > set theCitekeys to {} > repeat with aPub in thePubs > set aCiteKey to cite key of aPub > set end of theCitekeys to aCiteKey > end repeat > set n to count of theCitekeys > set thePubs to {} > repeat with aCiteKey in theCitekeys > set foundPubs to (get publications whose cite > key is aCiteKey) > -- display dialog aCiteKey & ": m = " & (count > of foundPubs) > repeat with aPub in foundPubs > set end of thePubs to aPub > end repeat > end repeat > set m to count of thePubs > -- display dialog m > if m = n then return > -- sort all publications in set thePubs by cite key > ensuring that in any set of publications with the same cite key the youngest > comes first and the oldest, typically the only one of the set that is still > member of any static groups, comes last. To retain static group memberships > we have to ensure that such "membership info" is copied from the last to the > first publication of any set of publications with the same cite key (see vars > 'aPub', 'prevPub', 'youngestPub'). > set thePubs to (sort thePubs by "Cite Key" subsort by > "Date-Added" without ascending) > end if > set theDupes to {} > set prevCiteKey to missing value > set prevPub to missing value > set youngestPub to missing value > repeat with aPub in thePubs > set aCiteKey to cite key of aPub > ignoring case > if aCiteKey is prevCiteKey then > set end of theDupes to aPub > -- we fix the static group membership > redundantly in cases where aPub is also merely an obsolete duplicate, since > we have possibly not yet advanced to the end of the set with the same cite > key. But this is unavoidable with this algorithm looping simply through all > publications. The end result will be that youngestPub (first in set of > publications with same cite key) will be member of all static groups of the > publications in the set (unification). The latter should be no big issue, > since typically in multiple sets of publications it is only the last > publication that matters. If this should be an issue, then we would need to > first delete all static group membership info in 'youngestPub' in case we > encounter a 3rd, or 4th etc. same cite key in 'aPub', and copy only those of > 'aPub'. However, for the sake of efficiency I wish not to support this > behavior. > my fixGroupMembership(theBibDeskDocu, > aCiteKey, aPub, youngestPub) > else > -- remember in 'youngestPub' possible > candiate for a new set of publications with the same cite key > set youngestPub to aPub > end if > end ignoring > set prevCiteKey to aCiteKey > set prevPub to aPub > end repeat > repeat with aPub in theDupes > try > set theCiteKey to cite key of aPub > delete aPub > on error errText number errNum > my alertUserOfFailedDeletion(errText, errNum, > theCiteKey) > end try > end repeat > end tell > end CleanupDuplicates > > > on alertUserOfFailedDeletion(errText, errNum, theCiteKey) > display dialog "Unexpected error " & errNum & " encountered during > deletion of record with cite key " & theCiteKey & ": " & errText ¬ > buttons {"OK"} default button {"OK"} with title "Unexpected > error encountered" with icon caution giving up after 3 > end alertUserOfFailedDeletion > > > on fixGroupMembership(theBibDeskDocu, theCiteKey, oldPub, newPub) > tell application "BibDesk" > tell theBibDeskDocu > set thePubsGroups to (get static groups whose > publications contains oldPub) > if (count of thePubsGroups) is greater than 0 then > repeat with aGroup in thePubsGroups > add newPub to aGroup > end repeat > end if > end tell > end tell > end fixGroupMembership > > > Andreas > > > ETH Zurich > Prof. em. Dr. Andreas Fischlin > IPCC Vice-Chair WGII > Systems Ecology - Institute of Biogeochemistry and Pollutant Dynamics > CHN E 24 > Universitaetstrasse 16 > 8092 Zurich > SWITZERLAND > > [email protected] <mailto:[email protected]> > www.sysecol.ethz.ch/people/andreas.fischlin.html > <http://www.sysecol.ethz.ch/people/andreas.fischlin.hml> > > +41 44 633-6090 phone > +41 44 633-1136 fax > +41 79 595-4050 mobile > > Make it as simple as possible, but distrust it! > ________________________________________________________________________ > > > > > > > > > >> On Fri, 31.03.23, at 05:01, David Craig <[email protected] >> <mailto:[email protected]>> wrote: >> >> What’s the recommended way to merge two bibliographies? >> >> I know I can drag all entries from one bibliography into another, but in >> this case BibDesk is happy to copy over duplicate entries, which then have >> to be removed by hand. Is there a better way to do this? (I’ve searched >> the manual and wiki, but haven’t found what I’m looking for.) >> >> Thanks, >> David Craig >> >> >> <http://www.panix.com/~dac/> >> >> _______________________________________________ >> Bibdesk-users mailing list >> [email protected] >> <mailto:[email protected]> >> https://lists.sourceforge.net/lists/listinfo/bibdesk-users > David Craig <http://www.panix.com/~dac/>
_______________________________________________ Bibdesk-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bibdesk-users
