Re: [Bibdesk-users] Merging Bibliographies?

Fischlin Andreas Fri, 31 Mar 2023 08:05:13 -0700

If the cite keys remain constant, i.e. serve as the primary key, then you can 
clean up your new data base by using following AppleScript. Note, it overwrites 
any existing older record with the imported one. If this is not your case, you 
must not use this AppleScript.


(* 
       Name     Cleanup Duplicates
   
       Purpose  If importing into a BibDesk file updated records, which have 
been imported previously, are obsolete, since outdated. Such duplicates are 
deleted while retaining static group memberships.
           
        Remark  This script does also properly restore static group memberships 
even in cases where duplicates are present in multiple numbers, i.e. not only 
simple pairs (see variable 'youngestPub')
                   
        Installation    
                        Copy this script to folder '~/Library/Application 
Support/BibDesk/Scripts/'
        
        Usage           Run this script by choosing corresponding menu command 
from within BibDesk's Script menu.
                
        Remark  Alternatively you can also run this script from anywhere on 
your system without any installation.                                           
    
        
        Programmer      Andreas Fischlin, [email protected], 
http://www.sysecol.ethz.ch/staff/af/ 
                                building on script 'Select Obsolete Duplicates' 
written by Christiaan Hofmann, as of 1.Sep.2009 and the previous, but 
comparably very slow script 'Cleanup Duplicates' written by af
         
        History         
        
          af            01.Sep.2009 v 1.0: First implementation (works with 
BibDesk Version 1.5.2 (1879) under Snow Leopard OS X 10.6.4). In contrast to 
its predecessor, this script does no logging to be as efficient as possible. 
This algorithm is considerably more efficient than its much more complex 
predecessor variant.
          af            08.Oct.2011 v 1.0.1: Introducing core algorithm as a 
separte routine to enable copy/paste maintenance (identical routines used in 
scripts 'Cleanup Duplicates.scpt' and 'Fix PDF and URL Links.scpt'). 
          af            15.Sep.2019 v 1.1: Enhancement: For a large library the 
algorithm becomes very time consuming. I added therefore a case working only 
within the set of currently selected pubs. This works also in the case when 
called after an import, making the importing much more efficient.
          af            8.Apr.2022 v 1.2: Enhancement: Adding routine 
alertUserOfFailedDeletion to notify user when deletion of records failed
          
*)


on run {}
        CleanupDuplicates()
end run


-- IMPORTANT NOTE: The following routine is an identical copy as contained in 
files 'Cleanup Duplicates.scpt' and 'Fix PDF and URL Links.scpt'. Make sure the 
two copies are always kept identical.
on CleanupDuplicates()
        set theBibDeskDocu to document 1 of application "BibDesk"
        tell document 1 of application "BibDesk"
                set thePubs to selection
                if (count of thePubs) = 0 then
                        -- get and sort all publications by cite key ensuring 
that in any set of publications with the same cite key the youngest comes first 
and the oldest, typically the only one of the set that is still member of any 
static groups, comes last. To retain static group memberships we have to ensure 
that such "membership info" is copied from the last to the first publication of 
any set of publications with the same cite key (see vars 'aPub', 'prevPub', 
'youngestPub').
                        -- with a large library this functionality becomes 
quite tedious and with BibDesk 1.7.1 under OS X 10.14.x (Mojave) it seems that 
duplicates are no longer found with the menu command 'Database -> Select 
Duplicates -> Only Duplicates'. Choosing this menu command results only in a 
beep, despite the fact that duplicates (by cite key) are present and correctly 
shown in red.
                        set thePubs to (sort (get publications) by "Cite Key" 
subsort by "Date-Added" without ascending)
                else
                        -- this part of the algorithm does only search for 
duplicates for the currently selected publications regardless of the possible 
presence of additional other duplicates in the data base or not. It finds for 
every cite key within current selection a possibly matching publication in the 
data base. Only the thus constructed set of publications is then fixed for 
duplicates by keeping the younger record only. This helps to speed up things 
considerably for large libraries, in particular when importing potential 
duplicates is a relatively small set compared to the entire library.
                        set theCitekeys to {}
                        repeat with aPub in thePubs
                                set aCiteKey to cite key of aPub
                                set end of theCitekeys to aCiteKey
                        end repeat
                        set n to count of theCitekeys
                        set thePubs to {}
                        repeat with aCiteKey in theCitekeys
                                set foundPubs to (get publications whose cite 
key is aCiteKey)
                                -- display dialog aCiteKey & ":  m = " & (count 
of foundPubs)
                                repeat with aPub in foundPubs
                                        set end of thePubs to aPub
                                end repeat
                        end repeat
                        set m to count of thePubs
                        -- display dialog m
                        if m = n then return
                        -- sort all publications in set thePubs by cite key 
ensuring that in any set of publications with the same cite key the youngest 
comes first and the oldest, typically the only one of the set that is still 
member of any static groups, comes last. To retain static group memberships we 
have to ensure that such "membership info" is copied from the last to the first 
publication of any set of publications with the same cite key (see vars 'aPub', 
'prevPub', 'youngestPub').
                        set thePubs to (sort thePubs by "Cite Key" subsort by 
"Date-Added" without ascending)
                end if
                set theDupes to {}
                set prevCiteKey to missing value
                set prevPub to missing value
                set youngestPub to missing value
                repeat with aPub in thePubs
                        set aCiteKey to cite key of aPub
                        ignoring case
                                if aCiteKey is prevCiteKey then
                                        set end of theDupes to aPub
                                        -- we fix the static group membership 
redundantly in cases where aPub is also merely an obsolete duplicate, since we 
have possibly not yet advanced to the end of the set with the same cite key. 
But this is unavoidable with this algorithm looping simply through all 
publications. The end result will be that youngestPub (first in set of 
publications with same cite key) will be member of all static groups of the 
publications in the set (unification). The latter should be no big issue, since 
typically in multiple sets of publications it is only the last publication that 
matters. If this should be an issue, then we would need to first delete all 
static group membership info in 'youngestPub' in case we encounter a 3rd, or 
4th etc. same cite key in 'aPub', and copy only those of 'aPub'. However, for 
the sake of efficiency I wish not to support this behavior.
                                        my fixGroupMembership(theBibDeskDocu, 
aCiteKey, aPub, youngestPub)
                                else
                                        -- remember in 'youngestPub' possible 
candiate for a new set of publications with the same cite key
                                        set youngestPub to aPub
                                end if
                        end ignoring
                        set prevCiteKey to aCiteKey
                        set prevPub to aPub
                end repeat
                repeat with aPub in theDupes
                        try
                                set theCiteKey to cite key of aPub
                                delete aPub
                        on error errText number errNum
                                my alertUserOfFailedDeletion(errText, errNum, 
theCiteKey)
                        end try
                end repeat
        end tell
end CleanupDuplicates


on alertUserOfFailedDeletion(errText, errNum, theCiteKey)
        display dialog "Unexpected error " & errNum & " encountered during 
deletion of record with cite key " & theCiteKey & ": " & errText ¬
                buttons {"OK"} default button {"OK"} with title "Unexpected 
error encountered" with icon caution giving up after 3
end alertUserOfFailedDeletion


on fixGroupMembership(theBibDeskDocu, theCiteKey, oldPub, newPub)
        tell application "BibDesk"
                tell theBibDeskDocu
                        set thePubsGroups to (get static groups whose 
publications contains oldPub)
                        if (count of thePubsGroups) is greater than 0 then
                                repeat with aGroup in thePubsGroups
                                        add newPub to aGroup
                                end repeat
                        end if
                end tell
        end tell
end fixGroupMembership


Andreas


ETH Zurich
Prof. em. Dr. Andreas Fischlin
IPCC Vice-Chair WGII
Systems Ecology - Institute of Biogeochemistry and Pollutant Dynamics
CHN E 24
Universitaetstrasse 16
8092 Zurich
SWITZERLAND

[email protected]
www.sysecol.ethz.ch/people/andreas.fischlin.html 
<http://www.sysecol.ethz.ch/people/andreas.fischlin.hml>

+41 44 633-6090 phone
+41 44 633-1136 fax
+41 79 595-4050 mobile

             Make it as simple as possible, but distrust it!
________________________________________________________________________









> On Fri, 31.03.23, at 05:01, David Craig <[email protected]> wrote:
> 
> What’s the recommended way to merge two bibliographies?
> 
> I know I can drag all entries from one bibliography into another, but in this 
> case BibDesk is happy to copy over duplicate entries, which then have to be 
> removed by hand.   Is there a better way to do this?  (I’ve searched the 
> manual and wiki, but haven’t found what I’m looking for.)
> 
> Thanks,
> David Craig
> 
> 
> <http://www.panix.com/~dac/>
> 
> _______________________________________________
> Bibdesk-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/bibdesk-users

smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Bibdesk-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bibdesk-users

Re: [Bibdesk-users] Merging Bibliographies?

Reply via email to