Looking into one particular example,
https://github.com/seabbs/idmodelr/blob/master/DESCRIPTION
this appears to be the authors' fault:
Authors@R: c(
person(given = "Sam Abbott",
role = c("aut", "cre"),
email = "cont...@samabbott.co.uk",
comment = c(ORCID = "0000-0001-8057-8037")),
person(given = "Akira Endo",
role = c("aut"),
email = "akira.e...@lshtm.ac.uk",
comment = c(ORCID = "0000-0001-6377-7296")))
Maybe CRAN should start checking for missing 'family' fields in
Authors@R ... ???
cheers
Ben Bolker
On 2024-08-20 9:47 a.m., Kurt Hornik wrote:
Kurt Hornik writes:
The variant attaches drops the URL and does unique.
Hmm, the ones in
head(with(a, sort_by(a, ~ family + given)), 100)
without a family look suspicious ...
Best
-k
Dirk Eddelbuettel writes:
On 20 August 2024 at 07:57, Dirk Eddelbuettel wrote:
|
| Hi Kurt,
|
| On 20 August 2024 at 14:29, Kurt Hornik wrote:
| | I think for now you could use something like what I attach below.
| |
| | Not ideal: I had not too long ago starting adding orcidtools.R to tools,
| | which e.g. has .persons_from_metadata(), but that works on the unpacked
| | sources and not the CRAN package db. Need to think about that ...
|
| We need something like that too as I fat-fingered the string 'ORCID'. See
| fortune::fortunes("Dirk can type").
|
| Will the function below later. Many thanks for sending it along.
Very nice. Resisted my common impulse to make it a data.table for easy
sorting via keys etc. After running your code the line
head(with(a, sort_by(a, ~ family + given)), 100)
shows that we need a bit more QA as person entries are not properly split
between 'family' and 'given', use the URL and that we have repeats.
Excluding those is next.
Right. One should canonicalize the ORCID (having the URLs is from being
nice) and then do unique() ...
Best
-k
Dirk
| Dirk
|
| |
| | Best
| | -k
| |
| | ********************************************************************
| | x <- tools::CRAN_package_db()
| | a <- lapply(x[["Authors@R"]],
| | function(a) {
| | if(!is.na(a)) {
| | a <- tryCatch(utils:::.read_authors_at_R_field(a),
| | error = identity)
| | if (inherits(a, "person"))
| | return(a)
| | }
| | NULL
| | })
| | a <- do.call(c, a)
| | a <- lapply(a,
| | function(e) {
| | if(is.null(o <- e$comment["ORCID"]) || is.na(o))
| | return(NULL)
| | cbind(given = paste(e$given, collapse = " "),
| | family = paste(e$family, collapse = " "),
| | oid = unname(o))
| | })
| | a <- as.data.frame(do.call(rbind, a))
| | ********************************************************************
| |
| | > Salut Thierry,
| |
| | > On 20 August 2024 at 13:43, Thierry Onkelinx wrote:
| | > | Happy to help. I'm working on a new version of the checklist package. I
could
| | > | export the function if that makes it easier for you.
| |
| | > Would be happy to help / iterate. Can you take a stab at making the
| | > per-column split more robust so that we can bulk-process all non-NA
entries
| | > of the returned db?
| |
| | > Best, Dirk
| |
| | > --
| | > dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
|
| --
| dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
--
dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel
--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
> E-mail is sent at my convenience; I don't expect replies outside of
working hours.
______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel