[Rd] Making a package CITATION file from BibTeX
Dear Colleagues, I would like to provide a CITATION file for my package nat.nblast [1]. I have the correct citation in BibTeX format [2]. How can I convert this BibTeX to the format needed by R for a package CITATION file (I have a lot of other packages needing citations ...). I think what I need is the opposite of RefManageR::toBiblatex [3]. This seems like it should be a common need, so I feel sure I must be missing something, but I can't seem to google up any hints. With many thanks, Greg Jefferis. [1] http://github.com/jefferislab/nat.nblast https://cran.r-project.org/package=nat.nblast [2] @article{Costa:2016aa, Author = {Costa, Marta and Manton, James D and Ostrovsky, Aaron D and Prohaska, Steffen and Jefferis, Gregory S X E}, Doi = {10.1016/j.neuron.2016.06.012}, Journal = {Neuron}, Month = {Jul}, Number = {2}, Pages = {293-311}, Title = {NBLAST: Rapid, Sensitive Comparison of Neuronal Structure and Construction of Neuron Family Databases}, Volume = {91}, Year = {2016}} [3] https://rdrr.io/github/ropensci/RefManageR/man/toBiblatex.html -- Gregory Jefferis Division of Neurobiology MRC Laboratory of Molecular Biology Francis Crick Avenue Cambridge Biomedical Campus Cambridge, CB2 OQH, UK http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis http://jefferislab.org http://www.zoo.cam.ac.uk/departments/connectomics __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Making a package CITATION file from BibTeX
Dear Achim, Thank you so much for taking the time to write out a perfect response to my question. I hope it will soon be available for others to google! @Inaki: Thank you. You are quite right. I intended to write to r-package-devel. Achim's reply is still copied below for users of that list. All the best, Greg. On 29 May 2019, at 23:46, Achim Zeileis wrote: On Thu, 30 May 2019, Dr Gregory Jefferis wrote: Dear Colleagues, I would like to provide a CITATION file for my package nat.nblast [1]. I have the correct citation in BibTeX format [2]. How can I convert this BibTeX to the format needed by R for a package CITATION file (I have a lot of other packages needing citations ...). (1) You can use read.bib() from the "bibtex" package to read the .bib file containing the relevant reference. (2) This gives you a "bibentry" object that can be turned into the R code generating it with format(..., style = "R") from the basic "utils" package. (3) Then you can writeLines() this R code on the console or writeLines(..., "CITATION") to a CITATION file. (4) Optionally you can also include a $header in your bibentry with a short introductory sentence. Or if you have multiple references to go into the same CITATION you might want to include a $header for each and an $mheader for everything. A worked example is included below. Background information is given in: https://doi.org/10.32614/RJ-2012-009 Let's assume that your BibTeX entry [2] is the first entry in a file called "my.bib". Then you can do: ## read first item from BibTeX as "bibentry" object b <- bibtex::read.bib("my.bib")[[1]] ## delete the bib key and add a header for the citation b$key <- NULL b$header <- "To cite nat.nblast in publications use:" ## turn the "bibentry" into the R code generating it b <- format(b, style = "R") ## write the R code to the console writeLines(b) bibentry(bibtype = "Article", header = "To cite nat.nblast in publications use:", author = c(person(given = "Marta", family = "Costa"), person(given = c("James", "D"), family = "Manton"), person(given = c("Aaron", "D"), family = "Ostrovsky"), person(given = "Steffen", family = "Prohaska"), person(given = c("Gregory", "S", "X", "E"), family = "Jefferis")), doi = "10.1016/j.neuron.2016.06.012", journal = "Neuron", month = "Jul", number = "2", pages = "293-311", title = "NBLAST: Rapid, Sensitive Comparison of Neuronal Structure and Construction of Neuron Family Databases", volume = "91", year = "2016") I think what I need is the opposite of RefManageR::toBiblatex [3]. This seems like it should be a common need, so I feel sure I must be missing something, but I can't seem to google up any hints. With many thanks, Greg Jefferis. [1] http://github.com/jefferislab/nat.nblast https://cran.r-project.org/package=nat.nblast [2] @article{Costa:2016aa, Author = {Costa, Marta and Manton, James D and Ostrovsky, Aaron D and Prohaska, Steffen and Jefferis, Gregory S X E}, Doi = {10.1016/j.neuron.2016.06.012}, Journal = {Neuron}, Month = {Jul}, Number = {2}, Pages = {293-311}, Title = {NBLAST: Rapid, Sensitive Comparison of Neuronal Structure and Construction of Neuron Family Databases}, Volume = {91}, Year = {2016}} [3] https://rdrr.io/github/ropensci/RefManageR/man/toBiblatex.html -- Gregory Jefferis Division of Neurobiology MRC Laboratory of Molecular Biology Francis Crick Avenue Cambridge Biomedical Campus Cambridge, CB2 OQH, UK http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis http://jefferislab.org http://www.zoo.cam.ac.uk/departments/connectomics __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Gregory Jefferis, PhD Tel: +44 1223 267048 Division of Neurobiology MRC Laboratory of Molecular Biology Francis Crick Avenue Cambridge Biomedical Campus Cambridge, CB2 OQH, UK http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis http://jefferislab.org http://www.zoo.cam.ac.uk/departments/connectomics __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Comments requested on "changedFiles" function
Dear Duncan, This certainly looks useful. Might you consider adding the ability to supply an alternative digest function? Details below. I often use a homemade "make" type function which starts by looking at modification times e.g. in a private package https://github.com/jefferis/nat.utils/blob/master/R/make.r For some of my work, I use hash functions. However because I typically work with many large files I often use a special digest process e.g. using the crc checksum embedded in a gzip file directly or hashing only the part of a large file that is (almost) certain to change. Perhaps (code unchecked) along the lines of: changedFiles <- function(snapshot, timestamp = tempfile("timestamp"), file.info = NULL, digest = FALSE, digestfun=NULL, full.names = FALSE, ...) if(digest){ if(is.null(digestfun)) digestfun=tools::md5sum else digestfun=match.fun(digestfun) info <- data.frame(info, digest = digestfun(fullnames)) } etc OR alternatively using only one argument: changedFiles <- function(snapshot, timestamp = tempfile("timestamp"), file.info = NULL, digest = FALSE, full.names = FALSE, ...) if(is.logical(digest)){ if(digest) digestfun=tools::md5sum } else { # Assume that digest specifies a function that we want to use digestfun=match.fun(digest) digest=TRUE } if(digest) info <- data.frame(info, digest = digestfun(fullnames)) etc Many thanks, Greg. On 4 Sep 2013, at 18:53, Duncan Murdoch wrote: In a number of places internal to R, we need to know which files have changed (e.g. after building a vignette). I've just written a general purpose function "changedFiles" that I'll probably commit to R-devel. Comments on the design (or bug reports) would be appreciated. The source for the function and the Rd page for it are inline below. - changedFiles.R: changedFiles <- function(snapshot, timestamp = tempfile("timestamp"), file.info = NULL, md5sum = FALSE, full.names = FALSE, ...) { dosnapshot <- function(args) { fullnames <- do.call(list.files, c(full.names = TRUE, args)) names <- do.call(list.files, c(full.names = full.names, args)) if (isTRUE(file.info) || (is.character(file.info) && length(file.info))) { info <- file.info(fullnames) rownames(info) <- names if (isTRUE(file.info)) file.info <- c("size", "isdir", "mode", "mtime") } else info <- data.frame(row.names=names) if (md5sum) info <- data.frame(info, md5sum = tools::md5sum(fullnames)) list(info = info, timestamp = timestamp, file.info = file.info, md5sum = md5sum, full.names = full.names, args = args) -- Gregory Jefferis, PhD Tel: 01223 267048 Division of Neurobiology MRC Laboratory of Molecular Biology Francis Crick Avenue Cambridge Biomedical Campus Cambridge, CB2 OQH, UK http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis http://jefferislab.org http://flybrain.stanford.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] inflate zlib compressed data using base R or CRAN package?
Hello, I have a binary file type that includes a zlib compressed data block (ie not gzip). Is anyone aware of a way using base R or a CRAN package to decompress this kind of data (from disk or memory). So far I have found Rcompression::decompress on omegahat, but I would prefer to keep dependencies on CRAN (or bioconductor). I am also trying to avoid writing yet another C level interface to part of zlib. Many thanks for any pointers, Greg. -- Gregory Jefferis, PhD Division of Neurobiology MRC Laboratory of Molecular Biology Francis Crick Avenue Cambridge Biomedical Campus Cambridge, CB2 OQH, UK http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis http://jefferislab.org http://flybrain.stanford.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] inflate zlib compressed data using base R or CRAN package?
Dear Murray, On 28 Nov 2013, at 1:30, Murray Stokely wrote: I think none of these examples describe a zlib compressed data block inside a binary file that the OP asked about, as all of your examples are e.g. prepending gzip or zip headers. Greg, is memDecompress what you are looking for? Yes, so long as one explicitly specifies type='gzip' - even though it isn't! I'm not sure I would have guessed it from the help. Thank you! Best, Greg. -- Gregory Jefferis, PhD Tel: 01223 267048 Division of Neurobiology MRC Laboratory of Molecular Biology Francis Crick Avenue Cambridge Biomedical Campus Cambridge, CB2 OQH, UK http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis http://jefferislab.org http://flybrain.stanford.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] possible bug: graphics::image seems to ignore getOption("preferRaster")
the details section of ?image says: If useRaster is not specified, raster images are used when the getOption("preferRaster") is true, the grid is regular and either dev.capabilities("raster") is "yes" or it is "non-missing" and there are no missing values. but in my experience this is never the case and getOption("preferRaster") is ignored. As far as I can see, the logic for checking this is in image is broken here: ras <- dev.capabilities("raster") if (identical(ras, "yes")) useRaster <- TRUE because dev.capabilities("raster") returns a list like this (on my machine, R.version in footer) $rasterImage [1] "yes" You can test this by doing: ras=structure(list(rasterImage = "yes"), .Names = "rasterImage") identical(ras,'yes') # returns FALSE so the test would need to be something like: ras <- dev.capabilities("raster")[[1]] if (identical(ras, "yes")) useRaster <- TRUE I can't find any relevant changes in R news http://stat.ethz.ch/R-manual/R-devel/doc/html/NEWS.html This discussion https://www.mail-archive.com/r-devel@r-project.org/msg22811.html suggests that Simon Urbanek may have added the useRaster option and looking at git blame on this mirror repo: https://github.com/wch/r-source/blame/c3ba5b0be36d3a1290e18fe189142c88f1e43236/src/library/graphics/R/image.R#L111-L120 suggests that Brian Ripley's svn commit 56949 was the last to touch these lines: https://github.com/wch/r-source/commit/b9012424f895bf681daf1b85255942547d495bcd Thanks for any pointers if I am missing something! Best wishes, Greg Jefferis. R.version _ platform x86_64-apple-darwin10.8.0 arch x86_64 os darwin10.8.0 system x86_64, darwin10.8.0 status major 3 minor 0.3 year 2014 month 03 day06 svn rev65126 language R version.string R version 3.0.3 (2014-03-06) nickname Warm Puppy -- Gregory Jefferis, PhD Division of Neurobiology MRC Laboratory of Molecular Biology Francis Crick Avenue Cambridge Biomedical Campus Cambridge, CB2 OQH, UK http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis http://jefferislab.org http://flybrain.stanford.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] subscripting a data.frame (without changing row order) changes internal row.names
Dear R-devel, Can anyone help me to understand this? It seems that subscripting the rows of a data.frame without actually changing their order, somehow changes an internal representation of row.names that is revealed by e.g. dput/dump/serialize I have read the docs and inspected the (R) code for data.frame, rownames, row.names and dput without enlightenment. df=data.frame(a=1:10, b=1) dput(df) df2=df[1:nrow(df), ] # R thinks they are equal (so do I!) all.equal(df, df2) dput(df2) Looking at the output of the dputs dput(df) structure(list(a = 1:10, b = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1)), .Names = c("a", "b"), row.names = c(NA, -10L), class = "data.frame") dput(df2) structure(list(a = 1:10, b = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1)), .Names = c("a", "b"), row.names = c(NA, 10L), class = "data.frame") we have row.names = c(NA, -10L) in the first case and row.names = c(NA, 10L) in the second, so somehow these objects have a different representation Can anyone explain why? This has come up because library(digest) digest(df)==digest(df2) [1] FALSE digest uses serialize under the hood, but serialize, dput and dump all show the same effect (I've pasted an example below using dump, md5sum from base R). Many thanks for any enlightenment! More generally is there any way to calculate a digest of a data.frame that could get round this issue or is that not possible? Best wishes, Greg. A digest using base R: library(tools) td=tempfile() dir.create(td) tempfiles=file.path(td,c("df", "df2")) dump("df",tempfiles[1]) dump("df2",tempfiles[2]) md5sum(tempfiles) # different md5sum sessionInfo() # for my laptop but also observed on R 3.1.2 R version 3.1.1 (2014-07-10) Platform: x86_64-apple-darwin13.1.0 (64-bit) locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] tools stats graphics grDevices utils datasets methods base other attached packages: [1] nat_1.5.14 nat.utils_0.4.2 digest_0.6.4Rvcg_0.9 devtools_1.6.1 igraph_0.7.1 [7] testthat_0.9.1 rgl_0.93.1098 loaded via a namespace (and not attached): [1] codetools_0.2-9 filehash_2.2-2nabor_0.4.3 parallel_3.1.1plyr_1.8.1 [6] Rcpp_0.11.3 rstudio_0.98.1062 rstudioapi_0.1XML_3.98-1.1 yaml_2.1.13 -- Gregory Jefferis, PhD Division of Neurobiology MRC Laboratory of Molecular Biology Francis Crick Avenue Cambridge Biomedical Campus Cambridge, CB2 OQH, UK http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis http://jefferislab.org http://flybrain.stanford.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] subscripting a data.frame (without changing row order) changes internal row.names
Hi Kevin, Joshua, Many thanks for this additional information. On 10 Nov 2014, at 22:21, Kevin Ushey wrote: I believe the question here is related to the sign on the compact row names representation: why is it sometimes `c(NA, )` and sometimes `c(NA, )` -- why the difference in sign? It was indeed the difference in sign that Kevin highlights that was puzzling me To the best of my knowledge, older versions of R used the signed-ness of compact row.names to differentiate between different 'types' of data.frames, but that should no longer be necessary. Unless there is some reason not to, I believe R should standardize on one representation, and consider it a bug if the other is seen. [snip, Joshua wrote ] Look at ?.row_names_info (which is mentioned in the See Also section of ?row.names) and its type argument. The first are "automatic". The second are a compact form of 1:10, as mentioned in ?row.names. I'm not certain of the root cause/reason, but the second object will not have "automatic" rownames because you have subset it with a non-missing 'i'. Quoting ?.row_names_info for .row_names_info(x, type = 1L) Currently type = 0 returns the internal "row.names" attribute (possibly NULL), type = 2 the number of rows implied by the attribute, and type = 1 the latter with a negative sign for ‘automatic’ row names. .row_names_info(df2) [1] 10 .row_names_info(df) [1] -10 So indeed the first case is marked as automatic, the second not. Thanks again, Greg. -- Gregory Jefferis, PhD Tel: 01223 267048 Division of Neurobiology MRC Laboratory of Molecular Biology Francis Crick Avenue Cambridge Biomedical Campus Cambridge, CB2 OQH, UK http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis http://jefferislab.org http://flybrain.stanford.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] depends/suggests when making a new generic to override a function in a user package
Dear R developers, I would like to add a new S3 generic to override a function in a user package, specifically: STAR::as.repeatedTrain I have followed the recommendation here: http://cran.r-project.org/doc/manuals/R-exts.html#Adding-new-generics doing this: as.repeatedTrain<-function(x,...){ UseMethod("as.repeatedTrain") } as.repeatedTrain.default<-function(x,...) { STAR::as.repeatedTrain(x) } but the question now arises, how do I make sure that my as.repeatedTrain generic function is used in preference to STAR's. If my package suggests STAR, then STAR will likely be loaded afterwards and my generic will get clobbered. So at the moment I have made my package depend on STAR rather than suggest it. However I will only occasionally use STAR, and it is quite a big package with a number of dependencies and I would prefer to avoid this. Is there another approach? I assume this isn't a problem for a non-generic in a base package because they are loaded first. Many thanks for any suggestions, Greg Jefferis. -- Gregory Jefferis, PhD Division of Neurobiology MRC Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 0QH, UK. http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis http://www.neuroscience.cam.ac.uk/directory/profile.php?gsxej2 http://flybrain.stanford.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [BioC] enabling reproducible research & R package management & install.package.version & BiocLite
On 5 Mar 2013, at 14:36, Cook, Malcolm wrote: So, even if I wanted to go where dragons lurked, it would not be possible to cobble a version of biocLite that installed specific versions of software. Thus, I might rather consider an approach that at 'publish' time tarzips up a copy of the R package dependencies based on a config file defined from sessionInfo and caches it in the project directory. Then when/if the project is revisited (and found to produce differnt results under current R enviRonment), I can "simply" install an old R (oops, I guess I'd have to build it), and then un-tarzip the dependencies into the projects own R/Library which I would put on .libpaths. Sounds a little like this: http://cran.r-project.org/web/packages/rbundler/index.html (which I haven't tested). Best, Greg. -- PLEASE NOTE CHANGE OF CONTACT DETAILS FROM MON 4TH MARCH: Gregory Jefferis, PhD Tel: 01223 267048 Division of Neurobiology MRC Laboratory of Molecular Biology Francis Crick Avenue Cambridge Biomedical Campus Cambridge, CB2 OQH, UK http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis http://jefferislab.org http://flybrain.stanford.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel