On 05/02/2018 03:21 PM, Joris Meys wrote:
Dear all, I've noticed by trying to download gz files from here : https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907811 At the bottom one can download GSM907811.CEL.gz . If I download this manually and try oligo::read.celfiles("GSM907811.CEL.gz") everything works fine. (oligo is a bioConductor package) However, if I download using download.file(" https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSM907811&format=file&file=GSM907811%2ECEL%2Egz ", destfile = "GSM907811.CEL.gz")
On windows, the 'mode' argument to download.file() needs to be "wb" (write binary) for binary files.
Martin
The file is downloaded, but oligo::read.celfiles() returns the following error: Error in checkChipTypes(filenames, verbose, "affymetrix", TRUE) : End of gz file reached unexpectedly. Perhaps this file is truncated. Moreover, if I try to delete it after using download.file(), I get a warning that permission is denied. I can only remove it using Windows file explorer after I closed the R session, indicating that the connection is still open. Yet, showConnections() doesn't show any open connections either. Session info below. Note that I started from a completely fresh R session. oligo is needed due to the specific file format of these gz files. They're not standard tarred files. Cheers Joris Session Info ------------------------------------------------------------------------------------- R version 3.5.0 (2018-04-23) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200) Matrix products: default locale: [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C [5] LC_TIME=English_United Kingdom.1252 attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods [9] base other attached packages: [1] pd.hugene.1.0.st.v1_3.14.1 DBI_0.8 oligo_1.44.0 [4] Biobase_2.39.2 oligoClasses_1.42.0 RSQLite_2.1.0 [7] Biostrings_2.48.0 XVector_0.19.9 IRanges_2.13.28 [10] S4Vectors_0.17.42 BiocGenerics_0.25.3 loaded via a namespace (and not attached): [1] Rcpp_0.12.16 compiler_3.5.0 [3] BiocInstaller_1.30.0 GenomeInfoDb_1.15.5 [5] bitops_1.0-6 iterators_1.0.9 [7] tools_3.5.0 zlibbioc_1.25.0 [9] digest_0.6.15 bit_1.1-12 [11] memoise_1.1.0 preprocessCore_1.41.0 [13] lattice_0.20-35 ff_2.2-13 [15] pkgconfig_2.0.1 Matrix_1.2-14 [17] foreach_1.4.4 DelayedArray_0.5.31 [19] yaml_2.1.18 GenomeInfoDbData_1.1.0 [21] affxparser_1.52.0 bit64_0.9-7 [23] grid_3.5.0 BiocParallel_1.13.3 [25] blob_1.1.1 codetools_0.2-15 [27] matrixStats_0.53.1 GenomicRanges_1.31.23 [29] splines_3.5.0 SummarizedExperiment_1.9.17 [31] RCurl_1.95-4.10 affyio_1.49.2
This email message may contain legally privileged and/or...{{dropped:2}} ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel