Re: [Rd] 64-bit integer type warning on windows

2018-03-14 Thread Juan Telleria Ruiz de Aguirre
It does not answer direcly your question, but have you tried "bit64"
CRAN package :)

https://cran.r-project.org/web/packages/bit64/index.html

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] 64-bit integer type warning on windows

2018-03-14 Thread Dirk Eddelbuettel

On 13 March 2018 at 17:29, Christophe DUTANG wrote:
| So, I was wondering if this warning was removable at all? Does anyone 
encounter this issue?

That is a pretty old topic.

Did you look into Writing R Extensions?  The first mention is

   * Do be very careful with passing arguments between R, C and FORTRAN
 code.  In particular, 'long' in C will be 32-bit on some R
 platforms (including 64-bit Windows), but 64-bit on most modern
 Unix and Linux platforms.  It is rather unlikely that the use of
 'long' in C code has been thought through: if you need a longer
 type than 'int' you should use a configure test for a C99/C++11
 type such as 'int_fast64_t' (and failing that, 'long long' (8)) and
 typedef your own type to be 'long' or 'long long', or use another
 suitable type (such as 'size_t').

 It is not safe to assume that 'long' and pointer types are the same
 size, and they are not on 64-bit Windows.  If you need to convert
 pointers to and from integers use the C99/C++11 integer types
 'intptr_t' and 'uintptr_t' (which are defined in the header
 '' and are not required to be implemented by the C99
 standard but are used in C code by R itself).

 Note that 'integer' in FORTRAN corresponds to 'int' in C on all R
 platforms.

   [...]
   
   (8) but note that 'long long' is not a standard C++98 type, and C++
   compilers set up for strict checking will reject it.

so with C++11 you get by: simply make your package use 'CXX_STD = CXX11'.

Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] truncation/rounding bug with write.csv

2018-03-14 Thread Gregory Michaelson
Hello, I have looked on https://www.r-project.org/bugs.html , but it seems
that this is the only way to do it.

The issue is that the precision used by write.csv is on consistant for big
files.  See the following code:

First I create a large dataframe filled with random uniform values.  Then I
write it to .csv and print out the first and last lines.


df = data.frame(replicate(100, runif(100, 0,1)))

write.csv(df, "temp.csv")
system('tail -n1 temp.csv')
system('head -n2 temp.csv')


If you run this, you will note that the precision for the first line is
different from the preision of the last line.  I'm not sure what is
Controlling this, but in the code that led me to this bug, I was only
getting 3 decimal Points by the end of the file.

if you use the write functionality in readr, then you get consistent
precision:

readr::write_csv(df, "temp2.csv")
system('tail -n1 temp2.csv')
system('head -n2 temp2.csv')

 I hope that this ishelpful.  If this is not the proper way to submit the
bug, please let me know.


-- 

Greg

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] truncation/rounding bug with write.csv

2018-03-14 Thread Dirk Eddelbuettel

What OS are you on?  On Ubuntu 17.10 with R 3.4.3 all seems well (see
below for your example, I just added a setwd()).

[ That said, I long held a (apparently minority) view that csv is for all
intends and purposes a less-than-ideal format.  If you have that much data,
you do generally not want to serialize it back and forth as that is slow, and
may drop precision.  The rds format is great for R alone; we now have C code
to read it from other apps (in the librdata repo by Evan Miller).  Different
portable serializations work too (protocol buffer, msgpack, ...), there are
databases and on and on... ]

Dirk


R> df <- data.frame(replicate(100, runif(100, 0,1)))
R> setwd("/tmp")
R> write.csv(df, "temp.csv")
R> system('tail -n1 temp.csv')
"100",0.11496100993827,0.740764639340341,0.519190795486793,0.736045523779467,0.537115448853001,0.769496953347698,0.102257401449606,0.437617724528536,0.173321532085538,0.351960731903091,0.397348914295435,0.496789071243256,0.463006566744298,0.573105450021103,0.575196429155767,0.821617329493165,0.112913676071912,0.187580146361142,0.121353451395407,0.576333721866831,0.00763232703320682,0.468676633667201,0.451408475637436,0.0172415724955499,0.946199159137905,0.439950440311804,0.109224532730877,0.657066411571577,0.0524766123853624,0.54859598656185,0.94473168021068,0.500153199071065,0.636756601976231,0.221365773351863,0.620196332456544,0.559639401268214,0.198483835440129,0.397874651942402,0.710652963491157,0.317212327616289,0.239299293374643,0.0606942125596106,0.165786643279716,0.667431530542672,0.436631754040718,0.812185280025005,0.374252707697451,0.421187321422622,0.730321826180443,0.904493971262127,0.399387824581936,0.650714065413922,0.594219180056825,0.147960299625993,0.941945064114407,0.357223904458806,0.275038427906111,0.191008436959237,0.957893384154886,0.211530723143369,0.680650093592703,0.503884038887918,0.754094189498574,0.74776051659137,0.673691919771954,0.236221367260441,0.825558929471299,0.21071959589608,0.246618688805029,0.686810691142455,0.0247942050918937,0.572868114337325,0.494058627169579,0.684360746992752,0.0139967589639127,0.626861660508439,0.417218193877488,0.410173830809072,0.390906651504338,0.477168896235526,0.382211019750684,0.597674581920728,0.198329919017851,0.0684413285925984,0.450342149706557,0.133007253985852,0.755873151356354,0.372862737858668,0.762442974606529,0.582133987685665,0.692048883531243,0.259269661735743,0.147847984684631,0.635266482364386,0.320955650880933,0.00151186063885689,0.446474697208032,0.0673662247136235,0.791947861900553,0.0973296447191387
R> system('head -n2 temp.csv')
"","X1","X2","X3","X4","X5","X6","X7","X8","X9","X10","X11","X12","X13","X14","X15","X16","X17","X18","X19","X20","X21","X22","X23","X24","X25","X26","X27","X28","X29","X30","X31","X32","X33","X34","X35","X36","X37","X38","X39","X40","X41","X42","X43","X44","X45","X46","X47","X48","X49","X50","X51","X52","X53","X54","X55","X56","X57","X58","X59","X60","X61","X62","X63","X64","X65","X66","X67","X68","X69","X70","X71","X72","X73","X74","X75","X76","X77","X78","X79","X80","X81","X82","X83","X84","X85","X86","X87","X88","X89","X90","X91","X92","X93","X94","X95","X96","X97","X98","X99","X100"
"1",0.995067856274545,0.0237177284434438,0.839840568602085,0.99880409357138,0.455015312181786,0.967688028467819,0.191194181796163,0.903533136472106,0.570170691236854,0.86230118968524,0.23530788696371,0.30707904486917,0.256274404237047,0.369592409580946,0.989929250674322,0.50812312704511,0.806819133926183,0.536566868191585,0.0863138805143535,0.294523851014674,0.676951135974377,0.195627561537549,0.261776751372963,0.383222601376474,0.578275503357872,0.79082652577199,0.19860127940774,0.0204593606758863,0.659964868798852,0.42379029514268,0.69516694964841,0.0594558380544186,0.124592808773741,0.289328144863248,0.524508266709745,0.84306427766569,0.317027662880719,0.273440480465069,0.111866136547178,0.217484838794917,0.354757327819243,0.973936082562432,0.673076402861625,0.300948366522789,0.219195493729785,0.912278874544427,0.276768424082547,0.959344451315701,0.500720858341083,0.431024399353191,0.81699790329,0.0738761406391859,0.600137831410393,0.639816240407526,0.405302967177704,0.941259450744838,0.190415472723544,0.0382565588224679,0.486769351176918,0.127647049957886,0.55870802059,0.686994878342375,0.176803215174004,0.794697789475322,0.59406904829666,0.0897431457415223,0.196549082174897,0.0750515828840435,0.736311340238899,0.00494878669269383,0.383522965712473,0.960385771468282,0.101023471681401,0.209177070530131,0.798869548132643,0.147874428424984,0.187238642480224,0.148522146046162,0.32379064662382,0.620601811446249,0.201180462958291,0.179565666476265,0.466121524339542,0.245493365218863,0.980698639061302,0.342919659335166,0.387780519668013,0.393966492731124,0.148554262006655,0.521724705817178,0.722740866011009,0.105151653522626,0.461909410310909,0.905382365221158,0.073629385553,0.636923864483833,0.540197744267061,0.425208077067509,0.666353516280651,0.584139186656103
R> 

Re: [Rd] truncation/rounding bug with write.csv

2018-03-14 Thread Joris Meys
To my surprise, I can confirm on Windows 10 using R 3.4.3 . As tail is not
recognized by Windows cmd, I replaced with:

system('powershell -nologo "& "Get-Content -Path temp.csv -Tail 1')

The last line shows only 7 digits after the decimal, whereas the first have
15 digits after the decimal. I agree with Dirk though, 1.6Gb csv files are
not the best way to work with datasets.

Cheers
Joris



On Wed, Mar 14, 2018 at 1:53 PM, Dirk Eddelbuettel  wrote:

>
> What OS are you on?  On Ubuntu 17.10 with R 3.4.3 all seems well (see
> below for your example, I just added a setwd()).
>
> [ That said, I long held a (apparently minority) view that csv is for all
> intends and purposes a less-than-ideal format.  If you have that much data,
> you do generally not want to serialize it back and forth as that is slow,
> and
> may drop precision.  The rds format is great for R alone; we now have C
> code
> to read it from other apps (in the librdata repo by Evan Miller).
> Different
> portable serializations work too (protocol buffer, msgpack, ...), there are
> databases and on and on... ]
>
> Dirk
>
>
> R> df <- data.frame(replicate(100, runif(100, 0,1)))
> R> setwd("/tmp")
> R> write.csv(df, "temp.csv")
> R> system('tail -n1 temp.csv')
> "100",0.11496100993827,0.740764639340341,0.519190795486793,0.
> 736045523779467,0.537115448853001,0.769496953347698,0.102257401449606,0.
> 437617724528536,0.173321532085538,0.351960731903091,0.397348914295435,0.
> 496789071243256,0.463006566744298,0.573105450021103,0.575196429155767,0.
> 821617329493165,0.112913676071912,0.187580146361142,0.121353451395407,0.
> 576333721866831,0.00763232703320682,0.468676633667201,0.451408475637436,0.
> 0172415724955499,0.946199159137905,0.439950440311804,0.109224532730877,0.
> 657066411571577,0.0524766123853624,0.54859598656185,0.94473168021068,0.
> 500153199071065,0.636756601976231,0.221365773351863,0.620196332456544,0.
> 559639401268214,0.198483835440129,0.397874651942402,0.710652963491157,0.
> 317212327616289,0.239299293374643,0.0606942125596106,0.165786643279716,0.
> 667431530542672,0.436631754040718,0.812185280025005,0.374252707697451,0.
> 421187321422622,0.730321826180443,0.904493971262127,0.399387824581936,0.
> 650714065413922,0.594219180056825,0.147960299625993,0.941945064114407,0.
> 357223904458806,0.275038427906111,0.191008436959237,0.957893384154886,0.
> 211530723143369,0.680650093592703,0.503884038887918,0.754094189498574,0.
> 74776051659137,0.673691919771954,0.236221367260441,0.825558929471299,0.
> 21071959589608,0.246618688805029,0.686810691142455,0.0247942050918937,0.
> 572868114337325,0.494058627169579,0.684360746992752,0.0139967589639127,0.
> 626861660508439,0.417218193877488,0.410173830809072,0.390906651504338,0.
> 477168896235526,0.382211019750684,0.597674581920728,0.198329919017851,0.
> 0684413285925984,0.450342149706557,0.133007253985852,0.755873151356354,0.
> 372862737858668,0.762442974606529,0.582133987685665,0.692048883531243,0.
> 259269661735743,0.147847984684631,0.635266482364386,0.320955650880933,0.
> 00151186063885689,0.446474697208032,0.0673662247136235,0.
> 791947861900553,0.0973296447191387
> R> system('head -n2 temp.csv')
> "","X1","X2","X3","X4","X5","X6","X7","X8","X9","X10","X11"
> ,"X12","X13","X14","X15","X16","X17","X18","X19","X20","X21"
> ,"X22","X23","X24","X25","X26","X27","X28","X29","X30","X31"
> ,"X32","X33","X34","X35","X36","X37","X38","X39","X40","X41"
> ,"X42","X43","X44","X45","X46","X47","X48","X49","X50","X51"
> ,"X52","X53","X54","X55","X56","X57","X58","X59","X60","X61"
> ,"X62","X63","X64","X65","X66","X67","X68","X69","X70","X71"
> ,"X72","X73","X74","X75","X76","X77","X78","X79","X80","X81"
> ,"X82","X83","X84","X85","X86","X87","X88","X89","X90","X91"
> ,"X92","X93","X94","X95","X96","X97","X98","X99","X100"
> "1",0.995067856274545,0.0237177284434438,0.839840568602085,0.
> 99880409357138,0.455015312181786,0.967688028467819,0.191194181796163,0.
> 903533136472106,0.570170691236854,0.86230118968524,0.23530788696371,0.
> 30707904486917,0.256274404237047,0.369592409580946,0.989929250674322,0.
> 50812312704511,0.806819133926183,0.536566868191585,0.0863138805143535,0.
> 294523851014674,0.676951135974377,0.195627561537549,0.261776751372963,0.
> 383222601376474,0.578275503357872,0.79082652577199,0.19860127940774,0.
> 0204593606758863,0.659964868798852,0.42379029514268,0.69516694964841,0.
> 0594558380544186,0.124592808773741,0.289328144863248,0.524508266709745,0.
> 84306427766569,0.317027662880719,0.273440480465069,0.111866136547178,0.
> 217484838794917,0.354757327819243,0.973936082562432,0.673076402861625,0.
> 300948366522789,0.219195493729785,0.912278874544427,0.276768424082547,0.
> 959344451315701,0.500720858341083,0.431024399353191,0.81699790329,0.
> 0738761406391859,0.600137831410393,0.639816240407526,0.405302967177704,0.
> 941259450744838,0.190415472723544,0.0382565588224679,0.486769351176918,0.
> 127647049957886,0.55870802059,0.686994878342375,0.176803215174004,0.
> 794697789475322,0

[Rd] clusterApply arguments

2018-03-14 Thread Florian Schwendinger
Hi!

I recognized that the argument matching of clusterApply (and therefore 
parLapply) goes wrong when one of the arguments of the function is called "c". 
In this case, the argument "c" is used as cluster and the functions give the 
following error message "Error in checkCluster(cl) : not a valid cluster".

Of course, "c" is for many reasons an unfortunate argument name and this can be 
easily fixed by the user side. 

See below for a small example.

library(parallel)

clu <- makeCluster(2, "PSOCK")

fun <- function(x0, x1) (x0 + x1)
clusterApply(clu, x = 1:2, fun = fun, x1 = 1) ## OK
parLapply(cl = clu, X = 1:2, fun = fun, x1 = 1) #OK


fun <- function(b, c) (b + c)
clusterApply(clu, x = 1:2, fun = fun, c = 1) ## Error
clusterApply(cl = clu, x = 1:2, fun = fun, c = 1) ## OK 
parLapply(cl = clu, X = 1:2, fun = fun, c = 1) ## Error

stopCluster(clu)


I used "R version 3.4.3 Patched (2018-01-07 r74099".


Best regards,
Florian

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] truncation/rounding bug with write.csv

2018-03-14 Thread Gregory Michaelson
I ran this code in RStudio Server on a linux machine, but I don’t know the 
version offhand.  I will try to get it tomorrow.  Thanks.

Thanks,
Greg Michaelson
www.datarobot.com
704-981-1118




> On Mar 14, 2018, at 4:47 PM, Joris Meys  wrote:
> 
> To my surprise, I can confirm on Windows 10 using R 3.4.3 . As tail is not 
> recognized by Windows cmd, I replaced with:
> 
> system('powershell -nologo "& "Get-Content -Path temp.csv -Tail 1')
> 
> The last line shows only 7 digits after the decimal, whereas the first have 
> 15 digits after the decimal. I agree with Dirk though, 1.6Gb csv files are 
> not the best way to work with datasets.
> 
> Cheers
> Joris
> 
> 
> 
> On Wed, Mar 14, 2018 at 1:53 PM, Dirk Eddelbuettel  > wrote:
> 
> What OS are you on?  On Ubuntu 17.10 with R 3.4.3 all seems well (see
> below for your example, I just added a setwd()).
> 
> [ That said, I long held a (apparently minority) view that csv is for all
> intends and purposes a less-than-ideal format.  If you have that much data,
> you do generally not want to serialize it back and forth as that is slow, and
> may drop precision.  The rds format is great for R alone; we now have C code
> to read it from other apps (in the librdata repo by Evan Miller).  Different
> portable serializations work too (protocol buffer, msgpack, ...), there are
> databases and on and on... ]
> 
> Dirk
> 
> 
> R> df <- data.frame(replicate(100, runif(100, 0,1)))
> R> setwd("/tmp")
> R> write.csv(df, "temp.csv")
> R> system('tail -n1 temp.csv')
> "100",0.11496100993827,0.740764639340341,0.519190795486793,0.736045523779467,0.537115448853001,0.769496953347698,0.102257401449606,0.437617724528536,0.173321532085538,0.351960731903091,0.397348914295435,0.496789071243256,0.463006566744298,0.573105450021103,0.575196429155767,0.821617329493165,0.112913676071912,0.187580146361142,0.121353451395407,0.576333721866831,0.00763232703320682,0.468676633667201,0.451408475637436,0.0172415724955499,0.946199159137905,0.439950440311804,0.109224532730877,0.657066411571577,0.0524766123853624,0.54859598656185,0.94473168021068,0.500153199071065,0.636756601976231,0.221365773351863,0.620196332456544,0.559639401268214,0.198483835440129,0.397874651942402,0.710652963491157,0.317212327616289,0.239299293374643,0.0606942125596106,0.165786643279716,0.667431530542672,0.436631754040718,0.812185280025005,0.374252707697451,0.421187321422622,0.730321826180443,0.904493971262127,0.399387824581936,0.650714065413922,0.594219180056825,0.147960299625993,0.941945064114407,0.357223904458806,0.275038427906111,0.191008436959237,0.957893384154886,0.211530723143369,0.680650093592703,0.503884038887918,0.754094189498574,0.74776051659137,0.673691919771954,0.236221367260441,0.825558929471299,0.21071959589608,0.246618688805029,0.686810691142455,0.0247942050918937,0.572868114337325,0.494058627169579,0.684360746992752,0.0139967589639127,0.626861660508439,0.417218193877488,0.410173830809072,0.390906651504338,0.477168896235526,0.382211019750684,0.597674581920728,0.198329919017851,0.0684413285925984,0.450342149706557,0.133007253985852,0.755873151356354,0.372862737858668,0.762442974606529,0.582133987685665,0.692048883531243,0.259269661735743,0.147847984684631,0.635266482364386,0.320955650880933,0.00151186063885689,0.446474697208032,0.0673662247136235,0.791947861900553,0.0973296447191387
> R> system('head -n2 temp.csv')
> "","X1","X2","X3","X4","X5","X6","X7","X8","X9","X10","X11","X12","X13","X14","X15","X16","X17","X18","X19","X20","X21","X22","X23","X24","X25","X26","X27","X28","X29","X30","X31","X32","X33","X34","X35","X36","X37","X38","X39","X40","X41","X42","X43","X44","X45","X46","X47","X48","X49","X50","X51","X52","X53","X54","X55","X56","X57","X58","X59","X60","X61","X62","X63","X64","X65","X66","X67","X68","X69","X70","X71","X72","X73","X74","X75","X76","X77","X78","X79","X80","X81","X82","X83","X84","X85","X86","X87","X88","X89","X90","X91","X92","X93","X94","X95","X96","X97","X98","X99","X100"
> "1",0.995067856274545,0.0237177284434438,0.839840568602085,0.99880409357138,0.455015312181786,0.967688028467819,0.191194181796163,0.903533136472106,0.570170691236854,0.86230118968524,0.23530788696371,0.30707904486917,0.256274404237047,0.369592409580946,0.989929250674322,0.50812312704511,0.806819133926183,0.536566868191585,0.0863138805143535,0.294523851014674,0.676951135974377,0.195627561537549,0.261776751372963,0.383222601376474,0.578275503357872,0.79082652577199,0.19860127940774,0.0204593606758863,0.659964868798852,0.42379029514268,0.69516694964841,0.0594558380544186,0.124592808773741,0.289328144863248,0.524508266709745,0.84306427766569,0.317027662880719,0.273440480465069,0.111866136547178,0.217484838794917,0.354757327819243,0.973936082562432,0.673076402861625,0.300948366522789,0.219195493729785,0.912278874544427,0.276768424082547,0.959344451315701,0.500720858341083,0.431024399353191,0.81699790329,0.0738761406391859,0.600137831410393,0.639816240407526,0.405302967177704,0.9412594

[Rd] punctuation in utils::cite()

2018-03-14 Thread Georgi Boshnakov
Hi,

I wonder if the following is a design decision in the default bib style  for 
utils::cite().

Create a bibentry object:

> bibs <- bibtex::read.bib(package = "tools")
> bibs

Murdoch D (2009). "Parsing Rd files." http://developer.r-project.org/parseRd.pdf>.

When an entry is cited with with a text to include with argument `after', I 
would expect a comma after the year, but both textual and parenthesised 
citation include only a space delimiter:

> cite("murdoch:2009", bib = bibs, after = "section 2", textual = TRUE)
[1] "Murdoch (2009 section 2)"

> cite("murdoch:2009", bib = bibs, after = "section 2")
[1] "(Murdoch 2009 section 2)"


Is this intended?  Including the comma in `after' doesn't help, since the space 
is before it:

> cite("murdoch:2009", bib = bibs, after = ", section 2", textual = TRUE)
[1] "Murdoch (2009 , section 2)"

> cite("murdoch:2009", bib = bibs, after = ", section 2")
[1] "(Murdoch 2009 , section 2)"


Georgi Boshnakov


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] clusterApply arguments

2018-03-14 Thread Henrik Bengtsson
This is nothing specific to parallel::clusterApply() per se. It is the
default behavior of R where it allows for partial argument names.  I
don't think there's much that can be done here except always using
fully named arguments to the "apply" function itself as you show.

You can "alert" yourself when there's a mistake by using:

options(warnPartialMatchArgs = TRUE)

e.g.

> clusterApply(clu, x = 1:2, fun = fun, c = 1) ## Error
Warning in clusterApply(clu, x = 1:2, fun = fun, c = 1) :
  partial argument match of 'c' to 'cl'
Error in checkCluster(cl) : not a valid cluster

It's still only a warning, but an informative one.

/Henrik

On Wed, Mar 14, 2018 at 8:50 AM, Florian Schwendinger
 wrote:
> Hi!
>
> I recognized that the argument matching of clusterApply (and therefore 
> parLapply) goes wrong when one of the arguments of the function is called 
> "c". In this case, the argument "c" is used as cluster and the functions give 
> the following error message "Error in checkCluster(cl) : not a valid cluster".
>
> Of course, "c" is for many reasons an unfortunate argument name and this can 
> be easily fixed by the user side.
>
> See below for a small example.
>
> library(parallel)
>
> clu <- makeCluster(2, "PSOCK")
>
> fun <- function(x0, x1) (x0 + x1)
> clusterApply(clu, x = 1:2, fun = fun, x1 = 1) ## OK
> parLapply(cl = clu, X = 1:2, fun = fun, x1 = 1) #OK
>
>
> fun <- function(b, c) (b + c)
> clusterApply(clu, x = 1:2, fun = fun, c = 1) ## Error
> clusterApply(cl = clu, x = 1:2, fun = fun, c = 1) ## OK
> parLapply(cl = clu, X = 1:2, fun = fun, c = 1) ## Error
>
> stopCluster(clu)
>
>
> I used "R version 3.4.3 Patched (2018-01-07 r74099".
>
>
> Best regards,
> Florian
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] truncation/rounding bug with write.csv

2018-03-14 Thread Ista Zahn
I don't see the issue here. It would be helpful if people would report
their sessionInfo() when reporting whether or not they see this issue.
Mine is

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Arch Linux

Matrix products: default
BLAS/LAPACK: /usr/lib/libopenblas_haswellp-r0.2.20.so

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_3.4.3 rmsfact_0.0.3  cowsay_0.5.0   fortunes_1.5-4

On Wed, Mar 14, 2018 at 12:02 PM, Gregory Michaelson  wrote:
> I ran this code in RStudio Server on a linux machine, but I don’t know the 
> version offhand.  I will try to get it tomorrow.  Thanks.
>
> Thanks,
> Greg Michaelson
> www.datarobot.com
> 704-981-1118
>
>
>
>
>> On Mar 14, 2018, at 4:47 PM, Joris Meys  wrote:
>>
>> To my surprise, I can confirm on Windows 10 using R 3.4.3 . As tail is not 
>> recognized by Windows cmd, I replaced with:
>>
>> system('powershell -nologo "& "Get-Content -Path temp.csv -Tail 1')
>>
>> The last line shows only 7 digits after the decimal, whereas the first have 
>> 15 digits after the decimal. I agree with Dirk though, 1.6Gb csv files are 
>> not the best way to work with datasets.
>>
>> Cheers
>> Joris
>>
>>
>>
>> On Wed, Mar 14, 2018 at 1:53 PM, Dirk Eddelbuettel > > wrote:
>>
>> What OS are you on?  On Ubuntu 17.10 with R 3.4.3 all seems well (see
>> below for your example, I just added a setwd()).
>>
>> [ That said, I long held a (apparently minority) view that csv is for all
>> intends and purposes a less-than-ideal format.  If you have that much data,
>> you do generally not want to serialize it back and forth as that is slow, and
>> may drop precision.  The rds format is great for R alone; we now have C code
>> to read it from other apps (in the librdata repo by Evan Miller).  Different
>> portable serializations work too (protocol buffer, msgpack, ...), there are
>> databases and on and on... ]
>>
>> Dirk
>>
>>
>> R> df <- data.frame(replicate(100, runif(100, 0,1)))
>> R> setwd("/tmp")
>> R> write.csv(df, "temp.csv")
>> R> system('tail -n1 temp.csv')
>> "100",0.11496100993827,0.740764639340341,0.519190795486793,0.736045523779467,0.537115448853001,0.769496953347698,0.102257401449606,0.437617724528536,0.173321532085538,0.351960731903091,0.397348914295435,0.496789071243256,0.463006566744298,0.573105450021103,0.575196429155767,0.821617329493165,0.112913676071912,0.187580146361142,0.121353451395407,0.576333721866831,0.00763232703320682,0.468676633667201,0.451408475637436,0.0172415724955499,0.946199159137905,0.439950440311804,0.109224532730877,0.657066411571577,0.0524766123853624,0.54859598656185,0.94473168021068,0.500153199071065,0.636756601976231,0.221365773351863,0.620196332456544,0.559639401268214,0.198483835440129,0.397874651942402,0.710652963491157,0.317212327616289,0.239299293374643,0.0606942125596106,0.165786643279716,0.667431530542672,0.436631754040718,0.812185280025005,0.374252707697451,0.421187321422622,0.730321826180443,0.904493971262127,0.399387824581936,0.650714065413922,0.594219180056825,0.147960299625993,0.941945064114407,0.357223904458806,0.275038427906111,0.191008436959237,0.957893384154886,0.211530723143369,0.680650093592703,0.503884038887918,0.754094189498574,0.74776051659137,0.673691919771954,0.236221367260441,0.825558929471299,0.21071959589608,0.246618688805029,0.686810691142455,0.0247942050918937,0.572868114337325,0.494058627169579,0.684360746992752,0.0139967589639127,0.626861660508439,0.417218193877488,0.410173830809072,0.390906651504338,0.477168896235526,0.382211019750684,0.597674581920728,0.198329919017851,0.0684413285925984,0.450342149706557,0.133007253985852,0.755873151356354,0.372862737858668,0.762442974606529,0.582133987685665,0.692048883531243,0.259269661735743,0.147847984684631,0.635266482364386,0.320955650880933,0.00151186063885689,0.446474697208032,0.0673662247136235,0.791947861900553,0.0973296447191387
>> R> system('head -n2 temp.csv')
>> "","X1","X2","X3","X4","X5","X6","X7","X8","X9","X10","X11","X12","X13","X14","X15","X16","X17","X18","X19","X20","X21","X22","X23","X24","X25","X26","X27","X28","X29","X30","X31","X32","X33","X34","X35","X36","X37","X38","X39","X40","X41","X42","X43","X44","X45","X46","X47","X48","X49","X50","X51","X52","X53","X54","X55","X56","X57","X58","X59","X60","X61","X62","X63","X64","X65","X66","X67","X68","X69","X70","X71","X72","X73","X74","X75","X76","X77","X78","X79","X80","X81","X82","X83","X84","X85","X86","X87","X88","X89","X90","X91","X92","X93","X94","X95","X96","X97","X98","X99","X100"
>> "1",0.995067856274545,0.0237177284434438,

Re: [Rd] truncation/rounding bug with write.csv

2018-03-14 Thread Joris Meys
My apologies for not including sessionInfo(), and I'm a bit angry at myself
for that. Retrying in a fresh session of R, I get different results. More
specifically, I get the expected result where accuracy is the same in the
first and the last line. As I didn't include my sessionInfo() in my
previous mail, I can't figure out why I now have a different result. So I'm
positive I've seen the behaviour described by Gregory, but I can't
reproduce consistently.

Results and session Info below.

Cheers
Joris

df = data.frame(replicate(100, runif(100, 0,1)))
write.csv(df, "temp.csv")

> system('head -n2 temp.csv')
"","X1","X2","X3","X4","X5","X6","X7","X8","X9","X10","X11","X12","X13","X14","X15","X16","X17","X18","X19","X20","X21","X22","X23","X24","X25","X26","X27","X28","X29","X30","X31","X32","X33","X34","X35","X36","X37","X38","X39","X40","X41","X42","X43","X44","X45","X46","X47","X48","X49","X50","X51","X52","X53","X54","X55","X56","X57","X58","X59","X60","X61","X62","X63","X64","X65","X66","X67","X68","X69","X70","X71","X72","X73","X74","X75","X76","X77","X78","X79","X80","X81","X82","X83","X84","X85","X86","X87","X88","X89","X90","X91","X92","X93","X94","X95","X96","X97","X98","X99","X100"
"1",0.278388975420967,0.370451691094786,0.717217007186264,0.116161955753341,0.144262576242909,0.937281515449286,0.373484081588686,0.955863541224971,0.826917823404074,0.821003203978762,0.592950115678832,0.0627794633619487,0.815737818833441,0.0805139308795333,0.238502083579078,0.509200588334352,0.73775092815049,0.868772336747497,0.0352788285817951,0.96509046619758,0.403636189643294,0.435718205757439,0.0162769011221826,0.597037401981652,0.504837732296437,0.206882111029699,0.883217994589359,0.548339378088713,0.294472687412053,0.996299823047593,0.84715538774617,0.206719091162086,0.936834576772526,0.439650829415768,0.48171737533994,0.847850588615984,0.168411831371486,0.74452265072614,0.148969533387572,0.410039864480495,0.778313281945884,0.432499173562974,0.512454774230719,0.16644035698846,0.82063413807191,0.978053349768743,0.99700310616754,0.874686364317313,0.796479270327836,0.816980117466301,0.274035695008934,0.00785374757833779,0.678476774599403,0.660274159396067,0.184961069142446,0.681200950173661,0.611048432299867,0.73395977425389,0.209964233217761,0.310086127603427,0.975754244253039,0.125808657845482,0.015794032253325,0.526331929024309,0.531722096726298,0.59097072808072,0.815139955608174,0.529103851644322,0.183188699418679,0.910278890514746,0.237709420500323,0.752752122003585,0.14534721034579,0.00572531204670668,0.222574554383755,0.895228188252077,0.899962505558506,0.987743409816176,0.592631630599499,0.948386731324717,0.86595072131604,0.0715177122037858,0.0426598901394755,0.336731978459284,0.641609625890851,0.949697833275422,0.26424896903336,0.528028564760461,0.562290757661685,0.653207891387865,0.513830083655193,0.818740799557418,0.86044091056101,0.790382120991126,0.227793522411957,0.580261130817235,0.181467723799869,0.295633365400136,0.548259064555168,0.833231552969664
> system('powershell -nologo & Get-Content -Path temp.csv -Tail 1')
"100",0.946863592602313,0.656343327835202,0.627083137864247,0.482342466711998,0.337082419078797,0.424337374512106,0.626660786569118,0.870844106189907,0.78627574048005,0.0107703430112451,0.50574235082604,0.182688802946359,0.29385484661907,0.0441680049989372,0.375604564556852,0.895043386844918,0.510951161850244,0.865806604968384,0.0833957826253027,0.100834607845172,0.139034334337339,0.854574690107256,0.121182460337877,0.86904955166392,0.616418665507808,0.616997531382367,0.325345175806433,0.487117795739323,0.0097313594771,0.30411878527,0.0132197963539511,0.654607841046527,0.896146323531866,0.358923224499449,0.968490360304713,0.757937406655401,0.926832290366292,0.863271801266819,0.325824091676623,0.140821835258976,0.550571520347148,0.645497811725363,0.545551799703389,0.440615838393569,0.296690225601196,0.838868388207629,0.488215223187581,0.512655091006309,0.764586469857022,0.156665422255173,0.109298826660961,0.660329486243427,0.220234925625846,0.192423258908093,0.672684306278825,0.239764124620706,0.754978574579582,0.636799369007349,0.240582759492099,0.458807958755642,0.196174292825162,0.477994701592252,0.725636600283906,0.473409370519221,0.741089153569192,0.906417449470609,0.540478575974703,0.360421892022714,0.933905930491164,0.631188633851707,0.416520888684317,0.485372453462332,0.700725849252194,0.186034456361085,0.903570784721524,0.0693298415280879,0.261779377236962,0.128776200115681,0.0801852298900485,0.665786169003695,0.144309232477099,0.485807131510228,0.0646850543562323,0.909404250094667,0.848976222565398,0.862456669798121,0.949187902035192,0.240288577275351,0.177118748193607,0.0833796421065927,0.0747064722236246,0.107194342184812,0.774909492349252,0.424547733273357,0.848057812545449,0.913047505775467,0.134580536745489,0.904593974584714,0.90503191947937,0.386907825712115

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: