Re: [Rd] Vignette problem and CRAN policies
This is nothing to do with CRAN policies (nor R). The issue is that the current upquote.sty does not play with 'ae' fonts as used by default by Sweave. The change is in TeX. And that was what Spencer Graves was informed. On 19/09/2013 04:35, Spencer Graves wrote: Hello, All: The vignette with the sos package used "upquote.sty", required for R Journal when it was published in 2009. Current CRAN policy disallows "upquote.sty", and I've so far not found a way to pass "R CMD check" with sos without upquote.sty. I changed sos.Rnw per an email exchange with Prof. Ripley without solving the problem; see below. The key error messages (see the results of "R CMD build" below) appear to be "sos.tex:16: LaTeX Error: Environment article undefined" and " sos.tex:558: LaTeX Error: \begin{document} ended by \end{article}." When the article worked, it had bot \begin{document} and \begin{article}, with matching \end statements for both. I've tried commenting out either without success. The current nonworking code is available on R-Forge via anonymous SVN checkout using "svn checkout svn://svn.r-forge.r-project.org/svnroot/rsitesearch/". Any suggestions on how to fix this would be greatly appreciated. Thanks, Spencer ## COMPLETE RESULTS FROM R CMD check Microsoft Windows [Version 6.1.7600] Copyright (c) 2009 Microsoft Corporation. All rights reserved. C:\Users\sgraves>cd 2013 C:\Users\sgraves\2013>cd R_pkgs C:\Users\sgraves\2013\R_pkgs>cd sos C:\Users\sgraves\2013\R_pkgs\sos>cd pkg C:\Users\sgraves\2013\R_pkgs\sos\pkg>R CMD build sos * checking for file 'sos/DESCRIPTION' ... OK * preparing 'sos': * checking DESCRIPTION meta-information ... OK * installing the package to re-build vignettes * creating vignettes ... ERROR Loading required package: brew Attaching package: 'sos' The following object is masked from 'package:utils': ? Loading required package: WriteXLS Perl found. The following Perl modules were not found on this system: Text::CSV_XS If you have more than one Perl installation, be sure the correct one was used he re. Otherwise, please install the missing modules. See the package INSTALL file for more information. Loading required package: RODBC Warning in odbcUpdate(channel, query, mydata, coldata[m, ], test = test, : character data 'Adrian Baddeley and Rolf Turner with substantial contributions of code by Kasper Klitgaard Berthelsen;Abdollah Jalilian; Marie-Colette van Liesho ut; Ege Rubak; Dominic Schuhmacher;and Rasmus Waagepetersen. Additional contributionsby Q.W. Ang;S. Azaele; C. Beale; R. Bernhardt; T. Bendtsen;A. Bevan; B. Biggerstaff; R. Bivan d; F. Bonneu; J. Burgos; S. Byers; Y.M. Chang; J.B. Che n; I. Chernayavsky;Y.C. Chin; B. Christensen; J.-F. Co eurjolly; R. Corria Ainslie; M. de la Cruz; P. Dalgaard; P.J. Dig gle;P. Donnelly;I. Dryden; S. Eglen; O. Flores;N. Funwi-Gabga; A. Gault; M. Genton; J. Gilbey; J. Goldstick; P. Graba rnik; C. Graf;J. Franklin;U. Hahn;A. Hardegen; M. Herin g; M.B. Hansen;M. Hazelton;J. Heikkinen; K. Hornik; R. Ihaka ; A. Jammalamadaka; R. John-Chandran; D. Johnson; M. Kuhn; J. Laake; F. Lavancier; T. Lawrence;R.A. Lamb; J. Lee; G.P. Leser; [... truncated] Warning in odbcUpdate(channel, query, mydata, coldata[m, ], test = test, : character data 'John Fox [aut, cre], Sanford Weisberg [aut], Douglas Bates [ct b], Steve Ellison [ctb], David Firth [ctb], Michael Friendly [ctb], Gregor Gorja nc [ctb], Spencer Graves [ctb], Richard Heiberger [ctb], Rafael Laboissiere [ctb ], Georges Monette [ctb], Henric Nilsson [ctb], Derek Ogle [ctb], Brian Ripley [ ctb], Achim Zeileis [ctb], R-Core [ctb]' truncated to 255 bytes in column 'Autho r' Warning in odbcUpdate(channel, query, mydata, coldata[m, ], test = test, : character data 'John Fox [aut, cre], Liviu Andronic [ctb], Michael Ash [ctb], Milan Bouchet-Valat [ctb], Theophilius Boye [ctb], Stefano Calza [ctb], Andy Cha ng [ctb], Philippe Grosjean [ctb], Richard Heiberger [ctb], Kosar Karimi Pour [c tb], G. Jay Kerns [ctb], Renaud Lancelot [ctb], Matthieu Lesnoff [ctb], Uwe Ligg es [ctb], Samir Messad [ctb], Martin Maechler [ctb], Robert Muenchen [ctb], Dunc an Murdoch [ctb], Erich Neuwirth [ctb], Dan Putler [ctb], Brian Ripley [ctb], Mi roslav Ristic [ctb], Peter Wolf [ctb]' truncated to 255 bytes in column 'Author' Perl found. The following Perl modules were not found on this system: Text::CSV_XS If you have more than one Perl installation, be sure the correct one was used he re. Otherwise, please install the missing modules. See the package INSTALL file for more information. Warning in odbcUpdate(channel, query, mydata, coldata[m, ], test = test, : character data 'Adrian Baddeley
Re: [Rd] dbeta may hang R session for very large values of the shape parameters
The issue is underflow to zero in bd0 (C file src/nmath/bd.c). We'll fix that, but given the R 3.0.2 RC is in code freeze and this has existed for years, not for 3.0.2. On 18/09/2013 23:52, Kosmidis, Ioannis wrote: Dear all, we received a bug report for betareg, that in some cases the optim call in betareg.fit would hang the R session and the command cannot be interrupted by Ctrl-C… We narrowed down the problem to the dbeta function which is used for the log likelihood evaluation in betareg.fit. Particularly, the following command hangs the R session to a 100% CPU usage in all systems we tried it (OS X 10.8.4, Debian GNU Linux, Ubuntu 12.04) with either R-3.0.1 and with the R-devel version (in all systems I waited 3 minutes before I kill R): ## Warning: this will hang the R session dbeta(0.9, 1e+308, 10) Furthermore, through a trial and error investigation using the following code ## Warning: this will hang the R session x <- 0.9 for (i in 0:100) { a <- 1e+280*2^i b <- 10 cat("shape1 =", a, "\n") cat("shape2 =", b, "\n") cat("Beta density", dbeta(x, shape1 = a, shape2 = b), "\n") cat("===\n") } I noticed that: * this seems to happen when shape1 is about 1e+308, seemingly irrespective of the value of shape2 (run the above with another value of b), and as it appears only when x>=0.9 and x < 1 (run the above lines with x <- 0.8 for example and everything works as expected). * similar problems are encountered for small x values when shape2 is massive. I am not sure why this happens but it looks deep to me. The temporary fix for the purposes of betareg was a hack (a simple if command that returns NA for the log likelihood if any shape parameter has values greater than 1e+300 say). Nevertheless, I thought that this is an issue worth reporting to R-devel (instead of R-help), especially since dbeta may be used within generic optimisers and figuring that dbeta is the problem can be hard --- it took us some time before we started suspecting dbeta. Interestingly, this appears to happen close to what R considers infinity. Typing 1.799e+308 into R returns Inf. I hope the above limited in scope analysis is informative. Best regards, Ioannis -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] getParseData() for imaginary numbers
On 13-09-18 1:47 PM, Yihui Xie wrote: Hi, The imaginary unit is gone in the 'text' column in the returned data frame from getParseData(), e.g. in the example below, perhaps the text should be 1i instead of 1: Yes, I can confirm the bug. I'll fix it in R-devel now, and in R-patched after 3.0.2 is released next week. Duncan Murdoch p=parse(text='1i') getParseData(p) line1 col1 line2 col2 id parent token terminal text 1 11 12 1 2 NUM_CONST TRUE1 2 11 12 2 0 exprFALSE sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base Regards, Yihui -- Yihui Xie Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] dbeta may hang R session for very large values of the shape parameters
Dear Brian, many thanks, that is great. Best regards, Ioannis On 19/09/13 12:15, Prof Brian Ripley wrote: The issue is underflow to zero in bd0 (C file src/nmath/bd.c). We'll fix that, but given the R 3.0.2 RC is in code freeze and this has existed for years, not for 3.0.2. On 18/09/2013 23:52, Kosmidis, Ioannis wrote: Dear all, we received a bug report for betareg, that in some cases the optim call in betareg.fit would hang the R session and the command cannot be interrupted by Ctrl-C… We narrowed down the problem to the dbeta function which is used for the log likelihood evaluation in betareg.fit. Particularly, the following command hangs the R session to a 100% CPU usage in all systems we tried it (OS X 10.8.4, Debian GNU Linux, Ubuntu 12.04) with either R-3.0.1 and with the R-devel version (in all systems I waited 3 minutes before I kill R): ## Warning: this will hang the R session dbeta(0.9, 1e+308, 10) Furthermore, through a trial and error investigation using the following code ## Warning: this will hang the R session x <- 0.9 for (i in 0:100) { a <- 1e+280*2^i b <- 10 cat("shape1 =", a, "\n") cat("shape2 =", b, "\n") cat("Beta density", dbeta(x, shape1 = a, shape2 = b), "\n") cat("===\n") } I noticed that: * this seems to happen when shape1 is about 1e+308, seemingly irrespective of the value of shape2 (run the above with another value of b), and as it appears only when x>=0.9 and x < 1 (run the above lines with x <- 0.8 for example and everything works as expected). * similar problems are encountered for small x values when shape2 is massive. I am not sure why this happens but it looks deep to me. The temporary fix for the purposes of betareg was a hack (a simple if command that returns NA for the log likelihood if any shape parameter has values greater than 1e+300 say). Nevertheless, I thought that this is an issue worth reporting to R-devel (instead of R-help), especially since dbeta may be used within generic optimisers and figuring that dbeta is the problem can be hard --- it took us some time before we started suspecting dbeta. Interestingly, this appears to happen close to what R considers infinity. Typing 1.799e+308 into R returns Inf. I hope the above limited in scope analysis is informative. Best regards, Ioannis -- Dr Ioannis Kosmidis Department of Statistical Science, University College, London, WC1E 6BT, UK Webpage: http://www.ucl.ac.uk/~ucakiko __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Design for classes with database connection
Simon Your idea to use SQLite and the nature of some of the sorting and extracting you are suggesting makes me wonder why you are thinking of R data structures as the home for the data storage. I would be inclined to put the data in an SQL database as the prime repository, then extract parts you want with SQL queries and bring them into R for analysis and graphics. If the full data set is large, and the parts you want to analyze in R at any one time are relatively small, then this will be much faster. After all, SQL is primarily for databases, whereas R's strength is more in statistics and graphics. In the project http://tsdbi.r-forge.r-project.org/ I have code that does some of the things you probably want. There the focus is on a single identifier for a series, and various observation frequencies are supported. Tick data is supported (as time stamped data) but not extensively tested as I do not work with tick data much. There is a function TSquery, currently in TSdbi on CRAN but very shortly being split with the SQL specific parts of the interface into a package TSsql. It is very much like the queries you seem to have in mind, but I have not used it with tick data. It is used to generate a time series by formulating a query to a database with several possible sorting fields, very much like you describe, and then order the data according to the time index. If your data set is large, then you need to think carefully about which fields you index. You certainly do not want to be building the indexes on the fly, as you would need to do if you dump all the data out of R into an SQL db just to do a sort. If the data set is small then indexing does not matter too much. Also, for a small data set there will be much less advantage of keeping the data in an SQL db rather than in R. You do need to be a bit more specific about what "huge" means. (Tick data for 5 days or 20 years? A 100 IDs or 10 million?) Large for an R structure is not necessarily large for an SQL db. With more specifics I might be able to give more suggestions. (R-SIG-DB may be a better forum for this discussion.) HTH, Paul On 13-09-18 01:06 PM, Simon Zehnder wrote: Dear R-Devels, I am designing right now a package intended to simplify the handling of market microstructure data (tick data, order data, etc). As these data is most times pretty huge and needs to be reordered quite often (e.g. if several security data is batched together or if only a certain time range should be considered) - the package needs to handle this. Before I start, I would like to mention some facts which made me decide to construct an own package instead of using e.g. the packages bigmemory, highfrequency, zoo or xts: AFAIK big memory does not provide the opportunity to handle data with different types (timestamp, string and numerics) and their appropriate sorting, for this task databases offer better tools. Package highfrequency is designed to work specifically with a certain data structure and the data in market microstructure has much greater versatility. Packages zoo and xts offer a lot of versatility but do not offer the data sorting ability needed for such big data. I would like to get some feedback in regard to my decision and in regard to the short design overview following. My design idea is now: 1. Base the package on S4 classes, with one class that handles data-reading from external sources, structuring and reordering. Structuring is done in regard to specific data variables, i.e. security ID, company ID, timestamp, price, volume (not all have to be provided, but some surely exist on market microstructure data). The less important variables are considered as a slot @other and are only ordered in regard to the other variables. Something like this: .mmstruct <- setClass('mmstruct', representation( name = "character", index= "array", N = "integer", K= "integer", compiD = "array", secID = "array", tradetime = "POSIXlt", flag = "array", price= "array", vol= "array", other = "data.frame")) 2. To enable a lightweight ordering function, the class should basically create an SQLite database on construction and delete it if 'rm()' is called. Throughout its life an object holds the database path and can execute queries on the database tables. By this, I can use the table sorting of SQLite (e.g. by constructing an index for each important variable). I assume this is faster and more efficient than programming something on my own - why reinventing the wheel? For this I would use VIRTUAL classes like: .mmstructBASE <- setClass('mmstructBASE', representation( dbName = "character", dbTable = "character")) .mmstructDB <- setClass('mmstructDB', representation( conn = "SQLiteConnection"), contains = c("mmstructBASE")) .mmstruct <- setClass('mmstruct', representation( name = "character", index
[Rd] Using long long types in C++
Hello, In Rcpp we'd like to do something useful for types such as long long and unsigned long long. For example, supporting our usual wrap construct. We'd like to be able to "wrap" a long long, or perhaps a std::vector so that it is returned to the R side in something meaningful (we are considering several options like loosing some precision and returning an int, loosing a bit less precision and returning a double or use bit shifting tricks and do something compatible with the bit64 package). To do this, we try to be careful and hide the code behind these two PP tests: #if defined(__GNUC__) && defined(__LONG_LONG_MAX__) which tests for gcc compatible (includes clang) compiler and the availability of the long long type. Now this is not enough and we also have to use __extension__ to disable warnings that are emitted by -pedantic. So we have something like this: #if defined(__GNUC__) && defined(__LONG_LONG_MAX__) __extension__ typedef long long int rcpp_long_long_type; __extension__ typedef unsigned long long int rcpp_ulong_long_type; #define RCPP_HAS_LONG_LONG_TYPES #endif and for the rest of what we do with these types, we use rcpp_long_long_type and rcpp_ulong_long_type and hide the code behind #if defined(RCPP_HAS_LONG_LONG_TYPES) But apparently this is still not enough and on some versions of gcc (e.g. 4.7 something), -pedantic still generates the warnings unless we also use -Wno-long-long Dirk tells me that the fact that these warnings show up means that it would not be accepted in CRAN. I understand that -pedantic is useful for finding potential portability problems, but in that case I believe everything is done to isolate the use of long long to a situation where we know we can use it given that we test for a compiler (gcc) and its known way to check for existence of long long: __LONG_LONG_MAX__ What are my options here ? Romain __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Using long long types in C++
On Fri, Sep 20, 2013 at 12:51:52AM +0200, rom...@r-enthusiasts.com wrote: > In Rcpp we'd like to do something useful for types such as long long > and unsigned long long. ... > But apparently this is still not enough and on some versions of gcc > (e.g. 4.7 something), -pedantic still generates the warnings unless > we also use -Wno-long-long Can you also add -std=c++0x or is that considered as bad as adding -Wno-long-long? (and why not use autoconf's AC_TYPE_LONG_LONG_INT and AC_TYPE_UNSIGNED_LONG_LONG_INT for the tests?) Cheers, Patrick __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Using long long types in C++
Romain, Can you use int64_t and uint_t64 instead? IMHO that would be more useful than long long anyway. Karl On Sep 19, 2013 5:33 PM, "Patrick Welche" wrote: > On Fri, Sep 20, 2013 at 12:51:52AM +0200, rom...@r-enthusiasts.com wrote: > > In Rcpp we'd like to do something useful for types such as long long > > and unsigned long long. > ... > > But apparently this is still not enough and on some versions of gcc > > (e.g. 4.7 something), -pedantic still generates the warnings unless > > we also use -Wno-long-long > > Can you also add -std=c++0x or is that considered as bad as adding > -Wno-long-long? > > (and why not use autoconf's AC_TYPE_LONG_LONG_INT and > AC_TYPE_UNSIGNED_LONG_LONG_INT for the tests?) > > Cheers, > > Patrick > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Using long long types in C++
On 20/09/2013 03:04, Karl Millar wrote: Romain, Can you use int64_t and uint_t64 instead? IMHO that would be more useful than long long anyway. 'Writing R Extensions' does say 'Do be very careful with passing arguments between R, C and FORTRAN code. In particular, long in C will be 32-bit on some R platforms (including 64-bit Windows), but 64-bit on most modern Unix and Linux platforms. It is rather unlikely that the use of long in C code has been thought through: if you need a longer type than int you should use a configure test for a C99 type such as int_fast64_t (and failing that, long long 42) and typedef your own type to be long or long long, or use another suitable type (such as size_t). Note that int64_t is not portable, even in C99, since its implementation is optional. On 20/09/2013 01:31, Patrick Welche wrote: > > Can you also add -std=c++0x or is that considered as bad as adding > -Wno-long-long? That is not portable. It is g++ specific and AFAIR not accepted by the version of g++ used on OS X (which dates from 2007). Karl On Sep 19, 2013 5:33 PM, "Patrick Welche" wrote: On Fri, Sep 20, 2013 at 12:51:52AM +0200, rom...@r-enthusiasts.com wrote: In Rcpp we'd like to do something useful for types such as long long and unsigned long long. ... But apparently this is still not enough and on some versions of gcc (e.g. 4.7 something), -pedantic still generates the warnings unless we also use -Wno-long-long Can you also add -std=c++0x or is that considered as bad as adding -Wno-long-long? (and why not use autoconf's AC_TYPE_LONG_LONG_INT and AC_TYPE_UNSIGNED_LONG_LONG_INT for the tests?) Cheers, Patrick __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Using long long types in C++
Le 20 sept. 2013 à 02:31, Patrick Welche a écrit : > On Fri, Sep 20, 2013 at 12:51:52AM +0200, rom...@r-enthusiasts.com wrote: >> In Rcpp we'd like to do something useful for types such as long long >> and unsigned long long. > ... >> But apparently this is still not enough and on some versions of gcc >> (e.g. 4.7 something), -pedantic still generates the warnings unless >> we also use -Wno-long-long > > Can you also add -std=c++0x or is that considered as bad as adding > -Wno-long-long? IIRC, a package on CRAN is not allowed to change -std, there is or at least was barriers to forbid this. Plus, some of us use the default settings on OSX, this is still (simili) gcc 4.2.1 which has long long but does not implement c++11 > (and why not use autoconf's AC_TYPE_LONG_LONG_INT and > AC_TYPE_UNSIGNED_LONG_LONG_INT for the tests?) Because no matter how many precautions we take, if at the end of the day we end up having mentions of long long in the code, even behind sufficient test, it will still generate warnings which i'm told would prevent the cran distribution of the package. I'd really like to hear from cran maintainers on this. > Cheers, > > Patrick Because __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Using long long types in C++
Karl, Brian gave some insights already. I'm also reluctant to use int64_t because there does not seem to be a standard version of what the type is. Eg on OSX, int64_t is a typedef to long long. IIRC there are cases where it is a typedef to long ... At least with long an long long they are guaranteed to be different types and i dont need to resort to configure voodoo, i can just rely on the compiler and its preprocessor. Romain Le 20 sept. 2013 à 04:04, Karl Millar a écrit : > Romain, > > Can you use int64_t and uint_t64 instead? IMHO that would be more useful > than long long anyway. > > Karl > > On Sep 19, 2013 5:33 PM, "Patrick Welche" wrote: >> On Fri, Sep 20, 2013 at 12:51:52AM +0200, rom...@r-enthusiasts.com wrote: >> > In Rcpp we'd like to do something useful for types such as long long >> > and unsigned long long. >> ... >> > But apparently this is still not enough and on some versions of gcc >> > (e.g. 4.7 something), -pedantic still generates the warnings unless >> > we also use -Wno-long-long >> >> Can you also add -std=c++0x or is that considered as bad as adding >> -Wno-long-long? >> >> (and why not use autoconf's AC_TYPE_LONG_LONG_INT and >> AC_TYPE_UNSIGNED_LONG_LONG_INT for the tests?) >> >> Cheers, >> >> Patrick >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel