FWIW %m is the proper conversion for months. %M is minutes. Looks like a bug.
Jeffrey Ryan | Founder | jeffrey.r...@lemnica.com www.lemnica.com On Oct 13, 2012, at 10:33 AM, Nicolae Caprarescu <capra...@cs.man.ac.uk> wrote: > Hi Michael, > > Thank you for pointing me in the right direction, I'm now using an email > client rather than Nabble. > > Related to the issue I described below, it's resolved now, I have managed > to fix it myself. However, I believe this might be a bug, or at least > something that needs improving; I have described both how to reproduce this > issue and its solution in the below 4 steps: > 1) library(RTAQ) > 2) Create XXX_trades.csv file with the contents below using a relative > path like [somewhere]/TAQData/2010-11-01/XXX_trades.csv > SYMBOL,DATE,TIME,PRICE,SIZE,G127,CORR,COND,EX > XXX,20101101,10:30:00,11.49,500,0,0,@,B > XXX,20101101,10:30:02,11.49,322,0,0,0,B > XXX,20101101,10:30:02,11.49,178,0,0,@,B > XXX,20101101,10:30:03,11.49,500,0,0,@,B > XXX,20101101,10:30:03,11.49,187,0,0,@,D > 3) > #convert does not generate any errors/warnings, however it does not work > properly > convert(from="2010-11-01", > to="2010-11-01",datasource="[somewhere]/TAQData/", > datadestination="[somewhere]/TAQDataRData/",trades=T,quotes=F,ticker="XXX",dir=T, > extention="csv", header=T, tradecolnames=c("SYMBOL", "DATE", "TIME", > "PRICE", "SIZE", "G127", "CORR", "COND", "EX")) > #loading the RData created by convert > TAQLoad("XXX",from="2010-11-01",to="2010-11-01",datasource="[somewhere]TAQDataRData/", > trades=T,quotes=F) > #output of TAQLoad > SYMBOL EX PRICE SIZE COND CORR G127 > <NA> "XXX" "B" "11.49" "500" "@" "0" "0" > <NA> "XXX" "B" "11.49" "322" "0" "0" "0" > <NA> "XXX" "B" "11.49" "178" "@" "0" "0" > <NA> "XXX" "B" "11.49" "500" "@" "0" "0" > <NA> "XXX" "D" "11.49" "187" "@" "0" "0" > Warning message: > timezone of object (GMT) is different than current timezone (). > > Problem are the <NA>s. If one does not supply the format of date and time > to the convert function, it is assumed that the standard NYSE format is > used, and therefore RTAQ internally (convert_to_RData.r line 32) represents > this as "Y%M%D %H:%M:%S". Whilst this works fine for some things, when a > timeDate is initialised using this format (convert_to_RData.r line 102), it > does not work. timeDate expects a correct format like "%Y%m%d %H:%M:%S" > format rather than "Y%M%D %H:%M:%S". > Run the below two to confirm: > tdobject=timeDate:::timeDate(paste(as.vector("2010-10-11"), > as.vector("10:30:30")), format="%Y%M%D > %H:%M:%S",FinCenter="GMT",zone="GMT") > #tdobject is GMT [1] [NA] > tdobject=timeDate:::timeDate(paste(as.vector("20101011"), > as.vector("10:30:30")), format="%Y%m%d > %H:%M:%S",FinCenter="GMT",zone="GMT") > #tdobject is now GMT [1] [2010-10-11 10:30:30] > > Therefore, if one explicitly includes format="%Y%m%d %H:%M:%S" in the > convert function, everything works fine and the <NA> problem above is > solved; this is my solution. Can I please suggest that, once you > investigate this and provided that you confirm my understanding, > convert_to_RData.r is changed in order to use "%Y%m%d %H:%M:%S" as the > default format? > > 4) My environment: > R version 2.15.1 (2012-06-22) > Platform: i686-pc-linux-gnu (32-bit) > > locale: > [1] LC_CTYPE=en_GB LC_NUMERIC=C LC_TIME=en_GB > [4] LC_COLLATE=C LC_MONETARY=en_GB LC_MESSAGES=en_GB > [7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C > [10] LC_TELEPHONE=C LC_MEASUREMENT=en_GB LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] RTAQ_0.2 timeDate_2160.97 xts_0.8-6 zoo_1.7-8 > > loaded via a namespace (and not attached): > [1] grid_2.15.1 lattice_0.20-6 > > > Best wishes, > Nicolae > > > > On Fri, 12 Oct 2012 21:52:22 +0100, "R. Michael Weylandt" > <michael.weyla...@gmail.com> wrote: >> I'm forwarding this to the R-SIG-Finance list, where ou'll have a more >> specialized audience. >> >> In the meanwhile, you may wish to look at >> > http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example >> >> Finally, I note you're posting from Nabble. Please do include context in >> your reply -- I don't believe Nabble does this automatically, so >> you'll need to manually include it. Most of the regular respondents on >> these lists don't use Nabble -- it is a _mailing list_ after all -- so >> we don't get the forum view you do, only emails of the individual >> posts. Combine that with the high volume of posts, and it's quite >> difficult to trace a discussion if we all don't make sure to include >> context. >> >> Cheers, >> Michael >> >> On Fri, Oct 12, 2012 at 7:01 PM, caprarn9 <capra...@cs.man.ac.uk> wrote: >>> Hello, >>> >>> I am closely following the RTAQ documentation in order to load my > dataset >>> into R, however I get this warning when running the convert function in >>> the >>> following way: >>> >>> convert(from="2010-11-01", to="2010-11-01",datasource=datasource, >>> datadestination=datadestination,trades=T,quotes=T,ticker="BAC",dir=T, >>> extention="csv", header=T, tradecolnames=c("SYMBOL", "DATE", "TIME", >>> "PRICE", "SIZE", "G127", "CORR", "COND", "EX"), > quotecolnames=c("SYMBOL", >>> "DATE", "TIME", "BID", "OFR", "BIDSIZ", "OFRSIZ", "MODE", "EX")) >>> >>> The only warning returned is: >>> In `[<-.factor`(`*tmp*`, is.na(tdata$G127), value = c(1L, 1L, 1L, : >>> invalid factor level, NAs generated >>> >>> As it is a warning, the .RData files still get created and I can use >>> TAQLoad >>> to load them: >>> >>> x <- >>> > TAQLoad("BAC",from="2010-11-01",to="2010-11-01",datasource=datadestination, >>> trades=T,quotes=T) >>> >>> The PROBLEM: >>> head(x) >>> SYMBOL EX PRICE SIZE COND CORR G127 >>> <NA> "BAC" "B" "11.4900" " 500" "@" "0" "0" >>> ... >>> >>> This is the same for the quotes objects, but different headers >>> obviously. I >>> get a <NA> instead of the expected YYY-MM-DD HH:MM:SS format for each >>> observation. >>> >>> I've spent a fair number of hours on trying to get this right, no >>> success. >>> Can you please provide me with some guidance? >>> >>> Thank you. >>> >>> A sample from the CSV files I use: >>> >>> SYMBOL,DATE,TIME,BID,OFR,BIDSIZ,OFRSIZ,MODE,EX >>> BAC,20101101,9:30:00,11.5,11.51,5,116,12,P >>> ... >>> >>> SYMBOL,DATE,TIME,PRICE,SIZE,G127,CORR,COND,EX >>> BAC,20101101,10:30:00,11.49,500,0,0,@,B >>> ... >>> >>> >>> >>> >>> -- >>> View this message in context: >>> > http://r.789695.n4.nabble.com/RTAQ-convert-function-warning-causes-incorrect-loading-of-data-tp4646025.html >>> Sent from the R help mailing list archive at Nabble.com. >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. > > _______________________________________________ > r-sig-fina...@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-sig-finance > -- Subscriber-posting only. If you want to post, subscribe first. > -- Also note that this is not the r-help list where general R questions > should go. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.