[R] Loop too slow for Bid calc - BUT cannot figure out how to do with matrix
Hi, I am trying to create Bid/Ask for each second from a high volume stock and the only way I have been able to solve this is using loops to create the target matrix from the source tick data matrix. Looping is too slow and not practical to use on multiple stocks. For example: Bids Matrix (a real one is 400,000++ length): Bid Time 10.0311:05:03.124 10.0411:05:03.348 10.0511:05:04.010 One Second Bid Matrix (Bid price for every second of the day): Bid Second 10.02 11:05:03 ??11:05:04sec.onesec) { onesec$Bid[sec] = bids$Price[r -1] # Price of previous bid bidrow = r # save bidrow as starting point to find next bid. break } #if }# for }# for Hope this is clear and thanks for your help. Chris -- View this message in context: http://r.789695.n4.nabble.com/Loop-too-slow-for-Bid-calc-BUT-cannot-figure-out-how-to-do-with-matrix-tp2955116p2955116.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop too slow for Bid calc - BUT cannot figure out how to do with matrix
Duncan and Martin, Thank you for your replies. I went with Martin's suggestion as it did not require loops and is probably the fastest...though it did take me 3 hours to figure out exactly how it was working !!! Here is what I am now using: bids = cbind(bids, timeCalc) orderBids = bids[order(bids$timeCalc),] # order bids on timeCalc col. result = orderBids[c(diff(orderBids$timeCalc) != 0,TRUE),] # get dataframe with last bid for each second Thanks again for advice. Chris -- View this message in context: http://r.789695.n4.nabble.com/Loop-too-slow-for-Bid-calc-BUT-cannot-figure-out-how-to-do-with-matrix-tp2955116p2966174.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Vector replace 0 elements without using a loop
Hi, With a vector like: x = c (22, 23, 22.5, 0,0,24, 0, 23.2, 23.5, 0,0,0, 26) How can I replace the 0's with the previous last value without looping through the vector ? Something tells me I am missing the obvious. Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Vector-replace-0-elements-without-using-a-loop-tp2966191p2966191.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: Vector replace 0 elements without using a loop
Petr and Bill, Thanks for your replies. I have gone with Petr use of na.locf(), but expect I can use Bill's full function in the near future. Chris -- View this message in context: http://r.789695.n4.nabble.com/Vector-replace-0-elements-without-using-a-loop-tp2966191p2968909.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Time OffSet From GMT - Losing it
Losing time offset from GMT: > sTime = as.POSIXct(paste("2008-03-03","09:30:01"), origin="1970-01-01") > sTime [1] "2008-03-03 09:30:01 EST" < 9.31am EST > sTime [1] 1204554601 > t = as.numeric(sTime) > as.POSIXct(t, origin="1970-01-01") [1] "2008-03-03 14:30:01 EST" <- no tz option and t is sTime + 5 hours ahead (assume because I am in EST) > as.numeric(as.POSIXct(t, origin="1970-01-01")) [1] 1204572601 <- value of t is sTime +5 hours > as.POSIXct(t, origin="1970-01-01", tz="GMT") [1] "2008-03-03 14:30:01 GMT" <--- time has not changed even though I used tz=”GMT” > as.numeric(as.POSIXct(t, origin="1970-01-01", tz="GMT")) [1] 1204554601 <- value is still sTime +5 hours though I am using GMT I am in New York so am on EST time. Missing the obvious...but what is it ? Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Time-OffSet-From-GMT-Losing-it-tp2968940p2968940.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Slow reading multiple tick data files into list of dataframes
Hi, I am trying to find the best way to read 85 tick data files of format: > head(nbbo) 1 bid CON 09:30:00.72209:30:00.722 32.71 98 2 ask CON 09:30:00.78209:30:00.810 33.14 300 3 ask CON 09:30:00.80909:30:00.810 33.14 414 4 bid CON 09:30:00.78309:30:00.810 33.06 200 Each file has between 100,000 to 300,300 rows. Currently doing nbbo.list<- lapply(filePath, read.csv)to create list with 85 data.frame objects...but it is taking minutes to read in the data and afterwards I get the following message on the console when taking further actions (though it does then stop): The R Engine is busy. Please wait, and try your command again later. filePath in the above example is a vector of filenames: > head(filePath) [1] "C:/work/A/A_2010-10-07_nbbo.csv" [2] "C:/work/AAPL/AAPL_2010-10-07_nbbo.csv" [3] "C:/work/ADBE/ADBE_2010-10-07_nbbo.csv" [4] "C:/work/ADI/ADI_2010-10-07_nbbo.csv" Is there a better/quicker or more R way of doing this ? Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Slow-reading-multiple-tick-data-files-into-list-of-dataframes-tp2990723p2990723.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Time OffSet From GMT - Losing it
That is embarrassingthanks for pointing out my mistake. Chris -- View this message in context: http://r.789695.n4.nabble.com/Time-OffSet-From-GMT-Losing-it-tp2968940p2990987.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SApply versus for loop for list of data.frames
Hi, I am trying to find the total number of rows for a list of data.frames and want to know if there is a better way than using a loop like: >df = { list of data.frame with varying number of rows...each one has a column called "COL" } >r = 0 > for (i in 1:length(df)) { + r = r + length(n[[i]]$CON) + } > r 6000123 < number of rows. Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/SApply-versus-for-loop-for-list-of-data-frames-tp2991107p2991107.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Regex to remove last character
Hi, Have been having trouble trying to figure out the right regex parameters to remove the last "." in timestamp with the following format: Convert 09:30:00.377.853 to 09:30:00.377853 Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Regex-to-remove-last-character-tp3172466p3172466.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] XTS : merge.xts seems to have problem with character vectors
Hi, Please can you tell me what I am doing wrong. When trying to merge two xts objects, one of which has multiple character vectors for columns...I am just getting NAs. > str(t) POSIXct[1:1], format: "2011-01-04 11:45:37" > y2 = xts(matrix(c(letters[1:10]),5), order.by=as.POSIXct(c(t + 1:5))) > names(y2) = c(1,2) > y2 1 2 2011-01-04 11:45:38 "a" "f" 2011-01-04 11:45:39 "b" "g" 2011-01-04 11:45:40 "c" "h" 2011-01-04 11:45:41 "d" "i" 2011-01-04 11:45:42 "e" "j" > y1 = xts(c(1:5), order.by=as.POSIXct(c(t + 1:5))) > y1 [,1] 2011-01-04 11:45:381 2011-01-04 11:45:392 2011-01-04 11:45:403 2011-01-04 11:45:414 2011-01-04 11:45:425 > merge(y1, y2) y1 X1 X2 2011-01-04 11:45:38 1 NA NA 2011-01-04 11:45:39 2 NA NA 2011-01-04 11:45:40 3 NA NA 2011-01-04 11:45:41 4 NA NA 2011-01-04 11:45:42 5 NA NA Warning message: In merge.xts(y1, y2) : NAs introduced by coercion Why do I lose the character columns ? Cheers, Chris -- View this message in context: http://r.789695.n4.nabble.com/XTS-merge-xts-seems-to-have-problem-with-character-vectors-tp3174125p3174125.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Aggragating subsets of data in larger vector with sapply
Have 40,000 rows of buy/sell trade data and am trying to add up the buys for each second, the code works but it is very slow. Any suggestions how to improve the sapply function ? secEP = endpoints(xSym$Direction, "secs") # vector of last second on an XTS timeseries object with multiple entries for each second. d = xSym$Direction s = xSym$Size buySize = sapply(1:(length(secEP)-1), function(y) { i = (secEP[y]+ 1):secEP[y+1]; # index of vectors between each secEP return(sum(as.numeric(s[i][d[i] == "buy"]))); } ) Object details: secEP = numeric Vector of one second Endpoints in xSym$Direction. > head(xSym$Direction) Direction 2011-01-05 09:30:00 "unkn" 2011-01-05 09:30:02 "sell" 2011-01-05 09:30:02 "buy" 2011-01-05 09:30:04 "buy" 2011-01-05 09:30:04 "buy" 2011-01-05 09:30:04 "buy" > head(xSym$Size) Size 2011-01-05 09:30:00 " 865" 2011-01-05 09:30:02 " 100" 2011-01-05 09:30:02 " 100" 2011-01-05 09:30:04 " 100" 2011-01-05 09:30:04 " 100" 2011-01-05 09:30:04 " 41" Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Aggragating-subsets-of-data-in-larger-vector-with-sapply-tp3206445p3206445.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Create a zoo/xts Time Series with Millisecond jumps
Is there a easy way to create the time index for a zoo/xts object for every 100 milliseconds. eg. time Index would be: 10:00:00:100 10:00:00:200 10:00:00:300 10:00:00:400 I am looking to build an empty zoo/xts object with time index from 10am to 3pm, index jumps by 100ms each row. Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Create-a-zoo-xts-Time-Series-with-Millisecond-jumps-tp3332427p3332427.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] as.POSIXct show milliseconds with format
Hi, Trying to create a POSIXct index for an xts object that will display the POSIXct index as HH:MM:SS.MMM. First of all, I am trying to get the as.POSIXct to work with format... > as.POSIXct(paste("2011-03-02 09:00:00.000", sep=""), tz="EST", > format="%H:%M:%OS3") [1] NA Why is this returning NA ? I can get Hours and Minutes...but only with the format as %H %M. > as.POSIXct(paste("2011-03-02 09:00:00.000", sep=""), tz="EST", format="%H > %M") [1] "2011-03-02 20:11:00 EST" BUT if I do it with format="%H:%M" I also get an NA: > as.POSIXct(paste("2011-03-02 09:00:00.000", sep=""), tz="EST", > format="%H:%M") [1] NA What am I not understanding ? Is it possible to create a POSIXct index for xts (or zoo) that will display (eg. with head(my_xts_object) ) the index in format HH:MM:SS.MMM so I can see the milliseconds. Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/as-POSIXct-show-milliseconds-with-format-tp3332733p3332733.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] xts POSIXct index format
Hi, I cannot figure out how to change the index format when displaying POSIXct objects. Would like the xts index to display as %H:%M:%OS3 when doing viewing the xts object. Think I am missing the obvious. Cheers, Chris -- View this message in context: http://r.789695.n4.nabble.com/xts-POSIXct-index-format-tp3336136p3336136.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xts POSIXct index format
Thank you for your help. indexFormat(x) solved the problem nicely. > head(a) 2011-03-04 09:30:00.0 22.10 2011-03-04 09:30:00.1 22.09 2011-03-04 09:30:00.2 22.10 2011-03-04 09:30:00.3 22.09 2011-03-04 09:30:00.4 22.10 2011-03-04 09:30:00.5 22.09 > indexFormat(a) <- "%H:%M:%OS3" > head(a) 09:30:00.000 22.10 09:30:00.100 22.09 09:30:00.200 22.10 09:30:00.300 22.09 09:30:00.400 22.10 09:30:00.500 22.09 -- View this message in context: http://r.789695.n4.nabble.com/xts-POSIXct-index-format-tp3336136p3337167.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Replace for loop when vector calling itself
Hi, I am missing something obvious. Need to create vector as: (0, i-1 + TheoP(i) - TheoP(i-1), repeat) Where i is the index position in the vector and i[1] is always 0. Found myself having to use a For Loop because I could not get sapply working. Any suggestions ? delta <- function(x) { start = index[x] end = index[x+1] - 1 iTheo = start:end len = length(iTheo) theoP = as.numeric(TheoBA$Bid[iTheo]) d = vector(mode = "numeric", length= len) d[1] = 0 if (len>1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } return(d) } Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Replace-for-loop-when-vector-calling-itself-tp3338383p3338383.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace for loop when vector calling itself
Hope this clarifies my Q. Creating a vector where each element is (except the first which is 0) is: the previous element + a calculation from another vector theoP[i] - theoP[i-1] I could not figure out how to do this without a for loop, as the vector had to reference itself for the next element...I am missing something obvious, but not too sure what. d = vector(mode = "numeric", length= len) d[1] = 0 if (len>1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } Thanks, Chris > Hi, > > I am missing something obvious. > > Need to create vector as: > > (0, i-1 + TheoP(i) - TheoP(i-1), repeat) Where i is the index > position > in the vector and i[1] is always 0. I think your prototype is not agreeing with the code below. Is "i" suppose to be the index (as suggested above) or the prior term (as implied below)? > > Found myself having to use a For Loop because I could not get sapply > working. Any suggestions ? Assuming the code below, you construct the first three or four values by hand I think you will find that the intermediate values of TheoP will have alternating signs. term2 = 2-1 + TheoP(2) - TheoP(1) term3 = 3-1 + TheoP(3) - (2-1 + TheoP(2) - TheoP(1)) term4 = 4-1 + TheoP(4) - (3-1 + TheoP(3) - (2-1 + TheoP(2) - TheoP(1)) ) The answer to the first question will determine how you proceed. If the index is being used then there are two series of cumulative sums and perhaps you can construct an expression that can be fed to the cumsum function. The diff function is also available and if the index version is correct, then it might even be as simple as c(0, ((1:len)-1)+diff(TheoP) ) So clarify what is intended. -- David. > > delta <- function(x) { > > start = index[x] > end = index[x+1] - 1 > iTheo = start:end > len = length(iTheo) > theoP = as.numeric(TheoBA$Bid[iTheo]) > d = vector(mode = "numeric", length= len) > d[1] = 0 > if (len>1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } > return(d) > } > > Thanks, > Chris > > -- -- View this message in context: http://r.789695.n4.nabble.com/Replace-for-loop-when-vector-calling-itself-tp3338383p3339351.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace for loop when vector calling itself
Hi, Thanks for your replies. In summary: 1. Replace code with c(0, cumsum(diff(theoP)) ). This is indeed correct and I had not realized it !! >> d = vector(mode = "numeric", length= len) >> d[1] = 0 >> if (len>1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } 2. How to create recursive vector, eg. vector = previous vector * new_data ... suggested to look at the filter() function. Thanks for your replies. Cheers, Chris -- View this message in context: http://r.789695.n4.nabble.com/Replace-for-loop-when-vector-calling-itself-tp3338383p3339938.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error: evaluation nested too deeply: infinite recursion / options(expressions=)?
Hi, I am processing tick data and my code has stopped working as I have increased the size of data being processed. Now I am receiving error for basic tasks in RConsole: > a = c(1:1000) Error: evaluation nested too deeply: infinite recursion / options(expressions=)? My R code worked fine with 50 stocks and 500,000 rows per stock, but when I increased this to 50 stocks and 5,000,000 rows per stock the code stopped halfway through with the above error message (Error: Evaluation nested too deeply.). Now I am getting this error with simple commands in the Console, as per above example. Please advice where I should look to resolve this ? I am using R 2.12.1 64Bit on Windows 7 with memory max at 8Gb. Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Error-evaluation-nested-too-deeply-infinite-recursion-options-expressions-tp3344168p3344168.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: evaluation nested too deeply: infinite recursion / options(expressions=)?
For further info, I cannot check my memory usage or even use ls() : > memory.limit Error: evaluation nested too deeply: infinite recursion / options(expressions=)? > ?memory.limit Error: evaluation nested too deeply: infinite recursion / options(expressions=)? > ls() Error: evaluation nested too deeply: infinite recursion / options(expressions=)? > Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Error-evaluation-nested-too-deeply-infinite-recursion-options-expressions-tp3344168p3344170.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Replace split with regex for speed ?
Have timestamp in format HH:MM:SS.MMM.UUU and need to remove the last "." so it is in format HH:MM:SS.MMMUUU. What is the fastest way to do this, since it has to be repeated on millions of rows. Should I use regex ? Currently doing it with a string split, which is slow: >head(ts) [1] 09:30:00.000.245 09:30:00.000.256 09:30:00.000.633 09:30:00.001.309 09:30:00.003.635 09:30:00.026.370 ts = strsplit(ts, ".", fixed = TRUE) ts=lapply(ts, function(x) { paste(x[1], ".", x[2], x[3], sep="") } ) # Remove last . from timestamp, from HH:MM:SS.MMM.UUU to HH:MM:SS.MMMUUU ts = unlist(ts) Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Replace-split-with-regex-for-speed-tp3386098p3386098.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace split with regex for speed ?
Thanks for your suggestions. Cheers, Chris -- View this message in context: http://r.789695.n4.nabble.com/Replace-split-with-regex-for-speed-tp3386098p3388958.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Source Code File For an Object
Is there anyway to query an object to find its source code file ? Created object F from file F.r, can object F tell me this ? Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Source-Code-File-For-an-Object-tp3461566p3461566.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Vi mode in Linux console - how to stop it ?
Hi, When I am editing a command using default R console in Linux, sometimes it is going into "vi mode"...not too sure how/why this happening. It then requires me to use vi commands to edit the line, which is very frustrating when I just want to use the delete key instead of "x" to delete a character. How do I turn this off ? Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Vi-mode-in-Linux-console-how-to-stop-it-tp3612235p3612235.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.