[R] readBin into a data frame

2013-08-01 Thread Zhang Weiwu
Hello. readBin is designed to read a batch of data with the same spec, e.g. 
read 1 floats into a vector. In practise I read into data frame, not 
vector.  For each data frame, I need to read a integer and a float.


for (i in 1:1000) {
dataframe$int[i]   <- readBin(con, integer(), size=2)
dataframe$float[i] <- readBin(con, numeric(), size=4)
}

And I need to read 100 such data files, ending up with a for loop in a for 
loop. Something feels wrong here, as it is being said if you use double-FOR 
you are not speaking R.


What is the R way of doing this? I can think of writing the content of the 
loop into a function, and vectorize it -- But, the result would be a list of 
list, not exactly data-frame, and the list grows incrementally, which is 
inefficient, since I know the size of my data frame at the outset. I am a 
new learner, not speaking half of R vocabulary, kindly provide some hint 
please:)


Best.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] use Vectorized function as range of for statement

2013-08-01 Thread Zhang Weiwu


I guess this has been discussed before, but I don't know the name of this 
problem, thus had to ask again.


Consider this scenario:


fun <- function(x) { print(x)}
for (i in Vectorize(fun, "x")(1:3)) print("OK")

[1] 1
[1] 2
[1] 3
[1] "OK"
[1] "OK"
[1] "OK"

The optimal behaviour is:


fun <- function(x) { print(x)}
for (i in Vectorize(fun, "x")(1:3)) print("OK")

[1] 1
[1] "OK"
[1] 2
[1] "OK"
[1] 3
[1] "OK"

That is, each iteration of vectorized function should yield some result for 
the 'for' statement, rather than having all results collected beforehand.


The intention of such a pattern, is to separates the data generation logic 
from data processing logic.


The latter mechanism, I think, is more efficient because it doesn't cache 
all data before processing -- and the interpreter has the sure knowledge 
that caching is not needed, since the vectorized function is not used in 
assignment but as a range.


The difference may be trivial, but this pseud code demonstrates otherwise:

readSample <- function(x) {

sampling_time <- readBin(con, integer(), 1, size=4)
sample_count <- readBin(con, integer(), 1, size=2)
samples <- readBin(con, float(), sample_count, size=4)

matrix # return a big matrix representing a sample
}

for (sample in Vectorize(readSample, "x")(1:1)) {
# process sample
}

The data file is a few Gigabytes, and caching them is not effortless. Not 
having to cache them would make a difference.


This email asks to 1. validate this need of the langauge; 2. alternative 
design pattern to workaround it; 3. Ask the proper place to discuss this.


Thanks and best...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] use Vectorized function as range of for statement

2013-08-01 Thread Zhang Weiwu



On Thu, 1 Aug 2013, Jeff Newmiller wrote:

The Vectorize function is essentially a wrapped up for loop, so you are 
really executing two successive for loops. Note that the Vectorize 
function is not itself vectorised, so there is no particular advantage to 
using it in this way. You might as well call fun as a statement in the for 
loop.


Thanks all who answered me! Now it is answered.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] why Vectorize conjures a list, not a vector?

2013-08-14 Thread Zhang Weiwu


The manual seems to suggest, with the SIMPLIFY = TRUE default option, 
Vectorize would conjure a vector if possible.


Quote:

SIMPLIFY: logical or character string; attempt to reduce the result to
  a vector, matrix or higher dimensional array; see the
  ‘simplify’ argument of ‘sapply’.

I assume, if each run of the function results a vector of the same type, 
the result should be a vector as well; there is a need of list only when 
data are of different type.


Or, given vectors of the same type, conjure vectors of the same type.

But it doesn't work that way -- see below -- so what's the magic inside?

REPRODUCE:

First, to make sure each run of the function always return vector of the 
same time:



for (datafile in list.files(full.names=TRUE,"16b")) 
print(mode(list.files(full.names=TRUE,datafile)))

[1] "character"
[1] "character"
[1] "character"
[1] "character"
[1] "character"
[1] "character"
[1] "character"
[1] "character"
[1] "character"
[1] "character"
[1] "character"


Then, vectorize it:


datafiles <- c(Vectorize(list.files, "path")(full.names=TRUE,path = 
list.files(base_dir,full.names=TRUE)))
mode(datafiles)

[1] "list"

The same happened with sapply, which should generate list only if a vector 
is impossible -- it generated a list when every result is a vector:



mode(sapply(list.files(base_dir,full.names=TRUE), list.files))

[1] "list"
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why Vectorize conjures a list, not a vector?

2013-08-15 Thread Zhang Weiwu



On Wed, 14 Aug 2013, Hervé Pagès wrote:


Hi Zhang,

First note that a list is a vector (try is.vector(list())).
The documentation for sapply() and Vectorize() should say *atomic*
vector instead of vector in the desccription of the 'simplify' and
'SIMPLIFY' arguments.

So in order for sapply() to be able to simplify the result, all runs
of the function not only need to produce an atomic vector of the same
type, but also of the same length. If this common length is 1, then the
final result can be simplified to an atomic vector of the same length
as the input.


Thanks. Thanks to Jeff Newmiller as well, you answered clearly and the 
answer solves the problem.__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to retain dimension when selecting one row from a matrix?

2013-08-15 Thread Zhang Weiwu

When you select a single row from a matrix, the dimsion is lost:


n <- matrix(nrow = 3, ncol = 5)
dim(n)

[1] 3 5

dim(n[1,])

NULL

dim(n[2,])

NULL

This doesn't happen if you select more than one row:


dim(n[1:2,])

[1] 2 5

This is causing trouble. SCENARIO: when I filter out unqualified sampled 
data, I was not unaware that only one row of data qualifies, and the 
resulted selected data thus doesn't have dimension.  I would call matrix[,3] 
and get an error.


One way to mend this is to re-assign the dimension to the result set, but 
that destories column names...



selected = n[1,]
selected

unixtime agiocount  ask  bid
  NA   NA   NA   NA   NA 

dim(selected)

NULL

dim(selected) <- c(1,5)
selected

 [,1] [,2] [,3] [,4] [,5]
[1,]   NA   NA   NA   NA   NA

Is there a way to retain dimension when selecting only one row from a 
matrix?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] to match samples by minute

2013-08-15 Thread Zhang Weiwu


Perhaps this is simple and common, but it took me quite a while to admit I 
cannot solve it in a simple way.


The data frame `df` has the following columns:

   unixtime, value, factor

Now I need a matrix of:

   unixtime, value-difference-between-factor1-and-factor2

The naive solution is:

   df[df$factor == "factor1",] - df[df$factor == "factor2",]

It won't work, because factor1 has 1000 valid samples, factor2 has 1400 
valid samples. The invalid samples are dropped on-site, i.e. removed before 
piped into R.


To solve it, I got 2 ideas.

1. create a new data.frame with 24*60 records, each record represent a 
minute in the day, because sampling is done once per minute. Now fit all 
records into their 'slots' by their nearest minute.


2. pair each record with another that has similar unixtime but different 
factor.


Both ideas require for loop into individual records. It feels to C-like to 
write a program that way. Is there a professional way to do it in R? If not, 
I'd even prefer to rewrite the sampler (in C) to not to discard invalid 
samples on-site, than to mangle R.


Thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] on how to make a skip-table

2013-09-12 Thread Zhang Weiwu


I've got two data frames, as shown below:
(NR means Number of Record)


record.lenths

NR length
1   100
2   130
3   150
4   148
5   100
683
760


valida.records

NR factor
1   3
2   4
4   8
7   9

And I intend to obtain the following skip-table:


skip.table

NR skip   factor
1   0   3
2   0   4
4   150 8
7   183 9


The column 'skip' is the space needed to skip invalid records.

For example, the 3rd element of skip.table has skip of '150', intended to 
skip the invalid record No.3 in record.lengths


For example, the 4th element of skip.table has skip of '183', intended to 
skip the invalid record No.5 and No.6, together is 100+83.


It's rather apparently intended for reading huge data files, and looks 
simple math, and I admit I couldn't find an R-ish way doing it.


Thanks in advance and also thanks for pointing out if I had been on the 
right track to start with.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] on how to make a skip-table

2013-09-12 Thread Zhang Weiwu


It is a nice surprise to wake up receiving three answers, all producing 
correct results. Many thanks to all of you.


Jim Holtman solved it with amazing clarity. Gang Peng using a traditioanl 
C-like pointer style and Arun with awesome tight code thanks to diff().


I am embrassed to see my mis-spellings inherited in the answers ('lenths' 
should be 'lengths' and 'valida' should be 'valid'). This experience is to 
behove me to not to code in midnight again.


For anyone wishing to test these methods, I have compiled them all into one 
R script file, pasted at the end of this email.


Jim Holtman asked me to elaborate the problem:

It is a common problem in reading sparse variable-lenght record data
file.  Records are stored in file one next to another. The length of
each record is known in advance, but a lot of them records are invalid,
and should be skipped to make efficient use of memory.

Ideally the datafile-reading routine should receive a skip-table. Before
reading each wanted/valid record, it seeks forward for the distance
given in the skip-table. The problem is how to obtain such a skip table.

What we have at hand to produce the skip table, is a set of two data
frames: a record.lengths data frame about each record's length, and a
valid.records data frame about which records are significant and should
be read.

--

## input data:

record.lengths <- read.table(text = "NR length
 1   100
 2   130
 3   150
 4   148
 5   100
 683
 760", header = TRUE)

valid.records <- read.table(text = "  NR factor
 1   3
 2   4
 4   8
 7   9", header = TRUE)

### Jim Holtman's method:

x <- merge(record.length, valid.records, by = "NR", all.x = TRUE)
x$seq <- cumsum(!is.na(x$factor))

# need to add 1 to lines with NA to associate with next group
x$seq[is.na(x$factor)] <- x$seq[is.na(x$factor)] + 1

# split by 'seq', output last record and sum of preceeding records
skip.table <- do.call(rbind
 , lapply(split(x, x$seq), function(.sk){
 if (nrow(.sk) > 1) .sk$skip <- sum(.sk$length[1:(nrow(.sk) - 1L)])
 else .sk$skip <- 0
 .sk[nrow(.sk), ] # return first value
 })
 )

print(skip.table)


### Gang Peng's method:

n.record <- length(record.lengths$NR)
index<- record.lengths$NR %in% valid.records$NR
tmp <- 1:n.record
ind <- tmp[index]
st  <- 1
skip <- rep(0,length(ind))
for (i in 1:length(ind)) {
if(sthttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to get values within a threshold

2013-09-13 Thread Zhang Weiwu


input:

> values
[1] 0.854400 1.648465 1.829830 1.874704 7.670915 7.673585 7.722619

> thresholds
[1] 1 3 5 7 9

expected output:

[1] 1 4 4 4 7

That is, need a vector of indexes of the maximum value below the threshold.

e.g.
First  element is "1", because value[1] is the largest below threshold "1".
Second element is "4", because value[4] is the largest below threshold "3".

The way I do it is:


sapply(1:length(threshold), function(x) { length(values[values < 
threshold[x]])})

[1] 1 4 4 4 7

It just seem to me too long and stupid to be like R. Is it already the best way?

Somehow I feel which() was designed for a purpose like this, but I couldn't 
figure out a way to apply which here.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to get values within a threshold

2013-09-13 Thread Zhang Weiwu



On Fri, 13 Sep 2013, William Dunlap wrote:


findInterval(thresholds, values)

[1] 1 4 4 4 7


Thanks a lot! But now I have a new problem, a typical R issue perhaps.

First, let's look at  a successful case:

> thresholds <- c(1,3,5,7,9)
> values <- c(0.854, 1.648, 1.829, 1.874, 7.670, 7.673, 7.722)
> values[findInterval(thresholds, values)]
[1] 0.854 1.874 1.874 1.874 7.722

Then a new batch of values came, notice only the first element of new values 
differ:


> thresholds <- c(1,3,5,7,9)
> values <- c(1.254, 1.648, 1.829, 1.874, 7.670, 7.673, 7.722)
> findInterval(thresholds, values)
[1] 0 4 4 4 7
> values[findInterval(thresholds, values)]
[1] 1.874 1.874 1.874 7.722

This is a surprise. The desirable output is:

[1] 0 1.874 1.874 1.874 7.722

This is desirable, because so maintains the same number of elements during 
calculation. (You may suggest leaving out the indices and try to calculate 
maximum-values-below-threshold directly, but the indices are useful to 
address other fields in the data frame whence values came.)


This problem can be simplified as following:

in R, we have:
> a <- 1:10
> a[c(1,3)]
[1] 1 3
> a[c(0,3)]
[1] 3

While I was hoping to get:
> a <- 1:10
> a[c(1,3)]
[1] 1 3
> a[c(0,3)]
[1] 0 3

The straightforward solution, is to shift the whole test values one 
position, so that the first value is always zero:


> values <- c(0, 1.254, 1.648, 1.829, 1.874, 7.670, 7.673, 7.722)

This solution, despite begetting a train of changes elsewhere in the code, 
is semantically wrong, since the first element of values should be the first 
value, now it is actually the 0-th value.


What would you do in the case?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to get values within a threshold

2013-09-13 Thread Zhang Weiwu



On Fri, 13 Sep 2013, William Dunlap wrote:


You may want to append -Inf (or 0 if you know the data cannot be
negative) to the start of your 'values' vector so you don't
have to write code to catch the cases when a threshold is below
the range of the values.
  > findInterval(thresholds, c(0,values,Inf))
  [1] 1 5 5 5 8
  > c(0, values, Inf)[.Last.value]
  [1] 0.000 1.874 1.874 1.874 7.722


Thanks a lot! I'll stick with this method for this project.

Thanks a lot to arun as well, for profiling different methods.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] the problem of buying and selling

2013-09-13 Thread Zhang Weiwu


I own a lot to the folks on r-help list, especially arun who answered every 
of my question and was never wrong. I am disinclined to once again ask this 
question, since it is more arithmatic than technical. But, having worked 2 
days on it, I realized my brain is just not juicy enough


Here is the problem.

Trust not for freedom to the Franks---
They have a king who buys and sells.
- Lord Byron: The Isles of Greece

Suppose the French King commands you to buy and sell, and tells you only to
deal if the profit is higher than 2%. Question: how much quantity will be 
dealt, and what is the actual profit? In fact, the King wants to see the 
relationship between his minimum-profit requirement and your result, in 
order to better his decision.


Let's look at the input data - a dump of which is attached to this mail.

Column 1 is the price of the market where you buy goods from, column 2 is 
the quantity of goods that is being sold at that price.


Column 3 is the price of the market where you sell goods to, column 4 is the 
quantity the buyers willing to buy at that price.



cbind(t(to_buy_from), t(to_sell_to))


 [,1]  [,2]   [,3]   [,4]
 [1,] 61.7050   190 63.170   2500
 [2,] 61.750029 63.150799
 [3,] 61.8050   166 63.110500
 [4,] 61.8950   166 63.060  1
 [5,] 61.9450   166 63.020   7840
 [6,] 61.9805  6150 62.995   2000
 [7,] 62.  3069 62.930   2000
 [8,] 62.0600   166 62.860  10811
 [9,] 62.1100   166 62.780  18054
[10,] 62.1450   166 62.755   9000
[11,] 62.1750   166 62.690  10960
[12,] 62.2250   166 62.635100
[13,] 62.2450   166 62.585   2380
[14,] 62.2720   100 62.550   2119
[15,] 62.2830  4000 62.525 108091
[16,] 62.2875   100 62.505   2000
[17,] 62.2955   100 62.485816
[18,] 62.3250   307 62.435600
[19,] 62.3800  2906 62.400300
[20,] 62.3940  1969 62.375   4611
[21,] 62.4250   166 62.355   5111
[22,] 62.4505  2000 62.335   1969
[23,] 62.4700   259 62.315500
[24,] 62.475550 62.250   5142
[25,] 62.4800   166 62.165660
[26,] 62.4935   305 62.115   2428
[27,] 62.4975  7786 62.085779
[28,] 62.4995 50049 62.050  12811
[29,] 62.5045   914 62.015192
[30,] 62.5150  1110 61.975   1200
[31,] 62.5285   400 61.895  4
[32,] 62.5500  6352 61.835100
[33,] 62.5750 9 61.775133
[34,] 62.6000   394 61.750   7723

For the simpliest case, if the King had commanded that the minimum profit 
should be 2.3742%, which is equal to 63.170/61.7050 (look at the first row), 
then you can easily project that 190 quantity of goods will be dealt (the 
minmum of [1,2] and [1,4]), and that the actual profit is 2.3742%.


If the king, however, has commanded that a deal should only be carried out 
if the profit is higher than 2%, the calculation will be more complicated. I 
don't know the right method, but I can demonstrate the wrong method and 
explain why it is wrong.


The wrong approach is the following:

The idea is to write a function that asks how much volume (total quantity) 
you want to deal, and returns the profit. This generates a relationship 
between volume and profit, and with interpolation you can get the volumen 
for any given minimum-profit requirement.



revenues <- function(open_orders, volumes) {
# calculate revenue using a list of open orders and desirable "volumes" of goods

# expecting volumnes as a vector, to test the revenue (total amont of money)
# for each volume (total amount of goods to deal) in the 'volumes'

volume  <- sapply(1:length(open_orders[2,]),
function(x) { sum(open_orders[2,1:x])})
revenue <- sapply(1:length(open_orders[2,]),
function(x) { sum(open_orders[1, 1:x] * open_orders[2,1:x])})
i <- findInterval(volumes, c(0, volume))
c(0, revenue)[i] + c(open_orders[1,], 0)[i]*(
volumes - c(0, volume)[i])
}

data.frame(volume = volumes, profit = revenues(to_sell_to, volumes) /
  revenues(to_buy_from, volumes) - 1)

With the above routine, let us test the profit with the following volumes:


volumes = c(10, 100, 500, 1000, 5000, 1, 3, 5, 7, 9)


And the result:


data.frame(volume = volumes, profit = revenues(to_sell_to, volumes) /

+   revenues(to_buy_from, volumes) - 1)
   volume  profit
   1  10 0.023741938
   2 100 0.023741938
   3 500 0.022424508
   41000 0.020974612
   55000 0.018972785
   6   1 0.018087976
   7   3 0.012223652
   8   5 0.009288480
   9   7 0.007729286
   10  9 0.006204251

So, by looking up the table, if the king requires minimum profit of 2%, the 
volume (total quantity) of goods being dealt should be a bit more than 1000. 
This answer is inexact, but our French King should get by with it. After 
all, he remembers nothing more than the number of digits.


Now let's look at why it is wrong. This answer is, actually, correct, but 
the method won'

Re: [R] the problem of buying and selling

2013-09-14 Thread Zhang Weiwu



On Sat, 14 Sep 2013, Zhang Weiwu wrote:

I own a lot to the folks on r-help list, especially arun who answered 
every of my question and was never wrong. I am disinclined to once again 
ask this question, since it is more arithmatic than technical. But, having 
worked 2 days on it, I realized my brain is just not juicy enough


Here is the problem.

Trust not for freedom to the Franks---
They have a king who buys and sells.
- Lord Byron: The Isles of Greece

Suppose the French King commands you to buy and sell, and tells you only to
deal if the profit is higher than 2%. Question: how much quantity will be 
dealt, and what is the actual profit? In fact, the King wants to see the 
relationship between his minimum-profit requirement and your result, in order 
to better his decision.


Let's look at the input data - a dump of which is attached to this mail.

Column 1 is the price of the market where you buy goods from, column 2 is the 
quantity of goods that is being sold at that price.


Column 3 is the price of the market where you sell goods to, column 4 is the 
quantity the buyers willing to buy at that price.


Forgive my carelessness. I should emphasize that there is only one type 
goods to be dealt. The below table should be read in this way: there is 190 
quantity of goods being sold at 61.7050 that you can buy from, and 29 
quantity of goods beign sold at 61.7500 that you can buy from (row 1 and 2, 
column 1 and 2).  They are exactly the same type of goods, that you can sell 
the total volume of 219 at the price of 63.170.



cbind(t(to_buy_from), t(to_sell_to))


 [,1]  [,2]   [,3]   [,4]
[1,] 61.7050   190 63.170   2500
[2,] 61.750029 63.150799
[3,] 61.8050   166 63.110500
[4,] 61.8950   166 63.060  1
[5,] 61.9450   166 63.020   7840
[6,] 61.9805  6150 62.995   2000
[7,] 62.  3069 62.930   2000
[8,] 62.0600   166 62.860  10811
[9,] 62.1100   166 62.780  18054
[10,] 62.1450   166 62.755   9000
[11,] 62.1750   166 62.690  10960
[12,] 62.2250   166 62.635100
[13,] 62.2450   166 62.585   2380
[14,] 62.2720   100 62.550   2119
[15,] 62.2830  4000 62.525 108091
[16,] 62.2875   100 62.505   2000
[17,] 62.2955   100 62.485816
[18,] 62.3250   307 62.435600
[19,] 62.3800  2906 62.400300
[20,] 62.3940  1969 62.375   4611
[21,] 62.4250   166 62.355   5111
[22,] 62.4505  2000 62.335   1969
[23,] 62.4700   259 62.315500
[24,] 62.475550 62.250   5142
[25,] 62.4800   166 62.165660
[26,] 62.4935   305 62.115   2428
[27,] 62.4975  7786 62.085779
[28,] 62.4995 50049 62.050  12811
[29,] 62.5045   914 62.015192
[30,] 62.5150  1110 61.975   1200
[31,] 62.5285   400 61.895  4
[32,] 62.5500  6352 61.835100
[33,] 62.5750 9 61.775133
[34,] 62.6000   394 61.750   7723

For the simpliest case, if the King had commanded that the minimum profit 
should be 2.3742%, which is equal to 63.170/61.7050 (look at the first row), 
then you can easily project that 190 quantity of goods will be dealt (the 
minmum of [1,2] and [1,4]), and that the actual profit is 2.3742%.


If the king, however, has commanded that a deal should only be carried out if 
the profit is higher than 2%, the calculation will be more complicated. I 
don't know the right method, but I can demonstrate the wrong method and 
explain why it is wrong.


The wrong approach is the following:

The idea is to write a function that asks how much volume (total quantity) 
you want to deal, and returns the profit. This generates a relationship 
between volume and profit, and with interpolation you can get the volumen for 
any given minimum-profit requirement.



revenues <- function(open_orders, volumes) {
# calculate revenue using a list of open orders and desirable "volumes" of 
goods


# expecting volumnes as a vector, to test the revenue (total amont of money)
# for each volume (total amount of goods to deal) in the 'volumes'

volume  <- sapply(1:length(open_orders[2,]),
function(x) { sum(open_orders[2,1:x])})
revenue <- sapply(1:length(open_orders[2,]),
function(x) { sum(open_orders[1, 1:x] * open_orders[2,1:x])})
i <- findInterval(volumes, c(0, volume))
c(0, revenue)[i] + c(open_orders[1,], 0)[i]*(
volumes - c(0, volume)[i])
}

data.frame(volume = volumes, profit = revenues(to_sell_to, volumes) /
  revenues(to_buy_from, volumes) - 1)

With the above routine, let us test the profit with the following volumes:


volumes = c(10, 100, 500, 1000, 5000, 1, 3, 5, 7, 9)


And the result:


data.frame(volume = volumes, profit = revenues(to_sell_to, volumes) /

+   revenues(to_buy_from, volumes) - 1)
  volume  profit
  1  10 0.023741938
  2 100 0.023741938
  3 500 0.022424508
  41000 0.020974612
  55000 0.018972785
  6  

Re: [R] Instructions for upgrading R on ubuntu

2013-09-16 Thread Zhang Weiwu


On Sun, 15 Sep 2013, Andrew Crane-Droesch wrote:

The c2d4u PPA is the main search result when googling "upgrade R 3.0.1 
ubuntu".


And it should be, because it is more likely that a PPA re-distribution works 
better for Ubuntu than a general distribution, even if it is an exceptional 
case with this software this time.


You must be in bad need of 64-bit memory access or long vectors†, to do a 
manual upgrade just one month ahead of Ubuntu's own maintenance upgrade 
(13.10) with contains R-3.0.1.


† They are the major features offered by R-3, so I guess most users are 
unlikely argued into hurrying an upgrade from R-2.x

http://www.r-bloggers.com/r-3-0-0-is-released-whats-new-and-how-to-upgrade/__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.