Re: [R] Dicrete Laplace distribution

2010-03-11 Thread Moshe Olshansky
Dear Nicolette, You can always use the bruit force solution which works for every discrete distribution with finite number of states: let p0,p1,...,pK be the probabilities of 0,1,...,K (such that they sum up to 1). Let P <- c(p0,p1,...,pK) and P1 <- c(cumsum(P),1) Now let x = runif() (uniform in

Re: [R] Optimise huge data.frame construction

2010-02-24 Thread Moshe Olshansky
Hi Daniele, One possibility would be to make two runs. In the first run you are not building the matrix but just calculating the number of rows you need (in a loop). Then you allocate such matrix (only once) and fill it in the second run. Regards, Moshe. --- On Wed, 24/2/10, Daniele Amberti w

Re: [R] reading "surfer" files

2010-02-23 Thread Moshe Olshansky
Check read.table (?read.table). --- On Wed, 24/2/10, RagingJim wrote: > From: RagingJim > Subject: [R] reading "surfer" files > To: r-help@r-project.org > Received: Wednesday, 24 February, 2010, 3:23 PM > > To the R experts, > > I am currently playing with a program which was designed so > th

Re: [R] Quadprog help

2010-02-22 Thread Moshe Olshansky
Hi Sergio, Having singular Dmat is certainly a problem. I can see two possibilities: 1) try to eliminate X1,...,X9, so that you are left with P1,...,P6 only. 2) if you can not do this, add eps*X1^+...+eps*X9^2 to your matrix Dmat so that it is positive definite (eps is a small positive number). Y

Re: [R] Goodness of fit test for count data

2010-02-22 Thread Moshe Olshansky
You can compute the conditional probability that your variable equals k given that it is non-zero. For example, if X has poisson distribution with parameter lambda then P(X=k/X!=0) = P(X=k)/(1-P(X=0)) = (exp(-lambda)/(1-exp(-lambda))*lambda^k/k! Now you can find lambda for which the sum of square

Re: [R] Normal distribution (Lillie.test())

2010-02-22 Thread Moshe Olshansky
Hi, As far as I understand, D is the value of (Kolmogorov-Smirnov) statistic and p-value is the probability to get that (or greater) value for normally distributed variables (so in your case you would most probably reject the hypothesis that your data is normal). --- On Tue, 23/2/10, Bosken w

Re: [R] Integral of function of dnorm

2010-02-17 Thread Moshe Olshansky
Yes, this can be easily computed analytically (even though my result is a bit different). --- On Fri, 12/2/10, dav...@rhotrading.com wrote: > From: dav...@rhotrading.com > Subject: Re: [R] Integral of function of dnorm > To: "Greg Snow" , "Trafim Vanishek" > , "Peter Dalgaard" > Cc: r-help@r

Re: [R] difftimes; histogram; memory problems

2010-02-15 Thread Moshe Olshansky
Hi Jonathan, If minDate = min(Condition1) - max(Condition2) and maxDate = max(Condition1) - min(Condition2) then all your differences would be between minDay and maxDay, and hopefully this is not a very big range (unless you are going many thousands years into the past or the future). So basica

Re: [R] Question about rank() function

2010-02-10 Thread Moshe Olshansky
Hi, I believe that the reason is that even though the first 4 elements of your fmodel look equal (when rounded to 4 decimal places) they are actually not. To check this try fmodel[1:4]-fmodel[1] --- On Thu, 11/2/10, Something Something wrote: > From: Something Something > Subject: [R] Questio

Re: [R] Resampling a grid to coarsen its resolution

2010-02-09 Thread Moshe Olshansky
One possibility I can see is to replace - by NA and use mean with na.rm=TRUE. --- On Wed, 10/2/10, Steve Murray wrote: > From: Steve Murray > Subject: [R] Resampling a grid to coarsen its resolution > To: r-help@r-project.org > Received: Wednesday, 10 February, 2010, 3:20 AM > > Dear all,

Re: [R] Polynomial equation

2010-01-07 Thread Moshe Olshansky
> > However, due to my limited stats background, I am unable to > find out the > equation of the trendline from the summary table. Besides, > how do I fit the > trendline on the graph? > > I intend to put the first column of data onto x axis and > the second column >

Re: [R] Polynomial equation

2010-01-07 Thread Moshe Olshansky
Hi Chris, You can use lm with poly (look ?lm, ?poly). If x and y are your arrays of points and you wish to fit a polynom of degree 4, say, enter: model <- lm(y~poly(x,4,raw=TRUE) and then summary(model) The raw=TRUE causes poly to use 1,x,x^2,x^3,... instead of orthogonal polynomials (which are

[R] Confidence intervals - a statistical question, nothing to do with R

2009-11-18 Thread Moshe Olshansky
Dear list, I have r towns, T1,...,Tr where town i has population Ni. For each town I randomly sampled Mi individuals and found that Ki of them have a certain property. So Pi = Ki/Mi is an unbiased estimate of the proportion of people in town i having that property and the weighted average of Pi

Re: [R] Kolmogorov smirnov test

2009-10-12 Thread Moshe Olshansky
Hi Roslina, I believe that you can ignore the warning. Alternatively, you may add a very small random noise to pairs with ties, i.e. something like xobs[which(duplicated(xobs))] <- xobs[which(duplicated(xobs))] + 1.0e-6*sd(xobs)*rnorm(length(which(duplicated(xobs Regards, Moshe. --- On Tu

Re: [R] keeping all rows with the same values, and not only unique ones

2009-09-24 Thread Moshe Olshansky
test[which(test[,"total"] %in% needed),] --- On Fri, 25/9/09, Dimitri Liakhovitski wrote: > From: Dimitri Liakhovitski > Subject: [R] keeping all rows with the same values, and not only unique ones > To: "R-Help List" > Received: Friday, 25 September, 2009, 8:52 AM > Dear R-ers, > > I have a

Re: [R] Basic population dynamics

2009-09-01 Thread Moshe Olshansky
Assuming that at the end all of them are dead, you can do the following: sum(deaths)-cumsum(deaths) Regards, Moshe. --- On Wed, 2/9/09, Frostygoat wrote: > From: Frostygoat > Subject: [R] Basic population dynamics > To: r-help@r-project.org > Received: Wednesday, 2 September, 2009, 4:48 AM

Re: [R] Help on efficiency/vectorization

2009-08-26 Thread Moshe Olshansky
You can do for (i in 1:ncol(x)) {names <- rownames(x)[which(x[,i]==1)];eval(parse(text=paste("V",i,".ind<-names",sep="")));} --- On Thu, 27/8/09, Steven Kang wrote: > From: Steven Kang > Subject: [R] Help on efficiency/vectorization > To: r-help@r-project.org > Received: Thursday, 27 August

Re: [R] Submit a R job to a server

2009-08-26 Thread Moshe Olshansky
Hi Deb, Based on your last note (and after briefly looking at Rserve) I believe that you should install R with all the packages you need on the server and then use it like you are using any workstation, i.e. log in to it and do whatever you need. Regards, Moshe. --- On Thu, 27/8/09, Debabrat

Re: [R] extra .

2009-08-20 Thread Moshe Olshansky
My guess is that 6. comes for 6.0 - something which comes from programming languages where 6 represents 6 as integer while 6. (or 6.0) represents 6 as floating point number. --- On Fri, 21/8/09, kfcnhl wrote: > From: kfcnhl > Subject: [R] extra . > To: r-help@r-project.org > Received: Friday

Re: [R] Principle components analysis on a large dataset

2009-08-20 Thread Moshe Olshansky
Hi Misha, Since PCA is a linear procedure and you have only 6000 observations, you do not need 68000 variables. Using any 6000 of your variables so that the resulting 6000x6000 matrix is non-singular will do. You can choose these 6000 variables (columns) randomly, hoping that the resulting matr

Re: [R] expanding 1:12 months to Jan:Dec

2009-08-20 Thread Moshe Olshansky
One possible (but not very elegant) solution is: > aa <- paste(1:12,":10:2009",sep="") > dd<-as.Date(aa,format="%m:%d:%Y") > mon <- format(dd,"%b") > mon [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec" --- On Thu, 20/8/09, Liviu Andronic wrote: > From: Liviu Andron

Re: [R] feature weighting in randomForest

2009-08-16 Thread Moshe Olshansky
Hi Tim, As far as I know you can not weigh predictors (and I believe that you really should not). You may weigh classes (and, in a sense, cases), but this is an entirely different issue. --- On Wed, 5/8/09, "Häring, Tim (LWF)" wrote: > From: "Häring, Tim (LWF)" > Subject: [R] feature weighti

Re: [R] System is computationally singular and scale of covariates

2009-08-16 Thread Moshe Olshansky
Hi, What do you mean by outer product? If you have two vectors, say x and y, of lenght n and you define matrix A by A(i,j) = x(i)*y(j) then your matrix has rank one and it is VERY singular (in exact arithmetics). Is this is what you mean by outer product? --- On Sun, 16/8/09, Stephan Lindner

Re: [R] Solutions of equation systems

2009-08-13 Thread Moshe Olshansky
Is your system of equations linear? --- On Fri, 14/8/09, Moreno Mancosu wrote: > From: Moreno Mancosu > Subject: [R] Solutions of equation systems > To: r-help@r-project.org > Received: Friday, 14 August, 2009, 2:29 AM > Hello all! > > Maybe it's a newbie question(in fact I _am_, a newbie), bu

Re: [R] problem about t test

2009-08-13 Thread Moshe Olshansky
You could do the following: y <- apply(dat,1,function(a) t.test(a[1:10],a[11:30])$p.value) This will produce an array of 2 p-values. --- On Fri, 14/8/09, Gina Liao wrote: > From: Gina Liao > Subject: [R] problem about t test > To: r-h...@stat.math.ethz.ch > Received: Friday, 14 August, 20

Re: [R] Matrix Integral

2009-08-12 Thread Moshe Olshansky
Hi, Is your matrix K symmetric? If yes, there is an "analytical" solution. --- On Sat, 1/8/09, nhawrylyshyn wrote: > From: nhawrylyshyn > Subject: [R] Matrix Integral > To: r-help@r-project.org > Received: Saturday, 1 August, 2009, 12:15 AM > > Hi, > > Any help on this would be appreciated:

Re: [R] Re : How to Import Excel file into R 2.9.0 version

2009-08-11 Thread Moshe Olshansky
Alternatively download the xlsReadWrite package from http://treetron.googlepages.com/ install it an proceed as in older version of R. --- On Tue, 11/8/09, Inchallah Yarab wrote: > From: Inchallah Yarab > Subject: [R] Re : How to Import Excel file into R 2.9.0 version > To: r-help@r-project.o

Re: [R] Counting the number of non-NA values per day

2009-08-11 Thread Moshe Olshansky
Try tempFun <- function(x) sum(!is.na(x)) nonZeros <- aggregate(pollution["pol"],format(pollution["date"],"%Y-%j"), FUN = tempFun) --- On Wed, 12/8/09, Tim Chatterton wrote: > From: Tim Chatterton > Subject: [R] Counting the number of non-NA values per day > To: r-help@r-project.org > Recei

Re: [R] Sampling of non-overlapping intervals of variable length

2009-07-19 Thread Moshe Olshansky
Another possibility, if the total length of your intervals is small in comparison to the "big interval" is to choose the starting points of all your intervals randomly and to dismiss the entire set if some of the intervals overlap. Most probably you will not have too many such cases (assuming,

Re: [R] searching for elements

2009-07-15 Thread Moshe Olshansky
?outer --- On Thu, 16/7/09, Chyden Finance wrote: > From: Chyden Finance > Subject: [R] searching for elements > To: r-help@r-project.org > Received: Thursday, 16 July, 2009, 3:00 AM > Hello! > > For the past three years, I have been using R extensively > in my PhD program in Finance for stat

Re: [R] Grouping data in dataframe

2009-07-14 Thread Moshe Olshansky
Try ?aggregate --- On Wed, 15/7/09, Timo Schneider wrote: > From: Timo Schneider > Subject: [R] Grouping data in dataframe > To: "r-help@r-project.org" > Received: Wednesday, 15 July, 2009, 1:56 PM > Hello, > > I have a dataframe (obtained from read.table()) which looks > like > >   >    Ex

Re: [R] ifultools on ppc debian

2009-07-14 Thread Moshe Olshansky
Hi Stephen, The error message clearly says what is wrong. Big Endian and Little Endian are two ways of storing data (mostly often double precision numbers) in memory. A double precision number occupies two blocks of 4 bytes each. On Big Endian machines (most machines which are not Intel) if the

Re: [R] Nested for loops

2009-07-13 Thread Moshe Olshansky
Make it for (i in 1:9) This is not the general solution, but in your case when i=10 you do not want to do anything. --- On Tue, 14/7/09, Michael Knudsen wrote: > From: Michael Knudsen > Subject: [R] Nested for loops > To: r-help@r-project.org > Received: Tuesday, 14 July, 2009, 3:38 PM > Hi,

Re: [R] averaging two matrices whilst ignoring missing values

2009-07-13 Thread Moshe Olshansky
One (awkward) way to do this is: x <- matrix(c(c(test),c(test2)),ncol=2) y <- rowMeans(x,na.rm=TRUE) testave <- matrix(y,nrow=nrow(test)) --- On Tue, 14/7/09, Tish Robertson wrote: > From: Tish Robertson > Subject: [R] averaging two matrices whilst ignoring missing values > To: r-help@r-proje

Re: [R] how to keep row name if there is only one row selected from a data frame

2009-07-12 Thread Moshe Olshansky
Try A[1,,drop=FALSE] - see help("\[") --- On Mon, 13/7/09, Weiwei Shi wrote: > From: Weiwei Shi > Subject: [R] how to keep row name if there is only one row selected from a > data frame > To: "r-h...@stat.math.ethz.ch" > Received: Monday, 13 July, 2009, 1:55 PM > Hi, there: > > Assume I hav

Re: [R] naming of columns in R dataframe consisting of mixed data (alphanumeric and numeric)

2009-07-09 Thread Moshe Olshansky
Hi Mary, Your data.frame has just one column (not 2)! You can check this by dim(tresult2). What appears to you to be the first column (names) are indeed rownames. If you really want to have two columns do something like tresult2 <- cbind(colnames(tresult),data.frame(t(tresult),row.names=NULL))

Re: [R] Substituting numerical values using `apply'

2009-07-09 Thread Moshe Olshansky
Let M be your matrix. Do the following: B <- t(matrix(colnames(a),nrow=ncol(M),ncol=nrow(M))) B[M==0] <- NA --- On Thu, 9/7/09, Olivella wrote: > From: Olivella > Subject: [R] Substituting numerical values using `apply' > To: r-help@r-project.org > Received: Thursday, 9 July, 2009, 6:25 AM

Re: [R] print() to file?

2009-07-09 Thread Moshe Olshansky
One possibility is to use sink (see ?sink). --- On Thu, 9/7/09, Steve Jaffe wrote: > From: Steve Jaffe > Subject: [R] print() to file? > To: r-help@r-project.org > Received: Thursday, 9 July, 2009, 5:03 AM > > I'd like to write some objects (eg arrays) to a log file. > cat() flattens them >

Re: [R] Extracting a column name in loop?

2009-07-08 Thread Moshe Olshansky
If df is your dataframe then names(df) contains the column names and so names(df)[i] is the name of i-th column. --- On Thu, 9/7/09, mister_bluesman wrote: > From: mister_bluesman > Subject: [R] Extracting a column name in loop? > To: r-help@r-project.org > Received: Thursday, 9 July, 2009,

Re: [R] Uncorrelated random vectors

2009-07-07 Thread Moshe Olshansky
As mentioned by somebody before, there is no problem for the normal case - use mvrnorm function from MASS package with any mu and make Sigma be any diagonal matrix (with strictly positive diagonal). Note that even though all the correlations are 0, the SAMPLE correlations won't be 0. If you wan

Re: [R] Counting the number of cycles in a temperature test

2009-07-07 Thread Moshe Olshansky
Hi Antje, Are your measurements taken every minute (i.e. 30 minutes correspond to 30 consecutive values)? How fast is your transition? If you had 30 minures of upper temperature, then 1000 minutes of room temperature and then 30 minutes of lower temperature - would you count this as a cycle? C

Re: [R] a really simple question on polynomial multiplication

2008-10-15 Thread Moshe Olshansky
One way is to use convolution (?convolve): If A(x) = a_p*x^p + ... + a_1*x + a_0 and B(x) = b_q*x^q + ... + b_1*x + b_0 and if C(x) = A(x)*B(x) = c_(p+q)*x^(p+q) + ... + c_0 then c = convolve(a,rev(b),type="open") where c is the vector (c_(p+q),...,c_0), a is (a_p,...,a_0) and b is (b_q,...,b_0)

Re: [R] runs of heads when flipping a coin

2008-10-09 Thread Moshe Olshansky
First of all, we must define what is a run of length r: is it a tail, then EXACTLY r heads and a tail again or is it AT LEAST r heads. Let's assume that we are looking for a run of EXACTLY r heads (and we toss the coin n times). Let X[1],X[2],...,X[n-r+1] be random variables such that Xi = 1 if t

Re: [R] ordering problem

2008-10-07 Thread Moshe Olshansky
Try AA <- apply(A,1,function(x) paste(x,collapse="")) and work with AA. --- On Tue, 30/9/08, Jose Luis Aznarte M. <[EMAIL PROTECTED]> wrote: > From: Jose Luis Aznarte M. <[EMAIL PROTECTED]> > Subject: [R] ordering problem > To: [EMAIL PROTECTED] > Received: Tuesday, 30 September, 2008, 8:43 P

Re: [R] design question on piping multiple data sets from 1 file into R

2008-09-24 Thread Moshe Olshansky
I think that you can use read.csv with nrows and skip arguments (see ?read.table). --- On Mon, 22/9/08, DS <[EMAIL PROTECTED]> wrote: > From: DS <[EMAIL PROTECTED]> > Subject: [R] design question on piping multiple data sets from 1 file into R > To: r-help@r-project.org > Received: Monday, 22 S

Re: [R] perl expression question

2008-09-22 Thread Moshe Olshansky
Hi Mark, stock<-"/opt/limsrv/mark/research/equity/projects/testDL/stock_data/fhdb/US/BLC.NYSE" > gsub(".*/([^/]+)$", "\\1",stock) [1] "BLC.NYSE" --- On Tue, 23/9/08, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > From: [EMAIL PROTECTED] <[EMAIL PROTECTED]> > Subject: [R] perl expression questi

Re: [R] sort a data matrix by all the values and keep the names

2008-09-22 Thread Moshe Olshansky
One possibility is: > x <- data.frame(x1=c(1,7),x2=c(4,6),x3=c(8,2)) > names <- t(matrix(rep(names(x),times=nrow(x)),nrow=ncol(x))) > m <- as.matrix(x) > ind <- order(m) > df <- data.frame(name=names[ind],value=m[ind]) > df name value 1 x1 1 2 x3 2 3 x2 4 4 x2 6 5 x1

Re: [R] Calculating interval for conditional/unconditional correlation matrix

2008-09-21 Thread Moshe Olshansky
Hi Ana, There are two problems: First of all, if you want your matrix to have 4 columns it's number of elements should not be 17! Secondly, and this is what causes your error message, you should not call your second function matrix. Call it matrix1, my_matrix, whatever. Otherwise R thinks tha

Re: [R] help on sampling from the truncated normal/gamma distribution on the far end (probability is very low)

2008-09-18 Thread Moshe Olshansky
Well, I made a mistake - your lambda should be 400 and not 40!!! --- On Thu, 18/9/08, Moshe Olshansky <[EMAIL PROTECTED]> wrote: > From: Moshe Olshansky <[EMAIL PROTECTED]> > Subject: Re: [R] help on sampling from the truncated normal/gamma > distribution on the far end

Re: [R] help on sampling from the truncated normal/gamma distribution on the far end (probability is very low)

2008-09-18 Thread Moshe Olshansky
Hi Sonia, If I did not make a mistake, the conditional distribution of X given that X > 0 is very close to exponential distribution with parameter lambda = 40, so you can sample from this distribution. --- On Mon, 15/9/08, Daniel Davis <[EMAIL PROTECTED]> wrote: > From: Daniel Davis <[EMAIL P

Re: [R] inserting values for null

2008-09-17 Thread Moshe Olshansky
Hi Ramya, Assuming that the problem is well defined (i.e. the values in col1 of the data.frames are unique and every value in D.F.sub.2[,1] appears also in D.F1[,1]) you can do the following: ind <- match(D.F.sub.2[,1],D.F1[,1]) D.F1[ind,] <- D.F.sub.2 --- On Thu, 18/9/08, Rajasekaramya <[EMA

Re: [R] database table merging tips with R

2008-09-11 Thread Moshe Olshansky
Just a small correction: start with s <- paste(r$userid,collapse=",") and not s <- paste(r$userid,sep=",") --- On Fri, 12/9/08, Moshe Olshansky <[EMAIL PROTECTED]> wrote: > From: Moshe Olshansky <[EMAIL PROTECTED]> > Subject: Re: [R] dat

Re: [R] database table merging tips with R

2008-09-11 Thread Moshe Olshansky
One possibility is as follows: If r$userid is your array of (2000) ID's then s <- paste(r$userid,sep=",") s<- paste("select t.userid, x, y, z from largetable t where t.serid in (",s,")",sep="") and finally d <- sqlQuery(connection,s) Regards, Moshe. --- On Fri, 12/9/08, Avram Aelony <[EMAIL P

Re: [R] densities with overlapping area of 0.35

2008-09-08 Thread Moshe Olshansky
Just a correction: if we take X+2a then everything is OK (the curves intersect at a), so a = 0.9345893 is correct but one must take X ~ N(0,1) and Y ~N(2*a,1). --- On Tue, 9/9/08, Moshe Olshansky <[EMAIL PROTECTED]> wrote: > From: Moshe Olshansky <[EMAIL PROTECTED]> &g

Re: [R] densities with overlapping area of 0.35

2008-09-08 Thread Moshe Olshansky
Let X be normally distributed with mean 0 and let f be it's density. Now the density of X+a will be f shifted right by a. Since the density is symmetric around mean it follows that the area of overlap of the two densities is exactly P(X>a) + P(X<-a). So if X~N(0,1), we want P(X>a) + P(X<-a) = 2P

Re: [R] intercept of 3D line? (Orthogonal regression)

2008-09-01 Thread Moshe Olshansky
I do not see why you can not use regression even in this case. To make things more simple suppose that the exact model is: y = a + b*x, i.e. y1 = a + b*x1 ... yn = a + b*xn But you can not observe y and x. Instead you observe ui = xi + ei (i=1,...,n) and vi = yi + di (i=1,...,n) Now you have

Re: [R] Integrate a 1-variable function with 1 parameter (Jose L. Romero)

2008-08-27 Thread Moshe Olshansky
This can be done analytically: after changing a variable (2*t -> t) and some scaling we need to compute f(x) = integral from 0 to 20 of (t^x*exp(-t))dt/factorial(x) f(0) = int from 0 to 20 of exp(-t)dt = 1 - exp(-20) and integration by parts yields (for x=1,2,3,...) f(x) = -exp(-20)*20^x/facto

Re: [R] Finding a probability

2008-08-26 Thread Moshe Olshansky
You commands are correct and the interpretation is that the probability that a normal random variable with mean 1454.190 and standard deviation 162.6301 achieves a value of 417 or less is 8.99413e-11 --- On Wed, 27/8/08, rr400 <[EMAIL PROTECTED]> wrote: > From: rr400 <[EMAIL PROTECTED]> > Subje

Re: [R] Problem with Integrate for NEF-HS distribution

2008-08-26 Thread Moshe Olshansky
If you look at your sech(pi*x/2) you can write it as sech(pi*x/2) = 2*exp(pi*x/2)/(1 + exp(pi*x)) For x < -15, exp(pi*x) < 10^-20, so for this interval you can replace sech(pi*x/2) by 2*exp(pi*x/2) and so the integral from -Inf to -15 (or even -10 - depends on your accuracy requirements) can be

Re: [R] paste: multiple collapse arguments?

2008-08-25 Thread Moshe Olshansky
One possibility is: y <- rep(" ",6) y[6] <- "" y[c(2,4)] <- "\n" res <- paste(paste(x,y,sep=""),collapse="") --- On Tue, 26/8/08, remko duursma <[EMAIL PROTECTED]> wrote: > From: remko duursma <[EMAIL PROTECTED]> > Subject: [R] paste: multiple collapse arguments? > To: r-help@r-project.org > Rec

Re: [R] Igraph library: How to calculate APSP (shortest path matrix) matrix for a subset list of nodes.

2008-08-24 Thread Moshe Olshansky
I was too optimistic - the complexity is O(E*log(V)) where V is the number of nodes, but since log(25000) < 20 this is still reasonable. --- On Mon, 25/8/08, Moshe Olshansky <[EMAIL PROTECTED]> wrote: > From: Moshe Olshansky <[EMAIL PROTECTED]> > Subject: Re: [R] I

Re: [R] Igraph library: How to calculate APSP (shortest path matrix) matrix for a subset list of nodes.

2008-08-24 Thread Moshe Olshansky
As far as I know/remember, if your graph is connected and contains E edges then you can find the shortest distance from any particular vertex to all other vertices in O(E) operations. You can repeat this procedure starting from every node (out of the 500). If you have 100,000 edges this will req

Re: [R] deconvolution: Using the output and a IRF to get the input

2008-08-24 Thread Moshe Olshansky
Hi Wolf, Without noise you could use FFT, i.e. FFT of a convolution is the product of the individual FFTs and so you get the FFT of your input signal and using inverse FFT you get the signal itself. When there is noise you must experiment. You may want to filter the response before doing FFT.

Re: [R] How I can read the binary file with "different type"?

2008-08-21 Thread Moshe Olshansky
Hi Miao, I can write a function which takes an integer and produces a float number whose binary representation equals to that of the integer, but this would be an awkward solution. So if nobody suggests anything better I will write such a function for you, but let's wait for a better solution.

Re: [R] Null and Alternate hypothesis for Significance test

2008-08-21 Thread Moshe Olshansky
Hi Nitin, I believe that you can not have null hypothesis to be that A and B come from different distributions. Asymptotically (as both sample sizes go to infinity) KS test has power 1, i.e. it will reject H0:A=B for any case where A and B have different distributions. To work with a finite samp

Re: [R] Help Regarding 'integrate'

2008-08-21 Thread Moshe Olshansky
The phenomenon is most likely caused by numerical errors. I do not know how 'integrate' works but numerical integration over a very long interval does not look a good idea to me. I would do the following: f1<-function(x){ return(dchisq(x,9,77)*((13.5/x)^5)*exp(-13.5/x)) } f2<-function(y){

Re: [R] Random sequence of days?

2008-08-19 Thread Moshe Olshansky
How about d[sample(length(d),10)] --- On Wed, 20/8/08, Lauri Nikkinen <[EMAIL PROTECTED]> wrote: > From: Lauri Nikkinen <[EMAIL PROTECTED]> > Subject: [R] Random sequence of days? > To: [EMAIL PROTECTED] > Received: Wednesday, 20 August, 2008, 4:04 PM > Dear list, > > I tried to find a solutio

Re: [R] Conversion - lowercase to Uppercase letters

2008-08-19 Thread Moshe Olshansky
Use toupper or tolower (see ?toupper, ?tolower) --- On Wed, 20/8/08, suman Duvvuru <[EMAIL PROTECTED]> wrote: > From: suman Duvvuru <[EMAIL PROTECTED]> > Subject: [R] Conversion - lowercase to Uppercase letters > To: r-help@r-project.org > Received: Wednesday, 20 August, 2008, 2:19 PM > I would

Re: [R] A doubt about "lm" and the meaning of its summary

2008-08-18 Thread Moshe Olshansky
Hi Alberto, Please disregard my previous note - I probably had a black-out!!! --- On Tue, 19/8/08, Alberto Monteiro <[EMAIL PROTECTED]> wrote: > From: Alberto Monteiro <[EMAIL PROTECTED]> > Subject: [R] A doubt about "lm" and the meaning of its summary > To: r-help@r-project.org > Received: Tue

Re: [R] A doubt about "lm" and the meaning of its summary

2008-08-18 Thread Moshe Olshansky
Hi Alberto, In your second case the linear model y = a*x + b + error does not hold. --- On Tue, 19/8/08, Alberto Monteiro <[EMAIL PROTECTED]> wrote: > From: Alberto Monteiro <[EMAIL PROTECTED]> > Subject: [R] A doubt about "lm" and the meaning of its summary > To: r-help@r-project.org > Recei

Re: [R] matrix row product and cumulative product

2008-08-17 Thread Moshe Olshansky
Hi Jeff, If I understand correctly, the overhead of a loop is that at each iteration the command must be interpreted, and this time is independent of the number of rows N. So if N is small this overhead may be very significant but when N is large this should be very small compared to the time n

Re: [R] Vectorization of duration of the game in the gambler ruin's problem

2008-08-14 Thread Moshe Olshansky
Hi Jose, If you are only interested in the expected duration, the problem can be solved analytically - no simulation is needed. Let P be the probability to get total.capital (and then 1-P is the probability to loose all the money) when starting with initial.capital. This probability P is well k

Re: [R] missing TRUE/FALSE error in conditional construct

2008-08-13 Thread Moshe Olshansky
The problem is that if x is either NA or NaN then x != 0 is NA (and not FALSE or TRUE) and the function is.nan tests for a NaN but not for NA, i.e. is.nan(NA) returns FALSE. You can do something like: mat_zeroless[!is.na(mat) & mat != 0] <- mat[!is.na(mat) & mat != 0] --- On Thu, 14/8/08, rco

Re: [R] ignoring zeros or converting to NA

2008-08-13 Thread Moshe Olshansky
Since 0 can be represented exactly as a floating point number, there is no problem with something like x[x==0]. What you can not rely on is something like 0.1+0.2 == 0.3 to be TRUE. --- On Thu, 14/8/08, Roland Rau <[EMAIL PROTECTED]> wrote: > From: Roland Rau <[EMAIL PROTECTED]> > Subject: Re:

Re: [R] Covariance matrix

2008-08-07 Thread Moshe Olshansky
Just interchange rows 2 and 3 and then columns 2 and 3 of the original covariance matrix. --- On Fri, 8/8/08, Zhang Yanwei - Princeton-MRAm <[EMAIL PROTECTED]> wrote: > From: Zhang Yanwei - Princeton-MRAm <[EMAIL PROTECTED]> > Subject: [R] Covariance matrix > To: "r-help@r-project.org" > Recei

Re: [R] simulate data based on partial correlation matrix

2008-08-05 Thread Moshe Olshansky
Hi Benjamin, Creating 0 correlations is easier and always possible, but creating arbitrary correlations can be done as well (when possible - see below). Suppose that x1,x2,x3,x4 have mean 0 and suppose that the desired correlations are r = (r1,r2,r3,r4). Let A be an orthogonal 4x4 matrix such th

Re: [R] cutting out numbers from vectors

2008-07-31 Thread Moshe Olshansky
Yes, this is how it should be done! --- On Fri, 1/8/08, Christos Hatzis <[EMAIL PROTECTED]> wrote: > From: Christos Hatzis <[EMAIL PROTECTED]> > Subject: Re: [R] cutting out numbers from vectors > To: "'calundergrad'" <[EMAIL PROTECTED]>, r-help@r-project.org > Received: Friday, 1 August, 2008,

Re: [R] Grouping Index of Matrix Based on Certain Condition

2008-07-31 Thread Moshe Olshansky
y <- 2 - (x[,1] > x[,2]) you can also do cbind(x,y) if you wish. --- On Fri, 1/8/08, Gundala Viswanath <[EMAIL PROTECTED]> wrote: > From: Gundala Viswanath <[EMAIL PROTECTED]> > Subject: [R] Grouping Index of Matrix Based on Certain Condition > To: [EMAIL PROTECTED] > Received: Friday, 1 Aug

Re: [R] cutting out numbers from vectors

2008-07-31 Thread Moshe Olshansky
This is something that is easier done in C than in R (to the best of my very limited knowledge). To do this in R you could do something like: > x <- "082-232-232-1" > y <-unlist(strsplit(x,"")) > i <- which(y != "0")[1]-1 > paste(y[-(1:i)],collapse="") [1] "82-232-232-1" --- On Fri, 1/8/08, c

Re: [R] Code to calculate internal rate of return

2008-07-31 Thread Moshe Olshansky
You can use uniroot (see ?uniroot). As an example, suppose you have a $100 bond which pays 3% every half year (6% coupon) and lasts for 4 years. Suppose that it now sells for $95. In such a case your time intervals are 0,0.5,1,...,4 and the payoffs are: -95,3,3,...,3,103. To find internal rate

Re: [R] stats question

2008-07-31 Thread Moshe Olshansky
Hello Jason, You are not specific enough. What do you mean by "significant difference"? Let's assume that indeed the incidence in A is 6% and in B is 10% and we are looking for Na and Nb such that with probability of at least 80% the mean of Nb sample from B will be at least, say, 0.03 (=3%) abo

Re: [R] Sampling two exponentials

2008-07-30 Thread Moshe Olshansky
I am not sure that this is well defined. For a multivariate normal distribution (which is well defined), the covariance matrix (and the means vector) fully determine the distribution. In the exponential case, what is multivariate (bivariate) exponential distribution? I believe that knowing tha

Re: [R] Urgent

2008-07-29 Thread Moshe Olshansky
Hi Yunlei, Is your problem constrained or not? If it is unconstrained and your matrix is not positive definite, the minimum is unbounded (unless you are extremely lucky and the matrix is positive semi-definite and the vector which multiplies the unknowns is exactly perpendicular to all the eige

Re: [R] Chi-square parameter estimation

2008-07-29 Thread Moshe Olshansky
If v is your vector of sample variances (and assuming that their distribution is chi-square) you can define f(df) <- sum(dchisq(v,df,log=TRUE)) and now you need to maximize f, which can be done using any optimization function (like optim). --- On Sat, 26/7/08, Julio Rojas <[EMAIL PROTECTED]> wr

Re: [R] finding a faster way to do an iterative computation

2008-07-29 Thread Moshe Olshansky
Try abs(outer(xk,x,"-")) (see ?outer) --- On Wed, 30/7/08, dxc13 <[EMAIL PROTECTED]> wrote: > From: dxc13 <[EMAIL PROTECTED]> > Subject: [R] finding a faster way to do an iterative computation > To: r-help@r-project.org > Received: Wednesday, 30 July, 2008, 4:12 AM > useR's, > > I am trying

Re: [R] product of successive rows

2008-07-29 Thread Moshe Olshansky
Assuming that the number of rows is even and that your matrix is A, element-wise product of pairs of rows can be calculated as A[seq(1,nrow(A),by=2),]*A{seq(2,nrow(A),by=2),] --- On Mon, 28/7/08, rcoder <[EMAIL PROTECTED]> wrote: > From: rcoder <[EMAIL PROTECTED]> > Subject: [R] product of su

Re: [R] simple random number generation

2008-07-24 Thread Moshe Olshansky
Or, as suggested by Duncan Murdoch, qnorm(runif(500,pnorm(-1.5),pnorm(1.5))) --- On Fri, 25/7/08, jim holtman <[EMAIL PROTECTED]> wrote: > From: jim holtman <[EMAIL PROTECTED]> > Subject: Re: [R] simple random number generation > To: "dxc13" <[EMAIL PROTECTED]> > Cc: r-help@r-project.org > Rec

Re: [R] Constrained coefficients in lm (correction)

2008-07-23 Thread Moshe Olshansky
This problem can be easily solved analytically: we want to minimize sum(res(i) -a*st(i) -b*mod(i))^2 subject to a+b=1,a,b>=0, so we want to minimize f(a) = sum((res(i)-mod(i)) - a*(st(i)-mod(i)))^2 for 0<=a<=1 Define Xi = res(i) - mod(i), Yi = st(i) - mod(i), then f(a) = sum(Xi - a*Yi)^2 f(0

Re: [R] spectral decomposition for near-singular pd matrices

2008-07-16 Thread Moshe Olshansky
D]> wrote: > From: Prasenjit Kapat <[EMAIL PROTECTED]> > Subject: Re: [R] spectral decomposition for near-singular pd matrices > To: [EMAIL PROTECTED] > Received: Thursday, 17 July, 2008, 10:56 AM > Moshe Olshansky yahoo.com> > writes: > > > How large is your matr

Re: [R] spectral decomposition for near-singular pd matrices

2008-07-16 Thread Moshe Olshansky
How large is your matrix? Are the very small eigenvalues well separated? If your matrix is not very small and the lower eigenvalues are clustered, this may be a really hard problem! You may need a special purpose algorithm and/or higher precision arithmetic. If your matrix is A and there exists

Re: [R] number of effective tests

2008-07-10 Thread Moshe Olshansky
It looks like SR, SU and ST are strongly correlated to each other, as well as DR, DU and DT. You can try to do PCA on your 6 variables, pick the first 2 principal components as your new variables and use them for regression. --- On Fri, 11/7/08, Georg Ehret <[EMAIL PROTECTED]> wrote: > From: G

Re: [R] rounding

2008-07-10 Thread Moshe Olshansky
below 255, so that x is less than 2.55 and should have been rounded to 2.5. --- On Fri, 11/7/08, Moshe Olshansky <[EMAIL PROTECTED]> wrote: > From: Moshe Olshansky <[EMAIL PROTECTED]> > Subject: Re: [R] rounding > To: [EMAIL PROTECTED], "Korn, Ed (NIH/NCI) [E]"

Re: [R] rounding

2008-07-10 Thread Moshe Olshansky
The problem is that neither 0.55 nor 2.55 are exact machine numbers (the computer uses binary representation), so it may happen that the machine representation of 0.55 is slightly less than 0.55 while the machine representation of 2.55 is slightly above 2.55. --- On Fri, 11/7/08, Korn, Ed (NIH

Re: [R] Sum(Random Numbers)=100

2008-07-08 Thread Moshe Olshansky
method is also correct except > it is based on > the conditioning. > On 2008-7-8, at 下午1:58, Shubha Vishwanath Karanth > wrote: > On 2008-7-8, at 下午2:39, Moshe Olshansky wrote: > > > If they are really random you can not expect their sum > to be 100. > >

Re: [R] Sum(Random Numbers)=100

2008-07-07 Thread Moshe Olshansky
If they are really random you can not expect their sum to be 100. However, it is not difficult to get that given that the sum of n independent Poisson random variables equals N, any individual one has the conditional binomial distribution with size = N and p = 1/n, i.e. P(Xi=k/Sn=N) = (N over k)*

Re: [R] odd dnorm behaviour (?)

2008-07-07 Thread Moshe Olshansky
dnorm() computes the density, so it may be > 1; pnorm() computes the distribution function. --- On Tue, 8/7/08, Mike Lawrence <[EMAIL PROTECTED]> wrote: > From: Mike Lawrence <[EMAIL PROTECTED]> > Subject: Re: [R] odd dnorm behaviour (?) > To: "Rhelp" <[EMAIL PROTECTED]> > Received: Tuesday, 8

Re: [R] multiplication question

2008-07-07 Thread Moshe Olshansky
The answer to your first question is sum(x)8sum(y) - sum(x*y) and for the second one x %*% R %*% y - sum(x*y*diag(R)) --- On Thu, 3/7/08, Murali Menon <[EMAIL PROTECTED]> wrote: > From: Murali Menon <[EMAIL PROTECTED]> > Subject: [R] multiplication question > To: [EMAIL PROTECTED] > Received:

Re: [R] Plot Mixtures of Synthetically Generated Gamma Distributions

2008-07-06 Thread Moshe Olshansky
I know very little about graphics, so my primitive and brute force solution would be plot(density(x[1:30]),col="blue");lines(density(x[31:60]),col="red");lines(density(x[61:90]),col="green") --- On Mon, 7/7/08, Gundala Viswanath <[EMAIL PROTECTED]> wrote: > From: Gundala Viswanath <[EMAIL PROT

Re: [R] Lots of huge matrices, for-loops, speed

2008-07-06 Thread Moshe Olshansky
t: Re: [R] Lots of huge matrices, for-loops, speed > To: [EMAIL PROTECTED] > Cc: r-help@r-project.org, "Zarza" <[EMAIL PROTECTED]> > Received: Monday, 7 July, 2008, 9:40 AM > On 7/07/2008, at 11:05 AM, Moshe Olshansky wrote: > > > Another possibility is t

Re: [R] Lots of huge matrices, for-loops, speed

2008-07-06 Thread Moshe Olshansky
Another possibility is to use explicit formula, i.e. if you are doing linear regression like y = a*x + b then the explicit formulae are: a = (meanXY - meanX*meanY)/(meanX2 - meanX^2) b = (meanY*meanX2 - meanX*meanXY)/(meanX2 - meanX^2) where meanX is mean(x), meanXY is mean(x*y), meanX2 is mean(

  1   2   3   >