from:"eric"

[R] optimisation, pseudo maximum likehood and family exponential in R

2010-04-21 Thread Eric

Hello,

By fixing log_vraisemblance, gradient and hessian function linked to a
family density like exp(theta.xi), I'm looking for some efficients
estimators by PML.

So I've seen optim,nlminb, et maxLik procedure.

But I 'm not sure that the heteroscedacity of my estimators are considered.

Does anyone has been deal with kind of problem?

Thanks for your help.

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Obtaining partial output from a function that does not run to completion.

2009-12-31 Thread Eric



To heck with print(), use R's debugging capabilities instead:

trace ( minBMI, browser )

Be sure to ?trace and ?browser so you can figure out how to 
interactively debug.


Eric


On 12/31/09 6:29 AM, John Sorkin wrote:

I have written a function that contains runs
lm()
vif()  and
glm()

When the glm() blows up with an error message, I don't get the output from 
either the lm() or vf() even thought neither lm() nor vif() have any problems . 
How can I force the function to print sequential results rather than wait for 
the entire function to complete before listing the functhion's output?
Thanks,
John



minBMI<-function(SS,SimData)
{
SampleData<-sample(1:SS,size=SS,replace=TRUE)
fitBMIEpiRevlm<-lm(AAMTCARE~BMIEpiRevAdjc+BMIEpiRevAdjcSq+SEX+jPHI+jMEDICAID+H_AGE+jMARSTAT+factor(jEDUCATION)+factor(jsmokercat)+factor(jrace)+log(INCOME_C+1),data=SimData[SampleData,],x=TRUE)
print(summary(fitBMIEpiRevlm))
print(vif(fitBMIEpiRevlm))
fitBMIEpiRev<- 
glm(AAMTCARE~BMIEpiRevAdjc+BMIEpiRevAdjcSq+SEX+jPHI+jMEDICAID+H_AGE+jMARSTAT+factor(jEDUCATION)+factor(jsmokercat)+factor(jrace)+log(INCOME_C+1),data=SimData[SampleData,],family=Gamma(link="log"))

print(summary(fitBMIEpiRev))
}
minBMI(SS,SimData)


John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Creating dates to plot

2012-11-15 Thread eric

I have a dataframe that looks like this:

head(x)
  Period  AP AlMA BB
 
All
1 200812  903,231   1,985,460   905,422   3,312,088   7,106,201 
2 200901  880,491   1,924,111   892,980   3,006,050   6,703,631 
3 200902  883,994   1,926,169   890,021   3,247,530   6,947,714 
4 200903  888,021   1,901,182   892,593   3,216,730   6,898,526 
5 200904  869,024   1,829,841   857,723   3,121,458   6,678,046 
6 200905  847,450   1,776,100   847,593   3,110,783   6,581,925 

x$Period is a numeric value that represents the year and month. How would I
take those values and turn them into dates so that I could plot the various
columns vs the date ?



--
View this message in context: 
http://r.789695.n4.nabble.com/Creating-dates-to-plot-tp4649677.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dealing with factors ???

2012-11-15 Thread eric

The table is much bigger than what was shown. I just displayed a few rows.
Seems like there should be a better way that the approach you are proposing.
What is also not clear to me is why the factors are coming at all. I do a
read.csv on a table full of numbers from excel and I'm seeing factors
everywhere.



--
View this message in context: 
http://r.789695.n4.nabble.com/Dealing-with-factors-tp4649686p4649689.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Dealing with factors ???

2012-11-15 Thread eric

I have a data frame x that came from read.csv. It seemed to read in ok but
then I tried doing some plotting of the values and ran into difficulties. 
The plot command seems to be plotting factors instead of the values. How do
I get rid of these factors ? The plot command I use is : plot (x$dat, x$TX,
type='l'). I also tried  ...plot(x$dat, levels(x$TX), type='l) but got an
error :

What am I doing wrong here ?

Error in plot.window(...) : need finite 'ylim' values
In addition: Warning messages:
1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf

 head(x,4)
  Period PA   NJ MDTX   

Alldat
1 200812  903,231   1,985,460   905,422   3,312,088   7,106,201  2008-12-31
2 200901  880,491   1,924,111   892,980   3,006,050   6,703,631  2009-01-31
3 200902  883,994   1,926,169   890,021   3,247,530   6,947,714  2009-03-03
4 200903  888,021   1,901,182   892,593   3,216,730   6,898,526  2009-03-31
> str(x)
'data.frame':   41 obs. of  7 variables:
 $ Period: int  200812 200901 200902 200903 200904 200905 200906 200907
200908 200909 ...
 $ PA  : Factor w/ 41 levels " 818,037 "," 823,191 ",..: 26 22 23 25 19 7 10
2 1 12 ...
 $ NJ   : Factor w/ 41 levels " 1,599,113 ",..: 31 28 29 27 22 19 20 17 14
16 ...
 $ MD   : Factor w/ 41 levels " 800,827 "," 807,154 ",..: 27 25 23 24 15 13
11 6 5 3 ...
 $ TX   : Factor w/ 41 levels " 2,472,690 ",..: 41 23 40 39 35 34 32 21 18
27 ...
 $ All   : Factor w/ 41 levels " 6,111,993 ",..: 40 27 38 36 25 21 19 13 11
16 ...
 $ dat   :Class 'Date'  num [1:41] 14244 14275 14306 14334 14365 ...





--
View this message in context: 
http://r.789695.n4.nabble.com/Dealing-with-factors-tp4649686.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Reshaping a dataframe

2012-11-17 Thread eric

Seems like this should be easy but I'm struggling a bit. How do I rearrange a
data frame to go from the first one to the second shown below ?


State   Datelbs 
TX  200701  400 
TX  200702  650 
TX  200703  950 
TX  200704  1000
FL  200701  200 
FL  200702  300 
FL  200703  500 
FL  200704  333 
NJ  200701  409 
NJ  200702  308 
NJ  200703  300 
NJ  200704  800 



DateTX  FL  NJ
200701  400 200 409
200702  650 300 308
200703  950 500 300
200704  1000333 800




--
View this message in context: 
http://r.789695.n4.nabble.com/Reshaping-a-dataframe-tp4649874.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reshaping a dataframe

2012-11-17 Thread eric


What am I doing wrong ? Using the method you proposed on SLIGHTLY different 
data.

> head(x)
 Region YearMon  kg
 AP  200701 290,311
 AP  200702 322,671
 AP  200703 216,600
 AP  200704 450,711
 AP  200705 245,215
 AP  200706 212,492
 j <-dcast(x,YearMon~Region,value.var='kg')
sing kg as value column: use value_var to override.
rror in names(data) <- array_names(res$labels[[2]]) : 
 'names' attribute [4] must be the same length as the vector [1]
 j <-dcast(x,YearMon~Region,value_var='kg')
rror in names(data) <- array_names(res$labels[[2]]) : 
 'names' attribute [4] must be the same length as the vector [1]








-Original Message-
From: arun kirshna [via R] 
To: eric 
Sent: Sat, Nov 17, 2012 5:23 pm
Subject: Re: Reshaping a dataframe


HI, 
Try this: 
dat1<-read.table(text=" 
State Date lbs 
TX 200701 400 
TX 200702 650 
TX 200703 950 
TX 200704 1000 
FL 200701 200 
FL 200702 300 
FL 200703 500 
FL 200704 333 
NJ 200701 409 
NJ 200702 308 
NJ 200703 300 
NJ 200704 800 
",sep="",header=TRUE,stringsAsFactors=FALSE) 
library(reshape2) 

 dcast(dat1,Date~State,value.var="lbs") 
#Date  FL  NJ   TX 
#1 200701 200 409  400 
#2 200702 300 308  650 
#3 200703 500 300  950 
#4 200704 333 800 1000 


A.K. 



If you reply to this email, your message will be added to the discussion below:
http://r.789695.n4.nabble.com/Reshaping-a-dataframe-tp4649874p4649889.html 
This email was sent by arun kirshna (via Nabble)
To receive all replies by email, subscribe to this discussion






--
View this message in context: 
http://r.789695.n4.nabble.com/Reshaping-a-dataframe-tp4649874p4649890.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reshaping a dataframe

2012-11-17 Thread eric

Didn't try the other two methods as I spent a bit of time trying to learn
about reshape2. I was also able to melt the data and that went fine
(although based on your post, melting is not needed).  

Any ideas on why reshape2 dcast is giving fits ?



--
View this message in context: 
http://r.789695.n4.nabble.com/Reshaping-a-dataframe-tp4649874p4649892.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reshaping a dataframe

2012-11-17 Thread eric


Could it be that my data is showing up as factors ?
> class(x)
1] "data.frame"
 str(x)
data.frame':284 obs. of  3 variables:
$ Region : Factor w/ 4 levels "AP","EU","LA",..: 1 1 1 1 1 1 1 1 1 1 ...
$ YearMon: int  200701 200702 200703 200704 200705 200706 200707 200708 200709 
200710 ...
$ kg : Factor w/ 284 levels "-18,646","-3,199,893",..: 123 137 97 175 107 
96 173 178 121 146 ..




-Original Message-
From: arun kirshna [via R] 
To: eric 
Sent: Sat, Nov 17, 2012 5:47 pm
Subject: Re: Reshaping a dataframe


HI, 
This is what I got: 
dat2<-read.table(text=" 
Region YearMon  kg 
1 AP  200701 290,311 
2 AP  200702 322,671 
3 AP  200703 216,600 
4 AP  200704 450,711 
5 AP  200705 245,215 
6 AP  200706 212,492 
",sep="",header=TRUE,stringsAsFactors=FALSE) 
dcast(dat2,YearMon~Region,value.var="kg") 
#  YearMon  AP 
#1  200701 290,311 
#2  200702 322,671 
#3  200703 216,600 
#4  200704 450,711 
#5  200705 245,215 
#6  200706 212,492 
 reshape(dat2,v.names="kg",idvar="YearMon",timevar="Region",direction="wide") 
#  YearMon   kg.AP 
#1  200701 290,311 
#2  200702 322,671 
#3  200703 216,600 
#4  200704 450,711 
#5  200705 245,215 
#6  200706 212,492 

#With xtabs(), this will not work because you need to replace the "commas" in 
kg column and change it to numeric 
 dat2$kg<-as.numeric(gsub("[,]","",dat2$kg)) 
xtabs(kg~YearMon+Region,data=dat2) 
#Region 
#YearMon  AP 
 # 200701 290311 
  #200702 322671 
  #200703 216600 
  #200704 450711 
  #200705 245215 
  #200706 212492 
A.K. 



If you reply to this email, your message will be added to the discussion below:
http://r.789695.n4.nabble.com/Reshaping-a-dataframe-tp4649874p4649893.html 
This email was sent by arun kirshna (via Nabble)
To receive all replies by email, subscribe to this discussion






--
View this message in context: 
http://r.789695.n4.nabble.com/Reshaping-a-dataframe-tp4649874p4649894.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Replace with something else

2012-11-18 Thread eric

I am reading some data into R from an Excel spreadsheet using read.csv. Some
of the original data that comes into column 1 from the spreadsheet is text
that says NA. The NA stands for north america. As it comes in, R converts
the NA over to  . 

What is the cleanest way to change the   values to something else. In
other words, get rid of the brackets ? Maybe convert  to NAM. 

635 LA 201207  557329
636 LA 201208  683771
637 LA 201209  613851
638 LA 201210  764217
639 LA 201211  212897
782200701  875634
783200702  614856
784200703  521520
785200704 1406400



--
View this message in context: 
http://r.789695.n4.nabble.com/Replace-NA-with-something-else-tp4649974.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Replace with something else

2012-11-18 Thread eric

I inserted na.strings='' and that seemed to work except for a problem with a
plot statement

plot(x1$NA,type='l',ylab='M kg/ y ',xlab='')

Error: unexpected numeric constant in "plot(x1$NA"

> tail(x1)
   AP   EU   LANA total
Jun 2012 2.32 2.26 5.38 13.74 23.70
Jul 2012 2.46 2.21 5.33 12.94 22.94
Aug 2012 2.69 2.24 5.28 13.32 23.54
Sep 2012 2.62 2.28 5.14 12.99 23.06
Oct 2012 2.61 2.27 5.31 12.59 22.80
Nov 2012 2.55 2.18 5.08 12.56 22.39

> str(x1)
‘zoo’ series from Dec 2007 to Nov 2012
  Data: num [1:60, 1:5] 2.02 1.91 1.79 1.66 1.37 1.25 1.32 1.31 1.32 1.31
...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:5] "AP" "EU" "LA" "NA" ...
  Index: Class 'yearmon'  num [1:60] 2008 2008 2008 2008 2008 ...

What's wrong with my plot statement ?



--
View this message in context: 
http://r.789695.n4.nabble.com/Replace-NA-with-something-else-tp4649974p4649988.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with loess

2012-11-19 Thread eric

Not sure what I'm doing wrong. Can't seem to get loess values. It looks like
loess is returning the same values as the input.

j <-loess(x1$total~as.numeric(index(x1)

plot(x1$total,type='l', ylab='M coms/y global',xlab='')
lines(loess(total~as.numeric(index(x1)),x1))

The plot statement works fine
No errors with the "lines" statement

But I don't get a smoothed value on the graph. From what I can tell, the
reason is that loess smoothed values are the same as x1$total. In other
words, x1$total -j$y = 0

What am I missing ?

> head(x1)
   APEU   LANA total
Dec 2007 3.98 14.12 6.42 26.33 50.88
Jan 2008 3.98 13.74 6.41 27.68 51.85
Feb 2008 3.90 13.80 6.39 26.07 50.18
Mar 2008 3.91 13.94 6.51 25.52 49.89
Apr 2008 3.91 14.15 6.48 25.79 50.35
May 2008 3.99 13.99 6.58 23.91 48.49
> mode(x1)
[1] "numeric"
> class(x1)
[1] "zoo"
> str(x1)
‘zoo’ series from Dec 2007 to Nov 2012
  Data: num [1:60, 1:5] 3.98 3.98 3.9 3.91 3.91 3.99 3.88 3.9 3.82 3.81 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:5] "AP" "EU" "LA" "NA" ...
  Index: Class 'yearmon'  num [1:60] 2008 2008 2008 2008 2008 ...



--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-loess-tp4650132.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to get bootstrap estimates

2012-11-19 Thread eric

You might want to check out the bootstrap package.

Also consider clarifying what you want to bootstrap  ...mec or vec or what

Lastly, it is not clear what you mean when you say ...

and I have the next errors: 

ro 12 = ro (mec,vec) 
ro 34 = ro (alg,ana) 
ro 35 = ro (alg,sta) 
ro 45 = ro (ana,sta) 
ro 14 = ro (mec,ana) 

I assume ro is actually row. And what other information is there about the
errors in those rows ? What kind of error is there ? I see there is a number
in those rows.






--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-get-bootstrap-estimates-tp4650071p4650134.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plotting specific points with type='l'

2012-11-22 Thread eric

I have a dataframe (x) and I'm plotting the 5th column vs the index. I also
have a vector (v) with a few select points that I want to emphasize with a
dot for those points

> head(x)
period   AP   EU   LA   NA
1 Jan 2007 0.18 0.45 0.19 3.19
2 Feb 2007 0.14 0.48 0.36 3.55
3 Mar 2007 0.14 0.42 0.46 2.61
4 Apr 2007 0.24 0.73 0.32 4.32
5 May 2007 0.19 0.60 0.32 4.40
6 Jun 2007 0.14 0.38 0.32 1.09

v <-c(2,4,7)
plot(x[,5], type='l')

How do I put a solid dot at just the points I want to highlight ? In other
words, a solid dot a the 2nd, 4th, and 7th point on the plot. All other
points according the the type='l'. I tried ... points(x[v,5], pch=19) ...
but the points didn't plot in the right spot. 



--
View this message in context: 
http://r.789695.n4.nabble.com/Plotting-specific-points-with-type-l-tp4650475.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How do I subtract sequential values ?

2010-11-28 Thread eric


Just starting to learn R so excuse me if this is a simple question. I'm
wondering how I get the percent difference in sequential values in one
column of a dataframe. If I had a dataframe and one of the columns was
"value", how would I go about calculating  (v2-v1)/v1 (v3-v2)/v2
(v4-v3)/v3 ...etc ?
-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-do-I-subtract-sequential-values-tp3063019p3063019.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] pdf package help files

2010-12-18 Thread eric


Newbie here...just learning

Do most packages come with pdf versions of the help files ? If yes, how to I
access the entire pdf file to be able to print it ? Is there a standard
command for that ? 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/pdf-package-help-files-tp3093926p3093926.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] package update

2010-12-26 Thread eric


I'm running Linux Ubuntu and tried to update my packages using the
update.package() command. It appeared to download the updates ok but I got
the following message:


The downloaded packages are in ‘/tmp/RtmpFM82Ry/downloaded_packages’
Warning in install.packages(update[instlib == l, "Package"], l, contriburl =
contriburl,  :
  'lib = "/usr/lib/R/site-library"' is not writable
Error in install.packages(update[instlib == l, "Package"], l, contriburl =
contriburl,  : 
  unable to install packages
Calls: update.packages -> install.packages

What does this mean ? And more importantly, how do I address it ?
-- 
View this message in context: 
http://r.789695.n4.nabble.com/package-update-tp3164690p3164690.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] data frame column name change

2011-01-16 Thread eric


How do I change the name of one column in a data frame ? Suppose I have a
data frame x with 5 columns. If the names were date, col1, col2, col3, col4
and I wanted to simply change the name of date, what would the command be ?
I tried the following and it didn't seem to work :

names(x[1]) <- "newname"

Thanks in advance for the comments
-- 
View this message in context: 
http://r.789695.n4.nabble.com/data-frame-column-name-change-tp3220684p3220684.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Log difference in a dataframe column

2011-01-17 Thread eric


What am I doing wrong here ? And what's the right way to calculate the log
differences in a column in a df ?

# first 3 rows of 5000 rows
y[1:3,]

 Date  Open  High   Low Close
1 1983-03-30 29.96 30.51 29.96 30.35
2 1983-03-31 30.35 30.55 30.20 30.24
3 1983-04-04 30.25 30.65 30.24 30.39

#equation in question ...why is this giving zeros ?
y1 <- 100*log(y[,5]/(lag(y[,5],1)))

# first 10 values from the equation...all zeros
head(y1,10)
 [1] 0 0 0 0 0 0 0 0 0 0
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Log-difference-in-a-dataframe-column-tp3221225p3221225.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Log difference in a dataframe column

2011-01-18 Thread eric


Just learning so excuse me if I'm being too basic here. But I'm wondering how
should I know that as.ts would be needed for lag ? Is there a thought
process or way to inspect that I should have gone through to know that log
would work on y[,5] but lag would not work on [,5] ? 

Is the general rule that lag is not in the base package and therefore would
not work ?

Thanks for the comments

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Log-difference-in-a-dataframe-column-tp3221225p3224478.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Vectorization

2011-01-23 Thread eric


Is there a way to vectorize this loop or a smarter way to do it ?

y
 [1]  0.003990746 -0.037664639  0.005397999  0.010415496  0.003500676
 [6]  0.001691775  0.008170774  0.011961998 -0.016879531  0.007284486
[11] -0.015083581 -0.006645958 -0.013153103  0.028148639 -0.005724317
[16] -0.027408025  0.014767422 -0.001619691  0.018334730 -0.009747171

x <-numeric(length(y))
for (i in 1 :length(y)) {
x[i] <- ifelse( i==1, 1*(1+y[i]), (1+y[i])*x[i-1])
}

x
 [1] 10039.907  9661.758  9713.912  9815.087  9849.447  9866.110  9946.724
 [8] 10065.706  9895.802  9967.888  9817.536  9752.289  9624.016  9894.919
[15]  9838.278  9568.630  9709.934  9694.207  9871.948  9775.724

Basically trying to see how the equity of an investment changes after each
return period. Start with $10,000 and a series of returns over time. Figure
out the equity after each time period (return).



-- 
View this message in context: 
http://r.789695.n4.nabble.com/Vectorization-tp3233340p3233340.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using diff and transform

2011-01-25 Thread eric


I want to use diff to take the differences in a column "Close" of a data
frame "y". But I'd like to do it using the transform function so a new data
frame is created with a difference column. The problem is that diff gives
one less than the number of elements in the original data frame. So
transform gives an error message. I know the first element of diff should be
NA or ideally 0. But not sure how I get there. Any suggestions ?


j
Date   Close
1 11/20/1985 156.412
2 11/21/1985 155.112
3 11/22/1985 154.182
4 11/25/1985 154.192
5 11/26/1985 153.722
6 11/27/1985 153.712

d <-diff(j[,2])
d
[1] -1.30 -0.93  0.01 -0.47 -0.01

j <- transform(j, delta=diff(j[,2]))

Error in data.frame(list(Date = c(5620L, 5639L, 5657L, 5706L, 5721L, 5736L : 
  arguments imply differing number of rows: 6, 5
Calls: transform -> transform.data.frame -> do.call -> data.frame

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Using-diff-and-transform-tp3237397p3237397.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] why doesn't ifelse work ?

2011-04-28 Thread eric

I have the following lines of code:

ind <- rollapply(GSPC, 200, mean)
signal <- ifelse(diff(ind, 5) > 0 , 1 , -1)
signal[is.na(signal)] <- 0

I never get a value of -1 for signal even though I know diff(ind , 5) is
less than zero frequently. It looks like when diff(ind , 5) is less than
zero, signal gets set to 0 instead of - 1. Any ideas why ?  Here's some
information on ind and diff(ind, 5) :

> mode(diff(ind, 5) >0)
[1] "logical"
> class(diff(ind, 5) >0 )
[1] "zoo"
> str(diff(ind, 5) > 0 )
‘zoo’ series from 1990-05-31 to 2010-12-02
  Data: logi [1:5171, 1] FALSE FALSE FALSE FALSE FALSE FALSE ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr "GSPC.Adjusted"
  Index:  Date[1:5171], format: "1990-05-31" "1990-06-01" "1990-06-04"
"1990-06-05" "1990-06-06" "1990-06-07" "1990-06-08" "1990-06-11" ...
> class(ind)
[1] "zoo"
> mode(ind)
[1] "numeric"
> str(ind)
‘zoo’ series from 1990-05-23 to 2010-12-02
  Data: num [1:5176, 1] 339 339 338 338 338 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr "GSPC.Adjusted"
  Index:  Date[1:5176], format: "1990-05-23" "1990-05-24" "1990-05-25"
"1990-05-29" "1990-05-30" "1990-05-31" "1990-06-01" "1990-06-04" 

--
View this message in context: 
http://r.789695.n4.nabble.com/why-doesn-t-ifelse-work-tp3482680p3482680.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] why doesn't ifelse work ?

2011-04-28 Thread eric

equire(quantmod)
require(PerformanceAnalytics)
rm(list=ls())
getSymbols("^GSPC", src="yahoo", from="1990-01-01", to=Sys.Date())
GSPC <-na.omit(Ad(GSPC))
ind <- rollapply(GSPC, 200, mean)
signal <- ifelse(diff(ind, 5) > 0 , 1 , -1)
signal[is.na(signal)] <- 0

--
View this message in context: 
http://r.789695.n4.nabble.com/why-doesn-t-ifelse-work-tp3482680p3482737.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] why doesn't ifelse work ?

2011-04-28 Thread eric

from the console ...

> table(signal)
signal
   01 
1286 3885 

note there is no -1 value.

This is consistent with what I see if if plot(signal). When I issue that
statement from the console, I see signal vary between 0 and 1.0 but it never
goes to - 1



--
View this message in context: 
http://r.789695.n4.nabble.com/why-doesn-t-ifelse-work-tp3482680p3482792.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] package update

2011-05-08 Thread eric

I tried to update my packages using update.packages() 

I got the following message:

The downloaded packages are in
‘/tmp/RtmpyDYdTX/downloaded_packages’
Warning in install.packages(update[instlib == l, "Package"], l, contriburl =
contriburl,  :
  'lib = "/usr/lib/R/library"' is not writable
Error in install.packages(update[instlib == l, "Package"], l, contriburl =
contriburl,  : 
  unable to install package

How do I fix this ?

--
View this message in context: 
http://r.789695.n4.nabble.com/package-update-tp3507479p3507479.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] package update

2011-05-10 Thread eric

ok, how do I do root permissions from the RStudio GUI ?  What is the specific
command that allows me to do that when I type ...update.packages()  ?

Also, why are the packages installing in the first place if I can't write to
that location ?

Currently running linux ubuntu 10.04 by the way.

--
View this message in context: 
http://r.789695.n4.nabble.com/package-update-tp3507479p3510977.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Need help with text processing / string split

2011-05-15 Thread eric

I used screen scraping to extract some information and put it into a table
called tbl. Now I want to modify the table a bit so the data can be more
useful. Here's the code I used:

library(XML)
rm(list=ls())
url <-
"http://webapp.montcopa.org/sherreal/salelist.asp?saledate=05/25/2011";
tbl <-data.frame(readHTMLTable(url))[2:405, c(3,5,6,8,9)]
names(tbl) <- c("Address", "Township", "Parcel", "Sale Date", "Costs")

tbl is attached as txt for your convenience. Entries in the last column of
the dataframe (tbl$Cost) appear as follows: $173,933.60$2,410.28  . 
http://r.789695.n4.nabble.com/file/n3524793/tbl.txt tbl.txt 

How do I:

1. Split the string
2. Have the two values show up as actual numbers that can be used
3. Put the numbers in two separate columns of the dataframe.

In other words $173,933.60$2,410.28 would show up as 173933.60 in one column
and 2410.28 would show up in a second column of tbl

I tried using strsplit but I could not get it working properly. 

 

--
View this message in context: 
http://r.789695.n4.nabble.com/Need-help-with-text-processing-string-split-tp3524793p3524793.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] What am I doing wrong with sapply ?

2011-05-25 Thread eric

Statement 9 using sapply does not seem to give the correct answer (or at
least to me). Yet I do what I think is the same thing with statement 11 and
I get the answer I'm looking for. 

9 : s <-sapply(unlist(v[c(1:length(v))]), max)
11: for(i in 1 :length(v)) v1[i] <- max(unlist(v[i]))

Shouldn't I get the same answer ? 


library(XML)
rm(list=ls())
url <-
"http://webapp.montcopa.org/sherreal/salelist.asp?saledate=05/25/2011";
tbl <-data.frame(readHTMLTable(url))[2:404, c(3,5,6,8,9)]
names(tbl) <- c("Address", "Township", "Parcel", "SaleDate", "Costs");
rownames(tbl) <- c(1:length(tbl[,1]))
x <-tbl
v <- gregexpr("( aka )|( AKA )",x$Address)
s <-sapply(unlist(v[c(1:length(v))]), max)
v1 <-numeric(length(v))
for(i in 1 :length(v)) v1[i] <- max(unlist(v[i]))

--
View this message in context: 
http://r.789695.n4.nabble.com/What-am-I-doing-wrong-with-sapply-tp3551598p3551598.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] newbie xml parsing question

2011-05-28 Thread eric

I am trying to read some data off the zillow site. Newbie to xml, html,
parsing and the xml package. I've been able to load the web page I'm
interested with the following code but I'm not sure of the next step to get
the information I'm interested in into R :

library(XML)
url <- "http://www.zillow.com/homes/511 W Lafayette St, Norristown, PA_rb"
doc <-doc <- htmlTreeParse(url1, isURL=TRUE)
doc

I'd like to be able to pull the following information into R 

href home details string : 

/homedetails/236-Arundel-Ave-Horsham-PA-19044/9933810_zpid/#{scid=hdp-site-map-bubble-address}

value for Zestimate \ Price: $239,000
Beds : 3
Baths: 1.0
Sqft :1630

I noticed all that information is in "doc". The section of doc where the
information is contained is shown below. How do I go about extracting this
information and getting it into R for the general case where the address in
url will change ?

LatLong.createFromDegrees(40.187567, -75.125861),
" 
  
  9933810  
 
\"/homedetails/236-Arundel-Ave-Horsham-PA-19044/9933810_zpid/#{scid=hdp-site-map-bubble-address}\"
 
236 Arundel Ave, Horsham, PA    Zestimate®: $239,000  \"#\"  
   Close  
Zestimate  A Zestimate® home valuation is
Zillow's estimated market value. It is not an appraisal. Use it as a
starting point to determine a home's value. Learn more   Mortgage payment: $963/mo   
\"/mortgage-rates/#{scid=mor-site-mapbubrates}\" See rates   
  Beds: 3 Baths:
1.0 Sqft: 1,630 Lot: 21,745 
   
\"/homedetails/236-Arundel-Ave-Horsham-PA-19044/9933810_zpid/#{scid=hdp-site-map-bubble-details}\"
Details \"#\" Views   
  Save  Close 
   "
)


--
View this message in context: 
http://r.789695.n4.nabble.com/newbie-xml-parsing-question-tp3558067p3558067.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Need help reading website info with XML package and XPath

2011-05-30 Thread eric

Hi, I'm looking for help extracting some information of the zillow website.
I'd like to do this for the general case where I manually change the address
by modifying the url (see code below). With the url containing the address,
I'd like to be able to extract the same information each time. The specific
information I'd like to be able to extract includes the homedetails url,
price (zestimate), number of beds, number of baths, and the Sqft. All this
information is shown in a bubble on the webpage.

I use the code below to try and do this but it's not working. I know the
infomation I'm interested in is there because if I print out "doc", I see it
all in one area. I've attached the relevant section of "doc" that shows and
highlights all the information I'm interested in (note that either url
that's highligted in doc is fine). 
http://r.789695.n4.nabble.com/file/n3561075/relevant-section-of-doc.pdf
relevant-section-of-doc.pdf 

I'm guessing my xpath statements are wrong or getNodeSet needs something
else to get to information contained in a bubble on a webpage. Any
suggestions or ideas would be GREATLY appreciated. 


library(XML)
url <- "http://www.zillow.com/homes/511 W Lafayette St, Norristown, PA_rb"
doc <- htmlTreeParse(url, useInternalNode=TRUE, isURL=TRUE)
f1 <- getNodeSet(doc, "//a[contains(@href,'homedetails')]")
f2 <- getNodeSet(doc, "//span[contains(@class,'price')]")
f3 <- getNodeSet(doc, "//LIST[@Beds]")
f4 <- getNodeSet(doc, "//LIST[@Baths]")
f5 <- getNodeSet(doc, "//LIST[@Sqft]")
g1 <-sapply(f1, xmlValue)
g2 <-sapply(f2, xmlValue)
g3 <-sapply(f3, xmlValue)
g4 <-sapply(f4, xmlValue)
g5 <-sapply(f5, xmlValue)
print(f1)



--
View this message in context: 
http://r.789695.n4.nabble.com/Need-help-reading-website-info-with-XML-package-and-XPath-tp3561075p3561075.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] advice?

2010-05-05 Thread Eric

Download the trial version of UltraEdit (windows only) to open, inspect
and edit the file. Rename columns as needed. Set a long Tab stop,
Find/replace your delimiters to tabs (^t) and use column mode to remove
unneeded columns. 

You can also split the file and check if loading in increments help.  


"Daniel Malter"  wrote:
> Hi, on the one hand, you write "fairly large," on the other hand, you
> write
> "should be readable by anything." The warnings indicate that you are
> plain
> out of memory at some point. Not too surprising, given that your
> dataset has
> about 45 rows and 720 columns. You may search the r-help files
> first for
> how to allocate memory/how to read large files, since these questions
> are
> asked frequently.
> 
> The error, however, seems to refer to the fact that there are columns
> with
> identical column names, which is not allowed.
> 
> Daniel
> 
> -
> cuncta stricte discussurus
> -
> 
> -Original Message-
> From: r-help-boun...@r-project.org
> [mailto:r-help-boun...@r-project.org] On
> Behalf Of Carson Baughman
> Sent: Monday, May 03, 2010 6:17 PM
> To: R-help@r-project.org
> Subject: [R] advice?
> 
> All-
>Thank you in advance for any help you might be able to lend. 
> Here is
> my issue.  I am trying to open a fairly large .dat file.  The file
> originally was downloaded as a GZ file but I unzipped it (with 7-zip)
> into
> it's current 1.86 gig .dat format.  I know that the data is "just a
> plain
> ASCII file with 720 columns and 360 rows per time step (month). It
> should be
> readable by anything!"  There are 1272 steps.  Here is what happens
> when I
> try to assign the file to an object:
> 
> > clds<-read.table("C:\\CRU
> Data\\TS3.0\\Cloud\\cru_ts_3_00.1901.2006.cld.dat", header = TRUE,
> row.names
> = 1)
>Error in read.table("C:\\CRU
> Data\\TS3.0\\Cloud\\cru_ts_3_00.1901.2006.cld.dat",  :
>duplicate 'row.names' are not allowed
>   In addition: There were 45 warnings (use warnings() to see them)
> >warnings()
>1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, 
> ... :
>Reached total allocation of 1535Mb: see help(memory.size)
>X 25
>26: In type.convert(data[[i]], as.is = as.is[i], dec = dec, 
> ... :
>Reached total allocation of 1535Mb: see help(memory.size)
> 
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How do I fix this ?

2011-01-26 Thread eric


Just when I think I'm starting to learn 

Statement z1 works, statement z doesn't. Why doesn't z work and what do I do
to fix it ? Clearly the problem is with the first NA, but I would think it's
handled through the loop vectorization.


y1 <- rnorm(20, 0, .013)

y1
 [1] -0.0068630836 -0.0101106230 -0.0169663344 -0.0066314769  0.0075063818
 [6] -0.0033548024  0.0015647863  0.0119815982 -0.0021430336  0.0044617167
[11]  0.0053447708 -0.0005590323  0.0063195781  0.0073059640 -0.0181872678
[16] -0.0098094568  0.0013679040 -0.0028490887 -0.0131129191  0.0126610358

z1 <- ifelse(is.na(y1), 1, 1*cumprod(1+y1))

z1
 [1] 9931.369 9830.957 9664.162 9600.074 9672.136 9639.688 9654.772 9770.451
 [9] 9749.513 9793.012 9845.354 9839.850 9902.034 9974.378 9792.971 9696.907
[17] 9710.172 9682.506 9555.541 9676.524



y <-c(NA, rnorm(19,0, .013))

y
 [1]NA  0.0056258152 -0.0117690116  0.0163961630  0.0007818773
 [6]  0.0007761957  0.0139769376  0.0041086982 -0.0049545337  0.0059587216
[11] -0.0079022056  0.0083076357 -0.0075823658  0.0173806814 -0.0034915869
[16] -0.0045480358  0.0168642491  0.0038681635 -0.0123010077  0.0087494624

z <-ifelse(is.na(y), 1, 1*cumprod(1+y))

z
 [1] 1NANANANANANANANANANANA
[13]NANANANANANANANA

-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-do-I-fix-this-tp3239239p3239239.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] There must be a smarter way

2011-01-27 Thread eric


Newbie and trying to learn the right way of doing things in R. in this case,
I just have that feeling that my convoluted line of code is way more
complicated than it needs to be. Please help me in seeing the easier way.

I want to do something pretty simple. I have a dataframe called x that is
6945 elements long. I'd like to create a vector rtn= log(x[2,2]/x[1,3]),
then log(x[3,2]/x[2,3]), then log(x[4,2]/lx[3,3])
...log(x[6945,2]/x[6944,3]). Also want to put zero as the first element.

I know I can do it with a loop but I'd like to figure out the simple way to
vectorize it. Here's my solution (it works but it's sure complicated
looking) :

rtn <-c(0,log(x[2:length(x[,1]),2]/x[1:length(x[,1])-1,3]))

Here's what x looks like:

head(x)
Date  Open Close
1 03/30/1983 29.96 30.35
2 03/31/1983 30.35 30.24
3 04/04/1983 30.25 30.39
4 04/05/1983 30.45 30.66
5 04/06/1983 30.85 30.85
6 04/07/1983 30.85 31.12
-- 
View this message in context: 
http://r.789695.n4.nabble.com/There-must-be-a-smarter-way-tp3243651p3243651.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to do a moving window on standard deviation

2011-01-30 Thread eric


I'd like to use vectorization to take a 4 point moving window on standard
deviation on the close column and create another variable (st.dev) in the
dataframe. Here's the dataframe


head(xyz)
Date Close
1 2011-01-28 56.42
2 2011-01-27 57.37
3 2011-01-26 56.48
4 2011-01-25 56.39
5 2011-01-24 55.74
6 2011-01-21 55.46

So the first 3 elements to the new st.dev column would be zero (c(rep(0,3)),
then the 4th element of the new std.dev column would be standard deviation
of the first 4 closes. Next element would be sd of Close[5]:Close[1], then
sd of Close[6]: Close[2] ...and so on until the last row of xyz.

There must be an easy vetorized way to do this but I don't see it. Sorry for
the basic question but continuing to figure this new language out.

Thanks in advance for the help

-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-to-do-a-moving-window-on-standard-deviation-tp3247566p3247566.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to do a moving window on standard deviation

2011-01-30 Thread eric


I'd rather do this without getting into zoo objects at the moment. OK, how
about with a simple loop ? Why doesn't this work ?

attach(xyz)
j <- for(i in 4: length(Close)) sd(Close[i]:Close[(i-3)])
-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-to-do-a-moving-window-on-standard-deviation-tp3247566p3247611.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Accessing DF index

2011-02-19 Thread eric


I have a dataframe called x2. It seems to have a date column but I can't
access it or give it a name or convert it to a date. How do I refer to that
first column and make it a date ? When I try x2[1,] I get the second column.

head(x2)
  FAIRXSP500delta
2000-08-31  0.010101096  0.007426964  0.002674132
2000-09-29  0.096679730 -0.054966292  0.151646023
2000-10-31 -0.008245580 -0.004961785 -0.003283795
2000-11-30  0.037024545 -0.083456134  0.120480679
2000-12-29  0.080042708  0.004045193  0.075997514
2001-01-31 -0.009042396  0.034050246 -0.043092643

x[1,1]
FAIRX
2000-08-31 0.01010110

x[1,2]
 SP500
2000-08-31 0.007426964

str(x2)
'data.frame':   127 obs. of  3 variables:
 $ FAIRX: num  0.0101 0.09668 -0.00825 0.03702 0.08004 ...
 $ SP500: num  0.00743 -0.05497 -0.00496 -0.08346 0.00405 ...
 $ delta: num  0.00267 0.15165 -0.00328 0.12048 0.076 ..
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Accessing-DF-index-tp3314649p3314649.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Accessing DF index

2011-02-19 Thread eric


So how would I convert those row names to dates and give that column the name
"Date" so that I can use subset and other functions on the Date column ?
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Accessing-DF-index-tp3314649p3314689.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plotting two lines on a graph when using par(mfrow=)

2011-02-27 Thread eric

Basic question but still learning 

How do I plot two lines (f$equity and f$bh.equity) on one of the three
graphs under mfrow ? I tried putting brackets around the first plot and
lines command but that didn't work.

par(mfrow=c(3,1))
{plot(f$Date,f$equity, col="blue", type="l", main="equity")
lines(f$bh.equity, col="gray")}
plot(f$Date,f$indicator, col="green", type="l", main="indicator")
plot(f$Date, f$SPY, col="red", type="l", main="SPY")

What I want is the first graph to have two lines(equity and bh.equity), then
the next two graphs to have one line each.

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Plotting-two-lines-on-a-graph-when-using-par-mfrow-tp3326979p3326979.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] What am I doing wrong with this loop ?

2011-03-02 Thread eric

What is wrong with this loop ?  I am getting an error saying incorrect number
of dimensions y[i,2]

x <- as.data.frame(runif(2000, 12, 38))
z <-numeric(length(x))
y <- as.data.frame(z)
for(i in 1:length(x)) {
  y <- ifelse(i < 500, as.data.frame(lowess(x[1:i,1], f=1/9)) ,
as.data.frame(lowess(x[(i-499):i,1], f=1/9)))  
  z[i] <-y[i,2]
}

--
View this message in context: 
http://r.789695.n4.nabble.com/What-am-I-doing-wrong-with-this-loop-tp3332703p3332703.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] What am I doing wrong with this loop ?

2011-03-03 Thread eric

Bill, I addressed the first issue with the data frames and length(x). But my
loops still isn't working. More importantly, you commented that I should be
using if(...) ... else ... rather than ifelse(.,.,).

Please help me understand the difference. I thought ifelse was just a faster
way of doing if(...)...else(.,.,).

What is the difference between these two methods ?

--
View this message in context: 
http://r.789695.n4.nabble.com/What-am-I-doing-wrong-with-this-loop-tp3332703p3334591.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] What am I doing wrong with this loop ?

2011-03-03 Thread eric

Never mind Billgot it. Always seems to happen this way. Can't figure
something out. Post to the site and wham, 5 min later (after posting), it's
all clear.

Oh well, thanks for the tips


--
View this message in context: 
http://r.789695.n4.nabble.com/What-am-I-doing-wrong-with-this-loop-tp3332703p3334609.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Why doesn't this work ?

2011-03-16 Thread eric

Why doesn't this work and is there a better way ?

z <-ifelse(t==1 || 2 || 3, 1,0)
t <-3
z
[1] 1
t <-4
z
[1] 1

trying to say ...if t == 1 or if t== 2 or if t ==3 then true, otherwise
false

--
View this message in context: 
http://r.789695.n4.nabble.com/Why-doesn-t-this-work-tp3383656p3383656.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How do I modify uniroot function to return .0001 if error ?

2011-04-03 Thread eric

I am calling the uniroot function from inside another function using these
lines (last two lines of the function) :

d <- uniroot(k, c(.001, 250), tol=.05)
return(d$root)

The problem is that on occasion there's a problem with the values I'm
passing to uniroot. In those instances uniroot stops and sends a message
that it can't calculate the root because f.upper * f.lower is greater than
zero.  All I'd like to do in those cases is be able to set the return value
of my calling function "return(d$root)" to .0001. But I'm not sure how to
pull that off. I tried a few modifications to uniroot but so far no luck.

For convenience, the uniroot function is shown below:

uniroot <- function (f, interval, ..., lower = min(interval), upper =
max(interval), 
f.lower = f(lower, ...), f.upper = f(upper, ...), tol =
.Machine$double.eps^0.25, 
maxiter = 1000) 
{
if (!missing(interval) && length(interval) != 2L) 
stop("'interval' must be a vector of length 2")
if (!is.numeric(lower) || !is.numeric(upper) || lower >= 
upper) 
stop("lower < upper  is not fulfilled")
if (is.na(f.lower)) 
stop("f.lower = f(lower) is NA")
if (is.na(f.upper)) 
stop("f.upper = f(upper) is NA")
if (f.lower * f.upper > 0)
stop("f.up * f.down > 0")
val <- .Internal(zeroin2(function(arg) f(arg, ...), lower, 
upper, f.lower, f.upper, tol, as.integer(maxiter)))
iter <- as.integer(val[2L])
if (iter < 0) {
warning("_NOT_ converged in ", maxiter, " iterations")
iter <- maxiter
}
list(root = val[1L], f.root = f(val[1L], ...), iter = iter, 
estim.prec = val[3L])
}

--
View this message in context: 
http://r.789695.n4.nabble.com/How-do-I-modify-uniroot-function-to-return-0001-if-error-tp3424092p3424092.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Where did my packages go ?

2011-04-18 Thread eric

Running linux 10.04 ubuntu.

Looks like Ubuntu automatically updated to version 2.13. I was running
version 2.12 until today. But now I'm getting error messages when I use the
require or library command and one of the packages I've downloaded in the
past.  For example:

> require(quantmod)
Loading required package: quantmod
Warning message:
In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return
= TRUE,  :
  there is no package called 'quantmod

I noticed under my Home folder in the R folder that I have a
i486-pc-gnu-library subfolder. Inside that sub folder are 3 subfolders
labeled 2.10, 2.12, 2.13. If I look in 2.12, I see a lot of subfolders with
the names of the packages I use (quantmod for instance). If I look in the
2.13 subfolder, there's nothing there at all; it's empty.

So what is the right course of action ? 


--
View this message in context: 
http://r.789695.n4.nabble.com/Where-did-my-packages-go-tp3459079p3459079.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How do I subset a dataframe

2011-08-14 Thread eric

I have a dataframe zeespan. One of the columns has the name "customer". The
data in the customer column is text. I would like to return a subset of the
dataframe with all rows that DON'T begin with either "ibm" or "exxon", or
"sears" in the customer column.

I tried   subset(zeespan, customer != c("ibm" | "exxon" | "sears") )

That didn't work and even if it did, the text would have to be an exact
match where what I really want is "begins with".

Suggestions on how to do this would be appreciated

--
View this message in context: 
http://r.789695.n4.nabble.com/How-do-I-subset-a-dataframe-tp3742172p3742172.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Not sure how to use aggregate, colSums, by

2011-08-14 Thread eric

I have a data frame called test shown below that i would like to summarize in
a particular way :

I want to show the column sums (columns y ,f) grouped by country (column
e1). However, I'm looking for the data to be split according to column e2.
In other words, two tables of sum by country. One table for "con" and one
table for "std" shown in column e2. Finally at the bottom of the two tables,
I would like the overall sum /Totals for all the countries for the two
columns (y,f).  The lay outs for the two tables I'm looking for are also
shown below in case my description isn't completely clear

I would also like to be able to use the Totals of y and f for the two tables
in other calculations. 

I can get the two sets of totals with the following commands but not the
sums by country.

colSums(test[test$e2=="std", c(3,4)])
colSums(test[test$e2=="con", c(3,4)])

I know there's an easy way to do this with a combination of colSums, by,
aggregate but I can't seem to get it.

std y   f

usasum   sum
francesum   sum
cansum   sum
italy   sum   sum
Totalssum   sum

con   y   f

usa   sum   sum
france   sum   sum
can   sum   sum
italy  sum   sum
Totalssum   sum

  e1  e2 y  f
1 usa std 1  1
2 usa std 1  2
3 can con 1  3
4  france con 1  4
5 can std 1  5
6   italy con 1  6
7 usa std 2  7
8 usa std 2  8
9 can con 2  9
10 france con 2 10
11can std 2 11
12  italy con 2 12
13usa std 3 13
14usa std 3 14
15can con 3 15
16 france con 3 16
17can std 3 17
18  italy con 3 18
19usa std 4 19
20usa std 4 20
21can con 4 21
22 france con 4 22
23can std 4 23
24  italy con 4 24
25usa std 5 25
26usa std 5 26
27can con 5 27
28 france con 5 28
29can std 5 29
30  italy con 5 30
31usa std 6 31
32usa std 6 32
33can con 6 33
34 france con 6 34
35can std 6 35
36  italy con 6 36

--
View this message in context: 
http://r.789695.n4.nabble.com/Not-sure-how-to-use-aggregate-colSums-by-tp3743258p3743258.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] factor coercion with read.csv or read.table

2012-06-06 Thread eric

How do I fix this error ? I tried coercion to a vector but that didn't work.

msci <-read.csv("..MSCIexUS.csv", header=TRUE)

head(msci)

 Date  index
1 Dec 31, 1969100
2 Jan 30, 1970 97.655
3 Feb 27, 1970 96.154
4 Mar 31, 1970 95.857
5 Apr 30, 1970 85.564
6 May 29, 1970 79.005

> str(msci)
'data.frame':   510 obs. of  2 variables:
 $ Date : Factor w/ 510 levels "Apr 28, 1972",..: 98 178 134 311 13 342 268
228 55 481 ...
 $ index: Factor w/ 510 levels "100","1,000.302",..: 1 499 493 488 444 412
418 434 441 448 ...


> msci$Date <-as.Date(msci$Date, dateFormat='%b %d, %Y')
Error in charToDate(x) : 
  character string is not in a standard unambiguous format


--
View this message in context: 
http://r.789695.n4.nabble.com/factor-coercion-with-read-csv-or-read-table-tp4632622.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Gaps on merging xts objects

2012-06-10 Thread eric

Looking for a little help figuring out what's driving gaps in data after
merging two xts objects (msci.m and x2). The merge statement I'm using is
...   y <-merge(x2,msci.m, all=FALSE). Here's info on the output , y:

head(y)
 t-bill msci
Sep 1985  7.310  316.963
Mar 1986  6.560  463.471
Jun 1986  6.180  498.791
Jul 1987  6.200  778.898
Aug 1987  6.400  833.519
Nov 1987  5.690  704.726
Feb 1988  5.780  783.533
May 1988  6.730  813.289

tail(y)
Mar 2008  1.465 2039.001
Jun 2008  1.936 1990.710
Aug 2009  0.152 1506.642
Nov 2009  0.061 1573.659
Jan 2011  0.152 1730.912
Feb 2011  0.147 1791.900
May 2011  0.061 1772.492
Oct 2011  0.010 1538.103
Apr 2012  0.096 1549.291

Notice the gap between Sept 1985 and March 1986 for example. Or the missing
2010 data. Both x2 and msci.m are monthly data. Here's info about both those
objects:

> str(x2)
An ‘xts’ object from Dec 1984 to May 2012 containing:
  Data: num [1:330, 1] 8.14 8.06 8.69 8.71 8.15 7.49 7.33 7.48 7.31 7.31 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr "x1.Close"
  Indexed by objects of class: [yearmon] TZ: 
  xts Attributes:  
List of 2
 $ tclass: chr "Date"
 $ tzone : chr ""
> str(msci.m)
An ‘xts’ object from Dec 1969 to May 2012 containing:
  Data: num [1:510, 1] 100 97.7 96.2 95.9 85.6 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr "msci.Close"
  Indexed by objects of class: [yearmon] TZ: 
  xts Attributes:  
List of 2
 $ tclass: chr "Date"
 $ tzone : chr ""

> head(msci.m)
 msci.Close
Dec 1969100.000
Jan 1970 97.655
Feb 1970 96.154
Mar 1970 95.857
Apr 1970 85.564
May 1970 79.005
> head(x2)
 x1.Close
Dec 1984 8.14
Jan 1985 8.06
Feb 1985 8.69
Mar 1985 8.71
Apr 1985 8.15
May 1985 7.49

tail(msci.m)
 msci.Close
Dec 2011   1445.066
Jan 2012   1521.751
Feb 2012   1601.262
Mar 2012   1582.660
Apr 2012   1549.291
May 2012   1363.978
> tail(x2)
 x1.Close
Dec 20110.025
Jan 20120.051
Feb 20120.117
Mar 20120.086
Apr 20120.096
May 20120.086

So how do I figure out what's driving the gaps ?



--
View this message in context: 
http://r.789695.n4.nabble.com/Gaps-on-merging-xts-objects-tp4632941.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Gaps on merging xts objects

2012-06-10 Thread eric

An update ...

I did a bit more search on the internet and got some ideas

i set the start month for the series to the same date.  That didn't help.
Then I tried

.index(x2)==.index(msci.m)
FALSE

i was able to fix the problem with :
index(x2) <- as.Date(index(x2))
index(msci.m) <- as.Date(index(msci.m))

Now the xts objects merged just fine. But I'm not exactly sure what the root
cause was or why this fix worked

One other part of this is the x2 data started as weekly and the msci data
started as daily. Both were converted to monthly data with to.monthly. I
suspect that has something to do with it but I don't understand.

Any words of wisdom in gaining a better understanding would be appreciated


--
View this message in context: 
http://r.789695.n4.nabble.com/Gaps-on-merging-xts-objects-tp4632941p4632950.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Stuck ...can't get sapply and xmlTreeParse working

2011-07-04 Thread eric

Can't seem to get the code below working. It gets stuck on line 24 inside the
function hm; comments show the line in question. The function hm is called
by sapply and is at the bottom of the code. Other stuff above line 24 works
correctly including the first couple of lines of the function hm. Should I
be using a different apply function or am I doing something wrong with
xmlTreeParse ? 


library(XML)
url.montco <-
"http://webapp.montcopa.org/sherreal/salelist.asp?saledate=07/27/2011";
tbl <-data.frame(readHTMLTable(url.montco))[, c(3,5,6,8,9)]
tbl <-tbl[2: length(tbl[,1]),]
names(tbl) <- c("Address", "Township", "Parcel", "SaleDate", "Costs");
rownames(tbl) <- NULL
v <- gregexpr("( aka )|( AKA )",tbl$Address)
s <-sapply(v, function(x) max(unlist(x)))
tbl$Address <- substring(tbl$Address, ifelse(s== -1, 0, s+4), 1)
tbl$Cost <- gsub(',', '', tbl$Costs) 
temp <- strsplit(tbl$Cost, "\\$")  
temp <- do.call(rbind, temp)  # create a matrix
mode(temp) <- 'numeric'
tbl$Debt <- round(temp[, 2]/1000,2) 
tbl$Court <- round(temp[, 3]/1000,2)
z <- data.frame(substr(tbl$SaleDate,regexpr("[A-Za-z]", tbl$SaleDate),
regexpr("[0-9]", tbl$SaleDate,)-1)) ; names(z) <- "Action"
y <- data.frame(substr(tbl$SaleDate,regexpr("[0-9]", tbl$SaleDate),2011)) ;
names(y) <- "ActionDate"
tbl <-cbind(tbl[, c(1,2,3,7,8)],z,y)
new.add <- paste(tbl$Address,"&citystatezip=",tbl$Township,"%2C+PA", sep='')
new.add <- sub("^( )+","", new.add)
new.add <-data.frame(gsub("( )+",'+', new.add)); names(new.add) <-
"ParseAddress"
hm <- function(x) {
  url.zill
<-paste("http://www.zillow.com/webservice/GetDeepSearchResults.htm?zws-id=X1-ZWz1bup03e49vv_5kvb6&address=",x,
sep="")
  ## problem line is next #
  zdoc <-xmlTreeParse(url.zill, useInternalNode=TRUE, isURL=TRUE)
  # problem line above  ##
  f$zpid <- sapply(getNodeSet(zdoc, "//result/zpid"), xmlValue)
  f$zest.low <-sapply(getNodeSet(zdoc, "//valuationRange/low"), xmlValue)
  f$zest <- sapply(getNodeSet(zdoc, "//zestimate/amount"), xmlValue)
  rm(zdoc)
  return(f)
}
j <-sapply(new.add, FUN=hm)
print(zest)

--
View this message in context: 
http://r.789695.n4.nabble.com/Stuck-can-t-get-sapply-and-xmlTreeParse-working-tp3644894p3644894.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with tryCatch

2011-07-10 Thread eric

Having a hard time understanding the help files for tryCatch. Looking for a
little help with the following statement which sits inside a for loop

zest[i] <- tryCatch(sapply(getNodeSet(zdoc, "//zestimate/amount"),
xmlValue), error=function() zest[i] <-"NA")

zest is a numeric vector

If the sapply statement evaluates to an error, I'd like to set the value of
zest[i] to NA and continue with the loop.

Suggestions ?

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-tryCatch-tp3657859p3657859.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with tryCatch

2011-07-10 Thread eric

I tried the following:

zest[i] <- tryCatch(sapply(getNodeSet(zdoc, "//zestimate/amount"),
xmlValue), error=function() NA)

Here's what happens :

Error in zest[i] <- tryCatch(sapply(getNodeSet(zdoc, "//zestimate/amount"), 
: 
  replacement has length zero



--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-tryCatch-tp3657859p3658113.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with tryCatch

2011-07-10 Thread eric

Bill, first off, thanks much for helping me through this. I think the best
approach might be for me to attach the actual code. 

I could probably do the if-else-else that you suggested. But I have eight
different variables with the same basic issue (note that six of the eight
are commented out while I figure out how to solve the tryCatch problem). So
doing an "if else" for each variable seems like a lot of code. i'm thinking
there must be a better way.

What I'm doing here is setting up a dataframe full of addresses and then
looking up data for each address on the web. Things seem to go well until
one of the addresses I have is not valid. When that happens, the data I'm
looking for (e.g. zest, bed, bath, sqft ...etc) does not seem to get set to
NA.

Note that the code works fine until about the 4th iteration when I run into
an invalid address and I get the replacement has length 0 error.

Any ideas how I can fix this so it works right without adding lots of
if-else-else code ?
http://r.789695.n4.nabble.com/file/n3658347/code.sher.pdf code.sher.pdf 

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-tryCatch-tp3657859p3658347.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Is there a better way ?

2011-07-10 Thread eric

Is there a more compact way to say this ?

r <-numeric(length(p)) ; s <-numeric(length(p)); t <- numeric(length(p)); u
<- numeric(length(p)); v <- numeric(length(p)) ; x <-numeric(length(p))

all these variables will be used in a loop 

for (i in 1 : length(p)) {
r[i] <-
s[i] <-
t[i] <-
etc
}

--
View this message in context: 
http://r.789695.n4.nabble.com/Is-there-a-better-way-tp3658588p3658588.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] install.packages problem

2011-11-05 Thread eric

I'm trying to install the rdatamarket package. I did an
install.packages('rdatamarket') command but got an error about half way
through the install as follows:

* installing *source* package ‘RCurl’ ...
checking for curl-config... no
Cannot find curl-config
ERROR: configuration failed for package ‘RCurl’

The install continued after the error but looks like it was completed. I'm
trying to figure out what the error means and how I fix it. 

Here's what I'm seeing ...ideas on how to address this would be appreciated
:

install.packages('rdatamarket')
Installing package(s) into ‘/home/eric/R/i486-pc-linux-gnu-library/2.13’
(as ‘lib’ is unspecified)
--- Please select a CRAN mirror for use in this session ---
also installing the dependencies ‘RCurl’, ‘RJSONIO’

trying URL 'http://lib.stat.cmu.edu/R/CRAN/src/contrib/RCurl_1.7-0.tar.gz'
Content type 'application/x-gzip' length 813252 bytes (794 Kb)
opened URL
==
downloaded 794 Kb

trying URL
'http://lib.stat.cmu.edu/R/CRAN/src/contrib/RJSONIO_0.96-0.tar.gz'
Content type 'application/x-gzip' length 1144519 bytes (1.1 Mb)
opened URL
==
downloaded 1.1 Mb

trying URL
'http://lib.stat.cmu.edu/R/CRAN/src/contrib/rdatamarket_0.6.3.tar.gz'
Content type 'application/x-gzip' length 12432 bytes (12 Kb)
opened URL
==
downloaded 12 Kb

* installing *source* package ‘RCurl’ ...
checking for curl-config... no
Cannot find curl-config
ERROR: configuration failed for package ‘RCurl’
* removing ‘/home/eric/R/i486-pc-linux-gnu-library/2.13/RCurl’
* installing *source* package ‘RJSONIO’ ...
Trying to find libjson.h header file
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
USE_LOCAL = ""
Using local libjson code. Copying files
/tmp/RtmpFw9QeX/R.INSTALL4ebf657f/RJSONIO
configure: creating ./config.status
config.status: creating src/Makevars
config.status: creating cleanup
** libs
gcc -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1
-DJSON_NO_EXCEPTIONS=1 -fpic  -std=gnu99 -O3 -pipe  -g -c ConvertUTF.c
-o ConvertUTF.o
g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1
-DJSON_NO_EXCEPTIONS=1 -fpic  -O3 -pipe  -g -c JSONChildren.cpp -o
JSONChildren.o
g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1
-DJSON_NO_EXCEPTIONS=1 -fpic  -O3 -pipe  -g -c JSONDebug.cpp -o
JSONDebug.o
g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1
-DJSON_NO_EXCEPTIONS=1 -fpic  -O3 -pipe  -g -c JSONIterators.cpp -o
JSONIterators.o
g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1
-DJSON_NO_EXCEPTIONS=1 -fpic  -O3 -pipe  -g -c JSONMemory.cpp -o
JSONMemory.o
g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1
-DJSON_NO_EXCEPTIONS=1 -fpic  -O3 -pipe  -g -c JSONNode.cpp -o
JSONNode.o
g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1
-DJSON_NO_EXCEPTIONS=1 -fpic  -O3 -pipe  -g -c JSONNode_Mutex.cpp -o
JSONNode_Mutex.o
g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1
-DJSON_NO_EXCEPTIONS=1 -fpic  -O3 -pipe  -g -c JSONStream.cpp -o
JSONStream.o
g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1
-DJSON_NO_EXCEPTIONS=1 -fpic  -O3 -pipe  -g -c JSONValidator.cpp -o
JSONValidator.o
g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1
-DJSON_NO_EXCEPTIONS=1 -fpic  -O3 -pipe  -g -c JSONWorker.cpp -o
JSONWorker.o
g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1
-DJSON_NO_EXCEPTIONS=1 -fpic  -O3 -pipe  -g -c JSONWriter.cpp -o
JSONWriter.o
g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1
-DJSON_NO_EXCEPTIONS=1 -fpic  -O3 -pipe  -g -c JSON_Base64.cpp -o
JSON_Base64.o
gcc -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1
-DJSON_NO_EXCEPTIONS=1 -fpic  -std=gnu99 -O3 -pipe  -g -c JSON_parser.c
-o JSON_parser.o
gcc -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1
-DJSON_NO_EXCEPTIONS=1 -fpic  -std=gnu99 -O3 -pipe  -g -c RJSON.c -o
RJSON.o
g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1
-DJSON_NO_EXCEPTIONS=1 -fpic  -O3 -pipe  -g -c internalJSONNode.cpp -o
internalJSONNode.o
g++ -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1
-DJSON_NO_EXCEPTIONS=1 -fpic  -O3 -pipe  -g -c libjson.cpp -o libjson.o
gcc -I/usr/share/R/include -I. -Ilibjson -Ilibjson/Source -DNDEBUG=1
-DJSON_NO_EXCEP

[R] DESeq

2011-11-07 Thread Eric

Hello,

I have RNAseq data, which I am trying to analyze with DESeq. My file (tab 
delimited .txt) appears to be correct:

>head(myfile)
VZ_w13 VZ_w14a VZ_w14b VZ_w15a VZ_w15b VZ_w16a
ENSG0253101  0   0   0   0   0   0
ENSG0223972  0   0   0   0   0   0...

However, when I try to analyze the data with

>cds <- newCountDataSet(myfile,conds)

I get the following message:

"Error in newCountDataSet(myfile,conds) : The countData is not integer.

The problem, as far as I can tell, is that my data are numerical, not integer, 
because when I run

>str(myfile)
'data.frame':   53433 obs. of  14 variables:
 $ VZ_w13   : num  0 0 0 0 8 0 0 0 0 0 ...

Does anyone have a way to convert my file from numerical to integer? As you can 
see, the data are in fact integers are already, so I'm a bit confused. 

Thanks,
Eric


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Warning message interpretation

2011-11-07 Thread eric

Using the rmarketdata package and getting a warning message. 

What does this warning message tell me ? What could I do to eliminate or
address it ?

require(rdatamarket)
Loading required package: rdatamarket
Loading required package: zoo


Warning message:
In assignInNamespace("as.Date.numeric", function(x, origin, ...) { :
  binding of ‘as.Date.numeric’ is locked and will not be changed

--
View this message in context: 
http://r.789695.n4.nabble.com/Warning-message-interpretation-tp4014483p4014483.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] update.packages() issue

2011-11-09 Thread eric

I updated packages using the update.packages() command but now having
problems. i now know I should have done this from the terminal (sudo rstudio
and then update.packages()). Instead I just opened rstudio without the sudo
command. So I know what to do going forward but how do I resolve the
existing problem. 

Note that these are just the packages I loaded tonight. There are lots of
others that I use on occasion that I would probably have the same problem
with as below.

What's the easiest way to correct this ?

> require(quantmod)
Loading required package: quantmod
Warning message:
In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return
= TRUE,  :
  there is no package called ‘quantmod’
> require(rdatamarket)
Loading required package: rdatamarket
Warning message:
In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return
= TRUE,  :
  there is no package called ‘rdatamarket’

--
View this message in context: 
http://r.789695.n4.nabble.com/update-packages-issue-tp4021998p4021998.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error in axis ????

2011-11-09 Thread eric

I did an update of both rstudio and my packages. I had some trouble but was
able to move a lot of the packages so most troubles seem to be behind me.
But having a problem with code that previously ran fine. See below:

require(quantmod)
Loading required package: quantmod
Loading required package: Defaults
Loading required package: xts
Loading required package: zoo

Attaching package: ‘zoo’

The following object(s) are masked from ‘package:base’:

as.Date, as.Date.numeric

Loading required package: TTR
> require(rdatamarket)
Loading required package: rdatamarket
> rm(list=ls())
> g
> <-dmlist("http://datamarket.com/data/set/1jz5/st-louis-financial-stress-index#display=line&ds=1jz5";)
> g$Date <-as.Date(g[,1], "%Y-%m-%d")
> h <-as.xts(g, order.by=g[,1])
> j <-h[,2]
> s <-getSymbols('^GSPC', from="1990-01-01", to=Sys.Date())
> s <-to.weekly(GSPC)
> s <-s[,6]
> x <-na.omit(merge(s,j)) ; names(x) <-c("sp","stress")
> print(head(x))
   sp stress
1993-12-31 466.45 -0.453
1994-01-07 469.90 -0.442
1994-01-14 474.91 -0.435
1994-01-21 474.72 -0.449
1994-01-28 478.70 -0.462
1994-02-04 469.81 -0.513
> par(mfrow=c(2,1))
> plot(x[,1]/400, ylim=c(-1,5), col="blue")
Error in axis(1, at = xycoords$x, labels = FALSE, col = "#BB", ...) : 
  formal argument "col" matched by multiple actual arguments
> lines(x[,2], col="red")
> ccf(drop(x[,1]), drop(x[,2]))

How do I fix the error ?

--
View this message in context: 
http://r.789695.n4.nabble.com/Error-in-axis-tp4022356p4022356.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error in axis ????

2011-11-09 Thread eric

Recently updated my packages (update.packages() and also updated rstudio to
the latest version. Had some problems but I thought I was able to get around
them by moving my folders. However, now I'm seeing a bit of a snag with some
code that previously ran fine.  Hoping someone can suggest a fix. Here's the
screen output:

require(quantmod)
Loading required package: quantmod
Loading required package: Defaults
Loading required package: xts
Loading required package: zoo

Attaching package: ‘zoo’

The following object(s) are masked from ‘package:base’:

as.Date, as.Date.numeric

Loading required package: TTR

> require(rdatamarket)
Loading required package: rdatamarket

> g
> <-dmlist("http://datamarket.com/data/set/1jz5/st-louis-financial-stress-index#display=line&ds=1jz5";)
> g$Date <-as.Date(g[,1], "%Y-%m-%d")
> h <-as.xts(g, order.by=g[,1])
> j <-h[,2]
> s <-getSymbols('^GSPC', from="1990-01-01", to=Sys.Date())
> s <-to.weekly(GSPC)
> s <-s[,6]
> x <-na.omit(merge(s,j)) ; names(x) <-c("sp","stress")
> print(head(x))
   sp stress
1993-12-31 466.45 -0.453
1994-01-07 469.90 -0.442
1994-01-14 474.91 -0.435
1994-01-21 474.72 -0.449
1994-01-28 478.70 -0.462
1994-02-04 469.81 -0.513
> par(mfrow=c(2,1))
> plot(x[,1]/400, ylim=c(-1,5), col="blue")
Error in axis(1, at = xycoords$x, labels = FALSE, col = "#BB", ...) : 
  formal argument "col" matched by multiple actual arguments
> lines(x[,2], col="red")
> ccf(drop(x[,1]), drop(x[,2]))



--
View this message in context: 
http://r.789695.n4.nabble.com/Error-in-axis-tp4022364p4022364.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about which kind of plot to use

2007-12-19 Thread Eric

Deepayan Sarkar wrote:
> On 12/19/07, Dylan Beaudette <[EMAIL PROTECTED]> wrote:
>   
>> On Wednesday 19 December 2007, Max wrote:
>> 
>>> Hi Everyone,
>>>
>>> I've got a question about data representation. I have some psychometric
>>> data with 5 scores for 15 different groups. I've been asked to show
>>> some kind of mean plots. The data below is the mean and SD for a given
>>> group, unfortunately my employer doesn't want me posting full datasets.
>>>
>>> :(
>>>
>>>  The groups V,W,X,Y,Z are divided into Bottom, (B), Middle (M) and Top
>>> (T). An example of my data is shown below.
>>>
>>> Score 1
>>> Mean   SD
>>>   B   M   T   B M T
>>> V 86.913.088.816.9  2.0   10.5
>>> W 16.196.117.72.2   4.6   1.7
>>> X 50.761.174.74.7   3.7   7.6
>>> Y 68.599.737.66.0   8.0   2.3
>>> Z 92.722.369.46.5   1.2   2.2
>>>
>>> What I did before was a standard mean plot:
>>>
>>> plotMeans(w$score1, w$Factor,
>>> error.bars="sd",xlab="Factor",ylab="Score",main="Group W Score 1 Plot")
>>>
>>>  However, with 15 groups and 5 scores this turns into 75 individual
>>> graphs. Is there a way to layer mean plots? Or show several mean plots
>>> in the Same graph? Any ideas or suggestions would be great.
>>>
>>> thanks,
>>>
>>> -Max
>>>
>>>   
>> How about a lattice plot using panels ? plot the distribution of each score
>> (box and whisker style), using a panel for each group?
>>
>> a <- rnorm(100)
>> b <- rnorm(100)
>> c <- rnorm(100)
>>  d <- rnorm(100)
>>
>> library(lattice)
>> new <- make.groups(a,b,c,d
>>
>> new$grp <- rep(gl(5,20, labels=c('A','B','C','D','E')), 4)
>>
>> bwplot(data ~ which | grp, data=new)
>>
>> Not quite means, but close!
>> 
>
> And
>
> demo("intervals", package = "lattice")
>
> shows you how to incorporate confidence intervals.
>
> -Deepayan
>
>   

Perhaps as long as you're learning a new plotting system, you might also 
check out whether ggplot2 might be an option.

I did a quick and dirty version (which I'm sure Hadley can improve and 
also remind me how to get rid of the legend that shows the "3" that I 
set the size to).

Assuming your data is re-shaped, so it comes out something like mine in 
the artificial example below, then it's a two-liner in ggplot:


maxdat.df <- data.frame (
score1 =  rnorm(9, mean = rep(c(10,20,30), each = 3), sd = 1 ) ,
SD = runif(9) * 2 + .5,
Group = factor ( rep ( c("V", "W", "X"), each = 3 ) ),
subGroup = rep( c("B","M","T"), 3) )
   
maxdat.df

library(ggplot2)
ggp <- ggplot ( maxdat.df, aes (y = score1, x = interaction(Group , 
subGroup), min = score1 - SD, max = score1 + SD, size = 3) )
ggp + geom_pointrange() + coord_flip()


Eric

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] adding layers in ggplot2 (data and code included)

2008-09-21 Thread Eric



The way you've attempted to get this result seems to align with the way 
R "should" work, but it fails in this case.

The fix is to break things up a little bit:

p <- ggplot(mydata, aes(x=Est, y=Tri))
p <- p + geom_point(aes(colour=factor(Group),shape=factor(Group)))
p <- p + 
geom_smooth(aes(group=factor(Group),color=factor(Group)),method=lm,se=F)

p


Eric



Juliet Hannah wrote:

Here is some sample data:

mydata <- read.table(textConnection("Est GroupTri
   00 4.639644
   10 4.579189
   20 4.590714
   01 4.443696
   11 4.588243
   21 4.650505
   02 4.296608
   12 4.826036
   22 4.765386"),header=TRUE);
  closeAllConnections();

I can form two plots, scatter and  lines, as follows:

p <- ggplot(mydata, aes(x=Est, y=Tri))
p + geom_point(aes(colour=factor(Group),shape=factor(Group)))

and

p+ geom_smooth(aes(group=factor(Group),color=factor(Group)),method=lm,se=F).

However, I am unable to have the plots together.

I obtain the following error:

  

p + 
geom_point(aes(colour=factor(Group),shape=factor(Group)))+geom_smooth(aes(group=factor(Group),color=factor(Group)),method=lm,se=F)


Error in `[.data.frame`(df, , var) : undefined columns selected

Thanks,

Juliet

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2 problem

2008-11-26 Thread Eric

aes() does not have an argument, "Year" but it does have an argument "x" 
so try:


df <- data.frame(Year = rep(1:5,2))
m <- ggplot(df, aes(x=Year))
m + geom_bar()

(It works for me.)

Eric



steve wrote:

I'm using ggplot2 2.0.8 and R 2.8.0

df = data.frame(Year = rep(1:5,2))
m = ggplot(df, aes(Year=Year))
m + geom_bar()

Error in get("calculate", env = ., inherits = TRUE)(., ...) :
  attempt to apply non-function

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] as.data.frame doesn't set col.names

2017-10-25 Thread Eric Berger

Hi Peter,
Thanks for contributing such a great answer. Can you please provide a
pointer to the documentation where it explains why dd$B <- s and dd["B"] <-
s have such different behavior?

(I am perfectly happy if you write the explanation but if it saves you time
to point to some reference that works fine for me.)

Regards,
Eric


On Wed, Oct 25, 2017 at 2:27 PM, Peter Dalgaard  wrote:

>
> > On 24 Oct 2017, at 22:45 , David L Carlson  wrote:
> >
> > You left out all the most important bits of information. What is yo? Are
> you trying to assign a data frame to a single column in another data frame?
> Printing head(samples) tells us nothing about what data types you have,
> especially if the things that look like text are really factors that were
> created when you used one of the read.*() functions. Use str(samples) to
> see what you are dealing with.
>
> Actually, I think there is enough information to diagnose this. The main
> issue is as you point out, assignment of an entire data frame to a column
> of another data frame:
>
> > l <- letters[1:5]
> > s <- as.data.frame(sapply(l,toupper))
> > dput(s)
> structure(list(`sapply(l, toupper)` = structure(1:5, .Label = c("A",
> "B", "C", "D", "E"), class = "factor")), .Names = "sapply(l, toupper)",
> row.names = c("a",
> "b", "c", "d", "e"), class = "data.frame")
>
> (incidentally, setting col.names has no effect on this; notice that it is
> only documented as an argument to "list" and "matrix" methods, and sapply()
> returns a vector)
>
> Now, if we do this:
>
> > dd <- data.frame(A=l)
> > dd$B <- s
>
> we end up with a data frame whose B "column" is another data frame
>
> > dput(dd)
> structure(list(A = structure(1:5, .Label = c("a", "b", "c", "d",
> "e"), class = "factor"), B = structure(list(`sapply(l, toupper)` =
> structure(1:5, .Label = c("A",
> "B", "C", "D", "E"), class = "factor")), .Names = "sapply(l, toupper)",
> row.names = c("a",
> "b", "c", "d", "e"), class = "data.frame")), .Names = c("A",
> "B"), row.names = c(NA, -5L), class = "data.frame")
>
> in printing such data frames, the inner frame "wins" the column names,
> which is sensible if you consider what would happen if it had more than one
> column:
>
> > dd
>   A sapply(l, toupper)
> 1 a  A
> 2 b  B
> 3 c  C
> 4 d  D
> 5 e  E
>
> To get the effect that Ed probably expected, do
>
> > dd <- data.frame(A=l)
> > dd["B"] <- s
> > dd
>   A B
> 1 a A
> 2 b B
> 3 c C
> 4 d D
> 5 e E
>
> (and notice that single-bracket indexing is crucial here)
>
> -pd
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R encountered a fatal error. The session was terminated. + * caught illegal operation *

2017-10-26 Thread Eric Berger

How about going back to earlier versions if you don't need the latest ones?

On Thu, Oct 26, 2017 at 12:59 PM, Klaus Michael Keller <
klaus.kel...@graduateinstitute.ch> wrote:

> Dear all,
>
> I just installed the "Short Summer" R update last week. Now, my R Studio
> doesn't open anymore!
>
> --> R encountered a fatal error.  The session was terminated.
>
> and my R terminal doesn't close properly
>
> --> *** caught illegal operation ***
>
> I restarted my Mac OS Sierra 10.12.6 and reinstalled both R 3.4.2 and the
> latest R studio but the problem persists.
>
> How can that issue be solved?
>
> Thanks in advance for your a precious help!
>
> All the best from Switzerland,
>
> Klaus
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] My function and NA Values Problem

2017-10-27 Thread Eric Berger

na.rm=TRUE  (you need to capitalize)


On Fri, Oct 27, 2017 at 10:43 AM, Engin YILMAZ 
wrote:

> Dear R Staff
>
> My working file is in the annex. "g1.csv"
> I have only 2 columns. Rice and coke.
> I try to execute following(below) function, but do not work.
> Because "Coke" value has NA values.
> I try to add "na.rm=True" to the function but do not work
> How can I solve this problem with this function or another algorithm?
> (Note: I have normally 450 columns)
>
> Sincerely
> Engin YILMAZ
>
>
> apply(g1, 2, function(c) sum(c==0))
>
> Rice Coke
>0   NA
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Count non-zero values in excluding NA Values

2017-10-29 Thread Eric Berger

If one does not need all the intermediate results then after defining data
just one line:

grand_total <- nrow(data)*ncol(data) - sum( sapply(data, function(x) sum(
is.na(x) | x == 0 ) ) )
# 76




On Sun, Oct 29, 2017 at 2:38 PM, Rui Barradas  wrote:

> Hello,
>
> Your attachment didn't came through, R-Help strips off most types of
> files, including CSV.
> Anyway, the following will do what I understand of your question. Tested
> with a fake dataset.
>
>
> set.seed(3026)# make the results reproducible
> data <- matrix(1:100, ncol = 10)
> data[sample(100, 15)] <- 0
> data[sample(100, 10)] <- NA
> data <- as.data.frame(data)
>
> zero <- sapply(data, function(x) sum(x == 0, na.rm = TRUE))
> na <- sapply(data, function(x) sum(is.na(x)))
> totals <- nrow(data) - zero - na  # totals non zero per column
> grand_total <- sum(totals)# total non zero
>
> totals
> # V1  V2  V3  V4  V5  V6  V7  V8  V9 V10
> #  6   8   8   8   8   7   7   8   6  10
>
> grand_total
> #[1] 76
>
> # another way
> prod(dim(data)) - sum(zero + na)
> #[1] 76
>
>
> Hope this helps,
>
> Rui Barradas
>
>
> Em 29-10-2017 10:25, Engin YILMAZ escreveu:
>
>> Dear R Staff
>>
>> You can see my data.csv file in the annex.
>>
>> I try to count non-zero values in dataset but I need to exclude NA in this
>> calculation
>>
>> My code is very long (following),
>> How can I write this code more efficiently and shortly?
>>
>> ## [NA_Count] - Find NA values
>>
>> data.na =sapply(data[,3:ncol(data)], function(c) sum(length(which(is.na
>> (c)
>>
>>
>> ## [Zero] - Find zero values
>>
>> data.z=apply(data[,3:ncol(data)], 2, function(c) sum(c==0))
>>
>>
>> ## [Non-Zero] - Find non-zero values
>>
>> data.nz=nrow(data[,3:ncol(data)])- (data.na+data.z)
>>
>>
>> Sincerely
>> Engin YILMAZ
>>
>> > =link&utm_campaign=sig-email&utm_content=webmail>
>> Virus-free.
>> www.avast.com
>> > =link&utm_campaign=sig-email&utm_content=webmail>
>> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pass Parameters to RScript?

2017-10-30 Thread Eric Berger

I did a simple search and got  hits immediately, e.g.
https://www.r-bloggers.com/passing-arguments-to-an-r-script-from-command-lines/

On Mon, Oct 30, 2017 at 2:30 PM, Morkus via R-help 
wrote:

> Is it possible to pass parameters to an R Script, say, from Java or other
> language?
>
> I did some searches, but came up blank.
>
> Thanks very much in advance,
>
> Sent from [ProtonMail](https://protonmail.com), Swiss-based encrypted
> email.
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pass Parameters to RScript?

2017-10-30 Thread Eric Berger

I do not program in Java but it seems a Java program can make system calls
which would be equivalent to running from the command line, but done from
within a Java program. Not sure whether that would meet your needs and if
not why not. Just a suggestion.

Check out

http://www.java-samples.com/showtutorial.php?tutorialid=8



On Mon, Oct 30, 2017 at 5:10 PM, Morkus  wrote:

> Thanks Eric,
>
> I saw that page, too, but it states:
>
> "This post describes how to pass external arguments to *R* when calling a
> Rscript *with a command line.*"
>
> Not what I'm trying to do.
>
> Thanks for your reply.
>
> Sent from ProtonMail <https://protonmail.com>, Swiss-based encrypted
> email.
>
>
>  Original Message 
> Subject: Re: [R] Pass Parameters to RScript?
> Local Time: October 30, 2017 9:39 AM
> UTC Time: October 30, 2017 1:39 PM
> From: ericjber...@gmail.com
> To: Morkus 
> r-help@r-project.org 
>
> I did a simple search and got  hits immediately, e.g.
> https://www.r-bloggers.com/passing-arguments-to-an-r-
> script-from-command-lines/
>
>
> On Mon, Oct 30, 2017 at 2:30 PM, Morkus via R-help 
> wrote:
>
>> Is it possible to pass parameters to an R Script, say, from Java or other
>> language?
>>
>> I did some searches, but came up blank.
>>
>> Thanks very much in advance,
>>
>> Sent from [ProtonMail](https://protonmail.com), Swiss-based encrypted
>> email.
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convertTime package.

2017-10-31 Thread Eric Berger

If you need a function (e.g. convertTime ) from a package (unknown?) then
you cannot simply instruct R to install the function.
e.g. if you give the command
> install.packages("convertTime")
you will get an error message like
"package 'convertTime' is not available (for R version 3.4.1)"

I did a Google search and found a package called SGP that has a function
"convertTime". I have no idea if it is the function you are looking for but
installing that package worked fine in R version 3.4.1.  So you can try

> install.packages("SGP")

HTH,
Eric


On Tue, Oct 31, 2017 at 9:15 PM, Sarah Goslee 
wrote:

> Hi Scott,
>
> Where did you get this function originally? I can't find anything about it.
>
> What OS are you using?
>
> What says, "not available for the version"? Where are you getting that
> error?
>
> What are you trying to accomplish? What does that function actually
> do? It's impossible to suggest a work-around for a function of unknown
> purpose and origin.
>
> (The posting guide for this list suggests you include all of that
> information when you inquire.)
>
> Sarah
>
>
> On Tue, Oct 31, 2017 at 2:04 PM, Scott Anderwald via R-help
>  wrote:
> > To whom it might concern.  I am working on a project that needs the
> convertTime function. I am currently using version 3.4.1 and it says not
> available for the version.  Two questions is there a work around for the
> function or is there another package that contains that functions.
> >
> >
> > Thanks,
> >
> >
> > Scott Anderwald
>
>
>
> --
> Sarah Goslee
> http://www.functionaldiversity.org
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] beta binomial distribution installation

2017-11-01 Thread Eric Berger

Hi,
I did a quick search for other packages that provide the beta binomial
distribution and found "rmutil".

> install.packages("rmutil")

The package has the CDF (pbetabinom) and inverse CDF (qbetabinom) among
other functions.

HTH,
Eric



On Wed, Nov 1, 2017 at 7:50 AM, MCGUIRE, Rhydwyn <
rm...@doh.health.nsw.gov.au> wrote:

> Hi there,
>
> It looks like you also need the bioconductor package biobase, I found
> instructions for downloading that package here:
> www.bioconductor.org/install
>
> Good luck.
>
> Cheers,
> Rhydwyn
>
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Amany
> Abdel-Karim
> Sent: Wednesday, 1 November 2017 2:13 PM
> To: r-h...@stat.math.ethz.ch
> Subject: [R] beta binomial distribution installation
>
> Hello,
>
> I  tried to install package TailRank using the command install.packages
> (RankTail) and library (TailRank) but I got the following errors. So, how
> can I install TaiRank in Rstudio to have se beta-binomial distribution, CDF
> and inverse CDG of  beta-binomal?
>
> The commands I used are:
>
> > install.packages("TailRank")
>
> Installing package into C:/Users/stator-guest/Documents/R/win-library/3.4
>
> (as lib is unspecified)
>
> Warning in install.packages :
>
>   dependency Biobase is not available
>
> trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/TailRank_
> 3.1.3.zip'
>
> Content type 'application/zip' length 331270 bytes (323 KB)
>
> downloaded 323 KB
>
>
>
> package TailRank successfully unpacked and MD5 sums checked
>
>
>
> The downloaded binary packages are in
>
> C:\Users\stator-guest\AppData\Local\Temp\RtmpoVx40V\
> downloaded_packages
>
> > library(TailRank)
>
> Error: package or namespace load failed for TailRank in loadNamespace(i,
> c(lib.loc, .libPaths()), versionCheck = vI[[i]]):
>
>  there is no package called Biobase
>
> In addition: Warning message:
>
> package TailRank was built under R version 3.4.2
>
>
>
>
> [[alternative HTML version deleted]]
>
> 
> __
> This email has been scanned for the NSW Ministry of Health by the Websense
> Hosted Email Security System.
> Emails and attachments are monitored to ensure compliance with the NSW
> Ministry of health's Electronic Messaging Policy.
> 
> __
> 
> ___
> Disclaimer: This message is intended for the addressee named and may
> contain confidential information.
> If you are not the intended recipient, please delete it and notify the
> sender.
> Views expressed in this message are those of the individual sender, and
> are not necessarily the views of the NSW Ministry of Health.
> 
> ___
> This email has been scanned for the NSW Ministry of Health by the Websense
> Hosted Email Security System.
> Emails and attachments are monitored to ensure compliance with the NSW
> Ministry of Health's Electronic Messaging Policy.
> 
> ___
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Function to save results

2017-11-01 Thread Eric Berger

Some comments:
1. sink() does not return a value. There is on point to set attr <-
sink(...). Just give the command sink("C://etc")
2. to complete the saving to the file you must give a second sink command
with no argument:  sink()
So your code would be (pseudo-code, not actual code)

sink( "filename" )
do something that prints output which will be captured by sink
sink()

HTH,
Eric

On Wed, Nov 1, 2017 at 1:32 PM, Priya Arasu via R-help  wrote:

> Hi,I want the results to be saved automatically in a output text file
> after the script has finished running.
>
> I used the sink function in the following example, but the results file
> (output.txt) was empty.
>
> net <- loadNetwork("C://Users//Priya//Desktop//Attractor analysis_all
> genes//synaptogenesis//regulationof_dopamine_signaling_submodule3.txt")#
> First I loaded theinput file for which I want to identify attractors
> attr <- sink("C://Users//Priya//Desktop//Attractor analysis_all
> genes//synaptogenesis//output.txt")# used the sink function to save the
> results from attr function
>
> attr <- getAttractors(net, type="asynchronous")# then ran the script for
> identifying attractors
> Is there any function to save the results before setting the script to
> run, so that results are automatically saved in a text file after the
> script has finished running?
>
> Thank youPriya
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Function to save results

2017-11-01 Thread Eric Berger

Hi Priya,

You did not follow the logic of the pseudo-code.
The sink("filename"), sink() pair captures whatever output is generated
between the first sink statement and the second sink statement.
You need (possibly) to do:

sink("C://Users//Priya//Desktop//Attractor analysis_all
genes//synaptogenesis//attr.txt")


net <- loadNetwork("C://Users//Priya//Desktop//Attractor analysis_all
genes//synaptogenesis//regulationof_dopamine_signaling_submodule3.txt")

attr <- getAttractors(net, type="asynchronous")


sink()


HTH,

Eric






On Wed, Nov 1, 2017 at 4:10 PM, Priya Arasu  wrote:

> Hi Eric,
> I tried as you suggested but I could not find the output in the text file
> I created (attr.txt)
>
> net <- loadNetwork("C://Users//Priya//Desktop//Attractor analysis_all 
> genes//synaptogenesis//regulationof_dopamine_signaling_submodule3.txt")
>
> sink("C://Users//Priya//Desktop//Attractor analysis_all 
> genes//synaptogenesis//attr.txt")
>
>
> sink()
>
> attr <- getAttractors(net, type="asynchronous")
>
>
> Priya
>
>
> On Wednesday, 1 November 2017 6:54 PM, Eric Berger 
> wrote:
>
>
> Some comments:
> 1. sink() does not return a value. There is on point to set attr <-
> sink(...). Just give the command sink("C://etc")
> 2. to complete the saving to the file you must give a second sink command
> with no argument:  sink()
> So your code would be (pseudo-code, not actual code)
>
> sink( "filename" )
> do something that prints output which will be captured by sink
> sink()
>
> HTH,
> Eric
>
>
>
> On Wed, Nov 1, 2017 at 1:32 PM, Priya Arasu via R-help <
> r-help@r-project.org> wrote:
>
> Hi,I want the results to be saved automatically in a output text file
> after the script has finished running.
>
> I used the sink function in the following example, but the results file
> (output.txt) was empty.
>
> net <- loadNetwork("C://Users//Priya/ /Desktop//Attractor analysis_all
> genes//synaptogenesis// regulationof_dopamine_ signaling_submodule3.txt")#
> First I loaded theinput file for which I want to identify attractors
> attr <- sink("C://Users//Priya// Desktop//Attractor analysis_all
> genes//synaptogenesis//output. txt")# used the sink function to save the
> results from attr function
>
> attr <- getAttractors(net, type="asynchronous")# then ran the script for
> identifying attractors
> Is there any function to save the results before setting the script to
> run, so that results are automatically saved in a text file after the
> script has finished running?
>
> Thank youPriya
>
>
>
> [[alternative HTML version deleted]]
>
> __ 
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/ listinfo/r-help
> <https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html <http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] beta binomial distribution installation

2017-11-01 Thread Eric Berger

Hi Amany,
I had no trouble installing TailRank and bioconductor using the link Rhydwyn
 provided.
I was curious about your statement that TailRank uses a different
parameterization for the betabinomial distribution than rmutil.
I looked at the documentation for the two packages and the transformation
to go from one to the other is straightforward.
If you want, you can do the following.
Let (N,u,v) be the parameters used in TailRank and (N,m,s) the parameters
used in rmutil. The correspondence is:

N = N, m  = u/(u+v), s = u+v.

This means you can define functions such as:

mypbb <- function(x,N,u,v) { rmutil::pbetabinom(x,N,u/(u+v),u+v) }   # CDF
myqbb <- function(x,N,u,v) { rmutil:qbetabinom{x,N,u/(u+v),u+v) }  #
inverse CDF

HTH,
Eric





On Wed, Nov 1, 2017 at 6:09 PM, Amany Abdel-Karim 
wrote:

> Hello,
>
> Thank you for your response. I need to install RankTail package since it
> contains the beta binomial distribution, CDF and inverse CDF in the usual
> form which I need to use. However rmutil package contain unusual forms for
> these functions. So it is easier for me to deal with the forms are
> contained in RankTail.
>
> I tried to install  bioconductor package, using the following commands
> but I still got the following errors:
>
>
>  (1) I tried biocLite() and then library ("TailRank"), I got the following
> errors.
>
> > biocLite()Error in biocLite() : could not find function "biocLite"> 
> > library("TailRank")Loading required package: oompaBaseError: package or 
> > namespace load failed for ‘TailRank’ in loadNamespace(i, c(lib.loc, 
> > .libPaths()), versionCheck = vI[[i]]):
>  there is no package called ‘Biobase’In addition: Warning messages:1: package 
> ‘TailRank’ was built under R version 3.4.2 2: package ‘oompaBase’ was built 
> under R version 3.4.2
>
>
> (2) I tried to write the command biocLite(), then biocLite("TailRank"), I got 
> the following errors:
>
> > biocLite()Error in biocLite() : could not find function "biocLite"> 
> > biocLite("ilRank")Error in biocLite("ilRank") : could not find function 
> > "biocLite"> biocLite()Error in biocLite() : could not find function 
> > "biocLite"> biocLite("TailRank")Error in biocLite("TailRank") : could not 
> > find function "biocLite"
>
> >
>
>
> Also, I checked under packages on the right side of the R window and I
> found TailRank , Description is Tail-Rank statistic, and version is 3.1.3.
> So, I tried to write the following code in the console window to check if
> the package works:
>
> > N<-20> u<-3> v<-10> p<-u/u+v> x<-0:N
>
> > yy<-dbb(x,N,u,v)
>
>
> I got the following error:Error in dbb(x, N, u, v) : could not find function 
> "dbb"
>
> >
>
> I am confused because if the package TailRank is already there, why the
> pervious code does not work to calculate dbb (x,N,u,v) and I got error? If
> I do not have the package, would you please let me know the right commands
> I should write in the script window to install TaiRank because the commands
> I used (which I mentioned at the beginning of the email did not work and
> gave errors). I appreciate your help since I am a new user of R.
>
>
> Amany
>
>
>
> --
>
> *From:* Eric Berger 
> *Sent:* Wednesday, November 1, 2017 2:42 AM
> *To:* MCGUIRE, Rhydwyn
> *Cc:* Amany Abdel-Karim; r-h...@stat.math.ethz.ch
> *Subject:* Re: [R] beta binomial distribution installation
>
> Hi,
> I did a quick search for other packages that provide the beta binomial
> distribution and found "rmutil".
>
> > install.packages("rmutil")
>
> The package has the CDF (pbetabinom) and inverse CDF (qbetabinom) among
> other functions.
>
> HTH,
> Eric
>
>
>
> On Wed, Nov 1, 2017 at 7:50 AM, MCGUIRE, Rhydwyn <
> rm...@doh.health.nsw.gov.au> wrote:
>
>> Hi there,
>>
>> It looks like you also need the bioconductor package biobase, I found
>> instructions for downloading that package here:
>> www.bioconductor.org/install
>>
>> Good luck.
>>
>> Cheers,
>> Rhydwyn
>>
>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Amany
>> Abdel-Karim
>> Sent: Wednesday, 1 November 2017 2:13 PM
>> To: r-h...@stat.math.ethz.ch
>> Subject: [R] beta binomial distribution installation
>>
>> Hello,
>>
>> I  tried to install package TailRank using the command install.packages
>> (RankTail) and library (TailRank) but I got the following

Re: [R] Correct subsetting in R

2017-11-01 Thread Eric Berger

matches <- merge(training,data,by=intersect(names(training),names(data)))

HTH,
Eric


On Wed, Nov 1, 2017 at 6:13 PM, Elahe chalabi via R-help <
r-help@r-project.org> wrote:

> Hi all,
> I have two data frames that one of them does not have the column ID:
>
> > str(data)
> 'data.frame':   499 obs. of  608 variables:
> $ ID   : int  1 2 3 4 5 6 7 8 9 10 ...
> $ alright  : int  1 0 0 0 0 0 0 1 2 1 ...
> $ bad  : int  1 0 0 0 0 0 0 0 0 0 ...
> $ boy  : int  1 2 1 1 0 2 2 4 2 1 ...
> $ cooki: int  1 2 2 1 0 1 1 4 2 3 ...
> $ curtain  : int  1 0 0 0 0 2 0 2 0 0 ...
> $ dish : int  2 1 0 1 0 0 1 2 2 2 ...
> $ doesnt   : int  1 0 0 0 0 0 0 0 1 0 ...
> $ dont : int  2 1 4 2 0 0 2 1 2 0 ...
> $ fall : int  3 1 0 0 1 0 1 2 3 2 ...
> $ fell : int  1 0 0 0 0 0 0 0 0 0 ...
>
> and the other one is:
>
> > str(training)
> 'data.frame':   375 obs. of  607 variables:
> $ alright  : num  1 0 0 0 1 2 1 0 0 0 ...
> $ bad  : num  1 0 0 0 0 0 0 0 0 0 ...
> $ boy  : num  1 1 2 2 4 2 1 0 1 0 ...
> $ cooki: num  1 1 1 1 4 2 3 1 2 2 ...
> $ curtain  : num  1 0 2 0 2 0 0 0 0 0 ...
> $ dish : num  2 1 0 1 2 2 2 1 4 1 ...
> $ doesnt   : num  1 0 0 0 0 1 0 0 0 0 ...
> $ dont : num  2 2 0 2 1 2 0 0 1 0 ...
> $ fall : num  3 0 0 1 2 3 2 0 2 0 ...
> $ fell : num  1 0 0 0 0 0 0 0 0 0 ...
> Does anyone know how should I get the IDs of training from data?
> thanks for any help!
> Elahe
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Function to save results

2017-11-01 Thread Eric Berger

Hi Priya,
I think your original question may have been phrased in a way that caused
David and me some confusion.
I think sink() may not be the function that is appropriate in your case.
Sink() is used to capture output to the console (so to speak).
You are trying to save the results of calculations returned, in this case
in the variable 'attr'.
You need to do something like:

attr <- getAttractors( ... )
saveRDS( attr, "filename.RDS")

and then later you can read the results back in another R session:

savedAttr <- readRDS("filename.RDS")

Look at the documentation ?saveRDS and ?readRDS

HTH,
Eric

On Wed, Nov 1, 2017 at 6:02 PM, David L Carlson  wrote:

> Let's try a simple example.
>
> > # Create a script file of commands
> > # Note we must print the results of quantile explicitly
> > cat("x <- rnorm(50)\nprint(quantile(x))\nstem(x)\n", file="Test.R")
> >
> > # Test it by running it to the console
> > source("Test.R")
> 0%25%50%75%   100%
> -2.4736219 -0.7915433 -0.1178056  0.7023577  2.9158617
>
>   The decimal point is at the |
>
>   -2 | 510
>   -1 | 7631110
>   -0 | 998877733211
>0 | 011244889
>1 | 00045
>2 | 19
>
> >
> > # Now run it and save the file
> > sink("Testout.txt")
> > source("Test.R")
> > sink()
> >
> > # What is located in "Testout.txt"?
> > cat(readLines("Testout.txt"), sep="\n")
>  0% 25% 50% 75%100%
> -2.47511893 -0.47919111  0.05761628  0.67403447  1.79825459
>
>   The decimal point is at the |
>
>   -2 | 5
>   -2 | 4
>   -1 |
>   -1 | 432000
>   -0 | 87755
>   -0 | 442110
>0 | 001244
>0 | 556789
>1 | 113
>1 | 5788
>
> > # Success
>
> Depending on your operating system, you may also be able to save the
> output with File | Save to File.
>
> ---
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77843-4352
>
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Priya
> Arasu via R-help
> Sent: Wednesday, November 1, 2017 9:57 AM
> To: Eric Berger 
> Cc: r-help@r-project.org
> Subject: Re: [R] Function to save results
>
> Hi Eric,Thanks for the explanation. Is there a way to save the results
> automatically after the analysis gets over?. As I recently lost the
> results, because I didn't save the results. I don't want to run the sink or
> save command after the analysis is over rather run the command for saving
> the file before starting to run the analysis, so the file gets saved
> automatically after the script has finished running Priya
>
>
>
> On Wednesday, 1 November 2017 7:53 PM, Eric Berger <
> ericjber...@gmail.com> wrote:
>
>
>  Hi Priya,
> You did not follow the logic of the pseudo-code. The sink("filename"),
> sink() pair captures whatever output is generated between the first sink
> statement and the second sink statement.You need (possibly) to do:
> sink("C://Users//Priya// Desktop//Attractor analysis_all
> genes//synaptogenesis//attr. txt") net <- loadNetwork("C://Users//Priya/
> /Desktop//Attractor analysis_all genes//synaptogenesis//
> regulationof_dopamine_ signaling_submodule3.txt")attr <- getAttractors(net,
> type="asynchronous")
> sink()
> HTH,Eric
>
>
>
> On Wed, Nov 1, 2017 at 4:10 PM, Priya Arasu 
> wrote:
>
> Hi Eric,I tried as you suggested but I could not find the output in the
> text file I created (attr.txt)
>
> net <- loadNetwork("C://Users//Priya/ /Desktop//Attractor analysis_all
> genes//synaptogenesis// regulationof_dopamine_ 
> signaling_submodule3.txt")sink("C://Users//Priya//
> Desktop//Attractor analysis_all genes//synaptogenesis//attr. txt")
>
>
> sink()
>
> attr <- getAttractors(net, type="asynchronous")
>  Priya
>
>
> On Wednesday, 1 November 2017 6:54 PM, Eric Berger <
> ericjber...@gmail.com> wrote:
>
>
>  Some comments:1. sink() does not return a value. There is on point to set
> attr <- sink(...). Just give the command sink("C://etc")2. to complete
> the saving to the file you must give a second sink command with no
> argument:  sink()So your code would be (pseudo-code, not actual code) sink(
> "filename" )do something that prints output which will be captured by
> sinksink() HTH,Eric
>
>
> On Wed, Nov 1, 2017 at 1:32 PM, Priya Arasu

Re: [R] Correct subsetting in R

2017-11-01 Thread Eric Berger

training$TrainingRownum <- 1:nrow(training)
data$DataRownum <- 1:nrow(data)
matches <- merge(training,data,by=intersect(names(training),names(data)))

The data frame 'matches' now has additional columns telling you the row in
each data frame corresponding to the matched items.

Regards,
Eric

On Wed, Nov 1, 2017 at 9:29 PM, Elahe chalabi 
wrote:

>
> It's not what I want, the first data frame has 499 observations and the
> second data frame is a subset of the first one but with 375 observations. I
> want something that returns the ID for training data frame
>
>
> On Wednesday, November 1, 2017 10:18 AM, Eric Berger <
> ericjber...@gmail.com> wrote:
>
>
>
> matches <- merge(training,data,by=intersect(names(training),names(data)))
>
> HTH,
> Eric
>
>
>
> On Wed, Nov 1, 2017 at 6:13 PM, Elahe chalabi via R-help <
> r-help@r-project.org> wrote:
>
> Hi all,
> >I have two data frames that one of them does not have the column ID:
> >
> >> str(data)
> >'data.frame':   499 obs. of  608 variables:
> >$ ID   : int  1 2 3 4 5 6 7 8 9 10 ...
> >$ alright  : int  1 0 0 0 0 0 0 1 2 1 ...
> >$ bad  : int  1 0 0 0 0 0 0 0 0 0 ...
> >$ boy  : int  1 2 1 1 0 2 2 4 2 1 ...
> >$ cooki: int  1 2 2 1 0 1 1 4 2 3 ...
> >$ curtain  : int  1 0 0 0 0 2 0 2 0 0 ...
> >$ dish : int  2 1 0 1 0 0 1 2 2 2 ...
> >$ doesnt   : int  1 0 0 0 0 0 0 0 1 0 ...
> >$ dont : int  2 1 4 2 0 0 2 1 2 0 ...
> >$ fall : int  3 1 0 0 1 0 1 2 3 2 ...
> >$ fell : int  1 0 0 0 0 0 0 0 0 0 ...
> >
> >and the other one is:
> >
> >> str(training)
> >'data.frame':   375 obs. of  607 variables:
> >$ alright  : num  1 0 0 0 1 2 1 0 0 0 ...
> >$ bad  : num  1 0 0 0 0 0 0 0 0 0 ...
> >$ boy  : num  1 1 2 2 4 2 1 0 1 0 ...
> >$ cooki: num  1 1 1 1 4 2 3 1 2 2 ...
> >$ curtain  : num  1 0 2 0 2 0 0 0 0 0 ...
> >$ dish : num  2 1 0 1 2 2 2 1 4 1 ...
> >$ doesnt   : num  1 0 0 0 0 1 0 0 0 0 ...
> >$ dont : num  2 2 0 2 1 2 0 0 1 0 ...
> >$ fall : num  3 0 0 1 2 3 2 0 2 0 ...
> >$ fell : num  1 0 0 0 0 0 0 0 0 0 ...
> >Does anyone know how should I get the IDs of training from data?
> >thanks for any help!
> >Elahe
> >
> >__ 
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/ listinfo/r-help
> >PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding Records to a Table in R

2017-11-01 Thread Eric Berger

Hi Paul,

#First I set up some sample data since I don't have a copy of your data
dtOrig <- as.Date( c("1985-04-01","1985-07-01","1985-12-01","1986-04-01"))
dfOrig <- data.frame( TransitDate=dtOrig, Transits=c(100,100,500,325),
CargoTons=c(1000,1080,3785,4200) )

#Generate the complete set of dates as a data frame
dfDates<- data.frame( TransitDate=seq(from=as.Date("1985-04-01"),by="1
month",length=13) )

# do the merge adding the "missing" rows (where NA will appear)
dfNew  <- merge(dfDates, dfOrig, by="TransitDate", all.x=TRUE )

# replace the NA's by zero
dfNew[is.na(dfNew)] <- 0

HTH,
Eric


On Wed, Nov 1, 2017 at 9:45 PM, Paul Bernal  wrote:

> Dear R friends,
>
> I am currently working with time series data, and I have a table(as data
> frame) that has looks like this (TransitDate are in format = "%e-%B-%Y") :
>
> TransitDate   Transits  CargoTons
> 1985-04-011002500
> 1985-05-011354500
> 1985-06-011201750
> 1985-07-011003750
> 1985-08-012001250
>
> The problem is, that there are several periods that don´t exist in the
> table, so it has the following behavior:
>
> TransitDateTransits  CargoTons
> 1985-04-01100 1000
> 1985-07-01100 1080
> 1985-12-01500 3785
> 1986-04-01325 4200
> .
> .
> 2017-09-01400 2350 (*this is the last observation)
>
> You can see in the last table fragment that the series jumps from
> 1985-04-01 to 1985-07-01, then it jumps from there to 1985-12-01 making the
> time series quite irregular (non-constant chronologically speaking).
>
> What I want to do is create a dummy table that has the sequence from the
> first observation (1985-04-01) up to the last one (2017-09-01) and then
> develop a code that checks if the dates contained in the dummy table exist
> in the original table, if they don´t exist then add those dates and put
> zeroes on the fields.
>
> How can I achieve this?
>
> Any help will be greatly appreciated,
>
> Best regards,
>
> Paul
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FW: Time Series

2017-11-07 Thread Eric Berger

Following Erin's pointer:

library(zoo)
times <- seq(from=as.POSIXct("2015-12-18 00:00:00"),
to=as.POSIXct("2017-10-24 23:00:00"), by="hour")
mydata <- rnorm(length(times))
tseri  <- zoo( x=mydata, order.by=times )

HTH,
Eric


On Tue, Nov 7, 2017 at 9:59 AM, Erin Hodgess 
wrote:

> Hello!
>
> What is the error message, please?
>
> At first glance, you are using the "ts" function.  That doesn't work for
> hourly frequency.
>
> You may want to create a zoo object.
>
> This is Round One.
>
> Sincerely,
> Erin
>
>
> On Tue, Nov 7, 2017 at 1:46 AM, Emre Karagülle 
> wrote:
>
> >
> > Hi,
> > I would like to ask a question about time series.
> > I am trying to convert my data into time series data.
> > I have hourly data from “2015-12-18 00:00” to “2017-10-24 23:00”
> > I am trying the following codes but they are not working.
> > Could you help me out?
> >
> > tseri <- ts(data ,seq(from=as.POSIXct("2015-12-18 00:00:00"),
> > to=as.POSIXct("2017-10-24 23:00:00"), by="hour"))
> >
> > tseri <- ts(data ,seq(from=as.Date("2015-12-18 00:00:00"),
> > to=as.Date("2017-10-24 23:00:00"), by="hour"))
> >
> >
> > Thank you
> >
> > --
> > Emre
> >
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
> --
> Erin Hodgess
> Associate Professor
> Department of Mathematical and Statistics
> University of Houston - Downtown
> mailto: erinm.hodg...@gmail.com
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fwd: FW: Time Series

2017-11-07 Thread Eric Berger

[Please send replies to r-help, not individual  responders]
Emre,
In R, when you call a function defined via something like
f <- function( foo, bar )
then you can call it as, for, example

a <- f(x,y)

or

a <- f(foo=x, bar=y)

or even

a <- f( bar=y, foo=x)   # notice I switched the order!

The first approach requires you to pass the arguments in the same order as
the function is expecting.
In the second and third examples you can pass the arguments in any order
you want since you have indicated the variable names.
The point is that the variable name goes on the left of the '=' and the
values (or variables) you are passing go on the right of the '='.

The zoo function is defined as
zoo( x = NULL, order.by = index(x), etc ... )

In my example code I passed the variable 'mydata' to the parameter 'x' via

zoo(x=mydata, order.by=times)

If you called your local variable x then you can call zoo via

zoo(x=x, order.by=whatever)  # using the 'named parameters' approach

or

zoo(x, order.by=whatever)   # where zoo will match the first argument to
the first parameter in its definition.

Hopefully this will help you understand why some of your attempts worked
and some did not work.

Regards,
Eric

Hi Erin and Eric

As both of you suggested I followed the Erin’s command

It  is failed with the following command

when I wrote x , which is numeric vector. I says that unused argument.

tseri  <- zoo( x=mydata, order.by=times )

when use  it without x=mydata like,

tseri  <- zoo( x, order.by=times )

it works.

I checked it by following command

x[times==as.POSIXct("2015-12-18 02:00:00")] and it gave me the true value.

Do you think it is okay?

By the way, I appreciate for fast reply.

Thank you.

--
Emre

*From: *Eric Berger 
*Sent: *Tuesday, November 7, 2017 11:08 AM
*To: *Erin Hodgess 
*Cc: *Emre Karagülle ; r-help@r-project.org
*Subject: *Re: [R] FW: Time Series

Following Erin's pointer:

library(zoo)
times <- seq(from=as.POSIXct("2015-12-18 00:00:00"),
to=as.POSIXct("2017-10-24 23:00:00"), by="hour")
mydata <- rnorm(length(times))
tseri  <- zoo( x=mydata, order.by=times )

HTH,

Eric

On Tue, Nov 7, 2017 at 9:59 AM, Erin Hodgess 
wrote:

Hello!

What is the error message, please?

At first glance, you are using the "ts" function.  That doesn't work for
hourly frequency.

You may want to create a zoo object.

This is Round One.

Sincerely,
Erin

On Tue, Nov 7, 2017 at 1:46 AM, Emre Karagülle 
wrote:

>
> Hi,
> I would like to ask a question about time series.
> I am trying to convert my data into time series data.
> I have hourly data from “2015-12-18 00:00” to “2017-10-24 23:00”
> I am trying the following codes but they are not working.
> Could you help me out?
>
> tseri <- ts(data ,seq(from=as.POSIXct("2015-12-18 00:00:00"),
> to=as.POSIXct("2017-10-24 23:00:00"), by="hour"))
>
> tseri <- ts(data ,seq(from=as.Date("2015-12-18 00:00:00"),
> to=as.Date("2017-10-24 23:00:00"), by="hour"))
>
>
> Thank you
>
> --
> Emre
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
Erin Hodgess
Associate Professor
Department of Mathematical and Statistics
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fitdistrplus and Custom Probability Density

2017-11-07 Thread Eric Berger

Why not define your own functions based on d?
e.g.
myCumDist <- function(x) { integrate(d, lower=-Inf, upper=x)$value  }
myQuantile <- function(x) { uniroot(f=function(y) { h(y) - x },
interval=c(-5,5)) }  # limits -5,5 should be replaced by your own which
might require some fiddling

e.g.
d <- function(x) { exp(-x^2/2)/(sqrt(2*pi)) }  # just an example for you to
test with; use your own density d(x) in your case

Then define myCumDist, myQuantile as above and compare with pnorm, qnorm.

HTH,
Eric




On Tue, Nov 7, 2017 at 4:22 PM, Lorenzo Isella 
wrote:

> Dear All,
> Apologies for not providing a reproducible example, but if I could, then I
> would be able to answer myself my question.
> Essentially, I am trying to fit a very complicated custom probability
> distribution to some data.
> Fitdistrplus does in principle everything which I need, but if require me
> to specify not only the density function d, but also the cumulative p and
> and inverse cumulative function q (see for instance
>
> http://www.stat.umn.edu/geyer/old/5101/rlook.html
>
> to understand what these quantities are in the case of a normal
> distribution).
>
> The analytical calculation of p and q is a big task in my case, so my
> question is if there is a workaround for this, i.e. a way to fit the
> unknown parameters of my probability distribution without specifying (at
> least analytically) p and q, but only the density d.
> Many thanks
>
> Lorenzo
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ggplot error

2017-11-08 Thread Eric Berger

I was not able to reproduce this problem. I tried two environments
1. Ubuntu 14.04.5 LTS, R version 3.4.2 (same R version as yours)
2. Windows 10, same R version



On Wed, Nov 8, 2017 at 9:50 AM, Zeki ÇATAV  wrote:

> Hello,
> I've an error recently.
>
> ggplot(data = mtcars, aes(x= wt, y= mpg)) + geom_line()
> Error: Found object is not a stat.
>
> > sessionInfo()
> R version 3.4.2 (2017-09-28)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 16.04.3 LTS
>
> Matrix products: default
> BLAS: /usr/lib/openblas-base/libblas.so.3
> LAPACK: /usr/lib/libopenblasp-r0.2.18.so
>
> locale:
>  [1] LC_CTYPE=tr_TR.UTF-8   LC_NUMERIC=C
>  LC_TIME=tr_TR.UTF-8
>  [4] LC_COLLATE=tr_TR.UTF-8 LC_MONETARY=tr_TR.UTF-8
> LC_MESSAGES=tr_TR.UTF-8
>  [7] LC_PAPER=tr_TR.UTF-8   LC_NAME=C  LC_ADDRESS=C
>
> [10] LC_TELEPHONE=C LC_MEASUREMENT=tr_TR.UTF-8
> LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> other attached packages:
> [1] dplyr_0.7.4 purrr_0.2.4 readr_1.1.1 tidyr_0.7.2
>  tibble_1.3.4tidyverse_1.1.1
> [7] ggplot2_2.2.1
>
> loaded via a namespace (and not attached):
>  [1] Rcpp_0.12.13   lubridate_1.7.1lattice_0.20-35class_7.3-14
>  assertthat_0.2.0
>  [6] ipred_0.9-6psych_1.7.8foreach_1.4.3  R6_2.2.2
>  cellranger_1.1.0
> [11] plyr_1.8.4 stats4_3.4.2   httr_1.3.1 rlang_0.1.4
>   lazyeval_0.2.1
> [16] caret_6.0-77   readxl_1.0.0   kernlab_0.9-25 rpart_4.1-11
>  Matrix_1.2-11
> [21] splines_3.4.2  CVST_0.2-1 ddalpha_1.3.1  gower_0.1.2
>   stringr_1.2.0
> [26] foreign_0.8-69 munsell_0.4.3  broom_0.4.2
> compiler_3.4.2 modelr_0.1.1
> [31] pkgconfig_2.0.1mnormt_1.5-5   dimRed_0.1.0   nnet_7.3-12
>   prodlim_1.6.1
> [36] DRR_0.0.2  codetools_0.2-15   RcppRoll_0.2.2 withr_2.1.0
>   MASS_7.3-47
> [41] recipes_0.1.0  ModelMetrics_1.1.0 grid_3.4.2 nlme_3.1-131
>  jsonlite_1.5
> [46] gtable_0.2.0   magrittr_1.5   scales_0.5.0
>  stringi_1.1.5  reshape2_1.4.2
> [51] bindrcpp_0.2   timeDate_3012.100  robustbase_0.92-8  xml2_1.1.1
>  lava_1.5.1
> [56] iterators_1.0.8tools_3.4.2forcats_0.2.0  glue_1.2.0
>  DEoptimR_1.0-8
> [61] sfsmisc_1.1-1  hms_0.3parallel_3.4.2
>  survival_2.41-3yaml_2.1.14
> [66] colorspace_1.3-2   rvest_0.3.2bindr_0.1  haven_1.1.0
>
>
> > conflicts(detail = TRUE)
> $.GlobalEnv
> [1] "iris"
>
> $`package:dplyr`
>  [1] "%>%"   "%>%"   "add_row"   "as_data_frame"
> "as_tibble" "data_frame"
>  [7] "data_frame_"   "frame_data""glimpse"   "lst"
>  "lst_"  "tbl_sum"
> [13] "tibble""tribble"   "trunc_mat" "type_sum"
> "filter""lag"
> [19] "intersect" "setdiff"   "setequal"  "union"
>
> $`package:purrr`
> [1] "%>%" "%>%"
>
> $`package:tidyr`
> [1] "%>%" "%>%"
>
> $`package:tibble`
>  [1] "add_row"   "as_data_frame" "as_tibble" "data_frame"
> "data_frame_"   "frame_data"
>  [7] "glimpse"   "lst"   "lst_"  "tbl_sum"
>  "tibble""tribble"
> [13] "trunc_mat" "type_sum"
>
> $`package:ggplot2`
> [1] "Position"
>
> $`package:stats`
> [1] "filter" "lag"
>
> $`package:datasets`
> [1] "iris"
>
> $`package:methods`
> [1] "body<-""kronecker"
>
> $`package:base`
> [1] "body<-""intersect" "kronecker" "Position"  "setdiff"   "setequal"
> "union"
>
>
> How can I solve this problem?
> Thanks.
>
> --
> Zeki Çatav
> zekicatav.com
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding Records to a Table in R

2017-11-08 Thread Eric Berger

Hi Paul,
The following worked for me:

library(lubridate)
dataset1 <- read.csv("dataset1.csv",stringsAsFactors=FALSE)
dataset1$TransitDate <- mdy(dataset1$TransitDate)
TransitDateFrame <- data.frame(TransitDate=seq(as.Date("1985-10-01"),
as.Date("2017-10-01"), by = "month"))
dataset1NEW <- merge(TransitDateFrame, dataset1, by="TransitDate",
all.x=TRUE)

HTH,
Eric



On Wed, Nov 8, 2017 at 4:32 PM, PIKAL Petr  wrote:

> Hi
>
> Instead of attachments copy directly result of dput(TransitDateFrame) and
> dput(dataset1) to your email. Or, if your data have more than about 20 rows
> you could copy only part of it.
>
> dput(TransitDateFrame[,1:20])
> dput(dataset1[,1:20])
>
> Only with this approach we can evaluate your data in all aspects and
> provide correct answer.
>
> Cheers
> Petr
>
> > -Original Message-
> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Paul
> Bernal
> > Sent: Wednesday, November 8, 2017 2:46 PM
> > To: Eric Berger 
> > Cc: r-help@r-project.org
> > Subject: Re: [R] Adding Records to a Table in R
> >
> > Dear Eric,
> >
> > Hope you are doing great. I also tried the following:
> >
> > #First I created the complete date sequence
> >
> > TransitDateFrame <- data.frame(TransitDate=seq(as.Date(dataset1[1,1]),
> > as.Date(dataset1[nrow(dataset1),1]), by = "month"))
> >
> > #Then I did the merging
> >
> >  dataset1NEW <- merge(TransitDateFrame, dataset1, by="TransitDate",
> > all.x=TRUE)
> >
> > Now it has, as expected the total number of rows. The problem is, it
> filled
> > absolutely everything with NAs, and this shouldn´t be the case since
> there are
> > dates that actually have data.
> >
> > why is this happening?
> >
> > I am attaching the dataset1 table as a .csv document for your reference.
> > Basically what I want is to bring all the values in dataset1 and only
> add the
> > dates missing with value 0.
> >
> > Best regards,
> >
> > Paul
> >
> > 2017-11-01 15:21 GMT-05:00 Eric Berger :
> >
> > > Hi Paul,
> > >
> > > #First I set up some sample data since I don't have a copy of your
> > > data dtOrig <- as.Date(
> > > c("1985-04-01","1985-07-01","1985-12-01","1986-04-01"))
> > > dfOrig <- data.frame( TransitDate=dtOrig, Transits=c(100,100,500,325),
> > > CargoTons=c(1000,1080,3785,4200) )
> > >
> > > #Generate the complete set of dates as a data frame
> > > dfDates<- data.frame( TransitDate=seq(from=as.Date("1985-04-01"),by="1
> > > month",length=13) )
> > >
> > > # do the merge adding the "missing" rows (where NA will appear) dfNew
> > > <- merge(dfDates, dfOrig, by="TransitDate", all.x=TRUE )
> > >
> > > # replace the NA's by zero
> > > dfNew[is.na(dfNew)] <- 0
> > >
> > > HTH,
> > > Eric
> > >
> > >
> > > On Wed, Nov 1, 2017 at 9:45 PM, Paul Bernal 
> > > wrote:
> > >
> > >> Dear R friends,
> > >>
> > >> I am currently working with time series data, and I have a table(as
> > >> data
> > >> frame) that has looks like this (TransitDate are in format =
> "%e-%B-%Y") :
> > >>
> > >> TransitDate   Transits  CargoTons
> > >> 1985-04-011002500
> > >> 1985-05-011354500
> > >> 1985-06-011201750
> > >> 1985-07-011003750
> > >> 1985-08-012001250
> > >>
> > >> The problem is, that there are several periods that don´t exist in
> > >> the table, so it has the following behavior:
> > >>
> > >> TransitDateTransits  CargoTons
> > >> 1985-04-01100 1000
> > >> 1985-07-01100 1080
> > >> 1985-12-01500 3785
> > >> 1986-04-01325 4200
> > >> .
> > >> .
> > >> 2017-09-01400 2350 (*this is the last observation)
> > >>
> > >> You can see in the last table fragment that the series jumps from
> > >> 1985-04-01 to 1985-07-01, then it jumps from there to 1985-12-01
> > >> making the time series quite irregular (non-constant chronologically
>

Re: [R] Calculating frequencies of multiple values in 200 colomns

2017-11-10 Thread Eric Berger

How about this workaround - add 1 to the vector
x <- c(1,0,2,1,0,2,2,0,2,1)
tabulate(x)
# [1] 3 4
tabulate(x+1)
#[1] 3 3 4


On Fri, Nov 10, 2017 at 4:34 PM, Marc Schwartz  wrote:

> Hi,
>
> To clarify the default behavior that Boris is referencing below, note the
> definition of the 'bin' argument to the tabulate() function:
>
> bin: a numeric vector ***(of positive integers)***, or a factor. Long
> vectors are supported.
>
> I added the asterisks for emphasis.
>
> This is also noted in the examples used for the function in ?tabulate at
> the bottom of the help page.
>
> The second argument, 'nbins', which defaults to max(1, bin, na.rm = TRUE),
> also affects the output:
>
> > tabulate(c(2, 3, 5))
> [1] 0 1 1 0 1
>
> In this case, with each element in the returned vector indicating how many
> 1's, 2's, 3's, 4's and 5's are present in the source vector.
>
> Compare that to:
>
> > tabulate(c(2, 3, 5), nbins = 3)
> [1] 0 1 1
>
> In the above example, 5 is ignored.
>
> Note also that tabulate(), unlike table(), does not return a named vector,
> just the frequencies.
>
> While tabulate() is used within the table() function, reviewing the code
> for the latter reveals how the default behavior of tabulate() is modified
> and preceded/wrapped in other code for use there.
>
> Regards,
>
> Marc Schwartz
>
>
> > On Nov 10, 2017, at 8:43 AM, Boris Steipe 
> wrote:
> >
> > |> x <- sample(0:2, 10, replace = TRUE)
> > |> x
> > [1] 1 0 2 1 0 2 2 0 2 1
> > |> tabulate(x)
> > [1] 3 4
> > |> table(x)
> > x
> > 0 1 2
> > 3 3 4
> >
> >
> >
> > B.
> >
> >
> >
> >> On Nov 10, 2017, at 4:32 AM, Allaisone 1 
> wrote:
> >>
> >>
> >>
> >> Thank you for your effort Bert..,
> >>
> >>
> >> I knew what is the problem now, the values (1,2,3) were only an
> example. The values I have are 0 , 1, 2 . Tabulate () function seem to
> ignore calculating the frequency of 0 values and this is my exact problem
> as the frequency of 0 values should also be calculated for the maf to be
> calculated correctly.
> >>
> >> 
> >> From: Bert Gunter 
> >> Sent: 09 November 2017 23:51:35
> >> To: Allaisone 1; R-help
> >> Subject: Re: [R] Calculating frequencies of multiple values in 200
> colomns
> >>
> >> [[elided Hotmail spam]]
> >>
> >> "For example, if I have the values : 1 , 2 , 3 in each column, applying
> Tabulate () would calculate the frequency of 1 and 2 without 3"
> >>
> >> Huh??
> >>
> >>> x <- sample(1:3,10,TRUE)
> >>> x
> >> [1] 1 3 1 1 1 3 2 3 2 1
> >>> tabulate(x)
> >> [1] 5 2 3
> >>
> >> Cheers,
> >> Bert
> >>
> >>
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >> On Thu, Nov 9, 2017 at 3:44 PM, Allaisone 1  mailto:allaiso...@hotmail.com>> wrote:
> >>
> >> Thank you so much for your replay
> >>
> >>
> >> Actually, I tried apply() function but struggled with the part of
> writing the appropriate function inside it which calculate the frequency of
> the 3 values. Tabulate () function is a good start but the problem is that
> this calculates the frequency of two values only per column which means
> that when I apply maf () function , maf value will be calculated using the
> frequency of these 2 values only without considering the frequency of the
> 3rd value. For example, if I have the values : 1 , 2 , 3 in each column,
> applying Tabulate () would calculate the frequency of 1 and 2 without 3 . I
> need a way to calculate the frequencies of all of the 3 values so the
> calculation of maf will be correct as it will consider all the 3
> frequencies but not only 2 .
> >>
> >>
> >> Regards
> >>
> >> Allahisone
> >>
> >> 
> >> From: Bert Gunter mailto:bgunter.4...@gmail.com
> >>
> >> Sent: 09 November 2017 20:56:39
> >> To: Allaisone 1
> >> Cc: r-help@R-project.org
> >> Subject: Re: [R] Calculating frequencies of multiple values in 200
> colomns
> >>
> >> This is not a good way to do things! R has many powerful built in
> functions to do this sort of thing for you. Searching  -- e.g. at
> rseek.org or even a plain old google search -- can help
> you find them. Also, it looks like you need to go through a tutorial or two
> to learn more about R's basic functionality.
> >>
> >> In this case, something like (no reproducible example given, so can't
> confirm):
> >>
> >> apply(Values, 2, function(x)maf(tabulate(x)))
> >>
> >> should be close to what you want .
> >>
> >>
> >> Cheers,
> >> Bert
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >> On Thu, Nov 9, 2017 at 11:44 AM, Allaisone 1  mailto:allaiso...@hotmail.com>> wrote:
> >>
> >> Hi All
> >>
> >>
> >> I have a dataset of 200 columns and 1000 rows , there are 3

Re: [R] effects package x axis labels

2017-11-11 Thread Eric Berger

Hi Andras,
I have not used this package before but I did the following steps to arrive
at an answer. Hopefully both the answer is what you are looking for and
also the steps to understand how you can answer such questions yourself in
the future.
1. R is an object-oriented language, but there are several ways in which
classes are supported. In particular, methods for some classes don't reside
with the class but with extensions to "generic" functions. The 'plot'
function is such an example. So the first step is to understand the class
returned by the function allEffects.

> myObj <- allEffects(mylogit)
> class(myObj)
# efflist

2. Next look at the documentation for the extensions to 'plot' for an
'efflist' class

> ?plot.efflist

3. Search in the help documentation for 'axes' to understand what is going
on (they also supply a lot of examples at the end of the help page). A few
experiments and the following seems to do what you asked for:
> plot(allEffects(mylogit),
+
axes=list(x=list(gre=list(lab="black"),gpa=list(lab="white"),rank=list(lab="green")),
+y=list(lab="Prob(xyz)")))

HTH,
Eric

On Sat, Nov 11, 2017 at 2:20 AM, Andras Farkas via R-help <
r-help@r-project.org> wrote:

> Dear All,
>
> probably a simple enough solution but don;t seem to be able to get my head
> around it...example based on a publicly available data set:
>
> mydata <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv";)
> mylogit <- glm(admit ~ gre + gpa + rank, data = mydata, family =
> "binomial")
> library(effects)
> plot(allEffects(mylogit)
>  ,axes=list(y=list(lab="Prob(xyz)"))
> )
>
> axes=list(y=list(lab="Prob(xyz)")) changes the y axis labels for all 3
> plots... Any thoughts on how I could change the x axis labels to let say
> 'black' (plot 1), 'white' (plot 2) and 'green' (plot 3) for the 3
> respective plots produced?
>
>
> appreciate the help...
>
> Andras
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R6 object that is a list of referenced object

2017-11-16 Thread Eric Berger

Hi Cristina,
You can try this:

> Community <- R6Class("Community",
   public = list(
 e = NULL,
 initialize = function() { self$e <- list() },
 add = function( person ) { self$e[[ length(self$e)
+ 1]] <<- person }
   )
  )

> crowd <- Community$new()
> crowd$add(Person1)
> crowd$add(Person2)
> crowd$e

HTH,
Eric


On Thu, Nov 16, 2017 at 9:55 AM, Jeff Newmiller 
wrote:

> See below.
>
> On Wed, 15 Nov 2017, Cristina Pascual wrote:
>
> Dear community,
>>
>> I am having a class, let's say Person,
>>
>> Person <-  R6Class("Person",
>> public = list(
>>   idPerson = NULL,
>>   name = NULL,
>>   age = NULL,
>>   initialize = function(idPerson = NA, name = NA, age
>> = NA) {
>>
>
> It is a bad idea to setup default values for all your parameters in any
> function, but particularly so for an initialization function. A Person with
> NA in the idPerson field is essentially unusable, so encouraging the
> creation of such an object is very bad practice.
>
>self$idPerson <- idPerson
>>self$name <- name
>>self$age <- age
>>   }
>> ) # public
>>
>> ) # Person
>>
>> I have created:
>> Person1 <- Person$new(1,'a',4)
>> Person2 <- Person$new(2,'b',5)
>>
>> and I also have a class Community:
>>
>> Community <- R6Class("Community",
>> public = list(
>>   e = NULL,
>>   initialize = function() self$e <- Person$new()
>>
>
> Initializing a Community with a bogus person is as bad as the idPerson
> being NA. It makes a lot more sense to have the set of persons in a
> community be the null set than to have a minimum of one person in the
> community who happens to have invalid identification.
>
> )
>> )
>>
>> I want to create
>>
>> Community1 = List
>>
>> and add Person1 and Person2 to Community1 (Community1 <-
>> Community1$add(Person1)
>>
>>  Community1 <- Community1$add(Person2)
>>
>> )
>>
>> How can I write this with R6? I cannot find the proper example in the
>> website.
>>
>> Can anybody help me?
>>
>> Thanks in advance,
>>
>
> You don't seem to be very familiar with either R or conventional
> object-oriented design. Although I am giving you a reprex below, I
> recommend that you avoid R6 until you are more familiar with how normal
> functional programming and S3 object oriented coding styles work in R.
> Using R6 as a crutch to avoid that learning process will only lead you to
> frustration and inefficient data handling. That is, this whole thing should
> just be a data frame.
>
> 
> library(R6)
> Person <-  R6Class( "Person"
>   , public = list( idPerson = NA
>  , name = NA
>  , age = NA
>  , initialize = function( idPerson
>
>  , name
>
>  , age
>
>  ) {
>
>self$idPerson <- idPerson
>
>self$name <- name
>
>self$age <- age
>
>  }
>  ) # public
>   ) # Person
>
> Person1 <- Person$new( 1, 'a', 4 )
> Person2 <- Person$new( 2, 'b', 5 )
>
> Community <- R6Class( "Community"
> , public = list( e = NULL
>
> , addPerson = function( p ) {
>
>self$e <- append( self$e, p )
>
>   }
>
> )
> )
>
> Community1 <- Community$new()
> Community1$addPerson( Person1 )
> Community1$addPerson( Person2 )
> Community1$e
> #> [[1]]
> #> 
> #>   Public:
> #> age: 4
> #> clone: function (deep = FALSE)
> #> idPerson: 1
> #> initialize: function (idPerson, name, age)
> #> name: a
> #>
> #> [[2]]
> #> 
> #>   Public:
> #> age: 5
> #> clone: function (deep = FALSE)
> #> idPerson: 2
> #> initialize: function (idPerson, name, age)
> #> name: b
>
> # Standard R approach:
> Person1a <

Re: [R] Risks of using "function <- package::function" ?

2017-11-17 Thread Eric Berger

As Jeff recommends, I use the pkg::fun for clarity.
However I probably use it more than needed (e.g. I use the dplyr:: prefix
on all dplyr function calls instead of just the functions with name
collisions).
Are there any tools that can be used (like a form of lint) to identify uses
of functions without the pkg:: prefix and which are part of a name
collision?
One could then edit the code to include the pkg:: prefix to disambiguate
those cases and verify via a repeated use of such a tool that there are no
outstanding cases.

Or alternative approaches to the issue?

Thanks,
Eric

On Fri, Nov 17, 2017 at 9:30 AM, Jeff Newmiller 
wrote:

> Obvious?  How about "obscurity"? Just directly use pkg::fun if you have
> name collision.
> --
> Sent from my phone. Please excuse my brevity.
>
> On November 16, 2017 4:46:15 PM PST, Duncan Murdoch <
> murdoch.dun...@gmail.com> wrote:
> >On 16/11/2017 4:53 PM, Boris Steipe wrote:
> >> Large packages sometimes mask each other's functions and that creates
> >a headache, especially for teaching code, since function signatures may
> >depend on which order packages were loaded in. One of my students
> >proposed using the idiom
> >>
> >> <- ::
> >>
> >> ... in a preamble, when we use just a small subset of functions from
> >a larger package. I like that idea and can't see obvious
> >disadvantages(1).
> >>
> >> Are there subtle risks to that approach?
> >
> >You might do it twice.  R isn't going to complain if you have
> >
> >filter <- stats::filter
> >
> ># some other code here...
> >
> >filter <- dplyr::filter
> >
> >in your code, but the second one will overwrite the first one.
> >
> >The normal way to handle this is in the NAMESPACE file, where you
> >should
> >have
> >
> >importFrom(stats, filter)
> >
> >If you then have
> >
> >importFrom(dplyr, filter)
> >
> >you should get an warning:
> >
> >Warning: replacing previous import ‘stats::filter’ by ‘dplyr::filter’
> >when loading ‘testpkg’.
> >
> >Duncan Murdoch
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Do I need to transform backtest returns before using pbo (probability of backtest overfitting) package functions?

2017-11-21 Thread Eric Berger

Hi Joe,
The centering and re-scaling is done for the purposes of his example, and
also to be consistent with his definition of the sharpe function.
In particular, note that the sharpe function has the rf (riskfree)
parameter with a default value of .03/252 i.e. an ANNUAL 3% rate converted
to a DAILY rate, expressed in decimal.
That means that the other argument to this function, x, should be DAILY
returns, expressed in decimal.

Suppose he wanted to create random data from a distribution of returns with
ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in decimal.
The equivalent DAILY

Then he does two steps: (1) generate a matrix of random values from the
N(0,1) distribution. (2) convert them to DAILY
After initializing the matrix with random values (from N(0,1)), he now
wants to create a series of DAILY
sr_base <- 0
mu_base <- sr_base/(252.0)
sigma_base <- 1.00/(252.0)**0.5
for ( i in 1:n ) {
  m[,i] = m[,i] * sigma_base / sd(m[,i]) # re-scale
  m[,i] = m[,i] + mu_base - mean(m[,i]) # re-center}

On Tue, Nov 21, 2017 at 2:10 PM, Bert Gunter  wrote:

> Wrong list.
>
> Post on r-sig-finance instead.
>
> Cheers,
> Bert
>
>
>
> On Nov 20, 2017 11:25 PM, "Joe O"  wrote:
>
> Hello,
>
> I'm trying to understand how to use the pbo package by looking at a
> vignette. I'm curious about a part of the vignette that creates simulated
> returns data. The package author transforms his simulated returns in a way
> that I'm unfamiliar with, and that I haven't been able to find an
> explanation for after searching around. I'm curious if I need to replicate
> the transformation with real returns. For context, here is the vignette
> (cleaned up a bit to make it reproducible):
>
> (Full vignette:
> https://cran.r-project.org/web/packages/pbo/vignettes/pbo.html)
>
> library(pbo)
> #First, we assemble the trials into an NxT matrix where each column
> #represents a trial and each trial has the same length T. This example
> #is random data so the backtest should be overfit.`
>
> set.seed(765)
> n <- 100
> t <- 2400
> m <- data.frame(matrix(rnorm(n*t),nrow=t,ncol=n,
>dimnames=list(1:t,1:n)), check.names=FALSE)
>
> sr_base <- 0
> mu_base <- sr_base/(252.0)
> sigma_base <- 1.00/(252.0)**0.5
> for ( i in 1:n ) {
>   m[,i] = m[,i] * sigma_base / sd(m[,i]) # re-scale
>   m[,i] = m[,i] + mu_base - mean(m[,i]) # re-center}
> #We can use any performance evaluation function that can work with the
> #reassembled sub-matrices during the cross validation iterations.
> #Following the original paper we can use the Sharpe ratio as
>
> sharpe <- function(x,rf=0.03/252) {
>   sr <- apply(x,2,function(col) {
> er = col - rf
> return(mean(er)/sd(er))
>   })
>   return(sr)}
> #Now that we have the trials matrix we can pass it to the pbo function
>  #for analysis.
>
> my_pbo <- pbo(m,s=8,f=sharpe,threshold=0)
>
> summary(my_pbo)
>
> Here's the portion i'm curious about:
>
> sr_base <- 0
> mu_base <- sr_base/(252.0)
> sigma_base <- 1.00/(252.0)**0.5
> for ( i in 1:n ) {
>   m[,i] = m[,i] * sigma_base / sd(m[,i]) # re-scale
>   m[,i] = m[,i] + mu_base - mean(m[,i]) # re-center}
>
> Why is the data transformed within the for loop, and does this kind of
> re-scaling and re-centering need to be done with real returns? Or is this
> just something the author is doing to make his simulated returns look more
> like the real thing?
>
> Googling around turned up some articles regarding scaling volatility to the
> square root of time, but the scaling in the code here doesn't look quite
> like what I've seen. Re-scalings I've seen involve multiplying some short
> term (i.e. daily) measure of volatility by the root of time, but this isn't
> quite that. Also, the documentation for the package doesn't include this
> chunk of re-scaling and re-centering code. Documentation: https://cran.r-
> project.org/web/packages/pbo/pbo.pdf
>
> So:
>
>-
>
>Why is the data transformed in this way/what is result of this
>transformation?
>-
>
>Is it only necessary for this simulated data, or do I need to
>similarly transform real returns?
>
> I read in the posting guide that stats questions are acceptable given
> certain conditions, I hope this counts. Thanks for reading,
>
> -Joe
>
>  utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> Virus-free.
> www.avg.com
>  utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> ___

Re: [R] Do I need to transform backtest returns before using pbo (probability of backtest overfitting) package functions?

2017-11-21 Thread Eric Berger

[re-sending - previous email went out by accident before complete]
Hi Joe,
The centering and re-scaling is done for the purposes of his example, and
also to be consistent with his definition of the sharpe function.
In particular, note that the sharpe function has the rf (riskfree)
parameter with a default value of .03/252 i.e. an ANNUAL 3% rate converted
to a DAILY rate, expressed in decimal.
That means that the other argument to this function, x, should be DAILY
returns, expressed in decimal.

Suppose he wanted to create random data from a distribution of returns with
ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in decimal.
The equivalent DAILY returns would have mean MU_D = MU_A / 252 and standard
deviation SIGMA_D =  SIGMA_A/SQRT(252).

He calls MU_D by the name mu_base  and  SIGMA_D by the name sigma_base.

His loop now converts the random numbers in his matrix so that each column
has mean MU_D and std deviation SIGMA_D.

HTH,
Eric



On Tue, Nov 21, 2017 at 2:33 PM, Eric Berger  wrote:

> Hi Joe,
> The centering and re-scaling is done for the purposes of his example, and
> also to be consistent with his definition of the sharpe function.
> In particular, note that the sharpe function has the rf (riskfree)
> parameter with a default value of .03/252 i.e. an ANNUAL 3% rate converted
> to a DAILY rate, expressed in decimal.
> That means that the other argument to this function, x, should be DAILY
> returns, expressed in decimal.
>
> Suppose he wanted to create random data from a distribution of returns
> with ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in
> decimal.
> The equivalent DAILY
>
> Then he does two steps: (1) generate a matrix of random values from the
> N(0,1) distribution. (2) convert them to DAILY
> After initializing the matrix with random values (from N(0,1)), he now
> wants to create a series of DAILY
> sr_base <- 0
> mu_base <- sr_base/(252.0)
> sigma_base <- 1.00/(252.0)**0.5
> for ( i in 1:n ) {
>   m[,i] = m[,i] * sigma_base / sd(m[,i]) # re-scale
>   m[,i] = m[,i] + mu_base - mean(m[,i]) # re-center}
>
> On Tue, Nov 21, 2017 at 2:10 PM, Bert Gunter 
> wrote:
>
>> Wrong list.
>>
>> Post on r-sig-finance instead.
>>
>> Cheers,
>> Bert
>>
>>
>>
>> On Nov 20, 2017 11:25 PM, "Joe O"  wrote:
>>
>> Hello,
>>
>> I'm trying to understand how to use the pbo package by looking at a
>> vignette. I'm curious about a part of the vignette that creates simulated
>> returns data. The package author transforms his simulated returns in a way
>> that I'm unfamiliar with, and that I haven't been able to find an
>> explanation for after searching around. I'm curious if I need to replicate
>> the transformation with real returns. For context, here is the vignette
>> (cleaned up a bit to make it reproducible):
>>
>> (Full vignette:
>> https://cran.r-project.org/web/packages/pbo/vignettes/pbo.html)
>>
>> library(pbo)
>> #First, we assemble the trials into an NxT matrix where each column
>> #represents a trial and each trial has the same length T. This example
>> #is random data so the backtest should be overfit.`
>>
>> set.seed(765)
>> n <- 100
>> t <- 2400
>> m <- data.frame(matrix(rnorm(n*t),nrow=t,ncol=n,
>>dimnames=list(1:t,1:n)), check.names=FALSE)
>>
>> sr_base <- 0
>> mu_base <- sr_base/(252.0)
>> sigma_base <- 1.00/(252.0)**0.5
>> for ( i in 1:n ) {
>>   m[,i] = m[,i] * sigma_base / sd(m[,i]) # re-scale
>>   m[,i] = m[,i] + mu_base - mean(m[,i]) # re-center}
>> #We can use any performance evaluation function that can work with the
>> #reassembled sub-matrices during the cross validation iterations.
>> #Following the original paper we can use the Sharpe ratio as
>>
>> sharpe <- function(x,rf=0.03/252) {
>>   sr <- apply(x,2,function(col) {
>> er = col - rf
>> return(mean(er)/sd(er))
>>   })
>>   return(sr)}
>> #Now that we have the trials matrix we can pass it to the pbo function
>>  #for analysis.
>>
>> my_pbo <- pbo(m,s=8,f=sharpe,threshold=0)
>>
>> summary(my_pbo)
>>
>> Here's the portion i'm curious about:
>>
>> sr_base <- 0
>> mu_base <- sr_base/(252.0)
>> sigma_base <- 1.00/(252.0)**0.5
>> for ( i in 1:n ) {
>>   m[,i] = m[,i] * sigma_base / sd(m[,i]) # re-scale
>>   m[,i] = m[,i] + mu_base - mean(m[,i]) # re-center}
>>
>> Why is the data transformed within the for loop, and does this kind of
>> re-scaling and re-centering need to be done with r

Re: [R] Do I need to transform backtest returns before using pbo (probability of backtest overfitting) package functions?

2017-11-21 Thread Eric Berger

Correct 

Sent from my iPhone

> On 21 Nov 2017, at 22:42, Joe O  wrote:
> 
> Hi Eric,
> 
> Thank you, that helps a lot. If I'm understanding correctly, if I’m wanting 
> to use actual returns from backtests rather than simulated returns, I would 
> need to make sure my risk-adjusted return measure, sharpe ratio in this case, 
> matches up in scale with my returns (i.e. daily returns with daily sharpe, 
> monthly with monthly, etc). And I wouldn’t need to transform returns like the 
> simulated returns are in the vignette, as the real returns are going to have 
> whatever properties they have (meaning they will have whatever average and 
> std dev they happen to have). Is that correct? 
> 
> Thanks, -Joe
> 
> 
>> On Tue, Nov 21, 2017 at 5:36 AM, Eric Berger  wrote:
>> [re-sending - previous email went out by accident before complete]
>> Hi Joe,
>> The centering and re-scaling is done for the purposes of his example, and 
>> also to be consistent with his definition of the sharpe function.
>> In particular, note that the sharpe function has the rf (riskfree) parameter 
>> with a default value of .03/252 i.e. an ANNUAL 3% rate converted to a DAILY 
>> rate, expressed in decimal.
>> That means that the other argument to this function, x, should be DAILY 
>> returns, expressed in decimal.
>> 
>> Suppose he wanted to create random data from a distribution of returns with 
>> ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in decimal. 
>> The equivalent DAILY returns would have mean MU_D = MU_A / 252 and standard 
>> deviation SIGMA_D =  SIGMA_A/SQRT(252).
>> 
>> He calls MU_D by the name mu_base  and  SIGMA_D by the name sigma_base.
>> 
>> His loop now converts the random numbers in his matrix so that each column 
>> has mean MU_D and std deviation SIGMA_D.
>> 
>> HTH,
>> Eric
>> 
>> 
>> 
>>> On Tue, Nov 21, 2017 at 2:33 PM, Eric Berger  wrote:
>>> Hi Joe,
>>> The centering and re-scaling is done for the purposes of his example, and 
>>> also to be consistent with his definition of the sharpe function.
>>> In particular, note that the sharpe function has the rf (riskfree) 
>>> parameter with a default value of .03/252 i.e. an ANNUAL 3% rate converted 
>>> to a DAILY rate, expressed in decimal.
>>> That means that the other argument to this function, x, should be DAILY 
>>> returns, expressed in decimal.
>>> 
>>> Suppose he wanted to create random data from a distribution of returns with 
>>> ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in decimal. 
>>> The equivalent DAILY
>>> 
>>> Then he does two steps: (1) generate a matrix of random values from the 
>>> N(0,1) distribution. (2) convert them to DAILY
>>> After initializing the matrix with random values (from N(0,1)), he now 
>>> wants to create a series of DAILY
>>> sr_base <- 0
>>> mu_base <- sr_base/(252.0)
>>> sigma_base <- 1.00/(252.0)**0.5
>>> for ( i in 1:n ) {
>>>   m[,i] = m[,i] * sigma_base / sd(m[,i]) # re-scale
>>>   m[,i] = m[,i] + mu_base - mean(m[,i]) # re-center}
>>> 
>>>> On Tue, Nov 21, 2017 at 2:10 PM, Bert Gunter  
>>>> wrote:
>>>> Wrong list.
>>>> 
>>>> Post on r-sig-finance instead.
>>>> 
>>>> Cheers,
>>>> Bert
>>>> 
>>>> 
>>>> 
>>>> On Nov 20, 2017 11:25 PM, "Joe O"  wrote:
>>>> 
>>>> Hello,
>>>> 
>>>> I'm trying to understand how to use the pbo package by looking at a
>>>> vignette. I'm curious about a part of the vignette that creates simulated
>>>> returns data. The package author transforms his simulated returns in a way
>>>> that I'm unfamiliar with, and that I haven't been able to find an
>>>> explanation for after searching around. I'm curious if I need to replicate
>>>> the transformation with real returns. For context, here is the vignette
>>>> (cleaned up a bit to make it reproducible):
>>>> 
>>>> (Full vignette:
>>>> https://cran.r-project.org/web/packages/pbo/vignettes/pbo.html)
>>>> 
>>>> library(pbo)
>>>> #First, we assemble the trials into an NxT matrix where each column
>>>> #represents a trial and each trial has the same length T. This example
>>>> #is random data so the backtest should be overfit.`
>>>> 
>>>> set.seed(765)
>

Re: [R] Scatterplot of many variables against a single variable

2017-11-27 Thread Eric Berger

LOL. Great reply Jim.
(N.B. Jim's conclusion is "debatable" by a judicious choice of seed. e.g.
set.seed(79) suggests that making the request more readable will actually
lower the number of useful answers. :-))


On Mon, Nov 27, 2017 at 11:42 AM, Jim Lemon  wrote:

> Hi Engin,
> Sadly, your illustration was ambushed on the way to the list. Perhaps
> you want something like this:
>
> # proportion of useful answers to your request
> pua<-sort(runif(20))
> #legibility of your request
> lor<-sort(runif(20))+runif(20,-0.5,0.5)
> # is a data set provided?
> dsp<-sort(runif(20))+runif(20,-0.5,0.5)
> # generate a linear model for the above
> pua.lm<-lm(pua~lor+dsp)
> # get the coefficients
> pua.lm
>
> Call:
> lm(formula = pua ~ lor + dsp)
>
> Coefficients:
> (Intercept)  lor  dsp
> 0.1692   0.6132   0.3311
>
> plot(pua~lor,col="red",main="Proportion of useful answers by request
> quality")
> points(pua~dsp,col="blue",pch=2)
> abline(0.1692,0.6132,col="red")
> abline(0.1692,0.3311,col="blue")
>
> So, the more readable your request and the quality of the data that
> you provide, the more useful answers you are likely to receive.
>
> Jim
>
>
> On Mon, Nov 27, 2017 at 7:56 PM, Engin YILMAZ 
> wrote:
> > Dear
> >
> > I try to realize one scatter matrix which draws *one single variable to
> all
> > variables* with *regression line* . You can see my eviews version  in the
> > annex .
> >
> > How can I draw this graph with R studio?
> >
> >
> > Sincerely
> > Engin YILMAZ
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] DeSolve Package and Moving Average

2017-11-29 Thread Eric Berger

Since you only provide pseudo-code I will give a guess as to the source of
the problem.
It is easy to get "burned" by use of the ifelse statement. Its results have
the same "shape" as the first argument.
My suggestion is to try replacing ifelse by a standard

if (  ) {
} else {
}

HTH,
Eric



On Wed, Nov 29, 2017 at 1:29 PM, Werning, Jan-Philipp <
jan-philipp.wern...@whu.edu> wrote:

> Dear all,
>
>
> I am using the DeSolve Package to simulate a system dynamics model. At the
> problematic point in the model, I basically want to decide how many
> products shall be produced to be sold. In order to determine the amount a
> basic forecasting model of using the average of the last 12 time periods
> shall be used. My code looks like the following.
>
> “ […]
>
> # Time units in month
> START<-0; FINISH<-120; STEP<-1
>
> # Set seed for reproducability
>
>  set.seed(123)
>
> # Create time vector
> simtime  <- seq(START, FINISH, by=STEP)
>
> # Create a stock vector with initial values
> stocks   <- c([…])
>
> # Create an aux vector for the fixed aux values
> auxs<- c([…])
>
>
> model <- function(time, stocks, auxs){
>   with(as.list(c(stocks, auxs)),{
>
> [… “lots of aux, flow, and stock functions” … ]
>
>
> aMovingAverage  <-  ifelse(exists("ResultsSimulation")=="FALSE",
> 1,movavg(ResultsSimulation$TotalSales, 12, type = "s”))
>
>
> return (list(c([…]))
>
>   })
> }
>
> # Call Solver, and store results in a data frame
> ResultsSimulation <-  data.frame(ode(y=stocks, times=simtime, func = model,
>   parms=auxs, method="euler"))
>
> […]”
>
> My problem is, that the moving average (function: movavg) is only computed
> once and the same value is used in every timestep of the model. I.e. When
> running the model for the first time, 1 is used, running it for the
> next time the total sales value of the first timestep is used. Since only
> one timestep exists, this is logical. Yet  I would expect the movavg
> function to produce a new value in each of the 120 timesteps, as it is the
> case with all other flow, stock and aux calculations as well.
>
> It would be great if you could help me with fixing this problem.
>
>
> Many thanks in advance!
>
> Yours,
>
> Jan
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] source files in temp environment

2017-12-02 Thread Eric Berger

I totally agree with Duncan's last point. I find it hard to reconcile your
early remarks (which indicate a deep knowledge of programming) with the
idea that your code is not built up from combining small(ish) functions.
Small functions would generally be considered best practices. Try searching
on this topic to see discussions and pointers.

On Sat, Dec 2, 2017 at 1:01 PM, Duncan Murdoch 
wrote:

> On 02/12/2017 5:48 AM, Alexander Shenkin wrote:
>
>> Hi all,
>>
>> I often keep code in separate files for organizational purposes, and
>> source() that code from higher level scripts.  One problem is that those
>> sourced files often create temporary variables that I don't want to keep
>> around.  I could clean up after myself with lots of rm()'s, but that's a
>> pain, and is messy.
>>
>> I'm wondering if one solution might be to source the code in a temporary
>> environment, assign outputs of interest to the .GlobalEnv with <<-, and
>> then delete the environment afterwards.  One way to do this:
>>
>> file.r:
>> temp1 = 1
>> temp2 = 2
>> desired_var <<- temp1 + temp2
>>
>> console:
>> temp_e = new.env()
>> source("file.r", local = temp_e)
>> rm(temp_e)
>>
>> It's a bit messy to create and delete environments, so I tried what
>> others have referred to:
>>
>> source("file.r", local = attach(NULL))
>>
>> This, however, results in a persistent "NULL" environment in the search
>> path.
>>
>>   > search()
>> ".GlobalEnv""package:bindrcpp"  "NULL"
>> "tools:rstudio" "package:stats" "package:graphics"
>> "package:grDevices" "package:utils" "package:datasets"
>> "package:methods"   "Autoloads" "package:base"
>>
>> Of course, functions are built to encapsulate like this (and do so in
>> their own temporary environment), but in many cases, turning the sourced
>> code into functions is possible but clunky.
>>
>> Any thoughts or suggestions would be much appreciated.
>>
>
> I would wrap the calls in the local() function, or put them in a function
> and call that.  That is,
>
> local({
>   source("file.R", local = TRUE)
> })
>
> or
>
> sourceit <- function() {
>   source("file.R", local = TRUE)
> }
> sourceit()
>
> With respect to your last comment (turning the code in file.R into
> functions which don't leave their locals behind):  I think that would be
> the best solution.  You may find it clunky now, but in the long run it
> likely will help you to make better code.
>
> Duncan Murdoch
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rcpp, dyn.load and C++ problems

2017-12-02 Thread Eric Berger

.Call("compute_values_cpp")
Also, if you were passing arguments to the C++ function you would need to
declare the function differently.
Do a search on "Rcpp calling C++ functions from R"

HTH,
Eric


On Sun, Dec 3, 2017 at 3:06 AM, Martin Møller Skarbiniks Pedersen <
traxpla...@gmail.com> wrote:

> Hi,
>
>   I have written a small C++ function and compile it.
>   However in R I can't see the function I have defined in C++.
>   I have read some web-pages about Rcpp and C++ but it is a bit confusion
> for me.
>
> Anyway,
>   This is the C++-code:
>
> #include 
> using namespace Rcpp;
>
> // [[Rcpp::export]]
> List compute_values_cpp(int totalPoints = 1e5, double angle_increment =
> 0.01, int radius = 400, double grow = 3.64) {
>   double xn = 0.5;
>   double angle = 0.1;
>   double xn_plus_one, yn_plus_one;
>   NumericVector x(totalPoints);
>   NumericVector y(totalPoints);
>
>   for (int i=0; i xn_plus_one = xn*cos(angle)*radius;
> yn_plus_one = xn*sin(angle)*radius;
> angle += angle_increment;
> xn = grow*xn*(1-xn);
> x[i] = xn_plus_one;
> y[i] = yn_plus_one;
>   }
>   return List::create(Rcpp::Named("x") = x, Rcpp::Named("y") = y);
> }
>
> And I compile it like this:
> PKG_CXXFLAGS=$(Rscript -e 'Rcpp:::CxxFlags()') \
> PKG_LIBS=$(Rscript -e 'Rcpp:::LdFlags()')  \
> R CMD SHLIB logistic_map.cpp
> without problems and I get a logistic_map.so file as expected.
>
> However in R:
> R> dyn.load("logistic_map.so")
> R> compute_values_cpp()
> Error in compute_values_cpp() :
>   could not find function "compute_values_cpp"
>
> Please advise,
>   What piece of the puzzle is missing?
>
> Regards
> Martin M. S. Pedersen
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with the behaviour of dashed lines in R plots

2017-12-04 Thread Eric Berger

Hi,
Sarah's last comment about using min/max x got me thinking. It's not that
the points are "very close together", it's that the x-values are not
ordered. So the plot is actually drawing a dashed line back-and-forth
between different points on the line, which has the effect of making the
result appear non-dashed. If you sort by the x-values before plotting you
will see the output is very different. I am not saying this is a better
solution than Sarah's regarding just using the end-points, but at least it
partially explains the relevance of that suggestion.

For example, here is a slightly modified version of the code that does an
ordering before plotting:

df1<-data.frame(B=runif(20,1.4,1.6),A=runif(20,-19.5,-9.8))
regressor<-lm(A~B,data = df1)
df2 <- data.frame( x=df1$B, yhat= predict(regressor,df1))
df2 <- df2[ order(df2$x), ]
plot(df2$x,df2$yhat,type="l", col="black", mgp=c(2,0.5,0),cex.lab=1.6,
lwd=2, lty=2,xlim=range(c(1.2,1.7)),ylim=rev(range(c(-19,-8
par(new = TRUE)
plot(df1$B,as.numeric(df1$A),type="p", col="black",
mgp=c(2,0.5,0),cex.lab=1.6,cex=2, xlab = "", ylab =
"",xlim=range(c(1.2,1.7)),ylim=rev(range(c(-19,-8))),pch=17)
box(lwd=3)

HTH,
Eric

On Mon, Dec 4, 2017 at 8:30 PM, jean-philippe <
jeanphilippe.fonta...@gssi.infn.it> wrote:

> hi Sarah,
>
> Thanks a lot for having taken time to answer me and for your reply. I
> wonder how I missed this solution. Indeed plotting the line with the 2
> extreme data points works perfectly.
>
>
> Best,
>
>
> Jean-Philippe Fontaine
>
>
>
> On 04/12/2017 18:30, Sarah Goslee wrote:
>
>> It's because you are plotting a line between each of the points in
>> your data frame, and they are very close togethe
>>
>
> --
> Jean-Philippe Fontaine
> PhD Student in Astroparticle Physics,
> Gran Sasso Science Institute (GSSI),
> Viale Francesco Crispi 7,
> 67100 L'Aquila, Italy
> Mobile: +393487128593, +33615653774
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Curiously short cycles in iterated permutations with the same seed

2017-12-07 Thread Eric Berger

Hi Boris,
Do a search on "the order of elements of the symmetric group". (This search
will also turn up homework questions and solutions.) You will understand
why you are seeing this once you understand how a permutation is decomposed
into cycles and how the order relates to a partition of n (n=10 in your
case).

Enjoy!
Eric

On Fri, Dec 8, 2017 at 6:39 AM, Boris Steipe 
wrote:

> I have noticed that when I iterate permutations of short vectors with the
> same seed, the cycle lengths are much shorter than I would expect by
> chance. For example:
>
> X <- 1:10
> Xorig <- X
> start <- 112358
> N <- 10
>
> for (i in 1:N) {
>   seed <- start + i
>   for (j in 1:1000) { # Maximum cycle length to consider
> set.seed(seed)# Re-seed RNG to same initial state
> X <- sample(X)# Permute X and iterate
> if (all(X == Xorig)) {
>   cat(sprintf("Seed:\t%d\tCycle: %d\n", seed, j))
>   break()
> }
>   }
> }
>
> Seed:   112359  Cycle: 14
> Seed:   112360  Cycle: 14
> Seed:   112361  Cycle: 8
> Seed:   112362  Cycle: 14
> Seed:   112363  Cycle: 8
> Seed:   112364  Cycle: 10
> Seed:   112365  Cycle: 10
> Seed:   112366  Cycle: 10
> Seed:   112367  Cycle: 9
> Seed:   112368  Cycle: 12
>
> I understand that I am performing the same permutation operation over and
> over again - but I don't see why that would lead to such a short cycle (in
> fact the cycle for the first 100,000 seeds is never longer than 30). Does
> this have a straightforward explanation?
>
>
> Thanks!
> Boris
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] inefficient for loop, is there a better way?

2017-12-12 Thread Morway, Eric

The code below is a small reproducible example of a much larger problem.
While the script below works, it is really slow on the true dataset with
many more rows and columns.  I'm hoping to get the same result to examp,
but with significant time savings.

The example below is setting up a data.frame for an ensuing regression
analysis.  The purpose of the script below is to appends columns to 'examp'
that contain values corresponding to the total number of days in the
previous 7 ('per') above some stage ('elev1' or 'elev2').  Is there a
faster method that leverages existing R functionality?  I feel like the
hack below is pretty clunky and can be sped up on the true dataset.  I
would like to run a more efficient script many times adjusting the value of
'per'.

ts <- 1:1000
examp <- data.frame(ts=ts, stage=sin(ts))

hi1 <- list()
hi2 <- list()
per <- 7
elev1 <- 0.6
elev2 <- 0.85
for(i in per:nrow(examp)){
examp_per <- examp[seq(i - (per - 1), i, by=1),]
stg_hi_cond1 <- subset(examp_per, examp_per$stage > elev1)
stg_hi_cond2 <- subset(examp_per, examp_per$stage > elev2)

hi1 <- c(hi1, nrow(stg_hi_cond1))
hi2 <- c(hi2, nrow(stg_hi_cond2))
}
examp$days_abv_0.6_in_last_7   <- c(rep(NA, times=per-1), unlist(hi1))
examp$days_abv_0.85_in_last_7  <- c(rep(NA, times=per-1), unlist(hi2))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] difference between ifelse and if...else?

2017-12-13 Thread Eric Berger

ifelse returns the "shape" of the first argument

In your ifelse the shape of "3 > 2" is a vector of length one, so it will
return a vector length one.

Avoid "ifelse" until you are very comfortable with it. It can often burn
you.

On Wed, Dec 13, 2017 at 5:33 PM, jeremiah rounds 
wrote:

> ifelse is vectorized.
>
> On Wed, Dec 13, 2017 at 7:31 AM, Jinsong Zhao  wrote:
>
> > Hi there,
> >
> > I don't know why the following codes are return different results.
> >
> > > ifelse(3 > 2, 1:3, length(1:3))
> > [1] 1
> > > if (3 > 2) 1:3 else length(1:3)
> > [1] 1 2 3
> >
> > Any hints?
> >
> > Best,
> > Jinsong
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posti
> > ng-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with recursive function

2017-12-14 Thread Eric Berger

You seem to have a typo at this expression (and some others like it)

Namely, you write

any(!dat2$norm_sd) >= 1

when you possibly meant to write

!( any(dat2$norm_sd) >= 1 )

i.e. I think your ! seems to be in the wrong place.

HTH,
Eric


On Thu, Dec 14, 2017 at 3:26 PM, DIGHE, NILESH [AG/2362] <
nilesh.di...@monsanto.com> wrote:

> Hi, I need some help with running a recursive function. I like to run
> funlp2 recursively.
> When I try to run recursive function in another function named "calclp" I
> get this "Error: any(!dat2$norm_sd) >= 1 is not TRUE".
>
> I have never built a recursive function before so having trouble executing
> it in this case.  I would appreciate any help or guidance to resolve this
> issue. Please see my data and the three functions that I am using below.
> Please note that calclp is the function I am running and the other two
> functions are within this calclp function.
>
> # code:
> Test<- calclp(dataset = dat)
>
> # calclp function
>
> calclp<- function (dataset)
>
> {
>
> dat1 <- funlp1(dataset)
>
> recursive_funlp <- function(dataset = dat1, func = funlp2) {
>
> dat2 <- dataset %>% select(uniqueid, field_rep, lp) %>%
>
> mutate(field_rep = paste(field_rep, "lp", sep = ".")) %>%
>
> spread(key = field_rep, value = lp) %>% mutate_at(.vars =
> grep("_",
>
> names(.)), funs(norm = round(scale(.), 3)))
>
> dat2$norm_sd <- round(apply(dat2[, grep("lp_norm", names(dat2))],
>
> 1, sd, na.rm = TRUE), 3)
>
> dat2$norm_max <- round(apply(dat2[, grep("lp_norm", names(dat2))],
>
> 1, function(x) {
>
> max(abs(x), na.rm = TRUE)
>
> }), 3)
>
> stopifnot(any(!dat2$norm_sd) >= 1)
>
> if (any(!dat2$norm_sd) >= 1) {
>
> df1 <- dat1
>
> return(df1)
>
> }
>
> else {
>
> df2 <- recursive_funlp()
>
> return(df2)
>
> }
>
> }
>
> df3 <- recursive_funlp(dataset = dat1, func = funlp2)
>
> df3
>
> }
>
>
> # funlp1 function
>
> funlp1<- function (dataset)
>
> {
>
> dat2 <- dataset %>% select(field, set, ent_num, rep_num,
>
> lp) %>% unite(uniqueid, set, ent_num, sep = ".") %>%
>
> unite(field_rep, field, rep_num) %>% mutate(field_rep =
> paste(field_rep,
>
> "lp", sep = ".")) %>% spread(key = field_rep, value = lp) %>%
>
> mutate_at(.vars = grep("_", names(.)), funs(norm = round(scale(.),
>
> 3)))
>
> dat2$norm_sd <- round(apply(dat2[, grep("lp_norm", names(dat2))],
>
> 1, sd, na.rm = TRUE), 3)
>
> dat2$norm_max <- round(apply(dat2[, grep("lp_norm", names(dat2))],
>
> 1, function(x) {
>
> max(abs(x), na.rm = TRUE)
>
> }), 3)
>
> data1 <- dat2 %>% gather(key, value, -uniqueid, -norm_max,
>
> -norm_sd) %>% separate(key, c("field_rep", "treatment"),
>
> "\\.") %>% spread(treatment, value) %>% mutate(outlier = NA)
>
> df_clean <- with(data1, data1[norm_sd < 1, ])
>
> datD <- with(data1, data1[norm_sd >= 1, ])
>
> s <- split(datD, datD$uniqueid)
>
> sdf <- lapply(s, function(x) {
>
> data.frame(x, x$outlier <- ifelse(is.na(x$lp_norm), NA,
>
> ifelse(abs(x$lp_norm) == x$norm_max, "yes", "no")),
>
> x$lp <- with(x, ifelse(outlier == "yes", NA, lp)))
>
> x
>
> })
>
> sdf2 <- bind_rows(sdf)
>
> all_dat <- bind_rows(df_clean, sdf2)
>
> all_dat
>
> }
>
>
> # funlp2 function
>
> funlp2<-function (dataset)
>
> {
>
> data1 <- dataset
>
> df_clean <- with(data1, data1[norm_sd < 1, ])
>
> datD <- with(data1, data1[norm_sd >= 1, ])
>
> s <- split(datD, datD$uniqueid)
>
> sdf <- lapply(s, function(x) {
>
> data.frame(x, x$outlier <- ifelse(is.na(x$lp_norm), NA,
>
> ifelse(abs(x$lp_norm) == x$norm_max, "yes", "no")),
>
> x$lp <- with(x, ifelse(outlier == "yes", NA, lp)))
>
> x
>
> })
>
> sdf2 <- bind_rows(sdf)
>
> all_dat <- bind_rows(df_clean, sdf2)
>
> all_dat
>

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 913 matches

Mail list logo