from:"jacob"

[R] [R-pkgs] Major Update to asciiSetupReader Package

2019-02-05 Thread Jacob

 The latest version of the package asciiSetupReader includes a number of
major improvements and bug fixes. This package lets you read in .dat+.sps
and .dat+.sas pair files.

See the following for an overview of the package:
https://jacobkap.github.io/asciiSetupReader/


For a list of changes please see here:
https://cran.r-project.org/web/packages/asciiSetupReader/news/news.html

Best,
Jacob

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] bug (?) with lapply / clusterMap / clusterApply etc

2016-03-22 Thread jacob



Hello I have encountered a bug(?) with the parallel package. When run  
from within a function, the parLapply function appears to be copying  
the entire parent environment (environment of interior of function)  
into all child nodes in the cluster, one node at a time - which is  
very very slow - and the copied contents are not even accessible  
within the child nodes even though they are apparent in the memory  
footprint. This happens when parLapply is run from within a function.  
I may be misusing the terms "parent" and "node" here...


The below code demonstrates the issue. The same parallel command is  
used twice within the function, once before creating a large object,  
and once afterwards. Both commands should take a nearly identical  
amount of time. Initially the parallel code takes less than 1/100th of  
a second, but in the second iteration requires hundreds of times  
longer...


Example Code:

 #create a cluster of nodes
 if(!"clus1" %in% ls()) clus1=makeCluster(10)

 #function used to demonstrate bug
 rows_fn1=function(x,clus){

 #first set of parallel code
  
print(system.time(parLapply(clus,1:5,function(z){y=rnorm(5000);return(mean(y))})))


 #create large vector
 x=rnorm(10^7)

 #second set
  
print(system.time(parLapply(clus,1:5,function(z){y=rnorm(5000);return(mean(y))})))


 }

 #demonstrate bug - watch task manager and see windows slowly  
copy the vector to each node in the cluster

 rows_fn1(1:5000,clus1)

Although the child nodes bloat proportionally to the size of x in the  
parent environment, x is not available in the child nodes. The code  
above can be tweaked to add more variables (x1,x2,x3 ...) and the  
child nodes will bloat to the same degree.


I am working on Windows Server 2012, I am using 64bit R version 3.2.1.  
I upgraded to 3.2.4revised and observed the same bug.


I have googled for this issue and have not encountered any other  
individuals having a similar problem.


I have attempted to reboot my machine without effect (aside from the obvious).

Any suggestions would be greatly appreciated!

With regards,

Jacob L Strunk
Forest Biometrician (PhD), Statistician (MSc)
and Data Munger

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] bug (?) with lapply / clusterMap / clusterApply etc

2016-03-23 Thread jacob


Very informative! Thank you.

Quoting Martin Morgan :


On 03/22/2016 01:46 PM, ja...@forestlidar.org wrote:


Hello I have encountered a bug(?) with the parallel package. When run
from within a function, the parLapply function appears to be copying the
entire parent environment (environment of interior of function) into all
child nodes in the cluster, one node at a time - which is very very slow
- and the copied contents are not even accessible within the child nodes
even though they are apparent in the memory footprint. This happens when
parLapply is run from within a function. I may be misusing the terms
"parent" and "node" here...

The below code demonstrates the issue. The same parallel command is used
twice within the function, once before creating a large object, and once
afterwards. Both commands should take a nearly identical amount of time.
Initially the parallel code takes less than 1/100th of a second, but in
the second iteration requires hundreds of times longer...

Example Code:

 #create a cluster of nodes
 if(!"clus1" %in% ls()) clus1=makeCluster(10)

 #function used to demonstrate bug
 rows_fn1=function(x,clus){

 #first set of parallel code

print(system.time(parLapply(clus,1:5,function(z){y=rnorm(5000);return(mean(y))})))


 #create large vector
 x=rnorm(10^7)

 #second set

print(system.time(parLapply(clus,1:5,function(z){y=rnorm(5000);return(mean(y))})))


 }

 #demonstrate bug - watch task manager and see windows slowly copy
the vector to each node in the cluster
 rows_fn1(1:5000,clus1)

Although the child nodes bloat proportionally to the size of x in the
parent environment, x is not available in the child nodes. The code


With this

library(parallel)
cl <- makeCluster(2)
f <- function() {
x <- 10
parSapply(cl, 1:5, function(i) x * i)
}

we see both that x is available, and why (so that symbols available  
in the environment in which FUN is defined are available, just like  
serial evaluation) the variable is copied



f()

[1] 10 20 30 40 50

Defining the function in the global environment, rather than in the  
body of a function, avoids copying implicit state,


cl <- makeCluster(2)
FUN <- function(i) x * i
f <- function() {
x <- 10
parSapply(cl, 1:5, FUN)
}

but requires that all arguments are defined / passed


f()

Error in checkForRemoteErrors(val) (from #3) :
  2 nodes produced errors; first error: object 'x' not found

updating the function definition and use

FUN <- function(i, x) x * i
f <- function() {
x <- 10
parSapply(cl, 1:5, FUN, x)
}


f()

[1] 10 20 30 40 50

The foreach package tries to be smart and export only symbols used  
(but can be tricked)


library(foreach)
library(doSNOW)
registerDoSNOW(cl)
g <- function() {
x <- 10
foreach(i=1:2) %dopar% { get("x") }
}


g()  # fails because 'x' is not referenced directly so not exported

Error in { (from #3) : task 1 failed - "object 'x' not found"

versus

g <- function() {
x <- 10
foreach(i=1:2) %dopar% { get("x"); x }
}

and


g()  # works because 'x' referenced and exported

[[1]]
[1] 10

[[2]]
[1] 10


Martin


above can be tweaked to add more variables (x1,x2,x3 ...) and the child
nodes will bloat to the same degree.

I am working on Windows Server 2012, I am using 64bit R version 3.2.1. I
upgraded to 3.2.4revised and observed the same bug.

I have googled for this issue and have not encountered any other
individuals having a similar problem.

I have attempted to reboot my machine without effect (aside from the
obvious).

Any suggestions would be greatly appreciated!

With regards,

Jacob L Strunk
Forest Biometrician (PhD), Statistician (MSc)
and Data Munger

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



This email message may contain legally privileged and/or  
confidential information.  If you are not the intended recipient(s),  
or the employee or agent responsible for the delivery of this  
message to the intended recipient(s), you are hereby notified that  
any disclosure, copying, distribution, or use of this email message  
is prohibited.  If you have received this message in error, please  
notify the sender immediately by e-mail and delete this email  
message from your computer. Thank you.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R abrupt exit

2010-04-02 Thread jacob

Dear Lists:

I recently ran quite annoyance problem while running R on Ubuntu 9.10.
 When running the  program,  the system suddenly exit from  the R
session with the following warnings:

 #
OMP: Hint: This may cause performance degradation and correctness
issues. Set environment variable KMP_DUPLICATE_LIB_OK=TRUE to ignore
this problem and force the program to continue anyway. Please note
that the use of KMP_DUPLICATE_LIB_OK is unsupported and using it may
cause undefined behavior. For more information, please contact
Intel(R) Premier Support.
Aborted
##
I have to restart R again, and all the calculations are being lost.

According to the warnings, I set the environment to true, no good. I
reinstalled the R program again, no good either.   I googled the
problem, it seems that there was no R help on the topic so far.


Any suggestions that the whole OS may have conflicts?  I have only one
copy of the following file on my computer system?

/usr/lib/R/lib/libguide.so


Thanks.

jacob

#
>options("KMP_DUPLICATE_LIB_OK")
$KMP_DUPLICATE_LIB_OK
[1] TRUE
> sessionInfo()
R version 2.10.1 (2009-12-14)
x86_64-pc-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] datasets  grDevices splines   graphics  stats utils methods
[8] base

other attached packages:
 [1] Design_2.3-0 Hmisc_3.7-0  survival_2.35-8
 [4] GEOquery_2.11.3  RCurl_1.4-0  bitops_1.0-4.1
 [7] affy_1.24.2  Biobase_2.6.1preprocessCore_1.8.0
[10] R.methodsS3_1.0.3

loaded via a namespace (and not attached):
[1] affyio_1.14.0  cluster_1.12.1 grid_2.10.1lattice_0.18-3

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] New package: fastDummies

2017-04-21 Thread Jacob Kaplan

Dear R users,

I am happy to announce that the package fastDummies is now on CRAN
(*https://cran.r-project.org/package=fastDummies
<https://cran.r-project.org/package=fastDummies>)*.

The fastDummies package lets you quickly and easily create dummy columns
from categorical variables in data.tables or data.frames. This package is
much faster than model.matrix() in handling data with greater than 1,000
rows.

Please submit all question or issues here (https://github.com/jacobkap/
fastDummies/issues)

Best,
Jacob

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] New CRAN Package Announcement: asciiSetupReader

2017-09-15 Thread Jacob Kaplan

I'm pleased to announce that asciiSetupReader is now on CRAN:
https://cran.r-project.org/web/packages/asciiSetupReader/index.html

This package allows users to read ASCII files that have an SPSS or SAS
setup file (.sps or .sas). Datasets that come in these txt-sps and txt-sas
paris can now be accessible through R. The function has the option of
correcting value labels (e.g. 1 to Male, 2 to Female) and column names
(e.g. V1 to Sex). You may also select only certain columns to read in which
is helpful when dealing with very large data sets.

A vignette is available explaining how to use the package with SPSS setup
files. The process is the same as for SAS setup files.

Please let me know if you if you find any bugs or problems in the package.
https://github.com/jacobkap/asciiReader/issues

Jacob

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] package.skeleton, environment argument causes error

2016-08-18 Thread Jacob Strunk

Hello, I have been using package.skeleton from within an lapply statement
successfully (assuming good source code) with the following setup in the
past:

x=try(package.skeleton(package_name,path=pathi,code_files=file_i))


but now fails with error:

Error: evaluation nested too deeply: infinite recursion /
options(expressions=)?

I am working in RStudio Version 0.99.896, with 64 bit R version 3.3.1
(2016-06-21)




I have been probing the code for package.skeleton a bit and noticed that
the default arguments for 'list' and 'environment' are supplied in the
function definition, thus making it impossible to achieve the conditions

envIsMissing=TRUE
missing(list) = TRUE


as a result of the fact that missing(list) cannot be true, the classesList
argument is empty and the call

classes0 <- .fixPackageFileNames(classesList)


then fails with the error

Error: evaluation nested too deeply: infinite recursion /
options(expressions=)?


If I remove the default arguments I get further, but get the same error  I
had before (Error: evaluation nested too deeply: infinite recursion /
options(expressions=)?) after executing the following code:

methods0 <- .fixPackageFileNames(methodsList)


and the contents of methodsList look like

An object of class "ObjectsWithPackage":

Object:
Package:


the function .fixPackageFileNames fails when it reaches

list <- as.character(list)


where in this case the contents of 'list' look like

str(list)
Formal class 'ObjectsWithPackage' [package "methods"] with 2 slots
  ..@ .Data  : chr(0)
  ..@ package: chr(0)


I am not sure if the problem arose from changes to package.skeleton
or methods::getClasses and methods::getGenerics or if there is something
peculiar about my environment.

my current ugly fix is to define the function .fixPackageFileNames in the
global environment and add a try statement and exit when it results in an
object of class "try-error":

.fixPackageFileNames=
function (list)
{
list <- *try(*as.character(list)*)*
*if(class(list)=="try-error")return(list)*
if (length(list) == 0L)
return(list)
list0 <- gsub("[[:cntrl:]\"*/:<>?\\|]", "_", list)
wrong <- grep("^(con|prn|aux|clock\\$|nul|lpt[1-3]|com[1-4])(\\..*|)$",
list0)
if (length(wrong))
list0[wrong] <- paste0("zz", list0[wrong])
ok <- grepl("^[[:alnum:]]", list0)
if (any(!ok))
list0[!ok] <- paste0("z", list0[!ok])
list1 <- tolower(list0)
list2 <- make.unique(list1, sep = "_")
changed <- (list2 != list1)
list0[changed] <- list2[changed]
list0
}


Any assistance with this error would be greatly appreciated!

Thank you,

-- 
Jacob Strunk
stru...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] nlme splines model.frame.default error

2015-06-02 Thread Jacob Wegelin


I want to use a specialized function to compute knots on the fly to pass to 
splines::bs within a call to nlme::lme.

The specialized function will compute knots from the x-axis variable (the x 
argument in a call to splines::bs).

The syntax works with lm. But when I try it with lme, the following error is 
returned:

Error in model.frame.default(formula = ~why + eks + toydat + ID, data = list( :
  invalid type (list) for variable 'toydat'

Below is a simplified example showing the "desired syntax," which returns the error, and 
a workaround. I am mystified what inherently is wrong with the "desired syntax", since it 
works fine with stats::lm.

set.seed(5)
library(splines)
library(nlme)
NID<-5
toydat<-data.frame(eks=0:0:((2*NID)-1))
toydat$ID<-factor(rep(LETTERS[1:NID], each=2))
toydat
toydat$why<-with(toydat, as.integer(100*sin((eks / 7 * pi)^2) + rnorm(eks)/10))
customKnotsFn<-function(some)3.5
print(toydat)
print(summary(toydat))
print(customKnotsFn)

# lm has no trouble:

mylm<-lm(why~bs(eks,knots=customKnotsFn(some=toydat$eks)), data=toydat)
print(mylm$call)

#  The "desired syntax" returns an error:

lme(fixed=why~bs(eks,knots=customKnotsFn(some=toydat$eks)),random=~1|ID, 
data=toydat)

#  Removing the argument from customKnotsFn eliminates the error, but then 
customKnotsFn is useless for practical purposes:

lme(fixed=why~bs(eks,knots=customKnotsFn()),random=~1|ID, data=toydat)

#  Workaround returns no error:

mylme<- with(toydat, 
lme(fixed=why~bs(eks,knots=customKnotsFn(some=eks)),random=~1|ID))
print(mylme$call)

Of course, with the workaround the resulting mylme does not know where the data 
came from.

Why should a workaround be necessary? Is there something inherently misguided about the 
"desired syntax" (above)?

Thanks for any insight

Jacob Wegelin


sessionInfo()

R version 3.1.3 (2015-03-09)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
Running under: OS X 10.7.5 (Lion)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] splines   stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] nlme_3.1-120

loaded via a namespace (and not attached):
[1] grid_3.1.3  lattice_0.20-30

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nlme splines model.frame.default error

2015-06-02 Thread Jacob Wegelin


On 2015-06-02 Tue 14:20, Jacob Wegelin wrote:
I want to use a specialized function to compute knots on the fly to pass to 
splines::bs within a call to nlme::lme.


The specialized function will compute knots from the x-axis variable (the x 
argument in a call to splines::bs).


The syntax works with lm. But when I try it with lme, the following error is 
returned:


Error in model.frame.default(formula = ~why + eks + toydat + ID, data = list( 
:

 invalid type (list) for variable 'toydat'

Below is a simplified example showing the "desired syntax," which returns the 
error, and a workaround. I am mystified what inherently is wrong with the 
"desired syntax", since it works fine with stats::lm.


set.seed(5)
library(splines)
library(nlme)
NID<-5
toydat<-data.frame(eks=0:0:((2*NID)-1))
toydat$ID<-factor(rep(LETTERS[1:NID], each=2))
toydat
toydat$why<-with(toydat, as.integer(100*sin((eks / 7 * pi)^2) + 
rnorm(eks)/10))

customKnotsFn<-function(some)3.5
print(toydat)
print(summary(toydat))
print(customKnotsFn)

# lm has no trouble:

mylm<-lm(why~bs(eks,knots=customKnotsFn(some=toydat$eks)), data=toydat)
print(mylm$call)

#  The "desired syntax" returns an error:

lme(fixed=why~bs(eks,knots=customKnotsFn(some=toydat$eks)),random=~1|ID, 
data=toydat)


#  Removing the argument from customKnotsFn eliminates the error, but then 
customKnotsFn is useless for practical purposes:


lme(fixed=why~bs(eks,knots=customKnotsFn()),random=~1|ID, data=toydat)

#  Workaround returns no error:

mylme<- with(toydat, 
lme(fixed=why~bs(eks,knots=customKnotsFn(some=eks)),random=~1|ID))

print(mylme$call)


# Here is a perhaps slicker workaround:

mybs <-function(x, df) {
 myknots<-customKnotsFn(x)
 bs(x, knots=myknots)
}

lme(fixed=why~mybs(eks, df=2), random=~1|ID, data=toydat)

# But why cannot lme handle the "desired syntax"?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ggplot2 geom_boxplot limits

2015-07-20 Thread Jacob Wegelin


With base graphics, one can use the "ylim" argument to zoom in on a boxplot.

With ggplot2, using "limits" to try to zoom in on a boxplot *changes the box*.  
Since the box usually indicates the 25th and 75th percentiles of a quantitative variable, 
this is puzzling.

The toy code below demonstrates this. In ggplot2, "zooming in" causes the two 
boxes to overlap, when they did not overlap in the full plot.  Also, the center lines --- 
which usually indicate the median of the variable --- change when one zooms in.

In base graphics, "zooming in" does not cause the boxes to overlap or, as far 
as I can see, the median line to move relative to the scale.

What is going on here?

pdf(file="toy-example.pdf")
set.seed(1)
toy1<-data.frame(Y=rnorm(500, mean=3), A="one")
toy2<-data.frame(Y=rnorm(500, mean=1.6), A="two")
toy<-rbind(toy1,toy2)
toy$A<-factor(toy$A)
library(ggplot2)
mybreaks<-signif(seq(from=min(toy$Y),to=max(toy$Y),by=0.5),digits=2)
mylimits<-c(0.61,3.7)
print(myplot<-ggplot(toy, aes(x=A,y=Y)) + 
geom_boxplot()+scale_y_continuous(breaks=mybreaks)+theme_bw())
print(myplot+scale_y_continuous(breaks=mybreaks,limits=mylimits))
boxplot(toy1$Y,toy2$Y)
boxplot(toy1$Y,toy2$Y, ylim=mylimits)
graphics.off()


sessionInfo()

R version 3.2.1 (2015-06-18)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.4 (Yosemite)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] ggplot2_1.0.1

loaded via a namespace (and not attached):
 [1] MASS_7.3-40  colorspace_1.2-6 scales_0.2.5 magrittr_1.5 
plyr_1.8.3   tools_3.2.1  gtable_0.1.2 reshape2_1.4.1
 [9] Rcpp_0.11.6  stringi_0.5-5    grid_3.2.1   stringr_1.0.0
digest_0.6.8 proto_0.3-10 munsell_0.4.2


Jacob A. Wegelin

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2 geom_boxplot limits

2015-07-20 Thread Jacob Wegelin


On 2015-07-20 Mon 15:19, Thierry Onkelinx wrote:


Limits in scales set values outside the limits to NA. Hence the boxplots, smoothers,... 
change. Use coord_cartesian() to "zoom in".


Thanks. What do I do if I also want to use coord_flip(), that is, if I want the 
boxes to lie horizontally *and* to zoom in?

myplot+coord_cartesian(ylim=mylimits) # zooms in

myplot+coord_cartesian(ylim=mylimits) + coord_flip() # flips but does not zoom

myplot + coord_flip()+coord_cartesian(ylim=mylimits) # zooms but does not flip

Jacob Wegelin


Op 20-jul.-2015 20:29 schreef "Jacob Wegelin" :
>
> With base graphics, one can use the "ylim" argument to zoom in on a boxplot.
>
> With ggplot2, using "limits" to try to zoom in on a boxplot *changes the 
box*.  Since the box usually indicates the 25th and 75th percentiles of a
quantitative variable, this is puzzling.
>
> The toy code below demonstrates this. In ggplot2, "zooming in" causes the two 
boxes to overlap, when they did not overlap in the full plot.  Also, the
center lines --- which usually indicate the median of the variable --- change 
when one zooms in.
>
> In base graphics, "zooming in" does not cause the boxes to overlap or, as far 
as I can see, the median line to move relative to the scale.
>
> What is going on here?
>
> pdf(file="toy-example.pdf")
> set.seed(1)
> toy1<-data.frame(Y=rnorm(500, mean=3), A="one")
> toy2<-data.frame(Y=rnorm(500, mean=1.6), A="two")
> toy<-rbind(toy1,toy2)
> toy$A<-factor(toy$A)
> library(ggplot2)
> mybreaks<-signif(seq(from=min(toy$Y),to=max(toy$Y),by=0.5),digits=2)
> mylimits<-c(0.61,3.7)
> print(myplot<-ggplot(toy, aes(x=A,y=Y)) + 
geom_boxplot()+scale_y_continuous(breaks=mybreaks)+theme_bw())
> print(myplot+scale_y_continuous(breaks=mybreaks,limits=mylimits))
> boxplot(toy1$Y,toy2$Y)
> boxplot(toy1$Y,toy2$Y, ylim=mylimits)
> graphics.off()
>
>> sessionInfo()
>
> R version 3.2.1 (2015-06-18)
> Platform: x86_64-apple-darwin13.4.0 (64-bit)
> Running under: OS X 10.10.4 (Yosemite)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] ggplot2_1.0.1
>
> loaded via a namespace (and not attached):
>  [1] MASS_7.3-40      colorspace_1.2-6 scales_0.2.5     magrittr_1.5     
plyr_1.8.3       tools_3.2.1      gtable_0.1.2     reshape2_1.4.1
>  [9] Rcpp_0.11.6      stringi_0.5-5    grid_3.2.1       stringr_1.0.0    
digest_0.6.8     proto_0.3-10     munsell_0.4.2
>
>
> Jacob A. Wegelin
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Probability distribution of fitted gaussian distribution.

2015-01-03 Thread Jacob Varughese

Hi,

I have a discrete set of data on the returns for 3 indices with 206 data
points. Since the number of points is less it doesnt exact look like a
gaussian distribution.

I wanted to fit the data to a gaussian distribution and have used the
fitdist function and have gotten the plots and the mean and sd estimates
for the gaussian that fits my data.

What I then want to do is to get a U=F(x) where U is the uniform variable
corresponding to the CDF function applied on the fitted theoritical CDF
curve. How can I get that?


Equivalent data that I find in matlab. Here the ksdensity gives an array of
f and xi values and I could use the f values for my usage. But I am trying
to work it out in R. The steps that I am going through in R are below. I
have also attached the input sheet that I am using for the indices. Sorry
in advance, case its a dumb one.

Estimate Density

Generate a sample data set from a mixture of two normal distributions.

rng default  % for reproducibility
x = [randn(30,1); 5+randn(30,1)];

Plot the estimated density.

[f,xi] = ksdensity(x);
figure
plot(xi,f);

Steps that I am following.

# Reading and finding the returns for 3 indices.
CDSPrices<-read.csv("CDS.csv")
 numRows=nrow(CDSPrices)
CDSReturnsN225=CDSPrices$N225[2:numRows]/CDSPrices$N225[1:numRows-1]-1
CDSReturnsSPX=CDSPrices$SPX[2:numRows]/CDSPrices$SPX[1:numRows-1]-1
CDSReturnsIBOVESPA=CDSPrices$IBOVESPA[2:numRows]/CDSPrices$IBOVESPA[1:numRows-1]-1
CDS_Returns<-cbind(CDSReturnsN225,CDSReturnsSPX,CDSReturnsIBOVESPA)

# Using fitdist to fit a gaussian distribution onto the discrete empirical
data I have.
library(fitdistrplus)
fittedNormal<-fitdist(CDS_Returns[,1],"norm")
plot(fittedNormal)


> fittedNormal[]
$estimate
mean   sd
-0.002035951  0.028654032

$method
[1] "mle"

$sd
   mean  sd
0.001996421 0.001403953

Reference

http://cran.r-project.org/web/packages/fitdistrplus/fitdistrplus.pdf  ~
 Page 15


-- 
*Jacob Varughese*
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Proposed changes to the R Language

2018-12-29 Thread jacob bogers

It has some good ideas, but R (my personal assesment) is not build for
superspeed, but for superease of use.))


On Fri, Dec 28, 2018 at 3:01 PM Angus C Hewitt (DHHS) <
angus.hew...@dhhs.vic.gov.au> wrote:

> Hi Team,
>
> Please advise if there are any plans in the pipeline for change the R
> language to "B"  as proposed in the below mentioned statistical series help
> at Auckland Uni.
>
> https://www.youtube.com/watch?v=88TftllIjaY
>
>
> Kind Regards,
>
> Angus Hewitt
> Senior Analyst | Decision Support
> System Design, Planning & Decision Making |  Health & Well Being
> Department of Health and Human Services | 19th floor, 50 Lonsdale Street,
> Melbourne Victoria 3000
> t. 9096 5859  | m. 0468 364 744 | e. angus.hew...@dhhs.vic.gov.au
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] confirm family==binomial and link==logistic

2016-02-12 Thread Jacob Wegelin


To check that a regression object comes from logistic regression, I employ the 
following two lines:

stopifnot(glmObject$family$family=="binomial")

stopifnot(glmObject$family$link=="logit")

For instance:

toyfunction<-function(glmObject) {
stopifnot(inherits(glmObject, "glm"))
stopifnot(glmObject$family$family=="binomial")
stopifnot(glmObject$family$link=="logit")
cat("okay, I guess\n")
glmObject
}

mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv";)

someobject<- glm(admit~gre+gpa, data=mydata)

toyfunction(someobject)

someobject<- glm(admit~gre+gpa, data=mydata, family="binomial")

toyfunction(someobject)

But Doug Bates once stated that it's preferable to use extractor functions (and perhaps 
other ready-made functions?) rather than "deconstructing" an object (his term), 
as I do here.

Accordingly, is there a smarter way to perform the check that I perform inside 
toyfunction?

Thanks for any insight

Jacob A. Wegelin
Assistant Professor
C. Kenneth and Dianne Wright Center for Clinical and Translational Research
Department of Biostatistics
Virginia Commonwealth University
830 E. Main St., Seventh Floor
P. O. Box 980032
Richmond VA 23298-0032
U.S.A. 
CTSA grant: UL1TR58
E-mail: jacobwege...@fastmail.fm 
URL: http://www.people.vcu.edu/~jwegelin


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] extract minimal variables from model

2017-01-06 Thread Jacob Wegelin


Given any regression model, created for instance by lm, lme, lmer, or rqs, such 
as

z1<-lm(weight~poly(Time,2), data=ChickWeight)

I would like a general way to obtain only those variables used for the model.  In the current example, this 
"minimal data frame" would consist of the "weight" and "Time" variables and 
none of the other columns of ChickWeight.

(Motivation: Sometimes the data frame contains thousands of variables which are 
not used in the current regression, and I do not want to keep copying and 
propagating them.)

The "model" component of the regression object doesn't serve this purpose:


head(z1$model)

  weight poly(Time, 2).1 poly(Time, 2).2
1 42-0.066020938 0.072002235
2 51-0.053701293 0.031099018
3 59-0.041381647-0.001334588
4 64-0.029062001-0.025298582
5 76-0.016742356-0.040792965
6 93-0.004422710-0.047817737

The following awkward workaround seems to do it when variable names contain only 
"word characters" as defined by regex:

minimalvariablesfrommodel20161120 <-function(object, originaldata){
# 
stopifnot(!missing(originaldata))

stopifnot(!missing(object))
intersect(
unique(unlist(strsplit(format(object$call$formula), split="\\W", 
perl=TRUE)))
, names(originaldata)
)
}


minimalvariablesfrommodel20161120(z1, ChickWeight)
[1] "weight" "Time" 




But if a variable has a space in its name, my workaround fails:


ChickWeight$"dog tail"<-ChickWeight$Time
z1<-lm(weight~poly(`dog tail`,2), data=ChickWeight)
head(z1$model)

  weight poly(`dog tail`, 2).1 poly(`dog tail`, 2).2
1 42  -0.066020938   0.072002235
2 51  -0.053701293   0.031099018
3 59  -0.041381647  -0.001334588
4 64  -0.029062001  -0.025298582
5 76  -0.016742356  -0.040792965
6 93  -0.004422710  -0.047817737

minimalvariablesfrommodel20161120(z1, ChickWeight)

[1] "weight"




Is there a more elegant, and hence more reliable, approach?

Thanks

Jacob A. Wegelin
Assistant Professor
C. Kenneth and Dianne Wright Center for Clinical and Translational Research
Department of Biostatistics
Virginia Commonwealth University
830 E. Main St., Seventh Floor
P. O. Box 980032
Richmond VA 23298-0032
U.S.A. 
URL: http://www.people.vcu.edu/~jwegelin


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] Announcing caesar

2017-01-23 Thread Jacob Kaplan

Dear R users,

I am happy to announce that the package caesar is now on CRAN (
https://cran.r-project.org/web/packages/caesar/index.html).

The caesar package lets you encrypt and decrypt strings using the common
Caesar cipher or a more secure pseudorandom number generator method.

If you would like to know more please visit the repository on GitHub (
https://github.com/jacobkap/caesar).

Best,
Jacob

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Validating Minitab's "Expanded Gage R&R Study" using R and lme4

2017-02-15 Thread Matt Jacob

I'm trying to validate the results of an "Expanded Gage R&R Study" in
Minitab using R and lme4, but I can't get the numbers to match up in
certain situations. I can't tell whether my model is wrong, my data is
bad, or something else is going on.

For instance, here's some data for which the results don't match:

https://i.stack.imgur.com/5PCgm.png

After running the gage study, these are the results according to
Minitab:

Study Var  %Study Var  %Tolerance
Source StdDev (SD)   (6 * SD)   (%SV)  (SV/Toler)
Total Gage R&R 1.7627710.5766  100.00   14.36
  Repeatability0.0 0.0.000.00
  Reproducibility  1.7627710.5766  100.00   14.36
B  0.0 0.0.000.00
A*B1.7627710.5766  100.00   14.36
Part-To-Part   0.0 0.0.000.00
  A0.0 0.0.000.00
Total Variation1.7627710.5766  100.00   14.36

But when I mimic Minitab's results by parsing the output from lmer() and
doing the arithmetic in Excel, this is what I see:

https://i.stack.imgur.com/EGg9F.png

The raw output from lmer() was:

Linear mixed model fit by REML ['lmerMod']
Formula: y ~ 1 + (1 | A) + (1 | B) + (1 | A:B)
   Data: d

REML criterion at convergence: -100.1

Scaled residuals: 
   Min 1Q Median 3QMax 
-1.308e-07 -1.308e-07 -1.308e-07 -6.541e-08  1.308e-07 

Random effects:
 Groups   NameVariance  Std.Dev. 
 A:B  (Intercept) 1.333e+00 1.154e+00
 B(Intercept) 7.066e-04 2.658e-02
 A(Intercept) 2.260e-03 4.754e-02
 Residual 2.655e-14 1.629e-07
Number of obs: 8, groups:  A:B, 4; B, 2; A, 2

Fixed effects:
Estimate Std. Error t value
(Intercept)52.17   0.57   91.53
convergence code: 0
Model failed to converge with max|grad| = 0.422755 (tol = 0.002,
component 1)
Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?

And the R code that produced that output is:

library(lme4)
A <- factor(c(1, 1, 2, 2, 2, 1, 2, 1))
B <- factor(c(1, 2, 1, 2, 1, 2, 2, 1))
y <- c(51.356124843620798, 51.356124843620798, 54.8816618912481,
51.356124843620798, 54.8816618912481, 51.356124843620798,
51.356124843620798, 51.356124843620798)
d <- data.frame(y, A, B)
fm <- lmer(y ~ 1 + (1|A) + (1|B) + (1|A:B), d)
summary(fm)

For a different measurement with a different response, it's a completely
different situation! Given the following data:

https://i.stack.imgur.com/cH0bO.png

The resulting table from Minitab is:

Study Var  %Study Var  %Tolerance
Source StdDev (SD)   (6 * SD)   (%SV)  (SV/Toler)
Total Gage R&R0.1936491.16190   55.901.00
  Repeatability   0.0935410.56125   27.000.48
  Reproducibility 0.1695581.01735   48.950.88
B 0.1322880.79373   38.190.68
A*B   0.1060660.63640   30.620.55
Part-To-Part  0.2872281.72337   82.921.49
  A   0.2872281.72337   82.921.49
Total Variation   0.3464102.07846  100.001.79

And after plugging my R results into Excel, I get exactly the same
thing:

https://i.stack.imgur.com/jUEAP.png

Which was produced by this R code:

library(lme4)
A <- factor(c(1, 1, 2, 2, 2, 1, 2, 1))
B <- factor(c(1, 2, 1, 2, 1, 2, 2, 1))
y <- c(-49.4, -49.8, -50.1, -50.1, -50.0, -49.9, -50.2, -49.6)
d <- data.frame(y, A, B)
fm <- lmer(y ~ 1 + (1|A) + (1|B) + (1|A:B), d)
summary(fm)

That generated the following lmer() summary:

Linear mixed model fit by REML ['lmerMod']
Formula: y ~ 1 + (1 | A) + (1 | B) + (1 | A:B)
   Data: d

REML criterion at convergence: -3.8

Scaled residuals: 
Min  1Q  Median  3Q Max 
-0.7705 -0.6853 -0.1039  0.4379  1.4151 

Random effects:
 Groups   NameVariance Std.Dev.
 A:B  (Intercept) 0.01125  0.10607 
 B(Intercept) 0.01750  0.13229 
 A(Intercept) 0.08250  0.28723 
 Residual 0.00875  0.09354 
Number of obs: 8, groups:  A:B, 4; B, 2; A, 2

Fixed effects:
Estimate Std. Error t value
(Intercept) -49.8875 0.2322  -214.9

Is the difference attributable to the warnings produced by lmer() about
the model failing to converge and being nearly unidentifiable? What
could Minitab be doing differently when the measurement data contains
only two distinct values?

Matt

This question is cross-posted to
http://stats.stackexchange.com/questions/262170/how-can-i-validate-minitabs-expanded-gage-rr-study-using-open-source-tools

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-

[R] problem with extract raster with polygon

2017-03-20 Thread Jacob Strunk

Hello, I am having some troubles extracting pixels from a raster using
polygons. When I attempt to do so, pixels which are only partially
intersected by polygons are not included.

In the example below the number of pixels returned is less than the number
of pixels which can be seen intersecting polygons.

As an aside I also encountered strange behavior in my example below after
buffering polygons by 10 units. After buffering polygons by 10 units, the
number of pixels return by extract was actually reduced.


Example
r <- raster(ncol=5, nrow=5)
r[] <- 1:ncell(r)

cds1 <- rbind(c(-180,-20), c(-160,5), c(-60, 0), c(-160,-70), c(-180,-20))
cds2 <- rbind(c(80,0), c(100,60), c(120,0), c(120,-55), c(80,0))
polys <- spPolygons(cds1, cds2)


plot(r)
plot(polys, add=TRUE)
plot(buffer(polys,10,dissolve=F), add=TRUE)
plot(buffer(polys,30,dissolve=F), add=TRUE)
plot(buffer(polys,100,dissolve=F), add=TRUE)

extract(r, polys)
extract(r, buffer(polys,10,dissolve=F))
extract(r, buffer(polys,30,dissolve=F))
extract(r, buffer(polys,100,dissolve=F))

my R output:
> extract(r, polys)
[[1]]
[1] 11 12 16

[[2]]
[1] 14  4 15 25

> extract(r, buffer(polys,10,dissolve=F))
[[1]]
[1] 11 12 16

[[2]]

14

> extract(r, buffer(polys,30,dissolve=F))
[[1]]
[1] 11 12 16 17 21

[[2]]
[1]  9 14 15 19 20 25

> extract(r, buffer(polys,100,dissolve=F))
[[1]]
 [1]  1  2  3  6  7  8 11 12 13 16 17 18 21 22 23

[[2]]
 [1]  3  4  5  8  9 10 13 14 15 18 19 20 24 25

> R.Version()
$platform
[1] "x86_64-w64-mingw32"

$arch
[1] "x86_64"

$os
[1] "mingw32"

$system
[1] "x86_64, mingw32"

$status
[1] ""

$major
[1] "3"

$minor
[1] "3.2"

$year
[1] "2016"

$month
[1] "10"

$day
[1] "31"

$`svn rev`
[1] "71607"

$language
[1] "R"

$version.string
[1] "R version 3.3.2 (2016-10-31)"

$nickname
[1] "Sincere Pumpkin Patch"



"Documentation for package ‘raster’ version 2.5-8" from Raster help pages


Thanks for any help

-- 
Jacob

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] nested calls, variable scope

2008-07-17 Thread Jacob Wegelin



Below is an example of a problem I encounter repeatedly when I write functions. 
 A call works at the command line, but it does not work inside a function, even 
when I have made sure that all required variables are available within the 
function. The only way I know to solve it is to make the required variable 
global, which of course is dangerous. What is the elegant or appropriate way to 
solve this?


# The call to ns works at the command line:


junk<-ns(DAT$Age, knots=74)


# The call to lmer works at the command line, with the call to ns nested within 
it:


junk2<-lmer(formula=Finished ~ Sex01 + ns(DAT$Age, knots= 74 ) + MinCC + PzC + (1 | NAME 
) + (1 | Year), data=myDATonly, family="binomial")


But now I want to do this within a function such as the following:


myfn

function(myknot=74) {
library(lme4)
library(splines)
cat("myknot=", myknot, "\n")
myMER<-lmer(formula=Finished ~ Sex01 + ns(DAT$Age, knots= myknot) + MinCC + PzC + 
(1 | NAME ) + (1 | Year), data=myDATonly, family="binomial")
}


myfn(74)
myknot= 74 
Error in ns(DAT$Age, knots = myknot) : object "myknot" not found


# The bad way to do it: revise myfn to make myknot a global variable:

myfn

function(myknot=74) {
library(lme4)
library(splines)
cat("myknot=", myknot, "\n")
myknot<<-myknot
myMER<-lmer(formula=Finished ~ Sex01 + ns(DAT$Age, knots= myknot
) + MinCC + PzC + (1 | NAME ) + (1 | Year), data=myDATonly,
family="binomial")
}


z<-myfn(74)

myknot= 74

# Runs fine but puts a global variable into .GlobalEnv.



sessionInfo()
R version 2.6.2 (2008-02-08) 
i386-pc-mingw32


locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] splines   stats graphics  grDevices datasets  utils methods
base

other attached packages:
[1] lme4_0.999375-13  Matrix_0.999375-9 lattice_0.17-6

loaded via a namespace (and not attached):
[1] boot_1.2-32 grid_2.6.2 




Thanks for any insight.

Jacob A. Wegelin
[EMAIL PROTECTED] 
Assistant Professor

Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A. 
http://www.people.vcu.edu/~jwegelin


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] nested calls, variable scope

2008-07-17 Thread Jacob Wegelin

Below is an example of a problem I encounter repeatedly when I write
functions.  A call works at the command line, but it does not work inside a
function, even when I have made sure that all required variables are
available within the function. The only way I know to solve it is to make
the required variable global, which of course is dangerous. What is the
elegant or appropriate way to solve this?


# The call to ns works at the command line:

> junk<-ns(DAT$Age, knots=74)

# The call to lmer works at the command line, with the call to ns nested
within it:

> junk2<-lmer(formula=Finished ~ Sex01 + ns(DAT$Age, knots= 74 ) + MinCC +
PzC + (1 | NAME ) + (1 | Year), data=myDATonly, family="binomial")

But now I want to do this within a function such as the following:

> myfn
function(myknot=74) {
 library(lme4)
 library(splines)
 cat("myknot=", myknot, "\n")
 myMER<-lmer(formula=Finished ~ Sex01 + ns(DAT$Age, knots= myknot) +
MinCC + PzC + (1 | NAME ) + (1 | Year), data=myDATonly, family="binomial")
}

> myfn(74)
myknot= 74
Error in ns(DAT$Age, knots = myknot) : object "myknot" not found

# The bad way to do it: revise myfn to make myknot a global variable:
> myfn
function(myknot=74) {
 library(lme4)
 library(splines)
 cat("myknot=", myknot, "\n")
 myknot<<-myknot
 myMER<-lmer(formula=Finished ~ Sex01 + ns(DAT$Age, knots= myknot
) + MinCC + PzC + (1 | NAME ) + (1 | Year), data=myDATonly,
family="binomial")
}

> z<-myfn(74)
myknot= 74

# Runs fine but puts a global variable into .GlobalEnv.

> sessionInfo()
R version 2.6.2 (2008-02-08)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] splines   stats graphics  grDevices datasets  utils methods
base

other attached packages:
[1] lme4_0.999375-13  Matrix_0.999375-9 lattice_0.17-6

loaded via a namespace (and not attached):
[1] boot_1.2-32 grid_2.6.2
>

Thanks for any insight.

Jacob A. Wegelin
[EMAIL PROTECTED]
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.
http://www.people.vcu.edu/~jwegelin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] dput vs unclass to see what a factor really is composed of

2008-07-31 Thread Jacob Wegelin

I used read.dta() to read in a Stata 9 dataset to R. The "Sex01" variable
takes on two values in Stata: 0 and 1, and it is labeled "M" and "F"
respectively, analogous to an R factor. Thus, read.dta reads it in as a
factor.

Now, I wanted to see what this variable *really* is, in R. For instance,
sometimes R converts a 0/1 variable into a 1/2 variable when it considers it
a factor. The "dput" function often enables me to see what a variable really
is, but in this case it tells me that the components of the factor are "1L"
and "2L", that is, "one uppercase ell" and "two uppercase ell."  What does
the uppercase ell mean?

In this case, "unclass" seems to enable me to see what the variable really
consists of. But what do "1L" and "2L" mean?

> summary(DAT$Sex01[1:150])
  M   F
137  13
> dput(DAT$Sex01[1:150])
structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("M", "F"), class = "factor")

> unclass(junk)
  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 2 1 1 2 2 1 1 1 1 1 2 2 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 1 1 1 1 1
[149] 1 1
attr(,"levels")
[1] "M" "F"

> str(junk)
 Factor w/ 2 levels "M","F": 1 1 1 1 1 1 1 1 1 1 ...


Jake

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nested calls, variable scope

2008-08-01 Thread Jacob Wegelin

Dear Bert:

Thank you for your reply. But the "eval.parent" function does not seem to
solve the problem. Below is the code. I have tried running it with

eval.parent(myknot)

as well as with integers from 0 through 4, as in

eval.parent(myknot, 4)

Each attempt produces the same error:

Error in eval(expr, p) : object "myknot" not found

I'm replying to the list in case someone can explain this.

Jake

fnJunk <-function(myknot=44 , myMER ) {
#  Fit lmer (mer, lme4) models for WSER project, using spline (ns), and plot
them for newdata.
   library(lme4)
   library(splines)

if(missing(myMER)){
   cat("myknot=", myknot)
   cat("fit another MER??")
   readline()
   myMER<-lmer(formula=Finished ~ Sex01 +
   ns(DAT$Age, knots= eval.parent(myknot, 0) ) +
   MinCC + PzC + (1 | NAME ) + (1 | Year), data=(DAT), family="binomial")
   }
# more stuff below here
}

On Thu, Jul 17, 2008 at 6:19 PM, Bert Gunter <[EMAIL PROTECTED]> wrote:

> As you may be aware, this is a case of FAQ for R 3.31, lexicographic
> scoping. It **can** be tricky, especially in a complex situation like
> yours.
> Typically eval() and/or substitute are what you need in such situations, to
> force evaluation in the appropriate environment. So try changing your
>
> ns(Dat$Age,knots=myknot)
>
> ## to
>
> ns(Dat$Age,knots=eval.parent(myknot))
>
> If that doesn't work, await wiser advice.
>
> Cheers,
>
> Bert Gunter
> Genentech
>
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> On
> Behalf Of Jacob Wegelin
> Sent: Thursday, July 17, 2008 1:58 PM
> To: [EMAIL PROTECTED]
> Subject: [R] nested calls, variable scope
>
> Below is an example of a problem I encounter repeatedly when I write
> functions.  A call works at the command line, but it does not work inside a
> function, even when I have made sure that all required variables are
> available within the function. The only way I know to solve it is to make
> the required variable global, which of course is dangerous. What is the
> elegant or appropriate way to solve this?
>
>
> # The call to ns works at the command line:
>
> > junk<-ns(DAT$Age, knots=74)
>
> # The call to lmer works at the command line, with the call to ns nested
> within it:
>
> > junk2<-lmer(formula=Finished ~ Sex01 + ns(DAT$Age, knots= 74 ) + MinCC +
> PzC + (1 | NAME ) + (1 | Year), data=myDATonly, family="binomial")
>
> But now I want to do this within a function such as the following:
>
> > myfn
> function(myknot=74) {
> library(lme4)
> library(splines)
> cat("myknot=", myknot, "\n")
> myMER<-lmer(formula=Finished ~ Sex01 + ns(DAT$Age, knots= myknot) +
> MinCC + PzC + (1 | NAME ) + (1 | Year), data=myDATonly, family="binomial")
> }
>
> > myfn(74)
> myknot= 74
> Error in ns(DAT$Age, knots = myknot) : object "myknot" not found
>
> # The bad way to do it: revise myfn to make myknot a global variable:
> > myfn
> function(myknot=74) {
> library(lme4)
> library(splines)
> cat("myknot=", myknot, "\n")
> myknot<<-myknot
> myMER<-lmer(formula=Finished ~ Sex01 + ns(DAT$Age, knots= myknot
> ) + MinCC + PzC + (1 | NAME ) + (1 | Year), data=myDATonly,
> family="binomial")
> }
>
> > z<-myfn(74)
> myknot= 74
>
> # Runs fine but puts a global variable into .GlobalEnv.
>
> > sessionInfo()
> R version 2.6.2 (2008-02-08)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] splines   stats graphics  grDevices datasets  utils methods
> base
>
> other attached packages:
> [1] lme4_0.999375-13  Matrix_0.999375-9 lattice_0.17-6
>
> loaded via a namespace (and not attached):
> [1] boot_1.2-32 grid_2.6.2
> >
>
> Thanks for any insight.
>
> Jacob A. Wegelin
> [EMAIL PROTECTED]
> Assistant Professor
> Department of Biostatistics
> Virginia Commonwealth University
> 730 East Broad Street Room 3006
> P. O. Box 980032
> Richmond VA 23298-0032
> U.S.A.
> http://www.people.vcu.edu/~jwegelin
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lattice: control size of axis title and axis labels

2009-08-08 Thread Jacob Wegelin



Was: Re: Adjusting x/y text labels for a bwplot using cex.lab

The purpose of this email is merely to explain how to control, separately,
two different text sizes on the axes in an R plot, in traditional or lattice
graphics. I did not find an explicit exposition of this, with examples, in a
quick hunt on the Web or in a couple books on R graphics. 

I want to control the size separately of

 (1) the title of the axis ("Important predictor" or "My outcome" or "X" or
"Y")

 (2) the numbers on the axis (or text in place of numbers).

In R, the word "label" is ambiguous. The term "axis label" (e.g., in
documentation of the "las" parameter) refers to the numbers on the axis,
whereas "xlab" refers to the title of the axis. 

In traditional R graphics, 

x<-rnorm(15); y<- x^2; plot(x,y, cex.axis=1, cex.lab=2, xlab="Important
predictor", ylab="My outcome")

makes "Important predictor" and "My outcome" big, whereas 

plot(x,y, cex.axis=2, cex.lab=1)

puts huge numbers on the axes. And one can control one axis at a time. For
instance, 

plot(x,y, axes=F, ylab=""); axis(1, cex.axis=2)

puts huge numbers on the x axis and does not mark the y axis at all.

In lattice:

xyplot(y~x, xlab=list(label="Important predictor", cex=2))

makes "Important predictor" big, whereas

xyplot(y~x, scales=list(x=list(cex=2)) )

makes the numbers on the x axis huge. And it does not have to be numbers
along the x axis. One can put text in place of numbers, as in:

xyplot(y~x, scales=list(x=list( cex=2 , at=(-1:1) , labels=c("little", "Oh",
"big"))) )

Jake Wegelin



S Ellison wrote:
> 
> You probably missed the bit in the lattice documentation which says that
> few if any of the standard par() parameters work on lattice. lattice
> uses its own system.
> 
> Look at the xyplot help page and seek out the scales argument. That
> tells you that scales is a list, optionally with x and y components,
> that controls the axis appearance. The bit you probably want is the cex
> component of that.
> 
> Using one of the simpler examples from ?xyplot, this looks like:
> xyplot(decrease ~ treatment, OrchardSprays, groups = rowpos,
>type = "a",
>auto.key = list(space = "right", points = FALSE, lines = TRUE),
>scales=list(x=list(cex=1.5))) #specifies bigger text on the
> x-axis
> 
> Steve E
> 
 mcobb_berkeley  07/21/09 10:10 PM >>>
> 
> Searched for this and found some help, but I still can't figure it out. 
> 
> I have trying to enlarge the x and y labels on my box plot.  I
> understand
> that you can do this using "cex.lab", but it does not seem to be working
> for
> me.  I must be adding it in the wrong spot.  Any help would be greatly
> appreciated.  Here is my code:
> 
> bwplot(hr~Herd, data=telemetry, notch=T, ylab="Home Range Area (ha)",
> xlab="Herd", 
> par.settings = list(plot.symbol = list(col = "black"),box.umbrella =
> list(col ="black"), box.rectangle=list(col="black")),fill="light blue")
> 
> ~McCrea
> -- 
> View this message in context:
> http://www.nabble.com/Adjusting-x-y-text-labels-for-a-bwplot-using-cex.lab-tp24595920p24595920.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> ***
> This email and any attachments are confidential. Any use...{{dropped:8}}
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Adjusting-x-y-text-labels-for-a-bwplot-using-cex.lab-tp24595920p24877995.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] symbols(x,y, circles=sqrt(N)) with lattice xyplot

2009-09-14 Thread Jacob Wegelin

How would I create the following plot using lattice?

symbols( combPsummary$pastRate, combPsummary$finRate,
circles=sqrt(combPsummary$N) )

The idea is to plot finRate vs pastRate using circles whose areas are
proportional to the number of people in each group.

The following attempt does not really work:

   xyplot(
  finRate ~ pastRate
  , data=combPsummary
  , panel=function( x, y, ...) {
 panel.xyplot( x,y, type="n" )
 symbols( x , y, circles=sqrt(combPsummary$N))
 }
  )

It apparently first draws a rectangle, i.e., the lattice panel, and
then inside that rectangle it draws the same thing that is drawn by
the symbols() command outside of a panel function. That is, it draws
another rectangle in addition to the desired circles, and it writes
its own axis labels.

I didn't find "symbols" in the index of Deepayan's book. And I don't
think the xyplot help file deals with this either? I don't think there
is a "lsymbols" or "panel.symbols" function?

(Bill Cleveland points out that the human eye does not interpret area
accurately, so that making the area of a circle proportional to sample
size may be a flawed approach. Thus I'd also be interested in any
comments on how one might best graphically display x, y, and sample
size in a single plot.)

Thanks for any tips

Jake Wegelin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] match.call to obtain the name of a function

2010-02-26 Thread Jacob Wegelin



Within a function I'd often like to obtain a text string equal to the name of 
the function.

One use for this: To generate a filename for use in pdf(). This enables me to 
keep track of which function generated a particular graphic came.

match.call() puts parentheses at the end of the name. I don't want parentheses 
in a filename.

The following kludgey function gives the desired result.


JANK

function(x, y) {
   one<-deparse(match.call())
   functionName<-gsub("\\(.*", "", one, perl=T)
   cat("The name of this function is ", functionName, "\n")
}

JANK(55, pi^2)
The name of this function is  JANK 

JANK()

The name of this function is  JANK

Is there not a more direct way? To paraphrase Douglas Bates, the above approach is like 
the diner scene in the movie "Five Easy Pieces". You get an order of toast by 
first ordering a chicken sandwich and then telling the waitress to hold (that is, to 
subtract) the meat, lettuce, and mayonnaise.

Thanks for any insights

Jacob Wegelin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] fisher.test gives p>1

2010-03-04 Thread Jacob Wegelin



The purpose of this email is to

(1) report an example where fisher.test returns p > 1

(2) ask if there is a reliable way to avoid p>1 with fisher.test.

If one has designed one's code to return an error when it finds a "nonsensical" 
probability, of course a value of p>1 can cause havoc.

Example:


junk<-data.frame(score=c(rep(0,14), rep(1,29), rep(2, 16)))
junk<-rbind(junk, junk)
junk$group<-c(rep("DOG", nrow(junk)/2), rep("kitty", nrow(junk)/2))
table(junk$score, junk$group)


DOG kitty
  0  1414
  1  2929
  2  1616

dput(fisher.test(junk$score, junk$group)$p.value)

1.12


dput(fisher.test(junk$score, junk$group, simulate.p.value=TRUE)$p.value)

1

In this particular case, specifying a simulated p value solved the problem. But is 
there a reliable way to avoid p>1 in general?


sessionInfo()
R version 2.10.1 (2009-12-14) 
x86_64-apple-darwin9.8.0


locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.10.1




Thanks for any insight

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] obtain names of variables and data from glm object

2009-07-26 Thread Jacob Wegelin

Suppose we have some glm object such as:

myglm <- glm( y ~ x, data=DAT)

Is there an elegant way--or the "right way" within the R way of thinking--to
obtain the names of the response variable, the predictor variables, and the
dataset, as character strings?

For instance, suppose the "right way" was to use the (currently fictitious)
functions theresponse(), thepredictors(), and theDataSet().  Then I would be
able to write a function that obtains the names and subsequently pastes text
along the following lines:

theResponse <- theresponse( myglm )

theFirstPredictor <- thepredictors( myglm )[1]

theDataSet <- theDataSet(myglm)

title(main= paste(theResponse, " is the response and ", theFirstPredictor, "
is the only predictor")

In reality, I can of course extract

> formula(myglm)
y ~ x

but I see no elegant way to extract the names of the predictor and response
from this object. The deparse() function doesn't quite solve this problem:

> deparse(formula(myglm))
[1] "y ~ x"
> deparse(formula(myglm)[2])
[1] "y()"
> deparse(formula(myglm)[3])
[1] "x()"

Ideally the elegant method would, in this example, return the character
strings "x", "y", and "DAT".

Thanks for any insights.

Jake

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.
E-mail: jwege...@vcu.edu
URL: http://www.people.vcu.edu/~jwegelin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] session logging

2009-08-03 Thread Jacob Wegelin



Consider all the text that one sees on the console during an R session.

Is there a way, within R, to make all this text--both the "output" and the 
"messages"--automatically get copied to a single text file, in addition to seeing it on 
the console?

If I remember to save the console to a file at the end of my R session, that 
does it. But

(1) That requires pointing and clicking--can it be automated as a text command?

(2) It would be nice to issue the text command at the start of the R session, such as 
"log this entire session in mylog.txt, append", if this would ensure that the 
session is logged whether I remember to save the console or not.

As far as I can tell,

sink(file="mylog.txt")

will hide the output from me and put it into mylog.txt. But it still shows me 
the error messages.

An attempt to put the output and messages into separate files returns an error:


sink("junkout.txt", type="output")
sink("junkmsg.txt", type="message")

Error in sink("junkmsg.txt", type = "message") :
  'file' must be NULL or an already open connection

and at any rate I'd like both messages in the same file, just like on the 
console.

People who run R at the unix command line apparently use the unix command 
-script-. But I mean something that will work within R, platform-independent.

A 2003 post to R-help suggests savehistory(), but this does *not* save the 
console; I tried it just now. Another post from the same thread suggests using 
emacs. But that is not platform-independent.

The existence of the 2003 thread suggests that this issue comes up 
periodically. Was it a deliberate design decision not to make logs available, 
in contrast to the way logging works in Stata?

I use the Rgui on a MacBook Pro:


sessionInfo()
R version 2.8.1 (2008-12-22) 
i386-apple-darwin8.11.1


locale:
en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] lme4_0.999375-28   Matrix_0.999375-21 lattice_0.17-17
foreign_0.8-29

loaded via a namespace (and not attached):
[1] boot_1.2-34 grid_2.8.1


Thanks for any insights.

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A. 
E-mail: jwege...@vcu.edu 
URL: http://www.people.vcu.edu/~jwegelin


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] session logging

2009-08-03 Thread Jacob Wegelin

Consider all the text that one sees on the console during an R session.

Is there a way, within R, that all this text--i.e., both the "output" and
the "messages"--can be automatically copied to a single file, in addition to
its being visible on the console?

If I remember to save the console to a file at the end of my R session, that
does it. But

(1) That requires pointing and clicking--can it be automated as a text
command?

(2) It would be nice to issue the text command at the start of the R
session, such as "log this entire session in mylog.txt, append", to ensure
that the session is logged whether I remember to save the console or not.

As far as I can tell,

sink(file="mylog.txt")

will hide the output from me and put it into mylog.txt. But it still shows
me the error messages.

An attempt to put the output and messages into separate files returns an
error:

> sink("junkout.txt", type="output")
> sink("junkmsg.txt", type="message")
Error in sink("junkmsg.txt", type = "message") :
   'file' must be NULL or an already open connection

and at any rate I'd like both messages in the same file, just like on the
console.

People who run R at the unix command line apparently use the unix command
-script-. But I mean something that will work within R,
platform-independent.

A 2003 post to R-help suggests savehistory(), but this does *not* save the
console; I tried it just now. Another post from the same thread suggests
using emacs. But that is not platform-independent.

The existence of the 2003 thread suggests that this issue comes up
periodically. Was it a deliberate design decision not to make logs
available, in contrast to the way logging works in Stata?

I use the Rgui on a MacBook Pro:

> sessionInfo()
R version 2.8.1 (2008-12-22)
i386-apple-darwin8.11.1

locale:
en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] lme4_0.999375-28   Matrix_0.999375-21 lattice_0.17-17
foreign_0.8-29

loaded via a namespace (and not attached):
[1] boot_1.2-34 grid_2.8.1


Thanks for any insights.

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.
E-mail: jwege...@vcu.edu
URL: http://www.people.vcu.edu/~jwegelin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] session logging

2009-08-03 Thread Jacob Wegelin

Maybe I'm missing something in the history help file. I read it and
subsequently tried

> savehistory("trythis.txt")

but the resulting file does not contain the text that scrolled past
the console. It contains only a few uninteresting timestamps. Is
there some more useful application of history() or savehistory()?

On Mon, Aug 3, 2009 at 4:09 PM, roger koenker  wrote:

> have you read
>
>?history
>
>
> url:www.econ.uiuc.edu/~rogerRoger Koenker
> emailrkoen...@uiuc.eduDepartment of Economics
> vox: 217-333-4558University of Illinois
> fax:   217-244-6678Urbana, IL 61801
>
>
>
>
> On Aug 3, 2009, at 3:01 PM, Jacob Wegelin wrote:
>
>  Consider all the text that one sees on the console during an R session.
>>
>> Is there a way, within R, that all this text--i.e., both the "output" and
>> the "messages"--can be automatically copied to a single file, in addition
>> to
>> its being visible on the console?
>>
>> If I remember to save the console to a file at the end of my R session,
>> that
>> does it. But
>>
>> (1) That requires pointing and clicking--can it be automated as a text
>> command?
>>
>> (2) It would be nice to issue the text command at the start of the R
>> session, such as "log this entire session in mylog.txt, append", to ensure
>> that the session is logged whether I remember to save the console or not.
>>
>> As far as I can tell,
>>
>> sink(file="mylog.txt")
>>
>> will hide the output from me and put it into mylog.txt. But it still shows
>> me the error messages.
>>
>> An attempt to put the output and messages into separate files returns an
>> error:
>>
>>  sink("junkout.txt", type="output")
>>> sink("junkmsg.txt", type="message")
>>>
>> Error in sink("junkmsg.txt", type = "message") :
>>  'file' must be NULL or an already open connection
>>
>> and at any rate I'd like both messages in the same file, just like on the
>> console.
>>
>> People who run R at the unix command line apparently use the unix command
>> -script-. But I mean something that will work within R,
>> platform-independent.
>>
>> A 2003 post to R-help suggests savehistory(), but this does *not* save the
>> console; I tried it just now. Another post from the same thread suggests
>> using emacs. But that is not platform-independent.
>>
>> The existence of the 2003 thread suggests that this issue comes up
>> periodically. Was it a deliberate design decision not to make logs
>> available, in contrast to the way logging works in Stata?
>>
>> I use the Rgui on a MacBook Pro:
>>
>>  sessionInfo()
>>>
>> R version 2.8.1 (2008-12-22)
>> i386-apple-darwin8.11.1
>>
>> locale:
>> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods   base
>>
>> other attached packages:
>> [1] lme4_0.999375-28   Matrix_0.999375-21 lattice_0.17-17
>> foreign_0.8-29
>>
>> loaded via a namespace (and not attached):
>> [1] boot_1.2-34 grid_2.8.1
>>
>>
>> Thanks for any insights.
>>
>> Jacob A. Wegelin
>> Assistant Professor
>> Department of Biostatistics
>> Virginia Commonwealth University
>> 730 East Broad Street Room 3006
>> P. O. Box 980032
>> Richmond VA 23298-0032
>> U.S.A.
>> E-mail: jwege...@vcu.edu
>> URL: http://www.people.vcu.edu/~jwegelin
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] session logging

2009-08-04 Thread Jacob Wegelin

Thanks. This seems to mirror the output *both* to the console *and* to the
sink file. The error messages, on the other hand, show up only in the
console.

On Tue, Aug 4, 2009 at 3:24 PM, Allan Engelhardt  wrote:

> sink(..., type=c("output","message"), split=TRUE) at the beginning of your
> session should do it?
>
> Jacob Wegelin wrote:
>
>>
>> Consider all the text that one sees on the console during an R session.
>>
>> Is there a way, within R, to make all this text--both the "output" and the
>> "messages"--automatically get copied to a single text file, in addition to
>> seeing it on the console?
>>
>> If I remember to save the console to a file at the end of my R session,
>> that does it. But
>>
>> (1) That requires pointing and clicking--can it be automated as a text
>> command?
>>
>> (2) It would be nice to issue the text command at the start of the R
>> session, such as "log this entire session in mylog.txt, append", if this
>> would ensure that the session is logged whether I remember to save the
>> console or not.
>>
>> As far as I can tell,
>>
>> sink(file="mylog.txt")
>>
>> will hide the output from me and put it into mylog.txt. But it still shows
>> me the error messages.
>>
>> An attempt to put the output and messages into separate files returns an
>> error:
>>
>>  sink("junkout.txt", type="output")
>>> sink("junkmsg.txt", type="message")
>>>
>> Error in sink("junkmsg.txt", type = "message") :
>>  'file' must be NULL or an already open connection
>>
>> and at any rate I'd like both messages in the same file, just like on the
>> console.
>>
>> People who run R at the unix command line apparently use the unix command
>> -script-. But I mean something that will work within R,
>> platform-independent.
>>
>> A 2003 post to R-help suggests savehistory(), but this does *not* save the
>> console; I tried it just now. Another post from the same thread suggests
>> using emacs. But that is not platform-independent.
>>
>> The existence of the 2003 thread suggests that this issue comes up
>> periodically. Was it a deliberate design decision not to make logs
>> available, in contrast to the way logging works in Stata?
>>
>> I use the Rgui on a MacBook Pro:
>>
>>  sessionInfo()
>>>
>> R version 2.8.1 (2008-12-22) i386-apple-darwin8.11.1
>>
>> locale:
>> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods   base
>>
>> other attached packages:
>> [1] lme4_0.999375-28   Matrix_0.999375-21 lattice_0.17-17
>> foreign_0.8-29
>>
>> loaded via a namespace (and not attached):
>> [1] boot_1.2-34 grid_2.8.1
>>
>>
>> Thanks for any insights.
>>
>> Jacob A. Wegelin
>> Assistant Professor
>> Department of Biostatistics
>> Virginia Commonwealth University
>> 730 East Broad Street Room 3006
>> P. O. Box 980032
>> Richmond VA 23298-0032
>> U.S.A. E-mail: jwege...@vcu.edu URL: http://www.people.vcu.edu/~jwegelin
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] multiple lty on same panel in xyplot

2009-08-05 Thread Jacob Wegelin

I would like to use lattice graphics to plot multiple functions (or groups
or subpopulations) on the same plot region, using different line types "lty"
or colors "col" to distinguish the functions (or groups).

In traditional graphics, this seems straightforward: First plot all the data
using 'type="n"', and subsequently execute a series of "points" or "lines"
commands, one for each different group or function.

What is the elegant way to do this using xyplot?

To make this concrete, consider the following toy example:

k<- 10
x<- (1:k)/3
yM<-6 + x^2
yF<-12 + x^(1.5)
xNA<-x[length(x)]

# Insertion of NA row is necessary to prevent a meaningless line
# from being drawn from the females to the males across the entire plot.

DAT<-data.frame(
x=c(x, xNA, x)
,
y=c(yF, NA, yM)
,
sex=c( rep(0, k ), 0,  rep(1, k))
)

library("lattice")

xyplot(
y ~ x
, data=DAT
, type="l"
)

This draws both men and women in the same color and line type. Instead, I
want to specify different "col" and "lty" values for the two groups.

More generally, does a reference exist that explains this kind of thing,
with examples? I have not yet found an answer to this question in Paul
Murrell's book. Does Deepayan Sarkar's _Lattice_ go into that kind of
detail?

(I use lattice, not traditional graphics, because the plot will eventually
be conditioned on a third categorical variable.)

Thanks for any suggestions.

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.
E-mail: jwege...@vcu.edu
URL: http://www.people.vcu.edu/~jwegelin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multiple lty on same panel in xyplot

2009-08-05 Thread Jacob Wegelin

On Wed, Aug 5, 2009 at 1:30 PM, Jacob Wegelin wrote:

> I would like to use lattice graphics to plot multiple functions (or groups
> or subpopulations) on the same plot region, using different line types "lty"
> or colors "col" to distinguish the functions (or groups).
>
> In traditional graphics, this seems straightforward: First plot all the
> data using 'type="n"', and subsequently execute a series of "points" or
> "lines" commands, one for each different group or function.
>
> What is the elegant way to do this using xyplot?
>

> To make this concrete, consider the following toy example:
>
> k<- 10
>  x<- (1:k)/3
> yM<-6 + x^2
> yF<-12 + x^(1.5)
>  xNA<-x[length(x)]
>
> # Insertion of NA row is necessary to prevent a meaningless line
> # from being drawn from the females to the males across the entire plot.
>
> DAT<-data.frame(
>  x=c(x, xNA, x)
> ,
> y=c(yF, NA, yM)
>  ,
> sex=c( rep(0, k ), 0,  rep(1, k))
> )
>
> library("lattice")
>

Solution (easier than I had imagined):

   myPanel<-function( x, y ) {
  panel.xyplot(x, y, type="n")
  llines( x[DAT$sex==0], y[DAT$sex==0], col="red", lty=1 )
  llines( x[DAT$sex==1], y[DAT$sex==1], col="blue", lty=2 )
   }

   xyplot(
  y ~ x
  , data=DAT
  , type="l"
  , panel=myPanel
   )

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.
E-mail: jwege...@vcu.edu
URL: http://www.people.vcu.edu/~jwegelin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multiple lty on same panel in xyplot

2009-08-05 Thread Jacob Wegelin

On Wed, 5 Aug 2009, Deepayan Sarkar wrote:

> On 8/5/09, Jacob Wegelin  wrote:
>> I would like to use lattice graphics to plot multiple functions (or
groups
>>  or subpopulations) on the same plot region, using different line types
"lty"
>>  or colors "col" to distinguish the functions (or groups).
>>
>>  In traditional graphics, this seems straightforward: First plot all the
data
>>  using 'type="n"', and subsequently execute a series of "points" or
"lines"
>>  commands, one for each different group or function.
>>
>>  What is the elegant way to do this using xyplot?
>>
>>  To make this concrete, consider the following toy example:
>>
>>  k<- 10
>>  x<- (1:k)/3
>>  yM<-6 + x^2
>>  yF<-12 + x^(1.5)
>>  xNA<-x[length(x)]
>>
>>  # Insertion of NA row is necessary to prevent a meaningless line
>>  # from being drawn from the females to the males across the entire plot.
>>
>>  DAT<-data.frame(
>>  x=c(x, xNA, x)
>>  ,
>>  y=c(yF, NA, yM)
>>  ,
>>  sex=c( rep(0, k ), 0,  rep(1, k))
>>  )
>
> It's much simpler in lattice, and you don't need to play such tricks.
Option 1:
>
> xyplot(yM + yF ~ x, type = "l", auto.key = list(points = FALSE, lines =
TRUE))
>
> and if you want to control lty etc:
>
> xyplot(yM + yF ~ x, type = "l", auto.key = list(points = FALSE, lines =
TRUE),
>   par.settings = simpleTheme(lty = c(2, 3)))
>
>
> Option 2 (a bit more work, but less mysterious under the hood):
>
> DAT<-
>data.frame(x = c(x, x), y=c(yF, yM),
>   sex= rep(c("Female", "Male"), each = length(x)))
>
> xyplot(y ~ x, data = DAT, groups = sex, type = "l")

Dear Bill and Deepayan,

Thanks. This is helpful. Where can one find a thorough documentation of all
these features like par.settings, simpleTheme, the options for where to
place the legend or "key", auto.key, the different locations besides "top"
where one can place the "auto.key", etc.?  I don't think this is all clearly
laid out in the R help files or latticeLab.pdf. But using your hints I found
that the following worked:

xyplot(
y ~ x
, groups= ~ sex
, type="l"
, auto.key = list(columns=2, points = FALSE, lines = TRUE)
, par.settings = simpleTheme(lty = c(1, 2), col="black")
, data=DAT
)

Now, how would I use lattice tools to plot males with a line and females
with points--and still get an informative autokey?

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.
E-mail: jwege...@vcu.edu
URL: http://www.people.vcu.edu/~jwegelin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] specify lattice black-and-white theme

2009-08-05 Thread Jacob Wegelin

Is there a simple way to specify a theme or trellis (lattice) parameters so
that, in a multipanel (conditioned) plot, there is no color and in the
strips there is no shading? This is the effect achieved on page 124 of
Deepayan Sarkar's "Lattice" (figure 7.2).
I managed to trick lattice into making a grayscale plot on my interactive
display as follows:

> graphics.off()
> postscript("junk.ps")
> mystuff<-trellis.par.get()
> graphics.off()
> trellis.par.set(mystuff)

Then when I made a multipanel (conditioned) plot, at least there was no
color, but the strips contained gray shading. Is this the "black and white
theme"? If one wants strips with white background (no shading), as in the
example cited above, must one go deeper into the details of the settings or
write one's own panel or strip function?

Thanks for any info.

> sessionInfo()
R version 2.8.1 (2008-12-22)
i386-apple-darwin8.11.1

locale:
en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] lattice_0.17-17

loaded via a namespace (and not attached):
[1] Matrix_0.999375-21 boot_1.2-34grid_2.8.1
lme4_0.999375-28
>

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.
E-mail: jwege...@vcu.edu
URL: http://www.people.vcu.edu/~jwegelin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lattice: simultaneously control aspect & outer whitespace

2009-08-07 Thread Jacob Wegelin

Suppose we wish to achieve the following three aims:
(1) Control the aspect ratio of our plot (i.e., tweak this till it looks
great)

(2) Save the plot as a PDF with zero or minimal white space outside it.

(3) Preserve this in code, so that in the future the exact same plot can be
reproduced by simply sourcing the code.

I can almost achieve (1) and (2) on my MacBook Pro by pointing and clicking,
as follows:

      Start with graphics.off().

      Generate the plot  on the computer screen; then the outer margins are
pretty thin by default.

      Use the mouse to adjust the aspect ratio until it looks right.

The fact that I pointed and clicked precludes (3).

If, to avoid point-and-click, I attempt to control the aspect ratio with the
"aspect" argument to xyplot, then a large whitespace appears outside the
graphic in the plot on my computer screen. This whitespace is carried over
to the saved *.pdf file.

If, instead of first creating the graphic on the computer screen and then
saving it, I create it in a postscript graphics device,

   trellis.device(postscript, file="junk.ps")

then the right outer margin is zero. That is, the graphic ends exactly at
the rightmost border of the box around the plot. But the bottom, left, and
top contain extra whitespace outside the graphic region.

Specifying

   par(oma=rep(0,4))
   par(mar=rep(0,4))

after opening the Postscript device appears to have no effect
whatsoever on this.

What is the right approach? It would be nice to be able to specify both the
aspect ratio and the amount of whitespace outside the plot, and then let the
software compute the necessary dimensions of the graphic device that are
necessary to accomodate these specs.

Thanks for any ideas

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.
E-mail: jwege...@vcu.edu
URL: http://www.people.vcu.edu/~jwegelin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] fixed portion of lme4 model in newdata

2010-04-15 Thread Jacob Wegelin



Suppose we have

•   An lmer{lme4} model, MyModel, computed on dataframe SomeDATA;

•   Dataframe NEWDATA, which contains the variables used in 
computing MyModel, but different values for these variables.

In NEWDATA I would like to compute (in an automated, quick, easy, 
non-error-prone manner) the fitted values of the fixed portion of the model. 
Also I want to compute the standard errors for these fits.

One use of this would be to display graphically the fitted portion of an lmer model so as 
to see the "effects" of selected variables. For instance, such a display would 
be useful for judging the *practical* significance of variables that have tested as 
*statistically* significant.

To this end, I would like to obtain (in an automated manner) the "design 
matrix" of the model for the NEWDATA, say NEWDATA_designMatrix.  Then the fits would 
be

NEWDATA_designMatrix %*% cbind(fixef(MyModel))

and the variance of the fits

NEWDATA_designMatrix %*% vcov(MyModel)  %*% t(NEWDATA_designMatrix)

Are not model.frame() and model.matrix() the way to obtain 
NEWDATA_designMatrix?  Below is an unsuccessful attempt to use these functions.

Can anyone explain how to obtain the desired result?

#   Simulate unbalanced longitudinal data which include a factor 
(categorical predictor).
set.seed(5)
obsPerID<-5
N_ID<-10
SomeDATA<-data.frame(
ID=rep( 1:N_ID, each=obsPerID)
, X=rnorm( obsPerID * N_ID )
, categ=factor(sample(LETTERS[1:3], size=obsPerID * N_ID, replace=T))
)
SomeDATA$Y <- SomeDATA$X + SomeDATA$ID + rnorm( obsPerID * N_ID ) / 2 + 
ifelse(SomeDATA$categ=="C", 5, 0)
SomeDATA<-SomeDATA[order(SomeDATA$ID, SomeDATA$X),]
print(summary(SomeDATA))
library(lattice)
print(
xyplot( Y ~ X | categ, groups=ID, data=SomeDATA, type="l"
, layout=c(3,1)
)
)
#	I would like to easily slap the fixed portion of the fits of a regression model on top of this plot. 
library(lme4)

MyModel<-lmer( Y ~ X + categ + (1|ID), data=SomeDATA)
print(MyModel)

#	Create an orderly NEWDATA. 
N_NEWDATA<-7

NEWDATA<-data.frame(
X=seq(-2, 2, length=N_NEWDATA)
, ID=2
, categ=factor(
sample(c("A", "C"), replace=T, size=N_NEWDATA)
, levels=levels(SomeDATA$categ))
, Y=0
)
NEWDATA$nuisance<-sample(1:3, replace=T, size=nrow(NEWDATA))
print(NEWDATA)
NEWDATA_frame<-model.frame( formula=formula(MyModel), data=NEWDATA)
print(NEWDATA_frame)
#   The following generates an error:
NEWDATA_matrix<-model.matrix( object=formula(MyModel), data=NEWDATA_frame)
#   The following does not conform to NEWDATA but instead reverts to 
SomeDATA:
NEWDATA_matrix<-model.matrix( object=MyModel, data=NEWDATA_frame)
print(NEWDATA_matrix)

Thanks for any insights

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032

APPENDIX


sessionInfo()
R version 2.10.1 (2009-12-14) 
i386-apple-darwin9.8.0


locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] lme4_0.999375-32   Matrix_0.999375-33 lattice_0.17-26

loaded via a namespace (and not attached):
[1] grid_2.10.1  tools_2.10.1

END APPENDIX__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] italics help in plot

2009-10-06 Thread Jacob Kasper

Part of my script reads:

speciesName <- names(data)[i]
plot(year,depth, xlab="Year",
ylab="Depth(m)",main=expression(italic(paste(speciesName))) )

Unfortunately, this just plots *speciesName *on my graph, not the name of
the species in italics. Any suggestions on how to resolve this?

Thank you
Jacob

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] princomp data input format

2009-10-07 Thread Jacob Kasper

I am trying do to a PCA analysis using princomp. I get a result, but I
wonder if I have the data in the correct format.
My data contains many stations where fish were sampled as well as
environmental information for each station (lat, lon, depth, temp and year).
the format is like this:
species 1| species 2 | species 3|...|lat | lon | depth | temp| year
1  3132  42   70   200
2  1988
4  34  0  43   54   4001
 1988
200  65  4  42   70   200
2  1989
1  32343   54   4000
 1989

with nearly 14,000 stations, each in its own row.

the result of running princomp(data,cor=T) is just a mess. How do I 'tell' r
what columns are species and what are environmental factors?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lattice auto.key drop unused levels

2009-10-10 Thread Jacob Wegelin

The following code produces a legend ("key") that mentions the unused
levels of Block.

library(MEMSS)
xyplot(yield~nitro, subset=(Block=="I" | Block=="II"), data=Oats,
group=Block, auto.key=T)

and adding "drop.unused.levels=T" does not fix it. And in fact even
the following does not solve the problem:

 xyplot(yield~nitro, data=Oats[Oats$Block=="I" | Oats$Block=="II",],
group=Block, auto.key=T)

The following workaround solves it, but seems inelegant:

junk<-Oats[Oats$Block=="I" | Oats$Block=="II",]
junk$Block<-factor(as.character(junk$Block))
xyplot(yield~nitro, group=Block, data=junk, auto.key=T)

What is the elegant or "proper R thinking" way to do this? That is, I
want to get a key that only mentions the levels of Block that are used
in the plot.

Thanks

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.
E-mail: jwege...@vcu.edu
URL: http://www.people.vcu.edu/~jwegelin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lattice auto.key drop unused levels

2009-10-10 Thread Jacob Wegelin

On Sat, Oct 10, 2009 at 8:51 PM, Peter Ehlers  wrote:
> The key will show the levels of the 'groups' factor. So you will
> have to ensure that the factor fed to groups has the levels
> that you want displayed. ?xyplot explicitly states that
> drop.unused.levels will NOT do that for you.

So the answer seems to be that there is no way to drop the levels
without bypassing the "subset" argument.

The following creates the desired effect:

xyplot(yield~nitro, data=Oats[Oats$Block=="I" | Oats$Block=="II" , ] ,
group=Block [, drop=T], auto.key=T)

whereas dropping levels in the "group" argument does not create the
desired effect

xyplot(yield~nitro, subset=(Block=="I" | Block=="II"), data=Oats,
group=Block [, drop=T], auto.key=T)

Jacob Wegelin

>  -Peter Ehlers
>
> Jacob Wegelin wrote:
>>
>> The following code produces a legend ("key") that mentions the unused
>> levels of Block.
>>
>> library(MEMSS)
>> xyplot(yield~nitro, subset=(Block=="I" | Block=="II"), data=Oats,
>> group=Block, auto.key=T)
>>
>> and adding "drop.unused.levels=T" does not fix it. And in fact even
>> the following does not solve the problem:
>>
>>  xyplot(yield~nitro, data=Oats[Oats$Block=="I" | Oats$Block=="II",],
>> group=Block, auto.key=T)
>>
>> The following workaround solves it, but seems inelegant:
>>
>> junk<-Oats[Oats$Block=="I" | Oats$Block=="II",]
>> junk$Block<-factor(as.character(junk$Block))
>> xyplot(yield~nitro, group=Block, data=junk, auto.key=T)
>>
>> What is the elegant or "proper R thinking" way to do this? That is, I
>> want to get a key that only mentions the levels of Block that are used
>> in the plot.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] xyplot does not find variable in data

2009-10-12 Thread Jacob Wegelin

When we call a lattice function such as xyplot, to what extent does
the "data" designation cause the function to look inside the "data"
for variables?

In the examples below, the "subset" argument understands that
"Variety" is a variable in the data.

But the "scales" argument does not understand that "nitro" is a
variable in the data.

What principle is at work?


library(MEMSS)

# The following works fine:

xyplot(
yield ~ nitro
, data=Oats
, scales=list(
x=list(
at=unique(Oats$nitro)
)
)
, subset=Variety=="Victory"
)

# But the following returns an error:

xyplot(
yield ~ nitro
, data=Oats
, scales=list(
x=list(
at=unique(nitro)
    )
)
)

Thanks for any insight

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.
E-mail: jwege...@vcu.edu
URL: http://www.people.vcu.edu/~jwegelin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] twoord.plot y lab size

2009-10-22 Thread Jacob Kasper

I am using twoord.plot and my Y axis units on the left y overlap. I tried
using cex.axis in my par command but that only adjusted the x units, not the
y. cex.axis in twoord.plot did not help. How do I adjust Y units in the
twoord.plot?
example:
twoord.plot (lx=myear, ly=z, rx=myear, ry=sta,xlim=c(1985,2010),
xlab="Year",ylab="Individuals",
rylab="# Stations")

-- 
Jacob Kasper
http://twitter.com/Protect_Oceans
66°04' N
23°07' W
Coastal & Marine Management Master's Student
University Centre of the Westfjords

Sundstræti 1437 Devens Rd
Ísafjörður, 400   Swampscott, MA 01907
Iceland   USA
+354 456 0195

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] variable labels to accompany data.frame

2009-10-28 Thread Jacob Wegelin



Often it is useful to keep a "codebook" to document the contents of a dataset. (By 
"dataset" I mean
a rectangular structure such as a dataframe.)

The codebook has as many rows as the dataset has columns (variables, fields).  
The columns (fields)
of the codebook may include:

•   variable name

•   type (character, factor, integer, etc)

•   variable label (e.g., a variable called "bmi2" might be labeled 
"BMI hand-input by
clinic personnel, must be checked"

•   permissible values

•   which values indicate missing (and potentially different kinds 
of missing)

Some statistics software (e.g., SPSS and Stata) provides at least a subset of 
this kind of
information automatically in a convenient form. For instance, in Stata one can define a 
"label" for
a variable and it is thenceforth linked to the variable. In output from certain 
modeling and
graphics functions, Stata by default uses the label rather than the variable 
name.

Furthemore: In Stata, if "myvariable" is labeled numeric (in R lingo, a 
factor), and I type

codebook myvariable

then Stata tells me, among other things, the "levels" of myvariable.

Does a tool of this sort exist in R?

The prompt() function is related to this, but prompt(someDataFrame) creates a 
text file on disk. The
text file is associated with, but not unambiguously linked to, someDataFrame.

The epicalc function codebook() provides a summary of a dataframe similar to 
that created by
summary() but easier to read. But this is not a way to define and keep track of 
labels that are
linked to variables.

To link a dataframe to its codebook, one could do the following "by hand": 
Create a list, say,
"somedata", where somedata$DATA is a dataframe that contains the data, and 
somedata$VARIABLE is also
a dataframe, but serves as the codebook. For instance, the following function 
creates a template
into which one could subsequently edit to insert variable labels and turn into 
somedata$VARIABLE.

fnJunk <-function( THESEDATA ) {
#  From a dataframe, make the start of a codebook.
   if(!is.data.frame(THESEDATA)) stop("!is.data.frame(THESEDATA)")
   data.frame(
  Variable=names(THESEDATA)
  , class=sapply(THESEDATA, class)
  , type=sapply(THESEDATA, typeof)
  , label=""
  , comment=""
  )
}


But the following automatic behavior would be nice:

•   We should be able to treat somedata exactly as we treat a 
dataframe, so that the
fact that it possesses a "codebook" is merely an added benefit, not an 
interference with the
usual tasks.

•   If we delete a column of somedata$DATA, the associated row of 
somedata$VARIABLE
should be automatically deleted.

•   If we add a column to somedata$DATA, the associated column 
should be inserted in
somedata$VARIABLE, and some of the fields automatically populated such 
as variable name and
type.  It could get fancier. For instance:

•   If we try to add a value to a field in somedata$DATA which is 
not permitted by the
"permissible values" listed for this field in somedata$VARIABLE, we get 
an error.

Has anyone already thought this through, maybe defined a class and associated 
methods?

Thanks

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A. 
E-mail: jwege...@vcu.edu 
URL: http://www.people.vcu.edu/~jwegelin__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] superscript troubles

2009-11-02 Thread Jacob Kasper

I know that this has been revisited over and over, yet I cannot figure out
how to solve this case of superscript troubles...
I would like the 2 in r2 to be superscript, yet I am pasting text before and
after it. I have tried several variations but have not solved this yet, any
suggestions?

legend (bty =
"n","topright",paste("r2=",round(summary(lat_x)$r.squared,digits=3),",
P=",round(coefficients(summary(lat_x))[2,4], digits=3)))
Thank you
Jacob

-- 
Jacob Kasper
http://twitter.com/Protect_Oceans
66°04' N
23°07' W
Coastal & Marine Management Master's Student
University Centre of the Westfjords

Sundstræti 1437 Devens Rd
Ísafjörður, 400   Swampscott, MA 01907
Iceland   USA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] loop through variable names

2009-11-11 Thread Jacob Wegelin



Often I perform the same task on a series of variables in a dataframe, by 
looping through a character vector that holds the names and using paste(), 
eval(), and parse() inside the loop.

For instance:

thesevars<-names(environmental)
environmental$ToyOutcome<-rnorm(nrow(environmental))

tableOfResults<-data.frame(var=thesevars)
tableOfResults$Beta<- NA
rownames(tableOfResults)<-thesevars

for( thisvar in thesevars) {
thiscommand<- paste("thislm <- lm( ToyOutcome ~ ", thisvar, ", 
data=environmental)")
eval(parse(text=thiscommand))
tableOfResults[thisvar, "Beta"] <- coef(thislm)[thisvar]
}

print(tableOfResults)

Note that it's not always as simple a task as in this example. Within the loop, 
I might first figure out whether the variable is continuous or categorical, 
then perform an operation depending on its type--maybe lm() for continuous but 
wilcox.test() for dichotomous.

But the use of paste(), eval(), and parse() seems awkward.  Is there a more 
elegant way to approach this?

Thanks

Jacob A. Wegelin
Department of Biostatistics
Virginia Commonwealth University
Richmond VA 23298-0032
U.S.A.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] wilcox.test loop through variable names

2009-11-15 Thread Jacob Wegelin



Often I perform the same task on a series of variables in a dataframe,
by looping through a character vector that holds the names and using
paste(), eval(), and parse() inside the loop.

For instance:

rm(environmental)
thesevars<-names(environmental)
environmental$ToyReal <-rnorm(nrow(environmental)) 
environmental$ToyDichot<- environmental$ToyReal < 0.53


tableOfResults<-data.frame(var=thesevars)

tableOfResults$p_wilcox <- NA

tableOfResults$Beta_lm <- NA

rownames(tableOfResults)<-thesevars

for( thisvar in thesevars) {
thiscommand<- paste("thiswilcox <- wilcox.test (", thisvar, " ~ ToyDichot , 
data=environmental)")
eval(parse(text=thiscommand))
tableOfResults[thisvar, "p_wilcox"] <- thiswilcox$p.value
thislm<-lm( environmental[ c( "ToyReal", thisvar )])
tableOfResults[thisvar, "Beta_lm"] <- coef(thislm)[thisvar]
}

print(tableOfResults)

Of course, the loop above is a toy example. In real life I might first figure 
out whether the variable is
continuous, dichotomous, or categorical taking on several values, then perform 
an operation depending on
its type.

The use of paste(), eval(), and parse() seems awkward.  As Gabor Grothendieck 
showed
(http://tolstoy.newcastle.edu.au/R/e8/help/09/11/4520.html), if we
are calling a regression function such as lm() we can avoid using
paste(), as shown above.

But is there a way to avoid paste() and eval() when one uses t.test()
or wilcox.test()?

Thanks

Jacob A. Wegelin
Department of Biostatistics
Virginia Commonwealth University
Richmond VA 23298-0032
U.S.A.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] wilcox.test loop through variable names

2009-11-15 Thread Jacob Wegelin

On Sun, 15 Nov 2009 14:33 -0500, "Jacob Wegelin"
 wrote:
> 
> Often I perform the same task on a series of variables in a dataframe,
> by looping through a character vector that holds the names and using
> paste(), eval(), and parse() inside the loop.
> 
> For instance:
> 
> rm(environmental)
> thesevars<-names(environmental)
> environmental$ToyReal <-rnorm(nrow(environmental)) 
> environmental$ToyDichot<- environmental$ToyReal < 0.53
> 
> tableOfResults<-data.frame(var=thesevars)
> 
> tableOfResults$p_wilcox <- NA
> 
> tableOfResults$Beta_lm <- NA
> 
> rownames(tableOfResults)<-thesevars
> 
> for( thisvar in thesevars) {
>   thiscommand<- paste("thiswilcox <- wilcox.test (", thisvar, " ~ 
> ToyDichot , data=environmental)")
>   eval(parse(text=thiscommand))
>   tableOfResults[thisvar, "p_wilcox"] <- thiswilcox$p.value
>   thislm<-lm( environmental[ c( "ToyReal", thisvar )])
>   tableOfResults[thisvar, "Beta_lm"] <- coef(thislm)[thisvar]
> }
> 
> print(tableOfResults)
> 
> Of course, the loop above is a toy example. In real life I might first
> figure out whether the variable is
> continuous, dichotomous, or categorical taking on several values, then
> perform an operation depending on
> its type.
> 
> The use of paste(), eval(), and parse() seems awkward.  As Gabor
> Grothendieck showed
> (http://tolstoy.newcastle.edu.au/R/e8/help/09/11/4520.html), if we
> are calling a regression function such as lm() we can avoid using
> paste(), as shown above.
> 
> But is there a way to avoid paste() and eval() when one uses t.test()
> or wilcox.test()?

Here is a solution:

rm(environmental)
thesevars<-names(environmental)
environmental$ToyReal <-rnorm(nrow(environmental))
environmental$ToyDichot<- environmental$ToyReal < 0.53

ThisList<-
lapply( environmental[thesevars], function( OneVar ) {
   c(
  p_wilcox= wilcox.test( OneVar ~ environmental$ToyDichot )$p.value
 ,
  Beta_lm = as.numeric(coef(lm( environmental$ToyReal ~ OneVar
  ))["OneVar"])
   )
   }
)

do.call("rbind", ThisList)

Jacob A. Wegelin 
Department of Biostatistics 
Virginia Commonwealth University 
Richmond VA 23298-0032 
U.S.A.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] facet_grid problem: don't understand error message

2009-05-26 Thread Etches Jacob

The following is a plot that used to work (last December, I think with  
a patched version of ggplot2 from Hadley), but has stopped working  
with recent updates.  Other instances of facet_grid work and the data  
look normal to me.  Since I have no idea what caused the error  
message, I'm having trouble producing a reproducible example or  
solving the problem.  Has anyone seen this before?

Many thanks,
Jacob Etches


 > p <- qplot(year,prop.excl,data=subset(excl,!is.na(ftf) &  
exclusion %in%  
c("disab_inc_rec","family_death","fs_ch","ft2_ch","retir_inc_rec")),
geom=c("line"), colour=Exclusion,
   ylab="Proportion of persons per exposure year",xlab="Start year  
of exposure window")
 > p+ facet_grid(ftf + Sex ~ Age)
Error in rbind(c("spacer", "strip_h", "strip_h", "strip_h",  
"strip_h",  :
   number of columns of matrices must match (see arg 2)
In addition: Warning message:
In rbind(list(1, 1, 1, 1, 1, 1), list(0.77705, 1, 1, 1, 1,  
0.5450417,  :
   number of columns of result is not a multiple of vector length (arg  
1)



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] course in ecological data analysis

2009-04-03 Thread Capelle, Jacob

Dear all,
 
For my PhD study I'm looking for relevant courses/workshops (short term)
in ecological data anlysis with R in Europe. After 2 days searching I'm
convinced that google is probably not the right medium to find this
information. If anyone can help me I will be most grateful. 
Best regards - J. Capelle

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] xyplot 3 panels 3 different Y variables

2010-02-04 Thread Jacob Wegelin



Often, when exploring a dataset, I'd like to plot several very different Y 
variables against the same X variable, in panels stacked one over the other. Is 
there an easy way to do this?

I'd like to achieve an elegant look similar to the look achieved by lattice in 
conditioned plots--for instance no space between panels. But unlike in 
straightforward conditioned plot, each panel may be on a different scale.

Example.

•   Plot Estrogen, Creatinine, and their ratio; all by the same 
predictor variable (say, Day).

Or: In a longitudinal study of hormones in reproductive-age 
women, plot progesterone, estradiol, testosterone, luteinizing hormone, 
follicle-stimulating hormone, and thyroid-stimulating hormone all on one page, 
parallel. Note that several of these variables are measured in different units.

•   One panel for each outcome variable, arranged one above the 
other.

•   Minimal vertical space between the panels.

To make this concrete, let's generate toy data:

N<-40
set.seed(5234767)
JUNK<-data.frame(Day=1:N)
JUNK$Creatinine<-exp(2*rnorm(nrow(JUNK)))
JUNK$Estrogen<- (sin(JUNK$Day/pi) + 2.5) * ( exp(2*rnorm(nrow(JUNK))) * 
JUNK$Creatinine )
JUNK$Creatinine[10]<-0.0001
JUNK$Ratio<- JUNK$Estrogen / JUNK$Creatinine

The following traditional graphics commands put an annoying wide space between the 
"panels" by default. Also, the X ticks are repeated unnecessarily.

par(mfrow=c(3,1))
par(oma=c(0,0,1,0))
plot(JUNK$Day, JUNK$Estrogen, xlab="", ylab="Estrogen", type="o")
plot(JUNK$Day, JUNK$Creatinine, xlab="", ylab="Creatinine", type="o")
plot(JUNK$Day, JUNK$Ratio, xlab="Day", ylab="Ratio", type="o")

The following lattice approach gives a kinda nice-looking end product, but seems so 
counterintuitive that I want to call it a workaround.  For instance, it generates a 
"time" variable that actually is a category.  And the variable names are 
converted into the levels of a factor by hand, so that the process is susceptible to 
human error.

Also, the Y variables are not labeled on the axes, only in the strip.  This is 
not ideal.

JUNK2<-JUNK
names(JUNK2)[names(JUNK2)=="Creatinine"]<-"Y.1"
names(JUNK2)[names(JUNK2)=="Estrogen"]<-"Y.2"
names(JUNK2)[names(JUNK2)=="Ratio"]<-"Y.3"
JUNKlong<- reshape(JUNK2,  dir="long", varying=2:4)
JUNKlong$outcome<-factor( JUNKlong$time, levels=1:3, labels=c("Creatinine", "Estrogen", 
"Ratio") )
JUNKlong$time<-NULL
library(lattice)
xyplot( Y ~ Day | outcome, data=JUNKlong, layout=c(1,3), type="o", 
scales=list(x=list(alternating=3), y=list(relation="free", alternating=3)), ylab="")

Am I making this harder than it needs to be?

Thanks

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] parallel xyplot, different y scales, conditioned

2010-02-05 Thread Jacob Wegelin



The syntax below creates parallel time-series plots of three different y 
variables conditioned by a dichotomous factor.  (Thanks to several people who 
answered an earlier post from Thursday, 
http://tolstoy.newcastle.edu.au/R/e9/help/10/02/3848.html. The syntax below is 
based on their helpful replies.)

The y variables differ wildly in their scales, and the specification

scales$y$relation="free"

enables each y variable to be plotted on its own distinct scale.  Unfortunately, this also causes 
the scales of the *same* y variable to differ between the two levels of the dichotomous factor. 
Thus one does not easily see (for instance) that QQQ takes on much smaller values for 
"feline" than for "Dog".

Is there a way to lock the levels of the factor (the columns) into the same 
scale for each y variable (for each row), while allowing the scales to differ 
between the y variables (between the rows)?

#   Toy data to illustrate:

N<-15
w <- (1:N)/N
x <- w^2
QQQ <- exp(w)
z <- x / QQQ
JUNK<-data.frame( x=x, QQQ=QQQ, z=z, w=w)
JUNK$ID<-1
jank<-JUNK
jank$ID<-2
jank$x<-jank$x / 2
jank$QQQ<-jank$QQQ / 2
jank$z<-jank$x/jank$QQQ
JUNK<-rbind(JUNK, jank)
jank<-JUNK
jank$x<-(jank$x) ^(1/4)
jank$QQQ<-(jank$QQQ) / 10
jank$z <- jank$x / jank$QQQ
JUNK$Species<-"Dog"
jank$Species<-"feline"
JUNK<-rbind(JUNK, jank)
JUNK$Species<-factor(JUNK$Species)
JUNK$ID<-factor(JUNK$ID)
summary(JUNK)


#   The three commands below differ only in the value of scales$y$relation.

#	scales$y$relation="same" 
#	forces x, QQQ, and z to the same scale, which obscures signal.

#   But at least it enables us to see that the range of QQQ
#   differs immensely between Dog and feline.
xyplot ( x + QQQ + z ~ w | Species
, group=ID
, data=JUNK
, ylab=c("x (mg/L)", "QQQ (pg/L)", "z (mIU/mL)")
, xlab=c("Dog", "feline")
, type="o"
, strip= TRUE
, outer=TRUE
, layout=c(2,3)
, scales=list(
x=list( alternating=3)
, y=list(
relation="same"
, alternating=3
, rot=0
, log=T
)
)
)


#	scales$y$relation="free" 
#	displays x, QQQ, and z on different scales, but it also allows

#   the scales for each variable to differ between Dog and feline.
#   This prevents us from visually comparing the species.
xyplot ( x + QQQ + z ~ w | Species
, group=ID
, data=JUNK
, ylab=c("x (mg/L)", "QQQ (pg/L)", "z (mIU/mL)")
, xlab=c("Dog", "feline")
, type="o"
, strip= TRUE
, outer=TRUE
, layout=c(2,3)
, scales=list(
x=list( alternating=3)
, y=list(
relation="free"
, alternating=3
, rot=0
, log=T
)
)
)


#	scales$y$relation="sliced" 
#	shows us that the difference max(z)-min(z) differs greatly between

#   Dog and feline.  But it obscures the fact that
#   QQQ differs wildly between Dog and feline, as we saw when
#   relation="same".
xyplot ( x + QQQ + z ~ w | Species
, group=ID
, data=JUNK
, ylab=c("x (mg/L)", "QQQ (pg/L)", "z (mIU/mL)")
, xlab=c("Dog", "feline")
, type="o"
, strip= TRUE
, outer=TRUE
, layout=c(2,3)
, scales=list(
x=list( alternating=3)
, y=list(
relation="sliced"
, alternating=3
, rot=0
, log=T
)
)
)


#   In comparing these plots, we also see that, of necessity, it is only with 
relation="same"
#   that we can avoid repeating the y axis ticks for
#   the second column of panels. With relation="same", xyplot
#   automatically gets rid of that repetition.  And we see an example of 
the fact that
#   scales$alternating=3 only works when relation="same".

Thanks for any insight

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] conditioned xyplot, many y variables

2010-02-07 Thread Jacob Wegelin

 QQQ differs wildly between Dog and feline, as we saw when
#   relation="same".
xyplot ( ppp + QQQ + z ~ TIME | Species
, group=ID
, data=JUNK
, ylab=c("ppp (mg/L)", "QQQ (pg/L)", "z (mIU/mL)")
, xlab=c("Dog", "feline")
, type="o"
, strip= FALSE
, outer=TRUE
, layout=c(2,3)
, scales=list(
ppp=list( alternating=3)
, y=list(
relation="sliced"
, alternating=3
, rot=0
, log=T
)
)
)

Comments:

The traditional graphics solution requires many lines of custom code.  Also, in traditional graphics I was 
unable to get the "ylab" labels to read parallel to the axis at the same time that the tick labels 
read perpendicular to the axis. The lattice package achieves this with "rot=0". In traditional 
graphics, "las" apparently governs the mtext (outer label) as well as the axis tick labels.

Thanks for any insights

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] {peruse}

2021-03-10 Thread Jacob Goldsmith

Hello all,

R package {peruse 0.3.1} is on CRAN. {peruse} provides simple tools for 
sequence iteration and set comprehension. It is especially useful for analyzing 
recursively generated sequences, including any stochastic process. The main 
page at https://jacgoldsm.github.io/peruse/ provides in-depth examples and 
extensive explanation of how to use the tool.

CRAN link:
https://CRAN.R-project.org/package=peruse<https://cran.r-project.org/package=peruse>


Best,
Jacob Goldsmith




[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Seeking implementation of my algorithm 'spdspds' - a novel algorithm for solving Linear Programming Problems with O(L^1.5) computational complexity

2020-06-11 Thread jacob bogers

" not having any access to the resources "
You have internet (as is proven by sending this email)
you even have R cli these days on mobile (I do anyway).
Learn R , code,  let me know how it turns out



On Wed, Jun 10, 2020 at 8:46 PM Keshava PRASADa Halemane <
k.prasa...@gmail.com> wrote:

> Friends:
> i am a retired Professor -
> not having any access to the resources (human/financial/business/whatever)
> that may be required -
> therefore i am seeking implementation of my algorithm 'spdspds' -
> a novel algorithm for solving Linear Programming Problems with O(L^1.5)
> computational complexity -
> in order to show/convince the esteemed world optimization community
> that it is indeed a great grand breakthrough in terms of achievement of the
> linear programming performance challenge of the millennium -
> with far reaching deep impact on optimization algorithm development in
> general -
> holy grail fantasy realized!
>
> I need some individual or team who is interested & willing to work on this.
> Earlier experience in implementation of optimization/LP algorithms will
> greatly help.
>
> You may access / download / read my paper -
> "Unbelievable *O*(*L*^1.5) worst case computational complexity achieved by
> spdspds algorithm for linear programming problem"
> which is available at - arxiv . org / abs / 1405 . 6902
>
> Thanks a lot.
>  - Dr(Prof) Keshava Prasad Halemane
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] scope, lme, ns, nlme, splines

2012-12-19 Thread Jacob Wegelin



Your solution works beautifully for predict.lme in the same data that
were used to compute the model. But what if I want to compute the 
population fitted values in newdata? In the expanded example below,

predict.lm is able to find the object called "fixed". But predict.lme is
unable to find it and returns an error.

library("nlme")
library(splines)
library(ggplot2)
set.seed(5)
junk<-data.frame(x=rnorm(100))
junk$y<-junk$x + rnorm(nrow(junk))
junk$idvar<- factor(sample(LETTERS[1:10], size=nrow(junk), replace=TRUE))
PRETTYNEWDATA<-data.frame(x=seq(-2, 2, length=20))

fitfunction<-function(splineDF=1) {
  fixed <- eval(
 substitute(
   expr= y ~ ns(x, df=splineDF)
   , env= list(splineDF=splineDF)
)
 )
thislm<- lm( fixed , data= junk)
junk$fitlm<-predict(thislm)
  thislme<-lme( fixed= fixed , random= ~ 1 | idvar , data= junk)
  junk$fitlme<-predict(thislme,level=0)
  print(
 ggplot( junk, aes(x, y))
+ geom_point(shape=1)
+ geom_line( aes(x, fitlme), color="red", lwd=2)
+ geom_line( aes(x, fitlm), color="blue", lty=3)
+ ggtitle(deparse(fixed))
 )
  PRETTYNEWDATA$fitlm<-predict(thislm,newdata=PRETTYNEWDATA)
print(head(PRETTYNEWDATA))
cat("about to use newdata with lme \n")
  PRETTYNEWDATA$fitlme<-predict(thislme,level=0, newdata=PRETTYNEWDATA)
  thislme
}

Jake Wegelin

On Thu, 6 Dec 2012, Duncan Murdoch wrote:


On 06/12/2012 2:28 PM, Jacob Wegelin wrote:

I want to fit a series of lme() regression models that differ only in the
degrees of freedom of a ns() spline. I want to use a wrapper function to do
this. The models will be of the form

y ~ ns(x, df=splineDF)

where splineDF is passed as an argument to a wrapper function.

This works fine if the regression function is lm(). But with lme(),
I get an error.  fitfunction() below demonstrates this.

Why?


Presumably this is a bug in nlme.  Formulas have associated environments.  In 
your example, that would be the local frame of the fitfunction, and splineDF 
lives there.  It looks as though lme is not looking in the formula 
environment for the splineDF value.


Another workaround besides the one you found is to use substitute() to put 
the value of splineDF into the formula, i.e.


fitfunction<-function(splineDF=1) {
 junk$splineDF <- splineDF
thislm<-lm( y ~ ns(x, df=splineDF), data= junk)
thislm
cat("finished thislm\n")
 fixed <- eval(substitute( y ~ ns(x, df=splineDF), 
list(splineDF=splineDF)))

thislme<-lme(
fixed= fixed
, random= ~ 1 | idvar
, data= junk)
thislme
}

It's not much better than your kludge in this example, but it might be more 
convenient if you want to work with different formulas.


Duncan Murdoch



KLUDGEfit() below provides a clumsy solution. It turns the lme()
command, along with the appropriate value of splineDF, into a text
string and then uses eval(parse(text=mystring)).  But is there not a
more elegant solution?

set.seed(5)
junk<-data.frame(x=rnorm(100))
junk$y<-junk$x + rnorm(nrow(junk))
junk$idvar<- factor(sample(LETTERS[1:10], size=nrow(junk), replace=TRUE))
library("nlme")
library(splines)

fitfunction<-function(splineDF=1) {
thislm<-lm( y ~ ns(x, df=splineDF), data= junk)
thislm
cat("finished thislm\n")
thislme<-lme(
fixed= y ~ ns(x, df=splineDF)
, random= ~ 1 | idvar
, data= junk)
thislme
}
fitfunction()

KLUDGEfit<-function(splineDF=2) {
thislm<-lm( y ~ ns(x, df=splineDF), data= junk)
thislm
cat("finished thislm\n")
mystring<-paste(
"thislme<-lme( fixed= y ~ ns(x, df="
, splineDF
, "), random= ~ 1 | idvar , data= junk)"
, sep="")
eval(parse(text=mystring))
thislme
}
KLUDGEfit()

Thanks for any insight

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
830 E. Main St., Seventh Floor
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plotting histogram of RT data

2013-03-31 Thread Jacob Fake

Hey there,

I'm relatively new to R, and am currently working my way through some basic
tutorials. I have a large data set that I've been able to import into the
program. I'm using a script in which x=readdir (the directory). I am trying
to create a histogram of the data in the 17th column, which is response
time data ranging from ~200 to 15000. I have tried to create a histogram by
entering hist(x[,17]). However, the output that I receive has an x-axis
that is wayyy out with small values like 0e+00 up to 6e+05. All the data is
clustered together in one bar at 0e+00 and has a frequency of 15000 on the
y-axis. What am I doing wrong? How can I change the output so that I can
assess the shape of the distribution?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lme(y ~ ns(x, df=splineDF)) error

2012-09-26 Thread Jacob Wegelin



I would like to fit regression models of the form

y ~ ns(x, df=splineDF)

where splineDF is passed as an argument to a wrapper function.

This works fine if the regression function is lm(). But with lme(),
I get two different errors, depending on how I handle splineDF
inside the wrapper function.

A workaround is to turn the lme() command, along with the appropriate
value of splineDF, into a text string and then use
eval(parse(text=mystring)).  Is there not a more elegant solution?

The function below demonstrates this. According to the value of
WhichApproach (1, 2, or 3), it attempts one of the approaches that
does not work or the workaround using the text string.


sessionInfo()

R version 2.15.1 (2012-06-22)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] splines   stats graphics  grDevices utils datasets  methods
base

other attached packages:
[1] nlme_3.1-104  ggplot2_0.9.1

loaded via a namespace (and not attached):
  [1] colorspace_1.1-1   dichromat_1.2-4digest_0.5.2
  grid_2.15.1labeling_0.1
   [6] lattice_0.20-6 MASS_7.3-18memoise_0.1
   munsell_0.3plyr_1.7.1
   [11] proto_0.3-9.2  RColorBrewer_1.0-5 reshape2_1.2.1
   scales_0.2.1   stringr_0.6
   [16] tools_2.15.1


wrapper <-function(WhichApproach=1, splineDF=4 ){
#   Create toy dataset
nID<-25
IDnames<-LETTERS[1:nID]

longdat<-data.frame( x=rep(-10:10, nID) / 3 , ID= rep( IDnames , each=21)
)
set.seed(5)
longdat$x<-longdat$x + rnorm(nrow(longdat))/10
IDeffect<-rnorm(nID) * 20
names(IDeffect) <- IDnames
longdat$IDeffect<-IDeffect[as.character(longdat$ID)]
longdat$y<- (longdat$x
+ longdat$x^3
+ (longdat$x-1)^4  / 5
+ 1/(abs(longdat$x/50) + 0.02)
+ longdat$IDeffect
+ rnorm(1:nrow(longdat)) * 2
)
longdat<-longdat[order(longdat$x),]

library(splines)
#   Calling ns within lm works fine:
mylm<- lm( y ~ ns(x,df=splineDF), data=longdat)
longdat$lmfit<-predict(mylm)
library(ggplot2)
print(
ggplot(longdat, aes(x, y))
+ geom_point(shape=1)
+ geom_line(aes(x=x, y=lmfit), color="red")
)


cat("Enter to attempt lme.")
readline()
library(nlme)

if(WhichApproach==1) {
# returns: "Error in eval(expr, envir, enclos) : object 'splineDF' not found"
#
# First make sure splineDF does not exist in .GlobalEnv, else we would be in 
WhichApproach==2.
if(exists("splineDF", where=".GlobalEnv")) remove(list="splineDF", 
pos=".GlobalEnv")
mylme<-lme(fixed= y ~ ns(x, df=splineDF)
, random= ~ 1 | ID
, correlation = corAR1()
, data=longdat
)
} else if(WhichApproach==2) {
# returns: "Error in model.frame.default(formula = ~ID + y + x + splineDF, data = list( : 
# variable lengths differ (found for 'splineDF')"

assign(x="splineDF", value=splineDF, pos=".GlobalEnv")
mylme<-lme(fixed= y ~ ns(x, df=splineDF)
, random= ~ 1 | ID
, correlation = corAR1()
, data=longdat
)
} else if (WhichApproach==3) {

thecommandstring<- paste("mylme<-lme(fixed= y ~ ns(x, df=", splineDF, ") , random= ~ 
1 | ID , correlation = corAR1() , data=longdat)")
thecommandstring
eval(parse(text=thecommandstring))

} else {
stop(paste("WhichApproach=", WhichApproach, " not valid."))
}

mylme
longdat$fullfit<-predict(mylme)

library(ggplot2)
print(
ggplot( longdat, aes(x,y))
+ geom_point(shape=1)
+ facet_wrap( ~ ID )
+ geom_line( aes(x, fullfit), color="red")
)

invisible(mylme)
}

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
830 E. Main St., Seventh Floor
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Logistic curve fitting with y values >1 (R version 2.15.2, OS is OS X 10.6.8)

2012-11-09 Thread Jacob B.

Hello,

I'm trying to fit a logistic curve to data but I'm having a hard time
discovering how.  Every tutorial I've come across either assumes the
logistic curve has 0http://www.apsnet.org/EDCENTER/ADVANCED/TOPICS/ECOLOGYANDEPIDEMIOLOGYINR/DISEASEPROGRESS/Pages/NonlinearRegression.aspx,
which assumes I have multiple categories of data, and
http://ww2.coastal.edu/kingw/statistics/R-tutorials/logistic.html
which only accepts y between 0 and 1.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Performing gage R&R study in R w/more than 2 factors

2012-11-19 Thread Matt Jacob

Hi everyone,

I'm fairly new to R, and I don't have a background in statistics, so
please bear with me. ;-)

I'm dealing with 2^k factorial designs, and I was just wondering if
there's any way to analyze more than two factors of a gage R&R study in
R. For example, Minitab has an "expanded gage R&R" function that lets
you include up to eight additional factors besides the usual two that
are present in gage studies (parts and operators). If I wanted to
include n additional random factors, is there a package or built-in
functionality that will allow me to do that?

I've been experimenting with the SixSigma package, and that has a ss.rr
method which works great---as long as your experiment only contains two
factors. I've also been using lmer from lme4 to fit a linear model of my
experiment, but the standard deviations generated by lmer don't match
what I'm seeing in Minitab. Since all my factors are random, the formula
I'm using looks like this:

vals ~ 1 + (1|f1) + (1|f2) + (1|f3) + (1|f1:f2) + (1|f1:f3) + (1|f2:f3)

What am I doing wrong, and how can I fix it?

Thanks,

Matt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Performing gage R&R study in R w/more than 2 factors

2012-11-20 Thread Matt Jacob

On Mon, Nov 19, 2012, at 16:31, Bert Gunter wrote:
> I believe that you need to consult a local statistician, as there are
> likely way too many statistical issues here that you do not fully
> understand. Alternatively, try posting to a statistical list like
> stats.stackexchange.com, as I think most of your issues are primarily
> statistical, not R related.

Yes, you are correct. I've actually been working with a statistician
within my organization, but the dilemma is that he's a stats guy who
knows Minitab, and I'm a software guy who's trying to deploy some tools
that are dependent on R. I've basically been trying to match up the
output of R with the output of Minitab to check my work.

Matt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Performing gage R&R study in R w/more than 2 factors

2012-11-20 Thread Matt Jacob

On Mon, Nov 19, 2012, at 18:26, David Winsemius wrote:
> My guess is that you do not understand the meaning of a "random  
> factor". I certainly did not when I first encountered it. All my  
> training had been with ordinary regression and analysis of variance.  
> These are methods for what in mixed models are fixed effects. My  
> opinion is that these terms are completely confusing to the new  
> student of this sort of analysis.

You're absolutely right---the distinction of fixed vs. random factors is
confusing. However, I was under the impression that all factors in a
gage R&R study were random, since we're trying to determine the sources
of variability on the system. 

> My guess is the you may just want the output of:
> 
> lm( vals ~ f1 * f2 * f3, data = yourdat)

I'm trying to get the variance component estimates, and from there, I
can calculate the percent tolerance and other interesting statistics. It
doesn't look like lm gives me that information, though. FWIW, your
formula is the same as what I'm feeding into aov, and the ANOVA table
output *does* match up with what Minitab is producing.

Matt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] scope, lme, ns, nlme, splines

2012-12-06 Thread Jacob Wegelin



I want to fit a series of lme() regression models that differ only in the
degrees of freedom of a ns() spline. I want to use a wrapper function to do
this. The models will be of the form

y ~ ns(x, df=splineDF)

where splineDF is passed as an argument to a wrapper function.

This works fine if the regression function is lm(). But with lme(),
I get an error.  fitfunction() below demonstrates this.

Why?

KLUDGEfit() below provides a clumsy solution. It turns the lme()
command, along with the appropriate value of splineDF, into a text
string and then uses eval(parse(text=mystring)).  But is there not a
more elegant solution?

set.seed(5)
junk<-data.frame(x=rnorm(100))
junk$y<-junk$x + rnorm(nrow(junk))
junk$idvar<- factor(sample(LETTERS[1:10], size=nrow(junk), replace=TRUE))
library("nlme")
library(splines)

fitfunction<-function(splineDF=1) {
thislm<-lm( y ~ ns(x, df=splineDF), data= junk)
thislm
cat("finished thislm\n")
thislme<-lme(
fixed= y ~ ns(x, df=splineDF)
, random= ~ 1 | idvar
, data= junk)
thislme
}
fitfunction()

KLUDGEfit<-function(splineDF=2) {
thislm<-lm( y ~ ns(x, df=splineDF), data= junk)
thislm
cat("finished thislm\n")
mystring<-paste(
"thislme<-lme( fixed= y ~ ns(x, df="
, splineDF
, "), random= ~ 1 | idvar , data= junk)"
, sep="")
    eval(parse(text=mystring))
thislme
}
KLUDGEfit()

Thanks for any insight

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
830 E. Main St., Seventh Floor
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] scales percent precision

2014-02-27 Thread Jacob Wegelin



scales::percent appears not to be documented.

Details:

At http://cran.r-project.org/web/packages/scales/scales.pdf, equivalently in 
?percent, I find no answer to the following two questions.

(1) How can I specify the number of decimal points in a call to percent()? For 
instance, 0.010101 could be

1%

1.0%

1.01%

etc. depending on what kind of report I'm writing.

I can control precision myself by writing

mypercent<-function(theargument, siglevel=2) {
stopifnot(is.numeric(theargument))
paste(signif(theargument, siglevel) * 100, "%", sep="")
}

and then we have


mypercent(0.010101)

[1] "1%"

mypercent(0.010101, 5)

[1] "1.0101%"

mypercent(0.010101, 3)

[1] "1.01%"

(2) What is the function precision() inside percent()? I find no documentation 
for it, and in fact it does not appear in the search path. Nor does round_any().


percent(0.010101)

[1] "1.01%"

percent
function (x) 
{

x <- round_any(x, precision(x)/100)
str_c(comma(x * 100), "%")
}


find("precision")

character(0)

find("round_any")

character(0)




Thanks for any insights

Jacob Wegelin


sessionInfo()

R version 2.15.3 (2013-03-01)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
 [1] tools grid  splines   stats graphics  grDevices utils 
datasets  methods   base

other attached packages:
[1] scales_0.2.3xtable_1.7-0reshape2_1.2.2  moments_0.13
corrplot_0.70   ggplot2_0.9.3.1 nlme_3.1-108

loaded via a namespace (and not attached):
 [1] colorspace_1.2-0   dichromat_1.2-4digest_0.6.0   gtable_0.1.2  
 labeling_0.1   lattice_0.20-13MASS_7.3-23munsell_0.4
 [9] plyr_1.8   proto_0.3-10   psych_1.2.8RColorBrewer_1.0-5 stringr_0.6.2 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] scales percent precision

2014-02-27 Thread Jacob Wegelin


But percent_format() does not take the argument, multiply it by 100, and paste 
on a percent sign, as we see here:


?scales::percent_format
percent_format(0.0101010101)

Error in percent_format(0.0101010101) : unused argument(s) (0.0101010101)

args(percent_format)
function () 
NULL


And how do we control the significant digits when we use percent()?


percent(0.0101010101)

[1] "1.01%"

My point is that


?scales::percent_format


does not answer these questions. This is what I mean by saying that the 
function is not documented.

On 2014-02-27 Thu 14:34, Dennis Murphy wrote:

Hi:

On Thu, Feb 27, 2014 at 8:49 AM, Jacob Wegelin  wrote:


scales::percent appears not to be documented.


?scales::percent_format

where it tells you that it takes its argument, multiplies it by 100
and then attaches a percent sign to it. For most situations, the data
should be relative frequencies/proportions. BTW, many of the functions
in the scales package are second-order R functions, which means there
are two calls in the function invocation. The first call returns a
function and the second is a call to the returned function.



Details:

At http://cran.r-project.org/web/packages/scales/scales.pdf, equivalently in
?percent, I find no answer to the following two questions.

(1) How can I specify the number of decimal points in a call to percent()?
For instance, 0.010101 could be

1%

1.0%

1.01%

etc. depending on what kind of report I'm writing.

I can control precision myself by writing

mypercent<-function(theargument, siglevel=2) {
stopifnot(is.numeric(theargument))
paste(signif(theargument, siglevel) * 100, "%", sep="")
}

and then we have


mypercent(0.010101)


[1] "1%"


mypercent(0.010101, 5)


[1] "1.0101%"


mypercent(0.010101, 3)


[1] "1.01%"


percent_format() uses pretty breaks by default, so you'd probably want
to pass your desired labels to scale_y_continuous() directly and avoid
percent_format(). You could call the function on a vector of breaks
and use the return values for the labels.



(2) What is the function precision() inside percent()? I find no
documentation for it, and in fact it does not appear in the search path. Nor
does round_any().


round_any() comes from the plyr package. I have no idea where
precision() comes from; I've wondered about that myself a couple of
times. I imagine it comes from one of the imported packages, but I
didn't find it in any of plyr, stringr or labeling. I didn't check the
color-related packages (RColorBrewer, dichromat or munsell). It could
also be a hidden function.

Dennis



percent(0.010101)


[1] "1.01%"


percent


function (x) {
x <- round_any(x, precision(x)/100)
str_c(comma(x * 100), "%")
}



find("precision")


character(0)


find("round_any")


character(0)





Thanks for any insights

Jacob Wegelin


sessionInfo()


R version 2.15.3 (2013-03-01)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
 [1] tools grid  splines   stats graphics  grDevices utils
datasets  methods   base

other attached packages:
[1] scales_0.2.3xtable_1.7-0reshape2_1.2.2  moments_0.13
corrplot_0.70   ggplot2_0.9.3.1 nlme_3.1-108

loaded via a namespace (and not attached):
 [1] colorspace_1.2-0   dichromat_1.2-4digest_0.6.0   gtable_0.1.2
labeling_0.1   lattice_0.20-13MASS_7.3-23munsell_0.4
 [9] plyr_1.8   proto_0.3-10   psych_1.2.8
RColorBrewer_1.0-5 stringr_0.6.2





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] scales percent precision

2014-02-27 Thread Jacob Wegelin



Incidentally,

?scales::percent

brings up exactly the same text as

?scales::percent_format

On 2014-02-27 Thu 14:47, Jacob Wegelin wrote:
But percent_format() does not take the argument, multiply it by 100, and 
paste on a percent sign, as we see here:



?scales::percent_format
percent_format(0.0101010101)

Error in percent_format(0.0101010101) : unused argument(s) (0.0101010101)

args(percent_format)

function () NULL

And how do we control the significant digits when we use percent()?


percent(0.0101010101)

[1] "1.01%"

My point is that


?scales::percent_format


does not answer these questions. This is what I mean by saying that the 
function is not documented.


On 2014-02-27 Thu 14:34, Dennis Murphy wrote:

Hi:

On Thu, Feb 27, 2014 at 8:49 AM, Jacob Wegelin  
wrote:


scales::percent appears not to be documented.


?scales::percent_format

where it tells you that it takes its argument, multiplies it by 100
and then attaches a percent sign to it. For most situations, the data
should be relative frequencies/proportions. BTW, many of the functions
in the scales package are second-order R functions, which means there
are two calls in the function invocation. The first call returns a
function and the second is a call to the returned function.



Details:

At http://cran.r-project.org/web/packages/scales/scales.pdf, equivalently 
in

?percent, I find no answer to the following two questions.

(1) How can I specify the number of decimal points in a call to percent()?
For instance, 0.010101 could be

1%

1.0%

1.01%

etc. depending on what kind of report I'm writing.

I can control precision myself by writing

mypercent<-function(theargument, siglevel=2) {
stopifnot(is.numeric(theargument))
paste(signif(theargument, siglevel) * 100, "%", sep="")
}

and then we have


mypercent(0.010101)


[1] "1%"


mypercent(0.010101, 5)


[1] "1.0101%"


mypercent(0.010101, 3)


[1] "1.01%"


percent_format() uses pretty breaks by default, so you'd probably want
to pass your desired labels to scale_y_continuous() directly and avoid
percent_format(). You could call the function on a vector of breaks
and use the return values for the labels.



(2) What is the function precision() inside percent()? I find no
documentation for it, and in fact it does not appear in the search path. 
Nor

does round_any().


round_any() comes from the plyr package. I have no idea where
precision() comes from; I've wondered about that myself a couple of
times. I imagine it comes from one of the imported packages, but I
didn't find it in any of plyr, stringr or labeling. I didn't check the
color-related packages (RColorBrewer, dichromat or munsell). It could
also be a hidden function.

Dennis



percent(0.010101)


[1] "1.01%"


percent


function (x) {
x <- round_any(x, precision(x)/100)
str_c(comma(x * 100), "%")
}



find("precision")


character(0)


find("round_any")


character(0)





Thanks for any insights

Jacob Wegelin


sessionInfo()


R version 2.15.3 (2013-03-01)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
 [1] tools grid  splines   stats graphics  grDevices utils
datasets  methods   base

other attached packages:
[1] scales_0.2.3xtable_1.7-0reshape2_1.2.2  moments_0.13
corrplot_0.70   ggplot2_0.9.3.1 nlme_3.1-108

loaded via a namespace (and not attached):
 [1] colorspace_1.2-0   dichromat_1.2-4digest_0.6.0   gtable_0.1.2
labeling_0.1   lattice_0.20-13MASS_7.3-23munsell_0.4
 [9] plyr_1.8   proto_0.3-10   psych_1.2.8
RColorBrewer_1.0-5 stringr_0.6.2





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plus/minus +/- in factor; not plotmath not expression

2013-12-02 Thread Jacob Wegelin



I want to put the "plus or minus" symbol into a character variable, so that this can be 
turned into a factor and be displayed in the "strip" of a faceted ggplot2 plot.

A very nice solution, thanks to Professor Ripley's post of Nov 16, 2008; 
3:13pm, visible at 
http://r.789695.n4.nabble.com/Symbols-to-use-in-text-td874239.html and 
subsequently http://www.fileformat.info/info/unicode/char/00b1/index.htm, is:

junk<- "\u00B1"
print(junk)

#   This works very nicely. For instance:

junk<-data.frame(gug=c(
rep( "\u00B1 1.2", 10)
,
rep( "\u00B1 2.3", 10)
)
)
junk$eks<-1:nrow(junk)
junk$why<-with(junk, as.numeric(gug) + eks)
print(summary(junk))
library(ggplot2)
print(
ggplot(data=junk, mapping=aes(x=eks, y=why))
+ geom_point()
+ facet_grid(. ~ gug)
)

This works very nicely on my system, but I just wanted to enquire:

Is this machine-independent and stable?

Is there a "native R" way to do this?

I did this in:


sessionInfo()

R version 2.15.3 (2013-03-01)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] ggplot2_0.9.3.1

loaded via a namespace (and not attached):
 [1] colorspace_1.2-0   dichromat_1.2-4digest_0.6.0   grid_2.15.3   
 gtable_0.1.2   labeling_0.1
 [7] MASS_7.3-23munsell_0.4plyr_1.8   proto_0.3-10  
 psych_1.2.8RColorBrewer_1.0-5
[13] reshape2_1.2.2 scales_0.2.3   stringr_0.6.2 




Incidentally (and for the sake of keyword searches): Although a google search 
initially led me to posts about expression() and plotmath, those eventually had 
nothing to do with the solution.

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
830 E. Main St., Seventh Floor
P. O. Box 980032
Richmond VA 23298-0032
U.S.A. 
CTSA grant: UL1TR58

URL: http://www.people.vcu.edu/~jwegelin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ggplot2 boxplot: horizontal, univariate

2010-06-18 Thread Jacob Wegelin



In ggplot2, I would like to make a boxplot that has the following properties:

(1) Contrary to default, the meaningful axis should be the horizontal axis.

Lattice does this, for instance, by

library(lattice);bwplot(~mtcars$mpg)

(2) It is *univariate*, i.e., of a single vector, say mtcars$mpg.  I do not 
wish to make separate plots for the different values of mtcars$cyl.

(3) Nothing on the meaningless axis--the axis that does not correspond to 
values of mtcars$mpg, the axis which by default is horizontal--should suggest a 
scale. Thus, there should be no axis title or label; there should be no axis 
ticks.

Partial solutions:

To achieve (1), one might save a vertical plot (with mpg on the y-axis) as a 
pdf and use Adobe Acrobat to rotate it 90 degrees. But is there not a way to do 
this *within* ggplot2?

Since ggplot2 has been carefully thought out starting with the grammar of 
graphics, I wonder if there is some conceptual argument against making 
univariate boxplots and against boxplots with a horizontal continuous axis.

But in teaching an introductory statistics course, I would like to compare the 
ways that a histogram and a boxplot summarize a single continuous variable. 
Thus I would like the continuous axis to be horizontal in both plots.

The following code achieves (2) and part of (3):

library(ggplot2)

qplot(factor(0), mpg, data=mtcars, geom="boxplot", xlab="")+ opts(axis.text.x = 
theme_blank(), axis.ticks=theme_blank())

but to remove the ticks from the meaningless (horizontal) axis I had to also 
remove the ticks from the meaningful (vertical) axis.  On page 143 of Wickham's 
ggplot2 book, I find axis.ticks as a theme element but nothing like 
axis.ticks.x.

Is there a way to remove the axis ticks from the x axis and not from the y axis?

But, more important: How do I make a boxplot that is rotated (or transposed) 
from the default, so that the x axis carries the information?

Thanks

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ess - History Expansion

2010-06-21 Thread Vinu Jacob

Hello R world,

Can't seem to get history expansion working in R under emacs 23.2 with ess on 
mac osx. (Vincent Goulet's package). Even '!!' does not get me the previous 
command.  What could be wrong? 

Thanks,

vj  

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plot xyz data in 2D

2010-09-21 Thread Jacob Oosterwijk

Hello,

 

I want to make an 2D plot of .xyz data. So plot on x-axis and y-axis and
use a color scale for the z.

For every x-location along a cross section of the soil at several depths
a resistivity must be displayed. This must result in a picture/graph
which shows the resistivity for that cross section. The depths differ
for each x-location.

 

How can I do that?

 

In fact it is an cross section with the following format (simple
example)

 

X

Depth

Resistivity

-3

0

1

-2

-0.25

2

-2

0

1

-1

-0.5

3

-1

-0.25

2

-1

0

1

0

-0.75

2

0

-0.5

3

0

-0.25

2

0

0

1

1

-0.5

3

1

-0.25

2

1

0

1

2

-0.25

2

2

0

2

3

0

2

 

 

Acacia Water

Jan van Beaumontstraat 1

2805 RN  Gouda

The Netherlands

Tel. office: +31(0) 182686424

Tel. mobile: +31(0) 6 12143599

Fax: +31(0)182686239

Email: jacob.oosterw...@acaciawater.com  

Website: www.acaciawater.com  

 

 

Acacia Water

Jan van Beaumontstraat 1

2805 RN  Gouda

The Netherlands

Tel. office: +31(0) 182686424

Tel. mobile: +31(0) 6 12143599

Fax: +31(0)182686239

Email: jacob.oosterw...@acaciawater.com  

Website: www.acaciawater.com  

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ancova help

2010-10-19 Thread Jacob Kasper

I am trying to run an ancova and am having trouble setting it up properly.
I have nearly 10,000 measurements of fish length, girth and stage of sexual
development. I am suspicious that the stage of development is affecting the
length (as they get full of eggs they get more round and are more difficult
to measure and measure shorter).
My data looks somethign like this:

Length

girth

stage

40

50

2

42

48

3

37

40

5

38

38

6

34

44

2

36

45

3

36

39

4

39

42

4

39

39

6

but now I am not quite sure what to do next.

I have tried this: fit<-lm(length ~ girth + stage)

summary(fit)

Residuals:
  Min1QMedian3Q   Max
-12.18198  -1.59198  -0.06057   1.50504  17.56265

Coefficients:
   Estimate Std. Error t value Pr(>|t|)
(Intercept)24.974310.29956   83.37   <2e-16 ***
data$girth  0.246160.00571   43.11   <2e-16 ***
data$stage0.673710.02742   24.57   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.388 on 9864 degrees of freedom
  (1101 observations deleted due to missingness)
Multiple R-squared: 0.1934,Adjusted R-squared: 0.1933
F-statistic:  1183 on 2 and 9864 DF,  p-value: < 2.2e-16


but this has not told me anything about where the differences in length that
are attributable to sexual development lie. any suggestions?


Thank you
Jacob

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ancova help

2010-10-19 Thread Jacob Kasper

I am trying to run an ancova and am having trouble setting it up properly.
I have nearly 10,000 measurements of fish length, girth and stage of sexual
development. I am suspicious that the stage of development is affecting the
length (as they get full of eggs they get more round and are more difficult
to measure and measure shorter).
My data looks somethign like this:

Length

girth

stage

40

50

2

42

48

3

37

40

5

38

38

6

34

44

2

36

45

3

36

39

4

39

42

4

39

39

6

but now I am not quite sure what to do next.

I have tried this: fit<-lm(length ~ girth + stage)

summary(fit)

Residuals:
  Min1QMedian3Q   Max
-12.18198  -1.59198  -0.06057   1.50504  17.56265

Coefficients:
   Estimate Std. Error t value Pr(>|t|)
(Intercept)24.974310.29956   83.37   <2e-16 ***
data$girth  0.246160.00571   43.11   <2e-16 ***
data$stage0.673710.02742   24.57   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.388 on 9864 degrees of freedom
  (1101 observations deleted due to missingness)
Multiple R-squared: 0.1934,Adjusted R-squared: 0.1933
F-statistic:  1183 on 2 and 9864 DF,  p-value: < 2.2e-16


but this has not told me anything about where the differences in length that
are attributable to sexual development lie. any suggestions?


Thank you
Jacob

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] "Ghost" values after subsetting

2011-01-13 Thread Jacob Kasper

I am using subset to select the data I want to use for my analysis and find
that after I subset my data frame on one column I get ghost values in the
other columns. here is an example:

> table(data$Dags)

   2008/04/12 2008/04/13 2008/04/16 2008/04/17 2008/04/19 2008/05/06

   103140 82187179212 68

2008/05/07 2008/05/12 2008/05/15 2008/05/25 2008/05/28 2008/05/29 2009/04/17

   184308120227250150259

2009/04/18 2009/04/20 2009/04/21 2009/05/04 2009/05/15 2009/06/09 2009/06/10

   246241252153366225 79

2009/06/24 2009/06/25 2009/06/26 2010/03/11 2010/04/27 2010/05/07 2010/05/08

   126278297135285286275

2010/05/10 2010/05/11 2010/05/20 2010/05/21 2010/06/02 2010/07/20 2010/08/12

   290 22259291381 20648

2010/08/16 2010/08/18
11  2


> data10<-subset(data, data$Year==2010 & data$Recatpure1==1)

> table(data10$Dags)

   2008/04/12 2008/04/13 2008/04/16 2008/04/17 2008/04/19 2008/05/06

 0  0  0  0  0  0  0

2008/05/07 2008/05/12 2008/05/15 2008/05/25 2008/05/28 2008/05/29 2009/04/17

 0  0  0  0  0  0  0

2009/04/18 2009/04/20 2009/04/21 2009/05/04 2009/05/15 2009/06/09 2009/06/10

 0  0  0  0  0  0  0

2009/06/24 2009/06/25 2009/06/26 2010/03/11 2010/04/27 2010/05/07 2010/05/08

 0  0  0 23 38 20 29

2010/05/10 2010/05/11 2010/05/20 2010/05/21 2010/06/02 2010/07/20 2010/08/12

18  1 15 45 38  1  5

2010/08/16 2010/08/18
 0  0
How can I perform a subset so that these ghost values do not appear at all
in my new table?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] "Ghost" values after subsetting

2011-01-13 Thread Jacob Kasper

It is a factor, thank you for pointing me in the right direction.

Jacob

On Thu, Jan 13, 2011 at 11:12, Sarah Goslee  wrote:

> Hi Jacob,
>
> You don't give us enough information to answer your question. Specifically,
> what is your dataframe?
> str(data)
> would be helpful (and calling your data data is not usually wise).
>
> My guess is that Dags is actually a factor -- do you want it to be a
> factor? --
> and so you are retaining all of the levels. You need to think about how you
> are getting the data into R, whether you want a factor for that column, and
> whether you should drop the unused levels.
>
> Reading the help for factor may be enlightening.
>
> Sarah
>
> On Thu, Jan 13, 2011 at 3:57 AM, Jacob Kasper 
> wrote:
> > I am using subset to select the data I want to use for my analysis and
> find
> > that after I subset my data frame on one column I get ghost values in the
> > other columns. here is an example:
> >
> >> table(data$Dags)
> >
> >   2008/04/12 2008/04/13 2008/04/16 2008/04/17 2008/04/19
> 2008/05/06
> >
> >   103140 82187179212
> 68
> >
> > 2008/05/07 2008/05/12 2008/05/15 2008/05/25 2008/05/28 2008/05/29
> 2009/04/17
> >
> >   184308120227250150
>  259
> >
> > 2009/04/18 2009/04/20 2009/04/21 2009/05/04 2009/05/15 2009/06/09
> 2009/06/10
> >
> >   246241252153366225
> 79
> >
> > 2009/06/24 2009/06/25 2009/06/26 2010/03/11 2010/04/27 2010/05/07
> 2010/05/08
> >
> >   126278297135285286
>  275
> >
> > 2010/05/10 2010/05/11 2010/05/20 2010/05/21 2010/06/02 2010/07/20
> 2010/08/12
> >
> >   290 22259291381 20
>  648
> >
> > 2010/08/16 2010/08/18
> >11  2
> >
> >
> >> data10<-subset(data, data$Year==2010 & data$Recatpure1==1)
> >
> >> table(data10$Dags)
> >
> >   2008/04/12 2008/04/13 2008/04/16 2008/04/17 2008/04/19
> 2008/05/06
> >
> > 0  0  0  0  0  0
>  0
> >
> > 2008/05/07 2008/05/12 2008/05/15 2008/05/25 2008/05/28 2008/05/29
> 2009/04/17
> >
> > 0  0  0  0  0  0
>  0
> >
> > 2009/04/18 2009/04/20 2009/04/21 2009/05/04 2009/05/15 2009/06/09
> 2009/06/10
> >
> > 0  0  0  0  0  0
>  0
> >
> > 2009/06/24 2009/06/25 2009/06/26 2010/03/11 2010/04/27 2010/05/07
> 2010/05/08
> >
> > 0  0  0 23 38 20
> 29
> >
> > 2010/05/10 2010/05/11 2010/05/20 2010/05/21 2010/06/02 2010/07/20
> 2010/08/12
> >
> >18  1 15 45 38  1
>  5
> >
> > 2010/08/16 2010/08/18
> > 0  0
> > How can I perform a subset so that these ghost values do not appear at
> all
> > in my new table?
> >
>
>
> --
> Sarah Goslee
> http://www.functionaldiversity.org
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] coxph() - unexpected result using Crawley's seedlings data (The R Book)

2011-06-28 Thread Jacob Brogren

Hi,

I ran the example on pp. 799-800 from Machael Crawley's "The R Book" using 
package survival v. 2.36-5, R 2.13.0 and RStudio 0.94.83. The model is a Cox's 
Proportional Hazards model. The result was quite different compared to the R 
Book. I have compared my code to the code in the book but can not find any 
differences in the function call. My results are attached as well as a link to 
the results presented in the book (link to Google Books).

When running the examples on pp. 797-799 I can't detect any differences in 
results so I don't think there are errors in the data set or in the creation of 
the status variable.

-
Original from the R Book:
http://books.google.com/books?id=8D4HVx0apZQC&lpg=PA799&ots=rQgd_8ofeS&dq=r%20coxph%20crawley&pg=PA799#v=onepage&q&f=false

-
My result:
> summary(model1)
Call:
coxph(formula = Surv(death, status) ~ strata(cohort) * gapsize, 
data = seedlings)

  n= 60, number of events= 60 

coef exp(coef)  se(coef)  z 
Pr(>|z|)
gapsize-0.001893  0.998109  0.593372 -0.003
0.997
gapsize:strata(cohort)cohort=September  0.717407  2.049112  0.860807  0.833
0.405

   exp(coef) exp(-coef) lower .95 upper .95
gapsize   0.9981  1.0020.3120 3.193
gapsize:strata(cohort)cohort=September2.0491  0.4880.379211.074

Rsquare= 0.022   (max possible= 0.993 )
Likelihood ratio test= 1.35  on 2 df,   p=0.5097
Wald test= 1.32  on 2 df,   p=0.5178
Score (logrank) test = 1.33  on 2 df,   p=0.514

Anyone have an idea why this is occurring?

Kind Regards

Jacob
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] coxph() - unexpected result using Crawley's seedlings data (The R Book)

2011-06-28 Thread Jacob Brogren

Hi,

sorry about that; here is the full output - data set, structure, model and 
result.

Cheers

Jacob

> seedlings
  cohort death gapsize status
1  September 7  0.5889  1
2  September 3  0.6869  1
3  September12  0.1397  1
4  September 1  0.1921  1
5  September 4  0.2798  1
6  September 2  0.2607  1
7  September 6  0.9467  1
8  September 6  0.6375  1
9  September 8  0.1527  1
10 September 3  0.8237  1
11 September 1  0.5979  1
12 September 1  0.2914  1
13 September 3  0.5053  1
14 September 5  0.4714  1
15 September 2  0.6041  1
16 September 8  0.8812  1
17 September 4  0.8416  1
18 September 1  0.0577  1
19 September 2  0.9034  1
20 September 2  0.4348  1
21 September 7  0.9878  1
22 September11  0.1486  1
23 September14  0.5003  1
24 September 1  0.8507  1
25 September10  0.8187  1
26 September14  0.0291  1
27 September 1  0.3785  1
28 September 4  0.8384  1
29 September 2  0.8351  1
30 September 2  0.9674  1
31   October 1  0.6943  1
32   October 1  0.2591  1
33   October 2  0.7397  1
34   October 2  0.4663  1
35   October14  0.9115  1
36   October 5  0.1750  1
37   October 1  0.5628  1
38   October 8  0.2681  1
39   October 5  0.6967  1
40   October 2  0.7020  1
41   October 4  0.7971  1
42   October 3  0.4047  1
43   October 5  0.0498  1
44   October10  0.0364  1
45   October 9  0.4080  1
46   October 1  0.6226  1
47   October11  0.3002  1
48   October 3  0.8111  1
49   October21  0.4894  1
50   October 1  0.0375  1
51   October 4  0.2560  1
52   October 9  0.2168  1
53   October 8  0.7437  1
54   October 1  0.9082  1
55   October 3  0.9496  1
56   October 9  0.1040  1
57   October 9  0.8691  1
58   October16  0.9502  1
59   October 6  0.0790  1
60   October 1  0.5658  1
> str(seedlings)
'data.frame':   60 obs. of  4 variables:
 $ cohort : Factor w/ 2 levels "October","September": 2 2 2 2 2 2 2 2 2 2 ...
 $ death  : int  7 3 12 1 4 2 6 6 8 3 ...
 $ gapsize: num  0.589 0.687 0.14 0.192 0.28 ...
 $ status : num  1 1 1 1 1 1 1 1 1 1 ...
> model1 <- coxph(Surv(death,status)~strata(cohort)*gapsize,data=seedlings)
> summary(model1)
Call:
coxph(formula = Surv(death, status) ~ strata(cohort) * gapsize, 
data = seedlings)

  n= 60, number of events= 60 

coef exp(coef)  se(coef)  z 
Pr(>|z|)
gapsize-0.001893  0.998109  0.593372 -0.003
0.997
gapsize:strata(cohort)cohort=September  0.717407  2.049112  0.860807  0.833
0.405

   exp(coef) exp(-coef) lower .95 upper .95
gapsize   0.9981  1.0020.3120 3.193
gapsize:strata(cohort)cohort=September2.0491  0.4880.379211.074

Rsquare= 0.022   (max possible= 0.993 )
Likelihood ratio test= 1.35  on 2 df,   p=0.5097
Wald test= 1.32  on 2 df,   p=0.5178
Score (logrank) test = 1.33  on 2 df,   p=0.514

28 jun 2011 kl. 15.48 skrev Robert A LaBudde:

> Did you create the 'status' variable the way indicated on p. 797?
> 
> Frequently with Surv() it pays to use syntax such as Surv(death, status==1) 
> to make a clear logical statement of what is an event (status==1) vs. 
> censored.
> 
> PS. Next time include head(seedlings) and str(seedlings) to make clear what 
> you are using as data.
> 
> 
> At 06:51 AM 6/28/2011, Jacob Brogren wrote:
>> Hi,
>> 
>> I ran the example on pp. 799-800 from Machael Crawley's "The R Book" using 
>> package survival v. 2.36-5, R 2.13.0 and RStudio 0.94.83. The model is a 
>> Cox's Proportional Hazards model. The result was quite different compared to 
>> the R Book. I have compared my code to the code in the book but can not find 
>> any differences in the function call. My results are attached as well as a 
>> link to the results presented in the book (link to Google Books).
>> 
>> When running the examples on pp. 797-799 I can't detect any differences in 
>> results so I don't think there are errors in the data set or in the creation 
>> of the status variable.
>> 
>> -
>> Original from the R Book:
>> http://books.google.com/books?id=8D4HVx0apZQC&lpg=PA799&ots=rQgd_8ofeS&dq=r%20coxph%20crawley&pg=PA799#v=onepage&q&f=false
>> 
>> -
>> My result:
>

Re: [R] coxph() - unexpected result using Crawley's seedlings data (The R Book)

2011-06-28 Thread Jacob Brogren

All,

I rerun once again and managed to reproduce the results from the text book. 
Made no changes to the code. Could it be some problem with convergence?

Anyhow, now it works!

Cheers

Jacob

ps. I find "The R Book" very useful ds.

28 jun 2011 kl. 15.48 skrev Robert A LaBudde:

> Did you create the 'status' variable the way indicated on p. 797?
> 
> Frequently with Surv() it pays to use syntax such as Surv(death, status==1) 
> to make a clear logical statement of what is an event (status==1) vs. 
> censored.
> 
> PS. Next time include head(seedlings) and str(seedlings) to make clear what 
> you are using as data.
> 
> 
> At 06:51 AM 6/28/2011, Jacob Brogren wrote:
>> Hi,
>> 
>> I ran the example on pp. 799-800 from Machael Crawley's "The R Book" using 
>> package survival v. 2.36-5, R 2.13.0 and RStudio 0.94.83. The model is a 
>> Cox's Proportional Hazards model. The result was quite different compared to 
>> the R Book. I have compared my code to the code in the book but can not find 
>> any differences in the function call. My results are attached as well as a 
>> link to the results presented in the book (link to Google Books).
>> 
>> When running the examples on pp. 797-799 I can't detect any differences in 
>> results so I don't think there are errors in the data set or in the creation 
>> of the status variable.
>> 
>> -
>> Original from the R Book:
>> http://books.google.com/books?id=8D4HVx0apZQC&lpg=PA799&ots=rQgd_8ofeS&dq=r%20coxph%20crawley&pg=PA799#v=onepage&q&f=false
>> 
>> -
>> My result:
>> > summary(model1)
>> Call:
>> coxph(formula = Surv(death, status) ~ strata(cohort) * gapsize,
>>data = seedlings)
>> 
>>  n= 60, number of events= 60
>> 
>>coef exp(coef)  se(coef)  z 
>> Pr(>|z|)
>> gapsize-0.001893  0.998109  0.593372 -0.003  
>>   0.997
>> gapsize:strata(cohort)cohort=September  0.717407  2.049112  0.860807  0.833  
>>   0.405
>> 
>>   exp(coef) exp(-coef) lower .95 upper 
>> .95
>> gapsize   0.9981  1.002 0.3120 3.193
>> gapsize:strata(cohort)cohort=September2.0491  0.488 0.379211.074
>> 
>> Rsquare= 0.022   (max possible= 0.993 )
>> Likelihood ratio test= 1.35  on 2 df,   p=0.5097
>> Wald test= 1.32  on 2 df,   p=0.5178
>> Score (logrank) test = 1.33  on 2 df,   p=0.514
>> 
>> Anyone have an idea why this is occurring?
>> 
>> Kind Regards
>> 
>> Jacob
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: r...@lcfltd.com
> Least Cost Formulations, Ltd.URL: http://lcfltd.com/
> 824 Timberlake Drive Tel: 757-467-0954
> Virginia Beach, VA 23464-3239Fax: 757-467-2947
> 
> "Vere scire est per causas scire"
> 
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plotting groups via density and different colors

2011-07-18 Thread Jacob Kasper

I have a data set that looks like this:
dene <- data.frame(length =
c(35,32,33,34,41,40,46,35,41,40,45,36,38,37,39,40,42,42,42,43,44),
sex=c(1,1,1,1,2,2,2,1,2,2,2,1,2,2,2,2,2,2,2,2,2))

I would like to plot the density (frequency of occurrence) of each length
class but I want to have different colors for sex. I used the following:
library(sm)
sex.f<-factor(as.factor(dene$sex),levels=c(1,2),
labels=c("Males","Females"))
sm.density.compare(dene$length, dene$sex)
yet I want the sum of the areas under the two curves to be equal to 1, not
the sum of the area under each curve to be equal to 1.

I can plot the frequency using hist, but then I do not get the two colors
indicating the difference in sex.


hist(dene$length, freq=F, breaks=11)

any thoughts on how to approach this would be appreciated.

Thank you

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lme( y ~ ns(x, df=splineDF)) error

2012-09-06 Thread Jacob Wegelin



I would like to fit regression models of the form

y ~ ns(x, df=splineDF)

where splineDF is passed as an argument to a wrapper function.

This works fine if the regression function is lm(). But with lme(),
I get two different errors, depending on how I handle splineDF
inside the wrapper function.

A workaround is to turn the lme() command, along with the appropriate
value of splineDF, into a text string and then use
eval(parse(text=mystring)).  Is there not a more elegant solution?

The function below demonstrates this.


sessionInfo()

R version 2.15.1 (2012-06-22)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] splines   stats graphics  grDevices utils datasets  methods
base

other attached packages:
[1] nlme_3.1-104  ggplot2_0.9.1

loaded via a namespace (and not attached):
 [1] colorspace_1.1-1   dichromat_1.2-4digest_0.5.2
 grid_2.15.1labeling_0.1
  [6] lattice_0.20-6 MASS_7.3-18memoise_0.1
  munsell_0.3plyr_1.7.1
  [11] proto_0.3-9.2  RColorBrewer_1.0-5 reshape2_1.2.1
  scales_0.2.1   stringr_0.6
  [16] tools_2.15.1

The following function can also be downloaded from
http://jacobwegelin.net/tmp/r-help/.

wrapper <-function(WhichApproach=1, splineDF=4 ){
#   Create toy dataset
nID<-25
IDnames<-LETTERS[1:nID]

longdat<-data.frame( x=rep(-10:10, nID) / 3 , ID= rep( IDnames , each=21)
)
set.seed(5)
longdat$x<-longdat$x + rnorm(nrow(longdat))/10
IDeffect<-rnorm(nID) * 20
names(IDeffect) <- IDnames
longdat$IDeffect<-IDeffect[as.character(longdat$ID)]
longdat$y<- (longdat$x
+ longdat$x^3
+ (longdat$x-1)^4  / 5
+ 1/(abs(longdat$x/50) + 0.02)
+ longdat$IDeffect
+ rnorm(1:nrow(longdat)) * 2
)
longdat<-longdat[order(longdat$x),]

library(splines)
#   Calling ns within lm works fine:
mylm<- lm( y ~ ns(x,df=splineDF), data=longdat)
longdat$lmfit<-predict(mylm)
library(ggplot2)
print(
ggplot(longdat, aes(x, y))
+ geom_point(shape=1)
+ geom_line(aes(x=x, y=lmfit), color="red")
)


cat("Enter to attempt lme.")
readline()
library(nlme)

if(WhichApproach==1) {
# returns: "Error in eval(expr, envir, enclos) : object 'splineDF' not found"
#
# First make sure splineDF does not exist in .GlobalEnv, else we would be in 
WhichApproach==2.
if(exists("splineDF", where=".GlobalEnv")) remove(list="splineDF", 
pos=".GlobalEnv")
mylme<-lme(fixed= y ~ ns(x, df=splineDF)
, random= ~ 1 | ID
, correlation = corAR1()
, data=longdat
)
} else if(WhichApproach==2) {
# returns: "Error in model.frame.default(formula = ~ID + y + x + splineDF, data = list( : 
# variable lengths differ (found for 'splineDF')"

assign(x="splineDF", value=splineDF, pos=".GlobalEnv")
mylme<-lme(fixed= y ~ ns(x, df=splineDF)
, random= ~ 1 | ID
, correlation = corAR1()
, data=longdat
)
} else if (WhichApproach==3) {

thecommandstring<- paste("mylme<-lme(fixed= y ~ ns(x, df=", splineDF, ") , random= ~ 
1 | ID , correlation = corAR1() , data=longdat)")
thecommandstring
eval(parse(text=thecommandstring))

} else {
stop(paste("WhichApproach=", WhichApproach, " not valid."))
}

mylme
longdat$fullfit<-predict(mylme)

library(ggplot2)
print(
ggplot( longdat, aes(x,y))
+ geom_point(shape=1)
+ facet_wrap( ~ ID )
+ geom_line( aes(x, fullfit), color="red")
)

invisible(mylme)
}

Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
830 E. Main St., Seventh Floor
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] unable to load Hmisc in R 2.14.0

2011-11-10 Thread Jacob Wegelin



On my MacBook Pro (OS 10.6.8), after updating to R version 2.14.0 (2011-10-31) 
and reinstalling the Hmisc package, I am unable to load the Hmisc library.

Hmisc was working *before* I updated R.

Any idea what's wrong?

Details below.


install.packages("Hmisc", dependencies=TRUE)

trying URL
'http://cran.case.edu/bin/macosx/leopard/contrib/2.14/Hmisc_3.9-0.tgz'
Content type 'application/x-gzip' length 1457353 bytes (1.4 Mb)
opened URL
==
downloaded 1.4 Mb


The downloaded packages are in

/var/folders/HR/HRdt7XjbGC4gfgEM2M7yWE+++TM/-Tmp-//RtmpvPPB8R/downloaded_packages

library(Hmisc)

Error in dyn.load(file, DLLpath = DLLpath, ...) :
  unable to load shared object
'/Library/Frameworks/R.framework/Versions/2.14/Resources/library/Hmisc/libs/x86_64/Hmisc.so':
  
dlopen(/Library/Frameworks/R.framework/Versions/2.14/Resources/library/Hmisc/libs/x86_64/Hmisc.so,
6): Library not loaded:
@rpath/R.framework/Versions/2.14/Resources/lib/libR.dylib
  Referenced from:
/Library/Frameworks/R.framework/Versions/2.14/Resources/library/Hmisc/libs/x86_64/Hmisc.so
  Reason: image not found
Error: package/namespace load failed for ‘Hmisc’


sessionInfo()

R version 2.14.0 (2011-10-31)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] splines   stats graphics  grDevices utils datasets  methods
base

other attached packages:
[1] survival_2.36-10

loaded via a namespace (and not attached):
[1] cluster_1.14.1 grid_2.14.0lattice_0.20-0 psych_1.1.11
tools_2.14.0 




Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
830 E. Main St., Seventh Floor
P. O. Box 980032
Richmond VA 23298-0032__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rJava load failure, 64-bit R, 64-bit Java, R 2.14.1 (works fine on R 2.12.2 64-bit, same computer)

2012-02-07 Thread Strunk, Jacob

Hello R users,

I have encountered difficulty in attempting to load the package "rJava" for 
64-bit R 2.14.1. It appears that the problem is specific (for my system) to 
this version of R. I had no trouble loading rJava on an earlier version 
(2.12.2) of R.

N:\>java -version
java version "1.6.0_25"
Java(TM) SE Runtime Environment (build 1.6.0_25-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11, mixed mode)

I used the following to install rJava:
install.packages('rJava', .libPaths()[1], 'http://www.rforge.net/')

> library(rJava)
Error : .onLoad failed in loadNamespace() for 'rJava', details:
  call: inDL(x, as.logical(local), as.logical(now), ...)
  error: unable to load shared object 'C:/Program 
Files/R/R-2.14.1/library/rJava/libs/x64/rJava.dll':
  LoadLibrary failure:  The specified path is invalid.

Error: package/namespace load failed for 'rJava'

The specified file is in fact present, and my R console says "(64-bit)" at the 
top

I am aware of the following discussion on this topic:
http://stackoverflow.com/questions/2399027/cannot-load-rjava-because-cannot-load-a-shared-library

I have updated my path to include the following:
C:\Program Files\R\R-2.14.1\bin\x64
And
C:\Program Files\Java\jre6\bin\

When I download the package and install from a local zip I get the following 
error:

> library(rJava)
Error in get(Info[i, 1], envir = env) :
  cannot allocate memory block of size 2.9 Gb
Error: package/namespace load failed for 'rJava'

There should not be any memory issues:

> memory.limit()
[1] 16381


Thank you!
Jacob L Strunk

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help

2008-05-15 Thread Jacob, Mody

Dear All:

I am new in Linux and I have inherited a Linux redhat system with "R Version 
2.1.0  (2005-04-18)" installed. Can somebody help me with step by step 
instructions to download the new version and install it?

Thanks
mjacob




The contents of this communication, including any attachments, may be 
confidential, privileged or otherwise protected from disclosure.  They are 
intended solely for the use of the individual or entity to whom they are 
addressed.  If you are not the intended recipient, please do not read, copy, 
use or disclose the contents of this communication.  Please notify the sender 
immediately and delete the communication in its entirety.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Graphics after invoking R from the command line

2007-11-28 Thread Mithun Jacob

I've tried running graphics commands like plot by invoking R at the
command line but the graphics window does not appear. I'm using
R-2.6.0 on Windows XP and am using the cmd shell. Here's a sample
session:

R --slave --save --file=-
x<-c(1,2,3,4)
plot(x,x)

This leads to nothing. I've found a way around it, but the graph
window is frozen and commands to it have to be sent twice for the data
to appear.

R --slave --save --file=-
x<-c(1,2,3,4)
windows()
plot(x,x)
plot(x,x)

Am I doing something wrong here? It would be nice if I could display a
graphics window which isn't frozen.

Regards,
Mithun

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Graphics after invoking R from the command line

2007-11-28 Thread Mithun Jacob

>>Duncan Murdoch

I wish to run R with Visual C++ as a front end. So I was hoping to run
a file such as graph.r in the following manner:

R --slave --save --file=graph.r

But when the plot function runs, it does not display the graph.

>>Prof Brian Ripley:
So would I be correct to assume that it's not possible to run graphics
in batch mode?

Thanks for all the help!

Regards,
Mithun

On 11/28/07, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
> On Wed, 28 Nov 2007, Mithun Jacob wrote:
>
> > I've tried running graphics commands like plot by invoking R at the
> > command line but the graphics window does not appear. I'm using
> > R-2.6.0 on Windows XP and am using the cmd shell. Here's a sample
> > session:
> >
> > R --slave --save --file=-
> > x<-c(1,2,3,4)
> > plot(x,x)
> >
> > This leads to nothing. I've found a way around it, but the graph
> > window is frozen and commands to it have to be sent twice for the data
> > to appear.
> >
> > R --slave --save --file=-
> > x<-c(1,2,3,4)
> > windows()
> > plot(x,x)
> > plot(x,x)
> >
> > Am I doing something wrong here? It would be nice if I could display a
> > graphics window which isn't frozen.
>
> So use R interactively, rather than in batch mode typing at stdin. (Using
> --file puts you into batch mode.) Only in interactive use does the event
> loop give time to windows such as the screen device, pagers, data frame
> viewers 
>
> I have no idea where you got the idea of 'R --slave --save --file=-' from:
> the way to use R 'at the command line' on Windows is to use Rterm.
> 'Rterm --slave --save' would do the same without the freezing, but most
> people do find prompts useful.
>
> --
> Brian D. Ripley,  [EMAIL PROTECTED]
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ggplot2: specifying legend titles

2009-03-20 Thread Etches Jacob

I am trying to specify a legend title to be other than the variable  
name, but I find that the legend splits because scale_shape() takes  
effect but scale_colour() does not.  Can someone spot my error?   
Here's some toy code that produces the problem on my system (R2.8.1,  
windows, today's CRAN package updates):


library(ggplot2)
a <- sample(letters[1:5],100,replace=T)
b <- rnorm(100)
c <- rnorm(100)
qplot(b,c,shape=a,colour=a) +
  scale_colour("Better legend title") +
  scale_shape("Better legend title")



Any review or distribution by anyone other than the person for whom it was 
originally intended is prohibited.
If you have received this e-mail in error please delete all copies.
Opinions conclusions or other information contained in this e-mail may not be 
that of the organization.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] xYplot, error bars, log scale

2008-02-05 Thread Jacob Etches

I'm new to Hmisc and trying to get the following to work, but if I un- 
comment the y-scale list (in order to get a log-scale for the hazard  
ratio), the error bars become strangely large.  The dataframe is  
simply ODS output from TPHREG in SAS.  Can someone point me towards  
what I'm sure is a naive error?  Many thanks!

tphreg1_parameterestimates1987$LCL <-  
exp(tphreg1_parameterestimates1987$estimate - qnorm(. 
975)*tphreg1_parameterestimates1987$stderr)
tphreg1_parameterestimates1987$UCL <-  
exp(tphreg1_parameterestimates1987$estimate + qnorm(. 
975)*tphreg1_parameterestimates1987$stderr)

xYplot(Cbind(exp(estimate),LCL,UCL) ~ lag|malef*ftf, groups=classval0,  
as.table=T,
ylim=c(.8,4), pch=1:4, type="b", method="alt bars", label.curves=F,  
lty.bar=3,
data=tphreg1_parameterestimates1987, subset=ft != 2 & parameter ==  
"finc_mean_qu",
scales=list(
#y=list(log="e",at=c(1,1.25,1.5,2,2.5,3)),
x=list(at=seq(3,18,by=3))
),
key=list(columns=4,
text=list(c("Q0: Poorest","Q1","Q2","Q3")),
points=list(pch = 1:4,col = trellis.par.get("superpose.symbol") 
$col[1:4])
)
)







Doctoral candidate, Epidemiology Program
Department of Public Health Sciences, University of Toronto Faculty of  
Medicine

Research Associate
Institute for Work & Health
800-481 University Avenue, Toronto, Ontario, Canada   M5G 2E9
T: 416.927.2027 ext. 2290
F: 416.927.4167
[EMAIL PROTECTED]
www.iwh.on.ca

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ggplot2: can one have separate ylim for each facet?

2008-11-17 Thread Etches Jacob


In lattice

#toy data
library(ggplot2)
library(lattice)
x <- rnorm(100)
y <- rnorm(100)
k <- sample(c("Weak","Strong"),100,replace=T)
j <- sample(c("Tall","Short"),100,replace=T)
w <- data.frame(x,y,j,k)

xyplot(y~x|j+k,scales=list(y=list(relation="free")))

will give you a scale in each subplot with a range equal to the range  
of y within each subplot.


Is this possible using ggplot2?

qplot(x,y,data=w) + facet_grid(j~k) + ylim(-2,2)

produces a plot with the same range in each subplot.  Can the lattice  
behaviour be reproduced in ggplot2?


Thanks,

Jacob Etches

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] mixed exponential distribution

2008-12-10 Thread Fazekas, Jacob

Good morning,
 
Is there anyway to do Mixed Exponential Distribution in R?  I am trying
to load some lag-weighted empirical survival distribution into R and run
a mixed exponential on that data.
 
Thanks, 
Jacob Fazekas
Jacob Fazekas
Assistant Actuary
Auto-Owners Insurance Company
517-703-2543
[EMAIL PROTECTED]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R in Windows 11

2024-10-09 Thread Jacob Williams

Hi,

What is the minimum version of R supported on Windows 11?

Thanks,
Jacob

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with dates

2017-12-28 Thread Simmering, Jacob E

Your dates have an incomplete year information with 34. R assumes that 00-68 
are 2000 to 2068 and 69 to 99 are 1969 to 1999. See ?strptime and the details 
for %y. 

You can either append “19” to the start of your year variable to make it 
completely express the year or check if the date is in the future (assuming all 
dates should be in the past) and subtract 100 years from the date. 

> On Dec 28, 2017, at 11:13 AM, Ramesh YAPALPARVI 
>  wrote:
> 
> Hi all,
> 
> I’m struggling to get the dates in proper format.
> I have dates as factors and is in the form 02/27/34( 34 means 1934). If I use
> 
> as.Date with format %d%m%y it gets converted to 2034-02-27. I tried changing 
> the origin in the as.Date command but nothing worked. Any help is appreciated.
> 
> Thanks,
> Ramesh
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems with "predict" function

2018-01-31 Thread Simmering, Jacob E

Your messages about masking come from attaching your data set to the R session. 
In general, that is bad practice as it leads to confusing code. It is typically 
better to use the “data” argument in things like lm() to accomplish this task. 

As near as I can tell, your second set of predictions is not working because 
your call to lm() directly references vectors from the highdf data frame. If 
you do this:

h.lm <- lm(sales ~ month, data = highdf)
news <- data.frame(month = nrow(ptr) + 1)
hcs <- predict(h.lm, news, interval = "predict")

You should see the expected results. Note that here I’m directly referring to 
the variables “sales” and “month” and not using the bracket notation. 

> On Jan 31, 2018, at 11:08 AM, WRAY NICHOLAS via R-help  
> wrote:
> 
> Hello,
> 
> I am synthesising some sales data over a twelve month period, and then trying 
> to
> use the "predict" function, firstly to generate a thirteenth month forecast 
> with
> upper and lower 95% confidence limits.  So far so good
> 
> But what I then want to do is add the upper sales value at the 95th confidence
> limit to the vector of thirteen months and their respective sales to create a
> fourteenth month with a predicted sale and the 95% upper confidence limit for
> this, and so on  The idea being to create a "trumpet" of extreme posistions
> 
> But I keep getting instead of one line of predictions for the fourteenth 
> month,
> a whole set.  What I don't understand is why it works OK with my original
> synthetic set of twelve months, but doesn't like the set of thirteen sales 
> data
> points, even though as far as I can see I'm just repeating the process, albeit
> with a different label  I have tried to use different column labels in case 
> that
> was the problem but it doesn't seem to make any difference
> 
> I am also getting these weird warning messages telling me that things are 
> being
> "masked":
> 
> The following object is masked _by_ .GlobalEnv:
> 
> sales
> 
> The following object is masked from highdf (pos = 4):
> 
> sales
> Etc
> 
> Is it something to do with attaching the various data frames?  I am a bit at 
> sea
> on this and would be thankful for any pointers
> 
> Nick
> 
> My code:
> 
> 
> m<-runif(1,0,1)
> m
> mres<-m*(seq(1,12))
> mres
> ssd<-rexp(1,1)
> ssd
> devs<-rep(0,length(mres))
> for(i in 1:length(mres)){devs[i]<-rnorm(1,0,ssd)}
> devs
> plot(-10,-10,xlim=c(1,24),ylim=c(0,2))
> sales<-round((mres+devs)*1000)
> 
> points(sales,pch=19)
> 
> ptr<-cbind(1:length(sales),sales,sales,sales)
> 
> ptr
> sdf<-data.frame(cbind(1:nrow(ptr),sales))
> sdf
> 
> colnames(sdf)<-c(“monat”,“mitte”)
> sdf
> attach(sdf)
> s.lm<-lm(mitte~monat)
> 
> s.lm
> abline(s.lm,lty=2)
> news<-data.frame(monat=nrow(sdf)+1)
> news
> fcs<-predict(s.lm,news,interval="predict")
> fcs
> 
> points(1+nrow(ptr),fcs[,1],col="grey",pch=19)
> points(1+nrow(ptr),fcs[,2])
> points(1+nrow(ptr),fcs[,3])
> ptr<-rbind(ptr,c(1+nrow(ptr),fcs[2],fcs[1],fcs[3]))
> ptr
> 
> highdf<-data.frame(ptr[,c(1,4)])
> highdf
> colnames(highdf)<-c(“month”,“sales”)
> highdf
> 
> attach(highdf)
> h.lm<-lm(highdf[,2]~highdf[,1])
> h.lm
> abline(h.lm,col="gray",lty=2)
> news<-data.frame(month=nrow(ptr)+1)
> news
> hcs<-predict(h.lm,news,interval="predict")
> hcs
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Too strange that I cannot install several packages

2017-04-10 Thread Simmering, Jacob E

Bruce,

`sample` doesn’t appear to be an R package. `resample` installed for me. 
`apply` and `sapply` aren’t packages either. 

`sample`, `apply` and `sapply` are all functions, however. 

> On Apr 10, 2017, at 12:08 PM, BR_email  wrote:
> 
> Hi Rers:
> Is there anything I can check for as to why I cannot install several 
> packages, i.e., sample, resample, resample_bootstrap, apply, sapply, ...?
> 
> The error message I get is:
> 
>> install.packages("sample") Installing package into 
> ‘C:/Users/BruceRatner/Documents/R/win-library/3.3’ (as ‘lib’ is unspecified) 
> Warning in install.packages :
>  package ‘sample’ is not available (for R version 3.3.3)
> 
> 
> Thanks.
> Bruce
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Could not load Package 'rgl'

2017-04-12 Thread Simmering, Jacob E

Christofer,

This SO thread may be helpful:

http://stackoverflow.com/questions/33634871/installing-rgl-package-in-r-mac-osx-el-captian-fixed


On Apr 12, 2017, at 12:22 PM, Christofer Bogaso 
mailto:bogaso.christo...@gmail.com>> wrote:

Hi again,

I could not load the 'rgl' package with below Error details :

library(rgl)
Error : .onLoad failed in loadNamespace() for 'rgl', details:
 call: dyn.load(file, DLLpath = DLLpath, ...)
 error: unable to load shared object
'/Library/Frameworks/R.framework/Versions/3.3/Resources/library/rgl/libs/rgl.so':
 
dlopen(/Library/Frameworks/R.framework/Versions/3.3/Resources/library/rgl/libs/rgl.so,
6): Library not loaded: /opt/X11/lib/libGLU.1.dylib
 Referenced from:
/Library/Frameworks/R.framework/Versions/3.3/Resources/library/rgl/libs/rgl.so
 Reason: image not found
Error: package or namespace load failed for ‘rgl’

I am using R under below environment:

R version 3.3.2 (2016-10-31) -- "Sincere Pumpkin Patch"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin13.4.0 (64-bit)

Could you please help how to resolve this error.

Thanks for your time.

__
R-help@r-project.org mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] seek non-black box alternative to randomForest

2017-05-30 Thread Simmering, Jacob E

Barry, 

This is mostly a mailing list about R - you have have more luck with 
statistical questions on www.stat.stackexchange.com. 

That said - the editor is wrong. The limitations of trees that random forests 
“solves” is overfitting. The mechanism by which a random forest classifier is 
built is not a black box - some number of features and some number of rows are 
selected to produce a split. The reasons why this approach avoids the issues 
associated with trees is also clear. These are theory based claims. The random 
selection is critical to the function of the process. I’d suggest resubmitting 
the paper to a different journal instead of trying to find some way to fit a 
random forest without the random part.  

> On May 30, 2017, at 1:54 PM, Barry King  wrote:
> 
> I've recently had a research manuscript rejected by an editor. The
> manuscript showed
> that for a real life data set, random forest outperformed multiple linear
> regression
> with respect to predicting the target variable. The editor's objection was
> that
> random forest is a black box where the random assignment of features to
> trees was
> intractable. I need to find an alternative method to random forest that
> does not
> suffer from the black box label. Any suggestions? Would caret::treebag be
> free of
> random assignment of features? Your assistance is appreciated.
> 
> --
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] interaction between clusterMap(), read.csv() and try() - try does not catch error

2016-08-08 Thread Strunk, Jacob (DNR)

Hello I am attempting to process a list of csv files in parallel, some of which 
may be empty and fail with read.csv. I tend to use clusterMap as my go-to 
parallel function but have run into an interesting behavior. The behavior is 
that try(read.csv(x)) does not catch read errors resulting from having an empty 
csv file inside of clusterMap. I have not tested this with other functions 
(e.g. read.table, mean, etc.). The parLapply function does, it appears, 
correctly catch the errors. Any suggestions on how I should code with 
clusterMap such that try is guaranteed to catch the error?


I am working on windows server 2012
I have the latest version of R and parallel
I am executing the code from within the rstudio ide Version 0.99.896

Here is a demonstration of the failure

R code used in demonstration:
#prepare csv files - an empty file and a file with data
close(file("c:/temp/badcsv.csv",open="w"))
write.table(data.frame(x=2),"c:/temp/goodcsv.csv")

#prepare a parallel cluster
clus0=makeCluster(1, rscript_args = "--no-site-file")

#read good / bad files in parallel with parLapply - which succeeds: try Does 
catch err
x1=parLapply(clus0,c("c:/temp/badcsv.csv","c:/temp/goodcsv.csv"),function(...)try(read.csv(...)))
print(x1)

#read good / bad files in parallel with clusterMap - which fails: try does Not 
catch error
x0=clusterMap(clus0,function(...)try(read.csv(...)),c("c:/temp/badcsv.csv","c:/temp/goodcsv.csv"),SIMPLIFY=F)
print(x0)

R output:

> #prepare csv files - an empty file and a file with data
> close(file("c:/temp/badcsv.csv",open="w"))
> write.table(data.frame(x=2),"c:/temp/goodcsv.csv")
>
> #prepare a parallel cluster
> clus0=makeCluster(1, rscript_args = "--no-site-file")
>
> #read good / bad files in parallel with parLapply - which succeeds: try Does 
> catch err
> x1=parLapply(clus0,c("c:/temp/badcsv.csv","c:/temp/goodcsv.csv"),function(...)try(read.csv(...)))
> print(x1)
[[1]]
[1] "Error in read.table(file = file, header = header, sep = sep, quote = 
quote,  : \n  no lines available in input\n"
attr(,"class")
[1] "try-error"
attr(,"condition")


[[2]]
x
1 1 2

>
> #read good / bad files in parallel with clusterMap - which fails: try does 
> Not catch error
> x0=clusterMap(clus0,function(...)try(read.csv(...)),c("c:/temp/badcsv.csv","c:/temp/goodcsv.csv"),SIMPLIFY=F)
Error in checkForRemoteErrors(val) :
  one node produced an error: Error in read.table(file = file, header = header, 
sep = sep, quote = quote,  :
  no lines available in input
> print(x0)
Error in print(x0) : object 'x0' not found
>


Thanks for any help,
Jacob


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] interaction between clusterMap(), read.csv() and try() - try does not catch error

2016-08-08 Thread Strunk, Jacob (DNR)

Ok - got it, I can handle that. Thank you Luke!


Jacob L Strunk
___
From: luke-tier...@uiowa.edu [luke-tier...@uiowa.edu]
Sent: Monday, August 08, 2016 12:17 PM
To: Strunk, Jacob (DNR)
Cc: r-help@r-project.org
Subject: Re: [R] interaction between clusterMap(), read.csv() and try() - try 
does not catch error

try is working fine. The problem is that your remote function is
returning the try-error result, which the parallel infrastructure is
interpreting as an error on the remote node, since the remote calling
infrastructure is using try as well. This could be implemented more
robustly, but it would probably be better in any case your code to use
can use tryCatch and have the error. function return something easier
to work with, like NULL.

Best,

luke

On Mon, 8 Aug 2016, Strunk, Jacob (DNR) wrote:

> Hello I am attempting to process a list of csv files in parallel, some of 
> which may be empty and fail with read.csv. I tend to use clusterMap as my 
> go-to parallel function but have run into an interesting behavior. The 
> behavior is that try(read.csv(x)) does not catch read errors resulting from 
> having an empty csv file inside of clusterMap. I have not tested this with 
> other functions (e.g. read.table, mean, etc.). The parLapply function does, 
> it appears, correctly catch the errors. Any suggestions on how I should code 
> with clusterMap such that try is guaranteed to catch the error?
>
>
> I am working on windows server 2012
> I have the latest version of R and parallel
> I am executing the code from within the rstudio ide Version 0.99.896
>
> Here is a demonstration of the failure
>
> R code used in demonstration:
> #prepare csv files - an empty file and a file with data
> close(file("c:/temp/badcsv.csv",open="w"))
> write.table(data.frame(x=2),"c:/temp/goodcsv.csv")
>
> #prepare a parallel cluster
> clus0=makeCluster(1, rscript_args = "--no-site-file")
>
> #read good / bad files in parallel with parLapply - which succeeds: try Does 
> catch err
> x1=parLapply(clus0,c("c:/temp/badcsv.csv","c:/temp/goodcsv.csv"),function(...)try(read.csv(...)))
> print(x1)
>
> #read good / bad files in parallel with clusterMap - which fails: try does 
> Not catch error
> x0=clusterMap(clus0,function(...)try(read.csv(...)),c("c:/temp/badcsv.csv","c:/temp/goodcsv.csv"),SIMPLIFY=F)
> print(x0)
>
> R output:
>
>> #prepare csv files - an empty file and a file with data
>> close(file("c:/temp/badcsv.csv",open="w"))
>> write.table(data.frame(x=2),"c:/temp/goodcsv.csv")
>>
>> #prepare a parallel cluster
>> clus0=makeCluster(1, rscript_args = "--no-site-file")
>>
>> #read good / bad files in parallel with parLapply - which succeeds: try Does 
>> catch err
>> x1=parLapply(clus0,c("c:/temp/badcsv.csv","c:/temp/goodcsv.csv"),function(...)try(read.csv(...)))
>> print(x1)
> [[1]]
> [1] "Error in read.table(file = file, header = header, sep = sep, quote = 
> quote,  : \n  no lines available in input\n"
> attr(,"class")
> [1] "try-error"
> attr(,"condition")
>  quote, dec = dec, fill = fill, comment.char = comment.char, ...): no 
> lines available in input>
>
> [[2]]
>x
> 1 1 2
>
>>
>> #read good / bad files in parallel with clusterMap - which fails: try does 
>> Not catch error
>> x0=clusterMap(clus0,function(...)try(read.csv(...)),c("c:/temp/badcsv.csv","c:/temp/goodcsv.csv"),SIMPLIFY=F)
> Error in checkForRemoteErrors(val) :
>  one node produced an error: Error in read.table(file = file, header = 
> header, sep = sep, quote = quote,  :
>  no lines available in input
>> print(x0)
> Error in print(x0) : object 'x0' not found
>>
>
>
> Thanks for any help,
> Jacob
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] package.skeleton fails

2016-08-24 Thread Strunk, Jacob (DNR)

Hello, I have been using package.skeleton from within an lapply statement
successfully (assuming good source code) with the following setup in the
past:

writeLines("testfun=function(){}", "c:\\temp\\testfun.r")
x=try(package.skeleton("test_pack",path="c:\\temp\\tests\\",code_files= 
"c:\\temp\\testfun.r"))

but it now fails with the error:

Error: evaluation nested too deeply: infinite recursion /
options(expressions=)?

I am working in RStudio Version 0.99.896, with 64 bit R version 3.3.1
(2016-06-21)

I have been poking in the code and the error appears happen within the 
subfunction '.fixPackageFileNames'

Thanks for any assistance you might be able to provide.

Jacob


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] package.skeleton fails

2016-08-29 Thread Strunk, Jacob (DNR)

Ok, I apologize - we seem to have a component in our environment that interacts 
with this function. Thank you for your help.

You wouldn't happen to know if there is there a way to modify the way the 
environment is loaded in package.skeleton? - I don't see any way to pass it a 
configuration option like "--no-site-file"

Thank you,

Jacob 

-Original Message-
From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] 
Sent: Sunday, August 28, 2016 7:28 AM
To: Strunk, Jacob (DNR) ; r-help@r-project.org
Subject: Re: [R] package.skeleton fails

Your code works for me, and I do not see any lapply in the example you provide 
below.

Best,
Uwe Ligges



On 24.08.2016 21:21, Strunk, Jacob (DNR) wrote:
> Hello, I have been using package.skeleton from within an lapply 
> statement successfully (assuming good source code) with the following 
> setup in the
> past:
>
> writeLines("testfun=function(){}", "c:\\temp\\testfun.r") 
> x=try(package.skeleton("test_pack",path="c:\\temp\\tests\\",code_files
> = "c:\\temp\\testfun.r"))
>
> but it now fails with the error:
>
> Error: evaluation nested too deeply: infinite recursion / 
> options(expressions=)?
>
> I am working in RStudio Version 0.99.896, with 64 bit R version 3.3.1
> (2016-06-21)
>
> I have been poking in the code and the error appears happen within the 
> subfunction '.fixPackageFileNames'
>
> Thanks for any assistance you might be able to provide.
>
> Jacob
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Simulating points from GLM corresponding to new x-values

2009-08-12 Thread Jacob Nabe-Nielsen

Dear List,

Does anyone know how to simulate data from a GLM object correponding  
to values of the independent (x) variable that do not occur in the  
original dataset?

I have tried using simulate(), but it generates a new value of the  
dependent variable corresponding to each of the original x-values,  
which is not what I need. Ideally I whould like to simulate new values  
for GLM objects both with family="gaussian" and with family="binomial".

Thanks in advance,
Jacob

Jacob Nabe-Nielsen, PhD, MSc
Scientist
  --
Section for Climate Effects and System Modelling
Department of Arctic Environment
National Enviornmental Research Institute
Aarhus University
Frederiksborgvej 399, Postbox 358
4000 Roskilde, Denmark

email: n...@dmu.dk
fax: +45 4630 1914
phone: +45 4630 1944


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulating points from GLM corresponding to new x-values

2009-08-12 Thread Jacob Nabe-Nielsen


Hi Cliff -- thanks for the suggestion.

I tried extracting the predicted mean and standard error using  
predict(). Afterwards I simulated the dependent variable using  
rnorm(), with mean and standard deviation taken from the predict()  
function (sd = sqrt(n)*se). The points obtained this way were  
scattered far too much (compared to points obtained with simulate())  
-- I am not quite sure why.


Unfortunately the documentation of the simulate() function does not  
provide much information about how it is implemented, which makes it  
difficult to judge which method is best (predict() or simulate(), and  
it is also unclear whether simulate() can be applied to glms (with  
family=gaussian or binomial).


Any suggestions for how to proceed?

Jacob


On 12 Aug 2009, at 13:11, Clifford Long wrote:


Would the "predict" routine (using 'newdata') do what you need?

Cliff Long
Hollister Incorporated



On Wed, Aug 12, 2009 at 4:33 AM, Jacob Nabe- 
Nielsen wrote:

Dear List,

Does anyone know how to simulate data from a GLM object correponding
to values of the independent (x) variable that do not occur in the
original dataset?

I have tried using simulate(), but it generates a new value of the
dependent variable corresponding to each of the original x-values,
which is not what I need. Ideally I whould like to simulate new  
values
for GLM objects both with family="gaussian" and with  
family="binomial".


Thanks in advance,
Jacob

Jacob Nabe-Nielsen, PhD, MSc
Scientist
 --
Section for Climate Effects and System Modelling
Department of Arctic Environment
National Enviornmental Research Institute
Aarhus University
Frederiksborgvej 399, Postbox 358
4000 Roskilde, Denmark

email: n...@dmu.dk
fax: +45 4630 1914
phone: +45 4630 1944


   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 110 matches

Mail list logo