from:"Benjamin"

[R] specific package to laply

2015-05-14 Thread Benjamin

Hi,

I have a batch jobs problem for parallel code without sudo.  When I try to
send a package to the different nodes, as follows:

.libPaths( c(.libPaths(),
 "/my/first/library",
 "/my/second/library")
  )
library(foreach)
library(iterators)
library(parallel)
library(doParallel)
library(rvest)
cl <- makeCluster(detectCores())
registerDoParallel(cl)
sites <- paste0("https://www.site",1:2,".com";)

html0 <- foreach(i=sites,.packages='rvest') %dopar% html(i)


I get the following output:

Error in e$fun(obj, substitute(ex), parent.frame(), e$data) :
  worker initialization failed: there is no package called ‘rvest’
Calls: %dopar% -> 

Presumably, I need a way to export my .libPaths() to the nodes.  Any
suggestions?

Thanks,
Benjamin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] party package: is cforest really bagging?

2015-12-15 Thread Benjamin

Hi all,

I'm using the "party" package to create random forest of regression trees. I've
created a ForestControl class in order to limit my number of trees (ntree),
of nodes (maxdepth) and of variables I use to fit a tree (mtry). One thing
I'm not sure of is if the cforest algo is using subsets of my training set
for each tree it generates or not.

I've seen in the documentation that it is bagging so I assume it should.
But I'm not sure to understand well what the "subset" input is in that
function.

I'm also puzzled by the results I get using ctree: when plotting the tree,
I see that all my variables of my training set are classified in the
different terminal tree nodes while I would have exepected that it only
uses a subset here too.

So my question is, is cforest doing the same thing as ctree or is it really
bagging my training set?

Thanks in advance for you help!

Ben

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Package random forest: can corr.bias be harmful?

2013-06-21 Thread Benjamin

Hi, Christian.

It sounds like to the maintainer feels that the package needs improvements,
i.e., it may not function as expected.  However, you may contact him
directly using maintainer("randomForest").

Best,
Benjamin


On Fri, Jun 21, 2013 at 9:36 AM, Christian Kampichler <
christian.kampich...@sovon.nl> wrote:

> There exists an option corr.bias for making regression analyses with the R
> package randomForest. The manual warns Experimental. Use at your own
> risk. What does corr.bias precisely do and why can it be dangerous to use
> it? As far as I know the bias correction is based on a linear regression
> between the out-of-bag-prediction and the observed values which is used for
> correction of all predictions made by the model, but this does not appear
> so harmful to me.
>
> --
> Dr. Christian Kampichler
> Sovon Dutch Centre for Field Ornithology
> PO Box 6521, 6503 GA Nijmegen, The Netherlands
> +31 24 7 410 410
> www.sovon.nl
>
> __**
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html <http://www.R-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] read.table(..., header == FALSE, colClasses = )

2017-10-23 Thread Benjamin Tyner


Hello

I noticed that starting with R version 3.3.0 onward, this generates a 
warning:


   > txt <- c("a", "3.14")
   > read.table(file = textConnection(txt), header = FALSE, colClasses 
= c(x = "character", y = "numeric"))


the warning is "not all columns named in 'colClasses' exist" and I guess 
the change was made in response to this?


   https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16478

Regardless, I am wondering whether this is desirable, that as a result 
of the change, the code has become stricter about the presence of a 
(formerly) harmless names attribute. Or am I missing something?


Regards

Ben

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.table(..., header == FALSE, colClasses = )

2017-10-24 Thread Benjamin Tyner

Jeff,

Thank you for your reply. The intent was to construct a minimum 
reproducible example. The same warning occurs when the 'file' argument 
points to a file on disk with a million lines. But you are correct, my 
example was slightly malformed and in fact gives an error under R 
version 3.2.2. Please allow me to try again; in older versions of R,

   > read.table(file = textConnection("a\t3.14"), header = FALSE, 
colClasses = c(x = "character", y = "numeric"), sep="\t")

     V1   V2
   1  a 3.14

(with no warning). As of version 3.3.0,

   > read.table(file = textConnection("a\t3.14"), header = FALSE, 
colClasses = c(x = "character", y = "numeric"), sep="\t")

     V1   V2
   1  a 3.14
   Warning message:
   In read.table(file = textConnection("a\t3.14"), header = FALSE,  :
     not all columns named in 'colClasses' exist

My intent was not to complain but rather to learn more about best 
practices regarding the names attribute.

Regards

Ben

On 10/23/2017 08:51 PM, Jeff Newmiller wrote:

You are constructing the equivalent of a two-line data file, and complaining 
that it is not treating it like it was one line. If it did used to accept this 
silently [skeptical] then I for one am glad it produces a warning now.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.table(..., header == FALSE, colClasses = )

2017-10-24 Thread Benjamin Tyner

Yes, it makes sense now; lesson learned. Thank you both! Sometimes it 
seems that no matter how good the documentation, some useR will 
inevitably (ab)use the code in ways that were never intended by the 
authors. Then when the code and/or documentation changes, it is not 
always obvious to the useR whether the intent of the authors has 
changed, or whether the useR had just been "getting the right answer for 
the wrong reason" all along. In this particular case, the change was 
documented as stemming from a "new feature" (as opposed to a bugfix or 
more stringent argument checking) that might appear to be a non fully 
backwards compatible change. For example one might have the (apparently) 
bad habit of using col.names as a shortcut to rename headers on-the-fly ...

   > getRversion()
   [1] ‘3.2.2’

   > read.table(textConnection("x y\na 3.14"), header = TRUE, 
colClasses = c(x = "character", y = "numeric"), col.names = c("foo", "bar"))

     foo  bar
   1   a 3.14

but indeed, the names attribute has zero effect on the result:

   > read.table(textConnection("x y\na 3.14"), header = TRUE, 
colClasses = c(y = "character", x = "numeric"), col.names = c("foo", "bar"))

     foo  bar
   1   a 3.14

so I agree it is good that we are checking for that now.

Regards
Ben

On 10/24/2017 08:55 AM, Martin Maechler wrote:

Benjamin Tyner 
 on Tue, 24 Oct 2017 07:21:33 -0400 writes:

 > Jeff,
 > Thank you for your reply. The intent was to construct a minimum
 > reproducible example. The same warning occurs when the 'file' argument
 > points to a file on disk with a million lines. But you are correct, my
 > example was slightly malformed and in fact gives an error under R
 > version 3.2.2. Please allow me to try again; in older versions of R,

 >    > read.table(file = textConnection("a\t3.14"), header = FALSE,
 > colClasses = c(x = "character", y = "numeric"), sep="\t")
 >      V1   V2
 >    1  a 3.14

 > (with no warning). As of version 3.3.0,

 >    > read.table(file = textConnection("a\t3.14"), header = FALSE,
 > colClasses = c(x = "character", y = "numeric"), sep="\t")
 >      V1   V2
 >    1  a 3.14
 >    Warning message:
 >    In read.table(file = textConnection("a\t3.14"), header = FALSE,  :
 >      not all columns named in 'colClasses' exist

 > My intent was not to complain but rather to learn more about best
 > practices regarding the names attribute.

which is a nice attitude, thank you.

An even shorter MRE (as header=FALSE is default, and the default
sep="" works, too):

tt <- read.table(textConnection("a 3.14"), colClasses = c(x="character", 
y="numeric"))

Warning message:
In read.table(file = textConnection("a 3.14"), colClasses = c(x = "character",  
:
   not all columns named in 'colClasses' exist
If you read in the help page -- you did read that before posting, did you?---
how 'colClasses' should be specified ,

 colClasses: character.  A vector of classes to be assumed for the
  columns.  If unnamed, recycled as necessary.  If named, names
  are matched with unspecified values being taken to be ‘NA’.

  Possible values are ..
  .

and the 'x' and 'y' names you used, are matched with the
colnames ... which on the other hand are "V1" and "V2"  for
you, and so you provoke a warning.

Once you have read (and understood) the above part of the help
page, it becomes, easy, no?

tt <- read.table(textConnection("a 3.14"), colClasses = 
c("character","numeric"))
t2 <- read.table(textConnection("a 3.14"), colClasses=c(x="character",y="numeric"), 
col.names=c("x","y"))
t2

   xy
1 a 3.14
i.e., no warning in both of these two cases.

So please, please, PLEASE: at least non-beginners like you *should*
take the effort to read the help page (and report if these seem
incomplete or otherwise improvable)...

Best,
Martin Maechler
ETH Zurich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] trying to find the multiple combinations...

2017-12-07 Thread Benjamin Sabatini

Hi,

I'm trying to find a way to determine what multiples of the combination of 
three or more numbers equals a forth number.

So, if I had a number set like:

c(13.4689, 12.85212, 17.05071)

What combination and multiples of these numbers would average to 15.0078? (so, 
something that would tell me x, y, and z in (x*13.4689 + y*12.85212+ 
z*17.05071) / x+y+z) = 15.0078

I think this is doable with aggregate?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] setSessionLimit

2018-01-19 Thread Benjamin Tyner


Hello

The doc/NEWS.2 file mentions a setSessionLimit function, added with 
version 2.8.0


    o   setTimeLimit() function to set limits on the CPU
    and/or elapsed time for each top-level computation, and
    setSessionLimit() to set limits for the rest of the session.

However, I no longer see this function in recent versions of R, and 
there is no mention of its removal in the NEWS nor in the svn log. So 
I'm curious to learn why it was removed...does anyone recall?


Regards

Ben

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rJava garbage collect

2018-02-05 Thread Benjamin Tyner


Hi

Does rJava offer a way to instruct the JVM to perform a garbage collection?

Regards

Ben

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rJava garbage collect

2018-02-06 Thread Benjamin Tyner


Thanks Jeff; indeed it works:

   .jcall("java/lang/System", method = "gc")


On 02/05/2018 11:53 PM, Jeff Newmiller wrote:

rJava offers a mechanism to call arbitrary methods in Java. Wouldn't you use 
that mechanism to call whatever you would call if you were programming in Java 
(e.g. System.gc)?


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rJava garbage collect

2018-02-06 Thread Benjamin Tyner


Hi Martin,

Thanks for providing the reference.

In this particular case, it helped me to discover that 13 JVM threads 
were garbage collecting in parallel, occasionally resulting in a race 
condition. Setting


    options(java.parameters = "-XS:ParallelGCThreads=1")

appears to resolve the issue.

Regards

Ben



On 6 February 2018 at 04:34, Benjamin Tyner https://stat.ethz.ch/mailman/listinfo/r-help>> wrote:
>/Hi />//>/Does rJava offer a way to instruct the JVM to perform a garbage 
collection? /

Do you really, really need to run the garbage collector?

Consider reading:
https://stackoverflow.com/questions/5086800/java-garbage-collection

Regards
Martin



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] system() or pipe(..., open = "r") without child process?

2018-05-22 Thread Benjamin Tyner


Greetings

On linux, is it possible to invoke an OS command from within R without 
spawning a child process? If not, is it possible to avoid copying the 
"parts of the caller's context" that are mentioned on the clone manpage?


 ENOMEM Cannot  allocate  sufficient memory to allocate a task structure
  for the child, or to copy those parts of  the caller's  
context

  that need to be copied.

(I am trying to avoid that ENOMEM condition when calling system(), 
pipe(), etc).


Regards,

Ben

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Efficient manipulation with list object

2018-06-10 Thread Benjamin Christoffersen

You may be able to speed it up further by using `data.table`'s
`rbindlist` or a similar function as shown here
https://stackoverflow.com/a/49772719/5861244.

2018-06-10 21:20 GMT+02:00 Christofer Bogaso :
> Using do.call() reduces my calculation time significantly.
>
> On Sun, Jun 10, 2018 at 10:45 PM ruipbarradas  wrote:
>
>> Hello,
>>
>> Instead of Reduce try do.call.
>>
>> do.call ('rbind', list)
>>
>> But with such a long list it will still take time.
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>>
>>
>> Enviado a partir do meu smartphone Samsung Galaxy.
>>  Mensagem original 
>> De: Christofer Bogaso 
>> Data: 10/06/2018 16:33 (GMT+00:00)
>> Para: r-help 
>> Assunto: [R] Efficient manipulation with list object
>>
>> Hi,
>>
>> I have a list of length 10,000, and each element of that list is a matrix
>> with 3 columns and 2,000 rows.
>>
>> Now when I tried to make a Matrix object with that list using
>> Reduce('rbind', list), my code is taking a considerable amount of time.
>>
>> Is there any way to implement same above task in more efficient way?
>>
>> Thanks,
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] undo compile? (or: remove bytecode from closure)

2018-07-15 Thread Benjamin Tyner


Hi

Given a closure which has been compiled, what's the recommended way to 
recover the original? For example,


    > f <- function(x) x+1
    > fc <- cmpfun(f)
    > rm(f)
    > fc
    function(x) x+1
    

what's the best way to recover f from fc ?

Regards

Ben

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] undo compile? (or: remove bytecode from closure)

2018-07-16 Thread Benjamin Tyner

Thanks Rui and Duncan, this is most helpful.

On 07/16/2018 06:58 AM, Duncan Murdoch wrote:

On 16/07/2018 5:31 AM, Rui Barradas wrote:

Hello,

Maybe the following is not the recommended way but it works
(and I believe makes sense).

f <- function(){}
formals(f) <- formals(fc)
body(f) <- body(fc)

That's not quite right:  it might lose the environment of fc, if it 
isn't the environment where this took place.  But a simpler solution 
is just

f <- fc
body(f) <- body(f)

because any assignment to the body of a function causes the bytecode 
to be dropped.

Both of our approaches will also cause the source references to be 
dropped.  If you want to save those, you need more steps:

f <- fc
body(f) <- body(f)
attr(f, "srcref") <- getSrcref(fc)

Duncan Murdoch

f
#function (x)
#{
#  x <- x + 1
#  pi * x
#}

f(1)
#[1] 6.283185

Hope this helps,

Rui Barradas

Às 03:25 de 16-07-2018, Benjamin Tyner escreveu:

Hi

Given a closure which has been compiled, what's the recommended way to
recover the original? For example,

      > f <- function(x) x+1
      > fc <- cmpfun(f)
      > rm(f)
      > fc
      function(x) x+1

what's the best way to recover f from fc ?

Regards

Ben

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Unknown anomaly

2017-04-18 Thread Benjamin Robira

Dear Sir,

I writting to you as I am facing an irregularity in R that I do not know
the origin. When doing a sequence from 0 to 1 by 0.02 and assigning it to
a vector (i.e. code: a <- seq(from=0, to=1, by=0.02)) then, when I try to
use the 36th element (and two others behave the same way) it is not
recognized correctly. For instance a[36]==0.7, what should give TRUE,
gives instead FALSE. However, this works fine for element 35 and 37 and
all other elements but two.
I do not know the reason. I restarted my R session and tried on another
computer. This has been the same. None of my colleagues had an answer. I
hope that you would be able to help me fix that as it must be a pretty
straightforward error that I do not realise.

I would be thankful for any help,
With my very Best Regards,
Benjamin.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plot command error message

2017-06-04 Thread Benjamin Gföhler

 
I tried to plot a clustered linear regression model with the cplot command in R 
(code below). Leaflet is a binary variable (I know logit would be better), 
partisan is nummeric variable (0-4) and partisan_mis a dummy (0,1). As you can 
see it is clustered around two variables: around individuals and around the 
specific survey. When I try to run the cplot command I always get this error 
message:

error in plot.window(...) : need finite 'ylim' values 

Zusätzlich: Warnmeldungen:   1: In min(x) : no non-missing arguments to 
min; returning Inf 

2: In max(x) : no non-missing 
arguments to max; returning -Inf

Does anyone know how to solve this problem?

As I suppose that the problem has something to do with the way I coded the 
variables, here is how i did it:


voxit$leaflet[voxit$a65== "Ja"] <- 1
voxit$leaflet[voxit$a65== "Nein"] <- 0
voxitf <- subset(voxit, leaflet >= 0 ) 

voxitf$partisan[voxitf$a87x== "1 Tag oder weniger vorher"] <- 0
voxitf$partisan[voxitf$a87x== "Einige Tage vorher"] <- 1
voxitf$partisan[voxitf$a87x== "1-2 Woche(n) vorher"] <- 2
voxitf$partisan[voxitf$a87x== "Mehrere Wochen vorher"] <- 3
voxitf$partisan[voxitf$a87x== "Schon immer klar"] <- 4
mean(voxitf$partisan, na.rm = TRUE)

voxitf$partisan[is.na(voxitf$a87x)]<- 2.73493
table(voxitf$partisan)

voxitf$partisan_mis[is.na(voxitf$a87x)]<- 1
voxitf$partisan_mis[!is.na(voxitf$a87x)]<- 0


model1.1 <- lm(leaflet ~ partisan + partisan_mis , data = voxitf)

vcov_clust1.1 <- cluster.vcov(model1.1, cbind(voxitf$id,voxitf$projetx))

(marginal_effects1.1 <- margins(model1.1)) 

summary(marginal_effects1.1)

cplot(model1.1, x = "partisan", dx = "leaflet", what = "effect", se.type = 
"shade")
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Svyglm Error

2017-07-05 Thread Courtney Benjamin

Greetings,

I am revisiting code from several different files I have saved from the past 
and all used to run flawlessly; now when I run any of the svyglm related 
functions, I am coming up with an error:

Error in model.frame.default(formula = F3ATTAINB ~ F1PARED, data = data,  :
  the ... list does not contain 4 elements
The following is a minimal reproducible example:
library(RCurl)
library(survey)

data <- 
getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";)
elsq1ch <- read.csv(text = data)

#Specifying the svyrepdesign object which applies the BRR weights
elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = 
elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR")
elsq1ch_brr

##Resetting baseline levels for predictors
elsq1ch_brr <- update( elsq1ch_brr , F1HIMATH = relevel(F1HIMATH,"PreAlg or 
Less") )
elsq1ch_brr <- update( elsq1ch_brr , BYINCOME = relevel(BYINCOME,"0-25K") )
elsq1ch_brr <- update( elsq1ch_brr , F1RACE = relevel(F1RACE,"White") )
elsq1ch_brr <- update( elsq1ch_brr , F1SEX = relevel(F1SEX,"Male") )
elsq1ch_brr <- update( elsq1ch_brr , F1RTRCC = relevel(F1RTRCC,"Other") )

##Univariate testing for Other subset
Othpared <- 
svyglm(formula=F3ATTAINB~F1PARED,family="quasibinomial",design=subset(elsq1ch_brr,BYSCTRL==1&G10COHRT==1&F1RTRCC=="Other"),na.action=na.omit)
summary(Othpared)?


Any help in resolving this concern would be greatly appreciated.

Sincerely,

Courtney


Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Svyglm Error

2017-07-05 Thread Courtney Benjamin

?Problem solved; I did not have the most updated version of R.  When I updated 
it and ran the code, all worked well again.  Thanks very much for helping me 
solve the mistake of an R novice!



Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633


From: Anthony Damico 
Sent: Wednesday, July 5, 2017 3:47 PM
To: Courtney Benjamin
Cc: r-help@r-project.org
Subject: Re: [R] Svyglm Error

hi, i am not hitting an error when i copy and paste your code into a fresh 
console.  maybe compare your sessionInfo() to mine?


> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server 2008 R2 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United 
States.1252LC_MONETARY=English_United States.1252 LC_NUMERIC=C  
 LC_TIME=English_United States.1252

attached base packages:
[1] grid  stats graphics  grDevices utils datasets  methods   
base

other attached packages:
[1] survey_3.32-1   survival_2.41-3 Matrix_1.2-10   RCurl_1.95-4.8  
bitops_1.0-6

loaded via a namespace (and not attached):
[1] compiler_3.4.1  splines_3.4.1   lattice_0.20-35





On Wed, Jul 5, 2017 at 2:24 PM, Courtney Benjamin 
mailto:cbenj...@btboces.org>> wrote:
Greetings,

I am revisiting code from several different files I have saved from the past 
and all used to run flawlessly; now when I run any of the svyglm related 
functions, I am coming up with an error:

Error in model.frame.default(formula = F3ATTAINB ~ F1PARED, data = data,  :
  the ... list does not contain 4 elements
The following is a minimal reproducible example:
library(RCurl)
library(survey)

data <- 
getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";)
elsq1ch <- read.csv(text = data)

#Specifying the svyrepdesign object which applies the BRR weights
elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = 
elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR")
elsq1ch_brr

##Resetting baseline levels for predictors
elsq1ch_brr <- update( elsq1ch_brr , F1HIMATH = relevel(F1HIMATH,"PreAlg or 
Less") )
elsq1ch_brr <- update( elsq1ch_brr , BYINCOME = relevel(BYINCOME,"0-25K") )
elsq1ch_brr <- update( elsq1ch_brr , F1RACE = relevel(F1RACE,"White") )
elsq1ch_brr <- update( elsq1ch_brr , F1SEX = relevel(F1SEX,"Male") )
elsq1ch_brr <- update( elsq1ch_brr , F1RTRCC = relevel(F1RTRCC,"Other") )

##Univariate testing for Other subset
Othpared <- 
svyglm(formula=F3ATTAINB~F1PARED,family="quasibinomial",design=subset(elsq1ch_brr,BYSCTRL==1&G10COHRT==1&F1RTRCC=="Other"),na.action=na.omit)
summary(Othpared)?


Any help in resolving this concern would be greatly appreciated.

Sincerely,

Courtney


Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org><mailto:cbenj...@btboces.org<mailto:cbenj...@btboces.org>>

607-763-8633

[[alternative HTML version deleted]]

__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Creating New Variable Using Ifelse

2017-08-09 Thread Courtney Benjamin

Hello R Help List,

I am an R novice and trying to use the ifelse function to create a new binary 
variable based off of the responses of two other binary variables; NAs are 
involved.  I pulled it off almost successfully, but when I checked the counts 
of my new variable for accuracy, I found that a small portion of the NA cases 
were not being passed through as NAs, but as "0" counts in my new variable.  My 
many attempts at creating a nested ifelse statement that would pass the NAs 
through properly have not been successful.  Any help is greatly appreciated.

Here is a MRE:?

library(RCurl)
data <- 
getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq2wbl.csv";)
elsq2wbl <- read.csv(text = data)

##Recoding Negative Responses to NA
elsq2wbl [elsq2wbl[, "EVERRELJOB"] < -3, "EVERRELJOB"] <- NA
elsq2wbl [elsq2wbl[, "PSWBL"] < -2, "PSWBL"] <- NA

#Labeling categorical variable levels
elsq2wbl$EVERRELJOB <- factor(elsq2wbl$EVERRELJOB, levels = c(0,1), labels = 
c("No","Yes"))
elsq2wbl$PSWBL <- factor(elsq2wbl$PSWBL, levels = c(0,1), labels = 
c("No","Yes"))

##Trying to create a new variable to indicate if the student had a job
#related to the college studies that was NOT a WBL experience
elsq2wbl$NONWBLRELJOB <- ifelse(elsq2wbl$PSWBL=="No" & 
elsq2wbl$EVERRELJOB=="Yes",1,0)

#Cross tab to check counts of two variables that new variable is based upon
xtabs(~PSWBL+EVERRELJOB,subset(elsq2wbl,BYSCTRL==1&G10COHRT==1),addNA=TRUE)

#Checking count of newly created variable
Q2sub <- subset(elsq2wbl,BYSCTRL==1&G10COHRT==1)
library(plyr)
count(Q2sub,'NONWBLRELJOB')

#The new variable has the correct count of "1", but 88 cases too many for "0"
#The cross tab shows 20 and 68 NA cases that are being incorrectly counted as 
"0" in the new variable

#My other approach at trying to handle the NAs properly-returns an error
elsq2wbl$NONWBLRELJOB <- ifelse(elsq2wbl$PSWBL=="No" & 
elsq2wbl$EVERRELJOB=="Yes",1,ifelse(is.na(elsq2wbl$PSWBL)&is.na(elsq2wbl$EVERRELJOB),NA,

   ifelse(elsq2wbl$PSWBL!="No" & elsq2wbl$EVERRELJOB!="Yes",0)))



Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Creating New Variable Using Ifelse

2017-08-10 Thread Courtney Benjamin

Thanks very much; with your tips, I was able to get the nested ifelse statement 
to work properly!

Courtney Benjamin



From: PIKAL Petr 
Sent: Thursday, August 10, 2017 5:39 AM
To: Courtney Benjamin; r-help@r-project.org
Subject: RE: Creating New Variable Using Ifelse

Hi

see in line

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Courtney
> Benjamin
> Sent: Thursday, August 10, 2017 5:55 AM
> To: r-help@r-project.org
> Subject: [R] Creating New Variable Using Ifelse
>
> Hello R Help List,
>
> I am an R novice and trying to use the ifelse function to create a new binary
> variable based off of the responses of two other binary variables; NAs are
> involved.  I pulled it off almost successfully, but when I checked the counts 
> of
> my new variable for accuracy, I found that a small portion of the NA cases 
> were
> not being passed through as NAs, but as "0" counts in my new variable.  My
> many attempts at creating a nested ifelse statement that would pass the NAs
> through properly have not been successful.  Any help is greatly appreciated.
>
> Here is a MRE:?
>
> library(RCurl)
> data <-
> getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-
> ed/master/elsq2wbl.csv")

Did not work for me probably due to some restriction of data access.

> elsq2wbl <- read.csv(text = data)
>
> ##Recoding Negative Responses to NA
> elsq2wbl [elsq2wbl[, "EVERRELJOB"] < -3, "EVERRELJOB"] <- NA
> elsq2wbl [elsq2wbl[, "PSWBL"] < -2, "PSWBL"] <- NA

If you wanted recode values below zero to NA you can do this easily without any 
ifelse

> summary(test)
   mp tepl   kryst
 Min.   : 7.11   Min.   :100.0   Min.   : 21.70
 1st Qu.:32.25   1st Qu.:400.0   1st Qu.: 24.15
 Median :37.50   Median :500.0   Median : 26.55
 Mean   :38.44   Mean   :485.7   Mean   : 42.64
 3rd Qu.:44.75   3rd Qu.:600.0   3rd Qu.: 33.62
 Max.   :76.02   Max.   :900.0   Max.   :150.00
 NA's   :3   NA's   :6
> test[is.na(test)] <- 999
> summary(test)
   mp  tepl   kryst
 Min.   :  7.11   Min.   :100.0   Min.   : 21.70
 1st Qu.: 34.27   1st Qu.:400.0   1st Qu.: 25.93
 Median : 41.75   Median :500.0   Median : 92.90
 Mean   :244.28   Mean   :485.7   Mean   :452.51
 3rd Qu.: 70.05   3rd Qu.:600.0   3rd Qu.:999.00
 Max.   :999.00   Max.   :900.0   Max.   :999.00
>
> test[test>900] <- NA
> summary(test)
   mp tepl   kryst
 Min.   : 7.11   Min.   :100.0   Min.   : 21.70
 1st Qu.:32.25   1st Qu.:400.0   1st Qu.: 24.15
 Median :37.50   Median :500.0   Median : 26.55
 Mean   :38.44   Mean   :485.7   Mean   : 42.64
 3rd Qu.:44.75   3rd Qu.:600.0   3rd Qu.: 33.62
 Max.   :76.02   Max.   :900.0   Max.   :150.00
 NA's   :3   NA's   :6


>
> #Labeling categorical variable levels
> elsq2wbl$EVERRELJOB <- factor(elsq2wbl$EVERRELJOB, levels = c(0,1), labels =
> c("No","Yes"))
> elsq2wbl$PSWBL <- factor(elsq2wbl$PSWBL, levels = c(0,1), labels =
> c("No","Yes"))
>
> ##Trying to create a new variable to indicate if the student had a job
> #related to the college studies that was NOT a WBL experience
> elsq2wbl$NONWBLRELJOB <- ifelse(elsq2wbl$PSWBL=="No" &
> elsq2wbl$EVERRELJOB=="Yes",1,0)

You can use simple logical functions to get desired result

First sample data
> a <- sample(c("Y", "N"), 30, replace=TRUE)
> b <- sample(c("Y", "N"), 30, replace=TRUE)
> (a=="N")*(b=="Y")
 [1] 1 1 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1

has 1 only if both conditions are met.

and it works with NA values too.

> a[c(3,5,8)] <- NA
> b[c(3,6,7,8)] <- NA
> (a=="N")*(b=="Y")
 [1]  1  1 NA  0 NA NA NA NA  0  0  1  0  0  1  0  0  0  0  1  0  0  0  0  0  0
[26]  0  0  0  0  1

Cheers
Petr


>
> #Cross tab to check counts of two variables that new variable is based upon
> xtabs(~PSWBL+EVERRELJOB,subset(elsq2wbl,BYSCTRL==1&G10COHRT==1),add
> NA=TRUE)
>
> #Checking count of newly created variable
> Q2sub <- subset(elsq2wbl,BYSCTRL==1&G10COHRT==1)
> library(plyr)
> count(Q2sub,'NONWBLRELJOB')
>
> #The new variable has the correct count of "1", but 88 cases too many for "0"
> #The cross tab shows 20 and 68 NA cases that are being incorrectly counted as
> "0" in the new variable
>
> #My other approach at trying to handle the NAs properly-returns an error
> elsq2wbl$NONWBLRELJOB <- ifelse(elsq2wbl$PSWBL=="No" &
> elsq2wbl$EVERRELJOB==

[R] How to Display Value Labels in R Outputs?

2016-07-28 Thread Courtney Benjamin

Hello R Experts,

I am using the Survey Package in R to do some initial descriptive stats for my 
dissertation. With the outputs for both the svymean and the barplot, I would 
like the value labels to be displayed for the variable-it would make the 
descriptive statistics much easier to interpret. Instead of the output labels 
of: F1RTRCC1, F1RTRCC2-I would like to see the value labels of "academic" and 
"occupational" to be displayed.

How do I go about making this happen? I am including a minimal reproducible 
example with a small subset of my actual data:

https://drive.google.com/file/d/0B5fHCR5TGRjaZ29wNzR0cF9YRXc/view?usp=sharing


Any help is greatly appreciated. My only experience thus far is with SPSS and I 
have a feeling that the reason the variable value labels are not appearing is 
due to either the way I read the dataset into R: elsq1ch<-read.table 
(file="els-Q1-04-21-16.dat", header = TRUE, sep = "\t", quote = "\"", dec =".") 
or I am not specifying some detail that is required to manually assign labels 
to the values of the variables.?


Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] commercial license

2016-09-12 Thread benjamin . stocker

Dear r-project Team

How does It cost a commercial license for the R Console and the R-comander GUI 
without the Rstudio enviroment.
Thanks for helping me.


Freundliche Gr�sse/Kind regards

Benjamin Stocker
Reporting/Controlling MBC
Mercedes-Benz Schweiz AG
Bernstrasse 55
8952 Schlieren/Switzerland

Phone +41 44 755 84 24
Fax +41 44 755 82 17
mailto:benjamin.stoc...@daimler.com

www.mercedes-benz.ch<http://www.mercedes-benz.ch/>


If you are not the addressee, please inform us immediately that you have 
received this e-mail by mistake, and delete it. We thank you for your support.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Svyglm Error in Survey Package

2016-09-23 Thread Courtney Benjamin

In attempting to use the svyglm call in the R Survey Package, I am receiving 
the error: Error in pwt[i] : invalid subscript type 'list'

I have not been able to find a lot of information on how to resolve the error; 
one source advised it was related to how the subsetting command was executed.

This was my initial attempt:

mc1 <- 
svyglm(F3ATTAINMENT~F1SES2QU+F1RGPP2,elsq1ch_brr,subset(elsq1ch_brr,BYSCTRL==1 
& G10COHRT==1),na.action)
summary(mc1)
This was my second approach trying to change up how I had subsetted the data:
summary(mc1)
samp1 <- subset(elsq1ch_brr,BYSCTRL==1 & G10COHRT==1)
dim(samp1)
mc1 <- svyglm(F3ATTAINMENT~F1SES2QU+F1RGPP2,elsq1ch_brr,subset=samp1,na.action)
summary(mc1)?

Both attempts resulted in the same error stated above.  Any advisement in how 
to resolve this error would be greatly appreciated.
Sincerely,
Courtney Benjamin

?



Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Svyglm Error in Survey Package

2016-09-25 Thread Courtney Benjamin

Hello Dr. Fox,
Thank you very much for your recommendations and I will take action on 
following them.  I appreciate your willingness to provide guidance on my novice 
questions.
Sincerely,
Courtney


From: Fox, John 
Sent: Sunday, September 25, 2016 9:21 AM
To: Courtney Benjamin
Cc: r-help@r-project.org
Subject: RE: Svyglm Error in Survey Package

Dear Courtney,

You're confusing a function call, na.action(na.omit), with an argument 
specification, na.action=na.omit  (and, in any event, there is no na.action() 
function). But you don't have to specify na.action=na.omit, because na.omit 
(which produces a complete-case analysis) is the default na.action. See ?glm 
for more information.

If you haven't already done so, you might want to read something about how 
statistical modeling functions in R -- and possibly R more generally -- work. 
In the long run, that likely will save you some time.

I'm cc'ing this to the r-help email list, where you posed your original 
question. It's generally a good idea to keep responses on the list.

I hope this helps,
 John

> -Original Message-
> From: Courtney Benjamin [mailto:cbenj...@btboces.org]
> Sent: September 24, 2016 10:55 PM
> To: Fox, John 
> Subject: Re: Svyglm Error in Survey Package
>
> Hello Dr. Fox,
> Thank you very much for your explanation; I am in better shape now with the
> subset argument and I am no longer getting the error.  Now I am not having
> success with specifying the na.action argument.  I would like to exclude NAs
> like I have done in other commands with na.rm=TRUE.
> The following are my attempts:
> >
> summary(svyglm(F3ATTAINMENT~F1PARED+F1SES2QU,design=elsq1ch_brr,su
> bset
> > =BYSCTRL==1 & G10COHRT==1,na.action))
> Error in glm(formula = F3ATTAINMENT ~ F1PARED + F1SES2QU, subset =
> BYSCTRL ==  :
>   argument "na.action" is missing, with no default
>
> >
> summary(svyglm(F3ATTAINMENT~F1PARED+F1SES2QU,design=elsq1ch_brr,su
> bset
> > =BYSCTRL==1 & G10COHRT==1,na.action(na.omit))
> +
> Any guidance you would be willing to provide is greatly appreciated; I have
> only been using R for about 6 months.
> Sincerely
> Courtney
>
> Courtney Benjamin
> Broome-Tioga BOCES
> Automotive Technology II Teacher
> Located at Gault Toyota
> Doctoral Candidate-Educational Theory & Practice State University of New York
> at Binghamton cbenj...@btboces.org
> 607-763-8633
>
> 
> From: Fox, John 
> Sent: Saturday, September 24, 2016 9:38 AM
> To: Courtney Benjamin
> Cc: r-help@r-project.org
> Subject: RE: Svyglm Error in Survey Package
>
> Dear Courtney,
>
> I think that you're confused about how to use the subset argument to svyglm()
> and about what the subset() function returns. The subset argument should be a
> logical expression, evaluating to TRUE or FALSE for each case; subset() 
> returns
> a data set (e.g., a "survey.design" object).
>
> Here's an example adapted from ?svyglm:
>
> --- snip---
> data(api)
> dstrat<-svydesign(id=~1,strata=~stype, weights=~pw, data=apistrat, fpc=~fpc)
> summary(svyglm(api00~ell+meals+mobility, design=dstrat))
> summary(svyglm(api00~ell+meals+mobility, design=dstrat, subset = pcttest >
> 90)) summary(svyglm(api00~ell+meals+mobility, design=subset(dstrat, subset
> = pcttest > 90)))
> --- snip---
>
> The last two commands produce the same results.
>
> I hope this helps,
>  John
>
> -
> John Fox, Professor
> McMaster University
> Hamilton, Ontario
> Canada L8S 4M4
> Web: socserv.mcmaster.ca/jfox
>
>
>
> > -Original Message-
> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
> > Courtney Benjamin
> > Sent: September 23, 2016 11:01 PM
> > To: r-help@r-project.org
> > Subject: [R] Svyglm Error in Survey Package
> >
> > In attempting to use the svyglm call in the R Survey Package, I am
> > receiving the
> > error: Error in pwt[i] : invalid subscript type 'list'
> >
> > I have not been able to find a lot of information on how to resolve
> > the error; one source advised it was related to how the subsetting
> > command was executed.
> >
> > This was my initial attempt:
> >
> > mc1 <-
> >
> svyglm(F3ATTAINMENT~F1SES2QU+F1RGPP2,elsq1ch_brr,subset(elsq1ch_brr,
> > BYSCTRL==1 & G10COHRT==1),na.action)
> > summary(mc1)
> > This was my second approach trying to change up how I had subsetted
> > the
> > data:
> > summary(mc1)
> > s

Re: [R] non-terminal token lacking children from utils::getParseData

2015-05-02 Thread Benjamin Tyner

Thank you Yihui for also reporting the bug here:

https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16354

and thank you Duncan for finding the issue and fixing it! I definitely
like your idea to report a summary message instead of the long text string.

Regards
Ben

> I tried to reduce the offending portion as best I could to a
> more-or-less minimal example (1136 bytes), which can be downloaded via:
>
> wget https://www.dropbox.com/s/74rgxr5x2aalr99/badstring.R
>
> then once in R,
>
> > b <- parse(file = "~/badstring.R", keep.source = TRUE)
> > d <- getParseData(b, includeText = FALSE)
> > subset(d, line1 == 2L)
>line1 col1 line2 col2 id parent token terminal
> 10 25241 10 21  exprFALSE
> > subset(d, parent == 10)
> [1] line1col1 line2col2 id   parent   token   
> terminal
> <0 rows> (or 0-length row.names)
>
> here is my
>
> > sessionInfo()
> R version 3.0.2 (2013-09-25)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C 
>  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8   
>  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8  
>  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
>  [9] LC_ADDRESS=C   LC_TELEPHONE=C   
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C  
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> note that while it says R version 3.0.2 above, I have seen the same
> behaviour under version 3.1.2 as well.
>
> Regards
> Ben
>
> On 02/19/2015 06:34 PM, Duncan Murdoch wrote:
> >/ On 19/02/2015 6:31 PM, B Tyner wrote:
> />>/ Hi,
> />>/
> />>/ I have run across a source file for which the return value
> />>/ of getParseData() includes a record having FALSE for $terminal, yet it is
> />>/ not the parent of any other tokens. Before I spend time constructing a
> />>/ reproducible example, I wanted to verify that this is in fact unexpected
> />>/ behavior (under R 3.1.2)?
> />/ Before I spend the time thinking about that, I'd like to see a
> />/ reproducible example.
> />/
> />/ Duncan Murdoch
> />/
> />


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] package implementing continuous binomial?

2015-05-06 Thread Benjamin Tyner

Hi

I'm wondering if anyone is aware of an R package implementing (i.e.,
providing a pdf, cdf, and/or quantile function) for the continuous
binomial distribution? Specifically the one characterized here:

http://www2.math.uni-paderborn.de/fileadmin/Mathematik/AG-Indlekofer/Workshop/Satellite_meeting/ilenko.pdf

Figured I would check here first, before attempting to code it up myself.

Regards
Ben

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] package implementing continuous binomial?

2015-05-07 Thread Benjamin Tyner

Thanks David! I'll take a look at zipfR.

Regards
Ben

On 05/07/2015 03:10 PM, David Winsemius wrote:
> On May 6, 2015, at 7:00 PM, Benjamin Tyner wrote:
>
>> Hi
>>
>> I'm wondering if anyone is aware of an R package implementing (i.e.,
>> providing a pdf, cdf, and/or quantile function) for the continuous
>> binomial distribution? Specifically the one characterized here:
>>
>> http://www2.math.uni-paderborn.de/fileadmin/Mathematik/AG-Indlekofer/Workshop/Satellite_meeting/ilenko.pdf
>>
>> Figured I would check here first, before attempting to code it up myself.
> I found that reading the ArXiv version of that material was easier to 
> understand:
>
> http://arxiv.org/abs/1303.5990
>
> zipfR package has an implementation of the incomplete beta function that 
> might make some of the coding of the pdf and cdf more simple. 
>
> Searching done with Graves' very useful utility package: 
>
> library('sos')
> findFn("incomplete beta function")
>
> (I did't think that doing a search on "continuous Binomial" was likely to be 
> helpful, but I tried it anyway and did not find any functions named 
> "continuous binomial" in their help page titles.)
>


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Scatterplot : smoothing colors according to density of points

2015-06-02 Thread Benjamin Dubreuil

Hello everyone,

I have a data frame D with 4 columns id,X,Y,C.
I want to plot a simple scatter plot of D$X vs. D$Y and using D$C values as a 
color. (id is just a text string not used for the plot)

But actually, I don't want to use the raw values of D$C, I would prefer to 
calculate the average values of D$C according to the density of points in a 
fixed neighborhood.
In other words, I would like to smooth the colors according to the density of 
points.

I am looking for any function,package that could solve this.
So far, I've been looking at library MASS and the function kde2d which can 
calculate the density of points in 2 directions, but I don't see how I could 
then use this information to recalculate my D$C values.

Here is a piece of the matrix :
 > head(D)
  id X YC
1 O13297 44.44  21.61220 -0.136651639
2 O13329 31.272085   4.01590 -0.117016949
3 O13525  6.865672   2.43884 -0.161173913
4 O13539 14.176245   7.81217 -0.075756757
5 O13541 73.275862   3.59012 -0.006988235
6 O13547 28.991597 258.99900 -0.013985507

> dim(D)
[1] 36164

> apply(D[,-1],2,range)
   X  Y  C
[1,]   0.3378378 0.0003 -0.738
[2,] 100.000 24556.4000  0.5582500
(Y is not linear, so I use log='y' in the plot function)

I used a palette of 100 colors ranging from Blue to Yellow to red.
>pal =  colorRampPalette(c("blue","yellow","red"))(100)

To make D$C values correspond to a color, I used a cut with the following 
breaks (101 breaks from -1.2 to 1.2):
> BREAKS
  [1] -1.2000 -0.8000 -0.4000 -0.3600 -0.3200 -0.2800 -0.2400 -0.2000 -0.1925
 [10] -0.1850 -0.1775 -0.1700 -0.1625 -0.1550 -0.1475 -0.1400 -0.1368 -0.1336
 [19] -0.1304 -0.1272 -0.1240 -0.1208 -0.1176 -0.1144 -0.1112 -0.1080 -0.1048
 [28] -0.1016 -0.0984 -0.0952 -0.0920 -0.0888 -0.0856 -0.0824 -0.0792 -0.0760
 [37] -0.0728 -0.0696 -0.0664 -0.0632 -0.0600 -0.0568 -0.0536 -0.0504 -0.0472
 [46] -0.0440 -0.0408 -0.0376 -0.0344 -0.0312 -0.0280 -0.0248 -0.0216 -0.0184
 [55] -0.0152 -0.0120 -0.0088 -0.0056 -0.0024  0.0008  0.0040  0.0072  0.0104
 [64]  0.0136  0.0168  0.0200  0.0232  0.0264  0.0296  0.0328  0.0360  0.0392
 [73]  0.0424  0.0456  0.0488  0.0520  0.0552  0.0584  0.0616  0.0648  0.0680
 [82]  0.0712  0.0744  0.0776  0.0808  0.0840  0.0872  0.0904  0.0936  0.0968
 [91]  0.1000  0.1250  0.1500  0.1750  0.2000  0.2250  0.2500  0.4875  0.7250
[100]  0.9625  1.2000
> C.levels = as.numeric(cut(D$C,breaks=BREAKS))
>length(C.levels)
[1] 3616

C.levels ranges from 2 to 98 and then to plot the colors I used pal[C.levels].
> plot( x=D$x, y=D$Y, col=pal[ C.levels ],log='y')



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Rprof and system

2015-08-11 Thread Benjamin Tyner

Hi

I have an R script which invokes WriteXLS() (from the package of the
same name) which as you may know, calls perl via system(). I've noticed
that when I enable profiling using Rprof(), when the script gets to the
part where perl is called, it gets "stuck": it just sits there using
99-100% CPU and around 10% of the RAM, and the perl command does not
even show up as a running process under 'top'. While it is churning
away, new profiling results continue to be written to the profiling file
(at which point the entries are all "system" / "WriteXLS").  I have also
noticed this behavior with other system() calls; not just perl. I've
also tried pipe(cmd, open = "r") as an alternative to system(cmd) and
the result is the same. I have tried in interactive as well as
non-interactive mode, and the issue occurs in both modes although seems
to be more common in the latter.

So, I'm wondering if perhaps Rprof() + system() not a recommended
combination ? I did notice from ?Rprof that "the profiler interrupts R
asynchronously"; though I do not know what that means, perhaps that is
somehow relevant? This is running on R version 3.2.1 under linux (RHEL6).

Any suggestions would be greatly appreciated.

Regards
Ben

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] non-terminal token lacking children from utils::getParseData

2015-02-19 Thread Benjamin Tyner

I tried to reduce the offending portion as best I could to a
more-or-less minimal example (1136 bytes), which can be downloaded via:

wget https://www.dropbox.com/s/74rgxr5x2aalr99/badstring.R

then once in R,

> b <- parse(file = "~/badstring.R", keep.source = TRUE)
> d <- getParseData(b, includeText = FALSE)
> subset(d, line1 == 2L)
   line1 col1 line2 col2 id parent token terminal
10 25241 10 21  exprFALSE
> subset(d, parent == 10)
[1] line1col1 line2col2 id   parent   token   
terminal
<0 rows> (or 0-length row.names)
   
here is my

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-pc-linux-gnu (64-bit)
   
locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C 
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8   
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8  
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C   
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C  
   
attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

note that while it says R version 3.0.2 above, I have seen the same
behaviour under version 3.1.2 as well.

Regards
Ben

On 02/19/2015 06:34 PM, Duncan Murdoch wrote:
> On 19/02/2015 6:31 PM, B Tyner wrote:
>> Hi,
>>
>> I have run across a source file for which the return value
>> of getParseData() includes a record having FALSE for $terminal, yet it is
>> not the parent of any other tokens. Before I spend time constructing a
>> reproducible example, I wanted to verify that this is in fact unexpected
>> behavior (under R 3.1.2)?
> Before I spend the time thinking about that, I'd like to see a
> reproducible example.
>
> Duncan Murdoch
>
>


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mboost: Proportional odds boosting model - how to specify the offset?

2015-03-20 Thread Benjamin Hofner


Dear Madlene,

the problem that you observed was twofold.

First, mboost expects the offset to be a scalar or a vector with length
equal to the number of observations. However, fitted(p.iris) is a 
matrix. In PropOdds(), the linear or additive predictor is shared among 
all outcome categories and the thresholds are treated as nuisance 
parameter. What you need to supply as offset is the result of the linear 
or additive predictor (i.e., x'beta) instead of the fitted class 
probabilities.


Second, there was a bug in mboost. I fixed it on R-forge [1]. If the 
package was successfully built use

  install.packages("mboost", repos="http://R-Forge.R-project.org";)
to install it. You can also email to me off list. Then I will send you 
the package sources directly.


Your nuisance parameters (which represent the class thresholds) can be
extracted via nuisance(mlp). More details are given in the example below.

Best,
Benjamin

[1] http://r-forge.r-project.org/projects/mboost/

 Example code 

library(MASS)
library(mboost)

data(iris)
iris$Species <- factor(iris$Species, ordered = T)
p.iris <- polr(Species  ~ Sepal.Length, data = iris)
p.iris

lm.iris <- glmboost(Species  ~ Sepal.Length, data = iris,
family = PropOdds(nuirange = c(-0.5, 3)))
lm.iris[1000]
## thresholds:
nuisance(lm.iris)

## to make these comparable to p.iris use
nuisance(lm.iris) - coef(lm.iris)["(Intercept)"] -
attr(coef(lm.iris), "offset")

## now use linear predictor as offset:
mlp <- gamboost(Species ~ bols(Sepal.Length) + bols(Sepal.Width),
data = iris, family = PropOdds(nuirange = c(0, 1)),
offset = fitted(lm.iris))




Nussbaum  Madlene wrote

Dear R team

The package mboost allows for boosting of proportional odds models.
However, I would like to include an offset for every observation.
This produces an error - no matter how I put the offset (as response
probabilities or as response link).

Fitting gamboost-models with offset works satisfactory with family =
Gaussian() or Multinomial().

Questions: 1) How do I need to specify the offset with family =
PropOdds()?

2) Where in the mboost-object do I find the Theta's (response
category dependent intercept)?

>
>
>
> # --- minimal example with iris data ---
>
> library(MASS)
> library(mboost)
>
> data(iris)
> iris$Species <- factor(iris$Species, ordered = T)
> p.iris <- polr(Species  ~ Sepal.Length, data = iris)
> mlp <- gamboost(Species ~ bols(Sepal.Length) + bols(Sepal.Width),
>data = iris, family = PropOdds(),
>  offset = fitted(p.iris) )
>
> Error in tmp[[i]] : subscript out of bounds
>
>
> Thank you
> M. Nussbaum
>
> --
>
> ETH Zürich
> Madlene Nussbaum
> Institut für Terrestrische Ökosysteme
> Boden- und Terrestrische Umweltphysik
> CHN E 37.2
> Universitätstrasse 16
> CH-8092 Zürich
>
> Telefon + 44 632 73 21
> Mobile  + 79 761 34 66
> madlene.nussbaum@.ethz
> www.step.ethz.ch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Having trouble with gdata read in

2015-03-25 Thread Benjamin Baker

Trying to read and clean up the FERC data on Advanced Metering infrastructure. 
Of course it is in XLS for the first two survey years and then converts to XLSX 
for the final two. Bad enough that it is all in excel, they had to change the 
survey design and data format as well. Still, I’m sorting through it. However, 
when I try and read in the 2008 data, I’m getting this error:
###
Wide character in print at 
/Library/Frameworks/R.framework/Versions/3.1/Resources/library/gdata/perl/xls2csv.pl
 line 270.
Warning message:
In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  EOF within quoted string
###



Here is the code I’m running to get the data:
###
install.packages("gdata")
library("gdata")
fileUrl <- 
"http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls";
download.file(fileUrl, destfile="./ami.data/ami-data2008.xls")
list.files("ami.data")
dateDown.2008 <- date()
ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, header=TRUE)
###


Reviewed the data in the XLS file, and both “” and # are present within it. 
Don’t know how to get the read.xls to ignore them so I can read all the data 
into my data frame. Tried :
###
ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, quote="", 
header=TRUE)
###


And it spits out “More columns than column names” output.


Been searching this, and I can find some “solutions” for read.table, but 
nothing specific to read.xls


Many thanks,


Benjamin Baker



—
Sent from Mailbox
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Having trouble with gdata read in

2015-03-27 Thread Benjamin Baker

Anthony,




XLSX won’t read an XLS file. Additionally, the legacy Java that is required for 
the xlsx package really effs up my computer. Have to reinstall my OS to fix it.




—
Sent from Mailbox

On Wed, Mar 25, 2015 at 3:51 PM, Anthony Damico 
wrote:

> maybe
> library(xlsx)
> tf <- tempfile()
> ami <- "
> http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls
> "
> download.file( ami , tf , mode = 'wb' )
> ami.data2008 <- read.xlsx( tf , sheetIndex = 1 )
> On Wed, Mar 25, 2015 at 5:01 PM, Benjamin Baker  wrote:
>> Trying to read and clean up the FERC data on Advanced Metering
>> infrastructure. Of course it is in XLS for the first two survey years and
>> then converts to XLSX for the final two. Bad enough that it is all in
>> excel, they had to change the survey design and data format as well. Still,
>> I’m sorting through it. However, when I try and read in the 2008 data, I’m
>> getting this error:
>> ###
>> Wide character in print at
>> /Library/Frameworks/R.framework/Versions/3.1/Resources/library/gdata/perl/
>> xls2csv.pl line 270.
>> Warning message:
>> In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
>>   EOF within quoted string
>> ###
>>
>>
>>
>> Here is the code I’m running to get the data:
>> ###
>> install.packages("gdata")
>> library("gdata")
>> fileUrl <- "
>> http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls
>> "
>> download.file(fileUrl, destfile="./ami.data/ami-data2008.xls")
>> list.files("ami.data")
>> dateDown.2008 <- date()
>> ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1,
>> header=TRUE)
>> ###
>>
>>
>> Reviewed the data in the XLS file, and both “” and # are present within
>> it. Don’t know how to get the read.xls to ignore them so I can read all the
>> data into my data frame. Tried :
>> ###
>> ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, quote="",
>> header=TRUE)
>> ###
>>
>>
>> And it spits out “More columns than column names” output.
>>
>>
>> Been searching this, and I can find some “solutions” for read.table, but
>> nothing specific to read.xls
>>
>>
>> Many thanks,
>>
>>
>> Benjamin Baker
>>
>>
>>
>> —
>> Sent from Mailbox
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Having trouble with gdata read in

2015-03-27 Thread Benjamin Baker

Jim,




Thanks, XLConnect with proper syntax works great for both types of files.


—
Sent from Mailbox

On Thu, Mar 26, 2015 at 5:15 AM, jim holtman  wrote:

> My suggestion is to use XLConnect to read the file:
>> x <-
> "C:\\Users\\jh52822\\AppData\\Local\\Temp\\Rtmp6nVgFC\\file385c632aba3.xls"
>> require(XLConnect)
> Loading required package: XLConnect
> Loading required package: XLConnectJars
> XLConnect 0.2-10 by Mirai Solutions GmbH [aut],
>   Martin Studer [cre],
>   The Apache Software Foundation [ctb, cph] (Apache POI, Apache Commons
> Codec),
>   Stephen Colebourne [ctb, cph] (Joda-Time Java library)
> http://www.mirai-solutions.com ,
> http://miraisolutions.wordpress.com
>> input <- f.readXLSheet(x, 1)
>>
>> str(input)
> 'data.frame':   2266 obs. of  51 variables:
>  $ EIA  : num  34 59 87 97 108 118 123 149
> 150 157 ...
>  $ Entity.Name  : chr  "City of Abbeville" "City of
> Abbeville" "City of Ada" "Adams Electric Cooperative" ...
>  $ State: chr  "SC" "LA" "MN" "IL" ...
>  $ NERC.Region  : chr  "SERC" "SPP" "MRO" "SERC" ...
>  $ Filing.Order : num  12 11 1237 392 252 ...
>  $ Q5.MultRegion: chr  "" "" "" "" ...
>  $ Q6.OwnMeters.: chr  "Yes" "Yes" "Yes" "Yes" ...
>  $ Q7.ResMeters : num  3051 4253 857 8154 33670 ...
>  $ Q7.ComMeters : num  531 972 132 155 1719 ...
>  $ Q7.IntMeters : num  0 19 32 NA 626 NA 29 0 2 NA
> ...
>  $ Q7.TransMeters   : num  0 NA NA NA NA NA NA 0 0 NA
> ...
>  $ Q7.OtherMeters   : num  0 NA NA 57 NA NA NA 0 0 NA
> ...
>  $ Q7...total.meters: num  3582 5244 1021 8366 36015 ...
>  $ Q8.15Min.ResAMI  : num  0 NA NA NA NA NA NA NA NA NA
> ...
>  $ Q8.15Min.ComAMI  : num  0 NA NA 155 NA NA NA NA NA
> NA ...
>  $ Q8.15Min.IndAMI  : num  0 NA NA NA NA NA NA NA NA NA
> ...
>  $ Q8.15Min.TransAMI: num  0 NA NA NA NA NA NA NA NA NA
> ...
>  $ Q8.15Min.OtherAMI: num  0 NA NA NA NA NA NA NA NA NA
> ...
>  $ Q8.15Min.TotalAMI: num  0 0 0 155 0 0 0 0 0 0 ...
>  $ Q8.Hourly.ResAMI : num  0 NA NA NA 16100 NA NA NA NA
> NA ...
>  $ Q8.Hourly.ComAMI : num  0 NA NA NA 1600 NA NA NA NA
> NA ...
> 
> Jim Holtman
> Data Munger Guru
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
> On Wed, Mar 25, 2015 at 5:01 PM, Benjamin Baker  wrote:
>> Trying to read and clean up the FERC data on Advanced Metering
>> infrastructure. Of course it is in XLS for the first two survey years and
>> then converts to XLSX for the final two. Bad enough that it is all in
>> excel, they had to change the survey design and data format as well. Still,
>> I’m sorting through it. However, when I try and read in the 2008 data, I’m
>> getting this error:
>> ###
>> Wide character in print at
>> /Library/Frameworks/R.framework/Versions/3.1/Resources/library/gdata/perl/
>> xls2csv.pl line 270.
>> Warning message:
>> In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
>>   EOF within quoted string
>> ###
>>
>>
>>
>> Here is the code I’m running to get the data:
>> ###
>> install.packages("gdata")
>> library("gdata")
>> fileUrl <- "
>> http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls
>> "
>> download.file(fileUrl, destfile="./ami.data/ami-data2008.xls")
>> list.files("ami.data")
>> dateDown.2008 <- date()
>> ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1,
>> header=TRUE)
>> ###
>>
>>
>> Reviewed the data in the XLS file, and both “” and # are present within
>> it. Don’t know how to get the read.xls to ignore them so I can read all the
>> data into my data frame. Tried :
>> ###
>> ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, quote="",
>> header=TRUE)
>> ###
>>
>>
>> And it spits out “More columns than column names” output.
>>
&g

Re: [R] Having trouble with gdata read in

2015-03-27 Thread Benjamin Baker

Jim,

I’m not seeing the command f.readXLSheet in the documentation, nor is it 
executing in my code.

—
Sent from Mailbox

On Thursday, Mar 26, 2015 at 5:15 AM, jim holtman , wrote:

My suggestion is to use XLConnect to read the file:

> x <- 
> "C:\\Users\\jh52822\\AppData\\Local\\Temp\\Rtmp6nVgFC\\file385c632aba3.xls"

> require(XLConnect)

Loading required package: XLConnect

Loading required package: XLConnectJars

XLConnect 0.2-10 by Mirai Solutions GmbH [aut],

  Martin Studer [cre],

  The Apache Software Foundation [ctb, cph] (Apache POI, Apache Commons

    Codec),

  Stephen Colebourne [ctb, cph] (Joda-Time Java library)

http://www.mirai-solutions.com ,

http://miraisolutions.wordpress.com

> input <- f.readXLSheet(x, 1)

> 

> str(input)

'data.frame':   2266 obs. of  51 variables:

 $ EIA                                  : num  34 59 87 97 108 118 123 149 150 
157 ...

 $ Entity.Name                          : chr  "City of Abbeville" "City of 
Abbeville" "City of Ada" "Adams Electric Cooperative" ...

 $ State                                : chr  "SC" "LA" "MN" "IL" ...

 $ NERC.Region                          : chr  "SERC" "SPP" "MRO" "SERC" ...

 $ Filing.Order                         : num  12 11 1237 392 252 ...

 $ Q5.MultRegion                        : chr  "" "" "" "" ...

 $ Q6.OwnMeters.                        : chr  "Yes" "Yes" "Yes" "Yes" ...

 $ Q7.ResMeters                         : num  3051 4253 857 8154 33670 ...

 $ Q7.ComMeters                         : num  531 972 132 155 1719 ...

 $ Q7.IntMeters                         : num  0 19 32 NA 626 NA 29 0 2 NA ...

 $ Q7.TransMeters                       : num  0 NA NA NA NA NA NA 0 0 NA ...

 $ Q7.OtherMeters                       : num  0 NA NA 57 NA NA NA 0 0 NA ...

 $ Q7...total.meters                    : num  3582 5244 1021 8366 36015 ...

 $ Q8.15Min.ResAMI                      : num  0 NA NA NA NA NA NA NA NA NA ...

 $ Q8.15Min.ComAMI                      : num  0 NA NA 155 NA NA NA NA NA NA ...

 $ Q8.15Min.IndAMI                      : num  0 NA NA NA NA NA NA NA NA NA ...

 $ Q8.15Min.TransAMI                    : num  0 NA NA NA NA NA NA NA NA NA ...

 $ Q8.15Min.OtherAMI                    : num  0 NA NA NA NA NA NA NA NA NA ...

 $ Q8.15Min.TotalAMI                    : num  0 0 0 155 0 0 0 0 0 0 ...

 $ Q8.Hourly.ResAMI                     : num  0 NA NA NA 16100 NA NA NA NA NA 
...

 $ Q8.Hourly.ComAMI                     : num  0 NA NA NA 1600 NA NA NA NA NA 
...

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Wed, Mar 25, 2015 at 5:01 PM, Benjamin Baker  wrote:
Trying to read and clean up the FERC data on Advanced Metering infrastructure. 
Of course it is in XLS for the first two survey years and then converts to XLSX 
for the final two. Bad enough that it is all in excel, they had to change the 
survey design and data format as well. Still, I’m sorting through it. However, 
when I try and read in the 2008 data, I’m getting this error:

###

Wide character in print at 
/Library/Frameworks/R.framework/Versions/3.1/Resources/library/gdata/perl/xls2csv.pl
 line 270.

Warning message:

In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :

  EOF within quoted string

###

Here is the code I’m running to get the data:

###

install.packages("gdata")

library("gdata")

fileUrl <- 
"http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls";

download.file(fileUrl, destfile="./ami.data/ami-data2008.xls")

list.files("ami.data")

dateDown.2008 <- date()

ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, header=TRUE)

###

Reviewed the data in the XLS file, and both “” and # are present within it. 
Don’t know how to get the read.xls to ignore them so I can read all the data 
into my data frame. Tried :

###

ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, quote="", 
header=TRUE)

###

And it spits out “More columns than column names” output.

Been searching this, and I can find some “solutions” for read.table, but 
nothing specific to read.xls

Many thanks,

Benjamin Baker

—

Sent from Mailbox

        [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

[R] lattice: packet.number() versus panel.number()

2014-08-26 Thread Benjamin Tyner

Hi,

According to
https://svn.r-project.org/R-packages/trunk/lattice/R/print.trellis.R,

"[panel.number] is usually the same as, but can be different from
packet.number"

and I had been under the impression that as long as the user is not
using a custom index.cond nor perm.cond, the panel.number would in fact
be the same as the packet.number.

However, I have recently come across a case where the two are *not* the
same, even though I am not using a custom index.cond nor perm.cond.

So my question is, what might be some other possible situations in which
the two would be expected to differ?

Regards,
Ben

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lattice: packet.number() versus panel.number()

2014-08-27 Thread Benjamin Tyner

Thanks David. It turns out that I was in fact using a custom index.perm,
but not on purpose. What happened was I used the "[" method on the
trellis object ( lattice:::`[.trellis`), which of course is nothing but
a short-cut for updating the index.perm. Lesson learned...

Regards,
Ben

On 08/27/2014 02:10 AM, David Winsemius wrote:
> On Aug 26, 2014, at 3:25 PM, Benjamin Tyner wrote:
>
>> Hi,
>>
>> According to
>> https://svn.r-project.org/R-packages/trunk/lattice/R/print.trellis.R,
>>
>>"[panel.number] is usually the same as, but can be different from
>> packet.number"
>>
>> and I had been under the impression that as long as the user is not
>> using a custom index.cond nor perm.cond, the panel.number would in fact
>> be the same as the packet.number.
>>
>> However, I have recently come across a case where the two are *not* the
>> same, even though I am not using a custom index.cond nor perm.cond.
>>
>> So my question is, what might be some other possible situations in which
>> the two would be expected to differ?
> The immediate hypothesis that leaps to mind is cases where there are multiple 
> pages. On each page I suspect the upper left numbering restarts with 1, but I 
> suspect the packet numbers are sequential increasing.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] from cut.Date

2014-09-17 Thread Benjamin Tyner

Hello,

I'm wondering if this is expected?

> cut(structure(1, class="Date"), structure(c(11100,1),
class="Date"))
[1] 
Levels: 2000-05-23

The help page says that "for ‘"Date"’ objects, only ‘"day"’, ‘"week"’,
‘"month"’, ‘"quarter"’ and ‘"year"’ are allowed" [for the 'breaks'
argument]. Though I am not sure whether this statement is only
applicable in the context of the previous sentence about interval
specification (i.e., a roundabout way of saying that ‘"sec"’, ‘"min"’,
‘"hour"’, and ‘"DSTday"’ are not allowed for 'Date' objects), or whether
it also means that a vector of cut points (as in my example) is likewise
not allowed? If the latter, then perhaps the function out to error out
rather than return  in this case?

Regards
Ben

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] from cut.Date

2014-09-17 Thread Benjamin Tyner

Thanks Brian! 

The confusion was due to my failure to notice that these two functions have 
opposite defaults for 'right':

   > args(cut.default)
   function (x, breaks, labels = NULL, include.lowest = FALSE, right = TRUE, 
dig.lab = 3L, ordered_result = FALSE, ...) 

   > args(cut.Date)
   function (x, breaks, labels = NULL, start.on.monday = TRUE, right = FALSE, 
...) 

I suppose the latter does make sense, given that days, months, years etc are 
right-continuous functions of time.

Regards
Ben

> On 17/09/2014 12:04, Benjamin Tyner wrote:
> >/ Hello,
> />/
> />/ I'm wondering if this is expected?
> /
> It is as documented!
> >/
> />/  > cut(structure(1, class="Date"), structure(c(11100,1),
> />/ class="Date"))
> />/  [1] 
> />/  Levels: 2000-05-23
> />/
> />/ The help page says that "for ‘"Date"’ objects, only ‘"day"’, ‘"week"’,
> />/ ‘"month"’, ‘"quarter"’ and ‘"year"’ are allowed" [for the 'breaks'
> />/ argument]. Though I am not sure whether this statement is only
> />/ applicable in the context of the previous sentence about interval
> />/ specification (i.e., a roundabout way of saying that ‘"sec"’, ‘"min"’,
> />/ ‘"hour"’, and ‘"DSTday"’ are not allowed for 'Date' objects), or whether
> />/ it also means that a vector of cut points (as in my example) is likewise
> />/ not allowed? If the latter, then perhaps the function out to error out
> />/ rather than return  in this case?
> /
> The NA is correct: the value you pass is not covered by the 'breaks' you 
> specified.  As the help says
>
>   Using both ‘right = TRUE’ and ‘include.lowest = TRUE’ will include
>   both ends of the range of dates.
>
> With the default values, only the lower end is included.
>
> -- 
> Brian D. Ripley,  ripley at stats.ox.ac.uk 
> <https://stat.ethz.ch/mailman/listinfo/r-help>
> Emeritus Professor of Applied Statistics, University of Oxford
> 1 South Parks Road, Oxford OX1 3TG, UK


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] scaling of Amat still recommended for quadprog::solve.QP ?

2014-09-25 Thread Benjamin Tyner

Greetings,

I ran across this recommendation, to keep the norms of the columns of
the Amat on similar
scales,

https://stat.ethz.ch/pipermail/r-help/2007-September/141335.html

However, when I looked at the code, I noticed that the norms are already
being calculated:

c
c calculate the norm of each column of the A matrix
c
  do 51 i=1,q
 sum = 0.d0
 do 52 j=1,n
sum = sum + amat(j,i)*amat(j,i)
 52  continue
 work(iwnbv+i) = sqrt(sum)
 51   continue
  nact = 0
  iter(1) = 0
  iter(2) = 0
 50   continue

though I am not sure exactly how these norms get used subsequently. My
question is, is it no longer necessary to follow Berwin's recommendation
from 2007? Or are the norms being calculated for some other purpose, and
the recommendation still applies?

Regards
Ben

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] quadprog::solve.QP sometimes returns NaNs

2014-09-26 Thread Benjamin Tyner

Hello,

Here is an example; hopefully it is reproducible on others' platform:

library(quadprog)

n <- 66L

set.seed(6860)
X <- matrix(1e-20, n, n)
diag(X) <- 1
Dmat <- crossprod(X)
y <- seq_len(n)
dvec <- crossprod(X, y)

Amat <- diag(n)
bvec <- y + runif(n)

sol <- solve.QP(Dmat, dvec, Amat, bvec, meq = n)

print(sol$solution) # this gives all NaNs

under sessionInfo():

R version 3.0.2 (2013-09-25)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C 
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8   
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8  
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C   
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C  

attached base packages:
[1] stats graphics  grDevices utils datasets  methods  
base

other attached packages:
[1] quadprog_1.5-5

Any ideas?

Thanks
Ben

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] backsolve, chol, Matrix, and SparseM

2015-09-24 Thread Benjamin Tyner

Hi

I have some code which does (on a symmetric matrix 'x')

backsolve(chol(x), diag(nrow(x)))

and I am wondering what is the recommended way to accomplish this when x
is also sparse (from package:Matrix). I know that package:Matrix
provides a chol method for such matrices, but not a backsolve method. On
the other hand, package:SparseM does provide a backsolve method, but
doesn't actually return a sparse matrix. Moreover, I am a little
hesitant to use SparseM, as the vignette seems to be from 2003.

I did notice that help(topic = "solve", package = "Matrix") says "In
‘solve(a,b)’ in the ‘Matrix’ package, ‘a’ may also be a
‘MatrixFactorization’ instead of directly a matrix." which makes me
think this is the right way:

Matrix::solve(Cholesky(x), .sparseDiagonal(nrow(x)))

but unfortunately this didn't give the same result as:

Matrix::solve(chol(x), .sparseDiagonal(nrow(x)))

so I'm asking here in case someone has any suggestions.

Regards,
Ben

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] backsolve, chol, Matrix, and SparseM

2015-09-25 Thread Benjamin Tyner

Hi Martin,

Thanks for the remarks and examples, and for confirming that I was
indeed barking up the wrong tree with SparseM.

A. I assume that is a typo and you meant to say, no need for backsolve().
B. Absolutely; however, in this case I am taking advantage of
quadprog::solve.QP(..., factorized = TRUE) which requires the inverse of
the Cholesky factor; it turns out to be faster to compute this one time
upfront rather than have solve.QP(..., factorized = FALSE) do it over
and over again. Of course the holy grail would be a QP solver which
takes advantage of the various innovations from package:Matrix, but I
digress...
C. Agreed, assuming you are talking about Matrix::solve(X) on X of class
Matrix. On the other hand for a regular matrix x it is not difficult to
construct examples where backsolve(chol(x), diag(nrow(x))) is twice as
fast as base::solve(chol(x)), which led me down this path in the first
place.

By the way, is R-forge still the correct place to report bugs in
package:Matrix?

Regards
Ben


On 09/25/2015 04:25 AM, Martin Maechler wrote:
> Dear Ben,
>
>>>>>> Benjamin Tyner 
>>>>>> on Thu, 24 Sep 2015 13:47:58 -0400 writes:
> > Hi I have some code which does (on a symmetric matrix 'x')
>
> > backsolve(chol(x), diag(nrow(x)))
>
> > and I am wondering what is the recommended way to
> > accomplish this when x is also sparse (from
> > package:Matrix). I know that package:Matrix provides a
> > chol method for such matrices, but not a backsolve
> > method. On the other hand, package:SparseM does provide a
> > backsolve method, but doesn't actually return a sparse
> > matrix. Moreover, I am a little hesitant to use SparseM,
> > as the vignette seems to be from 2003.
>
> Roger Koenker has agreed in the past, that new projects should
> rather use Matrix.   SparseM has been the very first R package
> providing sparse matrix support.
>
>
> > I did notice that help(topic = "solve", package =
> > "Matrix") says "In ‘solve(a,b)’ in the ‘Matrix’ package,
> > ‘a’ may also be a ‘MatrixFactorization’ instead of
> > directly a matrix." which makes me think this is the right
> > way:
>
> > Matrix::solve(Cholesky(x), .sparseDiagonal(nrow(x)))
>
> > but unfortunately this didn't give the same result as:
>
> > Matrix::solve(chol(x), .sparseDiagonal(nrow(x)))
>
> > so I'm asking here in case someone has any suggestions.
>
> You don't give any examples.
> So a few remarks and a reproducible example to get more concrete
>
> A. As the Matrix package has classes for triangular matrices and
>   Matrix :: chol() returns them, there   is no need for
>   forwardsolve() or backwardsolve(), as just   solve() is always
>   enough.
>
> B. As Doug Bates has been teaching for many decennia, "it is
>   almost always computationally *wrong* to compute a matrix
>   inverse explicitly".
>   Rather computeA^{-1} B   or  A^{-1} x  {for vector x,
>   matrix B (but different from Identity).
>
> C. Inspite of B, there are cases (such as computing sandwich
>   estimates of covariance matrices) where you do want the inverse.
>   In that case,
>
>solve(A) is semantically equivalent to
>solve(A, diag(.))
>
>and almost always the *first* form is implempented more
>efficiently than the second.
>
> D. In Matrix,  use chol(.) ... unless you really read a bit
>about Cholesky(.) and its special purpose sparse cholesky decompositions.
>As mentioned above,  Matrix :: chol()  will return a
>"formally triangular" matrix, i.e., inheriting from
>"triangularMatrix"; in the sparse case, very typically of
>specific class "dtCMatrix".
>
> Here's a small reproducible example,
> please use it to ask further questions:
>
> *.R:
>
> library(Matrix)
> M <- as(diag(4)+1,"dsCMatrix")
> m <- as(M, "matrix") # -> traditional R matrix
> stopifnot( all(M == m) )
> M
> L <- Cholesky(M,super=TRUE,perm=FALSE) # a MatrixFactorization ("dCHMsuper")
> (L. <- as(L, "Matrix")) #-> lower-triagonal (sparseMatrix, specifically 
> "dtCMatrix")
> (cM <- chol(M))# *upper* triagonal ("dtCMatrix")
> (cm <- chol(m))#  upper  triagonal traditional matrix -- the same "of course" 
> :
> all.equal(as.matrix(cM), cm) # TRUE
>
> (r. <- backsolve(cm, diag(4)))# upper tri. (traditional) matrix
> (R. <- solve(cM) ) ## the "same"  (but nicer printing)
> all.equal

[R] efficiently multiply each column of a sparse Matrix by a sparse vector

2015-11-28 Thread Benjamin Tyner


Hi,

Say I have a sparse Matrix X, and a sparse vector (stored as a 1-column 
sparse Matrix A), with X and A having the same number of rows, and I 
wish to multiply each column of X by A, but would like the operation to 
take full advantage of the sparseness of both X and A. In other words I 
want the result to be another sparse Matrix but not having any zeros 
calculated or stored unnecessarily. For concreteness,


   library(Matrix)
   set.seed(6860)
   X <- sparseMatrix(i = sample(1:10, 5L),
 j = sample(1:10, 5L),
 x = rep(1, 5),
 dims = c(10L, 10L)
 )
   A <- sparseMatrix(i = sample(1:10, 5L),
 j = rep(1L, 5L),
 x = rep(1, 5),
 dims = c(10L, 1L)
 )

and observe that

   print(X * A[, 1L, drop=TRUE])

gives the following, in which three 0s are not represented sparsely,

   10 x 10 sparse Matrix of class "dgCMatrix"

[1,] . . . . . . . . . .
[2,] . . 1 . . . . . . .
[3,] . . . . . . . . . .
[4,] . . . . . 0 . . . .
[5,] . . . . . . 0 . . .
[6,] . 1 . . . . . . . .
[7,] . . . . . . . . . .
[8,] . . . . . . . . . .
[9,] . . . . . . . . . .
   [10,] 0 . . . . . . . . .

in other words I am wondering if there is a more efficient way to arrive 
at the same result as,


   print(X * A[, rep(1L, ncol(X)), drop=FALSE])

   10 x 10 sparse Matrix of class "dgCMatrix"

[1,] . . . . . . . . . .
[2,] . . 1 . . . . . . .
[3,] . . . . . . . . . .
[4,] . . . . . . . . . .
[5,] . . . . . . . . . .
[6,] . 1 . . . . . . . .
[7,] . . . . . . . . . .
[8,] . . . . . . . . . .
[9,] . . . . . . . . . .
   [10,] . . . . . . . . . .

without the additional overhead of duplicating A for ncol(X) times.

This seems like such a simple thing, but has me stumped. Any ideas?

Regards,
Ben

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] choropleth packages (US)

2015-12-06 Thread Benjamin Tyner


Hi

I wish to draw a basic choropleth (US, by state) and am wondering if 
anyone has any recommendations? I've tried the following thus far:


1. choroplethr: this works, but required installation of 30+ 
dependencies. I would prefer something with fewer dependencies.
2. tmap: this also seems promising, but most of the examples I saw were 
specific to European maps. Can it be adapted for US?
3. statebins: doesn't draw true choropleths, but I liked that it doesn't 
have many dependencies.


Regards
Ben

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] choropleth packages (US)

2015-12-10 Thread Benjamin Tyner

Very nice Adrian. Is there a straightforward way to add Alaska and 
Hawaii at the lower left? (without resorting to choroplethr package)


On 12/10/2015 06:09 AM, Adrian Waddell wrote:

Hi,

You can also use the 'maps' package for the map data and the 'scales'
package for the color mapping.

E.g.

library(maps)
library(scales)

m <- map('state', fill=TRUE, plot=FALSE)

s_data <- tolower(rownames(USArrests))
s_map <- tolower(m$names)

mapping <- lapply(s_data, function(state) {
   which(grepl(state, s_map))
})
## check if the mapping is good!

col_pal <- col_numeric("Greens", domain=NULL, na.color = 'lightyellow')

cols <- rep('lightyellow', length(s_data))

Map(function(indices, col) {
   cols[indices] <<- col
}, mapping, col_pal(USArrests$UrbanPop))

map(m, col=cols, fill=TRUE)


Adrian



On Mon, Dec 7, 2015 at 9:34 AM, Erich Neuwirth
 wrote:

ggplot2 also can do this with
fortify
geom_polygon

Von meinem iPad gesendet


Am 06.12.2015 um 21:03 schrieb Benjamin Tyner :

Hi

I wish to draw a basic choropleth (US, by state) and am wondering if anyone has 
any recommendations? I've tried the following thus far:

1. choroplethr: this works, but required installation of 30+ dependencies. I 
would prefer something with fewer dependencies.
2. tmap: this also seems promising, but most of the examples I saw were 
specific to European maps. Can it be adapted for US?
3. statebins: doesn't draw true choropleths, but I liked that it doesn't have 
many dependencies.

Regards
Ben

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R_DirtyImage and Rprof

2016-05-11 Thread Benjamin Tyner


Hello,

I have some code which was running in interactive mode while Rprof(..., 
line.profiling = TRUE). Near the end of my script, it opens up a 
pipe(..., open = "w") to a perl script, and at that point the execution 
gets stuck using 100% cpu.


(The perl script itself never showed up in pstree, as far as I can tell).

I did a "tail -f" on the file being written to by Rprof, and it was 
reporting "sys.save.image" over and over, and in fact an ".RData" file 
appeared when I had not asked for one, and I was able to load it later.


This got me curious, as nowhere in my code do I directly use that 
function. Looking through the source code for R, it appears that 
"sys.save.image" is called whenever an R_DirtyImage condition is triggered.


This was using R version 3.2.2 under RHEL. My efforts to create a 
reproducible example of this behavior have thus far been unsuccessful.


My questions: is there any documentation for R_DirtyImage, and how 
plausible is it that the R_DirtyImage condition was triggered by 
something Rprof did? The reason for my conjecture is that 
sys.save.image() calls closeAllConnections(), which I imagine might have 
interfered with the pipe that was open for writing, thus causing the 
stuck execution at that point.


If so, any advice for avoiding the R_DirtyImage condition while profiling?

If not, any conjectures for what might actually be going on? For what 
it's worth, I have observed a similar situation when using Rprof + 
system() instead of pipe(); for example:


   https://stat.ethz.ch/pipermail/r-help/2015-August/431286.html

Regards
Ben

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Significance of Svyrepdesign Object Warning

2016-10-23 Thread Courtney Benjamin

Hello R Users,

I am using Lumley's Survey Package in R to analyze complex survey data that 
involves 200 balanced repeated replicate (BRR) weight variables.  I have 
ensured that my svyrepdesign object that specifies the application of the BRR 
weights to the data set is accurate and I have matched the published standard 
errors of the data set.

When doing a logistic regression through the svyglm call, I receive the 
following warning:

In object$survey.design$pweights * presid^2 :
  longer object length is not a multiple of shorter object length?
I have search around quite a bit online and have not been able to find any good 
interpretation of its meaning.  I want to be sure that I am not making some 
type of mistake that is causing this warning to be produced.  Any advisement is 
greatly appreciated.
The following is an MRE that can be pasted into the R console:
library(RCurl)
library(survey)
data <- 
getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";)
elsq1ch <- read.csv(text = data)
#Specifying the svyrepdesign object which applies the BRR weights
elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = 
elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR")
elsq1ch_brr
#Logistic regression call which yields a warning regarding svyrepdesign object
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude)
allCC <- 
summary(svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude))
allCC

#Session Info
#R version 3.3.1 (2016-06-21)
#Platform: x86_64-w64-mingw32/x64 (64-bit)
#Running under: Windows >= 8 x64 (build 9200)

#locale:
#  [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United 
States.1252
#[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
#[5] LC_TIME=English_United States.1252
#attached base packages:
#  [1] grid  stats graphics  grDevices utils datasets  methods   
base
#other attached packages:
#[1] survey_3.31-2   survival_2.39-4 Matrix_1.2-6RCurl_1.95-4.8  
bitops_1.0-6
#loaded via a namespace (and not attached):
#[1] tools_3.3.1 splines_3.3.1   knitr_1.14  lattice_0.20-33


Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Significance of Svyrepdesign Object Warning

2016-10-23 Thread Courtney Benjamin

Thank you for your help.  I did try Anthony's recommendation of removing the 
'na.action=na.exclude' ; I thought I needed that argument as the data set 
includes NA values.  I found it interesting that without the 
'na.action=na.exclude' argument, the baseline level of two of my predictor 
variables (BYINCOME & F1HIMATH) were changed.


Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633


From: William Dunlap 
Sent: Sunday, October 23, 2016 2:24 PM
To: Anthony Damico
Cc: Courtney Benjamin; r-help@r-project.org; Thomas Lumley
Subject: Re: [R] Significance of Svyrepdesign Object Warning

The immediate problem could be solved by changing the following lines in 
survey:::summary.svrepglm from
presid <- resid(object, "pearson")
dispersion <- sum(object$survey.design$pweights * presid^2,
na.rm = TRUE)/sum(object$survey.design$pweights)
to
presid <- resid(object, "pearson")
pweights <- naresid(object$na.action, object$survey.design$pweights)
dispersion <- sum(pweights * presid^2, na.rm = TRUE)/sum(pweights,
na.rm = TRUE)

'naresid' uses the information from na.exclude to match up the residuals
with the row in the data that they correspond to.  resid() calls it so it should
also be applied to pweights so they line up correctly.




Bill Dunlap
TIBCO Software
wdunlap tibco.com<http://tibco.com>

On Sun, Oct 23, 2016 at 11:17 AM, Anthony Damico 
mailto:ajdam...@gmail.com>> wrote:
hi, great example.  i am ccing survey package author/maintainer dr.
lumley.  why do you have `na.action=na.exclude`?  if you remove it, things
work as expected--


library(RCurl)
library(survey)
data <- getURL("
https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv
")
elsq1ch <- read.csv(text = data)
#Specifying the svyrepdesign object which applies the BRR weights
elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights =
elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type =
"BRR")
elsq1ch_brr
#Logistic regression call which yields a warning regarding svyrepdesign
object

# your warning
a <-
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude)
summary(a)

# works fine
a <-
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1)
summary(a)



the mismatch of vectors generating that warning happens inside

debug(survey:::summary.svrepglm)

[..snip..]

Browse[2]> length(presid)
[1] 12614
Browse[2]> length(object$survey.design$pweights)
[1] 8397


and including vs excluding the na.action=na.exclude gives you a
slightly different dispersion parameter calculation

(Dispersion parameter for binomial family taken to be 0.7756235)

(Dispersion parameter for binomial family taken to be 0.7849244)


not sure if the two survey:::residuals.sv<http://residuals.sv>* methods should 
deal with the
na.action= parameter?


thanks

On Sun, Oct 23, 2016 at 11:56 AM, Courtney Benjamin 
mailto:cbenj...@btboces.org>>
wrote:

> Hello R Users,
>
> I am using Lumley's Survey Package in R to analyze complex survey data
> that involves 200 balanced repeated replicate (BRR) weight variables.  I
> have ensured that my svyrepdesign object that specifies the application of
> the BRR weights to the data set is accurate and I have matched the
> published standard errors of the data set.
>
> When doing a logistic regression through the svyglm call, I receive the
> following warning:
>
> In object$survey.design$pweights * presid^2 :
>   longer object length is not a multiple of shorter object length?
> I have search around quite a bit online and have not been able to find any
> good interpretation of its meaning.  I want to be sure that I am not making
> some type of mistake that is causing this warning to be produced.  Any
> advisement is greatly appreciated.
> The following is an MRE that can be pasted into the R console:
> library(RCurl)
> library(survey)
> data <- getURL("https://raw.githubusercontent.com/
> cbenjamin1821/careertech-ed/master/elsq1adj.csv")
> elsq1ch <- read.csv(text = data)
> #Specifying the svyrepdesign object which applies the BRR weights
> elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights =
> elsq1ch[,18:217], weights = elsq1ch[,17]

Re: [R] Significance of Svyrepdesign Object Warning

2016-10-27 Thread Courtney Benjamin

Hello Mr. Dunlap,

I have gone back and re-read the responses to my question.  I am interested in 
trying to apply your recommendation so I am doing things correctly; however I 
am not sure how to go about doing it within my code.  It appears that you are 
digging quite deeply into R where I am not yet familiar.  I am including a 
reproducible example; would you be willing to show an example of how it would 
be done?  I greatly appreciate your advisement and time.

Sincerely,

Courtney


library(RCurl)
library(survey)
data <- 
getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";)
elsq1ch <- read.csv(text = data)
#Specifying the svyrepdesign object which applies the BRR weights
elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = 
elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR")
elsq1ch_brr
#Logistic regression call which yields a warning regarding svyrepdesign object
allCC 
<-svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude)
summary(allCC)





Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633


From: William Dunlap 
Sent: Sunday, October 23, 2016 2:24 PM
To: Anthony Damico
Cc: Courtney Benjamin; r-help@r-project.org; Thomas Lumley
Subject: Re: [R] Significance of Svyrepdesign Object Warning

The immediate problem could be solved by changing the following lines in 
survey:::summary.svrepglm from
presid <- resid(object, "pearson")
dispersion <- sum(object$survey.design$pweights * presid^2,
na.rm = TRUE)/sum(object$survey.design$pweights)
to
presid <- resid(object, "pearson")
pweights <- naresid(object$na.action, object$survey.design$pweights)
dispersion <- sum(pweights * presid^2, na.rm = TRUE)/sum(pweights,
na.rm = TRUE)

'naresid' uses the information from na.exclude to match up the residuals
with the row in the data that they correspond to.  resid() calls it so it should
also be applied to pweights so they line up correctly.




Bill Dunlap
TIBCO Software
wdunlap tibco.com<http://tibco.com>

On Sun, Oct 23, 2016 at 11:17 AM, Anthony Damico 
mailto:ajdam...@gmail.com>> wrote:
hi, great example.  i am ccing survey package author/maintainer dr.
lumley.  why do you have `na.action=na.exclude`?  if you remove it, things
work as expected--


library(RCurl)
library(survey)
data <- getURL("
https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv
")
elsq1ch <- read.csv(text = data)
#Specifying the svyrepdesign object which applies the BRR weights
elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights =
elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type =
"BRR")
elsq1ch_brr
#Logistic regression call which yields a warning regarding svyrepdesign
object

# your warning
a <-
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude)
summary(a)

# works fine
a <-
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1)
summary(a)



the mismatch of vectors generating that warning happens inside

debug(survey:::summary.svrepglm)

[..snip..]

Browse[2]> length(presid)
[1] 12614
Browse[2]> length(object$survey.design$pweights)
[1] 8397


and including vs excluding the na.action=na.exclude gives you a
slightly different dispersion parameter calculation

(Dispersion parameter for binomial family taken to be 0.7756235)

(Dispersion parameter for binomial family taken to be 0.7849244)


not sure if the two survey:::residuals.sv<http://residuals.sv>* methods should 
deal with the
na.action= parameter?


thanks

On Sun, Oct 23, 2016 at 11:56 AM, Courtney Benjamin 
mailto:cbenj...@btboces.org>>
wrote:

> Hello R Users,
>
> I am using Lumley's Survey Package in R to analyze complex survey data
> that involves 200 balanced repeated replicate (BRR) weight variables.  I
> have ensured that my svyrepdesign object that specifies the application of
> the BRR weights to the data set is accurate and I have matched the
> published standard errors of the data set.
>
> When doing a logistic regression through the svyglm call, I receive the
> following warning:
>
> In object$survey.des

Re: [R] Significance of Svyrepdesign Object Warning

2016-10-27 Thread Courtney Benjamin

Thank you; I will do so.


Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633


From: William Dunlap 
Sent: Thursday, October 27, 2016 9:59 PM
To: Courtney Benjamin
Cc: Anthony Damico; r-help@r-project.org; Thomas Lumley
Subject: Re: [R] Significance of Svyrepdesign Object Warning

For, now I would just use na.action=na.omit instead of na.exclude.  My comments 
were mainly for the package author.

Bill Dunlap
TIBCO Software
wdunlap tibco.com<http://tibco.com>

On Thu, Oct 27, 2016 at 5:53 PM, Courtney Benjamin 
mailto:cbenj...@btboces.org>> wrote:

Hello Mr. Dunlap,

I have gone back and re-read the responses to my question.  I am interested in 
trying to apply your recommendation so I am doing things correctly; however I 
am not sure how to go about doing it within my code.  It appears that you are 
digging quite deeply into R where I am not yet familiar.  I am including a 
reproducible example; would you be willing to show an example of how it would 
be done?  I greatly appreciate your advisement and time.

Sincerely,

Courtney


library(RCurl)
library(survey)
data <- 
getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";)
elsq1ch <- read.csv(text = data)
#Specifying the svyrepdesign object which applies the BRR weights
elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = 
elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR")
elsq1ch_brr
#Logistic regression call which yields a warning regarding svyrepdesign object
allCC 
<-svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude)
summary(allCC)





Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633


From: William Dunlap mailto:wdun...@tibco.com>>
Sent: Sunday, October 23, 2016 2:24 PM
To: Anthony Damico
Cc: Courtney Benjamin; r-help@r-project.org<mailto:r-help@r-project.org>; 
Thomas Lumley
Subject: Re: [R] Significance of Svyrepdesign Object Warning

The immediate problem could be solved by changing the following lines in 
survey:::summary.svrepglm from
presid <- resid(object, "pearson")
dispersion <- sum(object$survey.design$pweights * presid^2,
na.rm = TRUE)/sum(object$survey.design$pweights)
to
presid <- resid(object, "pearson")
pweights <- naresid(object$na.action, object$survey.design$pweights)
dispersion <- sum(pweights * presid^2, na.rm = TRUE)/sum(pweights,
na.rm = TRUE)

'naresid' uses the information from na.exclude to match up the residuals
with the row in the data that they correspond to.  resid() calls it so it should
also be applied to pweights so they line up correctly.




Bill Dunlap
TIBCO Software
wdunlap tibco.com<http://tibco.com>

On Sun, Oct 23, 2016 at 11:17 AM, Anthony Damico 
mailto:ajdam...@gmail.com>> wrote:
hi, great example.  i am ccing survey package author/maintainer dr.
lumley.  why do you have `na.action=na.exclude`?  if you remove it, things
work as expected--


library(RCurl)
library(survey)
data <- getURL("
https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv
")
elsq1ch <- read.csv(text = data)
#Specifying the svyrepdesign object which applies the BRR weights
elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights =
elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type =
"BRR")
elsq1ch_brr
#Logistic regression call which yields a warning regarding svyrepdesign
object

# your warning
a <-
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude)
summary(a)

# works fine
a <-
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1)
summary(a)



the mismatch of vectors generating that warning happens inside

debug(survey:::summary.svrepglm)

[..snip..]

Browse[2]> length(presid)
[1] 12614
Browse[2]> length(object$survey.design$pweights)
[1] 8397


and including vs excluding the na.action=na.exclude gives you a
slightly different dispersion parameter calculation

(Di

[R] Resetting Baseline Level of Predictor in svyglm Function

2016-10-31 Thread Courtney Benjamin

Hello R Users:

I am using the survey package in R for modeling with complex survey data. I am 
trying to reset the baseline level of certain predictor variables being used in 
a logistic regression without success. The following is a reproducible example:

library(RCurl)
library(survey)

data <- 
getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";)
elsq1ch <- read.csv(text = data)

#Specifying the svyrepdesign object which applies the BRR weights
elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = 
elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR")
elsq1ch_brr

#Log. Reg. model
allCC <- 
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.omit)
summary(allCC)

##Attempting to reset baseline level for predictor variable
#Both attempts did not work
elsq1ch$F1HIMATH <- C(elsq1ch$F1HIMATH,contr.treatment, base=1)
elsq1ch$F1HIMATH <- relevel(elsq1ch$F1HIMATH,"PreAlg or Less")

#Log. Reg. model with no changes in baseline levels for the predictors
allCC <- 
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.omit)
summary(allCC)


Any guidance is greatly appreciated.?

Sincerely,

Courtney?

Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Resetting Baseline Level of Predictor in svyglm Function

2016-11-03 Thread Courtney Benjamin

Thank you, Anthony; it worked flawlessly.?


Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633


From: Anthony Damico 
Sent: Tuesday, November 1, 2016 3:20 AM
To: Courtney Benjamin
Cc: r-help@r-project.org
Subject: Re: [R] Resetting Baseline Level of Predictor in svyglm Function

hi, i think you want

elsq1ch_brr <- update( elsq1ch_brr , F1HIMATH = relevel(F1HIMATH,"PreAlg or 
Less") )





On Mon, Oct 31, 2016 at 9:05 PM, Courtney Benjamin 
mailto:cbenj...@btboces.org>> wrote:
Hello R Users:

I am using the survey package in R for modeling with complex survey data. I am 
trying to reset the baseline level of certain predictor variables being used in 
a logistic regression without success. The following is a reproducible example:

library(RCurl)
library(survey)

data <- 
getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";)
elsq1ch <- read.csv(text = data)

#Specifying the svyrepdesign object which applies the BRR weights
elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = 
elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR")
elsq1ch_brr

#Log. Reg. model
allCC <- 
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.omit)
summary(allCC)

##Attempting to reset baseline level for predictor variable
#Both attempts did not work
elsq1ch$F1HIMATH <- C(elsq1ch$F1HIMATH,contr.treatment, base=1)
elsq1ch$F1HIMATH <- relevel(elsq1ch$F1HIMATH,"PreAlg or Less")

#Log. Reg. model with no changes in baseline levels for the predictors
allCC <- 
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.omit)
summary(allCC)


Any guidance is greatly appreciated.?

Sincerely,

Courtney?

Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org><mailto:cbenj...@btboces.org<mailto:cbenj...@btboces.org>>

607-763-8633

[[alternative HTML version deleted]]

__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Archer-Lemeshow Goodness of Fit Test for Survey Data with Log. Regression

2016-11-16 Thread Courtney Benjamin

?Hello R Experts,

I am trying to implement the Archer-Lemeshow GOF Test for survey data on a 
logistic regression model using the survey package based upon an R Help Archive 
post that I found where Dr. Thomas Lumley advised how to do it: 
http://r.789695.n4.nabble.com/Goodness-of-t-tests-for-Complex-Survey-Logistic-Regression-td4668233.html

Everything is going well until I get to the point where I have to add the 
objects 'r' and 'g' as variables to the data frame by either using the 
transform function or the update function to update the svrepdesign object.  
The log. regression model involved uses a subset of data and some of the values 
in the data frame are NA, so that is affecting my ability to add 'r' and 'g' as 
variables; I am getting an error because I only have 8397 rows for the new 
variables and 16197 in the data frame and svrepdesign object.  I am not sure 
how to overcome this error.

The following is a MRE:

##Archer Lemeshow Goodness of Fit Test for Complex Survey Data with Logistic 
Regression

library(RCurl)
library(survey)

data <- 
getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";)
elsq1ch <- read.csv(text = data)

#Specifying the svyrepdesign object which applies the BRR weights
elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = 
elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR")
elsq1ch_brr

##Resetting baseline levels for predictors
elsq1ch_brr <- update( elsq1ch_brr , F1HIMATH = relevel(F1HIMATH,"PreAlg or 
Less") )
elsq1ch_brr <- update( elsq1ch_brr , BYINCOME = relevel(BYINCOME,"0-25K") )
elsq1ch_brr <- update( elsq1ch_brr , F1RACE = relevel(F1RACE,"White") )
elsq1ch_brr <- update( elsq1ch_brr , F1SEX = relevel(F1SEX,"Male") )
elsq1ch_brr <- update( elsq1ch_brr , F1RTRCC = relevel(F1RTRCC,"Academic") )

#Log. Reg. model-all curric. concentrations including F1RTRCC as a predictor
allCC <- 
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.omit)
summary(allCC)

#Recommendations from Lumley (from R Help Archive) on implementing the Archer 
Lemeshow GOF test
r <- residuals(allCC, type="response")
f<-fitted(allCC)
g<- cut(f, c(-Inf, quantile(f,  (1:9)/10, Inf)))

# now create a new design object with r and g added as variables
#This is the area where I am having problems as my model involves a subset and 
some values are NA as well
#I am also not sure if I am naming/specifying the new variables of r and g 
properly
transform(elsq1ch,r=r,g=g)
elsq1ch_brr <- update(elsq1ch_brr,tag=g,tag=r)
#then:
decilemodel<- svyglm(r~g, design=newdesign)
regTermTest(decilemodel, ~g)
#is the F-adjusted mean residual test from the Archer Lemeshow paper

Thank you,
Courtney

?

Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Archer-Lemeshow Goodness of Fit Test for Survey Data with Log. Regression

2016-11-18 Thread Courtney Benjamin

Thank you, Anthony.  Your approach does work; however, I am concerned to some 
degree about subsetting the data prior to creating the new svrepdesign as I 
know that it is not recommended to subset the data prior to creating the 
svrepdesign object.  I am not sure if this is a significant concern in this 
context as the model was fitted with the original svrepdesign that was created 
prior to subsetting any data and the new svrepdesign is being used to run the 
diagnostic for the model.  Any thoughts on that issue?

Also, from my understanding of the outcome of the diagnostic, low p values 
indicate a poor model fit.

Sincerely,

Courtney


Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633


From: Anthony Damico 
Sent: Thursday, November 17, 2016 4:28 AM
To: Courtney Benjamin
Cc: r-help@r-project.org
Subject: Re: [R] Archer-Lemeshow Goodness of Fit Test for Survey Data with Log. 
Regression

great minimal reproducible example, thanks.  does something like this work?



#Log. Reg. model-all curric. concentrations including F1RTRCC as a predictor
allCC <- 
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude)
summary(allCC)

r <- residuals(allCC, type="response")
f<-fitted(allCC)
your_g<- cut(f, c(-Inf, quantile(f,  (1:9)/10, Inf)))

elsq1ch[ elsq1ch$BYSCTRL==1&elsq1ch$G10COHRT==1 , 'your_g' ] <- your_g
elsq1ch[ elsq1ch$BYSCTRL==1&elsq1ch$G10COHRT==1 , 'r' ] <- r
newdesign<-svrepdesign(variables = elsq1ch, repweights = elsq1ch[,18:217], 
weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR")

decilemodel<- svyglm(r~your_g, design=newdesign,subset=BYSCTRL==1&G10COHRT==1)
regTermTest(decilemodel, ~your_g)




On Wed, Nov 16, 2016 at 10:15 PM, Courtney Benjamin 
mailto:cbenj...@btboces.org>> wrote:
?Hello R Experts,

I am trying to implement the Archer-Lemeshow GOF Test for survey data on a 
logistic regression model using the survey package based upon an R Help Archive 
post that I found where Dr. Thomas Lumley advised how to do it: 
http://r.789695.n4.nabble.com/Goodness-of-t-tests-for-Complex-Survey-Logistic-Regression-td4668233.html

Everything is going well until I get to the point where I have to add the 
objects 'r' and 'g' as variables to the data frame by either using the 
transform function or the update function to update the svrepdesign object.  
The log. regression model involved uses a subset of data and some of the values 
in the data frame are NA, so that is affecting my ability to add 'r' and 'g' as 
variables; I am getting an error because I only have 8397 rows for the new 
variables and 16197 in the data frame and svrepdesign object.  I am not sure 
how to overcome this error.

The following is a MRE:

##Archer Lemeshow Goodness of Fit Test for Complex Survey Data with Logistic 
Regression

library(RCurl)
library(survey)

data <- 
getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";)
elsq1ch <- read.csv(text = data)

#Specifying the svyrepdesign object which applies the BRR weights
elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = 
elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR")
elsq1ch_brr

##Resetting baseline levels for predictors
elsq1ch_brr <- update( elsq1ch_brr , F1HIMATH = relevel(F1HIMATH,"PreAlg or 
Less") )
elsq1ch_brr <- update( elsq1ch_brr , BYINCOME = relevel(BYINCOME,"0-25K") )
elsq1ch_brr <- update( elsq1ch_brr , F1RACE = relevel(F1RACE,"White") )
elsq1ch_brr <- update( elsq1ch_brr , F1SEX = relevel(F1SEX,"Male") )
elsq1ch_brr <- update( elsq1ch_brr , F1RTRCC = relevel(F1RTRCC,"Academic") )

#Log. Reg. model-all curric. concentrations including F1RTRCC as a predictor
allCC <- 
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.omit)
summary(allCC)

#Recommendations from Lumley (from R Help Archive) on implementing the Archer 
Lemeshow GOF test
r <- residuals(allCC, type="response")
f<-fitted(allCC)
g<- cut(f, c(-Inf, quantile(f,  (1:9)/10, Inf)))

# now create a new design object with r and g added as variables
#This is the area where I am having problems as my model involves a subset and 
some values are NA as well
#I am also not sure if I am naming/specifying the new variables of r and g 
properly
transform(elsq1ch,r=r,g=g)
elsq1ch_brr <- update(els

Re: [R] Archer-Lemeshow Goodness of Fit Test for Survey Data with Log. Regression

2016-11-18 Thread Courtney Benjamin

Thank you; I appreciate your advisement.


Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633


From: Anthony Damico 
Sent: Friday, November 18, 2016 5:15 AM
To: Courtney Benjamin
Cc: r-help@r-project.org
Subject: Re: [R] Archer-Lemeshow Goodness of Fit Test for Survey Data with Log. 
Regression

hi, my code does not subset the survey design on the line that creates the 
svrepdesign().  subsetting in order to create a variable while your data is 
still a data.frame is probably okay, so long as you expect the observations 
outside of the subset to be NAs like they are in this case.  
nrow(elsq1ch_brr)==nrow(newdesign)

On Fri, Nov 18, 2016 at 5:06 AM, Courtney Benjamin 
mailto:cbenj...@btboces.org>> wrote:

Thank you, Anthony.  Your approach does work; however, I am concerned to some 
degree about subsetting the data prior to creating the new svrepdesign as I 
know that it is not recommended to subset the data prior to creating the 
svrepdesign object.  I am not sure if this is a significant concern in this 
context as the model was fitted with the original svrepdesign that was created 
prior to subsetting any data and the new svrepdesign is being used to run the 
diagnostic for the model.  Any thoughts on that issue?

Also, from my understanding of the outcome of the diagnostic, low p values 
indicate a poor model fit.

Sincerely,

Courtney


Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633


From: Anthony Damico mailto:ajdam...@gmail.com>>
Sent: Thursday, November 17, 2016 4:28 AM
To: Courtney Benjamin
Cc: r-help@r-project.org<mailto:r-help@r-project.org>
Subject: Re: [R] Archer-Lemeshow Goodness of Fit Test for Survey Data with Log. 
Regression

great minimal reproducible example, thanks.  does something like this work?



#Log. Reg. model-all curric. concentrations including F1RTRCC as a predictor
allCC <- 
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude)
summary(allCC)

r <- residuals(allCC, type="response")
f<-fitted(allCC)
your_g<- cut(f, c(-Inf, quantile(f,  (1:9)/10, Inf)))

elsq1ch[ elsq1ch$BYSCTRL==1&elsq1ch$G10COHRT==1 , 'your_g' ] <- your_g
elsq1ch[ elsq1ch$BYSCTRL==1&elsq1ch$G10COHRT==1 , 'r' ] <- r
newdesign<-svrepdesign(variables = elsq1ch, repweights = elsq1ch[,18:217], 
weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR")

decilemodel<- svyglm(r~your_g, design=newdesign,subset=BYSCTRL==1&G10COHRT==1)
regTermTest(decilemodel, ~your_g)




On Wed, Nov 16, 2016 at 10:15 PM, Courtney Benjamin 
mailto:cbenj...@btboces.org>> wrote:
?Hello R Experts,

I am trying to implement the Archer-Lemeshow GOF Test for survey data on a 
logistic regression model using the survey package based upon an R Help Archive 
post that I found where Dr. Thomas Lumley advised how to do it: 
http://r.789695.n4.nabble.com/Goodness-of-t-tests-for-Complex-Survey-Logistic-Regression-td4668233.html

Everything is going well until I get to the point where I have to add the 
objects 'r' and 'g' as variables to the data frame by either using the 
transform function or the update function to update the svrepdesign object.  
The log. regression model involved uses a subset of data and some of the values 
in the data frame are NA, so that is affecting my ability to add 'r' and 'g' as 
variables; I am getting an error because I only have 8397 rows for the new 
variables and 16197 in the data frame and svrepdesign object.  I am not sure 
how to overcome this error.

The following is a MRE:

##Archer Lemeshow Goodness of Fit Test for Complex Survey Data with Logistic 
Regression

library(RCurl)
library(survey)

data <- 
getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";)
elsq1ch <- read.csv(text = data)

#Specifying the svyrepdesign object which applies the BRR weights
elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = 
elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR")
elsq1ch_brr

##Resetting baseline levels for predictors
elsq1ch_brr <- update( elsq1ch_brr , F1HIMATH = relevel(F1HIMATH,"PreAlg or 
Less") )
elsq1ch_brr <- update( elsq1ch_brr , BYINCOME = relevel(BYINCOME,"0-25K") )
elsq1ch_brr <- update( elsq1ch_brr , F1R

Re: [R] Archer-Lemeshow Goodness of Fit Test for Survey Data with Log. Regression

2016-11-18 Thread Courtney Benjamin

Like I had said, I was concerned with the low p value outcome of this 
diagnostic test, so I went on to test other interactions that I had not 
previously tested.  I did find an interaction that is significant.  Now when I 
go to run the diagnostic to check for any improvement, I am coming up with an 
error.

##Testing with interaction model
allCCI12 <- 
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC+F1HIMATH*F1RGPP2,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude)
summary(allCCI12)

r <- residuals(allCCI12, type="response")
f<-fitted(allCCI12)
your_g<- cut(f, c(-Inf, quantile(f,  (1:9)/10, Inf)))

elsq1ch[ elsq1ch$BYSCTRL==1&elsq1ch$G10COHRT==1 , 'your_g' ] <- your_g
elsq1ch[ elsq1ch$BYSCTRL==1&elsq1ch$G10COHRT==1 , 'r' ] <- r
newdesign<-svrepdesign(variables = elsq1ch, repweights = elsq1ch[,18:217], 
weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR")

decilemodel<- svyglm(r~your_g, design=newdesign,subset=BYSCTRL==1&G10COHRT==1)
regTermTest(decilemodel, ~your_g)






Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633


From: Anthony Damico 
Sent: Friday, November 18, 2016 5:15 AM
To: Courtney Benjamin
Cc: r-help@r-project.org
Subject: Re: [R] Archer-Lemeshow Goodness of Fit Test for Survey Data with Log. 
Regression

hi, my code does not subset the survey design on the line that creates the 
svrepdesign().  subsetting in order to create a variable while your data is 
still a data.frame is probably okay, so long as you expect the observations 
outside of the subset to be NAs like they are in this case.  
nrow(elsq1ch_brr)==nrow(newdesign)

On Fri, Nov 18, 2016 at 5:06 AM, Courtney Benjamin 
mailto:cbenj...@btboces.org>> wrote:

Thank you, Anthony.  Your approach does work; however, I am concerned to some 
degree about subsetting the data prior to creating the new svrepdesign as I 
know that it is not recommended to subset the data prior to creating the 
svrepdesign object.  I am not sure if this is a significant concern in this 
context as the model was fitted with the original svrepdesign that was created 
prior to subsetting any data and the new svrepdesign is being used to run the 
diagnostic for the model.  Any thoughts on that issue?

Also, from my understanding of the outcome of the diagnostic, low p values 
indicate a poor model fit.

Sincerely,

Courtney


Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633

____
From: Anthony Damico mailto:ajdam...@gmail.com>>
Sent: Thursday, November 17, 2016 4:28 AM
To: Courtney Benjamin
Cc: r-help@r-project.org<mailto:r-help@r-project.org>
Subject: Re: [R] Archer-Lemeshow Goodness of Fit Test for Survey Data with Log. 
Regression

great minimal reproducible example, thanks.  does something like this work?



#Log. Reg. model-all curric. concentrations including F1RTRCC as a predictor
allCC <- 
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.exclude)
summary(allCC)

r <- residuals(allCC, type="response")
f<-fitted(allCC)
your_g<- cut(f, c(-Inf, quantile(f,  (1:9)/10, Inf)))

elsq1ch[ elsq1ch$BYSCTRL==1&elsq1ch$G10COHRT==1 , 'your_g' ] <- your_g
elsq1ch[ elsq1ch$BYSCTRL==1&elsq1ch$G10COHRT==1 , 'r' ] <- r
newdesign<-svrepdesign(variables = elsq1ch, repweights = elsq1ch[,18:217], 
weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR")

decilemodel<- svyglm(r~your_g, design=newdesign,subset=BYSCTRL==1&G10COHRT==1)
regTermTest(decilemodel, ~your_g)




On Wed, Nov 16, 2016 at 10:15 PM, Courtney Benjamin 
mailto:cbenj...@btboces.org>> wrote:
?Hello R Experts,

I am trying to implement the Archer-Lemeshow GOF Test for survey data on a 
logistic regression model using the survey package based upon an R Help Archive 
post that I found where Dr. Thomas Lumley advised how to do it: 
http://r.789695.n4.nabble.com/Goodness-of-t-tests-for-Complex-Survey-Logistic-Regression-td4668233.html

Everything is going well until I get to the point where I have to add the 
objects 'r' and 'g' as variables to the data frame by either using the 
transform function or the update function to update the svrepdesign object.  
The log. regression model involved uses a subset of data a

[R] Further Subsetting of Data for Log. Reg. Results in qr.default Error

2016-11-23 Thread Courtney Benjamin

Hello R Experts,

In further subsetting data within a logistic regression in the survey package, 
I am getting a qr.default error.

Below is an example of the original model that runs correctly, but when I 
subset the data further to look at students of a particular curriculum 
concentration, the qr.default error occurs.  I thought it may have been related 
to converting the F1RTRCC variable into a factor for use in the original model; 
I went back and restored that variable to its original form and it didn't help.

Any guidance is greatly appreciated.


library(RCurl)
library(survey)

data <- 
getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";)
elsq1ch <- read.csv(text = data)

#Specifying the svyrepdesign object which applies the BRR weights
elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = 
elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR")
elsq1ch_brr

##Resetting baseline levels for predictors
elsq1ch_brr <- update( elsq1ch_brr , F1HIMATH = relevel(F1HIMATH,"PreAlg or 
Less") )
elsq1ch_brr <- update( elsq1ch_brr , BYINCOME = relevel(BYINCOME,"0-25K") )
elsq1ch_brr <- update( elsq1ch_brr , F1RACE = relevel(F1RACE,"White") )
elsq1ch_brr <- update( elsq1ch_brr , F1SEX = relevel(F1SEX,"Male") )
elsq1ch_brr <- update( elsq1ch_brr , F1RTRCC = relevel(F1RTRCC,"Academic") )

#Log. Reg. model-all curric. concentrations including F1RTRCC as a predictor
allCC <- 
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.omit)
summary(allCC)

##CTE Log. Reg. model that is resulting in the qr.default error
CTE <- 
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1&F1RTRCC=="Academic",na.action=na.omit)
summary(CTE)



Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Svyolr-Properly Specifying Start Values

2016-11-29 Thread Courtney Benjamin

Hello R Users,

I am trying to use the svyolr command and coming up with the following error:

Error in MASS::polr(formula, data = df, ..., Hess = TRUE, model = FALSE,  :
  attempt to find suitable starting values failed
>From what I have read online, a possible solution is to specify a value in the 
>start argument of svyolr; unfortunately, I have not been able to find any 
>detailed/clear descriptions about how to go about specifying values for the 
>start argument.  Any help in explaining how to go about establishing 
>reasonable start values would be greatly appreciated.

library (survey)
library(RCurl)

data <- 
getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj2.csv";)
elsq1ch <- read.csv(text = data)


#Specifying the svyrepdesign object which applies the BRR weights
elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = 
elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR")
elsq1ch_brr

allCColr <- 
svyolr(F3ATTAINMENT~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,design=subset(elsq1ch_brr,BYSCTRL==1&G10COHRT==1),na.action=na.omit)

?





Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org<mailto:cbenj...@btboces.org>

607-763-8633

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] modify the imported version of a function

2016-12-16 Thread Benjamin Tyner


Hi

I saw on the assignInNamespace help page, that it changes "the copy in 
the namespace, but not any copies already exported from the namespace, 
in particular an object of that name in the package (if already 
attached) and any copies already imported into other namespaces."


So now I'm wondering, whether there is a way to modify such copies 
already imported into other namespaces? For a concrete example, consider 
the lint package (now archived on CRAN) which includes a function 
check_pattern, which calls stringr:::perl, which has been deprecated in 
newer versions:


   > stringr:::perl
   function (pattern)
   {
   message("perl is deprecated. Please use regexp instead")
   regex(pattern)
   }
   

so, what if I wanted to directly replace stringr:::perl with 
stringr:::regex, in such a way that lint::check_pattern sees the 
replacement? Is there a way, preferably without having to attach stringr 
to the search path?


(I'm mostly just interested in learning about how namespace imports 
actually work; of course in this toy example one could use 
suppressMessages to hide that deprecation message).


Regards
Ben

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] modify the imported version of a function

2016-12-19 Thread Benjamin Tyner

Figured it out...in case it is useful to others:

   > library(lint)

   > lint.enclos <- parent.env(asNamespace("lint"))

   > stopifnot(all(c("perl", "regex") %in% ls(lint.enclos)))

   > lint.enclos$perl
   function (pattern)
   {
   message("perl is deprecated. Please use regexp instead")
   regex(pattern)
   }

   > unlockBinding("perl", lint.enclos)

   > lint.enclos$perl <- lint.enclos$regex

   > lint.enclos$perl
   function (pattern, ignore_case = FALSE, multiline = FALSE, comments 
= FALSE,

   dotall = FALSE, ...)
   {
   options <- stri_opts_regex(case_insensitive = ignore_case,
   multiline = multiline, comments = comments, dotall = dotall,
   ...)
   structure(pattern, options = options, class = c("regex",
   "pattern", "character"))
   }

On 12/16/2016 06:06 PM, Benjamin Tyner wrote:

Hi

I saw on the assignInNamespace help page, that it changes "the copy in 
the namespace, but not any copies already exported from the namespace, 
in particular an object of that name in the package (if already 
attached) and any copies already imported into other namespaces."

So now I'm wondering, whether there is a way to modify such copies 
already imported into other namespaces? For a concrete example, 
consider the lint package (now archived on CRAN) which includes a 
function check_pattern, which calls stringr:::perl, which has been 
deprecated in newer versions:

   > stringr:::perl
   function (pattern)
   {
   message("perl is deprecated. Please use regexp instead")
   regex(pattern)
   }

so, what if I wanted to directly replace stringr:::perl with 
stringr:::regex, in such a way that lint::check_pattern sees the 
replacement? Is there a way, preferably without having to attach 
stringr to the search path?

(I'm mostly just interested in learning about how namespace imports 
actually work; of course in this toy example one could use 
suppressMessages to hide that deprecation message).

Regards
Ben

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] file.exists() on device files

2017-01-11 Thread Benjamin Tyner


Hi,

On my linux machine (Ubuntu, and also tested on RHEL), I am curious to 
know what might be causing file.exists (and also normalizePath) to not 
see the final device file here:


   > list.files("/dev/fd", full.names = TRUE)
   [1] "/dev/fd/0" "/dev/fd/1" "/dev/fd/2" "/dev/fd/3"
   > file.exists(list.files("/dev/fd", full.names = TRUE))
   [1]  TRUE  TRUE  TRUE FALSE
   > normalizePath(list.files("/dev/fd", full.names = TRUE))
   [1] "/dev/pts/2" "/dev/pts/2" "/dev/pts/2" "/dev/fd/3"
   Warning message:
   In normalizePath(list.files("/dev/fd", full.names = TRUE)) :
 path[4]="/dev/fd/3": No such file or directory

   > sessionInfo()
   R version 3.2.5 (2016-04-14)
   Platform: x86_64-pc-linux-gnu (64-bit)
   Running under: Ubuntu 16.04.1 LTS

   locale:
[1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8   LC_NAME=C
[9] LC_ADDRESS=C   LC_TELEPHONE=C
   [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

   attached base packages:
   [1] stats graphics  grDevices utils datasets  methods base

Regards
Ben

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [FORGED] file.exists() on device files

2017-01-13 Thread Benjamin Tyner


Thank you for the insights, Rolf and Henrik.

To give another example, this time in non-interactive mode,

   Rscript -e "file.exists(commandArgs(TRUE))" <(echo "Hi")

   [1] TRUE

versus

   Rscript -e "normalizePath(commandArgs(TRUE))" <(echo "Hi")
   [1] "/dev/fd/63"
   Warning message:
   In normalizePath(commandArgs(TRUE)) :
 path[1]="/dev/fd/63": No such file or directory

It almost seems like file.exists and normalizePath use separate criteria 
for determining existence?


Regards

Ben

On 01/12/2017 01:42 AM, Rolf Turner wrote:

On 12/01/17 16:33, Henrik Bengtsson wrote:




FYI, the /proc is there because Unix has something called the "proc
filesystem (procfs; https://en.wikipedia.org/wiki/Procfs) is a special
filesystem in Unix-like operating systems that presents information
about processes and other system information in a hierarchical
file-like structure".  For instance, you can query the uptime of the
machine by reading from /proc/uptime:

$ cat /proc/uptime
332826.96 661438.10

$ cat /proc/uptime
332871.40 661568.50


You can get all IDs (PIDs) of all processes currently running:

$ ls /proc/ | grep -E '^[0-9]+$'

and for each process you there are multiple attributes mapped as
files, e.g. if I start R as:

$ R --args -e "message('hello there')"

then I can query that process as:

$ pid=$(pidof R)
$ echo $pid
26323

$ cat /proc/26323/cmdline
/usr/lib/R/bin/exec/R--args-emessage('hello there')

Unix is neat


Indeed.  Couldn't agree more.  Thanks for the insight.



cheers,

Rolf



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] flushing on.exit prior to q()

2017-02-21 Thread Benjamin Tyner


Hi,

When using a custom error function that calls q(), what is the 
recommended way to "flush" the calling function's on.exit ?


For example, say I have a script:

   #!/usr/bin/Rscript --no-init-file

   options(error = function() {

  cat("on error message\n", file = stderr())

  q(save = "no", status = 1)
   })

   test <- function() {

   on.exit(cat("on exit message\n", file = stderr()))

   stop("error")
   }

   test()

when I run the script, the "on exit message" does not print. (I found 
one solution might involve adding a stop() to my custom error function 
prior to the q(); however this is somewhat less than ideal: it also 
triggers an "error during wrapup" because of the recursion).


Other ideas, best practices?

Thanks
Ben

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple... but...

2008-07-23 Thread Nutter, Benjamin

> x <- c(1,3,5)
> y <- c(2,4,6)
> xy <- sort(c(x,y))

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Shubha Vishwanath Karanth
Sent: Wednesday, July 23, 2008 8:55 AM
To: [EMAIL PROTECTED]
Subject: [R] Simple... but...

Hi R,

If 

x=c(1,3,5)

y=c(2,4,6)

I need a vector which is c(1,2,3,4,5,6) from x and y.

How do I do it? I mean the best way

Thanks, Shubha

This e-mail may contain confidential and/or privileged
i...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S. News & World Report (2007).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.

Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Matrix barplot

2008-07-25 Thread Nutter, Benjamin

> data <- data.frame(Year=c(2000,2001,2002),
 A=c(4,2,1), 
 B=c(3,1,2), 
 C=c(0,3,5))

> data.mat <- as.matrix(data)[,2:4]
> rownames(data.mat) <- data$Year
> data.mat <- t(data.mat)
> barplot(data.mat,beside=TRUE)

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Andreas Tille
Sent: Friday, July 25, 2008 8:14 AM
To: r-help@r-project.org
Subject: [R] Matrix barplot

Hi,

as a bloody R beginner I failed to solve the probably simple problem
to create a barplot of the following data read from a file

  Year  A BC
  2000  4 30
  2001  2 13
  2002  1 25

The Barplot should look like

 5 | C
 4 |   A C
 3 |   AB  C C
 2 |   ABA CBC
 1 |   ABABC   ABC
   +--
  2000  2001  2002

(well, something like that - the colors are encoded as letters in this
  ASCII-graphics - assume the coloring / shading as usual).

My problem is that if I read the table using

data <- read.table(file='data.dat', sep = '\t', fill=TRUE,
header=TRUE )

everything looks is read as expected if I try

data

but I have no idea how to separate the Matrix (A B C) to specify the
height parameter of barplot as a matrix and the names.arg parameter
for the inscription.  Whatever I tried I get

'height' must be a vector or a matrix

and I have no idea to do this right.  So I think I'm just facing a
data conversion problem and have to turn data[['Year']]  into a vector
and the remaining table into a matrix.

I guess this is a really simple question - but I failed to find a
solution.

Kind regards

Andreas.

-- 
http://fam-tille.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S. News & World Report (2007).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Color of box frame in Legend (Was: Matrix barplot)

2008-07-28 Thread Nutter, Benjamin

Try sourcing in the 'new.legend' function below. It's the legend
function with a new argument called 'box.col'.  The argument will change
the color of the box surrounding the legend.  If I understand what it is
you are looking for, this should work.  Also, I didn't see a way to
change the axis bar in your code, so I suppressed the axis in the call
to barplot, and manually replaced the axis using the axis function.  I
hope this works for you.

Benjamin

data <- data.frame(Year=c(2000,2001,2002),
  A=c(2,2,1),
  B=c(3,1,2),
  C=c(0,3,5))

data.mat <- as.matrix(data)[,2:4]
rownames(data.mat) <- data[['Year']]
data.mat <- t(data.mat)

textcolor="yellow"
par(col.axis=textcolor,col.main=textcolor)

barplot(data.mat,beside=TRUE,col=rainbow(3),main="Test",yaxt="n")
axis(2,at=0:5,col=textcolor)
new.legend(x="topleft", colnames(data[,2:4]),fill=rainbow(3),
inset=0.05,
text.col=textcolor,box.col=textcolor
   )



###

###
new.legend <- function (x, y = NULL, legend, fill = NULL, col =
par("col"), 
lty, lwd, pch, angle = 45, density = NULL, bty = "o", bg =
par("bg"), 
box.lwd = par("lwd"), box.lty = par("lty"), box.col="black",
pt.bg = NA, cex = 1, 
pt.cex = cex, pt.lwd = lwd, xjust = 0, yjust = 1, x.intersp = 1, 
y.intersp = 1, adj = c(0, 0.5), text.width = NULL, text.col =
par("col"), 
merge = do.lines && has.pch, trace = FALSE, plot = TRUE, 
ncol = 1, horiz = FALSE, title = NULL, inset = 0) 
{
if (missing(legend) && !missing(y) && (is.character(y) || 
is.expression(y))) {
legend <- y
y <- NULL
}
mfill <- !missing(fill) || !missing(density)
title <- as.graphicsAnnot(title)
if (length(title) > 1) 
stop("invalid title")
legend <- as.graphicsAnnot(legend)
n.leg <- if (is.call(legend)) 
1
else length(legend)
if (n.leg == 0) 
stop("'legend' is of length 0")
auto <- if (is.character(x)) 
match.arg(x, c("bottomright", "bottom", "bottomleft", 
"left", "topleft", "top", "topright", "right", "center"))
else NA
if (is.na(auto)) {
xy <- xy.coords(x, y)
x <- xy$x
y <- xy$y
nx <- length(x)
if (nx < 1 || nx > 2) 
stop("invalid coordinate lengths")
}
else nx <- 0
xlog <- par("xlog")
ylog <- par("ylog")
rect2 <- function(left, top, dx, dy, density = NULL, angle, 
...) {
r <- left + dx
if (xlog) {
left <- 10^left
r <- 10^r
}
b <- top - dy
if (ylog) {
top <- 10^top
b <- 10^b
}
rect(left, top, r, b, angle = angle, density = density, 
...)
}
segments2 <- function(x1, y1, dx, dy, ...) {
x2 <- x1 + dx
if (xlog) {
x1 <- 10^x1
x2 <- 10^x2
}
y2 <- y1 + dy
if (ylog) {
y1 <- 10^y1
y2 <- 10^y2
}
segments(x1, y1, x2, y2, ...)
}
points2 <- function(x, y, ...) {
if (xlog) 
x <- 10^x
if (ylog) 
y <- 10^y
points(x, y, ...)
}
text2 <- function(x, y, ...) {
if (xlog) 
x <- 10^x
if (ylog) 
y <- 10^y
text(x, y, ...)
}
if (trace) 
catn <- function(...) do.call("cat", c(lapply(list(...), 
formatC), list("\n")))
cin <- par("cin")
Cex <- cex * par("cex")
if (is.null(text.width)) 
text.width <- max(abs(strwidth(legend, units = "user", 
cex = cex)))
else if (!is.numeric(text.width) || text.width < 0) 
stop("'text.width' must be numeric, >= 0")
xc <- Cex * xinch(cin[1], warn.log = FALSE)
yc <- Cex * yinch(cin[2], warn.log = FALSE)
if (xc < 0) 
text.width <- -text.width
xchar <- xc
xextra <- 0
yextra <- yc * (y.intersp - 1)
ymax <- yc * max(1, strheight(legend, units = "user", cex = cex)/yc)
ychar <- yextra + ymax
if (trace) 
catn("  xchar=", xchar, "; (yextra,ychar)=", c(yextra, 
ychar))
if (mfill) {
xbox <- xc * 0.8
ybox <- yc * 0.5
dx.fill

[R] parent in Creating environment object

2008-08-01 Thread Benjamin Otto

Hi,

I would like to convert a simple list into an environment object. It seems I
have to create an environment object with new.env() and assign the single
values afterwards. Now what I did not really understand from the guides
until now is, how the parent environment supplied to the new.env() function
influence the final environment. So:

1. Do I ALWAYS have to supply a parent during creation?
2. If yes, what would that be, when all I want is a conversion from a simple
list?

Best regards

Benjamin

==
Benjamin Otto
University Hospital Hamburg-Eppendorf
Institute For Clinical Chemistry
Martinistr. 52
D-20246 Hamburg

Tel.: +49 40 42803 1908
Fax.: +49 40 42803 4971
==



-- 
Pflichtangaben gemäß Gesetz über elektronische Handelsregister und 
Genossenschaftsregister sowie das Unternehmensregister (EHUG):

Universitätsklinikum Hamburg-Eppendorf
Körperschaft des öffentlichen Rechts
Gerichtsstand: Hamburg

Vorstandsmitglieder:
Prof. Dr. Jörg F. Debatin (Vorsitzender)
Dr. Alexander Kirstein
Ricarda Klein
Prof. Dr. Dr. Uwe Koch-Gromus

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Saving environment object

2008-08-15 Thread Benjamin Otto

Hi,

When I create an environment object with new.env() and populate it with
values then how can I save it into an .RData file properly, so it can be
loaded later on in a new session?

Saving an environment object with save() or save.image() results in an error
message when loading again:

Error: protect(): protection stack overflow

Regards,

benjamin

==
Benjamin Otto
University Hospital Hamburg-Eppendorf
Institute For Clinical Chemistry
Martinistr. 52
D-20246 Hamburg

Tel.: +49 40 42803 1908
Fax.: +49 40 42803 4971
==



-- 
Pflichtangaben gemäß Gesetz über elektronische Handelsregister und 
Genossenschaftsregister sowie das Unternehmensregister (EHUG):

Universitätsklinikum Hamburg-Eppendorf
Körperschaft des öffentlichen Rechts
Gerichtsstand: Hamburg

Vorstandsmitglieder:
Prof. Dr. Jörg F. Debatin (Vorsitzender)
Dr. Alexander Kirstein
Ricarda Klein
Prof. Dr. Dr. Uwe Koch-Gromus

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Saving environment object

2008-08-15 Thread Benjamin Otto

Hi Erik,

Yes this is what I was trying and your example or the one of luke is working
fine with me.

So now I'm not sure if this is due to an environment which takes too much
space. The environment troubling me has 644276 entries. Is this too much?

Benjamin

-Ursprüngliche Nachricht-
Von: Erik Iverson [mailto:[EMAIL PROTECTED] 
Gesendet: Friday, August 15, 2008 3:37 PM
An: Benjamin Otto
Cc: R-Help
Betreff: Re: [R] Saving environment object

Benjamin Otto wrote:
> Hi,
> 
> When I create an environment object with new.env() and populate it with
> values then how can I save it into an .RData file properly, so it can be
> loaded later on in a new session?
> 
> Saving an environment object with save() or save.image() results in an
error
> message when loading again:
> 
> Error: protect(): protection stack overflow

Can you give a small, reproducible example as the posting guide asks? 
And also provide your sessionInfo() ?

I am not able to replicate this.

test <- new.env()
assign("hi", pi, pos = test)
save(test, file = "~/testenv.Rdata")

does not give me an error.  Is this basically what you're trying?

-- 
Pflichtangaben gemäß Gesetz über elektronische Handelsregister und 
Genossenschaftsregister sowie das Unternehmensregister (EHUG):

Universitätsklinikum Hamburg-Eppendorf
Körperschaft des öffentlichen Rechts
Gerichtsstand: Hamburg

Vorstandsmitglieder:
Prof. Dr. Jörg F. Debatin (Vorsitzender)
Dr. Alexander Kirstein
Ricarda Klein
Prof. Dr. Dr. Uwe Koch-Gromus
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Saving environment object

2008-08-15 Thread Benjamin Otto

Hi Luke, hi all,

You have been right guys, the hashing solved the problem. Thanks for your
hard efforts. :)

Best regards

Benjamin

-Ursprüngliche Nachricht-
Von: Luke Tierney [mailto:[EMAIL PROTECTED] 
Gesendet: Friday, August 15, 2008 6:29 PM
An: Prof Brian Ripley
Cc: Benjamin Otto; R-Help
Betreff: Re: [R] Saving environment object

On Fri, 15 Aug 2008, Prof Brian Ripley wrote:

> Having been told that this was a particular very large environment, Luke's

> comments in serialize.c wouild seem to apply:
>
>   The output format for dotted pairs writes the ATTRIB value first
>   rather than last.  This allows CDR's to be processed by iterative
>   tail calls to avoid recursion stack overflows when processing long
>   lists.  The writing code does take advantage of this, but the
>   reading code does not.  It hasn't been a big issue so far--the only
>   case where it has come up is in saving a large unhashed environment
>   where saving succeeds but loading fails because the PROTECT stack
>   overflows.  With the ability to create hashed environments at the
>   user level this is likely to be even less of an issue now.  But if
>   we do need to deal with it we can do so without a change in the
>   serialization format--just rewrite ReadItem to pass the place to
>   store the CDR it reads. (It's a bit of a pain to do, that is why it
>   is being deferred until it is clearly needed.)

Thanks -- I had forgotten about that (and would still prefer to defer
rewriting ReadItem :-))

luke

> So I think the moral is to hash large environments, and increasing 
> --max-ppsize should enable this one to be read in.
>
> On Fri, 15 Aug 2008, Luke Tierney wrote:
>
>> On Fri, 15 Aug 2008, Benjamin Otto wrote:
>> 
>>> Hi,
>>> 
>>> When I create an environment object with new.env() and populate it with
>>> values then how can I save it into an .RData file properly, so it can be
>>> loaded later on in a new session?
>>> 
>>> Saving an environment object with save() or save.image() results in an 
>>> error
>>> message when loading again:
>>> 
>>> Error: protect(): protection stack overflow
>> 
>> save/load works fine (and is used in many places):
>> 
>>> e<-new.env()
>>> assign("e", e, envir = e)
>>> assign("x", 2, envir = e)
>>> save(e, file = "test.Rda")
>>> rm(e)
>>> load("test.Rda")
>>> e
>> 
>> 
>> There may be something about the values you are using that is causing
>> problems, but there is no way to tell without a reproducible example.
>> 
>> luke
>> 
>>> 
>>> Regards,
>>> 
>>> benjamin
>>> 
>>> ==
>>> Benjamin Otto
>>> University Hospital Hamburg-Eppendorf
>>> Institute For Clinical Chemistry
>>> Martinistr. 52
>>> D-20246 Hamburg
>>> 
>>> Tel.: +49 40 42803 1908
>>> Fax.: +49 40 42803 4971
>>> ==
>>> 
>>> 
>>> 
>>> 
>> 
>> -- 
>> Luke Tierney
>> Chair, Statistics and Actuarial Science
>> Ralph E. Wareham Professor of Mathematical Sciences
>> University of Iowa  Phone: 319-335-3386
>> Department of Statistics andFax:   319-335-3017
>>   Actuarial Science
>> 241 Schaeffer Hall  email:  [EMAIL PROTECTED]
>> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
>
>

-- 
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
Actuarial Science
241 Schaeffer Hall  email:  [EMAIL PROTECTED]
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu



-- 
Pflichtangaben gemäß Gesetz über elektronische Handelsregister und 
Genossenschaftsregister sowie das Unternehmensregister (EHUG):

Universitätsklinikum Hamburg-Eppendorf
Körperschaft des öffentlichen Rechts
Gerichtsstand: Hamburg

Vorstandsmitglieder:
Prof. Dr. Jörg F. Debatin (Vorsitzender)
Dr. Alexander Kirstein
Ricarda Klein
Prof. Dr. Dr. Uwe Koch-Gromus
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Generic attribute in multinom function

2009-08-11 Thread Benjamin Geckle


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Generic attribute in multinom function

2009-08-11 Thread Benjamin Geckle


Hello,
I try estimate a MNL model but I need one coefficient as generic. How 
can I get the same coefficient for all alternatives?




As example

Call:

multinom(formula = choice ~ time + costs, data = DATA)



Coefficients:

  (Intercept)time costs

14.79-0.28  -3.97

2   -3.30 0.053 0.59

32.10-0.048-0.65



But I would like to get the costs parameters as generic attribute, so 
that I have the following result:




Coefficients:

  (Intercept)time   costs

14.79-0.3-1.8

2   -3.36 0.4-1.8

32.10-0.5-1.8

Thanks.

Best Regards

Benjamin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How can I do a generic specification in multiple logistic regression

2009-08-14 Thread Benjamin Geckle

I use the function multinom to estimate a multiple logistic regression. 
but I need one coefficient as generic. How can I do this? Or is it 
possible to do this with an another function?




As example  this is what I get:


Call:

multinom(formula = choice ~ time + costs, data = DATA)



Coefficients:

  (Intercept)time costs

14.79-0.28  -3.97

2   -3.30 0.053 0.59

32.10-0.048-0.65



But I would like to get the costs parameters as generic attribute, so 
that I have the following result:


Coefficients:

  (Intercept)time   costs

14.79-0.3-1.8

2   -3.36 0.4-1.8

32.10-0.5-1.8




Thank you for you help.

Best Regards

Benjamin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] applying summary() to an object created with ols()

2009-08-21 Thread Benjamin Volland


Hello R-list,

I am trying to calculate a ridge regression using first the *lm.ridge()* 
function from the MASS package and then applying the obtained Hoerl 
Kennard Baldwin (HKB) estimator as a penalty scalar to the *ols()* 
function provided by Frank Harrell in his Design package.

It looks like this:
> rrk1<-lm.ridge(lnbcpc ~ lntex + lnbeerp + lnwinep + lntemp + pop, 
subset(aa, Jahr>=1957 & Jahr<=1966))
> f <- ols(lnbcpc ~ lntex + lnbeerp + lnwinep + lntemp + pop, 
subset(aa, Jahr>=1957 & Jahr<=1966), penalty = rrk$kHKB)

> f

which returns
>Linear Regression Model
>
>ols(formula = lnbcpc ~ lntex + lnbeerp + lnwinep + lntemp + pop,
>data = subset(aa, Jahr >= 1957 & Jahr <= 1966), penalty = rrk$kHKB)
>
> n Model L.R.   d.f. R2  Sigma
>10  38.59  8.814 0.98390.02796
>
>Residuals:
>1 2 3 4 5 6 
7 8 910
>-0.014653 -0.002787  0.017515 -0.018145 -0.008757 -0.008035  0.006066  
0.045826 -0.001244 -0.015786

>
>Coefficients:
>Value Std. Error   t Pr(>|t|)
>Intercept  1.5240 3.3034  0.4613   0.8496
>lntex  0.3722 0.2071  1.7975   0.6801
>lnbeerp0.9085 0.5760  1.5771   0.6964
>lnwinep   -0.1458 0.1874 -0.7781   0.7863
>lntemp-0.0772 0.1344 -0.5743   0.8240
>pop   -4.1889 1.9286 -2.1720   0.6571
>
>Adjusted R-Squared: 0.2227

All in all beautiful (leaving aside that the results suck). The problem 
starts when I want to write the obtained coefficients (incl. Std. 
Errors, t-, and p-values) into a matrix.
Via the *f$coef* command I can only access the betas (1st column) and 
using the *summary(f)* function I get

> summary(f)
>Fehler in summary.Design(f) :
>adjustment values not defined here or with datadist for lntex lnbeerp 
lnwinep lntemp pop


Does anyone know how I can set the *datadist()* and the *options()* such 
that I will get access to all coefficients?


I tried:
> options(datadist=NULL)
> f <- ols(lnbcpc ~ lntex + lnbeerp + lnwinep + lntemp + pop, 
subset(aa, Jahr>=1957 & Jahr<=1966), penalty = rrk$kHKB)

> d <- datadist(f)
but got:
> Fehler in sort.list(unique(y)) : 'x' must be atomic for 'sort.list'
> Have you called 'sort' on a list?

In the R documentation on ?ols() it states concerning the values 
returned: "the same objects returned from |lm| (/unless |penalty| or 
|penalty.matrix| are given/ - then an abbreviated list is returned since 
|lm.pfit| is used as a fitter)..." Unfortunately no information seems to 
be available on lm.pfit.
Does anyone know why the using that function leads to an abbreviated 
return list? Is there a trick to circumvent that?


Thanks
Benjamin Volland

P.S. Currently using R-version 2.7.1 on a Windows PC.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with specifying variance-covariance matrix for random effects (nlme package)

2010-02-14 Thread Benjamin Cheah

Hi all,

I've been struggling with trying to specify a diagnoal matrix for linear mixed 
effects model. I think I've got nearly everything correct, except the following 
message appears:

In lme.formula(fixed = fwave ~ sex + sexXbulbar + visit + age +  :
  Fewer observations than random effects in all level 1 groups

Not sure if i've provided enough details, but I'm basically trying to perform a 
mixed effects model analysis, controlling for several covariates. Incorporating 
random effects for the intercept and slope.

What does the second line, 'Fewer observations than random effects in all level 
1 groups' mean?

Many thanks in advance,

Ben Cheah



  
__
[[elided Yahoo spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] all possible subsets, with AIC

2010-02-15 Thread Nutter, Benjamin

I've dabbled in this a little bit, and the result of my dabbling is
attached.  I'll give you fair warning, however.  The attached function
can take a long time to run, and if your model has 10 or more
predictors, you may be retired before it finishes running.

In any case, it will models for all possible subsets of predictors in
lm, glm, or coxph.  If requested, it will also plot the R-squared,
Adjusted R-squared, AIC, or BIC of those models (when the values are
applicable to the model).  It might give you a good starting point.

Benjamin

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of kcleary2
Sent: Friday, February 12, 2010 3:19 PM
To: r-help@r-project.org
Subject: [R] all possible subsets, with AIC



Hello, 

I have a question about doing ALL possible subsets regression with a
general linear model. My goal is to produce cumulative Akaike weights
for each of 7 predictor variables-to obtain this I need R to: 

1.
Show me ALL possible subsets, not just the best possible subsets 

2. Give
me an AIC value for each model (instead of a BIC value). 

I have tried to
do this in library(RcmdrPlugin.HH), and using the "leaps" code below.
With the leaps code my problem is that my response is not a vector, it's
a single value (density of a species) 

ANy help would be greatly
appreciated. Thanks a lot,
Kate 

ALL-SUBSETS
REGRESSIOM

DESCRIPTION

leaps() performs an exhaustive search for the best subsets of the
variables in x for predicting y in linear regression, using an efficient
branch-and-bound algorithm. It is a compatibility wrapper for regsubsets
[1] does the same thing better. 

Since the algorithm returns a
best model of each size, the results do not depend on a penalty model
for model size: it doesn't make any difference whether you want to use
AIC, BIC, CIC, DIC, ... 

USAGE

leaps(x=, y=, wt=rep(1, NROW(x)), int=TRUE, method=c("Cp", "adjr2",
"r2"), nbest=10, names=NULL, df=NROW(x),
strictly.compatible=TRUE)

ARGUMENTS

x
A matrix of predictors

y

A response vector

wt
Optional weight vector

int
Add an
intercept to the model

method
Calculate Cp, adjusted R-squared or
R-squared

nbest
Number of subsets of each size to report

names

vector of names for columns of x

df
Total degrees of freedom to
use instead of nrow(x) in calculating Cp and adjusted R-squared


strictly.compatible
Implement misfeatures of leaps() in S

--
Kate
Cleary
MS Candidate
Department of Fish, Wildlife, and Conservation Biology Colorado State
University Fort Collins, CO
970-491-3535



Links:
--
[1]
https://webmail.warnercnr.colostate.edu/leaps/help/regsubsets

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News & World Report (2009).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use
only by the individual or entity to which it is addressed
and may contain information that is privileged,
confidential, and exempt from disclosure under applicable
law.  If the reader of this message is not the intended
recipient or the employee or agent responsible for
delivering the message to the intended recipient, you are
hereby notified that any dissemination, distribution or
copying of this communication is strictly prohibited.  If
you have received this communication in error,  please
contact the sender immediately and destroy the material in
its entirety, whether electronic or hard copy.  Thank you.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] First. Last. Data row selection

2010-02-23 Thread Nutter, Benjamin

I've attached some functions I've written based on previous questions
that have been posted here.  Unfortunately, I was too lazy to give
credit to previous commenters in my Rd file, and for that I hope they'll
forgive me.  

In any case, please be assured that the functions I've attached are in
no way my original work.

Benjamin

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of wookie1976
Sent: Tuesday, February 23, 2010 12:40 PM
To: r-help@r-project.org
Subject: [R] First. Last. Data row selection


I am in the process of switching from SAS over to R.  I am working on
very large CSV datasets that contain vehicle information.  As I am
processing the data, I need to select the first (or sometimes the
second) record (by date) for any records that have the same license
plate number.  In SAS, there is a function called 'first.' that can be
used on sorted datasets to pull out those first entries for each
occurrence of a particular variable  (in this case the variable is
'license plate') found in the data.  I have spent some time looking
around and cannot seem to find an equivalent function in R. 
Can anyone recommend an efficient technique that would pull this off?  I
assume the database must first be sorted by vehicle plate and date, and
then apply the filter or function.  Any help would be greatly
appreciated.  

Thanks, Joe
--
View this message in context:
http://n4.nabble.com/First-Last-Data-row-selection-tp1566260p1566260.htm
l
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News & World Report (2009).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use
only by the individual or entity to which it is addressed
and may contain information that is privileged,
confidential, and exempt from disclosure under applicable
law.  If the reader of this message is not the intended
recipient or the employee or agent responsible for
delivering the message to the intended recipient, you are
hereby notified that any dissemination, distribution or
copying of this communication is strictly prohibited.  If
you have received this communication in error,  please
contact the sender immediately and destroy the material in
its entirety, whether electronic or hard copy.  Thank you.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Circles around letters or numbers in plot title

2010-02-23 Thread Nutter, Benjamin

Has anyone ever tried putting a circle around a letter or a number in a
plot title?

For instance, if I have a plot title "Scatterplot for Subject 24", I
want to put a circle around 24 to distinguish that plot from the other
30 I've generated.  Any tips or ideas beyond plotting a circle in the
margin?

Benjamin 
 


  Benjamin Nutter |  Biostatistician  |  Quantitative Health Sciences
  Cleveland Clinic  |  9500 Euclid Ave.   |  Cleveland, OH 44195  |
(216) 445-1365


===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News & World Report (2009).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] counting the number of ones in a vector

2010-02-26 Thread Nutter, Benjamin

What you did works well.  You could also try the following.

table(x)["1"]



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Randall Wrong
Sent: Friday, February 26, 2010 9:41 AM
To: r-help@r-project.org
Subject: [R] counting the number of ones in a vector

Dear R users,

I want to count the number of ones in a vector x.

That's what I did : length( x[x==1] )

Is that a good solution ?
Thank you very much,
Randall

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News & World Report (2009).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] counting the number of ones in a vector

2010-02-26 Thread Nutter, Benjamin

But if x has any missing values:

> x <- c(1, 1, 1, NA, NA, 2, 1, NA)
> 
> sum( x == 1)
[1] NA
> 
> sum(x==1, na.rm=TRUE)
[1] 4

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Henrique Dallazuanna
Sent: Friday, February 26, 2010 9:47 AM
To: Randall Wrong
Cc: r-help@r-project.org
Subject: Re: [R] counting the number of ones in a vector

Try:

sum(x == 1)

On Fri, Feb 26, 2010 at 11:40 AM, Randall Wrong  wrote:
> Dear R users,
>
> I want to count the number of ones in a vector x.
>
> That's what I did : length( x[x==1] )
>
> Is that a good solution ?
> Thank you very much,
> Randall
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News & World Report (2009).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.

Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Row-wisely converting a data frame into a list

2010-03-02 Thread Nutter, Benjamin

> as.data.frame(t(df)) 

For example

> x <- as.data.frame(t(mtcars))
> typeof(x)
[1] "list"

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Sebastian Bauer
Sent: Tuesday, March 02, 2010 8:12 AM
To: r-help@r-project.org
Subject: [R] Row-wisely converting a data frame into a list

Hello,

is there an elegant way, how I can convert each row of a data frame into
distinct elements of a list?

In essence, what I'm looking for is something like

rows.to.lists <- function( df ) {
ll <- NULL
for( i in 1:nrow(df) )
ll <- append( ll, list(df[i,]) )
return (ll)
}

but more done more efficiently (the data frame may contain ten-thousands
of rows). I thought about using apply() but this function always returns
a matrix.

Thanks in advance!

Bye,
Sebastian

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News & World Report (2009).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help on getting help from manuals

2010-03-12 Thread Nutter, Benjamin

As has been pointed out, there are tools in R to help find the commands
you are looking for.

As a practical note, I recommend starting with '?' if you think you know
what command you need.

If you're unsure of what command you need, my next step would be
help.search().  Often, the results from here will point you to a command
that you are looking for.

If a handful of tries with help.search() doesn't get you what you need,
then try RSiteSearch().  This will search the mailing list for entries
relevant to your search criteria.  Often times, you'll find that someone
has already asked your question.  In fact, I can only recall a couple of
times where a search of the mailing list has failed to provide a
solution for me.

Aside from reading lots of documentation, the one thing that has helped
me the most in picking up R has been reading Peter Dalgaard's
"Introductory Statistics with R."  It's a very small, basic introduction
to R, but laid a solid enough foundation that I soon found myself able
to produce and refine my own solutions.

In short, the best way to improve our knowledge of R is lots of reading
and a bit of trial and error.

Benjamin 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of ManInMoon
Sent: Friday, March 12, 2010 4:41 AM
To: r-help@r-project.org
Subject: [R] Help on getting help from manuals


Hi,

A number of people have suggested "I read the manuals"...

Could someone help me by telling me where the primary start point is
please?

For example, I am interested in writing functions with variable number
of arguments - where should I start to look?

"An introduction to R" only show a brief example - with no pointer to
where to find further data.

I can't do ?xxx from R console in most cases - as I don't know what the
function name is that I am looking for!!!

People have helped me find "substitute" to get some metadata out - BUT
how could I have found that without guidance from nice people in Nabble?

Any help on this very much appreciated.


--
View this message in context:
http://n4.nabble.com/Help-on-getting-help-from-manuals-tp1590323p1590323
.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News & World Report (2009).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Retrieving latitude and longitude via Google Maps API

2010-03-16 Thread Nutter, Benjamin

Does anyone have any experience retrieving latitutde and longitude for
an address from the Google Maps API?  

I'd like to have an R script that submits a street address, city, state,
and zip code and returns the coordinates.  So far, I've been submitting
the coordinates from another program, then loading the coordinates in R
and merging them back into the data frame I want to use.  It'd be nice
to be able to do it all in one script, but I'm not comprehending the API
thing very well.

I'm using R 2.9.1 on Windows XP.  Any suggestions or pointers?

Benjamin

 
 
  Benjamin Nutter |  Biostatistician  |  Quantitative Health Sciences
  Cleveland Clinic  |  9500 Euclid Ave.   |  Cleveland, OH 44195  |
(216) 445-1365


===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News & World Report (2009).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mboost: Interpreting coefficients from glmboost if center=TRUE

2010-03-19 Thread Benjamin Hofner

Sorry for the tardy reply but I just found your posting incidentally 
today. To make long things short:

You are right about the centering. We forgot to correct the intercept if 
center = TRUE. We lately found the problem ourself and fixed it in the 
current version (mboost 2.0-3). However the problem only occurred if you 
extracted the coefficients. As the intercept is rarely interpretable we 
didn't have a closer look and thus it took some time until we found this 
bug. The model predictions however always took care of the centering and 
thus were not affected (as you already pointed out).

As you realized centering is of hight importance if you use glmboost as 
it reduces the number of boosting iterations needed to estimate the 
model and furthermore often improves the estimates. Centering is also 
important if you use gamboost() and specify linear base-learners without 
intercept (e.g. bols(x, intercept=FALSE)). However, in this case you 
have to center the covariates yourself and take care of the intercept 
correction afterwards.

You wrote you weren't aware of a mailing list for mboost but you could 
write to the maintainer (see 
http://cran.at.r-project.org/package=mboost) for possible help and/or to 
report bugs.

HTH
 Benjamin

Kyle Werner wrote:

Thanks for your reply. In fact, I do use the predict method for model
assessment, and it shows that centering leads to a substantial
improvement using even the bluntest of assessments of 'goodness'
(i.e., binary categorization accuracy). So I agree that the package
authors must have internal tools to reverse the effects of centering
the variables, at least within the predict method. But it seems to me
that the coefficients that I get out should be related to the values
that I input, not to the centered values. In other words, centering
seems like it should be done "invisibly;" unless I center the
variables myself, I would expect the coefficients to be applicable to
the original data.

I extract the coefficients returned by the model and store them in a
database which is web accessible. I reconstruct models periodically,
and track various statistics associated with these models in the
database. This is why I highly value the fact that mboost has
glmboost, which can return linearly interpretable coefficients. It is
also why I do not directly call upon R every time I want to query a
model. (As an aside, if I were to use R directly, I might consider the
gamboost or blackboost methods, which do not return scalar
coefficients that are readily extractable.)

On Sun, Feb 7, 2010 at 6:31 PM, David Winsemius comcast.net> wrote:

 >
 > On Feb 7, 2010, at 5:03 PM, Kyle Werner wrote:
 >
 >> I'm running R 2.10.1 with mboost 2.0 in order to build predictive
 >> models . I am performing prediction on a binomial outcome, using a
 >> linear function (glmboost). However, I am running into some confusion
 >> regarding centering. (I am not aware of an mboost-specific mailing
 >> list, so if the main R list is not the right place for this topic,
 >> please let me know.)
 >>
 >> The boost_control() function allows for the choice between center=TRUE
 >> and center=FALSE. If I select center=FALSE, I am able to interpret the
 >> coefficients just like those from standard logistic regression.
 >> However, if I select center=TRUE, this is no longer the case. In
 >> theory and in practice with my data, centering improves the
 >> predictions made by the model, so this is an issue worth pursuing for
 >> me.
 >>
 >> Below is output from running the exact same data in exactly the same
 >> way, only differing by whether the "center" bit is flipped or not:
 >>
 >> Output with center=TRUE:
 >> [(Intercept)] => -0.04543632
 >> [painscore] => 0.007553608
 >> [Offset] => -0.546520621809327
 >>
 >> Output with center=FALSE:
 >> [(Intercept)] => -0.989742
 >> [painscore] => 0.001342585
 >> [Offset] => -0.546520621809327
 >>
 >> The mean of painscore is 741. It seems to me that for center=FALSE,
 >> mboost should modify the intercept by subtracting 741*0.007553608 from
 >> it (thus intercept should = -11.285). If I manually do this, the
 >> output is credible, and in the ballpark of that given by other methods
 >> (e.g., lrm or glm with a Binomial link function). If I don't do this,
 >> then the inverse logistic interpretation of the output is off by
 >> orders of magnitude.
 >>
 >> In the end, with "center=TRUE", and I want to make a prediction based
 >> on the coefficients returned by mboost, the results only make sense if
 >> I manually rescale my independent variables prior to making a
 >> prediction. Is this the desired behavior, or am I doing something
 >&

[R] dump() an object of type raw ?

2009-07-21 Thread Benjamin Tyner


Is there a way to do this? I tried

  x <- writeBin(pi, raw())
  dump("x","x.R")
  source("x.R")

but is seems x.R is not source()-able, as it contains an unexpected symbol.

Thanks
Ben

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dump() an object of type raw ?

2009-07-21 Thread Benjamin Tyner

Thank you David for taking the time to respond to my question. Perhaps I 
should clarify: the man page says "a 'dump' file can be 'source'd"; have 
I taken the documentation too literally in this case?

David Winsemius wrote:

On Jul 21, 2009, at 7:19 PM, Benjamin Tyner wrote:

Is there a way to do this? I tried

 x <- writeBin(pi, raw())
 dump("x","x.R")
 source("x.R")

but is seems x.R is not source()-able, as it contains an unexpected 
symbol.

Yes, "2d".I get:
> source("/Users/davidwinsemius/x.R")
Error in source("/Users/davidwinsemius/x.R") :
  /Users/davidwinsemius/x.R:2:8: unexpected symbol
1: x <-
2: c(18, 2d
  ^

So there was an good faith effort to source the file but the 
interpreter was not put on notice that it would be getting 
hexadecimal. The file contains:

x <-
c(18, 2d, 44, 54, fb, 21, 09, 40)

and when you try to execute that from the command line you get:

> x <-
+ c(18, 2d, 44, 54, fb, 21, 09, 40)
Error: syntax error

(R variable names cannot start with 2 and R wouldn't assume these are 
hexadecimal numbers.)

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] numbers on barplot

2009-07-27 Thread Nutter, Benjamin

The only thing you're missing is the midpoints of the bars.  Since you
specified

> Graph <- barplot(dat$Average)

You can get the midpoints from the Graph object.  So to put the number
on top of each bar you might use something like:

> text(Graph, dat$Average, dat$Average)


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Mohsen Jafarikia
Sent: Monday, July 27, 2009 10:02 AM
To: r-h...@stat.math.ethz.ch
Subject: [R] numbers on barplot

Hello all,
I have this simple barplot code:

ifn <- "id.dat"
dat <- read.table(ifn)
ofn <- "id.png"

bitmap(ofn, type = "png256", width = 30, height = 30, pointsize = 30, bg
=
"white",res=50)
par(mar=c(5, 5, 3, 2),lwd=5)
par(cex.main=1.6,cex.lab=1.6,cex.axis=1.6)

names(dat)<-c("NumberOfPeople","Average")
Graph<-barplot(dat$Average)
dev.off()

and here is the data (id.dat):

150.08
 60.09
 70.37

I want to write down the ""NumberOfPeople" on top of each of the bars.
Can
anybody help me on this?

Thanks,
Mohsen

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S. News & World Report (2008).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Drawing lines in margins

2009-07-29 Thread Nutter, Benjamin

Look at the xpd option in ?par.  If you set par(xpd=TRUE) you should be
able to add a segment for what you want.  But please let me know if
someone gives you a better way to do this.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Alan Cohen
Sent: Wednesday, July 29, 2009 10:22 AM
To: r-help@r-project.org
Subject: [R] Drawing lines in margins

Hi all,

Quick question: What function can I use to draw a line in the margin of
a plot?  segments() and lines() both stop at the margin.

In case the answer depends on exactly what I'm trying to do, see below.
I'm using R v. 2.8.1 on Windows XP.

Cheers,
Alan

I'm trying to make a horizontal barplot with a column of numbers on the
right side.  I'd like to put a line between the column header and the
numbers.  The following reconstructs the idea - just copy and paste it
in:
aa <- 1:10
plot.mtx2<-cbind(aa,aa+1)
colnames(plot.mtx2)<-c("Male","Female")
lci2<- cbind(aa-1,aa)
uci2<- cbind(aa+1,aa+2)
par(mar=c(5,6,4,5))
cols <- c("grey79","grey41")
bplot2<-barplot(t(plot.mtx2),beside=TRUE,xlab="Malaria death rates per
100,000",
names.arg=paste("state",aa,sep=""),legend.text=F,las=1,xlim=c(0,13),
horiz=T, col=cols,
main="Malaria death rates by state and sex")
legend(8,6,legend=c("Female","Male"),fill=cols[order(2:1)])
segments(y0=bplot2, y1=bplot2, x0=t(lci2), x1=t(uci2))
mtext(10*(aa+1),side=4,line=4,at=seq(3,3*length(aa),by=3)-0.35,padj=0.5,
adj=1,las=1,cex=0.85)
mtext(10*aa,side=4,line=4,at=seq(2,3*length(aa)-1,by=3)-0.65,padj=0.5,ad
j=1,las=1,cex=0.85)
mtext("Estimated",side=4,line=3,at=3*length(aa)+2.75,padj=0.5,adj=0.5,la
s=1,cex=0.85)
mtext("Deaths",side=4,line=3,at=3*length(aa)+1.25,padj=0.5,adj=0.5,las=1
,cex=0.85)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S. News & World Report (2008).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.

Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Converting date format

2010-03-23 Thread Nutter, Benjamin

If x is your vector of character date variables:

orig.date <- as.Date(x, format=c("%m/%d/%Y"))
new.date <- format(x, format=c("%m/%d/%y"))

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Hosack, Michael
Sent: Tuesday, March 23, 2010 10:11 AM
To: R-help@r-project.org
Subject: [R] Converting date format

R community:

Hello, I would to like to convert a character date variable from
%m/%d/%Y to %m/%d/%y. Any advice would be greatly appreciated. I have
tried functions for changing the formatting and removing the unnecessary
digits without success.

Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News & World Report (2009).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.

Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] (no subject)

2010-04-21 Thread Benjamin Stewart


I've got a problem with the sparseby command (reshape
library), and I have reached the peak of my R knowledge (it isn't really that
high).

 

I have a small data frame of 23 rows and 15 columns, here
is a subset, the first four columns are factors and the rest are numeric (only
one, line54 is provided).

 

   bearID YEAR
Season SEX  line54

5    1900    8 
3   0  16.3923519

11   2270    5 
1   0 233.7414014

12   2271    5 
1   0 290.8207652

13   2271    5 
2   0 244.7820844

15   2291    5 
1   0   0.000

16   2291    5 
2   0  14.5037795

17   2291    6 
1   0   0.000

18   2293    5 
2   0 144.7440752

19   2293    5 
3   0   0.000

20   2293    6 
1   0  16.0592270

21   2293    6 
2   0  30.1383426

28   2298    5 
1   0   0.9741067

29   2298    5 
2   0   9.6641018

30   2298    6 
2   0   8.6533828

31   2309    5 
2   0  85.9781303

32   2325    6 
1   0 110.8892153

35   2331    6 
1   0  26.7335562

44   2390    7 
2   0   7.1690620

45   2390    8 
2   0  44.1109897

46   2390    8 
3   0 503.9074898

47   2390    9 
2   0   8.4393660

54   2416    7 
3   0  48.6910907

58   2418    8      2  
0   5.7951139

 

Sparseby works fine when I try to calculate mean

 

>sparseby(mF[1:5], mF$Season, mean)

 

  mF$Season bearID
YEAR Season SEX    line54

1 1 NA  
NA NA   0 
84.90228

2 2 NA  
NA NA   0 
54.90713

3 3     NA  
NA NA   0 142.24773

 

But it goes nuts when looking for max or min

 

> sparseby(mF[5:6], mF$Season, max)

  mF$Season
structure(c(2169.49621795108, 1885.22677689026, 2492.17544685464

1 1
2169.496

2 2
1885.227

3 3
2492.175

 

Any ideas? All I want is to calculate create three
data.frames, mean, min and max.

 

Thanks,

 

Ben Stewart

  

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RExcel

2009-10-06 Thread Benjamin Ward

Hello-

I am a Graduate Assistant for an instructor who has written programs for
statistics calculations such as binomial distributions and regressions.

The programs had worked with no problem in Excel 2003. Now we are trying to
use it with Excel 2007, and we are having some trouble.

I have downloaded RandFriends and have ran the binomial distribution
process in 2007 Excel and have received an error that says: "Compile error
in hidden module: UFDBinomial"

However, ther are two demo excel files in the RExcel file called RdemoDens.
When I open the first RDemoDens excel file, I can run the processes and they
work fine. When I run the second RDemoDens excel file, or a blank excel 2007
file, the processes do not work and I get the error message.

I am trying to figure out what is different about the first RDemoDens excel
file that allows the calculations to process correctly. I am thinking that
something in the macro library in the demo must be different than what is in
a blank excel document. I just cannot seem to figure out what it is.

One thing that I did notice is that there are two different RExcel files in
the RExcel folder. One is labled "RExcel" and one is labed "RExcel 2007."
What are the difference between these two RExcel files? I am not sure if
this has anything to do with the problem, but perhaps the excel demo in
which our calculations work uses the correct RExcel file while a regular
excel 2007 document does not call the correct one.

If anyone has an idea about what might be happening here, or who else I
could ask about the situation, I would appreciate any input.

Thanks,

Ben Ward

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] row selection

2009-10-08 Thread Nutter, Benjamin

> x[seq(1, nrow(x), by=5), ] 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of David Winsemius
Sent: Thursday, October 08, 2009 4:19 PM
To: Ashta
Cc: R help
Subject: Re: [R] row selection

On Oct 8, 2009, at 4:14 PM, Ashta wrote:

> Hi all,
> I have a matrix  named x with N by  C
> I want to select every 5 th rrow from matrix x I used the following 
> code
> n<- nrow(x)
>> for(i in 1: n){
> + b <- a[i+5,]
>> b
> }
> Error: subscript out of bounds

What did you expect when "i" in your loop counter became one greater
than the number of rows?

>

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S. News & World Report (2008).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.

Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] row selection

2009-10-09 Thread Nutter, Benjamin

sub3 <- x[-seq(1, nrow(x), by=5), ]

Notice the '-' in front of the seq() command.  This will select
everything but what is in the sequence.

From: Ashta [mailto:sewa...@gmail.com] 
Sent: Friday, October 09, 2009 12:42 PM
To: Nutter, Benjamin
Cc: r-help@r-project.org
Subject: Re: [R] row selection

Hi all,

Thank you for your help. Now I am able to select every 5th row of the
data from the main data set (x) 
using

sub1<- x[seq(1, nrow(x), by=5), ]

So sub1 contains one fith of the data set  X.  I want also create
another data set that will contain the remaining  data set from X (ie.,
four fifth of the data set).

Any help is highly appreciated.

I have a matrix  named x with N by  C
I want to select every 5 th rrow from matrix x I used the following
code
> n<- nrow(x)
>> for(i in 1: n){
> + b <- a[i+5,]
>> b
> }

>sc < x[seq(1, nrow(x), by=5), ]

-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of David Winsemius
Sent: Thursday, October 08, 2009 4:19 PM
To: Ashta
Cc: R help
Subject: Re: [R] row selection

On Oct 8, 2009, at 4:14 PM, Ashta wrote:

> Hi all,
> I have a matrix  named x with N by  C
> I want to select every 5 th rrow from matrix x I used the
following
> code
> n<- nrow(x)
>> for(i in 1: n){
> + b <- a[i+5,]
>> b
> }
> Error: subscript out of bounds

What did you expect when "i" in your loop counter became one
greater
than the number of rows?

>

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible
code.

===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S. News & World Report (2008).
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.

Confidentiality Note:  This message is intended for
use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible
code.

===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S. News & World Report (2008).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.

Confidentiality Note:  This message is intended for use\...{{dropped:16}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Functional data analysis - problems with smoothing

2009-10-22 Thread Benjamin Cheah

Hi all,

I'm having major issues with smoothing my functional data

I'm referring to Jim Ramsay's
examples in his books. The following error message keeps appearing,
despite all my data being numeric can anyone kindly offer any suggestions?

isi - vector of argument values - i.e. the independent variable of the curves
rlz - data array
TMSfdPar - functional data parameter.

> TMSfdPar = fdPar(TMSbasis, 4, 0.01)
> TMSfdsmooth = smooth.basis(isi, rlz, TMSfdPar)
Error in smooth.basis(isi, rlz, TMSfdPar) : 'argvals' is not numeric.

I don't understand why the error message keeps popping up. I've tried playing 
around with Jim Ramsay's datasets and I think my data is organised in a similar 
manner to his, so can't understand what's going on oh, the frustration!

Thanks in advance,

Ben
(the amateur R data analyst and statistician.)



  
__


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Functional data analysis - problems with smoothing

2009-10-23 Thread Benjamin Cheah

Hi Peter and Spencer (and everyone else out there),

Thank you for your prompt reply to my post re. FDA - FYI - i'm very new to R 
and FDA. Currently half way through a biostats masters degree in Sydney, 
Australia! Absolutely loving it and I hope to include R computing on my future 
CV!

Just to clarify things:

When I entered what you suggested, this is what I obtained with my own 
dataset
> str(isi)
 num [1:14] 1 1.5 2 2.5 3 3.5 4 5 7 10 ...

Similarly, for Jim Ramsay's dataset (chapter five of Use R book):
> str(age)
 num [1:31] 1 1.25 1.5 1.75 2 3 4 5 6 7 ...

Doesn't this mean that my data object, 'isi' is numeric?

Also, I was looking through Jim Ramsay's datasets - I'm fairly sure that my 
vector, 'isi' was similarly organised to his vector, 'age' in chapter 5 of his 
new Use R FDA book, so I'm fairly certain that the data is numeric but it 
obviously isn't

Hope this email clarified things - apologies if it hasn't - still very new to 
the terminology and overall feel of R.

The R user network is terrific! Thank you for caring!

Ben

From: Peter Ehlers 

Cc: R-help@r-project.org
Sent: Fri, 23 October, 2009 5:07:15 PM
Subject: Re: [R] Functional data analysis - problems with smoothing

The error message is pretty clear: regardless of what
*you* think, R says that 'isi' is not numeric.

Are you sure that 'isi' is not a *factor* object?
I'm willing to bet that it is.

Use str() to check your data.

-Peter Ehlers

Benjamin Cheah wrote:
> Hi all,
> 
> I'm having major issues with smoothing my functional data
> 
> I'm referring to Jim Ramsay's
> examples in his books. The following error message keeps appearing,
> despite all my data being numeric can anyone kindly offer any suggestions?
> 
> isi - vector of argument values - i.e. the independent variable of the curves
> rlz - data array
> TMSfdPar - functional data parameter.
> 
>> TMSfdPar = fdPar(TMSbasis, 4, 0.01)
>> TMSfdsmooth = smooth.basis(isi, rlz, TMSfdPar)
> Error in smooth.basis(isi, rlz, TMSfdPar) : 'argvals' is not numeric.
> 
> I don't understand why the error message keeps popping up. I've tried playing 
> around with Jim Ramsay's datasets and I think my data is organised in a 
> similar manner to his, so can't understand what's going on oh, the 
> frustration!
> 
> Thanks in advance,
> 
> Ben
> (the amateur R data analyst and statistician.)
> 
> 
> 
>   
> __
> 
> 
> [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

__

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extracting the last row of each group in a data frame

2009-11-17 Thread Nutter, Benjamin

I usually use the following function:

last.record <- function(data, id, ..., na.last=TRUE, decreasing=FALSE){

  #*** Make vector of variables to sort by
  v <- c(id, unlist(list(...)))

  #*** Sort Data Frame
  data <- data[do.call(order,
c(data[,v, drop=FALSE], na.last=na.last,
decreasing=decreasing)),]

  #*** Extract last record for each id
  data[!duplicated(data[,id], fromLast=TRUE),]
}


DataData Frame from which the record is to be extracted
Id  ID variable from which the record is to be extracted.  The
data frame is automatically
  sorted by this variable.  May be either a character string
or an integer.
... Names of variables (or indices) in additon to id by which
data should be sorted.
na.last Argument passed to order().  Determines if missing values
are placed at the end
  of the sorting.
Decreasing  Argument passed to order().  Determines if data frame is
sorted in descending
  order.



So, in your example

> df <- data.frame(Name = c("A", "A", "A", "B", "B", "C", "D"),
   Value = c(1, 2, 3, 4, 8, 2, 3))


> last.record(df, "Name", "Value")

> last.record(df, 1, 2)

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Hao Cen
Sent: Monday, November 16, 2009 2:43 PM
To: r-help@r-project.org
Subject: [R] extracting the last row of each group in a data frame

Hi,

I would like to extract the last row of each group in a data frame.

The data frame is as follows

Name Value
A 1
A 2
A 3
B 4
B 8
C 2
D 3

I would like to get a data frame as
Name Value
A 3
B 8
C 2
D 3

Thank you for your suggestions in advance

Jeff

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News & World Report (2009).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Adding and Multiplying two Unevaluated Expressions

2009-12-01 Thread Benjamin Müller

HI,

As I'm trying to compute Taylor series, I'm having problems in adding and 
multiplying unevaluated expressions. I searched for a solution but found none.

my Taylor function works fine for evaluating functions as you can see here:


rTaylorVal=function(exp,x0,dx,n) {

ls=list(x=x0)

newexp=eval(exp,ls)

exp0=exp

for (i in 1:n){
exp0=D(exp0,"x")
newexp=newexp+eval(exp0,ls)/factorial(i)*dx^i
}

return(newexp)

}

Where exp is an expression like exp=expression(x^2*sin(x)), x0 is the 
startvalue, dx the difference between startvalue and searched value and n is 
the length of the series.

So I tried to remove dx as a value, to get a Taylor series expression, but it 
doesn't work as simple multiplication (*) and accumulation (+) is not good for 
expressions.

That's my point so far, now my question:

Is it actually possible to add and/or multiply expressions, and how?

Thank you so far.

Benjamin Müller
Geographer (B.Sc.)


-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [Fwd: Re: Adding and Multiplying two Unevaluated Expressions]

2009-12-02 Thread Benjamin Müller




 Original-Nachricht 
Betreff:Re: [R] Adding and Multiplying two Unevaluated Expressions
Datum:  Tue, 01 Dec 2009 23:49:39 +0100
Von:Benjamin Müller 
An: Rolf Turner 
Referenzen: 	<20091201144125.316...@gmx.net> 
<8e40e49f-e8fc-4fbd-8cc5-93789ffb0...@auckland.ac.nz>




This works fine for your example, but doesn't work as simple if there's 
more than these expressions.


In my example this would be in line 11:

newexp=as.expression(substitute(a+con1/con2*b^con3,list(a=newexp[[1]],b=expression(dx)[[1]],con1=eval(exp0,ls),con2=factorial(i),con3=1*i)))

this works just fine and you can still evaluate this. thank you very much!

If there's an easier way, let me know.
Greets,
Ben Müller


Rolf Turner schrieb:


On 2/12/2009, at 3:41 AM, Benjamin Müller wrote:


HI,

As I'm trying to compute Taylor series, I'm having problems in adding 
and multiplying unevaluated expressions. I searched for a solution 
but found none.


my Taylor function works fine for evaluating functions as you can see 
here:



rTaylorVal=function(exp,x0,dx,n) {

ls=list(x=x0)

newexp=eval(exp,ls)

exp0=exp

for (i in 1:n){
exp0=D(exp0,"x")
newexp=newexp+eval(exp0,ls)/factorial(i)*dx^i
}

return(newexp)

}

Where exp is an expression like exp=expression(x^2*sin(x)), x0 is the 
startvalue, dx the difference between startvalue and searched value 
and n is the length of the series.


So I tried to remove dx as a value, to get a Taylor series 
expression, but it doesn't work as simple multiplication (*) and 
accumulation (+) is not good for expressions.


That's my point so far, now my question:

Is it actually possible to add and/or multiply expressions, and how?


This may well be a case of the blind leading the partially sighted,
but for what it's worth my answer is ``Yes, but it's a kludge.''
One needs to use substitute, it seems to me.

E.g.:

e1 <- expression((x+y)^2)
e2 <- expression(1/(x^2 + y^2))
e3 <- as.expression(substitute(a+b,list(a=e1[[1]],b=e2[[1]])))
e3
D(e3,"x")
D(e3,"y")

Older and wiser heads may provide you with better counsel.

cheers,

Rolf Turner




##
Attention:This e-mail message is privileged and confidential. If you 
are not theintended recipient please delete the message and notify the 
sender.Any views or opinions presented are solely those of the author.


This e-mail has been scanned and cleared by 
MailMarshalwww.marshalsoftware.com

##



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 3 4 >

1 - 100 of 396 matches

Mail list logo