from:"Andrew"

[R] Smallest Space Analysis (SSA) in R

2018-09-19 Thread Andrew

Hi

As part of my forensics psych course, we have been introduced to 
Guttman's smallest space analysis (SSA). I want to explore this approach 
using R, but despite finding some queries on the web about this same 
thing, have yet to find any answers. The MASS package doesn't seem to do 
the job, and the only thing I have been able to find is some proprietary 
software HUDAP  (Hebrew University Data Analysis Package) which may/ not 
be compatible with R (or GNU/Linux for that matter).

Does anyone have information on how to do SSA using R?

Many thanks

Andrew


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Smallest Space Analysis (SSA) in R

2018-09-20 Thread Andrew

Hi Eric

I will need to dig into this a bit deeper, but this looks like it might 
hold some promise. The web link you shared seems familiar - perhaps I 
came across it but not at the site you linked to. I will read the 
sources with interest.

Thank you for bringing them to my attention.

Regards,
Andrew


On 20/09/18 21:28, Eric Berger wrote:
> Hi Andrew,
> I don't have any experience in this area but I was intrigued by your 
> question. Here is what I learned.
>
> 1, A bit of poking around turned up a thread on stats.stackexchange 
> that mentions that "smallest space analysis" (SSA) is a special case 
> of "multidimensional scaling" (MDS).
> See the thread here:
> https://stats.stackexchange.com/questions/82462/guttmans-smallest-space-analysis
>
> 2. The R package SMACOF implements some solutions for MDS. See the 
> documentation "Multidimensional Scaling in R: SMACOF" available at
> https://mran.microsoft.com/snapshot/2018-05-13/web/packages/smacof/vignettes/smacof.pdf
>
> HTH,
> Eric
>
>
>
> On Wed, Sep 19, 2018 at 2:00 PM, Andrew  <mailto:phaedr...@gmail.com>> wrote:
>
> Hi
>
> As part of my forensics psych course, we have been introduced to
> Guttman's smallest space analysis (SSA). I want to explore this
> approach
> using R, but despite finding some queries on the web about this same
> thing, have yet to find any answers. The MASS package doesn't seem
> to do
> the job, and the only thing I have been able to find is some
> proprietary
> software HUDAP  (Hebrew University Data Analysis Package) which
> may/ not
> be compatible with R (or GNU/Linux for that matter).
>
> Does anyone have information on how to do SSA using R?
>
> Many thanks
>
> Andrew
>
>
>         [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org <mailto:R-help@r-project.org> mailing list --
> To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> <https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> <http://www.R-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Smallest Space Analysis (SSA) in R

2018-09-21 Thread Andrew


Hi Martin

I was aware of the 'old' MDS package, thanks.

Hi Eric

I've now read the paper you referenced, thanks. Still didn't see 
anything about SSA however, so will keep looking.


However, if anyone on this list comes across something specific to SSA, 
could you kindly drop me a line to let me know?


Many thanks
Andrew


On 21/09/18 07:46, Martin Maechler wrote:

Eric Berger
 on Thu, 20 Sep 2018 23:28:27 +0300 writes:

 > Hi Andrew,
 > I don't have any experience in this area but I was intrigued by your
 > question. Here is what I learned.

 > 1, A bit of poking around turned up a thread on stats.stackexchange that
 > mentions that "smallest space analysis" (SSA) is a special case of
 > "multidimensional scaling" (MDS).
 > See the thread here:
 > 
https://stats.stackexchange.com/questions/82462/guttmans-smallest-space-analysis

 > 2. The R package SMACOF implements some solutions for MDS. See the
 > documentation "Multidimensional Scaling in R: SMACOF" available at
 > 
https://mran.microsoft.com/snapshot/2018-05-13/web/packages/smacof/vignettes/smacof.pdf

Well, but MDS  is "old" and already available in standard R in
its classical form:


apropos("MDS")

[1] "cmdscale"

?cmdscale  # "Classical Multidimensional Scaling"

then gives the help page for 'cmdscale' which in its 'See Also:'
mentions the

isomds()
sammon()

functions from package MASS which (as formally "Recommended"
pkg) is part of every full R installation.

So, at first, there's no need for any extra package...


 > HTH,
 > Eric



 > On Wed, Sep 19, 2018 at 2:00 PM, Andrew  wrote:

 >> Hi
 >>
 >> As part of my forensics psych course, we have been introduced to
 >> Guttman's smallest space analysis (SSA). I want to explore this approach
 >> using R, but despite finding some queries on the web about this same
 >> thing, have yet to find any answers. The MASS package doesn't seem to do
 >> the job, and the only thing I have been able to find is some proprietary
 >> software HUDAP  (Hebrew University Data Analysis Package) which may/ not
 >> be compatible with R (or GNU/Linux for that matter).
 >>
 >> Does anyone have information on how to do SSA using R?
 >>
 >> Many thanks
 >>
 >> Andrew
 >>
 >>
 >> [[alternative HTML version deleted]]
 >>
 >> __
 >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 >> https://stat.ethz.ch/mailman/listinfo/r-help
 >> PLEASE do read the posting guide http://www.R-project.org/
 >> posting-guide.html
 >> and provide commented, minimal, self-contained, reproducible code.
 >>

 > [[alternative HTML version deleted]]

 > __
 > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 > https://stat.ethz.ch/mailman/listinfo/r-help
 > PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 > and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Smallest Space Analysis (SSA) in R

2018-09-22 Thread Andrew


Hi Michael

This looks like it could be really helpful in moving my project forwards 
thank you.


I remember many years ago using (proprietary) software from the 
University of Liverpool which did a nice job of allowing regions to be 
defined, and then for the space to be rotated to obtain visual 
inspection of relative distance from different angles. I appreciate that 
smacof will not do that, but as long as the analysis allows for the 
graph to be plotted and analysed, that's what's important.


Thank you again, and to all of those who responded.

Best wishes
Andrew


On 21/09/18 14:07, Michael Friendly wrote:
Smallest space analysis (SSA) is just the name given to software 
developed by Guttman & Lingoes around the time the various versions

of multidimensional scaling were being developed.  Call it Israeli MDS
or Falafel MDS if you prefer. The reason you encountered it in your
course is presumably that the instructor was trained in that.

There are several variants of MDS-like algorithms for embedding
points representing objects in a space, using data representing
similarities or distances among objects -- metric (cmdscale)
and non-metric (MASS::isoMDS), using only rank order information, and 
a variety of

measures of goodness-of-fit ("stress").  I don't recall the details
of the SSA programs, but that should matter little conceptually.

The smacof package offers the widest array of possibilities.

-Michael


On 9/19/2018 7:00 AM, Andrew wrote:

Hi

As part of my forensics psych course, we have been introduced to
Guttman's smallest space analysis (SSA). I want to explore this approach
using R, but despite finding some queries on the web about this same
thing, have yet to find any answers. The MASS package doesn't seem to do
the job, and the only thing I have been able to find is some proprietary
software HUDAP  (Hebrew University Data Analysis Package) which may/ not
be compatible with R (or GNU/Linux for that matter).

Does anyone have information on how to do SSA using R?

Many thanks

Andrew


[[alternative HTML version deleted]]






__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Smallest Space Analysis (SSA) in R

2018-09-24 Thread Andrew


Ha! Even better - thank you. Plenty here for me to play around with.

Many thanks
Andrew


On 23/09/18 15:22, Michael Friendly wrote:

On 9/22/2018 6:49 AM, Andrew wrote:

Hi Michael

This looks like it could be really helpful in moving my project 
forwards thank you.


I remember many years ago using (proprietary) software from the 
University of Liverpool which did a nice job of allowing regions to 
be defined, and then for the space to be rotated to obtain visual 
inspection of relative distance from different angles. I appreciate 
that smacof will not do that, but as long as the analysis allows for 
the graph to be plotted and analysed, that's what's important.


You need not rely on the plots provided directly by a given package.
Just roll your own using standard plotting libraries.
Here is just the first Google hit on "R MDS 3D plot"

http://omnilogia.blogspot.com/2014/05/basic-2d-3d-multi-dimensional-scaling.html 



which shows a rotating 3D plot, colored by a grouping variable. Here 
is another:


http://whatzcookinlab.blogspot.com/2013/05/spinning-3d-mds-plot.html

The vegan and ade4 packages also have a variety of plots and related 
methods.



-Michael



Thank you again, and to all of those who responded.

Best wishes
Andrew


On 21/09/18 14:07, Michael Friendly wrote:
Smallest space analysis (SSA) is just the name given to software 
developed by Guttman & Lingoes around the time the various versions

of multidimensional scaling were being developed.  Call it Israeli MDS
or Falafel MDS if you prefer. The reason you encountered it in your
course is presumably that the instructor was trained in that.

There are several variants of MDS-like algorithms for embedding
points representing objects in a space, using data representing
similarities or distances among objects -- metric (cmdscale)
and non-metric (MASS::isoMDS), using only rank order information, 
and a variety of

measures of goodness-of-fit ("stress").  I don't recall the details
of the SSA programs, but that should matter little conceptually.

The smacof package offers the widest array of possibilities.

-Michael


On 9/19/2018 7:00 AM, Andrew wrote:

Hi

As part of my forensics psych course, we have been introduced to
Guttman's smallest space analysis (SSA). I want to explore this 
approach

using R, but despite finding some queries on the web about this same
thing, have yet to find any answers. The MASS package doesn't seem 
to do
the job, and the only thing I have been able to find is some 
proprietary
software HUDAP  (Hebrew University Data Analysis Package) which 
may/ not

be compatible with R (or GNU/Linux for that matter).

Does anyone have information on how to do SSA using R?

Many thanks

Andrew


[[alternative HTML version deleted]]











__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with if else branching print out

2018-12-18 Thread Andrew

Hi all

Playing around with some branching using if, if else, and else, I wrote 
the following very basic script:


[code block]

## Take single value input from user:
# Note 'as.integer' to convert character vector to numeric for testing

nums <- (as.numeric(readline("Please enter a number between 1 and 15: ")))

## Truth test routine:
# Set conditional values and test input against conditions and then 
print outcome

if ((nums > 15) || (nums < 1))
{
   print(c(nums, "is not between 1 or 15"))
} else if ((nums > 5) && (nums < 10) || (nums == 12))
{
   print(c(nums, "falls within 6 - 9 inclusively, or is 12"))
}  else
{
   print( c(nums, "is *not* 6 - 9 inclusive, nor is it 12"))
}

[/ code block]


However, it prints out like this:

[output]

Please enter a number between 1 and 15: 17
[1] "17" "is not between 1 or 15"

Please enter a number between 1 and 15: 9
[1] "9"
[2] "falls within 6 - 9 inclusively, or is 12"

Please enter a number between 1 and 15: 13
[1] "13" "is *not* 6 - 9 inclusive, 
nor is it 12"

[/ output]


How do I:

(a) reduce the gap between the reported number (i.e., 17, 9, 13) in each 
of the lines? and

(b) ensure that in the case of the second run using 9 as the input, the 
print is not over two lines?

I will try the switches function soon, which may yield a different 
output format, but wanted to get my head around why this is printing out 
the way it is.

Many thanks for any help.

Andrew


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [Solved] Re: Help with if else branching print out

2018-12-18 Thread Andrew


Dear Eric, Ivan & Rui

Thanks all for your suggestions. FWIW, I've opted to use paste0 and 
after a bit of playing around found it formatted the output nicely.


Eric, I wasn't aware of sprintf - seems a handy function to format text 
with, so will try to remember that.

Ivan, I'll dig into 'message' a little more, so that's a good one to know.
Rui, I liked the idea of the output vector and indexing - that will be 
useful for longer/ more complicated branching


Thank you all again.

Best wishes
Andrew


On 18/12/2018 09:25, Ivan Krylov wrote:

On Tue, 18 Dec 2018 08:56:23 +
Andrew  wrote:


How do I:

(a) reduce the gap between the reported number (i.e., 17, 9, 13) in
each of the lines? and

(b) ensure that in the case of the second run using 9 as the input,
the print is not over two lines?

Build a single string from your string parts. ?sprintf has already been
mentioned; another option is ?paste.

To prevent strings from being printed in quotes, use ?message.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Workaround for RODBC asymmetric numeric data treatment

2014-10-03 Thread Andrew

Note: I did raise report the issue below to   r-sig...@r-project.org, but 
didn't see any reply.  
I'm hoping somebody on r-help can help me devise a workaround for a problem I'm 
having 
with RODB:


I use RODBC to read and write a good deal of data to SQL Server and I'd be 
extremely grateful 
if anyone has found a workaround in order to be able to write dataframes to SQL 
Server
using RODBC dynamically created SQL tables and read the data from those tables, 
or indeed any 
arbitrary SQL Server table with "float" datatypes and end up with numeric 
columns instead of "factor" columns
in a dataframe in R.


I have found that when RODBC creates a Microsoft SQL Server data table from a 
dataFrame using sqlSave(append=FALSE), 
RODBC uses the SQL "float" datatype to store R numeric data in a 
dynamically-created table on the server.

However, when RODBC reads any SQL Server "float" datatype from SQL Server via 
sqlQuery it interprets float columns as "factor" data.


I created a standalone sample below to demonstrate the odd behavior of RODBC 
that I hope to overcome:

# Assuming the reader has access to SQL Server the code below is self-contained 
and repeatable

# I believe it demonstrates unexpected and undesirable behavior in RODBC 


library(RODBC)
library(fPortfolio)
library(timeSeries)
head(SWX.RET$SBI)
str(SWX.RET$SBI)
mydata<-as.timeSeries(SWX.RET)
head(mydata)

df2beSavedByRODBC =as.data.frame(mydata)

str(df2beSavedByRODBC)

# shows the numeric data in the dataframe
# 
# data.frame':  1916 obs. of  6 variables:
#   $ SBI : num  -0.002088 -0.000105 -0.00136 0.000419 0 ...
# $ SPI : num  -0.03439 -0.01041 0.01212 0.02246 0.00211 ...
...


# Let's save the dataframe to SQL Server:
  
dbconn<-odbcDriverConnect(connection="Driver={SQL 
Server};server=_YOURSERVERNAMEHER_;database=_YOURDBNAME_;Trusted_Connection=True;");
sqlSave(channel=dbconn,dat=df2beSavedByRODBC,tablename="testTable",rownames=TRUE,append=FALSE,addPK=FALSE,verbose=FALSE)

# The sqlSave above works very well.  The new table is create in the Microsoft 
SQL database and the ddl for the table is:
# 
# [dbo].[testTable](
#   [rownames] [varchar](255) NULL,
# [SBI] [float] NULL,
# [SPI] [float] NULL,
# [SII] [float] NULL,
# [LP25] [float] NULL,
# [LP40] [float] NULL,
# [LP60] [float] NULL
# )


# The numeric values from the dataframe are stored as float (i.e. numeric) in 
SQL server -- good!

## now let's read back the data RODBC stored in SQL server from a SQL table 
RODBC created:
  
  
sqlString = "select * from testTable"


dataFrameFromDB = sqlQuery(dbconn,sqlString,errors=TRUE);

str(dataFrameFromDB)

# 
# 'data.frame':  1916 obs. of  7 variables:
# $ rownames: Factor w/ 1916 levels "2000-01-04","2000-01-05",..: 1 2 3 4 5 6 7 
8 9 10 ...
# $ SBI : Factor w/ 1742 levels "-0.00041080415489958",..: 349 42 161 1418 
828 48 49 1419 1024 135 ...
# $ SPI : Factor w/ 1848 levels "-0.0020169904194276",..: 445 48 970 883 
1187 377 1157 1065 951 1840 ...
...

#*   RODBC wrote numeric data to SQL Server as float, but read the same 
data back as Factor !  


I could use some help to create a robust and flexible workaround for RODBC's 
asymmetric treatment of numeric data.
If there were some way to force RODBC sqlQuery to interpret all SQL Server 
float datatypes as numeric my problem would be solved.
FWIW:  RODBC does interpret the SQL Server "real" datatype as numeric.


Thank you,

Andrew

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dispaly only character variables from the dataset in R

2009-08-16 Thread andrew

you could try stringsAsFactors in the read.table function


On Aug 17, 4:19 pm, rajclinasia  wrote:
> Hi everyone,
>
> After reading the external file into R it will become a R dataset. Now my
> querry is i want to dispaly only character variables from that dataset.
> please help me in this aspect. it will be very helpful for us.
>
> Thank you in Advance.
> --
> View this message in 
> context:http://www.nabble.com/dispaly-only-character-variables-from-the-datas...
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Easy way to get top 2 items from vector

2009-09-03 Thread andrew

it is speedier to use sort than a combination of [] and order:

N<- 100
x <- runif(N)
> system.time(x[order(x)[c(N-1,N)]])
   user  system elapsed
   1.030.001.03
> system.time(sort(x)[c(N-1,N)])
   user  system elapsed
   0.280.000.28



On Sep 4, 11:17 am, Noah Silverman  wrote:
> Phil,
>
> That's perfect.  (For my application, I've never seen a tie.  While
> possible, the likelihood is almost none.)
>
> Thanks!
>
> --
> Noah
>
> On 9/3/09 4:29 PM, Phil Spector wrote:
>
>
>
> > Noah -
> >    max(x[-which.max(x)]  will give you the second largest value,
> > but it doesn't handle ties.
> >    x[order(x,decreasing=TRUE)[n]]  will give you the nth largest
> > value, with the same caveat regarding ties.  For example,
> > x[order(x,decreasing=TRUE)[1:3]] will give you the three largest
> > values.
>
> >                     - Phil Spector
> >                      Statistical Computing Facility
> >                      Department of Statistics
> >                      UC Berkeley
> >                      spec...@stat.berkeley.edu
>
> > On Thu, 3 Sep 2009, Noah Silverman wrote:
>
> >> Hi,
>
> >> I use the max function often to find the top value from a matrix or
> >> column of a data.frame.
>
> >> Now I'm looking to find the top 2 (or three) values from my data.
>
> >> I know that I could sort the list and then access the first two
> >> items, but that seems like the "long way".  Is there some way to
> >> access "max_2" or similar?
>
> >> Thanks!
>
> >> --
> >> Noah
>
> >> __
> >> r-h...@r-project.org mailing list
> >>https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >>http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
>         [[alternative HTML version deleted]]
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] dimensions checking

2010-02-10 Thread andrew

Hello r-help-archive list,

I have an algorithm that is fairly slow and I was wondering whether my
problem if it was an indexing going out of bounds and forcing the
creation of a new vector.  Basically, I want to check the dimensions
of all of my vectors before and after the algorithm runs:

I want something like

mapply(dim, ls())

but this just finds the dimension of the character array of the name
of the R object, not the actual size of the vector that has the same
name.

Any thoughts on how to do this please?

Regards,

Andrew.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strange strsplit gsub problem 0 is this a bug or a string length limitation?

2009-07-10 Thread Andrew

Thanks Marc.  I really appreciate your help.  I'm going to try my function hack.

I forwarded your suggestion to Yohan at rmetrics.

Warm regards,

Andrew

--- On Fri, 7/10/09, Marc Schwartz  wrote:

From: Marc Schwartz 
Subject: Re: [R] strange strsplit gsub problem 0 is this a bug or a string 
length limitation?
To: "tradenet" 
Cc: r-help@r-project.org
Date: Friday, July 10, 2009, 7:34 AM

On Jul 10, 2009, at 9:07 AM, tradenet wrote:

> 
> Thanks Marc!
> 
> I just found that the ~500 char limitation via an online search for the
> specs for the formula class
> The rmetrics library I'm using get's it's character array of assets by
> parsing a formula passed as an input parameter to the portfolioBacktest
> function.  Can I copy the portfolioBacktest function from the source, call
> it portfolioBackest_hack, add an additional pamater, an array of asset
> names, and have my version use this argument instead of parsing the formula?
> I'm fairly new to R so I don't know if R will find my function and if my
> function will find the other fPortfolio functions that may be referenced by
> the original, non "_hack" version of the function.
> 
> Warm regards,
> 
> Andrew

Hi Andrew,

Happy to help.

In terms of your proposal as a short term fix, it may be possible to do that. 
If you do create a new local function and call it directly, it will be seen 
instead of the package default version of the same function. However, without 
reviewing the code and package in detail, you have to be careful about other 
function dependencies and namespace issues that may be present.  I would go 
ahead and try it to see it it works.

A better and longer term approach would be to have the function author(s) 
modify the way in which they manipulate the formula object. They may wish to 
review this thread from 2001:

  https://stat.ethz.ch/pipermail/r-help/2001-August/014628.html

Back then, the as.character() limit was 60 and was increased by Prof. Ripley to 
500 in response to that discussion.

However, in that thread, Prof. Ripley also proposes a better way of 
manipulating the formula object passed to the function. That approach uses 
deparse() rather than using as.character().

HTH,

Marc Schwartz

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Newbie Question About Setting Plot Axis Limits

2009-09-23 Thread andrew

This is not a complete answer, but try the following in order

plot(c(1,1), ylim=c(0, 15), xlim=c(0, 13), axes = FALSE)
axis(1)
axis(2)
box()

If you want lines, use abline.  Hacking something together can be done
with these commands.


On Sep 24, 9:36 am, Jason Rupert  wrote:
> Using the following: plot(c(1,1), ylim=c(0, 15), xlim=c(0, 13))
>
> However, it produces the 
> following:http://n2.nabble.com/Simple-Plot-Axis-Limits-Question-td3703091.html
>
> This is not what I expected because I would have expected the origin to be 
> (0, 0), but on the plot it looks a little different on the plot.  
>
> The y axis also appears to extend a little beyond 15.  Is there some way to 
> have the plot command respect the requested ylim and xlim settings?  
>
> Thanks again for any insight.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subsetting from a vector or matrix

2009-09-25 Thread andrew

both the following will probably do the trick.

?subset
?"["

Basically on the second one, you want to come down to something that
looks like

x[L]

where x is a matrix/vector, and L is a logical vector that has the
same dimension as x, but is TRUE on the values of x that you want to
select.

for instance

x <- rnorm(10)
L <- x > 3
x[L] will return all values of x that are greater than 3.

or you can just do

x[x>3]

On Sep 25, 9:45 am, "Jim Bouldin"  wrote:
> I realize this should be simple but I'm having trouble subsetting vectors
> and matrices, for example extracting all values meeting a certain
> criterion, from a vector. Cannot seem to figure out the correct syntax and
> help page not very helpful.  Or should I be using some other function than
> subset.  Thanks for any help.
>
> Jim Bouldin
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] split-apply question

2009-10-01 Thread andrew

?subset is probably what you want:

subset(x, x1 == 'A')

On Oct 2, 1:43 pm, Kavitha Venkatesan 
wrote:
> Hi,
>
> I have a data frame that looks like this:
>
> >x
>
> x1  x2  x3
> A   1    1.5
> B   2    0.9
> B   3    2.7
> C   7    1.8
> D   7    1.3
>
> I want to "group" by the x1 column and in the case of multiple x$x1 values
> (e.g., "B")d, return rows that have the smallest values of x2. In the case
> of rows with only one value of x1 (e.g., "A"), return the row as is. How can
> I do that?  For example, in the above case, the output I want would be:
>
> x1  x2  x3
> A   1    1.5
> B   2    0.9
> C   7    1.8
> D   7    1.3
>
> Thanks!
>
>         [[alternative HTML version deleted]]
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dummy variables or factors?

2009-10-20 Thread andrew

The following is *significantly* easier to do than try and add in
dummy variables, although the dummy variable approach is going to give
you exactly the same answer as the factor method, but possibly with a
different baseline.

Basically, you might want to search the lm help and possibly consult a
stats book on information about how the design matrix is constructed
in both cases.

> xF <- factor(1:10)
> N <- 1000
> xFs <- sample(x=xF,N,replace = T)
> yFs <- rnorm(N, mean = as.numeric(xFs))
> lm(yFs ~ xFs)

Call:
lm(formula = yFs ~ xFs)

Coefficients:
(Intercept) xFs2 xFs3 xFs4
xFs5 xFs6 xFs7 xFs8
 0.7845   1.1620   2.1474   3.1391   4.2183
5.2621   6.0814   7.4170
   xFs9xFs10
 8.2193   9.2987

> lm(yFs ~ diag(10)[,1:9][xFs,])

Call:
lm(formula = yFs ~ diag(10)[, 1:9][xFs, ])

Coefficients:
(Intercept)  diag(10)[, 1:9][xFs, ]1  diag(10)[, 1:9]
[xFs, ]2  diag(10)[, 1:9][xFs, ]3
 10.083   -9.299
-8.137   -7.151
diag(10)[, 1:9][xFs, ]4  diag(10)[, 1:9][xFs, ]5  diag(10)[, 1:9]
[xFs, ]6  diag(10)[, 1:9][xFs, ]7
 -6.160   -5.080
-4.037   -3.217
diag(10)[, 1:9][xFs, ]8  diag(10)[, 1:9][xFs, ]9
 -1.882   -1.079




On Oct 21, 9:44 am, David Winsemius  wrote:
> On Oct 20, 2009, at 4:00 PM, Luciano La Sala wrote:
>
>
>
> > Dear R-people,
>
> > I am analyzing epidemiological data using GLMM using the lmer  
> > package. I usually explore the assumption of linearity of continuous  
> > variables in the logit of the outcome by creating 4 categories of  
> > the variable, performing a bivariate logistic regression, and then  
> > plotting the coefficients of each category against their mid points.  
> > That gives me a pretty good idea about the linearity assumption and  
> > possible departures from it.
>
> > I know of people who create 0,1 dummy variables in order to relax  
> > the linearity assumption. However, I've read that dummy variables  
> > are never needed (nor are desireble) in R! Instead, one should make  
> > use of factors variable. That is much easier to work with than dummy  
> > variables and the model itself will create the necessary dummy  
> > variables.
>
> > Having said that, if my data violates the linearity assumption, does  
> > the use of a factors for the variable in question helps overcome the  
> > lack of linearity?
>
> No. If done by dividing into samall numbers of categories after  
> looking at the data, it merely creates other (and probably more  
> severe) problems. If you are in the unusal (although desirable)  
> position of having a large number of events across the range of the  
> covariates in your data, you may be able to cut your variable into  
> quintiles or deciles and analyze the resulting factor, but the  
> preferred approach would be to fit a regression spline of sufficient  
> complexity.
>
> > Thanks in advance.
>
> --
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dummy variables or factors?

2009-10-20 Thread andrew

Oh dear, that doesn't look right at all.  I shall have a think about
what I did wrong and maybe follow my own advice and consult the doco
myself!


On Oct 21, 2:45 pm, andrew  wrote:
> The following is *significantly* easier to do than try and add in
> dummy variables, although the dummy variable approach is going to give
> you exactly the same answer as the factor method, but possibly with a
> different baseline.
>
> Basically, you might want to search the lm help and possibly consult a
> stats book on information about how the design matrix is constructed
> in both cases.
>
> > xF <- factor(1:10)
> > N <- 1000
> > xFs <- sample(x=xF,N,replace = T)
> > yFs <- rnorm(N, mean = as.numeric(xFs))
> > lm(yFs ~ xFs)
>
> Call:
> lm(formula = yFs ~ xFs)
>
> Coefficients:
> (Intercept)         xFs2         xFs3         xFs4
> xFs5         xFs6         xFs7         xFs8
>      0.7845       1.1620       2.1474       3.1391       4.2183
> 5.2621       6.0814       7.4170
>        xFs9        xFs10
>      8.2193       9.2987
>
> > lm(yFs ~ diag(10)[,1:9][xFs,])
>
> Call:
> lm(formula = yFs ~ diag(10)[, 1:9][xFs, ])
>
> Coefficients:
>             (Intercept)  diag(10)[, 1:9][xFs, ]1  diag(10)[, 1:9]
> [xFs, ]2  diag(10)[, 1:9][xFs, ]3
>                  10.083                   -9.299
> -8.137                   -7.151
> diag(10)[, 1:9][xFs, ]4  diag(10)[, 1:9][xFs, ]5  diag(10)[, 1:9]
> [xFs, ]6  diag(10)[, 1:9][xFs, ]7
>                  -6.160                   -5.080
> -4.037                   -3.217
> diag(10)[, 1:9][xFs, ]8  diag(10)[, 1:9][xFs, ]9
>                  -1.882                   -1.079
>
> On Oct 21, 9:44 am, David Winsemius  wrote:
>
>
>
> > On Oct 20, 2009, at 4:00 PM, Luciano La Sala wrote:
>
> > > Dear R-people,
>
> > > I am analyzing epidemiological data using GLMM using the lmer  
> > > package. I usually explore the assumption of linearity of continuous  
> > > variables in the logit of the outcome by creating 4 categories of  
> > > the variable, performing a bivariate logistic regression, and then  
> > > plotting the coefficients of each category against their mid points.  
> > > That gives me a pretty good idea about the linearity assumption and  
> > > possible departures from it.
>
> > > I know of people who create 0,1 dummy variables in order to relax  
> > > the linearity assumption. However, I've read that dummy variables  
> > > are never needed (nor are desireble) in R! Instead, one should make  
> > > use of factors variable. That is much easier to work with than dummy  
> > > variables and the model itself will create the necessary dummy  
> > > variables.
>
> > > Having said that, if my data violates the linearity assumption, does  
> > > the use of a factors for the variable in question helps overcome the  
> > > lack of linearity?
>
> > No. If done by dividing into samall numbers of categories after  
> > looking at the data, it merely creates other (and probably more  
> > severe) problems. If you are in the unusal (although desirable)  
> > position of having a large number of events across the range of the  
> > covariates in your data, you may be able to cut your variable into  
> > quintiles or deciles and analyze the resulting factor, but the  
> > preferred approach would be to fit a regression spline of sufficient  
> > complexity.
>
> > > Thanks in advance.
>
> > --
>
> > David Winsemius, MD
> > Heritage Laboratories
> > West Hartford, CT
>
> > __
> > r-h...@r-project.org mailing 
> > listhttps://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dummy variables or factors?

2009-10-20 Thread andrew

Sorry for this third posting - the second method is the same as the
first after all: the coefficients of the first linear model *is* a
linear transformation of the second.  Just got confused with the
pasting, tis all.


On Oct 21, 2:51 pm, andrew  wrote:
> Oh dear, that doesn't look right at all.  I shall have a think about
> what I did wrong and maybe follow my own advice and consult the doco
> myself!
>
> On Oct 21, 2:45 pm, andrew  wrote:
>
>
>
> > The following is *significantly* easier to do than try and add in
> > dummy variables, although the dummy variable approach is going to give
> > you exactly the same answer as the factor method, but possibly with a
> > different baseline.
>
> > Basically, you might want to search the lm help and possibly consult a
> > stats book on information about how the design matrix is constructed
> > in both cases.
>
> > > xF <- factor(1:10)
> > > N <- 1000
> > > xFs <- sample(x=xF,N,replace = T)
> > > yFs <- rnorm(N, mean = as.numeric(xFs))
> > > lm(yFs ~ xFs)
>
> > Call:
> > lm(formula = yFs ~ xFs)
>
> > Coefficients:
> > (Intercept)         xFs2         xFs3         xFs4
> > xFs5         xFs6         xFs7         xFs8
> >      0.7845       1.1620       2.1474       3.1391       4.2183
> > 5.2621       6.0814       7.4170
> >        xFs9        xFs10
> >      8.2193       9.2987
>
> > > lm(yFs ~ diag(10)[,1:9][xFs,])
>
> > Call:
> > lm(formula = yFs ~ diag(10)[, 1:9][xFs, ])
>
> > Coefficients:
> >             (Intercept)  diag(10)[, 1:9][xFs, ]1  diag(10)[, 1:9]
> > [xFs, ]2  diag(10)[, 1:9][xFs, ]3
> >                  10.083                   -9.299
> > -8.137                   -7.151
> > diag(10)[, 1:9][xFs, ]4  diag(10)[, 1:9][xFs, ]5  diag(10)[, 1:9]
> > [xFs, ]6  diag(10)[, 1:9][xFs, ]7
> >                  -6.160                   -5.080
> > -4.037                   -3.217
> > diag(10)[, 1:9][xFs, ]8  diag(10)[, 1:9][xFs, ]9
> >                  -1.882                   -1.079
>
> > On Oct 21, 9:44 am, David Winsemius  wrote:
>
> > > On Oct 20, 2009, at 4:00 PM, Luciano La Sala wrote:
>
> > > > Dear R-people,
>
> > > > I am analyzing epidemiological data using GLMM using the lmer  
> > > > package. I usually explore the assumption of linearity of continuous  
> > > > variables in the logit of the outcome by creating 4 categories of  
> > > > the variable, performing a bivariate logistic regression, and then  
> > > > plotting the coefficients of each category against their mid points.  
> > > > That gives me a pretty good idea about the linearity assumption and  
> > > > possible departures from it.
>
> > > > I know of people who create 0,1 dummy variables in order to relax  
> > > > the linearity assumption. However, I've read that dummy variables  
> > > > are never needed (nor are desireble) in R! Instead, one should make  
> > > > use of factors variable. That is much easier to work with than dummy  
> > > > variables and the model itself will create the necessary dummy  
> > > > variables.
>
> > > > Having said that, if my data violates the linearity assumption, does  
> > > > the use of a factors for the variable in question helps overcome the  
> > > > lack of linearity?
>
> > > No. If done by dividing into samall numbers of categories after  
> > > looking at the data, it merely creates other (and probably more  
> > > severe) problems. If you are in the unusal (although desirable)  
> > > position of having a large number of events across the range of the  
> > > covariates in your data, you may be able to cut your variable into  
> > > quintiles or deciles and analyze the resulting factor, but the  
> > > preferred approach would be to fit a regression spline of sufficient  
> > > complexity.
>
> > > > Thanks in advance.
>
> > > --
>
> > > David Winsemius, MD
> > > Heritage Laboratories
> > > West Hartford, CT
>
> > > __
> > > r-h...@r-project.org mailing 
> > > listhttps://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting 
> > > guidehttp://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
>
> > __
> > r-h...@r-project.org mailing 
> > listhttps://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] integrate() function error

2009-10-22 Thread andrew

Change e2 to the following and it works

e2 <- function(a) a^2/2

The reason it doesn't is that e2 must be able to handle vector inputs
correctly.   Your original function does not do this.

from ?integrate

f - an R function taking a numeric first argument and returning a
numeric vector of the same length. Returning a non-finite element will
generate an error.

You could change it to the following

e2 <- function(x) { out <- 0*x; for(n in 1:length(x)) out[n] <-
integrate(function(y) y,lower=0,upper=x[n])$value; out }

and it seems to give the correct answer for me.

On Oct 23, 1:13 pm, fuzuo xie  wrote:
> This is my code , when i run it ,error happed . can you tell me what's the
> reason and modify it ?thank you very much !! the error is "evaluation of
> function gave a result of wrong length"
>
> e2<-function(a) integrate(function(x) x,lower=0,upper=a)$value
> integrate(e2,lower=0, upper=0.5)$value
>
>         [[alternative HTML version deleted]]
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] splitting a vector of strings...

2009-10-22 Thread andrew

xs <- "this is string"
xsv <- paste(xs, 1:10)
sapply(xsv, function(x) strsplit(x, '\\sis\\s'))

This will split the vector of string "xsv" on the word 'is' that has a
space immediately before and after it.



On Oct 23, 1:34 pm, Jonathan Greenberg  wrote:
> Quick question -- if I have a vector of strings that I'd like to split
> into two new vectors based on a substring that is inside of each string,
> what is the most efficient way to do this?  The substring that I want to
> split on is multiple characters, if that matters, and it is contained in
> every element of the character vector.
>
> --j
>
> --
>
> Jonathan A. Greenberg, PhD
> Postdoctoral Scholar
> Center for Spatial Technologies and Remote Sensing (CSTARS)
> University of California, Davis
> One Shields Avenue
> The Barn, Room 250N
> Davis, CA 95616
> Phone: 415-763-5476
> AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] splitting a vector of strings...

2009-10-22 Thread andrew

the following works - double backslash to remove the "or"
functionality of | in a regex.  (Bill Dunlap showed that you don't
need sapply for it to work)

xs <- "this is | string"
xsv <- paste(xs, 1:10)
strsplit(xsv, "\\|")


On Oct 23, 3:50 pm, Jonathan Greenberg  wrote:
> William et al:
>
>     Thanks!  I think I have a somewhat more complicated issue due to the
> type of string I'm using -- the split is " | " (space pipe space) -- how
> do I code that based on your sub code below?  Using " | *" doesn't seem
> to be working.  Thanks!
>
> --j
>
>
>
> William Dunlap wrote:
> >> -Original Message-
> >> From: r-help-boun...@r-project.org
> >> [mailto:r-help-boun...@r-project.org] On Behalf Of Jonathan Greenberg
> >> Sent: Thursday, October 22, 2009 7:35 PM
> >> To: r-help
> >> Subject: [R] splitting a vector of strings...
>
> >> Quick question -- if I have a vector of strings that I'd like
> >> to split
> >> into two new vectors based on a substring that is inside of
> >> each string,
> >> what is the most efficient way to do this?  The substring
> >> that I want to
> >> split on is multiple characters, if that matters, and it is
> >> contained in
> >> every element of the character vector.
>
> > strsplit and sub can both be used for this.  If you know
> > the string will be split into 2 parts then 2 calls to sub
> > with slightly different patterns will do it.  strsplit requires
> > less fiddling with the pattern and is handier when the number
> > of parts is variable or large.  strsplit's output often needs to
> > be rearranged for convenient use.
>
> > E.g., I made 100,000 strings with a 'qaz' in their middles with
> >   x<-paste("X",sample(1e5),sep="")
> >   y<-sub("X","Y",x)
> >   xy<-paste(x,y,sep="qaz")
> > and split them by the 'qaz' in two ways:
> >   system.time(ret1<-list(x=sub("qaz.*","",xy),y=sub(".*qaz","",xy)))
> >   # user  system elapsed
> >   # 0.22    0.00    0.21
>
> > system.time({tmp<-strsplit(xy,"qaz");ret2<-list(x=unlist(lapply(tmp,`[`,
> > 1)),y=unlist(lapply(tmp,`[`,2)))})
> >    user  system elapsed
> >   # 2.42    0.00    2.20
> >   identical(ret1,ret2)
> >   #[1] TRUE
> >   identical(ret1$x,x) && identical(ret1$y,y)
> >   #[1] TRUE
>
> > Bill Dunlap
> > Spotfire, TIBCO Software
> > wdunlap tibco.com
>
> >> --j
>
> >> --
>
> >> Jonathan A. Greenberg, PhD
> >> Postdoctoral Scholar
> >> Center for Spatial Technologies and Remote Sensing (CSTARS)
> >> University of California, Davis
> >> One Shields Avenue
> >> The Barn, Room 250N
> >> Davis, CA 95616
> >> Phone: 415-763-5476
> >> AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307
>
> >> __
> >> r-h...@r-project.org mailing list
> >>https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >>http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
> --
>
> Jonathan A. Greenberg, PhD
> Postdoctoral Scholar
> Center for Spatial Technologies and Remote Sensing (CSTARS)
> University of California, Davis
> One Shields Avenue
> The Barn, Room 250N
> Davis, CA 95616
> Phone: 415-763-5476
> AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] a problem about integrate function in R .thank you .

2009-10-23 Thread andrew

I don't seem to get a problem with this.  Have you tried a Monte Carlo
approach to verify that you are getting incorrect answers?

For me, I get when the upper is 1 that

> integrate(e2, lower = 0, upper = 1)
-0.2820948 with absolute error < 5e-05

> sum(e2(runif(1)))/1
[1] -0.2825667

which seems to be consistent, while for upper = 0.75 I get

> 0.75*sum(e2(runif(1, min=0, max=0.75)))/1
[1] -0.2333506
> integrate(e2,lower=0,upper = 0.75)
-0.2341178 with absolute error < 7.8e-05



On Oct 23, 2:52 pm, fuzuo xie  wrote:
> e2 <- function(x) {
>     out <- 0*x
>     for(i in 1:length(x))
>     out[i] <-integrate(function(y) qnorm(y),lower=0,upper=x[i])$value
>     out }
> integrate(e2,lower=0, upper=a)$value
>
> above is my code , when a is small , say a<0.45  the result is right .
> however , when a>0.5
> the result is incorrect .  why ?    thank you .
>
>         [[alternative HTML version deleted]]
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how do I plot a regression curve with the data?

2009-10-27 Thread andrew

Hi Ken,

Perhaps something like

plot(x,y)
lines(sort(x), fit$fitted.values[order(x)])

should do what I think you want.  I think you have to be a bit careful
on the ordering of the fitted values so that they match up with the
order of the x variable, otherwise you get an odd looking line plot.

Regards,

Andrew

On Oct 28, 5:42 am, Ken Ervin  wrote:
> I have a data set of 6 or so ordered pairs, and I've been able to graph
> them and have decided to use a high-order polynomial regression.  I've
> used the following piece of code:
>
> regression <- function(x,y) {
>     x <- c(insert_numbers_here)
>     y <- c(insert_other_numbers_here)
>     fit <- lm(y ~ x + I(x^2) + I(x^3) + I(x^4) + I(x^5) + I(x^6) +
> I(x^7) + I(x^8) + I(x^9))
>     summary(fit)
>
> This gives me the coefficients for the regression very nicely, but I
> would like to plot both the data and the regression curve together.  How
> do I plot that regression curve as a function, and can I put it on the
> same set of axes as my data scatter plot?
>
> Thanks in advance for your help!
>
> -KE
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] getting the name of a single object in R for debugging output

2009-11-18 Thread Andrew

I often use a debug flag (set to TRUE) to turn on various debugging print 
statements in my R scripts.  I was thinking I should create a function 
debugPrint(object,debugFlag),
to print out the object name and contents if the debugFlag is set to TRUE.  
Then I wouldn't have to make my script ugly(..er) than it already is by adding 
IF statements all over the place.  I've seen how ls() dumps object names, but 
how do I get access to the character representation of the name of an object.

E.g.

myVar<- 10

print(myVar) produces "10"

I'd like to print out something like " myVar : 10"

I'd appreciate any suggestions.

Regards,

Andrew



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] getting the name of a single object in R for debugging output

2009-11-18 Thread Andrew



Henrique,
 
It works great.  Perfect!   Thank you.
 
Warm regards,
 
Andrew
--- On Wed, 11/18/09, Henrique Dallazuanna  wrote:


Try this:

debugPrint <- function(x, ...){
    print(sprintf("%s: %d", deparse(substitute(x)), x), ...)
}

On Wed, Nov 18, 2009 at 8:35 AM, Andrew  wrote:
> I often use a debug flag (set to TRUE) to turn on various debugging print 
> statements in my R scripts.  I was thinking I should create a function 
> debugPrint(object,debugFlag),
> to print out the object name and contents if the debugFlag is set to TRUE..  
> Then I wouldn't have to make my script ugly(..er) than it already is by 
> adding IF statements all over the place.  I've seen how ls() dumps object 
> names, but how do I get access to the character representation of the name of 
> an object.
>
> E.g.
>
> myVar<- 10
>
> print(myVar) produces "10"
>
> I'd like to print out something like " myVar : 10"
>
> I'd appreciate any suggestions.
>
> Regards,
>
> Andrew
>
>
>
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] smoothing or curve-fit a time series using lowess, polynomial or whatever I can get working

2009-12-02 Thread Andrew

I was looking for suggestions as to how to smooth a timeseries and, having 
accomplished that, how to find the fitted curve values for intermediate points. 
 
I've tried numerous examples of possible approaches in R that I've found on the 
web, but when applied to my simple data, R returns an error message in numerous 
cases.
 
The main problem seems to be that I have monthly data, starting on the 4th 
month of the first year.  So if I create a timeseries with null for the first 3 
months, or starting on the 3rd month, R libraries function fail.
I don't care about the quality of the data,I just want to get R working with 
time series, so I added obviously false values for jan-march of 2008 and I get 
the same error messages "only univariate series are allowed"

1.0
1.0
1.0
29925.0
60663.0
221873.5
480491.0
231454.5
193697.5
289078.1
446774.9
471155.7
211032.2
214740.9
865103.5
1358822.0
1484117.0
2152747.0
1825709.0
1403461.0
2502712.0
2392353.1
3368193.8
2496529.3
3089231.2
4334911.7
3577551.2
4450100.7
4581147.2
4501380.8
4406765.3
4326864.4
4253378.1
4142437.6
3931249.9
3658728.2
3364516.9
3103204.5
2878541.5
2684296.9
2513390.0
2360164.3
2211494.8
2074479.4
1947117.0
1828018.3
1716195.3
1610921.2
dec=".",)
> myTS<- ts(df, start=c(2008,1), frequency=12)
> 
> myTS
   Jan   Feb   Mar   Apr   May   Jun   Jul  
 Aug   Sep   Oct   Nov   Dec
2008   0.0   0.0   0.0   29925.0   60663.0  221873.5  480491.0  
231454.5  193697.5  289078.1  446774.9  471155.7
2009  211032.2  214740.9  865103.5 1358822.0 1484117.0 2152747.0 1825709.0 
1403461.0 2502712.0 2392353.1 3368193.8 2496529.3
2010 3089231.2 4334911.7 3577551.2 4450100.7 4581147.2 4501380.8 4406765.3 
4326864.4 4253378.1 4142437.6 3931249.9 3658728.2
2011 3364516.9 3103204.5 2878541.5 2684296.9 2513390.0 2360164.3 2211494.8 
2074479.4 1947117.0 1828018.3 1716195.3 1610921.2
> 
> 
> library(stats)
> 
> stl(log(myTS),s.window="periodic")
Error in stl(log(myTS), s.window = "periodic") : 
  only univariate series are allowed
>  
 

I don't understand what is wrong with my time series and I certain I'm missing 
something very basic here --

 I'd be very grateful for any suggestions as to how to fit a curve to this 
data, and then how to find the fitted value of the  curve at any arbitrary 
month within the series (i.e. no extraplolation is needed)

I would not mind using some other method, lm, polynomial fit etc, and am not 
set on using stl.

I'd be very grateful for any suggestions as to how to fit a curve to this data, 
and then how to find the fitted value of the  curve at any arbitrary month 
within the series (i.e. no extraplolation is needed)



 
Warm regards,

Andrew 
 
 
 
 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Legend containing maths symbol and values of variables

2009-03-23 Thread andrew

the plotmath help page should give the answer

http://www.stat.ucl.ac.be/ISdidactique/Rhelp/library/base/html/plotmath.html

On Mar 24, 2:53 pm, Daren Tan  wrote:
> I need to have the maths symbol for >= in the legend, and to
> substitute threshold variable with its value. Somehow, various
> attempts weren't successful. Please help.
>
> threshold <- 0.5
>
> plot(NA, xlab="", ylab="", main="", axes=F, xlim=c(0,1), ylim=c(0,1),
> xaxs="i", yaxs="i")
> legend(x=0, y=1, fill=c("orange", "white", "green", "gray90"),
> cex=0.8, pt.cex=1.8, x.intersp=0.6, bg = "white",
>            legend=c(expression("Up (" >= threshold ")"), "Normal",
> "Down", "NA"), bty = "n")
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Selecting elements from a vector (a simple question with weird results)

2009-03-29 Thread andrew

aa[order(aa)] is the same as sort, although sort is much quicker and
has lower memory requirements:

> aa <- rnorm(1000)
> all(aa[order(aa)] == sort(aa))
[1] TRUE
> system.time(sort(aa))
   user  system elapsed
   7.270.088.25
> system.time(aa[order(aa)])
   user  system elapsed
  29.580.32   31.31

(yes, I have a pitifully slow computer)

HTH,

Andrew.

On Mar 29, 8:01 am, Tal Galili  wrote:
> Yes Johannes - That helped, thank you.
>
> On Sat, Mar 28, 2009 at 11:50 PM, Johannes Huesing 
> wrote:
>
>
>
> > Tal Galili  [Sat, Mar 28, 2009 at 06:48:36PM CET]:
> > > Hello people.
>
> > > I wish to reorder a simple vector of numbers by another vector of the
> > order
> > > (and then do the same, but with a data frame rows)
>
> > > I try this (which doesn't work) :
> > > > aa <- c(3, 1 ,2 )
> > > > aa[aa]
> > > [1] 2 3 1
>
> > To my mind, it does what you told it to, and therefore "works"
> > in my book. The routine orders the numbers by placing the third element
> > first, the first second, and the second third.
>
> > Maybe aa[order(aa)] does what you mean it to do?
>
> > Best wishes
>
> > Johannes
>
> > --
> > Johannes Hüsing               There is something fascinating about science.
> >                              One gets such wholesale returns of conjecture
> > mailto:johan...@huesing.name  from such a trifling investment of fact.
> >http://derwisch.wikidot.com        (Mark Twain, "Life on the
> > Mississippi")
>
> --
> --
>
> My contact information:
> Tal Galili
> Phone number: 972-50-3373767
> FaceBook: Tal Galili
> My 
> Blogs:http://www.r-statistics.com/http://www.talgalili.comhttp://www.biostatistics.co.il
>
>         [[alternative HTML version deleted]]
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How may I add "is.outer" method

2009-04-12 Thread andrew

I am not quite sure how you define your outlier, but the definition
that I am familiar with is that an outlier is greater than 1.5*Inter
quartile range above the 3rd quartile or below the 1st quartile (this
is the default for the whiskers in boxplot).  This can be easily found
by

  x[(x < quantile(x)[2] - 1.5*IQR(x)) | (x > quantile(x)[4] + 1.5*IQR
(x))]

or if you prefer

  is.outlier <- function(x) {(x < quantile(x)[2] - 1.5*IQR(x)) | (x >
quantile(x)[4] + 1.5*IQR(x))}
  x[is.outlier(x)]

HTH.

Andrew.

On Apr 12, 8:51 am, Grześ  wrote:
> Hello
> I have a problem like this:
>
> > Glass$RI[is.outlier(Glass$RI)]
>
> Error: could not find function "is.outlier"
>
> Which commend I may add "is.outer" to my application?
>
> --
> View this message in 
> context:http://www.nabble.com/How-may-I-add-%22is.outer%22-method-tp23006216p...
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with for/if loop

2009-04-23 Thread andrew

or perhaps

agec <- 0*age
age[age<=46] <- 1
age[age>46 & age<=58 <- 2
age[age>58] <- 3

or perhaps a one liner

cut(x = age, breaks= c(0,46, 58,Inf), labels = c(1,2,3))



On Apr 24, 11:24 am, Jorge Ivan Velez 
wrote:
> Dear Hollie,
> ifelse() is one alternative in this particular case:
>
> # Some data
> age<- c(46,47,43,46,47,59,50,54,59,60)
>
> ifelse(age<=46, 1,
>         ifelse( age>46 & age<=58 , 2, 3) )
> [1] 1 2 1 1 2 3 2 2 3 3
>
> See ?ifelse for more details.
>
> HTH,
>
> Jorge
>
> On Thu, Apr 23, 2009 at 8:07 PM, Giggles_Fairy
> wrote:
>
>
>
>
>
> > I have a set of data that includes various data columns. One if the
> > survival
> > time and another if a continuous variable of ages. I want to put the ages
> > into intervals so that I can then perform the Kalpan Meier test. I am
> > trying
> > to use the following code to build a column with the age group numbers in
>
> > agecatagory<-c( )
> > for (i in 1:137)
> > {
> > {
> > if(age[i]<=46)  {agecat[i]<-1}
> > if(age[i]>46 & age[i]<= 58) {agecat[i]<-2}
> > if(age[i]>58) {agecat[i]<-3}
> > }
> > agecatagory<-c(agecatagory, agecat[i])
> > }
>
> > I have been getting various errors for various things and have finally got
> > it so that only one error comes up
> > Error in if (age[i] <= 46) { : missing value where TRUE/FALSE needed
>
> > Could anyone pleaseee shed some light on this for me and tell em where
> > I
> > am going wrong. I am sure it is just a minor thing but I cant for the life
> > of me figure it out!
>
> > Your replies will be very much appreciated!
>
> > Hollie
> > --
> > View this message in context:
> >http://www.nabble.com/Help-with-for-if-loop-tp23207428p23207428.html
> > Sent from the R help mailing list archive at Nabble.com.
>
> > __
> > r-h...@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>         [[alternative HTML version deleted]]
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FOURIER INTEGRALS IN R

2009-05-04 Thread andrew

integrate offers some one-dimensional algorithms, but you need to
start with a smooth function to get it to converge properly.  With a
cosine integral, there may be certain routines that offer better value
for money: the Clenshaw-Curtis integration, or perhaps the FFT.  You
would have to recast your problem by doing some sort of substitution.

Perhaps post some latex code to show the exact type of integral you
are wanting to calculate.

Regards,

On May 5, 6:32 am, "Achilleas Achilleos"  wrote:
> Hi,
>
> I am wondering whether there exist any function in R (any package) that
> calculates Fourier Integrals.
>
> Particularly, I am interested for estimation of a Cosine Fourier integral...
>
> I would be much obliged if you could help me on this..
>
> Thanks.
>
> Andreas
>
> --
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FOURIER INTEGRALS IN R

2009-05-05 Thread andrew

Isn't it possible to write this as the real part of a fourier
transform?  You could then use the fft to calculate it for all the
values of x, all at once.  The Madan and Carr paper offers some
pointers on how to do this: http://www.imub.ub.es/events/sssf/vgfrier7.pdf

On May 6, 1:06 am, "Ravi Varadhan"  wrote:
> Your integrand is smooth.
>
> What are the limits of integration: 0 to 1 or 0 to infinity?  Main challenge
> is that it is increasingly oscillatory as x and/or t increase.  You can find
> the zeros of thecosinefunction and add up the integrals between successive
> zeros.  In what context does this inttegral arise?  It must have been
> studied well using asymptotic approximation and such.  
>
> Ravi.
>
> 
> ---
>
> Ravi Varadhan, Ph.D.
>
> Assistant Professor, The Center on Aging and Health
>
> Division of Geriatric Medicine and Gerontology
>
> Johns Hopkins University
>
> Ph: (410) 502-2619
>
> Fax: (410) 614-9625
>
> Email: rvarad...@jhmi.edu
>
> Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
>
> 
> 
>
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
>
> Behalf Of A Achilleos
> Sent: Tuesday, May 05, 2009 10:18 AM
> To: r-h...@r-project.org
> Subject: Re: [R] FOURIER INTEGRALS IN R
>
> Ok thanks..
>
> No, my function is not smooth.
>
> Actually, I am dealing with an integral having the following form, for
> example:
>
> \int cos(tx) (1-t^2)^3 \exp(0.5*t^2) dt
>
> I want to estimate this FourierCosineintegral for a given value of x.
>
> Thanks for the help.
>
> AA
>
> On Tue, May 5, 2009 2:34 am, andrew wrote:
> > integrate offers some one-dimensional algorithms, but you need to
> > start with a smooth function to get it to converge properly.  With a
> >cosineintegral, there may be certain routines that offer better value
> > for money: the Clenshaw-Curtis integration, or perhaps the FFT.  You
> > would have to recast your problem by doing some sort of substitution.
>
> > Perhaps post some latex code to show the exact type of integral you
> > are wanting to calculate.
>
> > Regards,
>
> > On May 5, 6:32 am, "Achilleas Achilleos"  wrote:
> >> Hi,
>
> >> I am wondering whether there exist any function in R (any package)
> >> that calculates Fourier Integrals.
>
> >> Particularly, I am interested for estimation of aCosineFourier
> >> integral...
>
> >> I would be much obliged if you could help me on this..
>
> >> Thanks.
>
> >> Andreas
>
> >> --
>
> >> __
> >> r-h...@r-project.org mailing
> >> listhttps://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting
> >> guidehttp://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
> > __
> > r-h...@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> A Achilleos
> ma...@bristol.ac.uk
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to reduce the sparseness in a TDM to make a cluster plot readable?

2020-09-14 Thread Andrew

Hello all

I am doing some text mining on a set of five plain text files and have 
run into a snag when I run hclust in that there are just too many leaves 
for anything to be read. It returns a solid black line.

The texts have been converted into a TDM which has a dim of 5,292 and 5 
(as per 5 docs).

My code for removing sparsity is as follows:

 > tdm2 <- removeSparseTerms(tdm, sparse=0.9)

 > inspect(tdm2)

<>
Non-/sparse entries: 10415/16045
Sparsity   : 61%
Maximal term length: 22
Weighting  : term frequency (tf)

While the tf-idf weighting returns this when 0.9 sparseness is removed:

 > inspect(tdm.tfidf)
<>
Non-/sparse entries: 7915/18545
Sparsity   : 70%
Maximal term length: 22
Weighting  : term frequency - inverse document frequency 
(normalized) (tf-idf)

I have experimented by decreasing the value I use for decreasing 
sparseness, and that helps a bit, for example:

 > tdm2 <- removeSparseTerms(tdm, sparse=0.215)
 > inspect(tdm2)
<>
Non-/sparse entries: 3976/369
Sparsity   : 8%
Maximal term length: 14
Weighting  : term frequency (tf)

But, no matter what I do, the resulting plot is unreadable. The code for 
plotting the cluster is:

 > hc <- hclust(dist(tdm2, method = "euclidean"), method = "complete")
 > plot(hc, yaxt = 'n', main = "Hierarchical clustering")

Can someone kindly either advise me what I am doing wrong and/ or 
signpost me to some detailed info on how to fix this.

Many thanks in anticipation.

Andy


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to reduce the sparseness in a TDM to make a cluster plot readable?

2020-09-18 Thread Andrew

Hi Abby

Many thanks for reaching out with an offer of help. Very much appreciated.

(1) The packages I'm using are 'tm' for text-mining and the TDM and for 
the clustering it is 'cluster'
(2) Not sure where the problem is happening as it doesn't show up as an 
error. Where it manifests is in the plotting, however logic would 
suggest that it concerns the removal of sparse terms, so that would be 
in the TDM process
(3) I don't think I can provide a reproducible example. When I practice 
using data sets that packages provide, all is fine. The trouble is when 
I apply it to my own data sets which are five documents, etc., as described.

I think the nub of it is really to find a way that I can subset the TDM 
to return the twenty or thirty most frequently used words, and then to 
plot those using hclust. However, when searching on-line I haven't been 
able to find any suggestions on how to do that, nor is there any mention 
of using that approach in the books and tutorials I have.

If you (or someone on this list) can advise on how I can sort the terms 
in the TDM from most to least frequent, and then to subset the top 
twenty or thirty most frequently occurring terms (preferably using tf as 
well as tf-idf) and then I can plot that sub-set, then I think that that 
would do the trick, and the terms would be plotted clearly and legibly.

Thanks again for your offer of help. I hope that my reply helps clarify 
rather than muddy the situation.

Best wishes
Andy

On 17/09/2020 08:43, Abby Spurdle wrote:

I'm not familiar with these subjects.
And hopefully, someone who is, will offer some better suggestions.

But to get things started, maybe...
(1) What packages are you using (re: tdm)?
(2) Where does the problem happen, in dist, hclust, the plot method
for hclust, or in the package(s) you are using?
(3) Do you think you could produce a small reproducible example,
showing what is wrong, and explaining you would like it to do instead?

Note that if the problem relates to hclust, or the plot method, then
you should be able to produce a much simpler example.
e.g.

 mycount.matrix <- matrix (rpois (25000, 20),, 5)
 head (mycount.matrix, 3)
 tail (mycount.matrix, 3)

 plot (hclust (dist (mycount.matrix) ) )

On Tue, Sep 15, 2020 at 6:54 AM Andrew  wrote:

Hello all

I am doing some text mining on a set of five plain text files and have
run into a snag when I run hclust in that there are just too many leaves
for anything to be read. It returns a solid black line.

The texts have been converted into a TDM which has a dim of 5,292 and 5
(as per 5 docs).

My code for removing sparsity is as follows:

  > tdm2 <- removeSparseTerms(tdm, sparse=0.9)

  > inspect(tdm2)

<>
Non-/sparse entries: 10415/16045
Sparsity   : 61%
Maximal term length: 22
Weighting  : term frequency (tf)

While the tf-idf weighting returns this when 0.9 sparseness is removed:

  > inspect(tdm.tfidf)
<>
Non-/sparse entries: 7915/18545
Sparsity   : 70%
Maximal term length: 22
Weighting  : term frequency - inverse document frequency
(normalized) (tf-idf)

I have experimented by decreasing the value I use for decreasing
sparseness, and that helps a bit, for example:

  > tdm2 <- removeSparseTerms(tdm, sparse=0.215)
  > inspect(tdm2)
<>
Non-/sparse entries: 3976/369
Sparsity   : 8%
Maximal term length: 14
Weighting  : term frequency (tf)

But, no matter what I do, the resulting plot is unreadable. The code for
plotting the cluster is:

  > hc <- hclust(dist(tdm2, method = "euclidean"), method = "complete")
  > plot(hc, yaxt = 'n', main = "Hierarchical clustering")

Can someone kindly either advise me what I am doing wrong and/ or
signpost me to some detailed info on how to fix this.

Many thanks in anticipation.

Andy

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to reduce the sparseness in a TDM to make a cluster plot readable?

2020-09-18 Thread Andrew

Hello Jim

Thanks for that. I'll read up on it and will give it a go, either later 
today or tomorrow. I am assuming this will work for both tf and tf-idf 
weighted TDMs?

Much appreciated. :-)

Best wishes
Andy

On 18/09/2020 09:18, Jim Lemon wrote:

Hi Andrew,
>From your last email the answer to your problem may be the
findFreqTerms() function. Just increase the number of times a term has
to appear and check the result until you get the matrix size that you
want.

Jim

On Fri, Sep 18, 2020 at 5:32 PM Andrew  wrote:

Hi Abby

Many thanks for reaching out with an offer of help. Very much appreciated.

(1) The packages I'm using are 'tm' for text-mining and the TDM and for
the clustering it is 'cluster'
(2) Not sure where the problem is happening as it doesn't show up as an
error. Where it manifests is in the plotting, however logic would
suggest that it concerns the removal of sparse terms, so that would be
in the TDM process
(3) I don't think I can provide a reproducible example. When I practice
using data sets that packages provide, all is fine. The trouble is when
I apply it to my own data sets which are five documents, etc., as described.

I think the nub of it is really to find a way that I can subset the TDM
to return the twenty or thirty most frequently used words, and then to
plot those using hclust. However, when searching on-line I haven't been
able to find any suggestions on how to do that, nor is there any mention
of using that approach in the books and tutorials I have.

If you (or someone on this list) can advise on how I can sort the terms
in the TDM from most to least frequent, and then to subset the top
twenty or thirty most frequently occurring terms (preferably using tf as
well as tf-idf) and then I can plot that sub-set, then I think that that
would do the trick, and the terms would be plotted clearly and legibly.

Thanks again for your offer of help. I hope that my reply helps clarify
rather than muddy the situation.

Best wishes
Andy

On 17/09/2020 08:43, Abby Spurdle wrote:

I'm not familiar with these subjects.
And hopefully, someone who is, will offer some better suggestions.

But to get things started, maybe...
(1) What packages are you using (re: tdm)?
(2) Where does the problem happen, in dist, hclust, the plot method
for hclust, or in the package(s) you are using?
(3) Do you think you could produce a small reproducible example,
showing what is wrong, and explaining you would like it to do instead?

Note that if the problem relates to hclust, or the plot method, then
you should be able to produce a much simpler example.
e.g.

  mycount.matrix <- matrix (rpois (25000, 20),, 5)
  head (mycount.matrix, 3)
  tail (mycount.matrix, 3)

  plot (hclust (dist (mycount.matrix) ) )

On Tue, Sep 15, 2020 at 6:54 AM Andrew  wrote:

Hello all

I am doing some text mining on a set of five plain text files and have
run into a snag when I run hclust in that there are just too many leaves
for anything to be read. It returns a solid black line.

The texts have been converted into a TDM which has a dim of 5,292 and 5
(as per 5 docs).

My code for removing sparsity is as follows:

   > tdm2 <- removeSparseTerms(tdm, sparse=0.9)

   > inspect(tdm2)

<>
Non-/sparse entries: 10415/16045
Sparsity   : 61%
Maximal term length: 22
Weighting  : term frequency (tf)

While the tf-idf weighting returns this when 0.9 sparseness is removed:

   > inspect(tdm.tfidf)
<>
Non-/sparse entries: 7915/18545
Sparsity   : 70%
Maximal term length: 22
Weighting  : term frequency - inverse document frequency
(normalized) (tf-idf)

I have experimented by decreasing the value I use for decreasing
sparseness, and that helps a bit, for example:

   > tdm2 <- removeSparseTerms(tdm, sparse=0.215)
   > inspect(tdm2)
<>
Non-/sparse entries: 3976/369
Sparsity   : 8%
Maximal term length: 14
Weighting  : term frequency (tf)

But, no matter what I do, the resulting plot is unreadable. The code for
plotting the cluster is:

   > hc <- hclust(dist(tdm2, method = "euclidean"), method = "complete")
   > plot(hc, yaxt = 'n', main = "Hierarchical clustering")

Can someone kindly either advise me what I am doing wrong and/ or
signpost me to some detailed info on how to fix this.

Many thanks in anticipation.

Andy

  [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-hel

[R] Biologist R learner

2012-11-19 Thread andrew

I am a Biologist and a beginner, please help me to solve this: please
anyone..its my homework and I dont have a clue abt R yet..
   Write a function which does the following tasks:

(a) Calculates minimum and maximum value of a given argument x.
(b) If x is positive, some new vector gets the value of TRUE, and FALSE
otherwise.
(c) Creates a vector where the i:th and (i-1):th values of x are always
summed. First value of
the new vector has the same value as the first component of x. Use the
created function to
some vector x to show that the function works.




--
View this message in context: 
http://r.789695.n4.nabble.com/Biologist-R-learner-tp4650044.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I generate one vector for every row of a data frame?

2008-12-18 Thread andrew

I think this should work

rgmm <- function(n, gmm) {
M <- sample(1:4, n, replace = TRUE, prob= gmm$weight)
mean <- gmm[M, ]$mean
sd <- gmm[M, ]$sd

return(gmm[M,]$sd*rnorm(n) + gmm[M,]$mean)
}

hist(rgmm(1, gmm), breaks = 500)


On Dec 19, 4:14 pm, "Bill McNeill (UW)" 
wrote:
> I am trying to generate a set of data points from a Gaussian mixture
> model.  My mixture model is represented by a data frame that looks
> like this:
>
> > gmm
>
>   weight mean  sd
> 1    0.3    0 1.0
> 2    0.2   -2 0.5
> 3    0.4    4 0.7
> 4    0.1    5 0.3
>
> I have written the following function that generates the appropriate data:
>
> gmm_data <- function(n, gmm) {
>         c(rnorm(n*gmm[1,]$weight, gmm[1,]$mean, gmm[1,]$sd),
>                 rnorm(n*gmm[2,]$weight, gmm[2,]$mean, gmm[2,]$sd),
>                 rnorm(n*gmm[3,]$weight, gmm[3,]$mean, gmm[3,]$sd),
>                 rnorm(n*gmm[4,]$weight, gmm[4,]$mean, gmm[4,]$sd))
>
> }
>
> However, the fact that my mixture has four components is hard-coded
> into this function.  A better implementation of gmm_data() would
> generate data points for an arbitrary number of mixture components
> (i.e. an arbitrary number of rows in the data frame).
>
> How do I do this?  I'm sure it's simple, but I can't figure it out.
>
> Thanks.
> --
> Bill McNeillhttp://staff.washington.edu/billmcn/index.shtml
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Memory Efficiency of Symmetric Matrix

2009-01-06 Thread andrew

the SparseM package might be what you are looking for

http://www.econ.uiuc.edu/~roger/research/sparse/SparseM.pdf

On Jan 7, 11:36 am, Søren Højsgaard  wrote:
> You can do
> mat[lower.tri(mat, diag=F)]
> Søren
>
> 
>
> Fra: r-help-boun...@r-project.org på vegne af Nathan S. Watson-Haigh
> Sendt: on 07-01-2009 01:28
> Til: r-h...@r-project.org
> Emne: [R] Memory Efficiency of Symmetric Matrix
>
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> I'm generating a symmetric correlation matrix using a data matrix as input:
> mat <- cor(data.mat)
>
> My question is:
> Is there a more memory efficient way to store this data? For instance, since:
> all(mat == t(mat))
> every value is duplicated, and I should be able to almost half the memory 
> usage for large matrices.
>
> Any thoughts/comments?
>
> Cheers,
> Nathan
>
> - --
> - 
> Dr. Nathan S. Watson-Haigh
> OCE Post Doctoral Fellow
> CSIRO Livestock Industries
> Queensland Bioscience Precinct
> St Lucia, QLD 4067
> Australia
>
> Tel: +61 (0)7 3214 2922
> Fax: +61 (0)7 3214 2900
> Web:http://www.csiro.au/people/Nathan.Watson-Haigh.html
> - 
>
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.9 (MingW32)
> Comment: Using GnuPG with Mozilla 
> -http://enigmail.mozdev.org
>
> iEYEARECAAYFAklj9yAACgkQ9gTv6QYzVL6MGQCg1CHsRGAwEMah/8ZuZ9QFI6O5
> lcIAnjZ68DE9FABLMd07A3AfdMPRpXIH
> =5bet
> -END PGP SIGNATURE-
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Returning Non-Unique Index with Which (alternatives?)

2009-01-13 Thread andrew

Hi Gundala,

The following should work.

x <- numeric(length(qr))
for(k in 1:3) x[k] <- which(repo %in% qr[k])

also, be careful about overwriting qr - it is an base package
function.

regards,

Andrew

On Jan 13, 1:41 pm, "Gundala Viswanath"  wrote:
> Dear all,
>
> I tried to find index in repo given  a query with this:
>
> > repo <- c("AAA", "AAT", "AAC", "AAG", "ATA", "ATT")
> > qr <- c("AAC", "ATT", "ATT")
> > which(repo%in%qr)
>
> [1] 3 6
>
> Note that the query contain repeating elements, yet
> the output of which only returns unique.
>
> How can I make it returning
>
> [1] 3 6 6
>
> instead?
>
> - Gundala Viswanath
> Jakarta - Indonesia
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove columns containing all zeros (or other value)

2009-01-14 Thread andrew

or this

x[,!(colSums(abs(x)) == 0)]

On Jan 15, 10:00 am, Marc Schwartz  wrote:
> Careful:
>
> x <- matrix(c(1, 5, 3, 2, 1, 4, -1, 0, 1),
>             ncol = 3, nrow = 3)
>
> > x
>
>      [,1] [,2] [,3]
> [1,]    1    2   -1
> [2,]    5    1    0
> [3,]    3    4    1
>
> > x[, colSums(x) != 0]
>
>      [,1] [,2]
> [1,]    1    2
> [2,]    5    1
> [3,]    3    4
>
> Not quite the result wanted...  :-)
>
> Try this:
>
> > x[, colSums(x == 0) != nrow(x)]
>
>      [,1] [,2]
> [1,]    1    2
> [2,]    5    1
> [3,]    3    4
>
> x <- matrix(c(1, 5, 3, 2, 1, 4, -1, 0, 1),
>               ncol = 3, nrow = 3)
>
> > x[, colSums(x == 0) != nrow(x)]
>
>      [,1] [,2] [,3]
> [1,]    1    2   -1
> [2,]    5    1    0
> [3,]    3    4    1
>
> HTH,
>
> Marc Schwartz
>
> on 01/14/2009 04:29 PM Gustavo Carvalho wrote:
>
>
>
> > Sorry for the double post, but this is probably faster:
>
> > x[, colSums(x) != 0]
>
> > On Wed, Jan 14, 2009 at 8:22 PM, Gustavo Carvalho
> >  wrote:
> >> You can also try this:
>
> >> x[,-(which(colSums(x) == 0))]
>
> >> Cheers,
>
> >> Gustavo.
>
> >> On Wed, Jan 14, 2009 at 8:01 PM, Anthony Dick  wrote:
> >>> Hello-
>
> >>> I would like to remove the columns of a matrix that contain all zeros. For
> >>> example, from
> >>> x<-matrix(c(1,5,3,2,1,4,0,0,0), ncol=3,nrow=3)
>
> >>> I would like to remove the third column. However, because this is in a 
> >>> loop
> >>> I need a way to first determine which columns are all zeros, and only then
> >>> remove them. I.e., I don't know which column of x contains all zeros until
> >>> after x is created.
>
> >>> Thanks!
>
> >>> Anthony
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Partial sort?

2009-01-14 Thread andrew

I had a similar misunderstanding of using partial in sort() a while
back when I was trying to sort by columns, a la Excel.  In this case,
I ended up using the order() function, eg:

require(stats)
swiss[order(rownames(swiss)), ] #sort by location
swiss[order(swiss$fertility), ] #sort by fertility

Not sure if this helps, as I am attempting to do a mind read based on
ASCII (I get better results using unicode)

Regards,

Andrew

On Jan 15, 10:42 am, Duncan Murdoch  wrote:
> rkevinbur...@charter.net wrote:
> > This is definitely a newbie question but from the documentation I have not 
> > been able to figure out what the partial sort option on the sort method 
> > does. I have read and re-read the documentation and looked at the examples 
> > but for some reason it doesn't register. Would someone attempt to explain 
> > what sort with a non-null partial array of indices does?
>
> It guarantees that those particular indices are sorted correctly, but
> doesn't guarantee anything else.  For example,
>
>  > x <- 10:1
>  > sort(x, partial=1)
>
> guarantees that the first entry in x (i.e. 10) is placed correctly, but
> nothing else, and the result is:
>
>  [1]  1  9  8  7  6  5  4  3  2 10
>
> Duncan Murdoch
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] WhisperStation R

2009-01-26 Thread andrew

any idea why DDR2 667 MHz RAM isn't used instead of DDR?  I thought
that DDR 400MHz was almost finished in production...

On Jan 27, 1:01 pm, zerfetzen  wrote:
> What do you think of this:
>
> http://www.microway.com/whisperstation/whisperstation-r.html
>
> I'm considering ditching my Windows Vista 2 GB RAM computer for
> WhisperStation R using Debian 64-bit Linux with 32 GB RAM and setting the
> whole thing up for R and WinBUGS.  I put in a price request, but I know
> nothing about Linux, or WhisperStation R for that matter, and am really
> curious what you think?
> --
> View this message in 
> context:http://www.nabble.com/WhisperStation-R-tp21678280p21678280.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Environmental variables

2009-02-01 Thread andrew

can you use the system function?

system("echo $PATH")



On Feb 2, 11:10 am, David Epstein  wrote:
> I use a Mac (10.4.11 Mac Os X).
>
> In my .tcshrc I define an environmental variable MY.
> Is it possible to find out its value from inside R? When one loads
> R for Mac OS X Cocoa GUI written by:
>         Simon Urbanek
>         Stefano M. Iacus
> are files like .tcshrc read by R?
>
> Can I make the value of this environmental variable available to R?
>
> Sys.getenv() produces a lot of output, with the values of many environmental
> variables, but not this one.
>
> Do I need a file with the correct value of MY, and have both R and my Unix
> shell read the same file, or is there a better way to proceed? I want to
> avoid duplicating the information in source files, as this can lead to a
> setup that is very hard to maintain.
>
> Thanks for any help.
> David
> --
> View this message in 
> context:http://www.nabble.com/Environmental-variables-tp21782296p21782296.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pause in function execution

2009-02-09 Thread andrew

I am not too sure what your question is, but try

?debug

or take a look at http://www.ats.ucla.edu/stat/r/library/R-debug-tools.pdf

On Feb 10, 9:07 am, Fuchs Ira  wrote:
> I would like to have a function which gets data, does a calculation  
> and prints a result and then waits some number of seconds and  
> repeats.  If I use Sys.sleep, the execution is pausing but the  
> function output is buffered so that it all comes out when the function  
> terminates.  How can I get output while the function continues to  
> execute?
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] solve a linear system with argument is a matrix

2009-02-15 Thread andrew

Hi,

This looks like a Lyapanov equation.  It is used in control theory I
think.  There might be something if you search in those packages that
deal with this.

HTH,

On Feb 16, 9:25 am, �|珊�x  wrote:
> Hi everyone,
>
> I have a question to solve linear equations with argument is a matrix.
> For example, X is a q by p matrix satisfies
>
> F(X)=A and t(B)%*%X+t(X)%*%B=0,
>
> where A and B are known q by p matrices and 0 is p by p matrix with
> all elements are 0s. And F(.) is a known function of matrix, X.
> Now, I try to solve X. Can anyone help me? Thanks a lot.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] PCA functions

2009-02-16 Thread andrew

The PCA is just a singular value decomposition on a sample covariance/
correlation matrix.  Do a search for ?svd and get the eigenvalues and
vectors from that function.

On Feb 14, 10:30 am, "glenn"  wrote:
> Hi All, would appreciate an answer on this if you have a moment;
>
> Is there a function (before I try and write it !) that allows the input of a
> covariance or correlation matrix to calculate PCA, rather than the actual
> data as in princomp()
>
> Regards
>
> Glenn
>
>         [[alternative HTML version deleted]]
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] PCA functions

2009-02-16 Thread andrew

sqrt(svd(x)$d) maybe 2 more operations than princomp(covmat=x), but it
is hardly a chore.

On Feb 16, 9:15 pm, Mark Difford  wrote:
> Hi Glen, Andrew,
>
> >> The PCA is just a singular value decomposition on a sample covariance/...
>
> I believe that Bjørn-Helge Mevik's point was that __if you read the
> documentation__ you will see the argument "covmat" to princomp(). This,
> really, is much more straightforward and practical than Andrew's suggestion.
>
> Regards, Mark.
>
>
>
> andrew-246 wrote:
>
> > The PCA is just a singular value decomposition on a sample covariance/
> > correlation matrix.  Do a search for ?svd and get the eigenvalues and
> > vectors from that function.
>
> > On Feb 14, 10:30 am, "glenn"  wrote:
> >> Hi All, would appreciate an answer on this if you have a moment;
>
> >> Is there a function (before I try and write it !) that allows the input
> >> of a
> >> covariance or correlation matrix to calculate PCA, rather than the actual
> >> data as in princomp()
>
> >> Regards
>
> >> Glenn
>
> >>         [[alternative HTML version deleted]]
>
> >> __
> >> r-h...@r-project.org mailing
> >> listhttps://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting
> >> guidehttp://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
> > __
> > r-h...@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> View this message in 
> context:http://www.nabble.com/PCA-functions-tp22006964p22034611.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] No results show up when running Rmdr

2009-02-18 Thread Andrew

Hi all,
When running Rmdr using the demo data file using the following
commands:

data(mdrdata)
cvk<-10
nbr=2
res<-rmdr(mdrdata,10,2, randomize=TRUE)


I could not find the statistical results, but like this:

[1] 1
 Cross Validation 1 Wed Feb 18 09:05:23 2009
The best set of loci is 13 17
 Cross Validation 2 Wed Feb 18 09:05:26 2009

Could you please tell me why and how to retrieve the results?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] array manipulation simple questions

2009-02-23 Thread andrew



On Feb 24, 3:57 pm, jdeisenberg  wrote:
> Λεωνίδας Μπαντής wrote:
>
> > 1.   Suppose I have a<-c(1:10)  (or a<-array(c(1:10),dim=c(1,10)))
>
> > and I want to end up with vector b=[0 0 0 0 0 1 1 1 1 1]. i.e. I want to
> > substitute alla elements that are <5 with 0 and >5 with 1.
>
> I think you mean >=5, not > 5.  In that case, try this:
>
> b <- ifelse( a < 5, 0, 1)

you could also use

b <- 1*(x >= 5) #the logical elements get converted to numeric

>
> Λεωνίδας Μπαντής wrote:
>
> > 2.  Suppose I have a<-c(1,1,2,2,3,3)  (or array again)
>
> > And I want to place a "4,5" before every "2" and end up with a new
> > "bigger" vector "b":
>
> > b=[1 1 4 5 2 4 5 2 3 3]
>
> > Also I want to find where the 2's in array "a" (if it was an array) are
> > located i.e. positions 3,4.
>
> Not sure how to insert the 4,5, but this will find the where the 2's are:
> which(a == 2)
>

you could use

b <- rep(a,times = 2*(a==2) + 1)
replace(b, which(b==2), c(4,5,2))

> --
> View this message in 
> context:http://www.nabble.com/array-manipulation-simple-questions-tp22173710p...
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about 3-d plot

2009-02-26 Thread andrew

the following should work

library(lattice)
x <- seq(1,100)
y <- seq(1,100)
gr <- expand.grid(x,y)
gr$z <- x + y + rnorm(1,0,100)
cloud(z ~ x + y, data = gr)

also, look for the package rgl which does similar but with more
possiblities.

On Feb 27, 4:28 pm, Dipankar Basu  wrote:
> Hi R Users,
>
> I have produced a simulated scatter plot of y versus x tightly clustered
> around the 45 degree line through the origin with the following code:
>
>  x <- seq(1,100)
>  y <- x+rnorm(100,0,10)
>  plot(x,y,col="blue")
>  abline(0,1)
>
> Is there some way to generate a 3-dimensional analogue of this? Can I get a
> similar simulated scatter plot of points in 3 dimensions where the points
> are clustered around a plane through the origin where the plane in question
> is the 3-dimensional analogue of the 45 degree line through the origin?
>
> Deepankar
>
>         [[alternative HTML version deleted]]
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about 3-d plot

2009-02-26 Thread andrew

actually, I just realised you also want a line in the plot.  I am not
super-sure how to do this.

On Feb 27, 5:20 pm, andrew  wrote:
> the following should work
>
> library(lattice)
> x <- seq(1,100)
> y <- seq(1,100)
> gr <- expand.grid(x,y)
> gr$z <- x + y + rnorm(1,0,100)
> cloud(z ~ x + y, data = gr)
>
> also, look for the package rgl which does similar but with more
> possiblities.
>
> On Feb 27, 4:28 pm, Dipankar Basu  wrote:
>
>
>
> > Hi R Users,
>
> > I have produced a simulated scatter plot of y versus x tightly clustered
> > around the 45 degree line through the origin with the following code:
>
> >  x <- seq(1,100)
> >  y <- x+rnorm(100,0,10)
> >  plot(x,y,col="blue")
> >  abline(0,1)
>
> > Is there some way to generate a 3-dimensional analogue of this? Can I get a
> > similar simulated scatter plot of points in 3 dimensions where the points
> > are clustered around a plane through the origin where the plane in question
> > is the 3-dimensional analogue of the 45 degree line through the origin?
>
> > Deepankar
>
> >         [[alternative HTML version deleted]]
>
> > __
> > r-h...@r-project.org mailing 
> > listhttps://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Partial sum of a vector

2009-03-01 Thread andrew

perhaps this?

M <- dim(data_m)[2]

for(j in 1:M){
for (i in 4:T) {
data_q[i-3,j]=sum(data_m[(i-3):i,j])
}
}

of course, you can vectorize this and speed it up significantly, but
there is something evil about premature optimization.

On Mar 2, 1:29 pm, Mohammad Sabr  wrote:
> I am trying to run a loop where I can sum parts of a matrix. For example, I 
> want at each step in the loop to sum the the next 4-values of a vector. I 
> tried to do the following but the result were wrong:
>  
> for (i in 4:T) {
> data_q[i-3,]=sum(data_m[i-3,]:data_m[i,])}
>
>  
> can anyone please direct me on what should I do to run this loop properly
>  
> Thanks in advance
>         [[alternative HTML version deleted]]
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multivariate GARCH Package

2009-03-03 Thread andrew

in the seond link on the pdf you have cited is

http://www.vsthost.com/vstDocs/mgarchBEKK/

HTH.

On Mar 4, 1:52 pm, Mohammad Sabr  wrote:
> Good day everyone,
>  
> I tried to find a multivariate GARCH package and failed to find one. Although 
> when I searched R I found the following link which describes the package:
>  http://www.r-project.org/user-2006/Slides/Schmidbauer+Tunalioglu.pdf
>  
> can any one help me with this issue.
>  
> Thank you in advance
>         [[alternative HTML version deleted]]
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multivariate integration and partial differentiation

2009-03-07 Thread andrew

The adapt package has multivariate integration.

However, I am not sure you need the multivariate integration for the
example you describe: you only need one dimensional integration.  For
this, you can check out

?integrate

For differentiation, depending on how well behaved the cdf is, you
could use ?diff to calculate approximations for the slope.

HTH

Andrew

On Mar 7, 5:08 pm, Wei-han Liu  wrote:
> Could somebody share some tips on implementing multivariate integration and 
> partial differentiation in R?
> For example, for a trivariate joint distribution (cumulative density 
> function) of F(x,y,z), how to differentiate with respect to x and get the 
> bivariate distribution (probability density function) of f(y,z). Or integrate 
> f(x,y,z) with respect to x to get bivariate distribution of (y,z).
> Your sharing is appreciated.
>
> Wei-han Liu
>
>         [[alternative HTML version deleted]]
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mean reverting model

2009-03-09 Thread andrew

Mean reverting model = autoregression?  If so, then search for

?ar

or

?arima

to fit a time series.

On Mar 10, 4:36 am, Josuah Rechtsteiner  wrote:
> dear useRs,
>
> i'm working with a mean reverting model of the following specification:
>
> y = mu + beta(x - mu) + errorterm, where mu is a constant
>
> currently I estimate just y = x (with lm()) to get beta and then  
> calculate mu = estimated intercept / (1-beta).
>
> but I'd like to estimate mu and beta together in one regression-step  
> and also get the test-statistics (including parameter variance) for mu  
> as well as for beta in the summary of the regression.
>
> could you please help me?
>
> thanks very much in advance!
>
> josuah
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mean reverting model

2009-03-09 Thread andrew

Autoregression is just X(n+1) = a X(n) + b + error.  The mean
reverting model is when |a| < 1.  Estimation is carried out using

x_ar <- ar(x)
summary(x_ar)

standard error is found in the square root of the diagonal of the x_ar
$asy.var.coef matrix.

please read the documentation found at ?ar to get full details.

On Mar 10, 9:18 am, Josuah Rechtsteiner  wrote:
> hi andrew,
>
> the problem is that I don't know what kind of model this exactly is...
> I only know that I have to do it this way and how the model is  
> structured.
>
>
>
> > Mean reverting model = autoregression?  If so, then search for
>
> > ?ar
>
> > or
>
> > ?arima
>
> > to fit a time series.
>
> > On Mar 10, 4:36 am, Josuah Rechtsteiner  wrote:
> >> dear useRs,
>
> >> i'm working with a mean reverting model of the following  
> >> specification:
>
> >> y = mu + beta(x - mu) + errorterm, where mu is a constant
>
> >> currently I estimate just y = x (with lm()) to get beta and then
> >> calculate mu = estimated intercept / (1-beta).
>
> >> but I'd like to estimate mu and beta together in one regression-step
> >> and also get the test-statistics (including parameter variance) for  
> >> mu
> >> as well as for beta in the summary of the regression.
>
> >> could you please help me?
>
> >> thanks very much in advance!
>
> >> josuah
>
> >> __
> >> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/
> >> listinfo/r-help
> >> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
> > __
> > r-h...@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mean reverting model

2009-03-09 Thread andrew

Autoregression is more general than the (discretized) Ornstein
Uhlenbeck process.  For a start, a discretized version of the Ornstein-
Uhlenbeck is just an AR(1) process X(n+1) = X(n) + a (X(n) - mu) +
error(n+1), but with the coefficient a restricted to 0 < a < 1.  This
restriction is necessary as the OU process is continuous and reverts
to its mean mu.

You can get a translation between the coefficients of an OU process
and an AR process by just equating coefficients on like terms.  This
shouldn't be more difficult that a bunch of linear equations.

On Mar 10, 11:06 am, markle...@verizon.net wrote:
> i think there's confusion here between a time series that reverts to its
> long term mean
> and an "ornstein uhlenbeck" type of mean reversion. they're not the same
> thing and
> I don't want to go into the difference because I would probably just add
> to the confusion.
>
> you might be better off sending your original question to the
> R-Sig-Finance list although
> you may have already because I saw something abiout the same topic
> earlier ?
>
> If you google for ornstein uhlenbeck, there should be something
> somewhere on the net that shows that a discrete version of an ornstein
> uhlenbeck is think a an AR(2) with some complex parameters which are
> functions of the volatility and mean reverting parameter of the
> continuous OU process. I googled earlier because I was going to send it
> to you but the site where I wanted to go was busy. I think it's called
> planetmath.org or something like that.
>
> On Mon, Mar 9, 2009 at  7:54 PM, andrew wrote:
> > Autoregression is just X(n+1) = a X(n) + b + error.Â  The mean
>
> reverting model is when |a| < 1.Â  Estimation is carried out using
>
> x_ar <- ar(x)
> summary(x_ar)
>
> standard error is found in the square root of the diagonal of the x_ar
> $asy.var.coef matrix.
>
> please read the documentation found at ?ar to get full details.
>
> On Mar 10, 9:18Â am, Josuah Rechtsteiner  <mailto:rechtstei...@bgki.net> Â  <mailto:rechtstei...@bgki.net> > wrote:
>
>
>
> > hi andrew,
>
> > the problem is that I don't know what kind of model this exactly is...
> > I only know that I have to do it this way and how the model is Â 
> > structured.
>
> >> Mean reverting model = autoregression? Â If so, then search for
>
> >> ?ar
>
> >> or
>
> >> ?arima
>
> >> to fit a time series.
>
> >> On Mar 10, 4:36 am, Josuah Rechtsteiner  >> <mailto:rechtstei...@bgki.net> Â  <mailto:rechtstei...@bgki.net> >
> >> wrote:
> >>> dear useRs,
>
> >>> i'm working with a mean reverting model of the following Â 
> >>> specification:
>
> >>> y = mu + beta(x - mu) + errorterm, where mu is a constant
>
> >>> currently I estimate just y = x (with lm()) to get beta and then
> >>> calculate mu = estimated intercept / (1-beta).
>
> >>> but I'd like to estimate mu and beta together in one regression-step
> >>> and also get the test-statistics (including parameter variance) for
> >>> Â 
> >>> mu
> >>> as well as for beta in the summary of the regression.
>
> >>> could you please help me?
>
> >>> thanks very much in advance!
>
> >>> josuah
>
> >>> __
> >>> r-h...@r-project.org <mailto:r-h...@r-project.org> Â 
> >>> <mailto:r-h...@r-project.org> Â mailing
> >>> listhttps://stat.ethz.ch/mailman/<https://stat.ethz.ch/mailman/> Â 
> >>> <https://stat.ethz.ch/mailman/> listinfo/r-help
> >>> PLEASE do read the posting
> >>> guidehttp://www.R-project.org/posting-guide.html
> >>> <http://www.R-project.org/posting-guide.html> Â 
> >>> <http://www.R-project.org/posting-guide.html> and provide commented,
> >>> minimal, self-contained, reproducible code.
>
> >> __
> >> r-h...@r-project.org <mailto:r-h...@r-project.org> Â 
> >> <mailto:r-h...@r-project.org> Â mailing list
> >>https://stat.ethz.ch/mailman/listinfo/r-help
> >> <https://stat.ethz.ch/mailman/listinfo/r-help> Â 
> >> <https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the
> >> posting guidehttp://www.R-project.org/posting-guide.html
> >> <http://www.R-project.org/posting-guide.html> Â 
> >> <http://www.R-project.org/posting-guide.html> and provide commented,
> >> minimal,

Re: [R] R equivalent to MATLAB's "whos" Command?

2009-03-10 Thread andrew

I saw this in
http://groups.google.com.au/group/r-help-archive/browse_thread/thread/7dca300a7fde5286/718dc5f1405618c9?lnk=gst&q=sort(+sapply(ls()%2Cfunction(x){object.size(get(x))}))#718dc5f1405618c9

The command is something like

sort( sapply(ls(),function(x){object.size(get(x))}))

very useful, so I have added it to my .First function


On Mar 11, 8:29 am, Jason Rupert  wrote:
> Wow.  Thank you for your quick response.  However, I'm looking for an R 
> command that lists what is currently in the R workspace.
>
> Here is a link to a description of the MATLAB "whos" 
> command:http://www.math.carleton.ca/old/help/matlab/MathWorks_R13Doc/techdoc/...
> Essentially, it does the following:
> "who lists the variables currently in the workspace.
>
> whos lists the current variables and their sizes and types. It also reports 
> the totals for sizes.
>
> who('global') and whos('global') list the variables in the global workspace
> "
>
> --- On Tue, 3/10/09, Jorge Ivan Velez  wrote:
>
>
>
> > From: Jorge Ivan Velez 
> > Subject: Re: [R] R equivalent to MATLAB's "whos" Command?
> > To: jasonkrup...@yahoo.com
> > Cc: r-h...@r-project.org
> > Date: Tuesday, March 10, 2009, 4:14 PM
> > Dear Jason,
> > How about this?
>
> > ?ls()
> > ls()
>
> > HTH,
>
> > Jorge
>
> > On Tue, Mar 10, 2009 at 5:05 PM, Jason Rupert
> > wrote:
>
> By any chance is there an R equivalent to MATLAB' "whos" command? I tried 
> searching R and R-seek, but didn't really come up with anything. There are 
> several items I would like to make sure are stored in the workspace and check 
> their values.  Thank you again for your help and any feedback.
>
>
>
> > > > ?workspace
> > > No documentation for 'workspace' in specified
> > packages and libraries:
> > > you could try '??workspace'
> > > > ??workspace
> > > No help files found with alias or concept or title
> > matching using fuzzy matching.
> > > > ??whos
> > > No help files found with alias or concept or title
> > matching using regular expression matching.
> > > > ??whos
> > > No help files found with alias or concept or title
> > matching using regular expression matching.
> > > > ?whos
> > > No documentation for 'whos' in specified
> > packages and libraries:
> > > you could try '??whos'
>
> > >        [[alternative HTML version deleted]]
>
> > > __
> > > r-h...@r-project.org mailing list
> > >https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > >http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained,
> > reproducible code.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting question

2009-03-14 Thread andrew

check out the interaction.plot.  This *may* be what you are looking
for.

?interaction.plot

On Mar 15, 8:14 am, David Kaplan  wrote:
> Greetings all,
>
> I have two questions.  I have a data set that is arranged in the example
> below.  I wish to obtain a plot of the performance of each ID over Year
> on v1.  It's not clear how I set this up?
>
> ID  Year   V1  
> 1   1980    1  
> 1   1981    2
> 1   1982    6
> 1   1983    4
> 2   1980    5
> 2   1981    5
> 2   1982    5
> 2   1983    6
>
> Also,I would like to transpose the data to have the variable across the
> columns such as
>
> ID   v1980   v1981 v1982 v1983
> 1       1       2      6   4
> 2       5       5      5   6
>
> Is there a straightforward way to do this in R?
>
> Thanks in advance,
>
> David
>
> --
> ===
> David Kaplan, Ph.D.
> Professor
> Department of Educational Psychology
> University of Wisconsin - Madison
> Educational Sciences, Room, 1061
> 1025 W. Johnson Street
> Madison, WI 53706
>
> email: dkap...@education.wisc.edu
> homepage:http://www.education.wisc.edu/edpsych/default.aspx?content=kaplan.html
> Phone: 608-262-0836
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I set the Windows temporary directory in R?

2009-03-17 Thread andrew

setwd

On Mar 18, 12:42 pm, Jonathan Greenberg  wrote:
> I'm trying to redirect where temporary files go under R to
> D:\temp\somerandomname from its default (C:\Documents and
> Settings\username\Temp\somerandomname) -- how do I go about doing this?
>
> --j
>
> --
>
> Jonathan A. Greenberg, PhD
> Postdoctoral Scholar
> Center for Spatial Technologies and Remote Sensing (CSTARS)
> University of California, Davis
> One Shields Avenue
> The Barn, Room 250N
> Davis, CA 95616
> Cell: 415-794-5043
> AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] print of objects in R

2009-03-19 Thread andrew

xtable in the library xtable does a good job if you are using latex to
write up your results.  For example:

> xtable(matrix(rnorm(20),5,5))
% latex table generated in R 2.8.0 by xtable 1.5-4 package
% Fri Mar 20 13:48:53 2009
\begin{table}[ht]
\begin{center}
\begin{tabular}{rr}
  \hline
 & 1 & 2 & 3 & 4 & 5 \\
  \hline
1 & -0.45 & 0.25 & -0.42 & -1.64 & -0.45 \\
  2 & 1.39 & 0.06 & 0.08 & 2.12 & 1.39 \\
  3 & 0.49 & -0.78 & -1.28 & -0.45 & 0.49 \\
  4 & -0.11 & -0.81 & 1.48 & 0.30 & -0.11 \\
  5 & 0.12 & -0.11 & -0.14 & 1.50 & 0.12 \\
   \hline
\end{tabular}
\end{center}
\end{table}

Or you could roll your own using for loops and the print function if
this format isn't what you want.

HTH,

Andrew.

On Mar 20, 12:48 pm, "Mary A. Marion"  wrote:
> Hello,
>
> I have been watching my output as I create functions and do other things
> in r.
> One thing I don't like is the [1,] type notation at the beginning of a
> line.  I have been
> able to change that to a number such as 1 2 etc.   using
> as.data.frame(object).
>
> How can I stop the printing of a line number and column heading if I
> want to?
> I am thinking about publishing and writing of papers.  It is much easier
> to not have to
> remove that leading line number when inserting output into papers.
>
> Thank you.
>
> Sincerely,
> mmstat
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] recursive term

2008-12-11 Thread andrew

you might try the following.

Pochhammer_n <- function (a,b,c,n) {
if(n==0) return(1)
return(a*b*Pochhammer_n(a+1, b+1, c+1, n-1)/(c*n))
}

hypergeo_sum <- function (a,b,c,z,n) {
comb_sum <- 0
for (i in 0:n)   {
comb_sum <- comb_sum + Pochhammer_n(a,b,c,i)*z^i
}
return(comb_sum)
}


hypergeo_sum2 <- function (a,b,c,z,n) {
comb <- rep(1,n+1)
for (i in 2:(n+1)){
comb[i] <- comb[i-1]*(a+i-2)*(b+i-2)*z/((i-1)*(c+i-2))
}
return(sum(comb))
}

system.time(hypergeo_sum (1.25,1.75,1.25,0.5,300))
system.time(hypergeo_sum2(1.25,1.75,1.25,0.5,300))


On Dec 10, 4:28 pm, Roslina Zakaria  wrote:
> Hi,
>  
> I would like to write a function for this formula:
> F(a,b,c,z)=1+(ab)/(1!c)+ +(a(a+1)b(b+1))/(2!c(c+1))+ 
> +(a(a+1)(a+2)b(b+1)(b+2))/(3!c(c+1)(c+2))+…
>  
> I wrote this function but not sure what is not right:
>  
> hypergeo_sum <- function (a,b,c,z,n)
> { for (i in 1:n)
>   { aa <- 1+(a*b*z)/c
>   aa[i] <- (aa-1)*(a+i)*(b+i)*z/((i+1)*(c+i))
>   comb_sum <- aa + aa[i]+ aa[i+2]
>   }
>   comb_sum}
>
>  
> hypergeo_sum (1.25,1.75,1.25,0.5,3)
>  
> output:> hypergeo_sum (1.25,1.75,1.25,0.5,3)
>
> [1] 2.394531   NA 1.039062
>
>  
> The answer should be 2.852539.
>  
> Or can you suggest any R book that I can refer to. Thank you so much for your 
> help.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to create duplicated ID in multi-records per subject dataset

2008-12-14 Thread andrew

if the records are in the file dupIDs.txt, then when you read them in,
the IDs become factors.  Coercing them to numeric gets them to assign
a unique number to each factor.

So, you could try the following:

dupIDs <- read.table("dupIDs.txt", header = T)
dupIDs$ID2 <- cummax(as.numeric(dupIDs$ID)-1)

> dupIDs
  ID record ID2
1  1 20   1
2  . 30   1
3  . 25   1
4  2 26   2
5  . 15   2
6  3 21   3

HTH,

Andrew.


On Dec 15, 12:56 pm, "Zhixin Liu"  wrote:
> Hi R helpers,
>
> If I have a dataset looks like:
> ID   record
> 1        20
> .         30
> .         25
> 2         26
> .         15
> 3         21
> 4.
>
> And I want it becomes
> ID   record
> 1        20
> 1        30
> 1        25
> 2         26
> 2        15
> 3         21
> 4.
>
> That is, I have to duplicate IDs for those with multiple records. I am 
> wondering it is possible to be done in R, and I am grateful if you would like 
> to show me the direction.
>
> Many thanks!
>
> Zhixin
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Movement within a circle

2008-12-15 Thread andrew

The following has a bug in it, but it does a reasonable job.  The bug
occurs in the lines that says

if(inner(xnew,xnew) > R^2)
   xnew <- x[n,] - eps ##error - sometimes x[n,] - eps is outside
circle

If it is changed to something like xnew <- proj(x[n,] + eps) where
proj is a projection function onto the circle, you should be ok.

HTH,

Andrew.

##=
N <- 1000 #steps
x <- matrix(0, nrow=N+1, ncol =2)
R <- 1 #radius of circle
delta <- 0.5 #step size

inner <- function(x,y) {
if(length(x) != length(y)){
print("Wrong length")
return(0)
}

return(t(x) %*% (y))
}

for(n in 1:N){
eps <- delta*runif(2)
xnew <- x[n,] + eps
if(inner(xnew,xnew) > R^2)
xnew <- x[n,] - eps ##error - sometimes x[n,] - eps is outside
circle

x[n+1, ]<- xnew

}

semicircle <- function(x, centre = c(0,0), radius=R){
return(sqrt(radius^2 - x^2))
}
xvaries <- seq(-R,R,by=0.01)
plot(xvaries, semicircle(xvaries), type = 'l', xlim= c(-R, R), ylim =c
(-R,R))
lines(xvaries,- semicircle(xvaries))
points(x)


On Dec 16, 10:34 am, David Winsemius  wrote:
> On Dec 15, 2008, at 5:20 PM, Juliane Struve wrote:
>
> > Dear list,
>
> > I am trying to program semi-random movement within a circle, with no  
> > particles leaving the circle. I would like them to bounce back when  
> > they come to close to the wall, but I don't seem to be able to get  
> > this right.  Would somebody be able to give me a hint ? This is my  
> > code so far, the particle starts at some point and moves towards the  
> > wall, but I don't get the "bouncing off" part right .
>
> That is a bit vague for an audience that is numerically oriented.
>
>
>
> > Any help would be much appreciated.
>
> > Juliane
>
> > days=10
> > circularspace
> > =
> > data
> > .frame
> > (day
> > =c(0:days),xcoord=1,ycoord=1,xvelocity=1,yvelocity=1,xdistwall=0,  
> > ydistwall=0, wallxvel=0, wallyvel=0,stochasticxvel=0,stochasticyvel=0)
>
> You have initialized this object with 10 rows, but why would you  
> specify the distance to the walls as 0?
>
>
>
>
>
> > xmax=10
> > xmin=-10
> > ymax=10
> > ymin=-10
> > mindist=8
> > plot(xmin:xmax, ymin:ymax, type = "n")
> > circularspace
> > radius=10
> > timesteplength=1
> > weightfactor=1
> > for(i in 1:days)
> > {
> > #This is the stochastic component of the movement
> > circularspace$stochasticxvel[day=i+1]=runif(1,min=-1,max=1)
> > circularspace$stochasticyvel[day=i+1]=runif(1,min=-1,max=1)
> > #This is calculating the movment speed
> > circularspace$xvelocity[day=i+1]=weightfactor*(circularspace
> > $xvelocity[day=i])+ (1-weightfactor)*circularspace
> > $stochasticxvel[day=i+1]
>
> See below. That looks all wrong. should just be [i] not [day=i]
>
>
>
> > circularspace$yvelocity[day=i+1]=weightfactor*(circularspace
> > $yvelocity[day=i])+ (1-weightfactor)*circularspace
> > $stochasticyvel[day=i+1]
> > #This is calculating the new position
> > circularspace$xcoord[day=i+1]=circularspace$xcoord[day=i]
> > +circularspace$xvelocity[day=i]/timesteplength
>
>                               
>                            here you need to learn to use <- for  
> assignment
>
>
>
> > circularspace$ycoord[day=i+1]=circularspace$ycoord[day=i]
> > +circularspace$yvelocity[day=i]/timesteplength
> > #Tis is checking the distance from the wall
> > circularspace$xdistwall[day=i+1]=xmax-abs(circularspace$xcoord[day=i])
> > circularspace$ydistwall[day=i+1]=ymax-abs(circularspace$ycoord[day=i])
>
> I would have expected to see code that checked whether the radial  
> distance were greater than 10,
> To do so would require calculating x^ + y^2. At the moment, you appear  
> to have a square bounding box.
>
> And why is the distance on day 5 calculated in terms of day 4?
>
> And you need to look at the definitions of logical operators. "=" is  
> not a logical operator. Above you were making unnecessary assignments,  
> now you are conflating "=" , one of the assignment operators, with  
> "==", the logical test for equality.
>
> ?"==" # or
> ?Comparison
>
>
>
>
>
> > #This is updating the movement if distance to wall too small
> > if (circularspace$xdistwall[day=i+1] <= mindist){
> > circularspace$wallxvel[day=i+1]= -1*(circularspace$xcoord[day=i+1])
> > circularspace$xvelocity[day=i+1]=weightfactor*circularspace
>

[R] lmerTest Issues

2017-12-08 Thread Andrew Harmon

Hello all,

Everything was working very well. Now when I try to load lmerTest using:
library("lmerTest"), I get this error:

Error: package or namespace load failed for ‘lmerTest’ in
loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck =
vI[[j]]):
 there is no package called ‘purrr’


Nothing I've done has worked. I have uninstalled both R and R studio. I've
updated packages as well, but nothing works.

Any suggestions?

Thanks,

Drew

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Perform mantel test on subset of distance matrix

2017-12-31 Thread Andrew Marx

I'm trying to perform a mantel test that ignores specific pairs in my
distance matrices. The reasoning is that some geographic distances
below a certain threshold suffer from spatial autocorrelation, or
perhaps ecological relationships become less relevant that stochastic
processes above a certain threshold.

The problem is that I can't find a way to do it. If I replace values
in either or both of the distance matrices with NA, mantel.rtest (ade4
package) gives the following error: Error in if (any(distmat < tol))
warning("Zero distance(s)") : missing value where TRUE/FALSE needed

Here's a trivial example that tries to exclude elements of the first
matrix that equal 11:

library(ade4)
a <- matrix(data = 1:36, nrow = 6)
b <- matrix(data = 1:36, nrow = 6)
a[a==11] <- NA
mantel.rtest(as.dist(a), as.dist(b))

Is there a way to do this, either with this package or another?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] The stages of standard function evaluation

2018-05-02 Thread Andrew Hoerner

Dear R Help folks --

I have been trying to put together a list of the steps or stages of R
function evaluation, with particular focus on those that have "standard" or
"nonstandard" forms. This is both for my own edification and also because I
am thinking of joining the world of R bloggers and have been trying to put
together some draft posting that might be useful. I seem to have an
affirmative genius for finding incorrect interpretations of R's evaluation
rules; I'm trying to make that an asset.

I am hoping that you can tell me:


   1. Is this list complete, or are there additional stages I am missing?
   2. Have I inserted one or more imaginary stages?
   3. Are the terms I use below to name each stage appropriate, or are
   there other terms more widely used or recognizable?
   4. Is the order correct?

I begin each name with “Standard,” to express my belief that each of these
things has a usual or default form, but also that (unless I am mistaken)
almost none of them exist only in a single form true of all R functions. (I
have marked with an asterisk a few evaluation steps that I think may always
be followed).

It is my ultimate goal (which I do not feel at all close to accomplishing)
to determine a way to mechanically test for “standardness” along each of
these dimensions, so that each function could be assigned a logical vector
showing the ways that it is and is not standard. One thing I think is
conceptually or procedurally difficult about this project is that I think
“standardness” should be determined by what a function does, rather than by
how it does it, so that a primitive function that takes unevaluated
arguments positionally could still have standard matching, scoping, etc.,
by internal emulation. A related goal is to identify which evaluation steps
most often use an alternative form, and perhaps determine if there is more
than one such alternative. Finally, an easier short-term goal is simply to
find instances of one or more function with standard and non-standard
evaluation for each evaluation step.

For the most part below I am treating the evaluation of closures as the
standard from which “nonstandard” is defined. However, I do not assume that
other kinds of functions are automatically nonstandard on any particular
dimension below. Most of this comes from the R Language Definition, but
there are numerous places where I am by no means certain that my
interpretation is correct. I have highlighted some of these below with a
“??”.

I look forward to learning from you.

Warmest regards,

J. Andrew Hoerner


** Standard function recognition:* recognizing some or all of a string code
as a function. (Part of code line parsing)

*Standard environment construction:* construction of the execution
environment, and of pointers to the calling and enclosing environments.

*Standard function identification:* Get the name of the function, if any

** Standard f**unction scoping*: Search the current environment and then up
the chain of enclosing environments until you find the first binding
environment, an environment where the name of the function is bound, i.e.
linked to a pointer to the function definition. The binding environment is
usually (but not always) the same as the defining environment (i.e. the
enclosing environment when the function is defined. Note that function
lookup, unlike function argument lookup, necessarily starts from the
calling environment, because a function does not know what it is – its
formals, body, and environments – until it is found. Named functions are
always found by scoping. R never learns "where" they are -- they have to be
looked up each time. For this reason, anonymous functions must be used in
place, and called by a function that takes a function as an argument, or by
(function(formals){body})(actual args)

*Standard f**unction **retrieval**:*

load (??) the function, i.e. transfer the list (??) of formals and defaults
and the list (??) of expressions that constitute the function body into the
execution environment

Note that the function body is parsed at the time the function is created
(true?? Or is it parsed every time the function is called?)

*Standard argument matching*:

assignment of expressions and default arguments to formals via the
usually-stated matching rulesIf matched positionall at call time, the name
is scoped like an actual argument.. Note that giving an argument the same
name as a formal when calling the function will only match it to that
formal if matched positionally or by tag, not by name.

*Standard a**rgument parsing:*

Converts argument expressions into abstract syntax trees. “Standard”
argument parsing and promise construction take place before the arguments
are passed into the body.



*Standard p**romise construction*:

Assigning each name (including function names) in an AST to its binding the
calling environment (if ordinary) or the execution environment (if default)
(Am I right that th

[R] R in a real time MS BI environment

2017-04-11 Thread Andrew Scotchmer

Hi

Hope someone can help with the best way forward for a project I am working 
on.

My employer uses the MS BI stack - SSRS, SSIS, SQLServer, etc - and the 
developers have built a web portal in C# and ASP.Net to display real time 
management reports in a dashboard format using the bootstrap theme. 

I have developed a prototype dashboard with R and Shiny running on an 
Ubuntu virtual server which is more graphical and includes reactive 
components and machine learning analytics. Everyone is very impressed with 
the dashboard and want to incorporate the analytics and graphical 
components (ggplot with plotly) into the existing portal.  As it was only a 
prototype however it uses files that are created in SQL server and exported 
nightly as csv's into a folder shared with Ubuntu.  So far we have just 
created iframes that point to the shiny server but management are not happy 
with this approach.  They want to integrate the R models and graphics into 
the real time portal.

I know SQL Server 2016 bundles R services, but how do you incorporate real 
time R analytics and graphics in the existing MS/.Net stack?

Cheers
Andrew
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Memory leak in nleqslv()

2017-06-11 Thread Andrew Leach

Hello all,

I am relatively new to R, but enjoying it very much.  I am hoping that
someone on this list can help me with an issue I am having.

I am having issues with iterations over nleqslv, in that the solver
does not appear to clean up memory used in previous iterations. I
believe I've isolated the/my issue in a small sample of code:

library(nleqslv)

cons_ext_test <- function(x){
rows_x <- length(x)/2
x_1 <- x[1:rows_x]
x_2 <- x[(rows_x+1):(rows_x*2)]
eq1<- x_1-100
eq2<-x_2*10-40
return(c(eq1,eq2))
}

model_test <- function()
{
reserves<-(c(0:200)/200)^(2)*2000
lambda <- numeric(NROW(reserves))+5
res_ext <- pmin((reserves*.5),5)
x_test <- c(res_ext,lambda)
#print(x_test)
for(test_iter in c(1:1000))
   nleqslv(x_test,cons_ext_test,jacobian=NULL)
i<- sort( sapply(ls(),function(x){object.size(get(x))}))
print(i[(NROW(i)-5):NROW(i)])
}

model_test()

When I run this over 1000 iterations, memory use ramps up to over 2.4 GB

While running it with 10 iterations uses far less memory, only 95MB:

Running it once has my rsession with 62Mb of use, so growth in memory
allocation scales with iterations.

Even after 1000 iterations, with 2+ GB of memory used by the R
session, no large-sized objects are listed, although mem_use() shows
2+ GB of memory used.

test_iterlambda   res_ext  reservesx_test
   48  1648 1648  1648  3256

I've replicated this on OS-X and in Windows both on a desktop and a
Surface Pro, however colleagues have run this on their machines and
not found the same result.  gc() does not rectify the issue, although
re-starting R does.

Any help would be much appreciated.

AJL


-- 
Andrew Leach
Associate Professor of Natural Resources, Energy and Environment (NREE)
Academic Director, Energy Programs
Alberta School of Business| 3-20D Business Building
University of Alberta |Edmonton, AB  T6G 2R6 | Canada
T. 780.492.8489
E. andrew.le...@ualberta.ca
www.business.ualberta.ca

Follow me on Twitter at @andrew_leach

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Converting SAS Code

2017-09-29 Thread Andrew Harmon

Hello all,

My statistical analysis training up until this point has been entirely done
in SAS. The code I frequently used was:

*Yield Champagin;

data yield;

set stress;

if field='YV' then delete;

if field='HB' then delete;

if barcode='16187DD4015' then delete;

if barcode='16187DD6002' then delete;

if barcode='16187DD2007' then delete;

if barcode='16187DD5016' then delete;

if barcode='16187DD8007' then delete;

if barcode='16187DD7010' then delete;

if barcode='16187DD7007' then delete;

if barcode='16187DD8005' then delete;

if barcode='16187DD6004' then delete;

if barcode='16187DD5008' then delete;

if barcode='16187DD7012' then delete;

if barcode='16187DD6010' then delete;

run; quit;



Title'2016 Asilomar Stress Relief champagin yield';

proc mixed method=reml data=yield;

class rep Management Foliar_Fungicide Chemical_Treatment;

model Grain_Yield__Mg_h_ =Management|Foliar_Fungicide|Chemical_Treatment
Final_Stand__Plants_A_ / outpred=resids residual ddfm=kr;

random rep rep*Management rep*Management*Foliar_Fungicide;

lsmeans Management|Foliar_Fungicide|Chemical_Treatment / pdiff;

ods output diffs=ppp lsmeans=means;

ods listing exclude diffs lsmeans;

run; quit;

%include'C:\Users\harmon12\Desktop\pdmix800.sas';

%pdmix800(ppp,means,alpha=0.10,sort=yes);

ods graphics off;

run; quit;

proc univariate data=resids normal plot; id Barcode Grain_Yield__Mg_h_
pearsonresid; var resid;
proc print data=resids (obs=3);run;

Can someone please help me convert my code to R? Any help would be much
appreciated.


Thanks,


Andrew Harmon

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Converting SAS Code

2017-10-11 Thread Andrew Harmon

I have no problem setting up my mixed model, or performing anova or lsmeans
on my model’s outputs. However, performing lsd mean separation is giving me
fits.



So I do not have a problem when using two-way anova model. When using the
code:

fit.yield.add <- lm(data = ryzup, Yield ~ Rep + Nitrogen + Treatment)

LSD.test(fit.yield.add, trt = "Nitrogen", alpha = 0.1, console = TRUE)



It works beautifully. However when I run a mixed model:



yield.lmer <- lmer(data = ryzup, Yield ~ Nitrogen + Treatment +
(1|Rep/Nitrogen), REML = FALSE)

LSD.test(yield.lmer, trt = "Nitrogen", alpha = 0.1, console = TRUE)



It gives me fits. It produces errors like:



Error in as.data.frame.default(x[[i]], optional = TRUE) :

  cannot coerce class "structure("merModLmerTest", package = "lmerTest")"
to a data.frame


Do you have any suggestions for that?

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Sample selection using multiple logit or similar

2016-05-27 Thread Andrew Perrin


Greetings-

I am seeking to fit a model using Heckman-style selection; however the 
wrinkle is that the selection is into multiple categories, not a binary 
in/out. In this case, selection is into the type of higher-education 
institution a student attended; the goal is to estimate post-graduation 
outcomes while correcting for selection into universities.


There is an old Stata file, svyselmlog, that may or may not solve this 
problem in stata. Any guidance or pointers toward solving it in R would 
be most welcome.


Thanks-
Andy Perrin

--
-
Andrew J Perrin - Professor of Sociology, UNC-Chapel Hill
Director, Carolina Seminars   http://carolinaseminars.unc.edu
Special Assistant to the Provost and Dean for Accreditation
and Curricular Innovation
andrew_per...@unc.edu http://perrin.socsci.unc.edu

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lm() silently drops NAs

2016-07-26 Thread Andrew Robinson

Agh.  I've argued elsewhere that the default behaviour should be to
fail, and the user should take the responsibility to explicitly handle
the missing values, even if that simply be by changing the argument.
Probably Peter and I have different experiences with the completeness
of datasets, but anything that encourages me not to think about what
I'm doing in that realm seems like a bad idea.

Best wishes,

Andrew

On 27 July 2016 at 07:30, peter dalgaard  wrote:
>
>> On 26 Jul 2016, at 22:26 , Hadley Wickham  wrote:
>>
>> On Tue, Jul 26, 2016 at 3:24 AM, Martin Maechler
>>  wrote:
>>>
> ...
>> To me, this would be the most sensible default behaviour, but I
>> realise it's too late to change without breaking many existing
>> expectations.
>
> Probably.
>
> Re. the default choice, my recollection is that at the time the only choices 
> available were na.omit and na.fail. S-PLUS was using na.fail for all the 
> usual good reasons, but from a practical perspective, the consequence was 
> that, since almost every data set has NA values,  you got an error unless you 
> added na.action=na.omit to every single lm() call. And habitually typing 
> na.action=na.omit doesn't really solve any of the issues with different 
> models being fit to different subsets and all that. So the rationale for 
> doing it differently in R was that it was better to get some probably 
> meaningful output rather than to be certain of getting nothing. And, that was 
> what the mainstream packages of the time were doing.
>
>> On a related note, I've never really understood why it's called
>> na.exclude - from my perspective it causes the _inclusion_ of missing
>> values in the predictions/residuals.
>
> I think the notion is that you exclude them from the analysis, but keep them 
> around for the other purposes.
>
> -pd
>
>>
>> Thanks for the (as always!) informative response, Martin.
>>
>> Hadley
>>
>> --
>> http://hadley.nz
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Andrew Robinson
Deputy Director, CEBRA, School of Biosciences
Reader & Associate Professor in Applied Statistics  Tel: (+61) 0403 138 955
School of Mathematics and StatisticsFax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.au
Website: http://www.ms.unimelb.edu.au/~andrewpr

MSME: http://www.crcpress.com/product/isbn/9781439858028
FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R wont accept my zero count values in the GLM with quasi_poisson dsitribution

2015-07-28 Thread Andrew Robinson

You have selected the binomial family in the call to glm.  You should
instead try something like

 family=quasipoisson(link = "log")

I hope this helps

Andrew

On Tue, Jul 28, 2015 at 4:33 PM, Charlotte <
charlotte.hu...@griffithuni.edu.au> wrote:

> Hello
>
> I have count values for abundance which follow a pattern of over-dispersal
> with many zero values.  I have read a number of documents which suggest
> that
> I don't use data transforming methods but rather than I run the GLM with
> the
> quasi poisson distribution.  So I have written my script and R is telling
> me
> that Y should be more than 0.
>
> Everything I read tells me to do it this way but I can't get R to agree.
> Did I need to add something else to my script to get it to work and keep my
> data untransformed? The script I wrote is as follows:
>
> > fit <- glm(abundance~Gender,data=teminfest,family=binomial())
>
> then I get this error
> Error in eval(expr, envir, enclos) : y values must be 0 <= y <= 1
>
> I don't use R a lot so I am having trouble figuring out what to do next.
>
> I would appreciate some help
>
> Many Thanks
> Charlotte
>
>
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/R-wont-accept-my-zero-count-values-in-the-GLM-with-quasi-poisson-dsitribution-tp4710462.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Andrew Robinson
Deputy Director, CEBRA, School of Biosciences
Reader & Associate Professor in Applied Statistics  Tel: (+61) 0403 138 955
School of Mathematics and StatisticsFax: +61-3-8344
4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.au
Website: http://www.ms.unimelb.edu.au/~andrewpr

MSME: http://www.crcpress.com/product/isbn/9781439858028
FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with nonlinear least squares regression curve fitting

2015-02-25 Thread Andrew Robinson

Finding starting values is a bit of a dark art.  That said, there are steps
you can take, but it may take time.

First, I would scale Year so that it's not in the thousands! Experiment
with subtracting 1980 or so.  For specific advice, see inline.

On Thu, Feb 26, 2015 at 3:03 AM, Corey Callaghan 
wrote:

> The curves' functions that I want to test are in the code here (hopefully
> correctly):
>
> Inverse Quadratic Curve:
> fitmodel <- nls(Area ~ (-a*Year)*(Year + b), data = df, start=list(a=??,
> b=??, c=??))
>

I would plot the data and a smooth spline, differentiate the curve
function, identify some parameter values somewhere stable, and estimate
some values by eye, or even predict them from the first derivative of the
spline - spline.smooth will do this.

Sigmodial Curve:
> fitmodel <- nls(Area~a/(1+exp(-(b+c*Year))), data=df, start=list(a=???,
> b=???, c=??))
>

I'd use the highest value as a, fit spline as above then invert area at two
times to get b and c.

Double sigmoidal Curve:
> fitmodel <- nls(Area~a+2b(1/(1+exp(-abs(-c*Year+d)))-1/2)*sign(-c*Year+d),
> data=df, start=list(a=???, b=???, c=???)
>

 I'd use min(Area) as a, figure out b from the maximum (I guess 2b+a is the
asymptote), and experiment with two values for year to retrieve c and d
 uniroot might help?

Cheers

Andrew

-- 
Andrew Robinson
Deputy Director, CEBRA, School of Biosciences
Reader & Associate Professor in Applied Statistics  Tel: (+61) 0403 138 955
School of Mathematics and StatisticsFax: +61-3-8344
4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.au
Website: http://www.ms.unimelb.edu.au/~andrewpr

MSME: http://www.crcpress.com/product/isbn/9781439858028
FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Building and scoring multiple models with dplyr functions

2014-08-18 Thread Andrew Agrimson

Hello All,

I have a question regarding building multiple models and then scoring new
data with these models. I have been tasked with converting existing SAS
modeling code into equivalent R code, but unfortunately I rarely use R so
I'm in unfamiliar territory. I've think I've found a good way to build the
models using dplyr functions, however I'm having difficulty figuring out
how to use the models to score new data.


*The SAS code I'm converting builds multiple binomial models using the "BY"
statement in the GLIMMIX procedure. The results of the modeling fitting
process are stored using the "STORE" statement. *

*proc* *glimmix* data = model_data;

by group;

model y/n = var1-var10/dist=bin;

store model;

*quit*;


*The next step is to score a new data set using the PLM procedure. The "New
Data" is also grouped by "group" and PLM is able to match and apply the
appropriate model with the appropriate "by" value. *

*proc* *plm* restore=model;

score data=new_data out=scored predicted=p/ilink;

*run*;


*In R I've been able to reproduce the first model building step using dplyr
functions and it seems to work quite well. In fact it's much faster than my
SAS implementation.*

by_group <- group_by(model_data, group)

models <- by_group %>% do(mod = glm(cbind(y,n) ~ var1 + var2 + var3 + var4
+ var5 + var6 + var7 + var8 + var9 + var10,
family = binomial, data = .))


*As stated above, I cannot figure out how to apply these models to new
data.  I've scoured the internet and the documentation for an example but
so far no luck. I want to extract the model objects out of the data frame
"models" and apply the "predict" function, but my novice knowledge of R and
dplyr specifically is making this very difficult.*

*Any help or advice would be greatly appreciated.*


*Thanks,*

*Andy*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with analysis of variance

2014-08-22 Thread McCulloch, Andrew

Hi,



I have an observational dataset which consists of eight annual observations 
(year) for children (id) recording the rate of unemployment in the 
neighbourhood in which they lived (rate). I know if children move home so the 
data also has an identifier for spells in the same neighbourhood (spell). I 
want to decompose the overall variation in children's experience of area 
unemployment, given by the sum of (rate - mean rate)^2, into a) the component 
within a residential spell, sum of (rate - spell mean of rate)^2, b) the 
component between spells, sum of (spell mean), and c) the component between 
children, sum of (rate - mean rate for child). I think I can do this longhand 
using the calculations below:



mobility <- structure(list(year = c(2002L, 2003L, 2004L, 2005L, 2006L, 
2007L,2008L, 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2002L,

2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2002L, 2003L, 2004L,

2005L, 2006L, 2007L, 2008L, 2002L, 2003L, 2004L, 2005L, 2006L,

2007L, 2008L), rate = c(13.08962, 14.27165, 4.496403, 3.89839,

4.60199, 5.138746, 5.251025, 4.874652, 5.880996, 5.813953, 6.204044,

6.93802, 6.866853, 7.614808, 4.405841, 4.826733, 4.760742, 3.762136,

4.60199, 5.138746, 5.251025, 4.405841, 4.826733, 4.760742, 3.762136,

4.60199, 5.138746, 5.251025, 4.405841, 5.789474, 5.889423, 4.61211,

4.642526, 6.838906, 9.683488), spell = c(1L, 2L, 2L, 3L, 3L,

3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,

1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L), id = c(1L,

1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L,

3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L,

5L, 5L)), .Names = c("year", "rate", "spell", "id"), row.names = c(NA,

-35L), class = "data.frame")



mobility$id <- factor(mobility$id)

mobility$spell <- factor(mobility$spell)



mobility$spellmean <- ave(mobility$rate,mobility$id,mobility$spell,FUN=mean)

mobility$personmean <- ave(mobility$rate,mobility$id,FUN=mean)

mobility$totalmean <- mean(mobility$rate,na.rm=TRUE)

N <- dim(mobility)[1]

# observation deviation from overall mean

sum(((mobility$rate-mobility$totalmean)^2)/N)

5.159846

# observation deviation from spell mean

sum(((mobility$rate-mobility$spellmean)^2)/N)

2.039461

# deviation of spell mean from person mean

sum(((mobility$spellmean-mobility$personmean)^2)/N)

2.13787

# deviation of person mean from overall mean

sum(((mobility$personmean-mobility$totalmean)^2)/N)

0.982515



I think this is correct because the sum of the three components of variation 
sums to the total:

2.039461+2.13787+0.982515 = 5.159846



Can someone show me how to use the analysis of variance functions in R to get 
the same result. Thanks.

Andrew McCulloch
Leeds Metropolitan University






>From 22 September 2014 Leeds Metropolitan University will become Leeds Beckett 
>University.
Find out more at http://www.leedsbeckett.ac.uk
To view the terms under which this email is distributed, please go to:-
http://www.leedsmet.ac.uk/email-disclaimer.htm

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] recording age into different variables

2014-08-25 Thread Andrew Kabala

Hi,
I am new to R, I would like to know how i can summarize age data into
different categories i.e.
Age
14
16
20
60
45
.
.
I would like to generate a summary for example showing the frequencies in
each custom defined group
0-10 yrs  11-20 years 21-30 years
20 5   6

many thanks
Andrew

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] memory usage with party::cforest

2014-11-21 Thread Andrew Z

Is there a way to shrink the size of RandomForest-class (an S4
object), so that it requires less memory during run-time and less disk
space for serialization?

On my system the data slot is about 2GB, which is causing problems,
and I'd like to see whether predict() works without it.

# example with a much smaller data set (i.e., less than 2GB)
require(party)
data(iris)
cf <- cforest(Species ~ ., data=iris)
str(cf, max.level=2)
cf@data <- NULL # this fails



Andrew

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error in dis[sppInSample, sppInSample]:subscript out of bounds

2015-08-21 Thread Andrew Park

I'm pretty sure you are using the package Picante. When I received the error
you described in my own project, it was due to the naming of species in the
community matrix and the phylogenetic tree being slightly different (one
separated latin binomials with a space, the other with an underscore). When
I corrected one set of names (using gsub) to be identical to the other then
the error went away. I hope this helps.



--
View this message in context: 
http://r.789695.n4.nabble.com/Error-in-dis-sppInSample-sppInSample-subscript-out-of-bounds-tp4705418p4711366.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error Message when using VarSelection in YaiImpute package with the randomForest

2015-09-16 Thread andrew haywood

Dear All,

when using the following code

x <- iris[,1:2]  # Sepal.Length Sepal.Width
y <- iris[,3:4]  # Petal.Length Petal.Width
vsel <-
varSelection(x=x,y=y,nboot=5,yaiMethod="randomForest",useParallel=FALSE)


I get the following error code

Error in yai(x = xa, y = y, method = yaiMethod, bootstrap = bootstrap,  :
  object 'xcvRefs' not found


If anybody could tell me what I am doing wrong.

Cheers
Andrew

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error with Segmented package

2015-10-07 Thread andrew haywood

Dear List,

I am trying to run a simple pieewise regression using the segmented package.

When running the following code

library(segmented)
data = data.frame(x=c(50,60,70,80,90,100,110) , y=
c(703.786,705.857,708.153,711.056,709.257, 707.4, 705.6))

model.lm = segmented(lm(y~x,data = data),seg.Z = ~x, psi = NA, control =
seg.control(K=1))

I get the following error.

Error in if (psi == Inf) psi <- median(XREGseg) :
  missing value where TRUE/FALSE needed

Any advice would be greatly appreciated.

Kind regards
Andrew

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error with Segmented package

2015-10-07 Thread andrew haywood

Thanks David,

I apologies for not posting the versions.

I am running

R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] segmented_0.5-1.2

And I still get the error. What is the best way to debug the error?

Kind regards
Andrew


On Thu, Oct 8, 2015 at 7:11 AM, David Winsemius 
wrote:

>
> On Oct 7, 2015, at 6:50 AM, andrew haywood wrote:
>
> > Dear List,
> >
> > I am trying to run a simple pieewise regression using the segmented
> package.
> >
> > When running the following code
> >
> > library(segmented)
> > data = data.frame(x=c(50,60,70,80,90,100,110) , y=
> > c(703.786,705.857,708.153,711.056,709.257, 707.4, 705.6))
> >
> > model.lm = segmented(lm(y~x,data = data),seg.Z = ~x, psi = NA, control =
> > seg.control(K=1))
> >
> > I get the following error.
> >
> > Error in if (psi == Inf) psi <- median(XREGseg) :
> >  missing value where TRUE/FALSE needed
>
> I don't get any error, despite being a bit behind the times. You need to
> specify the versions (OS, R, segmented) and prepare for some debugging
> efforts. Easiest way to do this is with the output of sessionInfo().
>
> R version 3.2.1 (2015-06-18) -- "World-Famous Astronaut"
> Copyright (C) 2015 The R Foundation for Statistical Computing
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
> I also have an embarrassing number of packages loaded.
>
> other attached packages:
>  [1] segmented_0.5-1.1   boot_1.3-17 sqldf_0.4-10
>
> (remaining 48 are omitted)
>
>
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error with Segmented package

2015-10-10 Thread andrew haywood

Thanks Dave,

when I run traceback() the following output

Error in if (psi == Inf) psi <- median(XREGseg) :
  missing value where TRUE/FALSE needed
> traceback()
2: segmented.lm(lm(y ~ x, data = data), seg.Z = ~x, psi = NA, control =
seg.control(K = 1))
1: segmented(lm(y ~ x, data = data), seg.Z = ~x, psi = NA, control =
seg.control(K = 1))


I am unsure how to interpret this.

In addition when I run the command

options(error="browser")


I get the following error

Error in options(error = "browser") : invalid value for 'error'



Any help would be greatly appreciated.

Kind regards
Andrew

On Thu, Oct 8, 2015 at 10:33 AM, David Winsemius 
wrote:

>
> On Oct 7, 2015, at 6:10 PM, andrew haywood wrote:
>
> > Thanks David,
> >
> > I apologies for not posting the versions.
> >
> > I am running
> >
> > R version 3.2.2 (2015-08-14)
> > Platform: x86_64-w64-mingw32/x64 (64-bit)
> > Running under: Windows 7 x64 (build 7601) Service Pack 1
> >
> > locale:
> > [1] LC_COLLATE=English_United Kingdom.1252
> > [2] LC_CTYPE=English_United Kingdom.1252
> > [3] LC_MONETARY=English_United Kingdom.1252
> > [4] LC_NUMERIC=C
> > [5] LC_TIME=English_United Kingdom.1252
> >
> > attached base packages:
> > [1] stats graphics  grDevices utils datasets  methods   base
> >
> > other attached packages:
> > [1] segmented_0.5-1.2
> >
> > And I still get the error. What is the best way to debug the error?
>
> The first thing I do is run traceback() immediately after the error. If
> that fails to illuminate the problem, my next step is falling back to:
>
> options(error="browser")
> # which should allow you to examine the system-state at the time of the
> error.
>
> And at that point I start considering sending an email to the maintainer
> with my reproducible example, especially so since it fails with a more
> recent version. It may be that the maintainer has the same OS as you have.
>
> --
> David.
> >
> > Kind regards teht
> > Andrew
> >
> >
> > On Thu, Oct 8, 2015 at 7:11 AM, David Winsemius 
> wrote:
> >
> > On Oct 7, 2015, at 6:50 AM, andrew haywood wrote:
> >
> > > Dear List,
> > >
> > > I am trying to run a simple pieewise regression using the segmented
> package.
> > >
> > > When running the following code
> > >
> > > library(segmented)
> > > data = data.frame(x=c(50,60,70,80,90,100,110) , y=
> > > c(703.786,705.857,708.153,711.056,709.257, 707.4, 705.6))
> > >
> > > model.lm = segmented(lm(y~x,data = data),seg.Z = ~x, psi = NA, control
> =
> > > seg.control(K=1))
> > >
> > > I get the following error.
> > >
> > > Error in if (psi == Inf) psi <- median(XREGseg) :
> > >  missing value where TRUE/FALSE needed
> >
> > I don't get any error, despite being a bit behind the times. You need to
> specify the versions (OS, R, segmented) and prepare for some debugging
> efforts. Easiest way to do this is with the output of sessionInfo().
> >
> > R version 3.2.1 (2015-06-18) -- "World-Famous Astronaut"
> > Copyright (C) 2015 The R Foundation for Statistical Computing
> > Platform: x86_64-apple-darwin10.8.0 (64-bit)
> >
> > I also have an embarrassing number of packages loaded.
> >
> > other attached packages:
> >  [1] segmented_0.5-1.1   boot_1.3-17 sqldf_0.4-10
> >
> > (remaining 48 are omitted)
> >
> >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > David Winsemius
> > Alameda, CA, USA
> >
> >
>
> David Winsemius
> Alameda, CA, USA
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fwd: Error Message when using VarSelection in YaiImpute package with the randomForest

2015-10-10 Thread andrew haywood

Dear All,

as I new to the R-help list. I didnt realise I should provide the following
information

R version 3.0.2 (2013-09-25)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=ms_MY.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=ms_MY.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=ms_MY.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=ms_MY.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] yaImpute_1.0-26   strucchange_1.5-1 sandwich_2.3-4zoo_1.7-12
[5] segmented_0.5-1.2

loaded via a namespace (and not attached):
[1] grid_3.0.2  lattice_0.20-33 randomForest_4.6-12
[4] tools_3.0.2


Running traceback() I get the following

> traceback()
10: yai(x = xa, y = y, method = yaiMethod, bootstrap = bootstrap,
...)
9: withCallingHandlers(expr, warning = function(w)
invokeRestart("muffleWarning"))
8: suppressWarnings(yai(x = xa, y = y, method = yaiMethod, bootstrap =
bootstrap,
   ...))
7: grmsd(one = suppressWarnings(yai(x = xa, y = y, method = yaiMethod,
   bootstrap = bootstrap, ...)), ancillaryData = y, wts = wts)
6: withCallingHandlers(expr, warning = function(w)
invokeRestart("muffleWarning"))
5: suppressWarnings(grmsd(one = suppressWarnings(yai(x = xa, y = y,
   method = yaiMethod, bootstrap = bootstrap, ...)), ancillaryData = y,
   wts = wts))
4: FUN(1:5[[1L]], ...)
3: myapply(1:max(1, nboot), function(i, xa, y, wts, yaiMethod, bootstrap,
   ...) suppressWarnings(grmsd(one = suppressWarnings(yai(x = xa,
   y = y, method = yaiMethod, bootstrap = bootstrap, ...)),
   ancillaryData = y, wts = wts)), xa, y, wts, yaiMethod, bootstrap,
   ...)
2: unlist(myapply(1:max(1, nboot), function(i, xa, y, wts, yaiMethod,
   bootstrap, ...) suppressWarnings(grmsd(one = suppressWarnings(yai(x
= xa,
   y = y, method = yaiMethod, bootstrap = bootstrap, ...)),
   ancillaryData = y, wts = wts)), xa, y, wts, yaiMethod, bootstrap,
   ...))
1: varSelection(x = x, y = y, nboot = 5, yaiMethod = "randomForest",
   useParallel = FALSE)


Any help/guidance would be greatly appreciated.

Kind regards
Andrew

-- Forwarded message --
From: andrew haywood 
Date: Thu, Sep 17, 2015 at 1:57 AM
Subject: Error Message when using VarSelection in YaiImpute package with
the randomForest
To: r-help@r-project.org


Dear All,

when using the following code

x <- iris[,1:2]  # Sepal.Length Sepal.Width
y <- iris[,3:4]  # Petal.Length Petal.Width
vsel <-
varSelection(x=x,y=y,nboot=5,yaiMethod="randomForest",useParallel=FALSE)


I get the following error code

Error in yai(x = xa, y = y, method = yaiMethod, bootstrap = bootstrap,  :
  object 'xcvRefs' not found


If anybody could tell me what I am doing wrong.

Cheers
Andrew

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about lme syntax

2015-11-23 Thread Andrew Robinson

Hi Angelo,

it's dangerous to fit a model that includes interaction effects but omits
main effects.  Among other things, what can happen is that the statistical
tests become scale dependent, which is most unattractive.

I think that you should include the main effects in your model, even as
nuisance variables, and test the interaction using the model that includes
them.

BTW, your question might better be located with the mixed-effects models
special interest group.

https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Best wishes

Andrew

On Mon, Nov 23, 2015 at 9:19 PM, angelo.arc...@virgilio.it <
angelo.arc...@virgilio.it> wrote:

> Dear list,
> I need an help to understand the syntax of lme to fit my model according
> to the analysis I want to perform.
>
> My dependent variable resulted from a perceptual experiment in which
> responses of participants were measured twice for each provided stimulus.
> My goal is to verify whether the responses depend on two properties of the
> participants that I know to be related to each other (height and weight, so
> they need to be considered together as an interaction). More importantly, I
> need to understand how this relationship modulates according to the type of
> stimulus participants were presented to.
>
> Based on my understanding of lme syntax, the formula I have to use should
> be the following (because I am only interested in the interaction factor of
> Weight and Height)
>
> lme_dv <- lme(dv ~ Weight:Height:Stimulus_Type, data = scrd, random = ~ 1
> | Subject)
>
> Am I correct?
>
>
> Thank you in advance
>
> Best regards
>
> Angelo
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Andrew Robinson
Deputy Director, CEBRA, School of Biosciences
Reader & Associate Professor in Applied Statistics  Tel: (+61) 0403 138 955
School of Mathematics and StatisticsFax: +61-3-8344
4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.au
Website: http://www.ms.unimelb.edu.au/~andrewpr

MSME: http://www.crcpress.com/product/isbn/9781439858028
FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Ensure parameter is a string when passed within an lapply & called function runs a 'substitute' on it

2016-05-10 Thread Andrew Clancy

Hi, 

I’m trying to solve what looks like the same issue as stack overflow article, 
but within an lapply:
http://stackoverflow.com/questions/18939254/cant-use-a-variable-as-an-argument-but-can-use-its-value

I’ve replicated the issue with partialPlot below in ‘testFunc’. The lines up to 
the final print can’t change (including the substitute). In the first call it 
prints out ‘X1’ correctly, in the second it prints out ‘var’. I’ve tried eval, 
quote etc as the article suggests. Any ideas?

numObs  <- 10
numVars <- 6
dataSet    <- data.frame(replicate(numVars,rnorm(numObs)))
# partialPlot(x = model, pred.data = dataSet, x.var = 'X1', plot = F) 

testFunc <- function(x, pred.data, x.var, plot=F) {
  x.var <- substitute(x.var)
  # print(paste('is.character(x.var)', is.character(x.var), 'is.name(x.var)', 
is.name(x.var)))
  xname <- if (is.character(x.var)) x.var else {
    if (is.name(x.var)) deparse(x.var) else {
      eval(x.var)
    }
  }
  print(xname)
  # print(head(pred.data[,xname]))
}

vars <- names(dataSet)[[1]]
testFunc(x = model, pred.data = dataSet, x.var = local(vars), plot = F)

lapply(vars, function(var) {
  # print(paste('var', var))
  testFunc(x = model, pred.data = dataSet, x.var = var, plot = F)
})

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ensure parameter is a string when passed within an lapply & called function runs a 'substitute' on it

2016-05-11 Thread Andrew Clancy

Thanks David - my earlier response to Bert contains the resolution.
partialPlot was commented out deliberately as it was the target function
who's behaviour I was replicating in testFunc. The original behaviour, ie.
printing 'X1' was correct, and the do.call fix yields this same response
when testFunc is called within lapply. As I'm replicating partialPlot, no
changes can be made to testFunc (eg. your removal of 'substitute')
otherwise I'd need to patch the randomForest::partialPlot package &
function. The correct patch would be to change the eval to use the parent
environment, the subtitue should remain.

See the resolution here (jcheng beat r-help to it this time!)
https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!topic/shiny-discuss/cIZJzQmw8tQ


On 11 May 2016 at 08:48, David Winsemius  wrote:

>
> > On May 9, 2016, at 2:39 PM, Andrew Clancy  wrote:
> >
> > Hi,
> >
> > I’m trying to solve what looks like the same issue as stack overflow
> article, but within an lapply:
> >
> http://stackoverflow.com/questions/18939254/cant-use-a-variable-as-an-argument-but-can-use-its-value
>
>
> It would be helpful if you could articulate the issue.
>
> >
> > I’ve replicated the issue with partialPlot below in ‘testFunc’. The
> lines up to the final print can’t change (including the substitute). In the
> first call it prints out ‘X1’ correctly, in the second it prints out ‘var’.
> I’ve tried eval, quote etc as the article suggests. Any ideas?
> >
> > numObs  <- 10
> > numVars <- 6
> > dataSet<- data.frame(replicate(numVars,rnorm(numObs)))
> > # partialPlot(x = model, pred.data = dataSet, x.var = 'X1', plot = F)
>
> I'm assuming that the comment character is actually something that was
> inserted in hte process of stripping hte HTML from this posting.
>
> It throws an error when removed:
>
> Error in partialPlot(x = model, pred.data = dataSet, x.var = "X1", plot =
> F) :
>   object 'model' not found
>
> >
>
> > testFunc <- function(x, pred.data, x.var, plot=F) {
> >   x.var <- substitute(x.var)
>
> Try changing to eval(x.bar)
>
> >   # print(paste('is.character(x.var)', is.character(x.var), 
> > 'is.name(x.var)',
> is.name(x.var)))
>
> >   xname <- if (is.character(x.var)) x.var else {
> > if (is.name(x.var)) deparse(x.var) else {
> >   eval(x.var)
> > }
> >   }
> >   print(xname)
> >   # print(head(pred.data[,xname]))
> > }
> >
> > vars <- names(dataSet)[[1]]
> > testFunc(x = model, pred.data = dataSet, x.var = local(vars), plot = F)
>
> Returns:
> [1] "is.character(x.var) TRUE is.name(x.var) FALSE"
> [1] "X1"
> [1]  0.8704543 -0.4421564 -0.6725336 -1.3096399 -1.0531335 -0.4979650
>
>
> >
> > lapply(vars, function(var) {
> >   # print(paste('var', var))
> >   testFunc(x = model, pred.data = dataSet, x.var = var, plot = F)
> > })
>
> Retruns:
> [1] "var X1"
> [1] "is.character(x.var) TRUE is.name(x.var) FALSE"
> [1] "X1"
> [1]  0.8704543 -0.4421564 -0.6725336 -1.3096399 -1.0531335 -0.4979650
> [[1]]
> [1]  0.8704543 -0.4421564 -0.6725336 -1.3096399 -1.0531335 -0.4979650
>
>
> >
> >   [[alternative HTML version deleted]]
>
> Please learn to post in plain text for this mailing list.
>
> --
>
> David Winsemius
> Alameda, CA, USA
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Finding starting values for the parameters using nls() or nls2()

2016-10-09 Thread Andrew Robinson

Here are some things to try.  Maybe divide Area by 1000 and retention
by 100.  Try plotting the data and superimposing the line that
corresponds to the 'fit' from nls2.  See if you can correct it with
some careful guesses.

Getting suitable starting parameters for non-linear modeling is one of
the black arts of statistical fitting. Good luck!  And don't forget to
check for sensitivity.

Andrew

On 9 October 2016 at 22:21, Pinglei Gao  wrote:
> Hi,
>
> I have some data that i'm trying to fit a double exponential model: data.
> Frame (Area=c (521.5, 689.78, 1284.71, 2018.8, 2560.46, 524.91, 989.05,
> 1646.32, 2239.65, 2972.96, 478.54, 875.52, 1432.5, 2144.74, 2629.2),
>
> Retention=c (95.3, 87.18, 44.94, 26.36, 18.12, 84.68, 37.24, 33.04, 23.46,
> 9.72, 97.92, 71.44, 44.52, 24.44, 15.26) ) and the formula of the double
> exponential is: exp (b0*exp (b1*x^th)).
>
>
>
> I failed to guess the initial parameter values and then I learned a measure
> to find starting values from Nonlinear Regression with R (pp. 25-27):
>
>
>
>> cl<-data.frame(Area =c(521.5, 689.78, 1284.71, 2018.8, 2560.46, 524.91,
> 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52, 1432.5, 2144.74, 2629.2),
>
> + Retention =c(95.3, 87.18, 44.94, 26.36, 18.12, 84.68, 37.24, 33.04, 23.46,
> 9.72, 97.92, 71.44, 44.52, 24.44, 15.26) )
>
>> expFct <- function(Area, b0, b1,th) {exp(b0*exp(b1*Area^th))}
>
>> grid.Disperse <- expand.grid(list(b0 = seq(0.01,4, by = 0.01), th =
> c(0.02),b1 = seq(0.01, 4, by = 0.01)))
>
>> Disperse.m2a <- nls2(Retention ~expFct(Area, b0, b1,th), data = cl, start
> = grid.Disperse, algorithm = "brute-force")
>
>> Disperse.m2a
>
> Nonlinear regression model
>
>   model: Retention ~ expFct(Area, b0, th, b1)
>
>data: cl
>
> b0   th   b1
>
> 3.82 0.02 0.01
>
> residual sum-of-squares: 13596
>
> Number of iterations to convergence: 16
>
> Achieved convergence tolerance: NA
>
>
>
> I got no error then I use the output as starting values to nls2 ():
>
>> nls.m2<- nls2(Retention ~ expFct(Area, b0, b1, th), data = cl, start =
> list(b0 = 3.82, b1 = 0.02, th = 0.01))
>
> Error in (function (formula, data = parent.frame(), start, control =
> nls.control(),  :
>
> Singular gradient
>
>
>
> Why? Did I do something wrong or misunderstand something?
>
>
>
> Later, I found another measure from Modern Applied Statistics with S (pp.
> 216-217):
>
>
>
>> negexp <- selfStart(model = ~ exp(b0*exp(b1*x^th)),initial =
> negexp.SSival, parameters = c("b0", "b1", "th"),
>
> + template = function(x, b0, b1, th) {})
>
>> Disperse.ss <- nls(Retention ~ negexp(Area, b0, b1, th),data = cl, trace =
> T)
>
>  b0  b1  th
>
>4.208763  144.205455 1035.324595
>
> Error in qr.default(.swts * attr(rhs, "gradient")) :
>
>  NA/NaN/Inf (arg1) can not be called when the external function is called.
>
>
>
> Error happened again. How can I fix it? I am desperate.
>
>
>
> Best regards,
>
>
>
> Pinglei Gao
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Andrew Robinson
Deputy Director, CEBRA, School of Biosciences
Reader & Associate Professor in Applied Statistics  Tel: (+61) 0403 138 955
School of Mathematics and StatisticsFax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.au
Website: http://www.ms.unimelb.edu.au/~andrewpr

MSME: http://www.crcpress.com/product/isbn/9781439858028
FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nlme

2016-10-24 Thread Andrew Robinson

Without access to the data, or commented, minimal, self-contained,
reproducible code, it's pretty hard to speculate.  I suggest that you
reframe your question so that we can see what you can see.

Andrew

On 25 October 2016 at 03:03, Santiago Bueno  wrote:
> Dear people:
>
>
> I am getting the following error when trying to run the piece of code
> below. Any insights??
>
>
>
> Error in nlme.formula(Btronc ~ a + b * dbh^2 * haut, data = cbind(dat,  :
>
>   object 'a' not found
>
>
>
> library(nlme)
>
> start <- coef(lm(Btronc~I(dbh**2*haut),data=dat))
>
> names(start) <- c("a","b")
>
> summary(nlme(Btronc~a+b*dbh**2*haut, data=cbind(dat,g="a"), fixed=a+b_1,
> start=start,
>
> groups=~g, weights=varPower(form=~dbh)))
>
>
> Best,
>
>
> Santiago
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Andrew Robinson
Deputy Director, CEBRA, School of Biosciences
Reader & Associate Professor in Applied Statistics  Tel: (+61) 0403 138 955
School of Mathematics and StatisticsFax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.au
Website: http://www.ms.unimelb.edu.au/~andrewpr

MSME: http://www.crcpress.com/product/isbn/9781439858028
FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] XML to CSV

2017-01-03 Thread Andrew Lachance

up votdown votefavorite
<http://stats.stackexchange.com/questions/254328/how-to-convert-a-large-xml-file-to-a-csv-file-using-r?noredirect=1#>

I am completely new to R and have tried to use several functions within the
xml packages to convert an XML to a csv and have had little success. Since
I am so new, I am not sure what the necessary steps are to complete this
conversion without a lot of NA.

-- 
Andrew D. Lachance
Chief of Service, Bates Emergency Medical Service
Residence Coordinator, Hopkins House
Bates College Class of 2017
alach...@bates.edu 
(207) 620-4854

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] spatial analysis using quickPCNM

2017-01-23 Thread Andrew Halford

Hi Listers,

I posted this message to the R-sig-ecology group last Friday but have not
had a response hence my post here.

I have been trying to run spatial analyses on a fish community dataset.

My fish dataset has 114 species(variables) x 45 sites
My spatial dataset has the Lat and Long values for each site, converted to
cartesian coordinates



> fish <- read.table(file.choose())> latlong <- read.table(file.choose()) > 
> fish.h <- decostand (fish, "hellinger")> fish.PCNM.quick <- 
> quickPCNM(fish.h,latlong)

Truncation level = 639.5348
Time to compute PCNMs = 0.82  sec Error in if (temp2.test[1, 5] <=
alpha) { : argument is of length zeroTiming stopped at: 1.06 0.05 1.19

I do not understand the error message coming up and would appreciate
some advice.

Andy


-- 
Andrew Halford Ph.D
Research Scientist (Kimberley Marine Parks)
Dept. Parks and Wildlife
Western Australia

Ph: +61 8 9219 9795
Mobile: +61 (0) 468 419 473

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] XML to CSV

2017-01-25 Thread Andrew Lachance

Hello all,

Thank you for the extremely helpful information. As a follow up, some of
the nested elements are of the form below:
-



I've been having trouble extracting this information and was wondering if
anyone had any suggestions.

Thank you,
Andrew

On Thu, Jan 5, 2017 at 7:39 AM, Franzini, Gabriele [Nervianoms] <
gabriele.franz...@nervianoms.com> wrote:

> Hello Andrew,
>
> as you are "clean slate" anyway in handling XML files, you could take a
> look to XSLT processing -- also an off-topic area.
> There are free tools available around, and many examples of "XML to CSV
> XSLT" on StackOverflow.
>
> HTH,
> Gabriele
>
> -Original Message-
>
> On January 4, 2017 12:45:08 PM PST, Ben Tupper 
> wrote:
> >Hi,
> >
> >You should keep replies on the list - you never know when someone will
> >swoop in with the right answer to make your life easier.
> >
> >Below is a simple example that uses xpath syntax to identify (and in
> >this case retrieve) children that match your xpath expression.  xpath
> >epxressions are sort of like /a/directory/structure/description so you
> >can visualize elements of XML like nested folders or subdirectories.
> >
> >Hopefully this will get you started.  A lot more on xpath here
> >http://www.w3schools.com/xml/xml_xpath.asp  There are other extraction
> >tools in xml2 - just type ?xml2 at the command prompt to see more.
> >
> >Since you have more deeply nested elements you'll need to play with
> >this a bit first.
> >
> >library(xml2)
> >uri = 'http://www.w3schools.com/xml/simple.xml'
> >x = read_xml(uri)
> >
> >name_nodes = xml_find_all(x, "//name")
> >name = xml_text(name_nodes)
> >
> >price_nodes = xml_find_all(x, "//price")
> >price = xml_text(price_nodes)
> >
> >calories_nodes = xml_find_all(x, "//calories")
> >calories = xml_double(calories_nodes)
> >
> >X = data.frame(name, price, calories, stringsAsFactors = FALSE)
> >write.csv(X, file = 'foo.csv')
> >
> >Cheers,
> >Ben
> >
> >> On Jan 4, 2017, at 2:13 PM, Andrew Lachance 
> >wrote:
> >>
> >> Hello Ben,
> >>
> >> Thank you for the advice. I am extremely new to any sort of coding so
> >I have learned a lot already. Essentially, I was given an XML file and
> >was told to convert all of it to a csv so that it could be uploaded
> >into a database. Unfortunately the information I am working with is
> >medical information and can't really share it. I initially tried to
> >convert it using online programs, however that ended up with a large
> >amount of blank spaces that wasn't useful for uploading into the
> >database.
> >>
> >> So essentially, my goal is to parse all the data in the XML to a
> >coherent, succinct CSV that could be uploaded. In the document, there
> >are 361 patient files with 13 subcategories for each patient which
> >further branches off to around 150 categories total. Since I am so new,
> >I have been having a hard time seeing the bigger picture or knowing if
> >there are any intermediary steps that will prevent all the blank spaces
> >that the online conversion programs created.
> >>
> >> I will look through the information on the xml2 package. Any advice
> >or recommendations would be greatly appreciated as I have felt fairly
> >stuck. Once again, thank you very much for your help.
> >>
> >> Best,
> >> Andrew
> >>
> >> On Tue, Jan 3, 2017 at 2:29 PM, Ben Tupper  ><mailto:btup...@bigelow.org>> wrote:
> >> Hi,
> >>
> >> It's hard to know what to advise - much depends upon the XML data you
> >have and what you want to extract from it. Without knowing about those
> >two things there is little anyone could do to help.  Can you post to
> >the internet a to example data and provide the link here?  Then state
> >explicitly what you want to have in hand at the end.
> >>
> >> If you are just starting out I suggest that you try xml2 package (
> >https://cran.r-project.org/web/packages/xml2/
> ><https://cran.r-project.org/web/packages/xml2/> ) rather than XML
> >package ( https://cran.r-project.org/web/packages/XML/
> ><https://cran.r-project.org/web/packages/XML/> ). I have been using it
> >much more since the authors added the ability to create xml nodes
> >(rather than just extracting data from existing xml nodes).
> >>
> >> Cheers,
> >> Ben
> >>
> &

[R] with lapply() how can you retrieve the name of the object

2008-07-18 Thread Andrew Yee

In the following code, I'd like to be able to create a new variable
containing the value of the names of the list.


a <- data.frame(var.1 = 1:5)
b <- data.frame(var.1 = 11:15)

test.list <- list(a=a, b=b)

# in this case, names(test.list) is "a" and "b"

# and I'd like to use lapply() so that
# I get something that looks like
# var.1 var.2
# 1  a
# 2  a
# 3  a
#etc.

new.list <- lapply(test.list, function(x) {x$var.2 <- names(x)
x} )


# the above clearly doesn't do it.  How do you pull out the names of the
thing that is being lapplied?

Thanks,
Andrew

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] with lapply() how can you retrieve the name of the object

2008-07-21 Thread Andrew Yee

Thanks everyone for your suggestions (and sorry for the delay in the
acknowledgement).  Jorge and Jim, thanks for pointing out your approach.

Andrew

On Fri, Jul 18, 2008 at 7:02 PM, Jorge Ivan Velez
<[EMAIL PROTECTED]>wrote:

>
> Dear Andrew,
>
> Following Jim Holtman' solution (thanks for pointed it out), try:
>
> a <- data.frame(var.1 = 1:5)
> b <- data.frame(var.1 = 11:15)
> test.list <- list(a=a, b=b)
> do.call(rbind,lapply(names(test.list), function(x){
>  cbind(test.list[[x]], var.2=x)
> }))
>
>
> HTH,
>
> Jorge
>
>
>
>
> On Fri, Jul 18, 2008 at 6:57 PM, jim holtman <[EMAIL PROTECTED]> wrote:
>
>> Is this what you wanted; you would use the 'names' to get the names in
>> the lapply:
>>
>> > lapply(names(test.list), function(x){
>> + cbind(test.list[[x]], var.2=x)
>> + })
>> [[1]]
>>  var.1 var.2
>> 1 1 a
>> 2 2 a
>> 3 3 a
>> 4 4 a
>> 5 5 a
>>
>> [[2]]
>>  var.1 var.2
>> 111 b
>> 212 b
>> 313 b
>> 414 b
>> 515 b
>>
>>
>>
>> On Fri, Jul 18, 2008 at 6:31 PM, Jorge Ivan Velez
>> <[EMAIL PROTECTED]> wrote:
>> > Dear Andrew,
>> >
>> > It's not the best solution but it does what you need:
>> >
>> > a <- data.frame(var.1 = 1:5)
>> > b <- data.frame(var.1 = 11:15)
>> > test.list <- list(a=a, b=b)
>> > res=do.call(rbind,test.list)
>> > res$var.2=substr(rownames(res),1,1)
>> > rownames(res)=NULL
>> > res
>> >
>> >
>> > HTH,
>> >
>> > Jorge
>> >
>> >
>> > On Fri, Jul 18, 2008 at 6:21 PM, Andrew Yee <[EMAIL PROTECTED]>
>> wrote:
>> >
>> >> In the following code, I'd like to be able to create a new variable
>> >> containing the value of the names of the list.
>> >>
>> >>
>> >> a <- data.frame(var.1 = 1:5)
>> >> b <- data.frame(var.1 = 11:15)
>> >>
>> >> test.list <- list(a=a, b=b)
>> >>
>> >> # in this case, names(test.list) is "a" and "b"
>> >>
>> >> # and I'd like to use lapply() so that
>> >> # I get something that looks like
>> >> # var.1 var.2
>> >> # 1  a
>> >> # 2  a
>> >> # 3  a
>> >> #etc.
>> >>
>> >> new.list <- lapply(test.list, function(x) {x$var.2 <- names(x)
>> >>x} )
>> >>
>> >>
>> >> # the above clearly doesn't do it.  How do you pull out the names of
>> the
>> >> thing that is being lapplied?
>> >>
>> >> Thanks,
>> >> Andrew
>> >>
>> >>[[alternative HTML version deleted]]
>> >>
>> >> __
>> >> R-help@r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
>> >> http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>> >>
>> >
>> >[[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>>
>>
>> --
>> Jim Holtman
>> Cincinnati, OH
>> +1 513 646 9390
>>
>> What is the problem you are trying to solve?
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Rf_error crashes entire program.

2008-07-28 Thread Andrew Redd

I'm having a problem with the error and warning functions.  I've tried this
on multiple machine so I'm fairly sure it's not machine dependent and I've
tried it on the latest versions 2.6.0-2.7.1.  Whenever my program gets to an
error or warning it crashes the entire program rather than throwing the
error like it should.

Here are the relevant bits.

...
char const * const ExeedsMinVarianceError = "PFDA ERR: Near zero
variance encountered.  Estimation Unstable. Terminating Estimation.";

if(debug){printf("Da:\n");printmat(DaOld,1,*ka);fflush(stdout);}
daxpy_(ka, &mOne, Da, &one, DaOld, &one);
for(i=0;i<*ka;i++)convergenceCriteria+=fabs(DaOld[i]);
if(Da[*ka] < MinVariance){
printf("PING");fflush(stdout);
warning(ExeedsMinVarianceError);
break;
}

and the output from running in batch mode that I get is this:
Da:
0.18034.988e-017

PING
and here the program crashes.   I've tried this in multiple places and
sometimes the error is thrown sometimes not.  Does anyone have an idea of
what is going on.  I could not find any discussion of this error yet.
Thank you,
Andrew Redd

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rf_error crashes entire program.

2008-07-28 Thread Andrew Redd

Switching all of my printfs to Rprintf fixed the problem the errors now
proceed correctly.  I was unable to reproduce the error, but now that is
irrelevant, at least to me.  Thanks for the help.
-Andrew

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Changing values

2008-08-03 Thread Andrew Ramsey


Hello--

I am a relatively new user to R and I cannot find the information I  
need. Please help.


I have a very large data set with values including letters, numbers,  
and symbols (sometimes within the same vector value [ie X9-].


I've imported the data using read.fwp and it arrives in list format.

I'd like to change the letters and symbols to numbers (ie X9-  ->   
00911) in every entry.


How would you recommend I try to do so?

Thank you!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Decomposing tests of interaction terms in mixed-effects models

2008-08-03 Thread Andrew Robinson

Dear R colleagues,

a friend and I are trying to develop a modest workflow for the problem
of decomposing tests of higher-order terms into interpretable sets of
tests of lower order terms with conditioning.

For example, if the interaction between A (3 levels) and C (2 levels)
is significant, it may be of interest to ask whether or not A is
significant at level 1 of C and level 2 of C.

The textbook response seems to be to subset the data and perform the
tests on the two subsets, but using the RSS and DF from the original
fit.  

We're trying to answer the question using new explanatory variables.
This approach (seems to) work just fine in aov, but not lme.  

For example,

##

# Build the example dataset

set.seed(101)

Block <- gl(6, 6, 36, lab = paste("B", 1:6, sep = ""))
A <- gl(3, 2, 36, lab = paste("A", 1:3, sep = ""))
C <- gl(2, 1, 36, lab = paste("C", 1:2, sep = ""))
example <- data.frame(Block, A, C) 
example$Y <- rnorm(36)*2 + as.numeric(example$A)*as.numeric(example$C)^2 + 
3 * rep(rnorm(6), each=6)

with(example, interaction.plot(A, C, Y)) 

# lme 

require(nlme) 
anova(lme(Y~A*C, random = ~1|Block, data = example)) 

summary(aov(Y ~ Error(Block) + A*C, data = example))

# Significant 2-way interaction.  Now we would like to test the effect
# of A at C1 and the effect of A at C2.  Construct two new variables
# that decompose the interaction, so an LRT is possible.

example$AC <- example$AC1 <- example$AC2 <- interaction(example$A, example$C) 

example$AC1[example$C == "C1"] <- "A1.C1"  # A is constant at C1
example$AC2[example$C == "C2"] <- "A1.C2"  # A is constant at C2

# lme fails (as does lmer)

lme(Y ~ AC1 + AC, random = ~ 1 | Block, data = example)

# but aov seems just fine.

summary(aov(Y ~ Error(Block) + AC1 + AC, data = example)) 

## AC was not significant, so A is not significant at C1

summary(aov(Y ~ Error(Block) + AC2 + AC, data = example)) 

## AC was significant, so A is significant at C2

## Some experimentation suggests that lme doesn't like the 'partial
## confounding' approach that we are using, rather than the variables
## that we have constructed.

lme(Y ~ AC, random = ~ 1 | Block, data = example)
lme(Y ~ A + AC, random = ~ 1 | Block, data = example)

##########

Are we doing anything obviously wrong?   Is there another approach to
the goal that we are trying to achieve?

Many thanks,

Andrew

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Decomposing tests of interaction terms in mixed-effects models

2008-08-04 Thread Andrew Robinson

On Mon, Aug 04, 2008 at 10:17:38AM +0200, Peter Dalgaard wrote:
> Andrew Robinson wrote:
> >Dear R colleagues,
> >
> >a friend and I are trying to develop a modest workflow for the problem
> >of decomposing tests of higher-order terms into interpretable sets of
> >tests of lower order terms with conditioning.
> >
> >For example, if the interaction between A (3 levels) and C (2 levels)
> >is significant, it may be of interest to ask whether or not A is
> >significant at level 1 of C and level 2 of C.
> >
> >The textbook response seems to be to subset the data and perform the
> >tests on the two subsets, but using the RSS and DF from the original
> >fit.  
> >
> >We're trying to answer the question using new explanatory variables.
> >This approach (seems to) work just fine in aov, but not lme.  
> >
> >For example,
> >
> >##
> >
> ># Build the example dataset
> >
> >set.seed(101)
> >
> >Block <- gl(6, 6, 36, lab = paste("B", 1:6, sep = ""))
> >A <- gl(3, 2, 36, lab = paste("A", 1:3, sep = ""))
> >C <- gl(2, 1, 36, lab = paste("C", 1:2, sep = ""))
> >example <- data.frame(Block, A, C) 
> >example$Y <- rnorm(36)*2 + as.numeric(example$A)*as.numeric(example$C)^2 + 
> >3 * rep(rnorm(6), each=6)
> >
> >with(example, interaction.plot(A, C, Y)) 
> >
> ># lme 
> >
> >require(nlme) 
> >anova(lme(Y~A*C, random = ~1|Block, data = example)) 
> >
> >summary(aov(Y ~ Error(Block) + A*C, data = example))
> >
> ># Significant 2-way interaction.  Now we would like to test the effect
> ># of A at C1 and the effect of A at C2.  Construct two new variables
> ># that decompose the interaction, so an LRT is possible.
> >
> >example$AC <- example$AC1 <- example$AC2 <- interaction(example$A, 
> >example$C) 
> >example$AC1[example$C == "C1"] <- "A1.C1"  # A is constant at C1
> >example$AC2[example$C == "C2"] <- "A1.C2"  # A is constant at C2
> >
> ># lme fails (as does lmer)
> >
> >lme(Y ~ AC1 + AC, random = ~ 1 | Block, data = example)
> >
> ># but aov seems just fine.
> >
> >summary(aov(Y ~ Error(Block) + AC1 + AC, data = example)) 
> >
> >## AC was not significant, so A is not significant at C1
> >
> >summary(aov(Y ~ Error(Block) + AC2 + AC, data = example)) 
> >
> >## AC was significant, so A is significant at C2
> >
> >## Some experimentation suggests that lme doesn't like the 'partial
> >## confounding' approach that we are using, rather than the variables
> >## that we have constructed.
> >
> >lme(Y ~ AC, random = ~ 1 | Block, data = example)
> >lme(Y ~ A + AC, random = ~ 1 | Block, data = example)
> >
> >##
> >
> >Are we doing anything obviously wrong?   Is there another approach to
> >the goal that we are trying to achieve?
> >  
> This looks like a generic issue with lme/lmer, in that they are not 
> happy with rank deficiencies in the design matrix.
> 
> Here's a "dirty" trick which you are not really supposed to know about 
> because it is hidden inside the "stats" namespace:
> 
> M <- model.matrix(~AC1+AC, data=example)
> M2 <- stats:::Thin.col(M)
> summary(lme(Y ~ M2 - 1, random = ~ 1 | Block, data = example)
> 
> (Thin.col(), Thin.row(), and Rank() are support functions for 
> anova.mlm() et al., but maybe we should document them and put them out 
> in the open.)

That is a neat idea, thanks, Peter, but it doesn't quite fit the bill.
The summary provides t-tests but I am hoping to find F-tests,
otherwise I'm not sure how to efficiently test A (3 levels) at the two
levels of C.  

The anova.lme function doesn't help, sadly:

> anova(lme(Y ~ M2 - 1, random = ~ 1 | Block, data = example))
   numDF denDF F-value p-value
M2 625 23.0198  <.0001

so I'm still flummoxed!

Andrew
-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 3 4 5 6 7 8 >

1 - 100 of 793 matches

Mail list logo