[R] Fast reading of hex data?

2012-05-08 Thread Fang
Hi all,

Basically, I have data in the format of (up to 1 gig in size) text files
containing stuff like:

F34060F81000F28055F8A000F2E05EF8F000F34 (...)

The data is basically strings denoting hex values (9 = 9, A = 10, B = 11,
...) organised in fixed, small blocks. What I want to do is to read in a
specified segment of the string, break it up into blocks, and convert it
into a vector of integers for further processing. And I want to do this
fast, and hopefully without using masses of memory. So, I'm wondering if
anyone has any better ideas than what I'm doing - well, anything that would
make a sizable difference anyway.

Right now, my methodology is the following:

Use mmap (from library mmap) to map the file to a memory mapped variable,
reading in each byte as uint8 integer.
obj <- mmap("file.txt", mode = uint8())
tmp <- obj[bytepos]
Converting the integer representations of each byte into the appropriate
integer by 
tmp <- tmp - 48 - 7*(tmp>64)
Collating blocksize values together by
tmp<- matrix(tmp, ncol = blocksize, byrow = T) %*% 16^(blocksize: 1 - 1)

Now, my question is, is there a better way? My attempts with rawToChar and
strtoi seems to take drastically longer for reasonably lengthy bytepos,
presumeably because of string manipulations/storage, but possibly I am doing
it wrong somehow. If there is no better way in R, would there be much value
in implementing this in C, for example, or would the computational
improvement be small?

Thanks,

Zhou

--
View this message in context: 
http://r.789695.n4.nabble.com/Fast-reading-of-hex-data-tp4617024.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to plot PCA output?

2012-05-08 Thread Fang
I think the question on your mind should be: 'what do I want to do with this
plot'? Just producing output from the PCA is easy - plotting the output$sd
is probably quite informative. From the sounds of it, though, you want to do
clustering with the PCA component loadings? (Since that's mostly what the
biplot accomplishes using the first two PCs.) 

The first thing to note, then is that you might not want to plot all 36 PCs,
then! Once you go higher than the first few, your results will likely become
remarkably awful in ways that might not be obvious. A biplot with PCs 1 & 2,
or 2 & 3, for example, could be easily sufficient.

If you want to still plot many PCs, from an exploratory point of view,
something like a parallel coordinates plot might be helpful. Alternatively,
you could look at rgl for general plotting of 3d points (so you can do a 3d
version of the biplot), or apply more systematic clustering algorithms.

Zhou

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-plot-PCA-output-tp4614732p4617165.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] Announcing XGR

2016-12-15 Thread Hai Fang
Dear R users,

I am happy to announce that the package 'XGR' (Exploring Genomic Relations
available at http://cran.r-project.org/package=XGR) has been on CRAN since
this April. Now it gets published in Genome Medicine (see
http://dx.doi.org/10.1186/s13073-016-0384-y). Together with its web app,
XGR is able to provide a user-friendly tool for exploring genomic relations
at the gene, SNP and genomic region level.

Best regards,

Dr Hai Fang
Wellcome Trust Centre for Human Genetics
University of Oxford
Roosevelt Drive Headington
Oxford OX3 7BN

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Want help on data resampling!

2009-11-18 Thread ke fang
Dear all.
  I have a data matrix that each row containing a specific individual's 
information  including individual observation and  properties. I'm trying to 
use R to create some bootstrap samples with this data matrix. I have tried the 
boot() function in boot package, but it seems that this function need one or 
more statistic to be summarized. I can't just get my data resampled. I also 
tried the resample() function but get nothing. Can some body give me some hint 
on solving this problem?
  Thanks in advance!
  Here is some of my data. Rows represent individual. Columns represent 
individual information and observation.

10 168 133 22.5 1 0 3.45 4.890349 
11 672 15 25.5 1 0 3.9 2.70805 
12 168 201 25.7 1 0 3.9 5.303305 
17 216 125 46.5 0 0 4.7 4.828314 
18 216 103 95 0 0 9.5 4.634729 
19 504 92 64 0 0 7 4.521789 
20 504 52 81.5 0 0 8.2 3.951244 


  ___ 
  ºÃÍæºØ¿¨µÈÄã·¢£¬ÓÊÏäºØ¿¨È«ÐÂÉÏÏߣ¡ 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Binary operators in packages and documentation?

2009-11-25 Thread Zhou Fang
Hi,

I'm trying to make a package defining a new (S3?) class. Part of this
involves a custom version of a binary operator. e.g. "*.foo", so I can
do obj.foo * bar, or things like that.

Now, I think to makes this work with a NAMESPACE, I can do

S3method("*", foo)

in the NAMESPACE file, right? The question I was wondering was what
the appropriate way to document this operator is. i.e. What should I
put in the \usage section, etc?

'Writing R Extensions' doesn't seem to see much about this, but maybe
I'm missing something obvious.

Thanks,

Zhou Fang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Replace is leaking?

2009-05-27 Thread Zhou Fang
Okay, someone explain this behaviour to me:

Browse[1]> replace(rep(0, 4000), temp1[12] , temp2[12])[3925]
[1] 0.4462404
Browse[1]> temp1[12]
[1] 3926
Browse[1]> temp2[12]
[1] 0.4462404
Browse[1]> replace(rep(0, 4000), 3926 , temp2[12])[3925]
[1] 0

For some reason, R seems to shift indices along when doing this replacement.

Has anyone encountered this bug before? It seems to crop up from time
to time, seemingly at random. Any idea for a fix? Reassigning the
variables seems to preserve the magicness of the numbers. It all seems
very bizarre and worrying.

If anyone is interested in a R workspace to reproduce this, email me.
This is running in R 2.9.

Zhou Fang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Replace is leaking?

2009-05-27 Thread Zhou Fang

Oh hang on, I've figured it out.

Rounding error, doh. Somewhere along the line I got lazy and took the 
weighted average of two values that are equal. as.integer truncates, so, 
yeah. Never mind.


Zhou Fang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Scaled MPSE as a test for regressors?

2009-03-23 Thread Zhou Fang
Hi,

This is really more a stats question than a R one, but

Does anyone have any familiarity with using the mean prediction
squared error scaled by the variance of the response, as a 'scale
free' criterion for evaluating different regression algorithms.

E.g.

Generate X_train, Y_train, X_test, Y_test from true f. X_test/Y_test
are generated without noise, maybe?

Use X_train, Y_train and the algorithm to make \hat{f}

Look at var(Y_test - \hat{f}(X_test))/var(Y_test)

(Some of these var maybe should be replaced with mean squared values instead.)


It seems sort of reasonable to me. You get a number between zero and
one out of it, with 1 the solution for constant fits. Anyone seen
anything like this, or know anything about properties? Has it got a
name?

Zhou Fang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] Package 'dnet' for omics data integrative analysis

2014-05-26 Thread Hai Fang
Dear R package developers and users,

I am pleased to announce the official release of our newly developed
package 'dnet', which intends to analyse omics data in terms of network,
evolution and ontology.

It has features:
1. Identification of gene-active networks from high-throughput omics data;
2. Network-based sample classifications and visualisations on 2D sample
landscape;
3. Random Walk with Restart for network affinity calculation;
4. Semantic similarity between ontology terms (and between their annotated
genes);
5. Enrichment analysis using a variety of built-in databases;
6. A wide variety of built-in RData (
http://dnet.r-forge.r-project.org/rdata.html): ontologies (including Gene
Ontology, Disease Ontology, Human Phenotype and Mammalian Phenotype), gene
evolutionary age information and gene association networks in well-studied
organisms, including human, mouse, rat, chicken, c.elegans, fruitfly,
zebrafish and arabidopsis;
7. Support for high-performance parallel computing.

For it to be used widely, we have analysed several realcases with
step-by-step protocols (http://dnet.r-forge.r-project.org/demos.html).
Since it supports many functionalities, we also introduce them individual
topics in the form of FAQs (http://dnet.r-forge.r-project.org/faqs.html).

Enjoy it!

Hai Fang, Ph.D.
>From Prof. Gough's Group (http://bioinformatics.bris.ac.uk)
Department of Computer Science
Univeristy of Bristol
Bristol, United Kingdom
hf...@cs.bris.ac.uk
http://www.cs.bris.ac.uk/~hfang

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to make LN(x) transformation in R?

2013-09-27 Thread Xiao Fang
Dear R colleagues,

I am a newbie to R. I can not figure out how to compute Ln(x) value in R.

My question may be so easy for you but I will really appreciate if you can
help me. Thanks so much for your time!

Kathy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to convert a set of strings to a list of unique numeric id?

2010-06-20 Thread G FANG
Hi,

I have been a matlab user and is learning R.

I want to convert a large list of strings to a list of unique numeric
ids to reduce storage space.

For example,

there is a string list (there are duplicates)

ABC
ACCDEDF
ACCGEDF
ACCGEGF
.
ACCDEDF
ACCGEGF

and I want to have a corresponding numeric id list

1
2
3
4

2
4

In matlab, the 'unique' function can do this in addition to give the
unique set, but in R, 'unique' only gives the unique set


Please advice me on this.

Thanks,

Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to efficiently compute set unique?

2010-06-21 Thread G FANG
Hi,

I want to get the unique set from a large numeric k by 1 vector, k is
in tens of millions

when I used the matlab function unique, it takes less than 10 secs

but when I tried to use the unique in R with similar CPU and memory,
it is not done in minutes

I am wondering, am I using the function in the right way?

dim(cntxtn)
[1] 135847631
uniqueCntxt = unique(cntxtn);# this is taking really long

Please advice.

Thanks,

Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to efficiently compute set unique?

2010-06-22 Thread G FANG
Hi All,

I think I figured out what's the problem. I have been a matlab user,
so in all my codes, I maintain the as.matrix format, which is much
slower to do unique.

I tried to not do the as.matrix conversion, and now it takes just few
seconds to do unique, as well as other computations.

Thanks a lot Duncan, Steve, David, and Douglas,

Hopefully, this case can also help future matlab->R users who got
stucked in the matlab thinking style.

Gang


On Mon, Jun 21, 2010 at 7:01 PM, Douglas Bates  wrote:
> On Mon, Jun 21, 2010 at 8:38 PM, David Winsemius  
> wrote:
>>
>> On Jun 21, 2010, at 9:18 PM, Duncan Murdoch wrote:
>>
>>> On 21/06/2010 9:06 PM, G FANG wrote:
>>>>
>>>> Hi,
>>>>
>>>> I want to get the unique set from a large numeric k by 1 vector, k is
>>>> in tens of millions
>>>>
>>>> when I used the matlab function unique, it takes less than 10 secs
>>>>
>>>> but when I tried to use the unique in R with similar CPU and memory,
>>>> it is not done in minutes
>>>>
>>>> I am wondering, am I using the function in the right way?
>>>>
>>>> dim(cntxtn)
>>>> [1] 13584763        1
>>>> uniqueCntxt = unique(cntxtn);    # this is taking really long
>>>
>>> What type is cntxtn?  If I do that sort of thing on a numeric vector, it's
>>> quite fast:
>>>
>>> > x <- sample(10, size=13584763, replace=T)
>>> > system.time(unique(x))
>>>  user  system elapsed
>>>  3.61    0.14    3.75
>>
>> If it's a factor, it could be as simple as:
>>
>> levels(cntxtn)  # since the work of "unique-ification" has already been
>> done.
>
> Not quite.  When you generate a factor, as you do in your example, the
> levels correspond to the unique values of the original vector.  But
> when you take a subset of a factor the levels are preserved intact,
> even if some of those levels do not occur in the subset.  This is why
> there are unusual arguments with names like drop.unused.levels in
> functions like model.frame.  It is also a subtle difference in the
> behavior of factor(x) and as.factor(x) when x is already a factor.
>
>> ff <- factor(sample.int(200, 1000, replace = TRUE))
>> ff1 <- ff[1:40]
>> length(levels(ff))
> [1] 199
>> length(levels(ff1))
> [1] 199
>> length(levels(as.factor(ff1)))
> [1] 199
>> length(levels(factor(ff1)))
> [1] 34
>
>>> x <- factor(sample(10, size=13584763, replace=T))
>>> system.time(levels(x))
>>   user  system elapsed
>>      0       0       0
>>> system.time(y <- levels(x))
>>   user  system elapsed
>>      0       0       0
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to group a large list of strings into categories based on string similarity?

2010-06-23 Thread G FANG
Hi,

I want to group a large list (20 million) of strings into categories
based on string similarity?

The specific problem is: given a list of DNA sequence as below

ACTCCCGCCGTTCGCGCGCAGCATGATCCTG
ACTCCCGCCGTTCGCGCGC
CAGGATCATGCTGCGCGCGAACGGCGGGAGT
CAGGATCATGCTGCGCGCGAANN
CAGGATCATGCTGCGCGCG
..
.
NNNCCGTTCGCGCGCAGCATGATCCTG
CGCGCGCAGCATGATCCTG
GCGCGCGAACGGCGGGAGT
NNCGCGCAGCATGATCCTG
NNNTGCGCGCGAACGGCGGGAGT
NNTTCGCGCGCAGCATGATCCTG

'N' is the missing letter

It can be seen that some strings are the same except for those N's
(i.e. N can match with any base)

given this list of string, I want to have

1) a vector corresponding to each row (string), for each string assign
an id, such that similar strings (those only differ at N's) have the
same id
2) also get a mapping list from unique strings ('unique' in term of
the same similarity defined above) to the ids

I am a matlab user shifting to R. Please advice on efficient ways to do this.

Thanks!

Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to group a large list of strings into categories based on string similarity?

2010-06-26 Thread G FANG
Hi Martin,

Thanks a lot for your advice.

I tried the process you suggested as below, it worked, but in a
different way that I planned.

library(Biostrings)
x <- c("ACTCCCGCCGTTCGCGCGCAGCATGATCCTG",
  "ACTCCCGCCGTTCGCGCGC",
  "CAGGATCATGCTGCGCGCGAACGGCGGGAGT",
  "CAGGATCATGCTGCGCGCGAANN",
  "NCAGGATCATGCTGCGCGCGAAN",
  "CAGGATCATGCTGCGCGCG",
  "NNNCAGGATCATGCTGCGCGCGAANNN")
names(x) <- seq_along(x)
dna <- DNAStringSet(x)
while (!all(width(dna) == width(dna <- trimLRPatterns("N", "N", dna {}
names(dna)[order(dna)[rank(dna, ties.method="min")]]

The output is,
"1" "2" "3" "4" "4" "6" "4", this is the right answer after trimining
N's, i.e. without considering N, which strings are the same.

But actually, the match I planned is position-to-position match, i.e.
1st and 2nd strings are the same except for the N's

So, the expected output is 1 1 2 2 3 2 4

Please advice.

Thanks!

--gang

On Wed, Jun 23, 2010 at 7:55 PM, Martin Morgan  wrote:
> On 06/23/2010 07:46 PM, Martin Morgan wrote:
>> On 06/23/2010 06:55 PM, G FANG wrote:
>>> Hi,
>>>
>>> I want to group a large list (20 million) of strings into categories
>>> based on string similarity?
>>>
>>> The specific problem is: given a list of DNA sequence as below
>>>
>>> ACTCCCGCCGTTCGCGCGCAGCATGATCCTG
>>> ACTCCCGCCGTTCGCGCGC
>>> CAGGATCATGCTGCGCGCGAACGGCGGGAGT
>>> CAGGATCATGCTGCGCGCGAANN
>>> CAGGATCATGCTGCGCGCG
>>> ..
>>> .
>>> NNNCCGTTCGCGCGCAGCATGATCCTG
>>> CGCGCGCAGCATGATCCTG
>>> GCGCGCGAACGGCGGGAGT
>>> NNCGCGCAGCATGATCCTG
>>> NNNTGCGCGCGAACGGCGGGAGT
>>> NNTTCGCGCGCAGCATGATCCTG
>>>
>>> 'N' is the missing letter
>>>
>>> It can be seen that some strings are the same except for those N's
>>> (i.e. N can match with any base)
>>>
>>> given this list of string, I want to have
>>>
>>> 1) a vector corresponding to each row (string), for each string assign
>>> an id, such that similar strings (those only differ at N's) have the
>>> same id
>>> 2) also get a mapping list from unique strings ('unique' in term of
>>> the same similarity defined above) to the ids
>>>
>>> I am a matlab user shifting to R. Please advice on efficient ways to do 
>>> this.
>>
>> The Bioconductor Biostrings package has many tools for this sort of
>> operation. See http://bioconductor.org/packages/release/Software.html
>>
>> Maybe a one-time install
>>
>>    source('http://bioconductor.org/biocLite.R')
>>    biocLite('Biostrings')
>>
>> then
>>
>>   library(Biostrings)
>>   x <- c("ACTCCCGCCGTTCGCGCGCAGCATGATCCTG",
>>         "ACTCCCGCCGTTCGCGCGC",
>>         "CAGGATCATGCTGCGCGCGAACGGCGGGAGT",
>>         "CAGGATCATGCTGCGCGCGAANN",
>>         "NCAGGATCATGCTGCGCGCGAAN",
>>         "CAGGATCATGCTGCGCGCG",
>>         "NNNCAGGATCATGCTGCGCGCGAANNN")
>>   names(x) <- seq_along(x)
>>   dna <- DNAStringSet(x)
>>   while (!all(width(dna) ==
>>               width(dna <- trimLRPatterns("N", "N", dna {}
>>   names(dna)[rank(dna)]
>
> oops, maybe closer to
>
>   names(dna)[order(dna)[rank(dna, ties.method="min")]]
>
>> although there might be a faster way (e.g., match 8, 4, 2, 1 N's). Also,
>> your sequences likely come from a fasta file (Biostrings::readFASTA) or
>> a text file with a column of sequences (ShortRead::readXStringColumns)
>> or from alignment software (ShortRead::readAligned /
>> ShortRead::readFastq). If you go this route you'll want to address
>> questions to the Bioconductor mailing list
>>
>>   http://bioconductor.org/docs/mailList.html
>>
>> Martin
>>
>>> Thanks!
>>>
>>> Gang
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>
> --
> Martin Morgan
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
>
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Get the indices of non-zero entries of a sparse matrix in R

2010-07-06 Thread G FANG
Hi,

I am trying to get the indices of non-zero entries of a sparse matrix in R

 sr d
1 1089 3772 1
2 1109  190 1
3 1109 2460 1
4 1109 3071 1
5 1109 3618 1
6 1109   38 1

I found that the following can create a sparse matrix,

library(Matrix)
Y <- sparseMatrix(s,r,x=d)

but have not idea and did not find online how to convert a sparse
matrix back to three columns efficiently, i.e. get the indices of
non-zero entries of a sparse matrix.

In matlab, the find function can do this efficiently, I am wondering
is there a similar one in R?

Please advice. Thanks!

--gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] what does SEXP in R internals stand for

2010-11-13 Thread John Fang
Hi all,

Is there any one that would give an explanation on the abbreviation SEXP
used in R internals to represent a pointer to a data structure?

Thanks!

Best wishes,
John

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] what does SEXP in R internals stand for

2010-11-13 Thread John Fang
Thank you!

Best wishes,
John


2010/11/13 Alexx Hardt 

> Am 13.11.2010 14:50, schrieb John Fang:
>
> Hi all,
>>
>> Is there any one that would give an explanation on the abbreviation SEXP
>> used in R internals to represent a pointer to a data structure?
>>
>> Thanks!
>>
>>
> S-Expression, I believe:
> http://en.wikipedia.org/wiki/S-expression
>
> Best wishes,
>  Alex
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] msvcr80.dll is missing

2011-04-27 Thread Fang, Yongxiang
Dear All,

I run R on a windows 7 machine and it has been worked very well. I installed 
Graphvis 2.20.3 and Rgraphviz.
recently, however, I cannot load the Rgraphviz package and error message popped 
up
The message shown on the pop up window with the title: R Consol: Rgui.exe - 
Sysytem error
  The program can't start because MSVCR80.dll is missing from your computer.
  Try reinstalling the program to fix this problem.

When I searched the msvcr80.dll, it appeared in sub folds of Windows/winsxs/, 
which
is said the correct location by:
Martyn Lovell
Development Lead
Visual C++ Libraries
martynl.TakeThisOut@microsoft.com

I wonder why the .dll could not be called?
Could anyone provide a solution for the problem?

Thanks

Yongxiang





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] setMethod does not work in Window 7??

2010-06-03 Thread Fang, Jianwen
I am developing a S4 class but have had trouble to make setMethod work
in Window 7.  I tested an example found  in the setMethod manual:

 

 > require(graphics)

> setMethod("plot", signature(x="track", y="missing"),

+   function(x,  y, ...) plot(slot(x, "x"), slot(x, "y"), ...)

+ )

 

It gave me:

 

Error in setMethod("plot", signature(x = "track", y = "missing"),
function(x,  : 

  unused argument(s) (function(x, y, ...) plot(slot(x, "x"), slot(x,
"y"), ...))

 

It works perfectly fine in Linux.   Does anybody know why?

 

Thanks in advance!

JF

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Spatstat - envelope, Thomas process and Cramer-von Mises

2011-02-14 Thread Jeff Fang
Hi all,

I am using "spatstat" to investigate the spatial structure of some plant
populations, and I want to detect these patters with IPP and a Thomas
process based on pair-correlation function. I know the function "pcfinhom"
is available to characterize the IPP, but I have no idea about how to use
the pcf with Thomas process? Additionally, generating simulation envelopes
using these two null models is another problem for me. Cramer-von Mises
(CvM) can assess the curve-wise significance of deviations from null
hypothese, but in the spatstat package, is there any functions can do this
work?

I am not very familiar with R, so my problem might be simple.  I hope anyone
have any experience with these work can give me any assistance.

Many thanks


Jeff

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to detect the spatial point pattern with Thomas process based on pcf

2011-02-19 Thread Jeff Fang
Hi all,

I am using "spatstat" to investigate the spatial structure of some plant
populations, but I have no idea about detecting the spatial point pattern
with Thomas process based on pcf. Additionally, generating simulation
envelope using this null model is another problem for me. I am not very
familiar with R, so I hope anyone have any experience with this work can
give me any assistance.

Many thanks


Jeff

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Which function in R package "Spatstat" can help me to get the Cramer-von Mises statistic

2011-02-22 Thread Jeff Fang
Hi all,

When I detect the spatial point pattern, I want to use the Cramer-von Mises
statistic to assess the curve-wise significance of deviations from null
hypotheses. Who can tell me which function in R package "Spatstat" can do
this work?

Thanks a lot


Jeff

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem in simulating envelopes with R package "Spatstat"

2011-02-27 Thread Jeff Fang
Hi all,

While useing the R package "Spatstat" to detect the spatial point pattern of
my data, I met a problem.  When I computes simulation envelopes with a
fitted point process model: Poisson cluster (Thomas) process
(kappa=2.751010e-05; sigma=5.634274e+01; mu=4.943639e+02), I cannot get the
high values of the envelope, which show "NA" , but the "mmean"
values computed by averaging simulated values are available, this result
confused me. Is it a bug in this package or only caused by my data? How can
I solve this issue?


Thanks a lot

#my code is as follow:

>dist = seq(0, 12.5, 0.1)
>species.c = kppm(species, ~ 1, "Thomas")
>species.cenv = envelope(species.c, Kest, r=dist, nsim=999, nrank=5)



Jeff

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question about Chi-squared test

2011-03-04 Thread Jeff Fang
Hi all,

I know Chi-squared test can be done with the frequency data by R function
"chisq.test()", but I am not sure if it can be applied to the percentage
data ? The example of my data is as follow:

#

  KSL   MHL   MWS   CLGC   LYGC
 independent (%) 96.22  92.18  68.54   93.80   85.74

#


Thanks


Jeff

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question in Chi-squared test, can I do it with percentage data?

2011-03-04 Thread Jeff Fang
Hi all,

I know Chi-squared test can be done with the frequency data by R function
"chisq.test()", but I am not sure if it can be applied to the percentage
data ? The example of my data is as follow:

#

  KSL   MHL   MWS   CLGC   LYGC
 independent (%) 96.22  92.18  68.54   93.80   85.74

#


Thanks


Jeff

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What is the most cost effective hardware for R?

2012-05-08 Thread Zhou Fang
How many data points do you have?

--
View this message in context: 
http://r.789695.n4.nabble.com/What-is-the-most-cost-effective-hardware-for-R-tp4617155p4617187.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Seek() on windows - safe use cases?

2012-05-08 Thread Zhou Fang
So, I'm maintaining some else's code, which is as always, a fun thing. One
feature of this code is the use of the 'seek' command.

In ?seek:

  We have found so many errors in the Windows implementation of file
  positioning that users are advised to use it only at their own
  risk, and asked not to waste the R developers' time with bug
  reports on Windows' deficiencies.

So, yeah. I guess my question would be this: are there any 'safe' use cases
of seek? I assume that doing anything unusual with it would be pretty bad,
but in this case, the file input is absolutely predictable, and so seek
seems a lot more convenient than the alternatives. 

Would, in particular, using seek to skip the first N bytes of an
uncompressed text file file being read in be consistent and reliable? The
references to seek problems in the dev mailing list seem mostly limited to
compressed files, or reading and writing files at the same time.

Zhou

--
View this message in context: 
http://r.789695.n4.nabble.com/Seek-on-windows-safe-use-cases-tp4617858.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] registry vulnerabilities in R

2012-05-10 Thread Zhou Fang
What about using a Portable Apps style packaging of R? That might solve some
of the issues.

--
View this message in context: 
http://r.789695.n4.nabble.com/registry-vulnerabilities-in-R-tp4619217p4623388.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove a number from a vector

2012-05-11 Thread Zhou Fang
Better yet, remove the which altogether, and it'll run a slight bit faster
and maybe look a little neater.
x <- x[x!="bobo"] 

--
View this message in context: 
http://r.789695.n4.nabble.com/Remove-a-number-from-a-vector-tp851865p4626413.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] “For” calculation is so slow

2012-05-22 Thread Zhou Fang
For loops are really, really slow in R. In general, you want to avoid them
like the plague. If you absolutely must insist on using them in large,
computationally intense and complex code, consider implementing the relevant
parts in C, say, and calling that from R.

Staying within R, you can probably considerably speed up that code by
storing gx and gy as a multi-dimensional arrays. (e.g. for sample data,
something like

rawGy = sample( 1:240, 240^2* 241, replace = T)
rawGx = sample( 1:240, 240^2 *241, replace = T)
gx = array(rawGx, dim = c(length(s) - 1, 240,  max(rawGx)+1  ) )
gy = array(rawGy, dim = c(length(s) - 1, 240,  max(rawGy)+1 ) )

 ), in which case, you can easily do the computation without loops by

gxa = (gx[ ,a,1]+ 1)
gya =(gy[ ,a, 1] +1)
uv = gx[cbind(1:(length(s) - 1) , b, gxa)] / gx[cbind(1:(length(s) - 1) , a,
gxa)]  - gy[cbind(1:(length(s) - 1) ,b, gya)]/gy[cbind(1:(length(s) - 1) ,a,
gya)]

or similar, which will be enormously faster (on my computer, there's an over
30x speed up). With a bit of thought, I'm sure you can also figure out how
to let it vectorise in a, as well...

Zhou

--
View this message in context: 
http://r.789695.n4.nabble.com/For-calculation-is-so-slow-tp4630830p4630855.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] “For” calculation is so slow

2012-05-22 Thread Zhou Fang
I'm not sure what you are trying to prove with that example - the loopless
versions are massively faster, no?

I don't disagree that loops are sometimes unavoidable, and I suppose
sometimes loops can be faster when the non-loop version e.g. breaks your
memory budget, or performs tons of needless computations. But I think
avoiding for loops whenever you can is a good rule of thumb in R coding.

--
View this message in context: 
http://r.789695.n4.nabble.com/For-calculation-is-so-slow-tp4630830p4630897.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Could "incomplete final line found" be more serious than a warning?

2012-05-22 Thread Zhou Fang
If you look at the new file in raw mode, you'll see that it's chock full of
ASCII nuls, while the old file has none. This is probably what's giving you
the problems, because R does not allow strings containing embedded nul
characters. (I believe this is because Nul in strings is pretty dangerous in
programming, because they are often used to delimit the end of strings, and
so allowing you to read it in directly can be used for various code
injection exploits.)

To read the new data files, you need some way of dealing with the file as a
raw stream, and stripping out all the nul characters before converting back
to character. Investigate ?readBin...

Zhou

--
View this message in context: 
http://r.789695.n4.nabble.com/Could-incomplete-final-line-found-be-more-serious-than-a-warning-tp4630932p4630944.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Looping over subsets

2008-01-19 Thread Z. Fang
Hi,

Possibly a dumb question, but I wonder if anyone can help me with this. 
What I want to do, essentially, is to loop over all ordered subsets of a 
given size of a certain set. Ultimately, the idea is to find the subset 
that maximises a certain value.

The set in question is likely large (the subset size is likely small, 
though), so things like combn don't seem to be a good solution. The biggest 
concern is keeping memory usage sane, but processing time is also fairly 
important.

Any ideas?

Zhou Fang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Inserting blank lines into a file

2008-10-22 Thread Zhou Fang
Hi,

Should be a quickie:

I want to make a datafile in R for plotting in gnuplot (which has
friendlier 3D plotting options, as far as I can tell). So, I want to
create a file with contents along the lines of

#File begins
0 0 10
0 13 10
0.2 2 10

1 0 10.12
1 1 5
1 2 10

2 0 10
2 1 1
2 2 10

It's probably fairly easy to write the space-separated numbers with
write.table, sink, or similar. But what I haven't figured out is how
to get the blank lines between data blocks that I need.

Does anyone know?

Zhou

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Loading workspaces from the command line

2009-01-12 Thread Zhou Fang
Hi,

Is there any way to load workspaces (e.g. stuff from save.image) from
the command line? I'm on Linux, and would find this very helpful.

I'm guessing this functionality can be duplicated with a skillful bash
script to rename the particular file to .RData (and then back once R
terminates), but I'm wondering if there's a better way.

Zhou Fang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loading workspaces from the command line

2009-01-12 Thread Zhou Fang
That's not really what I meant by 'command line'. I meant, well,
loading from e.g. a bash shell, not from within an interactive R
session itself.

Thanks anyways,

Zhou

(Possibly this email was sent twice. Apologies)

On Mon, Jan 12, 2009 at 12:15 PM, Henrique Dallazuanna  wrote:
> See ?load
>
> On Mon, Jan 12, 2009 at 10:12 AM, Zhou Fang  wrote:
>>
>> Hi,
>>
>> Is there any way to load workspaces (e.g. stuff from save.image) from
>> the command line? I'm on Linux, and would find this very helpful.
>>
>> I'm guessing this functionality can be duplicated with a skillful bash
>> script to rename the particular file to .RData (and then back once R
>> terminates), but I'm wondering if there's a better way.
>>
>> Zhou Fang
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loading workspaces from the command line

2009-01-12 Thread Zhou Fang
Ok, looks like I can do what I want with --args, commandArgs() and an
appropiate .First.

Thanks,

Zhou

On Mon, Jan 12, 2009 at 2:27 PM, David Winsemius  wrote:
> See if this material is helpful:
>
> http://cran.r-project.org/doc/manuals/R-intro.html#Invoking-R-from-the-command-line
>
> -- David Winsemius
>
> On Jan 12, 2009, at 7:24 AM, Zhou Fang wrote:
>
>> That's not really what I meant by 'command line'. I meant, well,
>> loading from e.g. a bash shell, not from within an interactive R
>> session itself.
>>
>> Thanks anyways,
>>
>> Zhou
>>
>> (Possibly this email was sent twice. Apologies)
>>
>> On Mon, Jan 12, 2009 at 12:15 PM, Henrique Dallazuanna 
>> wrote:
>>>
>>> See ?load
>>>
>>> On Mon, Jan 12, 2009 at 10:12 AM, Zhou Fang  wrote:
>>>>
>>>> Hi,
>>>>
>>>> Is there any way to load workspaces (e.g. stuff from save.image) from
>>>> the command line? I'm on Linux, and would find this very helpful.
>>>>
>>>> I'm guessing this functionality can be duplicated with a skillful bash
>>>> script to rename the particular file to .RData (and then back once R
>>>> terminates), but I'm wondering if there's a better way.
>>>>
>>>> Zhou Fang
>>>>
>>>> __
>>>> R-help@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>>
>>> --
>>> Henrique Dallazuanna
>>> Curitiba-Paraná-Brasil
>>> 25° 25' 40" S 49° 16' 22" O
>>>
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loading workspaces from the command line

2009-01-12 Thread Zhou Fang
Well, that isn't ideal for my purposes. (A little context - basically
I have a script that I'm running for a lot of simulations, which is
kinda buggy, and what I'm doing is I'm having the script periodically
save whatever it has done so far to an automatically named file. Then
if something odd happens in between two saves, I can run forward from
a previously saved point to find the problem and figure out why it
happened, and also I won't risk losing everything if something
catastrophic happens.)

Anyways, if anyone's interested, in .Rprofile

.First <- function(){
  if (rev(commandArgs())[2] == "ld"){
load(rev(commandArgs())[1], .GlobalEnv)
  }
}

Then e.g.

alias Rload='R --arg ld'

or make a bash script with

gnome-terminal --command "R --args ld $1"

and set some Open With options, and you'll be about to open R
workspaces from Nautilus etc by point and click.

Zhou


On Mon, Jan 12, 2009 at 3:14 PM, Gabor Grothendieck
 wrote:
> Another possibility is to have a separate directory
> for each project and place an .RData file in each.
> Now just cd to whatever directory corresponds to the
> project you wish to work on and start R normally.
> No code is needed.
>
> On Mon, Jan 12, 2009 at 10:04 AM, Zhou Fang  wrote:
>> Ok, looks like I can do what I want with --args, commandArgs() and an
>> appropiate .First.
>>
>> Thanks,
>>
>> Zhou
>>
>> On Mon, Jan 12, 2009 at 2:27 PM, David Winsemius  
>> wrote:
>>> See if this material is helpful:
>>>
>>> http://cran.r-project.org/doc/manuals/R-intro.html#Invoking-R-from-the-command-line
>>>
>>> -- David Winsemius
>>>
>>> On Jan 12, 2009, at 7:24 AM, Zhou Fang wrote:
>>>
>>>> That's not really what I meant by 'command line'. I meant, well,
>>>> loading from e.g. a bash shell, not from within an interactive R
>>>> session itself.
>>>>
>>>> Thanks anyways,
>>>>
>>>> Zhou
>>>>
>>>> (Possibly this email was sent twice. Apologies)
>>>>
>>>> On Mon, Jan 12, 2009 at 12:15 PM, Henrique Dallazuanna 
>>>> wrote:
>>>>>
>>>>> See ?load
>>>>>
>>>>> On Mon, Jan 12, 2009 at 10:12 AM, Zhou Fang  wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Is there any way to load workspaces (e.g. stuff from save.image) from
>>>>>> the command line? I'm on Linux, and would find this very helpful.
>>>>>>
>>>>>> I'm guessing this functionality can be duplicated with a skillful bash
>>>>>> script to rename the particular file to .RData (and then back once R
>>>>>> terminates), but I'm wondering if there's a better way.
>>>>>>
>>>>>> Zhou Fang
>>>>>>
>>>>>> __
>>>>>> R-help@r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Henrique Dallazuanna
>>>>> Curitiba-Paraná-Brasil
>>>>> 25° 25' 40" S 49° 16' 22" O
>>>>>
>>>>
>>>> __
>>>> R-help@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Pausing processing into an interactive session

2009-01-26 Thread Zhou Fang
Hi all,

As a possibly silly request, is it possible to interactively pause a
R-calculation and do a browser(), say, without browser or other debug
handlers being explicitly included in the code?

Imagine the following situation:

You write up a big calculation for R to calculate. We are talking
hours here, or worse. A few hours into the calculation, you decide
that you want to check on how it's going. Unfortunately, you didn't
forsee the output you really want to check on. Oops.

What would seem ideal is something like this: as well as Ctrl-C, which
would terminate the current computation, we really want some key combo
perhaps that would pause the computation, perhaps at the next
'reasonable spot'. (Not Ctrl-Z either, as it doesn't let you look at
what's going on in the program). Then you can examine variables, for
example. Maybe even tweak them manually. And press the key to resume
the calculation.

Is this already possible somehow? Can it be made possible? Or would
there not be any point?

Thanks,

Zhou

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to compare two regression line slopes

2009-01-27 Thread Zhou Fang

Hi,

Yes, the two methods are equivalent.

The p-value R calculates is based on the same t-statistic used in your 
manual analysis. You can see this by doing the second method:


y2 = rbind(df1, df2)
y2 = cbind(c(0,0,0,1,1,1), y2)
summary(lm(y2[,3] ~ y2[,1] + y2[,2] + y2[,2]*y2[,1]))

Look at the values you previously calculated and see where they reappear...
print(td)
print(db)
print(sd)

Looked at from the other way, the models with the D's and so on is one 
way to explain where the t-test comes from. Just do H0: b2=0 vs H1: 
b2!=0, and sprinkle some independence and normality assumptions.


It's probably preferable to use the automatic lm based method, because 
then you specify the model explicitly, while with the seemingly recipe 
based approach the actual models and hypotheses your are testing may not 
be clear. Plus you get nice diagnostic statistics and pretty graphs. The 
downside is that you might get lured into complacency...


Zhou Fang

PS: Your model equation isn't right. In both, we are also allowing the 
intercept to vary between groups. So really you want

y = c + D.b0 + b1.x + D.b2.x

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to compare two regression line slopes

2009-01-27 Thread Zhou Fang

Hi,

Yes, the two methods are equivalent.

The p-value R calculates is based on the same t-statistic used in your 
manual analysis. You can see this by doing the second method:


y2 = rbind(df1, df2)
y2 = cbind(c(0,0,0,1,1,1), y2)
summary(lm(y2[,3] ~ y2[,1] + y2[,2] + y2[,2]*y2[,1]))

Look at the values you previously calculated and see where they reappear...
print(td)
print(db)
print(sd)

Looked at from the other way, the models with the D's and so on is one 
way to explain where the t-test comes from. Just do H0: b2=0 vs H1: 
b2!=0, and sprinkle some independence and normality assumptions.


It's probably preferable to use the automatic lm based method, because 
then you specify the model explicitly, while with the seemingly recipe 
based approach the actual models and hypotheses your are testing may not 
be clear. Plus you get nice diagnostic statistics and pretty graphs. The 
downside is that you might get lured into complacency...


Zhou Fang

PS: Your model equation isn't right. In both, we are also allowing the 
intercept to vary between groups. So really you want

y = c + D.b0 + b1.x + D.b2.x

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for/if loop

2009-01-28 Thread Zhou Fang

What are you trying to do with
> for (pp in 1:pp+1){
?

Also, note that 1:rr+1 and 1:(rr+1) mean different things.

Zhou

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Finding a basis in a set of vectors

2009-02-06 Thread Zhou Fang
Hi,

Okay, I have a n x p matrix X, which I know is not full rank. In
particular, there may be linear dependencies amongst the columns (but
not that many). What is a fast way of finding a linearly independent
subset of the columns of X that will span the column space of X, in R?
If it helps, I have the QR decomposition of the original X 'for free'.

I know that it's possible to do this directly by looping over the
columns and adding them, but at the very least, a solution without
horrible slow loops would be nice.

Any ideas welcome.

Zhou Fang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding a basis in a set of vectors

2009-02-06 Thread Zhou Fang
Ah ha, that does work.

What do you mean it isn't robust, though? I mean, obviously linear
dependency structures in general are not stable under small
perturbations...?

Or is it that it's platform dependent?

Zhou

On Fri, Feb 6, 2009 at 2:28 PM, Peter Dalgaard  wrote:
> Zhou Fang wrote:
>> Hi,
>>
>> Okay, I have a n x p matrix X, which I know is not full rank. In
>> particular, there may be linear dependencies amongst the columns (but
>> not that many). What is a fast way of finding a linearly independent
>> subset of the columns of X that will span the column space of X, in R?
>> If it helps, I have the QR decomposition of the original X 'for free'.
>>
>> I know that it's possible to do this directly by looping over the
>> columns and adding them, but at the very least, a solution without
>> horrible slow loops would be nice.
>
> Have a look at stats:::Thin.col(), but beware that it isn't terribly robust.
>
>> Any ideas welcome.
>>
>> Zhou Fang
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> --
>   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
>  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
>  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
> ~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fast ave for sorted data?

2009-02-15 Thread Zhou Fang

Hi,

This is probably really obvious, by I can't seem to find anything on it.

Is there a fast version of ave for when the data is already sorted in 
terms of the factor, or if the breaks are already known?


Basically, I have:
X = 0.1, 0.2, 0.32, 0.32, 0.4, 0.56, 0.56, 0.7...
Y = 223, 434, 343, 544, 231 etc
of the same, admittedly large length.

Now note that some of the values of X are repeated. What I want to do 
is, for those X that are repeated, take the corresponding values of Y 
and change them to the average for that particular X.


So, ave(Y,X) will work. But it's very slow, and certainly not suited to 
my problem, where Y changes and X stays the same and I need to 
repeatedly recalculate the averaging of Y. Ave also does not take take 
advantage of the sorting of the data.


So, is there an alternative? (Presumeably avoiding loops.)

Thanks,

Zhou Fang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fast ave for sorted data?

2009-02-15 Thread Zhou Fang
Thanks! That does exactly what I want. (Heck, maybe this should be
included as a default sorted alternative to ave.)

I was thinking of doing it another way using cumsums, but maybe this
method is faster.

Zhou

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bias correction for random forests?

2009-03-11 Thread Zhou Fang

Hi,

Way back in 2004, an update to randomForest added an option 'corr.bias'. 
The explanation was a bit vague, but it turns out it improves RF's 
predictive fit with my data substantially. But I am having trouble 
understanding it.


Does anyone know what this 'bias correction' actually does? Or what the 
justification for it is? Or when it would be necessary? Is there a paper 
I can look at?


And is the feature likely to emerge from 'experimental' any time soon?

Zhou Fang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to find "p"(proportion) in binomial(n, p)?

2007-09-20 Thread fang liu
Hi,

I got a problem. I am trying to find "p" in binomial.
X~bin(n, p)
I want to find value "p", so that Pr(X <= k) <= alpha.
Here, n, k are known.

Thank you for helping me with this!

Catherine

_
[[replacing trailing spam]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Time series graphs, question about using zoo

2007-09-20 Thread fang liu
Hi,
Can you tell me what is the meaning for "tail, 1"  in "aggregate"?
I also want to get some similar graph, but the data is not time series data.
Suppose here is my data one, I want a graph with x-axis is just the 
index(1:9).
The graph plot all the variable A, B,C,D. So there should be 4 lines for 
each graph. For the A line, at each time point, the letter A should be on 
the line. And the same goes for the B line.
A   BCD
>  8   4 9 8
>  7   5 4 7
>  6   8 4 4
>  3   7  6 2
>  5   1 8 5
>  6   4  71
>  2   8 3 4
>  1   2  4 8
>  4   3  19

So I add a name for each row
rownames(one) <- c(1:9)
z <- zooreg(as.matrix(one), start = 1, freq = 1)
z <- aggregate(z, as.Date, tail, 1)
plot(z, plot.type = "single",  type = "o",
pch = c("A", "B", "C", "D"), lty = 1:2)

I get the plot, which I think it should be right. but the problem is that 
the x-axis still have month (Jan, ) on it and I didnot get "A,B,C,D" on my 
graph, is there any thing wrong?



>From: "Gabor Grothendieck" <[EMAIL PROTECTED]>
>To: "Bill Pepe" <[EMAIL PROTECTED]>
>CC: r-help@r-project.org
>Subject: Re: [R] Time series graphs
>Date: Thu, 20 Sep 2007 14:15:54 -0400
>
>Using plot.zoo in the zoo package try this:
>
>Lines <- "Bob.A Bob.BTom.ATom.B
>  Jan  84 9 8
>  Feb 7 5 4 7
>  Mar 6 8 4 4
>  Apr 3 7  6 2
>  May5 1 8 5
>  Jun 6 4  71
>  July 2 8 3 4
>  Aug 12  4 8
>  Sep4  3  19
>"
>DF <- read.table(textConnection(Lines))
>
>library(zoo)
>z <- zooreg(as.matrix(DF), start = as.yearmon(as.Date("2007-01-01")), freq 
>= 12)
>z <- aggregate(z, as.Date, tail, 1)
>plot(z, plot.type = "single",  type = "o",
>   pch = c("A", "A", "B", "B"), lty = 1:2)
>legend("bottomleft", c("Bob", "Tom"), lty = 1:2)
>
>
>
>On 9/20/07, Bill Pepe <[EMAIL PROTECTED]> wrote:
> > I'm fairly new to S-Plus and I need to get this done quickly. Suppose I 
>have the following fake data below:
> >
> >  There are two companies, call them Bob and Tom. Each have two 
>variables, call them A and B, that have observations.
> >
> > Bob Tom
> >
> >A BAB
> >  Jan  84 9 8
> >  Feb 7 5 4 7
> >  Mar 6 8 4 4
> >  Apr 3 7  6 2
> >  May5 1 8 5
> >  Jun 6 4  71
> >  July 2 8 3 4
> >  Aug 12  4 8
> >  Sep4  3  19
> >
> >  Here is what I want to do: I want to make two different graphs, one for 
>Bob and one for Tom. For each graph, plot both variables A and B. Connect 
>the A values with a line, and connect the B values with a different type of 
>line. So there should be two lines for each graph. For the A line, at each 
>time point, the letter A should be on the line. And the same goes for the B 
>line. Either R or S-Plus since they are essentially the same.
> >
> >  I'm sure this is easy, but any help would be greatly appreciated.
> >
> >  Thanks,
> >
> >  Bill
> >
> >
> > -
> > Pinpoint customers who are looking for what you sell.
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
>http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide 
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

_
[[replacing trailing spam]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guid

[R] Question on Binom.Confint

2018-09-13 Thread Guo, Fang (Associate)
Hi,

I have a question with the function Binom.Confint(x,n,"method"=lrt). For 
likelihood ratio test, I'd like to ask how you define the upper limit when the 
frequency of successes is zero. Thanks!


Fang Guo
Associate

CORNERSTONE RESEARCH
699 Boylston Street, 5th Floor
Boston, MA 02116-2836
617.927.3042 direct
fa...@cornerstone.com<mailto:fa...@cornerstone.com>

www.cornerstone.com<http://www.cornerstone.com/>


***
Warning: This email may contain confidential or privileged information
intended only for the use of the individual or entity to whom it is
addressed. If you are not the intended recipient, please understand 
that any disclosure, copying, distribution, or use of the contents 
of this email is strictly prohibited.
***

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [FORGED] Question on Binom.Confint

2018-09-14 Thread Guo, Fang (Associate)
It's method="lrt" and I used the "binom" package.

-Original Message-
From: Rolf Turner [mailto:r.tur...@auckland.ac.nz] 
Sent: Thursday, September 13, 2018 10:02 PM
To: Guo, Fang (Associate) 
Cc: r-help@R-project.org
Subject: Re: [FORGED] [R] Question on Binom.Confint


On 09/14/2018 08:15 AM, Guo, Fang (Associate) wrote:

> Hi,
> 
> I have a question with the function Binom.Confint(x,n,"method"=lrt).
> For likelihood ratio test, I'd like to ask how you define the upper 
> limit when the frequency of successes is zero. Thanks!

Point 1:  This question is inappropriate for this list, since it is about 
statistical theory and not about R syntax and programming.

Point 2: Where did you find the function Binom.Confint()?  I can find no such 
function anywhere.  I did manage to locate a function
binom.confint() (note the lower case "b" and "c") but it does not have an 
argument "method".  Please do not expect those whom you are addressing to be 
telepathic.

Point 3:  Having "method"=lrt in the call is decidedly weird.  Perhaps you 
meant method="lrt"; this is entirely different.

cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276
***
Warning: This email may contain confidential or privileged information
intended only for the use of the individual or entity to whom it is
addressed. If you are not the intended recipient, please understand 
that any disclosure, copying, distribution, or use of the contents 
of this email is strictly prohibited.
***
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question on Binom.Confint

2018-09-14 Thread Guo, Fang (Associate)
I used library(binom). 

-Original Message-
From: Bert Gunter [mailto:bgunter.4...@gmail.com] 
Sent: Thursday, September 13, 2018 10:04 PM
To: Guo, Fang (Associate) 
Cc: r-help-requ...@r-project.org; R-help 
Subject: Re: [R] Question on Binom.Confint

In what package?
Binomial confidence interval functions are in several.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and 
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Thu, Sep 13, 2018 at 6:38 PM Guo, Fang (Associate)  
wrote:
>
> Hi,
>
> I have a question with the function Binom.Confint(x,n,"method"=lrt). For 
> likelihood ratio test, I'd like to ask how you define the upper limit when 
> the frequency of successes is zero. Thanks!
>
>
> Fang Guo
> Associate
>
> CORNERSTONE RESEARCH
> 699 Boylston Street, 5th Floor
> Boston, MA 02116-2836
> 617.927.3042 direct
> fa...@cornerstone.com<mailto:fa...@cornerstone.com>
>
> www.cornerstone.com<http://www.cornerstone.com/>
>
>
> ***
> Warning: This email may contain confidential or privileged information 
> intended only for the use of the individual or entity to whom it is 
> addressed. If you are not the intended recipient, please understand 
> that any disclosure, copying, distribution, or use of the contents of 
> this email is strictly prohibited.
> ***
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
***
Warning: This email may contain confidential or privileged information
intended only for the use of the individual or entity to whom it is
addressed. If you are not the intended recipient, please understand 
that any disclosure, copying, distribution, or use of the contents 
of this email is strictly prohibited.
***
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question on Binom.Confint

2018-09-14 Thread Guo, Fang (Associate)
I did use library(binom). However, I was able to use the method "lrt" which is 
short for likelihood ratio test. 
-Original Message-
From: Jim Lemon [mailto:drjimle...@gmail.com] 
Sent: Thursday, September 13, 2018 11:50 PM
To: Guo, Fang (Associate) ; r-help mailing list 

Subject: Re: [R] Question on Binom.Confint

Hi Fang,
Let's assume that you are using the "binom.confint" function in the "binom" 
package and you have made a spelling mistake or two. This function employs nine 
methods for estimating the binomial confidence interval. Sadly, none of these 
is "lrt". The zero condition is discussed in the help page for four of these 
methods. Assuming you want to use another method, you will have to look up the 
method. A good start is:

https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval

Jim

On Fri, Sep 14, 2018 at 11:38 AM Guo, Fang (Associate)  
wrote:
>
> Hi,
>
> I have a question with the function Binom.Confint(x,n,"method"=lrt). For 
> likelihood ratio test, I'd like to ask how you define the upper limit when 
> the frequency of successes is zero. Thanks!
>
>
> Fang Guo
> Associate
>
> CORNERSTONE RESEARCH
> 699 Boylston Street, 5th Floor
> Boston, MA 02116-2836
> 617.927.3042 direct
> fa...@cornerstone.com<mailto:fa...@cornerstone.com>
>
> www.cornerstone.com<http://www.cornerstone.com/>
>
>
> ***
> Warning: This email may contain confidential or privileged information 
> intended only for the use of the individual or entity to whom it is 
> addressed. If you are not the intended recipient, please understand 
> that any disclosure, copying, distribution, or use of the contents of 
> this email is strictly prohibited.
> ***
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
***
Warning: This email may contain confidential or privileged information
intended only for the use of the individual or entity to whom it is
addressed. If you are not the intended recipient, please understand 
that any disclosure, copying, distribution, or use of the contents 
of this email is strictly prohibited.
***
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] extract day or month as in Splus

2009-10-23 Thread Fang (Betty) Yang
Dear all,

 

I am writing to ask for help to find R code to do the same thing as the
following Splus code:

 

dates <- c("02/27/1992", "02/27/1992", "01/14/1992", "02/28/1992",
"02/01/1992")

timeDate(as.character(dates),in.format="%m/%d/%Y","%a")

[1] Thu Thu Tue Fri Sat

 

Could anyone give me some R codes to get the same results as above(extract
days from dates), please?

 

Thanks in advance!

 

Betty


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problems with read.csv

2009-11-02 Thread Fang (Betty) Yang
Dear all,

 

I'd like to ask help on R code to get the same results as the following
Splus code:

 

 

>indata<-importData("/home/data_new.csv")

 

>indata[1:5,4]

[1] 0930 1601 1006 1032 1020

 

I tried the following R code:

 

> indata<-read.csv("/home/data_new.csv")

> indata[1:5,4]

[1]  930 1601 1006 1032 1020

 

I'd like the first one to be 0930, too.

 

Thanks in advance,

 

 

Betty

 

 

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] keep empty subsets using aggregate

2009-11-24 Thread Fang (Betty) Yang
Dear all,

 

I am struggling with a small problem. By using aggregate, the empty subsets
are removed. I need each empty subset to be 0. Any suggestions will be
appreciated. 

 

Code:

 

edref = aggregate(rep(1,times=dim(eds)[1]),list(eds[,11], eds[,7],
eds[,27]), sum)

 

 

Thanks in advance,

 

Betty

 

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sum a particular column by group

2010-02-05 Thread Fang (Betty) Yang
Dear all,

 

I have a table like this:

 

> eds

  R.ID Region Gender  Agegr  Time nvisits

11  A F  60--64   1:00   1

22  OF  55--591:20   1

33  OF   55--59   3:45   3

44  SM 60--641:10   3

55  W  F   55--59   12:30   1

66  W  M  60--64   8:00   2

 

 

 

I got a bootstrap sample using the following code:

 

> r<-sample(eds[,1],replace=TRUE)

> r

[1] 2 4 3 2 6 4

> beds<-eds[r,]

> beds

R.ID Region Gender  Agegr Time nvisits

2  2  O F  55--59   1:20   1

4  4  S  M60--64   1:10   3

3  3  O F  55--59   3:45   3

2.12  O F 55--59   1:20   1

6  6  WM 60--64   8:00   2

4.14  SM 60--64   1:10   3

 

 

 

I want to sum the last column by columns 2,3,and 4(including 0 in some
group).  I tried the following codes:

#1 : only get the freq, not the sum of the last column.

> table<-as.data.frame(with(beds,table(beds[,2],beds[,3],beds[,4])))

> table

   Var1 Var2   Var3 Freq

1 AF 55--590

2 OF 55--593

3 SF 55--590

4 WF 55--590

5 AM 55--590

6 OM 55--590

7 SM 55--590

8 WM 55--590

9 AF 60--640

10OF 60--640

11SF 60--640

12WF 60--640

13AM 60--640

14OM 60--640

15SM 60--642

16WM 60--641

 

# 2: only got the sum the last column, but miss the group with 0 counts.

> aggregate(beds[,6],list(beds[,2],beds[,3],beds[,4]),sum)

  Group.1 Group.2 Group.3 x

1   O   F  55--59 5

2   S   M  60--64 6

3   W   M  60--64 2

 

In conclusion, the following is what I want:

 

   Var1 Var2   Var3 Freq

1 AF 55--590

2 OF 55--595

3 SF 55--590

4 WF 55--590

5 AM 55--590

6 OM 55--590

7 SM 55--590

8 WM 55--590

9 AF 60--640

10OF 60--640

11SF 60--640

12WF 60--640

13AM 60--640

14OM 60--640

15SM 60--646

16WM 60--642

 

Does anyone know a code to do this or give a hint? Thank you in advance.

 

Betty

 

 

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sum a particular column by group

2010-02-05 Thread Fang (Betty) Yang
Thanks for your help. Finally, I got it.

 

From: Dennis Murphy [mailto:djmu...@gmail.com] 
Sent: Friday, February 05, 2010 12:20 PM
To: Fang (Betty) Yang
Cc: r-help@r-project.org
Subject: Re: [R] sum a particular column by group

 

Hi:

This is not an elegant solution by any means, but it gets what you
want...using
the data frame from your bootstrap sample,

# All combinations of the three factors
xx <- with(beds, expand.grid(Region = levels(Region), Gender =
levels(Gender), 
   Agegr = levels(Agegr)) )
> dim(xx)
[1] 12  3# differs from the 16, but bootstrapping
probably explains it...
# One way to get a summary (there are others...)
library(plyr)
yy <- ddply(beds, .(Region, Gender, Agegr), summarise, Nvisits =
sum(nvisits))
res <- merge(xx, yy, all.x = TRUE)
res <- within(res, Nvisits[is.na(Nvisits)] <- 0)
> res
   Region Gender  Agegr Nvisits
1   O  F 55--59   5
2   O  F 60--64   0
3   O  M 55--59   0
4   O  M 60--64   0
5   S  F 55--59   0
6   S  F 60--64   0
7   S  M 55--59   0
8   S  M 60--64   6
9   W  F 55--59   0
10  W  F 60--64   0
11  W  M 55--59   0
12  W  M 60--64   2


HTH,
Dennis

On Fri, Feb 5, 2010 at 9:20 AM, Fang (Betty) Yang 
wrote:

Dear all,



I have a table like this:



> eds

 R.ID Region Gender  Agegr  Time nvisits

11  A F  60--64   1:00   1

22  OF  55--591:20   1

33  OF   55--59   3:45   3

44  SM 60--641:10   3

55  W  F   55--59   12:30   1

66  W  M  60--64   8:00   2







I got a bootstrap sample using the following code:



> r<-sample(eds[,1],replace=TRUE)

> r

[1] 2 4 3 2 6 4

> beds<-eds[r,]

> beds

   R.ID Region Gender  Agegr Time nvisits

2  2  O F  55--59   1:20   1

4  4  S  M60--64   1:10   3

3  3  O F  55--59   3:45   3

2.12  O F 55--59   1:20   1

6  6  WM 60--64   8:00   2

4.14  SM 60--64   1:10   3







I want to sum the last column by columns 2,3,and 4(including 0 in some
group).  I tried the following codes:

#1 : only get the freq, not the sum of the last column.

> table<-as.data.frame(with(beds,table(beds[,2],beds[,3],beds[,4])))

> table

  Var1 Var2   Var3 Freq

1 AF 55--590

2 OF 55--593

3 SF 55--590

4 WF 55--590

5 AM 55--590

6 OM 55--590

7 SM 55--590

8 WM 55--590

9 AF 60--640

10OF 60--640

11SF 60--640

12WF 60--640

13AM 60--640

14OM 60--640

15SM 60--642

16WM 60--641



# 2: only got the sum the last column, but miss the group with 0 counts.

> aggregate(beds[,6],list(beds[,2],beds[,3],beds[,4]),sum)

 Group.1 Group.2 Group.3 x

1   O   F  55--59 5

2   S   M  60--64 6

3   W   M  60--64 2



In conclusion, the following is what I want:



  Var1 Var2   Var3 Freq

1 AF 55--590

2 OF 55--595

3 SF 55--590

4 WF 55--590

5 AM 55--590

6 OM 55--590

7 SM 55--590

8 WM 55--590

9 AF 60--640

10OF 60--640

11SF 60--640

12WF 60--640

13AM 60--640

14OM 60--640

15SM 60--646

16WM 60--642



Does anyone know a code to do this or give a hint? Thank you in advance.



Betty










   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] using Statistics::R

2010-12-11 Thread Yi-Fang Liu
Hi,ALL

I want to use R in Perl, the Statistics::R module is great but I meet the 
problem: I don`g know how to pass one array from Perl to R, can anyone show me?
Any suggestion will be appreciate~

Thank you!


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.