Thanks Moshe! I apologize for not being so clear about the second part. Again, below is how the data looks like. The pattern for columns s1 and s2 will be:
(-1 -1) (-1 0) (-1 1) (0 -1) (0 0) (0 1) (1 -1) (1 0) (1 1) 104 131 57 631 305 668 33 15 107 There are 9 patterns, in other words, 9 combinations of -1,1, 0 given in the parenthesis. The occurring numbers are underneath. What I wish to have is that: scan the data from the begin, if any consecutive rows are of the same pattern (one of the 9 combinations in the above), we will 'memorize' the following information: the number in 'chr' column, the number in 'pos' column for the first row in the consecutive rows, the number in 'pos' column for the last row in the consecutive rows, how many rows of the consecutive rows, the corresponding pattern for them. I forgot to reinforce one requirement before for definition of the consecutive rows, which is that they are in the consecutive orders and are of the same number of 'chr'. Just to illustrate this, an example could be that, based on the data: BAC chr pos s1 s2 RP11-80G24 1 77465510 0 0 RP11-198H14 1 78696291 -1 0 RP11-267M21 1 79681704 -1 0 RP11-89A19 1 80950808 -1 0 RP11-6B16 1 82255496 -1 0 RP11-210E16 2 228801510 -1 0 even though row 2---6 are of the same pattern, which is -1 0 and are in the consecutive order, but row 6 is of different number of 'chr' than other rows. Therefore, we will not count row 6 and end up with: chr Start End #of_rows pattern 1 78696291 82255496 4 (-1 0) Hope this is clear. Thank you once again and Merry X'mas! Best, Allen > BAC chr pos s1 s2 > RP11-80G24 1 77465510 -1 0 > RP11-198H14 1 78696291 -1 0 > RP11-267M21 1 79681704 -1 0 > RP11-89A19 1 80950808 -1 0 > RP11-6B16 1 82255496 -1 0 > RP11-210E16 1 228801510 0 -1 > RP11-155C15 1 230957584 0 -1 > RP11-210F8 1 237932418 0 -1 > RP11-263L17 2 65724492 0 1 > RP11-340F16 2 65879898 0 1 > RP11-68A1 2 67718674 0 0 > RP11-474G23 2 68318411 0 0 > RP11-218N6 2 68454651 0 0 > CTD-2003M22 2 68567494 0 0 > ..... > On Dec 24, 2007 3:54 AM, Moshe Olshansky <[EMAIL PROTECTED]> wrote: > To answer your firs question try > > M[-which( M$s1 == 0 & M$s2 == 0),] > > For the second question, you must start with the more > precise definition of the grouping criterion. > > --- affy snp <[EMAIL PROTECTED]> wrote: > > > Hello list, > > > > I have a data frame M like: > > > > BAC chr pos s1 s2 > > RP11-80G24 1 77465510 -1 0 > > RP11-198H14 1 78696291 -1 0 > > RP11-267M21 1 79681704 -1 0 > > RP11-89A19 1 80950808 -1 0 > > RP11-6B16 1 82255496 -1 0 > > RP11-210E16 1 228801510 0 -1 > > RP11-155C15 1 230957584 0 -1 > > RP11-210F8 1 237932418 0 -1 > > RP11-263L17 2 65724492 0 1 > > RP11-340F16 2 65879898 0 1 > > RP11-68A1 2 67718674 0 0 > > RP11-474G23 2 68318411 0 0 > > RP11-218N6 2 68454651 0 0 > > CTD-2003M22 2 68567494 0 0 > > ..... > > > > how to remove those rows which have 0 for both of > > columns s1,s2? > > sth like M[!M$21=0&!M$s2=0]? > > > > Moreover, I want to get a list which could find a > > subset of rows which have > > the same pattern of data. For example, the first 8 > > rows in M can be > > clustered > > into 2 groups (represented below in 2 rows) and > > shown as: > > > > chr Start End # of > > rows Pattern > > 1 77465510 82255496 5 > > (-1 0) > > 1 228801510 237932418 3 > > (0 -1) > > > > Can anybody help me out of this? Thank you very much > > and happy holiday! > > > > Best, > > Allen > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, > > reproducible code. > > > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.