On Jul 9, 2011, at 12:45 PM, Bansal, Vikas wrote:

Dear sir,

I was doing with different code that is why u did not get output which I was saying.Please use this code on summary file-

I have a file that is summary.txt(I have attached it) .we can read
this file using-

dfa=read.table("summar.txt",fill=T,colClasses = "character",header=T)

In V10 column I have  ASCII values which I converted into decimal
numbers using this code-

dfa$V10 <- sapply(dfa$V10, function(a) paste(as.integer(charToRaw(a)), collapse = ' '))

now you will get this output.

dfa
   V7 V8  V9      V10
1    0  1   G       96
snipped
26   0  1   C       95
27   0  1   A       88
28   0  1   g       96
29   0  2  GG    92 92
30   0  2  GG    91 94
31   0  2  AT    89 94
32   0  2  GG    96 93

the values in column V10 corresponds to A,C,G T in column V9.I want
only those, whose score is more than 90.so output of above should be-
V7 V8  V9      V10
1    0  1   G       96

snipped the easy lines
29   0  2  GG    92 92
30   0  2  GG    91 94
31   0  2  T       89 94
32   0  2  GG    96 93

so in output 15th and 27th row should be deleted and 31st row should be-

31   0  2  T    89 94

because 89 is score for A and 94 is score for T.Therefore A has been deleted because its score is less than 90.

At the moment I have a version of dfa that has the original V10 and another column named 'value' in the fifth position. Since apply removes attributes and names, functions written to work with an apply function need to refer to positions:

dfa$newcol <-
  apply(dfa, 1, function(x){ # create index vectors for letters in V9
         vals <- c( sapply(strsplit(x[5], " "), as.numeric))
# use paste to make them into single character string
# so they will fit back into a dataframe
      paste( unlist(
# # unlist the list of qualifying letters in third column and
          strsplit(x[3],"")[ which(vals >=90)] ),
                     collapse=" ")} )

Here's the middle of that dataframe:
> dfa
    V7 V8  V9  V10    value  newcol
snipped
25   0  1   A    a       97       A
26   0  1   C    _       95       C
27   0  1   A    X       88
28   0  1   g    `       96       g
29   0  2  GG \\\\    92 92    G, G
30   0  2  GG   [^    91 94    G, G
31   0  2  AA   Y^    89 94       A
32   0  2  GG   `]    96 93    G, G
33   0  2  AA   a^    97 94    A, A
34   0  2  GG   ]^    93 94    G, G
35   0  2  AA  a\\    97 92    A, A
36   0  2  GG   a]    97 93    G, G
37   0  2  GG   Z]    90 93    G, G
38   0  2  GG   ]^    93 94    G, G
39   0  2  CC  W\\    87 92       C
40   0  2  CC   a]    97 93    C, C
41   0  2  TT   ``    96 96    T, T
42   0  2  GG  a\\    97 92    G, G
43   0  2  GG   ``    96 96    G, G
44   0  2  aa   aa    97 97    a, a
45   0  2  AA   a^    97 94    A, A
46   0  2  CC   b`    98 96    C, C
47   0  2  AA  _\\    95 92    A, A
48   0  2  CC   ]`    93 96    C, C
49   0  2  TT  ^\\    94 92    T, T
50   0  2  CC   Z`    90 96    C, C
51   0  2  Ac   `a    96 97    A, c
52   0  3 AAA  b`a 98 96 97 A, A, A
53   0  3 GGG  aa] 97 97 93 G, G, G
54   0  3 AAA  `[_ 96 91 95 A, A, A
55   0  3 CCC  a`_ 97 96 95 C, C, C
56   0  3 TTT  _]^ 95 93 94 T, T, T
57   0  3 CCC  aaa 97 97 97 C, C, C
58   0  3 CCC  ^a` 94 97 96 C, C, C
59   0  3 CCC  _`` 95 96 96 C, C, C
60   0  3 AAA  Z`] 90 96 93 A, A, A
61   0  3 CCC \\a] 92 97 93 C, C, C
62   0  2  GG   `_    96 95    G, G
63   0  2  GG   `Y    96 89       G
64   0  2  AA   a]    97 93    A, A
65   0  1   G    Z       90       G
66   0  1   G    _       95       G
67   0  1   T    ^       94       T
68   0  1   T    ^       94       T
69   0  1   C    Y       89
70   0  1   A   \\       92       A
71   0  1   G    ]       93       G

snipped

--
David.




Can you help me please.







Thanking you,
Warm Regards
Vikas Bansal
Msc Bioinformatics
Kings College London
________________________________________
From: David Winsemius [dwinsem...@comcast.net]
Sent: Saturday, July 09, 2011 12:04 AM
To: Bansal, Vikas
Cc: r-help@r-project.org
Subject: Re: [R] For column values-Quality control

On Jul 8, 2011, at 6:46 PM, Bansal, Vikas wrote:

Yes sir.you are right.after this I use this code to convert ASCII
values in column V10 to decimal numbers-

dfa$V10=lapply(dfa[,4], function(c) as.numeric(charToRaw(c)))

now u will get output something like this-

V7 V8
V9                                                       V10
0  1
G                                                        82
0  1              CGT
c(90, 92, 96)
0  1
GA                                                 c(78, 92)
0  1              GAG
c(90, 92, 92)
0  1
G                                                        88
0  1
A                                                        96
0  1              ATT
c(90, 96, 92)
0  1
T                                                        94
0  1
C                                                        97

now after this I am facing the problem-


I don't think so: Here's what I getas teh top pf dfa after that
operation:
str(dfa)
'data.frame':   111 obs. of  4 variables:
 $ V7 : chr  "0" "0" "0" "0" ...
 $ V8 : chr  "1" "1" "1" "1" ...
 $ V9 : chr  "G" "T" "C" "A" ...
 $ V10:List of 111
  ..$ : num 96
  ..$ : num 97
  ..$ : num 97
  ..$ : num 97
  ..$ : num 95
  ..$ : num 90
  ..$ : num 94
  ..$ : num 92
  ..$ : num 90
  ..$ : num 97
  ..$ : num 94
  ..$ : num 92
  ..$ : num 95
  ..$ : num 97
  ..$ : num 88
  ..$ : num 96
  ..$ : num 97
  ..$ : num 95
  ..$ : num 97
  ..$ : num 97
  ..$ : num 97
  ..$ : num 97
  ..$ : num 97
  ..$ : num 97
  ..$ : num 97
  ..$ : num 95
  ..$ : num 88
  ..$ : num 96
  ..$ : num  92 92
  ..$ : num  91 94
  ..$ : num  89 94
,,,, more follows and output was terminated

I say again/// read the Posting Guide and use dump() or dput().

--
David.


the values in column V10 corresponds to A,C,G T in column V9.I want
only those, whose score is more than 91.so output of above should be-

V7 V8
V9                                                       V10
0  1              GT
c(90, 92, 96)
0  1              A
c(78, 92)
0  1              AG
c(90, 92, 92)
0  1
A                                                        96
0  1              TT
c(90, 96, 92)
0  1
T                                                        94
0  1
C                                                        97

First row should be deleted because it contains 82 which is less
than 91.In second row C should deleted because it has less than 91
score in col V10.


Thanking you,
Warm Regards
Vikas Bansal
Msc Bioinformatics
Kings College London
________________________________________
From: David Winsemius [dwinsem...@comcast.net]
Sent: Friday, July 08, 2011 11:37 PM
To: Bansal, Vikas
Cc: r-help@r-project.org
Subject: Re: [R] For column values-Quality control

I get something entirely different when I execute that input command
with the attached file:

This is what I see as the first 14 lines for a displayed value for
dfa:

dfa
   V7 V8  V9  V10
1    0  1   G    `
2    0  1   T    a
3    0  1   C    a
4    0  1   A    a
5    0  1   G    _
6    0  1   G    Z
7    0  1   C    ^
8    0  1   C   \\
9    0  1   A    Z
10   0  1   T    a
11   0  1   g    ^
12   0  1   A   \\
13   0  1   C    _
14   0  1   G    a

If this is different than what you see when you type dfa after input
of that file in that manner then you should consider alternative
methods of communicating an unambiguous representation of your dfa
object.... as I have detailed in prior private messages.

--

David.

On Jul 8, 2011, at 6:10 PM, Bansal, Vikas wrote:


Dear all,

I am really sorry for not giving the input file because in my mail,I
did not explain my problem in a best way.

I have a file that is summary.txt(I have attached it) .we can read
this file using-

dfa=read.table("summar.txt",fill=T,colClasses = "character",header=T)

In V10 column I have  ASCII values which I converted into decimal
numbers using this code-

dfa$V10=lapply(dfa[,4], function(c) as.numeric(charToRaw(c)))

Now I have a dataframe dfa with these columns something like this-

V7 V8
V9                                                       V10
0  1
G                                                        82
0  1              CGT
c(90, 92, 96)
0  1
GA                                                 c(78, 92)
0  1              GAG
c(90, 92, 92)
0  1
G                                                        88
0  1
A                                                        96
0  1              ATT
c(90, 96, 92)
0  1
T                                                        94
0  1
C                                                        97

the values in column V10 corresponds to A,C,G T in column V9.I want
only those whose score is more than 91.so output of above should be-

V7 V8
V9                                                       V10
0  1              GT
c(90, 92, 96)
0  1              A
c(78, 92)
0  1              AG
c(90, 92, 92)
0  1
A                                                        96
0  1              TT
c(90, 96, 92)
0  1
T                                                        94
0  1
C                                                        97

Can you please tell me the solution.

Thanking you,
Warm Regards
Vikas Bansal
Msc Bioinformatics
Kings College
London<summary.txt>______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT


David Winsemius, MD
West Hartford, CT

<summary.txt>

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to