On Jul 9, 2011, at 12:45 PM, Bansal, Vikas wrote:
Dear sir,
I was doing with different code that is why u did not get output
which I was saying.Please use this code on summary file-
I have a file that is summary.txt(I have attached it) .we can read
this file using-
dfa=read.table("summar.txt",fill=T,colClasses = "character",header=T)
In V10 column I have ASCII values which I converted into decimal
numbers using this code-
dfa$V10 <- sapply(dfa$V10, function(a)
paste(as.integer(charToRaw(a)), collapse = ' '))
now you will get this output.
dfa
V7 V8 V9 V10
1 0 1 G 96
snipped
26 0 1 C 95
27 0 1 A 88
28 0 1 g 96
29 0 2 GG 92 92
30 0 2 GG 91 94
31 0 2 AT 89 94
32 0 2 GG 96 93
the values in column V10 corresponds to A,C,G T in column V9.I want
only those, whose score is more than 90.so output of above should be-
V7 V8 V9 V10
1 0 1 G 96
snipped the easy lines
29 0 2 GG 92 92
30 0 2 GG 91 94
31 0 2 T 89 94
32 0 2 GG 96 93
so in output 15th and 27th row should be deleted and 31st row should
be-
31 0 2 T 89 94
because 89 is score for A and 94 is score for T.Therefore A has been
deleted because its score is less than 90.
At the moment I have a version of dfa that has the original V10 and
another column named 'value' in the fifth position. Since apply
removes attributes and names, functions written to work with an apply
function need to refer to positions:
dfa$newcol <-
apply(dfa, 1, function(x){ # create index vectors for letters in V9
vals <- c( sapply(strsplit(x[5], " "), as.numeric))
# use paste to make them into single character string
# so they will fit back into a dataframe
paste( unlist(
# # unlist the list of qualifying letters in third column and
strsplit(x[3],"")[ which(vals >=90)] ),
collapse=" ")} )
Here's the middle of that dataframe:
> dfa
V7 V8 V9 V10 value newcol
snipped
25 0 1 A a 97 A
26 0 1 C _ 95 C
27 0 1 A X 88
28 0 1 g ` 96 g
29 0 2 GG \\\\ 92 92 G, G
30 0 2 GG [^ 91 94 G, G
31 0 2 AA Y^ 89 94 A
32 0 2 GG `] 96 93 G, G
33 0 2 AA a^ 97 94 A, A
34 0 2 GG ]^ 93 94 G, G
35 0 2 AA a\\ 97 92 A, A
36 0 2 GG a] 97 93 G, G
37 0 2 GG Z] 90 93 G, G
38 0 2 GG ]^ 93 94 G, G
39 0 2 CC W\\ 87 92 C
40 0 2 CC a] 97 93 C, C
41 0 2 TT `` 96 96 T, T
42 0 2 GG a\\ 97 92 G, G
43 0 2 GG `` 96 96 G, G
44 0 2 aa aa 97 97 a, a
45 0 2 AA a^ 97 94 A, A
46 0 2 CC b` 98 96 C, C
47 0 2 AA _\\ 95 92 A, A
48 0 2 CC ]` 93 96 C, C
49 0 2 TT ^\\ 94 92 T, T
50 0 2 CC Z` 90 96 C, C
51 0 2 Ac `a 96 97 A, c
52 0 3 AAA b`a 98 96 97 A, A, A
53 0 3 GGG aa] 97 97 93 G, G, G
54 0 3 AAA `[_ 96 91 95 A, A, A
55 0 3 CCC a`_ 97 96 95 C, C, C
56 0 3 TTT _]^ 95 93 94 T, T, T
57 0 3 CCC aaa 97 97 97 C, C, C
58 0 3 CCC ^a` 94 97 96 C, C, C
59 0 3 CCC _`` 95 96 96 C, C, C
60 0 3 AAA Z`] 90 96 93 A, A, A
61 0 3 CCC \\a] 92 97 93 C, C, C
62 0 2 GG `_ 96 95 G, G
63 0 2 GG `Y 96 89 G
64 0 2 AA a] 97 93 A, A
65 0 1 G Z 90 G
66 0 1 G _ 95 G
67 0 1 T ^ 94 T
68 0 1 T ^ 94 T
69 0 1 C Y 89
70 0 1 A \\ 92 A
71 0 1 G ] 93 G
snipped
--
David.
Can you help me please.
Thanking you,
Warm Regards
Vikas Bansal
Msc Bioinformatics
Kings College London
________________________________________
From: David Winsemius [dwinsem...@comcast.net]
Sent: Saturday, July 09, 2011 12:04 AM
To: Bansal, Vikas
Cc: r-help@r-project.org
Subject: Re: [R] For column values-Quality control
On Jul 8, 2011, at 6:46 PM, Bansal, Vikas wrote:
Yes sir.you are right.after this I use this code to convert ASCII
values in column V10 to decimal numbers-
dfa$V10=lapply(dfa[,4], function(c) as.numeric(charToRaw(c)))
now u will get output something like this-
V7 V8
V9 V10
0 1
G 82
0 1 CGT
c(90, 92, 96)
0 1
GA c(78, 92)
0 1 GAG
c(90, 92, 92)
0 1
G 88
0 1
A 96
0 1 ATT
c(90, 96, 92)
0 1
T 94
0 1
C 97
now after this I am facing the problem-
I don't think so: Here's what I getas teh top pf dfa after that
operation:
str(dfa)
'data.frame': 111 obs. of 4 variables:
$ V7 : chr "0" "0" "0" "0" ...
$ V8 : chr "1" "1" "1" "1" ...
$ V9 : chr "G" "T" "C" "A" ...
$ V10:List of 111
..$ : num 96
..$ : num 97
..$ : num 97
..$ : num 97
..$ : num 95
..$ : num 90
..$ : num 94
..$ : num 92
..$ : num 90
..$ : num 97
..$ : num 94
..$ : num 92
..$ : num 95
..$ : num 97
..$ : num 88
..$ : num 96
..$ : num 97
..$ : num 95
..$ : num 97
..$ : num 97
..$ : num 97
..$ : num 97
..$ : num 97
..$ : num 97
..$ : num 97
..$ : num 95
..$ : num 88
..$ : num 96
..$ : num 92 92
..$ : num 91 94
..$ : num 89 94
,,,, more follows and output was terminated
I say again/// read the Posting Guide and use dump() or dput().
--
David.
the values in column V10 corresponds to A,C,G T in column V9.I want
only those, whose score is more than 91.so output of above should be-
V7 V8
V9 V10
0 1 GT
c(90, 92, 96)
0 1 A
c(78, 92)
0 1 AG
c(90, 92, 92)
0 1
A 96
0 1 TT
c(90, 96, 92)
0 1
T 94
0 1
C 97
First row should be deleted because it contains 82 which is less
than 91.In second row C should deleted because it has less than 91
score in col V10.
Thanking you,
Warm Regards
Vikas Bansal
Msc Bioinformatics
Kings College London
________________________________________
From: David Winsemius [dwinsem...@comcast.net]
Sent: Friday, July 08, 2011 11:37 PM
To: Bansal, Vikas
Cc: r-help@r-project.org
Subject: Re: [R] For column values-Quality control
I get something entirely different when I execute that input command
with the attached file:
This is what I see as the first 14 lines for a displayed value for
dfa:
dfa
V7 V8 V9 V10
1 0 1 G `
2 0 1 T a
3 0 1 C a
4 0 1 A a
5 0 1 G _
6 0 1 G Z
7 0 1 C ^
8 0 1 C \\
9 0 1 A Z
10 0 1 T a
11 0 1 g ^
12 0 1 A \\
13 0 1 C _
14 0 1 G a
If this is different than what you see when you type dfa after input
of that file in that manner then you should consider alternative
methods of communicating an unambiguous representation of your dfa
object.... as I have detailed in prior private messages.
--
David.
On Jul 8, 2011, at 6:10 PM, Bansal, Vikas wrote:
Dear all,
I am really sorry for not giving the input file because in my mail,I
did not explain my problem in a best way.
I have a file that is summary.txt(I have attached it) .we can read
this file using-
dfa=read.table("summar.txt",fill=T,colClasses =
"character",header=T)
In V10 column I have ASCII values which I converted into decimal
numbers using this code-
dfa$V10=lapply(dfa[,4], function(c) as.numeric(charToRaw(c)))
Now I have a dataframe dfa with these columns something like this-
V7 V8
V9 V10
0 1
G 82
0 1 CGT
c(90, 92, 96)
0 1
GA c(78, 92)
0 1 GAG
c(90, 92, 92)
0 1
G 88
0 1
A 96
0 1 ATT
c(90, 96, 92)
0 1
T 94
0 1
C 97
the values in column V10 corresponds to A,C,G T in column V9.I want
only those whose score is more than 91.so output of above should be-
V7 V8
V9 V10
0 1 GT
c(90, 92, 96)
0 1 A
c(78, 92)
0 1 AG
c(90, 92, 92)
0 1
A 96
0 1 TT
c(90, 96, 92)
0 1
T 94
0 1
C 97
Can you please tell me the solution.
Thanking you,
Warm Regards
Vikas Bansal
Msc Bioinformatics
Kings College
London<summary.txt>______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
David Winsemius, MD
West Hartford, CT
<summary.txt>
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.