I used two rows to test the data frame, as follows. > dat <- read.table("TOV_43_Protein_Clusters_abundance1.tab", header=TRUE,sep = "\t") > dat1 <- dat[1:2,] > str(dat1) 'data.frame': 2 obs. of 44 variables: $ X : Factor w/ 1075762 levels "","POV_Cluster_1000001",..: 305266 625028 $ X109DCM: Factor w/ 46 levels "","1","10","109DCM",..: 1 1 $ X109SUR: Factor w/ 41 levels "","1","10","109SUR",..: 1 1 $ X18DCM : Factor w/ 31 levels "","1","10","11",..: 1 1 $ X18SUR : Factor w/ 25 levels "","1","10","11",..: 1 1 $ X22SUR : Factor w/ 50 levels "","1","10","11",..: 1 2 $ X23DCM : Factor w/ 46 levels "","1","10","11",..: 1 1 $ X25DCM : Factor w/ 42 levels "","1","10","11",..: 1 1 $ X25SUR : Factor w/ 47 levels "","1","10","11",..: 1 1 $ X30DCM : Factor w/ 34 levels "","1","10","11",..: 1 1 $ X31SUR : Factor w/ 43 levels "","1","10","11",..: 1 1 $ X32DCM : Factor w/ 15 levels "","1","10","11",..: 1 1 $ X32SUR : Factor w/ 58 levels "","1","10","11",..: 1 1 $ X34DCM : Factor w/ 53 levels "","1","10","11",..: 1 35 $ X34SUR : Factor w/ 47 levels "","1","10","11",..: 10 14 $ X36DCM : Factor w/ 48 levels "","1","10","11",..: 2 43 $ X36SUR : Factor w/ 45 levels "","1","10","11",..: 23 38 $ X38DCM : Factor w/ 40 levels "","1","10","11",..: 3 23 $ X38SUR : Factor w/ 44 levels "","1","10","11",..: 7 41 $ X39DCM : Factor w/ 38 levels "","1","10","11",..: 34 38 $ X39SUR : Factor w/ 40 levels "","1","10","11",..: 13 40 $ X41DCM : Factor w/ 47 levels "","1","10","11",..: 13 40 $ X41SUR : Factor w/ 40 levels "","1","10","11",..: 1 1 $ X42DCM : Factor w/ 48 levels "","1","10","11",..: 2 3 $ X42SUR : Factor w/ 41 levels "","1","10","11",..: 2 1 $ X46SUR : Factor w/ 31 levels "","1","10","11",..: 2 2 $ X52DCM : Factor w/ 49 levels "","1","10","11",..: 13 23 $ X64DCM : Factor w/ 35 levels "","1","10","11",..: 1 2 $ X64SUR : Factor w/ 36 levels "","1","10","11",..: 1 1 $ X65DCM : Factor w/ 38 levels "","1","10","11",..: 1 1 $ X65SUR : Factor w/ 35 levels "","1","10","11",..: 1 1 $ X66DCM : Factor w/ 27 levels "","1","10","11",..: 1 1 $ X66SUR : Factor w/ 35 levels "","1","10","11",..: 1 1 $ X67SUR : Factor w/ 38 levels "","1","10","11",..: 1 1 $ X68DCM : Factor w/ 33 levels "","1","10","11",..: 1 1 $ X68SUR : Factor w/ 36 levels "","1","10","11",..: 1 1 $ X70MES : Factor w/ 23 levels "","1","10","11",..: 1 1 $ X70SUR : Factor w/ 37 levels "","1","10","11",..: 1 1 $ X72DCM : Factor w/ 40 levels "","1","10","11",..: 13 27 $ X72SUR : Factor w/ 38 levels "","1","10","11",..: 1 1 $ X76DCM : Factor w/ 44 levels "","1","10","11",..: 1 1 $ X76SUR : Factor w/ 34 levels "","1","10","11",..: 1 1 $ X82DCM : Factor w/ 29 levels "","1","10","11",..: 1 1 $ X85DCM : Factor w/ 30 levels "","1","10","11",..: 1 1
Thank you!! Dawn On Tue, Jul 14, 2015 at 3:48 PM, Jeff Newmiller <jdnew...@dcn.davis.ca.us> wrote: > I suspect your data frame "dat" has non-numeric data in some of the > columns that have ABC in their names. Any column of a data frame can be > numeric or not, but the data frame as a unit cannot be numeric. If your > data file has odd characters in done of the otherwise-numeric columns, the > whole column will be read in as a factor or character strings. Look at the > output of str(dat) for columns that don't show "num'. If you can find the > column, and then one of the bad rows, you can use a text editor to fix them > manually, or show us examples of the bad data and we can suggest ways to > fix it in R. > --------------------------------------------------------------------------- > Jeff Newmiller The ..... ..... Go Live... > DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live > Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/Batteries O.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > --------------------------------------------------------------------------- > Sent from my phone. Please excuse my brevity. > > On July 14, 2015 2:35:38 PM PDT, Dawn <dawn1...@gmail.com> wrote: > >Hi, > > > >I used a small set of data (several columns and rows) and it works fine > >using the following command: > >abc <- rowSums(test[,grep("ABC",names(test),fixed=T)],na.rm=T) > > > >But when I used the real big data table, "Error in rowSums(dat[, > >grep("ABC", names(dat), fixed = T)], na.rm = T) : > > 'x' must be numeric" > >Then it didn't work either using as.numeric(): > >> as.numeric(dat) > >Error: (list) object cannot be coerced to type 'double' > > > >Thanks! > >Dawn > > > > > > > > > >On Fri, Jul 10, 2015 at 4:35 PM, Dawn <dawn1...@gmail.com> wrote: > > > >> Thank you all and sorry for the data messing. It has worked! > >> > >> Best, > >> Dawn > >> > >> On Fri, Jul 10, 2015 at 4:15 AM, Jim Lemon <drjimle...@gmail.com> > >wrote: > >> > >>> Hi Dawn, > >>> Your data are a bit messed up, but try the following: > >>> > >>> colSums(dat[,grep("ABC",names(dat),fixed=TRUE)],na.rm=TRUE) > >>> colSums(dat[,grep("XYZ",names(dat),fixed=TRUE)],na.rm=TRUE) > >>> > >>> I'm assuming that you want to discard the NA values. > >>> > >>> Jim > >>> > >>> On Fri, Jul 10, 2015 at 6:52 AM, Rui Barradas <ruipbarra...@sapo.pt> > >>> wrote: > >>> > Hello, > >>> > > >>> > Please use ?dput to give a data example, like this it's completely > >>> > unreadable. If your data.frame is named 'dat' use > >>> > > >>> > dput(head(dat, 30)) # paste the outut of this in your mail > >>> > > >>> > > >>> > And don't post in html, use plain text only, like the posting > >guide > >>> says. > >>> > > >>> > Rui Barradas > >>> > > >>> > > >>> > Em 09-07-2015 18:12, Dawn escreveu: > >>> >> > >>> >> Hi, > >>> >> > >>> >> I have a big dataframe as follows > >>> >> > >>> >> 109ABC 109XYZ 18ABC 18XYZ 22XYZ 23ABC > >25ABC > >>> >> 25XYZ > >>> >> 30ABC 31XYZ 32ABC 32XYZ 34DCM 34XYZ 36ABC > >>> 36SUR > >>> >> 38DCM 38XYZ 39DCM 39SUR 41DCM 41SUR 42DCM > >42SUR > >>> >> 46SUR 52DCM 64ABC 64XYZ 65ABC 65XYZ 66ABC > >66XYZ > >>> >> 67XYZ 68ABC 68SUR 70MES 70SUR 72ABC 72XYZ > >76ABC > >>> >> 76XYZ 82ABC 85ABC POV > >>> >> Cluster_1 > >17 > >>> 1 > >>> >> 3 10 14 5 2 2 1 1 1 2 > >>> >> 2 TT:61 > >>> >> Cluster_2 1 4 > > 20 > >>> >> 6 5 3 6 9 9 6 10 1 3 1 > >>> >> 4 TT:88 > >>> >> Cluster_3 3 3 6 4 > > 17 > >>> >> 17 18 13 17 19 22 11 5 21 8 5 18 > > 4 > >>> >> 7 9 > >>> >> TT:227 > >>> >> ........ > >>> >> > >>> >> I want to get two columns, i.e, one is to sum columns for all > >>> including > >>> >> ABC for each row and the other is to sum columns for all > >including XYZ > >>> >> for > >>> >> each row. > >>> >> > >>> >> Is there some help? Thank you! > >>> >> Dawn > >>> >> > >>> >> [[alternative HTML version deleted]] > >>> >> > >>> >> ______________________________________________ > >>> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>> >> https://stat.ethz.ch/mailman/listinfo/r-help > >>> >> PLEASE do read the posting guide > >>> >> http://www.R-project.org/posting-guide.html > >>> >> and provide commented, minimal, self-contained, reproducible > >code. > >>> >> > >>> > > >>> > ______________________________________________ > >>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>> > https://stat.ethz.ch/mailman/listinfo/r-help > >>> > PLEASE do read the posting guide > >>> http://www.R-project.org/posting-guide.html > >>> > and provide commented, minimal, self-contained, reproducible code. > >>> > >> > >> > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.