On 04-Nov-09 21:09:42, Mark W. Miller wrote: > I have a list of scientific names in a data set. I would like > to split the names into genus, species and subspecies. > Not all names include a subspecies. Could someone show me how > to do this? > > My example code is: > a <- matrix(c('genusA speciesA', 10, > 'genusB speciesAA', 20, > 'genusC speciesAAA subspeciesA', 15, > 'genusC speciesAAA subspeciesB', 25), nrow=4, byrow=TRUE) > aa <- data.frame(a) > colnames(aa) <- c('species', 'counts') > aa > ># The code returns > species counts > 1 genusA speciesA 10 > 2 genusB speciesAA 20 > 3 genusC speciesAAA subspeciesA 15 > 4 genusC speciesAAA subspeciesB 25 > ># I would like there to be 4 columns as below > genus species subspecies counts > genusA speciesA no.subspecies 10 > genusB speciesAA no.subspecies 20 > genusC speciesAAA subspeciesA 15 > genusC speciesAAA subspeciesB 25 > > I have tried using 'strsplit', but cannot get the desired result. > Thank you for any help with this. > > Mark Miller > Gainesville, Florida
The following seems to work for your example. However, others can probably propose a less clumsy version (but at least this one breaks it down into its elements): a <- matrix(c('genusA speciesA', 10, 'genusB speciesAA', 20, 'genusC speciesAAA subspeciesA', 15, 'genusC speciesAAA subspeciesB', 25), nrow=4, byrow=TRUE) a # [,1] [,2] # [1,] "genusA speciesA" "10" # [2,] "genusB speciesAA" "20" # [3,] "genusC speciesAAA subspeciesA" "15" # [4,] "genusC speciesAAA subspeciesB" "25" A <- NULL for( i in (1:nrow(a))){ Names <- unlist(strsplit(a[i,1],"[ ]+")) if(length(Names)==2) Names <- c(Names,"no.subspecies") A <- rbind(A,c(Names,a[i,2])) } colnames(A) <- c("Genus","Species","Subspecies","Count") A <- as.data.frame(A) A$Count <- as.numeric(A$Count) A # Genus Species Subspecies Count # 1 genusA speciesA no.subspecies 1 # 2 genusB speciesAA no.subspecies 3 # 3 genusC speciesAAA subspeciesA 2 # 4 genusC speciesAAA subspeciesB 4 Hoping this helps! Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <ted.hard...@manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 04-Nov-09 Time: 21:37:03 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.