It does sound like there could be a problem with the merging process. I have two questions about your merge command: chaffmerge2<-merge(chaff, chafffat, by.x=c("RINGNO", "FAT", "FATMTD"), by.y=c("RINGNO", "FAT", "FATMTD"), all=T)
1. What is the reason for matching on "FAT" and "FATMTD"? From your description of the data, I assume that "RINGNO" is the individual identifier. I'd have thought matching on that alone would be appropriate. 2. What happens if you omit the "all=T" argument? In particular, how does the size of the merged dataset compare to the inputs? -- James Reilly Department of Statistics, University of Auckland Private Bag 92019, Auckland, New Zealand On 27/11/07 2:31 PM, Katherine Jones wrote: > Hi, > > This is probably a case where someone has to see what is happening on > my computer and it is complicated by my data being from SPSS (not my > choice). It is quite hard to give my data, because it is such a large > dataset. I have analysed 9 other datasets that work fine, but this > particular dataset was inputted wrong so requires merging of two > datasets. This may be the problem. > > Example of data:- > File 1. > [1] Individual [2] Habitat type [3] Weight > File 2. > [1] Individual [2] Fat [3] Fat method. > > I merge the two files to create:- > [1] Individual [2] Habitat type [3] Weight [4] Fat [5] Fat method > > My merging appears to work in the sense that I can plot Weight versus > Fat and I get data, but if I ask to see the data file I see a sea of > "NAs". So I'm not sure how there can be data there to plot, see > levels for and create tables for but I can't see it as a dataframe. I > do get the plot I want. > > Fat method contains either blank cells, " B" or " E". > > I wish to select all the rows in columns 1-4 which contain an " E" in > Fat method. > > e.g. > 120, 3, 20.2, 4, E > 121, 4, 20.0, 5, B > 132, 3, 21.2, 4, > > I want to select only the row containing " E", so I can plot Fat vs > Habitat and Weight vs. Fat. > > I have been doing this by using > > selectE<-Data[Fatmethod==" E",]. > > However, this does not work. It removes all of my data in the other > columns to "NA" and I am left only with fatmethod and fat scores. > > It is odd it works with other datasets but not this one. Although > with my other datasets when I ask to select " E", I can still see " > B" using levels(Fat method) but there is no data there, so my plots > are correct. > > Sorry this is long. I'm having difficulty explaining it. > > Katherine > > > On 26-Nov-07, at 5:09 PM, jim holtman wrote: > >> That should give you back a subset of 'data' (with all its columns), >> for those with " E" in 'column'. Can you show an example of your data >> and what the desired output would be. The posting guide asks "provide >> commented, minimal, self-contained, reproducible code" so we don't >> have to speculate on what you want. >> >> On Nov 26, 2007 5:04 PM, Katherine Jones >> <[EMAIL PROTECTED]> wrote: >>> This sort of works. It does select the E data, but unfortunately >>> it doesn't >>> select the data from the other columns; I want to select data >>> across about 5 >>> columns by the factor " E" in one of the columns. It should be >>> easy, but for >>> some reason it is not working. The spaces being added don't help. >>> >>> It seems to work on my non-merged data files, although the merged >>> file >>> contains all the data I need. >>> >>> Thanks for the subset command though. Hadn't thought of using that. >>> >>> >>> >>> On 26-Nov-07, at 4:46 PM, jim holtman wrote: >>> ?subset >>> >>> >>> subset(data, column == " E") >>> >> >> >> -- >> Jim Holtman >> Cincinnati, OH >> +1 513 646 9390 >> >> What is the problem you are trying to solve? > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.