Hi there, I am having trouble subsetting a data frame by a conditional via one column (of many).
I read the file into R through "read.fwf," where I specified column widths. Original data is .DAT. I then utilized "names" function to read in column headings. For one column, PRVDR_NUM, I wish to further amend the entire data set, but only have PRVDR_NUM == 050108. This is where I'm having trouble. I've tried code like this: newinpatient <- subset(oldinpatient, oldinpatient$PRVDR_NUM == 050108) #OR newinpatient <- oldinpatient[oldinpatient$PRVDR_NUM == 050108, ] #OR providernum <- data.frame(newdim(PRVDR_NUM = c(050108)) newinpatient <- merge(providernum, oldinpatient) With checking "class" at one point, I gathered that R interprets PRVDR_NUM as a factor, not a number .. so I've understood a potential reason why I would have errors (with code above). So, I then tried something like this: newPRVDR_NUM <- format(as.numeric(levels(oldinpatient$PRVDR_NUM) [oldinpatient$PRVDR_NUM])) numericprvdr <- data.frame(oldinpatient, newPRVDR_NUM) bestprvdr <- numericprvdr[,-2] I thought that with converting PRVDR_NUM to numeric, then one of the three options above would be satisfied. But, that has not worked either. (I did confirm that the factor -> numeric worked, which it did) Though R reads the three options (above) with no errors, upon performing a "dim" check I receive the output: 0 93. The columns are correct, but rows (obviously) are not. (I did confirm that the desired value exists multiple times in the noted column, so 0 is definitely incorrect) As well, I would like to work with PRVDR_NUM as a variable alone, but I've found that with any of these variables/column names, I have to use "allinpatient$PRVDR_NUM." R does not recognize PRVDR_NUM alone. Why? More and more I think my problem is more foundational, meaning using the read.fwf function in the first place? Not using the read.fwf function correctly? Again, I've made enough progress with other variables & data sets of this type I've been fine so far, but now & future I need to repeat this code enough times where help in better understanding my errors & a more elegant/efficient solution would be greatly appreciated. Also note that R does not read all 93 columns as factors. Why would R interpret this six-wide column as a factor, but the nine-wide column next door as numeric? Your help is most appreciated! -- View this message in context: http://r.789695.n4.nabble.com/Subsetting-a-data-frame-Read-in-with-FWF-format-from-DAT-file-tp4461051p4461051.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.