Hi Thierry, On 01-Sep-08 09:45:27, ONKELINX, Thierry wrote: > > Dear Ted, > > I noticed that as.is was set by default in read.fwf. So if the > user sets stringsAsFactor it is passed through ... to read.table. > But I'm not sure how as.is is passed to read.table when onlye > stringsAsFactors is set. If it's the default (FALSE) then it might > be conflicting with stringsAsFactors. Therefore my suggestion to > use as.is instead of stringsAsFactors in this case.
Yes, that is how I think I see it too. I have now written two tiny test functions. temp.table() is the same as temp() before [below], temp.fwf() uses its arguments in the same way as read.fwf(). Also, temp.fwf() calls temp.table() in the same way as read.fwf() calls read.table() (as far as 'as.is' and 'stringsAsFactors' are concerned -- I hope!). temp.table<-function(as.is = !stringsAsFactors, stringsAsFactors = default.stringsAsFactors()){ print(c(as.is=as.is, sAF=stringsAsFactors)) } temp.fwf<-function(as.is=FALSE,...){ temp.table(as.is=as.is,...) } and now: temp.fwf(as.is=FALSE,stringsAsFactors=FALSE) # as.is sAF # FALSE FALSE temp.fwf(as.is=FALSE,stringsAsFactors=TRUE) # as.is sAF # FALSE TRUE temp.fwf(as.is=TRUE,stringsAsFactors=FALSE) # as.is sAF # TRUE FALSE temp.fwf(as.is=TRUE,stringsAsFactors=TRUE) # as.is sAF # TRUE TRUE temp.fwf(stringsAsFactors=TRUE) # as.is sAF # FALSE TRUE temp.fwf(stringsAsFactors=FALSE) # as.is sAF # FALSE FALSE showing that the 'as.is' result from temp.fwf() is independent of any value of 'stringsAsFactors' set in its paramater-list. > I suppose it might be a good idea to add stringsAsFactor to the > argumentlist of read.fwf and give it the same defaults as read.table. I was thinking the same, too. Ted. > Cheers, > > Thierry > > Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > Namens [EMAIL PROTECTED] > Verzonden: maandag 1 september 2008 11:23 > Aan: r-help@r-project.org > Onderwerp: Re: [R] Avoiding factors and levels in data frames > > On 01-Sep-08 08:20:25, ONKELINX, Thierry wrote: >> >> Try to add options(stringsAsFactors = FALSE) in your Rprofile.site >> (in the etc directory). Using as.is = TRUE seems safer than >> stringsAsFactors = FALSE in the read.fwf function. Because as.is >> is set to FALSE by default and stringsAsFactors is not set. >> >> HTH, >> >> Thierry > > Can I ask for some elucidation about how the code operates here? > Apparently read.fwf() calls read.table(), and ?read.fwf refers > you to ?read.table for things like 'as.is' and 'stringsAsFactors'. > > When I look at the code for read.table, I see in the paramater > list: > > function (file, .... , as.is = !stringsAsFactors, ... , > stringsAsFactors = default.stringsAsFactors(), ... ) > > with *no further reference whatever* to 'stringsAsFactors' in the > body of the function. In particular, there is no test that I can > see of whether or not 'stringsAsFactors' has been set by the user > in the call. > > The standard result of default.stringsAsFactors() is TRUE. > > I've written a tiny test function: > > temp<-function(as.is = !stringsAsFactors, > stringsAsFactors = default.stringsAsFactors()){ > print(c(as.is=as.is, sAF=stringsAsFactors)) > } > > temp() ># as.is sAF ># FALSE TRUE > > temp(stringsAsFactors = FALSE) ># as.is sAF ># TRUE FALSE > > temp(as.is=FALSE,stringsAsFactors = FALSE) ># as.is sAF ># FALSE FALSE > > So, if read.table is called with 'as.is=FALSE' (which is the default > set by read.fwf(), with any reference to 'stringsAsFactors' in the > call being part of the "..." which is passed to read.table()), then > read.table will be called with 'as.is=FALSE' regardless of whether > 'stringsAsFactors=FALSE' has been set explicitly in calling read.fwf(). > > The only way to get 'as.is' to be TRUE would be to set it explicitly > in the call to read.fwf() (and in that case one need not bother with > 'stringsAsFactors', since its only purpose seems to be to determine > the value of 'as.is'). Or, of course, to set default.stringsAsFactors > to be FALSE; but in many case people will want to have per-case > control over what happens in cases like this. > > Well, that's how it seems to me, on reading the code. Is this what > Thierry really means when he says "stringsAsFactors is not set"? > > If that is the case, then it seems to indicate some conflict or > inconsistency between read.fwf() and read.table() in this respect. > In any case, it strikes me as something of an undesirable tangle! > > With thanks for any comments, > Ted. > >> -----Oorspronkelijk bericht----- >> Van: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] >> Namens Asher Meir >> Verzonden: zondag 31 augustus 2008 11:02 >> Aan: r-help@r-project.org >> Onderwerp: [R] Avoiding factors and levels in data frames >> >> Hello all. >> >> I am an experienced R user, I have used R for many years for a wide >> variety of applications. However, I keep on running into one obstacle: >> I never want factors or levels in my data frames, but I keep on >> getting them. Is there any way to globally turn this whole feature of >> data frames off? Using options(stringAsFactors=FALSE) does not seem to >> work. >> Alternatively, if I have a data frame with levels, can I just get rid >> of them in that data frame? >> >> Here is an example: I have a large text file, of which part is in the >> fixed-width tabular form I need. I created a widths vector and a >> column names vector. I then read the file as follows: >> >> > raw1<-read.fwf(fn1,widths=widmax,col.names=headermax,stringsAsFactors=FA >> LSE) >> >> But raw1 still has factors! It is an old class data frame: >> >>> is(raw1) >> [1] "data.frame" "oldClass" >> >> And it still has levels: >>> raw1[1,1] >> [1] Gustav wind >> 229 Levels: - - - - - - - - - - - WIN - - - M ... Z > INDICATES >> C >> >> My question is: >> 1. Can I get rid of the levels in raw1? >> 2. Even better -- can I stop it getting read in as a data frame with >> factors? >> 3. Even better -- can I just tell R to never use factors in my data >> frames? >> >> Or any other solution that occurs to people -- maybe this is the wrong >> way to go about reading in fixed width data in this kind of file. >> >> I would appreciate any help. >> >> Asher -------------------------------------------------------------------- E-Mail: (Ted Harding) <[EMAIL PROTECTED]> Fax-to-email: +44 (0)870 094 0861 Date: 01-Sep-08 Time: 11:13:12 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.