Re: [Rd] Characters vs. factors
On Mon, Oct 5, 2009 at 4:33 PM, hadley wickham wrote: > It seems like a recent trend in R has been to make character vectors > and factors almost equivalent (apart from the way that factors always > remember their original range). There are a few exceptions: A related issue is that modeling functions throw a warning when character objects are used in place of factors: > shopping <- > read.csv("http://spreadsheets.google.com/pub?key=tE9pXlYLwTAeiDWxL8h_viA&single=true&gid=0&range=A1%3AE37&output=csv";, > as.is=TRUE) > shopping$seconds <- as.numeric(as.difftime(shopping$Total.Time)) > fit <- lm(seconds ~ Number.of.Items + Payment - 1, shopping,subset=-8) Warning message: In model.matrix.default(mt, mf, contrasts) : variable 'Payment' converted to a factor The warning doesn't affect R's behaviour, of course, but it does make it difficult to sanction the otherwise sensible advice to R beginners to read in data files with as.it=TRUE. (The warning leads to difficult-to-answer questions.) For similar reasons I deleted the warning from this post: http://blog.revolution-computing.com/2009/09/is-the-express-line-really-faster-1.html In general the trend towards equivalence of factors and character vectors is welcome, though. # David On Mon, Oct 5, 2009 at 4:33 PM, hadley wickham wrote: > > It seems like a recent trend in R has been to make character vectors > and factors almost equivalent (apart from the way that factors always > remember their original range). There are a few exceptions: > > * summary.character != summary.factor > * table(x, exclude = NULL) != table(factor(x), exclude=NULL) when x > includes missing values > > * strsplit on a factor > > > strsplit(factor(c("a", "a b")), " ") > Error in strsplit(factor(c("a", "a b")), " ") : non-character argument > > * nchar on a factor: > > > nchar(factor(c("abc", "d", "defgh"))) > [1] 1 1 1 > > * : with two character strings > > > "a":"b" > Error in "a":"b" : NA/NaN argument > In addition: Warning messages: > 1: NAs introduced by coercion > 2: NAs introduced by coercion > > factor("a"):factor("b") > [1] a:b > Levels: a:b > > Regards, > > Hadley > > -- > http://had.co.nz/ > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- David M Smith Director of Community, REvolution Computing www.revolution-computing.com Tel: +1 (206) 577-4778 x3203 (San Francisco, USA) Check out our upcoming events schedule at www.revolution-computing.com/events __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [R] Using R in a corporate envinronment
If you're looking for business and technical justifications for a business to adopt R, I wrote some up last year: http://blog.revolution-computing.com/2009/02/how-to-get-it-to-accept-and-love-r.html In summary: R is mainstream R is supported R is high-quality R leads commercial packages in innovation R is cost effective Ironically, for many companies #1 is more important than #5, so pointing to other companies using R is often a good strategy (see http://blog.revolution-computing.com/rmedia/ for some media articles that may help). And also (if I may indulge the list for a small plug) some companies are more comfortable paying for open-source software than installing it free of cost (for the support, validation, and so on). If that's the case for your company, there's REvolution R Enterprise: http://www.revolution-computing.com/products/revolution-enterprise.php Hope this helps, # David Smith -- David M Smith VP of Marketing, REvolution Computing http://blog.revolution-computing.com Tel: +1 (650) 330-0553 x205 (Palo Alto, CA, USA) Download REvolution R free: www.revolution-computing.com/downloads/revolution-r.php On Wed, Mar 10, 2010 at 10:07 AM, Fernando Henrique Ferraz Pereira da Rosa wrote: > > Dear r-useRs, > > After a couple of years in a 'R exile' of sorts, I've recently changed jobs > and my current employer (an American multinational in the food manufacturing > industry) is much more open than my past employer (which wouldn't even want > to hear about anything that didn't begin with SAS...). So, after my > insistence corporate IT is now considering adopting R as part of our > statistical applications toolbox. > > Things are not that simple though, and I'm now in the process of collecting > data for writing a Business Case for R in our corporation, and this is the > reason I'm writing you. If you have any examples (preferably with > references) and/or experience in a similar scenario please do write me. I've > already googled for some materials, and there's an excellent piece on the > NyTimes of last year, which pointed that even Google was adopting R, and > this is exactly the sort of thing I need to help convincing IT they'll be > making a sound choice in adopting R. > > Thanks in advance for your attention, > > Fernando Rosa > > -- > "Though this be randomness, yet there is structure in't." > Rosa, F.H.F.P > > Instituto de Matemática e Estatística > Universidade de São Paulo > Fernando Henrique Ferraz P. da Rosa > http://www.feferraz.net > > [[alternative HTML version deleted]] > > > __ > r-h...@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] FW: [R] The Quality & Accuracy of R
On Sun, Jan 25, 2009 at 4:20 PM, Peter Dalgaard wrote: >>> - a good reason to want post-install validation is that validity can depend >>> on other part of the system outside developer control (e.g. an overzealous >>> BLAS optimization, sacrificing accuracy and/or standards compliance for >>> speed, can cause trouble). This is also a reason for not making too >>> far-reaching statements about validity. I wanted to echo Peter's point here. It's the main reason why we don't claim our distribution of R is validated: *no* software can be considered validated outside of the environment where it is installed and used. (We do however claim Revolution R is ready for a validation *process*, a small but significant part of which is coming on-site to run tests and verify the results.) We've come across a number of environmental issues (locales, random number generators, shared libraries, path settings, many others) that may affect the validation process. My main point here is that R can only be validated in situ, and the process isn't practical to automate. With the right build tools in place, many of the *tests* can be automated, but that leaves out validation on how the results are stored, used, and accessed in practice. > Muenchen, Robert A (Bob) wrote: >>> Asking to add a superfluous step to an installation may seem like a > waste of time, and technically it is. But psychologically this testing > will have a important impact that will silence many critics. Nonetheless, Bob has an excellent point here -- even short of a complete validation process, *perception* can prevent the validation ball from getting stuck in the first place. Giving the user some degree of easily-digestible feedback that the installed R has run and passed a battery of tests could help for that, and is something we'll look at for the Revolution R distribution. # David Smith P.S. For those who subscribe to r-devel but not r-help, some further discussion of validation for R is here: http://blog.revolution-computing.com/2009/01/analyzing-clinical-trial-data-with-r.html -- David M Smith Director of Community, REvolution Computing www.revolution-computing.com Tel: +1 (206) 577-4778 x3203 (Seattle, USA) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Identifying graphics files produced by R
Oftentimes, I see graphs on the web that *look* like they've been produced by R, but I can never be sure. Or can I? I notice that PostScript files include a "%%%Creator: R Software" line, but do R graphics drivers encode any identifying information in GIF or PNG files more commonly used on the web? And of so, would such evidence necessarily be obliterated in post-processing (e.g cropping)? I'm trying to do an informal survey of R's use to create statistical graphics on the web, and if there's a way to identify graph files I see as coming from R it would help a lot. Thanks, # David Smith -- David M Smith Director of Community, REvolution Computing www.revolution-computing.com Tel: +1 (206) 577-4778 x3203 (Seattle, USA) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Summary: Identifying graphics files produced by R
Thanks to all those that responded to the question below, either on-list or privately. The bottom line is that there's no identifying information from R in the metadata for PNG or JPG files (and R doesn't produce GIFs). I did however figure out a way to automate a search for PDF and PostScript files produced by R, and the details are here: http://blog.revolution-computing.com/2009/02/r-graphics-in-the-media.html Thanks, # David Smith On Fri, Feb 13, 2009 at 1:15 PM, David M Smith < da...@revolution-computing.com> wrote: > Oftentimes, I see graphs on the web that *look* like they've been > produced by R, but I can never be sure. Or can I? I notice that > PostScript files include a "%%%Creator: R Software" line, but do R > graphics drivers encode any identifying information in GIF or PNG > files more commonly used on the web? And of so, would such evidence > necessarily be obliterated in post-processing (e.g cropping)? > > I'm trying to do an informal survey of R's use to create statistical > graphics on the web, and if there's a way to identify graph files I > see as coming from R it would help a lot. > > Thanks, > # David Smith > > -- > David M Smith > Director of Community, REvolution Computing www.revolution-computing.com > Tel: +1 (206) 577-4778 x3203 (Seattle, USA) > -- David M Smith Director of Community, REvolution Computing www.revolution-computing.com Tel: +1 (206) 577-4778 x3203 (Seattle, USA) [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Building R for Vistax64
On Wed, Mar 11, 2009 at 1:15 PM, Sim, Fraser wrote: > Hi all, > > I have successfully built from source the 32-bit version of R on my > Vista 64-bit box. I was hoping to graduate to a 64-bit version so I > could analyze some larger data sets. I have 8gb RAM installed. We (REvolution Computing) are beta testing a 64-bit build of R (2.7.2) and its packages for Windows now. There's more information at: http://www.revolution-computing.com/products/windows-64bit.php # David Smith -- David M Smith Director of Community, REvolution Computing www.revolution-computing.com Tel: +1 (206) 577-4778 x3203 (Seattle, USA) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Closed-source non-free ParallelR ?
ternative HTML version deleted]] >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > > > -- > Pat Shields > Software Engineer > REvolution Computing > One Century Tower | 265 Church Street, Suite 1006 > New Haven, CT 06510 > P: 203-777-7442 x250 | www.revolution-computing.com > > Check out our upcoming events schedule at > www.revolution-computing.com/events > > [[alternative HTML version deleted]] > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- David M Smith Director of Community, REvolution Computing www.revolution-computing.com Tel: +1 (206) 577-4778 x3203 (San Francisco, USA) Check out our upcoming events schedule at www.revolution-computing.com/events __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] About ParallelR and licensing of packages
I rather feel that this discussion has gone beyond a topic and tone suitable for r-devel. I would like to say however, as an author of several GPL works myself, that I am confident that REvolution Computing (my employer, in case that's not clear) is a good-faith member of the open-source community and adheres to the letter and spirit of all licenses. We will reply to the particulars of Mr Dowle's message in private email. I invite any others who may wish to share comments or concerns to do so to me directly at da...@revolution-computing.com. # David Smith -- David M Smith Director of Community, REvolution Computing www.revolution-computing.com Tel: +1 (206) 577-4778 x3203 (San Francisco, USA) On Sun, Apr 26, 2009 at 9:21 PM, Matthew Dowle wrote: > Dear Danese, > > Without prejudice save as to costs > > I am the author of the R library "data.table". I released data.table under > the provisions of the General Public License (GPL). This email is to notify > REvolution that we may be in dispute. If we are in dispute then I am > entitled to issue litigation proceedings against REvolution for breach of > contract. > > To establish if we are in fact in dispute, please answer the following : > > 1. Does REvolution R Enterprise include the library data.table ? > 2. Has REvolution R Enterprise been distributed yet, for example has > REvolution sold a copy ? > 3. If it was distributed, was it distributed under a GPL-compatible license > ? > > FSF guidance : > http://www.fsf.org/licensing/licenses/gpl-faq.html#GPLInProprietarySystem > > Notwithstanding a potential dispute on the basis above, please also answer > the following : > > 4. Has REvolution distributed any program code, written in R or any other > language or environment or otherwise, which uses the library data.table, for > example by calling functions that are provided by data.table at run time ? > 5. If so, was such program code distributed under a GPL compatible license > ? > FSF guidance : > http://www.fsf.org/licensing/licenses/gpl-faq.html#IfInterpreterIsGPL (3rd > paragraph) > http://www.fsf.org/licensing/licenses/gpl-faq.html#IfLibraryIsGPL > http://www.fsf.org/licensing/licenses/gpl-faq.html#NFUseGPLPlugins > > I am making every effort to agree with you that we are not in dispute. I > have several suggestions which may avoid dispute, for example you could > remove data.table from REvolution R Enterprise. You could confirm that the > aggregate work REvolution R Enterprise is released under a GPL-compatible > license. There may well be other solutions you could suggest. You could > decide to postpone distribution of REvolution R Enterprise until all > potential disputes are resolved. If I have not heard from you or your > representatives within 21 days of today 26 April 2009 then I will instruct > my legal representatives to establish whether there is a dispute. > Alternatively you can confirm we are in dispute and I will start to accrue > legal costs immediately thereon. Any such costs will themselves form part of > the claim. I intend to be as open and forthcoming with you about costs as my > lawyers permit me. > > This potential dispute is between myself only and REvolution. You must > engage with me directly by answering the questions above with respect to > data.table. It is a matter for you whether you answer publicly, via your > lawyers or privately to me. It is my understanding that any other GPL'd R > library owners is also entitled to establish, either now or in the future, > whether they are also potentially in dispute with you on the same basis as > above. There are up to 1,700 distinct R libraries, each of which could > potentially generate 1,700 claims of breach of contract on you. One of those > is the R Foundation, who as license holder for the library "base" have > stated they will make a public statement in due course. That is a matter for > the R Foundation, and them alone. In my potential dispute with you, under > English law I have 6 years between the date of any as yet unknown breach of > contract and the date by which I must serve notice on you and submit > particulars of claim to the cou! > rt. My lawyers cannot start to draft particulars of claim until we have > established we are actually in dispute. > > I remind you of the contract by which you are bound by me of your > distributing of my library, or your distributing of programs (yours or > otherwise) which use my library : > > Licensing FAQ page:http://www.fsf.org/licenses/gpl-faq.html > Text of the GNU GPL: http://www.fsf.org/copyleft/gpl.html > Text of the GNU LGPL: http://www.fsf.org/copyleft/lgpl.html > FSF license list page: http://www.fsf.org/licenses/license-list.html > > I look fo