Hi Gabriel > Personally, no I wouldn't. I would consider m==0 a degenerate case, where there is no data, but I personally find matrices (or data.frames) with rows but no columns a very strange concept.
This distinction between matrix and data.frames is the crux in this case. >From the dimensional modelling point of view, matrix can have non-zero rows and zero columns, but data.frame (assuming it maps to database table structure) should never have non-zero rows and zero columns. This kind of issue was raised before in our issue tracker: https://github.com/Rdatatable/data.table/issues/2422 You should find that discussion useful. Best, Jan Gorecki On Fri, May 17, 2019 at 8:11 AM Pages, Herve <hpa...@fredhutch.org> wrote: > > On 5/16/19 17:48, Gabriel Becker wrote: > > Hi Herve, > > Inline. > > > > On Thu, May 16, 2019 at 4:45 PM Pages, Herve > <hpa...@fredhutch.org<mailto:hpa...@fredhutch.org>> wrote: > Hi Gabe, > > ncol(data.frame(aa=c("a", "b", "c"), AA=c("A", "B", "C"))) > # [1] 2 > > ncol(data.frame(aa="a", AA="A")) > # [1] 2 > > ncol(data.frame(aa=character(0), AA=character(0))) > # [1] 2 > > ncol(cbind(aa=c("a", "b", "c"), AA=c("A", "B", "C"))) > # [1] 2 > > ncol(cbind(aa="a", AA="A")) > # [1] 2 > > ncol(cbind(aa=character(0), AA=character(0))) > # [1] 2 > > nrow(rbind(aa=c("a", "b", "c"), AA=c("A", "B", "C"))) > # [1] 2 > > nrow(rbind(aa="a", AA="A")) > # [1] 2 > > nrow(rbind(aa=character(0), AA=character(0))) > # [1] 2 > > Sure, but > > > > nrow(rbind(aa = c("a", "b", "c"), AA = c("a", "b", "c"))) > > [1] 2 > > > nrow(rbind(aa = c("a", "b", "c"), AA = "a")) > > [1] 2 > > > nrow(rbind(aa = c("a", "b", "c"), AA = character())) > > [1] 1 > > > Ah, I see now. > > But: > > > data.frame(aa = c("a", "b", "c"), AA = character()) > Error in data.frame(aa = c("a", "b", "c"), AA = character()) : > arguments imply differing number of rows: 3, 0 > > and > > > mapply(`*`, 1:5, integer(0)) > Error in mapply(`*`, 1:5, integer(0)) : > zero-length inputs cannot be mixed with those of non-zero length > > So I would declare rbind(aa = c("a", "b", "c"), AA = character()) > inconsistent rather than making the case that rbind(aa = character(), AA = > character()) needs to change. > > Cheers, > > H. > > > So even if I ultimately "lose" this debate (which really wouldn't shock me, > even if R-core did agree with me there's backwards compatibility to > consider), you have to concede that the current behavior is more complicated > than the above is acknowledging. > > By rights of the invariance that you and Hadley are advocating, as far as I > understand it, the last should give 2 rows, one of which is all NAs, rather > than giving only one row as it currently does (and, I assume?, always has). > > So there are two different behavior patterns that could coherently (and > internally-consistently) be generalized to apply to the rbind(character(), > character()) case, not just one. I'm making the case that the other one (that > length 0 vectors do not add rows because they don't contain data) would be > equally valid, and to N>1 people, at least equally intuitive. > > Best, > ~G > > hmmm... not sure why ncol(cbind(aa=character(0), AA=character(0))) or > nrow(rbind(aa=character(0), AA=character(0))) should do anything > different from what they do. > > In my experience, and more generally speaking, the desire to treat > 0-length vectors as a special case that deviates from the > non-zero-length case has never been productive. > > H. > > > On 5/16/19 13:17, Gabriel Becker wrote: > > Hi all, > > > > Apologies if this has been asked before (a quick google didn't find it for > > me),and I know this is a case of behaving as documented but its so > > unintuitive (to me at least) that I figured I'd bring it up here anyway. I > > figure its probably going to not be changed, but I'm happy to submit a > > patch if this is something R-core feels can/should change. > > > > So I recently got bitten by the fact that > > > >> nrow(rbind(character(), character())) > > [1] 2 > > > > > > I was checking whether the result of an rbind call had more than one row, > > and that unexpected returned true, causing all sorts of shenanigans > > downstream as I'm sure you can imagine. > > > > Now I know that from ?rbind > > > > For ‘cbind’ (‘rbind’), vectors of zero length (including ‘NULL’) > >> are ignored unless the result would have zero rows (columns), for > >> > >> S compatibility. (Zero-extent matrices do not occur in S3 and are > >> > >> not ignored in R.) > >> > > But there's a couple of things here. First, for the rowbind case this > > reads as "if there would be zero columns, the vectors will not be > > ignored". This wording implies to me that not ignoring the vectors is a > > remedy to the "problem" of the potential for a zero-column return, but > > thats not the case. The result still has 0 columns, it just does not also > > have zero rows. So even if the behavior is not changed, perhaps this > > wording can be massaged for clarity? > > > > The other issue, which I admit is likely a problem with my intuition, but > > which I don't think I'm alone in having, is that even if I can't have a 0x0 > > matrix (which is what I'd prefer) I would have expected/preferred a 1x0 > > matrix, the reasoning being that if we must avoid a 0x0 return value, we > > would do the minimum required to avoid, which is to not ignore the first > > length 0 vector, to ensure a non-zero-extent matrix, but then ignore the > > remaining ones as they contain information for 0 new rows. > > > > Of course I can program around this now that I know the behavior, but > > again, its so unintuitive (even for someone with a fairly well developed > > intuition for R's sometimes "quirky" behavior) that I figured I'd bring it > > up. > > > > Thoughts? > > > > Best, > > ~G > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel@r-project.org<mailto:R-devel@r-project.org> mailing list > > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=WzRf-6PuyYeprM0v55lLX2U-_hYGf__5yf3h6JNdJH0&s=nn76KQtp4viR66768zoSNcH7WpG77Pp8LyhOwYOs674&e= > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpa...@fredhutch.org<mailto:hpa...@fredhutch.org> > Phone: (206) 667-5791 > Fax: (206) 667-1319 > > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpa...@fredhutch.org<mailto:hpa...@fredhutch.org> > Phone: (206) 667-5791 > Fax: (206) 667-1319 > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel