[Rd] help.start() and Debian packaging (PR#8483)
Full_Name: Greg Kochanski Version: 2.2.0 OS: Debian Linux on i686 Submission from: (NULL) (212.159.16.190) Debian packages the R documentation separately from the R core code. Consequently, it is possible for people to have R without the HTML documentation. (In fact, the docs are not installed by default, so it's very likely.) Thus, help.start() cannot depend on the HTML documentation being there. It should check for one (or a few) files and produce some reasonable error message if it is not there. Maybe something like "Warning: the HTML documentation is not installed." Alternatively, help.start() could produce references to some on-line HTML documentation, instead of local documentation. A related bug is that if one calls help.start() when the HTML documentation does not exist, all future calls to help() will lead to errors. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Section 7.1 HML documentation (PR#8484)
Full_Name: Greg Kochanski Version: 2.2.0 OS: Debian Linux i686 Submission from: (NULL) (212.159.16.190) In /usr/share/doc/r-doc-html/manual/R-data.html (at least that's where it is on Debian...) the documentation is unclear. Comments below. The paragraph has unclear references, and I have no idea what it actually means. >> Base R comes with some facilities to communicate via BSD sockets on systems that support them (...). One potential problem >> For new projects it is suggested that socket connections are used instead. "Used instead"? Instead of what? >>The earlier low-level interface is given by functions make.socket, read.socket, write.socket and close.socket. What does "earlier" mean? Earlier than what? __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Minor: bad label in "faithful" dataset (PR#8485)
Full_Name: Greg Kochanski Version: 2.2.0 OS: Debian Linux i686 Submission from: (NULL) (212.159.16.190) The data set for "faithful" appears to be (column 1) the duration of the eruptions, and (column 2) the interval between eruptions. (See http://stat.ethz.ch/R-manual/R-patched/library/datasets/html/faithful.html). The label of column 1 is wrong, and should be "eruption duration" not just "eruptions", which implies a count of eruptions not a duration. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] help.start() and Debian packaging (PR#8483)
While I agree with you, I find that the Debian packager does not. I already reported the problem to Debian, and they said that enough people want light-weight installations that they will continue splitting R into several parts. The package maintainer is Dirk Eddelbuettel <[EMAIL PROTECTED]>, and the relevant bug report is 348051. His response was this: | > Ok, that confirms that all you need to do is to install r-doc-html. No bug, | > it is designed this way. Consequently, I can only appeal to your humanity and to good programming practice. It is good programming practice to protect the user from his/her own mistakes, even if those mistakes are made easier/encouraged by Debian. It is also good programming practice to provide appropriate error messages when something goes wrong, even if it "shouldn't" ever go wrong. So, yeah, you can make an argument that you don't have to do it, but R will be a better piece of software if you make the change. Prof Brian Ripley wrote: > This is all based on a false premise: that a partial install of Debian > files is 'R'. > > R's own scripts do always install the HTML documentation, so > help.start() is entitled to assume that it is present. ... > > Note that your version of 'R' is not current. > > If there is a bug here, it is in the Debian re-packaging. I trust the > Debian packages do contain a bug reporting address other than this one: > please use the correct one. (The other binary distributions that I am > aware of, e.g. RPMs, do seem to include all of R.) > > On Sat, 14 Jan 2006 [EMAIL PROTECTED] wrote: > >> Full_Name: Greg Kochanski >> Version: 2.2.0 >> OS: Debian Linux on i686 >> Submission from: (NULL) (212.159.16.190) >> >> >> Debian packages the R documentation separately from the R core code. >> Consequently, it is possible for people to have R without >> the HTML documentation. (In fact, the docs are not installed by >> default, >> so it's very likely.) >> >> >> Thus, help.start() cannot depend on the HTML documentation being there. >> It should check for one (or a few) files and produce some reasonable >> error message if it is not there. Maybe something like >> "Warning: the HTML documentation is not installed." >> >> Alternatively, help.start() could produce references to some on-line >> HTML documentation, instead of local documentation. >> >> >> >> A related bug is that if one calls >> help.start() when the HTML documentation does not exist, >> all future calls to help() will lead to errors. > > > Working as documented is not a bug. > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Section 7.1 HML documentation (PR#8484)
Well, I don't know how it can be precise and correct when it has dangling antecedents. Gramatically speaking, that's the equivalent of an uninitialized pointer. However, I agree with you that it probably just needs a minor bit of fiddling to make sure it answers "Instead of what?" and "Earlier than what?" Hin-Tak Leung wrote: > [EMAIL PROTECTED] wrote: > >> Full_Name: Greg Kochanski >> Version: 2.2.0 >> OS: Debian Linux i686 >> Submission from: (NULL) (212.159.16.190) >> >> >> In /usr/share/doc/r-doc-html/manual/R-data.html (at least that's where >> it is on Debian...) the documentation is unclear. Comments below. > > > The documentation is, I believe, correct and precise as it stands. > What it doesn't emphasize and mention is the difference between > "BSD socket" and "socket connection", or an "R connection of the socket > type". And it is recommended that you > use "socket connection" instead of "BSD socket". > > The earlier "BSD socket" is created, read, write with > "make.socket"/"read.socket"/"write socket"/"close socket". > > The newer "socket connection" is created by creating a new connection > object like this: > con <- socketConnection(port = 79, blocking = TRUE) > and invoking the open/write/read method of the "connection" > object. type "?connection" in an R prompt for details. > > "BSD socket" is a unix concept, "socket connection" is an R object. > The paragraph should have put "BSD socket" and "socket connection" > in quote or italics. Make more sense? > > Somebody please fix the paragraph... :-). > >> The paragraph has unclear references, and I have no idea what >> it actually means. >> >> >>>> Base R comes with some facilities to communicate via BSD sockets on >>>> systems >> >> >> that support them (...). One potential problem >> >>>> For new projects it is suggested that socket connections are used >>>> instead. >> >> >> >> "Used instead"? Instead of what? >> >> >>>> The earlier low-level interface is given by functions make.socket, >> >> >> read.socket, write.socket and close.socket. >> What does "earlier" mean? Earlier than what? >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > > > > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Section 7.1 HML documentation (PR#8484)
Well, you make two very strong assumptions. First, that your readers start in the beginning and read to the end. Second, that your readers are sufficiently dedicated to learn your terminology. The first is false: I got to that page via Google. The second is only true in varying degrees, and I wouldn't depend on it too strongly. When writing documentation, you really have to write for the case of someone who has a specific problem and wants to understand that problem as quickly as possible. That means the manuals should have "local support" -- most of what you need to know should be in one place, and everything else should be referenced or hyperlinked. Speaking almost professionally (since I'm almost a linguist), the word "instead" is normally used in the form "instead of X", and you can only delete the "of X" when X is clear and obvious. For instance, one wouldn't just write "I go to work instead." because your readers won't know the alternative to work. Likewise, with "earlier": the underlying form is "earlier than Y", and you can only delete "than Y" when your readers are quite clear what you are comparing to. That's what I meant by "dangling": that X and Y were not clear. Hin-Tak Leung wrote: > Greg Kochanski wrote: > >> Well, I don't know how it can be precise >> and correct when it has dangling antecedents. >> Gramatically speaking, that's the equivalent of >> an uninitialized pointer. > > > I don't think there is anything "dangling" there. What the paragraph > assumes (and quite patently wrongly) is that the reader had encountered > the concept of "R connection object of the socket type" elsewhere. > Without that background, one tends to interprete the phrase "socket > connection" in the traditional unix sense (i.e. = "BSD socket"), and > hence one reads the paragraph as " XXX is older than XXX and XXX is > newer than XXX and there had been potential problems with XXX and > one should use XXX instead (of XXX)". Yep. > >> However, I agree with you that it probably just >> needs a minor bit of fiddling to make sure it >> answers "Instead of what?" and "Earlier than what?" > > > I have re-read R-data and it seems the fault is yours. Because > "Connection" is mentioned in quite a major way and is the entire subject > of chapter 6 and comes earlier than the paragraph you quoted in > chapter 7. So it seems to be your own fault of trying to > understand chapter 7 without noticing the header of chapter 6 > nor reading it! That may be so, but it is irrelevant. The object of this exercise is not to assign blame, but to make the software more useful for the next user. Consequently, you might want to fix it (even if it is my fault), so long as it is likely to help the next guy (even if it is his fault). And, I contend that a lot more people Google into the middle of the documentation than read it from beginning to end. QED. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Section 7.1 HML documentation (PR#8484)
Well, you make two very strong assumptions. First, that your readers start in the beginning and read to the end. Second, that your readers are sufficiently dedicated to learn your terminology. The first is false: I got to that page via Google. The second is only true in varying degrees, and I wouldn't depend on it too strongly. When writing documentation, you really have to write for the case of someone who has a specific problem and wants to understand that problem as quickly as possible. That means the manuals should have "local support" -- most of what you need to know should be in one place, and everything else should be referenced or hyperlinked. Speaking almost professionally (since I'm almost a linguist), the word "instead" is normally used in the form "instead of X", and you can only delete the "of X" when X is clear and obvious. For instance, one wouldn't just write "I go to work instead." because your readers won't know the alternative to work. Likewise, with "earlier": the underlying form is "earlier than Y", and you can only delete "than Y" when your readers are quite clear what you are comparing to. That's what I meant by "dangling": that X and Y were not clear. Hin-Tak Leung wrote: > Greg Kochanski wrote: > >> Well, I don't know how it can be precise >> and correct when it has dangling antecedents. >> Gramatically speaking, that's the equivalent of >> an uninitialized pointer. > > > I don't think there is anything "dangling" there. What the paragraph > assumes (and quite patently wrongly) is that the reader had encountered > the concept of "R connection object of the socket type" elsewhere. > Without that background, one tends to interprete the phrase "socket > connection" in the traditional unix sense (i.e. = "BSD socket"), and > hence one reads the paragraph as " XXX is older than XXX and XXX is > newer than XXX and there had been potential problems with XXX and > one should use XXX instead (of XXX)". Yep. > >> However, I agree with you that it probably just >> needs a minor bit of fiddling to make sure it >> answers "Instead of what?" and "Earlier than what?" > > > I have re-read R-data and it seems the fault is yours. Because > "Connection" is mentioned in quite a major way and is the entire subject > of chapter 6 and comes earlier than the paragraph you quoted in > chapter 7. So it seems to be your own fault of trying to > understand chapter 7 without noticing the header of chapter 6 > nor reading it! That may be so, but it is irrelevant. The object of this exercise is not to assign blame, but to make the software more useful for the next user. Consequently, you might want to fix it (even if it is my fault), so long as it is likely to help the next guy (even if it is his fault). And, I contend that a lot more people Google into the middle of the documentation than read it from beginning to end. QED. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] --gui=Tk window does not stretch (PR#8520)
Full_Name: Greg Kochanski Version: 2.2.0 OS: Debian Linux Submission from: (NULL) (212.159.16.190) When you grab the corner of the Tk-R (R's console) window, the window stretches, but the useable area does not. It remains firmly fixed at the (rather small) value of 24 lines. In fact, you end up with a grey border of wasted pixels around the active white area that contains the text. (And, please don't tell me that it's not a bug because it's been that way for 15 years, or because the S documentation states that the terminal window is 24 lines high. That would shatter my dreams and illusions.) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] dataentry() (PR#8535)
Full_Name: Greg Kochanski Version: 2.2.1 OS: Debian Linux (testing) Submission from: (NULL) (212.159.16.190) In writing class notes to teach people how to use R, I came across a design failure of dataentry(). It seems that if you add a new value outside the bounds of an array, dataentry() fills the intervening space with NA. That's reasonable, but what happens if you *accidentally* entered a value outside the bounds?There's no way to get rid of it. Note that you are doomed once you type anyting beyond the end of an array, even if you delete your typing before moving the mouse out of the cell -- even then, that cell and others between it and the end of the array will be filled with NA. I would suggest that some mechanism be added to allow arrays to be shortened in the data editor. It would be generally useful, even beyond fixing typing mistakes. (I recognize that you can shorten an array with something like x <- X[1:132], but it should still be possible in the editor.) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] mosaicplot() labels overlap (PR#8536)
Full_Name: Greg Kochanski Version: 2.2.1 OS: Debian Linux (testing) Submission from: (NULL) (212.159.16.190) This is really a feature request. When you do mosaicplot() on a data set where the probability of several nearby rows is small, then the labels for those rows are plotted overlapping each other. This situation can be improved by calling mosaicplot() with a large value of "off", but sometimes, even off=50 (the largest allowable value) isn't sufficient, especially if the labels are several characters long. The problem exists even if the labels don't overlap, because one needs space between the labels to avoid confusion. For instance, labels "L*H", "!H*", and "L%" when too close together turn into "L*H!H*L%" which is confusing to anyone. The problem could be solved by breaking the assumption that the label position need always be exactly matched to the graphic.This is OK, especially for rows because (a) the graphical blocks that are part of a single row aren't aligned with each other anyway, and (b) if you can read the labels, you can generally match things up by counting. One way to do this in a fairly nice way is to position the labels in such a way to minimize the sum of the squared error between the label center and the average position of the blocks on that row, subject to the constraint that labels be non-overlapping. This problem is actually not too hard to solve: it is essentially Kruskal's algorithm for finding a best-fit monotonic sequence (which probably exists in CRAN already). Neglecting edge effects, assume you have a vector of desired positions z, and a vector of minimum widths for each label w. Then, you can compute the space used up by the labels: s[i] = -0.5*w[1] + sum(jhttps://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Mosaicplot coloring (PR#8537)
Full_Name: Greg Kochanski Version: 2.2.1 OS: Debian Linux (testing) Submission from: (NULL) (212.159.16.190) mosaicplot(x, shade=TRUE) is intended to color the blocks blue if they are more common than one might expect and red if they are rarer than one might expect. Unfortunately, if a block is much rarer than expected, it is so narrow that one cannot see the red.Thus, a casual inspection of the mosaicplot will miss some of the most statistically significant results. This is partially an intrinsic problem and cannot be entirely fixed, but it is made worse by the black outlines around each block.Blocks with very small probabilities show as black, not red. The broken outlines on the red blocks help, but not quite enough. I would suggest that there be an option to either turn off the black outlines when for the colored blocks, or an option to use colored outlines. If those options are somehow hidden in the ... part of the argument list for mosaicplot(), I apologize, but then this bug report should be converted to a documentation bug. Nowhere in help(mosaicplot) does it say what one can put into the unspecified arguments (...). __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Mosaicplot coloring (PR#8537)
Achim Zeileis wrote: > On Sun, 29 Jan 2006 [EMAIL PROTECTED] wrote: > > >>Full_Name: Greg Kochanski >>Version: 2.2.1 >>OS: Debian Linux (testing) >>Submission from: (NULL) (212.159.16.190) >> >> >>mosaicplot(x, shade=TRUE) is intended to color the blocks >>blue if they are more common than one might expect >>and red if they are rarer than one might expect. >> >>Unfortunately, if a block is much rarer than expected, >>it is so narrow that one cannot see the red. > > > Where is the bug?? Please read Section 9 in > http://CRAN.R-project.org/doc/manuals/R-FAQ.html > and also the posting guide at > http://www.R-project.org/posting-guide.html > The bug is that the software produces results that could lead to the wrong conclusion in a research paper, or could lead the readers of the research paper to an erroneous belief. That sounds like a relevant definition of a bug to me. From section 9: > Finally, a command's intended definition may not be best for > statistical analysis. This is a very important sort of problem, > but it is also a matter of judgment. > > ... The manual's job is to make everything clear. > It is just as important to report documentation bugs > as program bugs From my reading of section 9, this is a documentation bug. ... > For an enhanced implementation of mosaic plots written in the grid > graphics system, see the package "vcd" and the functions mosaic() and > strucplot(). See the package vignettes for details on control of the > graphical appearance and also for combining shading and significance > testing. To overcome the problem of small cells, another approach is to > plot expected instead of observed frequencies. > Z You shouldn't be telling this to me, you should be putting it in the documentation where it might help more than one person. Putting a "see also" note in help(mosaicplot) that points to the "vcd" package, and "mosaic" and "strucplot" functions might be a solution to the problem. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Mosaicplot coloring (PR#8537)
Achim Zeileis wrote: > On Mon, 30 Jan 2006, Greg Kochanski wrote: > > >>The bug is that the software produces results that could >>lead to the wrong conclusion in a research paper, >>or could lead the readers of the research paper to >>an erroneous belief. That sounds like a >>relevant definition of a bug to me. > > > Maybe. However, it seems to be a bug in the way you interpret mosaic > displays, not in the way they are implemented/documented in R. OK. Call it that if you want, though I expect that I share the bug with many other people. > > As I said before: This is a known issue with mosaic displays which is not > so hard to find out if you consult the references given in ?mosaiplot. The problem I see is that you (as a representative for the r-project) are the wrong person to judge the success or failure of the documentation.You presumably know the software in detail. Documentation is (to at least some degree) intended for use by people who _do_not_ know the software well. Expecting a package's developer to judge documentation is like asking a bald man to judge which comb is best. He knows what a comb is for, he may remember using one, but it's not quite the same as actually needing and using one. > > Another solution to your problem might be to use association plots > (assoplot() is referred to in ?mosaicplot, assoc() is again a more > flexible implementation in "vcd"). Thanks; that may help me; I appreciate the suggestion. (I must point out, though, it doesn't help improve mosaicplot().) > > >>>For an enhanced implementation of mosaic plots... >>You shouldn't be telling this to me, >>you should be putting it in the documentation where > > > Note that I'm neither the author of the mosaicplot() function nor > its manual page. Just out of curiosity, why are you responding to bug reports that you don't have the power to fix? >>Putting a "see also" note in help(mosaicplot) that points >>to the "vcd" package, ... >>might be a solution to the problem. > > vcd is `only' a contributed package on CRAN, hence not referred to from > base packages. But, if the contributed packages are better (as you seem to say) than the base packages, perhaps they *should* be mentioned? __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] --gui=Tk window does not stretch (PR#8520)
Peter Dalgaard wrote: > > As you're bound to discover, the Tk console is mainly a > proof-of-concept with shortcomings in many other areas as well. It's > been largely undeveloped (as has the Gnome GUI) because we had very > little feedback to indicate that people were actually interested in > getting it to work better. Patches might be considered. Does that mean "patches will be considered and accepted if they meet reasonable criteria for simplicity and correctness." or does it mean something less? __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Mosaicplot coloring (PR#8537)
Achim Zeileis wrote: > Greg: > > >>OK. Call it that if you want, though I expect that I share >>the bug with many other people. > > > What I tried to say here was: Reports of user errors do not belong on > R-bugs. A request on R-help would have been more likely to generate a > useful, friendly and widely shared reply/discussion. And what I was trying to say is that any behaviour of a program that makes user errors likely can reasonably be considered a bug.I'll grant you that an individual user error is mere anecdotal evidence, but an individual's report combined with a plausible argument that other people will make the same mistake deserves some attention. I would make an analogy to accident reports from aircraft accidents. Most accidents are caused (at least in part) by user error, and yet the reports recommend design changes to the aircraft to minimize the probability of future errors. > FYI: Meanwhile, R-core has signalled that they would add a cross > reference to vcd in this case. Hence, I'll suggest a documentation > patch to R-core. > Thanks! __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel