[Rd] help.start() and Debian packaging (PR#8483)

2006-01-14 Thread greg . kochanski
Full_Name: Greg Kochanski
Version: 2.2.0
OS: Debian Linux on i686
Submission from: (NULL) (212.159.16.190)


Debian packages the R documentation separately from the R core code.
Consequently, it is possible for people to have R without
the HTML documentation.   (In fact, the docs are not installed by default,
so it's very likely.)


Thus, help.start() cannot depend on the HTML documentation being there.
It should check for one (or a few) files and produce some reasonable
error message if it is not there.   Maybe something like
"Warning: the HTML documentation is not installed."

Alternatively, help.start() could produce references to some on-line
HTML documentation, instead of local documentation.



A related bug is that if one calls
help.start()  when the HTML documentation does not exist,
all future calls to help() will lead to errors.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Section 7.1 HML documentation (PR#8484)

2006-01-14 Thread greg . kochanski
Full_Name: Greg Kochanski
Version: 2.2.0
OS: Debian Linux i686
Submission from: (NULL) (212.159.16.190)


In /usr/share/doc/r-doc-html/manual/R-data.html (at least that's where
it is on Debian...) the documentation is unclear.   Comments below.


The paragraph has unclear references, and I have no idea what
it actually means.

>> Base R comes with some facilities to communicate via BSD sockets on systems
that support them (...). One potential problem
>> For new projects it is suggested that socket connections are used instead.

"Used instead"?   Instead of what?

>>The earlier low-level interface is given by functions make.socket,
read.socket, write.socket and close.socket. 

What does "earlier" mean?   Earlier than what?

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Minor: bad label in "faithful" dataset (PR#8485)

2006-01-14 Thread greg . kochanski
Full_Name: Greg Kochanski
Version: 2.2.0
OS: Debian Linux i686
Submission from: (NULL) (212.159.16.190)


The data set for "faithful" appears to be (column 1) the duration
of the eruptions, and (column 2) the interval between eruptions.
(See 
http://stat.ethz.ch/R-manual/R-patched/library/datasets/html/faithful.html).

The label of column 1 is wrong, and should be "eruption duration"
not just "eruptions", which implies a count of eruptions not a duration.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] help.start() and Debian packaging (PR#8483)

2006-01-16 Thread greg . kochanski
While I agree with you, I find that the Debian packager does not.
I already reported the problem to Debian, and they said that
enough people want light-weight installations that they will
continue splitting R into several parts.
The package maintainer is  Dirk Eddelbuettel <[EMAIL PROTECTED]>,
and the relevant bug report is 348051.

His response was this:
| > Ok, that confirms that all you need to do is to install r-doc-html. 
No bug,
| > it is designed this way.


Consequently, I can only appeal to your humanity and
to good programming practice.

It is good programming practice to protect the user from
his/her own mistakes, even if those mistakes are made
easier/encouraged by Debian.   It is also good programming
practice to provide appropriate error messages when something
goes wrong, even if it "shouldn't" ever go wrong.

So, yeah, you can make an argument that you don't have to
do it, but R will be a better piece of software if you make
the change.


Prof Brian Ripley wrote:
> This is all based on a false premise: that a partial install of Debian
> files is 'R'.
> 
> R's own scripts do always install the HTML documentation, so 
> help.start() is entitled to assume that it is present. ...
> 
> Note that your version of 'R' is not current.
> 
> If there is a bug here, it is in the Debian re-packaging.  I trust the 
> Debian packages do contain a bug reporting address other than this one: 
> please use the correct one.  (The other binary distributions that I am 
> aware of, e.g. RPMs, do seem to include all of R.)
> 
> On Sat, 14 Jan 2006 [EMAIL PROTECTED] wrote:
> 
>> Full_Name: Greg Kochanski
>> Version: 2.2.0
>> OS: Debian Linux on i686
>> Submission from: (NULL) (212.159.16.190)
>>
>>
>> Debian packages the R documentation separately from the R core code.
>> Consequently, it is possible for people to have R without
>> the HTML documentation.   (In fact, the docs are not installed by 
>> default,
>> so it's very likely.)
>>
>>
>> Thus, help.start() cannot depend on the HTML documentation being there.
>> It should check for one (or a few) files and produce some reasonable
>> error message if it is not there.   Maybe something like
>> "Warning: the HTML documentation is not installed."
>>
>> Alternatively, help.start() could produce references to some on-line
>> HTML documentation, instead of local documentation.
>>
>>
>>
>> A related bug is that if one calls
>> help.start()  when the HTML documentation does not exist,
>> all future calls to help() will lead to errors.
> 
> 
> Working as documented is not a bug.
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Section 7.1 HML documentation (PR#8484)

2006-01-17 Thread Greg Kochanski
Well, I don't know how it can be precise
and correct when it has dangling antecedents.
Gramatically speaking, that's the equivalent of
an uninitialized pointer.

However, I agree with you that it probably just
needs a minor bit of fiddling to make sure it
answers "Instead of what?" and "Earlier than what?"


Hin-Tak Leung wrote:
> [EMAIL PROTECTED] wrote:
> 
>> Full_Name: Greg Kochanski
>> Version: 2.2.0
>> OS: Debian Linux i686
>> Submission from: (NULL) (212.159.16.190)
>>
>>
>> In /usr/share/doc/r-doc-html/manual/R-data.html (at least that's where
>> it is on Debian...) the documentation is unclear.   Comments below.
> 
> 
> The documentation is, I believe, correct and precise as it stands.
> What it doesn't emphasize and mention is the difference between
> "BSD socket" and "socket connection", or an "R connection of the socket 
> type". And it is recommended that you
> use "socket connection" instead of "BSD socket".
> 
> The earlier "BSD socket" is created, read, write with 
> "make.socket"/"read.socket"/"write socket"/"close socket".
> 
> The newer "socket connection" is created by creating a new connection 
> object like this:
>  con <- socketConnection(port = 79, blocking = TRUE)
> and invoking the open/write/read method of the "connection"
> object. type "?connection" in an R prompt for details.
> 
> "BSD socket" is a unix concept, "socket connection" is an R object.
> The paragraph should have put "BSD socket" and "socket connection"
> in quote or italics. Make more sense?
> 
> Somebody please fix the paragraph... :-).
> 
>> The paragraph has unclear references, and I have no idea what
>> it actually means.
>>
>>
>>>> Base R comes with some facilities to communicate via BSD sockets on 
>>>> systems
>>
>>
>> that support them (...). One potential problem
>>
>>>> For new projects it is suggested that socket connections are used 
>>>> instead.
>>
>>
>>
>> "Used instead"?   Instead of what?
>>
>>
>>>> The earlier low-level interface is given by functions make.socket,
>>
>>
>> read.socket, write.socket and close.socket.
>> What does "earlier" mean?   Earlier than what?
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> 
> 
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Section 7.1 HML documentation (PR#8484)

2006-01-18 Thread Greg Kochanski
Well, you make two very strong assumptions.

First, that your readers start in the beginning and read to the
end.
Second, that your readers are sufficiently dedicated to learn
your terminology.

The first is false:  I got to that page via Google.
The second is only true in varying degrees,
and I wouldn't depend on it too strongly.

When writing documentation, you really have to write for
the case of someone who has a specific problem and wants
to understand that problem as quickly as possible.
That means the manuals should have "local support" --
most of what you need to know should be in one place, and
everything else should be referenced or hyperlinked.

Speaking almost professionally (since I'm almost a linguist),
the word "instead" is normally used in the form "instead of X",
and you can only delete the "of X" when X is clear and obvious.

For instance, one wouldn't just write

"I go to work instead."

because your readers won't know the
alternative to work.
Likewise, with "earlier":  the underlying form is
"earlier than Y", and you can only delete "than Y" when your
readers are quite clear what you are comparing to.

That's what I meant by "dangling": that X and Y were not clear.

Hin-Tak Leung wrote:
> Greg Kochanski wrote:
> 
>> Well, I don't know how it can be precise
>> and correct when it has dangling antecedents.
>> Gramatically speaking, that's the equivalent of
>> an uninitialized pointer.
> 
> 
> I don't think there is anything "dangling" there. What the paragraph 
> assumes (and quite patently wrongly) is that the reader had encountered 
> the concept of "R connection object of the socket type" elsewhere. 
> Without that background, one tends to interprete the phrase "socket
> connection" in the traditional unix sense (i.e. = "BSD socket"), and
> hence one reads the paragraph as " XXX is older than XXX and XXX is
> newer than XXX and there had been potential problems with XXX and
> one should use XXX instead (of XXX)".

Yep.


> 
>> However, I agree with you that it probably just
>> needs a minor bit of fiddling to make sure it
>> answers "Instead of what?" and "Earlier than what?"
> 
> 
> I have re-read R-data and it seems the fault is yours. Because 
> "Connection" is mentioned in quite a major way and is the entire subject
> of chapter 6 and comes earlier than the paragraph you quoted in
> chapter 7. So it seems to be your own fault of trying to
> understand chapter 7 without noticing the header of chapter 6
> nor reading it!

That may be so, but it is irrelevant.   The object of this
exercise is not to assign blame, but to make the software
more useful for the next user.

Consequently, you might want to fix it (even if it is my fault),
so long as it is likely to help the next guy (even if it is his fault).
And, I contend that a lot more people Google into the middle
of the documentation than read it from beginning to end.  QED.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Section 7.1 HML documentation (PR#8484)

2006-01-18 Thread greg . kochanski
Well, you make two very strong assumptions.

First, that your readers start in the beginning and read to the
end.
Second, that your readers are sufficiently dedicated to learn
your terminology.

The first is false:  I got to that page via Google.
The second is only true in varying degrees,
and I wouldn't depend on it too strongly.

When writing documentation, you really have to write for
the case of someone who has a specific problem and wants
to understand that problem as quickly as possible.
That means the manuals should have "local support" --
most of what you need to know should be in one place, and
everything else should be referenced or hyperlinked.

Speaking almost professionally (since I'm almost a linguist),
the word "instead" is normally used in the form "instead of X",
and you can only delete the "of X" when X is clear and obvious.

For instance, one wouldn't just write

"I go to work instead."

because your readers won't know the
alternative to work.
Likewise, with "earlier":  the underlying form is
"earlier than Y", and you can only delete "than Y" when your
readers are quite clear what you are comparing to.

That's what I meant by "dangling": that X and Y were not clear.

Hin-Tak Leung wrote:
> Greg Kochanski wrote:
> 
>> Well, I don't know how it can be precise
>> and correct when it has dangling antecedents.
>> Gramatically speaking, that's the equivalent of
>> an uninitialized pointer.
> 
> 
> I don't think there is anything "dangling" there. What the paragraph 
> assumes (and quite patently wrongly) is that the reader had encountered 
> the concept of "R connection object of the socket type" elsewhere. 
> Without that background, one tends to interprete the phrase "socket
> connection" in the traditional unix sense (i.e. = "BSD socket"), and
> hence one reads the paragraph as " XXX is older than XXX and XXX is
> newer than XXX and there had been potential problems with XXX and
> one should use XXX instead (of XXX)".

Yep.


> 
>> However, I agree with you that it probably just
>> needs a minor bit of fiddling to make sure it
>> answers "Instead of what?" and "Earlier than what?"
> 
> 
> I have re-read R-data and it seems the fault is yours. Because 
> "Connection" is mentioned in quite a major way and is the entire subject
> of chapter 6 and comes earlier than the paragraph you quoted in
> chapter 7. So it seems to be your own fault of trying to
> understand chapter 7 without noticing the header of chapter 6
> nor reading it!

That may be so, but it is irrelevant.   The object of this
exercise is not to assign blame, but to make the software
more useful for the next user.

Consequently, you might want to fix it (even if it is my fault),
so long as it is likely to help the next guy (even if it is his fault).
And, I contend that a lot more people Google into the middle
of the documentation than read it from beginning to end.  QED.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] --gui=Tk window does not stretch (PR#8520)

2006-01-23 Thread greg . kochanski
Full_Name: Greg Kochanski
Version: 2.2.0
OS: Debian Linux
Submission from: (NULL) (212.159.16.190)


When you grab the corner of the Tk-R (R's console) window,
the window stretches, but the useable area does not.
It remains firmly fixed at the (rather small) value of
24 lines.

In fact, you end up with a grey border  of wasted pixels
around the active white area that contains the text.

(And, please don't tell me that it's not a bug because
it's been that way for 15 years, or because the S
documentation states that the terminal window is
24 lines high.   That would shatter my dreams
and illusions.)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] dataentry() (PR#8535)

2006-01-29 Thread greg . kochanski
Full_Name: Greg Kochanski
Version: 2.2.1
OS: Debian Linux (testing)
Submission from: (NULL) (212.159.16.190)


In writing class notes to teach people how to use R, I came across
a design failure of dataentry().

It seems that if you add a new value outside the bounds of an array,
dataentry() fills the intervening space with NA. That's reasonable,
but what happens if you *accidentally* entered a value outside the
bounds?There's no way to get rid of it.

Note that you are doomed once you type anyting beyond the end
of an array, even if you delete your typing before moving the
mouse out of the cell -- even then, that cell and others
between it and the end of the array will be filled with NA.

I would suggest that some mechanism be added to allow
arrays to be shortened in the data editor. It would
be generally useful, even beyond fixing typing mistakes.

(I recognize that you can shorten an array with something
like x <- X[1:132], but it should still be possible in the
editor.)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] mosaicplot() labels overlap (PR#8536)

2006-01-29 Thread greg . kochanski
Full_Name: Greg Kochanski
Version: 2.2.1
OS: Debian Linux (testing)
Submission from: (NULL) (212.159.16.190)


This is really a feature request.

When you do mosaicplot() on a data set where the probability of
several nearby rows is small, then the labels for those
rows are plotted overlapping each other.

This situation can be improved by calling mosaicplot()
with a large value of "off", but sometimes, even off=50
(the largest allowable value) isn't sufficient,
especially if the labels are several characters long.

The problem exists even if the labels don't overlap,
because one needs space between the labels to avoid
confusion.   For instance, labels "L*H", "!H*", and
"L%" when too close together turn into
"L*H!H*L%" which is confusing to anyone.

The problem could be solved by breaking the assumption that
the label position need always be exactly matched to the
graphic.This is OK, especially for rows because
(a) the graphical blocks that are part of a single row
aren't aligned with each other anyway, and
(b) if you can read the labels, you can generally
match things up by counting.

One way to do this in a fairly nice way is to position
the labels in such a way to minimize the
sum of the squared error between the label center
and the average position of the blocks on that row,
subject to the constraint that labels be
non-overlapping.

This problem is actually not too hard to solve:
it is essentially Kruskal's algorithm for finding
a best-fit monotonic sequence  (which probably exists in
CRAN already).

Neglecting edge effects, assume you have a
vector of desired positions z, and
a vector of minimum widths for each label w.
Then, you can compute the space used up by
the labels:  s[i] = -0.5*w[1] + sum(jhttps://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Mosaicplot coloring (PR#8537)

2006-01-29 Thread greg . kochanski
Full_Name: Greg Kochanski
Version: 2.2.1
OS: Debian Linux (testing)
Submission from: (NULL) (212.159.16.190)


mosaicplot(x, shade=TRUE) is intended to color the blocks
blue if they are more common than one might expect
and red if they are rarer than one might expect.

Unfortunately, if a block is much rarer than expected,
it is so narrow that one cannot see the red.Thus,
a casual inspection of the mosaicplot will miss some
of the most statistically significant results.

This is partially an intrinsic problem and cannot be
entirely fixed, but it is made worse by the black outlines
around each block.Blocks with very small probabilities
show as black, not red.   The broken outlines on the
red blocks help, but not quite enough.

I would suggest that there be an option to either turn off
the black outlines when for the colored blocks,
or an option to use colored outlines.

If those options are somehow hidden in the ... part of the
argument list for mosaicplot(), I apologize, but then this
bug report should be converted to a documentation bug.
Nowhere in help(mosaicplot) does it say what one can put into
the unspecified arguments (...).

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Mosaicplot coloring (PR#8537)

2006-01-30 Thread Greg Kochanski


Achim Zeileis wrote:
> On Sun, 29 Jan 2006 [EMAIL PROTECTED] wrote:
> 
> 
>>Full_Name: Greg Kochanski
>>Version: 2.2.1
>>OS: Debian Linux (testing)
>>Submission from: (NULL) (212.159.16.190)
>>
>>
>>mosaicplot(x, shade=TRUE) is intended to color the blocks
>>blue if they are more common than one might expect
>>and red if they are rarer than one might expect.
>>
>>Unfortunately, if a block is much rarer than expected,
>>it is so narrow that one cannot see the red.
> 
> 
> Where is the bug?? Please read Section 9 in
>   http://CRAN.R-project.org/doc/manuals/R-FAQ.html
> and also the posting guide at
>   http://www.R-project.org/posting-guide.html
> 

The bug is that the software produces results that could
lead to the wrong conclusion in a research paper,
or could lead the readers of the research paper to
an erroneous belief.  That sounds like a
relevant definition of a bug to me.

 From section  9:
> Finally, a command's intended definition may not be best for
 > statistical analysis. This is a very important sort of problem,
 > but it is also a matter of judgment.
 > > ... The manual's job is to make everything clear.
 > It is just as important to report documentation bugs
 > as program bugs

 From my reading of section 9, this is a documentation bug.

...
> For an enhanced implementation of mosaic plots written in the grid
> graphics system, see the package "vcd" and the functions mosaic() and
> strucplot(). See the package vignettes for details on control of the
> graphical appearance and also for combining shading and significance
> testing. To overcome the problem of small cells, another approach is to
> plot expected instead of observed frequencies.
> Z


You shouldn't be telling this to me,
you should be putting it in the documentation where
it might help more than one person.
Putting a "see also" note in help(mosaicplot) that points
to the "vcd" package, and "mosaic" and "strucplot" functions
might be a solution to the problem.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Mosaicplot coloring (PR#8537)

2006-01-30 Thread Greg Kochanski


Achim Zeileis wrote:
> On Mon, 30 Jan 2006, Greg Kochanski wrote:
> 
> 
>>The bug is that the software produces results that could
>>lead to the wrong conclusion in a research paper,
>>or could lead the readers of the research paper to
>>an erroneous belief.  That sounds like a
>>relevant definition of a bug to me.
> 
> 
> Maybe. However, it seems to be a bug in the way you interpret mosaic
> displays, not in the way they are implemented/documented in R.

OK.  Call it that if you want, though I expect that I share
the bug with many other people.


> 
> As I said before: This is a known issue with mosaic displays which is not
> so hard to find out if you consult the references given in ?mosaiplot.

The problem I see is that you (as a representative for the r-project)
are the wrong person to judge the success or failure of the
documentation.You presumably know the software in detail.
Documentation is (to at least some degree) intended for use by
people who _do_not_ know the software well.

Expecting a package's developer to judge documentation is
like asking a bald man to judge which comb is best.   He knows
what a comb is for, he may remember using one, but it's not
quite the same as actually needing and using one.



> 
> Another solution to your problem might be to use association plots
> (assoplot() is referred to in ?mosaicplot, assoc() is again a more
> flexible implementation in "vcd").

Thanks; that may help me; I appreciate the suggestion.
(I must point out, though, it doesn't help improve
mosaicplot().)


> 
> 
>>>For an enhanced implementation of mosaic plots...
>>You shouldn't be telling this to me,
>>you should be putting it in the documentation where
> 
> 
> Note that I'm neither the author of the mosaicplot() function nor
> its manual page.

Just out of curiosity,
why are you responding to bug reports that you don't
have the power to fix?


>>Putting a "see also" note in help(mosaicplot) that points
>>to the "vcd" package, ...
>>might be a solution to the problem.
> 
> vcd is `only' a contributed package on CRAN, hence not referred to from
> base packages.

But, if the contributed packages are better (as you seem to say)
than the base packages, perhaps they *should* be mentioned?

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] --gui=Tk window does not stretch (PR#8520)

2006-01-30 Thread Greg Kochanski


Peter Dalgaard wrote:

> 
> As you're bound to discover, the Tk console is mainly a
> proof-of-concept with shortcomings in many other areas as well. It's
> been largely undeveloped (as has the Gnome GUI) because we had very
> little feedback to indicate that people were actually interested in
> getting it to work better. Patches might be considered.

Does that mean "patches will be considered and accepted if they
meet reasonable criteria for simplicity and correctness."
or does it mean something less?

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Mosaicplot coloring (PR#8537)

2006-01-30 Thread Greg Kochanski


Achim Zeileis wrote:
> Greg:
> 
> 
>>OK.  Call it that if you want, though I expect that I share
>>the bug with many other people.
> 
> 
> What I tried to say here was: Reports of user errors do not belong on
> R-bugs. A request on R-help would have been more likely to generate a
> useful, friendly and widely shared reply/discussion.


And what I was trying to say is that any behaviour of a
program that makes user errors likely can reasonably
be considered a bug.I'll grant you that an individual
user error is mere anecdotal evidence, but an individual's
report combined with a plausible argument that other people
will make the same mistake deserves some attention.

I would make an analogy to accident reports from aircraft accidents.
Most accidents are caused (at least in part) by user error,
and yet the reports recommend design changes to the
aircraft to minimize the probability of future errors.


> FYI: Meanwhile, R-core has signalled that they would add a cross
> reference to vcd in this case. Hence, I'll suggest a documentation
> patch to R-core.
> 

Thanks!

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel