Re: [Rd] Using the lazy data mechanism

2006-02-05 Thread Prof Brian Ripley
John,

There are lots of examples without a namespace on CRAN.  The first in the 
alphabet (in the C locale) is DAAG.  Another example (but handled 
specially) is package 'datasets'.

Taking the version on CRAN (car_1.0-18) and adding

LazyData: yes
LazyLoad: yes

worked for me (and you don't actually need the second, as it works whether 
or not the R code is lazy-loaded).

Perhaps you can let me know offline what the problems are?

Brian

On Sat, 4 Feb 2006, John Fox wrote:

> Dear list members,
>
> I'm trying to use the lazy data mechanism with the car package, so far
> without success. The data sets are in the source package's data subdirectory
> in the form of compressed .rda files, and I added the directive LazyData:
> yes to the package's DESCRIPTION file.
>
> I suspect that the problem is that the package has no namespace, but I've
> been unable to find a reference in the Writing R Extensions manual (nor
> elsewhere) that suggests that this is necessary. Is there a place that I've
> missed that describes the lazy data mechanism?

There's are article in R-news, but that describes the mechanism per se. 
How to use it is in `Writing R Extensions'.

[...]

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] a generic 'attach'?

2006-02-05 Thread Peter Dalgaard
<[EMAIL PROTECTED]> writes:

> Is there any reason why 'attach' is not generic in R?
> 
> I notice that it is in another system, for example,

I wonder which one? ;-)

> and I can see some
> applications if it were so in R.

I suppose there is no particular reason, except that it was probably
"good enough for now" at some point in time. 

Apropos attach(), and apologies in advance for the lengthy rant that
follows:

There are a couple of other annoyances with the attach/detach
mechanism that could do with a review. In particular, detach() is not
behaving according to documentation (return value is really NULL). I
feel that sensible semantics for editing an attached database and
storing it back would be useful. The current semantics tend to get
people in trouble, and some of the stuff you need to explain really
feels quite odd:

attach(airquality)
airquality$Month <- factor(airquality$Month)
# oops, that's not going to work. You need:
detach(airquality)
attach(airquality)

(notice in particular that this tends to keep two copies of the data
in memory at a time).

You can actually modify a database after attaching it (I'm
deliberately not saying "data frame", because it will not be one at
that stage), but it leads to contorsions like

assign("Month", factor(Month), "airquality")

or

with(pos.to.env(2), Month <- factor(Month))

(or even with(pos.to.env(match("airquality",search())),))

I've been thinking on and off about these matters. It is a bit tricky
because we'd rather not break codes (and books!) all over the place,
but suppose we 

(a) allowed with() to have its first argument interpreted like the 3rd
argument in assign()

(b) made detach() do what it claims: return the (possibly modified)
database. This requires that more metadata are kept around than
currently. Also, the semantics of 

attach(airquality)
assign("foo", function(bar)baz, "airquality") 
aq <- detach(airquality)

would need to be sorted out. Presumably "foo" needs to be dropped
with a warning.

Potentially, one could then also devise mechanisms for load/store
directly to/from the search path.

Alternative ideas include changing the search path itself to be an
actual list of objects (rather than a nesting of environments), but
that leads to the same sort of issues.


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] a generic 'attach'?

2006-02-05 Thread Bill.Venables
What have I started?  I had nothing anywhere near as radical as that in mind, 
Peter...

One argument against making 'attach' generic might be that such a move would 
slow it down a bit, but I can't really see why speed would be much of an issue 
with 'attach'.

I've noticed that David Brahm's package, g.data, for example really has a 
method for attach as part of it, (well almost), but he has to calls it 
g.data.attach.

Another package that has an obvious application for a method for attach is the 
filehash package of Roger Peng.

And as it happens I have another, but for now I call it 'Attach', which is 
pretty unsatisfying from an aesthetic point of view.

I think I'll just sew the seed for now.  The thing about generic functions is 
that if they exist people sometimes find quite innovative uses for them, and if 
they come at minimal cost, and break no existing code, I suggest we thik about 
implementing them.

(Notice I have had no need to use a 'compatibility with another system' 
argument at any stage...)

---

Another, even more minor issue I've wondered about is giving rm() the return 
value the object, or list of objects, that are removed.  Thus

newName <- rm(object)

would become essentially a renaming of an object in memory.

For some reason I seem to recall that this was indeed a feature of a very early 
version of the S language, but dropped out (I think) when S3 was introduced.  
Have I got that completely wrong?  (I seem to recall a lot of code had to be 
scrapped at that stage, including something rather reminiscent of the R with(), 
but I digress...)

Bill.


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard
Sent: Sunday, 5 February 2006 8:35 PM
To: Venables, Bill (CMIS, Cleveland)
Cc: [EMAIL PROTECTED]
Subject: Re: [Rd] a generic 'attach'?


<[EMAIL PROTECTED]> writes:

> Is there any reason why 'attach' is not generic in R?
> 
> I notice that it is in another system, for example,

I wonder which one? ;-)

> and I can see some
> applications if it were so in R.

I suppose there is no particular reason, except that it was probably
"good enough for now" at some point in time. 

Apropos attach(), and apologies in advance for the lengthy rant that
follows:

There are a couple of other annoyances with the attach/detach
mechanism that could do with a review. In particular, detach() is not
behaving according to documentation (return value is really NULL). I
feel that sensible semantics for editing an attached database and
storing it back would be useful. The current semantics tend to get
people in trouble, and some of the stuff you need to explain really
feels quite odd:

attach(airquality)
airquality$Month <- factor(airquality$Month)
# oops, that's not going to work. You need:
detach(airquality)
attach(airquality)

(notice in particular that this tends to keep two copies of the data
in memory at a time).

You can actually modify a database after attaching it (I'm
deliberately not saying "data frame", because it will not be one at
that stage), but it leads to contorsions like

assign("Month", factor(Month), "airquality")

or

with(pos.to.env(2), Month <- factor(Month))

(or even with(pos.to.env(match("airquality",search())),))

I've been thinking on and off about these matters. It is a bit tricky
because we'd rather not break codes (and books!) all over the place,
but suppose we 

(a) allowed with() to have its first argument interpreted like the 3rd
argument in assign()

(b) made detach() do what it claims: return the (possibly modified)
database. This requires that more metadata are kept around than
currently. Also, the semantics of 

attach(airquality)
assign("foo", function(bar)baz, "airquality") 
aq <- detach(airquality)

would need to be sorted out. Presumably "foo" needs to be dropped
with a warning.

Potentially, one could then also devise mechanisms for load/store
directly to/from the search path.

Alternative ideas include changing the search path itself to be an
actual list of objects (rather than a nesting of environments), but
that leads to the same sort of issues.


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] a generic 'attach'?

2006-02-05 Thread Prof Brian Ripley
What are you proposing the generic be, and how should it be described?

Most of the currrent attach seems to be general, the only parts which are 
specific to save() images and lists are

 value <- .Internal(attach(NULL, pos, name))
 load(what, envir = as.environment(pos))
 }
 else value <- .Internal(attach(what, pos, name))

So maybe it is not attach() but some internal version of it (which 
populates a frame on the search list) which needs to be generic.  Indeed. 
dbLoad() in pkg filehash looks just what one wants here.

[That code is a bit strange: is not 'value' the environment into which you 
want to load things?  So why go via as.environment?]

The devil is in the `well almost'.

On Sun, 5 Feb 2006 [EMAIL PROTECTED] wrote:

> What have I started?  I had nothing anywhere near as radical as that in 
> mind, Peter...
>
> One argument against making 'attach' generic might be that such a move 
> would slow it down a bit, but I can't really see why speed would be much 
> of an issue with 'attach'.

Speed is not an issue.  The major issue in making a function generic is 
describing what a generic function is required to do (including what it is 
required to return), and thereby ensuring that you do not break existing 
code without unduly limiting future uses.

> I've noticed that David Brahm's package, g.data, for example really has 
> a method for attach as part of it, (well almost), but he has to calls it 
> g.data.attach.

Another candidate is lazyload/lazydata databases.

> Another package that has an obvious application for a method for attach 
> is the filehash package of Roger Peng.
>
> And as it happens I have another, but for now I call it 'Attach', which 
> is pretty unsatisfying from an aesthetic point of view.
>
> I think I'll just sew the seed for now.  The thing about generic 
> functions is that if they exist people sometimes find quite innovative 
> uses for them, and if they come at minimal cost, and break no existing 
> code, I suggest we thik about implementing them.
>
> (Notice I have had no need to use a 'compatibility with another system' 
> argument at any stage...)

A good point, as it is not actually documented to be generic under that 
systems, as far as I can see.

[...]

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] a generic 'attach'?

2006-02-05 Thread Roger Peng
I think having a generic attach might be useful in the end.  But I agree 
that some more thought needs to go into how such a generic would behave. 
  I've always avoided using `attach()' precisely because I didn't fully 
understand the semantics.

One related possibility would be to create a method for `with()' (which 
is already generic) which would work on (in my case) "filehash" 
databases.  It still wouldn't be quite as nice as `attach()' for 
interactive work but it could serve some purposes.

-roger

[EMAIL PROTECTED] wrote:
> What have I started?  I had nothing anywhere near as radical as that
> in mind, Peter...
> 
> One argument against making 'attach' generic might be that such a
> move would slow it down a bit, but I can't really see why speed would
> be much of an issue with 'attach'.
> 
> I've noticed that David Brahm's package, g.data, for example really
> has a method for attach as part of it, (well almost), but he has to
> calls it g.data.attach.
> 
> Another package that has an obvious application for a method for
> attach is the filehash package of Roger Peng.
> 
> And as it happens I have another, but for now I call it 'Attach',
> which is pretty unsatisfying from an aesthetic point of view.
> 
> I think I'll just sew the seed for now.  The thing about generic
> functions is that if they exist people sometimes find quite
> innovative uses for them, and if they come at minimal cost, and break
> no existing code, I suggest we thik about implementing them.
> 
> (Notice I have had no need to use a 'compatibility with another
> system' argument at any stage...)
> 
> ---
> 
> Another, even more minor issue I've wondered about is giving rm() the
> return value the object, or list of objects, that are removed.  Thus
> 
> newName <- rm(object)
> 
> would become essentially a renaming of an object in memory.
> 
> For some reason I seem to recall that this was indeed a feature of a
> very early version of the S language, but dropped out (I think) when
> S3 was introduced.  Have I got that completely wrong?  (I seem to
> recall a lot of code had to be scrapped at that stage, including
> something rather reminiscent of the R with(), but I digress...)
> 
> Bill.
> 
> 
> -Original Message- From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard Sent: Sunday,
> 5 February 2006 8:35 PM To: Venables, Bill (CMIS, Cleveland) Cc:
> [EMAIL PROTECTED] Subject: Re: [Rd] a generic 'attach'?
> 
> 
> <[EMAIL PROTECTED]> writes:
> 
> 
>> Is there any reason why 'attach' is not generic in R?
>> 
>> I notice that it is in another system, for example,
> 
> 
> I wonder which one? ;-)
> 
> 
>> and I can see some applications if it were so in R.
> 
> 
> I suppose there is no particular reason, except that it was probably 
> "good enough for now" at some point in time.
> 
> Apropos attach(), and apologies in advance for the lengthy rant that 
> follows:
> 
> There are a couple of other annoyances with the attach/detach 
> mechanism that could do with a review. In particular, detach() is not
>  behaving according to documentation (return value is really NULL). I
>  feel that sensible semantics for editing an attached database and 
> storing it back would be useful. The current semantics tend to get 
> people in trouble, and some of the stuff you need to explain really 
> feels quite odd:
> 
> attach(airquality) airquality$Month <- factor(airquality$Month) #
> oops, that's not going to work. You need: detach(airquality) 
> attach(airquality)
> 
> (notice in particular that this tends to keep two copies of the data 
> in memory at a time).
> 
> You can actually modify a database after attaching it (I'm 
> deliberately not saying "data frame", because it will not be one at 
> that stage), but it leads to contorsions like
> 
> assign("Month", factor(Month), "airquality")
> 
> or
> 
> with(pos.to.env(2), Month <- factor(Month))
> 
> (or even with(pos.to.env(match("airquality",search())),))
> 
> I've been thinking on and off about these matters. It is a bit tricky
>  because we'd rather not break codes (and books!) all over the place,
>  but suppose we
> 
> (a) allowed with() to have its first argument interpreted like the
> 3rd argument in assign()
> 
> (b) made detach() do what it claims: return the (possibly modified) 
> database. This requires that more metadata are kept around than 
> currently. Also, the semantics of
> 
> attach(airquality) assign("foo", function(bar)baz, "airquality") aq
> <- detach(airquality)
> 
> would need to be sorted out. Presumably "foo" needs to be dropped 
> with a warning.
> 
> Potentially, one could then also devise mechanisms for load/store 
> directly to/from the search path.
> 
> Alternative ideas include changing the search path itself to be an 
> actual list of objects (rather than a nesting of environments), but 
> that leads to the same sort of issues.
> 
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__

Re: [Rd] SaveImage, LazyLoad, S4 and all that {was "install.R ... files"}

2006-02-05 Thread Prof Brian Ripley
On Fri, 3 Feb 2006, Robert Gentleman wrote:

> My understanding, and John or others may correct that, is that you need 
> SaveImage if you want to have the class hierarchy and generic functions, plus 
> associated methods all created and saved at build time.

That meaning the time of using R CMD INSTALL rather than using R CMD 
build, I guess?  (We do have an unfortunate ambiguity.)

> This is basically a sort of compilation step, and IMHO, should always be 
> done since it only needs to be done once, rather than every time a 
> package is loaded. Note that attaching your methods to other people's 
> generics has to happen at load time, since you won't necessarily know 
> where they are or even what they are until then (using an import 
> directive may alleviate some of those issues but I have not tested just 
> what does and does not work currently).

My understanding is that `compilation step' creates objects which are then 
saved in the image.  Such objects would also be saved if the image is 
converted into a lazyload database.

> I hope that LazyLoad does what it says it does, that is dissociates the value 
> from the symbol in such a way that the value lives on disk until it is 
> wanted, but the symbol is available at package load time. I do not see how 
> this relates to precomputing an image,

You obviously have this defined a different way to me: I believe (and so 
does my dictionary) that the image is what I save in my camera, not the 
real world scene.  I understand 'save' to save an image of an environment, 
that is to make a representation on a connection that can be used to 
recreate the environment at a later date.

> and would not be very happy if the two ideas became one, they really are 
> different and can be used to solve very different problems.

To create a lazyload database you first need an environment to save. On 
loading the package it then recreates not the environment but symbols 
linked to promises that will recreate the values at a later date.  So both 
mechanisms create an environment which they `image' in different ways.

The difference here is an inadvertent difference in the Unix INSTALL 
script, which I have now corrected.


> Prof Brian Ripley wrote:
>> The short answer is that there are no known (i.e. documented) differences, 
>> and no examples on CRAN which do not work with lazy-loading (except party, 
>> which loads the saved image in a test).  And that includes examples of 
>> packages which share S4 classes.  But my question was to tease things like 
>> this out.
>> 
>> You do need either SaveImage or LazyLoad in a package that defines S4 
>> classes and methods, since SetClass etc break the `rules' for R files in 
>> packages in `Writing R Extensions'.
>> 
>> When I have time I will take a closer look at this example.
>> 
>> 
>> On Fri, 3 Feb 2006, Martin Maechler wrote:
>> 
>> 
 "Seth" == Seth Falcon <[EMAIL PROTECTED]>
on Thu, 02 Feb 2006 11:32:42 -0800 writes:
>>> 
>>>   Seth> Thanks for the explaination of LazyLoad, that's very helpful.
>>>   Seth> On  1 Feb 2006, [EMAIL PROTECTED] wrote:
>>>   >> There is no intention to withdraw SaveImage: yes.  Rather, if
>>>   >> lazy-loading is not doing a complete job, we could see if it could
>>>   >> be improved.
>>> 
>>>   Seth> It seems to me that LazyLoad does something different with respect 
>>> to
>>>   Seth> packages listed in Depends and/or how it interacts with 
>>> namespaces.
>>> 
>>>   Seth> I'm testing using the Bioconductor package graph and find that if 
>>> I
>>>   Seth> change SaveImage to LazyLoad I get the following:
>>> 
>>> Interesting.
>>> 
>>> I had also the vague feeling that  saveImage  was said to be
>>> important when using  S4 classes and methods; particularly when
>>> some methods are for generics from a different package/Namespace
>>> and other methods for `base' classes (or other classes defined
>>> elsewhere).
>>> This is the case of 'Matrix', my primary experience here.
>>> OTOH, we now only use 'LazyLoad: yes' , not (any more?)
>>> 'SaveImage: yes' -- and honestly I don't know / remember why.
>>> 
>>> Martin
>>> 
>>> 
>>>   Seth> ** preparing package for lazy loading
>>>   Seth> Error in makeClassRepresentation(Class, properties, superClasses, 
>>> prototype,  :
>>>   Seth> couldn't find function "getuuid"
>>> 
>>>   Seth> Looking at the NAMESPACE for the graph package, it looks like it 
>>> is
>>>   Seth> missing some imports.  I added lines:
>>>   Seth> import(Ruuid)
>>>   Seth> exportClasses(Ruuid)
>>> 
>>>   Seth> Aside: am I correct in my reading of the extension manual that if 
>>> one
>>>   Seth> uses S4 classes from another package with a namespace, one
>>>   Seth> must import the classes and *also* export them?
>>> 
>>>   Seth> Now I see this:
>>> 
>>>   Seth> ** preparing package for lazy loading
>>>   Seth> Error in getClass("Ruuid") : "Ruuid" is not a defined class
>>>   Seth> Error: unable to load R code in package 'graph'
>>>   Seth> Execution halted

Re: [Rd] The install.R and R_PROFILE.R files

2006-02-05 Thread Prof Brian Ripley
I had a bumpy ride with this one.

Ruuid/src/Makefile.win refers to src/include, which is not in a binary 
distribution so cannot be installed from an installed version of R 2.2.1. 
(That's a bug report.)

graph throws an S4 signature error in R-devel.

After fixing those, it works with LazyLoad on Windows but not in Unix 
where there is an error in the INSTALL script which I have now fixed.


On Thu, 2 Feb 2006, Seth Falcon wrote:

> Thanks for the explaination of LazyLoad, that's very helpful.
>
> On  1 Feb 2006, [EMAIL PROTECTED] wrote:
>> There is no intention to withdraw SaveImage: yes.  Rather, if
>> lazy-loading is not doing a complete job, we could see if it could
>> be improved.
>
> It seems to me that LazyLoad does something different with respect to
> packages listed in Depends and/or how it interacts with namespaces.
>
> I'm testing using the Bioconductor package graph and find that if I
> change SaveImage to LazyLoad I get the following:
>
>   ** preparing package for lazy loading
>   Error in makeClassRepresentation(Class, properties, superClasses, 
> prototype,  :
>   couldn't find function "getuuid"
>
> Looking at the NAMESPACE for the graph package, it looks like it is
> missing some imports.  I added lines:
>  import(Ruuid)
>  exportClasses(Ruuid)
>
> Aside: am I correct in my reading of the extension manual that if one
> uses S4 classes from another package with a namespace, one
> must import the classes and *also* export them?
>
> Now I see this:
>
>** preparing package for lazy loading
>Error in getClass("Ruuid") : "Ruuid" is not a defined class
>Error: unable to load R code in package 'graph'
>Execution halted
>
> But Ruuid _is_ defined and exported in the Ruuid package.
>
> Is there a known difference in how dependencies and imports are
> handled with LazyLoad as opposed to SaveImage?
>
> Thanks,
>
> + seth
>
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] pbinom with size argument 0 (PR#8560)

2006-02-05 Thread Uffe Høgsbro Thygesen
Hello all

A pragmatic argument for allowing size==0 is the situation where the size is in 
itself a random variable (that's how I stumbled over the inconsistency, by the 
way).

For example, in textbooks on probability it is stated that:

  If X is Poisson(lambda), and the conditional 
  distribution of Y given X is Binomial(X,p), then 
  Y is Poisson(lambda*p).

(cf eg Pitman's "Probability", p. 400)

Clearly this statement requires Binomial(0,p) to be a well-defined distribution.

Such statements would be quite convoluted if we did not define Binomial(0,p) as 
a legal (but degenerate) distribution. The same applies to codes where the size 
parameter may attain the value 0.

Just my 2 cents.

Cheers,

Uffe


-Oprindelig meddelelse-
Fra: [EMAIL PROTECTED] på vegne af Peter Dalgaard
Sendt: sø 05-02-2006 01:33
Til: P Ehlers
Cc: [EMAIL PROTECTED]; Peter Dalgaard; [EMAIL PROTECTED]; 
r-devel@stat.math.ethz.ch; Uffe Høgsbro Thygesen
Emne: Re: [Rd] pbinom with size argument 0 (PR#8560)
 
P Ehlers <[EMAIL PROTECTED]> writes:

> I prefer a (consistent) NaN. What happens to our notion of a
> Binomial RV as a sequence of Bernoulli RVs if we permit n=0?
> I have never seen (nor contemplated, I confess) the definition
> of a Bernoulli RV as anything other than some dichotomous-outcome
> one-trial random experiment. 

What's the problem ??

An n=0 binomial is the sum of an empty set of Bernoulli RV's, and the
sum over an empty set is identically 0.

> Not n trials, where n might equal zero,
> but _one_ trial. I can't see what would be gained by permitting a
> zero-trial experiment. If we assign probability 1 to each outcome,
> we have a problem with the sum of the probabilities.

Consistency is what you gain. E.g. 

 binom(.,n=n1+n2,p) == binom(.,n=n1,p) * binom(.,n=n2,p)

where * denotes convolution. This will also hold for n1=0 or n2=0 if
the binomial in that case is defined as a one-point distribution at
zero. Same thing as any(logical(0)) etc., really.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] pbinom with size argument 0 (PR#8560)

2006-02-05 Thread uht
Hello all

A pragmatic argument for allowing size=3D=3D0 is the situation where the =
size is in itself a random variable (that's how I stumbled over the =
inconsistency, by the way).

For example, in textbooks on probability it is stated that:

  If X is Poisson(lambda), and the conditional=20
  distribution of Y given X is Binomial(X,p), then=20
  Y is Poisson(lambda*p).

(cf eg Pitman's "Probability", p. 400)

Clearly this statement requires Binomial(0,p) to be a well-defined =
distribution.

Such statements would be quite convoluted if we did not define =
Binomial(0,p) as a legal (but degenerate) distribution. The same applies =
to codes where the size parameter may attain the value 0.

Just my 2 cents.

Cheers,

Uffe


-Oprindelig meddelelse-
Fra: [EMAIL PROTECTED] p=E5 vegne af Peter Dalgaard
Sendt: s=F8 05-02-2006 01:33
Til: P Ehlers
Cc: [EMAIL PROTECTED]; Peter Dalgaard; [EMAIL PROTECTED]; =
r-devel@stat.math.ethz.ch; Uffe H=F8gsbro Thygesen
Emne: Re: [Rd] pbinom with size argument 0 (PR#8560)
=20
P Ehlers <[EMAIL PROTECTED]> writes:

> I prefer a (consistent) NaN. What happens to our notion of a
> Binomial RV as a sequence of Bernoulli RVs if we permit n=3D0?
> I have never seen (nor contemplated, I confess) the definition
> of a Bernoulli RV as anything other than some dichotomous-outcome
> one-trial random experiment.=20

What's the problem ??

An n=3D0 binomial is the sum of an empty set of Bernoulli RV's, and the
sum over an empty set is identically 0.

> Not n trials, where n might equal zero,
> but _one_ trial. I can't see what would be gained by permitting a
> zero-trial experiment. If we assign probability 1 to each outcome,
> we have a problem with the sum of the probabilities.

Consistency is what you gain. E.g.=20

 binom(.,n=3Dn1+n2,p) =3D=3D binom(.,n=3Dn1,p) * binom(.,n=3Dn2,p)

where * denotes convolution. This will also hold for n1=3D0 or n2=3D0 if
the binomial in that case is defined as a one-point distribution at
zero. Same thing as any(logical(0)) etc., really.

--=20
   O__   Peter Dalgaard =D8ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) =
35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) =
35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] javascript device for R

2006-02-05 Thread Tom Short
Romain Francois  free.fr> writes:

> 
> Hi,
> 
> Has anyone started a javascript device for R.
> I don't see something like that googling or at on 
> http://www.stat.auckland.ac.nz/~paul/R/devices.html
> For example, using that graphics library : 
> http://www.walterzorn.com/jsgraphics/jsgraphics_e.htm
> (I cc that message to the author.)
> 
> ps : this is not a feature request, i will do it. But if someone has 
> started that, let me know.
> 
> Romain
> 

Not that I know of, but here are some pointers that might get you started.
Here's an example of a graph in SVG done in the Dojo javascript toolkit:

http://archive.dojotoolkit.org/nightly/tests/widget/test_Chart.html

The idea is that it generates SVG for SVG-capable browsers and VML for Internet
Explorer (although I don't think they finished the IE part yet).

Another option is to use the canvas tag available in Firefox and Safari. You can
use this to emulate the canvas tag in IE:

http://me.eae.net/archive/2005/12/29/canvas-in-ie/

- Tom

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] e1071: using svm with sparse matrices (PR#8527)

2006-02-05 Thread David Meyer
> 
> First: this is not a bug, more a feature request.

[not in R as correctly pointed out, so just for the records:]

It _is_ a bug (in e1071), since svm() is indeed supposed to support
sparse data. The bug was introduced in 1.5-9 I think when support for
correct na-handling was added. The bug will be fixed in 1.5-13.

Thanks for pointing this out!

David.

> 
> Secondly, even if it was a bug, it is _not_ a bug in R, please read  
> the posting rules for bugs. Now a member of R-core has to use  
> valuable time to clean up after your bug report.
> 
> Correspondence such as this such really be sent to the package  
> maintainers (and - perhaps - a cc to R-devel).
> 
> Having said all of this, of course it would be nice if svn supported  
> sparse matrices.

> Libsvm, the `engine' underneath svm() in `e1071', uses a sparse
> representation of the data.  I vaguely recall seeing Chih-Jen Lin's code
> that uses the SparseM package to pass sparse data to svm()...  David would
> know best, of course.

> Andy

 
> /Kasper
> 
> 
> On Jan 27, 2006, at 2:02 AM, [EMAIL PROTECTED] wrote:
> 
> > Full_Name: Julien Gagneur
> > Version: 2.2.1
> > OS: Linux (Suse 9.3)
> > Submission from: (NULL) (194.94.44.4)
> >
> >
> > Using the SparseM library (SparseM_0.66)
> > and the e1071 library (e1071_1.5-12)
> >
> >
> > I fail using svm method with a sparse matrix. Here is a sample  
> > example.
> >
> > I experienced the same problem under Windows.
> >
> >
> >
> >> library(SparseM)
> > [1] "SparseM library loaded"
> >> library("e1071")
> > Loading required package: class
> >> data(iris)
> >> attach(iris)
> >> M=as.matrix(iris[,1:4])
> >> Msparse=as.matrix.csr(M)
> >> Species=iris[,5]
> >> mod=svm(Msparse,Species)
> > Error in svm.default(Msparse, Species) : object "nac" not found
> >

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel