Re: [Rd] (PR#11281) Bug in R 2.7 for over long lines

2008-05-12 Thread bugreports
On Sat, 2008-05-10 at 11:19 +0100, Prof Brian Ripley wrote:
> You will see the current code is different, and your 'fix' is not needed=20
> nor applies in R-devel.

would be nice...

> You failed to provide an example to reproduce the alleged bug, but the=20

well the bug was obvious, I told that I can trigger it and that the
proposed fix fixed it - no need to provide an example.

> issue does seem to be using lines beyond the documented line length.

exactly. one can crash R with too long lines.

> So it would have only affected people who did that 

or use auto-generated code like e.g. swig produces.

> And generating a new report (PR#11438) was distinctly unfriendly.

I did what I was told, reply and keep the PR#11281 in the subject. I am
sensing another bug.

> If after studing the R FAQ you have a reproducible example in a
> current=20
> version of R (R-devel or R-patched), plus add it to *this* report
> number.

I don't intend to play with devel versions of R, I was just trying to
get swig for R2.7 to work. Sorry that it triggered a bug in R. =EF=BB=BFI w=
ill
try R2.7.1 when it is released and report back.

Soeren.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug in R 2.7 for over long lines (crasher+proposed fix!) (PR#11281)

2008-05-12 Thread maechler
Hi Soeren,
> "SS" == Soeren Sonnenburg <[EMAIL PROTECTED]>
> on Sat, 10 May 2008 05:32:14 + writes:

SS> On Sat, 2008-04-26 at 09:38 +0200, Peter Dalgaard wrote:
>> [EMAIL PROTECTED] wrote: > OK, I am just sending it here
>> too as it looks like r-devel@r-project.org > is not the
>> right place:
>> >   
>> I think it was seen there too, just that noone got around
>> to reply. In R-bugs, there's a filing system so that it
>> won't be completely forgotten...

SS> Looks like no one cares about this :(

Just "looks like" but it aint...

SS> What should I do now? I mean I pointed directly to the
SS> bug and did show how it could be fixed

I'm not among the parse experts within R-core, but I think the
main problem with your report is that 
you talk about a crash but do not provide "self-contained
reproducible" code to produce such a crash, but just the
assertion that you get crashes when working on R <-> Swig
interaction.
Can you construct simple R code producing the crash?

Best regards
Martin


>> However, your mail seems to have gotten encoded in
>> quoted-printable, you might want to follow up with a
>> cleaned version. (Just keep the (PR#11281) in the
>> header).

SS> To me crashers are critical bugs... isn't really no one
SS> interested in seeing this fixed?

>> > =EF=BB=BFOn Fri, 2008-04-25 at 08:48 +0200, Soeren
>> Sonnenburg wrote:
>> >   
>> >> While trying to fix swig & R2.7 I actually discovered
>> that there is a >> bug in R 2.7 causing a crash (so R &
>> swig might actually work): >> =20 >> the bug is in
>> ./src/main/gram.c line 3038: >> =20 >> } else { /*
>> over-long line */ >> fixthis --> char *LongLine =3D (char
>> *) malloc(nc); >> if(!LongLine) >> error(_("unable to
>> allocate space for source line %
>> >> 
>> > d"), xxlineno);
>> >   
>> >> strncpy(LongLine, (char *)p0, nc); >> bug -->
>> LongLine[nc] =3D '\0'; >> SET_STRING_ELT(source, lines++,
>> >> mkChar2((char *)LongLine)); >> free(LongLine); >> =20
>> >> note that LongLine is only nc chars long, so the
>> LongLine[nc]=3D'\0'
>> >> 
>> > might
>> >   
>> >> be an out of bounds write. the fix would be to do >>
>> =20 >> =EF=BB=BF char *LongLine =3D (char *)
>> malloc(nc+1); >> =20 >> in line 3034 >> =20 >> Please fix
>> and thanks to dirk for the debian r-base-dbg package!
>> >> 
>> >
>> > Looking at the code again there seems to be another bug
>> above this for > the MAXLINESIZE test too:
>> >
>> > if (*p =3D=3D '\n' || p =3D=3D end - 1) { > nc =3D p -
>> p0; > if (*p !=3D '\n') > nc++; > if (nc <=3D
>> MAXLINESIZE) { > strncpy((char *)SourceLine, (char *)p0,
>> nc); > bug2 --> SourceLine[nc] =3D '\0'; >
>> SET_STRING_ELT(source, lines++, > mkChar2((char
>> *)SourceLine)); > } else { /* over-long line */ > char
>> *LongLine =3D (char *) malloc(nc+1); > if(!LongLine) >
>> error(_("unable to allocate space for source line %d"), >
>> xxlineno); > bug1 --> strncpy(LongLine, (char *)p0, nc);
>> > LongLine[nc] =3D '\0'; > SET_STRING_ELT(source,
>> lines++, > mkChar2((char *)LongLine)); > free(LongLine);
>> > } > p0 =3D p + 1; > }
>> >
>> >
>> > So I guess the test would be for nc < MAXLINESIZE above
>> or to change > SourceLine to have MAXLINESIZE+1 size.
>> >
>> > Alternatively as the strncpy manpage suggests do this
>> for all > occurrences of strncpy
>> >
>> > strncpy(buf, str, n); > if (n > 0) > buf[n - 1]=3D
>> =E2=80=99\0=E2=80=99;
>> >
>> > this could even be made a makro / helper function ...
>> >
>> > And another update: This does fix the R+swig crasher
>> for me (tested)!
>> >
>> > Soeren

SS> Soeren

SS> __
SS> R-devel@r-project.org mailing list
SS> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] (PR#11281) Bug in R 2.7 for over long lines

2008-05-12 Thread Duncan Murdoch

On 5/10/2008 5:20 PM, [EMAIL PROTECTED] wrote:

On Sat, 2008-05-10 at 11:19 +0100, Prof Brian Ripley wrote:

You will see the current code is different, and your 'fix' is not needed=20
nor applies in R-devel.


would be nice...


You failed to provide an example to reproduce the alleged bug, but the=20


well the bug was obvious, I told that I can trigger it and that the
proposed fix fixed it - no need to provide an example.


But it would be helpful to provide an example, so that we can test the 
fix.  As Brian told you, your fix was no good:  it was not against the 
current code.





issue does seem to be using lines beyond the documented line length.


exactly. one can crash R with too long lines.


Then the bug is also in your code, for sending lines that are too long. 
 R shouldn't crash on user error, but "don't do that" is an appropriate 
response.



So it would have only affected people who did that 


or use auto-generated code like e.g. swig produces.


Then swig should be modified to produce valid code.



And generating a new report (PR#11438) was distinctly unfriendly.


I did what I was told, reply and keep the PR#11281 in the subject. I am
sensing another bug.


If after studing the R FAQ you have a reproducible example in a
current=20
version of R (R-devel or R-patched), plus add it to *this* report
number.


I don't intend to play with devel versions of R, I was just trying to
get swig for R2.7 to work. Sorry that it triggered a bug in R. =EF=BB=BFI w=
ill
try R2.7.1 when it is released and report back.


If you aren't interested in being helpful by testing fixes for your 
code, then I doubt if any of us are going to go out of our way to help 
you with your errors.


Duncan Murdoch



Soeren.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] k means

2008-05-12 Thread cgenolin

Hi the devel list,

I am using K means with a non standard distance. As far as I see, the 
function kmeans is able to deal with 4 differents algorithm, but not 
with a user define distance.


In addition, kmeans is not able to deal with missing value whereas 
there is several solution that k-means can use to deal with them ; one 
is using a distance that takes the missing value in account, like a 
distance with Gower adjustement (which is the regular distance dist() 
used in R).


So is it possible to adapt kmeans to let the user gives an argument 
'distance to use'?


Christophe

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] k means

2008-05-12 Thread Bill.Venables
I would not support an extension of kmeans to do this.  I think it is
best left simple and fast as it now is.  

I can think of three ways you might handle your problem

1. Use, for example, pam() in the cluster package, which does a similar
job to kmeans (not quite the same, of course) with a general distance
measure.

2. If you are working with a non-standard metric and you really want to
use the k-means algorithm, then perhaps one way to do so is to use an
approximate euclidean coordinatisatin for the points with a
multidimensional scaling first and then use kmeans.  (e.g. cmdscale,
isoMDS, sammon, ...)  I've no idea what the traps are with this
approach, but it seems kind of feasible.

3. If the algorithms are there and available as you say, write the code
yourself and contribute it to the R-project as a simple package.
Everyone will benefit. 


Bill Venables
CSIRO Laboratories
PO Box 120, Cleveland, 4163
AUSTRALIA
Office Phone (email preferred): +61 7 3826 7251
Fax (if absolutely necessary):  +61 7 3826 7304
Mobile: +61 4 8819 4402
Home Phone: +61 7 3286 7700
mailto:[EMAIL PROTECTED]
http://www.cmis.csiro.au/bill.venables/ 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
[EMAIL PROTECTED]
Sent: Tuesday, 13 May 2008 3:25 AM
To: r-devel@r-project.org
Subject: [Rd] k means

Hi the devel list,

I am using K means with a non standard distance. As far as I see, the 
function kmeans is able to deal with 4 differents algorithm, but not 
with a user define distance.

In addition, kmeans is not able to deal with missing value whereas 
there is several solution that k-means can use to deal with them ; one 
is using a distance that takes the missing value in account, like a 
distance with Gower adjustement (which is the regular distance dist() 
used in R).

So is it possible to adapt kmeans to let the user gives an argument 
'distance to use'?

Christophe

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug in R 2.7 for over long lines (crasher+proposed fix!) (PR#11456)

2008-05-12 Thread r-ml
On Mon, 2008-05-12 at 11:10 +0200, [EMAIL PROTECTED] wrote:
> Hi Soeren,
> > "SS" == Soeren Sonnenburg <[EMAIL PROTECTED]>
> > on Sat, 10 May 2008 05:32:14 + writes:
> 
> SS> On Sat, 2008-04-26 at 09:38 +0200, Peter Dalgaard wrote:
> >> [EMAIL PROTECTED] wrote: > OK, I am just sending it here
> >> too as it looks like r-devel@r-project.org > is not the
> >> right place:
> >> >   
> >> I think it was seen there too, just that noone got around
> >> to reply. In R-bugs, there's a filing system so that it
> >> won't be completely forgotten...
> 
> SS> Looks like no one cares about this :(
> 
> Just "looks like" but it aint...
> 
> SS> What should I do now? I mean I pointed directly to the
> SS> bug and did show how it could be fixed
> 
> I'm not among the parse experts within R-core, but I think the
> main problem with your report is that 
> you talk about a crash but do not provide "self-contained
> reproducible" code to produce such a crash, but just the
> assertion that you get crashes when working on R <-> Swig
> interaction.
> Can you construct simple R code producing the crash?

No. I put however difficult autogenerated (~800k big!) .R code that will
crash R 2.7 at http://nn7.de/debugging/Features.R for everyone to
enjoy :)

Sourcing it will crash R2.7.0 (without my fix) but not 2.8.

Soeren

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug in R 2.7 for over long lines (crasher+proposed fix!) (PR#11281)

2008-05-12 Thread Soeren Sonnenburg
On Mon, 2008-05-12 at 11:10 +0200, [EMAIL PROTECTED] wrote:
> Hi Soeren,
> > "SS" == Soeren Sonnenburg <[EMAIL PROTECTED]>
> > on Sat, 10 May 2008 05:32:14 + writes:
> 
> SS> On Sat, 2008-04-26 at 09:38 +0200, Peter Dalgaard wrote:
> >> [EMAIL PROTECTED] wrote: > OK, I am just sending it here
> >> too as it looks like r-devel@r-project.org > is not the
> >> right place:
> >> >   
> >> I think it was seen there too, just that noone got around
> >> to reply. In R-bugs, there's a filing system so that it
> >> won't be completely forgotten...
> 
> SS> Looks like no one cares about this :(
> 
> Just "looks like" but it aint...
> 
> SS> What should I do now? I mean I pointed directly to the
> SS> bug and did show how it could be fixed
> 
> I'm not among the parse experts within R-core, but I think the
> main problem with your report is that 
> you talk about a crash but do not provide "self-contained
> reproducible" code to produce such a crash, but just the
> assertion that you get crashes when working on R <-> Swig
> interaction.
> Can you construct simple R code producing the crash?

No. I put however difficult autogenerated (~800k big!) .R code that will
crash R 2.7 at http://nn7.de/debugging/Features.R for everyone to
enjoy :)

Sourcing it will crash R2.7.0 (without my fix) but not 2.8.

Soeren

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] (PR#11281) Bug in R 2.7 for over long lines

2008-05-12 Thread Soeren Sonnenburg
On Mon, 2008-05-12 at 08:39 -0400, Duncan Murdoch wrote:
> On 5/10/2008 5:20 PM, [EMAIL PROTECTED] wrote:
> > On Sat, 2008-05-10 at 11:19 +0100, Prof Brian Ripley wrote:
> >> You will see the current code is different, and your 'fix' is not needed=20
> >> nor applies in R-devel.
> > 
> > would be nice...
> > 
> >> You failed to provide an example to reproduce the alleged bug, but the=20
> > 
> > well the bug was obvious, I told that I can trigger it and that the
> > proposed fix fixed it - no need to provide an example.
> 
> But it would be helpful to provide an example, so that we can test the 
> fix.  As Brian told you, your fix was no good:  it was not against the 
> current code.

Well it was when I posted it 4 days after R 2.7.0 was released. 

And the bug was very obvious, I mean look at this (quoting from my
original report):

char *LongLine = (char *) malloc(nc);
...
LongLine[nc] = '\0';

note that LongLine is only nc chars long, so the LongLine[nc]='\0' might
be an out of bounds write. the fix would be to do 
char *LongLine = (char *) malloc(nc+1);


Anyway an example that will crash R 2.7.0 is here
http://nn7.de/debugging/Features.R .
 
> >> issue does seem to be using lines beyond the documented line length.
> > 
> > exactly. one can crash R with too long lines.
> 
> Then the bug is also in your code, for sending lines that are too long. 
>   R shouldn't crash on user error, but "don't do that" is an appropriate 
> response.

I would just like to see this bug in R fixed.

> >> So it would have only affected people who did that 
> > 
> > or use auto-generated code like e.g. swig produces.
> 
> Then swig should be modified to produce valid code.

Sure. That's what I am trying to achieve.

[...]
> > I don't intend to play with devel versions of R, I was just trying to
> > get swig for R2.7 to work. Sorry that it triggered a bug in R. =EF=BB=BFI w=
> > ill
> > try R2.7.1 when it is released and report back.
> 
> If you aren't interested in being helpful by testing fixes for your 
> code, then I doubt if any of us are going to go out of our way to help 
> you with your errors.

I still don't understand what I could have possibly done wrong 
in my initial
post (http://article.gmane.org/gmane.comp.lang.r.devel/16243/)
to cause this meta-discussion.

But to put things in the right light. There is no bug in my code (this
time). But in R. And I did not ask for help - to the contrary: I've
pointed out a trivial to fix bug in R 2.7.0 and showed how it could be
fixed. 

The problem is that I am not really an R user, but just wanted to
support the R community by porting shogun to R in the hope that it may
be useful for some. To achieve this I am fixing bugs in the R swig
interface generator and now also R. So the detour I am taking here is
massive and I have not received any help (except from Dirk so far).

So if possible lets stay focused on the bug: Dirk helped me to get the R
from svn-trunk (it says R 2.8 at startup) to compile and voila, sourcing
the code from above does not generate any crashes anymore. So the
rewrite of gram.c fixes it I guess.

Soeren

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] (PR#11281) Bug in R 2.7 for over long lines (crasher+proposed

2008-05-12 Thread ripley
This example does not crash in R 2.7.0, R-patched nor R-devel (r45677) for 
me (x86_64 F8 Linux.)  It also does not crash with the CRAN build of R 
2.7.0 on Windows XP.

On Tue, 13 May 2008, Soeren Sonnenburg wrote:

> On Mon, 2008-05-12 at 11:10 +0200, [EMAIL PROTECTED] wrote:
>> Hi Soeren,
>>> "SS" == Soeren Sonnenburg <[EMAIL PROTECTED]>
>>> on Sat, 10 May 2008 05:32:14 + writes:
>>
>> SS> On Sat, 2008-04-26 at 09:38 +0200, Peter Dalgaard wrote:
>>>> [EMAIL PROTECTED] wrote: > OK, I am just sending it here
>>>> too as it looks like r-devel@r-project.org > is not the
>>>> right place:
>>>>>
>>>> I think it was seen there too, just that noone got around
>>>> to reply. In R-bugs, there's a filing system so that it
>>>> won't be completely forgotten...
>>
>> SS> Looks like no one cares about this :(
>>
>> Just "looks like" but it aint...
>>
>> SS> What should I do now? I mean I pointed directly to the
>> SS> bug and did show how it could be fixed
>>
>> I'm not among the parse experts within R-core, but I think the
>> main problem with your report is that
>> you talk about a crash but do not provide "self-contained
>> reproducible" code to produce such a crash, but just the
>> assertion that you get crashes when working on R <-> Swig
>> interaction.
>> Can you construct simple R code producing the crash?
>
> No. I put however difficult autogenerated (~800k big!) .R code that will
> crash R 2.7 at http://nn7.de/debugging/Features.R for everyone to
> enjoy :)
>
> Sourcing it will crash R2.7.0 (without my fix) but not 2.8.
>
> Soeren
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel