date:20091024

Re: [Rd] Advice on how to arrange fix of buglet

2009-10-24 Thread Uwe Ligges


John,

I guess it is time to file a bug report given it has not been done so 
far and noody found the time to look at ir so far.


Thanks,
Uwe


Prof. John C Nash wrote:


Recently I reported a small bug in optim's SANN method failing to report 
that it had exceeded the maximum function evaluation limit in the 
convergence code. This is a small enough matter that I was reluctant to 
create a full-blown bug report. Indeed in the optimx package Ravi 
Varadhan and I have been developing on r-forge (under the OptimizeR 
project) it was a minimal work around to fix the matter in our wrapper 
that incorporates optim() and a number of other tools. While I don't 
normally do C code, I could likely figure out a fix for optim too.


My query is about how to best get this done without causing a lot of 
work for others i.e., where to I send patches etc. I expect there are a 
number of similar issues for different areas of R and its documentation, 
and a clarification from someone in the core team could streamline 
things. Maybe the bug system is still the right place?


Cheers, JN

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Confusion regarding allocating Matrices.

2009-10-24 Thread Douglas Bates

On Fri, Oct 23, 2009 at 2:02 PM, Abhijit Bera  wrote:
> Sorry, I made a mistake while writing the code. The declaration of Data
> should have been first.

> I still have some doubts:

Because you are making some sweeping and incorrect assumptions about
the way that the internals of R operate.  R allows for arrays to be
dynamically resized but this is accomplished internally by allocating
new storage, copying the current contents to this new location and
installing the values of the new elements.  It is an expensive
operation, which is why it is discouraged.

Your design is deeply flawed.  Go back to the drawing board.

> When you say calloc and realloc are you talking about R's C interface Calloc
> and Realloc or the regular calloc and realloc?

Either one.

> I want to feed data directly into a R matrix and grow it as required. So at
> one time I might have 100 rows coming in from a data source. The next time I
> might have 200 rows coming in from a data source. I want to be able to
> expand the R-matrix instead of creating a regular C float matrix and then
> make an R-matrix based on the new size. I just want to have one R object and
> be able to expand it's size dynamically.

R stores floating-point numbers as the C data type double, not float.
It may seem pedantic to point out distinctions like that but not when
you are writing programs.  Compilers are the ultimate pedants - they
are real sticklers for getting the details right.

As I said, it just doesn't work the way that you think it does.  The
fact that there is an R object with a certain name before and after an
operation doesn't mean it is the same R object.

> I was reading the language specs. It says that one could declare an object
> in R like this:
>
> m=matrix(nrows=10,ncols=10)
>
> and then one could assign
>
> m[101]=1.00
>
> to expand the object.
>
> but this has one problem when I do a
>
> dim(m)
>
> I get
>
> NULL instead of 10 10
>
> So what is happening here?
>
>
> I am aware that R matrices are stored in column major order.
>
> Thanks for the tip on using float *dat= REAL(Data);
>
> Regards
>
> Abhijit Bera
>
>
>
> On Fri, Oct 23, 2009 at 7:27 PM, Douglas Bates  wrote:
>>
>> On Fri, Oct 23, 2009 at 9:23 AM, Douglas Bates 
>> wrote:
>> > On Fri, Oct 23, 2009 at 8:39 AM, Abhijit Bera 
>> > wrote:
>> >> Hi
>> >>
>> >> I'm having slight confusion.
>> >
>> > Indeed.
>> >
>> >> I plan to grow/realloc a matrix depending on the data available in a C
>> >> program.
>> >
>> >> Here is what I'm tried to do:
>> >
>> >> Data=allocMatrix(REALSXP,3,4);
>> >> SEXP Data;
>> >
>> > Those lines should be in the other order, shouldn't they?
>> >
>> > Also, you need to PROTECT Data or bad things will happen.
>> >
>> >> REAL(Data)[8]=0.001123;
>> >> REAL(Data)[20]=0.001125;
>> >> printf("%f %f\n\n\n\n",REAL(Data)[8],REAL(Data)[20]);
>>
>> And I forgot to mention, it is not a good idea to write REAL(Data)
>> many times like this.  REAL is a function, not a macro and you are
>> calling the same function over and over again unnecessarily.  It is
>> better to write
>>
>> double *dat = REAL(Data);
>>
>> and use the dat pointer instead of REAL(Data).
>>
>> >> Here is my confusion:
>> >
>> >> Do I always require to allocate the exact number of data elements in a
>> >> R
>> >> Matrix?
>> >
>> > Yes.
>> >
>> >> In the above code segment I have clearly exceeded the number of
>> >> elements that have been allocated but my program doesn't crash.
>> >
>> > Remember that when programming in C you have a lot of rope with which
>> > to hang yourself.   You have corrupted a memory location beyond that
>> > allocated to the array but nothing bad has happened  - yet.
>> >
>> >> I don't find any specific R functions for reallocation incase my data
>> >> set
>> >> grows. How do I reallocate?
>> >
>> > You allocate a new matrix, copy the contents of the current matrix to
>> > the new matrix, then release the old one.  It gets tricky in that you
>> > should unprotect the old one and protect the new one but you need to
>> > watch the order of those operations.
>> >
>> > This approach is not a very good one.  If you really need to grow an
>> > array it is better to allocate and reallocate the memory within your C
>> > code using calloc and realloc then, at the end of the calculations,
>> > allocate an R matrix and copy the results over.
>> >
>> > Also, you haven't said whether you are growing the matrix by row or by
>> > column or both.  If you are adding rows then you can't just reallocate
>> > storage because R stores matrices in column-major order. The positions
>> > of the elements in a matrix with n+1 rows are different from those in
>> > a matrix with n rows.
>> >
>> >> Is it necessary to reallocate or is R handling
>> >> the memory management for the matrix that I have allocated?
>> >>
>> >> Regards
>> >>
>> >> Abhijit Bera
>> >>
>> >>        [[alternative HTML version deleted]]
>> >>
>> >> __
>> >> R-devel@r-project.org

Re: [Rd] Confusion regarding allocating Matrices.

2009-10-24 Thread Abhijit Bera

Ok I get it. So everytime it does a alloc and copy.

I haven't finished the design yet. I'm just thinking about how randomly the
data might arrive; its real time data. So I will allocate a large chunk of
memory and keep track of when it fills up, once the data exceeds I will
alloc and copy the data (provided the size is  within system limits). In
this manner I should be able to reduce the number of expensive operations of
allocing and copying.

Thanks for your input.

Regards

Abhijit Bera

On Sat, Oct 24, 2009 at 10:06 PM, Douglas Bates  wrote:

> On Fri, Oct 23, 2009 at 2:02 PM, Abhijit Bera  wrote:
> > Sorry, I made a mistake while writing the code. The declaration of Data
> > should have been first.
>
> > I still have some doubts:
>
> Because you are making some sweeping and incorrect assumptions about
> the way that the internals of R operate.  R allows for arrays to be
> dynamically resized but this is accomplished internally by allocating
> new storage, copying the current contents to this new location and
> installing the values of the new elements.  It is an expensive
> operation, which is why it is discouraged.
>
> Your design is deeply flawed.  Go back to the drawing board.
>



> > When you say calloc and realloc are you talking about R's C interface
> Calloc
> > and Realloc or the regular calloc and realloc?
>
> Either one.
>
> > I want to feed data directly into a R matrix and grow it as required. So
> at
> > one time I might have 100 rows coming in from a data source. The next
> time I
> > might have 200 rows coming in from a data source. I want to be able to
> > expand the R-matrix instead of creating a regular C float matrix and then
> > make an R-matrix based on the new size. I just want to have one R object
> and
> > be able to expand it's size dynamically.
>
> R stores floating-point numbers as the C data type double, not float.
> It may seem pedantic to point out distinctions like that but not when
> you are writing programs.  Compilers are the ultimate pedants - they
> are real sticklers for getting the details right.
>
> As I said, it just doesn't work the way that you think it does.  The
> fact that there is an R object with a certain name before and after an
> operation doesn't mean it is the same R object.
>
> > I was reading the language specs. It says that one could declare an
> object
> > in R like this:
> >
> > m=matrix(nrows=10,ncols=10)
> >
> > and then one could assign
> >
> > m[101]=1.00
> >
> > to expand the object.
> >
> > but this has one problem when I do a
> >
> > dim(m)
> >
> > I get
> >
> > NULL instead of 10 10
> >
> > So what is happening here?
> >
> >
> > I am aware that R matrices are stored in column major order.
> >
> > Thanks for the tip on using float *dat= REAL(Data);
> >
> > Regards
> >
> > Abhijit Bera
> >
> >
> >
> > On Fri, Oct 23, 2009 at 7:27 PM, Douglas Bates 
> wrote:
> >>
> >> On Fri, Oct 23, 2009 at 9:23 AM, Douglas Bates 
> >> wrote:
> >> > On Fri, Oct 23, 2009 at 8:39 AM, Abhijit Bera 
> >> > wrote:
> >> >> Hi
> >> >>
> >> >> I'm having slight confusion.
> >> >
> >> > Indeed.
> >> >
> >> >> I plan to grow/realloc a matrix depending on the data available in a
> C
> >> >> program.
> >> >
> >> >> Here is what I'm tried to do:
> >> >
> >> >> Data=allocMatrix(REALSXP,3,4);
> >> >> SEXP Data;
> >> >
> >> > Those lines should be in the other order, shouldn't they?
> >> >
> >> > Also, you need to PROTECT Data or bad things will happen.
> >> >
> >> >> REAL(Data)[8]=0.001123;
> >> >> REAL(Data)[20]=0.001125;
> >> >> printf("%f %f\n\n\n\n",REAL(Data)[8],REAL(Data)[20]);
> >>
> >> And I forgot to mention, it is not a good idea to write REAL(Data)
> >> many times like this.  REAL is a function, not a macro and you are
> >> calling the same function over and over again unnecessarily.  It is
> >> better to write
> >>
> >> double *dat = REAL(Data);
> >>
> >> and use the dat pointer instead of REAL(Data).
> >>
> >> >> Here is my confusion:
> >> >
> >> >> Do I always require to allocate the exact number of data elements in
> a
> >> >> R
> >> >> Matrix?
> >> >
> >> > Yes.
> >> >
> >> >> In the above code segment I have clearly exceeded the number of
> >> >> elements that have been allocated but my program doesn't crash.
> >> >
> >> > Remember that when programming in C you have a lot of rope with which
> >> > to hang yourself.   You have corrupted a memory location beyond that
> >> > allocated to the array but nothing bad has happened  - yet.
> >> >
> >> >> I don't find any specific R functions for reallocation incase my data
> >> >> set
> >> >> grows. How do I reallocate?
> >> >
> >> > You allocate a new matrix, copy the contents of the current matrix to
> >> > the new matrix, then release the old one.  It gets tricky in that you
> >> > should unprotect the old one and protect the new one but you need to
> >> > watch the order of those operations.
> >> >
> >> > This approach is not a very good one.  If you really need to grow an
> >> > array it is better

Re: [Rd] Confusion regarding allocating Matrices.

2009-10-24 Thread Simon Urbanek



On Oct 24, 2009, at 2:58 PM, Abhijit Bera wrote:


Ok I get it. So everytime it does a alloc and copy.

I haven't finished the design yet. I'm just thinking about how  
randomly the data might arrive; its real time data. So I will  
allocate a large chunk of memory and keep track of when it fills up,  
once the data exceeds I will alloc and copy the data (provided the  
size is  within system limits). In this manner I should be able to  
reduce the number of expensive operations of allocing and copying.




Many smart people have thought about those things before you, it's  
worthwhile to read about it --- I would suggest reading a bit about  
data structures and programming in C. What you describe is usually  
tackled by a allocating additional (usually linked) buffers as you go  
since that means you don't have to copy anything (except for that last  
step where you create the R object). It's also very trivial to  
implement.


Cheers,
Simon





Abhijit Bera

On Sat, Oct 24, 2009 at 10:06 PM, Douglas Bates  
 wrote:


On Fri, Oct 23, 2009 at 2:02 PM, Abhijit Bera   
wrote:
Sorry, I made a mistake while writing the code. The declaration of  
Data

should have been first.



I still have some doubts:


Because you are making some sweeping and incorrect assumptions about
the way that the internals of R operate.  R allows for arrays to be
dynamically resized but this is accomplished internally by allocating
new storage, copying the current contents to this new location and
installing the values of the new elements.  It is an expensive
operation, which is why it is discouraged.

Your design is deeply flawed.  Go back to the drawing board.





When you say calloc and realloc are you talking about R's C  
interface

Calloc

and Realloc or the regular calloc and realloc?


Either one.

I want to feed data directly into a R matrix and grow it as  
required. So

at
one time I might have 100 rows coming in from a data source. The  
next

time I
might have 200 rows coming in from a data source. I want to be  
able to
expand the R-matrix instead of creating a regular C float matrix  
and then
make an R-matrix based on the new size. I just want to have one R  
object

and

be able to expand it's size dynamically.


R stores floating-point numbers as the C data type double, not float.
It may seem pedantic to point out distinctions like that but not when
you are writing programs.  Compilers are the ultimate pedants - they
are real sticklers for getting the details right.

As I said, it just doesn't work the way that you think it does.  The
fact that there is an R object with a certain name before and after  
an

operation doesn't mean it is the same R object.


I was reading the language specs. It says that one could declare an

object

in R like this:

m=matrix(nrows=10,ncols=10)

and then one could assign

m[101]=1.00

to expand the object.

but this has one problem when I do a

dim(m)

I get

NULL instead of 10 10

So what is happening here?


I am aware that R matrices are stored in column major order.

Thanks for the tip on using float *dat= REAL(Data);

Regards

Abhijit Bera



On Fri, Oct 23, 2009 at 7:27 PM, Douglas Bates 

wrote:


On Fri, Oct 23, 2009 at 9:23 AM, Douglas Bates  


wrote:

On Fri, Oct 23, 2009 at 8:39 AM, Abhijit Bera 
wrote:

Hi

I'm having slight confusion.


Indeed.

I plan to grow/realloc a matrix depending on the data available  
in a

C

program.



Here is what I'm tried to do:



Data=allocMatrix(REALSXP,3,4);
SEXP Data;


Those lines should be in the other order, shouldn't they?

Also, you need to PROTECT Data or bad things will happen.


REAL(Data)[8]=0.001123;
REAL(Data)[20]=0.001125;
printf("%f %f\n\n\n\n",REAL(Data)[8],REAL(Data)[20]);


And I forgot to mention, it is not a good idea to write REAL(Data)
many times like this.  REAL is a function, not a macro and you are
calling the same function over and over again unnecessarily.  It is
better to write

double *dat = REAL(Data);

and use the dat pointer instead of REAL(Data).


Here is my confusion:


Do I always require to allocate the exact number of data  
elements in

a

R
Matrix?


Yes.


In the above code segment I have clearly exceeded the number of
elements that have been allocated but my program doesn't crash.


Remember that when programming in C you have a lot of rope with  
which
to hang yourself.   You have corrupted a memory location beyond  
that

allocated to the array but nothing bad has happened  - yet.

I don't find any specific R functions for reallocation incase  
my data

set
grows. How do I reallocate?


You allocate a new matrix, copy the contents of the current  
matrix to
the new matrix, then release the old one.  It gets tricky in  
that you
should unprotect the old one and protect the new one but you  
need to

watch the order of those operations.

This approach is not a very good one.  If you really need to  
grow an
array it is better to allocate and reallocate the memory within  
your C
cod

Re: [Rd] names drop for columns from data.frame (PR#14002)

2009-10-24 Thread Peter Ehlers


Have you tried names(a[,1,drop=FALSE])?
Then have a look at help("[").

-Peter

ve...@clemson.edu wrote:

Full_Name: Francisco Vera
Version: 2.9.2
OS: Windows
Submission from: (NULL) (74.248.242.164)


Run the following commands:

a<-data.frame(x=1:2,y=3:4,row.names=c("i","j"))
names(a$x)
names(a[,1])

For names(a$x) I get NULL instead of c("i","j"). Same thing happens with
names(a[,1]). It works fine for rows, i.e., names(a[1,]) gives what is supposed
to.

It also works fine for matrices. If you issue the commands
b<-matrix(1:4,ncol=2,dimnames=list(c("x","y"),c("i","j")))
names(b[,1])
names(b[1,])

Thanks

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Peter Ehlers
University of Calgary

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Advice on how to arrange fix of buglet

Re: [Rd] Confusion regarding allocating Matrices.

Re: [Rd] Confusion regarding allocating Matrices.

Re: [Rd] Confusion regarding allocating Matrices.

Re: [Rd] names drop for columns from data.frame (PR#14002)

5 matches

Site Navigation

Mail list logo

Footer information