Re: [Rd] Increase transparency: suggestion on how to avoid namespaces and/or unnecessary overwrites of existing functions

2011-10-01 Thread Dominick Samperi
On Tue, Aug 23, 2011 at 2:23 PM, Janko Thyson
 wrote:
> aDear list,
>
> I'm aware of the fact that I posted on something related a while ago, but I
> just can't sweat this off and would like to ask your for an opinion:
>
> The problem:
> Namespaces are great, but they don't resolve certain conflicts regarding
> name clashes. There are more and more people out there trying to come up
> with their own R packages, which is great also! Yet, it becomes more and
> more likely that programmers will choose identical names for their exported
> functions and/or that they add functionality to existing function (i.e.
> overwriting existing functions).
> The whole process of which packages overwrite which functions is somewhat
> obscure and in addition depends on their order in the search path. On the
> other hand, it is not possible to use "namespace" functionality (i.e.
> 'namespace::fun()'; also less efficient than direct call; see illustration
> below) during early stages of the development process (i.e. the package is
> not finished yet) as there is no namespace available yet.
>
> I know of at least two cases where such overwrites (I think it's called
> masking, right?) led to some confusion at our chair:
> 1) loading package forecast overwrites certain functions in stats which made
> some code refactoring necessary
> 2) loading package 'R.utils' followed by package 'roxygen' overwrites
> 'parse.default()' which results in errors for something like
> 'eval(parse(text="a <- 1"))' ; see illustration below)
> And I'm sure the community could come up with lots more of such scenarios.
>
> Suggestions:
> 1) In order to avoid name clashes/unintended overwrites, how about switching
> to a coding paradigm that explicitly (and automatically) includes a
> package's name in all its functions' names once code is turned into a real
> package? E.g., getting used to "preemptively" type 'package_fun()' or
> 'package.fun()' instead of just 'fun()'. Better to be save than sorry,
> right? This could be realized pretty easily (see example below) and, IMHO,
> would significantly increase transparency.
> 2) In order to avoid intended (but for the user often pretty obscure)
> overwrites of existing functions, we could use the same mechanism together
> with the "rule": just don't provide any functions that overwrite existing
> ones, rather prepend your version of that function with your package name
> and leave it up to the user which version he wants to call.

Experts from the Lisp-Stats community have added a number
of functions to R that were inspired by Lisp, but one feature that apparently
was not added is the shadowing feature of Common Lisp. Here the default
behavior is not to permit packages to import conflicting names unless
explicit shadowing directives are specified.

Arguably a package is not intended to be used like a callable library,
yet this is the way they are often used in the R context. This kind of
shadowing tool might help to make this practice safer, at the expense
of requiring the developer to specify explicit shadowing directives.

Dominick

> At the moment, all of this is probably not that big of a deal yet, but my
> suggestion has more of a mid-term/long-term character.
>
> Below you find a little illustration. I'm probably asking too much, but it'd
> be great if we could get a little discussion going on how to improve the way
> of loading packages!
>
> Best regards and thanks for R and all it's packages!
> Janko
>
> 
> # PROOF OF CONCEPT
> 
>
> # 1) PROBLEM
> # IMHO, with the number of packages submitted to CRAN constantly increasing,
> # over time we will be likely to see problems with respect to name clashes.
> # The main reasons I see for this are the following:
> # a) package developers picking identical names for their exported functions
> # b) package developers overwriting base functions in order to add
> functionality
> #    to existing functions
> # c) ...
> #
> # This can create scenarios in which the user might not exactly know that
> # he/she is using a 'modified' version of a specific function. More so, the
> user
> # needs to carefully read the description of each new package he plans
> # to use in order to find out which functions are exported and which
> existing
> # functions might be overwritten. This in turn might imply that the user's
> # existing code needs to be refactored (i.e. instead of using 'fun()' it
> # might now be necessary to type 'namespace::fun()' to be sure that the
> desired
> # function is called).
>
> # 2) SUGGESTED SOLUTION
> # That being said, why don't we switch to a 'preemptive' coding paradigm
> # where the default way of calling functions includes the specification of
> # its namespace? In principle, the functionality offered by
> 'namespace::fun()'
> # gets the job done.
> # BUT:
> # a) it is slower compared to

Re: [Rd] Increase transparency: suggestion on how to avoid namespaces and/or unnecessary overwrites of existing functions

2011-10-01 Thread Spencer Graves
  When selecting names for functions and variables, I sometimes use 
library(sos) to look for existing conflicts with other packages.  This 
won't solve all the problems Janko mentioned, but it can help avoid 
some.  Spencer



On 10/1/2011 9:11 AM, Dominick Samperi wrote:

On Tue, Aug 23, 2011 at 2:23 PM, Janko Thyson
  wrote:

aDear list,

I'm aware of the fact that I posted on something related a while ago, but I
just can't sweat this off and would like to ask your for an opinion:

The problem:
Namespaces are great, but they don't resolve certain conflicts regarding
name clashes. There are more and more people out there trying to come up
with their own R packages, which is great also! Yet, it becomes more and
more likely that programmers will choose identical names for their exported
functions and/or that they add functionality to existing function (i.e.
overwriting existing functions).
The whole process of which packages overwrite which functions is somewhat
obscure and in addition depends on their order in the search path. On the
other hand, it is not possible to use "namespace" functionality (i.e.
'namespace::fun()'; also less efficient than direct call; see illustration
below) during early stages of the development process (i.e. the package is
not finished yet) as there is no namespace available yet.

I know of at least two cases where such overwrites (I think it's called
masking, right?) led to some confusion at our chair:
1) loading package forecast overwrites certain functions in stats which made
some code refactoring necessary
2) loading package 'R.utils' followed by package 'roxygen' overwrites
'parse.default()' which results in errors for something like
'eval(parse(text="a<- 1"))' ; see illustration below)
And I'm sure the community could come up with lots more of such scenarios.

Suggestions:
1) In order to avoid name clashes/unintended overwrites, how about switching
to a coding paradigm that explicitly (and automatically) includes a
package's name in all its functions' names once code is turned into a real
package? E.g., getting used to "preemptively" type 'package_fun()' or
'package.fun()' instead of just 'fun()'. Better to be save than sorry,
right? This could be realized pretty easily (see example below) and, IMHO,
would significantly increase transparency.
2) In order to avoid intended (but for the user often pretty obscure)
overwrites of existing functions, we could use the same mechanism together
with the "rule": just don't provide any functions that overwrite existing
ones, rather prepend your version of that function with your package name
and leave it up to the user which version he wants to call.

Experts from the Lisp-Stats community have added a number
of functions to R that were inspired by Lisp, but one feature that apparently
was not added is the shadowing feature of Common Lisp. Here the default
behavior is not to permit packages to import conflicting names unless
explicit shadowing directives are specified.

Arguably a package is not intended to be used like a callable library,
yet this is the way they are often used in the R context. This kind of
shadowing tool might help to make this practice safer, at the expense
of requiring the developer to specify explicit shadowing directives.

Dominick


At the moment, all of this is probably not that big of a deal yet, but my
suggestion has more of a mid-term/long-term character.

Below you find a little illustration. I'm probably asking too much, but it'd
be great if we could get a little discussion going on how to improve the way
of loading packages!

Best regards and thanks for R and all it's packages!
Janko


# PROOF OF CONCEPT


# 1) PROBLEM
# IMHO, with the number of packages submitted to CRAN constantly increasing,
# over time we will be likely to see problems with respect to name clashes.
# The main reasons I see for this are the following:
# a) package developers picking identical names for their exported functions
# b) package developers overwriting base functions in order to add
functionality
#to existing functions
# c) ...
#
# This can create scenarios in which the user might not exactly know that
# he/she is using a 'modified' version of a specific function. More so, the
user
# needs to carefully read the description of each new package he plans
# to use in order to find out which functions are exported and which
existing
# functions might be overwritten. This in turn might imply that the user's
# existing code needs to be refactored (i.e. instead of using 'fun()' it
# might now be necessary to type 'namespace::fun()' to be sure that the
desired
# function is called).

# 2) SUGGESTED SOLUTION
# That being said, why don't we switch to a 'preemptive' coding paradigm
# where the default way of calling functions includes the specification of
# its namespace?

Re: [Rd] Increase transparency: suggestion on how to avoid namespaces and/or unnecessary overwrites of existing functions

2011-10-01 Thread Duncan Murdoch

On 11-08-23 2:23 PM, Janko Thyson wrote:

aDear list,

I'm aware of the fact that I posted on something related a while ago,
but I just can't sweat this off and would like to ask your for an opinion:

The problem:
Namespaces are great, but they don't resolve certain conflicts regarding
name clashes. There are more and more people out there trying to come up
with their own R packages, which is great also! Yet, it becomes more and
more likely that programmers will choose identical names for their
exported functions and/or that they add functionality to existing
function (i.e. overwriting existing functions).
The whole process of which packages overwrite which functions is
somewhat obscure and in addition depends on their order in the search
path. On the other hand, it is not possible to use "namespace"
functionality (i.e. 'namespace::fun()'; also less efficient than direct
call; see illustration below) during early stages of the development
process (i.e. the package is not finished yet) as there is no namespace
available yet.



I agree there can be a problem, but I don't think it is necessarily as 
serious as you suggest.  Even though there are more and more packages 
available, most people will still use roughly the same number of them. 
Just because CRAN has thousands of packages doesn't mean I use all of 
them at the same time.




I know of at least two cases where such overwrites (I think it's called
masking, right?) led to some confusion at our chair:
1) loading package forecast overwrites certain functions in stats which
made some code refactoring necessary


If your code had been in a package with a NAMESPACE, it would not have 
been affected by a user loading forecast.  (If you start importing it, 
then of course it could cause masking problems.)


You suggest above that users only put code into a package very late in 
the development process.  The solution is, don't do that.  Create a 
package early on, and use it through the majority of development time.


You can leave the choice of exports until late by exporting everything; 
you'll still get the benefit of the more controlled name search from the 
beginning.


You say you can't use "namespace::call()" until the namespace package 
has been written.  But why would you want to?  If the call is coming 
from the new package, objects in it will be used with first priority in 
resolving the call.  You only need the :: notation when there are 
ambiguities in calls to external packages.




2) loading package 'R.utils' followed by package 'roxygen' overwrites
'parse.default()' which results in errors for something like
'eval(parse(text="a<- 1"))' ; see illustration below)
And I'm sure the community could come up with lots more of such scenarios.

Suggestions:
1) In order to avoid name clashes/unintended overwrites, how about
switching to a coding paradigm that explicitly (and automatically)
includes a package's name in all its functions' names once code is
turned into a real package? E.g., getting used to "preemptively" type
'package_fun()' or 'package.fun()' instead of just 'fun()'. Better to be
save than sorry, right? This could be realized pretty easily (see
example below) and, IMHO, would significantly increase transparency.


I think long names with consistent prefixes are harder to read than 
short descriptive names.  I think this would make code harder to read. 
For example, the first few lines of mean.default would change from


if (!is.numeric(x) && !is.complex(x) && !is.logical(x)) {
warning("argument is not numeric or logical: returning NA")
return(NA_real_)
}

to

if (!base_is.numeric(x) && !base_is.complex(x) &&
!base_is.logical(x)) {
base_warning("argument is not numeric or logical: returning NA")
return(base_NA_real_)
}



2) In order to avoid intended (but for the user often pretty obscure)
overwrites of existing functions, we could use the same mechanism
together with the "rule": just don't provide any functions that
overwrite existing ones, rather prepend your version of that function
with your package name and leave it up to the user which version he
wants to call.


That seems like good advice.

Duncan Murdoch




At the moment, all of this is probably not that big of a deal yet, but
my suggestion has more of a mid-term/long-term character.

Below you find a little illustration. I'm probably asking too much, but
it'd be great if we could get a little discussion going on how to
improve the way of loading packages!

Best regards and thanks for R and all it's packages!
Janko


# PROOF OF CONCEPT


# 1) PROBLEM
# IMHO, with the number of packages submitted to CRAN constantly increasing,
# over time we will be likely to see problems with respect to name clashes.
# The main reasons I see for this are the following:
# a) packa

Re: [Rd] Increase transparency: suggestion on how to avoid namespaces and/or unnecessary overwrites of existing functions

2011-10-01 Thread Dominick Samperi
On Sat, Oct 1, 2011 at 1:08 PM, Duncan Murdoch  wrote:
> On 11-08-23 2:23 PM, Janko Thyson wrote:
>>
>> aDear list,
>>
>> I'm aware of the fact that I posted on something related a while ago,
>> but I just can't sweat this off and would like to ask your for an opinion:
>>
>> The problem:
>> Namespaces are great, but they don't resolve certain conflicts regarding
>> name clashes. There are more and more people out there trying to come up
>> with their own R packages, which is great also! Yet, it becomes more and
>> more likely that programmers will choose identical names for their
>> exported functions and/or that they add functionality to existing
>> function (i.e. overwriting existing functions).
>> The whole process of which packages overwrite which functions is
>> somewhat obscure and in addition depends on their order in the search
>> path. On the other hand, it is not possible to use "namespace"
>> functionality (i.e. 'namespace::fun()'; also less efficient than direct
>> call; see illustration below) during early stages of the development
>> process (i.e. the package is not finished yet) as there is no namespace
>> available yet.
>>
>
> I agree there can be a problem, but I don't think it is necessarily as
> serious as you suggest.  Even though there are more and more packages
> available, most people will still use roughly the same number of them. Just
> because CRAN has thousands of packages doesn't mean I use all of them at the
> same time.
>
>
>> I know of at least two cases where such overwrites (I think it's called
>> masking, right?) led to some confusion at our chair:
>> 1) loading package forecast overwrites certain functions in stats which
>> made some code refactoring necessary
>
> If your code had been in a package with a NAMESPACE, it would not have been
> affected by a user loading forecast.  (If you start importing it, then of
> course it could cause masking problems.)
>
> You suggest above that users only put code into a package very late in the
> development process.  The solution is, don't do that.  Create a package
> early on, and use it through the majority of development time.
>
> You can leave the choice of exports until late by exporting everything;
> you'll still get the benefit of the more controlled name search from the
> beginning.
>
> You say you can't use "namespace::call()" until the namespace package has
> been written.  But why would you want to?  If the call is coming from the
> new package, objects in it will be used with first priority in resolving the
> call.  You only need the :: notation when there are ambiguities in calls to
> external packages.
>
>
>> 2) loading package 'R.utils' followed by package 'roxygen' overwrites
>> 'parse.default()' which results in errors for something like
>> 'eval(parse(text="a<- 1"))' ; see illustration below)
>> And I'm sure the community could come up with lots more of such scenarios.
>>
>> Suggestions:
>> 1) In order to avoid name clashes/unintended overwrites, how about
>> switching to a coding paradigm that explicitly (and automatically)
>> includes a package's name in all its functions' names once code is
>> turned into a real package? E.g., getting used to "preemptively" type
>> 'package_fun()' or 'package.fun()' instead of just 'fun()'. Better to be
>> save than sorry, right? This could be realized pretty easily (see
>> example below) and, IMHO, would significantly increase transparency.
>
> I think long names with consistent prefixes are harder to read than short
> descriptive names.  I think this would make code harder to read. For
> example, the first few lines of mean.default would change from
>
>    if (!is.numeric(x) && !is.complex(x) && !is.logical(x)) {
>        warning("argument is not numeric or logical: returning NA")
>        return(NA_real_)
>    }
>
> to
>
>    if (!base_is.numeric(x) && !base_is.complex(x) &&
>        !base_is.logical(x)) {
>        base_warning("argument is not numeric or logical: returning NA")
>        return(base_NA_real_)
>    }
>
>
>> 2) In order to avoid intended (but for the user often pretty obscure)
>> overwrites of existing functions, we could use the same mechanism
>> together with the "rule": just don't provide any functions that
>> overwrite existing ones, rather prepend your version of that function
>> with your package name and leave it up to the user which version he
>> wants to call.
>
> That seems like good advice.
>
> Duncan Murdoch

Except that namespace::foo should be assigned to another local
variable instead of using package::foo in a tight loop, because
repeated calls to "::" can introduce a significant performance
penalty. (This has been discussed in another thread.)

>>
>> At the moment, all of this is probably not that big of a deal yet, but
>> my suggestion has more of a mid-term/long-term character.
>>
>> Below you find a little illustration. I'm probably asking too much, but
>> it'd be great if we could get a little discussion going on how to
>> improve the way o

Re: [Rd] Increase transparency: suggestion on how to avoid namespaces and/or unnecessary overwrites of existing functions

2011-10-01 Thread Duncan Murdoch

On 11-10-01 5:14 PM, Dominick Samperi wrote:

On Sat, Oct 1, 2011 at 1:08 PM, Duncan Murdoch  wrote:

On 11-08-23 2:23 PM, Janko Thyson wrote:


aDear list,

I'm aware of the fact that I posted on something related a while ago,
but I just can't sweat this off and would like to ask your for an opinion:

The problem:
Namespaces are great, but they don't resolve certain conflicts regarding
name clashes. There are more and more people out there trying to come up
with their own R packages, which is great also! Yet, it becomes more and
more likely that programmers will choose identical names for their
exported functions and/or that they add functionality to existing
function (i.e. overwriting existing functions).
The whole process of which packages overwrite which functions is
somewhat obscure and in addition depends on their order in the search
path. On the other hand, it is not possible to use "namespace"
functionality (i.e. 'namespace::fun()'; also less efficient than direct
call; see illustration below) during early stages of the development
process (i.e. the package is not finished yet) as there is no namespace
available yet.



I agree there can be a problem, but I don't think it is necessarily as
serious as you suggest.  Even though there are more and more packages
available, most people will still use roughly the same number of them. Just
because CRAN has thousands of packages doesn't mean I use all of them at the
same time.



I know of at least two cases where such overwrites (I think it's called
masking, right?) led to some confusion at our chair:
1) loading package forecast overwrites certain functions in stats which
made some code refactoring necessary


If your code had been in a package with a NAMESPACE, it would not have been
affected by a user loading forecast.  (If you start importing it, then of
course it could cause masking problems.)

You suggest above that users only put code into a package very late in the
development process.  The solution is, don't do that.  Create a package
early on, and use it through the majority of development time.

You can leave the choice of exports until late by exporting everything;
you'll still get the benefit of the more controlled name search from the
beginning.

You say you can't use "namespace::call()" until the namespace package has
been written.  But why would you want to?  If the call is coming from the
new package, objects in it will be used with first priority in resolving the
call.  You only need the :: notation when there are ambiguities in calls to
external packages.



2) loading package 'R.utils' followed by package 'roxygen' overwrites
'parse.default()' which results in errors for something like
'eval(parse(text="a<- 1"))' ; see illustration below)
And I'm sure the community could come up with lots more of such scenarios.

Suggestions:
1) In order to avoid name clashes/unintended overwrites, how about
switching to a coding paradigm that explicitly (and automatically)
includes a package's name in all its functions' names once code is
turned into a real package? E.g., getting used to "preemptively" type
'package_fun()' or 'package.fun()' instead of just 'fun()'. Better to be
save than sorry, right? This could be realized pretty easily (see
example below) and, IMHO, would significantly increase transparency.


I think long names with consistent prefixes are harder to read than short
descriptive names.  I think this would make code harder to read. For
example, the first few lines of mean.default would change from

if (!is.numeric(x)&&  !is.complex(x)&&  !is.logical(x)) {
warning("argument is not numeric or logical: returning NA")
return(NA_real_)
}

to

if (!base_is.numeric(x)&&  !base_is.complex(x)&&
!base_is.logical(x)) {
base_warning("argument is not numeric or logical: returning NA")
return(base_NA_real_)
}



2) In order to avoid intended (but for the user often pretty obscure)
overwrites of existing functions, we could use the same mechanism
together with the "rule": just don't provide any functions that
overwrite existing ones, rather prepend your version of that function
with your package name and leave it up to the user which version he
wants to call.


That seems like good advice.

Duncan Murdoch


Except that namespace::foo should be assigned to another local
variable instead of using package::foo in a tight loop, because
repeated calls to "::" can introduce a significant performance
penalty. (This has been discussed in another thread.)


That's good advice too.

Duncan Murdoch





At the moment, all of this is probably not that big of a deal yet, but
my suggestion has more of a mid-term/long-term character.

Below you find a little illustration. I'm probably asking too much, but
it'd be great if we could get a little discussion going on how to
improve the way of loading packages!

Best regards and thanks for R and all it's packages!
Janko


#

Re: [Rd] Increase transparency: suggestion on how to avoid namespaces and/or unnecessary overwrites of existing functions

2011-10-01 Thread Simon Urbanek

On Oct 1, 2011, at 6:14 PM, Duncan Murdoch wrote:

> On 11-10-01 5:14 PM, Dominick Samperi wrote:
>> On Sat, Oct 1, 2011 at 1:08 PM, Duncan Murdoch  
>> wrote:
>>> On 11-08-23 2:23 PM, Janko Thyson wrote:
 
 aDear list,
 
 I'm aware of the fact that I posted on something related a while ago,
 but I just can't sweat this off and would like to ask your for an opinion:
 
 The problem:
 Namespaces are great, but they don't resolve certain conflicts regarding
 name clashes. There are more and more people out there trying to come up
 with their own R packages, which is great also! Yet, it becomes more and
 more likely that programmers will choose identical names for their
 exported functions and/or that they add functionality to existing
 function (i.e. overwriting existing functions).
 The whole process of which packages overwrite which functions is
 somewhat obscure and in addition depends on their order in the search
 path. On the other hand, it is not possible to use "namespace"
 functionality (i.e. 'namespace::fun()'; also less efficient than direct
 call; see illustration below) during early stages of the development
 process (i.e. the package is not finished yet) as there is no namespace
 available yet.
 
>>> 
>>> I agree there can be a problem, but I don't think it is necessarily as
>>> serious as you suggest.  Even though there are more and more packages
>>> available, most people will still use roughly the same number of them. Just
>>> because CRAN has thousands of packages doesn't mean I use all of them at the
>>> same time.
>>> 
>>> 
 I know of at least two cases where such overwrites (I think it's called
 masking, right?) led to some confusion at our chair:
 1) loading package forecast overwrites certain functions in stats which
 made some code refactoring necessary
>>> 
>>> If your code had been in a package with a NAMESPACE, it would not have been
>>> affected by a user loading forecast.  (If you start importing it, then of
>>> course it could cause masking problems.)
>>> 
>>> You suggest above that users only put code into a package very late in the
>>> development process.  The solution is, don't do that.  Create a package
>>> early on, and use it through the majority of development time.
>>> 
>>> You can leave the choice of exports until late by exporting everything;
>>> you'll still get the benefit of the more controlled name search from the
>>> beginning.
>>> 
>>> You say you can't use "namespace::call()" until the namespace package has
>>> been written.  But why would you want to?  If the call is coming from the
>>> new package, objects in it will be used with first priority in resolving the
>>> call.  You only need the :: notation when there are ambiguities in calls to
>>> external packages.
>>> 
>>> 
 2) loading package 'R.utils' followed by package 'roxygen' overwrites
 'parse.default()' which results in errors for something like
 'eval(parse(text="a<- 1"))' ; see illustration below)
 And I'm sure the community could come up with lots more of such scenarios.
 
 Suggestions:
 1) In order to avoid name clashes/unintended overwrites, how about
 switching to a coding paradigm that explicitly (and automatically)
 includes a package's name in all its functions' names once code is
 turned into a real package? E.g., getting used to "preemptively" type
 'package_fun()' or 'package.fun()' instead of just 'fun()'. Better to be
 save than sorry, right? This could be realized pretty easily (see
 example below) and, IMHO, would significantly increase transparency.
>>> 
>>> I think long names with consistent prefixes are harder to read than short
>>> descriptive names.  I think this would make code harder to read. For
>>> example, the first few lines of mean.default would change from
>>> 
>>>if (!is.numeric(x)&&  !is.complex(x)&&  !is.logical(x)) {
>>>warning("argument is not numeric or logical: returning NA")
>>>return(NA_real_)
>>>}
>>> 
>>> to
>>> 
>>>if (!base_is.numeric(x)&&  !base_is.complex(x)&&
>>>!base_is.logical(x)) {
>>>base_warning("argument is not numeric or logical: returning NA")
>>>return(base_NA_real_)
>>>}
>>> 
>>> 
 2) In order to avoid intended (but for the user often pretty obscure)
 overwrites of existing functions, we could use the same mechanism
 together with the "rule": just don't provide any functions that
 overwrite existing ones, rather prepend your version of that function
 with your package name and leave it up to the user which version he
 wants to call.
>>> 
>>> That seems like good advice.
>>> 
>>> Duncan Murdoch
>> 
>> Except that namespace::foo should be assigned to another local
>> variable instead of using package::foo in a tight loop, because
>> repeated calls to "::" can introduce a significant performance
>> penalty. (This has been di

Re: [Rd] Increase transparency: suggestion on how to avoid namespaces and/or unnecessary overwrites of existing functions

2011-10-01 Thread Joshua Wiley
On Sat, Oct 1, 2011 at 3:14 PM, Duncan Murdoch  wrote:
> On 11-10-01 5:14 PM, Dominick Samperi wrote:
[snip]
>> Except that namespace::foo should be assigned to another local
>> variable instead of using package::foo in a tight loop, because
>> repeated calls to "::" can introduce a significant performance
>> penalty. (This has been discussed in another thread.)
>
> That's good advice too.
>
> Duncan Murdoch

Is this performance hit the sort of thing that byte compiling would
help with, or am I misunderstanding its use?



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Increase transparency: suggestion on how to avoid namespaces and/or unnecessary overwrites of existing functions

2011-10-01 Thread Simon Urbanek

On Oct 1, 2011, at 7:52 PM, Joshua Wiley wrote:

> On Sat, Oct 1, 2011 at 3:14 PM, Duncan Murdoch  
> wrote:
>> On 11-10-01 5:14 PM, Dominick Samperi wrote:
> [snip]
>>> Except that namespace::foo should be assigned to another local
>>> variable instead of using package::foo in a tight loop, because
>>> repeated calls to "::" can introduce a significant performance
>>> penalty. (This has been discussed in another thread.)
>> 
>> That's good advice too.
>> 
>> Duncan Murdoch
> 
> Is this performance hit the sort of thing that byte compiling would help 
> with, or am I misunderstanding its use?
> 

Depending on what you mean ;). If you mean that compiling code containing :: 
removes the issue, yes, that is indeed true (i.e., foo::bar gets compiled into 
the actual value since the compiler can assume that it is a constant and thus 
:: is not actually called).

If you mean that compiling base and thus :: then that helps only to a degree. 
According to microbenchmark the overhead of :: is in the order of 7.5ms when 
compiled and 11ms when not compiled on my test machine (Xeon X5690 and R-devel 
which is actually much faster than the 30ms in R 2.13.1). If the time spent in 
your actual function is in the order of  microseconds then the overhead is 
still huge in either case. If it's much bigger then either doesn't matter.

Cheers,
Simon

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel