Re: [Rd] Is it a good idea or even possible to redefine attach?

2014-08-10 Thread Grant Rettke
Thank you for that pleasant and concise explanation!

I will keep at it.
Grant Rettke | ACM, ASA, FSF, IEEE, SIAM
g...@wisdomandwonder.com | http://www.wisdomandwonder.com/
“Wisdom begins in wonder.” --Socrates
((λ (x) (x x)) (λ (x) (x x)))
“Life has become immeasurably better since I have been forced to stop
taking it seriously.” --Thompson


On Tue, Aug 5, 2014 at 7:54 PM, Winston Chang  wrote:
> On Tue, Aug 5, 2014 at 4:37 PM, Grant Rettke  wrote:
>>
>> That is delightful.
>>
>> When I run it like this:
>> • Start R
>> • Nothing in .Rprofile
>> • Paste in your code
>> ╭
>> │ gcrenv <- new.env()
>> │ gcrenv$attach.old <- attach
>> │ gcrenv$attach <- function(...){stop("NEVER USE ATTACH")}
>> │ base::attach(gcrenv, name="gcr", warn.conflicts = FALSE)
>> ╰
>> • I get exactly what is expected, I think
>> ╭
>> │ search()
>> ╰
>> ╭
>> │  [1] ".GlobalEnv""gcr"   "ESSR"
>> │  [4] "package:stats" "package:graphics"  "package:grDevices"
>> │  [7] "package:utils" "package:datasets"  "package:methods"
>> │ [10] "Autoloads" "package:base"
>> ╰
>>
>> Just to be sure:
>> • Is that what is expected?
>> • I am surprised because I thought that `gcr' would come first before
>>   `.GlobalEnv'
>>   • Perhaps I mis understand, as `.GlobalEnv' is actually the "REPL"?
>>
>> My goal is to move that to my .Rprofile so that it is "always run" and I
>> can forget about it more or less.
>>
>> Reading [this] I felt like `.First' would be the right place to put it,
>> but then read further to find that packages are only loaded /after/
>> `.First' has completed.  Curious, I tried it just to be sure. I am now
>> :).
>>
>> This is the .Rprofile file:
>>
>> ╭
>> │ cat(".Rprofile: Setting CMU repository\n")
>> │ r = getOption("repos")
>> │ r["CRAN"] = "http://lib.stat.cmu.edu/R/CRAN/";
>> │ options(repos = r)
>> │ rm(r)
>> │
>> │ .First <- function() {
>> │«same code as above»
>> │ }
>> ╰
>>
>> (I included the repository load, and understand it should not impact
>> things here)
>>
>> This is run after normal startup of R:
>>
>> ╭
>> │ search()
>> ╰
>> ╭
>> │  [1] ".GlobalEnv""package:stats" "package:graphics"
>> │  [4] "package:grDevices" "package:utils" "package:datasets"
>> │  [7] "gcr"   "package:methods"   "Autoloads"
>> │ [10] "package:base"
>> ╰
>>
>> When I read this, I read it as:
>> • My rebind of `attach' occurs
>> • Then all of the packages are loaded and they are referring to
>>   my-rebound `attach'
>> • That is a problem because it *will* break package code
>> • Clearly me putting that code in `.Rprofile' is the wrong place.
>>
>
> That order for search path should actually be fine. To understand why,
> you first have to know the difference between the _binding_
> environment for an object, and the _enclosing_ environment for a
> function.
>
> The binding environment is where you can find an object. For example,
> in the global env, you have a bunch bindings (we often call them
> variables), that point to various objects - vectors, data frames,
> other environments, etc.
>
> The enclosing environment for a function is where the function "runs
> in" when it's called.
>
> Most R objects have just a binding environment (a variable or
> reference that points to the object); functions also have an enclosing
> environment. These two environments aren't necessarily the same.
>
> When you run search(), it shows the set of environments where R will
> look for an object of a given name, when you run stuff at the console
> (and are in the global env). The trick is that, although you can find
> a function (they are bound bound) in one of these _package_
> environments, those functions run in (are enclosed by) a different
> environment: the a corresponding _namespace_ environment.
>
> The way that a namespace environment is set up with the arrangement of
> its ancestor environments, it will find the base namespace version of
> `attach` before it finds yours, even if your personal gcr environment
> comes early in the search path.
>
> =
> # Here's an example to illustrate. The `utils::alarm` function calls
> `cat`, which is in base.
>
> alarm
> # function ()
> # {
> # cat("\a")
> # flush.console()
> # }
> # 
>
>
> # Running it makes the screen flash or beep
> alarm()
> # [screen flashes]
>
>
> # We'll put a replacement version of cat early in the search path,
> between utils and base
> my_stuff <- new.env()
> my_stuff$cat <- function(...) stop("Tried to call cat")
> base::attach(my_stuff, pos=length(search()) - 1, name="my_stuff")
>
> search()
> #  [1] ".GlobalEnv""tools:rstudio" "package:stats"
> "package:graphics"
> #  [5] "package:grDevices" "package:utils" "package:datasets"
> "package:methods"
> #  [9] "my_stuff"  "Autoloads" "package:base"
>
> # Calling cat from the console gives the error, as expected
> cat()
> # Error in cat() : Tried to call cat
>
> # Bu

Re: [Rd] Is it a good idea or even possible to redefine attach?

2014-08-10 Thread Grant Rettke
As it turns out, my approach was a bit aggressive. A critical package
was using it and could see my new attach!

I will just warn, and encourage:

.First <- function() {
gcr <- new.env()
gcr$unsafe.attach <- attach
gcr$attach <- function(...) {
warning("NEVER USE ATTACH! Use `unsafe.attach` if you must.")
unsafe.attach(...)
}
base::attach(gcr, name="gcr", warn.conflicts = FALSE)
}
Grant Rettke | ACM, ASA, FSF, IEEE, SIAM
g...@wisdomandwonder.com | http://www.wisdomandwonder.com/
“Wisdom begins in wonder.” --Socrates
((λ (x) (x x)) (λ (x) (x x)))
“Life has become immeasurably better since I have been forced to stop
taking it seriously.” --Thompson


On Sun, Aug 10, 2014 at 9:13 AM, Grant Rettke  wrote:
> Thank you for that pleasant and concise explanation!
>
> I will keep at it.
> Grant Rettke | ACM, ASA, FSF, IEEE, SIAM
> g...@wisdomandwonder.com | http://www.wisdomandwonder.com/
> “Wisdom begins in wonder.” --Socrates
> ((λ (x) (x x)) (λ (x) (x x)))
> “Life has become immeasurably better since I have been forced to stop
> taking it seriously.” --Thompson
>
>
> On Tue, Aug 5, 2014 at 7:54 PM, Winston Chang  wrote:
>> On Tue, Aug 5, 2014 at 4:37 PM, Grant Rettke  
>> wrote:
>>>
>>> That is delightful.
>>>
>>> When I run it like this:
>>> • Start R
>>> • Nothing in .Rprofile
>>> • Paste in your code
>>> ╭
>>> │ gcrenv <- new.env()
>>> │ gcrenv$attach.old <- attach
>>> │ gcrenv$attach <- function(...){stop("NEVER USE ATTACH")}
>>> │ base::attach(gcrenv, name="gcr", warn.conflicts = FALSE)
>>> ╰
>>> • I get exactly what is expected, I think
>>> ╭
>>> │ search()
>>> ╰
>>> ╭
>>> │  [1] ".GlobalEnv""gcr"   "ESSR"
>>> │  [4] "package:stats" "package:graphics"  "package:grDevices"
>>> │  [7] "package:utils" "package:datasets"  "package:methods"
>>> │ [10] "Autoloads" "package:base"
>>> ╰
>>>
>>> Just to be sure:
>>> • Is that what is expected?
>>> • I am surprised because I thought that `gcr' would come first before
>>>   `.GlobalEnv'
>>>   • Perhaps I mis understand, as `.GlobalEnv' is actually the "REPL"?
>>>
>>> My goal is to move that to my .Rprofile so that it is "always run" and I
>>> can forget about it more or less.
>>>
>>> Reading [this] I felt like `.First' would be the right place to put it,
>>> but then read further to find that packages are only loaded /after/
>>> `.First' has completed.  Curious, I tried it just to be sure. I am now
>>> :).
>>>
>>> This is the .Rprofile file:
>>>
>>> ╭
>>> │ cat(".Rprofile: Setting CMU repository\n")
>>> │ r = getOption("repos")
>>> │ r["CRAN"] = "http://lib.stat.cmu.edu/R/CRAN/";
>>> │ options(repos = r)
>>> │ rm(r)
>>> │
>>> │ .First <- function() {
>>> │«same code as above»
>>> │ }
>>> ╰
>>>
>>> (I included the repository load, and understand it should not impact
>>> things here)
>>>
>>> This is run after normal startup of R:
>>>
>>> ╭
>>> │ search()
>>> ╰
>>> ╭
>>> │  [1] ".GlobalEnv""package:stats" "package:graphics"
>>> │  [4] "package:grDevices" "package:utils" "package:datasets"
>>> │  [7] "gcr"   "package:methods"   "Autoloads"
>>> │ [10] "package:base"
>>> ╰
>>>
>>> When I read this, I read it as:
>>> • My rebind of `attach' occurs
>>> • Then all of the packages are loaded and they are referring to
>>>   my-rebound `attach'
>>> • That is a problem because it *will* break package code
>>> • Clearly me putting that code in `.Rprofile' is the wrong place.
>>>
>>
>> That order for search path should actually be fine. To understand why,
>> you first have to know the difference between the _binding_
>> environment for an object, and the _enclosing_ environment for a
>> function.
>>
>> The binding environment is where you can find an object. For example,
>> in the global env, you have a bunch bindings (we often call them
>> variables), that point to various objects - vectors, data frames,
>> other environments, etc.
>>
>> The enclosing environment for a function is where the function "runs
>> in" when it's called.
>>
>> Most R objects have just a binding environment (a variable or
>> reference that points to the object); functions also have an enclosing
>> environment. These two environments aren't necessarily the same.
>>
>> When you run search(), it shows the set of environments where R will
>> look for an object of a given name, when you run stuff at the console
>> (and are in the global env). The trick is that, although you can find
>> a function (they are bound bound) in one of these _package_
>> environments, those functions run in (are enclosed by) a different
>> environment: the a corresponding _namespace_ environment.
>>
>> The way that a namespace environment is set up with the arrangement of
>> its ancestor environments, it will find the base namespace version of
>> `attach` before it finds yours, even if your personal gcr environment
>> comes early in the search path.
>>
>> =
>> #

[Rd] How to redefine `require' to generate a warning and delegate work to the original require?

2014-08-10 Thread Grant Rettke
Good afternoon,

My goal is to warn the user any time that they use the `require'
function. The reason is that they probably wanted to use `library'
instead. There is a case where they should use the former though, so I
just want to issue a warning.

My approach was to:
• Re-bind `require' to `original.require'
• Re-implement `require' to
  • Warn the user
  • Delegate the real work back to `original-require'
  • Like this inside of my `.Rprofile':

╭
│ .First <- function() {
│ gcr <- new.env()
│ gcr$original.require(...) <- require
│ gcr$require <- function(...) {
│ warning("Are you sure you wanted `require` instead of `library`?")
│ original.require(...)
│ }
│ base::attach(gcr, name="gcr", warn.conflicts = FALSE)
│ }
╰

On startup I get the following error though:

╭
│ Error in gcr$original.require(...) <- require :
│   '...' used in an incorrect context
╰

What am I doing wrong here and what should I have read to grok my
mistake?

╭
│ > sessionInfo()
│ R version 3.1.1 (2014-07-10)
│ Platform: x86_64-apple-darwin13.2.0 (64-bit)
│
│ locale:
│ [1] en_US/en_US/en_US/C/en_US/en_US
│
│ attached base packages:
│ [1] stats graphics  grDevices utils datasets  methods   base
│
│ loaded via a namespace (and not attached):
│ [1] compiler_3.1.1 tools_3.1.1
╰

Kind regards,

Grant Rettke | ACM, ASA, FSF, IEEE, SIAM
g...@wisdomandwonder.com | http://www.wisdomandwonder.com/
“Wisdom begins in wonder.” --Socrates
((λ (x) (x x)) (λ (x) (x x)))
“Life has become immeasurably better since I have been forced to stop
taking it seriously.” --Thompson

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] How to redefine `require' to generate a warning and delegate work to the original require?

2014-08-10 Thread Jeroen Ooms
On Sun, Aug 10, 2014 at 7:22 PM, Grant Rettke  wrote:
>
> │ Error in gcr$original.require(...) <- require :
> │   '...' used in an incorrect context

I think you mean: gcr$original.require <- base::require. You don't
need the parentheses since you are not defining or calling a function.
You are simply assigning an object to another environment.

Basic questions about R usage/syntax like these are better suited for
the r-help list or stack-overflow. Have a look at:
https://stat.ethz.ch/mailman/listinfo/r-devel.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] "Fastest" way to merge 300+ .5MB dataframes?

2014-08-10 Thread Grant Rettke
Good afternoon,

Today I was working on a practice problem. It was simple, and perhaps
even realistic. It looked like this:
• Get a list of all the data files in a directory
• Load each file into a dataframe
• Merge them into a single data frame

Because all of the columns were the same, the simplest solution in my
mind was to `Reduce' the vector of dataframes with a call to
`merge'. That worked fine, I got what was expected. That is key
actually. It is literally a one-liner, and there will never be index
or scoping errors with it.

Now with that in mind, what is the idiomatic way? Do people usually do
something else because it is /faster/ (by some definition)?

Kind regards,


Grant Rettke | ACM, ASA, FSF, IEEE, SIAM
g...@wisdomandwonder.com | http://www.wisdomandwonder.com/
“Wisdom begins in wonder.” --Socrates
((λ (x) (x x)) (λ (x) (x x)))
“Life has become immeasurably better since I have been forced to stop
taking it seriously.” --Thompson

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] "Fastest" way to merge 300+ .5MB dataframes?

2014-08-10 Thread Joshua Ulrich
The same comment Jeroen Ooms made about your last email also applies
to this one: it is better suited to R-help.
--
Joshua Ulrich  |  about.me/joshuaulrich
FOSS Trading  |  www.fosstrading.com


On Sun, Aug 10, 2014 at 1:18 PM, Grant Rettke  wrote:
> Good afternoon,
>
> Today I was working on a practice problem. It was simple, and perhaps
> even realistic. It looked like this:
> • Get a list of all the data files in a directory
> • Load each file into a dataframe
> • Merge them into a single data frame
>
> Because all of the columns were the same, the simplest solution in my
> mind was to `Reduce' the vector of dataframes with a call to
> `merge'. That worked fine, I got what was expected. That is key
> actually. It is literally a one-liner, and there will never be index
> or scoping errors with it.
>
> Now with that in mind, what is the idiomatic way? Do people usually do
> something else because it is /faster/ (by some definition)?
>
> Kind regards,
>
>
> Grant Rettke | ACM, ASA, FSF, IEEE, SIAM
> g...@wisdomandwonder.com | http://www.wisdomandwonder.com/
> “Wisdom begins in wonder.” --Socrates
> ((λ (x) (x x)) (λ (x) (x x)))
> “Life has become immeasurably better since I have been forced to stop
> taking it seriously.” --Thompson
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] "Fastest" way to merge 300+ .5MB dataframes?

2014-08-10 Thread Grant Rettke
My sincere apologies.

Having read http://www.r-project.org/posting-guide.html , I had wanted
to post this one to R-help.
Grant Rettke | ACM, ASA, FSF, IEEE, SIAM
g...@wisdomandwonder.com | http://www.wisdomandwonder.com/
“Wisdom begins in wonder.” --Socrates
((λ (x) (x x)) (λ (x) (x x)))
“Life has become immeasurably better since I have been forced to stop
taking it seriously.” --Thompson


On Sun, Aug 10, 2014 at 1:28 PM, Joshua Ulrich  wrote:
> The same comment Jeroen Ooms made about your last email also applies
> to this one: it is better suited to R-help.
> --
> Joshua Ulrich  |  about.me/joshuaulrich
> FOSS Trading  |  www.fosstrading.com
>
>
> On Sun, Aug 10, 2014 at 1:18 PM, Grant Rettke  
> wrote:
>> Good afternoon,
>>
>> Today I was working on a practice problem. It was simple, and perhaps
>> even realistic. It looked like this:
>> • Get a list of all the data files in a directory
>> • Load each file into a dataframe
>> • Merge them into a single data frame
>>
>> Because all of the columns were the same, the simplest solution in my
>> mind was to `Reduce' the vector of dataframes with a call to
>> `merge'. That worked fine, I got what was expected. That is key
>> actually. It is literally a one-liner, and there will never be index
>> or scoping errors with it.
>>
>> Now with that in mind, what is the idiomatic way? Do people usually do
>> something else because it is /faster/ (by some definition)?
>>
>> Kind regards,
>>
>>
>> Grant Rettke | ACM, ASA, FSF, IEEE, SIAM
>> g...@wisdomandwonder.com | http://www.wisdomandwonder.com/
>> “Wisdom begins in wonder.” --Socrates
>> ((λ (x) (x x)) (λ (x) (x x)))
>> “Life has become immeasurably better since I have been forced to stop
>> taking it seriously.” --Thompson
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] How to redefine `require' to generate a warning and delegate work to the original require?

2014-08-10 Thread Grant Rettke
Jeroen, my sincere apologies.

Having read http://www.r-project.org/posting-guide.html I had
determined that my question was specifically "discussion about code
development in R" and was not in the realm of those who "want to use R
to solve problems but who are not necessarily interested in or
knowledgeable about programming".

As such, I posted it here.

Where did I go wrong?

Thanks for letting me know on both parts and have a great day.
Grant Rettke | ACM, ASA, FSF, IEEE, SIAM
g...@wisdomandwonder.com | http://www.wisdomandwonder.com/
“Wisdom begins in wonder.” --Socrates
((λ (x) (x x)) (λ (x) (x x)))
“Life has become immeasurably better since I have been forced to stop
taking it seriously.” --Thompson


On Sun, Aug 10, 2014 at 1:14 PM, Jeroen Ooms  wrote:
> On Sun, Aug 10, 2014 at 7:22 PM, Grant Rettke  
> wrote:
>>
>> │ Error in gcr$original.require(...) <- require :
>> │   '...' used in an incorrect context
>
> I think you mean: gcr$original.require <- base::require. You don't
> need the parentheses since you are not defining or calling a function.
> You are simply assigning an object to another environment.
>
> Basic questions about R usage/syntax like these are better suited for
> the r-help list or stack-overflow. Have a look at:
> https://stat.ethz.ch/mailman/listinfo/r-devel.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Error when assigning value in environment which is a locked binding

2014-08-10 Thread Winston Chang
If an environment x contains a locked binding y which is also an
environment, and then you try to assign a value to a binding inside of
y, it can either succeed or fail, depending on how you refer to
environment y.

x <- new.env()
x$y <- new.env()
lockEnvironment(x, bindings = TRUE)

# This assignment fails
x$y$z <- 1
# Error in x$y$z <- 1 : cannot change value of locked binding for 'y'

# Saving x$y to another variable, and then assigning there works
y2 <- x$y
y2$z <- 10  # OK
print(x$y$z)
# 10


Is this a bug or a feature? I realize that x$y is a locked binding
while y2 is not.

-Winston

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Error when assigning value in environment which is a locked binding

2014-08-10 Thread Winston Chang
Another oddity - even though there's an error thrown in assignment to
x$y$z, the assignment succeeds.

x <- new.env()
x$y <- new.env()
lockEnvironment(x, bindings = TRUE)
x$y$z <- 1
# Error in x$y$z <- 1 : cannot change value of locked binding for 'y'

x$y$z
# [1] 1


So I assume there must be a bug somewhere in here.

-Winston



On Sun, Aug 10, 2014 at 8:46 PM, Winston Chang 
wrote:

> If an environment x contains a locked binding y which is also an
> environment, and then you try to assign a value to a binding inside of
> y, it can either succeed or fail, depending on how you refer to
> environment y.
>
> x <- new.env()
> x$y <- new.env()
> lockEnvironment(x, bindings = TRUE)
>
> # This assignment fails
> x$y$z <- 1
> # Error in x$y$z <- 1 : cannot change value of locked binding for 'y'
>
> # Saving x$y to another variable, and then assigning there works
> y2 <- x$y
> y2$z <- 10  # OK
> print(x$y$z)
> # 10
>
>
> Is this a bug or a feature? I realize that x$y is a locked binding
> while y2 is not.
>
> -Winston
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel