Re: [Rd] write.csv problems
Às 17:02 de 28/06/2024, Spencer Graves escreveu: Hello, All: I'm getting strange errors with write.csv with some objects of class c('findFn', 'data.frame'). Consider the following: df1 <- data.frame(x=1) class(df1) <- c('findFn', 'data.frame') write.csv(df1, 'df1.csv') # Error in x$Package : $ operator is invalid for atomic vectors df2 <- data.frame(a=letters[1:2], b=as.POSIXct('2024-06-28')) class(df2) <- c('findFn', 'data.frame') write.csv(df2, 'df1.csv') # Error in tapply(rep(1, nrow(x)), xP, length) : # arguments must have same length "write.csv" works with some objects of class c('findFn', 'data.frame') but not others. I have 'findFn' object with 5264 rows that fails with the following error: Error in `[<-.data.frame`(`*tmp*`, needconv, value = list(Count = c("83", : replacement element 1 has 526 rows, need 5264 I have NOT yet been able to reproduce this error with a smaller example. However, starting 'write.csv' with something like the following should fix all these problems: if(is.data.frame(x)) class(x) <- 'data.frame' Comments? Thanks for all your work to help improve the quality of statistical software available to the world. Spencer Graves __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel Hello, I don't know if this answers to question. I wasn't able to reproduce errors but warnings, yes I was. A way of not giving errors or warnings is to call write.csv at the end of a pipe such as the following. df1 <- findFn("mean") df1 |> as.data.frame() |> write.csv("df1.csv") This solution is equivalent to the code proposed in the OP without the need for a change in base R. Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] write.csv problems
Hi, Rui et al.: On 6/29/24 14:24, Rui Barradas wrote: Às 17:02 de 28/06/2024, Spencer Graves escreveu: Hello, All: I'm getting strange errors with write.csv with some objects of class c('findFn', 'data.frame'). Consider the following: df1 <- data.frame(x=1) class(df1) <- c('findFn', 'data.frame') write.csv(df1, 'df1.csv') # Error in x$Package : $ operator is invalid for atomic vectors df2 <- data.frame(a=letters[1:2], b=as.POSIXct('2024-06-28')) class(df2) <- c('findFn', 'data.frame') write.csv(df2, 'df1.csv') # Error in tapply(rep(1, nrow(x)), xP, length) : # arguments must have same length "write.csv" works with some objects of class c('findFn', 'data.frame') but not others. I have 'findFn' object with 5264 rows that fails with the following error: Error in `[<-.data.frame`(`*tmp*`, needconv, value = list(Count = c("83", : replacement element 1 has 526 rows, need 5264 I have NOT yet been able to reproduce this error with a smaller example. However, starting 'write.csv' with something like the following should fix all these problems: if(is.data.frame(x)) class(x) <- 'data.frame' Comments? Thanks for all your work to help improve the quality of statistical software available to the world. Spencer Graves __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel Hello, I don't know if this answers to question. I wasn't able to reproduce errors but warnings, yes I was. A way of not giving errors or warnings is to call write.csv at the end of a pipe such as the following. df1 <- findFn("mean") df1 |> as.data.frame() |> write.csv("df1.csv") This solution is equivalent to the code proposed in the OP without the need for a change in base R. Thanks for this. Ivan Krylov informed me that this was NOT a problem with base R but with "[.findFn". I fixed that and got help from Ivan fixing another problem with "sos". Now it is officially "on its way to CRAN." Hope this helps, Yes. I'm not yet facile with "|>", but I'm learning. Spencer Graves Rui Barradas __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] \>
Yes. I'm not yet facile with "|>", but I'm learning. Spencer Graves There's very little to know. This: x |> f() |> g() is just a different way of writing g(f(x)) If f() or g() have extra arguments, just add them afterwards: x |> f(a = 1) |> g(b = 2) is just g(f(x, a = 1), b = 2) This isn't quite true of the magrittr pipe, but it is exactly true of the base pipe. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] \>
Hi, Duncan: On 6/29/24 17:24, Duncan Murdoch wrote: Yes. I'm not yet facile with "|>", but I'm learning. Spencer Graves There's very little to know. This: x |> f() |> g() is just a different way of writing g(f(x)) If f() or g() have extra arguments, just add them afterwards: x |> f(a = 1) |> g(b = 2) is just g(f(x, a = 1), b = 2) Agreed. If I understand correctly, the supporters of the former think it's easier to highlight and execute a subset of the earlier character string, e.g., "x |> f(a = 1)" than the corresponding subset of the latter, "f(x, a = 1)". I remain unconvinced. For debugging, I prefer the following: fx1 <- f(x, a = 1) g(fx1, b=2) Yes, "fx1" occupies storage space that the other two do not. Ir you are writing code for an 8086, the difference in important. However, for my work, ease of debugging is important, which is why I prefer, "fx1 <- f(x, a = 1); g(fx1, b=2)". Thanks, again, for the reply. Spencer Graves This isn't quite true of the magrittr pipe, but it is exactly true of the base pipe. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] \>
I agree with you (I think we may be similarly aged), but there is the `magrittr::debug_pipe()` function, which can be inserted anywhere into either kind of pipe. It will call `debug()` at that point, and let you examine the current value, before passing it on to the next entry. You can't single step through a pipe (as far as I know), but with that modification, you can see what you've got at any point. Duncan Murdoch On 2024-06-29 6:57 p.m., Spencer Graves wrote: Hi, Duncan: On 6/29/24 17:24, Duncan Murdoch wrote: Yes. I'm not yet facile with "|>", but I'm learning. Spencer Graves There's very little to know. This: x |> f() |> g() is just a different way of writing g(f(x)) If f() or g() have extra arguments, just add them afterwards: x |> f(a = 1) |> g(b = 2) is just g(f(x, a = 1), b = 2) Agreed. If I understand correctly, the supporters of the former think it's easier to highlight and execute a subset of the earlier character string, e.g., "x |> f(a = 1)" than the corresponding subset of the latter, "f(x, a = 1)". I remain unconvinced. For debugging, I prefer the following: fx1 <- f(x, a = 1) g(fx1, b=2) Yes, "fx1" occupies storage space that the other two do not. Ir you are writing code for an 8086, the difference in important. However, for my work, ease of debugging is important, which is why I prefer, "fx1 <- f(x, a = 1); g(fx1, b=2)". Thanks, again, for the reply. Spencer Graves This isn't quite true of the magrittr pipe, but it is exactly true of the base pipe. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] \>
I suggest there is actually quite a lot to know about piping, albeit you can use it fine while knowing little. For those who can happily write complex lines of code containing nested function calls and never have to explain it to anyone, feel free. I can do that and sometimes months later I only figure out what I did in ten minutes and then check to see if I got it right! But for people who are used to features vaguely similar in other languages, pipes are a great way to visualize data and process flow as they show a sort of sequence. No, they are not at all the same as a UNIX pipe but that is not a bad model as it lets you write shell scripts that do one conceptual step at a time and pass along data to the input of another program that processes it further and passes it along until you reach some goal. Many languages, such as ones using variations on Object Oriented, have a sort of pipeline that can look like: a.method_a(args).method_b(args) And in some languages, that can be spread across multiple lines to look a bit more like a pipeline. This too is an inexact analogy as what really happens is that the underlying object can return perhaps another object when you call a method and then you can call a method in that object and so on. This can make it limited in some ways or quite powerful. The many versions that have been created of an R pipe can be variations on many themes. As an example, you could take the multiple lines in a pipeline and rearrange them to look like the nested code with function calls as arguments in other functions and then evaluate it. It would, in effect, be a sort of syntactic sugar that makes it easier for SOME programmers. But the topic now shifts to debugging and indeed, the underlying implementation of a pipeline can impact on one debugs. The simplest case is trivial to debug. No visible pipes: Temp1 <- f1(x, args) Temp2 <- f2(Temp1, args) Result <- f3(Temp2, args) rm(Temp1, Temp2) So one form of piping does something like this under the table: For code like: X PIPED f1(args) PIPED f2(args) -> Result It simply does something like this: . <- x . <- f1(., args) . <- f2(., args) Result <- f3(., args) The variable "." just gets re-used repeatedly. But as this code swap is done outside normal view, can a debugger follow it? And "." keeps changing. As a nice feature, some implementations may actually check and if you place "." as an argument past the beginning as in f3(args, ., more_args) allow you to pipe in not just to the first argument for the many functions that may want the data second or third or ... There are other implementations possible that allow syntactic sugar without necessarily being run as shown. I am not sure how the native pipe that was added is implemented but it seems quite a bit faster than many other implementations and has some quirks such as requiring all functions to include parentheses, even if empty like piping to head(), and the way to do some things using anonymous functions is a tad annoying. I think the focus for many people is the HUMAN who is programming and sees a logical way to describe what they want without much ambiguity. Of course, if you want to keep playing with your code, don't use pipes except perhaps when it is pretty much done. An analogy to consider is another variant of piping used by ggplot where "+" is overloaded and: ggplot(args) + geom_point(args) + geom_line(args) + xlab(args) + theme_bw() + coord_flip() + ... Is a common way of writing a fairly complex set of operations. But what is being piped there is a growing object that each step modifies and an the end, the object is rendered into a graph based on whatever complex contents it contains. And, yes, that can be painful to debug and a simple option is: P <- ggplot(args) P <- P + geom_point(args) P <- P + geom_line(args) ... print(P) Being able to declare incremental changes and layers to a graph this way is more intuitive to some. Not using a pipelined approach allows you to comment out parts easily, such as not making it black/white sometimes, albeit you can as easily comment out the other version. What some people need to understand is that adding pipes of any of the varieties has never taken away to write the code in other ways. It is not in any way required. And for some people, it aligns better with how they can reason. Yet, if you need lots of debugging in your programs, writing them differently may be a better idea, at least until it is debugged. I have written code for my clients with quite elegant pipelines as well as functions like the dplyr mutate() that allow me to do many things in one function call, and formatted it beautifully with varying levels of indentation so you can see at a glance where things line up. Parts of the code are nested function calls and when it all leads to a ggplot structure like above, it can be a tad hard for many people to appreciate what it is doing. But