Duncan Murdoch wrote:
On 16/03/2009 5:06 PM, Romain Francois wrote:
hadley wickham wrote:
It would be pretty easy to use the output from the R parser (which
is never
wrong, is it?), and dump some markup out of it. For example the
showTree
function in codetools dumps an R expression as Lisp, this is not
too far
from generating html, or any other markup.
As this sounds like fun, I'll volunteer to do something about this.
Another
advantage is that we can imagine to plug hyperlinks in R code that
lives in
html help pages.
This also sounds like a good idea for a google summer of code project
- that way you might be able to get a student to give you a hand as
well.
Hadley
That did cross my mind earlier this evening, it just seems a bit too
easy to last all summer, but maybe I am missing something difficult.
I will start to play with this over the next few days, and make up my
mind.
It depends on your standards. You said you want R to parse the code
in the Rd file. That's going to be hard, because Rd files contain
something that is only "R-like", as far as the parser is concerned.
You'll need to convert it into R code before you can pass it to the R
parser.
I would assume this would be outsourced to the experimental parse_Rd
function
And then there's the question of scoping, which gets into the
evaluator, not just the parser. (The parser only recognizes "mean" as
an identifier; it's the evaluator that decides whether it's the
function in the base package or a local variable.)
That is an issue. I guess I will fall back on what the parser says and
infer on the scoping. Within the lines below, mean would be different
each time
mean( 1:10 )
lapply( 1:10, mean)
mean <- (1+4) / 2
lapply( list( mean, median), function( f ) f( 1:10) )
{ mean <- median; mean( 1:10 ) }
So if you've got high standards, it's probably quite hard. On the
other hand, if you're willing to accept the usual sort of errors that
syntax highlighters make, it's not so bad, but not trivial.
There is probably some middle ground between the job an highlighter
would do, and the way the R evaluator would think the expression
eventually. Given that this is more a nice to have feature, I guess we
can accept some errors. checkUsage is wrong sometimes, but it is still a
good tool.
One of the problem I might run into is performance, if we want this
to treat all Rd files, we are going to want something very efficient,
and it might not be enough to build on top of codetools (which uses
recursion at the R level) , but could make sense to provide a C level
implementation.
Remember what Knuth said about premature optimization. Write it first
in R, and only optimize it if it's not fast enough.
Deal
(I'd guess it'll be fast enough: Brian Ripley reported that all the R
code he wrote for conversions in R-devel was faster than the Perl code
it was replacing.)
That is good news
This could lead to interesting things as:
- syntax highlighting in sweave (or decumar)
- pretty printing in the console (using ansi characters)
- syntax highlighting in R help files, potentially with hyperlinks
I have requested creation of a project on r-forge. Anyone else want
to play with this ?
I'll sign up once it's going.
Duncan Murdoch
--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.