[Rd] Inconsistent Parse Behavior

brodie gaslam via R-devel Thu, 25 Dec 2014 09:22:59 -0800

Under some specific conditions, `parse` seems to produce inconsistent and 
potentially incorrect results the first time it is run in a fresh clean R 
session.  Consider this code where we parse the same text twice in a row, and 
get one value in the parse data that is mismatched:
```Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.


> txt <- 'c("", {
+   c(integer(3L), 1:3)
+   c(integer(), 1:3, 1L)         # TRUE
+   c(integer(), c(1, 2, 3), 1L)  # TRUE
+ } )
+ c("", {
+   lst <- list(list( 1,  2), list( 3, list( 4, list( 5, list(6, 6.1, 6.2)))))
+ } )
+ c("", {
+   TRUE
+ } )'
> prs1 <- parse(text=txt, keep.source=TRUE)
> prs2 <- parse(text=txt, keep.source=TRUE)
> which(attr(prs1, "srcfile")$parseData != attr(prs2, "srcfile")$parseData)
[1] 1176
> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base 
```This discrepancy does not happen if I simplify the code to parse in any way. 
 The code as it is is a much simplified version of the code that first produced 
the error for me.  I cannot reduce it further without also eliminating the 
error.
Unfortunately, the discrepancy is meaningful.  The problem is the first parse.  
Looking at `getParseData` output:```> subset(getParseData(prs1), id %in% c(226, 
234))
    line1 col1 line2 col2  id parent token terminal text
226     6    1     8    3 226    234  expr    FALSE     
234     9    5     9    5 234    251   ','     TRUE    ,```Notice how item 226 
has for parent item 234 that starts on line 9, col 5, after item 226 ends.  I'm 
not sure how this is possible.
In the second parse, the parse data is as one would expect:```> 
subset(getParseData(prs2), id == 226)
    line1 col1 line2 col2  id parent token terminal text
226     6    1     8    3 226      0  expr    FALSE    
```The parent here is the top level (0), as would be expected looking at the 
source code in `txt` (226 represents the second `c(...)` block).
I suspect the problem is caused by the use of `{}` inside of `f()`, but again, 
it is not that simple since any further simplification of my code above seems 
to resolve the problem.  I also don't know why it would work fine the second 
time, though there must be some state initialization inside the parser going on.
Any help appreciated.
Best,
Brodie


        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Inconsistent Parse Behavior

Reply via email to