On Dec 10, 2012, at 2:05 PM, Simon Urbanek wrote: > > On Dec 10, 2012, at 1:51 AM, Paul Johnson wrote: > >> I'm continuing my work on finding speedups in generalized inverse >> calculations in some simulations. It leads me back to .C and .Call, >> and some questions I've never been able to answer for myself. It may >> be I can push some calculations to LAPACK in or C BLAS, that's why I >> realized again I don't understand the call by reference or value >> semantics of .Call >> >> Why aren't users of .Call encouraged to "const" their arguments, and >> why doesn't .Call do this for them (if we really believe in return by >> value)? >> > > Because there is a difference between the *data* part of the SEXP and the > object itself. Internal structure of the object may need to be modified (e.g. > the NAMED ref counting increased when you assign it) in a call to R API. You > can't flag the data part as const separately, so you have to use non-const > SEXP. > > >> R Gentleman's R Programming for Bioinformatics is the most >> understandable treatment I've found on .Call. It appears to me .Call >> leaves "wiggle room" where there should be none. Here's Gentleman on >> p. 185. "For .Call and .External, the return value is an R object (the >> C functions must return a SEXP), and for these functions the values >> that were passed are typically not modified. If they must be >> modified, then making a copy in R, prior to invoking the C code, is >> necessary." >> >> I *think* that means: >> >> .Call allows return by reference, BUT we really wish users would not >> use it. Users can damage R data structures that are pointed to unless >> they really truly know what they are doing on the C side. ?? >> >> This seems dangerous. Why allow return by reference at all? >> > > Because it is completely legal to do things like > > SEXP last(SEXP bar) { > if (TYPEOF(bar) = VECSXP && LENGTH(bar) > 0) > return VECTOR_ELT(bar, LENGTH(bar) - 1); > Rf_error("sorry, I only work on lists"); > } >
Martin Morgan pointed out that this example is a bad one -- which is true. The common idiom that is safe is SEXP foo(SEXP bar) { ... return bar; } However, the last() example above is bad, because returning the element directly is a bad idea -- the conservative approach would be to use duplicate(), the more efficient one would be to bump up NAMED. Sorry, my bad. I guess I was rather strengthening Paul's point to duplicate() when in doubt even if it's less efficient :). Cheers, Simon > There is no point in duplicating the element. > > > >> On p. 197, there's a similar comment "Any function that has been >> invoked by either .External or .Call will have all of its arguments >> protected already. You do not need to protect them. .... [T]hey were >> not duplicated and should be treated as read-only values." >> >> "should be ... read-only" concerns me. They are "protected" in the >> garbage collector sense, > > Yes > > >> but they are not protected from "return by >> reference" damage. Right? >> > > There is no "return by reference damage". > > The only problem is if you modify input arguments while someone else holds a > reference, but there is no way in C to prevent that while still allowing them > to be useful. Note that it is legal to modify input arguments if there are no > references to it. > > Cheers, > Simon > > >> Why doesn't the documentation recommend function writers to mark >> arguments to C functions as const? Isn't that what the return by >> value policy would suggest? >> >> Here's a troublesome example in R src/main/array.c: >> >> /* DropDims strips away redundant dimensioning information. */ >> /* If there is an appropriate dimnames attribute the correct */ >> /* element is extracted and attached to the vector as a names */ >> /* attribute. Note that this function mutates x. */ >> /* Duplication should occur before this is called. */ >> >> SEXP DropDims(SEXP x) >> { >> SEXP dims, dimnames, newnames = R_NilValue; >> int i, n, ndims; >> >> PROTECT(x); >> dims = getAttrib(x, R_DimSymbol); >> [... SNIP ....] >> setAttrib(x, R_DimNamesSymbol, R_NilValue); >> setAttrib(x, R_DimSymbol, R_NilValue); >> setAttrib(x, R_NamesSymbol, newnames); >> [... SNIP ....] >> >> return x; >> } >> >> Well, at least there's a warning comment with that one. But it does >> show .Call allows call by reference. >> >> Why allow it, though? DropDims could copy x, modify the copy, and return it. >> >> I wondered why DropDims bothers to return x at all. We seem to be >> using modify and return by reference there. >> >> I also wondered why x is PROTECTED, actually. Its an argument, wasn't >> it automatically protected? Is it no longer protected after the >> function starts modifying it? >> >> Here's an example with similar usage in Writing R Extensions, section >> 5.10.1 "Calling .Call". It protects the arguments a and b (needed >> ??), then changes them. >> >> #include <R.h> >> #include <Rdefines.h> >> >> SEXP convolve2(SEXP a, SEXP b) >> { >> R_len_t i, j, na, nb, nab; >> double *xa, *xb, *xab; >> SEXP ab; >> >> PROTECT(a = AS_NUMERIC(a)); /* PJ wonders, doesn't this alter >> "a" in calling code*/ >> PROTECT(b = AS_NUMERIC(b)); >> na = LENGTH(a); nb = LENGTH(b); nab = na + nb - 1; >> PROTECT(ab = NEW_NUMERIC(nab)); >> xa = NUMERIC_POINTER(a); xb = NUMERIC_POINTER(b); >> xab = NUMERIC_POINTER(ab); >> for(i = 0; i < nab; i++) xab[i] = 0.0; >> for(i = 0; i < na; i++) >> for(j = 0; j < nb; j++) xab[i + j] += xa[i] * xb[j]; >> UNPROTECT(3); >> return(ab); >> } >> >> >> Doesn't >> >> PROTECT(a = AS_NUMERIC(a)); >> >> have the alter the data structure "a" both inside the C function and >> in the calling R code as well? And, if a was PROTECTED automatically, >> could we do without that PROTECT()? >> >> pj >> >> -- >> Paul E. Johnson >> Professor, Political Science Assoc. Director >> 1541 Lilac Lane, Room 504 Center for Research Methods >> University of Kansas University of Kansas >> http://pj.freefaculty.org http://quant.ku.edu >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel