On Dec 10, 2012, at 1:51 AM, Paul Johnson wrote: > I'm continuing my work on finding speedups in generalized inverse > calculations in some simulations. It leads me back to .C and .Call, > and some questions I've never been able to answer for myself. It may > be I can push some calculations to LAPACK in or C BLAS, that's why I > realized again I don't understand the call by reference or value > semantics of .Call > > Why aren't users of .Call encouraged to "const" their arguments, and > why doesn't .Call do this for them (if we really believe in return by > value)? >
Because there is a difference between the *data* part of the SEXP and the object itself. Internal structure of the object may need to be modified (e.g. the NAMED ref counting increased when you assign it) in a call to R API. You can't flag the data part as const separately, so you have to use non-const SEXP. > R Gentleman's R Programming for Bioinformatics is the most > understandable treatment I've found on .Call. It appears to me .Call > leaves "wiggle room" where there should be none. Here's Gentleman on > p. 185. "For .Call and .External, the return value is an R object (the > C functions must return a SEXP), and for these functions the values > that were passed are typically not modified. If they must be > modified, then making a copy in R, prior to invoking the C code, is > necessary." > > I *think* that means: > > .Call allows return by reference, BUT we really wish users would not > use it. Users can damage R data structures that are pointed to unless > they really truly know what they are doing on the C side. ?? > > This seems dangerous. Why allow return by reference at all? > Because it is completely legal to do things like SEXP last(SEXP bar) { if (TYPEOF(bar) = VECSXP && LENGTH(bar) > 0) return VECTOR_ELT(bar, LENGTH(bar) - 1); Rf_error("sorry, I only work on lists"); } There is no point in duplicating the element. > On p. 197, there's a similar comment "Any function that has been > invoked by either .External or .Call will have all of its arguments > protected already. You do not need to protect them. .... [T]hey were > not duplicated and should be treated as read-only values." > > "should be ... read-only" concerns me. They are "protected" in the > garbage collector sense, Yes > but they are not protected from "return by > reference" damage. Right? > There is no "return by reference damage". The only problem is if you modify input arguments while someone else holds a reference, but there is no way in C to prevent that while still allowing them to be useful. Note that it is legal to modify input arguments if there are no references to it. Cheers, Simon > Why doesn't the documentation recommend function writers to mark > arguments to C functions as const? Isn't that what the return by > value policy would suggest? > > Here's a troublesome example in R src/main/array.c: > > /* DropDims strips away redundant dimensioning information. */ > /* If there is an appropriate dimnames attribute the correct */ > /* element is extracted and attached to the vector as a names */ > /* attribute. Note that this function mutates x. */ > /* Duplication should occur before this is called. */ > > SEXP DropDims(SEXP x) > { > SEXP dims, dimnames, newnames = R_NilValue; > int i, n, ndims; > > PROTECT(x); > dims = getAttrib(x, R_DimSymbol); > [... SNIP ....] > setAttrib(x, R_DimNamesSymbol, R_NilValue); > setAttrib(x, R_DimSymbol, R_NilValue); > setAttrib(x, R_NamesSymbol, newnames); > [... SNIP ....] > > return x; > } > > Well, at least there's a warning comment with that one. But it does > show .Call allows call by reference. > > Why allow it, though? DropDims could copy x, modify the copy, and return it. > > I wondered why DropDims bothers to return x at all. We seem to be > using modify and return by reference there. > > I also wondered why x is PROTECTED, actually. Its an argument, wasn't > it automatically protected? Is it no longer protected after the > function starts modifying it? > > Here's an example with similar usage in Writing R Extensions, section > 5.10.1 "Calling .Call". It protects the arguments a and b (needed > ??), then changes them. > > #include <R.h> > #include <Rdefines.h> > > SEXP convolve2(SEXP a, SEXP b) > { > R_len_t i, j, na, nb, nab; > double *xa, *xb, *xab; > SEXP ab; > > PROTECT(a = AS_NUMERIC(a)); /* PJ wonders, doesn't this alter > "a" in calling code*/ > PROTECT(b = AS_NUMERIC(b)); > na = LENGTH(a); nb = LENGTH(b); nab = na + nb - 1; > PROTECT(ab = NEW_NUMERIC(nab)); > xa = NUMERIC_POINTER(a); xb = NUMERIC_POINTER(b); > xab = NUMERIC_POINTER(ab); > for(i = 0; i < nab; i++) xab[i] = 0.0; > for(i = 0; i < na; i++) > for(j = 0; j < nb; j++) xab[i + j] += xa[i] * xb[j]; > UNPROTECT(3); > return(ab); > } > > > Doesn't > > PROTECT(a = AS_NUMERIC(a)); > > have the alter the data structure "a" both inside the C function and > in the calling R code as well? And, if a was PROTECTED automatically, > could we do without that PROTECT()? > > pj > > -- > Paul E. Johnson > Professor, Political Science Assoc. Director > 1541 Lilac Lane, Room 504 Center for Research Methods > University of Kansas University of Kansas > http://pj.freefaculty.org http://quant.ku.edu > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel