https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752
Jeehoon Kang <jeehoon.kang at sf dot snu.ac.kr> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jeehoon.kang at sf dot snu.ac.kr --- Comment #43 from Jeehoon Kang <jeehoon.kang at sf dot snu.ac.kr> --- (In reply to Alexander Cherepanov from comment #38) > The evident solution is to not apply this optimization when provenance info > of the two variables differs. I guess for most integers it will be the same. IMO tracking provenance info is not a good idea, since it is really complicated. First, since integers and pointers can be casted to each other, not only pointers but also integers should carry provenance information. Second, tracking provenance info may work for simple examples, but it is even hard to define the provenance info itself for complex expressions. For e.g., what is the provenance of "e1-e2"? "2*e"? "e1 XOR e2"? "e1 * e2"? (even given the provenance info for integer expressions "e*") I would rather prefer marking pointers casted to integers as *escaped*, and forgetting about the provenance at all. Here are several reasons why this works well: - Standard optimizations are supported. Say we want to support the following constant propagation example: char f() { char a = '0'; g(); // unknown function; may guess the address of "a" and // try to access it (but it is always unsuccessful) return a; // -> return '0' } Since the address of "a" is not casted to integers, "a" is private to the function "f" (i.e., not escaped from "f"), and "g" cannot access "a". So we know "a = 0" at the return. - semantics is simple. No need to track the provenance info for variables. Once a pointer is casted to integers, it is just integers without any tracked information. As a result, the standard integer optimizations of our interest, as the following, are fully supported: if (x != y) x = y; -> x = y; - Performance degradation due to "casted pointers as escaped" is insignificant. Morally, if a pointer is casted to an integer, the address is regarded as "global": having the integer value of the pointer means you can access the pointer. So there will be not much optimization opportunity (or intent) for those pointers casted to integers. Of course, this argument should be validated by some experiment; yet I am quite convinced it is the case that the performance degradation is insignificant. I would like to ask how you think about this suggestion. Note that my argument here is based on my paper on this issue, where you can find the formal memory model we proposed, proofs that optimization examples are correct, and reasoning principle for proving optimizations (see the paper and the slide): A Formal C Memory Model Supporting Integer-Pointer Casts. Jeehoon Kang, Chung-Kil Hur, William Mansky, Dmitri Garbuzov, Steve Zdancewic, Viktor Vafeiadis. PLDI 2015. http://sf.snu.ac.kr/intptrcast/