https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98748
Bug ID: 98748 Summary: Increased precision for points-to analysis Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: ipa Assignee: unassigned at gcc dot gnu.org Reporter: bugzilla at bash9 dot xyz CC: marxin at gcc dot gnu.org Target Milestone: --- Hi, this ticket is to discuss the possibility of extending the points-to analysis in GCC. The points-to analysis in GCC is currently field-sensitive, context-insensitive and flow-insensitive. However the current GCC implementation suffers from some limitations: 1) field-sensitivity is unavailable for heap variables (i.e. those that are allocated via malloc) 2) field-sensitivity is similarly unavailable for structs allocated in dynamic arrays (again via malloc). I understand that these two limitations stem from the fact that no type information is available at the allocation site. I also believe that the current IPA-PTA may not take advantage of type information (however, I am not 100% sure of this at the moment). I've been reading this paper [0] which has a Datalog implementation for a points-to analysis that computes the points-to solution with field-sensitivity for heap variables and field-sensitivity for dynamic arrays. The implementation is also available on GitHub [1]. At the moment it analyzes LLVM-IR. This added sensitivity should be enough to unlock data-layout transformations at the granularity of alias-sets (i.e., it will change the layout of all variables pointing to the same object) as opposed to a coarser type granularity (i.e., to change the layout of all variables with a given type). For example: dropping unused fields from structs [2] and transforming a "struct of arrays to array of structs & vice versa". If one went and performed the most direct translation of the available analysis in GCC, one would need to iterate over all gimple code and generate Datalog facts that would then be fed to the solver. I believe that XSB is an implementation of Datalog that could be called from a GCC-plugin to compute the points-to analysis. However, it would be preferable to have the points-to analysis be performed directly in GCC. Please let me know if you have thoughts about this. I am happy to have a dialogue so that we can all come to a maintainable and reliable solution. Thanks! [0] Balatsouras, George, and Yannis Smaragdakis. "Structure-sensitive points-to analysis for C and C++." International Static Analysis Symposium. Springer, Berlin, Heidelberg, 2016. (available at the author's website) [1] https://github.com/plast-lab/cclyzer [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92801