Re: GSoC: want to take part in `Extend the static analysis pass for CPython Extension`
I do not have specific ideas on (c). I prefer to work on (b) if possible. The PEP 701 branch is under active development now. I review others' PRs and open some PRs myself. https://github.com/pablogsal/cpython/pull/54 https://github.com/pablogsal/cpython/pull/61 https://github.com/pablogsal/cpython/pull/63 I will submit a proposal on (b) as soon as possible. And by the way, I can get to work long before the start-coding timepoint of GSoC timeline. From: David Malcolm Sent: Monday, April 3, 2023 7:41 To: Sun Steven ; gcc@gcc.gnu.org Subject: Re: GSoC: want to take part in `Extend the static analysis pass for CPython Extension` On Sat, 2023-04-01 at 20:32 +, Sun Steven via Gcc wrote: > Hello, Hi! I just replied to your other email in the "[GSoC] Interest and initial proposal for project on reimplementing cpychecker as -fanalyzer plugin " thread. > > I want to take part in this project. > > b. Write a plugin to add checking for usage of the CPython API (e.g. > reference-counting); see > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107646 > > > I know the deadline is arriving, but this idea just came to me now. Indeed; the deadline for submitting proposals to the official GSoC website is April 4 - 18:00 UTC (i.e. this coming Tuesday); see: https://developers.google.com/open-source/gsoc/timeline Google are very strict about that deadline. > > Self-intro: > I am a fan of C++, and have expertise in writing low-latency codes. I > previously worked at a high-frequency trading company, mainly writing > C++ and Python on Linux. > > Familiarity with GCC: > I get an overall idea of how the compiler works. I have debugged > several GCC c++ frontend bugs. (eg. 108218, 99686, 99019,...) Thanks; I just took a look at those. > But I only checked the c++ frontend codes in detail, not the middle > or backend codes. I have the ability to work with large codebases. > > Familiarity with CPython: > I use a lot of CPython. Recently, I am contributing to the CPython > interpreter on PEP 701 (mainly on the parser, which I am familiar > with) > > > I have always been wanting to contribute major changes to GCC, but > just don't know if that project exists. I understand how middle-end > works, but never really interact with the GIMPLE. This project allows > me to take a real look at how GCC's middle end works. Given your knowledge of both C++ and of CPython internals, this project sounds like a good way for you to get involved. > > I want to know if anyone was already on this project. I would prefer > a large-sized object (350hrs). I see you've already posted to the thread Eric started. > > If b. was already taken, I also accept a. and c. I had to check the wiki page to see which ones (a) and (c) were; (a) is "Add format-string support to -fanalyzer." (c) is "Add a checker for some API or project of interest to the contributor (e.g. the Linux kernel, a POSIX API that we're not yet checking, or something else), either as a plugin, or as part of the analyzer core for e.g. POSIX." Do you have specific ideas for (c)? (a) would make a great project, in that it's reasonably self-contained. Eric's proposal for (b) plans to eventually tackle it, but there's a huge amount of potential work in (b) already. > By the way, I don't really care about the GSoC. If we miss the > deadline, we can still push forward this project without the support > of GSoC, as long as I get coached. I'm keen on helping new GCC contributors, with or without GSoC. A good next step is to build GCC from source, and try hacking in a new warning. See: https://gcc-newbies-guide.readthedocs.io/en/latest/ But remember that the GSoC deadline is April 4 - 18:00 UTC (i.e. this coming Tuesday), so if you're going to apply, you need to act fast. Good luck Dave
Re: [GSoC] Interest and initial proposal for project on reimplementing cpychecker as -fanalyzer plugin
Thanks for bringing this to my attention Dave! I’m happy to collaborate on this project with Steven. I will reply in more detail in the other thread. Best, Eric On Sun, Apr 2, 2023 at 7:28 PM David Malcolm wrote: > > On Sat, 2023-04-01 at 19:49 -0400, Eric Feng wrote: > > > For the task above, I think it's almost all there, it's "just" a > > > case > > > of implementing the special-case knowledge about the CPython API, > > > mostly via known_function subclasses. > > > > Sounds good. > > > > > > > In cpychecker I added some custom function attributes: > > > > > > https://gcc-python-plugin.readthedocs.io/en/latest/cpychecker.html > > > which were: > > > __attribute__((cpychecker_returns_borrowed_ref)) > > > __attribute__((cpychecker_steals_reference_to_arg(n))) > > > > > [...] > > > > > > But exactly what these macros would look like would be a decision > > > for > > > the CPython community (hence do it via PEP, based on a sample > > > implementation). > > > > Ok, I see what you mean now. Thanks for clarifying! > > > > > > > Yeah, this sounds like a big project. Fortunately there are a lot > > > of > > > possible subtasks in this one, and the project has benefits to GCC > > > and > > > to CPython even if you only get a subset of the ideas done in the > > > time > > > available (refcount checking being probably the highest-value > > > subtask). > > > > Sounds good. > > > > I refactored the project description and timeline sections of the > > proposal according to our conversation. Notably, I moved format > > string > > checking to task #2 in the timeline since its subtasks are > > particularly beneficial. I also suggest in the timeline section to > > reach out to the CPython community via PEP about the specifics of new > > attributes in week 9/10 since I think we should have a somewhat > > mature > > prototype by that point. Let me know if you think it should be done > > earlier/later. Please find the changed sections below (I omitted > > unchanged sections for brevity) > > ___ > > > > Describe the project and clearly define its goals: > > One pertinent use case of the gcc-python plugin was as a static > > analysis tool for CPython extension modules. The main goal of the > > plugin was to help programmers writing extensions identify common > > coding errors. The gcc-python-plugin has bitrotted over the years > > and, > > in particular, cpychecker stopped working some GCC releases ago. > > Broadly, the goal of this project is to port the functionalities of > > cpychecker to a -fanalyzer plugin. > > > > Below is a brief description of the functionalities of the static > > analysis tool for which I will work on porting over to a -fanalyzer > > plugin. The structure of the objectives is based on the > > gcc-python-plugin documentation: > > > > Reference count checking: > > > > Format string checking: Some CPython APIs such as PyArgs_ParseTuple, > > PyArg_ParseTupleAndKeywords, etc take format strings as arguments. > > This check involves verifying that the format strings taken in by > > these APIs are correct with respect to the number and types of > > arguments passed in. In particular, I will work on integrating the > > analyzer with -Wformat > > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107017) and adding > > plugin support for -Wformat > > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100121) . We should > > then > > be able to specify our own archetype which reflects the format string > > syntax for the relevant CPython APIs and take advantage of the > > integrated analyzer to check them. > > > > Associating PyTypeObject instances with compile-time-types: > > > from original proposal> > > > > Error-handling checking (including errors in exception handling): > > Common errors such as dereferencing a NULL value are already checked > > by the analyzer. I will extend this functionality by implementing > > special-case knowledge about the CPython API. > > > > Verification of PyMethodDef tables: > proposal> > > > > Provide an expected timeline: > > Please find a rough estimate of the weekly progress in relation to > > the > > features described below. Tasks that I expect to take longer than one > > week are broken down in more detail. In addition to what’s described, > > each task also involves adding test coverage pertaining its specific > > feature to a regression test suite. > > > > Week 1 - 7: Reference counting checking > > Week 1: Set up the overall infrastructure of the plugin and begin > > building core functionality > > Week 1 - 6: Core reference counting functionality > > Week 7: Refine prototype > > Week 8 - 10.5: Format string checking (including associating > > PyTypeObject instances with compile-time-types) > > Week 8 - ~9: RFE: support printf-style formatted functions in - > > fanalyzer > > Week ~9 - 10.5: RFE: plugin support for -Wformat via > > __attribute__((format())) > > Additionally, begin conversing with CPython community via PEP > > about the exact
Re: Re: GSoC: want to take part in `Extend the static analysis pass for CPython Extension`
Hi Steven, I’m happy to collaborate on this project together — it would be great to have your experience with CPython internals on the team. > And by the way, I can get to work long before the start-coding time point of > GSoC timeline. I can be involved in some capacity before the start-coding period as well (I originally planned to spend the time getting well acquainted so as to hit the ground running) but I would prefer if we leave the more involved tasks (e.g reference count checking, format string checking) to the start-coding time point as I can’t work in a full time manner until late May due to commitments in school before then. Perhaps we can begin with the more low hanging fruits such as Error-handling checking, errors in exception handling, and verification of PyMethodDef tables in the time before the start-coding period? It might be good for us to start with these smaller tasks first to be more efficient in tackling the more involved tasks anyways. It would also be easy to divvy these tasks up as well. Best, Eric
Re: Re: GSoC: want to take part in `Extend the static analysis pass for CPython Extension`
Hello, On Mon, Apr 03 2023, Eric Feng via Gcc wrote: > Hi Steven, > > I’m happy to collaborate on this project together — it would be great > to have your experience with CPython internals on the team. > While I normally welcome collaboration, please note that GSoC rules and reasonable caution dictate that the two project must not be dependent on one another. I.e. if one of you, for any reason, could not finish your bits in time, it must not have severe adverse effects on the other. If mostly independent collaboration is possible (and David agrees), then sure, go ahead. Thanks for understanding, Martin
Re: GSoC Separate Host Process Offloading
Hi Tobias and Thomas - just wondering if you've had a chance to look at this? Thanks, Adi From: Prasad, Adi Sent: Saturday, April 1, 2023 5:16 am To: Tobias Burnus ; Thomas Schwinge Cc: gcc@gcc.gnu.org Subject: RE: GSoC Separate Host Process Offloading Hi Tobias and Thomas, My apologies for the double email; I have an unrelated administrative ask. Would it be possible to provide any past successful GSoC proposals? I'm interested in any thnigs GCC specifically is looking for in proposals (I've seen quite a few generic guides on the web but none specific to GCC). Thanks, Adi > -Original Message- > From: Prasad, Adi > Sent: Saturday, April 1, 2023 4:16 AM > To: 'Tobias Burnus' ; Thomas Schwinge > > Cc: gcc@gcc.gnu.org > Subject: RE: GSoC Separate Host Process Offloading > > Hi Tobias, > Thanks for the reply! > > > > > Note that multiple offload targets are possible. For instance, on > > Debian/Ubuntu, 'gcc -v' shows: > > 'OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa' and lto-wrapper > then > > cycles through those, finding the offloading compiler in > > $PATH/accel//mkoffload > > > > Example: x86_64-none-linux-gnu/12.2.1/accel/amdgcn-amdhsa/mkoffload > > > > Thus, if you install it to 'x86_64-none-linux-gnu' and add it to > > OFFLOAD_TARGET_NAMES,* it will work; albeit, we probably want to have > > some special handling in gcc.cc to avoid host-process offloading by > > default and permit something like -foffload=host instead of having to > > specify -foffload=x86_64-none-linux-gnu > > > Understood. Forgive me if I'm misunderstanding this, but I wonder if it might > be > better to put the new mkoffload in an "accel/host" directory, and add "host" > to > OFFLOAD_TARGET_NAMES rather than have the specific host e.g. "x86_64-none- > linux-gnu"? This would 1) enable the use of "-foffload=host" automatically > and 2) > distinguish between compiling for the same device on a separate process versus > compiling to a separate device with the same architecture and kernel as the > host. > I can imagine this clash wouldn’t happen in practice, since compiling for a > separate host process would target CPUs while compiling for a separate device > would target GPUs, but it might be nicer to keep them conceptually separate > all > the same. > > > I think it would be useful to start posting patches early – such that > > they can be reviewed and discussed. Thus, this is not really the 4th > > and 5th item. > > > I can post patches every week instead since my proposal will set a milestone > target for each week. > Additionally, what do you think about me doing some other small tasks besides > the proposed scope? What I was thinking about specifically was that it might > be > helpful to get the offloading documentation page up to date and add info on > OpenACC. > > > No quick idea for work items – maybe I get one – or Thomas does :-) > > > > Tobias > > > Thank you so much for all the info, and do let me know if any small tasks come > up! > Adi
Re: [GSoC][analyzer-c++] Submission of a draft proposal
Hi David, On Mon, Apr 3, 2023 at 12:38 AM David Malcolm wrote: > > To be fair, C ones can be as well; the analyzer's exploded graphs tend > to get very big on anything but the most trivial examples. > > > [...snip...] > > Indeed - you'll have to do a lot of looking at gimple IR dumps, what > the supergraph looks like, etc, for all of this. > > Yep. I have already tried my hands on them, but to be fair I'm still often troubled by them. Still, doing so have already provided essential insight on the analyzer. [...snip...] > > 4. Extension of out-of-bounds > > ( - Extending -Wout-of-bounds to the many vec<...> might be a > > requirement. > > However I have to look into more details for that one, I don't see > > yet how > > it could be done without a similar reuse of the assertions as for the > > libstdc++.) > > > > From what I saw, despite the bugs not being FIXED, vfuncs seem to be > > working nicely enough after the fix from GSoC 2021. > > IIRC I was keeping those bugs open because there's still a little room > for making the analyzer smarter about the C++ type system e.g. if we > "know" that a foo * is pointing at a particular subclass, maybe we know > things about what vfunc implementations could be called. > > We could even try an analysis mode where we split the analysis path at > a vfunc call, where we could create an out-edge in the egraph for each > known concrete subclass for foo *, so that we can consider all the > possible subclasses and the code that would be called. (I'm not sure > if this is a *good* idea, but it intrigues me) > Like adding a flag to run in a non-standard mode, to debug when an unexpected vfunc analysis occurs ? TBH I didn't look that much into vfuncs support, as my dummy tests behave OK and I assumed it was fixed after last GSoC. > > > > > Unfortunately I couldn't devote as much time as I wanted to gcc > > yesterday, > > I plan to send a proposal draft tomorrow evening. Sincerely sorry for > > the > > short time frame before the deadline. > > Sound promising. Note that the deadline for submitting proposals to > the official GSoC website is April 4 - 18:00 UTC (i.e. this coming > Tuesday) and that Google are very strict about that deadline; see: > https://developers.google.com/open-source/gsoc/timeline > > I believe you've already had a go at posting gcc patches to our mailing > list: that's a great thing to mention in your application. > Thanks for the tip, I added it to my draft ! > > Good luck! > Dave > > Thanks ! BTW here is my draft proposal (in a google doc, I hope this is OK). If you can find the time to give me some feedback (as always), I would greatly appreciate it ! Below I will dump the "project goals" part, so that it's openly available on the mail list. Note that I've annotated some sections with [RFC], it's for your easy-of-use when reviewing part I'm explicitly asking for feedback. Just do a Ctrl-F on the string [RFC] A bit out of context, but since you always sign your mails 'Dave', should I address you that way ? Unsure about that. Best, Benjamin. (see below for the dump) The aim of this project is to enable the analyzer to self-analyze itself. To do so, the following items should be implemented (m: minor, M: Major feature) > Generalize gcc.dg/analyzer tests to be run with both C and C++ [PR96395] [M] > Support the options relative to checker sm-malloc -Wanalyser-double-free should behave properly for C++ allocations pairs new, new[], delete and delete[] both throwing and non-throwing versions. At the moment, only their non-throwing counterparts are somewhat handled, yet incorrectly as the expected -Wanalyzer-double-free is replaced by -Wanalyzer-use-after-free [m] and an incorrect -Wanalyzer-possible-null-dereference is emitted [fixed]. I filed it as bug PR109365 [2]. > Add support for tracking unique_ptr null-dereference [M]. While smart_ptr is correctly handled, the code snippet below demonstrates that this warning is not emitted for unique_ptr [4]. Figure 1 - First test case for unique_ptr support struct A {int x; int y;}; int main () { std::unique_ptr a; a->x = 12; /* -Wanalyzer-null-dereference missing */ return 0; } > Improve the diagnostic path for the standard library, with shared_ptr as a comparison point, so that they do not wander through the standard library code. [M] Figure 2 - Reproducer to demonstrate unnecessarily long diagnostic paths when using the standard library. struct A {int x; int y;}; int main () { std::shared_ptr a; a->x = 4; /* Diagnostic path should stop here rather than going to shared_ptr_base.h */ return 0; } [RFC] I believe this could be a 350-hours project as time flies by quickly, but I’m more than open to your suggestions to support self-analysis. I’ve read your idea on splitting at vfunc points, I’m currently looking into it. An additional goal I have considered is to add out-of-bounds support for the auto_vec. This would include supporting templates, but a shallow
Re: [GSoC][analyzer-c++] Submission of a draft proposal
Following last mail, a classic I forgot to link my draft ! https://docs.google.com/document/d/1MaLDo-Rt8yrJIvC1MO8SmFc6fp4eRQM_JeSdv-1kbsc/edit?usp=sharing Best, Benjamin. On Mon, Apr 3, 2023 at 6:44 PM Benjamin Priour wrote: > Hi David, > > On Mon, Apr 3, 2023 at 12:38 AM David Malcolm wrote: > >> >> To be fair, C ones can be as well; the analyzer's exploded graphs tend >> to get very big on anything but the most trivial examples. >> >> >> > [...snip...] > > >> >> Indeed - you'll have to do a lot of looking at gimple IR dumps, what >> the supergraph looks like, etc, for all of this. >> >> > Yep. I have already tried my hands on them, but to be fair I'm still often > troubled by them. Still, > doing so have already provided essential insight on the analyzer. > > [...snip...] > > >> > 4. Extension of out-of-bounds >> > ( - Extending -Wout-of-bounds to the many vec<...> might be a >> > requirement. >> > However I have to look into more details for that one, I don't see >> > yet how >> > it could be done without a similar reuse of the assertions as for the >> > libstdc++.) >> > >> > From what I saw, despite the bugs not being FIXED, vfuncs seem to be >> > working nicely enough after the fix from GSoC 2021. >> >> IIRC I was keeping those bugs open because there's still a little room >> for making the analyzer smarter about the C++ type system e.g. if we >> "know" that a foo * is pointing at a particular subclass, maybe we know >> things about what vfunc implementations could be called. >> >> We could even try an analysis mode where we split the analysis path at >> a vfunc call, where we could create an out-edge in the egraph for each >> known concrete subclass for foo *, so that we can consider all the >> possible subclasses and the code that would be called. (I'm not sure >> if this is a *good* idea, but it intrigues me) >> > > Like adding a flag to run in a non-standard mode, to debug when an > unexpected vfunc analysis occurs ? TBH I didn't look that much into vfuncs > support, as my dummy tests behave OK and I assumed it was fixed after last > GSoC. > > >> >> > >> > Unfortunately I couldn't devote as much time as I wanted to gcc >> > yesterday, >> > I plan to send a proposal draft tomorrow evening. Sincerely sorry for >> > the >> > short time frame before the deadline. >> >> Sound promising. Note that the deadline for submitting proposals to >> the official GSoC website is April 4 - 18:00 UTC (i.e. this coming >> Tuesday) and that Google are very strict about that deadline; see: >> https://developers.google.com/open-source/gsoc/timeline >> >> I believe you've already had a go at posting gcc patches to our mailing >> list: that's a great thing to mention in your application. >> > Thanks for the tip, I added it to my draft ! > >> >> Good luck! >> Dave >> >> Thanks ! BTW here is my draft proposal (in a google doc, I hope this is > OK). > If you can find the time to give me some feedback (as always), I would > greatly appreciate it ! > Below I will dump the "project goals" part, so that it's openly available > on the mail list. > Note that I've annotated some sections with [RFC], it's for your > easy-of-use when reviewing part I'm explicitly asking for feedback. Just do > a Ctrl-F on the string [RFC] > > A bit out of context, but since you always sign your mails 'Dave', should > I address you that way ? Unsure about that. > > Best, > Benjamin. (see below for the dump) > > The aim of this project is to enable the analyzer to self-analyze itself. > To do so, the following items should be implemented (m: minor, M: Major > feature) > > Generalize gcc.dg/analyzer tests to be run with both C and C++ [PR96395] > [M] > > Support the options relative to checker sm-malloc >-Wanalyser-double-free should behave properly for C++ allocations > pairs new, new[], delete and delete[] both throwing and non-throwing > versions. > At the moment, only their non-throwing counterparts are somewhat > handled, yet incorrectly as the expected -Wanalyzer-double-free is replaced > by -Wanalyzer-use-after-free [m] and an incorrect > -Wanalyzer-possible-null-dereference is emitted [fixed]. > I filed it as bug PR109365 [2]. > > Add support for tracking unique_ptr null-dereference [M]. While > smart_ptr is correctly handled, the code snippet below demonstrates that > this warning is not emitted for unique_ptr [4]. > Figure 1 - First test case for unique_ptr support > struct A {int x; int y;}; > int main () { > std::unique_ptr a; > a->x = 12; /* -Wanalyzer-null-dereference missing */ > return 0; > } > > Improve the diagnostic path for the standard library, with shared_ptr as > a comparison point, so that they do not wander through the standard library > code. [M] > Figure 2 - Reproducer to demonstrate unnecessarily long diagnostic paths > when using the standard library. > struct A {int x; int y;}; > int main () { > std::shared_ptr a; > a->x = 4; /* Diagnostic path should stop here rather than going
RE: GSoC Separate Host Process Offloading
Hello, On Sat, Apr 01 2023, Prasad, Adi via Gcc wrote: > Hi Tobias and Thomas, > > My apologies for the double email; I have an unrelated administrative > ask. Would it be possible to provide any past successful GSoC > proposals? I'm interested in any thnigs GCC specifically is looking > for in proposals (I've seen quite a few generic guides on the web but > none specific to GCC). Unfortunately no, not without seeking permission of their authors first. But generally speaking, if you can demonstrate that you have the skills and ability to understand the offloading architecture, the current relevant sources (not necessarily in depth but more-or-less correctly) and that you have the C/C++ coding skills to be able to successfully change them, you are likely to be selected (depending on how many slots we get from Google, of course). One way to demonstrate it is to provide good milestones in your proposal and a timeline which will demonstrate that you already have an idea what you would be working on in the first few weeks, beyond setting things up and learning stuff. > > Thanks, > Adi > >> -Original Message- >> From: Prasad, Adi >> Sent: Saturday, April 1, 2023 4:16 AM >> To: 'Tobias Burnus' ; Thomas Schwinge >> >> Cc: gcc@gcc.gnu.org >> Subject: RE: GSoC Separate Host Process Offloading >> >> Hi Tobias, >> Thanks for the reply! >> >> > >> > Note that multiple offload targets are possible. For instance, on >> > Debian/Ubuntu, 'gcc -v' shows: >> > 'OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa' and lto-wrapper >> then >> > cycles through those, finding the offloading compiler in >> > $PATH/accel//mkoffload >> > >> > Example: x86_64-none-linux-gnu/12.2.1/accel/amdgcn-amdhsa/mkoffload >> > >> > Thus, if you install it to 'x86_64-none-linux-gnu' and add it to >> > OFFLOAD_TARGET_NAMES,* it will work; albeit, we probably want to have >> > some special handling in gcc.cc to avoid host-process offloading by >> > default and permit something like -foffload=host instead of having to >> > specify -foffload=x86_64-none-linux-gnu >> > >> Understood. Forgive me if I'm misunderstanding this, but I wonder if it >> might be >> better to put the new mkoffload in an "accel/host" directory, and add "host" >> to >> OFFLOAD_TARGET_NAMES rather than have the specific host e.g. "x86_64-none- >> linux-gnu"? This would 1) enable the use of "-foffload=host" automatically >> and 2) >> distinguish between compiling for the same device on a separate process >> versus >> compiling to a separate device with the same architecture and kernel as the >> host. >> I can imagine this clash wouldn’t happen in practice, since compiling for a >> separate host process would target CPUs while compiling for a separate device >> would target GPUs, but it might be nicer to keep them conceptually separate >> all >> the same. These are details which can be tweaked later but yeah, some simplification will be necessary. >> >> > I think it would be useful to start posting patches early – such that >> > they can be reviewed and discussed. Thus, this is not really the 4th >> > and 5th item. >> > >> I can post patches every week instead since my proposal will set a milestone >> target for each week. >> Additionally, what do you think about me doing some other small tasks besides >> the proposed scope? What I was thinking about specifically was that it might >> be >> helpful to get the offloading documentation page up to date and add info on >> OpenACC. Updating documentation would be very welcome but not as part of a GSoC project, the rules forbid that. As far as small tasks are concerned, that is always very difficult in GCC and so we do not really expect all applicants to manage completing any. But it is important to demonstrate understanding of the relevant bits of GCC, for example by asking good questions on this list. Good luck, Martin
Re: [GSOC] few question about Bypass assembler when generating LTO object files
While going through the patch and simple-object.c I understood that the file simple-object.c is used to handle the object file format. However, this file does not contain all the architecture information required for LTO object files, so the workaround used in the patch is to read the crtbegin.o file and merge the missing attributes. While this workaround is functional, it is not optimal, and the ideal solution would be to extend simple-object.c to include the missing information. Regarding the phrase "Support in the driver to properly execute *1 binary", it is not entirely clear what it refers to. My interpretation is that the compiler driver (the program that coordinates the compilation process) needs to be modified to correctly output LTO object files instead of assembler files (the current approach involves passing the -S and -o .o options) and also skip the assembler option while using -fbypass-asm option but I am not sure. Can Jan or Martin please shed some light on this? Thanks & Regards Rishi Raj On Sun, 2 Apr 2023 at 03:05, Rishi Raj wrote: > Hii Everyone, > I had already expressed my interest in the " Bypass assembler when > generating LTO object files" project and making a proposal for the same. I > know I should have done it earlier but I was admitted to the hospital for > past few days :(. > I have a few doubts. > 1) > > "One problem is that the object files produced by libiberty/simple-object.c > (which is the low-level API used by the LTO code) > are missing some information (such as the architecture info and symbol > table) and API of the simple object will need to be extended to handle > that" I found this in the previous mailing list discussion. So who output > this information currently in the object file, is it assembler? > > Also in the current patch for this project by Jan Hubica, from where are we > getting these information from? Is it from crtbegin.o? > > 2) > "Support in driver to properly execute *1 binary." I found this on Jan > original patch's email. what does it mean > > exactly? > > Regards > > Rishi Raj > > > >
RE: GSoC Separate Host Process Offloading
Hi Adi! I've not been able yet to review your items in detail, but it's very good that you're discussing your ideas! At least a few comments: On 2023-04-01T03:16:28+, "Prasad, Adi via Gcc" wrote: > Tobias wrote: >> [...] permit something like -foffload=host instead of having to >> specify -foffload=x86_64-none-linux-gnu Right -- but I'd be happy if initially the latter worked, and then a 'host' variant can be made work incrementally. > Understood. Forgive me if I'm misunderstanding this, but [...] No, these are certainly good ideas! :-) (I can't investigate the details right now, but surely will, once the time comes.) Please spend some time on this central question that I'd raised: | Make some thoughts (or actual experiments) about how we could | use/implement a separate host process for code offloading. That is, include in your project proposal (or, discuss here, if there's still time) your ideas about how to actually implement that. As Martin wrote, don't worry too much about the specific format of your application. It's more important that we're able to see that you're understanding the scope of the project, timeline, expected difficulties, and so on. All within reasonable bounds, of course -- we're all very well aware of the difficulties of estimating software projects... Yet, some plausible timeline, milestones, etc. are necessary in the project proposal. Good luck! Grüße Thomas - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Re: [GSOC] few question about Bypass assembler when generating LTO object files
Hello, > While going through the patch and simple-object.c I understood that the > file simple-object.c is used to handle the object file format. However, > this file does not contain all the architecture information required for > LTO object files, so the workaround used in the patch is to read the > crtbegin.o file and merge the missing attributes. While this workaround is > functional, it is not optimal, and the ideal solution would be to extend > simple-object.c to include the missing information. Yes, simple-object.c simply uses architecture settings it read earlier which is problem since at compile time we do not read any object files, just parse sources). In my original patch the architecture flags were simply left blank. I am not sure if there is a version reading crtbeing.o which would probably not a be that bad workaround, at least for the start. Having a way to specify this from the machine descriptions would be better. Besides the architecture bits, for simple-object files to work we need to add the symbol table. For practically useful information we also need to stream the debug info. > > Regarding the phrase "Support in the driver to properly execute *1 binary", > it is not entirely clear what it refers to. My interpretation is that the > compiler driver (the program that coordinates the compilation process) > needs to be modified to correctly output LTO object files instead of > assembler files (the current approach involves passing the -S and -o > .o options) and also skip the assembler option while using > -fbypass-asm option but I am not sure. Can Jan or Martin please shed some > light on this? Yes, compiler drivers decides what to do and it needs to know that with -flto it does not need to produce assembly file and then invoke gas. If we go the way of reading in crtbegin.o it will also need to pass correct crtbegin to *1 binary. This is generally not that hard to do, just needs to be done :) Honza > > Thanks & Regards > > Rishi Raj > > On Sun, 2 Apr 2023 at 03:05, Rishi Raj wrote: > > > Hii Everyone, > > I had already expressed my interest in the " Bypass assembler when > > generating LTO object files" project and making a proposal for the same. I > > know I should have done it earlier but I was admitted to the hospital for > > past few days :(. > > I have a few doubts. > > 1) > > > > "One problem is that the object files produced by libiberty/simple-object.c > > (which is the low-level API used by the LTO code) > > are missing some information (such as the architecture info and symbol > > table) and API of the simple object will need to be extended to handle > > that" I found this in the previous mailing list discussion. So who output > > this information currently in the object file, is it assembler? > > > > Also in the current patch for this project by Jan Hubica, from where are we > > getting these information from? Is it from crtbegin.o? > > > > 2) > > "Support in driver to properly execute *1 binary." I found this on Jan > > original patch's email. what does it mean > > > > exactly? > > > > Regards > > > > Rishi Raj > > > > > > > >
Re: [GSOC] few question about Bypass assembler when generating LTO object files
Thanks, Jan for the Reply! I have completed a draft proposal for this project. I will appreciate your's, Martin's, or anybody else feedback on the same. Here is the link to my proposal https://docs.google.com/document/d/1r9kzsU96kOYfIhWZx62jx4ALG-J_aJs5U0sDpwFUtts/edit?usp=sharing On Tue, 4 Apr 2023 at 04:35, Jan Hubicka wrote: > Hello, > > While going through the patch and simple-object.c I understood that the > > file simple-object.c is used to handle the object file format. However, > > this file does not contain all the architecture information required for > > LTO object files, so the workaround used in the patch is to read the > > crtbegin.o file and merge the missing attributes. While this workaround > is > > functional, it is not optimal, and the ideal solution would be to extend > > simple-object.c to include the missing information. > > Yes, simple-object.c simply uses architecture settings it read earlier > which is problem since at compile time we do not read any object files, > just parse sources). In my original patch the architecture flags were > simply left blank. I am not sure if there is a version reading > crtbeing.o which would probably not a be that bad workaround, at least > for the start. Having a way to specify this from the machine descriptions > would be better. > > > Besides the architecture bits, for simple-object files to work we need > to add the symbol table. For practically useful information we also need > to stream the debug info. > > > > Regarding the phrase "Support in the driver to properly execute *1 > binary", > > it is not entirely clear what it refers to. My interpretation is that the > > compiler driver (the program that coordinates the compilation process) > > needs to be modified to correctly output LTO object files instead of > > assembler files (the current approach involves passing the -S and -o > > .o options) and also skip the assembler option while using > > -fbypass-asm option but I am not sure. Can Jan or Martin please shed some > > light on this? > Yes, compiler drivers decides what to do and it needs to know that with > -flto it does not need to produce assembly file and then invoke gas. If > we go the way of reading in crtbegin.o it will also need to pass correct > crtbegin to *1 binary. This is generally not that hard to do, just > needs to be done :) > Honza > > > > Thanks & Regards > > > > Rishi Raj > > > > On Sun, 2 Apr 2023 at 03:05, Rishi Raj wrote: > > > > > Hii Everyone, > > > I had already expressed my interest in the " Bypass assembler when > > > generating LTO object files" project and making a proposal for the > same. I > > > know I should have done it earlier but I was admitted to the hospital > for > > > past few days :(. > > > I have a few doubts. > > > 1) > > > > > > "One problem is that the object files produced by > libiberty/simple-object.c > > > (which is the low-level API used by the LTO code) > > > are missing some information (such as the architecture info and symbol > > > table) and API of the simple object will need to be extended to handle > > > that" I found this in the previous mailing list discussion. So who > output this information currently in the object file, is it assembler? > > > > > > Also in the current patch for this project by Jan Hubica, from where > are we getting these information from? Is it from crtbegin.o? > > > > > > 2) > > > "Support in driver to properly execute *1 binary." I found this on Jan > original patch's email. what does it mean > > > > > > exactly? > > > > > > Regards > > > > > > Rishi Raj > > > > > > > > > > > > >
Fwd: [GSOC] few question about Bypass assembler when generating LTO object files
-- Forwarded message - From: Rishi Raj Date: Tue, 4 Apr 2023 at 05:57 Subject: Re: [GSOC] Submission of draft proposal. To: Jan Hubicka Cc: , oops, I forgot to change the subject in previous email :( Thanks, Jan for the Reply! I have completed a draft proposal for this project. I will appreciate your's, Martin's, or anybody else feedback on the same. Here is the link to my proposal https://docs.google.com/document/d/1r9kzsU96kOYfIhWZx62jx4ALG-J_aJs5U0sDpwFUtts/edit?usp=sharing On Tue, 4 Apr 2023 at 04:35, Jan Hubicka wrote: > Hello, > > While going through the patch and simple-object.c I understood that the > > file simple-object.c is used to handle the object file format. However, > > this file does not contain all the architecture information required for > > LTO object files, so the workaround used in the patch is to read the > > crtbegin.o file and merge the missing attributes. While this workaround > is > > functional, it is not optimal, and the ideal solution would be to extend > > simple-object.c to include the missing information. > > Yes, simple-object.c simply uses architecture settings it read earlier > which is problem since at compile time we do not read any object files, > just parse sources). In my original patch the architecture flags were > simply left blank. I am not sure if there is a version reading > crtbeing.o which would probably not a be that bad workaround, at least > for the start. Having a way to specify this from the machine descriptions > would be better. > > > Besides the architecture bits, for simple-object files to work we need > to add the symbol table. For practically useful information we also need > to stream the debug info. > > > > Regarding the phrase "Support in the driver to properly execute *1 > binary", > > it is not entirely clear what it refers to. My interpretation is that the > > compiler driver (the program that coordinates the compilation process) > > needs to be modified to correctly output LTO object files instead of > > assembler files (the current approach involves passing the -S and -o > > .o options) and also skip the assembler option while using > > -fbypass-asm option but I am not sure. Can Jan or Martin please shed some > > light on this? > Yes, compiler drivers decides what to do and it needs to know that with > -flto it does not need to produce assembly file and then invoke gas. If > we go the way of reading in crtbegin.o it will also need to pass correct > crtbegin to *1 binary. This is generally not that hard to do, just > needs to be done :) > Honza > > > > Thanks & Regards > > > > Rishi Raj > > > > On Sun, 2 Apr 2023 at 03:05, Rishi Raj wrote: > > > > > Hii Everyone, > > > I had already expressed my interest in the " Bypass assembler when > > > generating LTO object files" project and making a proposal for the > same. I > > > know I should have done it earlier but I was admitted to the hospital > for > > > past few days :(. > > > I have a few doubts. > > > 1) > > > > > > "One problem is that the object files produced by > libiberty/simple-object.c > > > (which is the low-level API used by the LTO code) > > > are missing some information (such as the architecture info and symbol > > > table) and API of the simple object will need to be extended to handle > > > that" I found this in the previous mailing list discussion. So who > output this information currently in the object file, is it assembler? > > > > > > Also in the current patch for this project by Jan Hubica, from where > are we getting these information from? Is it from crtbegin.o? > > > > > > 2) > > > "Support in driver to properly execute *1 binary." I found this on Jan > original patch's email. what does it mean > > > > > > exactly? > > > > > > Regards > > > > > > Rishi Raj > > > > > > > > > > > > >
[GSOC] Submission of draft proposal for Bypass assembler when generating LTO object files
Sorry, I messed subject in my previous two emails :( so I am sending it again. I have completed a draft proposal for this project. I will appreciate Jan, Martin, or anybody else feedback on the same. Here is the link to my proposal https://docs.google.com/document/d/1r9kzsU96kOYfIhWZx62jx4ALG-J_aJs5U0sDpwFUtts/edit?usp=sharing
Re: [GSoC][analyzer-c++] Submission of a draft proposal
On Mon, 2023-04-03 at 18:46 +0200, Benjamin Priour wrote: > Following last mail, a classic I forgot to link my draft ! > https://docs.google.com/document/d/1MaLDo-Rt8yrJIvC1MO8SmFc6fp4eRQM_JeSdv-1kbsc/edit?usp=sharing Some notes: * The document still has some notes in italics marked "[RFC]" which you'll want to fix before formally submitting it. * "Project Goals": item 4: you give a reproducer; perhaps add a link to godbolt.org (Compiler Explorer) demonstrating the overlong diagnostic path? * Part 1: as part of moving the test cases to c-c++-common you'll probably have to debug/write a little .exp code (in Tcl) so that it actually runs the tests, probably in analyzer.exp. So you might want to allow some time to read up on Tcl, which is the language our testsuite is written in (I wish it was in Python, but fixing that would be a different project, alas) * Part 2: your grep for unique_ptr found 2903 uses, but I guess many of these are in libstdc++-v3. As i understand it, this is compiled for the target (as a library for use by the compiler user), whereas I'm much more interested in the code below "gcc", which is compiled for the host into the compiler itself. You might want so split out these numbers. One task is to try adding -fanalyzer to the build flags for the compiler itself, and see what happens: is it usable? is it unusably slow on some of our source files? does it find true problems? does it report false positives? The current document suggests doing this in part 3 as the last 20% of the project; I think it makes more sense to do the initial attempt at this much earlier, to get an earlier idea of what the problems might be. "Motivation and Skill set": the first paragraph is poorly worded; for example the 2nd sentence seems to just stop halfway through. "Mentor": yes, I would be the mentor Other than that, looks good. The deadline for formally submitting this to the GSoC website is April 4th at 18:00 UTC (less than 24 hours from now), and Google are strict about this deadline. Good luck! Dave > > Best, > Benjamin. > > On Mon, Apr 3, 2023 at 6:44 PM Benjamin Priour > wrote: > > > Hi David, > > > > On Mon, Apr 3, 2023 at 12:38 AM David Malcolm > > wrote: > > > > > > > > To be fair, C ones can be as well; the analyzer's exploded graphs > > > tend > > > to get very big on anything but the most trivial examples. > > > > > > > > > > > [...snip...] > > > > > > > > > > Indeed - you'll have to do a lot of looking at gimple IR dumps, > > > what > > > the supergraph looks like, etc, for all of this. > > > > > > > > Yep. I have already tried my hands on them, but to be fair I'm > > still often > > troubled by them. Still, > > doing so have already provided essential insight on the analyzer. > > > > [...snip...] > > > > > > > > 4. Extension of out-of-bounds > > > > ( - Extending -Wout-of-bounds to the many vec<...> might be a > > > > requirement. > > > > However I have to look into more details for that one, I don't > > > > see > > > > yet how > > > > it could be done without a similar reuse of the assertions as > > > > for the > > > > libstdc++.) > > > > > > > > From what I saw, despite the bugs not being FIXED, vfuncs seem > > > > to be > > > > working nicely enough after the fix from GSoC 2021. > > > > > > IIRC I was keeping those bugs open because there's still a little > > > room > > > for making the analyzer smarter about the C++ type system e.g. if > > > we > > > "know" that a foo * is pointing at a particular subclass, maybe > > > we know > > > things about what vfunc implementations could be called. > > > > > > We could even try an analysis mode where we split the analysis > > > path at > > > a vfunc call, where we could create an out-edge in the egraph for > > > each > > > known concrete subclass for foo *, so that we can consider all > > > the > > > possible subclasses and the code that would be called. (I'm not > > > sure > > > if this is a *good* idea, but it intrigues me) > > > > > > > Like adding a flag to run in a non-standard mode, to debug when an > > unexpected vfunc analysis occurs ? TBH I didn't look that much into > > vfuncs > > support, as my dummy tests behave OK and I assumed it was fixed > > after last > > GSoC. > > > > > > > > > > > > > > > Unfortunately I couldn't devote as much time as I wanted to gcc > > > > yesterday, > > > > I plan to send a proposal draft tomorrow evening. Sincerely > > > > sorry for > > > > the > > > > short time frame before the deadline. > > > > > > Sound promising. Note that the deadline for submitting proposals > > > to > > > the official GSoC website is April 4 - 18:00 UTC (i.e. this > > > coming > > > Tuesday) and that Google are very strict about that deadline; > > > see: > > > https://developers.google.com/open-source/gsoc/timeline > > > > > > I believe you've already had a go at posting gcc patches to our > > > mailing > > > list: that's a great thing to mention in your application. > > > > > Thank