Re: GSoC: want to take part in `Extend the static analysis pass for CPython Extension`

2023-04-03 Thread Steven Sun via Gcc
I do not have specific ideas on (c). I prefer to work on (b) if possible.

The PEP 701 branch is under active development now. I review others' PRs
and open some PRs myself.

https://github.com/pablogsal/cpython/pull/54
https://github.com/pablogsal/cpython/pull/61
https://github.com/pablogsal/cpython/pull/63


I will submit a proposal on (b) as soon as possible. And by the way, I can get
to work long before the start-coding timepoint of GSoC timeline.


From: David Malcolm 
Sent: Monday, April 3, 2023 7:41
To: Sun Steven ; gcc@gcc.gnu.org 
Subject: Re: GSoC: want to take part in `Extend the static analysis pass for 
CPython Extension`

On Sat, 2023-04-01 at 20:32 +, Sun Steven via Gcc wrote:
> Hello,

Hi!

I just replied to your other email in the "[GSoC] Interest and initial
proposal for project on reimplementing cpychecker as -fanalyzer plugin
" thread.

>
> I want to take part in this project.
>
> b. Write a plugin to add checking for usage of the CPython API (e.g.
> reference-counting); see
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107646
>
>
> I know the deadline is arriving, but this idea just came to me now.

Indeed; the deadline for submitting proposals to the official GSoC
website is April 4 - 18:00 UTC (i.e. this coming Tuesday); see:
https://developers.google.com/open-source/gsoc/timeline

Google are very strict about that deadline.

>
> Self-intro:
> I am a fan of C++, and have expertise in writing low-latency codes. I
> previously worked at a high-frequency trading company, mainly writing
> C++ and Python on Linux.
>
> Familiarity with GCC:
> I get an overall idea of how the compiler works. I have debugged
> several GCC c++ frontend bugs. (eg. 108218,  99686, 99019,...)

Thanks; I just took a look at those.


> But I only checked the c++ frontend codes in detail, not the middle
> or backend codes. I have the ability to work with large codebases.
>
> Familiarity with CPython:
> I use a lot of CPython. Recently, I am contributing to the CPython
> interpreter on PEP 701 (mainly on the parser, which I am familiar
> with)
>
>
> I have always been wanting to contribute major changes to GCC, but
> just don't know if that project exists. I understand how middle-end
> works, but never really interact with the GIMPLE. This project allows
> me to take a real look at how GCC's middle end works.

Given your knowledge of both C++ and of CPython internals, this project
sounds like a good way for you to get involved.

>
> I want to know if anyone was already on this project. I would prefer
> a large-sized object (350hrs).

I see you've already posted to the thread Eric started.

>
> If b. was already taken, I also accept a. and c.

I had to check the wiki page to see which ones (a) and (c) were;

(a) is "Add format-string support to -fanalyzer."

(c) is "Add a checker for some API or project of interest to the
contributor (e.g. the Linux kernel, a POSIX API that we're not yet
checking, or something else), either as a plugin, or as part of the
analyzer core for e.g. POSIX."

Do you have specific ideas for (c)?

(a) would make a great project, in that it's reasonably self-contained.
Eric's proposal for (b) plans to eventually tackle it, but there's a
huge amount of potential work in (b) already.

> By the way, I don't really care about the GSoC. If we miss the
> deadline, we can still push forward this project without the support
> of GSoC, as long as I get coached.

I'm keen on helping new GCC contributors, with or without GSoC.  A good
next step is to build GCC from source, and try hacking in a new
warning.  See:
  https://gcc-newbies-guide.readthedocs.io/en/latest/

But remember that the GSoC deadline is April 4 - 18:00 UTC (i.e. this
coming Tuesday), so if you're going to apply, you need to act fast.

Good luck
Dave



Re: [GSoC] Interest and initial proposal for project on reimplementing cpychecker as -fanalyzer plugin

2023-04-03 Thread Eric Feng via Gcc
Thanks for bringing this to my attention Dave! I’m happy to
collaborate on this project with Steven. I will reply in more detail
in the other thread.

Best,
Eric


On Sun, Apr 2, 2023 at 7:28 PM David Malcolm  wrote:
>
> On Sat, 2023-04-01 at 19:49 -0400, Eric Feng wrote:
> > > For the task above, I think it's almost all there, it's "just" a
> > > case
> > > of implementing the special-case knowledge about the CPython API,
> > > mostly via known_function subclasses.
> >
> > Sounds good.
> >
> >
> > > In cpychecker I added some custom function attributes:
> > >
> > > https://gcc-python-plugin.readthedocs.io/en/latest/cpychecker.html
> > > which were:
> > >   __attribute__((cpychecker_returns_borrowed_ref))
> > >   __attribute__((cpychecker_steals_reference_to_arg(n)))
> > >
> > [...]
> > >
> > > But exactly what these macros would look like would be a decision
> > > for
> > > the CPython community (hence do it via PEP, based on a sample
> > > implementation).
> >
> > Ok, I see what you mean now. Thanks for clarifying!
> >
> >
> > > Yeah, this sounds like a big project.  Fortunately there are a lot
> > > of
> > > possible subtasks in this one, and the project has benefits to GCC
> > > and
> > > to CPython even if you only get a subset of the ideas done in the
> > > time
> > > available (refcount checking being probably the highest-value
> > > subtask).
> >
> > Sounds good.
> >
> > I refactored the project description and timeline sections of the
> > proposal according to our conversation. Notably, I moved format
> > string
> > checking to task #2 in the timeline since its subtasks are
> > particularly beneficial. I also suggest in the timeline section to
> > reach out to the CPython community via PEP about the specifics of new
> > attributes in week 9/10 since I think we should have a somewhat
> > mature
> > prototype by that point. Let me know if you think it should be done
> > earlier/later. Please find the changed sections below (I omitted
> > unchanged sections for brevity)
> > ___
> >
> > Describe the project and clearly define its goals:
> > One pertinent use case of the gcc-python plugin was as a static
> > analysis tool for CPython extension modules. The main goal of the
> > plugin was to help programmers writing extensions identify common
> > coding errors. The gcc-python-plugin has bitrotted over the years
> > and,
> > in particular, cpychecker stopped working some GCC releases ago.
> > Broadly, the goal of this project is to port the functionalities of
> > cpychecker to a -fanalyzer plugin.
> >
> > Below is a brief description of the functionalities of the static
> > analysis tool for which I will work on porting over to a -fanalyzer
> > plugin. The structure of the objectives is based on the
> > gcc-python-plugin documentation:
> >
> > Reference count checking: 
> >
> > Format string checking: Some CPython APIs such as PyArgs_ParseTuple,
> > PyArg_ParseTupleAndKeywords, etc take format strings as arguments.
> > This check involves verifying that the format strings taken in by
> > these APIs are correct with respect to the number and types of
> > arguments passed in. In particular, I will work on integrating the
> > analyzer with -Wformat
> > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107017) and adding
> > plugin support for -Wformat
> > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100121) . We should
> > then
> > be able to specify our own archetype which reflects the format string
> > syntax for the relevant CPython APIs and take advantage of the
> > integrated analyzer to check them.
> >
> > Associating PyTypeObject instances with compile-time-types:
> >  > from original proposal>
> >
> > Error-handling checking (including errors in exception handling):
> > Common errors such as dereferencing a NULL value are already checked
> > by the analyzer. I will extend this functionality by implementing
> > special-case knowledge about the CPython API.
> >
> > Verification of PyMethodDef tables:  > proposal>
> >
> > Provide an expected timeline:
> > Please find a rough estimate of the weekly progress in relation to
> > the
> > features described below. Tasks that I expect to take longer than one
> > week are broken down in more detail. In addition to what’s described,
> > each task also involves adding test coverage pertaining its specific
> > feature to a regression test suite.
> >
> > Week 1 - 7: Reference counting checking
> > Week 1: Set up the overall infrastructure of the plugin and begin
> > building core functionality
> > Week 1 - 6: Core reference counting functionality
> > Week 7: Refine prototype
> > Week 8 - 10.5: Format string checking (including associating
> > PyTypeObject instances with compile-time-types)
> > Week 8 - ~9: RFE: support printf-style formatted functions in -
> > fanalyzer
> > Week ~9 - 10.5: RFE: plugin support for -Wformat via
> > __attribute__((format()))
> > Additionally, begin conversing with CPython community via PEP
> > about the exact

Re: Re: GSoC: want to take part in `Extend the static analysis pass for CPython Extension`

2023-04-03 Thread Eric Feng via Gcc
Hi Steven,

I’m happy to collaborate on this project together — it would be great
to have your experience with CPython internals on the team.

> And by the way, I can get to work long before the start-coding time point of 
> GSoC timeline.


I can be involved in some capacity before the start-coding period as
well (I originally planned to spend the time getting well acquainted
so as to hit the ground running) but I would prefer if we leave the
more involved tasks (e.g reference count checking, format string
checking) to the start-coding time point as I can’t work in a full
time manner until late May due to commitments in school before then.
Perhaps we can begin with the more low hanging fruits such as
Error-handling checking, errors in exception handling, and
verification of PyMethodDef tables in the time before the start-coding
period? It might be good for us to start with these smaller tasks
first to be more efficient in tackling the more involved tasks
anyways. It would also be easy to divvy these tasks up as well.

Best,
Eric


Re: Re: GSoC: want to take part in `Extend the static analysis pass for CPython Extension`

2023-04-03 Thread Martin Jambor
Hello,

On Mon, Apr 03 2023, Eric Feng via Gcc wrote:
> Hi Steven,
>
> I’m happy to collaborate on this project together — it would be great
> to have your experience with CPython internals on the team.
>

While I normally welcome collaboration, please note that GSoC rules and
reasonable caution dictate that the two project must not be dependent on
one another.  I.e. if one of you, for any reason, could not finish your
bits in time, it must not have severe adverse effects on the other.

If mostly independent collaboration is possible (and David agrees),
then sure, go ahead.

Thanks for understanding,

Martin


Re: GSoC Separate Host Process Offloading

2023-04-03 Thread Prasad, Adi via Gcc
Hi Tobias and Thomas - just wondering if you've had a chance to look at this?

Thanks,
Adi

From: Prasad, Adi 
Sent: Saturday, April 1, 2023 5:16 am
To: Tobias Burnus ; Thomas Schwinge 

Cc: gcc@gcc.gnu.org 
Subject: RE: GSoC Separate Host Process Offloading

Hi Tobias and Thomas,
My apologies for the double email; I have an unrelated administrative ask. 
Would it be possible to provide any past successful GSoC proposals? I'm 
interested in any thnigs GCC specifically is looking for in proposals (I've 
seen quite a few generic guides on the web but none specific to GCC).

Thanks,
Adi

> -Original Message-
> From: Prasad, Adi
> Sent: Saturday, April 1, 2023 4:16 AM
> To: 'Tobias Burnus' ; Thomas Schwinge
> 
> Cc: gcc@gcc.gnu.org
> Subject: RE: GSoC Separate Host Process Offloading
>
> Hi Tobias,
> Thanks for the reply!
>
> >
> > Note that multiple offload targets are possible. For instance, on
> > Debian/Ubuntu, 'gcc -v' shows:
> > 'OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa' and lto-wrapper
> then
> > cycles through those, finding the offloading compiler in
> > $PATH/accel//mkoffload
> >
> > Example: x86_64-none-linux-gnu/12.2.1/accel/amdgcn-amdhsa/mkoffload
> >
> > Thus, if you install it to 'x86_64-none-linux-gnu' and add it to
> > OFFLOAD_TARGET_NAMES,* it will work; albeit, we probably want to have
> > some special handling in gcc.cc to avoid host-process offloading by
> > default and permit something like -foffload=host instead of having to
> > specify -foffload=x86_64-none-linux-gnu
> >
> Understood. Forgive me if I'm misunderstanding this, but I wonder if it might 
> be
> better to put the new mkoffload in an "accel/host" directory, and add "host" 
> to
> OFFLOAD_TARGET_NAMES rather than have the specific host e.g. "x86_64-none-
> linux-gnu"? This would 1) enable the use of "-foffload=host" automatically 
> and 2)
> distinguish between compiling for the same device on a separate process versus
> compiling to a separate device with the same architecture and kernel as the 
> host.
> I can imagine this clash wouldn’t happen in practice, since compiling for a
> separate host process would target CPUs while compiling for a separate device
> would target GPUs, but it might be nicer to keep them conceptually separate 
> all
> the same.
>
> > I think it would be useful to start posting patches early – such that
> > they can be reviewed and discussed. Thus, this is not really the 4th
> > and 5th item.
> >
> I can post patches every week instead since my proposal will set a milestone
> target for each week.
> Additionally, what do you think about me doing some other small tasks besides
> the proposed scope? What I was thinking about specifically was that it might 
> be
> helpful to get the offloading documentation page up to date and add info on
> OpenACC.
>
> > No quick idea for work items – maybe I get one – or Thomas does :-)
> >
> > Tobias
> >
> Thank you so much for all the info, and do let me know if any small tasks come
> up!
> Adi


Re: [GSoC][analyzer-c++] Submission of a draft proposal

2023-04-03 Thread Benjamin Priour via Gcc
Hi David,

On Mon, Apr 3, 2023 at 12:38 AM David Malcolm  wrote:

>
> To be fair, C ones can be as well; the analyzer's exploded graphs tend
> to get very big on anything but the most trivial examples.
>
>
>
[...snip...]


>
> Indeed - you'll have to do a lot of looking at gimple IR dumps, what
> the supergraph looks like, etc, for all of this.
>
>
Yep. I have already tried my hands on them, but to be fair I'm still often
troubled by them. Still,
doing so have already provided essential insight on the analyzer.

[...snip...]


> > 4. Extension of out-of-bounds
> >  ( - Extending -Wout-of-bounds to the many vec<...> might be a
> > requirement.
> > However I have to look into more details for that one, I don't see
> > yet how
> > it could be done without a similar reuse of the assertions as for the
> > libstdc++.)
> >
> > From what I saw, despite the bugs not being FIXED, vfuncs seem to be
> > working nicely enough after the fix from GSoC 2021.
>
> IIRC I was keeping those bugs open because there's still a little room
> for making the analyzer smarter about the C++ type system e.g. if we
> "know" that a foo * is pointing at a particular subclass, maybe we know
> things about what vfunc implementations could be called.
>
> We could even try an analysis mode where we split the analysis path at
> a vfunc call, where we could create an out-edge in the egraph for each
> known concrete subclass for foo *, so that we can consider all the
> possible subclasses and the code that would be called.  (I'm not sure
> if this is a *good* idea, but it intrigues me)
>

Like adding a flag to run in a non-standard mode, to debug when an
unexpected vfunc analysis occurs ? TBH I didn't look that much into vfuncs
support, as my dummy tests behave OK and I assumed it was fixed after last
GSoC.


>
> >
> > Unfortunately I couldn't devote as much time as I wanted to gcc
> > yesterday,
> > I plan to send a proposal draft tomorrow evening. Sincerely sorry for
> > the
> > short time frame before the deadline.
>
> Sound promising.  Note that the deadline for submitting proposals to
> the official GSoC website is April 4 - 18:00 UTC (i.e. this coming
> Tuesday) and that Google are very strict about that deadline; see:
> https://developers.google.com/open-source/gsoc/timeline
>
> I believe you've already had a go at posting gcc patches to our mailing
> list: that's a great thing to mention in your application.
>
Thanks for the tip, I added it to my draft !

>
> Good luck!
> Dave
>
> Thanks ! BTW here is my draft proposal (in a google doc, I hope this is
OK).
If you can find the time to give me some feedback (as always), I would
greatly appreciate it !
Below I will dump the "project goals" part, so that it's openly available
on the mail list.
Note that I've annotated some sections with [RFC], it's for your
easy-of-use when reviewing part I'm explicitly asking for feedback. Just do
a Ctrl-F on the string [RFC]

A bit out of context, but since you always sign your mails 'Dave', should I
address you that way ? Unsure about that.

Best,
Benjamin. (see below for the dump)

The aim of this project is to enable the analyzer to self-analyze itself.
To do so, the following items should be implemented (m: minor, M: Major
feature)
> Generalize gcc.dg/analyzer tests to be run with both C and C++ [PR96395]
[M]
> Support the options relative to checker sm-malloc
   -Wanalyser-double-free should behave properly for C++ allocations
pairs new, new[], delete and delete[] both throwing and non-throwing
versions.
At the moment, only their non-throwing counterparts are somewhat
handled, yet incorrectly as the expected -Wanalyzer-double-free is replaced
by -Wanalyzer-use-after-free [m] and an incorrect
-Wanalyzer-possible-null-dereference is emitted [fixed].
 I filed it as bug PR109365 [2].
> Add support for tracking unique_ptr null-dereference [M]. While smart_ptr
is correctly handled, the code snippet below demonstrates that this warning
is not emitted for unique_ptr [4].
Figure 1 - First test case for unique_ptr support
struct A {int x; int y;};
int main () {
  std::unique_ptr a;
  a->x = 12; /* -Wanalyzer-null-dereference missing */
  return 0;
}
> Improve the diagnostic path for the standard library, with shared_ptr as
a comparison point, so that they do not wander through the standard library
code. [M]
Figure 2 - Reproducer to demonstrate unnecessarily long diagnostic paths
when using the standard library.
struct A {int x; int y;};
int main () {
  std::shared_ptr a;
  a->x = 4; /* Diagnostic path should stop here rather than going to
shared_ptr_base.h */
  return 0;
   }
[RFC] I believe this could be a 350-hours project as time flies by quickly,
but I’m more than open to your suggestions to support self-analysis. I’ve
read your idea on splitting at vfunc points, I’m currently looking into it.
An additional goal I have considered is to add out-of-bounds support for
the auto_vec. This would include supporting templates, but a shallow

Re: [GSoC][analyzer-c++] Submission of a draft proposal

2023-04-03 Thread Benjamin Priour via Gcc
Following last mail, a classic I forgot to link my draft !
https://docs.google.com/document/d/1MaLDo-Rt8yrJIvC1MO8SmFc6fp4eRQM_JeSdv-1kbsc/edit?usp=sharing

Best,
Benjamin.

On Mon, Apr 3, 2023 at 6:44 PM Benjamin Priour  wrote:

> Hi David,
>
> On Mon, Apr 3, 2023 at 12:38 AM David Malcolm  wrote:
>
>>
>> To be fair, C ones can be as well; the analyzer's exploded graphs tend
>> to get very big on anything but the most trivial examples.
>>
>>
>>
> [...snip...]
>
>
>>
>> Indeed - you'll have to do a lot of looking at gimple IR dumps, what
>> the supergraph looks like, etc, for all of this.
>>
>>
> Yep. I have already tried my hands on them, but to be fair I'm still often
> troubled by them. Still,
> doing so have already provided essential insight on the analyzer.
>
> [...snip...]
>
>
>> > 4. Extension of out-of-bounds
>> >  ( - Extending -Wout-of-bounds to the many vec<...> might be a
>> > requirement.
>> > However I have to look into more details for that one, I don't see
>> > yet how
>> > it could be done without a similar reuse of the assertions as for the
>> > libstdc++.)
>> >
>> > From what I saw, despite the bugs not being FIXED, vfuncs seem to be
>> > working nicely enough after the fix from GSoC 2021.
>>
>> IIRC I was keeping those bugs open because there's still a little room
>> for making the analyzer smarter about the C++ type system e.g. if we
>> "know" that a foo * is pointing at a particular subclass, maybe we know
>> things about what vfunc implementations could be called.
>>
>> We could even try an analysis mode where we split the analysis path at
>> a vfunc call, where we could create an out-edge in the egraph for each
>> known concrete subclass for foo *, so that we can consider all the
>> possible subclasses and the code that would be called.  (I'm not sure
>> if this is a *good* idea, but it intrigues me)
>>
>
> Like adding a flag to run in a non-standard mode, to debug when an
> unexpected vfunc analysis occurs ? TBH I didn't look that much into vfuncs
> support, as my dummy tests behave OK and I assumed it was fixed after last
> GSoC.
>
>
>>
>> >
>> > Unfortunately I couldn't devote as much time as I wanted to gcc
>> > yesterday,
>> > I plan to send a proposal draft tomorrow evening. Sincerely sorry for
>> > the
>> > short time frame before the deadline.
>>
>> Sound promising.  Note that the deadline for submitting proposals to
>> the official GSoC website is April 4 - 18:00 UTC (i.e. this coming
>> Tuesday) and that Google are very strict about that deadline; see:
>> https://developers.google.com/open-source/gsoc/timeline
>>
>> I believe you've already had a go at posting gcc patches to our mailing
>> list: that's a great thing to mention in your application.
>>
> Thanks for the tip, I added it to my draft !
>
>>
>> Good luck!
>> Dave
>>
>> Thanks ! BTW here is my draft proposal (in a google doc, I hope this is
> OK).
> If you can find the time to give me some feedback (as always), I would
> greatly appreciate it !
> Below I will dump the "project goals" part, so that it's openly available
> on the mail list.
> Note that I've annotated some sections with [RFC], it's for your
> easy-of-use when reviewing part I'm explicitly asking for feedback. Just do
> a Ctrl-F on the string [RFC]
>
> A bit out of context, but since you always sign your mails 'Dave', should
> I address you that way ? Unsure about that.
>
> Best,
> Benjamin. (see below for the dump)
>
> The aim of this project is to enable the analyzer to self-analyze itself.
> To do so, the following items should be implemented (m: minor, M: Major
> feature)
> > Generalize gcc.dg/analyzer tests to be run with both C and C++ [PR96395]
> [M]
> > Support the options relative to checker sm-malloc
>-Wanalyser-double-free should behave properly for C++ allocations
> pairs new, new[], delete and delete[] both throwing and non-throwing
> versions.
> At the moment, only their non-throwing counterparts are somewhat
> handled, yet incorrectly as the expected -Wanalyzer-double-free is replaced
> by -Wanalyzer-use-after-free [m] and an incorrect
> -Wanalyzer-possible-null-dereference is emitted [fixed].
>  I filed it as bug PR109365 [2].
> > Add support for tracking unique_ptr null-dereference [M]. While
> smart_ptr is correctly handled, the code snippet below demonstrates that
> this warning is not emitted for unique_ptr [4].
> Figure 1 - First test case for unique_ptr support
> struct A {int x; int y;};
> int main () {
>   std::unique_ptr a;
>   a->x = 12; /* -Wanalyzer-null-dereference missing */
>   return 0;
> }
> > Improve the diagnostic path for the standard library, with shared_ptr as
> a comparison point, so that they do not wander through the standard library
> code. [M]
> Figure 2 - Reproducer to demonstrate unnecessarily long diagnostic paths
> when using the standard library.
> struct A {int x; int y;};
> int main () {
>   std::shared_ptr a;
>   a->x = 4; /* Diagnostic path should stop here rather than going 

RE: GSoC Separate Host Process Offloading

2023-04-03 Thread Martin Jambor
Hello,

On Sat, Apr 01 2023, Prasad, Adi via Gcc wrote:
> Hi Tobias and Thomas,
>
> My apologies for the double email; I have an unrelated administrative
> ask. Would it be possible to provide any past successful GSoC
> proposals? I'm interested in any thnigs GCC specifically is looking
> for in proposals (I've seen quite a few generic guides on the web but
> none specific to GCC).

Unfortunately no, not without seeking permission of their authors first.

But generally speaking, if you can demonstrate that you have the skills
and ability to understand the offloading architecture, the current
relevant sources (not necessarily in depth but more-or-less correctly)
and that you have the C/C++ coding skills to be able to successfully
change them, you are likely to be selected (depending on how many slots
we get from Google, of course).

One way to demonstrate it is to provide good milestones in your proposal
and a timeline which will demonstrate that you already have an idea what
you would be working on in the first few weeks, beyond setting things up
and learning stuff.

>
> Thanks,
> Adi
>
>> -Original Message-
>> From: Prasad, Adi
>> Sent: Saturday, April 1, 2023 4:16 AM
>> To: 'Tobias Burnus' ; Thomas Schwinge
>> 
>> Cc: gcc@gcc.gnu.org
>> Subject: RE: GSoC Separate Host Process Offloading
>> 
>> Hi Tobias,
>> Thanks for the reply!
>> 
>> >
>> > Note that multiple offload targets are possible. For instance, on
>> > Debian/Ubuntu, 'gcc -v' shows:
>> > 'OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa' and lto-wrapper
>> then
>> > cycles through those, finding the offloading compiler in
>> > $PATH/accel//mkoffload
>> >
>> > Example: x86_64-none-linux-gnu/12.2.1/accel/amdgcn-amdhsa/mkoffload
>> >
>> > Thus, if you install it to 'x86_64-none-linux-gnu' and add it to
>> > OFFLOAD_TARGET_NAMES,* it will work; albeit, we probably want to have
>> > some special handling in gcc.cc to avoid host-process offloading by
>> > default and permit something like -foffload=host instead of having to
>> > specify -foffload=x86_64-none-linux-gnu
>> >
>> Understood. Forgive me if I'm misunderstanding this, but I wonder if it 
>> might be
>> better to put the new mkoffload in an "accel/host" directory, and add "host" 
>> to
>> OFFLOAD_TARGET_NAMES rather than have the specific host e.g. "x86_64-none-
>> linux-gnu"? This would 1) enable the use of "-foffload=host" automatically 
>> and 2)
>> distinguish between compiling for the same device on a separate process 
>> versus
>> compiling to a separate device with the same architecture and kernel as the 
>> host.
>> I can imagine this clash wouldn’t happen in practice, since compiling for a
>> separate host process would target CPUs while compiling for a separate device
>> would target GPUs, but it might be nicer to keep them conceptually separate 
>> all
>> the same.

These are details which can be tweaked later but yeah, some
simplification will be necessary.

>> 
>> > I think it would be useful to start posting patches early – such that
>> > they can be reviewed and discussed. Thus, this is not really the 4th
>> > and 5th item.
>> >
>> I can post patches every week instead since my proposal will set a milestone
>> target for each week.
>> Additionally, what do you think about me doing some other small tasks besides
>> the proposed scope? What I was thinking about specifically was that it might 
>> be
>> helpful to get the offloading documentation page up to date and add info on
>> OpenACC.

Updating documentation would be very welcome but not as part of a GSoC
project, the rules forbid that.

As far as small tasks are concerned, that is always very difficult in
GCC and so we do not really expect all applicants to manage completing
any.  But it is important to demonstrate understanding of the relevant
bits of GCC, for example by asking good questions on this list.

Good luck,

Martin


Re: [GSOC] few question about Bypass assembler when generating LTO object files

2023-04-03 Thread Rishi Raj via Gcc
While going through the patch and simple-object.c I understood that the
file simple-object.c is used to handle the object file format. However,
this file does not contain all the architecture information required for
LTO object files, so the workaround used in the patch is to read the
crtbegin.o file and merge the missing attributes. While this workaround is
functional, it is not optimal, and the ideal solution would be to extend
simple-object.c to include the missing information.

Regarding the phrase "Support in the driver to properly execute *1 binary",
it is not entirely clear what it refers to. My interpretation is that the
compiler driver (the program that coordinates the compilation process)
needs to be modified to correctly output LTO object files instead of
assembler files (the current approach involves passing the -S and -o
.o options) and also skip the assembler option while using
-fbypass-asm option but I am not sure. Can Jan or Martin please shed some
light on this?

Thanks & Regards

Rishi Raj

On Sun, 2 Apr 2023 at 03:05, Rishi Raj  wrote:

> Hii Everyone,
> I had already expressed my interest in the " Bypass assembler when
> generating LTO object files" project and making a proposal for the same. I
> know I should have done it earlier but I was admitted to the hospital for
> past few days :(.
> I have a few doubts.
> 1)
>
> "One problem is that the object files produced by libiberty/simple-object.c
> (which is the low-level API used by the LTO code)
> are missing some information (such as the architecture info and symbol
> table) and API of the simple object will need to be extended to handle
> that" I found this in the previous mailing list discussion. So who output 
> this information currently in the object file, is it assembler?
>
> Also in the current patch for this project by Jan Hubica, from where are we 
> getting these information from? Is it from crtbegin.o?
>
> 2)
> "Support in driver to properly execute *1 binary." I found this on Jan 
> original patch's email. what does it mean
>
> exactly?
>
> Regards
>
> Rishi Raj
>
>
>
>


RE: GSoC Separate Host Process Offloading

2023-04-03 Thread Thomas Schwinge
Hi Adi!

I've not been able yet to review your items in detail, but it's very good
that you're discussing your ideas!

At least a few comments:

On 2023-04-01T03:16:28+, "Prasad, Adi via Gcc"  wrote:
> Tobias wrote:
>> [...] permit something like -foffload=host instead of having to
>> specify -foffload=x86_64-none-linux-gnu

Right -- but I'd be happy if initially the latter worked, and then a
'host' variant can be made work incrementally.

> Understood. Forgive me if I'm misunderstanding this, but [...]

No, these are certainly good ideas!  :-) (I can't investigate the details
right now, but surely will, once the time comes.)


Please spend some time on this central question that I'd raised:

| Make some thoughts (or actual experiments) about how we could
| use/implement a separate host process for code offloading.

That is, include in your project proposal (or, discuss here, if there's
still time) your ideas about how to actually implement that.


As Martin wrote, don't worry too much about the specific format of your
application.  It's more important that we're able to see that you're
understanding the scope of the project, timeline, expected difficulties,
and so on.  All within reasonable bounds, of course -- we're all very
well aware of the difficulties of estimating software projects...  Yet,
some plausible timeline, milestones, etc. are necessary in the project
proposal.


Good luck!

Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [GSOC] few question about Bypass assembler when generating LTO object files

2023-04-03 Thread Jan Hubicka via Gcc
Hello,
> While going through the patch and simple-object.c I understood that the
> file simple-object.c is used to handle the object file format. However,
> this file does not contain all the architecture information required for
> LTO object files, so the workaround used in the patch is to read the
> crtbegin.o file and merge the missing attributes. While this workaround is
> functional, it is not optimal, and the ideal solution would be to extend
> simple-object.c to include the missing information.

Yes, simple-object.c simply uses architecture settings it read earlier
which is problem since at compile time we do not read any object files,
just parse sources). In my original patch the architecture flags were
simply left blank.  I am not sure if there is a version reading
crtbeing.o which would probably not a be that bad workaround, at least
for the start.  Having a way to specify this from the machine descriptions
would be better.

Besides the architecture bits, for simple-object files to work we need
to add the symbol table. For practically useful information we also need
to stream the debug info.
> 
> Regarding the phrase "Support in the driver to properly execute *1 binary",
> it is not entirely clear what it refers to. My interpretation is that the
> compiler driver (the program that coordinates the compilation process)
> needs to be modified to correctly output LTO object files instead of
> assembler files (the current approach involves passing the -S and -o
> .o options) and also skip the assembler option while using
> -fbypass-asm option but I am not sure. Can Jan or Martin please shed some
> light on this?
Yes, compiler drivers decides what to do and it needs to know that with
-flto it does not need to produce assembly file and then invoke gas.  If
we go the way of reading in crtbegin.o it will also need to pass correct
crtbegin to *1 binary.  This is generally not that hard to do, just
needs to be done :)

Honza
> 
> Thanks & Regards
> 
> Rishi Raj
> 
> On Sun, 2 Apr 2023 at 03:05, Rishi Raj  wrote:
> 
> > Hii Everyone,
> > I had already expressed my interest in the " Bypass assembler when
> > generating LTO object files" project and making a proposal for the same. I
> > know I should have done it earlier but I was admitted to the hospital for
> > past few days :(.
> > I have a few doubts.
> > 1)
> >
> > "One problem is that the object files produced by libiberty/simple-object.c
> > (which is the low-level API used by the LTO code)
> > are missing some information (such as the architecture info and symbol
> > table) and API of the simple object will need to be extended to handle
> > that" I found this in the previous mailing list discussion. So who output 
> > this information currently in the object file, is it assembler?
> >
> > Also in the current patch for this project by Jan Hubica, from where are we 
> > getting these information from? Is it from crtbegin.o?
> >
> > 2)
> > "Support in driver to properly execute *1 binary." I found this on Jan 
> > original patch's email. what does it mean
> >
> > exactly?
> >
> > Regards
> >
> > Rishi Raj
> >
> >
> >
> >


Re: [GSOC] few question about Bypass assembler when generating LTO object files

2023-04-03 Thread Rishi Raj via Gcc
Thanks, Jan for the Reply! I have completed a draft proposal for this
project. I will appreciate your's, Martin's, or anybody else feedback on
the same.
Here is the link to my proposal
https://docs.google.com/document/d/1r9kzsU96kOYfIhWZx62jx4ALG-J_aJs5U0sDpwFUtts/edit?usp=sharing

On Tue, 4 Apr 2023 at 04:35, Jan Hubicka  wrote:

> Hello,
> > While going through the patch and simple-object.c I understood that the
> > file simple-object.c is used to handle the object file format. However,
> > this file does not contain all the architecture information required for
> > LTO object files, so the workaround used in the patch is to read the
> > crtbegin.o file and merge the missing attributes. While this workaround
> is
> > functional, it is not optimal, and the ideal solution would be to extend
> > simple-object.c to include the missing information.
>
> Yes, simple-object.c simply uses architecture settings it read earlier
> which is problem since at compile time we do not read any object files,
> just parse sources). In my original patch the architecture flags were
> simply left blank.  I am not sure if there is a version reading
> crtbeing.o which would probably not a be that bad workaround, at least
> for the start.  Having a way to specify this from the machine descriptions
> would be better.
>


>
> Besides the architecture bits, for simple-object files to work we need
> to add the symbol table. For practically useful information we also need
> to stream the debug info.
>
>
> > Regarding the phrase "Support in the driver to properly execute *1
> binary",
> > it is not entirely clear what it refers to. My interpretation is that the
> > compiler driver (the program that coordinates the compilation process)
> > needs to be modified to correctly output LTO object files instead of
> > assembler files (the current approach involves passing the -S and -o
> > .o options) and also skip the assembler option while using
> > -fbypass-asm option but I am not sure. Can Jan or Martin please shed some
> > light on this?
> Yes, compiler drivers decides what to do and it needs to know that with
> -flto it does not need to produce assembly file and then invoke gas.  If
> we go the way of reading in crtbegin.o it will also need to pass correct
> crtbegin to *1 binary.  This is generally not that hard to do, just
> needs to be done :)
>
Honza
> >
> > Thanks & Regards
> >
> > Rishi Raj
> >
> > On Sun, 2 Apr 2023 at 03:05, Rishi Raj  wrote:
> >
> > > Hii Everyone,
> > > I had already expressed my interest in the " Bypass assembler when
> > > generating LTO object files" project and making a proposal for the
> same. I
> > > know I should have done it earlier but I was admitted to the hospital
> for
> > > past few days :(.
> > > I have a few doubts.
> > > 1)
> > >
> > > "One problem is that the object files produced by
> libiberty/simple-object.c
> > > (which is the low-level API used by the LTO code)
> > > are missing some information (such as the architecture info and symbol
> > > table) and API of the simple object will need to be extended to handle
> > > that" I found this in the previous mailing list discussion. So who
> output this information currently in the object file, is it assembler?
> > >
> > > Also in the current patch for this project by Jan Hubica, from where
> are we getting these information from? Is it from crtbegin.o?
> > >
> > > 2)
> > > "Support in driver to properly execute *1 binary." I found this on Jan
> original patch's email. what does it mean
> > >
> > > exactly?
> > >
> > > Regards
> > >
> > > Rishi Raj
> > >
> > >
> > >
> > >
>


Fwd: [GSOC] few question about Bypass assembler when generating LTO object files

2023-04-03 Thread Rishi Raj via Gcc
-- Forwarded message -
From: Rishi Raj 
Date: Tue, 4 Apr 2023 at 05:57
Subject: Re: [GSOC] Submission of draft proposal.
To: Jan Hubicka 
Cc: , 
oops, I forgot to change the subject in previous email :(

Thanks, Jan for the Reply! I have completed a draft proposal for this
project. I will appreciate your's, Martin's, or anybody else feedback on
the same.
Here is the link to my proposal
https://docs.google.com/document/d/1r9kzsU96kOYfIhWZx62jx4ALG-J_aJs5U0sDpwFUtts/edit?usp=sharing

On Tue, 4 Apr 2023 at 04:35, Jan Hubicka  wrote:

> Hello,
> > While going through the patch and simple-object.c I understood that the
> > file simple-object.c is used to handle the object file format. However,
> > this file does not contain all the architecture information required for
> > LTO object files, so the workaround used in the patch is to read the
> > crtbegin.o file and merge the missing attributes. While this workaround
> is
> > functional, it is not optimal, and the ideal solution would be to extend
> > simple-object.c to include the missing information.
>
> Yes, simple-object.c simply uses architecture settings it read earlier
> which is problem since at compile time we do not read any object files,
> just parse sources). In my original patch the architecture flags were
> simply left blank.  I am not sure if there is a version reading
> crtbeing.o which would probably not a be that bad workaround, at least
> for the start.  Having a way to specify this from the machine descriptions
> would be better.
>


>
> Besides the architecture bits, for simple-object files to work we need
> to add the symbol table. For practically useful information we also need
> to stream the debug info.
>
>
> > Regarding the phrase "Support in the driver to properly execute *1
> binary",
> > it is not entirely clear what it refers to. My interpretation is that the
> > compiler driver (the program that coordinates the compilation process)
> > needs to be modified to correctly output LTO object files instead of
> > assembler files (the current approach involves passing the -S and -o
> > .o options) and also skip the assembler option while using
> > -fbypass-asm option but I am not sure. Can Jan or Martin please shed some
> > light on this?
> Yes, compiler drivers decides what to do and it needs to know that with
> -flto it does not need to produce assembly file and then invoke gas.  If
> we go the way of reading in crtbegin.o it will also need to pass correct
> crtbegin to *1 binary.  This is generally not that hard to do, just
> needs to be done :)
>
Honza
> >
> > Thanks & Regards
> >
> > Rishi Raj
> >
> > On Sun, 2 Apr 2023 at 03:05, Rishi Raj  wrote:
> >
> > > Hii Everyone,
> > > I had already expressed my interest in the " Bypass assembler when
> > > generating LTO object files" project and making a proposal for the
> same. I
> > > know I should have done it earlier but I was admitted to the hospital
> for
> > > past few days :(.
> > > I have a few doubts.
> > > 1)
> > >
> > > "One problem is that the object files produced by
> libiberty/simple-object.c
> > > (which is the low-level API used by the LTO code)
> > > are missing some information (such as the architecture info and symbol
> > > table) and API of the simple object will need to be extended to handle
> > > that" I found this in the previous mailing list discussion. So who
> output this information currently in the object file, is it assembler?
> > >
> > > Also in the current patch for this project by Jan Hubica, from where
> are we getting these information from? Is it from crtbegin.o?
> > >
> > > 2)
> > > "Support in driver to properly execute *1 binary." I found this on Jan
> original patch's email. what does it mean
> > >
> > > exactly?
> > >
> > > Regards
> > >
> > > Rishi Raj
> > >
> > >
> > >
> > >
>


[GSOC] Submission of draft proposal for Bypass assembler when generating LTO object files

2023-04-03 Thread Rishi Raj via Gcc
Sorry, I messed subject in my previous two emails :( so I am sending it
again.
I have completed a draft proposal for this project. I will appreciate Jan,
Martin, or anybody else feedback on the same.
Here is the link to my proposal
https://docs.google.com/document/d/1r9kzsU96kOYfIhWZx62jx4ALG-J_aJs5U0sDpwFUtts/edit?usp=sharing


Re: [GSoC][analyzer-c++] Submission of a draft proposal

2023-04-03 Thread David Malcolm via Gcc
On Mon, 2023-04-03 at 18:46 +0200, Benjamin Priour wrote:
> Following last mail, a classic I forgot to link my draft !
> https://docs.google.com/document/d/1MaLDo-Rt8yrJIvC1MO8SmFc6fp4eRQM_JeSdv-1kbsc/edit?usp=sharing

Some notes:
  * The document still has some notes in italics marked "[RFC]" which
you'll want to fix before formally submitting it.
  * "Project Goals": item 4: you give a reproducer; perhaps add a link
to godbolt.org (Compiler Explorer) demonstrating the overlong
diagnostic path?
  *  Part 1: as part of moving the test cases to c-c++-common you'll
probably have to debug/write a little .exp code (in Tcl) so that it
actually runs the tests, probably in analyzer.exp.  So you might want
to allow some time to read up on Tcl, which is the language our
testsuite is written in (I wish it was in Python, but fixing that would
be a different project, alas)
  * Part 2: your grep for unique_ptr found 2903 uses, but I guess many
of these are in libstdc++-v3.  As i understand it, this is compiled for
the target (as a library for use by the compiler user), whereas I'm
much more interested in the code below "gcc", which is compiled for the
host into the compiler itself.  You might want so split out these
numbers.

One task is to try adding -fanalyzer to the build flags for the
compiler itself, and see what happens: is it usable?  is it unusably
slow on some of our source files? does it find true problems?  does it
report false positives?  The current document suggests doing this in
part 3 as the last 20% of the project; I think it makes more sense to
do the initial attempt at this much earlier, to get an earlier idea of
what the problems might be.

"Motivation and Skill set": the first paragraph is poorly worded; for
example the 2nd sentence seems to just stop halfway through.

"Mentor": yes, I would be the mentor

Other than that, looks good.

The deadline for formally submitting this to the GSoC website is April
4th at 18:00 UTC (less than 24 hours from now), and Google are strict
about this deadline.

Good luck!
Dave


> 
> Best,
> Benjamin.
> 
> On Mon, Apr 3, 2023 at 6:44 PM Benjamin Priour 
> wrote:
> 
> > Hi David,
> > 
> > On Mon, Apr 3, 2023 at 12:38 AM David Malcolm 
> > wrote:
> > 
> > > 
> > > To be fair, C ones can be as well; the analyzer's exploded graphs
> > > tend
> > > to get very big on anything but the most trivial examples.
> > > 
> > > 
> > > 
> > [...snip...]
> > 
> > 
> > > 
> > > Indeed - you'll have to do a lot of looking at gimple IR dumps,
> > > what
> > > the supergraph looks like, etc, for all of this.
> > > 
> > > 
> > Yep. I have already tried my hands on them, but to be fair I'm
> > still often
> > troubled by them. Still,
> > doing so have already provided essential insight on the analyzer.
> > 
> > [...snip...]
> > 
> > 
> > > > 4. Extension of out-of-bounds
> > > >  ( - Extending -Wout-of-bounds to the many vec<...> might be a
> > > > requirement.
> > > > However I have to look into more details for that one, I don't
> > > > see
> > > > yet how
> > > > it could be done without a similar reuse of the assertions as
> > > > for the
> > > > libstdc++.)
> > > > 
> > > > From what I saw, despite the bugs not being FIXED, vfuncs seem
> > > > to be
> > > > working nicely enough after the fix from GSoC 2021.
> > > 
> > > IIRC I was keeping those bugs open because there's still a little
> > > room
> > > for making the analyzer smarter about the C++ type system e.g. if
> > > we
> > > "know" that a foo * is pointing at a particular subclass, maybe
> > > we know
> > > things about what vfunc implementations could be called.
> > > 
> > > We could even try an analysis mode where we split the analysis
> > > path at
> > > a vfunc call, where we could create an out-edge in the egraph for
> > > each
> > > known concrete subclass for foo *, so that we can consider all
> > > the
> > > possible subclasses and the code that would be called.  (I'm not
> > > sure
> > > if this is a *good* idea, but it intrigues me)
> > > 
> > 
> > Like adding a flag to run in a non-standard mode, to debug when an
> > unexpected vfunc analysis occurs ? TBH I didn't look that much into
> > vfuncs
> > support, as my dummy tests behave OK and I assumed it was fixed
> > after last
> > GSoC.
> > 
> > 
> > > 
> > > > 
> > > > Unfortunately I couldn't devote as much time as I wanted to gcc
> > > > yesterday,
> > > > I plan to send a proposal draft tomorrow evening. Sincerely
> > > > sorry for
> > > > the
> > > > short time frame before the deadline.
> > > 
> > > Sound promising.  Note that the deadline for submitting proposals
> > > to
> > > the official GSoC website is April 4 - 18:00 UTC (i.e. this
> > > coming
> > > Tuesday) and that Google are very strict about that deadline;
> > > see:
> > > https://developers.google.com/open-source/gsoc/timeline
> > > 
> > > I believe you've already had a go at posting gcc patches to our
> > > mailing
> > > list: that's a great thing to mention in your application.
> > > 
> > Thank