Re: AutoFDO profile toolchain is open-sourced

2015-05-09 Thread Jan Hubicka
> > Yes, it will. But it's not well tuned at all. I will start tuning it
> > if I have free cycles. It would be great if opensource community can
> > also contribute to this tuning effort.
> 
> If you could outline portions of code which needs tuning, rewriting, that 
> will help get started in this effort.

Optimization passes in GCC are generally designed to work with any kind of edge 
profile they get.
There are only few cases where they do care about what profile is around.

At the moment we consider two types of profiles - static (guessed) and FDO. For
static one we shut down use of profile info for some heuristics - for example
we do not expect loop trip counts to be reliable in the profiles because they
are not.  You can look for code checking profile_status_for_fn.

Auto-FDO does not have special value for profile_status_for_fn and it goes with
same code paths for FDO.  Dehao has some patches for Auto-FDO tuning but my
impression is that he got mostly got around by just makng optimizer bit more
robust for nonsential profiles that is always good, since even FDO profiles can
get wrong.  BTW, Dehao, do you think you can submit these changes for this
stage1?

I suppose in this case we have yet another kind of profile that is less 
reliable than
FDO and we need to start by simply benchmarking and looking for cases where 
this profile
gets worse and handle them one by one :)

Honza


Re: Merging debug-early work?

2015-05-09 Thread Aldy Hernandez

On 05/08/2015 01:51 AM, Richard Biener wrote:


Did you see if --with-build-config=bootstrap-lto still works?


Just did on x86-64 Linux.  Bootstrap succeeds without any problems.


While doing the LTO work I wondered why you have the late_global_decl
loop in toplev.c:compile_file at all (well, maybe due to the missing one
for global vars output).  At least for function decls we should have covered
everything by that point.


It took a lot of trial and error with Jason reviewing these entry 
points, to find the right place to place things and not get a slew of 
unused DIEs created or unhandled (local statics, C++ clones, optimized 
away symbols, locals, etc etc).  I can't remember the details, though 
I'm sure we discussed it at length when the patches went into the branch.


Perhaps now that adding an unused decl DIE removal pass is a must, we 
could make the early/late business less efficient but cleaner, and then 
have the upcoming DIE removal pass clean things up?  Ughh...it did take 
a while to get right without bloating things too much, but I don't doubt 
there are ways to streamline it further.



Early debug seems to be enforced at finalize-compilation-unit time, but
only for function decls - I wonder why not as well for global vars
(and why the odd !decl_function_context check is there for functions).


IIRC, stuff living in functions were handled through recursion within 
dwarf2out.  I think we would miss local statics, or some such corner 
case.  I invite you to comment things out and see where things fail in 
the bootstrap/tests :).


Aldy



How to use old GPU (Fermi) in gcc with OpenACC?

2015-05-09 Thread Satoshi_OHSHIMA
Hi,

I'm trying to use and evaluate gcc with OpenACC on some NVIDIA GPUs.
I succeeded to build gcc with OpenACC by using
http://scelementary.com/2015/04/25/openacc-in-gcc.html as a reference.
Then, I succeeded to use Kepler GPU.
However, I tried to use it on old GPUs (Fermi), and I failed to execute it.
I noticed that there are some "sm_30" and "COMPUTE_30" keywords in gcc and
nvptx sources.
Then, I modified them to "sm_20" and "COMPUTE_20", but I failed to execute
my programs, too.
Are there any developers who can make gcc with OpenACC to support other than
"sm_30"?

thanks




--
View this message in context: 
http://gcc.1065356.n5.nabble.com/How-to-use-old-GPU-Fermi-in-gcc-with-OpenACC-tp1147463.html
Sent from the gcc - Dev mailing list archive at Nabble.com.