A simple question about virtual destructors.

2016-06-02 Thread K

Hi,

My question is what if a compiler will generate a virtual destructor (or 
convert  a nonvirtual to virtual) in a base class if the base class has 
at least one virtual function and  classes down in the hierarchy have 
nontrivial destructors? In other words make a compiler responsible for 
proper destruction of a polymorphic object.


Are there any serious pros  against this? This suggestion can look 
stupid but just think how many type strokes and hours in searching of 
memory leaks this could save.


Kirill.



Re: A simple question about virtual destructors.

2016-06-02 Thread K

On 06/02/16 13:14, Marcin Baczyński wrote:

2016-06-02 11:55 GMT+02:00 K :

Hi,

My question is what if a compiler will generate a virtual destructor (or
convert  a nonvirtual to virtual) in a base class if the base class has at
least one virtual function and  classes down in the hierarchy have
nontrivial destructors? In other words make a compiler responsible for
proper destruction of a polymorphic object.

Are there any serious pros  against this?

One thing that immediately comes to mind is the compiler would need to
know about those derived classes and their destructors when compiling
the base class.
Yes, this looks like a problem, but then there is a much simple 
solution: if base class have a virtual function then it must have a 
virtual destructor and if a base class has a virtual destructor then 
every class in a hierarchy will have a pointer for destructor in vtbl.



This suggestion can look stupid
but just think how many type strokes and hours in searching of memory leaks
this could save.

-Wnon-virtual-dtor or -Weffc++ could be of some help there.

Yes, I know -Wall also gave a proper warnings .

I wanted to send this as a suggestion to @isocpp.org mailing list, but 
then decided to ask people  who work on compiler for their opinions. 
This thing with a virtual destructors bothers people for decades and 
maybe it is time to make a compiler do proper destruction in this case.

Kirill.





Re: "Uninitialized array" warnings by c++ with -O2

2017-06-07 Thread K




That snippet invokes undefined behavior at runtime (violates C++ aliasing 
rules),
so just fix it, rather than bother with bugreports.  E.g. look for 
-fstrict-aliasing
in GCC documentation, or read the C++ standard.  With -fno-strict-aliasing,
which is a workaround for broken code you won't get any warnings about
uninitialized uses.

Jakub



Yes this code has problems with aliasing. But anyway warning messages 
are extremely misleading.
And I found that that a version which I beleve mustn't have aliasing 
problems still generates same warnings.


The code:

 static uint32_t calc_16bit_checksum_part(uint8_t* buf, int len, 
uint32_t ret) {

struct ui16{
uint16_t d;
};
ui16 *ptr = (ui16*)buf;
for( int i = 0; i < (len / 2); ++i) {
ret += ptr[i].d;
}
if ( len%2) {
ret += buf[len-1];
}

return ret;
}

static uint32_t calc_ch_udp_pseudo(uint32_t src, uint32_t dst, uint16_t 
len) {

struct udp_pseudo {
uint32_t src;
uint32_t dst;
uint8_t  z;
uint8_t  proto;
uint16_t len;
} tmp;

tmp.src = src;
tmp.dst = dst;
tmp.z = 0;
tmp.proto = UDP_PROTO_NUMBER;
tmp.len = len;

auto ret = calc_16bit_checksum_part((uint8_t*)&tmp, 
sizeof(tmp), 0);

return ret;
}

and messages

static uint16_t calc_16bit_checksum(uint8_t* buf, int len) {
 ^~~
snippet.cc: In function ‘int main()’:
snippet.cc:66:31: warning: ‘tmp.calc_16bit_checksum_part(uint8_t*, int, 
uint32_t)::ui16::d’ is used uninitialized in this function [-Wuninitialized]

 ret += ptr[i].d;
~~~^
snippet.cc:66:31: warning: ‘*((void*)(& 
tmp)+2).calc_16bit_checksum_part(uint8_t*, int, uint32_t)::ui16::d’ is 
used uninitialized in this function [-Wuninitialized]
snippet.cc:66:31: warning: ‘*((void*)(& 
tmp)+4).calc_16bit_checksum_part(uint8_t*, int, uint32_t)::ui16::d’ is 
used uninitialized in this function [-Wuninitialized]
snippet.cc:66:31: warning: ‘*((void*)(& 
tmp)+6).calc_16bit_checksum_part(uint8_t*, int, uint32_t)::ui16::d’ is 
used uninitialized in this function [-Wuninitialized]
snippet.cc:66:31: warning: ‘*((void*)(& 
tmp)+8).calc_16bit_checksum_part(uint8_t*, int, uint32_t)::ui16::d’ is 
used uninitialized in this function [-Wuninitialized]
snippet.cc:66:31: warning: ‘*((void*)(& 
tmp)+10).calc_16bit_checksum_part(uint8_t*, int, uint32_t)::ui16::d’ is 
used uninitialized in this function [-Wuninitialized]

[


ui16 is an aggregate type and uint8_t is unsigned char, so there will be 
no undefined behaviour.


Kirill.


Re: "Uninitialized array" warnings by c++ with -O2

2017-06-07 Thread K



On 06/07/2017 04:56 PM, Andrew Haley wrote:

On 07/06/17 14:45, K wrote:

And I found that that a version which I beleve mustn't have aliasing
problems still generates same warnings.

It still has aliasing problems: you can't make them magically go away
by using an intermediate uint8_t*.

You're doing this:

  struct udp_pseudo {
  uint32_t src;
  uint32_t dst;
  uint8_t  z;
  uint8_t  proto;
  uint16_t len;
  } tmp;
...

  auto ret = calc_16bit_checksum_part((uint8_t*)&tmp, sizeof(tmp), 0);

  static uint32_t calc_16bit_checksum_part(uint8_t* buf, int len,  
uint32_t ret) {
  struct ui16{
  uint16_t d;
  };
  ui16 *ptr = (ui16*)buf;

There's no need for any of this messing about with pointer casts, as has
been explained.



Sorry, but I still can't get the idea. Cast from udp_pseudo to uint8_t 
doesn't have an aliasing problem (std 8.8) and a cast from uint8_t to 
ui16 still doesn't have an aliasing problem (std  8.6), or may be I 
missed something?



Kirill.


GSoC topic: Implement hot cold splitting at GIMPLE IR level

2020-03-02 Thread Aditya K
Hi Everyone,
I was one of the original authors of hot cold splitting optimization in LLVM. I 
was wondering if implementing
a region based hot cold splitting optimization would be useful in GCC? We 
already have optimal implementation of SESE region detection in GCC 
(https://github.com/gcc-mirror/gcc/blob/master/gcc/sese.h) so implementation of 
hot cold splitting at IR level could leverage some of that.

Motivation:
With the increasing popularity of RISC-V architecture, where most applications 
are constrained for code-size. I assume those applications would (soon) be 
suffering from app launch time and page faults. I don't have numbers for this 
sorry, just a guess. Having an IR level hot cold splitting pass would benefit 
applications deployed on such devices by reducing their startup working set. 
I'd be happy to mentor a GSoC candidate if we chose to list this as one of the 
projects.

Description of the project: Region based Hot Cold Splitting is an IR level 
function splitting transformation. The goal of hot/cold splitting is to improve 
the memory locality of code and helps reduce startup working set. The splitting 
pass does this by identifying cold blocks and moving them into separate 
functions. Because it is implemented at the IR level all the back end target 
benefit from it. It is a relatively new optimization and it was recently 
presented at the LLVM Dev Meeting in 2019 and the slides are here: 
https://llvm.org/devmtg/2019-10/talk-abstracts.html#tech8. There are fast 
algorithms to detect SESE regions as illustrated in 
(http://impact.gforge.inria.fr/impact2016/papers/impact2016-kumar.pdf), we can 
leverage that to detect regions. 

Deliverables:
- Implement hot cold splitting pass at GIMPLE level
- Detect maximal cold region in a function and outline it as a separate function
- Use static as well as dynamic profile information to mark cold edges
- Write unit tests to show variety of regions outlined


Thanks,
-Aditya

Fw: GSoC topic: Implement hot cold splitting at GIMPLE IR level

2020-03-03 Thread Aditya K


Hi Martin,
Thank you for explaining the status quo. After reading the code of bb-reorder.c,
 it looks pretty good and seems it doesn't need any significant improvements.
In that case, the only value GIMPLE level hot/cold splitting could bring is to 
enable aggressive code-size optimization
by merging of similar/identical functions: after outlining cold regions, they 
may be suitable candidates for function merging.
ipa-split might be enabling some of that, having a region based function 
splitting could improve ipa-split.

-Aditya


--
From: Martin Liška 
Sent: Tuesday, March 3, 2020 2:47 AM
To: Aditya K ; gcc@gcc.gnu.org 
Cc: Jan Hubicka 
Subject: Re: GSoC topic: Implement hot cold splitting at GIMPLE IR level

Hello.
Thank you for idea. I would like to provide some comments about what GCC can 
currently
do and I'm curious we need something extra on top of what we do.
Right now, GCC can do hot/cold partitioning based on functions and basic 
blocks. With
a PGO profile, the optimization is quite aggressive and can save quite some code
being placed into a cold partitioning and being optimized for size. Without a 
profile,
we do a static profile guess (predict.c), where we also propagate information 
about cold
blocks (determine_unlikely_bbs). Later in RTL, we utilize the information and 
make
the real reordering (bb-reorder.c).

Martin





Combined top-down and bottom-up instruction scheduler

2015-09-08 Thread Aditya K
IIUC, in the haifa-sched.c, the default scheduling algorithm seems to be 
top-down (before reload).
Is there a way to schedule the other way (bottom up), or both ways?

As a use case for bottom-up or some other heuristic:
Currently, the first priority in the selection is given to the longest path, in 
some cases this may produce code with stalls at the end of the basic block. 
Whereas in the case of combined top-down + bottom-up scheduling we would end up 
having stalls in the middle of the basic block.

Thanks,
-Aditya

  

RE: Combined top-down and bottom-up instruction scheduler

2015-09-08 Thread Aditya K



> Subject: Re: Combined top-down and bottom-up instruction scheduler
> To: hiradi...@msn.com; gcc@gcc.gnu.org
> CC: vmaka...@redhat.com
> From: l...@redhat.com
> Date: Tue, 8 Sep 2015 12:51:24 -0600
>
> On 09/08/2015 12:39 PM, Aditya K wrote:
>> IIUC, in the haifa-sched.c, the default scheduling algorithm seems to
>> be top-down (before reload). Is there a way to schedule the other way
>> (bottom up), or both ways?
> Not that I'm aware of. Note that region scheduling allows insns to move
> between basic blocks to help fill the bubbles that can occur at the end
> of a block.
>
>>
>> As a use case for bottom-up or some other heuristic: Currently, the
>> first priority in the selection is given to the longest path, in some
>> cases this may produce code with stalls at the end of the basic
>> block. Whereas in the case of combined top-down + bottom-up
>> scheduling we would end up having stalls in the middle of the basic
>> block.
> GCC's original scheduler worked bottom-up until ~1997. IBM Haifa's work
> turned it into a top-down model and was a small, but clear improvement.
>
> There's certainly better things that can be done than strictly top-down
> or bottom-up, but revamping the scheduler again hasn't been seen as a
> major win for the most common processors GCC targets these days. Thus
> it hasn't been a significant area of focus.

Do you have pointers on places to look for if I want to explore bottom-up, or 
maybe a combination of the two.

Thanks,
-Aditya

>
> Jeff
  

Help with gcc-plugin (Traverse all loops inside function)

2015-09-15 Thread Aditya K
I started with one of the test cases in the plugin testsuite "def_plugin.c". 
Pasted the code for convenience.
I want to traverse all the loops in a function.

Maybe use, loops_for_fn (DECL_STRUCT_FUNCTION (fndef)), but this does not seem 
to work.


/* Callback function to invoke after GCC finishes a function definition. */

void plugin_finish_parse_function (void *event_data, void *data)
{
  tree fndef = (tree) event_data;
  //struct loops *l  = loops_for_fn (DECL_STRUCT_FUNCTION (fndef));
  warning (0, G_("Finish fndef %s"),
           IDENTIFIER_POINTER (DECL_NAME (fndef)));
}

int
plugin_init (struct plugin_name_args *plugin_info,
             struct plugin_gcc_version *version)
{
  const char *plugin_name = plugin_info->base_name;

  register_callback (plugin_name, PLUGIN_START_PARSE_FUNCTION,
                     plugin_start_parse_function, NULL);

  register_callback (plugin_name, PLUGIN_FINISH_PARSE_FUNCTION,
                     plugin_finish_parse_function, NULL);
  return 0;
}

Thanks,
-Aditya


  

RE: Proposal for adding splay_tree_find (to find elements without updating the nodes).

2015-03-09 Thread Aditya K



> Date: Mon, 9 Mar 2015 15:26:26 -0400
> From: tbsau...@tbsaunde.org
> To: gcc@gcc.gnu.org
> Subject: Re: Proposal for adding splay_tree_find (to find elements without 
> updating the nodes).
>
> On Mon, Mar 09, 2015 at 06:59:10PM +, vax mzn wrote:
>> w.r.t, https://gcc.gnu.org/wiki/Speedup_areas where we want to improve the 
>> performance of splay trees.
>>
>> The function `splay_tree_node splay_tree_lookup (splay_tree, 
>> splay_tree_key);'
>> updates the nodes every time a lookup is done.
>>
>> IIUC, There are places where we call this function in a loop i.e., we lookup 
>> different elements every time.
>
> So, this makes sense, but I've always wondered if it wouldn't make more
> sense to just use the red black tree in libstdc++ (or does it have a
> splay tree of its own too?)
>
> Trev
>

 Red-black trees do have better cost of balancing, although at a cost of higher 
complexity.
 I'm not sure if using red-black trees would improve the performance because I 
don't have any data to back.

 -Aditya

>> e.g.,
>> In this exaple we are looking for a different `t' in each iteration.
>>
>> gcc/gimplify.c:1096: splay_tree_lookup (ctx->variables, (splay_tree_key) t) 
>> == NULL)
>>
>> Here, we change the tree itself `ctx'
>> gcc/gimplify.c:5532: n = splay_tree_lookup (ctx->variables, 
>> (splay_tree_key)decl);
>>
>>
>> I think we don't need to update the tree in these cases at least.
>>
>>
>> -Aditya
>>
  

RE: Proposal for adding splay_tree_find (to find elements without updating the nodes).

2015-03-09 Thread Aditya K



> From: stevenb@gmail.com
> Date: Mon, 9 Mar 2015 23:59:52 +0100
> Subject: Re: Proposal for adding splay_tree_find (to find elements without 
> updating the nodes).
> To: hiradi...@msn.com
> CC: gcc@gcc.gnu.org
>
> On Mon, Mar 9, 2015 at 7:59 PM, vax mzn wrote:
>> w.r.t, https://gcc.gnu.org/wiki/Speedup_areas where we want to improve the 
>> performance of splay trees.
>>
>> The function `splay_tree_node splay_tree_lookup (splay_tree, 
>> splay_tree_key);'
>> updates the nodes every time a lookup is done.
>>
>> IIUC, There are places where we call this function in a loop i.e., we lookup 
>> different elements every time.
>> e.g.,
>> In this exaple we are looking for a different `t' in each iteration.
>
>
> If that's really true, then a splay tree is a horrible choice of data
> structure. The splay tree will simply degenerate to a linked list. The
> right thing to do would be, not to "break" one of the key features of
> splay trees (i.e. the latest lookup is always on top), but to use
> another data structure.
>
> Ciao!
> Steven

So I have this patch which replaces splay_tree_lookup with a new function 
splay_tree_find at some places.
I hope this is helpful.

commit 64f203f36661efd95958474f31b588a134dedb41
Author: Aditya 
Date:   Mon Mar 9 22:47:04 2015 -0500

    add splay_tree_find for finding elements without updating the tree

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index d822913..1053eee 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -1093,7 +1093,7 @@ gimplify_bind_expr (tree *expr_p, gimple_seq *pre_p)
  /* Mark variable as local.  */
  if (ctx && !DECL_EXTERNAL (t)
  && (! DECL_SEEN_IN_BIND_EXPR_P (t)
- || splay_tree_lookup (ctx->variables,
+  || splay_tree_find (ctx->variables,
    (splay_tree_key) t) == NULL))
    {
  if (ctx->region_type == ORT_SIMD
@@ -5529,7 +5529,7 @@ omp_firstprivatize_variable (struct gimplify_omp_ctx 
*ctx, tree decl)
 
   do
 {
-  n = splay_tree_lookup (ctx->variables, (splay_tree_key)decl);
+  n = splay_tree_find (ctx->variables, (splay_tree_key)decl);
   if (n != NULL)
    {
  if (n->value & GOVD_SHARED)
@@ -6428,7 +6428,7 @@ gimplify_adjust_omp_clauses_1 (splay_tree_node n, void 
*data)
  while (ctx != NULL)
    {
  splay_tree_node on
-   = splay_tree_lookup (ctx->variables, (splay_tree_key) decl);
+    = splay_tree_find (ctx->variables, (splay_tree_key) decl);
  if (on && (on->value & (GOVD_FIRSTPRIVATE | GOVD_LASTPRIVATE
  | GOVD_PRIVATE | GOVD_REDUCTION
  | GOVD_LINEAR | GOVD_MAP)) != 0)
@@ -6529,7 +6529,7 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, tree 
*list_p)
    case OMP_CLAUSE_FIRSTPRIVATE:
    case OMP_CLAUSE_LINEAR:
  decl = OMP_CLAUSE_DECL (c);
- n = splay_tree_lookup (ctx->variables, (splay_tree_key) decl);
+  n = splay_tree_find (ctx->variables, (splay_tree_key) decl);
  remove = !(n->value & GOVD_SEEN);
  if (! remove)
    {
@@ -6551,7 +6551,7 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, tree 
*list_p)
  if (ctx->outer_context->combined_loop
  && !OMP_CLAUSE_LINEAR_NO_COPYIN (c))
    {
- n = splay_tree_lookup (ctx->outer_context->variables,
+  n = splay_tree_find (ctx->outer_context->variables,
 (splay_tree_key) decl);
  if (n == NULL
  || (n->value & GOVD_DATA_SHARE_CLASS) == 0)
@@ -6578,7 +6578,7 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, tree 
*list_p)
  /* Make sure OMP_CLAUSE_LASTPRIVATE_FIRSTPRIVATE is set to
 accurately reflect the presence of a FIRSTPRIVATE clause.  */
  decl = OMP_CLAUSE_DECL (c);
- n = splay_tree_lookup (ctx->variables, (splay_tree_key) decl);
+  n = splay_tree_find (ctx->variables, (splay_tree_key) decl);
  OMP_CLAUSE_LASTPRIVATE_FIRSTPRIVATE (c)
    = (n->value & GOVD_FIRSTPRIVATE) != 0;
  break;
@@ -6587,7 +6587,7 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, tree 
*list_p)
  decl = OMP_CLAUSE_DECL (c);
  if (!is_global_var (decl))
    {
- n = splay_tree_lookup (ctx->variables, (splay_tree_key) decl);
+  n = splay_tree_find (ctx->variables, (splay_tree_key) decl);
  remove = n == NULL || !(n->value & GOVD_SEEN);
  if (!remove && TREE_CODE (TREE_TYPE (decl)) == POINTER_TYPE)
    {
@@ -6600,7 +6600,7 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, tree 
*list_p)
    for (octx = ctx->outer_context; octx;
 octx = octx->outer_context)
  {
-   n =

RE: Proposal for adding splay_tree_find (to find elements without updating the nodes).

2015-03-13 Thread Aditya K
---
> Date: Tue, 10 Mar 2015 11:20:07 +0100
> Subject: Re: Proposal for adding splay_tree_find (to find elements without 
> updating the nodes).
> From: richard.guent...@gmail.com
> To: stevenb@gmail.com
> CC: hiradi...@msn.com; gcc@gcc.gnu.org
>
> On Mon, Mar 9, 2015 at 11:59 PM, Steven Bosscher  
> wrote:
>> On Mon, Mar 9, 2015 at 7:59 PM, vax mzn wrote:
>>> w.r.t, https://gcc.gnu.org/wiki/Speedup_areas where we want to improve the 
>>> performance of splay trees.
>>>
>>> The function `splay_tree_node splay_tree_lookup (splay_tree, 
>>> splay_tree_key);'
>>> updates the nodes every time a lookup is done.
>>>
>>> IIUC, There are places where we call this function in a loop i.e., we 
>>> lookup different elements every time.
>>> e.g.,
>>> In this exaple we are looking for a different `t' in each iteration.
>>
>>
>> If that's really true, then a splay tree is a horrible choice of data
>> structure. The splay tree will simply degenerate to a linked list. The
>> right thing to do would be, not to "break" one of the key features of
>> splay trees (i.e. the latest lookup is always on top), but to use
>> another data structure.
>
> I agree with Steven here and wanted to say the same. If you don't
> benefit from splay trees LRU scheme then use a different data structure.
>
> Richard.
>
>> Ciao!
>> Steven


Thanks Richard and Steven for the feedback. I tried to replace the use of splay 
tree with std::map, and got it to bootstrap.
The compile time on a few runs shows improvement but I wont trust those data 
because it is too good to be true.
Maybe, I dont have enough data points and this machine runs other things too.

  1                                                                             
                                 Baseline: 1426175470 (SHA:7386c9d) 
Patch:1426258562 (SHA:592c06f)
  2 Program                                                                     
                                        |  CC_Time  CC_Real_Time                
        CC_Time  CC_Real_Time
  3 MultiSource/Applications/ALAC/decode/alacconvert-decode                     
     |   4.2605   4.9840                                     2.6455   3.0826
  4 MultiSource/Applications/ALAC/encode/alacconvert-encode                     
     |   4.3124   4.9600                                      2.7138   3.0725
  5 MultiSource/Applications/Burg/burg                                          
                      |   5.6053   6.4204                                       
3.5663   4.0837
  6 MultiSource/Applications/JM/ldecod/ldecod                                   
                |  33.3773  35.9444                                     20.7180 
 22.4260
  7 MultiSource/Applications/JM/lencod/lencod                                   
                |  74.2836  78.6588                                     47.3016 
 50.0484
  8 MultiSource/Applications/SIBsim4/SIBsim4                                    
                  |   5.5832   5.8932                                        
3.0524   3.2456
  9 MultiSource/Applications/SPASS/SPASS                                        
                   |  67.4992  72.1056                                    
43.2258  45.7996
 10 MultiSource/Applications/aha/aha                                            
                       |   0.5019   0.5894                                      
  0.3860   0.4406
 11 MultiSource/Applications/d/make_dparser                                     
                 |  13.4930  14.5575                                      
8.5084   9.2331
 12 MultiSource/Applications/hbd/hbd                                            
                      |   4.7727   5.9225                                       
 2.9896   3.8366
 13 MultiSource/Applications/hexxagon/hexxagon                                  
              |   3.1735   3.6957                                        1.8297 
  2.2171
 14 MultiSource/Applications/kimwitu++/kc                                       
                   |  29.6117  31.8364                                     
18.0744  19.5862
 15 MultiSource/Applications/lambda-0.1.3/lambda                                
              |   4.5274   4.9125                                          
2.7136   2.9241


I have attached my patch. Please give feedback/suggestions for improvement.

Thanks
-Aditya



  

splay.patch
Description: Binary data


RE: Proposal for adding splay_tree_find (to find elements without updating the nodes).

2015-03-13 Thread Aditya K
You're right. I'll change this to:

/* A stable comparison functor to sort trees.  */
struct tree_compare_decl_uid {
  bool  operator ()(const tree &xa, const tree &xb) const
  {
    return DECL_UID (xa) < DECL_UID (xb);
  }
};

New patch attached.


Thanks,
-Aditya



> Date: Fri, 13 Mar 2015 19:02:11 +
> Subject: Re: Proposal for adding splay_tree_find (to find elements without 
> updating the nodes).
> From: jwakely@gmail.com
> To: hiradi...@msn.com
> CC: richard.guent...@gmail.com; stevenb@gmail.com; gcc@gcc.gnu.org
>
> Are you sure your compare_variables functor is correct?
>
> Subtracting the two values seems very strange for a strict weak ordering.
>
> (Also "compare_variables" is a pretty poor name!)
  

splay.patch
Description: Binary data


RE: Examples of GCC plugins

2015-03-14 Thread Aditya K



> Date: Sat, 14 Mar 2015 18:22:29 +0300
> From: malts...@gmail.com
> To: gcc@gcc.gnu.org; san...@codesourcery.com
> Subject: Examples of GCC plugins
>
> Hi, all.
>
> When I first tried to write a simple plugin for GCC, it turned out that
> existing docs on plugins
> (https://gcc.gnu.org/onlinedocs/gccint/Plugins.html#Plugins) are rather
> brief, so I had to refer to the GCC source code. When I grepped the
> source looking for one of plugin API functions, I found this directory:
> gcc/testsuite/gcc.dg/plugin.
>
> Plugins in that directory were very helpful as a starting point. I
> think, they are definitely worth being mentioned in documentation (maybe
> I was inattentive and missed some link to them?) or perhaps even
> separated from testsuite (of course, I don't mean removing something
> from testsuite, but rather adding a separate Makefile, not dependent on
> DejaGNU). Any thoughts on this?
>
> --
> Regards,
> Mikhail Maltsev

I think that will be very helpful to mention this in the documentation.

-Aditya

  

RE: Proposal for another approach for Loop transformation with conditional in Loops.

2015-03-14 Thread Aditya K



> From: ajit.kumar.agar...@xilinx.com
> To: l...@redhat.com; richard.guent...@gmail.com; gcc@gcc.gnu.org
> CC: vin...@xilinx.com; shail...@xilinx.com; vid...@xilinx.com; 
> nmek...@xilinx.com
> Subject: Proposal for another approach for Loop transformation with 
> conditional in Loops.
> Date: Sun, 15 Mar 2015 04:40:54 +
>
> Hello All:
>
> I am proposing the new approach to Loop transformation as given below in the 
> example For the loops with
> conditional expression inside the Loops. The Loop body should be reducible 
> control flow graph. The iteration
> space is partitioned into different spaces for which either the cond_expr is 
> true or cond_expr is false. The
> evaluation of cond_expr for the partitioned iteration spaces can be 
> determined statically. For the partitioned
> iterations and based on the analysis of cond_expr on the partitioned 
> iterations spaces the Loops in fig (a) will
> be transformed to Loops in fig (b).
>
> for the iteration spaces of the conditional inside the loop is live in at the 
> entry of the cond_expr and Live out
> the Control flow graph is topological ordered with the basic blocks for the 
> conditional CFG and the partitioned
> iteration spaces can be formed for which the spaces can be true for 
> conditional expr and false and unknown.
>
> Based on the above info and analysis the Loop of Fig (a) will be transformed 
> to Loop Fig (b).
>
> This is much more optimized code as compared to the performance. The cases 
> will be triggered for SPEC
> Benchmarks. Loops is partitioned to different version with different 
> iteration spaces. Optimized in the presence
> Of the transformed generation of the loops without conditional.
>
> For ( x1= lb1; x1<= ub1; x1++)
> ..
> For(xn= lbn; xn <= ubn; xn++)
> {
>
> V = cond_expr;
> If ( V)
> Code_FOR _THEN;
> Else
> Code_FOR_ELSE;
> }
>
> }
> Fig(a)
>
> /* Loop for cond_expr == true */
> For( x1 = )
> For(xn = )
> CODE_FOR_THEN;
> End
> End
>
> /* Loop for cond_expr == false */
> For ( x1 = ..)
> For( xn = ...)
> CODE_FOR_ELSE;
> End
> End
>
> /* Loop for cond_expr == unknown *//
> For ( x1 = ...)
> For( xn = 
> {
> V = cond_expr;
> If( v)
> CODE_FOR_THEN;
> Else
> CODE_FOR_ELSE;
> }
> }
>
> Fig ( b).
>
> Thoughts Please ?
>
> Thanks & Regards
> Ajit

Also, to add to it, we could have profile feedback so that we only do this in 
the case of hot loops and not do when optimizing for size.


-Aditya

  

RE: Proposal for another approach for Loop transformation with conditional in Loops.

2015-03-14 Thread Aditya K



> From: ajit.kumar.agar...@xilinx.com
> To: l...@redhat.com; richard.guent...@gmail.com; gcc@gcc.gnu.org
> CC: vin...@xilinx.com; shail...@xilinx.com; vid...@xilinx.com; 
> nmek...@xilinx.com
> Subject: Proposal for another approach for Loop transformation with 
> conditional in Loops.
> Date: Sun, 15 Mar 2015 04:40:54 +
>
> Hello All:
>
> I am proposing the new approach to Loop transformation as given below in the 
> example For the loops with
> conditional expression inside the Loops. The Loop body should be reducible 
> control flow graph. The iteration
> space is partitioned into different spaces for which either the cond_expr is 
> true or cond_expr is false. The
> evaluation of cond_expr for the partitioned iteration spaces can be 
> determined statically. For the partitioned
> iterations and based on the analysis of cond_expr on the partitioned 
> iterations spaces the Loops in fig (a) will
> be transformed to Loops in fig (b).
>
> for the iteration spaces of the conditional inside the loop is live in at the 
> entry of the cond_expr and Live out
> the Control flow graph is topological ordered with the basic blocks for the 
> conditional CFG and the partitioned
> iteration spaces can be formed for which the spaces can be true for 
> conditional expr and false and unknown.
>
> Based on the above info and analysis the Loop of Fig (a) will be transformed 
> to Loop Fig (b).
>
> This is much more optimized code as compared to the performance. The cases 
> will be triggered for SPEC
> Benchmarks. Loops is partitioned to different version with different 
> iteration spaces. Optimized in the presence
> Of the transformed generation of the loops without conditional.
>
> For ( x1= lb1; x1<= ub1; x1++)
> ..
> For(xn= lbn; xn <= ubn; xn++)
> {
>
> V = cond_expr;
> If ( V)
> Code_FOR _THEN;
> Else
> Code_FOR_ELSE;
> }
>
> }
> Fig(a)
>
> /* Loop for cond_expr == true */
> For( x1 = )
> For(xn = )
> CODE_FOR_THEN;
> End
> End
>
> /* Loop for cond_expr == false */
> For ( x1 = ..)
> For( xn = ...)
> CODE_FOR_ELSE;
> End
> End
>
> /* Loop for cond_expr == unknown *//
> For ( x1 = ...)
> For( xn = 
> {
> V = cond_expr;
> If( v)
> CODE_FOR_THEN;
> Else
> CODE_FOR_ELSE;
> }
> }
>
> Fig ( b).
>
> Thoughts Please ?
>
> Thanks & Regards
> Ajit

Ajit,

How different this is from the loop-unswitch pass already in gcc 
(tree-ssa-loop-unswitch.c)?

-Aditya


  

RE: Proposal for adding splay_tree_find (to find elements without updating the nodes).

2015-03-15 Thread Aditya K



> Date: Sun, 15 Mar 2015 02:32:23 -0400
> From: tbsau...@tbsaunde.org
> To: gcc@gcc.gnu.org
> Subject: Re: Proposal for adding splay_tree_find (to find elements without 
> updating the nodes).
>
> hi,
>
> I'm only commenting on algorithmic stuff at this point, you should make
> sure this doesn't regress anything in make check. This stuff only
> effects code using omp stuff so compiling random c++ is unlikely to test
> this code at all.
>
> Also please follow the style in
> https://gcc.gnu.org/codingconventions.html
> and usually try to make new code similar to what's around it.
>
> @@ -384,7 +386,7 @@ new_omp_context (enum omp_region_type region_type)
>
> c = XCNEW (struct gimplify_omp_ctx);
> c->outer_context = gimplify_omp_ctxp;
> - c->variables = splay_tree_new (splay_tree_compare_decl_uid, 0, 0);
> + //c->variables = splay_tree_new (splay_tree_compare_decl_uid, 0, 0);
>
> I don't think this is what you want, xcnew is a calloc wrapper and
> doesn't call the ctor for gimplify_omp_ctx. For now placement new is
> probably the simplest way to get what you want.
>
Thanks for pointing this out. I'll do it the way c->privatized_types has been 
allocated.
e.g., by making c->variables a pointer to std::map and   c->variables = new 
gimplify_tree_t;


> -static void
> -delete_omp_context (struct gimplify_omp_ctx *c)
> -{
> - splay_tree_delete (c->variables);
> - delete c->privatized_types;
> - XDELETE (c);
> -}
>
> hm, why?
>
My bad, I'll restore this.

> -gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data)
> +gimplify_adjust_omp_clauses_1 (std::pair n, void *data)
>
> You can now change the type of data from void * to const
> gimplify_adjust_omp_clauses_data *

Done!


Thanks for the feedback, they were really helpful. I have updated the patch. 
Please review this.
Also, although I run `make check` while compiling gcc (with bootstrap enabled), 
I'm not sure if 'omp' related tests were exercised.
I'm still unfamiliar with several components of gcc. Any pointers on how to 
ensure all tests were run, would be useful.


-Aditya




>
> thanks!
>
> Trev
>
> On Fri, Mar 13, 2015 at 07:32:03PM +, Aditya K wrote:
>> You're right. I'll change this to:
>>
>> /* A stable comparison functor to sort trees.  */
>> struct tree_compare_decl_uid {
>>   bool  operator ()(const tree &xa, const tree &xb) const
>>   {
>> return DECL_UID (xa) < DECL_UID (xb);
>>   }
>> };
>>
>> New patch attached.
>>
>>
>> Thanks,
>> -Aditya
>>
>>
>> 
>>> Date: Fri, 13 Mar 2015 19:02:11 +
>>> Subject: Re: Proposal for adding splay_tree_find (to find elements without 
>>> updating the nodes).
>>> From: jwakely@gmail.com
>>> To: hiradi...@msn.com
>>> CC: richard.guent...@gmail.com; stevenb@gmail.com; gcc@gcc.gnu.org
>>>
>>> Are you sure your compare_variables functor is correct?
>>>
>>> Subtracting the two values seems very strange for a strict weak ordering.
>>>
>>> (Also "compare_variables" is a pretty poor name!)
>>
>
>


  

splay.patch
Description: Binary data


RE: Proposal for adding splay_tree_find (to find elements without updating the nodes).

2015-03-16 Thread Aditya K



> From: lopeziba...@gmail.com
> Date: Mon, 16 Mar 2015 15:16:55 +0100
> Subject: Re: Proposal for adding splay_tree_find (to find elements without 
> updating the nodes).
> To: tbsau...@tbsaunde.org; gcc@gcc.gnu.org; hiradi...@msn.com
>
>>> Thanks for the feedback, they were really helpful. I have updated the 
>>> patch. Please review this.
>>> Also, although I run `make check` while compiling gcc (with bootstrap 
>>> enabled), I'm not sure if 'omp' related tests were exercised.
>>> I'm still unfamiliar with several components of gcc. Any pointers on how to 
>>> ensure all tests were run, would be useful.
>>
>>
>> https://gcc.gnu.org/install/test.html should help, though unfortunately
>> you'll probably find the easiest way to check for regressions is to do
>> one run of straight trunk, then another with your patch. Saddly a bunch
>> of people have own scripts to deal with administrivia, but there isn't a
>> standardized way that's simple.
>
> I would recommend going through
>
> https://gcc.gnu.org/wiki/GettingStarted#Basics:_Contributing_to_GCC_in_10_easy_steps
>
> at least once. If you find something wrong, confusing or not answered
> there, please ask here and CC me, and I will NOT answer you ;) what I
> will do is update it, so the answer is there for you but also for the
> next person that comes after you.
>
> Of course, it is a wiki, anyone can update it and they are welcome to do it.
>
> Cheers,
>
> Manuel.

Hi Manuel,

I started looking at the steps to test gcc, (https://gcc.gnu.org/Testing_GCC):

In the first step: to install prerequisites 'dejagnu', tcl and Expect
    - The link to dejagnu does not have any information. for tcl and expect 
there are no links.


Thanks,
-Aditya   

RE: Proposal for adding splay_tree_find (to find elements without updating the nodes).

2015-03-16 Thread Aditya K



> From: lopeziba...@gmail.com
> Date: Mon, 16 Mar 2015 17:04:58 +0100
> Subject: Re: Proposal for adding splay_tree_find (to find elements without 
> updating the nodes).
> To: jwakely@gmail.com
> CC: hiradi...@msn.com; tbsau...@tbsaunde.org; gcc@gcc.gnu.org
>
> On 16 March 2015 at 16:55, Jonathan Wakely  wrote:
>> On 16 March 2015 at 15:54, Jonathan Wakely wrote:
>>> "DejaGnu" is not meant to be a link, but the wiki automatically treats
>>> any MixedCase word as a link.
>>
>> I've fixed that now.
>
> We can actually link to the DejaGNU page if someone is interested. But
> probably they only need to find it on their own GNU/Linux, so I
> mention that as well.
>
> Aditya, I hope it is clear now.

Yes it is. Thanks.

-Aditya   

RE: Proposal for adding splay_tree_find (to find elements without updating the nodes).

2015-03-16 Thread Aditya K



> From: hiradi...@msn.com
> To: lopeziba...@gmail.com; jwakely@gmail.com
> CC: tbsau...@tbsaunde.org; gcc@gcc.gnu.org
> Subject: RE: Proposal for adding splay_tree_find (to find elements without 
> updating the nodes).
> Date: Mon, 16 Mar 2015 18:45:22 +
>
>
>
> 
>> From: lopeziba...@gmail.com
>> Date: Mon, 16 Mar 2015 17:04:58 +0100
>> Subject: Re: Proposal for adding splay_tree_find (to find elements without 
>> updating the nodes).
>> To: jwakely@gmail.com
>> CC: hiradi...@msn.com; tbsau...@tbsaunde.org; gcc@gcc.gnu.org
>>
>> On 16 March 2015 at 16:55, Jonathan Wakely  wrote:
>>> On 16 March 2015 at 15:54, Jonathan Wakely wrote:
 "DejaGnu" is not meant to be a link, but the wiki automatically treats
 any MixedCase word as a link.
>>>
>>> I've fixed that now.
>>
>> We can actually link to the DejaGNU page if someone is interested. But
>> probably they only need to find it on their own GNU/Linux, so I
>> mention that as well.
>>
>> Aditya, I hope it is clear now.
>
> Yes it is. Thanks.
>
> -Aditya

So I tested my patch, and there were no regressions in the make check

Please review the patch:
http://gcc.gnu.org/ml/gcc/2015-03/msg00179/splay.patch

Thanks,
-Aditya   

RE: Proposal for adding splay_tree_find (to find elements without updating the nodes).

2015-03-18 Thread Aditya K



> Date: Wed, 18 Mar 2015 11:50:16 +0100
> Subject: Re: Proposal for adding splay_tree_find (to find elements without 
> updating the nodes).
> From: richard.guent...@gmail.com
> To: hiradi...@msn.com
> CC: tbsau...@tbsaunde.org; gcc@gcc.gnu.org
>
> On Mon, Mar 16, 2015 at 4:33 AM, Aditya K  wrote:
>>
>>
>> 
>>> Date: Sun, 15 Mar 2015 02:32:23 -0400
>>> From: tbsau...@tbsaunde.org
>>> To: gcc@gcc.gnu.org
>>> Subject: Re: Proposal for adding splay_tree_find (to find elements without 
>>> updating the nodes).
>>>
>>> hi,
>>>
>>> I'm only commenting on algorithmic stuff at this point, you should make
>>> sure this doesn't regress anything in make check. This stuff only
>>> effects code using omp stuff so compiling random c++ is unlikely to test
>>> this code at all.
>>>
>>> Also please follow the style in
>>> https://gcc.gnu.org/codingconventions.html
>>> and usually try to make new code similar to what's around it.
>>>
>>> @@ -384,7 +386,7 @@ new_omp_context (enum omp_region_type region_type)
>>>
>>> c = XCNEW (struct gimplify_omp_ctx);
>>> c->outer_context = gimplify_omp_ctxp;
>>> - c->variables = splay_tree_new (splay_tree_compare_decl_uid, 0, 0);
>>> + //c->variables = splay_tree_new (splay_tree_compare_decl_uid, 0, 0);
>>>
>>> I don't think this is what you want, xcnew is a calloc wrapper and
>>> doesn't call the ctor for gimplify_omp_ctx. For now placement new is
>>> probably the simplest way to get what you want.
>>>
>> Thanks for pointing this out. I'll do it the way c->privatized_types has 
>> been allocated.
>> e.g., by making c->variables a pointer to std::map and c->variables = new 
>> gimplify_tree_t;
>>
>>
>>> -static void
>>> -delete_omp_context (struct gimplify_omp_ctx *c)
>>> -{
>>> - splay_tree_delete (c->variables);
>>> - delete c->privatized_types;
>>> - XDELETE (c);
>>> -}
>>>
>>> hm, why?
>>>
>> My bad, I'll restore this.
>>
>>> -gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data)
>>> +gimplify_adjust_omp_clauses_1 (std::pair n, void *data)
>>>
>>> You can now change the type of data from void * to const
>>> gimplify_adjust_omp_clauses_data *
>>
>> Done!
>>
>>
>> Thanks for the feedback, they were really helpful. I have updated the patch. 
>> Please review this.
>> Also, although I run `make check` while compiling gcc (with bootstrap 
>> enabled), I'm not sure if 'omp' related tests were exercised.
>> I'm still unfamiliar with several components of gcc. Any pointers on how to 
>> ensure all tests were run, would be useful.
>
> I'm not sure we want to use std::map. Can you use GCCs own hash_map
> here?

Ok, I'll try to use has_map. I was under the impression that we can use 
standard library features, that's why I used std::map.

Thanks,
-Aditya

>
> Richard.
>
>>
>> -Aditya
>>
>>
>>
>>
>>>
>>> thanks!
>>>
>>> Trev
>>>
>>> On Fri, Mar 13, 2015 at 07:32:03PM +, Aditya K wrote:
>>>> You're right. I'll change this to:
>>>>
>>>> /* A stable comparison functor to sort trees. */
>>>> struct tree_compare_decl_uid {
>>>> bool operator ()(const tree &xa, const tree &xb) const
>>>> {
>>>> return DECL_UID (xa) < DECL_UID (xb);
>>>> }
>>>> };
>>>>
>>>> New patch attached.
>>>>
>>>>
>>>> Thanks,
>>>> -Aditya
>>>>
>>>>
>>>> 
>>>>> Date: Fri, 13 Mar 2015 19:02:11 +
>>>>> Subject: Re: Proposal for adding splay_tree_find (to find elements 
>>>>> without updating the nodes).
>>>>> From: jwakely@gmail.com
>>>>> To: hiradi...@msn.com
>>>>> CC: richard.guent...@gmail.com; stevenb@gmail.com; gcc@gcc.gnu.org
>>>>>
>>>>> Are you sure your compare_variables functor is correct?
>>>>>
>>>>> Subtracting the two values seems very strange for a strict weak ordering.
>>>>>
>>>>> (Also "compare_variables" is a pretty poor name!)
>>>>
>>>
>>>
>>
>>
>>
  

RE: Proposal for adding splay_tree_find (to find elements without updating the nodes).

2015-03-18 Thread Aditya K



> Date: Wed, 18 Mar 2015 22:01:18 +0530
> Subject: Re: Proposal for adding splay_tree_find (to find elements without 
> updating the nodes).
> From: prathamesh.kulka...@linaro.org
> To: hiradi...@msn.com
> CC: richard.guent...@gmail.com; tbsau...@tbsaunde.org; gcc@gcc.gnu.org
>
> On 18 March 2015 at 21:20, Aditya K  wrote:
>>
>>
>> 
>>> Date: Wed, 18 Mar 2015 11:50:16 +0100
>>> Subject: Re: Proposal for adding splay_tree_find (to find elements without 
>>> updating the nodes).
>>> From: richard.guent...@gmail.com
>>> To: hiradi...@msn.com
>>> CC: tbsau...@tbsaunde.org; gcc@gcc.gnu.org
>>>
>>> On Mon, Mar 16, 2015 at 4:33 AM, Aditya K  wrote:
>>>>
>>>>
>>>> 
>>>>> Date: Sun, 15 Mar 2015 02:32:23 -0400
>>>>> From: tbsau...@tbsaunde.org
>>>>> To: gcc@gcc.gnu.org
>>>>> Subject: Re: Proposal for adding splay_tree_find (to find elements 
>>>>> without updating the nodes).
>>>>>
>>>>> hi,
>>>>>
>>>>> I'm only commenting on algorithmic stuff at this point, you should make
>>>>> sure this doesn't regress anything in make check. This stuff only
>>>>> effects code using omp stuff so compiling random c++ is unlikely to test
>>>>> this code at all.
>>>>>
>>>>> Also please follow the style in
>>>>> https://gcc.gnu.org/codingconventions.html
>>>>> and usually try to make new code similar to what's around it.
>>>>>
>>>>> @@ -384,7 +386,7 @@ new_omp_context (enum omp_region_type region_type)
>>>>>
>>>>> c = XCNEW (struct gimplify_omp_ctx);
>>>>> c->outer_context = gimplify_omp_ctxp;
>>>>> - c->variables = splay_tree_new (splay_tree_compare_decl_uid, 0, 0);
>>>>> + //c->variables = splay_tree_new (splay_tree_compare_decl_uid, 0, 0);
>>>>>
>>>>> I don't think this is what you want, xcnew is a calloc wrapper and
>>>>> doesn't call the ctor for gimplify_omp_ctx. For now placement new is
>>>>> probably the simplest way to get what you want.
>>>>>
>>>> Thanks for pointing this out. I'll do it the way c->privatized_types has 
>>>> been allocated.
>>>> e.g., by making c->variables a pointer to std::map and c->variables = new 
>>>> gimplify_tree_t;
>>>>
>>>>
>>>>> -static void
>>>>> -delete_omp_context (struct gimplify_omp_ctx *c)
>>>>> -{
>>>>> - splay_tree_delete (c->variables);
>>>>> - delete c->privatized_types;
>>>>> - XDELETE (c);
>>>>> -}
>>>>>
>>>>> hm, why?
>>>>>
>>>> My bad, I'll restore this.
>>>>
>>>>> -gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data)
>>>>> +gimplify_adjust_omp_clauses_1 (std::pair n, void *data)
>>>>>
>>>>> You can now change the type of data from void * to const
>>>>> gimplify_adjust_omp_clauses_data *
>>>>
>>>> Done!
>>>>
>>>>
>>>> Thanks for the feedback, they were really helpful. I have updated the 
>>>> patch. Please review this.
>>>> Also, although I run `make check` while compiling gcc (with bootstrap 
>>>> enabled), I'm not sure if 'omp' related tests were exercised.
>>>> I'm still unfamiliar with several components of gcc. Any pointers on how 
>>>> to ensure all tests were run, would be useful.
>>>
>>> I'm not sure we want to use std::map. Can you use GCCs own hash_map
>>> here?
>>
>> Ok, I'll try to use has_map. I was under the impression that we can use 
>> standard library features, that's why I used std::map.
>>
> Using std::map caused a bootstrap build problem on AIX.
> see: https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02608.html
> However I am not sure if that's true any more after the following fix
> was commmited:
> https://gcc.gnu.org/ml/libstdc++/2014-10/msg00195.html
>

Thanks for letting me know.
-Aditya

> Regards,
> Prathamesh
>> Thanks,
>> -Aditya
>>
>>>
>>> Richard.
>>>
>>>>
>>>> -Aditya
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> thanks!
>>>>>
>>>>> Trev
>>>>>
>>>>> On Fri, Mar 13, 2015 at 07:32:03PM +, Aditya K wrote:
>>>>>> You're right. I'll change this to:
>>>>>>
>>>>>> /* A stable comparison functor to sort trees. */
>>>>>> struct tree_compare_decl_uid {
>>>>>> bool operator ()(const tree &xa, const tree &xb) const
>>>>>> {
>>>>>> return DECL_UID (xa) < DECL_UID (xb);
>>>>>> }
>>>>>> };
>>>>>>
>>>>>> New patch attached.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> -Aditya
>>>>>>
>>>>>>
>>>>>> 
>>>>>>> Date: Fri, 13 Mar 2015 19:02:11 +
>>>>>>> Subject: Re: Proposal for adding splay_tree_find (to find elements 
>>>>>>> without updating the nodes).
>>>>>>> From: jwakely@gmail.com
>>>>>>> To: hiradi...@msn.com
>>>>>>> CC: richard.guent...@gmail.com; stevenb@gmail.com; gcc@gcc.gnu.org
>>>>>>>
>>>>>>> Are you sure your compare_variables functor is correct?
>>>>>>>
>>>>>>> Subtracting the two values seems very strange for a strict weak 
>>>>>>> ordering.
>>>>>>>
>>>>>>> (Also "compare_variables" is a pretty poor name!)
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>
  

RE: Proposal for adding splay_tree_find (to find elements without updating the nodes).

2015-03-30 Thread Aditya K
So I have modified the patch to use hash_map instead of std::map. The patch is 
attached.

However, I got one regression after that.

# Comparing directories
## Dir1=../build-pristine/: 11 sum files
## Dir2=../build-test/: 11 sum files

# Comparing 11 common sum files
## /bin/sh ../contrib/compare_tests  /tmp/gxx-sum1.29214 /tmp/gxx-sum2.29214
Tests that now fail, but worked before:

c-c++-common/goacc/loop-private-1.c  -std=c++98  scan-tree-dump-times gimple 
"#pragma acc loop collapse\\(2\\) private\\(j\\) private\\(i\\)" 1
c-c++-common/goacc/loop-private-1.c scan-tree-dump-times gimple "#pragma acc 
loop collapse\\(2\\) private\\(j\\) private\\(i\\)" 1

## Differences found: 
# 1 differences in 11 common sum files found


The error is due to mis-comparison because of reordering of private(i) 
private(j).

I wanted to know if the order of the private flags matter.


Thanks,
-Aditya



> From: hiradi...@msn.com
> To: prathamesh.kulka...@linaro.org
> CC: richard.guent...@gmail.com; tbsau...@tbsaunde.org; gcc@gcc.gnu.org
> Subject: RE: Proposal for adding splay_tree_find (to find elements without 
> updating the nodes).
> Date: Wed, 18 Mar 2015 18:14:49 +
>
>
>
> 
>> Date: Wed, 18 Mar 2015 22:01:18 +0530
>> Subject: Re: Proposal for adding splay_tree_find (to find elements without 
>> updating the nodes).
>> From: prathamesh.kulka...@linaro.org
>> To: hiradi...@msn.com
>> CC: richard.guent...@gmail.com; tbsau...@tbsaunde.org; gcc@gcc.gnu.org
>>
>> On 18 March 2015 at 21:20, Aditya K  wrote:
>>>
>>>
>>> 
>>>> Date: Wed, 18 Mar 2015 11:50:16 +0100
>>>> Subject: Re: Proposal for adding splay_tree_find (to find elements without 
>>>> updating the nodes).
>>>> From: richard.guent...@gmail.com
>>>> To: hiradi...@msn.com
>>>> CC: tbsau...@tbsaunde.org; gcc@gcc.gnu.org
>>>>
>>>> On Mon, Mar 16, 2015 at 4:33 AM, Aditya K  wrote:
>>>>>
>>>>>
>>>>> 
>>>>>> Date: Sun, 15 Mar 2015 02:32:23 -0400
>>>>>> From: tbsau...@tbsaunde.org
>>>>>> To: gcc@gcc.gnu.org
>>>>>> Subject: Re: Proposal for adding splay_tree_find (to find elements 
>>>>>> without updating the nodes).
>>>>>>
>>>>>> hi,
>>>>>>
>>>>>> I'm only commenting on algorithmic stuff at this point, you should make
>>>>>> sure this doesn't regress anything in make check. This stuff only
>>>>>> effects code using omp stuff so compiling random c++ is unlikely to test
>>>>>> this code at all.
>>>>>>
>>>>>> Also please follow the style in
>>>>>> https://gcc.gnu.org/codingconventions.html
>>>>>> and usually try to make new code similar to what's around it.
>>>>>>
>>>>>> @@ -384,7 +386,7 @@ new_omp_context (enum omp_region_type region_type)
>>>>>>
>>>>>> c = XCNEW (struct gimplify_omp_ctx);
>>>>>> c->outer_context = gimplify_omp_ctxp;
>>>>>> - c->variables = splay_tree_new (splay_tree_compare_decl_uid, 0, 0);
>>>>>> + //c->variables = splay_tree_new (splay_tree_compare_decl_uid, 0, 0);
>>>>>>
>>>>>> I don't think this is what you want, xcnew is a calloc wrapper and
>>>>>> doesn't call the ctor for gimplify_omp_ctx. For now placement new is
>>>>>> probably the simplest way to get what you want.
>>>>>>
>>>>> Thanks for pointing this out. I'll do it the way c->privatized_types has 
>>>>> been allocated.
>>>>> e.g., by making c->variables a pointer to std::map and c->variables = new 
>>>>> gimplify_tree_t;
>>>>>
>>>>>
>>>>>> -static void
>>>>>> -delete_omp_context (struct gimplify_omp_ctx *c)
>>>>>> -{
>>>>>> - splay_tree_delete (c->variables);
>>>>>> - delete c->privatized_types;
>>>>>> - XDELETE (c);
>>>>>> -}
>>>>>>
>>>>>> hm, why?
>>>>>>
>>>>> My bad, I'll restore this.
>>>>>
>>>>>> -gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data)
>>>>>> +gimplify_ad

RE: Proposal for adding splay_tree_find (to find elements without updating the nodes).

2015-03-31 Thread Aditya K



> From: tho...@codesourcery.com
> To: hiradi...@msn.com
> CC: richard.guent...@gmail.com; tbsau...@tbsaunde.org; gcc@gcc.gnu.org; 
> prathamesh.kulka...@linaro.org
> Subject: RE: Proposal for adding splay_tree_find (to find elements without 
> updating the nodes).
> Date: Tue, 31 Mar 2015 09:09:24 +0200
>
> Hi!
>
> On Mon, 30 Mar 2015 22:28:41 +, Aditya K  wrote:
>> So I have modified the patch to use hash_map instead of std::map. The patch 
>> is attached.
>>
>> However, I got one regression after that.
>>
>> # Comparing directories
>> ## Dir1=../build-pristine/: 11 sum files
>> ## Dir2=../build-test/: 11 sum files
>>
>> # Comparing 11 common sum files
>> ## /bin/sh ../contrib/compare_tests /tmp/gxx-sum1.29214 /tmp/gxx-sum2.29214
>> Tests that now fail, but worked before:
>>
>> c-c++-common/goacc/loop-private-1.c -std=c++98 scan-tree-dump-times gimple 
>> "#pragma acc loop collapse\\(2\\) private\\(j\\) private\\(i\\)" 1
>> c-c++-common/goacc/loop-private-1.c scan-tree-dump-times gimple "#pragma acc 
>> loop collapse\\(2\\) private\\(j\\) private\\(i\\)" 1
>>
>> ## Differences found:
>> # 1 differences in 11 common sum files found
>>
>>
>> The error is due to mis-comparison because of reordering of private(i) 
>> private(j).
>>
>> I wanted to know if the order of the private flags matter.
>
> It doesn't matter. (I have not reviewed the proposed changes/patches
> themselves.)
>

Previously, I replaced splay_tree with std::map, and there were no regressions. 
I'm thinking this was because, std::map provides iterators to traverse the 
elements in a deterministic order (the sorted order) but
hash_map iterators may not have that order.

Thanks,
-Aditya

>
> Grüße,
> Thomas
  

RE: AutoFDO profile toolchain is open-sourced

2015-04-21 Thread Aditya K
After patching linux perf. This script collects creates a coverage file (e.g., 
for linpack) which can be used for fdo.


gcov=linpack-x86.gcov
MAKE='make'


# x86
x86() {
CC=/usr/bin/gcc
CXX=/usr/bin/g++

export CFLAGS="-Ofast -g3 -static"
export CPPFLAGS=$CFLAGS

$MAKE -C $SRC/SingleSource/Benchmarks/Linpack clean

$MAKE -C $SRC/SingleSource/Benchmarks/Linpack -k TEST=simple TARGET_LLVMGCC=$CC 
TARGET_CXX=$CXX LLI_OPTFLAGS= TARGET_CC=$CC TARGET_LLVMGXX=$CXX 
CC_UNDER_TEST_IS_GCC=1 TARGET_FLAGS= USE_REFERENCE_OUTPUT=1        
CC_UNDER_TEST_TARGET_IS_AARCH64=1 OPTFLAGS= LLC_OPTFLAGS= ENABLE_OPTIMIZED=1 
ARCH=x86_64 ENABLE_HASHED_PROGRAM_OUTPUT=1 DISABLE_JIT=1

perfdata=autofdo-linpack/perf-x86.data

perf record -b -e branch-instructions -o $perfdata 
$SRC/SingleSource/Benchmarks/Linpack/Output/linpack-pc.simple

autofdo/usr/bin/create_gcov 
--binary=$SRC/SingleSource/Benchmarks/Linpack/Output/linpack-pc.simple 
--profile=$perfdata --gcov=$gcov

}


hth,
-Aditya


> From: a...@firstfloor.org
> To: i.palac...@samsung.com
> CC: dnovi...@google.com; gcc@gcc.gnu.org; davi...@google.com; hubi...@ucw.cz; 
> seb...@gmail.com; de...@google.com; v.bari...@samsung.com
> Subject: Re: AutoFDO profile toolchain is open-sourced
> Date: Tue, 21 Apr 2015 07:25:10 -0700
>
> Ilya Palachev  writes:
>>
>> But why create_gcov does not inform about that (no branch events were
>> found)? It creates empty gcov file and says nothing :(
>>
>> Moreover, in the mentioned README it is said that perf should also be
>> executed with option -e BR_INST_RETIRED:TAKEN.
>
> Standard perf doesn't have a full event list
> This assumes a perf patched with the libpfm patch.
>
> Also I suspect it really wants to use PEBS events, so pp should be added.
>
> Alternatively you can use ocperf (from
> http://github.com/andikleen/pmu-tools) which is just a wrapper:
>
> ocperf.py record -e br_inst_retired.near_taken:pp -b ...
>
> or specify the event manually (depending on your CPU, like)
>
> perf record -e
> cpu/event=0xc4,umask=0x20,name=br_inst_retired_near_taken,period=49/pp
> -b ...
>
> BTW the biggest problem with autofdo currently is that it is
> quite bitrotten and supports only several years old perf.
> So all of this above will only work with old distributions,
> unless you compile an old perf utility first.
>
> -Andi
>
> --
> a...@linux.intel.com -- Speaking for myself only
  

RE: Compiler warnings while compiling gcc with clang‏

2015-05-05 Thread Aditya K



> CC: hiradi...@msn.com; gcc@gcc.gnu.org
> From: pins...@gmail.com
> Subject: Re: Compiler warnings while compiling gcc with clang‏
> Date: Tue, 5 May 2015 01:11:38 -0700
> To: renato.go...@linaro.org
>
>
>
>
>
>> On May 5, 2015, at 1:00 AM, Renato Golin  wrote:
>>
>>> On 5 May 2015 at 05:58, Andrew Pinski  wrote:
>>> These two are bogus and really clang in GCC's mind. The main reason
>>> is the standard says struct and class are the same thing.
>>
>> Apart from the fact that classes are private by default and structs
>> are not. They may be similar for layout purposes, and it may be ok to
>> interchange them on local re-declarations when the compiler doesn't
>> need the type completely defined, but they're not the same thing.
>
> Read the standard again. They are the same. The standard is very clear they 
> are the same.
>
>>
>> The compiler might be smart and use the protection model that the
>> original declaration used (private/public), but what that warning is
>> saying is that you have refactored your code to include classes and
>> forgot to update all uses, which is a very valid warning. I can't see
>> why one would *want* to keep the "struct" keyword. If you're already
>> compiling in C++ mode, removing it from variable/argument declarations
>> should be valid, and re-declaring incomplete types should be made as
>> class.
>
>
> No the warning is there to try to warn people about microsoft's c++ and 
> nothing else.

At least for consistency/maintainability purposes it would be very useful to 
have either all structs or all classes.

There are however, other differences between class and struct 
(http://stackoverflow.com/a/999810/811335) i.e.,

1. In absence of an access-specifier for a base class, public is assumed when 
the derived class is declared struct and private is assumed when the class is 
declared class.

2. class can be used in place of a typename to declare a template parameter, 
while the struct cannot.

-Aditya

>
> Thanks,
> Andrew
>
>>
>> cheers,
>> --renato
  

RE: Compiler warnings while compiling gcc with clang‏

2015-05-05 Thread Aditya K
contains 1 element) [-Warray-bounds]
../../gcc/final.c:3957:8: warning: array index 1 is past the end of the array 
(which contains 1 element) [-Warray-bounds]
          if (CONST_DOUBLE_HIGH (x))
              ^
../../gcc/rtl.h:1757:30: note: expanded from macro 'CONST_DOUBLE_HIGH'
#define CONST_DOUBLE_HIGH(r) XCMWINT (r, 1, CONST_DOUBLE, VOIDmode)
                             ^           ~
../../gcc/rtl.h:1123:36: note: expanded from macro 'XCMWINT'
#define XCMWINT(RTX, N, C, M)       ((RTX)->u.hwint[N])
                                     ^
../../gcc/rtl.h:397:5: note: array 'hwint' declared here
    HOST_WIDE_INT hwint[1];
    ^
../../gcc/hwint.h:54:26: note: expanded from macro 'HOST_WIDE_INT'
#   define HOST_WIDE_INT long
                         ^



gcc/cse.c:6171:38: warning: array index 1 is past the end of the array (which 
contains 1 element) [-Warray-bounds]
../../gcc/cse.c:6171:38: warning: array index 1 is past the end of the array 
(which contains 1 element) [-Warray-bounds]
            || (CONST_DOUBLE_P (new_rtx) && CONST_DOUBLE_HIGH (new_rtx)>= 0))
                                            ^~~
../../gcc/rtl.h:1757:30: note: expanded from macro 'CONST_DOUBLE_HIGH'
#define CONST_DOUBLE_HIGH(r) XCMWINT (r, 1, CONST_DOUBLE, VOIDmode)
                             ^           ~
../../gcc/rtl.h:1123:36: note: expanded from macro 'XCMWINT'
#define XCMWINT(RTX, N, C, M)       ((RTX)->u.hwint[N])
                                     ^
../../gcc/rtl.h:397:5: note: array 'hwint' declared here
    HOST_WIDE_INT hwint[1];
    ^
../../gcc/hwint.h:54:26: note: expanded from macro 'HOST_WIDE_INT'
#   define HOST_WIDE_INT long
                         ^



gcc/gcov-tool.c:225:7: warning: variable 'ret' is used uninitialized whenever 
'if' condition is false [-Wsometimes-uninitialized]
gcc/gcov-tool.c:493:7: warning: variable 'ret' is used uninitialized whenever 
'if' condition is false [-Wsometimes-uninitialized]
../../gcc/gcov-tool.c:225:7: warning: variable 'ret' is used uninitialized 
whenever 'if' condition is false [-Wsometimes-uninitialized]
  if (argc - optind == 2)
      ^~
../../gcc/gcov-tool.c:230:10: note: uninitialized use occurs here
  return ret;
         ^~~
../../gcc/gcov-tool.c:225:3: note: remove the 'if' if its condition is always 
true
  if (argc - optind == 2)
  ^~~
../../gcc/gcov-tool.c:196:10: note: initialize the variable 'ret' to silence 
this warning
  int ret;
         ^
          = 0
../../gcc/gcov-tool.c:493:7: warning: variable 'ret' is used uninitialized 
whenever 'if' condition is false [-Wsometimes-uninitialized]
  if (argc - optind == 2)
      ^~
../../gcc/gcov-tool.c:498:10: note: uninitialized use occurs here
  return ret;
         ^~~
../../gcc/gcov-tool.c:493:3: note: remove the 'if' if its condition is always 
true
  if (argc - optind == 2)
  ^~~
../../gcc/gcov-tool.c:462:10: note: initialize the variable 'ret' to silence 
this warning
  int ret;
         ^
          = 0


I think I can fix few of these if we want them to be fixed.
For some e.g. ( gcc/gcov-tool.c:225:7: warning: array index 1 is past the end 
of the array (which contains 1 element) ),
I have no idea what is the proper fix for them.

-Aditya


> Date: Tue, 5 May 2015 21:57:08 +0100
> Subject: Re: Compiler warnings while compiling gcc with clang‏
> From: jwakely@gmail.com
> To: hiradi...@msn.com
> CC: pins...@gmail.com; renato.go...@linaro.org; gcc@gcc.gnu.org
>
> On 5 May 2015 at 12:39, Aditya K wrote:
>> There are however, other differences between class and struct 
>> (http://stackoverflow.com/a/999810/811335) i.e.,
>>
>> 1. In absence of an access-specifier for a base class, public is assumed 
>> when the derived class is declared struct and private is assumed when the 
>> class is declared class.
>
> Yes, everyone here knows that. That is only relevant to the definition
> of the class, which can only occur once. For the purposes of
> declarations that are not definitions there is no difference.
>
>> 2. class can be used in place of a typename to declare a template parameter, 
>> while the struct cannot.
>
> Completely irrelevant in this context. The use of 'class' in a
> template parameter list has nothing to do with struct or class types,
> nor forward declarations of struct or class types.
  

RE: Compiler warnings while compiling gcc with clang‏

2015-05-05 Thread Aditya K



> CC: jwakely@gmail.com; renato.go...@linaro.org; gcc@gcc.gnu.org
> From: pins...@gmail.com
> Subject: Re: Compiler warnings while compiling gcc with clang‏
> Date: Tue, 5 May 2015 20:19:04 -0700
> To: hiradi...@msn.com
>
>
>
>
>
>> On May 5, 2015, at 8:13 PM, Aditya K  wrote:
>>
>> So, I analyzed other warnings and following is the list of relevant warning 
>> that I could collect. Hope this is useful.
>>
>>
>> gcc/ipa-icf.c:508:12: warning: logical not is only applied to the left hand 
>> side of this comparison
>> ../../gcc/ipa-icf.c:508:12: warning: logical not is only applied to the left 
>> hand side of this comparison [-Wlogical-not-parentheses]
>> if ((!type == FUNC || address || !opt_for_fn (decl, optimize_size))
>>
>> gcc/expr.c:5271:9: warning: comparison of constant -1 with expression of 
>> type 'unsigned int' is always false 
>> [-Wtautological-constant-out-of-range-compare]
>> ../../gcc/expr.c:5271:9: warning: comparison of constant -1 with expression 
>> of type 'unsigned int' is always false 
>> [-Wtautological-constant-out-of-range-compare]
>> if (!SUBREG_CHECK_PROMOTED_SIGN (target,
>> ^~~
>>
>> There was a similar bug posted some time ago 
>> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61271)
>> 
>>
>> gcc/reload1.c:470:28: warning: incrementing expression of type bool is 
>> deprecated [-Wdeprecated-increment-bool]
>> ../../gcc/reload1.c:470:28: warning: incrementing expression of type bool is 
>> deprecated [-Wdeprecated-increment-bool]
>> spill_indirect_levels++;
>> ~^
>>
>> Seems like for this bug we need to change the declaration of
>>
>> bool this_target_reload::x_spill_indirect_levels to an int. Even the comment 
>> there mentions that this variable might take
>> other integral values.
>>
>> /* Nonzero if indirect addressing is supported on the machine; this means
>> that spilling (REG n) does not require reloading it into a register in
>> order to do (MEM (REG n)) or (MEM (PLUS (REG n) (CONST_INT c))). The
>> value indicates the level of indirect addressing supported, e.g., two
>> means that (MEM (MEM (REG n))) is also valid if (REG n) does not get
>> a hard register. */
>> bool x_spill_indirect_levels;
>>
>>
>> 
>>
>> gcc/rtlanal.c:5573:23: warning: array index 1 is past the end of the array 
>> (which contains 1 element) [-Warray-bounds]
>> ../../gcc/rtlanal.c:5573:23: warning: array index 1 is past the end of the 
>> array (which contains 1 element) [-Warray-bounds]
>> *second = GEN_INT (CONST_DOUBLE_HIGH (value));
>> ^
>
> These warnings are bogus due to the array being the last element of the 
> structure.
>
> Please file that with clang.
>

IIRC, C++ does not allow flexible array members.

Thanks,
-Aditya


> Thanks,
> Andrew
>
>
>
>>
>> ../../gcc/rtl.h:1757:30: note: expanded from macro 'CONST_DOUBLE_HIGH'
>> #define CONST_DOUBLE_HIGH(r) XCMWINT (r, 1, CONST_DOUBLE, VOIDmode)
>> ^ ~
>> ../../gcc/rtl.h:1123:36: note: expanded from macro 'XCMWINT'
>> #define XCMWINT(RTX, N, C, M) ((RTX)->u.hwint[N])
>> ^
>> ../../gcc/rtl.h:3193:51: note: expanded from macro 'GEN_INT'
>> #define GEN_INT(N) gen_rtx_CONST_INT (VOIDmode, (N))
>> ^
>> ../../gcc/rtl.h:397:5: note: array 'hwint' declared here
>> HOST_WIDE_INT hwint[1];
>> ^
>> ../../gcc/hwint.h:54:26: note: expanded from macro 'HOST_WIDE_INT'
>> # define HOST_WIDE_INT long
>> ^
>>
>> 
>>
>> gcc/vec.h:1048:10: warning: offset of on non-POD type 'vec_embedded' (aka 
>> 'vec') [-Winvalid-offsetof]
>> ../../gcc/vec.h:1048:10: warning: offset of on non-POD type 'vec_embedded' 
>> (aka 'vec, va_heap, vl_embed>') 
>> [-Winvalid-offsetof]
>> return offsetof (vec_embedded, m_vecdata) + alloc * sizeof (T);
>> ^ ~
>> /home/hiraditya/work/llvm/install-release/bin/../lib/clang/3.7.0/include/stddef.h:120:24:
>>  note: expanded from macro 'offsetof'
>> #def

RE: AutoFDO profile toolchain is open-sourced

2015-05-08 Thread Aditya K



> Date: Fri, 8 May 2015 11:19:12 -0700
> Subject: Re: AutoFDO profile toolchain is open-sourced
> From: de...@google.com
> To: i.palac...@samsung.com
> CC: davi...@google.com; hubi...@ucw.cz; gcc@gcc.gnu.org; 
> v.bari...@samsung.com; dnovi...@google.com; seb...@gmail.com
>
> On Fri, May 8, 2015 at 2:00 AM, Ilya Palachev  wrote:
>> On 11.04.2015 01:49, Xinliang David Li wrote:
>>>
>>> On Fri, Apr 10, 2015 at 3:43 PM, Jan Hubicka  wrote:
>
> LBR is used for both cfg edge profiling and indirect call Target value
> profiling.

 I see, that makes sense ;) I guess if we want to support profile
 collection
 on targets w/o this feature we could still use one of the algorithms that
 try to guess edge profile from BB profile.
>>>
>>> Our experience with sampling cycles or retired instructions to guess
>>> BB profile has not been great -- the profile quality is significantly
>>> worse than LBR (which can almost match instrumentation based profile).
>>
>> Suppose that I have no opportunity to collect profile on x86 architecture
>> with LBR support and the only available architecture is arm/aarch64 (since
>> the application code is significantly different when compiled for different
>> architectures because of manual optimizations and different function names
>> and structure).
>
> If it's already manually tuned towards architecture (or even
> hand-written inlined-assembly), then I don't think FDO/AutoFDO can
> help much.
>
>>
>> Honza has mentioned that it's possible to guess edge profile from BB
>> profile. How do you think, can this help in the above described situation?
>> Yes, this will be much worse than LBR, but can it give any performance
>> benefit compared with no edge profile at all?
>
> Yes, it will. But it's not well tuned at all. I will start tuning it
> if I have free cycles. It would be great if opensource community can
> also contribute to this tuning effort.

If you could outline portions of code which needs tuning, rewriting, that will 
help get started in this effort.

Thanks,
-Aditya


>
> Cheers,
> Dehao
>
>>
>> --
>> Ilya
  

RE: AutoFDO profile toolchain is open-sourced

2015-05-12 Thread Aditya K
Recently we found an ICE while compiling a program with auto-fdo 
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65972).
The ICE was caused because SSA is not in a valid state when the early inliner 
is run. The fix was to update_ssa before running the early inliner 
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65972#c4).
However, it remains to be found out which pass caused the SSA to be in that 
state, maybe fixing the problem there would be more appropriate.


-Aditya



> Date: Sat, 9 May 2015 16:33:02 +0200
> From: hubi...@ucw.cz
> To: hiradi...@msn.com
> CC: de...@google.com; i.palac...@samsung.com; davi...@google.com; 
> hubi...@ucw.cz; gcc@gcc.gnu.org; v.bari...@samsung.com; dnovi...@google.com; 
> seb...@gmail.com
> Subject: Re: AutoFDO profile toolchain is open-sourced
>
>>> Yes, it will. But it's not well tuned at all. I will start tuning it
>>> if I have free cycles. It would be great if opensource community can
>>> also contribute to this tuning effort.
>>
>> If you could outline portions of code which needs tuning, rewriting, that 
>> will help get started in this effort.
>
> Optimization passes in GCC are generally designed to work with any kind of 
> edge profile they get.
> There are only few cases where they do care about what profile is around.
>
> At the moment we consider two types of profiles - static (guessed) and FDO. 
> For
> static one we shut down use of profile info for some heuristics - for example
> we do not expect loop trip counts to be reliable in the profiles because they
> are not. You can look for code checking profile_status_for_fn.
>
> Auto-FDO does not have special value for profile_status_for_fn and it goes 
> with
> same code paths for FDO. Dehao has some patches for Auto-FDO tuning but my
> impression is that he got mostly got around by just makng optimizer bit more
> robust for nonsential profiles that is always good, since even FDO profiles 
> can
> get wrong. BTW, Dehao, do you think you can submit these changes for this
> stage1?
>
> I suppose in this case we have yet another kind of profile that is less 
> reliable than
> FDO and we need to start by simply benchmarking and looking for cases where 
> this profile
> gets worse and handle them one by one :)
>
> Honza
  

returning struct or union with just double on Win32/x86

2018-12-04 Thread Jay K
typedef struct { double d; } Struct;


Struct f1 ()
{ 
Struct res = {3.0};
return res;
}

typedef union { double d; } Union;


Union f2 ()
{
Union res = {3.0};
return res;
}

x86 mingw 7.3.0

The first returns in ST0, the  second in edx:eax.

Msvc returns first in edx:eax.

Seems like a bug?

Thank you,
 - Jay

computed goto, const, truncated int, C++, etc.

2019-07-31 Thread Jay K
computed goto.
​
​
The documentation advertises read only relative address.​
​
Like this:​
​
https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html​
​
```​
static const int array[] = { &&foo - &&foo, &&bar - &&foo,​
                             &&hack - &&foo };​
goto *(&&foo + array[i]);​
```​
​
However, this doesn't quite work and is a little flawed.​
​
What one wants is:​
​
1. int or int32_t, as stated.​
1b. Or a target specific type provided by the compiler​
that encompasses the largest distance in a function.​
But in reality that is int/int32_t.​
​
Not blow the space unnecessarily on a 64bit integer, if executables and 
therefore distances within functions are limited to 32bits (I realize, there is 
a signnedess problem hypothethically, but ultimately I expect an assembler or 
linker warning for the label math overflow).

2. Syntax that works in C and C++.​
And is truly const, no dynamic initializer.
​This is crucial.

3. Preferably without casting.​
​But if I must, ok.


4. Instead of relative to a label, I should be able to use relative​
to the array itself. Which then only allows a single ampersand.​
Double might be nice, but whatever works.​
I think that might save an instruction.
It isn't critical.
​
5. 32bit and 64bit.​
​Crucial.


Many combinations do work, but you sometimes have to cast​
to char* or int or size_t.​
Sometimes have to narrow.​
Only sometimes can use the address of the array.​
Not always valid C and C++.​
​
And not all combinations do work.​
We have code that compiles as C or C++, unless/until we decide​
to use C++, and I couldn't make it work across the board.​
​
But now I'll see if the code is really any better than switch...

 - Jay

Wrong header included in cross build with asymmetric GCC versions

2017-05-25 Thread K. B.
A non-bootstrapped build:
build triplet is x86_64-linux-gnu
host,target triplet is x86_64-none-linux-musl
host compiler is gcc version 6.3.0

./configure --prefix=/usr --sysconfdir=/etc
--target=x86_64-none-linux-musl --host=x86_64-none-linux-musl
--libexecdir=/usr/lib --enable-threads=posix --libdir=/usr/lib
--disable-nls --with-sysroot=/ --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --with-system-zlib
--with-target-system-zlib --with-tune=generic

libgcc/config/i386/cpuinfo.c: fails on undeclared identifiers where
included "cpuid.h" ends up being incorrectly sourced from the 6.3.0
host compiler.
I believe this error is not unique to this file and exposes a larger
flaw within the build system in the handling of include paths for
non-boostrapping/cross builds and is otherwise undetected until a
header mismatch of this sort occurs.


make[3]: Leaving directory
'/home/deb/src/build/buildroot-x86/custom/gcc-7.1.0/x86_64-none-linux-musl/libgcc'
x86_64-none-linux-musl-cc   -g -O2 -O2  -g -O2 -DIN_GCC-W -Wall
-Wno-narrowing -Wwrite-strings -Wcast-qual -Wstrict-prototypes
-Wmissing-prototypes -Wold-style-definition  -isystem ./include
-fpic -mlong-double-80 -DUSE_ELF_SYMVER -g -DIN_LIBGCC2
-fbuilding-libgcc -fno-stack-protector   -fpic -mlong-double-80
-DUSE_ELF_SYMVER -I. -I. -I../../host-x86_64-none-linux-musl/gcc
-I../.././libgcc -I../.././libgcc/. -I../.././libgcc/../gcc
-I../.././libgcc/../include  -DHAVE_CC_TLS  -DUSE_TLS -o cpuinfo.o -MT
cpuinfo.o -MD -MP -MF cpuinfo.dep  -c
../.././libgcc/config/i386/cpuinfo.c -fvisibility=hidden
-DHIDE_EXPORTS
../.././libgcc/config/i386/cpuinfo.c: In function ‘get_available_features’:
../.././libgcc/config/i386/cpuinfo.c:278:17: error:
‘bit_AVX512VPOPCNTDQ’ undeclared (first use in this function)
   if (ecx & bit_AVX512VPOPCNTDQ)
 ^~~
../.././libgcc/config/i386/cpuinfo.c:278:17: note: each undeclared
identifier is reported only once for each function it appears in
../.././libgcc/config/i386/cpuinfo.c:280:17: error: ‘bit_AVX5124VNNIW’
undeclared (first use in this function)
   if (edx & bit_AVX5124VNNIW)
 ^~~~
../.././libgcc/config/i386/cpuinfo.c:282:17: error: ‘bit_AVX5124FMAPS’
undeclared (first use in this function)
   if (edx & bit_AVX5124FMAPS)
 ^~~~
../.././libgcc/shared-object.mk:14: recipe for target 'cpuinfo.o' failed
make[2]: *** [cpuinfo.o] Error 1
make[2]: Leaving directory
'/home/deb/src/build/buildroot-x86/custom/gcc-7.1.0/x86_64-none-linux-musl/libgcc'
Makefile:13316: recipe for target 'all-target-libgcc' failed
make[1]: *** [all-target-libgcc] Error 2
make[1]: Leaving directory '/home/deb/src/build/buildroot-x86/custom/gcc-7.1.0'
Makefile:897: recipe for target 'all' failed
make: *** [all] Error 2


extern const initialized warns in C

2018-01-20 Thread Jay K
extern const int foo = 123;



Why does this warn?
This is a valid portable form, with the same meaning
across all compilers, and, importantly, portably
to C and C++.

I explicitly do not want to say:

  const int foo = 123

because I want the code to be valid and have the same meaning
in C and C++ (modulo name mangling).

I end up with:

// Workaround gcc warning.
#ifdef __cplusplus
#define EXTERN_CONST extern const
#else
#define EXTERN_CONST const
#endif


EXTERN_CONST int foo = 123;

and having to explain it to people.

$ cat 1.c
extern const int foo = 123;
$ $HOME/gcc720/bin/gcc -c -S 1.c
1.c:1:18: warning: 'foo' initialized and declared 'extern'
 extern const int foo = 123;
  ^~~
$ $HOME/gcc720/bin/gcc -c -S -xc++ -Wall -pedantic 1$ $HOME/gcc720/bin/gcc -v
Using built-in specs.

COLLECT_GCC=/Users/jay/gcc720/bin/gcc
COLLECT_LTO_WRAPPER=/Users/jay/gcc720/libexec/gcc/x86_64-apple-darwin16.7.0/7.2.0/lto-wrapper
Target: x86_64-apple-darwin16.7.0
Configured with: ../gcc-7.2.0/configure -prefix=/Users/jay/gcc720 -disable-nls 
-disable-bootstrap
Thread model: posix
gcc version 7.2.0 (GCC) $ 


Thank you,
 - Jay



 

Re: extern const initialized warns in C

2018-01-22 Thread Jay K

By this argument there is a missing warning for the equivalent:

  const int foo = 123;

with no previous extern declaration.

As well, there is no warning in C++.
All three constructs are equivalent, yet only one gets a warning.

Interesting point, that I had not realized, and with an often acceptable
workaround, however also there exist coding conventions that prohibit use of 
static.
Instead they "hide" things by omitting them from headers only.

That can still be worked around, just put the declaration right before the 
definition,
in the same source file.

I realize there are many arguments for and against file level static.

 - Jay  


From: David Brown 
Sent: Monday, January 22, 2018 8:32 AM
To: Jay K; gcc
Subject: Re: extern const initialized warns in C
  

On 21/01/18 08:12, Jay K wrote:
> extern const int foo = 123;
> 
> 
> 
> Why does this warn?
> This is a valid portable form, with the same meaning
> across all compilers, and, importantly, portably
> to C and C++.
> 
> I explicitly do not want to say:
> 
>   const int foo = 123
> 
> because I want the code to be valid and have the same meaning
> in C and C++ (modulo name mangling).
> 
> I end up with:
> 
> // Workaround gcc warning.
> #ifdef __cplusplus
> #define EXTERN_CONST extern const
> #else
> #define EXTERN_CONST const
> #endif
> 
> 
> EXTERN_CONST int foo = 123;
> 
> and having to explain it to people.
> 

<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45977>


45977 – "warning: 'i' initialized and declared 'extern ...
gcc.gnu.org
GCC Bugzilla – Bug 45977 "warning: 'i' initialized and declared 'extern'" could 
use a separate warning flag controlling it Last modified: 2017-07-26 15:36:22 
UTC

This suggests that gcc authors consider mixing "extern" and
initialization to be such bad style that the compiler warns by default.
 But the "bug" is that there is no flag to turn off this warning.
(Ideally every warning should have a matching flag, even if the warning
is enabled by default.)

Usually you do not want to have "extern" and initialisation in the same
line - it indicates a questionable organisation of your sources which is
more likely to be error-prone than the standard idioms.  (I say
"questionable", not necessarily wrong - but certainly I would question
it if I saw it in source code.)

Normally you want:

// file.h
// declaration, not definition
extern const int foo;

// file.c
#include 
// definition
const int foo = 123;

// otherfile.c
#include 
int usefoo(void) { return foo; }


The key advantages of this sort of setup are a cleaner separation
between declarations (which you need to /use/ things) and the
definitions (which should normally only exist once in the program -
certainly for C).  The declarations and definitions only exist in one
place, and they are checked for consistency - there are no "extern"
declarations lying around in C files that might get out of step from
changes in the headers or other files with definitions.

To be consistent with this, and to work consistently with C and C++, I
have a strict policy that a C (or C++) file never contains  declarations
without definitions (and initialisations as needed), with each
definition either also declared as "extern" in a matching header file,
or it is declared as "static".

This sort of arrangement is very common - though many people are lazy
about using "static".  (In C++, you can also use anonymous namespaces,
but "static" works for consistency between C and C++.)


Still, gcc should have a flag to disable this warning if you have reason
to use "extern const int foo = 123;" - it is, after all, correctly
defined C code.



> $ cat 1.c
> extern const int foo = 123;
> $ $HOME/gcc720/bin/gcc -c -S 1.c
> 1.c:1:18: warning: 'foo' initialized and declared 'extern'
>  extern const int foo = 123;
>   ^~~
> $ $HOME/gcc720/bin/gcc -c -S -xc++ -Wall -pedantic 1$ $HOME/gcc720/bin/gcc -v
> Using built-in specs.
> 
> COLLECT_GCC=/Users/jay/gcc720/bin/gcc
> COLLECT_LTO_WRAPPER=/Users/jay/gcc720/libexec/gcc/x86_64-apple-darwin16.7.0/7.2.0/lto-wrapper
> Target: x86_64-apple-darwin16.7.0
> Configured with: ../gcc-7.2.0/configure -prefix=/Users/jay/gcc720 
> -disable-nls -disable-bootstrap
> Thread model: posix
> gcc version 7.2.0 (GCC) $ 
> 
> 
> Thank you,
>  - Jay
> 
> 
> 
>  
> 



Re: extern const initialized warns in C

2018-01-22 Thread Jay K
Also the warning did not include a link explaining the desired workaround.


Since you advocate for static...and I know it has big value..

There are the following reasons against static:

 - It is prohibited in some coding conventions.
    They instead hide symbols by omitting them from any headers.

 - It allows/encourages symbols duplicated from a human point of view,
   leading to harder to read code; but this is also the point and good,
   it offers scope to pick shorter names, or at least hide
   names (you can still strive for globally unique names, in
   case the symbols later have to be made extern)
   
 - it leads to accidental duplication, static int foo = 123 in a header

 - There are toolsets that don't resolve statics in disassembly

 - It only allows for sharing within a file and hiding from all others,
   it doesn't allow sharing for within a few files and hiding from others

 - It sort of doesn't work with "unity builds" old fashioned LTO/LTCG where one
   source file includes the rest
   

   I think a linker switch to report symbols that could be static
   might be useful.
   
   I find the "scoping" too hard to pass it, and if I need to make
   the symbol extern in future, I can afford a rename to do it.
   

    - Jay




From: Jay K
Sent: Monday, January 22, 2018 9:31 AM
To: David Brown; gcc
Subject: Re: extern const initialized warns in C
  


By this argument there is a missing warning for the equivalent:

  const int foo = 123;

with no previous extern declaration.

As well, there is no warning in C++.
All three constructs are equivalent, yet only one gets a warning.

Interesting point, that I had not realized, and with an often acceptable
workaround, however also there exist coding conventions that prohibit use of 
static.
Instead they "hide" things by omitting them from headers only.

That can still be worked around, just put the declaration right before the 
definition,
in the same source file.

I realize there are many arguments for and against file level static.

 - Jay  


From: David Brown 
Sent: Monday, January 22, 2018 8:32 AM
To: Jay K; gcc
Subject: Re: extern const initialized warns in C
  

On 21/01/18 08:12, Jay K wrote:
> extern const int foo = 123;
> 
> 
> 
> Why does this warn?
> This is a valid portable form, with the same meaning
> across all compilers, and, importantly, portably
> to C and C++.
> 
> I explicitly do not want to say:
> 
>   const int foo = 123
> 
> because I want the code to be valid and have the same meaning
> in C and C++ (modulo name mangling).
> 
> I end up with:
> 
> // Workaround gcc warning.
> #ifdef __cplusplus
> #define EXTERN_CONST extern const
> #else
> #define EXTERN_CONST const
> #endif
> 
> 
> EXTERN_CONST int foo = 123;
> 
> and having to explain it to people.
> 

<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45977>


45977 – "warning: 'i' initialized and declared 'extern ...
gcc.gnu.org
GCC Bugzilla – Bug 45977 "warning: 'i' initialized and declared 'extern'" could 
use a separate warning flag controlling it Last modified: 2017-07-26 15:36:22 
UTC


45977 – "warning: 'i' initialized and declared 'extern ...
gcc.gnu.org
GCC Bugzilla – Bug 45977 "warning: 'i' initialized and declared 'extern'" could 
use a separate warning flag controlling it Last modified: 2017-07-26 15:36:22 
UTC

This suggests that gcc authors consider mixing "extern" and
initialization to be such bad style that the compiler warns by default.
 But the "bug" is that there is no flag to turn off this warning.
(Ideally every warning should have a matching flag, even if the warning
is enabled by default.)

Usually you do not want to have "extern" and initialisation in the same
line - it indicates a questionable organisation of your sources which is
more likely to be error-prone than the standard idioms.  (I say
"questionable", not necessarily wrong - but certainly I would question
it if I saw it in source code.)

Normally you want:

// file.h
// declaration, not definition
extern const int foo;

// file.c
#include 
// definition
const int foo = 123;

// otherfile.c
#include 
int usefoo(void) { return foo; }


The key advantages of this sort of setup are a cleaner separation
between declarations (which you need to /use/ things) and the
definitions (which should normally only exist once in the program -
certainly for C).  The declarations and definitions only exist in one
place, and they are checked for consistency - there are no "extern"
declarations lying around in C files that might get out of step from
changes in the headers or other files with definitions.

To be consistent with this, and to work consistently with C and C++, I
have a strict policy that a C (or C++) file never con

Re: extern const initialized warns in C

2018-01-22 Thread Jay K
> I find the "scoping" too hard to pass it, and if I need to make
> the symbol extern in future, I can afford a rename to do it.


I mean, I actually like like the ability to shorten file level static symbols.


As you point out, true, you can have it both ways, static does not force 
shortening, merely allows it, and protects you from inadvertant duplication.


Really, the prohibition against file level static is used in large code bases.

You already have to chose unique extern names, and depend on the linker to 
diagnose duplicate externals for them.

Extending this by barring statics isn't necessarily significant.

Such code bases prefix every extern identifier with some "local" prefix, and 
any missing prefix is easy to spot and a stylistic mistake.

(i.e. local to the subsystem or directory -- I realize it is the very 
definition of "local" that makes or break this)


I understand that hiding by omission from headers is not hiding at the linker 
level.


I agree there are scalability problems with naming in C, but it isn't clear 
static helps significantly.


There is an interesting side effect though that I think is not very much 
appreciated.

Large C code bases are more amenable to plain text search than large C++ code 
bases, due to the "more uniqueness" of symbols.


This plain text search aspect is one of extremely few advantages I see to C 
over C++, perhaps the only one.


 - Jay



From: David Brown 
Sent: Monday, January 22, 2018 10:14 AM
To: Jay K; gcc
Subject: Re: extern const initialized warns in C

Hi,

I made some points in my other reply.  But for completeness, I'll tackle
these too.

On 22/01/2018 10:38, Jay K wrote:
> Also the warning did not include a link explaining the desired workaround.
>
>
> Since you advocate for static...and I know it has big value..
>
> There are the following reasons against static:
>
>   - It is prohibited in some coding conventions.
>  They instead hide symbols by omitting them from any headers.

As noted before, that is insane.  It gives no benefits but makes it easy
to cause mistakes that are hard to find.

>
>   - It allows/encourages symbols duplicated from a human point of view,
> leading to harder to read code; but this is also the point and good,
> it offers scope to pick shorter names, or at least hide
> names (you can still strive for globally unique names, in
> case the symbols later have to be made extern)

Omitting "static" also allows symbol duplication.  It just means that
such duplication is an error in the code - which may or may not be
caught at link time.

You /can/ have a coding convention that discourages duplicate symbol
names - even when using "static".  That might help a little in
understanding, but will quickly mean bloated source code that is harder
to read and follow (because you end up with long-winded symbol names
everywhere).

Such conventions are not scalable, are hopeless for multi-programmer
projects, terrible for code re-use, and can make code far harder to read
and write.

The scoping and naming in C is limited enough without omitting half the
features it has to deal with modularisation.

>
>   - it leads to accidental duplication, static int foo = 123 in a header

It is quite simple - don't do that.  It is appropriate for constant data
- "static const int foo = 123;" in a header will be fine, because "foo"
has the same value everywhere and is likely to be "optimised away".
That is the reason C++ makes "const int foo = 123;" effectively static.

Headers (in C, and mostly in C++) are for /declarations/, not
definitions - at least if you want to write structured and modular code.

>
>   - There are toolsets that don't resolve statics in disassembly

Statics are local to the file.  Disassemblies should show them when they
are used.  For the tiny, tiny proportion of C programmers that ever use
a disassembler, if their toolchains are not good enough then they should
get better toolchains.  It should /never/ be a problem when using
assembly listing files generated by the compiler, which are almost
always more useful than disassembling object code.

Making a coding convention to suit this requirement is like making
gloves with 6 fingers so that they fit people with an extra digit.

>
>   - It only allows for sharing within a file and hiding from all others,
> it doesn't allow sharing for within a few files and hiding from others

C has no good way to allow sharing between a few files and hiding from
others.  Such shared identifiers must be program-wide global.  But that
does /not/ mean you should make /everything/ program-wide global!  It
means you should minimise such sharing, prefix such shared names in a
way likely to minimise conflicts, and organise your source c

Re: extern const initialized warns in C

2018-01-22 Thread Jay K
 > If you put static (non-const)
 > variables in your header files, you have misunderstood how to use header
 > files in C programming.


 Not me, and usually const, but people do it, both.
 Even the consts can get duplicated.
 Even the simple C++
   const int one = 1;


 I can take the address of.
 It is also unusual and maybe dumb. There are too many programmers
 writing C and C++ with too little oversight to rule these out.


 The static prohibition might be too closed to identity.


 I understand about the local edit, but it behooves
 one to make the non-edited build debugable too.


 I know you can only go so far, can't litter it with printf
 or breakpoints or volatile, nor can compile it unoptimized
 and ship it, but uniquifying function/data names seems
 maybe affordable for debuggability of official builds.


  > If you want to switch from C to C++, that's fine by me


 But the rest of my team has to agree.


  > C++ gives you namespaces, which gives you
  > nother way to group your identifiers and control their scope.


I know *all* about C++ but I think for newbies it is best
to start out focusing on member functions.
Pretend there is a type global namespace to replace C's function
global namespace. That is a huge improvement on its own.


 > It makes a large difference - both for code size and speed


 I only recently learned of how static impacts ELF visibility
 and therefore performance. Windows does not work this way,
 and I'm sure MacOS does either. Non-static on Windows does not imply
 replacable at dynamic link time, not even if the function is exported.
 Symbols are always resolved directly within the dll/sharedobject by
 the static linker if they are present with no pointer or stub in the way.
 (Ok, if you incorrectly annotate as __declspec(dllexport) and you don't
 use LTO/LTCG, then you will call through a pointer, but no stub,
 no actual inter-positionableness, and it is a rare occurence.)


 There is also always a two level namespace -- imported functions are qualified
 by the dll name they are expected to be in. For multiple dlls to export
 the same function name creates no ambiguity and implies no replacement
 of one by the other, and no semantic difference depending on load order.
 Unless someone writes very wierd code calling dlopen/dlsym like crazy.
 There is no LD_PRELOAD, slight loss, and replacing e.g. operator new
 isn't really possible process-wide, only within the scope of the static link,
 and even that is so rare, that it is probably sufficient.


 There is no going out of the way to accurately simulate the static linker
 at dynamic link time. Functions are only exported if they are annotated
 in source or listed in a separate file. Not just by being non-static.


  - Jay



From: David Brown 
Sent: Monday, January 22, 2018 10:42 AM
To: Jay K; gcc
Subject: Re: extern const initialized warns in C



On 22/01/2018 11:14, Jay K wrote:
> I  meant:
>
>
> extern const foo = 123;
>
>
> does not warn in C++, but by these arguments, should.
>

Yes, I think it should.  But I am a compiler user, not a compiler
author, so my bias is strongly towards /my/ code rather than a wider
audience.

>
>
> I understand that const int foo = 123 is static in C++.
>
> It is this difference between C and C++ and the desire to write code
> that means the same in C and C++ is why I write extern const int foo =
> 123 in the first place. If I was just writing C forever and not planning
> to compile as C++ ever then I would omit the redundant extern -- and not
> get a warning -- the other inconsistency!

As I suggested, put the declaration in the header and the definition in
the source file.  Then it is the same code for C and C++, works
correctly, and gives no warnings no matter what flags you use.  And it
is modular, structured, and lets programmers see exactly what is
"exported" from that C file by looking in a short header rather than
digging through a long source file.

>
>
> To repeat for clarity:
>
>
>   1 C: extern const int foo = 123; => warns, for reasons explained and
> understood even if not agreed
>
>   2 C: const int foo = 123; => means the same thing, but no warning;
> inconsistent?
>
>   3 C++: extern const int foo = 123; => also no warning, inconsistent?
>
>
>
> The prohibition against file level static is actually quite widespread
> and adhered to.
>

Can you give references or links?  As I say, I think such a convention
is /seriously/ wrong.

(There are plenty of other conventions that I think are wrong - even
famous and "professional" standards like MISRA have some daft ideas.)

>
> Along with it, of course, comes a mandate to pick globally unique names.

That mandate I can understand.  There are rational justifications for
it, even though I don't agree wi

r9 on ARM32?

2018-04-23 Thread Jay K
I'm wondering what is the role of r9 on ARM32, on Linux and Android.
  On Apple it is documented as long ago reserved, these days available for 
scratch.

I've looked around a bit but haven't gotten the full answer.

It is "the PIC register", I see.

What does that imply? Volatile? Von-volatile?

In particular I'm looking for a spare register, to pass an extra "special" 
parameter in, that can be considered volatile and never otherwise has a 
parameter.

Most ABIs have a few candidates, but arm32 comes up relatively short.

Intra procedural scratch (r12) probably cannot work for me.
I know gcc uses it for nested function context and that is laudable. I wish I 
could guarantee no code between me setting it and it being consumed.

And if it is volatile, I'd want the dynamic linker stubs to still preserve it 
incoming.

Thank you,
 - Jay


suggest more inhibit_libc for ia64-linux -- problems with exception handling when haven't yet built glibc.

2011-11-12 Thread Jay K

Building cross gcc and binutils is easy, but for the libc/libgcc parts. I've 
wrestled with this a lot.


I'm trying to build an ia64-linux cross toolset from a Mac.
Including cross building glibc.


I've gone through many options and errors, including sysroot and not,
following the LFS stuff and the CLFS stuff.


(Linux-from-Scratch, Cross-Linux-from-Scratch)

(CLFS good in that it uses sysroot and is cross, but it uses older versions and 
for
now I gave up, and non-cross-LFS is basically cross anyway.)


Some of what I hit:


In file included from /src/gcc-4.6.2/libgcc/../gcc/unwind-sjlj.c:30:0:
/obj/gcc/./gcc/include/unwind.h:214:20: fatal error: stdlib.h: No such file or 
directory
/src/gcc-4.6.2/libgcc/../gcc/config/ia64/fde-glibc.c:33:20: fatal error: 
stdlib.h: No such file or directory
/src/gcc-4.6.2/libgcc/../gcc/config/ia64/fde-glibc.c:36:18: fatal error: 
link.h: No such file or directory




and suggest, more use of inhibit_libc:



jbook2:gcc-4.6.2 jay$ diff -u gcc/config/ia64/fde-glibc.c.orig 
gcc/config/ia64/fde-glibc.c
--- gcc/config/ia64/fde-glibc.c.orig 2011-11-12 13:30:55.0 -0800
+++ gcc/config/ia64/fde-glibc.c 2011-11-12 13:32:47.0 -0800
@@ -25,6 +25,8 @@
 /* Locate the FDE entry for a given address, using glibc ld.so routines
 to avoid register/deregister calls at DSO load/unload. */
 
+#ifndef inhibit_libc
+
 #ifndef _GNU_SOURCE
 #define _GNU_SOURCE 1
 #endif
@@ -160,3 +162,5 @@
 
return data.ret;
 }
+
+#endif


jbook2:gcc-4.6.2 jay$ diff -u gcc/unwind-generic.h.orig gcc/unwind-generic.h
--- gcc/unwind-generic.h.orig 2011-11-12 13:02:32.0 -0800
+++ gcc/unwind-generic.h 2011-11-12 16:11:46.0 -0800
@@ -211,7 +211,9 @@
 compatible with the standard ABI for IA-64, we inline these. */
 
#ifdef __ia64__
+#ifndef inhibit_libc
 #include 
+#endif
 
static inline _Unwind_Ptr
 _Unwind_GetDataRelBase (struct _Unwind_Context *_C)
@@ -223,7 +225,9 @@
 static inline _Unwind_Ptr
 _Unwind_GetTextRelBase (struct _Unwind_Context *_C __attribute__ 
((__unused__)))
 {
+#ifndef inhibit_libc
 abort ();
+#endif
 return 0;
 }



I understand the result is "broken" and that a second build will be needed.
But that seems to be common practise in building a cross toolset when
"libc" doesn't already exist.

And even then, I'm not done. Maybe this still won't work.

I know about buildroot for example but they don't have IA64 support.


I'm going to see if there is an option for building without any exception 
handling support.
It looks like not. Though this diff sort of does that.


Another thing I'll try is skipping libgcc for the first pass.
But I did get an error about missing libgcc when building glibc.
The 64bit math stuff at least ought to go in, but I realize it's probably not 
needed for 64bit targets.


Thank you,
 - Jay


FW: gcc uses too much stack?

2012-01-07 Thread Jay K

Have people considered that stack space should be used more conservatively by 
gcc?
More malloc, less alloca, fewer/smaller local variables?
More std::stack or such, less recursion on the machine stack?
(Yes, I know std::stack has no existing use in gcc yet.)


Don't make the amount of stack used dependent on the input?
If something can be compiled with N stack, then anything can be?
  Is a reasonable goal?


You know, heap is generally much larger than stack, and easier to detect 
exhaustion of?
Granted, yes, I understand very well, heap is much slower to allocate and 
requires explicit free,
and is subject to fragmentation.


Thanks,
 - Jay

> Date: Sun, 8 Jan 2012 00:05:01 -0500
> To: djgpp-digest-da...@delorie.com
> 
> 2012/01/07/15:03:06 ANNOUNCE: DJGPP port of GNU binutils 2.22 uploaded.
> 
> --
> 
...
>   DJGPP specific changes.
>   ===
>   - This port allows a maximal number of 4294967296 relocations per object 
> file
> and a maximal number of 4294967296 of lines per executable file.
> The previous limits were the classical COFF limitations of 65536 for 
> boths.
> Please note, that due to limitations inherent to DOS and memory ressources
> not every file can be compiled.  E.g.: to be able to compile a single file
> containing up to 3 * 65536 relocations I had to increment stack space of
> cc1.exe from 2MB to 10MB.  If the file contains 4 * 65536 relocations then
> cc1.exe aborts because memory has become exhausted.  Neither as.exe nor
> ld.exe have shown memory issues.  Both have the standard stack space of
> 512KB.  In other words, even if 32 bit values for relocation and line
> counters are now supported by DJGPP port of as.exe and ld.exe it does not
> imply that large files can be successfully compiled and linked.  There are
> memory limitations that may not be solvable.
...
> 
> Enjoy.
> 
> Guerrero, Juan Manuel 
...

  


Adding memory write to a sparc fsqrts insn

2010-01-29 Thread k e
Hello everybody,
I'd like to patch gcc's sparc machine descritpion so
that the destination register of a fpu sqareroot operation fsqrts
is stored into memory after each fsqrts.

like this:

fsqrts %f2,%f4
st %f4, -4[%fp]  <= add this after every fsqrts where -4[%fp] is
 a slot allocated on the stack for each fsqrts insn

The sparc.md portion that defines the fsqrts pattern is:

(define_insn "sqrtsf2"
  [(set (match_operand:SF 0 "register_operand" "=f")
(sqrt:SF (match_operand:SF 1 "register_operand" "f")))]
  "TARGET_FPU"
  "fsqrts\t%1, %0"
  [(set_attr "type" "fpsqrts")])

i thought I could add  the "st %f4, -4[%fp]" there. But now
my question:

- Where/when/how can I allocate the stackframe slot to save the
  destination fp reg in (the offset to %fp).

I'm not that familiar with rtl representation and the stages
of compilation. So any help would be apreciated.

Maybe another architecture has a similar construct already
that I could study and use for my purpose. If somebody
can point me to such machine description part ...

-- Greetings Konrad


Generating store after fdivd: how to avoid delay slot

2010-02-09 Thread k e
I try to patch gcc so that after a fdivd the destination register is
stored to the stack.

fdivd %f0,%f2,%f4; std %f4, [%sp]

I generate the rtl for divdf3 using a emit_insn,DONE sequence in a
define_expand pattern (see below).

In the assembler output phase I use a define_insn and write
out "fdivd\t%%1, %%2, %%0; std %%0, %%3" as the expression string.

My question:
  - How can I mark the pattern so that it will not be sheduled into a
delay slot? How can I specify that the output will be 2 instructions
and hint the scheduler about it?
  - Is the (set_attr "length" "2") attribute in define_insn divdf3_store
(below) already sufficient?

-- Greetings Konrad


;; handle divdf3 
(define_expand "divdf3"
  [(parallel [(set (match_operand:DF 0 "register_operand" "=e")
  (div:DF (match_operand:DF 1 "register_operand" "e")
(match_operand:DF 2 "register_operand" "e")))
  (clobber (match_scratch:SI 3 ""))])]
  "TARGET_FPU"
  "{
  output_divdf3_emit (operands[0], operands[1], operands[2], operands[3]);
  DONE;
}")

(define_insn "divdf3_store"
  [(set (match_operand:DF 0 "register_operand" "=e")
  (div:DF (match_operand:DF 1 "register_operand" "e")
(match_operand:DF 2 "register_operand" "e")))
  (clobber (match_operand:DF 3 "memory_operand" ""  ))]
  "TARGET_FPU && TARGET_STORE_AFTER_DIVSQRT"
   {
   return output_divdf3 (operands[0], operands[1], operands[2],
operands[3]);
   }
   [(set_attr "type" "fpdivd")
   (set_attr "fptype" "double")
   (set_attr "length" "2")])

(define_insn "divdf3_nostore"
  [(set (match_operand:DF 0 "register_operand" "=e")
(div:DF (match_operand:DF 1 "register_operand" "e")
(match_operand:DF 2 "register_operand" "e")))]
  "TARGET_FPU && (!TARGET_STORE_AFTER_DIVSQRT)"
  "fdivd\t%1, %2, %0"
  [(set_attr "type" "fpdivd")
   (set_attr "fptype" "double")])




/ handle fdivd /
char *
output_divdf3 (rtx op0, rtx op1, rtx dest, rtx scratch)
{
  static char string[128];
  sprintf(string,"fdivd\t%%1, %%2, %%0; std %%0, %%3 !!!");
  return string;
}

void
output_divdf3_emit (rtx dest, rtx op0, rtx op1, rtx scratch)
{
  rtx slot0, div, divsave;

  div = gen_rtx_SET (VOIDmode,
 dest,
 gen_rtx_DIV (DFmode,
  op0,
  op1));

  if (TARGET_STORE_AFTER_DIVSQRT) {
slot0 = assign_stack_local (DFmode, 8, 8);
divsave = gen_rtx_SET (VOIDmode, slot0, dest);
emit_insn(divsave);
emit_insn (gen_rtx_PARALLEL(VOIDmode,
gen_rtvec (2,
   div,
   gen_rtx_CLOBBER (SImode,
slot0;
  } else {
emit_insn(div);
  }
}


Issue on Solaris box

2009-05-20 Thread santosh k

Hi,
 
I'm running my C program on a Solaris machine with the following gcc version 
installed.
 
>>gcc version 3.4.6
>>Thread model 3.4.6
>>/usr/local/lib/gcc/sparc-sun-solaris2.9/3.4.6/spec
 
Issue:
The same code seems to compile and execute fine on 
mingw running on Windows  with gcc version 3.4.5
cygwin running on Windows with gcc version 3.4.4
 
The C program compiles fine but fails during execution in Solaris machine
Is it possible for different version of gcc to behave unexpectedly.
 
Is it a problem of different architecture? or a code bug?
 
Could u give me a list of gcc version which is stable/compatible for Solaris 
env.  
 
Thanks
Santosh


  


gcc 4.5.0 vms-gcc_shell_handler.c needs #define __NEW_STARLET

2010-05-03 Thread Jay K

/src/gcc-4.5.0/libgcc/../gcc/config/alpha/vms-gcc_shell_handler.c: In function 
'get_dyn_handler_pointer':
/src/gcc-4.5.0/libgcc/../gcc/config/alpha/vms-gcc_shell_handler.c:73:3: error: 
'PDSCDEF' undeclared (first use in this function)
/src/gcc-4.5.0/libgcc/../gcc/config/alpha/vms-gcc_shell_handler.c:73:3: note: 
each undeclared identifier is reported only once for each function it appears in
/src/gcc-4.5.0/libgcc/../gcc/config/alpha/vms-gcc_shell_handler.c:73:13: error: 
'pd' undeclared (first use in this function)
/src/gcc-4.5.0/libgcc/../gcc/config/alpha/vms-gcc_shell_handler.c:73:18: 
warning: cast to pointer from integer of different size
/src/gcc-4.5.0/libgcc/../gcc/config/alpha/vms-gcc_shell_handler.c:73:18: error: 
expected expression before ')' token
/src/gcc-4.5.0/libgcc/../gcc/config/alpha/vms-gcc_shell_handler.c:111:17: 
warning: cast to pointer from integer of different size
/src/gcc-4.5.0/libgcc/../gcc/config/alpha/vms-gcc_shell_handler.c:111:10: 
warning: cast to pointer from integer of different size
make[4]: *** [vms-gcc_shell_handler.o] Error 1
make[3]: *** [multi-do] Error 1
make[2]: *** [all-multi] Error 2
make[1]: *** [all-target-libgcc] Error 2
make: *** [all] Error 2
 
 
fix, put this at top:
  
 
#ifndef __NEW_STARLET
#define __NEW_STARLET
#endif
 
 
 - Jay
  


gcc 4.5.0 stddef.h clobbers __size_t with #define, breaks VMS (code already avoids similar on FreeBSD)

2010-05-03 Thread Jay K

VMS decc$types.h:

    typedef unsigned int __size_t;

but with GCC 4.5.0 this preprocesses as:

    typedef unsigned int ;
    
and there are ensuing errors e.g. when compiling gcc/libiberty/regex.c

probably because of:

/usr/local/lib/gcc/alpha-dec-vms/4_5_0/include/stddef.h (it does get included)
#if defined (__FreeBSD__) && (__FreeBSD__>= 5)
/* __size_t is a typedef on FreeBSD 5!, must not trash it. */
#else
#define __size_t
#endif

presumably should be more like:

#if (defined (__FreeBSD__) && (__FreeBSD__>= 5)) || defined(__vms)
/* __size_t is a typedef on FreeBSD 5 and VMS, must not trash it! */
#else
#define __size_t
#endif


That gets further, then hits 


src/gcc-4.5.0/libiberty/regex.c: In function 'byte_insert_op2':
/src/gcc-4.5.0/libiberty/regex.c:4279:1: error: unrecognizable insn:
(insn 62 61 63 5 /src/gcc-4.5.0/libiberty/regex.c:4276 (set (reg:DI 135)
    (plus:SI (subreg/s:SI (reg/v/f:DI 109 [ pfrom ]) 0)
    (const_int 5 [0x5]))) -1 (nil))
/src/gcc-4.5.0/libiberty/regex.c:4279:1: internal compiler error: in 
extract_insn, at recog.c:2103
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
make: *** [regex.o] Error 1


 - Jay

  


gcc-4.5.0 internal compiler error on alpha-dec-vms compiling libiberty/regex.c without -mbwx

2010-05-03 Thread Jay K

> src/gcc-4.5.0/libiberty/regex.c: In function 'byte_insert_op2':
> /src/gcc-4.5.0/libiberty/regex.c:4279:1: error: unrecognizable insn:
> (insn 62 61 63 5 /src/gcc-4.5.0/libiberty/regex.c:4276 (set (reg:DI 135)
> (plus:SI (subreg/s:SI (reg/v/f:DI 109 [ pfrom ]) 0)
> (const_int 5 [0x5]))) -1 (nil))
> /src/gcc-4.5.0/libiberty/regex.c:4279:1: internal compiler error: in 
> extract_insn, at recog.c:2103
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See  for instructions.
> make: *** [regex.o] Error 1


Fixed by saying make CFLAGS=-mbwx, which enables some byte/word instructions.
More information needed?

Let's see.
Here is the code:

/* Like `insert_op1', but for two two-byte parameters ARG1 and ARG2.  */
/* ifdef WCHAR, integer parameter is 1 wchar_t.  */

static void
PREFIX(insert_op2) (re_opcode_t op, UCHAR_T *loc, int arg1,
    int arg2, UCHAR_T *end)
{
  register UCHAR_T *pfrom = end;
  register UCHAR_T *pto = end + 1 + 2 * OFFSET_ADDRESS_SIZE;

  while (pfrom != loc)
    *--pto = *--pfrom;

  PREFIX(store_op2) (op, loc, arg1, arg2);
}

Here is a reduced/preprocessed form that hits the same problem:


jbook2:~ jay$ cat re.c
typedef unsigned char UCHAR;


void insert_op2 (UCHAR *loc, UCHAR *end)
{
   UCHAR *pfrom = end;
   UCHAR *pto = end + 1;

  while (pfrom != loc)
    *--pto = *--pfrom;
}

jbook2:~ jay$ alpha-dec-vms-gcc -c re.c

jbook2:~ jay$ alpha-dec-vms-gcc -c -O2 re.c
re.c: In function 'insert_op2':
re.c:10:1: error: unrecognizable insn:
(insn 58 57 59 5 re.c:9 (set (reg:DI 120)
    (plus:SI (subreg/s:SI (reg/v/f:DI 108 [ pfrom ]) 0)
    (const_int 1 [0x1]))) -1 (nil))
re.c:10:1: internal compiler error: in extract_insn, at recog.c:2103
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
jbook2:~ jay$ 


I opened a bug in the database.


  - Jay
  


gcc4.5.0/libiberty/pex-common.h missing pid_t on vms

2010-05-03 Thread Jay K

In file included from /src/gcc-4.5.0/libiberty/pex-common.c:23:0:
/src/gcc-4.5.0/libiberty/pex-common.h:73:3: error: expected 
specifier-qualifier-list before 'pid_t'


the code:

/* pid_t is may defined by config.h or sys/types.h needs to be
   included.  */
#if !defined(pid_t) && defined(HAVE_SYS_TYPES_H)
#include 
#endif


proposed/tested fix:
#ifdef __vms
#include 
#endif

or similar.

This then hits:

/usr/local/lib/gcc/alpha-dec-vms/4_5_0/../../../../alpha-dec-vms/include/unistd.h:475:22:
 error: macro "geteuid" passed 1 arguments, but takes just 0
/usr/local/lib/gcc/alpha-dec-vms/4_5_0/../../../../alpha-dec-vms/include/unistd.h:476:22:
 error: macro "getuid" passed 1 arguments, but takes just 0
make[2]: *** [pex-common.o] Error 1


But I say that's a bug in the VMS headers and I patch it:

#if __USE_LONG_GID_T
#   pragma __extern_prefix __save
#   pragma __extern_prefix "__long_gid_"
#elif __CRTL_VER>= 7000 && !defined(_VMS_V6_SOURCE)
#   if __CAN_USE_EXTERN_PREFIX
#  pragma __extern_prefix __save
#  pragma __extern_prefix "__unix_"
#   else
-#    define geteuid() __unix_geteuid()
-#    define getuid() __unix_getuid()
+#    define geteuid __unix_geteuid
+#    define getuid __unix_getuid
#   endif
#endif


__uid_t geteuid (void);
__uid_t getuid  (void);


I did the same thing in the VMS header unixlib.h.
Maybe something for fixincludes? (along with #define __NEW_STARLET, #define 
__int64 long long...)


(Alternate interpretation is that gcc should implement __CAN_USE_EXTERN_PREFIX
and the #pragmas. I'd be willing to #define __USE_LONG_GID_T but I assume the 
pragmas are a problem.)


 - Jay


gcc 4.5.0 libiberty .o vs. .obj confusion

2010-05-03 Thread Jay K

build=i386-darwin
host=alpha-dec-vms

target=alpha-dec-vms


alpha-dec-vms-ar rc ./libiberty.a \
      ./regex.o ./cplus-dem.o ./cp-demangle.o ./md5.o ./sha1.o ./alloca.o 
./argv.o ./choose-temp.o ./concat.o ./cp-demint.o ./crc32.o ./dyn-string.o 
./fdmatch.o ./fibheap.o ./filename_cmp.o ./floatformat.o ./fnmatch.o 
./fopen_unlocked.o ./getopt.o ./getopt1.o ./getpwd.o ./getruntime.o ./hashtab.o 
./hex.o ./lbasename.o ./lrealpath.o ./make-relative-prefix.o ./make-temp-file.o 
./objalloc.o ./obstack.o ./partition.o ./pexecute.o ./physmem.o ./pex-common.o 
./pex-one.o ./pex-unix.o ./safe-ctype.o ./sort.o ./spaces.o ./splay-tree.o 
./strerror.o ./strsignal.o ./unlink-if-ordinary.o ./xatexit.o ./xexit.o 
./xmalloc.o ./xmemdup.o ./xstrdup.o ./xstrerror.o ./xstrndup.o  ./asprintf.obj 
./insque.obj ./memmem.obj ./mempcpy.obj ./mkstemps.obj ./stpcpy.obj 
./stpncpy.obj ./strndup.obj ./strverscmp.obj ./vasprintf.obj ./vfork.obj 
./strncmp.obj
alpha-dec-vms-ar: ./asprintf.obj: No such file or directory
make: *** [libiberty.a] Error 1
jbook2:libiberty jay$ edit Makefile 


alpha-dec-gcc -c foo.c outputs foo.obj.

"Something" seems to know this, since:

libiberty/Makefile.in:
LIBOBJS = @LIBOBJS@


libiberty/Makefile:
LIBOBJS =  ${LIBOBJDIR}./asprintf$U.obj ${LIBOBJDIR}./insque$U.obj 
${LIBOBJDIR}./memmem$U.obj ${LIBOBJDIR}./mempcpy$U.obj 
${LIBOBJDIR}./mkstemps$U.obj ${LIBOBJDIR}./stpcpy$U.obj 
${LIBOBJDIR}./stpncpy$U.obj ${LIBOBJDIR}./strndup$U.obj 
${LIBOBJDIR}./strverscmp$U.obj ${LIBOBJDIR}./vasprintf$U.obj 
${LIBOBJDIR}./vfork$U.obj ${LIBOBJDIR}./strncmp$U.obj


and then later there are explicit rules for building asprintf.o, etc.
I'll probably just hack the configure Makefile to say .o.


This could be an autoconf/automake bug.
Or maybe libiberty is supposed to say $O or such in place of .o?


 - Jay
  


RE: gcc 4.5.0 libiberty .o vs. .obj confusion

2010-05-03 Thread Jay K

I'm guessing that every ".o" in libiberty/Makefile.in should be changed to 
$(OBJEXT).

Thanks,
 - Jay


> From: jay.kr...@cornell.edu
> To: gcc@gcc.gnu.org
> Subject: gcc 4.5.0 libiberty .o vs. .obj confusion
> Date: Mon, 3 May 2010 11:29:15 +
>
>
> build=i386-darwin
> host=alpha-dec-vms
>
> target=alpha-dec-vms
>
>
> alpha-dec-vms-ar rc ./libiberty.a \
>   ./regex.o ./cplus-dem.o ./cp-demangle.o ./md5.o ./sha1.o ./alloca.o 
> ./argv.o ./choose-temp.o ./concat.o ./cp-demint.o ./crc32.o ./dyn-string.o 
> ./fdmatch.o ./fibheap.o ./filename_cmp.o ./floatformat.o ./fnmatch.o 
> ./fopen_unlocked.o ./getopt.o ./getopt1.o ./getpwd.o ./getruntime.o 
> ./hashtab.o ./hex.o ./lbasename.o ./lrealpath.o ./make-relative-prefix.o 
> ./make-temp-file.o ./objalloc.o ./obstack.o ./partition.o ./pexecute.o 
> ./physmem.o ./pex-common.o ./pex-one.o ./pex-unix.o ./safe-ctype.o ./sort.o 
> ./spaces.o ./splay-tree.o ./strerror.o ./strsignal.o ./unlink-if-ordinary.o 
> ./xatexit.o ./xexit.o ./xmalloc.o ./xmemdup.o ./xstrdup.o ./xstrerror.o 
> ./xstrndup.o  ./asprintf.obj ./insque.obj ./memmem.obj ./mempcpy.obj 
> ./mkstemps.obj ./stpcpy.obj ./stpncpy.obj ./strndup.obj ./strverscmp.obj 
> ./vasprintf.obj ./vfork.obj ./strncmp.obj
> alpha-dec-vms-ar: ./asprintf.obj: No such file or directory
> make: *** [libiberty.a] Error 1
> jbook2:libiberty jay$ edit Makefile
>
>
> alpha-dec-gcc -c foo.c outputs foo.obj.
>
> "Something" seems to know this, since:
>
> libiberty/Makefile.in:
> LIBOBJS = @LIBOBJS@
>
>
> libiberty/Makefile:
> LIBOBJS =  ${LIBOBJDIR}./asprintf$U.obj ${LIBOBJDIR}./insque$U.obj 
> ${LIBOBJDIR}./memmem$U.obj ${LIBOBJDIR}./mempcpy$U.obj 
> ${LIBOBJDIR}./mkstemps$U.obj ${LIBOBJDIR}./stpcpy$U.obj 
> ${LIBOBJDIR}./stpncpy$U.obj ${LIBOBJDIR}./strndup$U.obj 
> ${LIBOBJDIR}./strverscmp$U.obj ${LIBOBJDIR}./vasprintf$U.obj 
> ${LIBOBJDIR}./vfork$U.obj ${LIBOBJDIR}./strncmp$U.obj
>
>
> and then later there are explicit rules for building asprintf.o, etc.
> I'll probably just hack the configure Makefile to say .o.
>
>
> This could be an autoconf/automake bug.
> Or maybe libiberty is supposed to say $O or such in place of .o?
>
>
>  - Jay
>
  


internal compiler error compiling gmp/get_d/gmpn_get_d for alpha-dec-vms

2010-05-03 Thread Jay K

build=i386-darwin
host=alpha-dec-vms
target=alpha-dec-vms


/bin/sh ../libtool --mode=compile alpha-dec-vms-gcc -mbwx -std=gnu99 
-DHAVE_CONFIG_H -I. -I/src/gcc-4.5.0/gmp/mpn -I.. -D__GMP_WITHIN_GMP 
-I/src/gcc-4.5.0/gmp -DOPERATION_`echo get_d | sed 's/_$//'`    -g -O2 -c -o 
get_d.lo get_d.c
 alpha-dec-vms-gcc -mbwx -std=gnu99 -DHAVE_CONFIG_H -I. 
-I/src/gcc-4.5.0/gmp/mpn -I.. -D__GMP_WITHIN_GMP -I/src/gcc-4.5.0/gmp 
-DOPERATION_get_d -g -O2 -c get_d.c -o get_d.obj
get_d.c: In function '__gmpn_get_d':
get_d.c:490:1: internal compiler error: in 
compute_frame_pointer_to_fb_displacement, at dwarf2out.c:16269
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
make[2]: *** [get_d.lo] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2
jbook2:gmp jay$ 


I said make CFLAGS=-mieee and it seemed to fix it.
I think that merely turned off optimization though.


I did configure gcc in the first place with -disable-shared 
-enable-sjlj-exception, since there were problems compiling the libgcc/Dwarf 
stuff.
I can try "normal" there again.


 - Jay


  


RE: gcc4.5.0/libiberty/pex-common.h missing pid_t on vms

2010-05-05 Thread Jay K

> Use #ifdef HAVE_UNISTD_H instead. There are many examples in
> libiberty.
>
> Ian


Thanks Ian, that worked.


--- /src/orig/gcc-4.5.0/libiberty/pex-common.h    2009-04-13 03:45:58.0 
-0700
+++ /src/gcc-4.5.0/libiberty/pex-common.h    2010-05-04 06:43:24.0 -0700
@@ -31,6 +31,9 @@
 #if !defined(pid_t) && defined(HAVE_SYS_TYPES_H)
 #include 
 #endif
+#ifdef HAVE_UNISTD_H
+#include 
+#endif
 
 #define install_error_msg "installation problem, cannot exec `%s'"


Perhaps someone can apply it..
Sorry, not me.


 - Jay
  


RE: gcc 4.5.0 libiberty .o vs. .obj confusion

2010-05-05 Thread Jay K

> CC: gcc@
> From: iant@
>
> Jay:
>> I'm guessing that every ".o" in libiberty/Makefile.in should be changed to 
>> $(OBJEXT).
>
> Yes.
>
> Ian

Thanks. 

Specifically ".o" goes to "@objext@".

There's no way I'm going to be able to get "the papers" in.
I can try to squeak by via triviality of change.
I'm slightly derailed on other aspects of targeting VMS (e.g. *crt0*.c, 
vms-crtl.h), but this did work for me, attached.
It's many lines, but highly mechanical.
There are a few places where ".o" occurs in comments, can be left alone.
There is:

.c.o:
    false

> .c.obj:
>    false


and
<    -rm -rf *.o pic core errs \#* *.E a.out

>    -rm -rf *.o *.obj pic core errs \#* *.E a.out


and I wrapped the affected lines to one file per line, and spaces instead of 
tabs (consistent rendering)


 - Jay
  119a120,122
> .c.obj:
>   false
> 
160,178c163,213
< REQUIRED_OFILES = \
<   ./regex.o ./cplus-dem.o ./cp-demangle.o ./md5.o ./sha1.o\
<   ./alloca.o ./argv.o \
<   ./choose-temp.o ./concat.o ./cp-demint.o ./crc32.o  \
<   ./dyn-string.o  \
<   ./fdmatch.o ./fibheap.o ./filename_cmp.o ./floatformat.o\
<   ./fnmatch.o ./fopen_unlocked.o  \
<   ./getopt.o ./getopt1.o ./getpwd.o ./getruntime.o\
<   ./hashtab.o ./hex.o \
<   ./lbasename.o ./lrealpath.o \
<   ./make-relative-prefix.o ./make-temp-file.o \
<   ./objalloc.o ./obstack.o\
<   ./partition.o ./pexecute.o ./physmem.o  \
<   ./pex-common.o ./pex-one.o @pexecute@   \
<   ./safe-ctype.o ./sort.o ./spaces.o ./splay-tree.o ./strerror.o  \
<./strsignal.o  \
<   ./unlink-if-ordinary.o  \
<   ./xatexit.o ./xexit.o ./xmalloc.o ./xmemdup.o ./xstrdup.o   \
<./xstrerror.o ./xstrndup.o
---
> REQUIRED_OFILES =   \
> ./reg...@objext@\
> ./cplus-d...@objext@\
> ./cp-demang...@objext@  \
> ./m...@objext@  \
> ./sh...@objext@ \
> ./allo...@objext@   \
> ./ar...@objext@ \
> ./choose-te...@objext@  \
> ./conc...@objext@   \
> ./cp-demi...@objext@\
> ./crc...@objext@\
> ./dyn-stri...@objext@   \
> ./fdmat...@objext@  \
> ./fibhe...@objext@  \
> ./filename_c...@objext@ \
> ./floatform...@objext@  \
> ./fnmat...@objext@  \
> ./fopen_unlock...@objext@ \
> ./geto...@objext@   \
> ./getop...@objext@  \
> ./getp...@objext@   \
> ./getrunti...@objext@   \
> ./hasht...@objext@  \
> ./h...@objext@  \
> ./lbasena...@objext@\
> ./lrealpa...@objext@\
> ./make-relative-pref...@objext@ \
> ./make-temp-fi...@objext@ \
> ./objall...@objext@ \
> ./obsta...@objext@  \
> ./partiti...@objext@\
> ./pexecu...@objext@ \
> ./physm...@objext@  \
> ./pex-comm...@objext@   \
> ./pex-o...@objext@  \
> @pexecute@  \
> ./safe-cty...@objext@   \
> ./so...@objext@ \
> ./spac...@objext@   \
> ./splay-tr...@objext@   \
> ./strerr...@objext@ \
> ./strsign...@objext@\
> ./unlink-if-ordina...@objext@   \
> ./xatex...@objext@  \
> ./xex...@objext@\
> ./xmall...@objext@  \
> ./xmemd...@objext@  \
> ./xstrd...@objext@  \
> ./xstrerr...@objext@\
> ./xstrnd...@objext@
183,203c218,276
< CONFIGURED_OFILES = ./asprintf.o ./atexit.o   \
<   ./basename.o ./bcmp.o ./bcopy.o ./bsearch.o ./bzero.o   \
<   ./calloc.o ./clock.o ./copysign.o   \
<   ./_doprnt.o \
<   ./ffs.o \
<   ./getcwd.o ./getpagesize.o ./gettimeofday.o \
<   ./index.o ./insque.o\
<   ./memchr.o ./memcmp.o ./memcpy.o ./memmem.o ./memmove.o \
<./mempcpy.o ./memset.o ./mkstemps.o\
<   ./pex-djgpp.o ./pex-msdos.o \
<./pex-unix.o ./pex-win32.o \
<./putenv.o \
<   ./random.o ./rename.o ./rindex.o\
<   ./setenv.o ./sigsetmask.o ./snprintf.o ./stpcpy.o ./stpncpy.o   \
<./strcasecm

RE: gcc 4.5.0 libiberty .o vs. .obj confusion

2010-05-05 Thread Jay K

oops, also need like:

--- /src/orig/gcc-4.5.0/libiberty/configure    2010-01-04 15:46:56.0 
-0800
+++ /src/gcc-4.5.0/libiberty/configure    2010-05-05 05:40:52.0 -0700
@@ -6533,10 +6533,10 @@
 
 # Figure out which version of pexecute to use.
 case "${host}" in
- *-*-mingw* | *-*-winnt*)    pexecute=./pex-win32.o  ;;
- *-*-msdosdjgpp*)        pexecute=./pex-djgpp.o  ;;
- *-*-msdos*)        pexecute=./pex-msdos.o  ;;
- *)                pexecute=./pex-unix.o   ;;
+ *-*-mingw* | *-*-winnt*)    pexecute=./pex-win32.$ac_objext  ;;
+ *-*-msdosdjgpp*)        pexecute=./pex-djgpp.$ac_objext  ;;
+ *-*-msdos*)        pexecute=./pex-msdos.$ac_objext  ;;
+ *)                pexecute=./pex-unix.$ac_objext   ;;
 esac
 

--- /src/orig/gcc-4.5.0/libiberty/configure.ac    2010-01-04 15:46:56.0 
-0800
+++ /src/gcc-4.5.0/libiberty/configure.ac    2010-05-05 05:45:47.0 -0700
@@ -671,10 +671,10 @@
 
 # Figure out which version of pexecute to use.
 case "${host}" in
- *-*-mingw* | *-*-winnt*)    pexecute=./pex-win32.o  ;;
- *-*-msdosdjgpp*)        pexecute=./pex-djgpp.o  ;;
- *-*-msdos*)        pexecute=./pex-msdos.o  ;;
- *)                pexecute=./pex-unix.o   ;;
+ *-*-mingw* | *-*-winnt*)    pexecute=./pex-win32.$ac_objext  ;;
+ *-*-msdosdjgpp*)        pexecute=./pex-djgpp.$ac_objext  ;;
+ *-*-msdos*)        pexecute=./pex-msdos.$ac_objext  ;;
+ *)                pexecute=./pex-unix.$ac_objext   ;;
 esac
 AC_SUBST(pexecute)
 

I manually edited configure.
I don't know how to keep multiple versions of autoconf installed/working, other 
than to use Cygwin and its special packages dedicated to this problem.

configure.ac:3: error: Autoconf version 2.64 or higher is required
configure.ac:3: the top level
autom4te: /usr/bin/gm4 failed with exit status: 63
jbook2:libiberty jay$ 

 - Jay



> From: jay.krell@
> To: i...@m
> CC: g...@g
> Subject: RE: gcc 4.5.0 libiberty .o vs. .obj confusion
> Date: Wed, 5 May 2010 10:10:15 +
>
>
>> CC: gcc@
>> From: iant@
>>
>> Jay:
>>> I'm guessing that every ".o" in libiberty/Makefile.in should be changed to 
>>> $(OBJEXT).
>>
>> Yes.
>>
>> Ian
>
> Thanks.
>
> Specifically ".o" goes to "@objext@".
>
> There's no way I'm going to be able to get "the papers" in.
> I can try to squeak by via triviality of change.
> I'm slightly derailed on other aspects of targeting VMS (e.g. *crt0*.c, 
> vms-crtl.h), but this did work for me, attached.
> It's many lines, but highly mechanical.
> There are a few places where ".o" occurs in comments, can be left alone.
> There is:
>
> .c.o:
> false
>
>> .c.obj:
>>false
>
>
> and
> <-rm -rf *.o pic core errs \#* *.E a.out
>
>>-rm -rf *.o *.obj pic core errs \#* *.E a.out
>
>
> and I wrapped the affected lines to one file per line, and spaces instead of 
> tabs (consistent rendering)
>
>
>  - Jay
>
  


more .o vs. .obj targeting VMS

2010-05-05 Thread Jay K

Here's the next one:

alpha-dec-vms-ar cru libdecnumber.a decNumber.o decContext.o decimal32.o 
decimal64.o decimal128.o 
alpha-dec-vms-ar: decNumber.o: No such file or directory
make[2]: *** [libdecnumber.a] Error 1
make[1]: *** [all-libdecnumber] Error 2
make: *** [all] Error 2

jbook2:vms jay$ ls libdecnumber/
Makefile    config.log    decNumber.obj    decimal64.obj
config.cache    config.status    decimal128.obj    gstdint.h
config.h    decContext.obj    decimal32.obj    stamp-h1

 - Jay

  


gcc/resource.h conflicts with sysroot/usr/include/resource.h (alpha-dec-vms)

2010-05-06 Thread Jay K

lpha-dec-vms-gcc -c   -DIN_GCC   -W -Wall -Wwrite-strings -Wcast-qual 
-Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attri
bute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings 
-Wold-style-definition -Wc++-compat   -DHAVE_CONFIG_H -I. -
I. -I/src/gcc-4.5.0/gcc -I/src/gcc-4.5.0/gcc/. -I/src/gcc-4.5.0/gcc/../include 
-I/src/gcc-4.5.0/gcc/../libcpp/include -I/obj/gcc/vms/.
/gmp -I/src/gcc-4.5.0/gmp -I/obj/gcc/vms/./mpfr -I/src/gcc-4.5.0/mpfr 
-I/src/gcc-4.5.0/mpc/src  -I/src/gcc-4.5.0/gcc/../libdecnumber -
I/src/gcc-4.5.0/gcc/../libdecnumber/dpd -I../libdecnumber 
/src/gcc-4.5.0/gcc/c-lang.c -o c-lang.o
In file included from /src/gcc-4.5.0/gcc/resource.h:24:0,
 from 
/usr/local/lib/gcc/alpha-dec-vms/4_5_0/../../../../alpha-dec-vms/include/wait.h:74,
 from 
/usr/local/lib/gcc/alpha-dec-vms/4_5_0/../../../../alpha-dec-vms/include/stdlib.h:51,
 from /src/gcc-4.5.0/gcc/system.h:211,
 from /src/gcc-4.5.0/gcc/c-lang.c:24:

/src/gcc-4.5.0/gcc/hard-reg-set.h:42:39: error: expected '=', ',', ';', 'asm' 
or '__attribute__' before 'HARD_REG_ELT_TYPE'


The problem is that there is both gcc/resource.h and 
sysroot/usr/include/resource.h.

When sysroot/usr/include/wait.h does:

#if defined _XOPEN_SOURCE_EXTENDED || !defined _POSIX_C_SOURCE
#   include         /* for siginfo_t */
#   include     /* for struct rusage */
#endif

it gets the wrong resource.h

for now I patched sysroot/usr/include/wait.h to #include "resource.h" instead.

Unfortunate fix is maybe to rename to gcc/gccresource.h?

 - Jay
 
  


builtin ffs vs. renamed ffs (vms-crtl.h)

2010-05-07 Thread Jay K

In gcc for VMS there is some mechanism to rename functions.
See the files:

/src/gcc-4.5.0/gcc/config/vms/vms-crtl-64.h
/src/gcc-4.5.0/gcc/config/vms/vms-crtl.h


which are mostly just lists of function from/to.


As well in gcc there is a mechanism for optimizing various "builtin" functions, 
like ffs.


These two mechanisms seem to conflict or be applied in the wrong order.
I didn't look at it deeply.


The symptom is that if you add ffs (to decc$ffs) to vms-crtl.h, the translation
is not done, and you end up with unresolved external ffs.


If you #if out the support for "builtin ffs", it works.


My local hack is below but obviously that's not the way.


I'll enter a bug.


Thanks,
 - Jay


diff -u /src/orig/gcc-4.5.0/gcc/builtins.c ./builtins.c
--- /src/orig/gcc-4.5.0/gcc/builtins.c    2010-04-13 06:47:11.0 -0700
+++ ./builtins.c    2010-05-07 23:11:30.0 -0700
@@ -51,6 +51,8 @@
 #include "value-prof.h"
 #include "diagnostic.h"
 
+#define DISABLE_FFS
+
 #ifndef SLOW_UNALIGNED_ACCESS
 #define SLOW_UNALIGNED_ACCESS(MODE, ALIGN) STRICT_ALIGNMENT
 #endif
@@ -5899,6 +5901,7 @@
 return target;
   break;
 
+#ifndef DISABLE_FFS
 CASE_INT_FN (BUILT_IN_FFS):
 case BUILT_IN_FFSIMAX:
   target = expand_builtin_unop (target_mode, exp, target,
@@ -5906,6 +5909,7 @@
   if (target)
 return target;
   break;
+#endif
 
 CASE_INT_FN (BUILT_IN_CLZ):
 case BUILT_IN_CLZIMAX:
@@ -13612,6 +13616,7 @@
 case BUILT_IN_ABORT:
   abort_libfunc = set_user_assembler_libfunc ("abort", asmspec);
   break;
+#ifndef DISABLE_FFS
 case BUILT_IN_FFS:
   if (INT_TYPE_SIZE < BITS_PER_WORD)
 {
@@ -13620,6 +13625,7 @@
                        MODE_INT, 0), "ffs");
 }
   break;
+#endif
 default:
   break;
 }
diff -u /src/orig/gcc-4.5.0/gcc/optabs.c ./optabs.c
--- /src/orig/gcc-4.5.0/gcc/optabs.c    2010-03-19 12:45:01.0 -0700
+++ ./optabs.c    2010-05-07 23:11:36.0 -0700
@@ -45,6 +45,8 @@
 #include "basic-block.h"
 #include "target.h"
 
+#define DISABLE_FFS
+
 /* Each optab contains info on how this target machine
    can perform a particular operation
    for all sizes and kinds of operands.
@@ -3240,6 +3242,7 @@
 return temp;
 }
 
+#ifndef DISABLE_FFS
   /* Try implementing ffs (x) in terms of clz (x).  */
   if (unoptab == ffs_optab)
 {
@@ -3247,6 +3250,7 @@
   if (temp)
 return temp;
 }
+#endif
 
   /* Try implementing ctz (x) in terms of clz (x).  */
   if (unoptab == ctz_optab)
@@ -3268,7 +3272,11 @@
 
   /* All of these functions return small values.  Thus we choose to
  have them return something that isn't a double-word.  */
-  if (unoptab == ffs_optab || unoptab == clz_optab || unoptab == ctz_optab
+  if (
+#ifndef DISABLE_FFS
+  unoptab == ffs_optab ||
+#endif
+    unoptab == clz_optab || unoptab == ctz_optab
   || unoptab == popcount_optab || unoptab == parity_optab)
 outmode
   = GET_MODE (hard_libcall_value (TYPE_MODE (integer_type_node),
@@ -6301,7 +6309,9 @@
   init_optab (addcc_optab, UNKNOWN);
   init_optab (one_cmpl_optab, NOT);
   init_optab (bswap_optab, BSWAP);
+#ifndef DISABLE_FFS
   init_optab (ffs_optab, FFS);
+#endif
   init_optab (clz_optab, CLZ);
   init_optab (ctz_optab, CTZ);
   init_optab (popcount_optab, POPCOUNT);
@@ -6558,9 +6568,11 @@
   one_cmpl_optab->libcall_basename = "one_cmpl";
   one_cmpl_optab->libcall_suffix = '2';
   one_cmpl_optab->libcall_gen = gen_int_libfunc;
+#ifndef DISABLE_FFS
   ffs_optab->libcall_basename = "ffs";
   ffs_optab->libcall_suffix = '2';
   ffs_optab->libcall_gen = gen_int_libfunc;
+#endif
   clz_optab->libcall_basename = "clz";
   clz_optab->libcall_suffix = '2';
   clz_optab->libcall_gen = gen_int_libfunc;
@@ -6643,11 +6655,13 @@
   satfractuns_optab->libcall_basename = "satfractuns";
   satfractuns_optab->libcall_gen = gen_satfractuns_conv_libfunc;
 
+#ifndef DISABLE_FFS
   /* The ffs function operates on `int'.  Fall back on it if we do not
  have a libgcc2 function for that width.  */
   if (INT_TYPE_SIZE < BITS_PER_WORD)
 set_optab_libfunc (ffs_optab, mode_for_size (INT_TYPE_SIZE, MODE_INT, 0),
        "ffs");
+#endif
 
   /* Explicitly initialize the bswap libfuncs since we need them to be
  valid for things other than word_mode.  */


Thanks,
 - Jay
  


vmsdbgout.c int-to-enum cast and #define globalref

2010-05-07 Thread Jay K

vmsdbgout.c has an int-to-enum warning and needs some form of "globalref" when 
host=alpha-dec-vms since that #includes the VMS system headers.
Perhaps gcc should recognize globalref when target=*vms* and at least interpret 
it as extern.

Thanks,
 - Jay

diff -u /src/orig/gcc-4.5.0/gcc/vmsdbgout.c ./vmsdbgout.c
--- /src/orig/gcc-4.5.0/gcc/vmsdbgout.c    2009-11-25 02:55:54.0 -0800
+++ ./vmsdbgout.c    2010-05-06 01:40:20.0 -0700
@@ -21,6 +21,8 @@
 along with GCC; see the file COPYING3.  If not see
 .  */
 
+#define globalref extern
+
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -743,7 +745,7 @@
   modbeg.dst_b_modbeg_flags.dst_v_modbeg_version = 1;
   modbeg.dst_b_modbeg_flags.dst_v_modbeg_unused = 0;
   modbeg.dst_b_modbeg_unused = 0;
-  modbeg.dst_l_modbeg_language = module_language;
+  modbeg.dst_l_modbeg_language = (DST_LANGUAGE)module_language;
   modbeg.dst_w_version_major = DST_K_VERSION_MAJOR;
   modbeg.dst_w_version_minor = DST_K_VERSION_MINOR;
   modbeg.dst_b_modbeg_name = strlen (module_name);
@@ -822,7 +824,7 @@
  + string count byte + string length */
   header.dst__header_length.dst_w_length
 = DST_K_DST_HEADER_SIZE - 1 + 1 + 4 + 1 + strlen (go);
-  header.dst__header_type.dst_w_type = 0x17;
+  header.dst__header_type.dst_w_type = (DST_DTYPE)0x17;
 
   totsize += write_debug_header (&header, "transfer", dosizeonly);
 


  


pic+64bit+sun assembler+unwind-tables => illegal cross section subtraction

2010-05-09 Thread Jay K

I haven't tried 4.5.0 yet.
 
 
-bash-4.1$ /opt/csw/gcc4/bin/g++ -v
Using built-in specs.
Target: i386-pc-solaris2.10
Configured with: ../gcc-4.3.3/configure --prefix=/opt/csw/gcc4 --exec-prefix=/op
t/csw/gcc4 --with-gnu-as --with-as=/opt/csw/bin/gas --without-gnu-ld --with-ld=/
usr/ccs/bin/ld --enable-nls --with-included-gettext --with-libiconv-prefix=/opt/
csw --with-x --with-mpfr=/opt/csw --with-gmp=/opt/csw --enable-java-awt=xlib --e
nable-libada --enable-libssp --enable-objc-gc --enable-threads=posix --enable-st
age1-languages=c --enable-languages=ada,c,c++,fortran,java,objc
Thread model: posix
gcc version 4.3.3 (GCC)
 

/opt/csw/gcc4/bin/g++ 1.cpp -fPIC -S -m64
"1.s", line 117 : Warning: Illegal subtraction - symbols from different 
sections: ".LFB2", ".DOT-2"
"1.s", line 120 : Warning: Illegal subtraction - symbols from different 
sections: ".LLSDA2", ".DOT-3"
void F1();
void F2()
{
  try { F1(); } catch(...) {F2(); }
}

 
 /usr/ccs/bin/as -xarch=amd64 1.s
 


or similar:
-bash-4.1$ cat 2.c
void F1() { }
 

   /opt/csw/gcc4/bin/gcc -fPIC -S -funwind-tables -m64 2.c
   /usr/ccs/bin/as -xarch=amd64 2.s
Assembler: 2.c
"2.s", line 38 : Warning: Illegal subtraction - symbols from different 
sections: ".LFB2", ".DOT-1"
 

I'm aware of this thread:
http://gcc.gnu.org/ml/gcc-patches/2006-07/msg00908.html
 
 

I think I'll switch to GNU as, or omit -funwind-tables for now.
  Or see if 4.5.0 fixes it.
Sparc32, sparc64, x86 work.
 
 
-gstabs+ also generated .stabd that Sun assembler didn't like.
I switched to -gstabs.
Maybe I messed up something though, as it looks like gcc is aware not to output 
.stabd to non-gas.
  More reason to use GNU assembler, understood.

 
http://gcc.gnu.org/install/specific.html#ix86-x-solaris210
 could be a bit more precise:
  >> Recent versions of the Sun assembler in /usr/ccs/bin/as work almost as 
well, though. 

 
"almost as well"?
Maybe that should say more, like, use -g or -gstabs instead of -gstabs+, don't 
use 64bit+pic+unwind-tables or 64bit+pic+exceptions
 
 
I switched to Sun assembler because I'm seeing GNU as installed in different 
places on different machines.
  Some people don't install /usr/sfw and the install elsewhere.
 
 
 - Jay


RE: pic+64bit+sun assembler+unwind-tables => illegal cross section subtraction

2010-05-09 Thread Jay K

Ah, good point. I don't think my "real" scenario did that though.
I'll investigate more. Networking problems were?are hampering download 4.5.0 and
 build and configure it.
 
I did come up with Makefile:
 
Assemble = $(shell if test -x /opt/csw/gnu/as ; then echo /opt/csw/gnu/as ; \
 elif test -x /usr/sfw/bin/gas ; then echo /usr/sfw/bin/gas ; \
 else echo "unable to find GNU assembler" ; fi )
 
:) which addresses why I wasn't using GNU as.
 
(Yes, I've heard of autoconf.)
 
Thanks, later,
 - Jay



> From: pins...@gmail.com
> To: jay.kr...@cornell.edu
> Subject: Re: pic+64bit+sun assembler+unwind-tables => illegal cross section 
> subtraction
> Date: Sun, 9 May 2010 17:48:04 -0700
> CC: gcc@gcc.gnu.org
>
>
>
> Sent from my iPhone
>
> On May 9, 2010, at 5:42 PM, Jay K wrote:
>
>>
>> I haven't tried 4.5.0 yet.
>>
>>
>> -bash-4.1$ /opt/csw/gcc4/bin/g++ -v
>> Using built-in specs.
>> Target: i386-pc-solaris2.10
>> Configured with: ../gcc-4.3.3/configure --prefix=/opt/csw/gcc4 --
>> exec-prefix=/op
>> t/csw/gcc4
>
>
>
>> --with-gnu-as
>
>
> You configured gcc to build with the gnu as but then run with it so
> what do you expect.
>
>
>> --with-as=/opt/csw/bin/gas --without-gnu-ld --with-ld=/
>> usr/ccs/bin/ld --enable-nls --with-included-gettext --with-libiconv-
>> prefix=/opt/
>> csw --with-x --with-mpfr=/opt/csw --with-gmp=/opt/csw --enable-java-
>> awt=xlib --e
>> nable-libada --enable-libssp --enable-objc-gc --enable-threads=posix
>> --enable-st
>> age1-languages=c --enable-languages=ada,c,c++,fortran,java,objc
>> Thread model: posix
>> gcc version 4.3.3 (GCC)
>>
>>
>> /opt/csw/gcc4/bin/g++ 1.cpp -fPIC -S -m64
>> "1.s", line 117 : Warning: Illegal subtraction - symbols from
>> different sections: ".LFB2", ".DOT-2"
>> "1.s", line 120 : Warning: Illegal subtraction - symbols from
>> different sections: ".LLSDA2", ".DOT-3"
>> void F1();
>> void F2()
>> {
>> try { F1(); } catch(...) {F2(); }
>> }
>>
>>
>> /usr/ccs/bin/as -xarch=amd64 1.s
>>
>>
>>
>> or similar:
>> -bash-4.1$ cat 2.c
>> void F1() { }
>>
>>
>> /opt/csw/gcc4/bin/gcc -fPIC -S -funwind-tables -m64 2.c
>> /usr/ccs/bin/as -xarch=amd64 2.s
>> Assembler: 2.c
>> "2.s", line 38 : Warning: Illegal subtraction - symbols from
>> different sections: ".LFB2", ".DOT-1"
>>
>>
>> I'm aware of this thread:
>> http://gcc.gnu.org/ml/gcc-patches/2006-07/msg00908.html
>>
>>
>>
>> I think I'll switch to GNU as, or omit -funwind-tables for now.
>> Or see if 4.5.0 fixes it.
>> Sparc32, sparc64, x86 work.
>>
>>
>> -gstabs+ also generated .stabd that Sun assembler didn't like.
>> I switched to -gstabs.
>> Maybe I messed up something though, as it looks like gcc is aware
>> not to output .stabd to non-gas.
>> More reason to use GNU assembler, understood.
>>
>>
>> http://gcc.gnu.org/install/specific.html#ix86-x-solaris210
>> could be a bit more precise:
>>>> Recent versions of the Sun assembler in /usr/ccs/bin/as work
>>>> almost as well, though.
>>
>>
>> "almost as well"?
>> Maybe that should say more, like, use -g or -gstabs instead of -
>> gstabs+, don't use 64bit+pic+unwind-tables or 64bit+pic+exceptions
>>
>>
>> I switched to Sun assembler because I'm seeing GNU as installed in
>> different places on different machines.
>> Some people don't install /usr/sfw and the install elsewhere.
>>
>>
>> - Jay  


RE: pic+64bit+sun assembler+unwind-tables => illegal cross section subtraction

2010-05-09 Thread Jay K

Fix in 4.4.0.
 
I was getting:
 
.LASFDE1:
.long   .LASFDE1-.Lframe1
.long   .LFB2-.  <<<
.long   .LFE2-.LFB2
 
4.5.0 configured right:
 
.LASFDE1:
.long   .LASFDE1-.Lframe1
.long   .l...@rel <<< 
.long   .LFE0-.LFB0
 
dw2_asm_output_encoded_addr_rtx =>
 
#ifdef ASM_OUTPUT_DWARF_PCREL
   ASM_OUTPUT_DWARF_PCREL (asm_out_file, size, XSTR (addr, 0));
#else
   dw2_assemble_integer (size, gen_rtx_MINUS (Pmode, addr, pc_rtx));
#endif

  
 C:\src\gcc-4.4.0\gcc\config\i386\sol2-10.h(45):#define 
ASM_OUTPUT_DWARF_PCREL(FILE, SIZE, LABEL) \
 C:\src\gcc-4.5.0\gcc\config\i386\sol2-10.h(46):#define 
ASM_OUTPUT_DWARF_PCREL(FILE, SIZE, LABEL) \

 
#define ASM_OUTPUT_DWARF_PCREL(FILE, SIZE, LABEL) \
  do {   \
fputs (integer_asm_op (SIZE, FALSE), FILE);  \
assemble_name (FILE, LABEL);   \
fputs (SIZE == 8 ? "@rel64" : "@rel", FILE); \
  } while (0)
#endif

2009-01-29  Rainer Orth  
 * config/i386/sol2-10.h [!HAVE_AS_IX86_DIFF_SECT_DELTA]
 (ASM_OUTPUT_DWARF_PCREL): Define.

http://gcc.gnu.org/viewcvs?view=revision&revision=143758
 
 
Thanks,
 - Jay



> From: jay.kr...@cornell.edu
> To: pins...@gmail.com
> CC: gcc@gcc.gnu.org
> Subject: RE: pic+64bit+sun assembler+unwind-tables => illegal cross section 
> subtraction
> Date: Mon, 10 May 2010 01:02:29 +
>
>
> Ah, good point. I don't think my "real" scenario did that though.
> I'll investigate more. Networking problems were?are hampering download 4.5.0 
> and
> build and configure it.
>
> I did come up with Makefile:
>
> Assemble = $(shell if test -x /opt/csw/gnu/as ; then echo /opt/csw/gnu/as ; \
> elif test -x /usr/sfw/bin/gas ; then echo /usr/sfw/bin/gas ; \
> else echo "unable to find GNU assembler" ; fi )
>
> :) which addresses why I wasn't using GNU as.
>
> (Yes, I've heard of autoconf.)
>
> Thanks, later,
> - Jay
>
>
> 
>> From: pins...@gmail.com
>> To: jay.kr...@cornell.edu
>> Subject: Re: pic+64bit+sun assembler+unwind-tables => illegal cross section 
>> subtraction
>> Date: Sun, 9 May 2010 17:48:04 -0700
>> CC: gcc@gcc.gnu.org
>>
>>
>>
>> Sent from my iPhone
>>
>> On May 9, 2010, at 5:42 PM, Jay K wrote:
>>
>>>
>>> I haven't tried 4.5.0 yet.
>>>
>>>
>>> -bash-4.1$ /opt/csw/gcc4/bin/g++ -v
>>> Using built-in specs.
>>> Target: i386-pc-solaris2.10
>>> Configured with: ../gcc-4.3.3/configure --prefix=/opt/csw/gcc4 --
>>> exec-prefix=/op
>>> t/csw/gcc4
>>
>>
>>
>>> --with-gnu-as
>>
>>
>> You configured gcc to build with the gnu as but then run with it so
>> what do you expect.
>>
>>
>>> --with-as=/opt/csw/bin/gas --without-gnu-ld --with-ld=/
>>> usr/ccs/bin/ld --enable-nls --with-included-gettext --with-libiconv-
>>> prefix=/opt/
>>> csw --with-x --with-mpfr=/opt/csw --with-gmp=/opt/csw --enable-java-
>>> awt=xlib --e
>>> nable-libada --enable-libssp --enable-objc-gc --enable-threads=posix
>>> --enable-st
>>> age1-languages=c --enable-languages=ada,c,c++,fortran,java,objc
>>> Thread model: posix
>>> gcc version 4.3.3 (GCC)
>>>
>>>
>>> /opt/csw/gcc4/bin/g++ 1.cpp -fPIC -S -m64
>>> "1.s", line 117 : Warning: Illegal subtraction - symbols from
>>> different sections: ".LFB2", ".DOT-2"
>>> "1.s", line 120 : Warning: Illegal subtraction - symbols from
>>> different sections: ".LLSDA2", ".DOT-3"
>>> void F1();
>>> void F2()
>>> {
>>> try { F1(); } catch(...) {F2(); }
>>> }
>>>
>>>
>>> /usr/ccs/bin/as -xarch=amd64 1.s
>>>
>>>
>>>
>>> or similar:
>>> -bash-4.1$ cat 2.c
>>> void F1() { }
>>>
>>>
>>> /opt/csw/gcc4/bin/gcc -fPIC -S -funwind-tables -m64 2.c
>>> /usr/ccs/bin/as -xarch=amd64 2.s
>>> Assembler: 2.c
>>> "2.s", line 38 : Warning: Illegal subtraction - symbols from
>>> different sections: ".LFB2", ".DOT-1"
>>>
>>>
>>> I'm aware of this thread:
>>> http://gcc.gnu.org/ml/gcc-patches/2006-07/msg00908.html
>>>
>>>
>>>
>>> I think I'll switch to GNU as, or omit -funwind-tables for now.
>>> Or see if 4.5.0 fixes it.
>>> Sparc32, sparc64, x86 work.
>>>
>>>
>>> -gstabs+ also generated .stabd that Sun assembler didn't like.
>>> I switched to -gstabs.
>>> Maybe I messed up something though, as it looks like gcc is aware
>>> not to output .stabd to non-gas.
>>> More reason to use GNU assembler, understood.
>>>
>>>
>>> http://gcc.gnu.org/install/specific.html#ix86-x-solaris210
>>> could be a bit more precise:
>>>>> Recent versions of the Sun assembler in /usr/ccs/bin/as work
>>>>> almost as well, though.
>>>
>>>
>>> "almost as well"?
>>> Maybe that should say more, like, use -g or -gstabs instead of -
>>> gstabs+, don't use 64bit+pic+unwind-tables or 64bit+pic+exceptions
>>>
>>>
>>> I switched to Sun assembler because I'm seeing GNU as installed in
>>> different places on different machines.
>>> Some people don't install /usr/sfw and the install elsewhere.
>>>
>>>
>>> - Jay 


-disable-fixincludes doesn't quite work, minor

2010-05-10 Thread Jay K

-disable-libgcc and/or -disable-fixincludes are useful, depending on your goal.

 Like if you just want to compile C to assembly or object files.


It fails, but only after doing what I want anyway.

make[2]: *** No rule to make target 
`../build-sparc-sun-solaris2.10/fixincludes/fixinc.sh', needed by 
`stmp-fixinc'.  Stop.
gmake[2]: Leaving directory `/home/jkrell/obj/gcc.sparc/gcc'
gmake[1]: *** [all-gcc] Error 2

Definitely not a big deal.


$HOME/src/gcc-4.5.0/configure -without-gnu-ld -with-ld=/usr/ccs/bin/ld 
-without-gnu-as -with-as=/usr/ccs/bin/as -disable-nls -disable-fixincludes 
-verbose -disable-libgcc -disable-bootstrap
gmake


(I'd still like -disable-intl; I always just delete that directory. 
-disable-intl /might/ work, I should try it again.
I'd also like  -without-libiconv, which I'm sure doesn't work. This way the 
resulting gcc doesn't require
libiconv, which isn't on all systems. I have automation to patch out the 
libiconv use.)

 - Jay
  


RE: pic+64bit+sun assembler+unwind-tables => illegal cross section subtraction

2010-05-10 Thread Jay K

It might also be necessary to configure for i586-sun-solaris2.10 instead of 
i586-solaris2.10.
Something I read said you can use various shorter forms, and I like the idea 
for convenience and to avoid those "pc"s and "unknown"s,
but this seems to have bitten me a number of times, not just today.

Anyway, it is working for me, configure -without-gnu-as i586-sun-solaris2.10, 
having applied Rainer's change to a 4.3 tree.
 (Yes, I'd like to upgrade.)

It seems using GNU as might still be slightly preferred in order to move data 
(jump tables) out of .text and into read only data, like, you know, "the less 
that is executable, the more secure". Though for locality, .text might be 
better.

For now I'm erring toward using what is more often present in the same location.

I should have just waited till I tested with 4.5, that would have shut me up. :)

Thanks,
 - Jay


> From: jay.kr...@cornell.edu
> To: pins...@gmail.com
> CC: gcc@gcc.gnu.org
> Subject: RE: pic+64bit+sun assembler+unwind-tables => illegal cross section 
> subtraction
> Date: Mon, 10 May 2010 03:17:40 +
>
>
> Fix in 4.4.0.
>
> I was getting:
>
> .LASFDE1:
> .long .LASFDE1-.Lframe1
> .long .LFB2-. <<<
> .long .LFE2-.LFB2
>
> 4.5.0 configured right:
>
> .LASFDE1:
> .long .LASFDE1-.Lframe1
> .long .l...@rel <<<
> .long .LFE0-.LFB0
>
> dw2_asm_output_encoded_addr_rtx =>
>
> #ifdef ASM_OUTPUT_DWARF_PCREL
> ASM_OUTPUT_DWARF_PCREL (asm_out_file, size, XSTR (addr, 0));
> #else
> dw2_assemble_integer (size, gen_rtx_MINUS (Pmode, addr, pc_rtx));
> #endif
>
> 
> C:\src\gcc-4.4.0\gcc\config\i386\sol2-10.h(45):#define 
> ASM_OUTPUT_DWARF_PCREL(FILE, SIZE, LABEL) \
> C:\src\gcc-4.5.0\gcc\config\i386\sol2-10.h(46):#define 
> ASM_OUTPUT_DWARF_PCREL(FILE, SIZE, LABEL) \
>
>
> #define ASM_OUTPUT_DWARF_PCREL(FILE, SIZE, LABEL) \
> do { \
> fputs (integer_asm_op (SIZE, FALSE), FILE); \
> assemble_name (FILE, LABEL); \
> fputs (SIZE == 8 ? "@rel64" : "@rel", FILE); \
> } while (0)
> #endif
>
> 2009-01-29 Rainer Orth 
> * config/i386/sol2-10.h [!HAVE_AS_IX86_DIFF_SECT_DELTA]
> (ASM_OUTPUT_DWARF_PCREL): Define.
>
> http://gcc.gnu.org/viewcvs?view=revision&revision=143758
>
>
> Thanks,
> - Jay
>
>
> 
>> From: jay.kr...@cornell.edu
>> To: pins...@gmail.com
>> CC: gcc@gcc.gnu.org
>> Subject: RE: pic+64bit+sun assembler+unwind-tables => illegal cross section 
>> subtraction
>> Date: Mon, 10 May 2010 01:02:29 +
>>
>>
>> Ah, good point. I don't think my "real" scenario did that though.
>> I'll investigate more. Networking problems were?are hampering download 4.5.0 
>> and
>> build and configure it.
>>
>> I did come up with Makefile:
>>
>> Assemble = $(shell if test -x /opt/csw/gnu/as ; then echo /opt/csw/gnu/as ; \
>> elif test -x /usr/sfw/bin/gas ; then echo /usr/sfw/bin/gas ; \
>> else echo "unable to find GNU assembler" ; fi )
>>
>> :) which addresses why I wasn't using GNU as.
>>
>> (Yes, I've heard of autoconf.)
>>
>> Thanks, later,
>> - Jay
>>
>>
>> 
>>> From: pins...@gmail.com
>>> To: jay.kr...@cornell.edu
>>> Subject: Re: pic+64bit+sun assembler+unwind-tables => illegal cross section 
>>> subtraction
>>> Date: Sun, 9 May 2010 17:48:04 -0700
>>> CC: gcc@gcc.gnu.org
>>>
>>>
>>>
>>> Sent from my iPhone
>>>
>>> On May 9, 2010, at 5:42 PM, Jay K wrote:
>>>
>>>>
>>>> I haven't tried 4.5.0 yet.
>>>>
>>>>
>>>> -bash-4.1$ /opt/csw/gcc4/bin/g++ -v
>>>> Using built-in specs.
>>>> Target: i386-pc-solaris2.10
>>>> Configured with: ../gcc-4.3.3/configure --prefix=/opt/csw/gcc4 --
>>>> exec-prefix=/op
>>>> t/csw/gcc4
>>>
>>>
>>>
>>>> --with-gnu-as
>>>
>>>
>>> You configured gcc to build with the gnu as but then run with it so
>>> what do you expect.
>>>
>>>
>>>> --with-as=/opt/csw/bin/gas --without-gnu-ld --with-ld=/
>>>> usr/ccs/bin/ld --enable-nls --with-included-gettext --with-libiconv-
>>>> prefix=/opt/
>>>> csw --with-x --with-mpfr=/opt/csw --with-gmp=/opt/csw --enable-java-
>>>> awt=xlib --e
>>>> nable-libada --enable-libssp --enable-objc-gc --enable-threads=posix
>&

RE: -disable-fixincludes doesn't quite work, minor

2010-05-10 Thread Jay K

Ok if I do both or the emails are just annoying?
I find that bugs are often ignored just as well (but not lost/forgotten, 
granted. :) )
 
Thanks,
 - Jay



> To: jay.kr...@cornell.edu
> CC: gcc@gcc.gnu.org
> Subject: Re: -disable-fixincludes doesn't quite work, minor
> From: i...@google.com
> Date: Mon, 10 May 2010 09:50:01 -0700
>
> Jay K writes:
>
>> -disable-libgcc and/or -disable-fixincludes are useful, depending on your 
>> goal.
>>
>> Like if you just want to compile C to assembly or object files.
>>
>>
>> It fails, but only after doing what I want anyway.
>>
>> make[2]: *** No rule to make target 
>> `../build-sparc-sun-solaris2.10/fixincludes/fixinc.sh', needed by 
>> `stmp-fixinc'. Stop.
>> gmake[2]: Leaving directory `/home/jkrell/obj/gcc.sparc/gcc'
>> gmake[1]: *** [all-gcc] Error 2
>
>
> Thanks for pointing these things out. However, I hope you are filing
> bug reports about these issues you are raising. Problems which are
> only reported to the mailing list will be reliably lost and forgotten.
> If you want them to be fixed, please open bug reports. Thanks.
>
> Ian 


rep prefix doesn't work with Solaris 2.9 Sun assembler

2010-05-11 Thread Jay K

Solaris 2.9 x86 gcc 4.5.0 configure -without-gnu-as -with-as=/usr/ccs/bin/as

 
 => Assembly syntax errors in gcov.c whereever there is rep prefix.
 

I was actually looking for a problem with lock prefixes on 4.3 -- testing 
4.5.0, found this instead, which is about about the same.
 
 
See:
 http://gcc.gnu.org/viewcvs?view=revision&revision=127728
  for handling of the lock prefix in a separate instruction.
 
 
See: 
http://developers.sun.com/sunstudio/downloads/ssx/express_Feb2008_readme.html
 "You can now place lock/rep/repnz/repz/repe/repne prefix on the same line as 
the following instruction."
 
 
But I'd like to stay compatible with the existing Sun assembler 
at /usr/ccs/bin/as.
 
 
I considered just changing them all to \;, like there is rep\;ret, but
I noticed this:
sync.md:  "lock{%;| }or{l}\t{$0, (%%esp)|DWORD PTR [esp], 0}"
 

which appears to be an attempt to output Microsoft/Intel assembly, so I went
with the space usually and ; only for Darwin/Solaris, like how sync.md
was already using ; for Darwin.
 

Proposed patch below/attached.
  (-w to hide indent change)
 

I'll open a bug. And test it on some machines maybe.
Any marked with TARGET_64BIT I left alone. Maybe that is too inconsistent 
though.
 
 

diff -uw /home/jkrell/src/orig/gcc-4.5.0/gcc/config/i386/i386.c ./i386.c
--- /home/jkrell/src/orig/gcc-4.5.0/gcc/config/i386/i386.c Wed Apr  7 23:58:27 
2010
+++ ./i386.c Tue May 11 10:01:54 2010
@@ -11896,11 +11896,10 @@
return;
 
  case ';':
-#if TARGET_MACHO
+   if (TARGET_MACHO || TARGET_SOLARIS)
fputs (" ; ", file);
-#else
+   else
putc (' ', file);
-#endif
return;
 
  default:
diff -uw /home/jkrell/src/orig/gcc-4.5.0/gcc/config/i386/i386.h ./i386.h
--- /home/jkrell/src/orig/gcc-4.5.0/gcc/config/i386/i386.h Wed Mar 24 21:44:48 
2010
+++ ./i386.h Tue May 11 09:59:01 2010
@@ -467,6 +467,9 @@
redefines this to 1.  */
 #define TARGET_MACHO 0
 
+/* Like TARGET_MACHO, redefined in sol2.h. */
+#define TARGET_SOLARIS 0
+
 /* Likewise, for the Windows 64-bit ABI.  */
 #define TARGET_64BIT_MS_ABI (TARGET_64BIT && ix86_cfun_abi () == MS_ABI)
 
diff -uw /home/jkrell/src/orig/gcc-4.5.0/gcc/config/i386/i386.md ./i386.md
--- /home/jkrell/src/orig/gcc-4.5.0/gcc/config/i386/i386.md Wed Mar 24 19:49:49 
2010
+++ ./i386.md Tue May 11 09:49:05 2010
@@ -13811,7 +13811,7 @@
   [(return)
(unspec [(const_int 0)] UNSPEC_REP)]
   "reload_completed"
-  "rep\;ret"
+  "rep{%;| }ret"
   [(set_attr "length" "2")
(set_attr "atom_unit" "jeu")
(set_attr "length_immediate" "0")
@@ -17772,7 +17772,7 @@
  (mem:BLK (match_dup 4)))
(use (match_dup 5))]
   "!TARGET_64BIT"
-  "rep movs{l|d}"
+  "rep{%;| }movs{l|d}"
   [(set_attr "type" "str")
(set_attr "prefix_rep" "1")
(set_attr "memory" "both")
@@ -17808,7 +17808,7 @@
  (mem:BLK (match_dup 4)))
(use (match_dup 5))]
   "!TARGET_64BIT"
-  "rep movsb"
+  "rep{%;| }movsb"
   [(set_attr "type" "str")
(set_attr "prefix_rep" "1")
(set_attr "memory" "both")
@@ -18023,7 +18023,7 @@
(use (match_operand:SI 2 "register_operand" "a"))
(use (match_dup 4))]
   "!TARGET_64BIT"
-  "rep stos{l|d}"
+  "rep{%;| }stos{l|d}"
   [(set_attr "type" "str")
(set_attr "prefix_rep" "1")
(set_attr "memory" "store")
@@ -18056,7 +18056,7 @@
(use (match_operand:QI 2 "register_operand" "a"))
(use (match_dup 4))]
   "!TARGET_64BIT"
-  "rep stosb"
+  "rep{%;| }stosb"
   [(set_attr "type" "str")
(set_attr "prefix_rep" "1")
(set_attr "memory" "store")
@@ -18188,7 +18188,7 @@
(clobber (match_operand:SI 1 "register_operand" "=D"))
(clobber (match_operand:SI 2 "register_operand" "=c"))]
   "!TARGET_64BIT"
-  "repz cmpsb"
+  "repz{%;| }cmpsb"
   [(set_attr "type" "str")
(set_attr "mode" "QI")
(set_attr "prefix_rep" "1")])
@@ -18239,7 +18239,7 @@
(clobber (match_operand:SI 1 "register_operand" "=D"))
(clobber (match_operand:SI 2 "register_operand" "=c"))]
   "!TARGET_64BIT"
-  "repz cmpsb"
+  "repz{%;| }cmpsb"
   [(set_attr "type" "str")
(set_attr "mode" "QI")
(set_attr "prefix_rep" "1")])
@@ -18305,7 +18305,7 @@
(clobber (match_operand:SI 1 "register_operand" "=D"))
(clobber (reg:CC FLAGS_REG))]
   "!TARGET_64BIT"
-  "repnz scasb"
+  "repnz{%;| }scasb"
   [(set_attr "type" "str")
(set_attr "mode" "QI")
(set_attr "prefix_rep" "1")])
diff -uw /home/jkrell/src/orig/gcc-4.5.0/gcc/config/i386/sol2.h ./sol2.h
--- /home/jkrell/src/orig/gcc-4.5.0/gcc/config/i386/sol2.h Wed Mar 31 11:03:29 
2010
+++ ./sol2.h Tue May 11 10:02:17 2010
@@ -172,3 +172,6 @@
 #define TF_SIZE 113
 
 #define MD_UNWIND_SUPPORT "config/i386/sol2-unwind.h"
+
+#undef  TARGET_SOLARIS
+#define TARGET_SOLARIS 1
 
 
Thanks,
 - JayOnly in .: 1.txt
diff -uw /home/jkrell/src/orig/gcc-4.5.0/gcc/config/i386/i386.c ./i386.c
--- /home/jkrell/src/orig/gcc-4.5.0/gcc/config/i386/i386.c  Wed Apr  7 
23:58:27 2010
+++ ./i386.cTue May 11 10:01:54 2010
@@ -11896,11 +11896,10 @@
  retur

RE: rep prefix doesn't work with Solaris 2.9 Sun assembler

2010-05-11 Thread Jay K

Understood, but I'll have to stick to "small" changes as I can't get the papers.

Uros pointed to:
http://gcc.gnu.org/ml/gcc-patches/2010-05/msg00657.html

which appears to just be *very* coincident timing.
So I Rainer will fix it soon.
I have a patch now based on that discussion.

I used:
    case ';':
      if ((TARGET_MACHO || TARGET_SOLARIS) && ASSEMBLER_DIALECT == ASM_ATT)
        fputs (" ; ", file);
      else
        fputc (' ', file);
      return;

though even better might be to assume 64bit implies more recent assembler:

    case ';':
      if ((TARGET_MACHO || TARGET_SOLARIS) && ASSEMBLER_DIALECT == ASM_ATT && 
!TARGET_64BIT)
        fputs (" ; ", file);
      else
        fputc (' ', file);
      return;

I guess there was an obscure bug where -masm=intel didn't work for Darwin 
targets?.
What, people use -masm=intel and masm/nasm/yasm instead of gas? Or just to 
human read the output?

Thanks,
 - Jay


> From: ebotca...@adacore.com
> To: jay.kr...@cornell.edu
> Subject: Re: rep prefix doesn't work with Solaris 2.9 Sun assembler
> Date: Tue, 11 May 2010 10:35:12 +0200
> CC: gcc@gcc.gnu.org
>
>> Proposed patch below/attached.
>> (-w to hide indent change)
>
> See http://gcc.gnu.org/contribute.html for guidelines.
>
>> I'll open a bug.
>
> See http://gcc.gnu.org/bugs for guidelines.
>
> Generally speaking, posting a patch inlined in a message on gcc@gcc.gnu.org
> will most likely result in it being lost and forgotten. In order to report
> an issue, please open a ticket with bugzilla. In order to submit a patch,
> please use gcc-patc...@gcc.gnu.org. In both cases, follow the guidelines
> written down in the aforementioned documentation. Thanks in advance.
>
> --
> Eric Botcazou
  


FW: [Bug c/44166] New: -fvisibility=protected doesn't work?

2010-05-17 Thread Jay K



> Date: Mon, 17 May 2010 13:41:57 +
> Subject: [Bug c/44166] New: -fvisibility=protected doesn't work?
> To: jay.kr...@cornell.edu
> From: gcc-bugzi...@gcc.gnu.org
>
> -fvisibility=protected doesn't work?
>
> a...@xlin2:~$ cat 1.c
> void F1() { }
> void* F2() { return F1; }
>
> j...@xlin2:~$ $HOME/bin/gcc 1.c -fPIC -shared -fvisibility=protected
>
> /usr/bin/ld: /tmp/cc0d6EQ3.o: relocation R_386_GOTOFF against protected
> function `F1' can not be used when making a shared object
> /usr/bin/ld: final link failed: Bad value
> collect2: ld returned 1 exit status
>
> j...@xlin2:~$ $HOME/bin/gcc -v
> Using built-in specs.
> COLLECT_GCC=/home/jay/bin/gcc
> COLLECT_LTO_WRAPPER=/home/jay/libexec/gcc/i686-pc-linux-gnu/4.5.0/lto-wrapper
> Target: i686-pc-linux-gnu
> Configured with: /src/gcc-4.5.0/configure -verbose -prefix=/home/jay
> -disable-bootstrap -disable-multilibs
> Thread model: posix
> gcc version 4.5.0 (GCC)
>
>
> --
> Summary: -fvisibility=protected doesn't work?
> Product: gcc
> Version: 4.5.0
> Status: UNCONFIRMED
> Severity: normal
> Priority: P3
> Component: c
> AssignedTo: unassigned at gcc dot gnu dot org
> ReportedBy: jay dot krell at cornell dot edu
> GCC build triplet: i686-pc-linux-gnu
> GCC host triplet: i686-pc-linux-gnu
> GCC target triplet: i686-pc-linux-gnu
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44166
>
> --- You are receiving this mail because: ---
> You reported the bug, or are watching the reporter.
  


alpha-dec-osf5.1 4.5 built/installed

2010-06-10 Thread Jay K

per http://gcc.gnu.org/install/finalinstall.html

Built/installed 4.5 on alpha-dec-osf.

alphaev67-dec-osf5.1

bash-4.1$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/jayk/libexec/gcc/alphaev67-dec-osf5.1/4.5.0/lto-wrapper
Target: alphaev67-dec-osf5.1
Configured with: /home/jayk/src/gcc-4.5.0/configure -disable-nls 
-prefix=/home/jayk
Thread model: posix
gcc version 4.5.0 (GCC) 

C and C++
Though I meant to let it do "all".

This isn't a "normal modern" 5.1, like 5.1a or 5.1b, but is old 5.1 rev 732.

Not easy to get the prerequisites for running tests: 
http://gcc.gnu.org/ml/gcc-testresults/2010-06/msg00967.html
More followup still to do.

and I only ran make check in the gcc directory, to skip gmp/mpfr/mpc.


 - Jay
  


how to get instruction codes in gcc passes?

2010-06-13 Thread Ilya K
Hi all.
(I have never used these maillists before. Sorry if something wrong here.)

I am newbie in gcc and I need some help.

I am performing some research work in theme of code optimization.
Now I have to write my own optimization pass for gcc. And in this pass
I need to get the instruction codes of the resulting assemble code.

I put my pass just before the "NEXT_PASS (pass_final);" in
init_optimization_passes function. So I think that asm instructions
are almost ready when my pass starts its work.
This pass is already inserted into the gcc code and can be started.
The gcc is compiled. And I can see my debug stuff in especially
generated file when the gcc works.
Actually I have no useful code yet, but I just want to get some
information for starting the developing.

For the beginning I want to do something like this:
for (insn = get_insns() ; insn ; insn = NEXT_INSN(insn))
{
int code = ...;   //I need help in this line!!!
myDebugOutputShowCode(++i, code);
}

I.e. I just want to see the whole list of code of instructions. Like
assembler listing.

Can you help me and give some advices how to do it?
I had a look at some *.md files in gcc sources, but did not found any
source of codes of assembler instructions. How does the gcc generates
the binary file? Where can it get the binary representation of asm
instruction?
Where is it described that "nop" is "0F 1F" value for x86 architecture
and so on?

Thanks for any help.


Re: how to get instruction codes in gcc passes?

2010-06-13 Thread Ilya K
On Sun, Jun 13, 2010 at 6:38 PM, Richard Guenther
 wrote:
> On Sun, Jun 13, 2010 at 4:29 PM, Ilya K  wrote:
>> Hi all.
>> (I have never used these maillists before. Sorry if something wrong here.)
>>
>> I am newbie in gcc and I need some help.
>>
>> I am performing some research work in theme of code optimization.
>> Now I have to write my own optimization pass for gcc. And in this pass
>> I need to get the instruction codes of the resulting assemble code.
>>
>> I put my pass just before the "NEXT_PASS (pass_final);" in
>> init_optimization_passes function. So I think that asm instructions
>> are almost ready when my pass starts its work.
>> This pass is already inserted into the gcc code and can be started.
>> The gcc is compiled. And I can see my debug stuff in especially
>> generated file when the gcc works.
>> Actually I have no useful code yet, but I just want to get some
>> information for starting the developing.
>>
>> For the beginning I want to do something like this:
>>    for (insn = get_insns() ; insn ; insn = NEXT_INSN(insn))
>>    {
>>        int code = ...;   //I need help in this line!!!
>>        myDebugOutputShowCode(++i, code);
>>    }
>>
>> I.e. I just want to see the whole list of code of instructions. Like
>> assembler listing.
>>
>> Can you help me and give some advices how to do it?
>> I had a look at some *.md files in gcc sources, but did not found any
>> source of codes of assembler instructions. How does the gcc generates
>> the binary file? Where can it get the binary representation of asm
>> instruction?
>> Where is it described that "nop" is "0F 1F" value for x86 architecture
>> and so on?
>
> GCC does not have an integrated assembler and only outputs
> assembler source.  At the point above you still have RTXen
> where you can query INSN_CODE to see what instruction from
> the machine description was matched.  In the define_insns
> there are patterns for generating the assembler source instruction.
>
> Richard.
>
>> Thanks for any help.
>>
>

OK. No low-level instruction codes in gcc.
I found the definitions of instructions in i386.dm (thanks for this advice):

(define_insn "nop"
  [(const_int 0)]
  ""
  "nop"
  [(set_attr "length" "1")
   (set_attr "length_immediate" "0")
   (set_attr "modrm" "0")])

So, the gcc only knows that some specific internal representation
should be generated as "nop" string, right?

And who is generating the binary file from the asm file? Does gcc call
the external program? What the name of this program? Or the sources I
need can be found inside the gcc directory?

I think that if there are no such instruction-code table inside gcc, I
can build my own version of it inside my pass. But I need the sources
of some utility which knows how to convert "nop" string into the
binary number. Where can I get it?


Re: how to get instruction codes in gcc passes?

2010-06-13 Thread Ilya K
On Sun, Jun 13, 2010 at 7:38 PM, Dave Korn  wrote:
...
>  What exact use would it be to you having the opcode bytes known during an
> optimisation pass?  There may be a better/easier way to do whatever it is
> you're trying to do.
>
>    cheers,
>      DaveK
>

My main aim is to build platform-dependent optimization based on the
minimizing of hamming distance between the instructions.
So I need the platform-specific information like instruction codes.
Well, may be I should to try to put it into GAS, not into the gcc. I
just never looked over these assemblers.
I thought that if gcc already have the base of optimizations then this
will be a best place to put another one. I do not know now if these
assembler tools have a room for optimization.

Ilya K


Re: how to get instruction codes in gcc passes?

2010-06-14 Thread Ilya K
On 13/06/2010 20:57, Ian Lance Taylor wrote:
> Take a look at http://code.google.com/p/mao/ .
> Ian

Thanks, Ian! This project looks very interesting.
I will try to play with it.



On Mon, Jun 14, 2010 at 2:28 AM, Dave Korn  wrote:
>  Or in binutils, LD's relaxation infrastructure might be usable to this end.
>  But I think if you want something so platform dependent as to care about the
> bitpatterns of opcodes, it almost certainly ought to live in the assembler or
> linker rather than then compiler.
>
>    cheers,
>      DaveK

Yes, I agree that compiler maybe is not the best place for my
optimizations. I will try to see if binutils or mao provides better
environment for this developing.
Thanks for pointing out the places where the low-level optimizations
are more suitable.

Ilya K


Re: how to get instruction codes in gcc passes?

2010-06-14 Thread Ilya K
On Mon, Jun 14, 2010 at 12:25 AM, Basile Starynkevitch
 wrote:
>
> Why do you want to optimize the generated assembly code? AFAIK, all
> optimization passes in GCC work on some intermediate representation
> which is not the assembly code, and many of them work on Gimple.
> ...
> ... GCC emit textual assembly code. The
> assembler (that is binutils, not GCC) know that nop is 0f 1f.
>
> Cheers.
>
> PS. I might have some details wrong; I am not very familiar with GCC
> back-ends & RTL passes.
>
> --
> Basile STARYNKEVITCH         http://starynkevitch.net/Basile/
> email: basilestarynkevitchnet mobile: +33 6 8501 2359
> 8, rue de la Faiencerie, 92340 Bourg La Reine, France
> *** opinions {are only mines, sont seulement les miennes} ***
>
>

Yes, thanks. I have already seen that GCC does not have the
instruction codes. Nevertheless I have to work on the low-level. At
the back-end. It is a feature of my work :).
So, maybe I'll just switch to inserting my code into binutils, or mao project.

Ilya K


ARM FLOOR_MOD_EXPR?

2010-06-19 Thread Jay K

Do FLOOR_DIV_EXPR and FLOOR_MOD_EXPR work on ARM, for 64bit signed integer?
I have a front end (sort of), using gcc 4.3, generates trees, doesn't work.
   type_for_mode(TImode) is NULL and that is dereferenced.
I realize TRUNC_* would be far more "normal", but I can't change that.
I guess I'll just go back to generating function calls.

 - Jay
  


RE: ARM FLOOR_MOD_EXPR?

2010-06-20 Thread Jay K

in calls.c:

  tfom = lang_hooks.types.type_for_mode (outmode, 0);
  if (aggregate_value_p (tfom, 0))

for 64bit mod, outmode ends up TImode.
Our frontend doesn't support TImode -- reasonable? -- and so type_for_mode 
returns NULL here.
aggregate_value_p then derefences that NULL.

At least that's what happens in 4.3.

I tried hacking the C frontend to interpret % as FLOOR instead of TRUNC.
It works though -- the C frontend supports TImode.
Seems a little bit odd to depend on that?

 - Jay


> From: jay.kr...@cornell.edu
> To: gcc@gcc.gnu.org
> Subject: ARM FLOOR_MOD_EXPR?
> Date: Sat, 19 Jun 2010 08:17:16 +
>
>
> Do FLOOR_DIV_EXPR and FLOOR_MOD_EXPR work on ARM, for 64bit signed integer?
> I have a front end (sort of), using gcc 4.3, generates trees, doesn't work.
>type_for_mode(TImode) is NULL and that is dereferenced.
> I realize TRUNC_* would be far more "normal", but I can't change that.
> I guess I'll just go back to generating function calls.
>
>  - Jay
>
  


suggest assert wide_int larger than hashval_t

2010-07-18 Thread Jay K

I get this in 4.3.5:

../../gcc/gcc/varasm.c: In function `const_rtx_hash_1':
../../gcc/gcc/varasm.c:3387: warning: right shift count >= width of type

./include/hashtab.h:typedef unsigned int hashval_t;

  unsigned HOST_WIDE_INT hwi;
  hashval_t h, *hp;
 ...
    const int shift = sizeof (hashval_t) * CHAR_BIT;
    const int n = sizeof (HOST_WIDE_INT) / sizeof (hashval_t);
    int i;

    h ^= (hashval_t) hwi;
    for (i = 1; i < n; ++i)
      {
        hwi >>= shift;  here


It looks about the same in 4.5.0 except without const:


    int shift = (sizeof (hashval_t) * CHAR_BIT);


Something is amiss here locally, for the types to be the same size.


But maybe add gcc_assert(sizeof(hashval_t) < sizeof(HOST_WIDE_INT),
outside the loop? It should be optimized away anyway.


Maybe I'd get -Werror but I use -disable-bootstrap.
Native compiler is gcc, but old.


Thanks,
 - Jay

  


RE: suggest assert wide_int larger than hashval_t (32bit hwint?)

2010-07-19 Thread Jay K

Hm, it seems on some 32bit systems, where there is no -m64, wideint can be a 
mere 32bits.

In which case the code should probably say:

hwi = ((hwi >> (shift - 1)) >> 1);

This was targeting OpenBSD/x86.
Maybe I should just stick need_64bit_hwint = yes on config.gcc for that and 
move along?
Assume there is always long long or __int64?
Coverage of this case is pretty rare now from my skimming.

 - Jay


> From: jay.kr...@cornell.edu
> To: gcc@gcc.gnu.org
> Subject: suggest assert wide_int larger than hashval_t
> Date: Mon, 19 Jul 2010 06:44:33 +
>
>
> I get this in 4.3.5:
>
> ../../gcc/gcc/varasm.c: In function `const_rtx_hash_1':
> ../../gcc/gcc/varasm.c:3387: warning: right shift count >= width of type
>
> ./include/hashtab.h:typedef unsigned int hashval_t;
>
>   unsigned HOST_WIDE_INT hwi;
>   hashval_t h, *hp;
>  ...
> const int shift = sizeof (hashval_t) * CHAR_BIT;
> const int n = sizeof (HOST_WIDE_INT) / sizeof (hashval_t);
> int i;
>
> h ^= (hashval_t) hwi;
> for (i = 1; i < n; ++i)
>   {
> hwi >>= shift;  here
>
>
> It looks about the same in 4.5.0 except without const:
>
>
> int shift = (sizeof (hashval_t) * CHAR_BIT);
>
>
> Something is amiss here locally, for the types to be the same size.
>
>
> But maybe add gcc_assert(sizeof(hashval_t) < sizeof(HOST_WIDE_INT),
> outside the loop? It should be optimized away anyway.
>
>
> Maybe I'd get -Werror but I use -disable-bootstrap.
> Native compiler is gcc, but old.
>
>
> Thanks,
>  - Jay
>
>
  


RE: suggest assert wide_int larger than hashval_t

2010-07-19 Thread Jay K

It's "just" a warning, no "real" affects seen.
I patched my copy to say
  hwi = ((hwi >> (shift - 1)) >> 1);

Thanks,
 - Jay


> From: i...@google.com
> To: jay.kr...@cornell.edu
> CC: gcc@gcc.gnu.org
> Subject: Re: suggest assert wide_int larger than hashval_t
> Date: Mon, 19 Jul 2010 00:36:06 -0700
>
> Jay K  writes:
>
> > I get this in 4.3.5:
> >
> > ../../gcc/gcc/varasm.c: In function `const_rtx_hash_1':
> > ../../gcc/gcc/varasm.c:3387: warning: right shift count >= width of type
> >
> > ./include/hashtab.h:typedef unsigned int hashval_t;
> >
> >   unsigned HOST_WIDE_INT hwi;
> >   hashval_t h, *hp;
> >  ...
> > const int shift = sizeof (hashval_t) * CHAR_BIT;
> > const int n = sizeof (HOST_WIDE_INT) / sizeof (hashval_t);
> > int i;
> >
> > h ^= (hashval_t) hwi;
> > for (i = 1; i < n; ++i)
> >   {
> > hwi >>= shift;  here
>
> It's not an actual problem, because the code is never executed in the
> case where right shift count >= width of type. Note that the loop
> starts at 1. If this is breaking bootstrap that it needs to be
> addressed one way or another, otherwise it's not too serious.
>
> Ian
  


http://gcc.gnu.org/install/specific.html maybe should mention memory settings for OpenBSD

2010-08-24 Thread Jay K

Possibly a note for:


http://gcc.gnu.org/install/specific.html
under OpenBSD.


or just for the mail archives:


Building a *slight* fork of 4.5.1 on OpenBSD/x86 4.7 I hit


gcc -c  -g -O2 -static -DIN_GCC   -W -Wall -Wwrite-strings \
-Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attribute   \
-DHAVE_CONFIG_H -I. -I. -I../../gcc-4.5/gcc -I../../gcc-4.5/gcc/. \
-I../../gcc-4.5/gcc/../include -I../../gcc-4.5/gcc/../libcpp/include \
-I/home/jay/dev2/cm3/m3-sys/m3cc/I386_OPENBSD/./gmp \
-I/home/jay/dev2/cm3/m3-sys/m3cc/gcc-4.5/gmp \
-I/home/jay/dev2/cm3/m3-sys/m3cc/I386_OPENBSD/./mpfr \
-I/home/jay/dev2/cm3/m3-sys/m3cc/gcc-4.5/mpfr \
-I/home/jay/dev2/cm3/m3-sys/m3cc/gcc-4.5/mpc/src \
-I../../gcc-4.5/gcc/../libdecnumber \
-I../../gcc-4.5/gcc/../libdecnumber/dpd -I../libdecnumber \
-I/usr/local/include insn-attrtab.c -o insn-attrtab.o


cc1: out of memory allocating 304988696 bytes after a total of 0 bytes
gmake: *** [insn-attrtab.o] Error 1


This was not a problem with 4.3.0 or 4.3.5. I don't know about 4.4.x.
We skipped them, just because we are slow and lagging.
 

I couldn't get ulimit to do anything as non-root. I do have swap.


I changed these from 512 to 768, probably not all of them necessary:


# pwd
/etc
# grep 768 login.conf  
    :datasize-max=768M:\
    :datasize-cur=768M:\
    :datasize-cur=768M:\


and then I can proceed.


System probably doesn't have much RAM, maybe only 512MB, so that
could be where the previous values came from. I had never touched them.


Smaller amounts of RAM seem "more normal" these days to pack
more virtual machines onto one physical system.
(Though this just an old laptop, not a virtual machine.)


 - Jay
  


built-in atomic functions

2010-08-30 Thread manju k
Hello,

Are there any new built-in implementations equivalent to atomic_read and 
atomic_write in latest releases ?
if not should I rely on trick suggested in one of the earlier discussions as 
below ?

  ">>We don't have atomic read or atomic write builtins (ok, you could

>>abuse __sync_fetch_and_add (&x, 0) for atomic read and a loop
>>with __sync_compare_and_swap_val for atomic store, but that's a horrible
>>overkill."
Also I am looking for atomic_set, I do not see any __sync* functions.
but from atomic.h, atomic_set just does a normal assignment, so doing the same 
should be safe enough?

Thanks for any help.

Thanks and Regards,
Manju



  


64 bit porting guide

2010-09-10 Thread manju k
Hello,
I am porting my application from 32bit to 64bit architecture on intel.
Can anyone point me to some good references for 64bit porting on intel 
platform(32bit i686 to 64bit x86_64)

Thanks,
manju


  


internal compiler error: in referenced_var_lookup, at tree-dfa.c

2010-09-10 Thread Jay K

So..we have a custom frontend.
That uses process boundaries to avoid GPL crossing into BSDish licensed code.
So maybe you don't want to help me. Understood.

But just in case:

We generate trees. Probably they aren't of great quality.
e.g. relatively devoid of types and do field accesses by offseting pointers and 
casting.
We do our own layout earlier. Not great, but that's how it is.


Currently we use gcc 4.5.1.
When I enable inlining on some architectures, e.g. Macosx/x86, much code yields:


../FPrint.m3: In function 'FPrint__xCombine':
../FPrint.m3:25:0: internal compiler error: in referenced_var_lookup, at 
tree-dfa.c:525
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.


Now I've spent a while debugging this. I'm not completely lazy.


The original source (again, not a standard frontend):

PROCEDURE xCombine (): INTEGER =
  BEGIN
    RETURN xFix32 (0);
  END xCombine;

PROCEDURE xFix32 (x: INTEGER): INTEGER =
  CONST a = 1;
  BEGIN
   IF Word.And (x, a) = 0 THEN
 RETURN 0;
   END;
   RETURN 0;
  END xFix32;

 -fdump-tree-all, 003t.original:

FPrint__xCombine ()
{
  int_32 D.985;
  int_32 M3_AcxOUs__result;

    int_32 M3_AcxOUs__result;
    int_32 D.985;
  D.985 = FPrint__xFix32 (0);
   = D.985;
  return ;
}


FPrint__xFix32 (int_32 M3_AcxOUs_x)
{
  int_32 M3_AcxOUs__result;

    int_32 M3_AcxOUs__result;
  if ((M3_AcxOUs_x & 1) != 0)
    {
  goto ;
    }
   = 0;
  return ;
  :;
   = 0;
  return ;
}



040t.release_ssa:
;; Function FPrint__xFix32 (FPrint__xFix32)

Released 0 names, 0.00%
FPrint__xFix32 (int_32 M3_AcxOUs_x)
{
  int_32 D.987;

:
  D.987_3 = M3_AcxOUs_x_2(D) & 1;
  if (D.987_3 != 0)
    goto  ();
  else
    goto ;

:
  _4 = 0;
  goto ;

:
  _5 = 0;

:
  # _1 = PHI <_4(3), _5(4)>
  return _1;

}


Breakpoint 1, 0x0078f21a in copy_phis_for_bb (bb=0x4126a280, id=0xb5d8) at 
../../gcc-4.5/gcc/tree-inline.c:1937
1937    {
(gdb) n
1938      basic_block const new_bb = (basic_block) bb->aux;
(gdb) 
1943      jaykrell_check_1();  doesn't do anything now, was helping debug 
(gdb) 
1945      for (si = gsi_start (phi_nodes (bb)); !gsi_end_p (si); gsi_next (&si))
(gdb) 
1947      tree res = { 0 }, new_res = { 0 };
(gdb) 
1948      gimple new_phi = { 0 };
(gdb) 
1949      edge new_edge = { 0 };
(gdb) 
1951      phi = gsi_stmt (si);
(gdb) 
1952      res = PHI_RESULT (phi);
(gdb) 
1953      new_res = res;
(gdb) 
1954      if (is_gimple_reg (res))
(gdb) 
1956          walk_tree (&new_res, copy_tree_body_r, id, NULL);
(gdb) 
1957          new_phi = create_phi_node (new_res, new_bb);  This line split up 
locally, ok.
(gdb) 
1958          SSA_NAME_DEF_STMT (new_res) = new_phi;
(gdb) call debug_referenced_vars()

Referenced variables in FPrint__xCombine: 7

Variable: D.1036, UID D.1036, int_32gimple_default_def 0x412130a8 1036

Variable: D.1041, UID D.1041, int_32gimple_default_def 0x412130a8 1041

Variable: .MEM, UID D.1038, , is global, call 
clobberedgimple_default_def 0x412130a8 1038

Variable: M3_AcxOUs_x, UID D.1039, int_32gimple_default_def 0x412130a8 1039

Variable: D.1040, UID D.1040, int_32gimple_default_def 0x412130a8 1040

Variable: , UID D.979, int_32gimple_default_def 0x412130a8 979

Variable: D.985, UID D.985, int_32gimple_default_def 0x412130a8 985


(gdb) n
1959          FOR_EACH_EDGE (new_edge, ei, new_bb->preds)
(gdb) call debug_referenced_vars()

Referenced variables in FPrint__xCombine: 7

Variable: D.1036, UID D.1036, int_32gimple_default_def 0x412130a8 1036

Variable: D.1041, UID D.1041, int_32gimple_default_def 0x412130a8 1041

Variable: .MEM, UID D.1038, , is global, call 
clobberedgimple_default_def 0x412130a8 1038

Variable: M3_AcxOUs_x, UID D.1039, int_32gimple_default_def 0x412130a8 1039

Variable: D.1093058884, UID D.1093058884, int_32gimple_default_def 0x412130a8 
1093058884

Variable: , UID D.979, int_32gimple_default_def 0x412130a8 979

Variable: D.985, UID D.985, int_32gimple_default_def 0x412130a8 985


You can see D.1040 got overwritten with something else.
  And then later the assertion is that it is missing.
Is it valid for uids to be so high?


Any clues/tips?


Thanks much,
 - Jay
  


RE: internal compiler error: in referenced_var_lookup, at tree-dfa.c

2010-09-10 Thread Jay K

Er, As  I understand, lack of a process boundary automatically implies GPL 
"spread" through "linkage".

  Assuming "linkage" means "ld". I'm not sure I've seen "linkage" defined. 
However

  if "linkage" or "derivation" includes "interaction via file or network I/O", 
then a lot of folks will be upset,
  (and some people very pleased :) ) File and network I/O connect all code in 
the world.

A process boundary at least gives you a chance.


Some of the work -- the frontend -- is clearly derived, and linked, so it is 
GPL.



> a derived work! You need to consult a knowledgable attorney before

> proceeding in this direction.



Most of the proceeding in this direction was done >10 years ago by others.
  Granted, I don't know what legal advise they had.

I'm proceeding not much further, e.g. merging to current gcc, making debugging 
better.

At some point I might generate C to fix a number of problems (including this 
assert and
licensing, and debugging, and efficient exception handling, etc.), but that is 
a different matter.



Anyway, I put this out there to give folks a chance to not "like" me and not 
help me.

I'll address the technical part separately.


Thanks,

 - Jay


> Date: Fri, 10 Sep 2010 11:17:59 -0400
> From: de...@adacore.com
> To: i...@google.com
> CC: jay.kr...@cornell.edu; gcc@gcc.gnu.org
> Subject: Re: internal compiler error: in referenced_var_lookup, at tree-dfa.c
>
> On 9/10/2010 11:08 AM, Ian Lance Taylor wrote:
> > Jay K writes:
> >
> >> That uses process boundaries to avoid GPL crossing into BSDish licensed 
> >> code.
> >> So maybe you don't want to help me. Understood.
> >
> > Note that different people have different opinions as to whether a
> > process boundary means that your code is not a derived work. Not that
> > we should get into that discussion on this mailing list.
>
> Indeed, it is important to realize that putting in an arbitrary
> process boundary is not guarantee at all that you have not created
> a derived work! You need to consult a knowledgable attorney before
> proceeding in this direction.
  


RE: internal compiler error: in referenced_var_lookup, at tree-dfa.c

2010-09-10 Thread Jay K

[licensing dealt with separately]



> > Variable: D.1093058884, UID D.1093058884, int_32gimple_default_def 
> > 0x412130a8 1093058884
> This is clearly wrong, though I have no idea what caused it.
> > Is it valid for uids to be so high?
> No.

Thanks, that helps.


> From your description, you've implemented some sort of customized tree
> reader.


Not exactly, by my understanding of terminology.
We do end up making gcc trees but we serialize something that is separately
specified and not really the same, though some level of resemblance is
unavoidable since they both are involved with compilation, i.e. they both
have operations like "add" and notions of locals, parameters, functions, etc.
Ours is stack-based though for example (not my preference, but it was already 
there).


 > Does it play nicely with the garbage collector?


I think so.
We have the GTY annotations, I've managed to crash things when I got them
wrong/missing. I haven't moved all targets from 4.3.x to 4.5.x so I even have
to hack on the code a bit because GTY has to be in a different place.
I put the type declarations in seperate .h files, maintain both, and copy one
over the other before compilation.


We do have an open bug report about causing the gcc garbage collector
consuming infinite memory, maybe due to a cycle in our data.
But really the system works a ton. I can compile and run tens of thousands of 
lines
of code, for multiple architectures. I "just" have to turn off inlining, and 
a small number of other optimizations. Clearly we are pretty good, and flawed.


Notice that the gcc middle end seemed to have created this variable with the 
high uid.
I checked the globals that guide uid assignment, found them after sending the
first mail. They aren't so high.
I haven't yet found where this uid comes from.
I kind of suspect it might be a type mismatch, overwriting part of a tree node
  with the wrong type or such.
I'll have to dig more.

I know it comes from here:
copy_phis_for_bb:
...
      SSA_NAME_DEF_STMT (new_res)
        = new_phi = create_phi_node (new_res, new_bb);



Thanks,
 - Jay
  


RE: internal compiler error: in referenced_var_lookup, at tree-dfa.c

2010-09-11 Thread Jay K

> I kind of suspect it might be a type mismatch, overwriting part of a tree node

configure -enable-checking:

?"Bmr?: In function 'FPrint__xCombine':
`?"Bmr?:13:0: internal compiler error: tree check: expected ssa_name, have 
var_decl in copy_phis_for_bb, at tree-inline.c:1950

and some other problems..I really need to fix those...

 - Jay


> From: jay.kr...@cornell.edu
> To: i...@google.com
> CC: gcc@gcc.gnu.org
> Subject: RE: internal compiler error: in referenced_var_lookup, at tree-dfa.c
> Date: Fri, 10 Sep 2010 20:38:58 +
>
>
> [licensing dealt with separately]
>
>
>
> > > Variable: D.1093058884, UID D.1093058884, int_32gimple_default_def 
> > > 0x412130a8 1093058884
> > This is clearly wrong, though I have no idea what caused it.
> > > Is it valid for uids to be so high?
> > No.
>
> Thanks, that helps.
>
>
> > From your description, you've implemented some sort of customized tree
> > reader.
>
>
> Not exactly, by my understanding of terminology.
> We do end up making gcc trees but we serialize something that is separately
> specified and not really the same, though some level of resemblance is
> unavoidable since they both are involved with compilation, i.e. they both
> have operations like "add" and notions of locals, parameters, functions, etc.
> Ours is stack-based though for example (not my preference, but it was already 
> there).
>
>
>  > Does it play nicely with the garbage collector?
>
>
> I think so.
> We have the GTY annotations, I've managed to crash things when I got them
> wrong/missing. I haven't moved all targets from 4.3.x to 4.5.x so I even have
> to hack on the code a bit because GTY has to be in a different place.
> I put the type declarations in seperate .h files, maintain both, and copy one
> over the other before compilation.
>
>
> We do have an open bug report about causing the gcc garbage collector
> consuming infinite memory, maybe due to a cycle in our data.
> But really the system works a ton. I can compile and run tens of thousands of 
> lines
> of code, for multiple architectures. I "just" have to turn off inlining, and
> a small number of other optimizations. Clearly we are pretty good, and flawed.
>
>
> Notice that the gcc middle end seemed to have created this variable with the 
> high uid.
> I checked the globals that guide uid assignment, found them after sending the
> first mail. They aren't so high.
> I haven't yet found where this uid comes from.
> I kind of suspect it might be a type mismatch, overwriting part of a tree node
>   with the wrong type or such.
> I'll have to dig more.
>
> I know it comes from here:
> copy_phis_for_bb:
> ...
>   SSA_NAME_DEF_STMT (new_res)
> = new_phi = create_phi_node (new_res, new_bb);
>
>
>
> Thanks,
>  - Jay
>
  


RE: internal compiler error: in referenced_var_lookup, at tree-dfa.c

2010-09-11 Thread Jay K

arg..well, I had replaced xmalloc with alloca, leading to some of the garbage 
below, but
I am indeed still running afoul of the garbage collector.
I don't know if that is my original problem but I should probably fix this 
first.
 ie: now that I'm using -enable-checking, and I think it collects earlier/more 
often.


I need to "map" my internal unsigned long arbitrary integers, to trees.
So I just have an array of struct {unsigned id; tree t };
I put GTY on the struct, on the field, and on the VEC of them.
When I append I mark it dirty.
When I search I qsort if dirty, then bsearch.


typedef struct GTY(()) m3type_t
{
  unsigned long id;
  tree GTY (()) t;
} m3type_t;


static GTY (()) VEC (m3type_t,gc)* m3type_table;

seems reasonable eh?
The files are in gtfiles. When I put the GTY in the wrong place I get compile 
errors.

I guess I can try rolling my own "VEC" or even use a fixed size and see what 
happens..

Thanks,
 - Jay


> From: jay.kr...@cornell.edu
> To: i...@google.com
> CC: gcc@gcc.gnu.org
> Subject: RE: internal compiler error: in referenced_var_lookup, at tree-dfa.c
> Date: Sat, 11 Sep 2010 08:48:08 +
>
>
> > I kind of suspect it might be a type mismatch, overwriting part of a tree 
> > node
>
> configure -enable-checking:
>
> ?"Bmr?: In function 'FPrint__xCombine':
> `?"Bmr?:13:0: internal compiler error: tree check: expected ssa_name, have 
> var_decl in copy_phis_for_bb, at tree-inline.c:1950
>
> and some other problems..I really need to fix those...
>
>  - Jay
>
> 
> > From: jay.kr...@cornell.edu
> > To: i...@google.com
> > CC: gcc@gcc.gnu.org
> > Subject: RE: internal compiler error: in referenced_var_lookup, at 
> > tree-dfa.c
> > Date: Fri, 10 Sep 2010 20:38:58 +
> >
> >
> > [licensing dealt with separately]
> >
> >
> >
> > > > Variable: D.1093058884, UID D.1093058884, int_32gimple_default_def 
> > > > 0x412130a8 1093058884
> > > This is clearly wrong, though I have no idea what caused it.
> > > > Is it valid for uids to be so high?
> > > No.
> >
> > Thanks, that helps.
> >
> >
> > > From your description, you've implemented some sort of customized tree
> > > reader.
> >
> >
> > Not exactly, by my understanding of terminology.
> > We do end up making gcc trees but we serialize something that is separately
> > specified and not really the same, though some level of resemblance is
> > unavoidable since they both are involved with compilation, i.e. they both
> > have operations like "add" and notions of locals, parameters, functions, 
> > etc.
> > Ours is stack-based though for example (not my preference, but it was 
> > already there).
> >
> >
> > > Does it play nicely with the garbage collector?
> >
> >
> > I think so.
> > We have the GTY annotations, I've managed to crash things when I got them
> > wrong/missing. I haven't moved all targets from 4.3.x to 4.5.x so I even 
> > have
> > to hack on the code a bit because GTY has to be in a different place.
> > I put the type declarations in seperate .h files, maintain both, and copy 
> > one
> > over the other before compilation.
> >
> >
> > We do have an open bug report about causing the gcc garbage collector
> > consuming infinite memory, maybe due to a cycle in our data.
> > But really the system works a ton. I can compile and run tens of thousands 
> > of lines
> > of code, for multiple architectures. I "just" have to turn off inlining, and
> > a small number of other optimizations. Clearly we are pretty good, and 
> > flawed.
> >
> >
> > Notice that the gcc middle end seemed to have created this variable with 
> > the high uid.
> > I checked the globals that guide uid assignment, found them after sending 
> > the
> > first mail. They aren't so high.
> > I haven't yet found where this uid comes from.
> > I kind of suspect it might be a type mismatch, overwriting part of a tree 
> > node
> > with the wrong type or such.
> > I'll have to dig more.
> >
> > I know it comes from here:
> > copy_phis_for_bb:
> > ...
> > SSA_NAME_DEF_STMT (new_res)
> > = new_phi = create_phi_node (new_res, new_bb);
> >
> >
> >
> > Thanks,
> > - Jay
> >
>
  


RE: internal compiler error: in referenced_var_lookup, at tree-dfa.c

2010-09-12 Thread Jay K

I have it seemingly working now, much better, thanks for the nudges -- that 
indeed high id is invalid, and to look again at my GTY use.
I don't know if it made the difference but I changed some whitespace to match 
others, and
typedef struct foo_t { ... } foo_t;
to
typedef struct foo { ... } foo_t; without the _t on the struct tag.

 - Jay


...snip...
  


eliminating mpc/mpfr and reducing gmp

2010-09-26 Thread Jay K

Hi. You know, gmp/mpfr/mpc are a significant
portion of building any frontend/backend.


So I looked at it.


mpc of course is only for complex numbers.
Our frontend doesn't use them.
Maybe only for "builtins" as well.


#define do_mpc_arg1(a, b, c) NULL_TREE
and such.


mpfr appears to be used for floating point builtins.
Not for floating point constant folding as I'd thought?


We generate very few builtins, just memcmp, memset, memmove, memcpy, and *sync*.
Either way, floating point doesn't matter to me much.
#define do_mpfr_arg1(a,b,c) NULL_TREE and such.
including something_logarithm, something_exponent.


And #if out the version report in toplev.c.
And the then unused conversions in real.c.
and the #include in real.h.


and then mpfr isn't needed.


That is: changes to gcc to eliminate mpc and mpfr dependencies are pretty darn 
small.
I used #if 1/#if 0, but it'd be easy enough to have configure -without-mpc 
-without-mpfr.
I was able to do this for 4.5, and then 4.3 was trivial, the same except no mpc.


And then, if you dig a bit more, you find that gmp
contains "n", "q" (rational), "z", "f" (float), and miscellaneous e.g. *rand*.
(apples and oranges: *rand* must operate on some mp type, of course)


None of "q" and "f" are needed, and vast swaths of "n" and "z"
are unused and can be deleted. Easier to truncate the
files first when experimenting.


Probably can do even better.
There is a file in GMP dumbmp.c (as in dump, stupid) that contains
a simple portable subset of GMP, used for some "generator" programs.
First glance says it isn't sufficient, but it might be close..


The result is a lot faster to build, if you are just doing a just
a single stage build of a compiler.
To some extent it adds up -- multiple stages.
To some extent it is is minor -- if you are building libstdc++/libada/libjava x 
multilibs.
I build just a frontend/backend though.


I understand you can also build/install the libraries once.
That isn't a terrible option. I have operated that away. But does have 
drawbacks.
It is harder for Canadian cross and "cross back" (cross build a native 
compiler),
which admittedly, I don't do much. 
There were problems last I tried, esp. with enable-shared and fixincludes,
and it always a pain to get the C runtime headers/libraries.
The first I opened bugs for. The second can't really be fixed, except
e.g. with an integrated tree with glibc/newlib. Maybe some people
have scripts out there to scp the appropriate files. I end up taking
way more than necessary and sometimes hunting around for what I missed.


and gmp doesn't build with default gcc 4.0 on Intel MacOSX 10.5.
Building within the gcc tree is one workaround, because how it fiddles with 
CFLAGS
and seemingly accidentally avoids the problems with "inline".
Granted, CC=gcc-4.2 is another easy one. Other options.


Anyway.

- Jay 


RE: eliminating mpc/mpfr and reducing gmp

2010-09-27 Thread Jay K

Wow that is fast.


My fastest machine, and I have several slower:


gmp
time sh -c "CC=gcc-4.2 ./configure none-none-none -disable-shared 
-enable-static && make && ssh r...@localhost \"cd `pwd` && make install\""
real    2m2.594s

mpfr
time sh -c "./configure -disable-shared -enable-static && make && ssh 
r...@localhost \"cd `pwd` && make install\""
real    0m43.756s

mpc
time sh -c "./configure -disable-shared -enable-static && make && ssh 
r...@localhost \"cd `pwd` && make install\""
real    0m15.495s


which is still a significant fraction of building cc1 (I don't have that time 
sorry)

I used to use Cygwin. Things add up much faster there.


> mpfr et al. If you're not, it only happens once.


Almost anything long but incremental can be justified via incrementality.
But there is also, occasionally, mass bouts of trying to get the configure 
switches just right and
starting over repeatedly...at least me being unsure of incrementality in the 
fact of rerunning configure...


Anyway, just putting it out there, probably won't happen, but  configure 
-without-mpc -without-mpfr might be nice
and aren't difficult, -without-gmp much better, but I can't yet claim it isn't 
difficult.
Maybe, something like, if gmp is "in tree", after configure, the Makefile could 
be hacked down
to omit mpf, mpq, and lots others, but then the linkage between gcc and gmp 
gets messy.
  i.e. as gmp changes.


 - Jay


> Date: Mon, 27 Sep 2010 11:37:04 +0100
> From: a...@redhat.com
> To: jay.kr...@cornell.edu
> CC: gcc@gcc.gnu.org
> Subject: Re: eliminating mpc/mpfr and reducing gmp
>
> On 09/27/2010 01:23 AM, Jay K wrote:
> >
> > Hi. You know, gmp/mpfr/mpc are a significant
> > portion of building any frontend/backend.
>
> I disagree. Most of the time I don't notice them.
>
> > The result is a lot faster to build, if you are just doing a just
> > a single stage build of a compiler.
>
> Sure, but if you're working on the compiler you don't need to rebuild
> mpfr et al. If you're not, it only happens once.
>
> On my box, for mpc:
>
> real 0m2.624s
> user 0m3.336s
> sys 0m1.663s
>
> and for mpfr:
>
> real 0m4.484s
> user 0m12.006s
> sys 0m5.127s
>
> Andrew.
  


RE: eliminating mpc/mpfr and reducing gmp

2010-09-27 Thread Jay K

I only do one language, no driver, one stage, no libraries (none of libgcc, 
libstdc++, libjava, libada, etc.), no fixincludes (the headers are probably 
fine, and aren't used by this frontend anyway), the bootstrap compiler is 
always pretty decent, thus one stage (or I'll "cheat" and do one full regular 
bootstrap+install of a recent release on the occasional oddball host).
 
 
It takes under 10 minutes on year old MacBook. (sorry, I didn't measure).
 
 
I guess it is very unusual, but it is also actually very useful.
 
 
Everyone else probably depends on good incrementality when actively changing 
stuff, or pays a higher price overall when occasionally building the entire 
thing clean and not changing stuff, so this is minor.
 
 
We understand each other, fair enough.
 
 
 - Jay



> Date: Tue, 28 Sep 2010 01:51:31 +0100
> From: dave.korn.cyg...@gmail.com
> To: jay.kr...@cornell.edu
> CC: a...@redhat.com; gcc@gcc.gnu.org
> Subject: Re: eliminating mpc/mpfr and reducing gmp
>
> On 27/09/2010 12:39, Jay K wrote:
>
> > gmp
> > time sh -c "CC=gcc-4.2 ./configure none-none-none -disable-shared 
> > -enable-static && make && ssh r...@localhost \"cd `pwd` && make install\""
> > real 2m2.594s
> >
> > mpfr
> > time sh -c "./configure -disable-shared -enable-static && make && ssh 
> > r...@localhost \"cd `pwd` && make install\""
> > real 0m43.756s
> >
> > mpc
> > time sh -c "./configure -disable-shared -enable-static && make && ssh 
> > r...@localhost \"cd `pwd` && make install\""
> > real 0m15.495s
>
>
> > which is still a significant fraction of building cc1 (I don't have that
> > time sorry)
>
>
> Huh, am I doing something seriously wrong? It takes me four hours to
> boostrap GCC at with all languages enabled at -j8 on an AMD2x64; I'm not too
> concerned about 3 minutes per pass!
>
> cheers,
> DaveK
> 


RE: eliminating mpc/mpfr and reducing gmp

2010-09-27 Thread Jay K

> Well, the other thing is: why not just build them once and install them to
> your $prefix? There's no need to build them in-tree every time if you have
> sufficiently up-to-date versions installed.
>
> cheers,
> DaveK

 
I have a CVS tree, used by others, that builds a frontend.
 
 
"Others" are a few people and a few automated build machines (using Hudson),
with various operating systems (Linux, Solaris, Darwin, FreeBSD, etc.).
 
 
The requirements on these machines isn't of course zero -- they
exist, they are up and running -- but is kept low where possible.
 
 
So the decision was made, not exactly by me, but I went along,
to add gmp+mpfr to the tree, and later mpc.
 
 
It's pretty simple, reliable, works.
 
 
What I was often doing was deleting gmp/mpfr in my local CVS tree,
leaving configure to find them in /usr/local.
And then sometimes I'd accidentally restore them with cvs upd -dAP.
 
 
I toyed with the idea of having it always check /usr/local first
but didn't do that. What if user has an insufficient version there?
I'd have to test for that and fallback to in-tree.
 
 
Since we are using Hudson we could probably add nice automation,
to build gmp/mpfr/mpc "somewhere", neither /usr/local, nor "in gcc",
 maybe install to $HOME/hudson, and point the frontend build at them.
But the Hudson stuff...is not well known to me, I'd kind of rather
leave it alone. And I'd have to e.g. deal with the gmp inline problem,
just another small detail...
 
 
I'm satisfied so far with the pruning. I can see it isn't for everyone.
We'll see, maybe I'll grow to dislike it as well.
 
 
 - Jay


atomicity of x86 bt/bts/btr/btc?

2010-10-19 Thread Jay K

gcc-4.5/gcc/config/i386/i386.md:

;; %%% bts, btr, btc, bt.
;; In general these instructions are *slow* when applied to memory,
;; since they enforce atomic operation. When applied to registers,


I haven't found documented confirmation that these instructions are atomic 
without a lock prefix,
having checked Intel and AMD documentation and random web searching.
They are mentioned as instructions that can be used with lock prefix.


- Jay 


RE: atomicity of x86 bt/bts/btr/btc?

2010-10-19 Thread Jay K


> Subject: Re: atomicity of x86 bt/bts/btr/btc?
> From: foxmuldrsterm
> To: jay.krell
> CC: gcc@gcc.gnu.org
> Date: Tue, 19 Oct 2010 02:52:34 -0500
>
> > ;; %%% bts, btr, btc, bt.
> > ;; In general these instructions are *slow* when applied to memory,
> > ;; since they enforce atomic operation. When applied to registers,
> >
> > I haven't found documented confirmation that these instructions are atomic 
> > without a lock prefix,
> > having checked Intel and AMD documentation and random web searching.
> > They are mentioned as instructions that can be used with lock prefix.
>
> They do not automatically lock the bus. They will lock the bus with the
> explicit LOCK prefix, and BTS is typically used for an atomic read/write
> operation.
>
> - Rick

 
Thanks Rick.
I'll go back to using them.
I'm optimizing mainly for size.
The comment should perhaps be amended.
The "since they enforce atomic operation" part seems wrong.
 
 - Jay


RE: atomicity of x86 bt/bts/btr/btc?

2010-10-19 Thread Jay K


> Subject: RE: atomicity of x86 bt/bts/btr/btc?
> From: foxmuldrster
> To: jay
> CC: gcc
> Date: Tue, 19 Oct 2010 03:05:26 -0500
>
> > > They do not automatically lock the bus. They will lock the bus with the
> > > explicit LOCK prefix, and BTS is typically used for an atomic read/write
> > > operation.
>
> > Thanks Rick.
> > I'll go back to using them.
> > I'm optimizing mainly for size.
> > The comment should perhaps be amended.
> > The "since they enforce atomic operation" part seems wrong.
>
> Np. For citation, see here (page 166).
>
> http://www.intel.com/Assets/PDF/manual/253666.pdf
>
> - Rick

 
Yep that was one of my references.
 
 
It might be nice if optimizing for size would use them with code like e.g.:
 
 
void set_bit(size_t* a, size_t b)
{
  const unsigned c = sizeof(size_t) * CHAR_BIT;
  a[b / c] |= (((size_t)1) << (b % c));
}
 
void clear_bit(size_t* a, size_t b)
{
  const unsigned c = sizeof(size_t) * CHAR_BIT;
  a[b / c] &=  ~(((size_t)1) << (b % c));
}
 
int get_bit(size_t* a, size_t b)
{
  const unsigned c = sizeof(size_t) * CHAR_BIT;
  return !!(a[b / c] & (((size_t)1) << (b % c)));
}
 
 
 - Jay


is gcc garbage collector compacting?

2010-11-03 Thread Jay K

Is the gcc garbage collector compacting?


In particular I want to have ".def" file (like tree.def) where
the macros generate struct declarations,
and the fields might be tree.


That won't work with the gcc garbage collection scheme.


However, I'm willing also store all trees
that go in these structs in a global VEC of trees,
that does work with the garbage collector.


That would suffice to keep them alive.
But it wouldn't keep them up to date if the
garbage collector is compacting.


Heck, I guess, if need be, I'm willing to apply
a slight hack to disable the garbage collector.


If I really must -- if there is really much garbage
to collect, I could run the preprocessor ahead of time
and provide a file to gengtype that it understands.
I'm leary of this approach, because I want only
certain preprocessing to take place.
I could output markers before/after the chunk I want though.


Or, maybe the files do get preprocessed?
Evidence is no, as when #if 0'ed out some types, gengtype still saw them.
Something to consider, perhaps.


Thanks,
 - Jay

  


RE: is gcc garbage collector compacting?

2010-11-03 Thread Jay K

 > Are you coding inside your branch, or just your plugin?
 > [implied] What are you actually doing? 


It isn't relevant or short, but if you really want to know:


It is a front end, and indeed, a branch, it won't ever be contributed.
It is the Modula-3 frontend, which plays slight licensing games and
the original authors aren't around, so couldn't get to FSF, and current
maintainers can't either.


Honestly, long term, I'd like to generate C instead, for a variety of reasons.
 - no more license game 
 - more portability (e.g. to platforms that also fork/maintain gcc,
   so we don't have to fork and patch also theirs: e.g. OpenBSD, iPhone;
    e.g. to platforms with no gcc such as NT/ia64) 
 - generate C++ actually, for portable usually efficient exception handling 
   Currently we use setjmp/longjmp and a thread local. It works and is very 
portable,
    but is very inefficient.
 - debuggability, with stock debuggers   
 - solidly fix stuff I detail below  
 - a source distribution story  
   Yes, I realize, compilers must be distributed as binaries, but one can
   paint the world as layered on C or C++ and then only their compilers need 
binary distribution. 
  - I realize compile speed would suffer. And expression evaluation in debugger 
would suffer. Those are the drawbacks. 


Making it a plugin might be viable in the future, but I'm more keen on 
generating C instead.
We also have our own frontend for NT/x86, that writes out COFF .objs, which 
might be interesting to extend to others,
but that is a lot of work.


Historically we have generated very poor trees.


One particular example is that our structs have size but no fields.
I only realized this when targeting Solaris/sparc64 and hitting
assertion failures in the middle end -- the code couldn't
figure out how to pass such things.


As well, historically, we made everything volatile, in order to defeat
the optimizer. Now what we do is almost never use volatile, but still
turn off a small number of optimizations.


Nobody also had been using configure -enable-checking, and it complains 
variously.
I've fixed some of that.


So I'm working on filling in types much better.


I have made good fast progress.




This also greatly improves debugging with stock gdb, instead of forked/hacked
gdb that we get debug info to in a somewhat sleazy but fairly portable way
(custom stabs, as I understand, doesn't work e.g. on HP-UX/HPPA64 or on any 
MacOSX.)



However some of the types are circular, and the code currently
only makes one pass. So I simply want it to build up its own in-memory
representation, make some passes over that, and then generate the "real" trees.
The in-memory representation will contain some trees.. though I suppose
now that I explain it, there is a way around that.


The reading of the intermediate form is somewhat data driven, reading into 
locals.
I want to reuse that but read into structs, and then loop over them.
Simple stuff really.


I could probably also make extra passes over the gcc trees instead,
but I'm designing partly out of ignorance. I'm not sure when you can
continue to change them vs. when they are frozen.


The non-compacting of the gc makes this easier than it might be otherwise.
Though looking at it, we already store many of our trees in global arrays.
There's just a few stragglers I can also put in global arrays and be ok.


Thanks,
 - Jay
  


RE: is gcc garbage collector compacting?

2010-11-03 Thread Jay K

> I believe that your case is a very good example of why front-ends
> should be able to become plugins. It is not the case yet, and adding


Currently we do define a new tree code, I think just one.
And the implementation is *slightly* invasive.


I was tempted to write up a proposal to this list to add it to mainline gcc but
I ended up not.


I noticed the D/gcc frontend does something similar.


In particular..nested functions...taking their address..we implement them in a 
way that
doesn't involve runtime codegen. Though there are significant downsides
to what we do. Having thought about it, I believe there is no particularly
good way to implemented nested functions (taking their address), just
multiple not very good ways.


What we do is, is that function pointers are rare in our language.
Before calling through any function pointer, we read the data it points to and
check for a marker -- -1, currently of size_t size, though probably should
always be 4 bytes (alignment issues), except maybe on IA64 (particularly
large instructions). This is a bit sleazy -- assumption
that code is readable and that -1 isn't valid code, but other target-dependent
markers could be specified. Anyway, if it -1, it is followed by actual function
pointer and frame pointer, and that is used instead of calling it directly.
Something like that.
The implication on the backend isn't perhaps clear from that, nor do I 
necessarily
understand it. But certainly gcc's existing nested functions don't work this 
way.
I understand, again, there are major tradeoffs either way. Ours is not 
monotonically
better than gcc's.
Ah, I guess the new code is to load the static link following the -1.


A C-generating front end has other advantages:
  I know what is well formed C.
  I don't know what is well formed gcc/tree or gcc/gimple.
  The various problems we have with the optimizer -- because our trees are 
poorly formed -- would go away.
   We wouldn't have to twiddle various optimizations off.
  This is sort of "magic/special" -- I just happen to know far more about C 
than gcc internals.
  

> (and there is also LLVM which could perhaps interest you).


Yeah, people have asked about using it. But nobody puts any work into
it from our side (we have precious few people doing anything!).
We aren't supposed to discuss it here. :)
And it has the same disadvantage vs. C/C++ that gcc/tree/gimple has -- I know 
what is valid C/C++,
whereas LLVM is another big unknown to investigate. And it hits fewer targets.
Granted, both mainline gcc and LLVM hit plenty targets.
I do dream of having portability on par with: printf("hello world\n"); without
doing much porting myself/ourselves, and have access to wierd systems such as 
VMS. :)


 > future gimple front-end


Interesting. But again similar problems.
  Given my knowledge, generating C/C++ easier than anything else. 
Also have to wait until these versions of gcc are widespread.
Look at Apple and OpenBSD for how older gcc remains in use.


Anyway, knowing the garbage collector isn't compacting is good.


Thanks,
 - Jay


> Date: Wed, 3 Nov 2010 10:23:14 +0100
> From: basile@
> To: jay.kr...@u
> CC: gcc@
> Subject: Re: is gcc garbage collector compacting?
>
[snip]
  


asm_fprintf inefficiency?

2010-11-05 Thread Jay K

so..I was bemaining to self extra #ifs, extra autoconf..
the checking for puts_locked...
the fact that asm_fprintf calls putc one character at a time,
which probably benefits from _unlocked.



1) asm_fprintf probably should skip from % to %, calling
puts on each span, instead of putc one at a time.
Granted, its input strings tend to be short.


2) But more so..


given, e.g.:


asm_fprintf(file, "%Rfoo");


one could instead say


#ifndef REGISTER_PREFIX
#define REGISTER_PREFIX ""
#endif


fprintf(file, "%sfoo", REGISTER_PREFIX);


That works for all of asm_fprintf (I, L, R, U) except %O, and %O appears
little used or unused. And it doesn't handle {.


jbook2:gcc jay$ grep asm_fprintf */*/* | grep { | wc -l
  33
jbook2:gcc jay$ grep asm_fprintf */*/* |  wc -l
 318

Maybe something else could be done for those 10%?

like:
before:
asm_fprintf (file, "\t{l|lwz} %s,", reg_names[0]);

after:
fprintf (file, "\t%s %s,", dialect_number ? "lwz" : "l", reg_names[0]);
or bigger/faster:
fprintf (file, dialect_number ? "\tlwz %s," : \tl %s,", reg_names[0]);


(Really I'd rather gcc just output .o files directly...)


 - Jay




  


RE: asm_fprintf

2010-11-05 Thread Jay K

 > And putc_unlocked is a macro which appends to a buffer. puts is not.
 
 
I *assumed* there is puts_unlocked like all the other *_unlocked.
Maybe not.

 
 > (Really I'd rather gcc just output .o files directly...) 
 > It would be an interesting project, but it's not a major component of 
 > optimizing compilation time. I would certainly encourage any interested 
 
 
Perhaps when not optimizing?
Eh, but I've taken no measurement.
There is the possible fork() cost on Cygwin.
But maybe spawn is used, much faster.
 
 
 
 - Jay


extern "C" applied liberally?

2010-11-15 Thread Jay K

I know it is debatable and I could be convinced otherwise, but I would suggest:
 
 
 
#ifdef __cplusplus
extern "C" {
#endif
 
...


#ifdef __cplusplus
} /* extern "C" */
#endif
 
 
be applied liberally in gcc.
Not "around" #includes, it is the job of each .h file, and mindful of #ifdefs 
(ie: correctly).
 
 
Rationale:
  Any folks that get to see the mangled names, debugging, working on binutils, 
whatever, are saved from them.
 They are generally believed to be ugly, right? Yeah yeah, not a technical 
argument.
 
  For some reason, I wasn't able to set breakpoints in gdb on MacOSX otherwise, 
though this doesn't make sense and a small example didn't reproduce the 
behavior. I have since applied this to 300+ files in a 4.5.1 fork -- not all of 
them just out of time/laziness. (I tried and failed to automate it.)
 
 
 
I think it is a good idea for any C or historically C code when moving to a C++ 
compiler.
  I have done that with a few small/medium sized code bases.
 
 
 
They could/would be removed as templates/function overloads/operator 
overloading are introduced.
Or such sections of code/declarations could have extern "C++" { } around them 
(not a well known feature, but ok.)
 
 
 
 - Jay


  1   2   3   >