pedwarn_cxx98 and enum { foo, }
Hello. I noticed that (foo.cpp): enum gaz { foo, }; generated a warning foo.cpp:1:15: warning: comma at end of enumerator list [-pedantic] when compiled with g++ -std=gnu++0x -pedantic -fsyntax-only foo.cpp According to n3290 this is acceptable so I tried to make this warning go away. This warning is issued at line 13588 in gcc/cp/parser.c (revision 173058) with a pedwarn. I then found the function pedwarn_cxx98 with the comment /* Issue an ISO C++98 pedantic warning at LOCATION, conditional on option OPT with text GMSGID. Use this function to report diagnostics for constructs that are invalid C++98, but valid C++0x. */ bool pedwarn_cxx98 (location_t location, int opt, const char *gmsgid, ...) and that sounds like what I want to happen here so I changed the pedwarn to pedwarn_cxx98, recompiled an tried again. No change, I still got the warning. So, now I would like to know if there is something amiss with pedwarn_cxx98 or the comment above it? /MF
Re: GCC Optimisation, Part 0: Introduction
On Wed, Apr 27, 2011 at 10:29 PM, Basile Starynkevitch wrote: > >> Here are some areas I'll look closer to, as shown by some early profiling >> I performed: >> * hash tables (both htab and symtab) > > There is probably a lot of tuning possible around GCC hash tables. However, > I would imagine that tuned performance might depend on the particular > hardware running the so modified GCC. And C++ is becoming an acceptable > programming language inside GCC. So replacing current C hashtables by some > fancier C++ collections might be considered (but I have no idea if this will > be a performance win, perhaps no!; it could be a readability win.). > > Perhaps some hashtables are used to associate data to important structures > (like gimple) which are difficult to change. Perhaps if you have strong > measurements that adding information inside gimple instead of keeping an > hashtable for it you might be able to remove some hashtables, but this is > difficult, since GCC essential data types have been very hand-optimized. > It seems to me that hash table in GCC is more like mapping(or dictionary, or associated array, or function(Key->Value)), instead of containter. I think the main problem of hash table is that conflict rate is unpredictable, so the lookup time is unpredictable. At it's worst condition, you will run equal method on all the elements to find a slot. Perhaps using B+ tree to implement mapping may be an alternative. Using hash table , you should implement hash and equal methods. Using B+ tree, you should only implement compare method. Using B+ tree, the time complexity of lookup is O(log(n)), which is much better. I don't think C++ STL map is necessarily better than hash table or B+ tree. For every (Key, Value) pair, it will instantiate a new class, rendering the resulting code size unnecessarily big. >> * ggc_internal_alloc_stat() or maybe implementing proper memory >> management instead of garbage collection, for hottest callers > > My opinion is that a garbage collector is (or can become) a proper memory > manager. > If you really want to switch back to manual memory management, please be > sure to make changes which will be easy to revert. > I believe that the current GCC garbage collector could be improved (but that > is very hard work!) > Agree. -- Chiheng Xu Wuhan,China
Re: GCC Optimisation, Part 0: Introduction
On Thu, Apr 28, 2011 at 2:52 PM, Chiheng Xu wrote: > On Wed, Apr 27, 2011 at 10:29 PM, Basile Starynkevitch > wrote: >> >>> Here are some areas I'll look closer to, as shown by some early profiling >>> I performed: >>> * hash tables (both htab and symtab) >> >> There is probably a lot of tuning possible around GCC hash tables. However, >> I would imagine that tuned performance might depend on the particular >> hardware running the so modified GCC. And C++ is becoming an acceptable >> programming language inside GCC. So replacing current C hashtables by some >> fancier C++ collections might be considered (but I have no idea if this will >> be a performance win, perhaps no!; it could be a readability win.). >> >> Perhaps some hashtables are used to associate data to important structures >> (like gimple) which are difficult to change. Perhaps if you have strong >> measurements that adding information inside gimple instead of keeping an >> hashtable for it you might be able to remove some hashtables, but this is >> difficult, since GCC essential data types have been very hand-optimized. >> > > It seems to me that hash table in GCC is more like mapping(or > dictionary, or associated array, or function(Key->Value)), instead of > containter. > > I think the main problem of hash table is that conflict rate is > unpredictable, so the lookup time is unpredictable. At it's worst > condition, you will run equal method on all the elements to find a > slot. > > Perhaps using B+ tree to implement mapping may be an alternative. > Using hash table , you should implement hash and equal methods. Using > B+ tree, you should only implement compare method. Using B+ tree, the > time complexity of lookup is O(log(n)), which is much better. Well, with a good hash-function your average lookup complexity is O(1) which is much better. Richard.
Re: GCC Optimisation, Part 0: Introduction
On 4/28/2011 8:55 AM, Richard Guenther wrote: It seems to me that hash table in GCC is more like mapping(or dictionary, or associated array, or function(Key->Value)), instead of containter. I think the main problem of hash table is that conflict rate is unpredictable, so the lookup time is unpredictable. At it's worst condition, you will run equal method on all the elements to find a slot. Perhaps using B+ tree to implement mapping may be an alternative. Using hash table , you should implement hash and equal methods. Using B+ tree, you should only implement compare method. Using B+ tree, the time complexity of lookup is O(log(n)), which is much better. Well, with a good hash-function your average lookup complexity is O(1) which is much better. I think the hash table is a much better choice than the B+ tree. You really are much more interested in average case performance in a compiler than worst case, especially when the worst case will not happen in practice. Richard.
Re: GCC 4.5.3 Release Candidate available from gcc.gnu.org
On 04/21/2011 12:40 PM, Richard Guenther wrote: A first release candidate for GCC 4.5.3 is available from ftp://gcc.gnu.org/pub/gcc/snapshots/4.5.3-RC-20110421/ and shortly its mirrors. It has been generated from SVN revision 172803. I have sofar bootstrapped and tested the release candidate on x86_64-unknown-linux-gnu. Please test it and report any issues to bugzilla. If all goes well the final GCC 4.5.3 release will happen late next week. I didn't see regressions in the testsuite compared to 4.5.2, for the architecures built by Debian. Matthias
Re: ARM unaligned MMIO access with attribute((packed))
On Wed, 27 Apr 2011, Arnd Bergmann wrote: > On Wednesday 27 April 2011 18:25:40 Alan Stern wrote: > > On Wed, 27 Apr 2011, Rabin Vincent wrote: > > > > > On Wed, Apr 27, 2011 at 00:21, Alan Stern > > > wrote: > > > > On Tue, 26 Apr 2011, Rabin Vincent wrote: > > > >> In my case it's this writel() in ehci-hub.c that gets chopped into > > > >> strbs: > > > >> > > > >> � � � /* force reset to complete */ > > > >> � � � ehci_writel(ehci, temp & ~(PORT_RWC_BITS | PORT_RESET), > > > >> � � � � � � � � � � � � � � � status_reg); > > > > > > > > Why would that get messed up? �The status_reg variable doesn't have any > > > > __atribute__((packed)) associated with it. > > > > > > The initialization of status_reg is: > > > > > > u32 __iomem *status_reg > > > = &ehci->regs->port_status[(wIndex & 0xff) - 1]; > > > > > > where ehci->regs is a pointer to the packed struct ehci_regs. So, this > > > is the same problem of casting pointers to stricter alignment. > > > > Right. I can understand the compiler complaining about the cast to > > stricter alignment during the initialization. But I don't understand > > why that would affect the code generated for the writel function. > > The compiler does not complain, it just silently assumes that it needs > to do byte accesses. There is no way to tell the compiler to ignore > what it knows about the alignment, other than using inline assembly > for the actual pointer dereference. Most architectures today do that, > but on ARM it comes down to "*(u32 *)status_reg = temp". Ah -- so the compiler associates the alignment attribute with the data value and not with the variable's type? I didn't know that. Alan Stern
Re: pedwarn_cxx98 and enum { foo, }
On 04/28/2011 12:35 PM, Magnus Fromreide wrote: So, now I would like to know if there is something amiss with pedwarn_cxx98 or the comment above it? Looking at the way pedwarn_cxx98 is used for long long (the original motivating case), think the right way to use it for your issue too seems having something more precise than just OPT_pedantic as the second argument. Or do something else altogether, like just not calling any warning when the dialect is != cxx98. The right person to help you is Manuel, I think. Paolo.
Re: ARM unaligned MMIO access with attribute((packed))
On Thursday 28 April 2011, Alan Stern wrote: > > The compiler does not complain, it just silently assumes that it needs > > to do byte accesses. There is no way to tell the compiler to ignore > > what it knows about the alignment, other than using inline assembly > > for the actual pointer dereference. Most architectures today do that, > > but on ARM it comes down to "*(u32 *)status_reg = temp". > > Ah -- so the compiler associates the alignment attribute with the data > value and not with the variable's type? I didn't know that. The behavior here is unspecified because the underlying typecase is not valid. Gcc apparently uses some heuristics trying to do the right thing, and in recent versions that heuristic seems to have changed. Arnd
gcc-4.5-20110428 is now available
Snapshot gcc-4.5-20110428 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.5-20110428/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.5 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_5-branch revision 173136 You'll find: gcc-4.5-20110428.tar.bz2 Complete GCC (includes all of below) MD5=b127e67e3117148ede7c68d9670305c8 SHA1=efdf705315f0bad4de29ac9b6e0feadd9a89ebf4 gcc-core-4.5-20110428.tar.bz2C front end and core compiler MD5=82e1305e9cbb7732121d9c521c66929b SHA1=bb0c8cc9dd59c2624407a29303908786c834bfc3 gcc-ada-4.5-20110428.tar.bz2 Ada front end and runtime MD5=88d29412d1999c718638b9f24f4bfee4 SHA1=6f20daf2f6108c39a9e6444af5f68a16e10dda8e gcc-fortran-4.5-20110428.tar.bz2 Fortran front end and runtime MD5=0c02bb5d4d7a7b8f16229268e45a5fc6 SHA1=7ba089bb1deb6ca77a54370b875522a983afb942 gcc-g++-4.5-20110428.tar.bz2 C++ front end and runtime MD5=0b70acedf835e9814d53fe00b8a3b8c8 SHA1=e8935c26d1312d6269a2baeaf12f0503d8ba1666 gcc-go-4.5-20110428.tar.bz2 Go front end and runtime MD5=4a07b44527ce2cdd830623e42792bbda SHA1=e441fa7c6678cebe74252c4c811c37f904b48e90 gcc-java-4.5-20110428.tar.bz2Java front end and runtime MD5=ac9cfb820932711737e31773aa320c4c SHA1=fd9960f06aa225108d1f3c3beb55e6dbddb11f59 gcc-objc-4.5-20110428.tar.bz2Objective-C front end and runtime MD5=8d9c5844c7a9a0697873f3eae32e0093 SHA1=7473e28ffe62fb0ec848fd5bfe6ad5cc33eed946 gcc-testsuite-4.5-20110428.tar.bz2 The GCC testsuite MD5=26ec8033d1496fcca161eccf747782ba SHA1=786b71502bf46d65ce999a1f40701a06326d8780 Diffs from 4.5-20110421 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.5 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: GCC Optimisation, Part 0: Introduction
Hi Xu, B+ tree is more commonly used in file systems. In memory, I think RB-tree is better. RB-tree vs. hash table is just like map vs. unordered_map. -- Yuan Pengfei Peking Unversity, China
Integration of transactional memory support into a data-flow extension of OpenMP
Hi, I am Ismail KURU, accepted by Google Summer of Code 2011. My study mainly focuses on integration of transactional memory support into data-flow extension of OpenMP that is aiming increased expressiveness and performance while preserving the paradigms' properties. My project combines development and research components and can be decomposed into 5 phases. 1. Study the compatibility of the transactional memory and OpenMP constructs in the transmem development branch of GCC and propose solutions to the possible technical difficulties. 2. Study the desgin and implementation of both data-flow (streaming) and transmem branches of GCC, interacting with their maintainers. 3. Address the infrastructure limitations, software engineering and version management issues related to the integration of both code bases into a single experimental branch. 4. Contribute to the design and implementation for a semantics of transactions nested within OpenMP tasks. 5. Write and rewrite relevant benchmarks leveraging the new programming model, performing systematic evaluations and detailed characterizations of the performance behavior. I have finished the research part of the project related with compatibility of the transactional memory and OpenMP constructs with integrating the OpenMP constructs and trans-mem constructs of GCC on LeeTM benchmark from University of Manchester. Note : Just an example code from LeeTM for compatibility research of OpenMP and trans-mem . #define MEMSET _ITM_memsetW #define MEMCPY _ITM_memcpyRtWt #define BEGIN_TRANSACTION \ _ITM_beginTransaction (pr_instrumentedCode | pr_hasNoIrrevocable\ | pr_hasNoAbort) #define END_TRANSACTION \ _ITM_commitTransaction () void run_benchmark() { . _ITM_Initialize(); // create Lee benchmark Lee *lee = new Lee(cmdl_args.input_file_name, false, false, false); // initialize thread arguments for(unsigned i = 0;i < cmdl_args.thread_count;i++) { targs[i].lee = lee; targs[i].id = i; targs[i].commits = 0; targs[i].aborts = 0; targs[i].private_buffer= create_private_buffer(lee); } long start_time_ms; if(targs->id==0) start_time_ms = get_time_ms(); omp_set_num_threads(cmdl_args.thread_count); #pragma omp parallel private(nthreads, thread_id) { /* Only master thread does this */ if (thread_id == 0) { nthreads = omp_get_num_threads(); printf("Number of threads = %d\n", nthreads); } run_transactions(&targs[thread_id]); #pragma omp barrier } } void run_transactions(thread_args *targ) { Lee *lee = targ->lee; while(true) { WorkQueue *track = lee->getNextTrack(); if(track == NULL) { break; } // check for aborts if(!first) { targ->aborts++; } else { first = false; } BEGIN_TRANSACTION; #ifdef IRREGULAR_ACCESS_PATTERN // perform an update or read of contention object if(should_irregular_write(&targs->seed)) { lee->update_contention_object(); } else if(should_irregular_read(&targs->seed)) { lee->read_contention_object(); } #endif // IRREGULAR_ACCESS_PATTERN // transaction body lee->layNextTrack(track, targ->private_buffer); END_TRANSACTION; targs->commits++; } } There is not a big performance difference between our approach (OpenMP + trans-mem) and (pthreads + tinySTM). I am trying to find out a pipeline (regarding data-flow) structure for transactional region in LeeTM for benchmarking my study. Also transaction's size, scaling contention on shared data are my other issues. More important thing is that I would like to ask whether you (GCC developers) have your own suggestions and priorities regarding the proposed topic and the interaction between TM and OpenMP. Regards Ismail KURU
Re: GCC Optimisation, Part 0: Introduction
On Thu, Apr 28, 2011 at 9:13 PM, Robert Dewar wrote: > I think the hash table is a much better choice than the B+ tree. You > really are much more interested in average case performance in a compiler > than worst case, especially when the worst case will not > happen in practice. Basically, I agree with you. Hash table is better in most cases, but is not always better, especially on ultra large key set. O(1) is better than O(log(n)), but is not much better. If you have 1 million elements, at most, you only need 20 comparisons. Comparison may be much cheaper than hashing. 20 comparisons may be faster than 1 hashing. -- Chiheng Xu Wuhan,China
Re: GCC Optimisation, Part 0: Introduction
On Fri, Apr 29, 2011 at 8:07 AM, Yuan Pengfei wrote: > B+ tree is more commonly used in file systems. In memory, I think RB-tree is > better. > RB-tree vs. hash table is just like map vs. unordered_map. > Any balanced tree that have O(log(n)) lookup complexity, including splay tree. -- Chiheng Xu Wuhan,China