Proposal on Unrolling factor based on Data reuse.
Hello All: I would like to propose the Unrolling factor based on Data reuse between different iterations. This combines the data reuse of different iterations into single iterations. There is a use of MaxFactor which decides on the calculation of unroll factor based on Data reuse.The MaxFactor is calculated based on (MaxInsts/LoopBodyInsts). The MaxInsts decides on the number of instruction that doesn't degrades on the instruction cache. The calculation of the MaxInsts also considers the def and use distance for the max insts that does not degrades the instruction cache which leads to increase in the Live range width and multiplied with the MaxInsts calculated based on Instruction cache. The following example from Livermore Loops benchmarks. The data reuse from the previous iteration to current iteration makes the data reuse. Unrolling the Loop in Fig(1) can Reuse the data and thus increasing the performance. The unrolled loop is unrolled 3 times given in Fig(2) based on the algorithm for Calculation of unrolling factor used as given in Fig(3). For ( I = 1; I < size; i++ ) { X[i] = z[i] * ( y[i] - x[i-1]); } Fig(1). For ( I = 1; I < size ; i+=3) { X[i] = z[i] * ( y[i] - x[i-1]); X[i+1] = z[i] * ( y[i] - x[i]); X[i] = z[i] * ( y[i] - x[i+1]]); } Fig(2). Algorithm for unroll factor based on data reuse. If ( data_reuse == true) { Unroll_factor = Reuse_distance +1; If(Unroll_factor < MaxFactor) Return unroll_factor;; Else{ Unroll_factor = MaxFactor - unroll_factor. If( unroll_factor < MaxDistance) Return unroll_factor; } } Fig ( 3). In the example given above in Fig(1) the data reuse distance and the MaxFactor calculated based on the maximum number of insts that don't degrades the instructions caches and doesn't exceed the maximum limit on def and use distance that increases the width of Live range. The above is considered and the Loop is 3 times. Thoughts Please ? Thanks & Regards Ajit
RE: Proposal on Unrolling factor based on Data reuse.
Hello All: There is a typo error. The fig(2) is corrected as follows. For ( I = 1; I < size ; i+=3) { X[i] = z[i] * ( y[i] - x[i-1]); X[i+1] = z[i] * ( y[i] - x[i]); X[i+2] = z[i] * ( y[i] - x[i+1]]); } Fig(2). Thanks & Regards Ajit -Original Message- From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Ajit Kumar Agarwal Sent: Saturday, March 07, 2015 3:31 PM To: Richard Biener; gcc@gcc.gnu.org Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Proposal on Unrolling factor based on Data reuse. Hello All: I would like to propose the Unrolling factor based on Data reuse between different iterations. This combines the data reuse of different iterations into single iterations. There is a use of MaxFactor which decides on the calculation of unroll factor based on Data reuse.The MaxFactor is calculated based on (MaxInsts/LoopBodyInsts). The MaxInsts decides on the number of instruction that doesn't degrades on the instruction cache. The calculation of the MaxInsts also considers the def and use distance for the max insts that does not degrades the instruction cache which leads to increase in the Live range width and multiplied with the MaxInsts calculated based on Instruction cache. The following example from Livermore Loops benchmarks. The data reuse from the previous iteration to current iteration makes the data reuse. Unrolling the Loop in Fig(1) can Reuse the data and thus increasing the performance. The unrolled loop is unrolled 3 times given in Fig(2) based on the algorithm for Calculation of unrolling factor used as given in Fig(3). For ( I = 1; I < size; i++ ) { X[i] = z[i] * ( y[i] - x[i-1]); } Fig(1). For ( I = 1; I < size ; i+=3) { X[i] = z[i] * ( y[i] - x[i-1]); X[i+1] = z[i] * ( y[i] - x[i]); X[i] = z[i] * ( y[i] - x[i+1]]); } Fig(2). Algorithm for unroll factor based on data reuse. If ( data_reuse == true) { Unroll_factor = Reuse_distance +1; If(Unroll_factor < MaxFactor) Return unroll_factor;; Else{ Unroll_factor = MaxFactor - unroll_factor. If( unroll_factor < MaxDistance) Return unroll_factor; } } Fig ( 3). In the example given above in Fig(1) the data reuse distance and the MaxFactor calculated based on the maximum number of insts that don't degrades the instructions caches and doesn't exceed the maximum limit on def and use distance that increases the width of Live range. The above is considered and the Loop is 3 times. Thoughts Please ? Thanks & Regards Ajit
Re: Proposal on Unrolling factor based on Data reuse.
On March 7, 2015 11:01:12 AM CET, Ajit Kumar Agarwal wrote: >Hello All: > >I would like to propose the Unrolling factor based on Data reuse >between different iterations. This combines the data >reuse of different iterations into single iterations. There is a use of >MaxFactor which decides on the calculation of unroll >factor based on Data reuse.The MaxFactor is calculated based on >(MaxInsts/LoopBodyInsts). The MaxInsts decides on >the number of instruction that doesn't degrades on the instruction >cache. The calculation of the MaxInsts also considers >the def and use distance for the max insts that does not degrades the >instruction cache which leads to increase in the Live >range width and multiplied with the MaxInsts calculated based on >Instruction cache. > >The following example from Livermore Loops benchmarks. > >The data reuse from the previous iteration to current iteration makes >the data reuse. Unrolling the Loop in Fig(1) can >Reuse the data and thus increasing the performance. The unrolled loop >is unrolled 3 times given in Fig(2) based on the >algorithm for Calculation of unrolling factor used as given in Fig(3). > >For ( I = 1; I < size; i++ ) >{ >X[i] = z[i] * ( y[i] - x[i-1]); >} > >Fig(1). > >For ( I = 1; I < size ; i+=3) >{ > X[i] = z[i] * ( y[i] - x[i-1]); > X[i+1] = z[i] * ( y[i] - x[i]); > X[i] = z[i] * ( y[i] - x[i+1]]); >} > >Fig(2). > >Algorithm for unroll factor based on data reuse. > >If ( data_reuse == true) >{ >Unroll_factor = Reuse_distance +1; > If(Unroll_factor < MaxFactor) > Return unroll_factor;; > Else{ > Unroll_factor = MaxFactor - unroll_factor. > If( unroll_factor < MaxDistance) >Return unroll_factor; > > } >} > >Fig ( 3). > >In the example given above in Fig(1) the data reuse distance and the >MaxFactor calculated based on the maximum number of >insts that don't degrades the instructions caches and doesn't exceed >the maximum limit on def and use distance that increases >the width of Live range. The above is considered and the Loop is 3 >times. > >Thoughts Please ? I believe this is what predictive commoning does in a more general way. Richard. >Thanks & Regards >Ajit
Data Recovery Services and Hardware Repair Services
We have made helping people our business. We only do data recovery, we do it all day, every day - we have become the very best in this field. We have seen all types of problems and solved nearly all of them! Why we are here? Data – Important assets of enterprises and individuals, is the key element of great success. Virus Attack, Back-up failure, Hard Disk Drive Crash, System Failure, Nature Disasters, Human Errors are one of the causes of Data Loss, which might potentially cause million and billions loss of profit. IntelliRecovery experts with their overwhelming experience, state-of-the-art tools and deep knowledge provide finest and innovative Data Recovery Solutions for all types of storage media and file systems in case of data loss. We are dedicated to provide you unmatchable, updated and quality data recovery services and we have enduring commitments to customer satisfaction. Nowadays, personal computer, mobile devices, networks, servers etc are so popular and they have become something we can’t miss in our daily life. This kind of digital platform can drive efficiency and convenience, unfortunately there is a huge risk behind – Data loss. Hence, IntelliRecovery team are here to recover and salvage lost data from disaster stricken hard drives, computer workstations, laptops, e-mails, smartphones, PDA's, iPods and portable storage media of every brand, manufacturer, model and capacity. What can we do? Hard disk data recovery RAID array data recovery Emergency data recovery USB flash drive data recovery Memory card data recovery Mobile phone data recovery Apple iPhone Passcode/PIN Removal Mobile Phone Spyware & Virus Detection NAS Data Recovery Data destruction Hard disk duplication Hard disk password removal We believe in doing business in an honest, friendly and professional way. We are always up front, honest and provide realistic time estimates, quotes and believe in providing the best service possible. No data recovery company can guarantee that they can recover your data (e.g if your disk platter is scratched or data is overwritten) but we promise not to give up until we recover your data or completely exhaust all possible options to help you! If your data cannot be recovered for any reason, we'll give you an honest explanation of the reason and assure you that we have done everything in our power to try to help you. In over 90% of cases, we can recover data from hard disks and other data storage devices. We look forward to helping you! We also provide Mobile Phone Repair services for Iphone, Ipad, and we specialize in repairing cell phones, smart phones, computers, laptops, game consoles, cameras, mp3 players, and most any other electronic gadget you can throw at us. We strive to create solutions where there weren’t any before. Best regards, Lance Email: drdatab...@tom.com
Re: GSoc-2015: Modular GCC
On 03/05/2015 06:51 PM, Sidharth Chaturvedi wrote: > I like the idea of making the intermediate representations more > streamable. But I think this task should also involve creating > separate front-end and middle-end modules, as then there can be a > clear distinction of what IR is an input to a module and what IR is > output from a module(including a more specific structure to each IR). > This would give a pipeline structure to GCC. I don't know how much of > this can be achieved via GSoc, but I would still like to give it a > try. Any interested mentors? Thanks. Would it make sense for someone to pick this https://gcc.gnu.org/wiki/GimpleFrontEnd up where it was left off? Thanks, Pedro Alves
Re: GSoc-2015: Modular GCC
On 03/07/15 07:41, Pedro Alves wrote: On 03/05/2015 06:51 PM, Sidharth Chaturvedi wrote: I like the idea of making the intermediate representations more streamable. But I think this task should also involve creating separate front-end and middle-end modules, as then there can be a clear distinction of what IR is an input to a module and what IR is output from a module(including a more specific structure to each IR). This would give a pipeline structure to GCC. I don't know how much of this can be achieved via GSoc, but I would still like to give it a try. Any interested mentors? Thanks. Would it make sense for someone to pick this https://gcc.gnu.org/wiki/GimpleFrontEnd up where it was left off? Absolutely. Especially since it's a necessary step towards unit testing of the gimple optimizers. Jeff
Re: GSoc-2015: Modular GCC
On 03/06/15 18:22, Mikhail Maltsev wrote: 05.03.2015 18:46, Jeff Law wrote: Certainly still useful, the trick is finding a hunk of that work that can be tackled via GSoc. jeff Though I'm not participating in GSoC, I would be glad to get involved in GCC development. Could you please give some comments on current status of the following part: "Before GCC can be modularized properly, it is necessary to make clear what interfaces and declarations are actually needed in each source file. This will, no doubt, be a huge job. It is unclear at the moment whether there are tools available that could help (Dehydra perhaps, or a dedicated plugin, or Ctags? Or turn this patch into a proper plugin? Maybe create a symbols database and identify dependencies to break?)." It's an ongoing project. Two active sub-projects are pulling apart the headers so that we can identify routines from the front-ends that are being used outside the front-ends and deferring of debugging output. The intermediate goal of the first of those projects is the ability to start extracting tree types out of the gimple and backends. Other sub-projects that have received a lot of attention over the last year are adding a c++ class hierarchy to part of gimple and RTL. One potentially easy project there would be to take David Malcolm's work to turn the RTL EXPR & INSN lists into standard C++ forward lists. One that's harder would be pushing the rtx_insn * class further. Basically most of the places where we have a as-a cast between rtx_insn * and an rtx is a good candidate for removal and further propagation of rtx_insn * types. Jeff
Proposal for inter-procedural loops fusion.
Hello All: I am proposing the inter-procedural Loop fusion. Generally the Loops adjacent to each other and the conformable Candidates of loop fusions are done with respect to intra-procedural loops. The whole program analysis needs to Be done with array sections analysis across the procedure calls to fuse the loops across the procedure calls. In the example given in Fig(1) the doubly nested loops in routine sub1 will be a candidates of fusing the doubly Nested loops in routine sub2 in Fig(2). The main calls routine sub1 and sub2 and the loops in routine sub1 and sub2 Can be fused. The array section analysis and dependence analysis is done on the regions across the procedure calls For loops in sub1 and sub2. Here is the proposal for loop fusion across the procedure calls. Normally loop peeling is done to make the candidate For Loop fusion and the sub2 function the loops can also be peeled. In the examples given below the array A and array B are global variables in order to happen the loops fusion across the procedure calls. The below examples are extracted from the articles on the following. "Inter-procedure loop fusion, array compaction and rotation" by John Ng et.al Examples: program main common /s/ A(N),B(N) call sub1() call sub2() end Subroutine sub1() common /s/ A(N),B(N) do j=1,N do i=1,N A(i+1, j) = j -i enddo enddo do i=1,N A(1,i) = i * 2 enddo do i=1,N A(i+1,n+1) = A(i+1,1) enddo end /* sub1 */ Fig (1) Subroutine sub2() common /s/ A(N),B(N) do j=1,N do i=1,N B(i,j) = A(i+1,j+1) +A(i,j+1) + A(i,j) +A(i+1,j) enddo enddo end Fig (2). Thoughts Please? Thanks & Regards Ajit