2016-04-26 22:50 GMT+03:00 H.J. Lu <hjl.to...@gmail.com>: > On Tue, Apr 26, 2016 at 11:42 AM, H.J. Lu <hjl.to...@gmail.com> wrote: >> On Tue, Apr 26, 2016 at 9:33 AM, H.J. Lu <hjl.to...@gmail.com> wrote: >>> On Tue, Apr 26, 2016 at 9:27 AM, Ilya Enkovich <enkovich....@gmail.com> >>> wrote: >>>> 2016-04-26 19:20 GMT+03:00 Ilya Enkovich <enkovich....@gmail.com>: >>>>> 2016-04-26 19:12 GMT+03:00 H.J. Lu <hjl.to...@gmail.com>: >>>>>> On Tue, Apr 26, 2016 at 9:07 AM, Ilya Enkovich <enkovich....@gmail.com> >>>>>> wrote: >>>>>>> 2016-04-26 18:39 GMT+03:00 H.J. Lu <hjl.to...@gmail.com>: >>>>>>>> On Tue, Apr 26, 2016 at 8:21 AM, Ilya Enkovich >>>>>>>> <enkovich....@gmail.com> wrote: >>>>>>>>> 2016-04-26 18:12 GMT+03:00 H.J. Lu <hjl.to...@gmail.com>: >>>>>>>>>> On Tue, Apr 26, 2016 at 8:05 AM, Ilya Enkovich >>>>>>>>>> <enkovich....@gmail.com> wrote: >>>>>>>>>>> 2016-04-26 17:55 GMT+03:00 H.J. Lu <hjl.to...@gmail.com>: >>>>>>>>>>>> On Tue, Apr 26, 2016 at 7:15 AM, Ilya Enkovich >>>>>>>>>>>> <enkovich....@gmail.com> wrote: >>>>>>>>>>>>> 2016-04-26 17:07 GMT+03:00 H.J. Lu <hjl.to...@gmail.com>: >>>>>>>>>>>>>> On Mon, Apr 25, 2016 at 9:13 AM, Ilya Enkovich >>>>>>>>>>>>>> <enkovich....@gmail.com> wrote: >>>>>>>>>>>>>>> 2016-04-25 18:27 GMT+03:00 H.J. Lu <hjl.to...@gmail.com>: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Ilya, can you take a look? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> H.J. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Algorithmic part of the patch looks OK to me except the >>>>>>>>>>>>>>> following piece of code. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> +/* Check REF's chain to add new insns into a queue >>>>>>>>>>>>>>> + and find registers requiring conversion. */ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Comment is wrong because you don't have any conversions >>>>>>>>>>>>>>> required for >>>>>>>>>>>>>>> your candidates. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I will fix it. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> + >>>>>>>>>>>>>>> +void >>>>>>>>>>>>>>> +scalar_chain_64::analyze_register_chain (bitmap candidates, >>>>>>>>>>>>>>> df_ref ref) >>>>>>>>>>>>>>> +{ >>>>>>>>>>>>>>> + df_link *chain; >>>>>>>>>>>>>>> + >>>>>>>>>>>>>>> + gcc_assert (bitmap_bit_p (insns, DF_REF_INSN_UID (ref)) >>>>>>>>>>>>>>> + || bitmap_bit_p (candidates, DF_REF_INSN_UID >>>>>>>>>>>>>>> (ref))); >>>>>>>>>>>>>>> + add_to_queue (DF_REF_INSN_UID (ref)); >>>>>>>>>>>>>>> + >>>>>>>>>>>>>>> + for (chain = DF_REF_CHAIN (ref); chain; chain = chain->next) >>>>>>>>>>>>>>> + { >>>>>>>>>>>>>>> + unsigned uid = DF_REF_INSN_UID (chain->ref); >>>>>>>>>>>>>>> + >>>>>>>>>>>>>>> + if (!NONDEBUG_INSN_P (DF_REF_INSN (chain->ref))) >>>>>>>>>>>>>>> + continue; >>>>>>>>>>>>>>> + >>>>>>>>>>>>>>> + if (!DF_REF_REG_MEM_P (chain->ref)) >>>>>>>>>>>>>>> + continue; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I believe here you wrongly jump to the next ref intead of >>>>>>>>>>>>>>> actually adding it >>>>>>>>>>>>>>> to a queue. You may just use >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> gcc_assert (!DF_REF_REG_MEM_P (chain->ref)); >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> because you should'n have a candidate used in address operand. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I will update. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> + >>>>>>>>>>>>>>> + if (bitmap_bit_p (insns, uid)) >>>>>>>>>>>>>>> + continue; >>>>>>>>>>>>>>> + >>>>>>>>>>>>>>> + if (bitmap_bit_p (candidates, uid)) >>>>>>>>>>>>>>> + add_to_queue (uid); >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Probably gcc_assert (bitmap_bit_p (candidates, uid)) since no >>>>>>>>>>>>>>> uses and defs >>>>>>>>>>>>>>> out of candidates list are allowed? >>>>>>>>>>>>>> >>>>>>>>>>>>>> That would be wrong since there are >>>>>>>>>>>>>> >>>>>>>>>>>>>> while (!bitmap_empty_p (queue)) >>>>>>>>>>>>>> { >>>>>>>>>>>>>> insn_uid = bitmap_first_set_bit (queue); >>>>>>>>>>>>>> bitmap_clear_bit (queue, insn_uid); >>>>>>>>>>>>>> bitmap_clear_bit (candidates, insn_uid); >>>>>>>>>>>>>> add_insn (candidates, insn_uid); >>>>>>>>>>>>>> } >>>>>>>>>>>>>> >>>>>>>>>>>>>> An instruction is a candidate and the bit is cleared when >>>>>>>>>>>>>> analyze_register_chain is called. >>>>>>>>>>>>> >>>>>>>>>>>>> You clear candidates bit but the first thing you do in add_insn >>>>>>>>>>>>> is set >>>>>>>>>>>>> insns bit. >>>>>>>>>>>>> Thus you should hit: >>>>>>>>>>>>> >>>>>>>>>>>>> if (bitmap_bit_p (insns, uid)) >>>>>>>>>>>>> continue; >>>>>>>>>>>>> >>>>>>>>>>>>> For handled candidates. >>>>>>>>>>>>> >>>>>>>>>>>>> Probably it would be more clear if we keep this clear/set pair >>>>>>>>>>>>> together? E.g. move >>>>>>>>>>>>> bitmap_clear_bit (candidates, insn_uid) to scalar_chain::add_insn. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> After we started processing candidates, we only use candidates >>>>>>>>>>>> to check if an instruction is a candidate, not to check if an >>>>>>>>>>>> instruction is NOT a candidate. >>>>>>>>>>> >>>>>>>>>>> I don't see how it's related to what I said. My point is that >>>>>>>>>>> when you analyze added insn you shouldn't reach insns which are both >>>>>>>>>>> not in candidates and not in current scalar_chain_64. That's why I >>>>>>>>>>> think you miss an assert in scalar_chain_64::analyze_register_chain. >>>>>>>>>> >>>>>>>>>> Since all candidates will be processed by >>>>>>>>>> >>>>>>>>>> while (!bitmap_empty_p (queue)) >>>>>>>>>> { >>>>>>>>>> insn_uid = bitmap_first_set_bit (queue); >>>>>>>>>> bitmap_clear_bit (queue, insn_uid); >>>>>>>>>> bitmap_clear_bit (candidates, insn_uid); >>>>>>>>>> add_insn (candidates, insn_uid); >>>>>>>>>> } >>>>>>>>> >>>>>>>>> This process only queue, not all candidates. analyze_register_chain >>>>>>>>> fills queue bitmap to build a chain. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> I will change to >>>>>>>>>> >>>>>>>>>> /* Check REF's chain to add new insns into a queue. */ >>>>>>>>>> >>>>>>>>>> void >>>>>>>>>> timode_scalar_chain::analyze_register_chain (bitmap candidates, >>>>>>>>>> df_ref ref) >>>>>>>>>> { >>>>>>>>>> gcc_assert (bitmap_bit_p (insns, DF_REF_INSN_UID (ref)) >>>>>>>>>> || bitmap_bit_p (candidates, DF_REF_INSN_UID (ref))); >>>>>>>>>> add_to_queue (DF_REF_INSN_UID (ref)); >>>>>>>>>> } >>>>>>>>>> >>>>>>>>> >>>>>>>>> The purpose of analyze_register_chain is to collect all register defs >>>>>>>>> or uses into chain's queue. You can't just remove a loop over ref's >>>>>>>>> chain then. >>>>>>>> >>>>>>>> That is true for DImode conversion. For TImode conversion, >>>>>>>> there are no register conversions required: >>>>>>>> >>>>>>>> https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01394.html >>>>>>>> >>>>>>>> All it does it to call add_to_queue. >>>>>>> >>>>>>> No conversion means no mark_dual_mode_def calls in >>>>>>> analyze_register_chain >>>>>>> but it still may call add_to_queue multiple times in its loop. E.g. >>>>>>> >>>>>>> 1: r1 = 1 >>>>>>> ... >>>>>>> 5: r1 = 2 >>>>>>> .. >>>>>>> 10: r2 = r1 >>>>>>> >>>>>>> When you add #10 into a chain you should add both #1 and #5 into a >>>>>>> queue. >>>>>>> That's what analyze_register_chain is supposed to do. >>>>>> >>>>>> Since both i1 and i5 are candidates, they are added to queue. >>>>> >>>>> By who? They will just go into another chain. >>>> >>>> BTW you can just reuse existing analyze_register_chain and make >>>> mark_dual_mode_def >>>> virtual instead. Just put gcc_unreachable () into its TI version. >>>> >>> >> >> Here is the updated patch which does that. Ok for trunk if there >> is no regressions on x86-64? >> > > CSE works with SSE constants now. Here is the updated patch. > OK for trunk if there are no regressions on x86-64?
Looks OK to me. Thanks, Ilya > > > -- > H.J.