RE: Optimisation of type class methods / dictionaries

Simon Peyton-Jones Fri, 16 Nov 2012 01:19:12 -0800

See Note [ClassOp/DFun selection] in TcInstDcls, which I reproduce below.  Does 
that help?


Simon

Note [ClassOp/DFun selection]

One thing we see a lot is stuff like
    op2 (df d1 d2)
where 'op2' is a ClassOp and 'df' is DFun.  Now, we could inline *both*
'op2' and 'df' to get
     case (MkD ($cop1 d1 d2) ($cop2 d1 d2) ... of
       MkD _ op2 _ _ _ -> op2
And that will reduce to ($cop2 d1 d2) which is what we wanted.

But it's tricky to make this work in practice, because it requires us to
inline both 'op2' and 'df'.  But neither is keen to inline without having
seen the other's result; and it's very easy to get code bloat (from the
big intermediate) if you inline a bit too much.

Instead we use a cunning trick.
* We arrange that 'df' and 'op2' NEVER inline.

* We arrange that 'df' is ALWAYS defined in the sylised form
      df d1 d2 = MkD ($cop1 d1 d2) ($cop2 d1 d2) ...

* We give 'df' a magical unfolding (DFunUnfolding [$cop1, $cop2, ..])
   that lists its methods.

* We make CoreUnfold.exprIsConApp_maybe spot a DFunUnfolding and return
   a suitable constructor application -- inlining df "on the fly" as it
   were.

* We give the ClassOp 'op2' a BuiltinRule that extracts the right piece
   iff its argument satisfies exprIsConApp_maybe.  This is done in
   MkId mkDictSelId

* We make 'df' CONLIKE, so that shared uses stil match; eg
      let d = df d1 d2
      in ...(op2 d)...(op1 d)...
From: cvs-ghc-boun...@haskell.org [mailto:cvs-ghc-boun...@haskell.org] On 
Behalf Of Thomas Schilling
Sent: 15 November 2012 18:01
To: Cvs-ghc@haskell.org
Subject: Optimisation of type class methods / dictionaries

Hi,

I was investigating how GHC optimises dictionaries.  So, I'm using the 
following variant of the Eq type class:

    class XEq a where
      xeq, xne :: a -> a -> Bool
      xne a b = not (a `xeq` b) }

    instance XEq Bool where
      xeq True  True  = True
      xeq False False = True
      xeq _     _     = False

    instance XEq a => XEq [a] where
      xeq []     []     = True
      xeq (x:xs) (y:ys) = xeq x y && xeq xs ys
      xeq _             _ = False

I then compile this with:

     ghc -c -fforce-recomp -O2 -ddump-simpl -ddump-types <filename.hs>


I now have two questions regarding the output of the compiler.


Question 1: Why are the instance method selectors marked as NOINLINE?

Main.xeq [InlPrag=[NEVER]]
  :: forall a_aeV. Main.XEq a_aeV => a_aeV -> a_aeV -> GHC.Types.Bool



Question 2: The optimised code for the list implementation looks as follows:

Main.$fXEq[]_$cxeq [Occ=LoopBreaker]
  :: forall a_an3.
     Main.XEq a_an3 =>
     [a_an3] -> [a_an3] -> GHC.Types.Bool
[GblId, Arity=3, Caf=NoCafRefs, Str=DmdType LSS]
Main.$fXEq[]_$cxeq =
  \ (@ a_an3)
    ($dXEq_auY :: Main.XEq a_an3)
    (ds_dw8 :: [a_an3])
    (ds1_dw9 :: [a_an3]) ->
    case ds_dw8 of _ {
      [] ->
        case ds1_dw9 of _ {
          [] -> GHC.Types.True;
          : ipv_swB ipv1_swC -> GHC.Types.False
        };
      : x_an4 xs_an5 ->
        case ds1_dw9 of _ {
          [] -> GHC.Types.False;
          : y_an6 ys_an7 ->
            --- AAA
            case Main.xeq @ a_an3 $dXEq_auY x_an4 y_an6 of _ {
              GHC.Types.False -> GHC.Types.False;
              GHC.Types.True ->
                --- BBB
                Main.$fXEq[]_$cxeq @ a_an3 $dXEq_auY xs_an5 ys_an7
            }
        }
    }

If I turn off optimisation there are two calls to "xeq" which correspond 
exactly to the two calls from the implementation to either side of "&&".  In 
the optimised version, the second "xeq" has been inlined (marker BBB), but not 
the first one (marker AAA).  This is despite "xeq" being marked NOINLINE, by 
the way.  I would expect the code after AAA to look something like:

   case $dXEq_auY of _ {
     Main.D:XEq tpl_B2 _ ->
       case tpl_B2 x_an4 y_an6 of _ {
          ... ->

Adding -fdicts-strict to the command line also doesn't seem to convince the 
compiler to apply the worker/wrapper transformation and create a loop that just 
passes around the implementation of "xeq" instead of the whole dictionary.

Does anyone know what the issue is here?

Thanks,
  / Thomas

_______________________________________________
Cvs-ghc mailing list
Cvs-ghc@haskell.org
http://www.haskell.org/mailman/listinfo/cvs-ghc

RE: Optimisation of type class methods / dictionaries

Reply via email to