[Apache TVM Discuss] [Questions] How to read out the intermediate value in Relay IR?

popojames via Apache TVM Discuss Mon, 14 Feb 2022 16:18:40 -0800


Hello TVM community,


I have a question regarding how to read out the intermediate value in Relay IR.

For the mod that the user creates manually, I know we can set arbitrary output 
with the proper setting.

For example, to read out the output_0, output_1, output_2, we can set: 


>     data = relay.var("data", relay.TensorType(dshape, "float32"))
>     output_0 = whatever operations....
>     output_1 = whatever operations....
>     output_2 = whatever operations....
>     func = relay.Function([data], relay.Tuple([output_0, output_1, output_2]))
>     mod = tvm.IRModule.from_expr(func)


However, for the mod that converts from another DNN framework, I am wondering 
how to add such arbitrary output?

For example, I have successfully converted the distilBERT model PyTorch to TVM 
relay IR.
Here is the output when I use print("original mod: \n", 
mod.astext(show_meta_data=False)):

>     def 
> @main(%tf_distil_bert_for_sequence_classification/distilbert/embeddings/Gather/resource:
>  Tensor[(30522, 768), float32], %x: Tensor[(1, 128), int32], 
> %tf_distil_bert_for_sequence_classification/distilbert/embeddings/Gather_1/resource:
>  Tensor[(512, 768), float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/embeddings/LayerNorm/batchnorm/mul/ReadVariableOp/resource:
>  Tensor[(768), float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/embeddings/LayerNorm/batchnorm/ReadVariableOp/resource:
>  Tensor[(768), float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/q_lin/Tensordot/ReadVariableOp/resource:
>  Tensor[(768, 768), float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/q_lin/BiasAdd/ReadVariableOp/resource:
>  Tensor[(768), float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/k_lin/Tensordot/ReadVariableOp/resource:
>  Tensor[(768, 768), float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/k_lin/BiasAdd/ReadVariableOp/resource:
>  Tensor[(768), float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/ones: Tensor[(1, 128), 
> float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/v_lin/Tensordot/ReadVariableOp/resource:
>  Tensor[(768, 768), float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/v_lin/BiasAdd/ReadVariableOp/resource:
>  Tensor[(768), float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/out_lin/Tensordot/ReadVariableOp/resource:
>  Tensor[(768, 768), float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/out_lin/BiasAdd/ReadVariableOp/resource:
>  Tensor[(768), float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/sa_layer_norm/batchnorm/mul/ReadVariableOp/resource:
>  Tensor[(768), float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/sa_layer_norm/batchnorm/ReadVariableOp/resource:
>  Tensor[(768), float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/ffn/lin1/Tensordot/ReadVariableOp/resource:
>  Tensor[(768, 3072), float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/ffn/lin1/BiasAdd/ReadVariableOp/resource:
>  Tensor[(3072), float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/ffn/lin2/Tensordot/ReadVariableOp/resource:
>  Tensor[(3072, 768), float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/ffn/lin2/BiasAdd/ReadVariableOp/resource:
>  Tensor[(768), float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/output_layer_norm/batchnorm/mul/ReadVariableOp/resource:
>  Tensor[(768), float32], 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/output_layer_norm/batchnorm/ReadVariableOp/resource:
>  Tensor[(768), float32]) 
> 
> 
>     {
>       %0 = expand_dims(meta[relay.Constant][0] /* ty=Tensor[(128), int32] */, 
> axis=0) /* ty=Tensor[(1, 128), int32] */;
>       %1 = 
> take(%tf_distil_bert_for_sequence_classification/distilbert/embeddings/Gather_1/resource,
>  %0, axis=0) /* ty=Tensor[(1, 128, 768), float32] */;
>       %2 = 
> take(%tf_distil_bert_for_sequence_classification/distilbert/embeddings/Gather/resource,
>  %x, axis=0) /* ty=Tensor[(1, 128, 768), float32] */;
>       %3 = tile(%1, reps=[1, 1, 1]) /* ty=Tensor[(1, 128, 768), float32] */;
>       %4 = add(%2, %3) /* ty=Tensor[(1, 128, 768), float32] */;
>       %5 = mean(%4, axis=[2], keepdims=True) /* ty=Tensor[(1, 128, 1), 
> float32] */;
>       %6 = subtract(%4, %5) /* ty=Tensor[(1, 128, 768), float32] */;
>       %7 = multiply(%6, %6) /* ty=Tensor[(1, 128, 768), float32] */;
>       %8 = mean(%7, axis=[2], keepdims=True) /* ty=Tensor[(1, 128, 1), 
> float32] */;
>       %9 = add(%8, 1e-12f /* ty=float32 */) /* ty=Tensor[(1, 128, 1), 
> float32] */;
>       %10 = power(%9, -0.5f /* ty=float32 */) /* ty=Tensor[(1, 128, 1), 
> float32] */;
>       %11 = multiply(%10, 
> %tf_distil_bert_for_sequence_classification/distilbert/embeddings/LayerNorm/batchnorm/mul/ReadVariableOp/resource)
>  /* ty=Tensor[(1, 128, 768), float32] */;
>       %12 = multiply(%5, %11) /* ty=Tensor[(1, 128, 768), float32] */;
>       %13 = multiply(%4, %11) /* ty=Tensor[(1, 128, 768), float32] */;
>       %14 = 
> subtract(%tf_distil_bert_for_sequence_classification/distilbert/embeddings/LayerNorm/batchnorm/ReadVariableOp/resource,
>  %12) /* ty=Tensor[(1, 128, 768), float32] */;
>       %15 = add(%13, %14) /* ty=Tensor[(1, 128, 768), float32] */;
>       %16 = reshape(%15, newshape=[128, 768]) /* ty=Tensor[(128, 768), 
> float32] */;
>       %17 = 
> transpose(%tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/q_lin/Tensordot/ReadVariableOp/resource,
>  axes=[1, 0]) /* ty=Tensor[(768, 768), float32] */;
>       %18 = nn.dense(%16, %17, units=768) /* ty=Tensor[(128, 768), float32] 
> */;
>       %19 = reshape(%18, newshape=[1, 128, 768]) /* ty=Tensor[(1, 128, 768), 
> float32] */;
>       %20 = add(%19, 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/q_lin/BiasAdd/ReadVariableOp/resource)
>  /* ty=Tensor[(1, 128, 768), float32] */;
>       %21 = reshape(%20, newshape=[1, -1, 12, 64]) /* ty=Tensor[(1, 128, 12, 
> 64), float32] */;
>       %22 = cast(768 /* ty=int32 */, dtype="float64") /* ty=float64 */;
>       %23 = cast(12 /* ty=int32 */, dtype="float64") /* ty=float64 */;
>       %24 = divide(%22, %23) /* ty=float64 */;
>       %25 = cast(%24, dtype="int32") /* ty=int32 */;
>       %26 = cast(%25, dtype="float32") /* ty=float32 */;
>       %27 = transpose(%21, axes=[0, 2, 1, 3]) /* ty=Tensor[(1, 12, 128, 64), 
> float32] */;
>       %28 = power(%26, -0.5f /* ty=float32 */) /* ty=float32 */;
>       %29 = multiply(%27, %28) /* ty=Tensor[(1, 12, 128, 64), float32] */;
>       %30 = reshape(%15, newshape=[128, 768]) /* ty=Tensor[(128, 768), 
> float32] */;
>       %31 = 
> transpose(%tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/k_lin/Tensordot/ReadVariableOp/resource,
>  axes=[1, 0]) /* ty=Tensor[(768, 768), float32] */;
>       %32 = nn.dense(%30, %31, units=768) /* ty=Tensor[(128, 768), float32] 
> */;
>       %33 = reshape(%32, newshape=[1, 128, 768]) /* ty=Tensor[(1, 128, 768), 
> float32] */;
>       %34 = add(%33, 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/k_lin/BiasAdd/ReadVariableOp/resource)
>  /* ty=Tensor[(1, 128, 768), float32] */;
>       %35 = reshape(%34, newshape=[1, -1, 12, 64]) /* ty=Tensor[(1, 128, 12, 
> 64), float32] */;
>       %36 = transpose(%35, axes=[0, 2, 1, 3]) /* ty=Tensor[(1, 12, 128, 64), 
> float32] */;
>       %37 = reshape(%29, newshape=[12, 128, 64]) /* ty=Tensor[(12, 128, 64), 
> float32] */;
>       %38 = reshape(%36, newshape=[12, 128, 64]) /* ty=Tensor[(12, 128, 64), 
> float32] */;
>       %39 = nn.batch_matmul(%37, %38, meta[relay.attrs.BatchMatmulAttrs][0]) 
> /* ty=Tensor[(12, 128, 128), float32] */;
>       %40 = 
> reshape(%tf_distil_bert_for_sequence_classification/distilbert/ones, 
> newshape=[1, 1, 1, 128]) /* ty=Tensor[(1, 1, 1, 128), float32] */;
>       %41 = subtract(1f /* ty=float32 */, %40) /* ty=Tensor[(1, 1, 1, 128), 
> float32] */;
>       %42 = reshape(%39, newshape=[1, 12, 128, 128]) /* ty=Tensor[(1, 12, 
> 128, 128), float32] */;
>       %43 = multiply(1e+30f /* ty=float32 */, %41) /* ty=Tensor[(1, 1, 1, 
> 128), float32] */;
>       %44 = subtract(%42, %43) /* ty=Tensor[(1, 12, 128, 128), float32] */;
>       %45 = nn.softmax(%44) /* ty=Tensor[(1, 12, 128, 128), float32] */;
>       %46 = reshape(%15, newshape=[128, 768]) /* ty=Tensor[(128, 768), 
> float32] */;
>       %47 = 
> transpose(%tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/v_lin/Tensordot/ReadVariableOp/resource,
>  axes=[1, 0]) /* ty=Tensor[(768, 768), float32] */;
>       %48 = nn.dense(%46, %47, units=768) /* ty=Tensor[(128, 768), float32] 
> */;
>       %49 = reshape(%48, newshape=[1, 128, 768]) /* ty=Tensor[(1, 128, 768), 
> float32] */;
>       %50 = add(%49, 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/v_lin/BiasAdd/ReadVariableOp/resource)
>  /* ty=Tensor[(1, 128, 768), float32] */;
>       %51 = reshape(%50, newshape=[1, -1, 12, 64]) /* ty=Tensor[(1, 128, 12, 
> 64), float32] */;
>       %52 = transpose(%51, axes=[0, 2, 1, 3]) /* ty=Tensor[(1, 12, 128, 64), 
> float32] */;
>       %53 = reshape(%52, newshape=[12, 128, 64]) /* ty=Tensor[(12, 128, 64), 
> float32] */;
>       %54 = reshape(%45, newshape=[12, 128, 128]) /* ty=Tensor[(12, 128, 
> 128), float32] */;
>       %55 = transpose(%53, axes=[0, 2, 1]) /* ty=Tensor[(12, 64, 128), 
> float32] */;
>       %56 = nn.batch_matmul(%54, %55, meta[relay.attrs.BatchMatmulAttrs][1]) 
> /* ty=Tensor[(12, 128, 64), float32] */;
>       %57 = reshape(%56, newshape=[1, 12, 128, 64]) /* ty=Tensor[(1, 12, 128, 
> 64), float32] */;
>       %58 = transpose(%57, axes=[0, 2, 1, 3]) /* ty=Tensor[(1, 128, 12, 64), 
> float32] */;
>       %59 = reshape(%58, newshape=[1, -1, 768]) /* ty=Tensor[(1, 128, 768), 
> float32] */;
>       %60 = reshape(%59, newshape=[128, 768]) /* ty=Tensor[(128, 768), 
> float32] */;
>       %61 = 
> transpose(%tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/out_lin/Tensordot/ReadVariableOp/resource,
>  axes=[1, 0]) /* ty=Tensor[(768, 768), float32] */;
>       %62 = nn.dense(%60, %61, units=768) /* ty=Tensor[(128, 768), float32] 
> */;
>       %63 = reshape(%62, newshape=[1, 128, 768]) /* ty=Tensor[(1, 128, 768), 
> float32] */;
>       %64 = add(%63, 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/attention/out_lin/BiasAdd/ReadVariableOp/resource)
>  /* ty=Tensor[(1, 128, 768), float32] */;
>       %65 = add(%64, %15) /* ty=Tensor[(1, 128, 768), float32] */;
>       %66 = mean(%65, axis=[2], keepdims=True) /* ty=Tensor[(1, 128, 1), 
> float32] */;
>       %67 = subtract(%65, %66) /* ty=Tensor[(1, 128, 768), float32] */;
>       %68 = multiply(%67, %67) /* ty=Tensor[(1, 128, 768), float32] */;
>       %69 = mean(%68, axis=[2], keepdims=True) /* ty=Tensor[(1, 128, 1), 
> float32] */;
>       %70 = add(%69, 1e-12f /* ty=float32 */) /* ty=Tensor[(1, 128, 1), 
> float32] */;
>       %71 = power(%70, -0.5f /* ty=float32 */) /* ty=Tensor[(1, 128, 1), 
> float32] */;
>       %72 = multiply(%71, 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/sa_layer_norm/batchnorm/mul/ReadVariableOp/resource)
>  /* ty=Tensor[(1, 128, 768), float32] */;
>       %73 = multiply(%66, %72) /* ty=Tensor[(1, 128, 768), float32] */;
>       %74 = multiply(%65, %72) /* ty=Tensor[(1, 128, 768), float32] */;
>       %75 = 
> subtract(%tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/sa_layer_norm/batchnorm/ReadVariableOp/resource,
>  %73) /* ty=Tensor[(1, 128, 768), float32] */;
>       %76 = add(%74, %75) /* ty=Tensor[(1, 128, 768), float32] */;
>       %77 = reshape(%76, newshape=[128, 768]) /* ty=Tensor[(128, 768), 
> float32] */;
>       %78 = 
> transpose(%tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/ffn/lin1/Tensordot/ReadVariableOp/resource,
>  axes=[1, 0]) /* ty=Tensor[(3072, 768), float32] */;
>       %79 = nn.dense(%77, %78, units=3072) /* ty=Tensor[(128, 3072), float32] 
> */;
>       %80 = reshape(%79, newshape=[1, 128, 3072]) /* ty=Tensor[(1, 128, 
> 3072), float32] */;
>       %81 = add(%80, 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/ffn/lin1/BiasAdd/ReadVariableOp/resource)
>  /* ty=Tensor[(1, 128, 3072), float32] */;
>       %82 = divide(%81, 1.41421f /* ty=float32 */) /* ty=Tensor[(1, 128, 
> 3072), float32] */;
>       %83 = erf(%82) /* ty=Tensor[(1, 128, 3072), float32] */;
>       %84 = multiply(0.5f /* ty=float32 */, %81) /* ty=Tensor[(1, 128, 3072), 
> float32] */;
>       %85 = add(1f /* ty=float32 */, %83) /* ty=Tensor[(1, 128, 3072), 
> float32] */;
>       %86 = multiply(%84, %85) /* ty=Tensor[(1, 128, 3072), float32] */;
>       %87 = reshape(%86, newshape=[128, 3072]) /* ty=Tensor[(128, 3072), 
> float32] */;
>       %88 = 
> transpose(%tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/ffn/lin2/Tensordot/ReadVariableOp/resource,
>  axes=[1, 0]) /* ty=Tensor[(768, 3072), float32] */;
>       %89 = nn.dense(%87, %88, units=768) /* ty=Tensor[(128, 768), float32] 
> */;
>       %90 = reshape(%89, newshape=[1, 128, 768]) /* ty=Tensor[(1, 128, 768), 
> float32] */;
>       %91 = add(%90, 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/ffn/lin2/BiasAdd/ReadVariableOp/resource)
>  /* ty=Tensor[(1, 128, 768), float32] */;
>       %92 = add(%91, %76) /* ty=Tensor[(1, 128, 768), float32] */;
>       %93 = mean(%92, axis=[2], keepdims=True) /* ty=Tensor[(1, 128, 1), 
> float32] */;
>       %94 = subtract(%92, %93) /* ty=Tensor[(1, 128, 768), float32] */;
>       %95 = multiply(%94, %94) /* ty=Tensor[(1, 128, 768), float32] */;
>       %96 = mean(%95, axis=[2], keepdims=True) /* ty=Tensor[(1, 128, 1), 
> float32] */;
>       %97 = add(%96, 1e-12f /* ty=float32 */) /* ty=Tensor[(1, 128, 1), 
> float32] */;
>       %98 = power(%97, -0.5f /* ty=float32 */) /* ty=Tensor[(1, 128, 1), 
> float32] */;
>       %99 = multiply(%98, 
> %tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/output_layer_norm/batchnorm/mul/ReadVariableOp/resource)
>  /* ty=Tensor[(1, 128, 768), float32] */;
>       %100 = multiply(%93, %99) /* ty=Tensor[(1, 128, 768), float32] */;
>       %101 = multiply(%92, %99) /* ty=Tensor[(1, 128, 768), float32] */;
>       %102 = 
> subtract(%tf_distil_bert_for_sequence_classification/distilbert/transformer/layer_._0/output_layer_norm/batchnorm/ReadVariableOp/resource,
>  %100) /* ty=Tensor[(1, 128, 768), float32] */;
>       add(%101, %102) /* ty=Tensor[(1, 128, 768), float32] */
>     }

To read out the output, I can use **"module.get_output(0)"**, 

However, this operation only allow user to read the output from the last 
operation, which is 
> add(%101, %102) /* ty=Tensor[(1, 128, 768), float32]

I am wondering is it possible to print out any intermediate value like %60 or 
%80?

Can the user modify the number of outputs in Relay IR and read them out?

Thanks :)





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/how-to-read-out-the-intermediate-value-in-relay-ir/12084/1)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/9317f785a6188fe37506845b48dab9af9a8bd930adfc9f75ff1f2955c99368f5).

[Apache TVM Discuss] [Questions] How to read out the intermediate value in Relay IR?

Reply via email to