Hello!

Currently, I am testing to compare the performance of direct conv2d and 
winograd conv2d using TOPI.
However, as a result of experiments, conv2d using winograd algorithm is too 
much worse than direct.
The code below is the code I experimented with.

    ## data shape
    data_shape = (1,3,224,224)
    w_shape = (64,3,3,3)

    ## Data
    sample_data = np.random.uniform(-1,1, size=data_shape ).astype("float32")
    sample_p1 = np.random.uniform(-1,1, size=w_shape ).astype("float32")

    ## placeholder
    input_data = tvm.te.placeholder( shape = data_shape, dtype = "float32", 
name="Input" )
    p1 = tvm.te.placeholder( shape = w_shape, dtype="float32", name="p1" )

    ## Winograd conv2d
    with tvm.target.create('cuda'):
        conv = topi.cuda.conv2d_nchw_winograd(input_data
                                              ,p1 
                                              ,(1,1)
                                              ,(0,0)
                                              ,(1,1)
                                              ,"float32"  )
        sch = topi.cuda.schedule_conv2d_nchw_winograd([conv])
        winoMod = tvm.build( sch, [ input_data,p1,conv] , target, name='wino')

    ## Direct conv2d
    with tvm.target.create('cuda'):
        conv = topi.cuda.conv2d_nchw( input_data
                                        ,p1 
                                        ,[1,1]
                                        ,[0,0]
                                        ,[1,1] )
        sch = topi.cuda.schedule_conv2d_nchw([conv])
        simpleMod = tvm.build(sch, [input_data,p1], target, name='direct' )


    ## Real data
    tvm_input = tvm.nd.array( sample_data , ctx )
    tvm_p1 = tvm.nd.array( sample_p1, ctx )

    ## Performance Testing
    ev_wino = winoMod.time_evaluator(winoMod.entry_name, ctx, 
number=1,repeat=100 )
    ev_conv = simpleMod.time_evaluator(simpleMod.entry_name, ctx, 
number=1,repeat=100 )

    timer = ev_conv( tvm_input, tvm_p1).mean*1e3
    print("Conv with Direct algo -> ",timer)
    timer = ev_wino( tvm_input, tvm_p1).mean*1e3
    print("Conv with Winograd Strassen algo -> ",timer )

The execution result is as follows.

    Conv with Direct algo ->  0.11522044
    Conv with Winograd Strassen algo ->  4.70840109

The performance gap is too big.
According to the [Fast Algorithms for Convolutional Neural Networks 
paper](https://arxiv.org/abs/1509.09308), I think performance is higher or 
similar than direct conv2d.
Is there something I misunderstood?





---
[Visit 
Topic](https://discuss.tvm.ai/t/topi-winograd-convolution-performance-is-too-slow/6161/1)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/89a1a4bd70f6ab8207a7ba366d2beb7929ccd1ba5069e5aec67802cfe4ef4dc4).

Reply via email to