[TDA4] Convolution layer taking forever

Dominic Fellner

Hi,

I have run the appended network in an adapted tidl demo, but the execution hangs after the function tivxKernelTIDLDumpToFile() was called for layer 20, so I assume the execution hangs in either

layer 20 (inception2_t0_0_relu)
layer 23 (inception2_t0_1_relu)
layer 21 (inception2_t1_0_relu).

Question: Is tivxKernelTIDLDumpToFile() called before or after the corresponding layer execution?

In host emulation, everything is fine. Other networks also run.

To make it run I needed to increase the scratch memory by editing psdk_rtos_auto_j7_06_01_00_15/vision_apps/apps/basic_demos/app_tirtos/common/app_cfg_c7x_1.h:

#define DDR_HEAP_MEM_SIZE ((70)*0x100000u) //earlier 80MB
#define DDR_SCRATCH_SIZE ((46)*0x100000u) // earlier 16MB

Questions: Can I fix this somehow? Will this be fixed in the next TIDL version?

Regards

Dominic Fellner

[Network]

Fullscreen 5483.deploy.prototxt.txt Download

layer {
  name: "input_1"
  type: "Input"
  top: "input_1"
  input_param {
    shape {
      dim: 1
      dim: 3
      dim: 512
      dim: 1024
    }
  }
}
layer {
  name: "conv0_0"
  type: "Convolution"
  bottom: "input_1"
  top: "conv0_0"
  convolution_param {
    num_output: 16
    bias_term: true
    pad: 3
    kernel_size: 7
    stride: 1
  }
}
layer {
  name: "conv0_0_relu"
  type: "ReLU"
  bottom: "conv0_0"
  top: "conv0_0_relu"
}
layer {
  name: "conv0_2"
  type: "Convolution"
  bottom: "conv0_0_relu"
  top: "conv0_2"
  convolution_param {
    num_output: 16
    bias_term: true
    pad: 2
    kernel_size: 5
    stride: 1
  }
}
layer {
  name: "conv0_2_relu"
  type: "ReLU"
  bottom: "conv0_2"
  top: "conv0_2_relu"
}
layer {
  name: "conv0_4"
  type: "Convolution"
  bottom: "conv0_2_relu"
  top: "conv0_4"
  convolution_param {
    num_output: 32
    bias_term: true
    pad: 1
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv0_4_relu"
  type: "ReLU"
  bottom: "conv0_4"
  top: "conv0_4_relu"
}
layer {
  name: "conv1_0"
  type: "Convolution"
  bottom: "conv0_4_relu"
  top: "conv1_0"
  convolution_param {
    num_output: 32
    bias_term: true
    pad: 2
    kernel_size: 5
    stride: 1
  }
}
layer {
  name: "conv1_0_relu"
  type: "ReLU"
  bottom: "conv1_0"
  top: "conv1_0_relu"
}
layer {
  name: "conv1_2"
  type: "Convolution"
  bottom: "conv1_0_relu"
  top: "conv1_2"
  convolution_param {
    num_output: 32
    bias_term: true
    pad: 1
    kernel_size: 3
    stride: 1
  }
}
layer {
  name: "conv1_2_relu"
  type: "ReLU"
  bottom: "conv1_2"
  top: "conv1_2_relu"
}
layer {
  name: "conv1_3"
  type: "Convolution"
  bottom: "conv1_2_relu"
  top: "conv1_3"
  convolution_param {
    num_output: 32
    bias_term: true
    pad: 1
    kernel_size: 3
    stride: 1
  }
}
layer {
  name: "conv1_3_relu"
  type: "ReLU"
  bottom: "conv1_3"
  top: "conv1_3_relu"
}
layer {
  name: "conv1_4"
  type: "Convolution"
  bottom: "conv1_3_relu"
  top: "conv1_4"
  convolution_param {
    num_output: 64
    bias_term: true
    pad: 1
    kernel_size: 3
    stride: 1
  }
}
layer {
  name: "conv1_4_relu"
  type: "ReLU"
  bottom: "conv1_4"
  top: "conv1_4_relu"
}
layer {
  name: "inception1_t0_0"
  type: "Convolution"
  bottom: "conv1_4_relu"
  top: "inception1_t0_0"
  convolution_param {
    num_output: 48
    bias_term: true
    kernel_size: 1
    stride: 1
  }
}
layer {
  name: "inception1_t1_0"
  type: "Convolution"
  bottom: "conv1_4_relu"
  top: "inception1_t1_0"
  convolution_param {
    num_output: 48
    bias_term: true
    kernel_size: 1
    stride: 1
  }
}
layer {
  name: "inception1_t0_0_relu"
  type: "ReLU"
  bottom: "inception1_t0_0"
  top: "inception1_t0_0_relu"
}
layer {
  name: "inception1_t1_0_relu"
  type: "ReLU"
  bottom: "inception1_t1_0"
  top: "inception1_t1_0_relu"
}
layer {
  name: "inception1_t3_0"
  type: "Pooling"
  bottom: "conv1_4_relu"
  top: "inception1_t3_0"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "inception1_t0_1"
  type: "Convolution"
  bottom: "inception1_t0_0_relu"
  top: "inception1_t0_1"
  convolution_param {
    num_output: 48
    bias_term: true
    pad: 1
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "inception1_t1_1"
  type: "Convolution"
  bottom: "inception1_t1_0_relu"
  top: "inception1_t1_1"
  convolution_param {
    num_output: 48
    bias_term: true
    pad: 2
    kernel_size: 5
    stride: 2
  }
}
layer {
  name: "inception1_t3_1"
  type: "Convolution"
  bottom: "inception1_t3_0"
  top: "inception1_t3_1"
  convolution_param {
    num_output: 48
    bias_term: true
    kernel_size: 1
    stride: 1
  }
}
layer {
  name: "inception1_t0_1_relu"
  type: "ReLU"
  bottom: "inception1_t0_1"
  top: "inception1_t0_1_relu"
}
layer {
  name: "inception1_t1_1_relu"
  type: "ReLU"
  bottom: "inception1_t1_1"
  top: "inception1_t1_1_relu"
}
layer {
  name: "inception1_t3_1_relu"
  type: "ReLU"
  bottom: "inception1_t3_1"
  top: "inception1_t3_1_relu"
}
layer {
  name: "inception1concat"
  type: "Concat"
  bottom: "inception1_t0_1_relu"
  bottom: "inception1_t1_1_relu"
  bottom: "inception1_t3_1_relu"
  top: "inception1concat"
}
layer {
  name: "inception1_dim_red"
  type: "Convolution"
  bottom: "inception1concat"
  top: "inception1_dim_red"
  convolution_param {
    num_output: 64
    bias_term: true
    kernel_size: 1
    stride: 1
  }
}
layer {
  name: "inception1_dim_red_relu"
  type: "ReLU"
  bottom: "inception1_dim_red"
  top: "inception1_dim_red_relu"
}
layer {
  name: "conv2_0"
  type: "Convolution"
  bottom: "inception1_dim_red_relu"
  top: "conv2_0"
  convolution_param {
    num_output: 64
    bias_term: true
    pad: 1
    kernel_size: 3
    stride: 1
  }
}
layer {
  name: "conv2_0_relu"
  type: "ReLU"
  bottom: "conv2_0"
  top: "conv2_0_relu"
}
layer {
  name: "conv2_1"
  type: "Convolution"
  bottom: "conv2_0_relu"
  top: "conv2_1"
  convolution_param {
    num_output: 64
    bias_term: true
    pad: 1
    kernel_size: 3
    stride: 1
  }
}
layer {
  name: "conv2_1_relu"
  type: "ReLU"
  bottom: "conv2_1"
  top: "conv2_1_relu"
}
layer {
  name: "conv2_3"
  type: "Convolution"
  bottom: "conv2_1_relu"
  top: "conv2_3"
  convolution_param {
    num_output: 96
    bias_term: true
    pad: 1
    kernel_size: 3
    stride: 1
  }
}
layer {
  name: "conv2_3_relu"
  type: "ReLU"
  bottom: "conv2_3"
  top: "conv2_3_relu"
}
layer {
  name: "inception2_t0_0"
  type: "Convolution"
  bottom: "conv2_3_relu"
  top: "inception2_t0_0"
  convolution_param {
    num_output: 72
    bias_term: true
    kernel_size: 1
    stride: 1
  }
}
layer {
  name: "inception2_t1_0"
  type: "Convolution"
  bottom: "conv2_3_relu"
  top: "inception2_t1_0"
  convolution_param {
    num_output: 72
    bias_term: true
    kernel_size: 1
    stride: 1
  }
}
layer {
  name: "inception2_t0_0_relu"
  type: "ReLU"
  bottom: "inception2_t0_0"
  top: "inception2_t0_0_relu"
}
layer {
  name: "inception2_t1_0_relu"
  type: "ReLU"
  bottom: "inception2_t1_0"
  top: "inception2_t1_0_relu"
}
layer {
  name: "inception2_t3_0"
  type: "Pooling"
  bottom: "conv2_3_relu"
  top: "inception2_t3_0"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "inception2_t0_1"
  type: "Convolution"
  bottom: "inception2_t0_0_relu"
  top: "inception2_t0_1"
  convolution_param {
    num_output: 72
    bias_term: true
    pad: 1
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "inception2_t1_1"
  type: "Convolution"
  bottom: "inception2_t1_0_relu"
  top: "inception2_t1_1"
  convolution_param {
    num_output: 72
    bias_term: true
    pad: 2
    kernel_size: 5
    stride: 2
  }
}
layer {
  name: "inception2_t3_1"
  type: "Convolution"
  bottom: "inception2_t3_0"
  top: "inception2_t3_1"
  convolution_param {
    num_output: 72
    bias_term: true
    kernel_size: 1
    stride: 1
  }
}
layer {
  name: "inception2_t0_1_relu"
  type: "ReLU"
  bottom: "inception2_t0_1"
  top: "inception2_t0_1_relu"
}
layer {
  name: "inception2_t1_1_relu"
  type: "ReLU"
  bottom: "inception2_t1_1"
  top: "inception2_t1_1_relu"
}
layer {
  name: "inception2_t3_1_relu"
  type: "ReLU"
  bottom: "inception2_t3_1"
  top: "inception2_t3_1_relu"
}
layer {
  name: "inception2concat"
  type: "Concat"
  bottom: "inception2_t0_1_relu"
  bottom: "inception2_t1_1_relu"
  bottom: "inception2_t3_1_relu"
  top: "inception2concat"
}

over 6 years ago

0 Anshu Jain over 6 years ago

TI__Guru 56820 points

Hi Dominic.

Can you disable trace dump and try?

Regards,

Anshu

0 Dominic Fellner over 6 years ago in reply to Anshu Jain

Prodigy 200 points

Hi Anshu,

I only enabled it after I saw the execution never returning.

Regards

Dom

0 Anshu Jain over 6 years ago in reply to Dominic Fellner

TI__Guru 56820 points

Hi Dom,

Extra memory requirement mainly comes if you enable layer level traces. If you just want to see where the code is getting stuck then you can set TIDL_CreateParams->traceLogLevel = 1 (and traceWriteLevel = 0 ) and try. This will only print the logs without dumping traces of each layer

Regards,
Anshu

0 Dominic Fellner over 6 years ago in reply to Anshu Jain

Prodigy 200 points

Hi Anshu,

I needed to increase the scratch mem, because my model is rather big. But as memory allocation is successful, I guess that is not a problem.

I tried to set the params as you described, but that did not change anything. The net is still stuck, this time without trace where it is stuck.

I read in this forum that other people have observed similar behavior for their networks.

Can you confirm that this also happens for you and will this be fixed in the next TIDL version?

Regards

Dom

0 Anshu Jain over 6 years ago in reply to Dominic Fellner

TI__Guru 56820 points

Hi Dom,

When you set TIDL_CreateParams->traceLogLevel = 1 (and traceWriteLevel = 0 ) it should print the layer level debug traces in the console given that you have set TIDL_CreateParams->TIDLVprintf correctly. This can help to figure out for which layer the crash is happening.

You can refer the networks which we have validated on TIDL in "TIDL User Guide"->"Data Sheet"->"Pre-trained CNN Models for TIDL". If your network uses any of these or similar networks then we expect them to work. For other's there may be issues but that totally depends on the network and its properties. It is recommend to verify the network standalone TIDL on PC first and then on EVM standalone and then with SDK demo app.

The next TIDL release is expected to come by January end.

Processors

Processors forum

[TDA4] Convolution layer taking forever