Other Parts Discussed in Thread: TDA2
Hi,Because I use group convolution and the fps is very low, what is the limit on the number of groups in convolution?
This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Hi,Because I use group convolution and the fps is very low, what is the limit on the number of groups in convolution?
Hi,
The max number of supported groups in convolution are 1024.
Thanks,
Praveen
Hi,
Thank you for your reply.
The number of group I used in each convolution is less than 1024, but my fps is still very low. I think it is the problem of tidl importTool.
How to set the conv2dKernelType of tidl_import_JDetNet.txt? Only if the size of conv is less than 32*32 is set to 1. Or are there other conditions?
Thanks
Hi,
Sorry for the late reply. Please set conv2dKernelType to 1 if the size of conv is upto 64*64 and try if it can improve the fps.
Thanks.
Praveen
Hi,
Thank you for your reply. I have already tried to set conv2dKernelType to 1 if the size of conv is upto 64*64, but the fps still very low.
Are there any other restrictions?
Thanks.
Hi,
If you are running SSD model, you may need to modify below as shown to overcome low fps,
More details are in the below thread.
https://e2e.ti.com/support/processors/f/791/t/689617
Thanks,
Praveen
Hi,
Thank you for your reply. Yes, I was running SSD model. I have already tried to set keep_top_k to 20 and confidence_threshold to 0.15, but the fps still very low.
Are there any other restrictions or import tool have some problems?
Thanks.
Hi,
>> Are there any other restrictions or import tool have some problems?
No, there are no other problems in the import tool. This low fps is because of grouped convolutions will take more time for processing in TIDL, so this could be a reason for low performance. Kindly try with small grouped convolutions.
Thanks,
Praveen
Hi,
Thank you for your reply.
Normally, the use of group convolution can increase the execution speed of the model, but why does the use of group convolution in tidl make the fps drop?
By the way, The total parameter amount of the ssdJacintoNetV2 I trained is 3.25693e+06, and Total Giga Macs is 3.6114. After the import tool, the executed fps is 20, but why the total parameter amount of the ssd model I additionally designed is less (253024),Total Giga Macs is 1.4182, and after the import tool, the executed fps is 15 (drop)?
The settings have been set according to the user guide. Are there any additional restrictions on tidl and import tool?
Thanks.
Hi,
I have shared my import config file. Please test for the import tool and fps.
Thanks.https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/import_5F00_config_5F00_file.7z
Hi,
Import config file looks fine, the conv2dKernelType was set correctly. The fps reduction is mainly because of uneven and small tensor sizes. The kernels are be better optimized for tensor sizes multiple of 8. I think fps you got is the final number for your model.
You can try one last thing setting "nms_threshold: 0.4" in the deploy.prototxt.
Thanks,
Praveen
Hi,
Thank you for your reply.
The situation you mentioned in my model is starts from the pool3 layer, but this design is the same as ssdJacintoNet, So why my fps is lower than ssdJacintoNet?
What is the principle of nms_threshold: 0.4? I have tried setting nms_threshold to 0.4, but the fps has not changed.
Thanks.
Hi,
>> The situation you mentioned in my model is starts from the pool3 layer, but this design is the same as ssdJacintoNet, So why my fps is lower than ssdJacintoNet?
Even though this situation started from pool3 layer, but in the earlier layers there are grouped conv layers which are not well optimized in the TDA2 (as we do SIMD across numchannels), so there is performance degradation in your model.
Thanks,
Praveen
Hi,
Thank you for your reply.
>>In the earlier layers there are grouped conv layers which are not well optimized in the TDA2
But the maximum grouped conv of each layer is only 4, which is the same as the maximum grouped conv value of the ssdJacintoNet. Or is there a limit to the number of grouped conv? If so, how much is it?
Thanks.