Part Number: DRA72XEVM
Dear TI experts,
I was trying to do TIDL inference using app_tidl code with some modifications to handle the N channels inputs to do inference.
I observed when I gave the input tensor of size [1,400,248,296], the inference results are NOT similar to my Pytorch results.(which is approx. 29.8 MB memory size including input padding)
However, if I give a input tensor of size [1,256,248,296], the inference results are similar to my Pytorch results.(which is approx. 19 MB memory size including input padding)
What I observed for these tensors sizes during calibration is that, the calibration runs fine for both the input tensor sizes and it gives me the correct min and max output tensor ranges as in Pytorch.
However, when I increase the number of channels to 400, the calibration gives a debug log as below:
------------------ Network Compiler Traces -----------------------------
Main iteration numer: 0....
Life time alive buffers are with ID: 10030 ( 0), 40001 ( 2), 27 ( 2), 60027 ( 1), 40002 ( 2), 40002 ( 2), 20030 ( 0), 40002 ( 2),
Preparing for memory allocation : internal iteration number: 0
Info: Couldn't perform operation 3 with data ID 3 in memSpace 1
Info: Memory Fragmentation issue, need to push to DDR
Main iteration numer: 1....
I don't know if this has an impact on the wrong inference later ? is there any limitation on the input tensor memory being passed to the network for inference ?
Can I make it to work by increasing some memory allocation ? or is it the limitation ?
Thanks for your feedback in advance!
Best Regards
Adit