J721S2XSOMXEVM: Output tensor size of tiadsegNet_v2

Part Number: J721S2XSOMXEVM

Tool/software:

Hello TI experts,

I am working on custom tidl applicaition using segmentation network that is used in app_tidl_seg demo application in ti-processor-sdk-rtos-j721s2-evm-10_00_00_05.

And I am trying to parse the output tensor of the network and have found that the tensor size is different than the image size.

The input image size of the network is 768x384 and the output tensor size seems to be 768x463x1.

I thought the tensor size would be 768x384x1 according to the size showed in the diagram below, but it isn't.

https://software-dl.ti.com/jacinto7/esd/processor-sdk-rtos-jacinto7/latest/exports/docs/vision_apps/docs/user_guide/group_apps_dl_demos_app_tidl_seg.html

Why is the tensor size of dimension1 463?

Regards,

Juhyun

  • Hi Juhyun,

    Can you explain the model you are using and where you are getting the possibly incorrect tensor size?  Specifically, what layer are you looking at?  

    regards,

    Chris

  • Hi Chris,

    The model is tiadsegNet_v2, which is , AFAIK, a model used in vision_apps seg dl demo, vision_apps/apps/dl_demos/app_tidl_seg.
    And the size I'm referring to is the model's output, the final layer.

    I have no idea what the exact size of the output of the model is, but my guess is that it would be the same size as the input, since the network is a semantic segmentation.

    So I don't think the size I checked is incorrect. Rather, I'd like to know about the size of the output tensor, especially the part of the difference from the size of the input.

    Regrads,

    Juhyun

  • Hi Juhyun,

    If you could send me the model it will help.  One way to find the output size is from the output of the previous layer.  For example please see the following image:

    The output tensor will be 1x12x80x80.  You can usually ignore the batch of 1, so the tensor will be 12x80x80.  Assuming the output is an image, that will be your image size.  In practical terms, outputs for an image are usually 3x300x300, where the first number is the number of channels X width X height.

     Chris