TDA4VM: Deep Learning Interface

Part Number: TDA4VM

Tool/software:

Hello E2E Experts,

Good day.

I know why the GPU isn't used for DL inferencing for its higher power consumption. Also the DMPAC consisted only of 2 blocks for Dense Optical Flow(DOF) and Stereo Disparity Engine(SDE) but I'm still actually wondering if I can use them for other operations by looking through the interfaces in the SDK. Please let me know if you 've more info about this topic as I studied the TRM and many other links for overriding the DMPAC functionalities and using other functions.

I had some other questions actually built upon my research: 
1- Can I use the Dense Optical Flow in the DMPAC to offload the Head Pose Estimation task from my ARM cores in my Driver Monitoring System?
2- The image attached was extracted from the link you gave, in this part it was talking about potentially handling multiple models. Does that mean that it's possible that some models can be deployed on the MMA in parallel if their requirements in TOPS and memory can be handled simultaneously by the 8 TOPS & 4GB DDR memory? 

3- I also saw that TIDL can automatically optimize and analyze my DL model layers and see which layer can be quantized with minimal loss of accuracy. How efficient is this analysis and optimization process? would it be more efficient if it's done manually?

Regards,

TICSC

  • Hi,

    The GPU on the TDA4 devices is not available for inference, but it is planned to be available in the next generation of devices.   

    1- Can I use the Dense Optical Flow in the DMPAC to offload the Head Pose Estimation task from my ARM cores in my Driver Monitoring System?

    I do not understand this question, offload where?  To a DSP core?

    2- The image attached was extracted from the link you gave, in this part it was talking about potentially handling multiple models. Does that mean that it's possible that some models can be deployed on the MMA in parallel if their requirements in TOPS and memory can be handled simultaneously by the 8 TOPS & 4GB DDR memory? 

    Models can be run in parallel on the DPS(s); this is automatic.  We do not support manually running anything on the MMA as TIDL owns the MMA to handle multi-threading in model execution correctly. 

    3- I also saw that TIDL can automatically optimize and analyze my DL model layers and see which layer can be quantized with minimal loss of accuracy. How efficient is this analysis and optimization process? would it be more efficient if it's done manually?

    The efficiency improvement depends on the model and cannot be answered in general.  Perhaps you can make it more efficient manually, but you should do this before you compile it.  You will still need to run it through the import compile steps to partition the model to the correct processors.

    Chris 

  • Hello Chris,

    Good day.

    Regarding item number one, I was aiming for offloading the task to the DMPAC, not the DSP. I've actually read about how to offload the task to the c66x DSPs but I was thinking of offloading the head pose estimation algorithm to the DMPAC, as the head pose estimation itself depends on motion processing using optical flow.

    Is this feasible?

    Regards.

    TICSC