This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VM: EdgeAI: Matmul with inputs of different types not working on target

Part Number: TDA4VM


Tool/software:

Hello,

We are trying to run out model on EdgeAI 9.20.07. https://github.com/TexasInstruments/edgeai-tidl-tools/tree/09_02_07_00?tab=readme-ov-file.

We have observed issue with Matmul layer on target when the inputs have different type (For ex: TIDL_SignedShort and TIDL_UnsignedShort), Even though similar scenario is working on host emulation mode on pc.

The error being produced on target is like so,

[C7x_1 ]     68.682882 s: Alg Init for Layer # -   56
[C7x_1 ]     68.683050 s: [tidsp/tidl_matmul_device.c:399] MMALIB init failed
[C7x_1 ]     68.683085 s: [tidsp/tidl_matmul_device.c:548] Alg init fail for handle 0

The example graph to test this issue included in the attached patch for EdgeAI 9.20.07.

edgeai-tidl-tools_e2e_28Aug24.7z.txt

  • Hi Deepanshu,

    I do apologize for the delay in providing an update - we have been looking at this internally. 

    I have reproduced a similar SVG with 9.2.9 tools, I am seeing a failure on target, however not the MMALIB init error. Can clarify if this error occurs with the full model or when you run the toy model (or both)? Based on the log snippet it seems this is the full model based on the layer numbers. For the full model, is this structured similarly with the Softmax layer preceding the Matmul layer? Are you currently delegating the Softmax layer as discussed on the softmax ticket? 

    Best,

    Asha

  • From 9/5,

    Need to use tidl_tools provided by customer to run into matmul issue, as this will get passed softmax issue present on 9.2.x tools for TDA4VM as discussed in softmax thread. 

    Action Items:

    TI - Use the above to fully reproduce matmul issue, on TDA4VM (run on other EVM as necessary), file ticket as necessary - will update on this effort on 9/6

    Layer is part of lane centering model, this is a higher priority item for customer. 

    - Asha 

  • Hello Asha,

    I realised I had shared default tidl_tools in the previous patch. Since modified tidl_tools is not usualy preferred by TI people, best use default 9.2.7 tools.

    I have replaced the softmax with relu in the new test model. This should avoid the 'output transpose not supported issue' and hit the matmul issue on the target.

    7z��'��S�r�h"�o`D�jAFT2�Q-��6����ֆO���wtv�[q��8�3�!������cw��i��Jy~u�G>�|�������\�/V
    �zxpG��>g�}7�3��Gs�u����4�b4���o�a?�^)�5sLE/�����!��$�]�W��%G7�vI�}dn����E�w�N��q�f	��
    ��u�7J��8'ZP�p�����8���p)5M��cnNA����T�J4|7!X�_�'u�L`�EF�s#;y�)"���p�R�(�����ٍ��ۤƩE������{$f�	�e��E�������d�����?��{�����Ft��[ʭ�6�[Z��rS%����f�����Q�	�r#]��
    \)�"9Matmul_problem_graph_1.onnx
    ֽLm�
    ֽLm�
    ��y� ���

  • Hi Deepanshu,

    Thank you for providing this new toy model to so we can reproduce the issue more easily on our end. 

    I was able to see the same behavior as you on target, and I've filed a bug TIDL-4735 internally. As we discussed, this layer is on the lane centering model, so I will include this when discussing the other tickets internally.

    Best,

    Asha

  • Hi Deepanshu,

    In the meantime, can you try delegating this Matmul layer to ARM (adding the layer to the deny list) and seeing what the behavior is?

    Best,

    Asha

  • Hello Asha,

    I am getting segmentation fault when i put the Matmul in deny list, for the full graph.

    Thanks

  • Hi Deepanshu,

    Is this during compilation or inference? Can you attach the logs where you are seeing the fault (with debug_level=2)?

    Best,

    Asha

  • Hi Asha,

    PFA the compilation logs with seg fault, as requested.

    err_log_e2e.txt

    Thanks

  • Hello Asha,

    As we have discussed this is a priority issue for the lane detection model, please help us with the current updates on this issue.

    Thanks

  • could you pls share the update on this? we are totally blocked on this!

  • Hi all,

    I do apologize for the delay - 

    I've have transitioned this thread to Chris to communicate an expected fix date.

    Best,

    Asha

  •  could you pls share the update on this Matmul issue?

    Regards,

    Ramesh

  • Hi Ramesh, 

    The TIDL call is using functionality that was not available in MMALIB.  The feature will be enabled in a build in early 1Q 2025.

  • Hi Chris,

    Could you kindly confirm once again ETA for the fix for the Matmul issue regarding different signs ?

    Regards

    Gajanan

  • Due to other priorities in the development pipeline, it is still 1Q25—or, to be more specific, early 1Q25. I will update you as I hear from the dev team.

  • Thanks for the confirmation.