This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

MCU-PLUS-SDK-AM243X: freeRTOS-Flash-access blocks all lower priority-tasks in a while loop

Part Number: MCU-PLUS-SDK-AM243X

Hello,

so we noticed that the Flash-access via the TI-drivers uses a while-loop in Flash_norOspiWaitReady (no matter the SDK-version):

    status = Flash_norOspiCmdRead(config, cmd, cmdAddr, numAddrBytes, &readStatus, 1);

    while((status != SystemP_SUCCESS) || timeOut > 0)
    {
        status = Flash_norOspiCmdRead(config, cmd, cmdAddr, numAddrBytes, &readStatus, 1);

        if((status == SystemP_SUCCESS) && ((readStatus & devDefines->NOR_SR_WIP) == 0))
        {
            break;
        }

        timeOut--;
    }

In a freeRTOS-based environment this means that the calling task is running inside this loop. This also means all lower-priority-tasks are completely blocked until this operation is finished. Since a task with prio 5 has no wait-call or anything inside this function it will loop and poll until the answer is ready. All tasks below prio 5 are never scheduled then.

We may build a workaround with a dedicated Flash-Handler-Task but I am not sure if this behaviour is intended by the SDK. A solutions inside the SDK for freeRTOS would be appreciated.

Additionally for now I would recommend mention this behaviour in your SDK-documentation, since this may help other devs running into those problems.

regards

Felix

  • Hi Felix,

    Agree with your assessment. Those timeouts should have been use sleep instead of busy wait.

    Unfortunately, the changes are not trivial. There is no easy workaround. I will file a JIRA ticket (MCUSDK-8416) and hopefully it will be get addressed in next release.

    Best regards,

    Ming

  • Yes I can imagine since the drivers architecture seems to be independent of the OS.

    Well I tried to "quick fix" it by overriding the Flash_norOspiWaitReady-function with our own implementation, where I input that sleep.

    So I thought it may work only in source-code but I don't want to change the code of the SDK. So I thought that there may be an option with the linker to link our function instead of the sdk-one. I found the --symbol_map option (https://software-dl.ti.com/codegen/docs/tiarmclang/rel1_3_0_LTS/compiler_manual/linker_description/04_linker_options/symbol-management-options.html#mapping-of-symbols-symbol-map-option)

    It states: "Symbol mapping allows a symbol reference to be resolved by a symbol with a different name. Symbol mapping allows functions to be overridden with alternate definitions. This feature can be used to patch in alternate implementations, which provide patches (bug fixes) or alternate functionality. "

    The SDK is compiled as a static library for us, so the lib where to override is board.am243x.r5f.ti-arm-clang.debug.lib. But when linking the linker seems to ignore my option. In the produced map-file there is no overriden function neither does it get called on hw. I tried setting the option at start in the linker-script as well as a normal option-passing via cmd-line.

    So I also tried forcing the symbol with the --undef_sym-option. it appears in the map-file then but there is no generated trampolin and so it does not get called.
    Edit: I searched for that linker-option and found that this may be a bug, which is not resolved: sir.ext.ti.com/.../EXT_EP-10043

    Shall I open a new e2e-thread for this topic?

  • Hi Felix,

    As we discussed in the call, you can create a low priority task for the flash related operation, or temporarily reduce its priority while it is operation to the block erase, then move it back after the block erase is done. 

    As of the Linker issue you reported above, please file a new e2e thread, so we can keep track on it separately.

    Thanks!

    Ming

  • Hey Ming,

    yes this would've been my last straw but I found a "quick" fix for that. So I managed to declare the function "weak" in the SDK and got it to be overridden by a function of us, which is nearly the same but includes task.h and uses vTaskDelay.

    Unfortunately the internal SDK-Flash-API is not really documented so I am really not sure what unit the timeout-variable has. The timeout refers to the field NOR_BULK_ERASE_TIMEOUT of the Flash_NorOspiDevDefines-struct for our flash and the value is 72000000. Those are no ms I guess. Is this related to the clock-frequency of the Sitara? to the R5f? or to another clock with a divider or multiplier?

    I wanted to use this timeout for a sleep-implementation in this case.

    So I got it running at least by using also the timeout as before and inserting a vTaskDelay(1) every while-loop-iteration. This surely is no final fix but helps us right now at least. We may redesign our application to have a separate flash-task inside our drivers but this may take a bit longer.

    I will file a new e2e-ticket for the other topic.

    Thanks!

    Best regards,

    Felix

  • Hi Felix,

    I am glad the workaround works for you. I talked to the software development team yesterday. They told me that this should be the application-level implementation, because the OSPI flash driver is OS independent and this issue is flash type dependent, therefore the changes should be made at application level. Of course, they will document it clearly in the release document in next release.

    As of the timeout value, it is the times the driver trying to call Flash_norOspiCmdRead(). It is also flash dependent. My suggestion is to use the same timeout value, but use the smallest sleep unit possible after each Flash_norOspiCmdRead() call.

    Best regards,

    Ming

  • Ok I see. Well we thought about like the possibility to have the decision between "blocking" and "non-blocking", that may be decided when constructing the Flash-Object. A non-blocking Flash-driver may then use like a registered callback which calls the sleep-implementation of the used OS. That will make it OS-independent too. That callback may be there by default inside the SDK, so when using "non-blocking" the SDK automatically connects to the OS-implementation or the callback is provided by the user from outside. Would be possible via SysCfg or SDK-API directly.

    We also had such discussions internally for standardized driver-architectures where we came to the conclusion that designing the drivers completely independent of OS or no-OS is possible via registering of callbacks to the concrete implementations. May also be used for mutexes and so on.

    And yes, currently the used flash by us is really slow when erasing blocks, so this is surely related to this. But even shorter time-cycles can, depending on the application, cause some severe problems here.

    This is just a suggestion. For now it works for us like that.

  • Hi Felix,

    Thank you so much for your update and suggestion. I will certainly feed then back to our software team.

    Can you let us know the flash chips you are using? What is the block size when you do the block erase?

    Since the workaround works for you now, can I close this thread?

    Best regards,

    Ming 

  • Hey Ming, yes I will close the topic. I will let you know our flash-devices via mail.

    Best regards

    Felix