This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS570LC4357: Branch Prediction Behaviour - Cortex 5

Part Number: TMS570LC4357

Hello!

We are trying to understand how exactly works the branch prediction of theTMS570LC4357 (Cortex 5). Our goal is that the BP behaviour is deterministic for the worst-case execution time analysis that we have to perform.

We are not planning to use normal operation as it seems difficult to demonstrate the dynamic configuration is deterministic.

Our code contains many loops with a constant number of iterations. For that reason we expected the "always taken" configuration to be quicker that the "always not taken" configuration. Unfortunately the result of measuring a code snippet shows almost no differences between the two configurations. Does anyone understand why it is like that?

normal operation -> 239.65 us

always taken        -> 340.65 us

always not taken  -> 339.95 us

Best regards

Mathieu

  • Hi Mathieu,

    Per ARM Cortex-R5 TRM, to disable the program flow prediction, you must disable the return stack and set the branch prediction policy to always not-taken. Is the return stack disabled in your test?

    You can disable the return stack by setting RSDIS in the ACTLR (Auxiliary Control Register).

  • Hi Wang,

    a collegue repeated the measurements:

    Scenario Bit16/15 BP operation Bit 17 Return stack Time for the first call of function in us
    1 1 always taken  1 1 = Return stack disabled 239.40
    2 1 always taken  0 0 = Normal return stack operation 209.95
    3 10 always not taken  1 1 = Return stack disabled 427.10
    4 10 always not taken  0 0 = Normal return stack operation 257.60
    5 0 Normal operation 1 1 = Return stack disabled 194.95
    6 0 Normal operation 0 0 = Normal return stack operation 168.20

    We observed a bigger difference between scenarios if the first call of the function is measured.

    What is exactly the behavour of the BP, if it is set to always taken and return stack is not deactivated? Dynamic and only foreseeing always taken?

    Regards

    Mathieu

  • Please refer to section 5.2 of Cortex-R5 TRM: DDI0460D

    https://developer.arm.com/documentation/ddi0460/d/Prefetch-Unit

  • Thanks for the hint! I had missed chapter 5.4

     

    I would like to use the following configuration:

    BP = b01 = Branch always taken and history table updates disabled.

    RSDIS = 0 = Normal return stack operation. This is the reset value.

    FRCDIS = 0 = Enable loop prediction. This is the reset value.

    DBHE 0 = Enable the extension. This is the reset value.

    DEOLP = 0 = Normal fetch rate control operation. This is the reset value.

     

    Is the program flow prediction of the BP deterministic with this configuration?

  • Hi Mathieu,

    Without branch prediction, the CPU has to wait until the conditional branch instruction has passed the execute stage before the next following instruction can enter the fetch stage in the pipeline. The branch predictor attempts to avoid this waste of time by trying to guess whether the conditional branch is most likely to be TAKEN or NOT-TAKEN.

    If we force the prediction to be always taken, the branch is then fetched and executed. If it is later detected that the prediction was wrong, then the executed instruction is discarded and the pipeline starts over with the correct branch, incurring a delay. I am not sure if fixed pattern prediction is more deterministic or not.