This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TM4C1294NCPDT: Comparison of ROM vs Flash SSI/SPI Tivaware API's - shouldn't ROM version be faster?

Part Number: TM4C1294NCPDT
Other Parts Discussed in Thread: STRIKE

Fellows,

Code below reads two continous sets of 32 bits over an SPI port, with CSB controlled by GPIO, and clock set at 7.5MHz. It takes 11.44us. Tivaware functions compiled to Flash, no optimizations.

    IntMasterDisable();
    GPIOPinWrite(gyroHW->GPIOFSSPort,gyroHW->GPIOFSSPin,0);				// Lower CSB
    SSIDataPut(gyroHW->SSIBase, frameHigh);							// Sends only the upper 16 bits
    while(SSIBusy(gyroHW->SSIBase));								// Flush first 16 bits
    SSIDataGet(gyroHW->SSIBase, &readCrap);							// Read useless data
    SSIDataPut(gyroHW->SSIBase, (frameLow));						// This sends the other 16 bits
    while(SSIBusy(gyroHW->SSIBase));								// Flush second 16 bits
    GPIOPinWrite(gyroHW->GPIOFSSPort,gyroHW->GPIOFSSPin,gyroHW->GPIOFSSPin);	// Raise CSB
    SSIDataGet(gyroHW->SSIBase, &readCrap);							// Read useless data
    GPIOPinWrite(gyroHW->GPIOFSSPort,gyroHW->GPIOFSSPin,0);				// Lower CSB
    SSIDataPut(gyroHW->SSIBase, frameHigh);							// Flush third 16 bits
    while(SSIBusy(gyroHW->SSIBase));								// Wait for bits to be flushed out
    SSIDataGet(gyroHW->SSIBase, &read16High);						// Read 16 bits
    SSIDataPut(gyroHW->SSIBase, frameLow);							// Flush last 16 bits
    while(SSIBusy(gyroHW->SSIBase));								// Wait for bits to be flushed out
    SSIDataGet(gyroHW->SSIBase, &read16Low);						// Read more data
    GPIOPinWrite(gyroHW->GPIOFSSPort,gyroHW->GPIOFSSPin,gyroHW->GPIOFSSPin);	// Raise CSB
    IntMasterEnable();

While this exact version with all the API's called directly from ROM take 12.22us.

    IntMasterDisable();
    ROM_GPIOPinWrite(gyroHW->GPIOFSSPort,gyroHW->GPIOFSSPin,0);				// Lower CSB
    ROM_SSIDataPut(gyroHW->SSIBase, frameHigh);							// Sends only the upper 16 bits
    while(ROM_SSIBusy(gyroHW->SSIBase));								// Flush first 16 bits
    ROM_SSIDataGet(gyroHW->SSIBase, &readCrap);							// Read useless data
    ROM_SSIDataPut(gyroHW->SSIBase, (frameLow));						// This sends the other 16 bits
    while(ROM_SSIBusy(gyroHW->SSIBase));								// Flush second 16 bits
    ROM_GPIOPinWrite(gyroHW->GPIOFSSPort,gyroHW->GPIOFSSPin,gyroHW->GPIOFSSPin);	// Raise CSB
    ROM_SSIDataGet(gyroHW->SSIBase, &readCrap);							// Read useless data
    ROM_GPIOPinWrite(gyroHW->GPIOFSSPort,gyroHW->GPIOFSSPin,0);				// Lower CSB
    ROM_SSIDataPut(gyroHW->SSIBase, frameHigh);							// Flush third 16 bits
    while(ROM_SSIBusy(gyroHW->SSIBase));								// Wait for bits to be flushed out
    ROM_SSIDataGet(gyroHW->SSIBase, &read16High);						// Read 16 bits
    ROM_SSIDataPut(gyroHW->SSIBase, frameLow);							// Flush last 16 bits
    while(ROM_SSIBusy(gyroHW->SSIBase));								// Wait for bits to be flushed out
    ROM_SSIDataGet(gyroHW->SSIBase, &read16Low);						// Read more data
    ROM_GPIOPinWrite(gyroHW->GPIOFSSPort,gyroHW->GPIOFSSPin,gyroHW->GPIOFSSPin);	// Raise CSB
    IntMasterEnable();

Not that this will kill anyone, but all discussions and documentations to date imply that ROM calls are faster. Any thoughts as the reason for the results above?

Regards

Bruno

  • It appears that your, "ROM vs Flash"listings have been reversed!      (top code listing described as, "Flash" - yet all calls are to ROM!)

    Beware when employing "absolutes" (i.e. "ALL discussions & documentations...")  that's clearly untrue!     (You really didn't search for - then review  "ALL" - so the comment is gratuitous and incorrect!)     Amit has authored several posts which described "limitations" experienced (sometimes) via ROM calls...

    It has been noted that there are "overheads" - demanded by ROM functions -  which are NOT present w/in Flash calls!       And the,  "Type, Size, and even code/function "placement" w/in ROM" - all have been shown to impact code execution.     (a more thorough review of discussions/documentations is likely to reveal (how/when/where)  to "tease optimal" performance from ROM calls.)

    It proves "good" that you, "Run such comparative tests" - but "Not so good" that you draw conclusions which may rely upon "absolutes"  (ALL)  and may be in factual error...    (credit UCLA law)

    And do further detail (somewhat detail) the "murder" of the SPI Slave - via 8+MHz SPI clock - (communicated via PM).

  • cb1_mobile said:
    It appears that your, "ROM vs Flash"listings have been reversed!

    True. Fixed!

    Found this discussion on the topic which has some very good insight.

    e2e.ti.com/.../1010015

    cb1_mobile said:
    "ALL discussions & documentations...")  that's clearly untrue!

    Noted. That's what proofreading, revisions and cb1s are for! I have based my statement on two lines of thought. First, this text copied from Tivaware UG:

    "For better accuracy, the ROM version of this function may be used. This version will not suffer from flash- and prefect buffer-related timing variability but will still be delayed by interruptservice routines."

    Next, the fact that I myself, based on the above, have several times defended the use of ROM_ based functions on posts here, for both space and speed. Nobody had corrected be so far, to the point that I had built that as a solid fact. Thanks for pointing that out, cb1.

    The conclusion is that ROM calls will likely to be smaller, but not necessarily faster (this second, I see, will depend a lot on which function we are talking about, and other "unpredictable variables" on the system). Being this a post to actually originate discussion and learning, I'll say the purpose is being met!

    Bruno

  • Greetings Bruno -  et, "Merci beaucoup."

    No rest for the wicked (i.e.  profit-seeking, small tech firms) desiring to (further) "Strike while tech is (if not hot) at least beyond, "luke warm."     (04:00 - in the "joint" - to set-up & prepare for the "weekend obsessed" (clock watching) crack crüe.)

    I believe your "broadening of conclusions" - which admit the likelihood of "performance deltas" - based upon multiple variables (some not always readily predictable) will best serve you.      And - you've learned (and nicely accepted) that, "Rules for the admissibility & acceptance of evidence" - may have "reach" beyond the courtroom.    (only those "facts" - earlier screened & approved - may be employed as "evidence.")

    If you've the interest - it may prove insightful to:

    • repeat your test - but first enlarge your code listings ("copy/paste" your code 10x) - and then measure/compare/contrast
    • repeat your test (several times) but (now) targeting different MCU Peripherals - to note potential variations (if any)   (in the past - we tested & noted such...)
    • to prevent the (unwanted) intrusion of "other" (impacting) program elements - run such tests in a pristine code environment - not subject to "outside" influences.   (i.e. interrupts, RTOS etc.)
    • different System Clocks will alter performance (i.e. 40MHz & below for 4C123) - may prove useful to: "Run - Measure - and Record"

    Should there exist a "universal truth" (i.e. ROM Functions are ALWAYS superior to FLASH - in terms of execution speed & memory footprint) could not such have been better "featured" & promoted?    (as many users ARE Engineers - comparison charts (somehow) "come to mind."      Yet there are "few" (maybe none) perhaps in recognition of the "variability" which arises.    (rendering the use of "absolutes" unwise!)

    BTW - the "MCU manual section" which you provided does NOT specifically note an, "execution speed" advantage via ROM calls.   (speaks to the (hedged) "timing variability" instead)     Your conclusion (may) prove valid - but it is not "certain" - and "in effect" under "each/every" operating condition...

    So - conclusions are critical - serve as great "guideposts" - and may (even) benefit from "methods commonly employed w/in (other, more strict) disciplines!"       (especially those: enjoying legal status/rule and having been developed prior to (real) engineering's birth!)