This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Possible CC2541 SBL bug

Other Parts Discussed in Thread: CC2541

Hello,


I have noticed this on 2 devices now. This has happened rarely but it is something that bricks the device. While setting up the download and failing to establish SBL Handshake, I have not been able to recover the module via UART. In my design we do not have any other method of updating the module. Hence the ICs were pre programmed with an SBL enabled image to allow updating in future. We are running a modified Host Release project as the application.

All the instructions mentioned in the link below were followed.

http://processors.wiki.ti.com/index.php/SerialBootLoader

I have power cycled the module but I don't get any response in this state. Has anyone experienced this before? This puts our design at a greater risk if there is bug in the SBL.

regards

Ankit

  • On further investigation, I read the entire contents of the flash on the affected module and noticed that the SBL area was all 0xff, suggesting it was erased/overwritten somehow.
  • So you're saying that you can not communicate with the SBL? Can you attach a logic capture of the UART lines when you are attempting the handshake to verify that it is done correctly?
  • Hello Tim,


    Yes I cannot communicate with the SBL. I even had the 2nd entry to bootloader [if ((SLEEPSTA & 0x10) || !sblWait())] compiled as a backup. I did probe the UART lines and only RX into CC2541 shows the data going in, no TX from CC2541. The flash area for SBL is blank as I mentioned in my last comment.

    Is it possible for SBL to overwrite itself?

    Regards

    Ankit
     

  • I suppose it is possible if you send an invalid address offset which would cause an overflow on the address calculation (t16 below):

    static void sblProc(void)
    {
      uint16 t16 = BUILD_UINT16(sbBuf[SBL_REQ_ADDR_LSB], sbBuf[SBL_REQ_ADDR_MSB]) + HAL_SBL_IMG_BEG;
      uint8 len = 1, rsp = SBL_SUCCESS;
      uint16 crc[2];
    
      switch (sbBuf[RPC_POS_CMD1])
      {
      case SBL_WRITE_CMD:
        if ((t16 % SBL_PAGE_SIZE) == 0)
        {
          HalFlashErase(t16 / SBL_PAGE_SIZE);
        }
        HalFlashWrite(t16, (sbBuf + SBL_REQ_DAT0), (SBL_RW_BUF_LEN / HAL_FLASH_WORD_SIZE));
        break;

    However, you shouldn't be doing this.  In any case I can't imagine that the entire SBL will be overwritten with 0xFF as it would break before you get to that point.

    The best way to debug this is to look at a logic capture of the entire sequence where this occurs. You will be able to see every command and response from the SBL.

  • Hello Tim,


    In the corrupt image, the first 2 pages have been erased. I have attached the broken flash hex file for your reference.

    I cant see why the application would pass in an incorrect offset.

    This needs more investigation and I would appreciate if you would consider the possibility of some code in SBL overwriting itself even if the address offset provided is correct.

    I will investigate further today and add more results.

    Ankit

    img_broken.zip

  • Hello Tim,

    The HalFlashErase function sets the erase bit in FCTL but doesn't wait for it to be reset. Could that cause corruption? A write being called while the erase is in progress?

    void HalFlashErase(uint8 pg)
    {
    FADDRH = pg * (HAL_FLASH_PAGE_SIZE / HAL_FLASH_WORD_SIZE / 256);
    FCTL |= 0x01;
    }
  • My bad, I overlooked the following comment in the software User's Guide.

    If ERASE is also set to 1, a page erase of the whole page addressed by
    FADDRH[7:1] is performed before the write. Setting WRITE to 1 when ERASE is 1
    has no effect.

    ignore my last comment.
  • Hello Tim,

    After more testing today, I can confirm that my application won't send the address which could cause an overflow scenario. It crashes when I try injecting the overflow address as it exceeds the file size of the binary image to be read from.

    I did manually send a write command with the address value of 0xFE00 and that seems to have erased the sbl. But this scenario is not possible.

    We are about to release this product but this issue puts it at a big risk. How could the first 2 pages (4K) of flash get erased? This is the big question.

    I would appreciate if TI tries to reproduce this bug.
  • Hello TI,

    I was hoping for some feedback on this thread. This is quite a major issue.

    I have verified the flash contents of the second failed module and it seems like there is a tiny bit corruption in the SBL area. Following is hex dump of the lines where it mismatches

    First instance of corruption right at the start:

    :100000000207C3020803FFFFFFFFFF02080B0C807B    - Corrupt

    :100000000207C3020803FFFFFFFFFF02080BFFFF09  - Good

    Second instance:

    :1002800014603F14607E1470030203800203B6EB17  - Corrupt

    :1002800014603F14607E1470030203B70203B6EBE0  - Good


    We bought the modules from digikey, how likely is it that the flash gets corrupt on these modules?

  • Have you analyzed the complete logic capture of UART communication to/from the SBL? This will allow you to see every command you are sending and every response. You can then compare this to the flash image.

    You can also try setting a breakpoint in HalFlashErase() to see if it is hit at any point that it shouldn't be.
  • Hello Tim,


    Once the unit it is bricked, it is very hard to reproduce the issue again. logic capture of the UART is not practical for us at the moment.

    What are you trying to establish with the logic capture? I know that the pc application won't send an overflow address that could cause the corruption.

    As far as setting the breakpoint on HalFlashErase goes, that would be great but the UART lines are a part of the CC debugger cable so I cannot debug SBL and transfer firmware at the same time.

    I repeat my earlier question again, how likely is the CC2541 flash to get corrupted after a few writes? We have had to discard around 2-3% of ICs during pre programming stages already as they failed during flash verification part while programming. The ones which did pass have now bricked. My previous comment shows that there were only a few bytes corrupt.

    I have noticed this on 2 out of 200 tested modules at our end and these haven't been released yet.

    I think this is a major issue rather than just the forum support query and it deserves more attention then a couple of lines of response every other day.

  • I don't have numbers but the chances of flash corruption are extremely unlikely. I can try to talk to our hardware team to get specifics. However, I highly doubt this is the case as we have never seen this.

    How do you know that the PC application won't send an overflow address? This could be verified by a UART capture. Even if you don't think you are sending one due to inspecting the code (which should be verified with UART), there could be noise or something else which is corrupting the UART lines. Have we reviewed your schematic? If not, please send me a personal message with your schematic attached.

    In terms of practicality, you could take a UART capture using a digital logic analyzer of every SBL you perform. If it breaks then you could compare it to a working one.

    I don't understand the issue with not being able to debug. Of course the UART pins and debug pins are not the same. Are you saying that you don't have physical access to the debug pins?

    Without capturing UART or using the debugger, I don't see any way to debug this.

    Just to clarify, by bricked you mean "not able to be programmed by the SBL" correct?

    You may also want to try locking the SBL flash pages for reading/writing/erasing. This would probably mask your issue but it shouldn't be occurring in the first place.
  • Hello Tim,

    This issue is not easily reproduced. I have programmed a unit over 500 times and haven't managed to get into this state. Right now I am continuously loop programming the unit which had a 3 byte corruption as mentioned in the earlier thread. 50 loops so far and it hasn't corrupted. We had to lift that IC of the board and then reflash it using the TI Flash Programmer.

    Hence I dont think its practical capture all that uart logic data if this can't be reproduced over 100s of iteration. But I will try to do this. Regarding the pc application, it would crash with a wrong address. Noise on the uart lines seems not probable but could be possible. Although the corruption caused by injected wrong address is different from the actual failures.

    By bricked I mean that they can't be programmed via sbl.

    How can I just lock the SBL flash pages? Can I do this from the application running on chip itself?

    regards
    Ankit
  • You can do this by setting the relevant lock bits (see section 3.4.1 of the chip user's guide: www.ti.com/.../swru191).

    These reside at the end of flash starting 16 bytes from the end (see the linker file for more info).

    They can be set either in IAR under the "Texas Instruments" tab of the project options or by using flash programmer (see the guide: www.ti.com/.../swru069g.pdf)
  • Thank you Tim
    Write protecting SBL flash pages is definitely a preventative measure.
  • Hello Tim,

    I have tried going through Digikey who are our suppliers to log an official support request with TI. We have spent over 8 weeks now and there hasn't been much progress apart from a suggestion about a brownout being the potential cause of flash corruption on boot. This suggestion has come from Digikey so we have had no success in getting TI's attention via that channel.


    Could you suggest how I can get TI involved in this investigation? 8 weeks is a long time and a lot could have been achieved if this was handled better.

    We could start by TI reviewing our design (based on the reference design) and suggest if there are any potential brown out sources.

    Looking forward to your response.

    regards

    Ankit

  • Hi,
    If you post your schematics and layout, we will review it and give you feedback. If you do not want to post them on this forum, send me a friend request and we can take the review offline. This forum is the best way for you to get help from us in TI, I am sorry if your case has not gotten the support you needed.
  • Hello CHS

    I have sent you a friend request. Once you accept it, I will send the schematics to you offline.

    regards
    Ankit