This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

C6678 local reset

Is there any documentation that gives a complete description of what local reset does on the C6678?

All I have been able to find are vague comments scattered through several documents:

"The local reset can be used to reset a particular CorePac without resetting any other chip components."

"In response to a local reset, the L2 cache isleft in its current operating mode. However,  the entire contents of the cache are invalidated."

Other than invalidating the cache, what exactly does local reset do?

  • To add to my previous question:

    Section 2.2.3 in SPRUGV4b states:

    "When local reset is asserted, the internal memories (L1P, L1D, and L2) for the core are still accessible."

    This suggests that external memory is not accessible while local reset is asserted. What is the state of local memory when local reset is deasserted? Are the previous memory contents preserved?

    Later in the same section it states:

    "Set MDCTL[Y].LRSTZto 0x1 to de-assert local reset. The C66x core immediately executes program instructions  after reset is de-asserted. Note that the boot sequence does not re-occur unless there is a device-level reset. Execution of code previously in L2 begins execution."

    Why is the explicit mention of L2 there? Does execution not start from the address in BOOT_ADDR? Is that address constrained to be in L2?

  • From the lack of replies I'm concluding that TI does not have any documentation that is explicit about what local reset actually does.

    I am now stuck with the following problem:

    I have a C6678 board that is connected to the host PC using PCIe.

    After a power cycle, I can issue a local reset, load a trivial program (8 instructions) to 0x00800000 and start it running (by removing the local reset). The program simply increments a memory location and I can see that location changing
    by reading memory from the PC.

    I now do a local reset and reload the program. I can see it still works. This can be repeated many times.

    If I now run my real code, it works perfectly a few times, but eventually when I try to reload it (via local reset), it doesn't appear to start. With the C6678 in this state, my trivial incrementing code now no longer runs when I follow the procedure given above. Clearly, local reset is failing to reset something, and this results in local reset not jumping to the location in DSP_BOOT_ADDR.

    Power-cycling the DSP puts everything back and I can repeat the exercise with identical results.

    I can see all the relevant memory-mapped registers changing their values as the PC issues the local reset, loads the code, and removes reset. I can see that DSP_BOOT_ADDR has been set to 0x800000 (reads as 0x800001) and the code has been loaded correctly to 0x800000. Everything looks right, but the code does not execute.

    Because everything works after a power cycle, I conclude that something crucial has been altered and local reset doesn't put it back to a working value.

    What could possibly stop the DSP from responding to local reset (and yet leave it responding correctly to PCIe requests from the host)?

    I would be grateful if anyone would respond.

  • I'm seeing the exact same problem.  Using the first load to initialize some components and then trying a reset, followed by a new load, and pull it out of reset.  I see it work sometimes and other times nothing happens.  Toggling the reset has no effect.  It does seem that at this point if I reset, reload, and pull it out of reset a second time it tends to start running.  I simply repeat the second load and things are happy.  I can't figure out what isn't getting reset properly and why sometimes it will work but not others when the steps are always the same.

  • Peter Robertson said:

    If I now run my real code, it works perfectly a few times, but eventually when I try to reload it (via local reset), it doesn't appear to start. With the C6678 in this state, my trivial incrementing code now no longer runs when I follow the procedure given above. Clearly, local reset is failing to reset something, and this results in local reset not jumping to the location in DSP_BOOT_ADDR.

    Power-cycling the DSP puts everything back and I can repeat the exercise with identical results.

    I can see all the relevant memory-mapped registers changing their values as the PC issues the local reset, loads the code, and removes reset. I can see that DSP_BOOT_ADDR has been set to 0x800000 (reads as 0x800001) and the code has been loaded correctly to 0x800000. Everything looks right, but the code does not execute.

    I have been experiencing the same issue.

    I have found that even though I have, like you,  put the DSP into local reset, written the proper DSP_BOOT_ADDR and successfully uploaded the new program... when I pull the DSP out of reset it begins executing the old program! up to a certain point and then it crashes. 

  • While it doesn't solve the problem, the fact that other users are having the same experience suggests that the issue is unlikely to be finger trouble.

    Why has nobody from TI responded with a link to the complete documentation showing exactly what local reset does? Should we conclude that there is no such documentation or that the real answer is 'nobody knows'?

    Does anyone from TI have a trivial example that shows how to make local reset actually do what the meagre documentation says it should?

    Will this post end up like so many others with no response from TI?

  • I would appreciate some response to show that someone in TI has actually read my questions and that they haven't fallen into the E2E black hole.

  • I think I found the solution to the problem. 

    The first EABI file I load, I do the following steps:

    1. put the PCIESS into supervisor mode;

    2. unlock the chip level regs;

    3. write to the DSP_BOOT_ADDR reg.

    4. I will then write to the boot magic address and generate the MSI interrupt;

    5. pull the PCIESS out of supervisor mode;

    When I load the second EABI file, I do the same steps above, minus step 4 but with the inclusion of toggling the local reset. 

    The fix is if I move step 5 above (pull the PCIESS out of supervisor mode) before step 4. Then when I load the second EABI file I see it run as it should. 

  • Thank you for your suggestion, but it does not solve my problem.

    1: The system is always in supervisor mode.

    2: The MSI interrupt and the boot magic address are not relevant ; the on-chip bootloader isn't being accessed.

    I suspect that the problem could be resolved if someone in TI would supply a complete definition of what a local reset actually does. Am I being unreasonable in thinking that this should be expected in the processor's documentation?

    Has anybody from TI actually looked into this?

  • Never mind... it didn't really work for me either


    By shifting that line of code I created a bug in my EABI loader that would write a 0x0 to the magic boot address on the first EABI load. In effect, the first EABI never ran and only the second EABI file was the first to run. 

    So... again there is nothing wrong with either EABI file, as they can both run. There is nothing wrong with the loader as they can both be loaded and run. But I can not load them successively using the local reset.

    I plugged in the JTAG debugger this morning and saw that on the second EABI file after the local reset, the PC is jumping off to lala land. 

  • That's what I have deduced. Local reset is clearly not doing something critical and this results in the processor ending up jumping to somewhere essentially random.

    I have received a note from a contact in TI saying that someone "is handling your e2e post". I would have thought it to be common courtesy for that person to post here to that effect rather than leaving us guessing. Let's hope such a person actually exists and does not work to a geological timescale.

  • Peter,

    I apologize for the delay.

    If I understand your user case: you have a board with C6678 plugged in a Linux PC and you wish the Linux PC (host) can reset C6678 and re-run code again and again without power cycling the PC?

    If that is the intent, for C6678 EVM, we have software example in place already. For Quad or Octal C6678 board, the same idea would apply I would check with Advantech to see if they have this test and example code.

    As you know, on C6678, there are several types of resets: power on reset, hard reset, soft reset and CPU local reset. Hard reset will reset everything on the device except the PLLs, test, emulation logic, and reset isolation modules. Since PCIE doesn’t support reset isolation, a hard reset will reset PCIE module as well and all the configured PCIE registers (PCIE MMRs) will be lost. Soft reset will behave like a hard reset except that the stick bits of PCIE MMRs are retained. The PC can’t communicate with PCIE card anymore in both hard reset and soft reset cases. So in order to reset the DSP while keeping the PCIE untouched, we have to use the local reset. Otherwise, PCIE link will be gone and you have to do power cycle the PC which is not ideal.

    Our example in BIOS MCSDK for PCIe local reset does the following:

    • Put all cores in reset via PSC
    • Disable all modules except PCIE and cores via PSC
    • Configure chip level registers DSP_BOOT_ADDRn and IPCGRn: Here the header array
    converted from DSP local reset example is loaded into each core via PCIE; the _c_int00
    is then written to each DSP_BOOT_ADDRn; finally IPCGRn is written to jump start the
    DSP local reset example program, which simply polls magic address for a secondary
    boot.
    • Enable all modules previous disabled via PSC
    • Pull all cores out of reset via PSC

    You can find example code here:

    http://software-dl.ti.com/sdoemb/sdoemb_public_sw/bios_mcsdk/latest/index_FDS.html

    Under: C:\ti\mcsdk_2_01_02_06\tools\boot_loader\examples\pcie\
    Docs folder: section 9.6 explains  How DSP local reset example works
    Linux code: \linux_host_loader\pciedemo.c  #ifdef LOCAL_RESET
                           please check function dspLocalReset().
    DSP code: pcieboot_localreset\src\pcieboot_localreset.c

    I believe if you follow the same way to implement for your board (if not C6678 EVM), it should work. If not, I need to get more details about your implementation, ideally if you could attach your code for me to check and compare the implementation so I can understand how you do local reset and what exactly was implemented.

    Again the code can be directly used for C6678 EVM as that has been tested. For other board, same idea applies.

    best regards,

    David Zhou

  • Thank you for responding.

    "you have a board with C6678 plugged in a Linux PC"

    No. I am using Windows XP, but this should be irrelevant.

    Your examples are not appropriate, because I am not using the on-chip bootloader for the second and subsequent executions. This is because to do so would mean that I would have to reserve a significant amount of internal memory; this is not acceptable.

    The initial power-cycle of the board will result in the PCIe being initialised correctly.

    My loader loads everything directly through PCIe, sets DSP_BOOT_ADDR0 and then signals a local reset. This works reliably until something unknown gets changed in the DSP. From then on, the local reset has no apparent effect.

    I am simply loading a trivial set of instructions to 0x800000 (the value I set in DSP_BOOT_ADDR0) and I can see them being loaded correctly (which indicates that PCIe reads and writes are still working) , but they are never executed. What could be preventing a local reset from executing from the address in DSP_BOOT_ADDR0 as it is documented to do?

    I still have not seen any documentation on exactly what local reset actually does.

    What does it do?


  • Peter,

    A local reset is a reset that is applied only to the CPU and/or its megamodule (any tightly
    coupled controllers delivered with the CPU). This reset does not affect the rest of the device. And it does not initiate boot process.

    Can you attach the code so I can reproduce the issue? How did you trigger local reset? Please also let me know the hardware you are using, part number and revision information. What PCIe windows driver you are using for this card? For Windows XP, is it 32 bit or 64 bit?

    Thanks!

    best regards,

    David Zhou

  • "This reset does not affect the rest of the device. And it does not initiate boot process."

    I know, but it should initiate a branch to the address in DSP_BOOT_ADDR0 according to the documentation.

    The code is part of a fairly complex system, so I shall try to reproduce the problem with a small example I can send to you. For the time being, here is the essence of the reset code:

    unsigned int Reset1[] = {
       0x00000000,  //  Reset1:  nop
       0x0000002A,  //           mvkl 0x00800000, b0
       0x0000406A,  //           mvkh 0x00800000, b0
       0x0080A35A,  //           zero             b1
       0x008000D2,  //  xxx:     addk 1,          b1
       0x008002F6,  //           stw  b1,        *b0
       0x0004A120,  //           bnop xxx,        5
       0x00008000   //           nop  5
    };

    void Reset(unsigned int Disable)  {
       WriteDspRegisterUlong(PSC_MDCTL15, 3 | Disable);
       WriteDspRegisterUlong(PTCMD, 1<<8);
       do {ReadDspRegisterUlong(PTSTAT, X);} while (X&(1<<8));
       do {ReadDspRegisterUlong(PSC_MDSTAT15, X);} while ((X&0x1F) != 3);
    }

    Reset(0<<8);  // into reset
    WriteDspRegisterUlong(DSP_BOOT_ADDR0, 0x00800000);

    WriteDsp(0x10800000, (UINT8 *)Reset1, sizeof(Reset1));
    Reset(1<<8);  // out of reset

    I can then read 0x800000 and see it being incremented when the code is executed after a power cycle.
    This can be repeated (with variations of the code to ensure that the new code has been written) many times.

    If instead, I load my real code, it loads and executes correctly, but only once. Thereafter, the code above loads to 0x800000 but that code is never executed.

    The board is a DSPC-8681 from Advantech. I have no way of knowning the chip details without dismantling the board, something I cannot do as it is on loan. The PCIe driver is the one supplied by Advantech.

    My earlier comment about XP was a typo; I am running on a 64-bit Windows 7 PC.

    Regards,

    Peter

  • Peter,

    Thank you for providing a piece of  your code. I think there is a problem in the way you put the core into reset. Per PSC user guide (literature number: SPRUGV4B—November 2011) section 3.2.8, you need to set bit 8 to "0" in MDCTL for assert local reset. Your code didn't do this: WriteDspRegisterUlong(PSC_MDCTL15, 3 | Disable);

    Here is my suggestion:

    void Reset(unsigned int Disable)  {
        ReadDspRegisterUlong(PSC_MDCTL15, X);
        if (Disable == 0) {/* put core in reset */
             WriteDspRegisterUlong(PSC_MDCTL15, (X&(~0x1F) | 3)&(~0x100));
        } else { /*put core out of reset */
             WriteDspRegisterUlong(PSC_MDCTL15, (X&(~0x1F) | 3 | Disable);
          }

         /* Make sure no previous transition in progress, just a safeguard */
       do {ReadDspRegisterUlong(PTSTAT, X);} while (X&(1<<8));

       /*start transition */
       WriteDspRegisterUlong(PTCMD, 1<<8);

    /* check transition finished */
       do {ReadDspRegisterUlong(PTSTAT, X);} while (X&(1<<8));
       do {ReadDspRegisterUlong(PSC_MDSTAT15, X);} while ((X&0x1F) != 3);
    }

    Please give this a try to see if it helps.

    You have a nice weekend!

    Best Regards,

    David Zhou

  • David,

    We have the exact same issue with our own board and device drivers under Windows.  Our reset code is like what you have above, but Peter's code *is* putting it into reset. Take a second look over what Peter posted and you will see it writes a 0 to bit 8 of the MDCTL.

    The only thing I am not doing, that Peter is and TI, is setting the PTCMD after writing to the LRST in the MDCTL. The sprugv4 in section 2.2.3 Local Reset does not list that as a necessary step. And indeed, we verify we can halt program execution and then restart by just toggling the LRST. 

    The problem is when we put the core into Local Reset and load a new (second) EABI file, write the appropriate DSP_BOOT_ADDR, and the take the core out of Local Reset the core behaves badly. The Program Counter runs off to lala land. 

    Sometimes it almost looks like the core is trying to execute the first EABI file, like it is running old code from the L1P cache when we deassert the LRST. 

    Colin

  • David,

    Thank you for your suggestion, but let's look at your code:

        ReadDspRegisterUlong(PSC_MDCTL15, X);
        if (Disable == 0) {/* put core in reset */
           WriteDspRegisterUlong(PSC_MDCTL15, (X&(~0x1F) | 3)&(~0x100));
        } else { /*put core out of reset */
           WriteDspRegisterUlong(PSC_MDCTL15, (X&(~0x1F) | 3 | Disable);
        }

    (X&(~0x1F) | 3)&(~0x100) = (X&(~0x1F)&(~0x100) | 3) = (X&(~0x11F) | 3)

    According to the documentation, there are only three fields in the MDCTL registers that are not reserved. X&(~0x11F) clears all except RESETISO, the reset value of which is 0 and so our reset should clear this also. This means that all non-reserved fields in MDCTL will be set to zero, so we may as well clear the whole register. This makes reading MDCTL pointless.

    Your code now reduces to:

        if (Disable == 0) {/* put core in reset */
           WriteDspRegisterUlong(PSC_MDCTL15, (0&(~0x1F) | 3)&(~0x100));
        } else { /*put core out of reset */
           WriteDspRegisterUlong(PSC_MDCTL15, (0&(~0x1F) | 3 | Disable);
        }

    which simplifies to:

        if (Disable == 0) {/* put core in reset */
           WriteDspRegisterUlong(PSC_MDCTL15, 3);
        } else { /*put core out of reset */
           WriteDspRegisterUlong(PSC_MDCTL15, 3 | Disable);
        }

    which again simplifies to:

        WriteDspRegisterUlong(PSC_MDCTL15, 3 | Disable);

    which is what I had in the first place.

    (As an aside, almost all of the code from TI that I have examined can be similarly dramatically simplified and clarified by applying a little thought.)

    Colin,

    I had also consided the possibility that the cache was causing the problem, so my code also invalidates L1P, L1D and L2. This makes no difference, which is what I expected as I remember reading somewhere (very difficult to find as information is scattered over many TI documents) that local reset clears the caches.

  • Peter Robertson said:

    Colin,

    I had also consided the possibility that the cache was causing the problem, so my code also invalidates L1P, L1D and L2. This makes no difference, which is what I expected as I remember reading somewhere (very difficult to find as information is scattered over many TI documents) that local reset clears the caches.

    We tried to invalidate, too. We also tried disabling the L1P Cache but both still exhibited the same problem. 

    Another speculation was the the LRESET was causing a cache pre-load in the L1P. So we had our first EABI program exit to a NOP loop, we would write the second EABI program to L2SRAM and then toggle the LRESET. Same result.

    But often times it does look like the first EABI program is trying to execute on the second EABI program load. Like there is some hidden and very sticky L0 cache. 

  • Hi David,

    I have been trying to generate some code I could send to you, but the problem appears only after running an application that needs a full installation of our software. I an trying to find a way to cut things down so that I can send a simple example that demonstrates the problem.

  • It also requires a SPC board and associated drivers from Advantech, so I suspect you will not be able to reproduce the problem.

    The evidence Colin and I have found is suggesting strongly something strange happening with the cache.

  • Colin,

    Can you possibly send your project to me to demo the problem? I assume you are using DSPC-8681E too?

    best regards,

    David Zhou

  • dzhou said:

    Colin,

    Can you possibly send your project to me to demo the problem? I assume you are using DSPC-8681E too?

    best regards,

    David Zhou

    I will have to create something. The main project is for a customer and I can not distribute. 

    I am using our own custom board and device driver. 

  • Peter, Colin,

    I found Advantech DSPC-8681E has DSP local reset functionality provided both in Windows driver and Linux driver. The download page is: DSPC8681 driver link. The latest version for windows is V0.7.8, the local reset has been  implemented in do_dsp_local_reset() function in dsp_loader(Lightning_PCIE_0_7_8\Project\dsp_loader), the API call into Windows driver. The demo script is in Lightning_PCIE_0_7_8\examples\script\DSPC8681E\dspreset.bat.

    For the DSP local rest implementation, you can refer to the Linux source code V0.7 in Lightning_PCIE_0_7\dsp_loader\app. It is still done by do_dsp_local_reset() in dsp_loader.c

    /*
     *
     * Copyright (C) 2011 Advantech Co., Ltd. - http://www.advantech.com.tw/
     *
     *  Redistribution and use in source and binary forms, with or without
     *  modification, are permitted provided that the following conditions
     *  are met:
     *
     *    Redistributions of source code must retain the above copyright
     *    notice, this list of conditions and the following disclaimer.
     *
     *    Redistributions in binary form must reproduce the above copyright
     *    notice, this list of conditions and the following disclaimer in the
     *    documentation and/or other materials provided with the
     *    distribution.
     *
     *    Neither the name of Advantech Co., Ltd. nor the names of
     *    its contributors may be used to endorse or promote products derived
     *    from this software without specific prior written permission.
     *
     *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
     *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
     *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
     *  A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
     *  OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
     *  SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
     *  LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
     *  DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
     *  THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
     *  (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
     *  OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
     *
     */
    
    #include <linux/stat.h>
    #include <linux/ioctl.h>
    
    #include <sys/mman.h>
    #include <sys/time.h>
    
    #include <fcntl.h>
    #include <stdio.h>
    #include <string.h>
    #include <stdlib.h>
    #include <unistd.h>
    
    #include "pcie_drv.h"
    #include "argparse.h"
    #include "bspversion.h"
    #include "ihexparser.h"
    #include "c6678_boot_defs.h"
    #include "pcieLocalReset.h"
    
    
    /* define this feature for time counting when loadbinary and savebinary */
    #define time_measure
    
    /* define DMA channel and param id */
    #define DMA_CHAN_RD                         2
    #define DMA_CHAN_WR                         3
    #define DMA_PARAM_START_NUM                 0
    #define TI667X_PCIE_MAX_IO_BUFFERS          2
    #define DSP_OB_REGION_SIZE                  0x400000
    
    /* write 16MB once time for binary file read/write */
    #define BLOCK_BUF_SZ (0x1000000)
    
    /* Block size in bytes when r/w data between GPP and DSP via DSP CPU */
    #define DSP_LOAD_BLOCK_TRANSFER_SIZE        0x100      
    #define IS_CORE_LOCAL_ADDR(addr)            ((addr & 0xff000000) == 0)
    #define LOCAL_TO_GLOBAL_ADDR(addr, core_id) (addr | ((core_id + 0x10) << 24))
    
    #define BOOT_ENTRY_LOCATION_ADDR 0x87FFFC
    
    
    /*-------------------------------------------------------------------------
     * functions used by DSP load
     *-----------------------------------------------------------------------*/
    int dio_write_mem(unsigned dsp_id, unsigned addr, const unsigned char *buffer, size_t size)
    {
        int ret = 0;
    	
        ret = pcie_drv_dsp_write(dsp_id, addr, (unsigned char*) buffer, size);
        return ret;
    }
    
    
    int dio_read_mem(unsigned dsp_id, unsigned addr, const unsigned char *buffer, size_t size)
    {
        int ret = 0;
        ret = pcie_drv_dsp_read(dsp_id, addr, (unsigned char*) buffer, size);
    
        return ret;
    }
    
    
    int dio_write_mem_dma(unsigned dsp_id, unsigned addr, unsigned char *buffer, size_t size, pci_io_buf_info_t *dma_desc)
    {
        int ret = 0;
        unsigned int num_buffers;
        uint32_t transfer_buffers, transfer_size, remaining_size;
        int i;
        unsigned int offset, cpy_size, remaining_cpy_size;
        pci_io_buf_info_t host_dma_buf_desc;
    
        /* Calculate total number of host buffers required to fit the data */
        num_buffers = (size+dma_desc->buffer_size-1)/dma_desc->buffer_size;
        remaining_size = size;
        offset = 0;
    
        host_dma_buf_desc.buf_info = dma_desc->buf_info;
    
        while(num_buffers)
        {
            if(num_buffers > dma_desc->num_buffers)
            {
                transfer_buffers = dma_desc->num_buffers;
                transfer_size = transfer_buffers * dma_desc->buffer_size;
            }
            else
            {
                transfer_buffers = num_buffers;
                transfer_size = remaining_size;
            }
    
            /* Copy from buffer content into DMA buffers */
            remaining_cpy_size = transfer_size;
            for(i=0; i<transfer_buffers; i++)
            {
                cpy_size = (remaining_cpy_size < dma_desc->buffer_size) ? remaining_cpy_size : dma_desc->buffer_size;
                
                memcpy(host_dma_buf_desc.buf_info[i].virtAddr, (unsigned char *)buffer+offset, cpy_size);
                host_dma_buf_desc.buf_info[i].length = cpy_size;
    
                remaining_cpy_size -= cpy_size;
                offset+=cpy_size;
            }
            host_dma_buf_desc.num_buffers = transfer_buffers;
            host_dma_buf_desc.buffer_size = dma_desc->buffer_size;
    
            /* transfer data through DMA */
            pcie_drv_dma_write(dsp_id, addr, 
                &host_dma_buf_desc, transfer_size,
                DMA_CHAN_WR,
                DMA_PARAM_START_NUM+dma_desc->num_buffers);
    
            num_buffers -= transfer_buffers;
            remaining_size -=transfer_size;
            addr += transfer_size;
        }
        return ret;
    }
    
    
    int dio_read_mem_dma(unsigned dsp_id, unsigned addr, unsigned char *buffer, size_t size, pci_io_buf_info_t *dma_desc)
    {
        int ret = 0;
        unsigned int num_buffers;
        unsigned int transfer_buffers, transfer_size, remaining_size;
        int i;
        unsigned int offset, cpy_size, remaining_cpy_size;
        pci_io_buf_info_t host_dma_buf_desc;
    
        /* Calculate total number of host buffers required to fit the data */
        num_buffers = (size+dma_desc->buffer_size-1)/dma_desc->buffer_size;
        remaining_size = size;
        offset = 0;
    
        host_dma_buf_desc.buf_info = dma_desc->buf_info;
        
        while(num_buffers)
        {
            if(num_buffers > dma_desc->num_buffers)
            {
                transfer_buffers = dma_desc->num_buffers;
                transfer_size = transfer_buffers * dma_desc->buffer_size;
            }
            else
            {
                transfer_buffers = num_buffers;
                transfer_size = remaining_size;
            }
    
            host_dma_buf_desc.num_buffers = transfer_buffers;
            host_dma_buf_desc.buffer_size = dma_desc->buffer_size;
            
            /* read data through DMA */
            pcie_drv_dma_read(dsp_id, addr, &host_dma_buf_desc, transfer_size, DMA_CHAN_RD, DMA_PARAM_START_NUM);
    
            /* Copy from DMA buffers content into destination buffers */
            remaining_cpy_size = transfer_size;
            for(i=0; i<transfer_buffers; i++)
            {
                cpy_size = (remaining_cpy_size < dma_desc->buffer_size) ? remaining_cpy_size : dma_desc->buffer_size;
                
                memcpy((unsigned char *)buffer+offset, host_dma_buf_desc.buf_info[i].virtAddr, cpy_size);
    
                remaining_cpy_size -= cpy_size;
                offset+=cpy_size;
            }
            
            num_buffers -= transfer_buffers;
            remaining_size -=transfer_size;
            addr += transfer_size;
        }
        return ret;
    }
    
    
    int ihex_write_block(struct _ihex_parser* self, const unsigned char* block, unsigned address, unsigned length)
    {
        int ret;
        unsigned core_num;
    
        core_num = (self->user_value & 0xffff);
    
        if ((address & 0xff000000) == 0) // It points to 'per core' address space.
        {
            address |= ((core_num + 0x10) << 24);
        }
    
        ret = dio_write_mem(self->user_value >> 16, address, block, length);
    
        return 0;
    }
    
    
    enum
    {
        by_memcpy = 0,
        by_dma = 1
    };
    int load_binfile (unsigned dsp_id, const char* filename , unsigned start_addr, unsigned length, unsigned xfer_type)
    {
        int ret=0;
        char* block_buffer=NULL;
        FILE* fp;
        unsigned addr=start_addr;
        unsigned index=0;
        unsigned total_length=0;
        unsigned int finish=0;
        host_buf_info_t driver_dma_buf_info[TI667X_PCIE_MAX_IO_BUFFERS];
        pci_io_buf_info_t dev_dma_desc;
    
        memset(driver_dma_buf_info, 0, sizeof(driver_dma_buf_info));
        fp = fopen ( filename, "rb" );
        if ( !fp ) return -1;
    
        block_buffer = (char*) malloc ( BLOCK_BUF_SZ );
        if(block_buffer == NULL)
        {
            printf("Failed to alloc buffer for storing file block\n");
            return -1;
        }
    
        if(xfer_type == by_dma)
        {
                ret = pcie_drv_dma_mem_alloc(dsp_id, TI667X_PCIE_MAX_IO_BUFFERS, DSP_OB_REGION_SIZE, driver_dma_buf_info);
                if(ret != 0)
                {
                    free(block_buffer);
                    return -1;
                }
                dev_dma_desc.num_buffers = TI667X_PCIE_MAX_IO_BUFFERS;
                dev_dma_desc.buffer_size = DSP_OB_REGION_SIZE;
                dev_dma_desc.buf_info = driver_dma_buf_info;
        }
    
        while ( !feof ( fp ) )
        {
            index = fread(block_buffer, 1, BLOCK_BUF_SZ, fp);
    
            if(index == 0)
                break;
    
            if((length !=0) && (total_length+index >= length))
            {
                index = length - total_length;
                finish = 1;
            }
    
            if(xfer_type == by_dma)
                dio_write_mem_dma(dsp_id, addr, block_buffer, index, &dev_dma_desc);
            else
                dio_write_mem(dsp_id, addr, block_buffer, index);
    
            total_length+=index;
            addr+=index;
    
            if(finish)
                break;
        }
    
        printf("Written to dsp %d bytes\n", total_length);
    
        if(xfer_type == by_dma)
            pcie_drv_dma_mem_free(dsp_id, driver_dma_buf_info);
        
        free ( block_buffer );
        fclose ( fp );
    
        return 0;
    }
    
    
    int save_binfile (unsigned dsp_id, const char* filename , unsigned start_addr, unsigned length, unsigned xfer_type)
    {
        char* block_buffer;
        FILE* fp;
        int ret=0;
        unsigned addr=start_addr;
        unsigned index=0;
        unsigned remaining_length=length;
        unsigned read_length;
        unsigned writed_length=0;
        host_buf_info_t driver_dma_buf_info[TI667X_PCIE_MAX_IO_BUFFERS];
        pci_io_buf_info_t dev_dma_desc;
    
        fp = fopen ( filename, "wb" );
        if ( !fp ) return -1;
    
        block_buffer = (char*) malloc ( BLOCK_BUF_SZ );
        if(block_buffer == NULL)
            return -1;
    
        if(xfer_type == by_dma)
        {
                ret = pcie_drv_dma_mem_alloc(dsp_id, TI667X_PCIE_MAX_IO_BUFFERS, DSP_OB_REGION_SIZE, driver_dma_buf_info);
                if(ret != 0)
                {
                    free(block_buffer);
                    return -1;
                }
                dev_dma_desc.num_buffers = TI667X_PCIE_MAX_IO_BUFFERS;
                dev_dma_desc.buffer_size = DSP_OB_REGION_SIZE;
                dev_dma_desc.buf_info = driver_dma_buf_info;
        }
        
        do
        {
            read_length = (remaining_length >= BLOCK_BUF_SZ) ? BLOCK_BUF_SZ : remaining_length;
            if(xfer_type == by_dma)
                dio_read_mem_dma(dsp_id, addr, block_buffer, read_length, &dev_dma_desc);
            else
                dio_read_mem(dsp_id, addr, block_buffer, read_length);
    
            index = fwrite(block_buffer, 1, read_length, fp);
            if(index != read_length)
            {
                printf("ERROR: Fail to write file, expect %d bytes but write %d bytes actually\n", read_length, index);
                ret = -1;
                break;
            }
    
            writed_length+=index;
            addr+=read_length;
            remaining_length -= read_length;	
        }while(remaining_length != 0);
    
    
        printf("Saved from dsp %d bytes\n", writed_length);
    
        if(xfer_type == by_dma)
            pcie_drv_dma_mem_free(dsp_id, driver_dma_buf_info);
        
        free ( block_buffer );
        fclose ( fp );
    
        return ret;
    }
    
    
    /*-------------------------------------------------------------------------
     * functions used by DSP info
     *-----------------------------------------------------------------------*/
    int print_pci_info(unsigned int dsp_id)
    {
        dsp_dev_info_t dsp_info_list;
        struct pci_dev_info *pci_info;
        int index;	
        int ret = 0;
    
    
        ret = pcie_drv_get_dsp_dev_info(&dsp_info_list);
        if (ret != 0)
        {
            printf("ERROR: PCIe Drv failed to get PCI info of DSP\n");
            return -1;
        }
        pci_info = &dsp_info_list.dsp_info[dsp_id];
    
        printf("==============================================\n");
        printf("Chip: %d\n",dsp_id);				
        printf("\tPCI Bridge %d\n", pci_info->bridge_pri);
        printf("\tPCI Bus Num: %d\n", pci_info->bus_number);		
        printf("\tVendor ID: 0x%04x\tDevice ID: 0x%04x\n", pci_info->vendor, pci_info->device);
        printf("\tSubsystem VendorID: 0x%04x\tSubsystem DevID: 0x%04x\n", pci_info->subsystem_vendor, pci_info->subsystem_device);
        printf("\tClass: 0x%08x\n", pci_info->class);	
        printf("\tHeader Type: %d\tIrq Pin: %d\n", pci_info->hdr_type, pci_info->pin);			
        printf("\tBAR Configuration:\n");
        printf("\tStart\t\t|\tLength\t\t|\tFlags\n");
        for (index = 0; index < 4; index++)
        {
            printf("\t0x%08lx\t|\t%08ld\t|\t0x%08lx\n",
                pci_info->bar_info[index].bar_start, pci_info->bar_info[index].bar_len, pci_info->bar_info[index].bar_flags);
        }
        printf("==============================================\n");
    
        free(dsp_info_list.dsp_info);
        return ret;	
    
    }
    
    int list_pci_info(void)
    {
        dsp_dev_info_t dsp_info_list;
        int index;
        int card_index = 0;
        unsigned char last_bridge_pri = 255;
        int ret = 0;
    
        ret = pcie_drv_get_dsp_dev_info(&dsp_info_list);
        if (ret != 0)
        {
            printf("ERROR: PCIe Drv failed to get PCI info of DSP\n");
            return -1;
        }
    
        for (index=0; index<dsp_info_list.dsp_amount; index++)
        {
            if(last_bridge_pri != dsp_info_list.dsp_info[index].bridge_pri)
            {
                card_index++;
                printf("Card %d:\n", card_index);
            }
            printf("\t[Chip %d] Device %04x\n", index, dsp_info_list.dsp_info[index].subsystem_device);
    
            last_bridge_pri = dsp_info_list.dsp_info[index].bridge_pri;
        }
    
        free(dsp_info_list.dsp_info);
        return ret;	
    }
    
    
    /*-------------------------------------------------------------------------
     * functions used by DSP reset
     *-----------------------------------------------------------------------*/
    /**
     *  @brief Function dsp_local_write() write to local memory of DSP
     *  @param[in]     dsp_id         DSP Chip ID
     *  @param[in]     core_id        Core ID within the chip specified by dsp_id
     *  @param[in]     addr           DSP Address to write
     *  @param[in]     buf            Destination Buffer pointer
     *  @param[in]     size           size of Memory read in bytes
     *  @retval        0 for success, -1 for failure
     *  @pre  
     *  @post 
     */
    static int32_t dsp_local_write(int32_t dsp_id, int32_t core_id, uint32_t addr, uint8_t *buf, uint32_t size)
    {
        int32_t ret;
    
        if (IS_CORE_LOCAL_ADDR(addr)) // It points to 'per core' address space.
        {
            addr = LOCAL_TO_GLOBAL_ADDR(addr, core_id);
        }
    
        ret = pcie_drv_dsp_write(dsp_id, addr, (unsigned char *)buf, size);
    
        return(ret);	
    }
    
    
    /**
     *  @brief Function dsp_reg_read() Read register
     *  @param[in]     dsp_id         DSP Chip ID
     *  @param[in]     addr           DSP Register Address to read
     *  @retval        Register value
     *  @pre  
     *  @post 
     */
    static uint32_t dsp_reg_read(uint32_t dsp_id, uint32_t addr)
    {
        unsigned value;
        int32_t ret_value;
    
        ret_value = pcie_drv_dsp_read(dsp_id, addr, (unsigned char *)&value, 4); 
        if(ret_value != 0)
        {
            fprintf(stderr, "\n dsp_reg_read: ERROR Reading Address %x\n", addr);
        }
        return(value);
    }
    /**
     *  @brief Function dsp_reg_write() Write register
     *  @param[in]     dsp_id         DSP Chip ID
     *  @param[in]     addr           DSP Register Address to write
     *  @retval        none
     *  @pre  
     *  @post 
     */
    
    static void dsp_reg_write(unsigned dsp_id, uint32_t addr, uint32_t value)
    {
        unsigned value_local;
        int32_t ret_value;
    
        value_local = value;
        ret_value = pcie_drv_dsp_write(dsp_id, addr, (unsigned char *)&value_local, 4);  
        if(ret_value != 0)
        {
            fprintf(stderr, "\n dsp_reg_write: ERROR Writing to Address %x\n", addr);
        }
    }
    
    /* ============================================================================
    *  @func   coreLocalReset
    *
    *  @desc   Reset a particular CorePac, 6678 Data Manual, section 7.4.4
    *          initiated by LPSC MMRs
    *
    *  @modif  None.
    *  ============================================================================
    */
    void coreLocalReset(uint32_t dsp_id, uint32_t pid, uint32_t mid, uint32_t state) 
    {
        uint32_t *pReg, temp, counter = 0;
    
        temp =  dsp_reg_read(dsp_id, PSC_BASE_ADDRESS + MDCTL(mid));  
        if (state == 0)
        {
            /* Reset assert */
            temp = ((temp & ~0x1F) | PSC_ENABLE) & (~0x100);
        }
        else
        {
            /* Reset de-assert */
            temp = (temp & ~0x1F) | PSC_ENABLE | (1 << 8);
        }
    
        /* Assert/De-assert local reset */
        dsp_reg_write(dsp_id, PSC_BASE_ADDRESS + MDCTL(mid), temp);
    
        /* No previous transition in progress */
        counter = 0;
        while (1)
        {
            temp = dsp_reg_read(dsp_id, PSC_BASE_ADDRESS + PTSTAT);
            if ((temp & (1 << pid)) == 0)
                break;
    
            usleep(1000);
            counter ++;
            if (counter > 10)
            {
                fprintf(stderr, "\ncoreLocalReset: Previous transition in progress dsp_id %d pid %d mid %d state: %d\n", dsp_id, pid, mid, state);
                break;
            }
        }
    
        dsp_reg_write(dsp_id, PSC_BASE_ADDRESS + PTCMD, (1 << pid)); 
    
        /* Current transition finished */
        counter = 0;
        while (1)
        {
            temp =  dsp_reg_read(dsp_id, PSC_BASE_ADDRESS + PTSTAT);
            if ((temp & (1 << pid)) == 0)
                break;
    
            usleep(1000);
            counter ++;
            if (counter > 10)
            {
                fprintf(stderr, "\ncoreLocalReset: Current transition in progress dsp_id %d pid %d mid %d state: %d\n", dsp_id, pid, mid, state);
                break;
            }
        }
    
        /* Verifying state change */
        counter = 0;
        while (1)
        {
            temp =  dsp_reg_read(dsp_id, PSC_BASE_ADDRESS + MDSTAT(mid));
            if ((temp & 0x1F) == 3)
                break;
    
            usleep(1000);
            counter ++;
            if (counter > 10)
            {
                fprintf(stderr, "\ncoreLocalReset: MD stat for dsp_id %d pid %d mid %d state: %d timeout\n", dsp_id, pid, mid, state);
                break;
            }
        }
    
    }
    
    
    /* ============================================================================
    *  @func   setPDState
    *
    *  @desc   Set a new power state for the specified domain id in a power controler
    *          domain. Wait for the power transition to complete.
    *
    *      pid   -  power domain.
    *      state -  new state value to set (1 = ON; 0 = OFF )
    *
    *  @modif  None.
    *  ============================================================================
    */
    void setPDState(uint32_t dsp_id, uint32_t pid, uint32_t state)
    {
        uint32_t *pReg, mdctl, pdctl, temp, counter = 0;
        uint32_t mid;
    
        pdctl = dsp_reg_read(dsp_id, PSC_BASE_ADDRESS + PDCTL(pid));
    
        /* No previous transition in progress */
        counter = 0;
        while (1)
        {
            temp = dsp_reg_read(dsp_id, PSC_BASE_ADDRESS + PTSTAT);
            if ((temp & (1 << pid)) == 0)
                break;
    
            usleep(1000);
            counter ++;
            if (counter > 10)
            {
                fprintf(stderr, "\n setPDState: dsp_id %d: Previous transition in progress pid %d state: %d\n",dsp_id, pid, state);
                break;
            }
        }
        
        /* Set power domain control */
        dsp_reg_write(dsp_id,  PSC_BASE_ADDRESS + PDCTL(pid), ((pdctl & (~ 0x1)) | state)); 
    
        /* Start power transition by setting PTCMD GO to 1 */
        temp = dsp_reg_read(dsp_id, PSC_BASE_ADDRESS + PTCMD);
        dsp_reg_write(dsp_id,  PSC_BASE_ADDRESS + PTCMD, temp | (0x1 << pid)); 
    
        /* Current transition finished */
        counter = 0;
        while (1)
        {
            temp = dsp_reg_read(dsp_id, PSC_BASE_ADDRESS + PTSTAT);
            if ((temp & (1 << pid)) == 0)
                break;
    
            usleep(1000);
            counter ++;
            if (counter > 10)
            {
                fprintf(stderr, "\n setPDState: dsp_id %d: Current transition in progress pid %d state: %d\n", dsp_id, pid, state);
                break;
            }
        }
    
    }
    /* ============================================================================
    *  @func   setPscState
    *
    *  @desc   Set a new power state for the specified domain id in a power controler
    *          domain. Wait for the power transition to complete.
    *
    *      pid   -  power domain.
    *      mid   -  module id to use for module in the specified power domain
    *      state -  new state value to set (0 = RESET; 3 = ENABLE)
    *
    *  @modif  None.
    *  ============================================================================
    */
    void setPscState(uint32_t dsp_id, uint32_t pid, uint32_t mid, uint32_t state)
    {
        uint32_t *pReg, mdctl, pdctl, temp, counter = 0;
    
        mdctl = dsp_reg_read(dsp_id, PSC_BASE_ADDRESS + MDCTL(mid));
        pdctl = dsp_reg_read(dsp_id, PSC_BASE_ADDRESS + PDCTL(pid));
    
        /* No previous transition in progress */
        counter = 0;
        while (1)
        {
            temp = dsp_reg_read(dsp_id, PSC_BASE_ADDRESS + PTSTAT);
            if ((temp & (1 << pid)) == 0)
                break;
    
            usleep(1000);
            counter ++;
            if (counter > 10)
            {
                fprintf(stderr, "\nsetPscState: dsp_id %d: Previous transition in progress pid %d mid %d state: %d\n",dsp_id, pid, mid, state);
                break;
            }
        }
    
        /* Set power domain control */
        dsp_reg_write(dsp_id,  PSC_BASE_ADDRESS + PDCTL(pid), (pdctl | 0x1)); 
    
        /* Set MDCTL NEXT to new state */
        mdctl = ((mdctl) & ~(0x1f)) | state;
        dsp_reg_write(dsp_id,  PSC_BASE_ADDRESS + MDCTL(mid), mdctl); 
    
        /* Start power transition by setting PTCMD GO to 1 */
        temp = dsp_reg_read(dsp_id, PSC_BASE_ADDRESS + PTCMD);
        dsp_reg_write(dsp_id,  PSC_BASE_ADDRESS + PTCMD, temp | (0x1 << pid)); 
    
        /* Current transition finished */
        counter = 0;
        while (1)
        {
            temp = dsp_reg_read(dsp_id, PSC_BASE_ADDRESS + PTSTAT);
            if ((temp & (1 << pid)) == 0)
                break;
    
            usleep(1000);
            counter ++;
            if (counter > 10)
            {
                fprintf(stderr, "\n setPscState: dsp_id %d: Current transition in progress pid %d mid %d state: %d\n", dsp_id, pid, mid, state);
                break;
            }
        }
    
        /* Verifying state change */
        counter = 0;
        while (1)
        {
            temp = dsp_reg_read(dsp_id, PSC_BASE_ADDRESS + MDSTAT(mid));
            if ((temp & 0x1F) == state)
                break;
    
            usleep(1000);
            counter ++;
            if (counter > 10)
            {
                fprintf(stderr, "\nsetPscState: dsp_id %d: MD stat for pid %d mid %d expected state: %d state: %d timeout\n", dsp_id, pid, mid, state, (temp & 0x1f));
                break;
            }
        }
    
    }
    /* ============================================================================
    *  @func   set_PA_state_to_reset
    *
    *  @desc   Set the PA PDSPs to reset state 
    *
    *      dsp_id   -  Dsp id
    *
    *  @modif  None.
    *  ============================================================================
    */
    void set_PA_state_to_reset(uint32_t dsp_id)
    {
        int i;
    
        /* Put each of the PDSPs into reset */
        for (i = 0; i < 6; i++)
        {
            dsp_reg_write(dsp_id,  CSL_PA_SS_CFG_REGS + PDSP_CONTROL_OFFSET(i), 0); 
        }
    
        /* Reset packet ID */
        dsp_reg_write(dsp_id,  CSL_PA_SS_CFG_REGS + PDSP_PKT_ID_SOFT_RESET_OFFSET, 1); 
    
        /* Reset LUT2 */
        dsp_reg_write(dsp_id,  CSL_PA_SS_CFG_REGS + PDSP_LUT2_SOFT_RESET_OFFSET, 1); 
    
        /* Reset statistics */
        dsp_reg_write(dsp_id,  CSL_PA_SS_CFG_REGS + PDSP_STATS_SOFT_RESET_OFFSET, 1); 
    
        /* Reset timers */
        for (i = 0; i < 6; i++)
            dsp_reg_write(dsp_id,  CSL_PA_SS_CFG_REGS + PDSP_TIMER_CNTRL_REG_OFFSET(i), 0); 
    
        usleep(100);
    }
    /* ============================================================================
    *  @func   set_SA_state_to_reset
    *
    *  @desc   Set the SA PDSPs to reset state 
    *
    *      dsp_id   -  Dsp id
    *
    *  @modif  None.
    *  ============================================================================
    */
    void set_SA_state_to_reset(uint32_t dsp_id)
    {
        int i;
    
        /* Disable and reset PDSPs */
        for (i = 0; i < 2; i++)
        {
            dsp_reg_write(dsp_id,  CSL_PA_SS_CFG_CP_ACE_CFG_REGS + PDSP_CONTROL_OFFSET(i), 0); 
        }
        usleep(100);
    }
    /* ============================================================================
    *  @func   setBootAddrIpcgr
    *
    *  @desc   Write boot entry point into DSP_BOOT_ADDR0 and the send an IPC
    *
    *  @modif  None.
    *  ============================================================================
    */
    uint32_t setBootAddrIpcgr(uint32_t dsp_id, uint32_t core, uint32_t addr)  
    {
    
        /* Unlock KICK0, KICK1 */
        dsp_reg_write(dsp_id,  CHIP_LEVEL_BASE_ADDRESS + KICK0, KICK0_UNLOCK); 
        dsp_reg_write(dsp_id,  CHIP_LEVEL_BASE_ADDRESS + KICK1, KICK1_UNLOCK); 
    
        /* Check if the last 10 bits of addr is 0 */
        if ((addr & 0x3ff) != 0)
        {
            fprintf(stderr, "\nsetBootAddrIpcgr: The address is not 1K aligned.\n");
            return 0;
        }
    
        dsp_reg_write(dsp_id,  CHIP_LEVEL_BASE_ADDRESS + DSP_BOOT_ADDR(core), addr); 
        
        usleep(1000); 
    
        return 1;
    }
    
    
    /* ============================================================================
    *  @func   byteto32bits 
    *
    *  @desc   Convert 4 bytes to 32 bits long word
    *
    *  @modif  None.
    *  ============================================================================
    */
    uint32_t byteTo32bits(uint8_t *pDspCode)
    {
        uint32_t i, temp;
    
        temp = *pDspCode++;
        for(i = 0; i < 3;i++)
        {
            temp <<= 8;
            temp |= *pDspCode++;
        }
        return(temp);
    }
    
    /* ============================================================================
    *  @func   pushData
    *
    *  @desc   Parser function for DSP boot image array
    *
    *  @modif  None.
    *  ============================================================================
    */
    void pushData(uint32_t dsp_id, uint8_t *pDspCode, uint8_t coreNum, uint32_t *bootEntryAddr)
    {
        uint32_t i, j, tempArray[DSP_LOAD_BLOCK_TRANSFER_SIZE/4];
        uint32_t size, section = 0, totalSize = 0;
        uint32_t count, remainder, startaddr;
    
        /* Get the boot entry address */
        *bootEntryAddr = byteTo32bits(pDspCode);
        pDspCode +=4;
    
        while(1)
        {
            /* Get the size */
            size = byteTo32bits(pDspCode);
            if(size == 0)
                break;
    
            if ((size/4)*4 != size)
            {
                size = ((size/4)+1)*4;
            }
    
            totalSize += size;
            section++;
            pDspCode += 4;
            startaddr = byteTo32bits(pDspCode);
    
            pDspCode+= 4;
    
            count = size/DSP_LOAD_BLOCK_TRANSFER_SIZE;
    
            remainder = size - count * DSP_LOAD_BLOCK_TRANSFER_SIZE;
    
            for(i = 0; i < count; i++)
            {
                for (j = 0; j < DSP_LOAD_BLOCK_TRANSFER_SIZE/4; j++)
                {
                    tempArray[j] = byteTo32bits(pDspCode);
                    pDspCode += 4;
                }
                /* Transfer boot tables to DSP */
                dsp_local_write(dsp_id, coreNum, startaddr, (uint8_t *)tempArray, DSP_LOAD_BLOCK_TRANSFER_SIZE); 
                startaddr += DSP_LOAD_BLOCK_TRANSFER_SIZE;
            }
    
            for (j = 0; j < remainder/4; j++)
            {
                tempArray[j] = byteTo32bits(pDspCode);
                pDspCode += 4;
            }
            dsp_local_write(dsp_id, coreNum, startaddr, (uint8_t *)tempArray, remainder); 
        }
    }
    
    
    #define LOOP_CODE_LOCATION_ADDR 0xc000000
    #define IDLE_INSTRUCTION_CODE   0x0001E000
    int32_t downloadSimpleLoopCode(int32_t dsp_id)
    {
        int i;
        int ret = 0;
        uint32_t write_value;
    
        write_value = IDLE_INSTRUCTION_CODE;
    
        /* Write 8 Idle instructions */
        for(i=0; i< 8; i++)
        {
            ret = pcie_drv_dsp_write(dsp_id, (LOOP_CODE_LOCATION_ADDR+i*4), (uint8_t *)(&write_value)  , 4);
            if(ret != 0)
            {
                fprintf(stderr, "\nERROR:downloadSimpleLoopCode: dsp_id %x Failed writing loop code", dsp_id);
                return -1;
            }
        }
        return(ret);
    }
    
    
    /**
     *  @brief Function dio_put_dsp_in_reset() Puts DSP in reset state
     *  @param[in]     dsp_id           DSP Chip Id
     *  @retval        0 for success, -1 for failure
     *  @pre  
     *  @post 
     */
    int32_t dio_put_dsp_in_reset(int32_t dsp_id)
    {
        /* Local reset of all cores */
        coreLocalReset(dsp_id, PD8,  LPSC_C0_TIM0, LOC_RST_ASSERT);
        coreLocalReset(dsp_id, PD9,  LPSC_C1_TIM1, LOC_RST_ASSERT);
        coreLocalReset(dsp_id, PD10, LPSC_C2_TIM2, LOC_RST_ASSERT);
        coreLocalReset(dsp_id, PD11, LPSC_C3_TIM3, LOC_RST_ASSERT);
        coreLocalReset(dsp_id, PD12, LPSC_C4_TIM4, LOC_RST_ASSERT);
        coreLocalReset(dsp_id, PD13, LPSC_C5_TIM5, LOC_RST_ASSERT);
        coreLocalReset(dsp_id, PD14, LPSC_C6_TIM6, LOC_RST_ASSERT);
        coreLocalReset(dsp_id, PD15, LPSC_C7_TIM7, LOC_RST_ASSERT);
    
        /* Disable all other modules */
        setPscState(dsp_id, PD0, LPSC_MODRST0, PSC_SWRSTDISABLE);
        setPscState(dsp_id, PD0, LPSC_EMIF16_SPI, PSC_SWRSTDISABLE);
        setPscState(dsp_id, PD0, LPSC_TSIP, PSC_SWRSTDISABLE);
        setPscState(dsp_id, PD1, LPSC_DEBUG, PSC_SWRSTDISABLE);
        setPscState(dsp_id, PD1, LPSC_TETB_TRC, PSC_SWRSTDISABLE);
        setPscState(dsp_id, PD4, LPSC_SRIO, PSC_SWRSTDISABLE);
        setPscState(dsp_id, PD5, LPSC_HYPER, PSC_SWRSTDISABLE);
        setPscState(dsp_id, PD7, LPSC_MSMCRAM, PSC_SWRSTDISABLE);
    
        /* Put Netcp components in safe state for reset */
        set_SA_state_to_reset(dsp_id);
        set_PA_state_to_reset(dsp_id);
    
        setPscState(dsp_id, PD2, LPSC_SA, PSC_SWRSTDISABLE); 
        setPscState(dsp_id, PD2, LPSC_SGMII, PSC_SWRSTDISABLE);
        setPscState(dsp_id, PD2, LPSC_PA, PSC_SWRSTDISABLE);
        /* Turn off power domain for NETCP */
        setPDState(dsp_id, PD2, PSC_PD_OFF);
    
        return 0;
    }
    
    
    /**
     *  @brief Function dio_bring_dsp_out_reset() brings the dsp out of reset state
     *         The precompiled boot image is loaded to DSP, prior to bringing the dsp
     *         out of reset, and then DSP jumps to entry point.
     *  @param[in]     dsp_id           DSP Chip Id
     *  @retval        0 for success, -1 for failure
     *  @pre  
     *  @post 
     */
    int32_t dio_bring_dsp_out_reset(int32_t dsp_id)
    {
        int ret = 0;
        int i;
        uint32_t bootEntryAddr;
    
        /* Bring MSMCRAM out of reset to allow writing to MSMC */
        setPscState(dsp_id, PD7, LPSC_MSMCRAM, PSC_ENABLE);
    
        /*-------------------------------------------------------------------------
        * The following code is a work around to flush the cache, without this
        * Any dirty cache lines in L1D may cause corruption of the downloaded
        * image
        */
        ret = downloadSimpleLoopCode(dsp_id);
        if(ret !=0)
        {
            fprintf(stderr, "\nERROR: dio_bring_dsp_out_reset: dsp_id %d Failed download loop code",dsp_id);
            return(-1);
        }
        for (i = 0; i < TOTAL_NUM_CORES_PER_CHIP; i++)
        {
            if (setBootAddrIpcgr(dsp_id, i, LOOP_CODE_LOCATION_ADDR ) == 0)
            {
                fprintf(stderr, "\nERROR: dio_bring_dsp_out_reset: dsp_id %d, Core %d  set boot address failed !!! ", dsp_id, i);
                return(-1);
            }
        }   
        /* Enable required modules */
        setPscState(dsp_id, PD0, LPSC_MODRST0, PSC_ENABLE);
    
        /* Local out of reset of all cores */
        coreLocalReset(dsp_id, PD8,  LPSC_C0_TIM0, LOC_RST_DEASSERT);
        coreLocalReset(dsp_id, PD9,  LPSC_C1_TIM1, LOC_RST_DEASSERT);
        coreLocalReset(dsp_id, PD10, LPSC_C2_TIM2, LOC_RST_DEASSERT);
        coreLocalReset(dsp_id, PD11, LPSC_C3_TIM3, LOC_RST_DEASSERT);
        coreLocalReset(dsp_id, PD12, LPSC_C4_TIM4, LOC_RST_DEASSERT);
        coreLocalReset(dsp_id, PD13, LPSC_C5_TIM5, LOC_RST_DEASSERT);
        coreLocalReset(dsp_id, PD14, LPSC_C6_TIM6, LOC_RST_DEASSERT);
        coreLocalReset(dsp_id, PD15, LPSC_C7_TIM7, LOC_RST_DEASSERT);
    
        /* Local reset of all cores */
        coreLocalReset(dsp_id, PD8,  LPSC_C0_TIM0, LOC_RST_ASSERT);
        coreLocalReset(dsp_id, PD9,  LPSC_C1_TIM1, LOC_RST_ASSERT);
        coreLocalReset(dsp_id, PD10, LPSC_C2_TIM2, LOC_RST_ASSERT);
        coreLocalReset(dsp_id, PD11, LPSC_C3_TIM3, LOC_RST_ASSERT);
        coreLocalReset(dsp_id, PD12, LPSC_C4_TIM4, LOC_RST_ASSERT);
        coreLocalReset(dsp_id, PD13, LPSC_C5_TIM5, LOC_RST_ASSERT);
        coreLocalReset(dsp_id, PD14, LPSC_C6_TIM6, LOC_RST_ASSERT);
        coreLocalReset(dsp_id, PD15, LPSC_C7_TIM7, LOC_RST_ASSERT);
    
        /* Disable the enabled modules for cache flush*/
        setPscState(dsp_id, PD0, LPSC_MODRST0, PSC_SWRSTDISABLE);
        /*-------------------------------------------------------------------------*/
    
        
        for (i = 0; i < TOTAL_NUM_CORES_PER_CHIP; i++)
        {
            pushData(dsp_id, localResetCode, i, &bootEntryAddr);
            if (setBootAddrIpcgr(dsp_id, i, bootEntryAddr) == 0)
            {
                fprintf(stderr, "\nERROR: dio_bring_dsp_out_reset: dsp_id %d Core %d is not ready !!! ", dsp_id, i);
                return(-1);
            }
        }
        
    
        /* Enable all other modules */
        setPscState(dsp_id, PD0, LPSC_MODRST0, PSC_ENABLE);
        setPscState(dsp_id, PD0, LPSC_EMIF16_SPI, PSC_ENABLE);
        setPscState(dsp_id, PD0, LPSC_TSIP, PSC_ENABLE);
        setPscState(dsp_id, PD1, LPSC_DEBUG, PSC_ENABLE);
        setPscState(dsp_id, PD1, LPSC_TETB_TRC, PSC_ENABLE);
    
        /* Local out of reset of all cores */
        coreLocalReset(dsp_id, PD8,  LPSC_C0_TIM0, LOC_RST_DEASSERT);
        coreLocalReset(dsp_id, PD9,  LPSC_C1_TIM1, LOC_RST_DEASSERT);
        coreLocalReset(dsp_id, PD10, LPSC_C2_TIM2, LOC_RST_DEASSERT);
        coreLocalReset(dsp_id, PD11, LPSC_C3_TIM3, LOC_RST_DEASSERT);
        coreLocalReset(dsp_id, PD12, LPSC_C4_TIM4, LOC_RST_DEASSERT);
        coreLocalReset(dsp_id, PD13, LPSC_C5_TIM5, LOC_RST_DEASSERT);
        coreLocalReset(dsp_id, PD14, LPSC_C6_TIM6, LOC_RST_DEASSERT);
        coreLocalReset(dsp_id, PD15, LPSC_C7_TIM7, LOC_RST_DEASSERT);
    
        return 0;
    }
    
    
    /*-------------------------------------------------------------------------
     * main function entry 
     *-----------------------------------------------------------------------*/
    static int do_version(int argc, char** argv)
    {
        printf("%s\n", __SVN_VERSION);
    
        return 0;
    }
    
    
    static int do_dsp_load(int argc, char** argv)
    {
        unsigned values[3];
        int ret = 0;
        ihex_parser ihex;
    
    
        enum
        {
            chip = 0, core = 1, entry = 2
        };
    
        ihex_parser_init(&ihex);
    
        ret = get_param_value_array(argc, argv, 2, 3, values);
        if (ret)
        {
            printf("ERROR: failed to get_param_value_array\n");
            return -1;
        }
    
        ret = pcie_drv_open();
        if (0 != ret)
        {
            printf("ERROR: Failed to open PCIe driver\n");
            return -1;
        }
    
    
        // Load BIN file to DSP memory
        printf("Load HEX image: %s to %d:%d, start address 0x%08x\n", argv[5], values[chip], values[core], values[entry]);
    
        ihex.write_block = ihex_write_block;
        ihex.user_value = (values[chip] << 16) | values[core];
    
        /* transfer hex file here */
        ret = ihex_parser_readfile(&ihex, argv[5]);
    
        ihex_parser_close(&ihex);
        if (ret)
        {
            perror("Load HEX error\n");
            return ret;
        }
        printf("Load HEX OK\n");
    
        ret = pcie_drv_dsp_set_entry_point(values[chip], values[core], values[entry]);
        if(values[core] == 0)
        {
            pcie_drv_set_ep_config(values[chip], PCIEDRV_PCIEEP_SET_INTA_SET);  //make a PCIe interrupt to trigger DSP
            usleep(10000);
            pcie_drv_set_ep_config(values[chip], PCIEDRV_PCIEEP_SET_INTA_CLR);  //clear PCIe interrupt
        }
    
        pcie_drv_close();
        return ret;
    }
    
    
    static int do_dsp_query(int argc, char** argv)
    {
        unsigned chip=0;
        int ret = 0;
    
        ret = pcie_drv_open();
        if (0 != ret)
        {
            printf("ERROR: Failed to open PCIe driver\n");
            return -1;
        }
    
        if ((strcmp(argv[2], "-l")==0) || (strcmp(argv[2], "list")==0))
        {
            ret=list_pci_info();
        }
        else
        {
            ret = get_param_value_array(argc, argv, 2, 1, &chip);
            if (ret)
            {
                printf("ERROR: failed to get_param_value_array\n");
                pcie_drv_close();
                return -1;				
            }
            ret=print_pci_info(chip);		
        }
    
        pcie_drv_close();
        return ret;
    
    }
    
    static int do_dsp_wmem(int argc, char** argv)
    {
        unsigned values[3];
        int ret = 0;
        pcie_ioctl_t ioparam;
    
        enum
        {
            chip = 0, address = 1, data = 2
        };
    
        ret = get_param_value_array(argc, argv, 2, 3, values);
        if (ret)
        {
            printf("ERROR: failed to get_param_value_array\n");
            return -1;
        }
    
        ret = pcie_drv_open();
        if (0 != ret)
        {
            printf("ERROR: Failed to open PCIe driver\n");
            return -1;
        }
    
        ret = dio_write_mem(values[chip], values[address], (unsigned char *) &values[data], 4);
        if (ret)
        {
            printf("ERROR: failed to write memory to device\n");
            ret = -1;
        }
        pcie_drv_close();
        return ret;
    }
    
    
    static int do_dsp_rmem(int argc, char** argv)
    {
        unsigned values[2];
        int ret = 0;
        pcie_ioctl_t ioparam;
        unsigned int buffer;
    
        enum
        {
            chip = 0, address = 1
        };
    
        ret = get_param_value_array(argc, argv, 2, 2, values);
        if (ret)
        {
            printf("ERROR: failed to get_param_value_array\n");
            return -1;
        }
    
        ret = pcie_drv_open();
        if (0 != ret)
        {
            printf("ERROR: Failed to open PCIe driver\n");
            return -1;
        }
        
        ret = dio_read_mem(values[chip], values[address], (unsigned char *) &buffer, 4);
        if (ret)
        {
            printf("ERROR: failed to read memory from device\n");
            pcie_drv_close();
            return -1;
        }
    
        printf("0x%08x\n", buffer);
    
        pcie_drv_close();
        return ret;
    }
    
    
    static int do_dsp_load_bin(int argc, char** argv)
    {
        unsigned values[5];
        int ret = 0;
        char dev_name[128];
        pcie_ioctl_t ioparam;
    
    #ifdef time_measure
        struct timeval time_measure1, time_measure2;
        unsigned int diff;
    #endif
    
        enum
        {
            chip = 0, entry = 1, size = 2 , xfer_type = 3
        };
    
    
        ret = get_param_value_array(argc, argv, 2, 4, values);
        if (ret)
        {
            printf("ERROR: failed to get_param_value_array\n");
            return -1;
        }
    
        ret = pcie_drv_open();
        if (0 != ret)
        {
            printf("ERROR: Failed to open PCIe driver\n");
            return -1;
        }
    
        // Load BIN file to DSP memory
        printf("Load Binary file: %s to DSP%d, start address 0x%08x, Size 0x%08x\n", argv[6], values[chip], values[entry], values[size]);
    
    
    #ifdef time_measure
        gettimeofday(&time_measure1, NULL);
    #endif
    	
        /* transfer hex file here */
        ret = load_binfile(values[chip], argv[6], values[entry], values[size], values[xfer_type]);
    
    #ifdef time_measure
        gettimeofday(&time_measure2, NULL);
        diff = (time_measure2.tv_sec - time_measure1.tv_sec) * 1000000
            + (time_measure2.tv_usec - time_measure1.tv_usec);
    
        printf("Time measured: %d us\n", diff);
    #endif
    
        if (ret)
            printf("ERROR: Load bin error\n");
        else
            printf("Load Binary OK\n");
    
        pcie_drv_close();
        return ret;
    }
    
    
    static int do_dsp_save_bin(int argc, char** argv)
    {
        unsigned values[5];
        int ret = 0;
        char dev_name[128];
        pcie_ioctl_t ioparam;
    #ifdef time_measure
        struct timeval time_measure1, time_measure2;
        unsigned int diff;
    #endif
    
        enum
        {
            chip = 0, entry = 1, size = 2, xfer_type = 3
        };
    
    
        ret = get_param_value_array(argc, argv, 2, 4, values);
        if (ret)
        {
            printf("ERROR: failed to get_param_value_array\n");
            return -1;
        }
    
        ret = pcie_drv_open();
        if (0 != ret)
        {
            printf("ERROR: Failed to open PCIe driver\n");
            return -1;
        }
    
        // Load BIN file to DSP memory
        printf("Save Binary file: %s from DSP %d, start address 0x%08x Size 0x%08x\n", argv[6], values[chip], values[entry], values[size]);
    
    #ifdef time_measure
        gettimeofday(&time_measure1, NULL);
    #endif
    
        /* transfer hex file here */
        ret = save_binfile(values[chip], argv[6], values[entry], values[size], values[xfer_type]);
    	
    #ifdef time_measure
        gettimeofday(&time_measure2, NULL);
        diff = (time_measure2.tv_sec - time_measure1.tv_sec) * 1000000
            + (time_measure2.tv_usec - time_measure1.tv_usec);
    
        printf("Time measured: %d us\n", diff);
    #endif
    
        if (ret)
            printf("ERROR: Save bin error\n");
        else
            printf("Save Binary OK\n");
    	
        pcie_drv_close();
        return ret;
    }
    
    
    static int do_dsp_local_reset(int argc, char** argv)
    {
        unsigned chip=0;
        int ret = 0;
        unsigned int execution_wait_count=0;
        unsigned int read_boot_entry_location_value[TOTAL_NUM_CORES_PER_CHIP];
        int i;
    
        ret = get_param_value_array(argc, argv, 2, 1, &chip);
        if (ret)
        {
            printf("ERROR: failed to get_param_value_array\n");
            return -1;
        }
    
        ret = pcie_drv_open();
        if (0 != ret)
        {
            printf("ERROR: Failed to open PCIe driver\n");
            return -1;
        }
    
        /* Put the DSP in reset */
        dio_put_dsp_in_reset(chip);
        if (0 != ret)
        {
            printf("ERROR: Failed to put DSP into reset\n");
            return -1;
        }
    
        /* Take the DSP out of reset */
        dio_bring_dsp_out_reset(chip);
        if (0 != ret)
        {
            printf("ERROR: Failed to bring DSP out of reset\n");
            return -1;
        }
    
    
        /* Check to see if reset is complete */
        while(1)
        {
            for(i=0; i< TOTAL_NUM_CORES_PER_CHIP; i++)
            {
                ret = pcie_drv_dsp_read(chip, ((0x10 + i) << 24) + BOOT_ENTRY_LOCATION_ADDR, (unsigned char *)&read_boot_entry_location_value[i], 4);
                if(ret != 0)
                {
                    printf("\nERROR: pciedrv_dsp_read failed\n");
                }
            } 
            for(i=0; i< TOTAL_NUM_CORES_PER_CHIP; i++)
            {
                if(read_boot_entry_location_value[i] != 0)
                    break;
            }
            execution_wait_count++;
            if(execution_wait_count > 1000) 
            {
                printf("\" ERROR: Reset code is not working : Timedout\n ");
                goto err_reset;
            }
            if(i == TOTAL_NUM_CORES_PER_CHIP)
                break;
            usleep(1000);
        };
       
        printf("\n Iterations waited for entry point to clear %d\n", execution_wait_count);
      
        pcie_drv_close();
    
        printf("Dsp %d:  DSP Reset success ! \n", chip);
        return ret;
    
    err_reset:
        printf("Dsp %d:  DSP Reset Fail ! \n", chip);
        pcie_drv_close();
        return -1;   
    
    }
    
    /** Print help information to console */
    static void do_help(const char *prog)
    {
        printf("%s %s\n", __BSP_VERSION, __SVN_VERSION);
        printf("%s query -l\n\tList all device info.\n", prog);	
        printf("%s query [chip id#(ID start from 0)]\n\tQuery chip# detail device info.\n", prog);	
        printf("%s load [chip id#(ID start from 0)] [core#0~7] [image entry point] [image file name (hex)]\n\tDownload and run program.\n", prog);
        printf("%s rmem [chip id#(ID start from 0)][DSP address]\n\tRead a 32bits-DWORD from DSP memory.\n", prog);
        printf("%s wmem [chip id#(ID start from 0)][DSP address] [DWORD value]\n\tWrite a 32bits-DWORD to DSP memory.\n", prog);
        printf("%s loadbinary [chip id#(ID start from 0)] [DSP address] [Number of Bytes (0: all)][type 0: memcpy, 1: DMA] [bin file name]\n\tLoad Binary file to DSP memory.\n", prog);
        printf("%s savebinary [chip id#(ID start from 0)] [DSP address] [Number of Bytes][type 0: memcpy, 1: DMA] [bin file name]\n\tSave data as a Binary file from DSP memory.\n", prog);
        printf("%s reset [chip id#(ID start from 0)]\n\tReset DSP chip.\n", prog);
        printf("%s version\n\tPrint version info.\n", prog);		
    }
    
    
    #define NUM_CMDS (8)
    static const arg_template_t defcmd[NUM_CMDS] = {
        { "version", 2, do_version},
        { "load", 6, do_dsp_load},
        { "query", 3, do_dsp_query},
        { "wmem", 5, do_dsp_wmem},
        { "rmem", 4, do_dsp_rmem},
        { "loadbinary", 7, do_dsp_load_bin},	
        { "savebinary", 7, do_dsp_save_bin},
        { "reset", 3, do_dsp_local_reset},
    };
    
    
    /*-------------------------------------------------------------------------
     * main entry point
     *-----------------------------------------------------------------------*/
    int main(int argc, char** argv)
    {
        int ret;
    
        if (parse_cmd_line(argc, argv, defcmd, NUM_CMDS, &ret) != 0)
        {
            do_help(argv[0]);
        }
    
        return ret;
    }
    
    
    . Two major function calls are dio_put_dsp_in_reset() and dio_bring_dsp_out_reset(). In the second function, you can find this comments: “The following code is a work around to flush the cache, without this Any dirty cache lines in L1D may cause corruption of the downloaded image”. It wrote 8 idle instructions into MSMC, the DSP cores are bring out of reset once to execute this code. Then, the DSP cores are put into reset again, load your real application code and bring out of reset. So I think this is quite relavant to issue you are seeing.

    Can you follow what was done in Linux source code and port into your Windows driver code (if you are not using Advantech’s Windows driver API) to see if it works for you?

    Thanks!

    best regards,

    David Zhou

  • David,

    My code already explicitly invalidates all the caches from the PC, but I tried your suggestion anyway. As I expected, there is no change in the behaviour; code executes once, and then local reset does not execute any code at the boot address.

    I am trying to generate a cut-down version of my loader so you can load an application and observe this behaviour. I assume you have access to the Advantech board & software.

    Peter

  • Peter,

    Thanks. Yes, I have Advantech board here.

    regards,

    David

  • I now have a test program but have failed to get this board to add an attachment (clicking 'Click to add' just takes me to the top of the page). Please let me know how I can get the code to you.

    It runs the same test three times.

    Each test is:
    A: Reset, then load a NOP loop to 0x0C000000 and let it run.
    B: Reset, then load a loop to 0x800000 that increments either 0x800000 or
       0x800004 and lets it run.
    C: Display four words from 0x800000, five times.
    D: Reset, then load a Diamond application and let it run (generating no output).

    Here is the output:
    C:\Users\3L\Desktop>f:\TIload
    C6678 test loader
    Got map
    1 board located
    Root found
    Test 1
    First reset
    Second reset
    BOOT=10800001
    0000A5B8 0000002A 0000406A 0080A35A
    000104A8 0000002A 0000406A 0080A35A
    00014EBA 0000002A 0000406A 0080A35A
    00019EF5 0000002A 0000406A 0080A35A
    0001F52A 0000002A 0000406A 0080A35A
    Third reset
    Entry point set to 10800000
    Program running
    Done
    Test 2
    First reset
    Second reset
    BOOT=10800001
    00000000 00009945 0000406A 0080A35A
    00000000 00011A76 0000406A 0080A35A
    00000000 0001F8BA 0000406A 0080A35A
    00000000 0003AC26 0000406A 0080A35A
    00000000 0005295C 0000406A 0080A35A
    Third reset
    Entry point set to 10800000
    Program running
    Done
    Test 3
    First reset
    Second reset
    BOOT=10800001
    00000000 0000002A 0000406A 0080A35A
    00000000 0000002A 0000406A 0080A35A
    00000000 0000002A 0000406A 0080A35A
    00000000 0000002A 0000406A 0080A35A
    00000000 0000002A 0000406A 0080A35A
    Third reset
    Entry point set to 10800000
    Program running
    Done

    C:\Users\3L\Desktop>

    In tests 1 & 2 you can see location 0x800000 or 0x800004 being incremented,
    showing that the loaded code has been executed.
    In test 3, you see nothing is changing, showing the code is no longer being
    executed.

  • You can download the test program from here:

    ftp.3L.com/TrialDownload/TILoad.zip

    Peter

  • Peter, Much appreciated! No further action is required on your end. I have all the actions here. best regards, David Zhou

  • dzhou said:
    I found Advantech DSPC-8681E has DSP local reset functionality provided both in Windows driver and Linux driver. The download page is: DSPC8681 driver link. The latest version for windows is V0.7.8, the local reset has been  implemented in do_dsp_local_reset() function in dsp_loader(Lightning_PCIE_0_7_8\Project\dsp_loader), the API call into Windows driver. The demo script is in Lightning_PCIE_0_7_8\examples\script\DSPC8681E\dspreset.bat.

    David,

    It looks like this may work for us, we will have to test further to be sure. 

    They are resetting Module 0 but the TI C6678 says this "Cannot be disabled".  What special insight does Advantech have here that the rest of us do not?

    Also, what part of their code is correcting this issue and why? They load a program that is just IDLE instructions but I see Peter is loading one that does NOPs. Is that the make or break step? Obviously I could go through and comment out line by line and test, but that is tedious and this seems like a *very important* thing to know in order to successfully use the L2RST. 

    Also Peter, I don't think you can access the C66 Core PAc registers (Cache invalidate) from the PCIESS. 

  • A question I would like answered is, does Advantech have access to usage information that does not appear in standard TI documentation? If so, why isn't that information in the standard documentation?

    "...but that is tedious..."

    That seems to be the name of the game when things are documented so poorly.

    "Also Peter, I don't think you can access the C66 Core PAc registers (Cache invalidate) from the PCIESS."

    Do you have a particular reason for thinking this? They're just memory-mapped registers after all and I haven't found anything that suggests why these registers might be inaccessible (although things are easy to miss when scattered throughoutt several documents).

  • Peter Robertson said:
    Do you have a particular reason for thinking this? They're just memory-mapped registers after all and I haven't found anything that suggests why these registers might be inaccessible (although things are easy to miss when scattered throughoutt several documents).

    The 0x0180000 is a local address to each CorePac whose memory map is listed in table 9-3 in the CorePac User Guide sprugw0. 

    The tables in chapter 4 "System Interconnect" in the main datasheet list the available global access connections. From PCIESS to CorePac they only list CorePacN_SDMA (Slave DMA interface?) which I think can only access the CorePac SRAM. 

    Maybe there is a global address to access each CorePac's configuration registers but it isn't documented. 

  • Peter, 

    Between every reset and load of a new program you need to first load the IDLE or NOP program, bring the DSP out of reset, put it back in reset and then load your new program. This appears to be working for us. I see in your code in test() you do the NOP program, then an increment program and then your appData program. If you put the NOP program between  your increment and appData I think things will work. 

    But the question still remains, why is the DSP able to successfully run IDLE or NOP commands but not other instructions in subsequent program loads with LRST?

  • Peter,

    Issue reproduced, now working on solution/root cause.

    regards,

    David

  • Peter,

    Thanks very much for providing the test case, we can reproduce the issue and investigate.

    We have emulator connected to DSP running the test, both to core 0 and core 1 for debug visibility. Core 0 was used to view some local addresses and Program Counter when core 0 is not “held-in-reset”. Core 1 was used to view some global address.

    • It is confirmed that DSP memory read and write are correct through PCIE Windows driver.
    • It is found that in the third loop: Test(3), the core 0 can’t be taken out of reset after it was put into reset in the beginning. That is why you found the small assembly code Reset1() can’t run.
    • We tried many code changes in your void Test(int Which) function to exactly follow what Advantech driver did, that is, it has those MSMCRAM and MODRST0 disable/enable to flush cache. All other modules are turned off. It didn’t help.
    • We also tried Colin’s (e2e member) suggestion to load/execute IDLE loops in MSMC in between your Reset0()/Reset1() and AppData(), it didn’t help either.
    • The reset worked always if we removed the load() of your AppData().
    • The reset worked always if we replace your AppData() with our simple test application code.

     So, it looks the AppData() caused the issue, which we don’t have any visibility. The AppData may cause some invalid memory access and the DSP local reset may not clear this situation and DSP finally stall when enter into the third Test round.

     So I'd like to give you some hints on how to debug your AppData() code assuming you are not able to provide that code.

    Before the test run, we checked the EVTFLAG0/1/2/3 (Table 9.3 of  http://www.ti.com/lit/ug/sprugw0c/sprugw0c.pdf) registers (EVTFLAG3 is 0x1400_0000 initially), and it is found the Reset0()/Reset1() or our simple test application didn’t causing any issue. If using your AppData(), after the Test(1), (PC is in “IDLE” at 0x0080_0860), it is found EVTFLAG3 is changed to 0x1420_0001. And after the Test(2), EVTFLAG is changed to 0x1420_4001. That is, EVENT 96/110/117 were set, this may give you some clue (see Table 9-2) to debug what happened in your AppData().

    best regards,

    David Zhou

  • Hi Peter,

    Are you able to make some progress yet? Please let me know how I can further assist you on this issue.

    regards,
    David

  • Hi David,

    I am try to find out what it is my code is doing that is killing the C6678. This is taking a while as I do not have a working emulator. I shall let you know what I find.

    Whatever it is, I am surprised that TI is not at all bothered that a program can put a C6678 core into a state that a soft reset will not resolve. As soft reset is the only thing that a PCIe connection can do, it seems a serious design flaw in the system.

    Regards,

    Peter

  • I have traced the problem down to setting the PLL. Omitting this lets local reset work repeatedly.

    Unfortunately, I do not have time to characterise this further.

  • Peter,

    The PLL programming may be tricky. Were you copy/pasting PLL programming code from somewhere else or you created on your own. I'd highly recommend you copy/paste the implementation from TI's code, either from a GEL file or source code in (e.g.) intermediate bootloader.

    best regards,

    David Zhou