This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Errors in Big-endian mode and fprintf to a file on a C6678 EVM

We are in the process of testing some code on the EVM in big-endian. It all works just fine on little-endian. We're using 7.3.11 of the compiler and compiling against bare-metal. Our code writes to files using fprintf, which naturally end up on the host. However, the output is full of NULL's, where we expect characters to be. If we fprintf to stderr, we get proper output.

With the Big-Endian code (and 7.3.11 compiler) I am frequently getting nulls written to the files instead of data; and I have added debug code to the system at the point of writing to ensure I did not write nulls to the file.

For some reason, data is not getting flushed, or is being overwritten with nulls on the way to the host file system. The nulls are leading nulls in the file, in the current case that roadblocks me, it is exactly 30 nulls (0x00) and that is precisely how many non-null characters I expect in that position. We do fclose() the file after writing.

I do not have a way to circumvent a faulty file system. If there is something about "fprintf" that is different for big-endian versus little-endian code, I am not aware of it. One of the outputs producing nulls is just

fprintf(fpout, " %d %d", 3, 0);

and this works just fine in little-endian code, and if I add a debug line and fprintf(stderr, " %d %d \n", 3, 0); I get the right output on the screen (nulls still end up in the file).

The only difference on the hardware is SW3-1 flipped, and all the code is being compiled with --big_endian flag.

  • We believe this to be a bug in the Big-Endian host-emulation library for the EVM. Why?

    From my engineer:

    An explicit fflush() after the fprintf() calls seems to be fixing something.

    In this particular case the EVM program exits immediately after the fclose(), so perhaps the issue has something to do with the end of the program interrupting the implicit fflush() of the fclose() call. Why that would only happen with BigEnd doesn't make sense, but perhaps somebody is testing the wrong bit in a flag collection or something.

    The only thing I have seen is that the DMA engines are incoherent with L2 when accessing DDR3. So DDR3 has nulls; if C is writing to DDR3 then the data gets stuck in the L2 and L1 caches and never flushes to DDR3.

    If the file write to the host depends on DMA in any way, then if fclose() fails to flush the cache to DDR3 before the DMA starts up, what will get written would be NULLs. Maybe my explicit fflush() invokes different code that gets the cached line into DDR3 before a DMA.

    A speculation. My whole file is just 43 bytes long, it can fit in a cache line.

  • Hello Philip,

    Is ti possible for you to attach an example project to demonstrate the failure? It will help us to debug it with that.

    regards,

    David

  • Our project is working on a code-generator for numerical kernels that produces 1000's of files, so sending you what we're working with is not practical. This is all done outside of CCS (thankfully). I'll check with my guys and see if this is reproducible with a simple sequence of fopen();fprintf();fclose(). But this code does work fine with LE, it's only BE that's the problem.

  • Hi all,

    Efforts at reproducing this bug with a standalone test-case have failed. Thanks to Eric and David for putting time into this. In our code, it iteratively generates thousands  of small output files on the host which are then used as input to the next compile. 

    So, the next question is in our cache setup configuration, your comments are welcome. from Tony:

    It occurs to me there may be a problem in my code that sets up the
    machine, in particular the code to configure the L1 and L2 caches and
    the MAR registers.

    I declare those addresses (in the 0x0184xxxx control register area) as
    volatile unsigned int*, and I read and write integer values to them.
    If the order of the bytes in an unsigned int are stored in reversed
    order by the big_endian hardware switch, do I have to write a
    different integer value in order to set the same bits in the hardware?

  • FYI, fflush() as listed above did not solve the problem.

  • From customer e-mail:

    "I tried that (a small standalone program to write out a file and then read it back) and it did not reproduce the problem. The problem may be intermittent; on one occasion I crashed because a file written by an early program could not even be found by a subsequent program; when I restarted the install (without making any source changes) that problem did not reoccur."

    The problem resolved.

    Regards, Eric

     

  • Sorry, the problem  is not resolved Eric.

    I created a standalone program that works differently
    in LE and BE modes.

    #include <stdio.h>
    #include <time.h>
    #include <c6x.h>
    
    /*---------------------------------------------------------------------------*/
    /* This program will initialize the C6678 configuration registers to utilize */
    /* the L1 Cache, L2 Cache, and fix the MAR (Memory Attribute Registers) to   */
    /* make all memory cacheable and pre-fetchable. Here is the layout of        */
    /* control registers, specific to the C6678. They are memory-mapped          */
    /* registers; write to the location and the internal register gets written.  */
    /* All are 32-bit registers.                                                 */
    /*                                                                           */
    /* Address    Function                 Value                                 */
    /* 0x01840000 Level 2 Cache Config     0x==== 0007 (all cache)               */
    /* 0x01845004 L2 Write-back invalidate 0x0000 0001 (flushes L2 cache)        */
    /* 0x01840040 Level 1 Cache Config     0x==== 0007 (all cache)               */
    /* 0x01845044 L1 Write-back invalidate 0x0000 0001 (flushes L1 cache)        */
    /* 0x01848000 16 reserved read-only MAR registers.                           */
    /* 0x01848040 240 MAR regs.       |=   0x0000 0009 (8=prefetch, 1=cacheable) */
    /* 0x018483FC Last of the MAR regs.                                          */
    /*            For MAR regs, bits 1-2 are reserved, so we leave them as-is.   */
    /* MAR reg 16 controls memory 1000 0000h to 10FF FFFFh,                      */
    /* MAR reg 17 controls memory 1100 0000h to 11FF FFFFh, etc,                 */
    /* MAR rg 255 controls memory FF00 0000h to FFFF FFFFh.                      */
    /* Notice that MAR16 holds range of L1 SRAM, DO NOT cache that page!         */
    /* L1 config 0--7 == disabled, 4K,8K,16K,32K,32K,32K, max (=32K)             */
    /* L2 config 0--7 == disabled, 32K,64K,128K,256K,512K,1M,max (=1M).          */
    /* We set our L2 to '5' because the link file, C6678_unified.cmd, maps our   */
    /* L2 space to just 512K.                                                    */
    /*                                                                           */
    /* Interrupt stuff:                                                          */
    /* 0x01800040 4 Event Clear Registers, to 0x0180004C.                        */
    /* 0x01800080 4 Event Mask Registers, to 0x0180008C.                         */
    /* 0x01800104 3 Interrupt Mux registers, to 0x0180010C.                      */
    /* ISTP is the Interrupt Service Table pointer.                              */
    /* ISTP & 0xFFFFC000 is the base address, +0x01E0 is INT15 ISR (8 words).    */
    /* IER is an actual register (not to be confused with the EDMA channel       */
    /*     control memory-mapped IER registers) that controls whether interrupts */
    /*     4..15 are enabled.                                                    */
    /*                                                                           */
    /* Modified Oct 18: Set L1 to just 4K, so 28K could be used as fast ram.     */
    /* Modified Mar  4: Set L2 to just 256K, so 256K could be used as fast ram.  */
    /* Nov 29: Added code to report on mapping of EDMA3CC (DMA, QDMA transfers). */ 
    /*---------------------------------------------------------------------------*/
    int main(int argc, char **args)
    {
       TSCL=0;  /* Always start real-time clock, just in case timing. */
       volatile unsigned int *L1PCFG = (unsigned int *) (0x01840020);
       volatile unsigned int *L1CFG  = (unsigned int *) (0x01840040);
       volatile unsigned int *L2CFG  = (unsigned int *) (0x01840000);
       volatile unsigned int *MAR    = (unsigned int *) (0x01848040);
    // volatile unsigned int *EDMA3CC0 = (unsigned int*) (0x02700000ul);
    // volatile unsigned int *CC0_QEER = (unsigned int*) (0x02701084ul);
    // volatile unsigned int *CC0_EER  = (unsigned int*) (0x02701020ul);
    // volatile unsigned int *EDMA3CC1 = (unsigned int*) (0x02720000ul);
    // volatile unsigned int *EDMA3CC2 = (unsigned int*) (0x02740000ul);
    // volatile unsigned int *EVTMASKR = (unsigned int*) (0x01800080ul);
    // volatile unsigned int *INTMUXR  = (unsigned int*) (0x01800104ul);
    // volatile unsigned int *INTSVCT  = NULL;
       unsigned int i, v;
    
       v = *L1PCFG;                                 /* Get value. */
       printf("L1PCFG on entry: %08X.\n", v);       /* Report it. */
    // v &= 0xFFFFFFF8;                             /* Force final bits 000b. */
    // v |= 0x00000007;                             /* Force final bits 111b. */
    // *L1PCFG = v;                                 /* Configure L1P Cache. */
    
       v = *L1CFG;                                  /* Get value. */
       printf("L1CFG on entry: %08X.\n", v);        /* Report it. */
    // v &= 0xFFFFFFF8;                             /* Force final bits 000b. */
    // v |= 0x00000001;                             /* Force final bits 001b. */
    //                                              /*  (28K work, 4K cache). */
    // *L1CFG = v;                                  /* Configure L1 Cache. */
    
       v = *L2CFG;                                  /* Get value. */
       printf("L2CFG on entry: %08X.\n", v);        /* Report it. */
    // v &= 0xFFFFFFF8;                             /* Force final bits 000b. */
    // v |= 0x00000004;                             /* Force final bits 100b. */
    // *L2CFG = v;                                  /* Configure L2 Cache. */
    
       v = *MAR;                                    /* Get typical MAR value. */
       printf("MAR[16] on entry: %08X.\n", v);      /* Report it. */
    // MAR[0] = 0;                                  /* range of L1D SRAM! */ 
    // v = *MAR;                                    /* Read again. */
    // printf("MAR[16] Modified: %08X.\n", v);      /* Report it. */
    // v = 0x00000009;                              /* Value for MAR register. */
    // for (i=1; i<240; i++)                        /* For all other MAR regs, */
    //    MAR[i] |= v;                              /* Configure address range.*/
    
       printf("Cache Setup Complete: L1CFG=%08X, L2CFG=%08X.\n", *L1CFG, *L2CFG);
       fflush(stdout);
     
       /* Report on setup of interrupts. */
    // for (i=0; i<4; i++) 
    //    printf("EVTMASKR[%1i]=%08X.\n", i, EVTMASKR[i]);
    
    // for (i=0; i<3; i++) 
    //    printf("INTMUXR[%1i]=%08X.\n", i, INTMUXR[i]);
    
    // v = IER;                                  // get the interrupt Enable ptr.       
    // printf("IER=%08X.\n", v);                 // Report.
    
    // v = ISTP;                                 // get the interrupt service table ptr.
    // printf("ISTP=%08X.\n", v);                // Report.
    
    // v &= 0xFFFFFC00;                          // Clear low part.
    // v |= 0x000001E0;                          // Interrupt 15.
    // INTSVCT = (unsigned int*) (v);            // Recast.
    // for (i=0; i<8; i++)
    //    printf("ISR_15[%1i]=%08X.\n", i, INTSVCT[i]);
    
       /* Report on setup of CC0. */
    // printf("CC0 EER=%08X; CC0 QEER=%08X. (Event Enabled Registers).\n", 
    //    *CC0_EER, *CC0_QEER);
    
       /* Report on the contents of QDMA 0-7 mapping, for each channel. */
       /* Offset is 0x200 = 512, but in 4-byte ints, 128.               */
    // for (i=0; i<8; i++)
    //    printf("EDMA3CC0 QDMA(%1i)=0x%08X.\n", i,EDMA3CC0[128+i]);
    // for (i=0; i<8; i++)
    //    printf("EDMA3CC1 QDMA(%1i)=0x%08X.\n", i,EDMA3CC1[128+i]);
    // for (i=0; i<8; i++)
    //    printf("EDMA3CC2 QDMA(%1i)=0x%08X.\n", i,EDMA3CC2[128+i]);
    
       return(0);
    } /* END main */
    

    I modified the cache_setup.c program to change NOTHING (lines are
    commented out) but to just report the existing cache-control register
    contents. So it is nothing but reads and printf() statements.

    The Makefile now has two targets; cache_setup.out will compile the
    little-endian version of the code, BE_cache_setup.out will compile
    with the --big_endian flag.

    BE AWARE that both versions produce "cache_setup.out", that will not
    always be the case, it is just a hack until I know what it takes to
    get something working... Eventually, my plan is to make the
    "doconfig.sh" script different for little-endian and big-endian, and
    make it compile the appropriate cache_setup.out.

    For now, just try to run these in little-endian mode and big-endian mode.

    In LE, the printf() statements all work as expected.
    In BE, I get no output at all.

    I think that proves at least one bug in the file system, quite similar
    to one I get when the correct filename appears in the host directory
    but the file has zero bytes in it.

  • On our second development system, identical to the first, we are unable to duplicate this behavior. Either we have something faulty with our EVM or our system isn't exactly identical. In either case, this looks like our setup issue. We are proceeding with our work on the second machine - and will return to this after our deliverables are met. I will post updates here as we get to the bottom of this. Thanks to Ti for helping us figure this out so quickly.

  • This was resolved by moving the EVM to No_boot mode.
    ---
    I have my 6678 EVM on No-boot mode with all switches 3-6, pin 1-4 at “ON” position.
    Steps:
    1)      $ source sourceme.sh
    2)      $ make clean
    3)      $ make all
    4)      Using CCS to connect DSP and the GEL file will run automatically for initialization
    5)      Using CCS load and run cache_setup.out
    6)      After 5) abort, using CCS load and run the xdmvnk_test.out built with above code change
    I didn’t see you issue. I highly suspect your setting caused the problem =======è can you change the EVM card in no-boot mode and make sure in your configuration file (ECM.ccxml) the gel file is located there as “../../emulation/boards/evmc6678l/gel/evmc6678l.gel”, and try again?
    EVM settings
    SW3 - ON OFF ON OFF ==========è to “ON ON ON ON”
    SW4 - ON ON  ON ON
    SW5 - ON ON  ON OFF =========è to “ON ON ON ON”
    SW6 - ON ON  ON ON
    Below is my test log.
    Regards, Eric