This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

How does debugging work?

Anonymous
Anonymous


Hi All,

 

I would like to ask a group of related questions on debugging.

 

What is happening when one clicks "debug"? In particular, what is happening when the programs runs to and stops at a breakpoint?

 

The purpose of breakpoint is to allow the developer to see the status of the "system" when the exection has run to a particular line of code. But how broad is the definition of this "system"?

Take the instance of an EVM board for example: in addition to the DSP CPU itself, there are many other peripherals on the board, including video and audio interface, added memory storage (DDR2 and flash), LED lights, etc. Even within the CPU itself there can still be the difference between the most "central" part such as the arithmetic unit, an extended "core" part such as the megamodule (with cache, DMA, interrupt controller, extended memory controller), and other modules integrated within the chip such as Video Processing Front End in a DaVinci chip.

All components, from the the most central computational part of a CPU, to integrated enhancement modules, to EVM-added components, have their internal registers and sometimes memory storage. After a system has booted and all peripheral modules have been properly initialized, with each cycle of the CPU clock, the whole EVM board works like a huge manufacturing factory comprising numerous workshops operating in synchronization according to their respectively assigned schedule of tasks.

The executable code, stored in the location of .text section as specified in or determined by the linker, is only a very small portion comparing with the grand scale of the manufacturing assembly line. When we have proceeded to particular code line and stopped, what status do we have to enter into the log in order for a useful "debug" to work?

1. Do we have to "freeze" the whole factory? Do we have to stop the operation at units in all levels of the hierarchy, from top management (CPU) to bottomost workshops, and order every one to halt and keep their present gesture? In this way an inspector would be able to examine the details in every single addressable unit, which is of course the best for examination purposes.

2. Do we really shut down all machines? Can we shut down all machines? Can we make a running wheel stop instantly at its current angle? Can we issue the command to units at all levels of the hierarchy, at all distances from the center, at the same time? Imagine the work of the Roman empire: are communication and relay needed for distant provinces? Won't there be delay due to this?

3. And how much is the cost (procedure, time)? Is it affordable to do all these numerous steps just for observing the status of a single variable? How long does it take to restore the previous running state? And after all, considering the lag between different units, is a restoration possible?

 

I have never considered such questions in my programming experience before. Although DSP development brought all of them to the forefront, they actually also exist in the seemingly simple environment of Windows programming. Perhaps the only situation in which all these consideration do not exist is the pure algorithmic testing case where no I/O or other device coordination needs to be concerned.

Coming back to a DSP system, what is actually happening when CCS halts at a breakpoint?

1. I think the first principle is a Seperation: all those that can only be changed by the execution of instruction code will remain unchanged due to the halt at the breakpoint. These should include:

1) calling stacks
2) heap, allocation on which is only done by a small group of functions, such as malloc()
3) most of the registers, either within or outside the DSP chip

But there could be complications in this case: For clarity, let's refer to "bits" rather than "registers". Some registers are provided to set the running mode of a device, much the same as buttons on a power plant panel; some are intended to provided as a feedback channel, much the same as a signal light in such a plant. If no instruction code is being executed when CCS halts at a breakpoint, those "buttons" will remain intact, but not necessary the signals.
So if we view a register address containing a "signal" bit, at different times (we first click "view memory" from the menu, key in an address and press "enter", at any possible time, and wait after an arbitrary time, key in the same address or simply refresh this address's value), is it possible for us to see different bit values?
  

2. 1 considered stacks, heap and registers. What about memory? Even in breakpoint halt I can still see the moving scene captured by the camera is continously being output to the connected display, which is apparently being done by the video processing module within the chip. To achieve this, frames must be written first to a memory address and then being retrieved by the back end video module for output. In this situation, there must at least one block of memory that is being written and read, which are of course coordinated to avoid conflict.
If we attempt to view an address at this block during a breakpoint halt, what will happen? Do we get different values at different times? Are we allowed to read or not?

 

 

 

I am still not able to figure out answers to these questions myself. Could anyone explain them to me?

And in which forum should these questions be asked? Breakpoint is used in CCS, but I also feel that they are intimately related to BIOS and other operating system concepts, and they perhaps also related to the complier, even not that stronger. I know that the policy is to only allow a single post in a single forum, so if this is not the right place, could the moderator help me to move it to the most appropriate forum?

 

 

 

 

Sincerely,
Zheng
 

 

  • Zheng Zhao said:

    I would like to ask a group of related questions on debugging.

    I will comment on a few of your items below.

     

    Zheng Zhao said:

    What is happening when one clicks "debug"? In particular, what is happening when the programs runs to and stops at a breakpoint?

    I make a comment on this from the reference point of using Code Composer Studio.  Other IDEs may have slightly different behavior.  When you request the CCS IDE to debug, it will launch a session which will attempt to connect to the target CPU specified.  This connection, if made, will allow the CCS IDE to take control over the CPU and interrupt what it was doing prior to that point.  The CPU will be halted.

    When you have specified a breakpoint in the application code, the CCS IDE will perform some operations with the target CPU to setup triggers.  This could be special hardware/logic inside the CPU or replace instructions in RAM memory with emulation stop instructions.  When the CPU encounters this "trigger", it will halt.  At that point, the CCS IDE will then interacting with the CPU to refresh any memory windows that are open, access data that you may be tracking via visualization tools, the disassembly window, etc.  These refreshes consist of a series of transactions over the JTAG interface, to force the CPU to perform memory reads, etc. and then send that data back to the IDE.

    As you have gone on to discuss below, other things in the system which could be peripherals, other CPUs, FPGAs, etc. will still likely be operating as if nothing has happened.  The CCS IDE and JTAG emulation only are associated with that particular target CPU.

     

     The purpose of breakpoint is to allow the developer to see the status of the "system" when the exection has run to a particular line of code. But how broad is the definition of this "system"?

    [/quote]

    The definition of the system is what is visible by the target CPU.  If the target CPU only has visibility into a peripheral or other logic device via memory mapped registers, that is what you will only be able to see via the IDE when encountering a breakpoint.

    Take the instance of an EVM board for example: in addition to the DSP CPU itself, there are many other peripherals on the board, including video and audio interface, added memory storage (DDR2 and flash), LED lights, etc. Even within the CPU itself there can still be the difference between the most "central" part such as the arithmetic unit, an extended "core" part such as the megamodule (with cache, DMA, interrupt controller, extended memory controller), and other modules integrated within the chip such as Video Processing Front End in a DaVinci chip.

    All components, from the the most central computational part of a CPU, to integrated enhancement modules, to EVM-added components, have their internal registers and sometimes memory storage. After a system has booted and all peripheral modules have been properly initialized, with each cycle of the CPU clock, the whole EVM board works like a huge manufacturing factory comprising numerous workshops operating in synchronization according to their respectively assigned schedule of tasks.

    [/quote]

    In a general sense, I agree with your statements.  However, do keep in mind that some devices and/or systems, may be operating off of completely separate and unrelated clock domains.  They are not necessarily tied specifically to the target CPU clock.  This implies that some elements of the overal system are asynchronous to each other.

    The executable code, stored in the location of .text section as specified in or determined by the linker, is only a very small portion comparing with the grand scale of the manufacturing assembly line. When we have proceeded to particular code line and stopped, what status do we have to enter into the log in order for a useful "debug" to work?

    1. Do we have to "freeze" the whole factory? Do we have to stop the operation at units in all levels of the hierarchy, from top management (CPU) to bottomost workshops, and order every one to halt and keep their present gesture? In this way an inspector would be able to examine the details in every single addressable unit, which is of course the best for examination purposes.

    [/quote]

    This is really something that you need to decide for your particular application.  What are you going to need to have access to in order to properly validate your application.  If "freezing" the whole factory is important, then I will say this, you will be challenged to accomplish this.  Once your application starts interacting with other things in the system which have "minds" of their own, you are not going to be able to realize this system freeze.

    2. Do we really shut down all machines? Can we shut down all machines? Can we make a running wheel stop instantly at its current angle? Can we issue the command to units at all levels of the hierarchy, at all distances from the center, at the same time? Imagine the work of the Roman empire: are communication and relay needed for distant provinces? Won't there be delay due to this?

    [/quote]

    Precisely and hence the reason why I stated that you will be challenged to freeze everything.  It is not realistic to accomplish this.  Therefore, you validation strategy of your application (hardware and software) will need to consider these uncontrollable aspects and you will need to incorporate test methodologies to work through this.  It may force you to perform as much unit test as possible, but ultimately, you will need to try to add in as many hooks into the functional pieces to give you the visibility you need.  Things like logging routines, test points in hardware, etc.

    3. And how much is the cost (procedure, time)? Is it affordable to do all these numerous steps just for observing the status of a single variable? How long does it take to restore the previous running state? And after all, considering the lag between different units, is a restoration possible?

    [/quote]

    Great question.  There is no one answer to any of these.  You will need to decide what the cost is of hunting for the needle in the haystack later on in crunch time.  Generally speaking, if you are creating well defined modules (software and hardware) with clear interfaces, I think you will have a much better time tracking down issues.  But the more you can instrument your code, to give you that visibility when you need it, the better.  Plan ahead before you run.

    Zheng Zhao said:

    I have never considered such questions in my programming experience before. Although DSP development brought all of them to the forefront, they actually also exist in the seemingly simple environment of Windows programming. Perhaps the only situation in which all these consideration do not exist is the pure algorithmic testing case where no I/O or other device coordination needs to be concerned.

    Personally, I don't consider the questions you asked (which are excellent questions and good that you are thinking about them) are at all related to DSP, aside from the standpoint that DSP is often related to real-time processing.  These are good questions to consider with any system.

    Coming back to a DSP system, what is actually happening when CCS halts at a breakpoint?

    1. I think the first principle is a Seperation: all those that can only be changed by the execution of instruction code will remain unchanged due to the halt at the breakpoint. These should include:

    1) calling stacks
    2) heap, allocation on which is only done by a small group of functions, such as malloc()
    3) most of the registers, either within or outside the DSP chip

    But there could be complications in this case: For clarity, let's refer to "bits" rather than "registers". Some registers are provided to set the running mode of a device, much the same as buttons on a power plant panel; some are intended to provided as a feedback channel, much the same as a signal light in such a plant. If no instruction code is being executed when CCS halts at a breakpoint, those "buttons" will remain intact, but not necessary the signals.
    So if we view a register address containing a "signal" bit, at different times (we first click "view memory" from the menu, key in an address and press "enter", at any possible time, and wait after an arbitrary time, key in the same address or simply refresh this address's value), is it possible for us to see different bit values?
      

    [/quote]

    Agree to your comments.  Yes, there is a possibility that values will change on you as you refresh (or update the window).  If these are active signals and change state independent of code execution, you will likely see changes in the values when you refresh the window.

     

    Zheng Zhao said:

    2. 1 considered stacks, heap and registers. What about memory? Even in breakpoint halt I can still see the moving scene captured by the camera is continously being output to the connected display, which is apparently being done by the video processing module within the chip. To achieve this, frames must be written first to a memory address and then being retrieved by the back end video module for output. In this situation, there must at least one block of memory that is being written and read, which are of course coordinated to avoid conflict.
    If we attempt to view an address at this block during a breakpoint halt, what will happen? Do we get different values at different times? Are we allowed to read or not?

    Yes, you will get different values at different times if those other subsystems are actively accessing memory. 

    [quote user="Zheng Zhao"]

    [quote user="Zheng Zhao"]

    [quote user="Zheng Zhao"]

    [quote user="Zheng Zhao"]

    [quote user="Zheng Zhao"]

    [quote user="Zheng Zhao"]

  • Anonymous
    0 Anonymous in reply to BrandonAzbell

    Dear Brandon,

     

    I am very glad to get the detailed and informative answers from you, they are very helpful. Thanks very much.

                        

                                

    I would like to ask another related question: we have "Reset" button in CCS which allows:

    1. CPU Reset
    2. System Reset
    3. Emulator Reset

    Regarding what we have just discussed: what is the scope of each of these options?

    1. Does CPU mean the arithmetic unit, or megamodule, or everything within the TI DSP chip?
    2. Does System mean every thing visible to the CPU, including all memory mapped registers of peripheral devices? Does "System Reset" clear all contents in DDR2 memory to zero?
    3. I haven't found once that "Emulator Reset" is available. What is its functionality and purpose?

     

    Why are these reset's provided in debugging? If in the middle of running we reset the status of any of 1 or 2 or 3, do we get all the previous states/history lost? If the breakpoint is at the 100th line of a total of 1000 lines' code, do we get everything resulted from code lines 0-99 lost? And when we click the green button "Run", do we start from "nothing" and code line 100, and if without another breakpoint, all the way down to code line 1000?

    If from the said new starting point line 100 we have nothing of the previous steps left, do we have to re-initialize the CPU and peripherals? Is it subsequent running (line 100 to 1000) still meaningful if we do not?

    And why sometimes after I choose "System Reset" and then click "Terminate All" to quit the debug session, if I tried to start debug again, CCS will tell me that there are "verification error" in the target and debugging cannot be started? Usually I have to first plug off and then on the EVM power to get this resolved.

     

     

    Sincerely,
    Zheng

  • Anonymous
    0 Anonymous in reply to BrandonAzbell

    Dear Brandon,

    Brandon Azbell said:

    In a general sense, I agree with your statements.  However, do keep in mind that some devices and/or systems, may be operating off of completely separate and unrelated clock domains.  They are not necessarily tied specifically to the target CPU clock.  This implies that some elements of the overal system are asynchronous to each other.

    What type of devices belong to these "exception"s? Could you give me a few examples of them?

     

    Sincerely,
    Zheng

  • A quick example that I can think of are Analog-to-Digital converters that may need to operate with a specific sampling frequency.  Or Video encoders/decoders that operate with a different clock frequency than what the CPU is using.

  • Anonymous
    0 Anonymous in reply to BrandonAzbell

    Dear Brandon,

    I encountered a strange problem when debugging a video program in which I need to identify top/bottom field in a BT.656 stream. Although DaVinci 6437 chip there is a register bit (SYN_MODE.FLDSTAT) in its video processing front end (VPFE) serving as the field indicator, in interrupt breakpoint I observe irregular value change rather than an expected regular toggling repetition between 0 and 1. I was not able to find out the cause of this problem and this is the direct motivation compelled me to think on the under the hood details of debugging. I have found many of the points in your answer are relevant to the problem, but I was still not able to figure it out with these concepts. Could you have a look at this post (Several related VPFE questions)?

                 


    Sincerely,
    Zheng

  • It looks like your other thread is being worked.

  • Anonymous
    0 Anonymous in reply to BrandonAzbell

    Dear Brandon,

    I seem to have got some clue on this problem. I will try if I can work it out.

     

    Thanks,

    Zheng