This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320C6672ACYP25 power consumption and heat issue

Other Parts Discussed in Thread: TMS320C6672

Hello,

We have been producing a product with the TMS320C6672 for a little over a year, and have been happy with it, but we had an application where we needed to make use of the 1.25GHz version (TMS320C6672ACYP25).  We recently did a run of boards with the new part, and started by running out existing software, which runs at 1GHz. 

With the same code, our current consumption doubled, and the part shuts down due to the excess heat. 

We haven't done a build since earlier this year.  Our guess is that maybe a change was made to the die, and our code might need to change as well, but we aren't sure.  Was there a change to the part recently?  Or is the 1.25GHz a totally different die? 

Based on the power calculator, we would expect a rise in current with increased speed, but didn't expect to see a jump just moving to the 1.25GHz part.  Is there something we are missing?

Thank you

  • We have run into the EXACT SAME PROBLEM.  We switched from 6772ACYP to 6672ACPY25 for the extra horsepower but we have not touched the code yet to take advantage of it.  Brought our first six boards up early this week and encountered the extreme current / voltage issue.

    The variable core is drawing 4 Amps on the 1.25GHz version whereas it draws 2 Amps on the 1 GHz version as reported by the Fusion application.  And the current goes up dramatically if we allow the DSP to get hot. 

    We were running the previous version without a heat sink, had to put a heat sink on now as we measured the DSP at 220 degrees F with our thermal camera before our system manager FPGA ultimately shut down.

    Our board was respun between the versions, but with only minor changes.  I have scrubbed the BOM and the board to see if we had a something installed wrong, but I was not able to determine anything glaring. 

    I also was told by Avnet sales that rolling back to the 1 GHz part is not an option, that these are the same parts, speed is just determined by yield. 

    We are stuck in the mud awaiting a response.

    Thanks,

    Joel.

  • Jamal, Joel,

    We apologize for the wait. Please note that the 1.25GHz parts use a higher cutoff screen than the 1GHz speed grades so the leakage is expected to be higher on the newer parts even running at 1GHz (the power model handles this by selecting the larger leakage curve when the entered CPU frequency is > 1000MHz. Therefore to get the 1GHz numbers on the 1.25GHz part enter 1001MHz in the power model).

    A few questions:

    1) What is the nature of code you are running (some benchmark? Custom code?)

    2) What is the expected CPU and peripheral loading for your use case? Does it heavily load the C66x cores?

    3) For your use case, what power did you measure on the 1GHz speed grade units? (Joel, you have stated your numbers). Could you create a table with old and new measurements on your device as well as what is expected by the power model? Screen shots of the power model will be best.

  • Hi Aditya,

    We didn't use the estimator when we started the project, so we don't have anything to compare it to, but just looking at the spreadsheet, going from 1000 to 1001, the current jumps by 1 Amp.  We are seeing a 2 Amp jump (from 2A to 4A).  All of that is on the core voltage.  Can you explain the large jump in current?  It doesn't really make sense.

    There is some custom code, obviously, but we started with the SDK.  Yes, we are using the DSP's quite a bit. 

    Thanks

     

     

  • Jamaal,

    The higher screen point could potentially explain the difference. Allow me to explain:

    The power model gives you the worst case power consumption after you have entered your desired utilization/frequency/temp etcparameters.

    All 1.25GHz speed grade units shipped will have a lower power consumption than this worst case. But you can still expect differences among the devices where some consume more power than others due to differences in leakage power.

    It could very well be the case here that the 1GHz speed grade unit was lower on its leakage spectrum (consuming say ~2W instead of ~3.2W) and the 1.25GHz is at the higher end of its spectrum (consuming say ~4W instead of 4.2W). This would mean a 2A jump. But we need to verify that is indeed the case.

    • Do you see the ~2A jump on multiple sets of 1.25G and 1G units? 
    • For 3-5 of the above units,  can you read the four 32-bit values at these locations?
      0x2620008 0x262000C 0x02620010  0x02620014

    SIDE NOTE: your power supply sizing should be determined by this worst case after you have entered  the parameters.

     

  • There is not anything special about the SW.  As Jaamal pointed out, SW was developed following Keystone documents and the TI SDK.  I am having a difficult time accepting that the DSP with the new fabric requires 4A to run an infinite loop whereas the previous fabric requires 2A for the same loop. 

    After bootstrap, we simply run in a loop waiting to process commands from a host controller.  There is no real processing here.  When we do enter the mode where we are endlessly processing commands which include interfacing with the PCIE controller, an external USB controller via the EMIF bus, and running some math algorithms, we do not see a change in current.  The DSP draws the same amount of current as it does when we are idle.  We notice a 20% global current increase when we are hammering on the DDR3 to run an initial memory test after boot, but it settles back down and is stable after the test completes.  This is true on both versions of the DSP. 

    We are running on a single core at this point. 

    The Engineer that is responsible for SW development and will ultimately be required to easily read back these registers for us left for Vacation this morning until next Tuesday. 

    Joel.

  • Jamaal and Joel,

    I stated this as a "SIDE NOTE" earlier but it should really have been at the front and center of my response: TI will only guarantee that your units consume lower power than the power model for your use case. You are expected to design your power supplies based on the worst case generated by the power model.

    In that context, as long as your units meet this criteria, a 2A jump does not really mean anything other than differences in the leakage screening and process strength variations.

  • As I obviously failed to state because it is not an issue... our power supply is capable of 10A for the variable 1V core supply.  The issue that we have, which is in the subject line of this post, is that the DSP turns the extra 2A to heat and this heat must be dissipated or the chip overheats.  At 2A on the previous screen we don't need a heat sink.  At 4A we need a very large heat sink or a smaller heat sink plus a fan.

    Joel.

  • The overheating indicates that you may not have an adequetly designed thermal solution. The thermal solution needs to take into account the worst case power dissipation for your use case that is derived from the power model.

    TI offers thermal design guidelines for Keystone devices. Please take a look at Thermal Design Guide for KeyStone Devices (SPRABI3)

     

  • Clearly - this is my problem, not taking thermal conditions into account when we designed our product before the 1.25GHz version was even in the TI thought queue!  Thanks, but to be honest with you, you are not working towards finding the root cause of TI's problem.

  • Joel,

    From reading this thread, the 1.25GHz C6672's you have are operating as expected and within the power consumption limits TI has specified.  

    The power consumption spreadsheet provides the worst case power consumption numbers that would be seen for the device under the operating conditions configured in the power spreadsheet.  The power consumption spreadsheet is provided so that customers can develop an adequate power and thermal solution.  

    If the C6672's are not exceeding the power consumption spreadsheet, I'm not sure what 'root cause' you're after.

    It could very well be that on the 1GHz units, you likely had units that were more towards the Nominal end of the process and was consuming a fair amount of less power than the Power Consumption Spreadsheet indicates for the solution.  

    The 1.25GHz units may have been towards the 'hotter' side of the process, but still within our specified power consumption limits.

    At this point, is there some area in which you feel that TI has failed to identify an issue?

  • Take your blinders off and get of the Nile (denial) - what semiconductor world are you coming from?  This is the same code running the same clock frequency on two different versions of your part, that has the same identical part number, just a 1.25GHz marking up in the corner of the part.  We are experiencing a current doubling and we should have expected that based on your power estimator spreadsheet????  My Avnet FAE even ran your spreadsheet and reported to us that we should have expected 1A worst case.  And worst case would come into play as the DSP approaches its maximum temperature and the current doubling appears when the part is ice cold.  That is one heck of a good power estimator program you have there if it is really +/-2x, in Amps.  Very useful indeed.

    I have built with several "lots" of the lower frequency part over the two years that we have worked it, including C6678s, that we have used of the lower frequency part and we don't see any unreasonable power consumption differences.

    If we were increasing the clock speed, or running it faster than the part was designed for, I could see a doubling in current.  What you are suggesting is that you see a current doubling across process technologies.  This might be the normal for TI, but I highly doubt most chip designers would live with this.  And I highly doubt that anyone would or could produce a part that consumes 2x current from lot to lot.

    At this point your answers, all denial, are of now use.   We have no immediate requirement for the higher speed the 1.25GHz version provides, it was simply changed as forward thinking (bad, bad forward thinking, obviously).  Avnet has provided the lower speed part, they arrived today.  This is what will go with on the next build.  Besides, it saves me $25 a part also.

    BTW, this forum was really useful, when we had a software question :)

    J.

      

  • Again, that Power Consumption Spreadsheet is for providing the worst case power numbers.  Using a delta difference between what you've measured before, especially on a different speedgrade of device and what you're measuring now, is not realistic.

    The majority of the power consumption at the hotter end of the processes is from 'leakage current' and makes up the majority of the baseline power listed in the power consumption spreadsheet.  You'll see this value change significantly with temperature, and you'll also see a signifigant change when going from 1000MHz to 1001MHz as that's going from a 1GHz device to a 1.25GHz device, which in general will need to have a higher leakage to achieve the higher speeds.

    I highly recommend that you review the Power Consumption Summary Application Report that goes along with the Power Consumption Spreadsheet to fully understand the information it provides and how it should be used.

    If you care to provide the exact measurements, Measured Device Temp, Current of all the C6672 supplies, and Filled out power consumption spreadsheet, we are more than happy to look at this further.

    This is required for continued analysis.

    Best Regards,

    Chad

  • The results returned from the registers you requested read are:

    0x0262,0008 => 0x1100,7006

    0x0262,000C => 0x0404,1F69

    0x0262,0010 => 0XA000,0000

    0x0262,0014 => 0x37BE,0021

    We did not see these registers documented except "reserved".  Are they documented somewhere?

    J.

  • Joel,

    Thanks for the information, we will take a look at this.

    That said, we will still need the following requested information: Measured Device Temp, Current of all the C6672 supplies, and Filled out power consumption spreadsheet.

    Best Regards,

    Chad