This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

HEVC - Current Status, Roadmap, Metrics

Hello All,

I was wondering what the current status is on the HEVC (H.265) codec, what the hardware requirements are, and if there are published performance metrics.

It would be interesting to see how many H.265 streams can be run per core; of course this has to be delineated by stream parameters.

Thanks In Advance,
johnw

  • Johnw,

    Thanks for your interest in our products. Please contact your local TI sales representative and he/she will be more than happy to share this information with you.

  • Hello Roger,

    Hmmm, I am not sure who that person is anymore.  Can you give me suggestions?

    We can talk via msg off list maybe.

    Thanks,
    johnw

  • Johnw,

    if you do not remember his/her name, this is not a problem, please contact the distributor that is supporting you.

    regards

     

  • Hi Johnw,

    we have an HEVC encoder and decoder for C66x cores

    http://software-dl.ti.com/dsps/dsps_public_sw/codecs/C6678/HEVC_E/latest/index_FDS.html

    http://software-dl.ti.com/dsps/dsps_public_sw/codecs/C6678/HEVC_D/latest/index_FDS.html

    From above links you can download the codecs installers and datasheets. Please refer to datasheets for codecs' features and cycle information in some typical use cases.

    Thank you,

    Paula

  • Hello Paula,

    Thanks for your answer.

    I have looked at the documentation that was published in Oct. 2014 - I was wondering if there are any other docs that show the encoding technique, and the number of cores used plus what the core utilization is like in the Oct. 2014 docs.

    I want to build a table of metrics so as to be able to glean the direction a design should go in.

    Also - regarding the 6678 cores - the data in the Oct 2014 doc(s) - can that data also be applicable to 6678 cores
    in the K2 processors?

    I saw that Advantech had a HEVC demo at a show this year - do you have any real performance metric numbers from that by any chance?

    I have e-mailed my contacts at Advantech but they are both out of the country right now I think.

    A nice thing that is happening right now is that the K2 pricing is dropping making it a very attractive solution - in case that is news to someone reading this.

    Thanks,
    johnw

  • Hi Johnw,

    Latest published doc is from Oct-2014, in order to get an HEVC roadmap you would need to contact TI bussiness side. If you send me a private message with your company information an email I can pass them to our bussiness manager.

    About Advantech, I think they showed HEVC transcoder 1080p60 demo at NAB and 4Kp24 demo at IBC.

    On the other hand, yes, HEVC can run on 6678 cores in K2. I also think advantech has and HEVC demo on K2.

    thank you,

    Paula

  • Hello Paula,

    OK PM'ed you - hope that went through OK.

    As far as the benchmark, I am trying to determine how to calculate utilization - the MPS measured is for the entire encode - correct?  And - is that per processor (or core) - or is that number an aggragate? Also, do I throw away the notion of MMACS when trying to calculate a utilization number based on the MPS per encode?  And just use the clock speed?

    Is the K2 architecture superscaler?  Since it is massively parallel - I guess it is - but I don't see that term jumping out at me at the moment.  If I have nothing more than cycles per second in which to try and calculate something like - only X% of processor Y (or core Y in processor Z) was used for the encoding sequence H265MP_ENC_001 and the processor is superscaler, then the calculation isn't that straight forward possibly.

    Thanks,
    johnw

  • Paula,

    Maybe I missed something - but when I downloaded the encoder - I expected to see a bunch of C source code - I didn't see that.

    Is the entire encoder source for HEVC included in the first link you gave me above?

    Thanks,
    johnw

  • Paula/TI,

    So, if we just use the metric of cycles per second on the DSPC-8682 - for the first profile - Config ID H265MP_ENC_001 - if we use the 1GHz clock rate and say 4 chips used for an aggragate 4GHz - then do 
    a simple calculation (normalized) - 1230 / 4000 * 100 = 30.75%(average) - is that a real utilization?

    Can I put two more of the Airshow_p1920x1080_420p.yuv, YUV420, Random Access, IBB @ 8Mbps @ 60 frames per second streams on to those 4 6678's for a total utilization of 92.25% (theoretically)?  Maybe I am I/O bound
    or out of memory before I get there - but let's just say for argument for the sake of this calculation - I am not memory
    or I/O bound - is that enough in the ballpark to make sense?

    Thanks,
    johnw

  • Hi John,

    - MIPS are per core. So if you are using, say 4 cores and "Mega Cycles per second" reported 1200, and your used CPU clock is 1250MHz. Then you are real-time.

    - HEVC encoder is multicore/multchip scalable. However, please keep in mind MIPS per core, in a multichip/multicore scenario, not necesarly extrapolate linearly. for example if for 1080p30 X cores are required for real time, if you use Y*X cores with Y>1 then you will encode faster than realtime contary if you Z*X cores with Z<1 it won't be real time.

    The best way is to have a setup for checking MIPS (MCSDK video PCIe HEVC demos could help), different configurations and use cases would give you different MIPS requirement.

    http://processors.wiki.ti.com/index.php/MCSDK_VIDEO_2.1_PCIE_Demo_Guide

    http://software-dl.ti.com/sdoemb/sdoemb_public_sw/mcsdk_video/02_02_00_42/index_FDS.html

    thank you,

    Paula

     

     

  • Hi John, in ti.com we distribute HEVC lib obj no source code (packge comes with a CCS and VC test application, FYI). For source code you would need to contact TI bussiness unit.

    thank you,

    Paula

  • Hi John,

    If you need 4 chips - 32 cores (DataSheet H265MP_ENC_001 use case - RA, IBB, 1080p60, 8Mbps) and reported Mega Cycles per second per core is 1230, then you will need to clock your DSPs @ 1250 MHz for real time. And your utilzation would be ~98.4%.

    For 2 channels of the same above use case you would need all 8 DSPs from your DSPC-8682.

    thank you,

    Paula

  • Hello Paula,

    Can you please tell give me the TI Business Unit contact information?

    Thanks,
    johnw

  • OK Paula - so for the sake of clarity - let's discuss multi-chip vs. multi-core as in Table 1 of the HEVC datasheet.

    So, multi-chip(4 Chips) means 8 * 4 = 32 cores - correct?
    Multi-core (8 cores) means - 1 chip - correct?

    I am wondering why some of the terminology is used the way it is for Table 1 - Configuration table.  In determining
    utilization, this is an important distinction as you know.

    Thanks,
    JohnW

    Paula Carrillo said:

    Hi John,

    - MIPS are per core. So if you are using, say 4 cores and "Mega Cycles per second" reported 1200, and your used CPU clock is 1250MHz. Then you are real-time.

    - HEVC encoder is multicore/multchip scalable. However, please keep in mind MIPS per core, in a multichip/multicore scenario, not necesarly extrapolate linearly. for example if for 1080p30 X cores are required for real time, if you use Y*X cores with Y>1 then you will encode faster than realtime contary if you Z*X cores with Z<1 it won't be real time.

    The best way is to have a setup for checking MIPS (MCSDK video PCIe HEVC demos could help), different configurations and use cases would give you different MIPS requirement.

    http://processors.wiki.ti.com/index.php/MCSDK_VIDEO_2.1_PCIE_Demo_Guide

    http://software-dl.ti.com/sdoemb/sdoemb_public_sw/mcsdk_video/02_02_00_42/index_FDS.html

    thank you,

    Paula

     

     

  • Paula,

    So, if we switch gears here a bit - for the H.264 CODEC in the June 2014 datasheet (SPRS882) - if we look at H264HP_ENC_001 - which says single core - meaning 1/8 th of the '6678 (correct?) - the average is 570 MCPS.

    So, if I run the clock for that device at 570 * 2 (plus some delta) - I can have a clock of ~ 1.2 GHz and get two streams of Shyam_p720x480_420p_8bit.yuv, YUV420, CAVLC, CBR, IPPP @ 2Mbps @ 30 frames per second, out of 1 core (of 1/8th) of a '6678 - assuming I am not I/O bound or memory restricted, correct?

    I realize I may have to develop a mux or arbiter to actually make that work - but again, I am trying to get a feel for real utilization metrics.

    Thank You,
    John W.

  • Paula,

    OK - I though X was percent, Y was a core, and Z was the chip or 'processor' - so I got a little confused but I think I understand the point you are making.  

    The only thing I have at my disposal at the moment is the K2 EVM - and I have not tried to 'port' the demos over to that yet even though that will probably happen.   I have been talking to Advantech and am working on getting one of the PCIe boards.

    Thanks,
    John W.

    Paula Carrillo said:

    Hi John,

    - MIPS are per core. So if you are using, say 4 cores and "Mega Cycles per second" reported 1200, and your used CPU clock is 1250MHz. Then you are real-time.

    - HEVC encoder is multicore/multchip scalable. However, please keep in mind MIPS per core, in a multichip/multicore scenario, not necesarly extrapolate linearly. for example if for 1080p30 X cores are required for real time, if you use Y*X cores with Y>1 then you will encode faster than realtime contary if you Z*X cores with Z<1 it won't be real time.

    The best way is to have a setup for checking MIPS (MCSDK video PCIe HEVC demos could help), different configurations and use cases would give you different MIPS requirement.

    http://processors.wiki.ti.com/index.php/MCSDK_VIDEO_2.1_PCIE_Demo_Guide

    http://software-dl.ti.com/sdoemb/sdoemb_public_sw/mcsdk_video/02_02_00_42/index_FDS.html

    thank you,

    Paula

     

     

  • Paula,

    Another question - how does one know what the 'magic' links are to get the latest 2.2 stuff?

    Thanks,
    John W.

    Paula Carrillo said:

  • Paula,

    And, in the quest for utilization, the (or at least one or perhaps the 'prime') metric, as far as stream parameters go, 
    is obviously frames per second; don't you agree?

    Thanks,
    johnw

  • Hi John your H.264 encoder observations looks good to me. About prime metric, I guess it depends a little bit of your final application but after fixing resolution and quality level for your use case, then yes, you can check MIPS encode or FPS (which is cpu_clock_HZ/MIPS) as a metric of utilization. About MCSDK video download link, from the main MCSDK video wiki you should get it at the bottom in "Product download and updates".. but It haven't been updated. I will do it. Thanks for the catch.

    Paula

     

  • OK Paula - sure thing - thanks for updating the link.

    Best Regards,
    John W.

  • Hello Paula,

    In the quest for clean metrics and determining performance for the K2 family, the max clock speed of the C66x CorePacs for the K2 is 1.2 GHz and for the '6678 it is 1.4 GHz according to the datasheet.

    Since some of the metrics indicate a clock speed in excess of 1.2 GHz - can the K2 be run faster than that to
    achieve real-time performance?

    Looks like we're going to get the K2 EVM I have into a chassis with a PCIe bus to do some real testing in the near future.

    Thanks,
    John W.

    Paula Carrillo said:

    Hi John your H.264 encoder observations looks good to me. About prime metric, I guess it depends a little bit of your final application but after fixing resolution and quality level for your use case, then yes, you can check MIPS encode or FPS (which is cpu_clock_HZ/MIPS) as a metric of utilization. About MCSDK video download link, from the main MCSDK video wiki you should get it at the bottom in "Product download and updates".. but It haven't been updated. I will do it. Thanks for the catch.

    Paula

     

  • Hi John, we currently clock C6678, for testing and cycle profiling, at 1250MHz, so you should be OK. If there is any use case over 1200MHz we can adjust some encoder params (with minimal VQ effect) in order to fit new cycle budget.

    thank you,

    Paula 

  • Paula,

    Well, if we look at the two H265MP_ENC_001 and H265MP_ENC_005 for instance, both of those exceed 
    what would be required R/T with a datasheet clock of 1.2 GHz.  So, if this was running inside the K2x - then
    I suppose that could be adjusted - even though it appears slight - but it would not be realtime running at
    1.2 GHz - but close.

    Also, several of the cases exceed 1.2 GHz Peak as show in Table 2 on page 2 of the datasheet.

    Would running the CorePacs faster mean also the clock rate for the ARM cores would have to be increased to keep the system timing OK?

    Thanks,
    John W.

  • Paula,

    Are the test streams publicly available?

    I.e. - Airshow_p720x480_420p.yuv, YUV420, Random Access, IBB @ 1Mbps @ 60 frames per second, Airshow_p1280x720_420p.yuv, YUV420, Random Access, IBB @ 4Mbps @ 30 frames per second .

    Thanks,
    John W.

  • Hi John, airshow clips comes as part of MCSDK video package and stand alone test bench (CCS project)

    thanks,

    Paula

  • Paula,

    OK - but where is this one:

    Airshow_p720x480_420p.yuv

    Don't see the Shyam that are in the H.264 datasheet either.

    Thanks,
    John W.

  • John, I will check if we can share testvecs clips and come back to you.

    thank you,

    Paula

  • Hello Paula,

    That will be great.

    Thanks,
    John