This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

HEVC encoder artifacts when running on more than 3 DSPs

Hi,

When I run HEVC encoder (02.00.00.02 and 02.00.00.03) on 2 DSPs with multiple tiles it produces correct output.

But when I run it on 3 DSPs with 3 tiles or 4 DSPs with 4 tiles output has artifacts.

I use the same logic for 2, 3 and 4 DSP modes. Also I double checked that in\out buffers have the same addresses and contain the same data on all the DSPs.

Please see attached files: hevc_art.zip

Regards,

Andrey Lisnevich

  • Hi Andrey Lisnevich,

    Can you please send us the config file, you are using to encode the streams in multi tile mode.

    Thanks and Regards,
    Palachandra M V
  • Hi,

    The configuration:

    videnc2Params.encodingPreset               = IH265_ENCODINGPRESET_USER_DEFINED;
    videnc2Params.rateControlPreset             = IVIDEO_LOW_DELAY; // 1: LOW_DELAY (CBR),  2: STORAGE (VBR), 4: NONE,  5: USER_DEFINED
    videnc2Params.maxWidth                      = 1920;
    videnc2Params.maxHeight                     = 1080;
    videnc2Params.dataEndianness                = XDM_BYTE; // Not configurable
    videnc2Params.maxInterFrameInterval         = 1; // Max I to P frame distance. 1: no B frames, 2: one B frame, 3: two B frames, etc [1, 255]
    videnc2Params.maxBitRate                    = 600000; // In bits per second. Should be valid as per LEVEL limit
    videnc2Params.minBitRate                    = 600000; // In bits per second
    videnc2Params.inputChromaFormat             = XDM_YUV_420P; // Not configurable
    videnc2Params.inputContentType              = IVIDEO_PROGRESSIVE; // 0: PROGRESSIVE, 1: INTERLACED
    videnc2Params.operatingMode                 = IVIDEO_ENCODE_ONLY; // Not configurable
    videnc2Params.profile                       = IH265_MAIN_PROFILE; // TODO: change description
    videnc2Params.level                         = IH265_LEVEL_41; // TODO: change description
    videnc2Params.inputDataMode                 = IVIDEO_ENTIREFRAME; // Not configurable
    videnc2Params.outputDataMode                = IVIDEO_ENTIREFRAME; // Not configurable
    videnc2Params.numInputDataUnits             = 1;
    videnc2Params.numOutputDataUnits            = 1;
    
    int i;
    for (i = 0 ; i < IVIDEO_MAX_NUM_METADATA_PLANES; i++) {
    	videnc2Params.metadataType[i] = IVIDEO_METADATAPLANE_NONE;
    }
    
    // General HEVC encoder settings
    params.scalingMatrixPreset =    IH265_SCALINGMATRIXPRESET_DEFAULT; // Scaling matrix preset: 0: Default 1: User defined
    params.decRefreshType =         0; // Decoder referesh type: IDR or CDR [0 or 1].
    params.decRefreshInterval =     1; // Decoder referesh Interval. [0,1]
    params.enableTransQuantBypass = 0; // Enable/Disable transform and quantisation bypass. [0,1]
    params.maxPoc =                 256; // Maximum allowed Picture Order Count. [16, 65535]
    params.enableTransformSkip =    0; // Enable/Diable Transform bypass. [0,1]
    params.maxIntraFrameInterval =  120; // Maximum intra frame interval. -1,[1, 2147483647]
    params.enableWPP =              0; // Enable/Disable WPP support. [0,1]
    params.maxNumRefFrames =        1; // Maximum allowed reference frame. [0,1]
    params.enableVirtualTile =      0; // Indicates whether encoding is happening with virtual tile enabled or not
    params.debugTraceLevel =        0; // No debug info
    params.lastNFramesToLog =       0; // No debug info
    params.enableIntraRDO =         1; // This parameter controls the RDO feature in intra frame.
    params.enableSAONonReference =  1; // This parameter controls the SAO for non reference frames.
    params.enableScaleDeadZone =    1; // This parameter controls the scale dead zone algo.
    
    // Rate Control behavior
    rateControlParams.rateControlParamsPreset = IH265_RATECONTROLPARAMS_USERDEFINED; // 0: Default, 1: User defined, 2: existing
    rateControlParams.rcAlgo =                  0; // Rate control algorithm used, 0: Variable Bitrate, 1: Constant bitrate (low dealy)
    rateControlParams.qpI =                     -1;    // Quantization parameter(0-51) for I-Slices. [qpMinI, qpMaxI]
    rateControlParams.qpMaxI =                  40; // Maximum QP for I frames. [1,51]
    rateControlParams.qpMinI =                  12; // Maximum QP for I frames. [1,51]
    rateControlParams.qpP =                     28;    // Quantization parameter(0-51) for P-Slices. [qpMinP, qpMaxP]
    rateControlParams.qpMaxP =                  51; // Maximum QP for P frames. [1,51]
    rateControlParams.qpMinP =                  12; // Minimum QP for P frames. [1,51]
    rateControlParams.qpOffsetB =               4;
    rateControlParams.qpMaxB =                  51;
    rateControlParams.qpMinB =                  12;
    rateControlParams.enableFrameSkip =         0;
    rateControlParams.enablePartialFrameSkip =  0;
    rateControlParams.qualityFactorIP =         0;
    rateControlParams.cbQPIndexOffset =         2;
    rateControlParams.crQPIndexOffset =         2;
    rateControlParams.initialBufferLevel =      1200000;
    rateControlParams.hrdBufferSize =           1200000;
    rateControlParams.enableHRDComplianceMode = 0;
    rateControlParams.maxFrameSkipCnt  =        0;
    rateControlParams.SubFrameRC =              1;
    rateControlParams.maxDeltaQP =              0;
    rateControlParams.enablePRC =               0;
    
    // Loop filtering operations
    loopFilterParams.loopFilterParamsPreset =        IH265_SLICECODINGPRESET_USERDEFINED;
    loopFilterParams.enableDeblockFilter =           1;
    loopFilterParams.enableSaoFilter =               1;
    loopFilterParams.enableLoopFilterSliceBoundary = 0;
    loopFilterParams.enableLoopFilterTileBoundary =  0;
    loopFilterParams.separateCbCrSAO =               0;
    loopFilterParams.offsetLoopFilterInPPSFlag =     0;
    loopFilterParams.offsetDeblockBetaDiv2 =         0;
    loopFilterParams.offsetDeblockTcDiv2 =           0;
    
    // GOP
    gopCntrlParams.gopCntrlParamsPreset = IH265_GOPCTRLPRESET_DEFAULT;
    
    // Slice coding
    sliceCodingParams.sliceCodingPreset =    IH265_SLICECODINGPRESET_USERDEFINED;
    sliceCodingParams.sliceCodingMode =      0;
    sliceCodingParams.sliceCodingArg =       0;
    sliceCodingParams.enableTiles =          1;
    sliceCodingParams.numTileColumns =       2;
    sliceCodingParams.numTileRows =          2;
    sliceCodingParams.enableDependentSlice = 0;
    
    // Intra prediction coding
    intraCodingParams.intraCodingPreset =          IH265_INTRACODINGPRESET_USERDEFINED;
    intraCodingParams.intraRefreshMethod =         0;
    intraCodingParams.intraRefreshRate =           0;
    intraCodingParams.constrainedIntraPredEnable = 0;
    intraCodingParams.enableStrongIntraSmoothing = 1;
    intraCodingParams.matchYCbCrIntraMode =        0;
    intraCodingParams.enableLumaIntra4x4Mode =     0;
    intraCodingParams.enableLumaIntra8x8Mode =     0;
    intraCodingParams.enableLumaIntra16x16Mode =   0;
    intraCodingParams.enableLumaIntra32x32Mode =   0;
    intraCodingParams.enableChromaIntra4x4Mode =   0;
    intraCodingParams.enableChromaIntra8x8Mode =   0;
    intraCodingParams.enableChromaIntra16x16Mode = 0;
    
    // Inter prediction coding
    interCodingParams.interCodingPreset =   IH265_INTERCODINGPRESET_USERDEFINED;
    interCodingParams.enableTmvp =          0;
    interCodingParams.searchRangeHorP =     144;
    interCodingParams.searchRangeVerP =     32;
    interCodingParams.searchRangeHorB =     144;
    interCodingParams.searchRangeVerB =     32;
    interCodingParams.interCodingBias =     0;
    interCodingParams.skipMVCodingBias =    0;
    interCodingParams.numMergeCandidates =  3;
    interCodingParams.enableBiPredMode =    0;
    interCodingParams.enableFastIntraAlgo = 1;
    
    // Video usability information
    vuiCodingParams.vuiCodingPreset =               IH265_VUICODINGPRESET_DEFAULT;
    vuiCodingParams.aspectRatioInfoPresentFlag =    1;
    vuiCodingParams.aspectRatioIdc =                IH265_ASPECTRATIO_EXTENDED;
    vuiCodingParams.videoSignalTypePresentFlag =    0;
    vuiCodingParams.videoFormat =                   0;
    vuiCodingParams.videoFullRangeFlag =            0;
    vuiCodingParams.colourDescriptionPresentFlag =  0;
    vuiCodingParams.colourPrimaries =               0;
    vuiCodingParams.transferCharacteristics =       0;
    vuiCodingParams.matrixCoefficients =            0;
    vuiCodingParams.timingInfoPresentFlag =         0;
    vuiCodingParams.enableVui             =         0;
    vuiCodingParams.frame_field_info_present_flag = 0;
    vuiCodingParams.hrdParamsPresentFlag =          0;
    
    // Supplemental enhancement information
    seiParams.enableSeiFlag = 0;
    
    // Coding tree block
    ctbCodingParams.maxCTBSize = 64;
    ctbCodingParams.maxCUDepth = 3;
    
    // Dynamic parametes
    dynamicParams.videnc2DynamicParams.forceFrame = IVIDEO_NA_FRAME; // -1: IVIDEO_NA_FRAME, 3: IVIDEO_IDR_FRAME
    dynamicParams.videnc2DynamicParams.generateHeader = XDM_ENCODE_AU; // 0: Encode entire access unit including headers, 1: Encode only header
    dynamicParams.videnc2DynamicParams.ignoreOutbufSizeFlag = XDAS_FALSE; // Non configurable
    dynamicParams.videnc2DynamicParams.inputWidth  = 1920;
    dynamicParams.videnc2DynamicParams.inputHeight = 1080;
    dynamicParams.videnc2DynamicParams.interFrameInterval = 1; // I to P frame distance. 1: no B frames, 2: one B frame, 3: two B frames, etc [1, 255]
    dynamicParams.videnc2DynamicParams.intraFrameInterval = 120; // The number of frames between two I frames. 0: IPPPP..., 1: IIII..., 2: IPIPIPIPI, 3: IPPIPPIPPI or IPBIPBIPBI, etc.
    dynamicParams.videnc2DynamicParams.mvAccuracy = IVIDENC2_MOTIONVECTOR_QUARTERPEL; // Motion vectors accuracy. 0: integer pel., 2: quarter pel.
    dynamicParams.videnc2DynamicParams.putDataFxn = NULL;
    dynamicParams.videnc2DynamicParams.putDataHandle = 0;
    dynamicParams.videnc2DynamicParams.getDataFxn = NULL;
    dynamicParams.videnc2DynamicParams.getDataHandle = 0;
    dynamicParams.videnc2DynamicParams.getBufferFxn = NULL;
    dynamicParams.videnc2DynamicParams.getBufferHandle = 0;
    dynamicParams.videnc2DynamicParams.refFrameRate = 25000;
    dynamicParams.videnc2DynamicParams.targetFrameRate = 25000;
    dynamicParams.videnc2DynamicParams.sampleAspectRatioWidth = 1;
    dynamicParams.videnc2DynamicParams.sampleAspectRatioHeight = 1;
    dynamicParams.videnc2DynamicParams.targetBitRate = 600000;
    dynamicParams.enableTransQuantBypass = 0;
    dynamicParams.enableTransformSkip = 0;
    dynamicParams.enableROI = 0; // Parameter controls the Region of interest feature.
    dynamicParams.writeSpsPpsHdr = IH265_WRITESPSPPSHDR_NONE;

    Regards,

    Andrey Lisnevich

  • Hi Andrey Lisnevich,

    - I tried encoding with the same config parameters on four chip, encoder encodes without any artifact.
    - Were you able to encode the stream in multi-tile configuration with encoder release earlier to 02.00.00.02
    - Can you please send us the encoded output for the same config in four chip configuration and single chip configuration with same input yuv as input for both four chip and single chip encoding mode.

    Thanks and Regards
    Palachandra M V
    www.pathpartnertech.com
  • Hi Palachandra,

    I created demo: drive.google.com/.../view

    It is very similar to demo in my previous multi-DSP thread: e2e.ti.com/.../405339

    In hevc_demo/README file you can find instructions how to build it and run.
    In hevc_demo/output_examples directory you can find output of the encoder for different count of DSPs. In 3 and 4 DSP modes you can see artifacts.
    Input is in hevc_demo/host/bin/input.yuv

    Regrads,
    Andrey Lisnevich
  • Hi Andrey Lisnevich

    We are able to reproduce the artifact with the shared demo application. We are working to resolve the same.

    Thanks and Regards
    Palachandra M V
    www.pathpartnertech.com



  • Hi Andrey,

    I tried encoding the input.yuv using MCSDK and Demo Setup configuration in 3 chip mode, please find my observations below

    - Using MCSDK
    1. Codec encodes without any artifacts.

    -Using Demo Setup
    1. Artifact observed in encoded stream.
    2. In successive encoding, position of artifacts observed are not same.
    3. Artifacts observed is due to corruption of actual bit stream in the encoded stream.
    4. Corruption of bit stream always starts at the beginning of Tile1. (In 3 chip mode there will be 3 tiles Tile0, Tile1, Tile2)
    5. Bit stream corruption is observed due to overlapping of previously encoded bit pattern with the actual data.

    I suspect this issue may be due to
    - Cache invalidation and write back for key index, shared_mem_SwappedStream may not be happening as expected or output buffer may be over written at host side.
       
    I have attached encoded bit stream, encoded using MCSDK and Demo Setup, comparing both below are my findings

    Instance 1:

    Bit stream corruption initially starts at location @addr0 = 0x00060AB9
    Actual Bit Pattern (DE 07 7C 83 61 .....) 564 Bytes
    Appearing Bit Patten (B3 F3 A1 43 ED ......) 564 Bytes
    Same 564 bytes of data present in bit stream from @addr1 : 0x0005F406

    Instance 2:

    Bit stream corruption appears at location @addr2 = 0x0006357C
    Actual Bit Pattern (E1 0B F8 00 00 .....) 146 Bytes
    Appearing Bit Patten (40 00 00 03 00 ......) 146 Bytes
    Same 146 bytes of data present in bit stream from @addr3 : 0x00060965

    Can you please give your feedback on my findings.

    Thanks and Regards,
    Palachandra M V
    www.pathpartnertech.com

    3chip_arifacts.7z

  • Do you know why the issues isn't happening on 2 DSPs?

  • Hi Andrey,

    1. As the bit-stream corruption is of random nature it is difficult to predict the reason that, why artifacts are not observed in 2 chip configuration.

    2. I did few changes in the code to corner down the cause for the bit-stream corruption:
    (change in codec):
    - I over written the actual bit stream data with a constant number based on chip ID. (Say for, (chip 0 : 0x01), (chip 1 : 0x02), (chip 2 : 0x04)).
    - After writing to the output buffer, verified first few bytes of data at the start of chip 1, the data is as expected, without bit stream corruption in the codec, but bit stream corruption is observed at the start of (chipID : 1) for few frames in out.265 file.
    - From this it looks like bit stream may not be happening at the codec side.

    3. To make sure, bit stream corruption is not happening at the codec side, I tried to dump the bit stream data into a file, immediately after the process call.
    - Contents of context.outBufferDescriptor.descs[0].buf based on the outArguments->bytesGenerated dumped into a file.
    - I was able to dump only few frames due to memory corruption Issue.
    - Due to this, I was not able to corner down the cause for the bit stream corruption.

    4. Can you please help me to dump the data of context.outBufferDescriptor.descs[0].buf into a file after the process call, This will help us to corner down the reason for bit-stream corruption.

    5. The hevcdemo setup provided is very helpful in case of debugging, to give free run after doing minor changes, I need to load the program into all the cores. As this takes more time it will be helpful if you suggest any changes to give for free run without loading the program in code composer studio, by this, I can switch hevcdemo setup when ever required.

    Thanks and Regards
    Palachandra M V
    www.pathpartnertech.com

  • Hi Palachandra,

    >> will be helpful if you suggest any changes to give for free run without loading the program in code composer studio
    To run demo without CCS:

    1) To build dsp.out that kicks cores on startup change in app.cfg: Program.global.RUN_VIA_DEBUG = 0;

    2) To run demo from .out file without CCS studio uncomment this is Demo.cxx: //dspManager.runImageOnAllDspCores(dspId, "dsp.out");
    and put dsp.out into bin directory

    3) To do DSP reset before run uncomment code of this function in DesktopLinuxSdkDspManagerImpl.cxx: void DesktopLinuxSdkDspManagerImpl::resetDsp(unsigned dspId)
    and put init.out into bin directory

    >> Can you please help me to dump the data of context.outBufferDescriptor.descs[0].buf into a file after the process call

    Actually demo already do it - EncoderTask_outputEncodedData is called on master core right after process call. It sends content of context->outBufferDescriptor.descs[0] to HOST for dumping.

    Regards,
    Andrey Lisnevich

  • Hi Andrey,

    - Thank you for your update. I am able to encode the stream through hevc_demo application in both the modes. (with and without loading program in CCS)

    - HEVC encoder artifacts were observed, as multiChip barrier was missed out for multiTile configuration,  I have added the multichip barrier in the codec,  with the above changes artifacts are not observed in the encoded stream.  I have attached the updated library, can you please check encoding the stream with updated library.

    Thanks and Regards
    Palachandra M V
    www.pathpartnertech.com

    h265venc_ti.7z

  • Hi Palachandra,

    I did quick review. In this release I see no artifacts in 2 and 4 DSP modes. But in 3 DSP mode with 3 tiles there are still some artifacts: drive.google.com/.../view


    Also I see no difference in following:

    numTileColumns = 1;

    numTileRows = 4;

    and

    numTileColumns = 4;

    numTileRows = 1;

    and

    numTileColumns = 2;

    numTileRows = 2;

    The same with 2 and 3 tiles. It always splits image in the way it wants:

    • in case of 2 and 3 tiles - horizontally
    • in case of 4 tiles - it splits 2x2

    Regards,

    Andrey Lisnevich

  • Hi Andrey,

    >> Frame division when tile is enabled :
    1. In singlechip processing, numTileColumns can have value in range [1,Max CTU in Column] and numTileRows can have value in range [1,Max CTU in Row]
    2. In the multichip processing, each frame is divided into as many number of tiles as there are number of chips.
    3. If the number of chips N is odd, then there will N horizontal tiles. If N is even, then there will be two rows of tiles, both containing N/2 tiles.
    4. Config parameter value for numTileColumns and numTileRows are not considered for encoding in multichip-multitile scenario.

    >> HEVC Encoder Artifacts :
    1. In the attached stream, I am observing artifact for all 2, 3 and 4 chip encoding.
    2. Can you please send me the config for 2, 3 and 4 chip encoding, where artifacts are observed so we can reproduce at our end.

    Thanks and Regards,
    Palachandra M V
    www.pathpartnertech.com

  • Hi Palachandra,

    Previous samples I got with my production app.

    Please see the samples I get with demo app: new_hevc_output.zip

    No artifacts (at least visually) when running on 4 and 2 DSPs but still artifacts on 3 DSPs.

    Use current demo to reproduce.

    Regards,

    Andrey Lisnevich

  • Hi Andrey,

    HEVC encoder artifacts observed when encoding using 3 chips is due to in-correct offset calculation in fetching of reference region, I have corrected the same. Attaching the updated library can you please check encoding the stream with updated library.

    Thanks and Regards
    Palachandra M V
    www.pathpartnertech.com2548.h265venc_ti.7z

  • Hi Andrey,

    Please let me know whether the artifact issue is resolve with the latest library.

    Thanks and Regards
    Palachandra M V
    www.pathpartnertech.com

  • Thanks Palachandra,

    It looks that this build fixes all described above issues.
    But I am still in process of testing HEVC encoder.

    Regards,
    Andrey Lisnevich