This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
I see an IMPORTANT NOTE in the attached cmd file like below:
/* ------------------------------------------ */
/* IMPORTANT NOTE: Splitting external memory */
/* into sections shown, grouping of functions */
/* and ordering inside groups is a crucial */
/* requirement for performance. Alteration */
/* in the above will lead to severe cache */
/* penalties and hence results in performace */
/* degradation. */
/* ------------------------------------------ */
So my questions are:
What does above IMPORTANT NOTE mean? Would you give me some explanation or details to
show what's the optimizing tricks which used in cmd file ?
I noted that cmd file is written like below :
/* FracPelB */
.= align(0x8000); /* align to 32kb */
Why cmd file can be written like this ?
What are relationships between H264HPVENC_TI_cSectfracpelP and
H264HPVENC_TI_MV_fractionalPel ,H264HPVENC_TI_subPelRefinement,
Are these four functions below subfunction of H264HPVENC_TI_cSectfracpelP ?
H264HPVENC_TI_MV_fractionalPel ,H264HPVENC_TI_subPelRefinement, H264HPVENC_TI_sad_halfpel,H264HPVENC_TI_sad_quartpel
Continue with Q2, How much performance would be improved if cmd file is written as kind of
show in Q2 ?
Why can that kind of way cmd file is written by in questions 2 improve function's performance ?
/* ===========================================================================*/ /* Copyright (c) 2012 Texas Instruments, Incorporated. */ /* All Rights Reserved. */ /* ===========================================================================*/ /*============================================================================*/ /* Partial Linker Command File for H264HP encoder */ /* Purpose: To have hassel free integration of algorithm in system, system */ /* integrator should not worry about managing multiple sections of code and */ /* data. */ /* Define code section(s) with naming convention: .text:H264HPVENC_TI_cSectx */ /* Define data section(s) with naming convention: .const:H264HPVENC_TI_dSectx*/ /* Define const section(s) with naming convention: .far:H264HPVENC_TI_uSectx */ /* - Hide all symbols, export XDAIS functions, create namespaced sections */ /* - Create sections with optimised cache relative placement */ /*============================================================================*/ /* Make relocatable object */ -r /* Hide all symbols in this partial link */ -h /* Make XDAIS Functions and Tables Symbols Globally Visible */ -g IH264HPVENC_PARAMS -g H264HPVENC_TI_DYNAMICPARAMS -g H264HPVENC_TI_IALG /* module ID */ -g H264HPVENC_TI_activate /* activate */ -g H264HPVENC_TI_alloc /* algAlloc */ -g H264HPVENC_TI_deactivate /* deactivate */ -g H264HPVENC_TI_free /* free */ -g H264HPVENC_TI_initObj /* init */ -g H264HPVENC_TI_numAlloc /* numAlloc (NULL => IALG_DEFMEMRECS) */ -g H264HPVENC_TI_encode /* H264HP encode */ -g H264HPVENC_TI_control /* H264HP encode's control */ -g H264HPVENC_TI_init -g H264HPVENC_TI_exit -g H264HPVENC_TI_IH264HPVENC -g H264HPVENC_TI_getResources -g H264HPVENC_TI_numResources -g H264HPVENC_TI_initResources -g H264HPVENC_TI_reInitResources -g H264HPVENC_TI_deInitResources -g H264HPVENC_TI_activateRes -g H264HPVENC_TI_activateAllRes -g H264HPVENC_TI_deactivateRes -g H264HPVENC_TI_deactivateAllRes -g H264HPVENC_TI_IRES SECTIONS { /* ------------------------------------------ */ /* IMPORTANT NOTE: Splitting external memory */ /* into sections shown, grouping of functions */ /* and ordering inside groups is a crucial */ /* requirement for performance. Alteration */ /* in the above will lead to severe cache */ /* penalties and hence results in performace */ /* degradation. */ /* ------------------------------------------ */ .const { *(.const) } .text:H264HPVENC_TI_cSectfracpelB { /* FracPelB */ .= align(0x8000); /* align to 32kb */ *(.text:H264HPVENC_TI_MV_fractionalPel_B) *(.text:H264HPVENC_TI_subPelRefinement_B) *(.text:H264HPVENC_TI_sad_halfpel_B) *(.text:H264HPVENC_TI_sad_quartpel_B) } .text:H264HPVENC_TI_cSectfracpelP { /* FracPelB */ .= align(0x8000); /* align to 32kb */ *(.text:H264HPVENC_TI_MV_fractionalPel) *(.text:H264HPVENC_TI_subPelRefinement) *(.text:H264HPVENC_TI_sad_halfpel) *(.text:H264HPVENC_TI_sad_quartpel) } .text:H264HPVENC_TI_cSectProcIntra8x8 { /* Intra8x8 */ .= align(0x8000); /* align to 32kb */ *(.text:H264HPVENC_TI_process_intra8x8mb) *(.text:H264HPVENC_TI_predgen_errorgen_Intra8x8) *(.text:H264HPVENC_TI_filter_intra8x8_recon) *(.text:H264HPVENC_TI_trans_quant_Intra8x8) *(.text:H264HPVENC_TI_forward8x8) *(.text:H264HPVENC_TI_iquant_Intra8x8) *(.text:H264HPVENC_TI_idct_addpred_Intra8x8) *(.text:H264HPVENC_TI_inverse8x8) } .text:H264HPVENC_TI_cSectChromaRecon { /* ChromaRecon */ .= align(0x8000); /* align to 32kb */ *(.text:H264HPVENC_TI_GenChromaErrDC) *(.text:H264HPVENC_TI_GenChromaErrHor) *(.text:H264HPVENC_TI_GenChromaErrVer) *(.text:H264HPVENC_TI_trans_quant_chroma) *(.text:H264HPVENC_TI_rle_iq_chroma) } .text:H264HPVENC_TI_cSectpreMEframe { /* preMEframe */ .= align(0x8000); /* align to 32kb */ *(.text:H264HPVENC_TI_pre_meFrame) *(.text:H264HPVENC_TI_mad_hx16_skip2) *(.text:H264HPVENC_TI_find_best_sad2_avoid_partition) *(.text:H264HPVENC_TI_square_search) *(.text:H264HPVENC_TI_mad_NxN_Skip) *(.text:H264HPVENC_TI_find_best_sad2) *(.text:H264HPVENC_TI_PrepareSearchList1) *(.text:H264HPVENC_TI_CalcSadList1) *(.text:H264HPVENC_TI_CollectSadList1) } }
First, you need an understanding of C6000 program cache and how it can affect performance. For that, please read the first part of this wiki article. You can stop when it starts discussing the cache layout tool clt6x. Whether you decide to use that tool has nothing to do with understanding this link command file.
This link command file is carefully tailored to place functions in this implementation of H264 so that they minimize conflicts in the program cache. If you somehow change the relative location of these functions, performance of the code could get much worse because of cycles lost to cache conflicts.
Thanks and regards,