This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AER parameter questions

Hi

Firstly, I'd like to confirm how the coh_ratio_threshold parameter is used in practice. According to the API documentation, when the ratio is exceeded for a number of frames a 'coherence presence' is detected. The AER developer guide does not seem to describe these parameters and events in any more detail. 

What does a 'coherence presence' detection signify? Is this related to the coherence of the acoustic system (the echo path) being considered suitably linear and free of noise to allow convergence of the adaptive filter? Or is it related to something else, e.g. LMS misalignment?

In short, do we want to maximise the number of coherence events or minimise them?

Secondly, can you describe further to the API description how the clip_scale_curve is applied? Is the input to the curve the double talk detection value? Presumably the input value compared against the 4 thresholds, but what is then scaled by 4,1,1 or 0 depending on which range the input falls within (c1-c4)?

Kind regards,

Simon

  • Hi Simon,

    Sorry again for my late response.

    1. Regarding coh_ratio_threshold. You want to minimize the number of coherence events when the echo path is stable, e.g. phone is not moving, and that means a higher value of this parameter. However, you want this parameter to be low enough to quickly detect echo path change so that the phone can quickly adapt to new echo path.

    2. Regarding clip_scale_curve. The input to the curve indicates the likelihood of double talk, and based on this input and the configured curve, NLP will apply proper amount of clipping. The amount of clipping is controlled by a calculated clipper value which is scaled by this curve. The clipper value depends on convergence and far end signal power.

    Hopes this helps.

    Regards,

    Jianzhong

  • Hi Jianzhong

    Thanks for the reply. Just to confirm I understand these correctly:

    1. When the coherence ratio has a high value, then this signifies the echo path has changed? I.e. the LMS coefficients need to be adapted significantly when the threshold is exceeded.

    2. For the example curve c1=40, c2=64, c3=64, c4=128, the clipper will be scaled by 4 until c1 is exceeded, after which it will be scaled by 1. A value is 128 signifies a high likelihood of double talk, hence the scaling of 0 for the c4 threshold can be used to suppress centre clipping during double talk?

    Regards,

    Simon

  • Hi Simon,


    Yes, your understandings are correct.

    In addition, the scaling factor is a continuous function. It gradually changes from 4 to 1 between the first two points on the curve.

    Regards,

    Jianzhong

  • Thanks, that's a great help.

    Simon