This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AER not working with 8KHz Little Endian sample inputs

Hi,

I am working on aer_16_1_0_3. The test application with the given inputs fe_16k.pcm & ne_pink_16k.pcm worked fine as mentioned in the Installation & verification document. As my voip application is for 8KHz I have recorded PCM files be_fe_s16_8k.pcm & be_ne_s16_8k.pcm (be:BigEndian__fe/ne:farend/nearend__s16:Signed16bit__8KHZ). And changed the Sampling Rate bit field to "0" in the aersimcfg.txt file. The outputs (rxOut.pcm & txOut.pcm) were playing fine

In accordance to this I have disabled the logic for this endianness conversion before/after file read/write operations in aersim_fileIO.c and then ran the same application with little endian inputs. In this case the outputs(rxOut.pcm & txOut.pcm) were totally noisy.

Then I ran the same case (inputs as little endian) with the endianness logic enabled in aersim_fileIO.c. In this case I got outputs in which the speech was audible but not upto the mark.

My Setup is:

OS: Linux

Evaluation Board: Sitara AM335x

Audio Type: PCM, Signed 16 bit, 8KHz, Stereo

Audio Player Used: Audacity

Attachment of IO files: 0677.le_IO_s16_8K.zip

Please let me know what are the required/missing things to be done to make it work fine.

Regards,

G.Shricharan.

  • Hi,

    Here are my comments:

    1. The files you attached are still at 16kHz sampling rate.

    2. How did you change the code to disable the endianness conversion? Can you attach your code?

    Thanks,

    Jianzhong

  • Hi Jianzhong,

    Jianzhong: The files you attached are still at 16kHz sampling rate.

    Shricharan: Please find that the attached files in the initial post of mine are Little Endian, 8Khz, Signed 16bit, Streo format. I cross checked them; for better understanding I have attached the screen shots of the settings and waveforms which I have used by Audacity player for "near-end" as-well-as "far-end" input files. The same settings implies to the rxOut.pcm & txOut.pcm.

    File Attachment:1538.ScreenShots.zip

    Jianzhong: How did you change the code to disable the endianness conversion? Can you attach your code?

    Shricharan: Find the attachment of the code for the endianness conversion. Go through the functions aerSimFwrite() & aerSimFread() @ line numbers 133 & 59 respectively.

    File Attachment:4370.aersim_fileIO.c

  • Hi Shricharan,

    Thanks for the data and code. The modified code looks good.

    The cause of your problem is the stereo format of the input signal. AER works with single-channel data. Please change the data format to mono and try again.

    Regards,

    Jianzhong

  • Hi Jianzhong,

    Please find the status of the comments posted previously

    The cause of your problem is the stereo format of the input signal. AER works with single-channel data. Please        change the data format to mono and try again.

    I have tried changing the input format to mono instead of stereo i.e. characteristics of the input signal (near-end & far-end) are Little Endian, Signed 16 bit, 8KHz, mono. inspite; ended up with the undesired(I think noisy/data loss) txOut & rxOut signals. I have attached IO files for a clear understanding.

    LE_IO files:0068.IO_le_8Khz_mono.zip

    With this background I went back to re-check how the Big Endian format with the input signal parameters; Big Endian, Signed 16 bit, 8Khz, mono;  works??? For this I have taken the fresh code(same version which I mentioned previously, but with the new code). Here also I ended up with the undesired outputs (I think data loss/ no gain issue). I have attached IO files for a clear understanding.

    BE_IO files:1145.IO_be_8Khz_mono.zip

    Also attaching the 0550.aersimcfg.txt file for for the reference. In this I changed the second parameter to "zero" to run the application on 8Khz.

    Please let me know or direct me to the exact steps for the test application to run on 8Khz as it is very crucial for my project.

    Jianzhong "Thank You" very much for your continuous support.

    Regards,

    G.Shricharan

  • Hi G.Shricharan,

    Looking at the output, I think the problem is probably that AER didn't get a chance to converge. You can try the following:

    1. disable NLP, just have FE signal, observe the Tx output and see if it shows convergence.

    2. add 5 seconds of silence to the beginning of your NE signal and rerun the test.

    Regards,

    Jianzhong

  • Hi Jianzhong

    Please find the outcomes for the actions performed by your observations

    1. disable NLP, just have FE signal, observe the Tx output and see if it shows convergence.

    I had reset the seventh parameter of aersimcfg.txt file to "zero"  only to have the FE signal, and Modified the "AER control bitfield 0" to 0x1883; to disable the NLP; then I ran the binary and ended up with the segmentation fault with a message showing "End of far end input file reached" then I set the "TX input file IO" to "one" and ran the binary; it worked fine and got the rxOut.pcm & txOut.pcm.

    Observation: Here I observed that there is echo in the txOut.pcm file if NLP is disabled, attaching the input & output files for better understanding

    IO files:1016.NLP_Disabled_IO_Files.zip Sorry for the audio quality of recording for this.

    With this observations & results I moved to the second comment of yours about the convergence

    2. add 5 seconds of silence to the beginning of your NE signal and rerun the test.

    In this I had recorded the near-end data with 5-6 seconds of silence at irregular intervals in-order to capture the echo data clearly in txOut.pcm;if any; far-end data is recorded normally with 8kHz, mono. And surprisingly!!! we had good txOut.pcm signal; thank you very much for your observation and help in this; But here While running the binary I have enabled the NLP also, since I observed that if NLP is disabled then echo is also available in the txOut.pcm. Attaching the IO files where the NLP is enabled and "ne" file has a few seconds of silence.

    IO files:8322.NLP_Enabled_Added Silence.zip


    For cross verification I have taken the fresh code and checked whether this convergence problem persists in the 16Khz big endian inputs also (as given by TI).

    Observations:

    1. With the silence in near end data for some time(few seconds) the convergence issue does not exist (this kind of input is provided by ti in the package).

    2. I have recorded near-end & far-end inputs with 16Khz, Big Endian, signed 16bit configuration and observed that the txOut.pcm file is not proper (the wave form was same as I have conveyed yesterday)  as in some portions the data was suppressed/no gain; in short the convergence issue remained with the 16Khz Big Endian data also.

    Hence my question is--- In my project (or) in real case scenario how this convergence should be taken care. Do we have any steps to minimise this convergence time in tuning guide or redirect me to any such area.

    Once more Thank You very much for helping me to come to a stable understanding that aer works fine with 8Khz little endian; provided if sufficient time is given to aer to converge!!!.

    Regards,

    G.Shrihcaran

  • Hi G.Shrihcaran,

    Glad that you have obtained meaningful test results.

    Regarding your question about convergence, any type of adaptive filter needs some time to converge. During that time, there should be no near end signal. In real world conversation, it is very rare that people would keep talking simultaneously from the beginning. A simple "Hello" from far end would be enough to converge the adaptive filter for acceptable performance. Then during the conversation, convergence will keep improving whenever near end is silent. 

    The convergence time depends on the tail length configured by the user. For typical handset-like application, tail length can be set to 20msec or less, and convergence is very fast. AER can converge to steady state in a few 100msec. For typical hands-free application, tail length should be set to 200msec or more. For this case, convergence will be slower, but it can still get to about 20dB in a second, then gets to steady state (about 30dB-40dB or more, depending on the enclosure and room) in 10 to 20 seconds.

    Regards,

    Jianzhong




  • Hi Jianzhong

    Hope this post finds you in good health..... Jianzhong, does aer supports DTD (Double Talk Detection).

    The background for this question is that, if the Far-End is not speaking then there is nothing to monitor at all (By Adaptive Algorithm). So how does the Adaptive Algorithm know when it should be working and when it shouldn’t be?

    Regards,

    G.Shricharan

  • Hi G. Shricharan,

    Yes, AER has double talk detection. When double talk is detected, adaptive filter will not update.

    Regards,

    Jianzhong

  • Hi Jianzhong,

    How to toggle the DTD; is there any control bit in the aersimcfg.txt file or any other way is available to do this. Please let me know the same or direct me to the location in the documents.

    Regards,

    G.Shricharan

  • Hi G.Shricharan,

    The double talk detector is always running if NLP is enabled. It doesn't make sense to turn it off (or disable NLP) in normal operations.

    Which version of AER are you using? If you use 16.1 and later, there is a parameter (dt_thresh) that controls the sensitivity of double talk detector. It allows users to tune their specific application either favoring more duplex or favoring less echo. Documentation of that parameter is available in AER external API documentation and aer.h. 

    Regards,

    Jianzhong

  • Hi jianzhong,

    While we were using aer version 16_1_0_3 and SDK version ti-sdk-05.05.00 compiling & running of aer on sitara AM335x was fine (used file system and tool chain available in this version SDK). But now we have migrated to new SDK version ti-sdk-06.00.00 and while compiling the aer for Sitara Am335x we are encountering with errors(using the tool chain available in the latest SDK). Attaching the logs of the error Message.

    Note:

    We have modified the make file as mentioned in the AER_Installation_Verification.pdf but used the tool chain provided in the SDK version ti-sdk-06.00.00

    Log File:7268.AER_buidLog.txt

    Regards,

    G.Shricharan.

  • Hi G. Shricharan,

    Sorry for my late reply. After looking at you build log file, I think the problem may be due to soft/hard floating point compiler option. AER 16.1.0.3 was compiled with soft floating point option, but your executable is probably compiled with hard floating point option.

    The latest AER release (17.0.0) has both soft and hard floating point libraries available at: http://software-dl.ti.com/libs/aer/latest/index_FDS.html. Please download AER for Cortex-A8 Hardfp Linux Installer and give it a try.

    Regards,

    Jianzhong

  • Hi Jianzhong,

    I have started working on AER  17_0_0_0 Hard float. And I am using it with a standard SIP client. The target board I am working on is sitara AM335x. The AER Demo (prebuilt binaries) has worked fine but when I integrate aer with my sip client it started misbehaving.

    Misbehaviour 1: The sound Quality is not good (I think this is because of the buffer size mismatch)

    Misbehaviour 2: If I mute the far end then I can transmit the near-end data

    Please suggest me a solution to fix the second misbehaviour

    Regards,

    G.Shricharan

  • Hi Shricharan,

    The 2nd problem may be caused by no convergence. Please follow instructions in AER Quick Tuning Guide in <installation_root>\docs folder to get AER converged first and verify that before proceeding.

    Regards,

    Jianzhong

  • Hi jianzhong,

    I am working on the implementing of the fine tuning guide suggestions; mean while sharing with you the current status (buffers) of the aer integrated in a SIP client.

    Attaching the buffers 0572.Am335xOutputs.zip which are captured at different stages which explained below.

    sendinBuf.pcm : This is the buffer before passing to the AER; in short capture buffer at DUT

    rxinBuf.pcm : This is the buffer captured at the far end

    syncBuf.pcm rxout.pcm txout.pcm : These buffers are self explanatory

    Other properties of these buffers are

    Sampling Rate: 8KH

    Channels: Mono

    Endianness: Little Endian

    Encoding: Signed 16bit PCM

    Please let me know if any information is missing.

    Regards,

    G.Shricharan

  • Hi G. Shricharan,

    You rxout and txout are both corrupted. I would suggest you completely bypassing AER and recapturing rxout and txout. See if you get clean data. If data is not clean, then you have a problem in your system. If data is clean, it probably means the integration of AER has some problems.

    Regards,

    Jianzhong

  • Hi Jianzhong,

    Can you please help me with this problem?? If you feel the question is unclear or could not understand please let me know.....


    I have been working on AM335x sitara evm and my project needs self diagnosis of the captured audio from the mic through the speaker on the same board (we have re-worked on the evm to add the mic and speaker). In the application it is preferred that the user only requests for the audio test to be performed and the software (have to implement; but no Idea how to do so) need to check the quality of the audio and based on the statistics it has to give the test result.

    Please provide me some ideas how to implement this self diagnosis test for audio quality.

    Regards,

    G.Shricharan

  • Hi G. Shricharan,

    Audio quality can be analyzed by looking at AER debug information such as the maximal ERLE, near end noise power, NLP linear attenuation, etc. These debug information is documented in aer.h.

    When user requests for audio quality report, you can call AER API function aerGetPerformance() to get the above mentioned information.

    Regards,

    Jianzhong

  • Hi Jianzhong,

    Currently the problem we are facing with AER is, near-end captured data cannot be heard clearly at the far-end.

    Observations:

    We found that transmitting path sync buffer(which is used in the aerSendIn() API) is corrupting the near-end data before streaming on to the network.

    So, we started digging into the receive path components/APIs; which creates the sync buffer; then it is observed that aerReceiveIn() API is corrupting its output buffer; sim_data_rx_out[ ] (or) rxout[ ]; which is used by the piuReceiveIn() for creating the sync buffer; sim_data_rx_sync[ ];

    Attached the dump buffers of far-end data or receive path files

    1. farEnd_orginal.pcm--> before any aer related processing started at the Rx path
    2. aerRxIn.pcm-->this file is the output buffer dump of aerReceiveIn() API (or) dump of the sim_data_rx_out[ ] buffer

    Attachment: 8664.AER_RxPath_BufferDump.zip

    The characteristics of the above files to play are:

    1. Encoding Type--> Signed 16bit PCM

    2. Byte Order--> Little Endian               

    3. Channels--> Mono                             

    4. Sampling Rate--> 8000 Hz / 8 kHz    

    Question

    The chm document says that precondition for aerReceiveIn() API is that "Initial microphone and speaker analog gains must be set explicitly through aerControl()Otherwise, aerReceiveIn() will process data as if AER is disabled."

    In our case the aer_ctl.gain_rx_analog = -16aer_ctl.gain_tx_analog = -32 

    Therefore the non functioning of AER is ruled out. Hence suggest us how to move further to make AER function in real time scenario??

    And please let me know if any further information is required.

    Regards,
    Shricharan

     

  • Hi Shricharan,

    I examined the Rx output data (aerRxIn.pcm) and saw every other 80 samples are all 0's. The playback at speaker wouldn't sound right. Did you notice that?

    To debug this problem, the first step is to disable AER and Rx EQ (and DRC is it is used). The input should be passed to the output without being changed. If you don't get clean data at Rx out, that should be easy to debug. If you do get clean data when AER is disabled and but corrupted data when AER is enabled, that means you memory management may have some problems. Make sure you properly place the buffers required by AER, without overlapping any.

    Regards,

    Jianzhong

  • Hi Jianzhong,


    Thanks for your observation of filling 0's after every other 80 samples, we have taken care that in our application and passing the data accordingly to that.

    After the above said implementation we had observed some fruit-full result in the output buffer/near-end data, but still the quality of near-end data (which will be streamed over the network) is not up to the mark. It contains components of the far-end data in very minute level.

    Attaching the buffers before aer processing and after aer proessing

    0383.aerEnabled.zip


    On observing this phenomenon we have disabled aer from Config file then checked ;this works as expected---with full echo in the near-end data

    Attaching the buffers before aer processing and after aer proessing--in this aer is disabled from config file

    8081.aerDisabled.zip

    The characteristics of the above files to play are:

    1. Encoding Type--> Signed 16bit PCM

    2. Byte Order--> Little Endian               

    3. Channels--> Mono                             

    4. Sampling Rate--> 8000 Hz / 8 kHz    

    Please let me know if any information is not clear.

    so jianzhong, please suggest us any steps to fine tune the near-end buffer after aer processing

    Regards,

    G.Shricharan

  • Hi Shricharan,

    Please follow AER Quick Tuning Guide, chapter 5, to make sure AER has convergence.

    You need to:

    1. disable NLP and CNG,

    2. keep near end silent, i.e. only echo goes to mic,

    3. send CSS or pink noise (provided in AER package) from far end,

    4. capture near end signal after AER Tx.

    Please refer to the tuning guide for more detailed instructions.

    Regards,

    Jianzhong