possible AIF2 bug ?

Ran Yaniv

The AIF2 PD block counts samples via the Frame Message counter and increments symbol counter based on receive samples.

Upon losing synchronization, the Frame message counter and symbol counter freeze as samples are no longer received. The counters advance again at the next frame boundary after synchronization is restored.

So far so good ... However in my opinion these counters should be reset to zero upon restoring synchronization, and not continue counting from the same state they were in prior to losing synchronization.

Could someone from TI please comment if this is a bug in the AIF2 operation?

Regards,

Ran

over 14 years ago

0 Mike Lachmayr over 14 years ago

TI__Prodigy 10 points

Ran,

(I'm the designer of the AIF2 PD)

Thanks for your question. I'm not sure what it means to lose synch in the above.

OBSAI: For OBSAI use, we do expect SERDES bit errors to whack an OBSAI message header every so often. When that occurs, 4 samples go missing. AIF2 will patch the missing samples with zero's (in fact it will patch up to 8 missing samples). Under this operation, sync is never lost. In the event that 12 or more sequential samples go missing, the PD will lose sync and reset the framing engine. On the next radio frame boundary (programmed by AxC offsets), it will automatically resync.

In a real BTS environment, only board hot swap (illegeal) can cause OBSAI messages to stop being sent for a given AxC. Yes, in that case, the AIF2 framing engine will wait forever for a continuation of sample from all the active AxC. In fact, it is not possible to reset the framing engine via SW without a full HW reset of the entire AIF2. Regardless, whenever you do start sending traffic for that AxC, it will most likely have a big discontinuity. The AIF2 PD will respond with the error recovery described in the paragraph above.

Bug? We are planning on adding an override (basically a reset) to the framing engine for future generations. I don't appreciate any real use cases which need the feature, but in lab bring up tests, customers have been frustrated by the current behavior. Basically customers have done loopback tests, and prematurely shut off the Egress channels before the Ingress channels have shut down. I would like to understand real life use cases where the current behavior causes inconvenience.

In CPRI operation, So long as the input link stays up and running, the PD will continue to extract data from the CPRI link. Is it a CPRI use case which you are experiencing? Is the PD hung after a CPRI link fails, then comes back?

Hopefully this is helpful, please post more and I'll try to answer your questions.

Thanks

Michael

0 Ran Yaniv over 14 years ago in reply to Mike Lachmayr

Intellectual 645 points

Hi Mike,

Thanks for your response.

Let me explain exactly what I am observing, and then also a use case where this behavior is problematic for me.

The usage is CPRI. Technically, when the RM first changes state to ST3 (SYNC), the sample and symbol counters in the PD start from 0 on the next frame boundary. If the link changes state away from ST3 (doesn't have to be to ST0), the counters pause until the next frame boundary after state is restored to ST3. The sample/symbol counters however continue from their last position, meaning symbol index will no longer wrap around to 0 on the frame boundary...

The problem with this is that the lengths of the packets out of the PD will no longer be such that PdFrameMsgTc[0] is the first packet length on frame boundary. The offset in the LUT will be arbitrary depending on the duration of the sync loss. Considering LTE as an example, the symbol lengths vary over radio frame in a specific pattern, and the PdFrameMsgTc LUT is relied upon to provide synchronized symbol lengths into the system.

Now, obviously in a system where the AIF2 syncs just once and never loses sync (never leaves state ST3), this would not be an issue. However we observe the following scenario, and frustration is an adequate word...

* The DSP is reset after a link with RRH has already been established. In this case, even though RRH loses the CPRI signal from the DSP, it may still be transmitting to the DSP a free-running signal.

* Now, when DSP is restarted and AIF2 is started, the RM will sync to this free-running signal coming from the RRH, and reach state ST3.

* AIF2 will also start transmitting to the RRH. The RRH will lock to the signal coming from AIF2, realign its frame boundary and change the alignment of the signal it is transmitting to the AIF2.

* RM will then lose sync to the previous free-running signal, and sync to the re-aligned signal from RRH.

The result of the above description is that the PD sample and symbol counters are no longer aligned to frame boundary due to the early sync and subsequent resync.

Thanks

Ran

Processors

Processors forum

possible AIF2 bug ?