SN65HVD1781: Irregular communication faults (B line is not driven correctly)

Ronald Zwanenburg

Part Number: SN65HVD1781

Hello Everyone,

We are currently experiencing RS485 communication issues with one of our products. We have sent a scope to the machine to measure the A and B signals to determine the problem.

We have seen that the B signal is sometimes (almost) flat. The machine has worked without problems for 2 months. After this, the problems have started and sometimes occur.

Some of the slaves are more prone to have the problem occur. All cables have been replaced and checked. This did not solve the problem.

We would very much like to find an explanation for this situation, but have a difficult time understanding it. Any ideas are very welcome!

Please see this document for more information.

Thank you,
Ronald Zwanenburg

over 4 years ago

0 Eric Schott1 over 4 years ago

TI__Guru 53515 points

Hi Ronald,

Thank you for sharing the documentation you have for this issue. This behavior looks interesting to me, especially the shift in the B line mid-packet ("Strange Event" indeed). Could you help with a few questions while I review possible fail cases?

Does the failure persist indefinitely or does it eventually revert back to normal?
If the suspected faulty node is replaced, is the issue in the system resolved?
I assume the current measurements were done by probing the bus connector (RS485_BAR_1). If not, please specify probe location for current data.
- Is it possible to probe the system at the failing node? Preferably directly to A and B terminals (transceiver side of series components). This would help identify if the source of this behavior is the transceiver or some of the passive components on the board.
If the transceiver on the faulty node is replaced, is the issue resolved? If so, please save the failed device as we may want to analyse the cause of failure.
Were scope shots taken during the lower Vcc test case? Or was this only based off of bus communication success? Based on the captured waveforms, it looks like communication is still possible (though less reliable) even during the fail case.
Based on the schematics, it looks like the transceivers are well protected, but I must ask - is there reason to suspect any significant ESD or Surge event occured in the system before the failure? User handling, lighting event, etc.?

Regards,
Eric Schott

0 Ronald Zwanenburg over 4 years ago in reply to Eric Schott1

Prodigy 10 points

Hi Eric,

Thank you for the quick and comprehensive response. Here is my response to the questions:

Eric Schott1 said:
Does the failure persist indefinitely or does it eventually revert back to normal?

This is the second machine where the problems occur. The first machine is stationed in Israel. There we replaced the cables and slave devices. The cables and devices were then sent to us for analysis. We have not been able to reproduce the problem. So yes, the problem will resolve itself after some time.

Eric Schott1 said:
If the suspected faulty node is replaced, is the issue in the system resolved?

The problem in Israel was indeed solved by replacing the nodes. The machine that is currently having problems is located in the Czech Republic. This machine works without a problem if no commands are sent to the faulty nodes.

Eric Schott1 said:
I assume the current measurements were done by probing the bus connector (RS485_BAR_1).

Your assumption is correct.

Eric Schott1 said:
Is it possible to probe the system at the failing node? Preferably directly to A and B terminals (transceiver side of series components). This would help identify if the source of this behavior is the transceiver or some of the passive components on the board.

Unfortunately it is not possible to place probes on the pins of the IC. The PCB is cast in expoy resin.

Eric Schott1 said:
If the transceiver on the faulty node is replaced, is the issue resolved? If so, please save the failed device as we may want to analyse the cause of failure.

I cannot say this with certainty. Unfortunately, a lot of effort has to be made to get to the chip. The circuit board is potted in a long u-shaped metal housing. Please let me know if you want to do any research. Then I will look at the possibilities of getting the chip from one of the faulty boards.

Eric Schott1 said:
Were scope shots taken during the lower Vcc test case? Or was this only based off of bus communication success? Based on the captured waveforms, it looks like communication is still possible (though less reliable) even during the fail case.

I'm not sure I understand the question correctly. But the scope images shown come from the machine in the Czech Republic. The replies sent by the slave were not received by the master.

During tests with a lower vcc, communication worked reliably until the chip completely stopped at 2.8 volts.

Eric Schott1 said:
Based on the schematics, it looks like the transceivers are well protected, but I must ask - is there reason to suspect any significant ESD or Surge event occured in the system before the failure? User handling, lighting event, etc.?

This is very unlikely. The devices are mounted in one machine. Only experienced technicians will disconnect the devices. Thunderstorms are possible, I'm going to ask how often these occur at these locations.

I appreciate the help!

Regards,
Ronald

0 Eric Schott1 over 4 years ago in reply to Ronald Zwanenburg

TI__Guru 53515 points

Hi Ronald,

Thank you for your clear answers.

Because it appears that the issue resolves itself when the setup is moved, it doesn't sound like this is a case of device damage. I suspect instead that there is some change in the system or passive components that causes the path between the faulty node and subsequent receiving nodes that creates some high impedance path on the B line.

There seems to be some similarity between the open circuit test (page 10) and the results from the first waveform (page 2). The B line weakly follows the A signal, though at a much lower magnitude than the tested example. This is likely due to the attenuation of the signal throughout the system (cabling and series resistances). Because of this attenuation, I believe the disconnect is occuring at the remote faulty board. The source could be the board connector, TBU becoming active, or connection of series components (such as L9 and R5).

I would be interested to see what waveforms taken at the faulty node (close the transceiver) look like during the bad communication. I suspect the magnitude at the transceiver pins will be consistent with a normal device, but we would see attenuation when probing the cable harness close to the board connector. Checking where on the board this attenuation occurs, or where the B line appears disconnected, would pinpoint the source of the issue. I'm not sure if such sampling would be possible in the field however. For your debug testing, perhaps you will be able to recreate the issue by modifying the suspected board while monitoring the bus from the same place as measured in the field.

Let me know if this sounds plausible and how possible it would be to measure behaviors at the faulty board.

Regards,
Eric Schott

Interface

Interface forum

SN65HVD1781: Irregular communication faults (B line is not driven correctly)