This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/AM3351: System time and intrrupt error

Part Number: AM3351

Tool/software: Linux

1.We used the AM3351 chip, and the SDK version is ' Ti - processor-sdk-linux-am335x-evm-03.00.00.04'(Kernel version :linux 4.412).The RTC clock source of our board is internal.We used our board for long run testing. After  30 hours or longer,  we found that the system clock will go wrong ,console could not connect to board,external socket connection disconnected,and the reboot command is not working.But we could use telent to connect our board.

2.We used date command to to view system time,found that the system time will go ahead 180s form a cetain time value,and then bounce back to this certain time value.It awlays cycled back and forth as this.I have written a module and app to view xtime, found Xtime has been unable to update,and has been stuck in a certain value.The part of log about that  the time cycle back and forth within the 180s ,is as follow:

root@opera:~# date
Tue Feb 7 18:34:51 UTC 2017
root@opera:~# date
Tue Feb 7 18:34:55 UTC 2017
root@opera:~# date
Tue Feb 7 18:34:57 UTC 2017
root@opera:~# date
Tue Feb 7 18:35:00 UTC 2017
root@opera:~# date
Tue Feb 7 18:35:02 UTC 2017
root@opera:~# date
Tue Feb 7 18:35:04 UTC 2017
root@opera:~# date
Tue Feb 7 18:35:59 UTC 2017
root@opera:~# date
Tue Feb 7 18:36:18 UTC 2017
root@opera:~# date
Tue Feb 7 18:36:34 UTC 2017
root@opera:~# date
Tue Feb 7 18:36:55 UTC 2017
root@opera:~# date
Tue Feb 7 18:37:00 UTC 2017
root@opera:~# date
Tue Feb 7 18:37:03 UTC 2017
root@opera:~# date
Tue Feb 7 18:37:20 UTC 2017
root@opera:~# date
Tue Feb 7 18:34:23 UTC 2017

3. I used  the 'cat  /proc/interrupt' command to view the status of the system interrupts many times ,and found that  the number of interruptions of 'gp_timer' and' 44e09000.serial' had been unchanged.Normally these two values are constantly increasing.System interrupts are as follows:

root@opera:~# cat /proc/interrupts
           CPU0       
 16:    5315316      INTC  68 Level     gp_timer
 19:          1      INTC  78 Level     wkup_m3_txev
 20:       1176      INTC  12 Level     49000000.edma_ccint
 22:          0      INTC  14 Level     49000000.edma_ccerrint
 26:          0      INTC  96 Level     44e07000.gpio
 33:          0  44e07000.gpio   6 Edge      48060000.mmc cd
 59:          0      INTC  98 Level     gpio1_9
 92:          0      INTC  32 Level     gpio2_25
125:          0      INTC  62 Level     481ae000.gpio
158:       1281      INTC  72 Level     44e09000.serial
159:          4      INTC  70 Level     44e0b000.i2c
160:          0      INTC  30 Level     4819c000.i2c
161:         13      INTC  64 Level     mmc0
162:         11      INTC  28 Level     mmc1
164:          0      INTC  77 Level     wkup_m3
170:          0      INTC  75 Level     rtc0
171:          0      INTC  76 Level     rtc0
174:     170642      INTC  41 Level     4a100000.ethernet
175:       4649      INTC  42 Level     4a100000.ethernet
178:        573      INTC   4 Level     48080000.elm
179:          0      INTC 100 Level     gpmc
180:          0      INTC 109 Level     53100000.sham
184:          0      INTC 111 Level     48310000.rng
186:      12093      INTC  18 Level     musb-hdrc.0.auto
187:          8      INTC  19 Level     musb-hdrc.1.auto
188:          0      INTC  17 Level     47400000.dma-controller
Err:          0
root@opera:~# cat /proc/interrupts
           CPU0       
 16:    5315316      INTC  68 Level     gp_timer
 19:          1      INTC  78 Level     wkup_m3_txev
 20:       1176      INTC  12 Level     49000000.edma_ccint
 22:          0      INTC  14 Level     49000000.edma_ccerrint
 26:          0      INTC  96 Level     44e07000.gpio
 33:          0  44e07000.gpio   6 Edge      48060000.mmc cd
 59:          0      INTC  98 Level     gpio1_9
 92:          0      INTC  32 Level     gpio2_25
125:          0      INTC  62 Level     481ae000.gpio
158:       1281      INTC  72 Level     44e09000.serial
159:          4      INTC  70 Level     44e0b000.i2c
160:          0      INTC  30 Level     4819c000.i2c
161:         13      INTC  64 Level     mmc0
162:         11      INTC  28 Level     mmc1
164:          0      INTC  77 Level     wkup_m3
170:          0      INTC  75 Level     rtc0
171:          0      INTC  76 Level     rtc0
174:     170692      INTC  41 Level     4a100000.ethernet
175:       4653      INTC  42 Level     4a100000.ethernet
178:        573      INTC   4 Level     48080000.elm
179:          0      INTC 100 Level     gpmc
180:          0      INTC 109 Level     53100000.sham
184:          0      INTC 111 Level     48310000.rng
186:      12093      INTC  18 Level     musb-hdrc.0.auto
187:          8      INTC  19 Level     musb-hdrc.1.auto
188:          0      INTC  17 Level     47400000.dma-controller
Err:          0
So I doubt that the kernel interrupt system is a problem, which led to the system clock and console can not be normal to enter the interrupt。Finally, the problem described in Item 1 is generated.
4.This is the DTB of the relevant documents, please help check whether the DTB configuration caused  the system clock interrupt error.
Please help solve the problem.Thanks very much!

  • Hi user4773977,

    The described issue is not known and needs deep investigation. But at first I would like to ask do you have NTP configuration? And could you post NTP settings?

    BR
    Tsvetolin Shulev
  • Hi Tsvetolin Shulev,
    we do not have ntp configuration.we use our own ntp client that we wrote ourselves.our ntpclient connects with NTP server over 'socket' , and get time form it.And then use the gettimeofday founction to write the time we get from NTP server in system time.Our ntpclient will only run once after the system have started.So I don't think this issue has anything to do with ntpclient.
    In addition,when the issue occured,we could not connet the board over UART,but could use telent to connect our board.When we login the system of board ,use the 'cat /proc/interrupt' command to view the status of the system interrupts ,and found that the interrupts of gp_timer stoped.That caused the systime wrong. I have written a module and app to view xtime, found Xtime has been unable to update.I do not know what caused that the interrupts of gp_timer stoped.

    Things are urgent!
    Please help me!
    Thanks!
  • It looks like the INT system is wrong now!
  • Hi Brand,

    Is this issue solved?
    What do you mean "INT system is wrong"? Any details?

    BR,
    Wayne

  • Hi Wayne,
    This issue is not solved.
    When we login the system of board ,use the 'cat /proc/interrupt' command to view the status of the system interrupts ,and found that the interrupts of gp_timer stoped.That caused the system time wrong.
  • Hi All,

    I am observing something similar on out custom AM335x board. This does not happen often and is really hard to reproduce, basically we have to wait until certain parts of the system fail to respond.

    The main symptom we see when the device gets into this state is that it becomes unresponsive in certain time windows. During the 'unresponsive window', I cannot issue any commands at all. I have an ssh session open to the device, and during the 'responsive window' I am able to issue certain commands, for example the date command. I can issue the date command repeatedly until the system 'locks up', after the system has locked up I can still type in commands, but I get no response over my ssh session, then when the system transitions into the 'responsive window', it immediately responds to the last command I issued, in this case the date command, and I see the time jumping back 62 seconds from the last date command.

    In addition, I see that the gp_timer interrupt count in /proc/interrupts NEVER increases, however, on a system running normally this increases dramatically.

    It also appears as though the system timer has frozen, if I issue a sleep 1 command, it never exits. Also, the serial console appears to be dead in this state. I am also not able to log into the device with a separate ssh session.

    I am not sure if I have missed any thread describing this issue, this is the only thread that describes roughly what I observe.

    Thanks in advance for any further insight