TDA4VM: GTC register not synced between HI & LO

Ming Zhong10

Intellectual 2235 points

Part Number: TDA4VM

Dear experts,

Our customer is seeing that reading GTC from A72 will has following issue:

When LO overflows, HI should increase by 1. But they find that, LO overflows, but HI is not increased.

Do you have any suggested best practice to read these 2 32bit registers?

Thanks & Best Regards!

over 5 years ago

0 subin li over 5 years ago

Expert 1050 points

Ming Zhong10 said:

Part Number: TDA4VM

Dear experts,

Our customer is seeing that reading GTC from A72 will has following issue:

When LO overflows, HI should increase by 1. But they find that, LO overflows, but HI is not increased.

Do you have any suggested best practice to read these 2 32bit registers?

Thanks & Best Regards!

ZM

Hi, experts

We found that at a certain moment HI increase by 1, but LO not cleared, so when we read GTC time, it is about 21s than the real time.

How can we solve this problem? If cannot solve this problem, how long will this moment(HI increase by 1, but LO not cleared) last?

0 z over 5 years ago in reply to subin li

TI__Expert 4015 points

It seems to me that ZM and subin li are describing two different issues. I was not aware of this issue. Could you provide a register dump so that I can try to recreate the problem?

0 Ming Zhong10 over 5 years ago in reply to subin li

TI__Intellectual 2235 points

Subin,

Could you please give a piece of code & log to explain this issue?

I think if you can print the registers, it will help our expert to understand.

Thanks & Best Regards!

0 subin li over 5 years ago in reply to Ming Zhong10

Expert 1050 points

Our sample code is shown below：

#define MAP_SIZE 4096UL
#define MAP_MASK (MAP_SIZE - 1)
#define GTC_BASE_REGISTER 0x00A90000

off_t gtc_target = GTC_BASE_REGISTER;

fd = open("/dev/mem", O_RDWR | O_SYNC));

gtc_map_base = mmap(0, MAP_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, gtc_target & ~MAP_MASK);

gtc_virt_addr = gtc_map_base + (gtc_target & MAP_MASK);

uint64_t read_result = *((unsigned long *) gtc_virt_addr + 1);

At a certain moment, the read_result is 0x01FFFFFFFD, then we quickly get the read_result agian, it is 0x02FFFFFFFD, it should be 0x0200000000.

We think at that moment, the HI(0x00A9000C) increase by 1, but LO(0x00A90008) not cleared.

How can we solve this problem? If cannot solve this problem, how long will this moment(HI increase by 1, but LO not cleared) last?

Thank you!

0 subin li over 5 years ago in reply to z

Expert 1050 points

ZM misunderstood our problem, please help to solve my issue. Thank you.

Our sample code is shown below：

#define MAP_SIZE 4096UL
#define MAP_MASK (MAP_SIZE - 1)
#define GTC_BASE_REGISTER 0x00A90000

off_t gtc_target = GTC_BASE_REGISTER;

fd = open("/dev/mem", O_RDWR | O_SYNC));

gtc_map_base = mmap(0, MAP_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, gtc_target & ~MAP_MASK);

gtc_virt_addr = gtc_map_base + (gtc_target & MAP_MASK);

uint64_t read_result = *((unsigned long *) gtc_virt_addr + 1);

At a certain moment, the read_result is 0x01FFFFFFFD, then we quickly get the read_result again, it is 0x02FFFFFFFD, it should be 0x0200000000.

We think at that moment, the HI(0x00A9000C) increase by 1, but LO(0x00A90008) not cleared.

How can we solve this problem? If cannot solve this problem, how long will this moment(HI increase by 1, but LO not cleared) last?

Thank you!

0 z over 5 years ago in reply to subin li

TI__Expert 4015 points

So far, I have not been able to replicate your error. What frequency is the GTC source clock? Additionally, what is the value in the GTC_CNTFID0 register? And after you read 0x02FFFFFFFD, do you eventually read 0x200000000 if you read from the register enough times?

One thing you could try doing differently is to read from the GTC_CNTCVS, which contains the same information in a read-only register. Since I can't recreate your issue, I don't know for sure if that will solve it.

-Zack

0 subin li over 5 years ago in reply to z

Expert 1050 points

Hi, Zack

About the GTC source clock， we didn't change the value of CTRLMMR_GTC_CLKSEL(bit 0~3), so its value is default 0, and we use MAIN_PLL3_HSDIV1_CLKOUT as the source clock, it's 200MHZ.

The value in the GTC_CNTFID0 register is 0.

After I read 0x02FFFFFFFD, I eventually read 0x200000000(or 0x200000XXX) if I read from the register enough times. We want to know, when we get a wrong vlaue(0x02FFFFFFFD), how long must we wait before we can get the correct value((0x200000XXX))?

The GTC_CNTCVS register you said is GTC_CNTCVS_HI and GTC_CNTCVS_LO, I found the values are the same to 0x00A90008 and 0x00A9000C.

Maybe you can use the test codes show below to replicate out error:

uint64_t last_result = 0;
uint64_t read_result = 0;

while(1) {
    read_result = *((unsigned long *) gtc_virt_addr + 1);
    if (abs(read_result - last_result ) > 200000000) {
        printf("ERROR happen!\n");
    }
    last_result = read_result;
    usleep(1);
}

0 z over 5 years ago in reply to subin li

TI__Expert 4015 points

I encountered the issue when I left it to run for several hours. In order to catch the failure, I had to poll the counter at just the right moment, so letting it run for a while brought out the failure.

I will continue to look into this to find a solution.

0 subin li over 5 years ago in reply to z

Expert 1050 points

Thank you, looking forward to hearing good news from you.

0 subin li over 5 years ago

Expert 1050 points

0 z over 5 years ago in reply to subin li

TI__Expert 4015 points

Subin,

There is not a way to sync the memory mapped registers for HI and LO on the GTC. Because the interface to the GTC counter is only 32 bits wide, any read of the counter will necessarily be two 32-bit reads. If those two reads are separated by a rollover of the LO register, then you will encounter an error.

I propose the following workaround: perform a second read of the timer counter, with the shortest possible delay, to detect the presence of a bad read. The second read should always be greater than the first read. If it is not, then the first read was invalid. There are a few caveats:

1) Reads of the GTC counter should always read LO first, followed by HI. The C code in your example is treating the counter as a single 64 bit number, but based on the sequence of values you provided, the assembled code is reading LO first, then HI.

2) The second read must occur within the rollover time of LO. In other words, the time between reads must be less than (2^32)/(GTC_CLK freq).

3) After reading the GTC counter twice, discard the second read and use the first read as your returned value.

Please let me know if this is an acceptable workaround for you, and if you have any more questions.

0 subin li over 5 years ago in reply to z

Expert 1050 points

Hi, Zack

Are you making progress on this issue?

If there is no good solution for the time being, can you confirm for us first how long will this moment(HI increase by 1, but LO not cleared) last?

Thank you!

0 z over 5 years ago in reply to subin li

TI__Expert 4015 points

Subin,

I'm not sure if you missed it, but I provided a response on the 4th of September. Please review that response, and let me me know if that solution works for you.

Thanks,

Zack

0 subin li over 5 years ago in reply to z

Expert 1050 points

Hi, Zack

I think there are some problems about your solution.

1、How long after the first reading we start the second reading？If the time between reads is too short, the values of two readings is always the same or very close. So the values of two readings may be all wrong.

2、We think there are the following scenarios：

①first read：0x1FFFFFF00（TRUE）

usleep(XXX)

second read: 0x1FFFFFFFE（TRUE）

return: 0x1FFFFFFFE

②first read：0x1FFFFFFFE（TRUE）

usleep(XXX)

second read: 0x2FFFFFFFF（FALSE）

usleep(XXX)

third read: 0x20000000E（TRUE）

return: 0x20000000E

③first read：0x2FFFFFFFE（FALSE）

usleep(XXX)

second read: 0x20000000E（TRUE）

usleep(XXX)

third read: 0x20000001E（TRUE）

return: 0x20000001E

Now we want to know the minimum usleep time XXX.

But we cannot use usleep here, because it will cause context switching when be called. If it is called frequently, it will cause inefficiency.

So we are going to use the busy waiting function, because it won't cause context switching.

So we want to know the minimum usleep time XXX.

0 z over 5 years ago in reply to subin li

TI__Expert 4015 points

Subin,

You said in your last post:

"If the time between reads is too short, the values of two readings is always the same or very close."

Have you observed multiple consecutive reads with incorrect values? I have not observed this in my tests, and if that is what you are finding, then it runs contrary to my understanding of the root problem.

If that is the case, then I agree that there must be some minimum wait time between reads, and I will work to figure out what that delay must be.

0 subin li over 5 years ago in reply to z

Expert 1050 points

Hi, Zack

I mean if I read the time like this:

first read：0x2FFFFFF0E

second read: 0x2FFFFFF0E

If do not wait for a period of time between two readings, the second read will be the same or very close with the first read.

If first read is TRUE, second read will also be TRUE. If first read is FALSE, second read will also be FALSE.

So we have to know the minimum wait time between reads.

0 z over 5 years ago in reply to subin li

TI__Expert 4015 points

Is the above scenario with two false reads hypothetical, or observed? I don't think it's possible to get two incorrect readings in a row like that. The source of the error is that a rollover occurs on LO between reading LO and HI. The events happen in the following order:

1) You read LO

2) LO rolls over, and HI increments.

3) You read HI.

Now you've read an incremented HI value, but a LO value from before the rollover. Your code sticks them together as a single 64-bit value. This is what appears to be causing your false readings. However, two sequential reads can't both be false by this mechanism, as long as the time between them is significantly less than the rollover period of LO.

-Zack

0 subin li over 5 years ago in reply to z

Expert 1050 points

Hi, Zack

I think your idea is wrong.

Below is my test log:

read GTC: (last: 2796023698067) 0x28AFFFFD293

read GTC: (first: 2796023698067) 0x28AFFFFD293

usleep(1)

read GTC: (second: 2800318676976) 0x28BFFFFFFF0

read GTC: (third: 2800318676976) 0x28BFFFFFFF0

So if I read the registers without sleep, the values will be the same.

So we want to know the interval time for reading registers, and the interval should guarantee at least once reading is correct.

0 z over 5 years ago in reply to subin li

TI__Expert 4015 points

What core are you accessing this register from, and what speed is that core running at?

-Zack

0 subin li over 5 years ago in reply to z

Expert 1050 points

We are accessing this register from A72, and it is running at 2GHZ.

0 subin li over 5 years ago in reply to z

Expert 1050 points

0 z over 5 years ago in reply to subin li

TI__Expert 4015 points

Subin,

I reached out to the designer and he gave this comment:

this really isn’t the intended use of the GTC at least from the CPU core standpoint. The GTC is present in part to implement the System Counter function of the ARMv8 Generic Timer Architecture. As such, it distributes the time count to the ARM processors via a gray encoded System Timer Bus. The ARM cores each contain Processor Element timers that get their counter value from the GTC over the System Timer Bus. So I would expect for the A72 core to get the time by reading its local CNTPCT using a system control register read rather than trying to read from the GTC counter registers.

Is that a workable solution?

0 subin li over 5 years ago in reply to z

Expert 1050 points

Hi, Zack

We need to get time not just from A72, but from other cores(R5, C66 and C7X) as well. And we found that only the RTC register could be used as a common source of time. You said that we can read its local CNTPCT to get time in A72, but we cannot get the time in the same way in other cores. If we use other ways to get the time in the other cores, the time we get will be different.

In the other way, whatever the purpose of the GTC, the problem we described earlier is there. Now we can't solve this problem.

Now we just want to know how long the GTC register error will last.

0 subin li over 5 years ago in reply to z

Expert 1050 points

0 z over 5 years ago in reply to subin li

TI__Expert 4015 points

Subin,

read GTC: (last: 2796023698067) 0x28AFFFFD293

read GTC: (first: 2796023698067) 0x28AFFFFD293

usleep(1)

read GTC: (second: 2800318676976) 0x28BFFFFFFF0

read GTC: (third: 2800318676976) 0x28BFFFFFFF0

I don't really understand the test log above. Are you reading in the order last, first, second, third?

So if I read the registers without sleep, the values will be the same.

So we want to know the interval time for reading registers, and the interval should guarantee at least once reading is correct.

Could you fill a large array with counter values, without calling usleep between read_GTC? At least a big enough array to see a few transitions in GTC before the jump, while the jump is corrupting the read_GTC value, and after the jump. That way we can get a better idea of how fast the GTC register is being accessed and it will help answer your question.

-Zack

0 subin li over 5 years ago in reply to z

Expert 1050 points

Hi, Zack

1、yes, I'm reading in the order last, first, second, third.

the test log told us that, I read two times, their values are 0x28AFFFFD293 and they are right.

but after usleep(1), I read two times, their values are 0x28BFFFFFFF0 and they are wrong(a big jump when compare with 0x28AFFFFD293 ).

2、I think we have described the problem clearly.

We want to know how long this error time will last from a theoretical analysis point of view.

Our test is not completely reliable and depends on the actual environment.

0 subin li over 5 years ago in reply to z

Expert 1050 points

This problem has been delayed for a long time, please help us solve it as soon as possible, thank you.

0 subin li over 5 years ago in reply to z

Expert 1050 points

0 z over 5 years ago in reply to subin li

TI__Expert 4015 points

The expected duration is 1 GTC clock cycle.

Processors

Processors forum

TDA4VM: GTC register not synced between HI & LO