HalCoGen/Hercules: EMIF-ASYNCH1/2/3

Sarah Weinberger

Expert 1915 points

Other Parts Discussed in Thread: HALCOGEN, RM48L952

Why is it that if we change setup for strobe for EMIF-ASYNCH1/2/3 output memory access, we never get a strobe of anything other than 26us?

over 9 years ago

0 Anthony F. Seely over 9 years ago

TI__Guru 68930 points

Hi Sarah,

Would you please elaborate a bit?

First, what do you mean specifically by a 'setup' of 26us? Is this the 'SETUP' phase of the ASYNC transaction that you refer to as the 26us setup, or just setup in a general sense meaning that the STROBE period stays 26us.

Also what are you seeing on the CS\ pin? In the Select Strobe mode - the chip select should have the timing of the strobe which would be 5 cycles.
These are cycles in terms of the EMIF CLOCK frequency on the "EMIF General" tab (VCLK3).

Your clock must be really really slow if you are seeing 26us = 5 cycles...

Anyway need some more information on what is going on.. a picture of what you are seeing from a scope would be great.

Thanks and Best Regards,
Anthony.

0 Sarah Weinberger over 9 years ago in reply to Anthony F. Seely

Expert 1915 points

Hi Anthony, thank you for the quick response.

I have to forward some of your questions to the EE. I worked with him over the weekend, so I have some of the answers, but he did the actual measurements. I was just behind the computer.

Our task was to test out the discrete out signals, what he called parallel 4xxx interface. We have 8 signals going out and 8 coming in, however we wanted to test the discrete out signals. The EE had me alternate 0x00 and 0xFF on the discrete lines every 50ms. I set up a 50ms timer and wiggled the discrete out signals.

Here is the code:

#define lpbyDiscreteIn  (uint8 *)0x60000000
#define lpbyDiscreteOut (uint8 *)0x64000000
static uint8 isDiscreteLow = FALSE;					  // SMW DEBUG --

void rtiNotification(uint32 notification)
{
/*  enter user code between the USER CODE BEGIN and USER CODE END. */
/* USER CODE BEGIN (9) */
	if (rtiNOTIFICATION_COMPARE0 == notification)
	{
		// Test: Discrete Out
		uint8 uValue2 = 0;
		uValue2 = (FALSE == isDiscreteLow ? 0x00 : 0xFF);
		isDiscreteLow = !isDiscreteLow;
		*lpbyDiscreteOut = uValue2;
	}
	else if (rtiNOTIFICATION_COMPARE1 == notification)
	{
	}
/* USER CODE END */
}

The EE tested the signals, however what he saw was 26us on any of the pins with a random other timing element on the oscilloscope. He had me change the values from the default to (2, 5, 3, 2, 5, 3, 1), however that had no effect on the timing that we saw.

Also, is there a nice HalCoGen #define, method, or whatnot to write and read from the discrete out lines rather than the crude #define that I used with lpbyDiscreteIn and lpbyDiscreteOut?

0 Anthony F. Seely over 9 years ago in reply to Sarah Weinberger

TI__Guru 68930 points

Sarah,

It likely depends then on the signals that were being probed. Probing the data lines will not give you a reliable measurement... and there is a bus hold feature that keeps the lines from floating between accesses.

You need to be probing the EMIF nCS[2] and EMIF_nCS[3] lines to see the actual timing & STROBE period. When you have SELECT STROBE mode you should see the width of these (active low) signals exactly equal to the # of clock cycles that you have programmed for the STROBE period (possibly with a factor of +1 .. depending on whether HalCoGen corrects for this or not ... normally value of '0' in the register means 1 cycle, value of 1 means 2 cycles, etc..).

Otherwise the other factor that might be in play is if you have the EMIF_nWAIT signal in use and your hardware is inserting a fixed delay.. you would also need to have nWAIT enabled in HalCoGen for this to occur though.

Thanks and Best Regards,
Anthony

0 Sarah Weinberger over 9 years ago in reply to Anthony F. Seely

Expert 1915 points

Hi Anthony, we now get 5 pulses instead of one on the writes only and they are approximately 800ns wide. The reads are correct. ASYNCH1 is read and ASYNCH2 is write. We tried for sheer testing purpose to make through code ASYNCH1 write and the problem, namely the 5 pulses, traveled with the write.

Here are the addresses to which read and write.

#define lpbyDiscreteIn  (uint8 *)0x60000000
#define lpbyDiscreteOut (uint8 *)0x64000000

Here is my test code that I have in a 100ms timer interrupt.

			// Test: Discrete In
			if (10 <= uDiscreteIn)
			{
				uint8 uValueIn = *lpbyDiscreteIn;
				char szValueIn[12];
				char szLine[81];
				*szLine = 0;
				itoa(uValueIn, szValueIn);
				strcat(szLine, "Discrete In = ");
				strcat(szLine, szValueIn);
				strcat(szLine, "\r\n");
				fifoenter(szLine);
				uDiscreteIn = 0;
			}
			else
				uDiscreteIn++;

			// Test: Discrete Out
			if (10 <= uDiscreteOut)
			{
//				uint8 uValue2 = 0;
//				uValue2 = (FALSE == isDiscreteLow ? 0x00 : 0xFF);
//				isDiscreteLow = !isDiscreteLow;
//				*lpbyDiscreteOut = uValue2;
				*lpbyDiscreteOut = (uint8)uDiscreteValue;
				if (255 < ++uDiscreteValue)
					uDiscreteValue = 0;
				uDiscreteOut = 0;
//				static int iBit = 0;
//				uint8 uValue = 1 << iBit;
//				*lpbyDiscreteOut = uValue;
//				iBit++;
//				if (8 <= iBit)
//					iBit = 0;
			}
			else
				uDiscreteOut++;

What we see on the scope are 5 pulses on the write side. When I read data back, we see 223, although upon startup the value builds up to that, so it might be 14, then 20, then 48, then 223, where it settles. The EE says that he only sees 5-pulses but was expecting one (1).

Here are the ASYNCH1 and 2 configuration screenshots. We tried different settings, even putting 1 for the write and read strobe values, but that did not help our problem.

I have a tested straight through 26-pin ribbon cable connected the in to the out. I increment data from 0 to 255 and then repeat out on the discrete port and read the values back, as seen in the code above. What I see is merely 223 and 5 pulses on the scope with the EE expecting to see 1 pulse.

0 Anthony F. Seely over 9 years ago in reply to Sarah Weinberger

TI__Guru 68930 points

Hi Sarah,

Sorry I am getting a bit confused and need to clarify a couple points:

1) When measuring the pulse width - which pin are you measuring? Is it the chip select or is it another pin like WE or OE?

2) By '5 pulses' do you mean 1 pulse low of width 5 clocks? Or do you mean 5 distinct pulses low, with the pin returning high between each?
(and this is on the chip select pin again?)

3) 223 is a data bus value, correct? It's not 'setup of 2, strobe of 2, hold of 2'.. Just want to make sure of this.

A screen capture from a scope would be really helpful here if it's possible to grab one. I don't expect any issue other than the EMIF errata (just a post today about it - and you may need to read it too... e2e.ti.com/.../516918).

Thanks and Best Regards,
Anthony

0 Sarah Weinberger over 9 years ago in reply to Anthony F. Seely

Expert 1915 points

Hi Anthony,

(from the EE)

1. We are measuring at nCS3.

2. By 5-pulses, I mean 5 times nCS3 goes low with the low time being 600ns and the high time being 700ns.

3. That is correct and it is a data bus value, which we read and is in decimal.

Thanks,

Sarah

0 Anthony F. Seely over 9 years ago in reply to Sarah Weinberger

TI__Guru 68930 points

Hi Sarah,

If they are distinct pulses - then we should understand the data type for your variables lpbyDiscreteOut , lpbyDiscreteIn.
Says they are uint8 *, so you should be seeing an assembly instruction 'LDRB' for the read and 'STRB' for the write.

If you can confirm that the assembly code being executed is an LDRB and an STRB (open disassembly window / use 'green' assembly step arrows' in ccs... then maybe the next step is to check your MPU settings. If the EMIF is setup for 'Normal' type then it may be that something funny is going on with the CPU. I can't put my finger on it, but there's write buffering and the CPU probably reads ahead too when the memory is type 'Normal' so you want to make sure it' strongly ordered.

You are trying to implement an IEEE-488 interface (AKA GPIB or HPIB)? Or something else?

Thanks and Best Regards,
Anthony

EDIT: PS forgot to ask you EMIF clock frequency - in case there is a question about the strobe timing itself.. Was focused on the # of pulses but need to know the EMIF clock to confirm that the width is correct.

0 Sarah Weinberger over 9 years ago in reply to Anthony F. Seely

Expert 1915 points

Hi Anthony,

Here is the disassembly, which as you can see does use the LDRB for the read and STRB for the write.

182       				uint8 uValueIn = *lpbyDiscreteIn;
0003aeb0:   E3A0C206            mov        r12, #0x60000000
0003aeb4:   E5DCC000            ldrb       r12, [r12]
0003aeb8:   E5CDC004            strb       r12, [sp, #4]

203       				*lpbyDiscreteOut = (uint8)uDiscreteValue;
0003af2c:   E59FC790            ldr        r12, [pc, #0x790]
0003af30:   E3A00319            mov        r0, #0x64000000
0003af34:   E5DCC000            ldrb       r12, [r12]
0003af38:   E5C0C000            strb       r12, [r0]

We did not use the MPU, as we disabled that, but we enabled the MPU, as shown below, to no change, still the 5-pulses. The applicable region is 5, because the address range is 0x60000000 through 0x67FFFFFF.

Did you want another set of values on this tab?

>> You are trying to implement an IEEE-488 interface (AKA GPIB or HPIB)? Or something else?

We are outputting 8 parallel values and reading the same 8 parallel values back after looping them back via RS-422 8-bit parallel link.

>> PS forgot to ask you EMIF clock frequency - in case there is a question about the strobe timing itself.. Was focused on the # of pulses but need to know the EMIF clock to confirm that the width is correct.

10 MHz

Thanks,

Sarah

0 Joe Josn over 9 years ago in reply to Sarah Weinberger

Intellectual 680 points

Enable mpu unit and configure your async memory region as device type to avoid extra writes as Anthony mentioned. One thing to note is that _mpuInit_() need to be called in your startup code to enable mpu unit. I don't believe HalCogen does that automatic with the checkbox in the mpu tab as you would expect. Make sure _mpuInit_() is called before your test code is executed.

0 Sarah Weinberger over 9 years ago in reply to Joe Josn

Expert 1915 points

Yes, I like the _mpuInit_() comment too, as I did not know that I have to call that, however in stepping through the debugger or just uploading code, code execution stops right at the _mpuInit_() call.

void doInitDiscrete()
{
	// Initialize ASYNC1, 0x60000000, for discrete intput with order the same as old CCM; method ReadDiscrete().
	emif_ASYNC1Init();

	// Initialize ASYNC2, 0x64000000, for discrete output with order the same as old CCM; method WriteDiscrete().
	emif_ASYNC2Init();

	// Enable the MPU unit to avoid extra writes.
	_mpuInit_();
}

I call this function just before the "_enable_IRQ();" call.

I only enabled the EEPROM stuff and those settings look okay.

Why could the reason that I get hung up on that call be and how do I fix the problem?

0 Chester Gillon over 9 years ago in reply to Sarah Weinberger

Guru 92251 points

Why could the reason that I get hung up on that call be and how do I fix the problem?

The _mpuInit_ function writes to MPU co-processor registers which require the CPU to be in an "privilege state". If the CPU is in user (USR) mode the _mpuInit_ function will generate an abort.

Prior to stepping over the _mpuInit_ function can you check in the CCS Target Status view what is the state of the CortexR4 CPU. In the following, albeit for a different CPU to yours, the information of interest is the "Status mode":

0 Joe Josn over 9 years ago in reply to Chester Gillon

Intellectual 680 points

If you tried running in privileged mode like Chester suggested and still having issues, take a look at this app note infocenter.arm.com/.../index.jsp .

This section is taken from AN296 at ARM info center : "None of the regions are defined or enabled after reset. Any access which lies outside a defined and enabled region when the MPU is enabled will cause an abort. Therefore, at least one region must be defined before enabling the MPU after reset. If the MPU is enabled and no regions are defined, the processor enters a state from which it is recoverable only by a further reset"

So, any memory that you access (R/W/X) should be defined and enabled before you enable mpu.

I believe HalCogen has a set of predefined regions with the correct settings by default for FLASH/RAM/Peripherals. You can enable all those regions plus the specialized region for EMIF.

0 Chester Gillon over 9 years ago in reply to Joe Josn

Guru 92251 points

Joe Josn said:
This section is taken from AN296 at ARM info center : "None of the regions are defined or enabled after reset. Any access which lies outside a defined and enabled region when the MPU is enabled will cause an abort. Therefore, at least one region must be defined before enabling the MPU after reset. If the MPU is enabled and no regions are defined, the processor enters a state from which it is recoverable only by a further reset"
So, any memory that you access (R/W/X) should be defined and enabled before you enable mpu.

Thank you for the information. From looking at the screen shot of the MPU settings for regions 1 to 5 which Sarah posted shows that only region 5 which is for the EMIF is enabled. The MPU for regions 2 (flash) and 3 (RAM) are not enabled.

Since HALCoGen leaves the CPU in system mode by default, I believe the problem is caused by disabling MPU access to FLASH/RAM/Peripherals.

0 Sarah Weinberger over 9 years ago in reply to Chester Gillon

Expert 1915 points

Does that mean that I am in privileged mode?

Same issue.

0 Chester Gillon over 9 years ago in reply to Sarah Weinberger

Guru 92251 points

Sarah Weinberger said:
Does that mean that I am in privileged mode?

Yes, that means you are in privileged mode. i.e. the CortexR4 mode isn't the problem.

0 Anthony F. Seely over 9 years ago in reply to Chester Gillon

TI__Guru 68930 points

Sarah,

I didn't realize this was the same project as the interrupt project; so didn't look at the MPU settings.

I just created a new HalCoGen project for the RM57L and the MPU is enabled by default, as well all of the regions are enabled up to and including region 7... then + Region 16 which covers the peripheral and system registers on top of that. This is a project created with HalCoGen 4.05.01.

Those setting should be useful for the RM57L without editing them -- for the most part. And even Region 5 is setup to be STRONGLY ORDERED.

Does your HalCoGen install give different default settings ? Seems like most of your regions are disabled.

0 Sarah Weinberger over 9 years ago in reply to Chester Gillon

Expert 1915 points

I be the first person to say that I do not know what each Halcogen/Hercules region means or which ones I should enable or to what value.

What does OINC mean?

I enabled region 2 and 3, as shown as follows, and was able to go a couple steps beyond the _mpu_Init(), however I still halt, just a bit later, so that is the area to make a positive change.

What further changes do I need to make?

0 Anthony F. Seely over 9 years ago in reply to Sarah Weinberger

TI__Guru 68930 points

Sarah,

Its a little complex because the MPU region type options support a lot 'bigger' of a processor than you have. I think some of the options only make sense if you have a cluster of R5's sharing an L2 cache.

You can float the cursor over the drop down V (arrow) and it will at least decode the acronyms into words.

But the description of these options is all in the CPU documentation from ARM.

My HalCoGen project defaults to "NORMAL_OIWTNOWA_NONSHARED' and the permission is 'PRIV_RO_USER_RO_EXEC' for the flash.

That makes sense. First, the Flash is normal memory - it's not sensitive to the size or number of reads (aside from pipeline effects..)

The OIWT and NOWA are cache policies related to writes.. but it doesn't matter for Flash because flash is read only.

Non-Shared has to do with the mutex primitives 'LDREX and STREX' and just means you will not be using this memory area for a mutex - which makes sense because it is read only.

The permission PRIV_RO_USER_RO_EXEC means you can read the flash in any mode, and you can execute instructions from it.
If you have memory that you want to make sure the CPU never tries to execute from (like the area on your EMIF that is IO) you would at least make sure that you have an option without the 'EXEC' bit set.

Back to the cache - the OIWT means Outer/Inner Write-Through. You can ignore OI because this device doesn't have two levels of cache. Write through is important. There is write through and write back. Write through means when you write, if the data you write is in cached the cache gets updated but also the value in original memory (say L2SRAM) gets updated at the same time. This may slow you down a bit but helps if you are using DMA because the DMA cannot read data from cache.

Write-back is faster because the writes only go to the main memory when there is a need to evict a line from cache. So it's a good type for your stack, global variables, etc. Anything that isn't read by a different bus master (DMA, EMAC, HTU, FTU..)

WA is a policy that means if you don't have an address in the cache already because of a prior access, the first write itself will cause that line to be cached.

These options are more important for the L2SRAM region (covering 0x0800 0000 to 0x0803 FFFF) where you have the RW attribute.

The place to read all about this in detail is silver.arm.com/.../download.tm (ARM DDI0406C). You can start with Chapter B4 Protected Memory System Architecture (PMSA) which is what you have on the Cortex R. (the previous chapter VMSA is for cortex A so don't get confused).

0 Anthony F. Seely over 9 years ago in reply to Anthony F. Seely

TI__Guru 68930 points

PS: bottom line here is I would recommend getting a clean HalCoGen project from 4.05.01 and copying the MPU settings into your current project.
To me it looks like the defaults from a clean 4.05.01 project would work for you - where you've got a lot of differences in the screenshot you posted..

0 Chester Gillon over 9 years ago in reply to Sarah Weinberger

Guru 92251 points

Sarah Weinberger said:
I enabled region 2 and 3, as shown as follows, and was able to go a couple steps beyond the _mpu_Init(), however I still halt, just a bit later, so that is the area to make a positive change.
What further changes do I need to make?

I think the problem is that the region(s) for the peripherals are still disabled, which will then cause an abort when the CPU attempts to access peripheral accesses.

I agree with Anthony's suggestion of starting by copying the MPU settings from a clean HALCoGen project.

I am not sure what type should be set to for the EMIF Region 5 to solve your original problem, but suggest repeat your tests with the default MPU settings and the MPU enabled (via the call to _mpuInit_).

0 Sarah Weinberger over 9 years ago in reply to Chester Gillon

Expert 1915 points

Hi Chester/Anthony,

I did create a clean Halcogen project and set the R4-MPU-PMU configuration to that of a clean project. That resolved the _mpu_Init_() issue and allowed my project to run with MPU enabled. Thanks!

I then played with the 5 Region Types for Region 5 (my discrete in/out signals), as described earlier. One gave me 223 as a value read in, while the others gave me a sort of number increment, just not what I expected. The EE is not here today, will be on Saturday, so I cannot comment on the number of strobes. I can, however, on the values.

I created multiple columns, just so there is not a hugely double columns. If working properly, the value that I write out should be identical to what I read in.

Here is the region type that goes along with that.

Note: The tool-tip does not correspond to the actual values.

Here is the code that I have executing in a 100ms timer.

			// Test: Discrete In
			if (10 <= uDiscrete)
			{
				// Write out the value.
				*lpbyDiscreteOut = uDiscreteValue++;

				// Read the discrete value back in.
				uint8 uValueIn = *lpbyDiscreteIn;

				// Display the discrete value.
				char szValueIn[12];
				char szLine[81];
				strcpy(szLine, "Discrete: Read = ");
				itoa(uValueIn, szValueIn);
				strcat(szLine, szValueIn);
				strcat(szLine, ", Should Be = ");
				itoa(uDiscreteValue, szValueIn);
				strcat(szLine, szValueIn);
				strcat(szLine, "\r\n");
				fifoenter(szLine);

				// Rinse and repeat.
				uDiscrete = 0;
			}
			else
				uDiscrete++;

I allocated the value, uDiscrete, as a static at the top of the source file.

#define lpbyDiscreteIn (uint8 *)0x60000000
#define lpbyDiscreteOut (uint8 *)0x64000000

0 reads in as 223, 1 reads in as 0, 2 reads in as 17, then consistent until 16 reads in as 31, but then 17 reads in as 0. I had a gap, only because I started capturing not at zero and my scroll history was not set high enough for 18 through 72. Anyways, the pattern is that the numbers consistently increment, but then drop down or rise up (194 reads in as 209).

What setting or lack thereof causes the read to not equal the write, considering that I do one right after the other?

0 Chester Gillon over 9 years ago in reply to Sarah Weinberger

Guru 92251 points

Sarah Weinberger said:
What setting or lack thereof causes the read to not equal the write, considering that I do one right after the other?

I think one issue is caused by the following test code which increments the uDiscreteValue variable after performing the write but before reporting the "Should Be" valur:

				// Write out the value.
				*lpbyDiscreteOut = uDiscreteValue++;

				// Read the discrete value back in.
				uint8 uValueIn = *lpbyDiscreteIn;

				// Display the discrete value.
				char szValueIn[12];
				char szLine[81];
				strcpy(szLine, "Discrete: Read = ");
				itoa(uValueIn, szValueIn);
				strcat(szLine, szValueIn);
				strcat(szLine, ", Should Be = ");
				itoa(uDiscreteValue, szValueIn);
				strcat(szLine, szValueIn);
				strcat(szLine, "\r\n");
				fifoenter(szLine);

Which to avoid a 'one off' error think should be:

				// Write out the value.
				*lpbyDiscreteOut = uDiscreteValue;

				// Read the discrete value back in.
				uint8 uValueIn = *lpbyDiscreteIn;

				// Display the discrete value.
				char szValueIn[12];
				char szLine[81];
				strcpy(szLine, "Discrete: Read = ");
				itoa(uValueIn, szValueIn);
				strcat(szLine, szValueIn);
				strcat(szLine, ", Should Be = ");
				itoa(uDiscreteValue, szValueIn);
				strcat(szLine, szValueIn);
				strcat(szLine, "\r\n");
				fifoenter(szLine);
				uDiscreteValue++;

However, that doesn't explain all the anomalies in the test results. Can you post a schematic of how the EMIF write data gets connected to the EMIF read data externally to the RM48L952, and/or a scope capture of the activity on the EMIF for a write and subsequent read?

0 Joe Josn over 9 years ago in reply to Chester Gillon

Intellectual 680 points

Did this test work before enabling mpu unit?

Spreadsheet shows data Bit 4 has issues!

Is D4 pin muxed? Inverted logic? or just logic timing issues?

Consult with your EE/FPGA guy to figure out the r/w timing requirements : Is Select Strobe causing issues? Did you try normal mode?

0 Sarah Weinberger over 9 years ago in reply to Joe Josn

Expert 1915 points

Great idea to look at binary values and compare and great catch!

Actually, bit 4 and bit 5 had issues.

I pulled the schematic for the old and new boards and sure enough bits 4 and 5 routed differently.

Discrete In Pins Discrete Out Pins

D4 15,16 13,14

D5 19,20 17,18

The other bits aligned perfectly. I had a new cable made that connected straight through with the exception of the pins above and presto!

The huge breakthrough was getting the MPU integrated and working. Creating a new sample project and copying over the default MPU settings and the comment about calling _mpu_Init_() in my code did the trick. I did not know that that function existed let alone that I need to call it.

It is sad that I cannot accept 2 answers, because the issue had 2 problems.

Thanks for all the help!

Arm-based microcontrollers

Arm-based microcontrollers forum

HalCoGen/Hercules: EMIF-ASYNCH1/2/3