I want to generate a 5.000MHz output clock from a TMS320C5535 running at 50.000MHz. A 50MHz system clock implies that each instruction should take 20ns. However, when I use the following two instruction to toggle a GPIO pin ('T0' has output bit = '0' and 'T1' has output bit = '1') I see a 60ns delay. It's like the single instruction takes 3 clock cycles to complete!
MOV T0,port(#IOOUTDATA1)
MOV T1,port(#IOOUTDATA1)
If I add a 'NOP' between the instructions the delay increases to 80ns.
How can I toggle an output pin and get it done in a single instruction cycle?