This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

paralleling two TMS320F28335 control cards

Other Parts Discussed in Thread: CONTROLSUITE

Hi,

I am trying to boost the performance of the F28335. I am wondering, if it is possible to physically connect two F28335 control cards in parallel. If so, will there be an issue related to electrical voltages of the connecting pins. I am trying to reduce the computational time of my Algorithm from 152 us to less than 100 us.

Algorithm: Optimal control problem. External signal triggers the ADC for sampling and ADC’s end-of-sequence interrupt initiates a loop where floating point numbers are multiplied certain number of times and, in the end, minimum of all the numbers in an array is found. 

Example of a multiplication is as follows:

Polynomial[count] = temp_value - 0.5687949*(u[0][outLoop]*adc1 - u[1][outLoop]*adc2+ u[2][outLoop]*adc3 + u[3][outLoop]*adc4 + u[0][inLoop]*adc5 + u[1][inLoop]*adc6 - u[2][inLoop]*adc7 - u[3][inLoop]*adc8;

 

Here, outLoop and inLoop are in the range [0 5], i.e. 36 combinations altogether.

 

Possible solutions already tried using code composer studio V5:

1) Optimization of both polynomial equation and the compiler built-in.

2) IQ math library. In my case, slow in comparison to using FPU+fastRTS: please comment if it is expected.

3) Currently in the process of using pointers, however, sceptical about drastic improvement in performance.

I am running from RAM as I am using the RAM linker file as obtained from controlSuite. I have also considered other TI devices with a faster processor, but those devices lack a built in ADC. I think an ADC built in is a big help as it reduces the hassle related to external circuitry.   

Overclocking: I have found a little literature on overclocking the F28335. Please comment: What if we attach a heat sink on the device and modify the ADC registers to keep its clock below 12.5 MHz (to maintain linearity). How far can we push a clock from 150 MHz? It will be really nice to know some implications of the driving the F28335’s FPU at the maximum possible clock.

 

1) Optimization of both polynomial equation and the compiler.

2) IQ math library. In my case, slow in comparison to using FPU+fastRTS: please comment if it is expected.

3) Currently in the process of using pointers, however, sceptical about drastic improvement in performance.

I am running from RAM as I am using the RAM linker file as obtained from controlSuite. I have also considered other TI devices with a faster processor, but those devices lack a built in ADC. I think an ADC built in is a big help as it reduces the hassle related to external circuitry.   

Overclocking: I have found a little literature on overclocking the F28335. Please comment: What if we attach a heat sink on the device and modify the ADC registers to keep ADC's clock below 12.5 MHz (to maintain linearity). How far can we push a system clock and in turn FPU from 150 MHz? It will be really nice to know some implications of the driving the F28335’s FPU at the maximum possible clock.

Thanks

 

  •  Hi Riar.

    One time I have overclocked 2809 due my error. 200 MHz   instead 100MHz. The processor worked unstably. The problem with overclocking is not only in overheat. Timings of all your F28335 parts will be changed and you will need to lose time to play with clock dividers (not only for ADC; I think the flash is the main problem) to obtain stability. Better spend this time to find more faster implementation of your algorithm. 

    Guys from Intel and AMD keeps overcklocking possibility in the head during building theirs CPUs. Guys from TI suppose more accurate using of the parts. So you can try overcklocking but it may be good for the lab research but  most probably not good for the big volume production. 

     

     

  • Hi Roman

    Thanks for sharing your experience with the 2809 and perspective on overcklocking.

  • Can you save time by not re-computing the outer loop terms?...

    for(outLoop = 0; outLoop <= 5; outloop++)
    {
        outLoopSOP = u[0][outLoop]*adc1 - u[1][outLoop]*adc2+ u[2][outLoop]*adc3 + u[3][outLoop]*adc4;
        for(inLoop = 0; inLoop <= 5; inLoop++)
        {
            Polynomial[count] = temp_value - 0.5687949*(outloopSOP + u[0][inLoop]*adc5 + u[1][inLoop]*adc6 - u[2][inLoop]*adc7 - u[3][inLoop]*adc8);

        }
    }

  • Thanks Devin,

    I think the compiler's optimizer is already taking care of it. I have tried your suggested method of loops and, still, did not notice any change in the computational time.

  • To answer your original question, you shouldn't have any trouble connecting two control cards in parallel.  If you are using something like the experimenter's kit, just connect the grounds at one point, then connect the GPIO signals as needed.  

    Have you considered using a C28346 device?  This is a 300MHz C28x device, however there is no flash or ADC, so you would need external versions of both.  This setup might be slightly easier to manage than two F28335 devices. 

    As far as overclocking, to reinforce what Roman has said, my guess is that for most F28335 devices the CPU and FPU will probably tolerate something like 10-15% overclocking under nominal conditions. The ADC and Flash certainly will not tolerate significant overclocking and you definitely wont be able to overclock the CPU to the 50% or more that you would need to get your algo to less than 100us.       

  • Cheers Devin, I will look into it.

  • Hi Riar,

    There are also two other alternatives:

    1. You could try writing in pure assembly

    2. You could try and go with single precision (16 bit) fixed point, if resolution is not critical. With hand coded assembly using DMAC, thigs would improve a lot. I do not know if that would be enough though.

    Regards, Mitja