This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM625: LPDDR4 access priority

Part Number: AM625
Other Parts Discussed in Thread: SYSCONFIG

Hi Team, 

Inquiry from customer: 

"

We use an am62x with a LPDDR4 memory and have an issue with memory access.

 

This is what happens (we think). Let’s say you have process that heavily loads the LPDDR4 memory running on one A53 core. At the same time a graphic process on another core heavily trying to read/write to the LPDDR4 memory. 2 processes from different cores want to access the LPDDR4 memory at the same time. In this case we think that both cores have the same priority to the memory.

 

What we want is to set some kind of memory access prioritization letting our main process on one core always have higher access than for example GPU, other cores and peripherals to the LPDDR4 memory. The reason for this is that on one A53 core we run a real time application process. It is very important that this process not get blocked/starved from access the LPDDR4 memory which we think it is by the GPU when it is rendering(drawing) on the screen.

 

Is it possible to priorities memory access in this way?

 

We understand that there is a tool called Resource Partitioning Tool based on SysConfig tool.

Is this something we can use to achieve what we want?"

BR,

-RT

  • Hello RT,

    I am reassigning your thread to the hardware owner to comment about whether there is a hardware capability to moderate DDR accesses from different cores & software hosts. Please ping the thread if you do not get a response within a couple of business days.

    Regards,

    Nick

  • Hi Nick, Team,

    Can I have an expert look at this please?

    Best regards,

    -RT

  • Hi Ryan, this is discussed in the QoS section 3.1.5 of the TRM.  You can assigned different priorities based on the initiator of the transaction.  The DDR subsystem then can use this to perform Class of Service (see section 9.1.3.1) arbitration.  You should be able to setup this arbitration with the info that's in the TRM.  If you have further questions, you can post here.

    Regard,s

    James

  • Hello James

    Thanks for the answer.

    We are the customer Ryan talks about.

    What we want to achieve is to give one core in A53 higher priority to the LPDDR4 memory then the others. According to you we should be able to this by the info in the TRM. We have read the TRM and are still a lite lost how to do this. Some more guidance would be appreciated.

     

    The section “K3 Resource Partitioning Tool” in the online am62 SDK documentation talks about QoS configuration, but “Not yet supported” in U-Boot and SBL.

    4.1.14. K3 Resource Partitioning Tool — Processor SDK AM62x Documentation

    Is this the tool we should use?

     

    Do you have some other example code that could points us in right direction?

     

    Regards

    Magnus

  • Magnus

    As of now there does not seem to be any examples in the SDK for this and K3 Resource Partitioning Tool supporting QOS is a roadmap item for us.

    We will plan to have something available as part of our SDK offering in SDK9.0 - July time frame. 

    I understand your concerns, but curious if you are actually seeing any issues or this is just future proofing?

    Regards

    Mukul

  • Hello,

    This is one of our major issues right now.

    We have problems with our main application that runs one core and what we think it is starved out from LPDDR4 by our graphic application that runs on another core.

    It is very important for us that our main application have higher access to LPDDR4.

    As I understand, SysConfig will support setting up this in the future but not now. Correct?

    Then we need to set up this by our self in SBL or U-Boot. Correct?

    /Magnus

  • Hi Magnus

    Thanks for sharing the additional background on this. If you have more details on the graphic applications, resource partitioning/cpu loading  and what works vs not - please feel free to share. 

    We need some more time to get you some more guidance on QoS register configuration - so give us a few more days to investigate and provide you guidance on this. 

    Regards

    Mukul 

  • Hello,

     

    I am gladly sharing. I have created I very simple use case that simulates exactly our problem.

    I have written 2 simple programs

    I have called them jitterTest and jitterQt

    * jitterTest (simulates our main application)

    Allocate memory, fills it up with data and then free the memory again, this takes about 5 ms. Then it sleeps for 5 ms.

    A start time stamp is taken before memory is allocated.

    A stop time stamp is taken after memory is free.

    Execution time is calculated showing how long time it took to fill up the memory and max and min execution time is stored.

    After about 1 second max and min execution time are printed out and the diff time between them.

    The diff time is what we call the jitter.

     

    * jitterQt (simulates our graphical interface)

    Simple Qt program the continuously draws 4 lines all over the display.

     

     

    When jitterTest runs without jitterQt running the jitter is about 30 to 70 micro seconds

    78513  Execution time max=5684 min=5724 jitter=40
    79585  Execution time max=5682 min=5720 jitter=38
    80656  Execution time max=5683 min=5729 jitter=46
    81728  Execution time max=5681 min=5726 jitter=45
    82799  Execution time max=5683 min=5717 jitter=34
    83871  Execution time max=5684 min=5726 jitter=42
    84942  Execution time max=5679 min=5723 jitter=44
    86014  Execution time max=5684 min=5727 jitter=43
    87085  Execution time max=5682 min=5729 jitter=47
    88157  Execution time max=5683 min=5719 jitter=36
    89228  Execution time max=5683 min=5717 jitter=34
    90300  Execution time max=5681 min=5724 jitter=43
    91371  Execution time max=5680 min=5717 jitter=37
    92443  Execution time max=5682 min=5712 jitter=30
    93514  Execution time max=5683 min=5724 jitter=41
    94586  Execution time max=5683 min=5741 jitter=58
    

    When jitterTest runs with jitterQt running the jitter is about 1000 to 2000 micro seconds. When jitterQt runs it could take about 20 to 40 per cent longer for jitterTest to execute which isn’t good for us.

    133777  Execution time max=5697 min=6888 jitter=1191
    134885  Execution time max=5688 min=7499 jitter=1811
    135993  Execution time max=5698 min=6849 jitter=1151
    137098  Execution time max=5702 min=6736 jitter=1034
    138200  Execution time max=5691 min=6803 jitter=1112
    139309  Execution time max=5693 min=6956 jitter=1263
    140413  Execution time max=5698 min=7198 jitter=1500
    141519  Execution time max=5693 min=6889 jitter=1196
    142626  Execution time max=5690 min=6717 jitter=1027
    143739  Execution time max=5687 min=7535 jitter=1848
    144850  Execution time max=5690 min=7183 jitter=1493
    145955  Execution time max=5686 min=7082 jitter=1396
    

    We have no problem dedicate one core for our main application (jitterTest in this simple use case).

    If it is possible to prioritize memory access for one core to be higher or in any other way solve this. Any help is appreciated.

    I have attached both my simple programs

    jitterTest

    //******************************************************************************
    // Filename: main.cpp
    // Description:
    //******************************************************************************
    
    //*****************
    //* Include files *
    //*****************
    #define _GNU_SOURCE
    #include <pthread.h>
    #include <stdio.h>
    #include <unistd.h>
    #include <semaphore.h>
    #include <errno.h>
    #include <time.h>
    #include <string.h>
    #include <fcntl.h>
    #include <sys/ioctl.h>
    #include <linux/gpio.h>
    #include <stdlib.h>
    
    //*********************
    //* Local definitions *
    //*********************
    #define MEMORY_SIZE 0x240000
    
    #define DEBUG_INFO(fmt, arg...) printf("%d  " fmt "\n", (int)hw_Get1msTicks(), ##arg); fflush(stdout)
    
    // GPIO
    #define MCU_GPIO_0  "/dev/gpiochip0"
    #define MAIN_GPIO_0 "/dev/gpiochip1"
    #define MAIN_GPIO_1 "/dev/gpiochip2"
    static int fd_mcu;
    static struct gpiohandle_request rq_mcu_out;
    static struct gpiohandle_data data_mcu_out;
    
    static int verbose;
    static int gpio;
    
    struct timespec start_time;
    struct timespec stop_time;
    long time_microsec;
    long max_time_microsec;
    long min_time_microsec;
    
    
    //******************************************************************************
    // Name: hw_Get1msTicks
    //******************************************************************************
    int hw_Get1msTicks(void)
    {
      struct timespec tms;
      int ms;
    
      clock_gettime(CLOCK_MONOTONIC, &tms);
      ms = tms.tv_sec * 1000;
      ms += tms.tv_nsec/1000000;
      if (tms.tv_nsec % 1000000 >= 500000) {
        ++ms;
      }
    
      return(ms);
    }
    
    //******************************************************************************
    // Name: ParseArgV
    // Desc: Process command line options
    //******************************************************************************
    static int ParseArgV(int argc, char *argv[])
    {
      int Result;
      int opt;
    
      Result = 0;
      verbose = 0;
    
      // Process command line options and create associated devices
      while(optind < argc) {
        // Process dash options
        while((opt = getopt(argc, argv, "vg:")) != -1) {
          switch(opt) {
            case 'v':
              verbose++;
              break;
            case 'g':
              gpio = atoi(optarg);
              break;
            default:
              perror("Invalid options, argument not valid");
              Result = 0xDEAD;
              break;
          }
        }
      }
    
      return(Result);
    }
    
    //******************************************************************************
    // Name: print_sched_attr
    //******************************************************************************
    static void print_sched_attr(int policy, struct sched_param *param)
    {
      DEBUG_INFO("policy=%s, priority=%d\n",
                (policy == SCHED_FIFO)  ? "SCHED_FIFO" :
                (policy == SCHED_RR)    ? "SCHED_RR" :
                (policy == SCHED_OTHER) ? "SCHED_OTHER" :
                "???",
                param->sched_priority);
    }
    
    //******************************************************************************
    // Name: main
    //******************************************************************************
    int main(int argc, char *argv[])
    {
      int i, result, counter;
      struct sched_param param;   
      int policy;
      char *charPointer;
      long startTime;
      long stopTime;
    
      verbose = 0;
      gpio = -1;
      counter = 0;
      max_time_microsec = 0;
      min_time_microsec = 10000000;
    
      // Parse arguments
      if(ParseArgV(argc, argv) == 0xDEAD) {
        return(-1);
      }
    
      if(verbose >= 1) {
        DEBUG_INFO("jitterTest verbose=%d gpio=%d", verbose, gpio);
      }
    
      // Set thread name
      (void)pthread_setname_np(pthread_self(), "jitterTest");
    
      // Set thread prio
      pthread_getschedparam(pthread_self(), &policy, &param);
      param.sched_priority = 20;
      policy = SCHED_RR;
      pthread_setschedparam(pthread_self(), policy, &param);
      print_sched_attr(policy, &param);
    
      // Handle GPIO
      if(gpio == 7 || gpio == 8) {
        fd_mcu = open(MCU_GPIO_0, O_RDONLY);
        if (fd_mcu < 0) {
          DEBUG_INFO("drvIO: Error unable to open %s: %s", MCU_GPIO_0, strerror(errno));
        }
        // Request outputs for mcu 0
        rq_mcu_out.lineoffsets[0] = gpio;
        rq_mcu_out.flags = GPIOHANDLE_REQUEST_OUTPUT | GPIOHANDLE_REQUEST_OPEN_SOURCE;
        rq_mcu_out.lines = 1;
        data_mcu_out.values[0] = 0;
        result = ioctl(fd_mcu, GPIO_GET_LINEHANDLE_IOCTL, &rq_mcu_out);
        result = ioctl(rq_mcu_out.fd, GPIOHANDLE_SET_LINE_VALUES_IOCTL, &data_mcu_out);
        if (result == -1) {
          DEBUG_INFO("Unable to get line handle from ioctl : %s", strerror(errno));
        }
      }
    
      while(1) {
        // Set gpio to 1
        if(gpio == 7 || gpio == 8) {
          data_mcu_out.values[0] = 1;
          ioctl(rq_mcu_out.fd, GPIOHANDLE_SET_LINE_VALUES_IOCTL, &data_mcu_out);
        }
    
        clock_gettime(CLOCK_MONOTONIC, &start_time);
    
        // Do something with LPDDR4 RAM for about x ms
    
        // Alloc memory
        charPointer = (char *)malloc(MEMORY_SIZE);
    
        if(verbose >= 2) {
          DEBUG_INFO("charPointer address %p", charPointer);
        }
    
        // Fill up memory
        for(i=0; i < MEMORY_SIZE; i++) {
          charPointer[i] = 0;
        }
    
        // Free memory
        free(charPointer);
    
        // Set gpio to 0
        if(gpio == 7 || gpio == 8) {
          data_mcu_out.values[0] = 0;
          ioctl(rq_mcu_out.fd, GPIOHANDLE_SET_LINE_VALUES_IOCTL, &data_mcu_out);
        }
    
        clock_gettime(CLOCK_MONOTONIC, &stop_time);
    
        // Check time
        startTime = start_time.tv_sec * 1000000000 + start_time.tv_nsec;
        stopTime = stop_time.tv_sec * 1000000000 + stop_time.tv_nsec;
        time_microsec = (stopTime - startTime) / 1000;
    
        if(time_microsec > max_time_microsec) {
          max_time_microsec = time_microsec;
        }
        if(time_microsec < min_time_microsec) {
          min_time_microsec = time_microsec;
        }
    
        counter++;
        if(counter >= 100) {
          counter = 0;
    
          DEBUG_INFO("Execution time max=%ld min=%ld jitter=%ld", min_time_microsec, max_time_microsec, max_time_microsec - min_time_microsec);
    
          max_time_microsec = 0;
          min_time_microsec = 10000000;
        }
    
        // Wait for x ms
        usleep(5000);
      }
    
      return 0;
    }
    
    

    jitterQt

    //******************************************************************************
    // Filename: main.cpp
    // Description:
    //******************************************************************************
    
    //*****************
    //* Include files *
    //*****************
    #include <QApplication>
    #include <QTimer>
    #include <QPainter>
    #include <QGraphicsScene>
    #include <QGraphicsView>
    #include <QGraphicsLineItem>
    #include <QPen>
    #include <QPen>
    
    //*********************
    //* Local definitions *
    //*********************
    #define LINES 4
    
    static QGraphicsLineItem *line[LINES];
    static QPen *pen;
    
    static bool leftOrRight;
    static int pos_x1[LINES];
    static int pos_y1[LINES];
    static int pos_x2[LINES];
    static int pos_y2[LINES];
    
    //******************************************************************************
    // Name: timeout50ms
    //******************************************************************************
    static void timeout50ms(void)
    {
      leftOrRight = !leftOrRight;
    
      for(int i = 0; i< LINES; i++) {
        if(leftOrRight) {
          pos_x1[i] = (random() % 1280) - 640;
          pos_y1[i] = (random() % 720) - 360;
        }
        else {
          pos_x2[i] = (random() % 1280) - 640;
          pos_y2[i] = (random() % 720) - 360;
        }
    
        line[i]->setLine(pos_x1[i], pos_y1[i], pos_x2[i], pos_y2[i]);
      }
    }
    
    //******************************************************************************
    // Name: main
    //******************************************************************************
    int main(int argc, char **argv)
    {
      struct sched_param param;
      int policy;
    
      QApplication app(argc, argv);
      QGraphicsScene scene;
      QGraphicsView view( &scene );
      QTimer timer;
    
      // Set thread name
      (void)pthread_setname_np(pthread_self(), "jitterQt");
    
      // Set thread prio
      pthread_getschedparam(pthread_self(), &policy, &param);
      param.sched_priority = 10;
      policy = SCHED_RR;
      pthread_setschedparam(pthread_self(), policy, &param);
    
      QObject::connect(&timer, &QTimer::timeout, timeout50ms);
      timer.start(50);
    
      pen = new QPen(Qt::green, 12, Qt::SolidLine, Qt::RoundCap);
    
      for(int i = 0; i< LINES; i++) {
        line[i] = scene.addLine(QLineF(-100 ,-100 ,100 , 100), *pen);
      }
    
      view.setRenderHints( QPainter::Antialiasing );
      view.showFullScreen();
    
      return app.exec();
    }
    

    Regards

    Magnus

  • Hello Magnus,

    I am trying to replicate the issue on my side and I just wanted to clarify a couple of things. For my setup, I am using a AM62 SK EVM and it's running the latest SDK 8.06. On my setup, I have a 1920x1080 HDMI monitor connected to the EVM all the time. Upon boot, I launch your CPU test and I see jitter values around 100. Next, I launch the Qt test and I see a bump from ~100 to 350. While the Qt test runs, I am noticing almost 50% of the CPU core is being used by the Qt app. I am guessing Qt Widget uses CPU for drawing instead of the GPU. As an experiment, I launched an OpenGLES app (uses GPU for rendering) and the Jitter value goes from ~350 to ~140. I have also attached a text file with my values.

    Questions:

    1. On my setup, I don't see the huge delta that you are observing and I am assuming you are using a custom board. By any chance, do you have a SK board where you can try the same setup?
    2. Is your primarily goal to set the priority between two A53 cores or between GPU and A53? Depending on what's being used to draw (CPU or A53), the main app could see a variance in the jitter.

    Regards,
    Krunal

    134617  Execution time max=23629 min=23944 jitter=315 <-------------------- Launch CPU mem test
    137486  Execution time max=23634 min=23771 jitter=137
    140355  Execution time max=23634 min=23711 jitter=77
    143223  Execution time max=23629 min=23749 jitter=120
    146092  Execution time max=23634 min=23709 jitter=75
    148961  Execution time max=23625 min=23739 jitter=114
    151829  Execution time max=23630 min=23726 jitter=96
    154698  Execution time max=23628 min=23735 jitter=107
    157567  Execution time max=23624 min=23713 jitter=89
    160435  Execution time max=23630 min=23719 jitter=89
    163304  Execution time max=23636 min=23730 jitter=94
    166173  Execution time max=23633 min=23717 jitter=84
    169042  Execution time max=23634 min=23751 jitter=117
    171911  Execution time max=23630 min=23738 jitter=108
    174779  Execution time max=23629 min=23722 jitter=93
    177648  Execution time max=23632 min=23729 jitter=97
    180517  Execution time max=23629 min=23739 jitter=110
    183385  Execution time max=23629 min=23745 jitter=116
    186254  Execution time max=23633 min=23722 jitter=89
    189123  Execution time max=23630 min=23739 jitter=109
    191991  Execution time max=23628 min=23702 jitter=74
    194860  Execution time max=23631 min=23706 jitter=75
    197768  Execution time max=23634 min=25291 jitter=1657 <-------------------- Launch Qt test in parallel of mem test
    200667  Execution time max=23672 min=25457 jitter=1785
    203557  Execution time max=23661 min=24628 jitter=967
    206447  Execution time max=23667 min=24320 jitter=653
    209335  Execution time max=23657 min=24037 jitter=380
    212223  Execution time max=23653 min=24083 jitter=430
    215111  Execution time max=23666 min=24077 jitter=411
    217999  Execution time max=23663 min=24163 jitter=500
    220886  Execution time max=23664 min=23969 jitter=305
    223775  Execution time max=23659 min=24159 jitter=500
    226662  Execution time max=23650 min=24064 jitter=414
    229547  Execution time max=23655 min=23972 jitter=317
    232435  Execution time max=23657 min=24075 jitter=418
    235324  Execution time max=23668 min=23998 jitter=330
    238214  Execution time max=23659 min=24003 jitter=344
    241100  Execution time max=23660 min=23981 jitter=321
    243987  Execution time max=23662 min=24000 jitter=338
    246875  Execution time max=23661 min=24049 jitter=388  <-------------------- Exit Qt test and Launch GPU test w/ mem test
    249748  Execution time max=23633 min=23960 jitter=327
    252627  Execution time max=23667 min=23982 jitter=315
    255503  Execution time max=23670 min=23845 jitter=175
    258379  Execution time max=23667 min=23810 jitter=143
    261254  Execution time max=23661 min=23803 jitter=142
    264129  Execution time max=23668 min=23785 jitter=117
    267004  Execution time max=23655 min=23783 jitter=128
    269880  Execution time max=23661 min=23837 jitter=176
    272756  Execution time max=23667 min=23798 jitter=131
    275632  Execution time max=23664 min=23894 jitter=230
    278507  Execution time max=23669 min=23798 jitter=129
    281382  Execution time max=23663 min=23797 jitter=134
    284257  Execution time max=23667 min=23783 jitter=116
    287133  Execution time max=23667 min=23880 jitter=213
    290008  Execution time max=23661 min=23772 jitter=111
    292884  Execution time max=23672 min=23777 jitter=105
    295760  Execution time max=23665 min=23814 jitter=149
    298636  Execution time max=23672 min=23852 jitter=180
    301511  Execution time max=23671 min=23832 jitter=161
    304386  Execution time max=23665 min=23807 jitter=142
    307261  Execution time max=23665 min=23792 jitter=127
    310137  Execution time max=23669 min=23808 jitter=139
    ^C
    root@am62xx-evm:~#
    

  • Hello,

    My test runs on our custom board. I have a AM62 SK EVM board and will run the tests on it to see the result there.

    Currently we us SDK 08.04.01.03 as base for our SW, so that is what we run. We plan to step up to SDK 9 when it is released. As I understand it is a big step and coming July. Do you now when it is coming?

    I am running the Qt test on top of Weston so I am not sure if it uses the GPU or not.

    What we which is to give one core that we dedicate to our main application higher access to memory then both GPU and other cores. If graphic lags a little bit is better then the main application jitter….

    I will test on the AM62 SK EVM board and get back with the result.

    From htop, both jitterTest and (jitterQt+Weston) uses about 50 per cent.

    93343  Exe|||         max=5690 min=6595 jitter=9052   11.2%]   Tasks: 27, 5 thr; 2 running
      2  [||||||||||||||||                                29.8%]   Load average: 1.31 0.52 0.19
      3  [|||||||||||||||                                 27.5%]   Uptime: 00:01:36
      4  [||||||||||||||||||||||||||||||                  54.7%]
      Mem[|||||||||||||||||||||||||                   110M/423M]
      Swp[                                                0K/0K]
    
    CPU     PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
      4     390 root      -21   0  8600  4660  2312 S 53.8  1.1  0:20.58 jitterTest
      3     351 root       20   0  135M 30216 21836 S 35.4  7.0  0:20.30 weston --idle-time=0
      2     387 root      -11   0 76960 30832 28668 S 25.6  7.1  0:13.87 jitterQt
      4     386 root       20   0  135M 30216 21836 R  3.9  7.0  0:01.73 weston --idle-time=0
      1     391 root       20   0  4588  3148  2192 R  1.3  0.7  0:00.87 htop
      2       1 root       20   0  153M  6672  5064 S  0.0  1.5  0:03.75 /sbin/init
      3     169 rpc        20   0  3680  2180  1952 S  0.0  0.5  0:00.01 /usr/sbin/rpcbind -w -f
      3     170 root       20   0 17264  5008  4460 S  0.0  1.2  0:00.39 /lib/systemd/systemd-journald
      2     185 root       20   0 13004  3512  2580 S  0.0  0.8  0:00.87 /lib/systemd/systemd-udevd
      1     198 systemd-n  20   0 15892  4660  4208 S  0.0  1.1  0:00.26 /lib/systemd/systemd-networkd
      1     243 systemd-r  20   0  7124  3216  2920 S  0.0  0.7  0:00.21 /lib/systemd/systemd-resolved
      1     250 root       20   0  291M   764   624 S  0.0  0.2  0:16.88 /usr/sbin/rngd -f -r /dev/hwrng
      2     251 root       20   0  291M   764   624 S  0.0  0.2  0:16.83 /usr/sbin/rngd -f -r /dev/hwrng
      3     252 root       20   0  291M   764   624 S  0.0  0.2  0:16.85 /usr/sbin/rngd -f -r /dev/hwrng
      4     253 root       20   0  291M   764   624 S  0.0  0.2  0:16.85 /usr/sbin/rngd -f -r /dev/hwrng
      1     249 root       20   0  291M   764   624 S  0.0  0.2  1:07.52 /usr/sbin/rngd -f -r /dev/hwrng
      4     255 avahi      20   0  4828  2696  2392 S  0.0  0.6  0:00.06 avahi-daemon: running [md5board.local]
      4     256 root       20   0  2780   536   468 S  0.0  0.1  0:00.01 /sbin/klogd -n
      2     257 avahi      20   0  4700   240     0 S  0.0  0.1  0:00.00 avahi-daemon: chroot helper
      1     258 root       20   0  2780   520   452 S  0.0  0.1  0:00.00 /sbin/syslogd -n
      2     259 messagebu  20   0  4504  2772  2444 S  0.0  0.6  0:00.16 /usr/bin/dbus-daemon --system --address=systemd: --nofork
      4     260 root       20   0  8132  4660  4252 S  0.0  1.1  0:00.13 /usr/sbin/ofonod -n
      3     274 root       20   0  2036  1396  1288 S  0.0  0.3  0:00.00 /sbin/agetty -o -p -- \u --noclear tty1 linux
      1     320 root       20   0 14904  1748  1496 S  0.0  0.4  0:00.03 /bin/journalctl -efu vmac.service -u vmgc.service -u iqan
      2     326 root       20   0  4532  2504  2112 S  0.0  0.6  0:00.03 /bin/login --
      3     333 root       20   0  6988  3968  3564 S  0.0  0.9  0:00.14 /lib/systemd/systemd-logind
      4     343 root       20   0  3396  2496  2252 S  0.0  0.6  0:00.01 /bin/sh /usr/bin/runWeston
      1     357 root       20   0 14332  7696  6316 S  0.0  1.8  0:00.09 /usr/libexec/weston-keyboard
      2     358 root       20   0 13884  7712  6356 S  0.0  1.8  0:00.09 /usr/libexec/weston-desktop-shell
      2     369 root       20   0  8604  5644  4956 S  0.0  1.3  0:00.30 /lib/systemd/systemd --user
      81188  Execution time max=5695 min=7099 jitter=1404me the process has spent in user and system time
    1 3     375 root     2020   0  4440  3588  2280 S  0.0  0.8  0:00.11 -sh
    
    

    Regards

    Magnus

  • Hello

    I ran the test on my AM62 SK EVM board connected to Monitor via HDMI.

    I got this result. Using SDK 08.04.03.01.

    root@am62xx-evm:~# uname -a
    Linux am62xx-evm 5.10.140-g5e63ae91b2 #1 SMP PREEMPT Tue Sep 27 16:50:05 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux
    root@am62xx-evm:~# jitterTest &
    [2] 1489
    2199333  policy=SCHED_RR, priority=20
    
    2200346  Execution time max=5066 min=9594 jitter=4528
    2201356  Execution time max=5067 min=5152 jitter=85
    2202366  Execution time max=5066 min=5222 jitter=156
    2203375  Execution time max=5065 min=5204 jitter=139
    2204385  Execution time max=5065 min=5170 jitter=105
    2205395  Execution time max=5066 min=5191 jitter=125
    2206405  Execution time max=5064 min=5190 jitter=126
    2207414  Execution time max=5066 min=5162 jitter=96
    2208424  Execution time max=5065 min=5187 jitter=122
    2209434  Execution time max=5067 min=5258 jitter=191
    2210444  Execution time max=5064 min=5184 jitter=120
    2211454  Execution time max=5064 min=5196 jitter=132
    2212464  Execution time max=5065 min=5181 jitter=116
    2213473  Execution time max=5065 min=5176 jitter=111
    2214483  Execution time max=5066 min=5149 jitter=83
    2215493  Execution time max=5065 min=5161 jitter=96
    2216503  Execution time max=5065 min=5195 jitter=130
    2217512  Execution time max=5065 min=5176 jitter=111
    2218522  Execution time max=5065 min=5169 jitter=104
    2219532  Execution time max=5065 min=5274 jitter=209
    2220542  Execution time max=5065 min=5177 jitter=112
    root@am62xx-evm:~# jitterQt &
    [3] 1490
    root@am62xx-evm:~# 2221554  Execution time max=5066 min=5296 jitter=230
    qt.qpa.wayland: "wl-shell" is a deprecated shell extension, prefer using "xdg-shell-v6" or "xdg-shell" if supported by the cN
    qt.qpa.wayland: Wayland does not support QWindow::requestActivate()
    2222718  Execution time max=5130 min=8977 jitter=3847
    2223804  Execution time max=5095 min=8704 jitter=3609
    2224902  Execution time max=5105 min=7122 jitter=2017
    2225966  Execution time max=5092 min=7743 jitter=2651
    2227031  Execution time max=5096 min=7704 jitter=2608
    2228112  Execution time max=5094 min=8326 jitter=3232
    2229191  Execution time max=5093 min=7948 jitter=2855
    2230259  Execution time max=5090 min=6617 jitter=1527
    2231329  Execution time max=5092 min=6883 jitter=1791
    2232371  Execution time max=5097 min=6380 jitter=1283
    2233433  Execution time max=5097 min=6920 jitter=1823
    2234496  Execution time max=5095 min=6912 jitter=1817
    2235543  Execution time max=5089 min=6296 jitter=1207
    2236600  Execution time max=5090 min=6690 jitter=1600
    2237650  Execution time max=5097 min=7017 jitter=1920
    2238691  Execution time max=5093 min=6239 jitter=1146
    2239736  Execution time max=5096 min=6398 jitter=1302
    root@am62xx-evm:~# kill 1490
    
    [3]-  Terminated              jitterQt
    root@am62xx-evm:~# 2240778  Execution time max=5088 min=5839 jitter=751
    2241790  Execution time max=5067 min=5613 jitter=546
    2242800  Execution time max=5065 min=5149 jitter=84
    2243810  Execution time max=5065 min=5165 jitter=100
    2244820  Execution time max=5065 min=5165 jitter=100
    2245830  Execution time max=5066 min=5173 jitter=107
    2246839  Execution time max=5066 min=5188 jitter=122
    2247849  Execution time max=5066 min=5180 jitter=114
    2248859  Execution time max=5065 min=5197 jitter=132
    root@am62xx-evm:~# kill 1489
    root@am62xx-evm:~# 

    Regards

    Magnus

  • Hi Magnus,

    Is Weston going to be used in your application? I ran the test without Weston (/etc/init.d/weston stop and app --platform eglfs). Also, could you try running a non-Qt based application, for example: /usr/bin/SGX/demos/DRM/OGLES2ChameleonMan.

    Regards,
    Krunal

  • Hello,

    We have not decided if Weston should be used or not. Most likely it will.

    Yes,

    I will do a test with without using Weston and instead using eglfs

    I will also do a test with the ChameletonMan instead of my jitterQt.

    Get back with the results.

     

    I guess these tests will give you/us a hint of what taking access from the main application (jitterTest). Trying to solve this issue by playing around with different graphic settings I don’t believe in.

    In the end I really think we need to setup prioritization access to the LPDDR4 memory.

    /Magnus

  • Here are the result from my last test

    What I can see is that running with or without weston make no change.

    I guess the OGLES2ChameleonMan using the GPU much more than jitterQt does, so it gives a better result.

    Without weston with eglfs:

    root@md5board:~# systemctl stop weston
    root@md5board:~# export QT_QPA_PLATFORM=eglfs
    root@md5board:~# jitterQt
    qt.qpa.input: xkbcommon not available, not performing key mapping
    ^C

    root@md5board:~#


    580342  Execution time max=5680 min=5723 jitter=43
    581413  Execution time max=5679 min=5717 jitter=38
    582484  Execution time max=5677 min=5715 jitter=38
    583555  Execution time max=5679 min=5711 jitter=32
    584625  Execution time max=5681 min=5719 jitter=38
    585696  Execution time max=5679 min=5708 jitter=29
    586767  Execution time max=5678 min=5727 jitter=49
    587837  Execution time max=5679 min=5730 jitter=51
    588908  Execution time max=5678 min=5738 jitter=60
    589979  Execution time max=5681 min=5715 jitter=34
    591049  Execution time max=5678 min=5714 jitter=36
    592120  Execution time max=5678 min=5724 jitter=46
    593191  Execution time max=5681 min=5713 jitter=32
    594295  Execution time max=5678 min=7076 jitter=1398		<- Start jitterQt without Weston with eglfs
    595394  Execution time max=5675 min=7404 jitter=1729
    596486  Execution time max=5678 min=6708 jitter=1030
    597574  Execution time max=5678 min=6571 jitter=893
    598666  Execution time max=5675 min=6358 jitter=683
    599753  Execution time max=5675 min=6386 jitter=711
    600849  Execution time max=5676 min=6622 jitter=946
    601940  Execution time max=5679 min=6430 jitter=751
    603029  Execution time max=5677 min=6411 jitter=734
    604121  Execution time max=5677 min=6408 jitter=731
    605209  Execution time max=5677 min=6244 jitter=567
    606297  Execution time max=5676 min=6232 jitter=556
    607388  Execution time max=5678 min=6579 jitter=901
    608476  Execution time max=5675 min=6436 jitter=761
    609565  Execution time max=5674 min=6283 jitter=609
    610656  Execution time max=5676 min=6372 jitter=696
    611748  Execution time max=5677 min=6422 jitter=745
    612838  Execution time max=5679 min=6541 jitter=862
    613931  Execution time max=5677 min=6495 jitter=818
    615018  Execution time max=5679 min=6309 jitter=630
    616108  Execution time max=5677 min=6421 jitter=744
    617197  Execution time max=5678 min=6417 jitter=739
    618286  Execution time max=5676 min=6337 jitter=661
    619376  Execution time max=5678 min=6336 jitter=658
    620459  Execution time max=5676 min=6408 jitter=732		<- Exit jitterQt
    621530  Execution time max=5677 min=5714 jitter=37
    622601  Execution time max=5677 min=5722 jitter=45
    623672  Execution time max=5678 min=5725 jitter=47
    624743  Execution time max=5681 min=5716 jitter=35
    625814  Execution time max=5680 min=5743 jitter=63
    626885  Execution time max=5680 min=5711 jitter=31
    627956  Execution time max=5680 min=5712 jitter=32
    629027  Execution time max=5680 min=5735 jitter=55
    630097  Execution time max=5679 min=5711 jitter=32
    631168  Execution time max=5680 min=5717 jitter=37
    632239  Execution time max=5678 min=5719 jitter=41
    633310  Execution time max=5681 min=5718 jitter=37
    634380  Execution time max=5679 min=5712 jitter=33

    OGLES2ChameleonMan with weston (no jitterQt)

    257112  Execution time max=5678 min=5752 jitter=74
    258183  Execution time max=5680 min=5721 jitter=41
    259254  Execution time max=5676 min=5726 jitter=50
    260326  Execution time max=5681 min=5752 jitter=71
    261396  Execution time max=5680 min=5720 jitter=40
    262468  Execution time max=5680 min=5730 jitter=50
    263539  Execution time max=5678 min=5747 jitter=69
    264610  Execution time max=5681 min=5729 jitter=48
    265681  Execution time max=5678 min=5718 jitter=40
    266752  Execution time max=5678 min=5734 jitter=56
    267825  Execution time max=5680 min=5874 jitter=194		<- Start OGLES2ChameleonMan with Weston
    268912  Execution time max=5692 min=6055 jitter=363
    269993  Execution time max=5685 min=6020 jitter=335
    271074  Execution time max=5685 min=5930 jitter=245
    272155  Execution time max=5697 min=5896 jitter=199
    273236  Execution time max=5694 min=5893 jitter=199
    274317  Execution time max=5701 min=5895 jitter=194
    275398  Execution time max=5694 min=5926 jitter=232
    276478  Execution time max=5690 min=5916 jitter=226
    277559  Execution time max=5692 min=5890 jitter=198
    278639  Execution time max=5694 min=5889 jitter=195
    279720  Execution time max=5689 min=5897 jitter=208
    280799  Execution time max=5689 min=5935 jitter=246
    281872  Execution time max=5680 min=5901 jitter=221		<- ExitOGLES2ChameleonMan
    282943  Execution time max=5678 min=5714 jitter=36
    284014  Execution time max=5680 min=5729 jitter=49
    285085  Execution time max=5680 min=5721 jitter=41
    286156  Execution time max=5680 min=5713 jitter=33
    287227  Execution time max=5678 min=5719 jitter=41
    288298  Execution time max=5678 min=5721 jitter=43
    289369  Execution time max=5679 min=5725 jitter=46
    290440  Execution time max=5680 min=5717 jitter=37
    291512  Execution time max=5682 min=5729 jitter=47

    OGLES2ChameleonMan without weston with eglfs (no jitterQt)

    925169  Execution time max=5681 min=5716 jitter=35
    926240  Execution time max=5678 min=5733 jitter=55
    927311  Execution time max=5678 min=5715 jitter=37
    928382  Execution time max=5679 min=5715 jitter=36
    929452  Execution time max=5678 min=5717 jitter=39
    930523  Execution time max=5678 min=5713 jitter=35
    931594  Execution time max=5681 min=5710 jitter=29
    932665  Execution time max=5681 min=5735 jitter=54
    933736  Execution time max=5681 min=5751 jitter=70
    934807  Execution time max=5681 min=5723 jitter=42
    935878  Execution time max=5680 min=5729 jitter=49
    936949  Execution time max=5680 min=5718 jitter=38
    938019  Execution time max=5677 min=5723 jitter=46
    939090  Execution time max=5677 min=5715 jitter=38
    940161  Execution time max=5679 min=5719 jitter=40
    941236  Execution time max=5681 min=5895 jitter=214		<- Start OGLES2ChameleonMan without Weston with eglfs
    942319  Execution time max=5691 min=6039 jitter=348
    943401  Execution time max=5687 min=5943 jitter=256
    944482  Execution time max=5688 min=5965 jitter=277
    945563  Execution time max=5687 min=5945 jitter=258
    946644  Execution time max=5686 min=5964 jitter=278
    947725  Execution time max=5687 min=5969 jitter=282
    948806  Execution time max=5688 min=5961 jitter=273
    949887  Execution time max=5687 min=5950 jitter=263
    950968  Execution time max=5683 min=5966 jitter=283
    952049  Execution time max=5684 min=5954 jitter=270
    953130  Execution time max=5685 min=5971 jitter=286
    954211  Execution time max=5686 min=5967 jitter=281
    955292  Execution time max=5689 min=5947 jitter=258
    956373  Execution time max=5690 min=5952 jitter=262
    957445  Execution time max=5681 min=5902 jitter=221		<- Exit OGLES2ChameleonMan 
    958516  Execution time max=5678 min=5713 jitter=35
    959586  Execution time max=5680 min=5709 jitter=29
    960657  Execution time max=5680 min=5712 jitter=32
    961728  Execution time max=5680 min=5713 jitter=33
    962799  Execution time max=5682 min=5713 jitter=31
    963870  Execution time max=5681 min=5729 jitter=48
    964941  Execution time max=5679 min=5713 jitter=34
    966012  Execution time max=5680 min=5720 jitter=40
    967083  Execution time max=5680 min=5719 jitter=39
    968154  Execution time max=5681 min=5724 jitter=4
    

    Regards

    Magnus

  • Hi Magnus,

    Also, is the above result with RT Linux or standard? The reason I ask if because I am not seeing such a massive deviation on my setup.

    Regards,
    Krunal

  • Hello,

    I am not running RT linux SDK, I am running standard SDK.

    /Magnus

  • Hello,

    After reading the TRM a lot (specially chapter 3.1), I am confused how to set up QoS (Quality Of Service). Do you have some other information how to do this? Also, this chapter lead med to the DDRSS CoS.

    So

    I have trying to play around with DDRSS CoS (Class Of Service) to see if I could improve my jitter problem.

    I have isolated core 3 and changed the VBUSM2AXI prio for Core 3 to highest and the rest to lowest (I think). I did the change in u-boot function board_inti(). Running my jitterTest on core 3 and jitterQt on another.

    Unfortunately, I didn’t see any improvements, but I am not sure I have done this correct.

    Printout from u-boot console

    board_init
    before change
    DDR16SS_V2A_CTL_REG + 0x00 1ef
    DDR16SS_V2A_CTL_REG + 0x04 0
    DDR16SS_V2A_CTL_REG + 0x08 0
    DDR16SS_V2A_CTL_REG + 0x0C 0
    DDR16SS_V2A_CTL_REG + 0x10 0
    DDR16SS_V2A_CTL_REG + 0x14 0
    DDR16SS_V2A_CTL_REG + 0x18 0
    DDR16SS_V2A_CTL_REG + 0x1C 0
    After change
    DDR16SS_V2A_CTL_REG + 0x00 1ef
    DDR16SS_V2A_CTL_REG + 0x04 30013
    DDR16SS_V2A_CTL_REG + 0x08 0
    DDR16SS_V2A_CTL_REG + 0x0C 0
    DDR16SS_V2A_CTL_REG + 0x10 77777777
    DDR16SS_V2A_CTL_REG + 0x14 0
    DDR16SS_V2A_CTL_REG + 0x18 77777777
    DDR16SS_V2A_CTL_REG + 0x1C 77777777
    

    Regards

    Magnus

  • Hello Magnus,

    I have attached a patch that enables QoS in Uboot and please keep in mind that I used 08.06.00.42 for my testing. After applying the patch, there is a file called arch/arm/mach-k3/am62x/am62x_qos_data.c and it has the QoS settings. Here are the instructions for generating the file:

    Step 1: Download and install the latest Sysconfig: https://www.ti.com/tool/SYSCONFIG
    Step 2: In Linux, I launched the sysconfig gui using the following command "~/ti/sysconfig_1.16.1/sysconfig_gui.sh"
    Step 3: Open an existing Design by clicking "Browse" and navigate to the file -> /home/krunal/work/ti-processor-sdk-linux-am62xx-evm-08.06.00.42/board-support/k3-respart-tool/out/am62x-evm.syscfg
    Step 4: On the left hand side, click on Quality of Service section followed by Add (See image1.png)
    Step 5: Add device name of your choice and in my case I added SAM62_A53_512KB_WRAP_MAIN_0 (See image2.png)
    Step 5a: I changed the order_id to 8 so it takes different path and escalated priority to 0
    Step 6: On the right hand side, click on the file am62x_qos_data.c and it will have the values you can plug into the Uboot file (See image3.png)
    Step 7: Compile Uboot and load the images on your SD card/boot media

    As I mentioned earlier, it's hard for me to see the variance you are seeing on my setup. I did not test the Jitter example and I just did a santiy check with devmem2. I read the register 0x45D20500 and valued changed from 0x0 to 0x80.

    Regards,
    Krunal

    https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/0001_2D00_Temp._2D00_QoS_2D00_patches_2D00_for_2D00_8.06_2D00_SDK.patch


  • Hello,

    Really good that I can change the QoS.

     I have attached the patch, compiled and tested. No difference in my test.

     

    I still don’t understand how to configure the QoS to achieve what I want (if it is possible).

     

    What is your change actually doing? Increasing Epriority to highest and using OrderId 8 (real-time path), but isn’t this for all A53 cores?

     

    What we want is that one core should have higher priority than other cores. Is it achievable?

    /Magnus

  • Hello,

    What is your change actually doing? Increasing Epriority to highest and using OrderId 8 (real-time path), but isn’t this for all A53 cores?

    >> The orderID is used as a mechanism to load balance the traffic to DDR through two parallel paths. The transactions with order ID 0-7 share one path, while transactions with 8-15 share a separate path. Also, the priority information is used for cbass for arbitration decision, which implements typical priority based round robin. Priority value 0x0 is the highest priority, while 0x7 is the lowest priority. By default, QoS has priority value set to 0x7( lowest priority). Yes, it's happening for all the A53 cores.

    What we want is that one core should have higher priority than other cores. Is it achievable?

    >> Let me check internally and get back to you.

    Regards,
    Krunal

  • Hi Magnus,

    Based on my internal discussion, it's not possible to limiting one A53 core vs another A53 core.

    Regards,
    Krunal

  • Hello

    Okej, then I know that limiting one A53 core vs another isn’t possible.

    One thing that I can’t get out of my head is why you don’t see the same “bad” result as I do. When I look at your printouts I see that you have an execution time about 23700 us vs my with 5700 us. That is strange. I have attached my binaries. Is it possible for you to test with them?

    If it is so that your jitterQt don’t affect that much you must have some setting that I don’t have.

    You running SDK 08.06.00.42?

    Regards

  • Hello,

    After reading the TRM even more I saw in chapter 3.1.5 Quality of Service (QoS)

     “Some of the modules such as DSS is able to adjust the priority of the transaction based on the system congestion condition. But majority of the transactions have static priority level set by the QoS block. But the priority setting through QoS block can be tuned to fit certain use case scenarios”

     

    So I stared to read the DSS chapter and came across this.

     “12.9.1.4.1.6.6 DISPC DMA Priority Requests Control

    The register controls the priority level for DMA requests going out to the memory through the system interconnect. As explained in Section 12.9.1.4.1.6.5, DISPC DMA MFLAG Mechanism, the DISPC Initiator port generates a 1-bit MFLAG output signal to raise the priority of all requests made on that port, if any of its DMA buffers runs critically low (determined by a set of user programmable threshold values for each buffer).

    DSS uses the MFLAG signal from DISPC to set a 3-bit priority level output (Mpriority) for the Initiator port to either a low or high value (configurable in [2-0] PRI_LO and [5-3] PRI_HI register fields with optional values of 0~7) as follows:

    • When MFLAG = 0, the PRI_LO register field determines the value of the Mpriority output for the normal transactions.
    • When MFLAG = 1, the PRI_HI register field determines the value of the Mpriority output for the high-priority transactions.

    This Mpriority output directly drives the respective input of the system interconnect port, which corresponds to the DISPC DMA Initiator port.”

     

    So even more digging I found the file “drivers/gpu/drm/tidss/tidss_dispc.c” and changed

    cba_lo_pri = 7 and cba_hi_prio = 7 so now DDS module have the lowest prio instead of 0 and 1 which are high prio. I thought this change would improve my problem, but no improvement.

     

    So far I have change

    Setup QoS to highest prio for main A53 cpus

    Increased A53 Core 3 prio in DDRSS CoS (Class Of Service) to highest (see previous reply)

    Isolated jitterTest to core 3

    DDS to lowest prio .

     

    I guessed that these changes should have improved my jitter issue. But it doesn’t.

    What have I missed? What more can I improve?  

     

    /Regards

    Magnus

  • Hi Magnus,

    Using your binary, I tested on my setup and here is the results:

    root@am62xx-evm:~/customer# ./jitterTest 
    6252237  policy=SCHED_RR, priority=20
    
    6253250  Execution time max=5077 min=9338 jitter=4261
    6254261  Execution time max=5082 min=5184 jitter=102
    6255273  Execution time max=5082 min=5196 jitter=114
    6256284  Execution time max=5082 min=5240 jitter=158
    6257295  Execution time max=5081 min=5210 jitter=129
    6258306  Execution time max=5081 min=5153 jitter=72
    6259318  Execution time max=5082 min=5371 jitter=289
    6260329  Execution time max=5081 min=5185 jitter=104
    6261340  Execution time max=5080 min=5171 jitter=91
    6262351  Execution time max=5080 min=5184 jitter=104
    6263362  Execution time max=5080 min=5185 jitter=105
    6264373  Execution time max=5080 min=5217 jitter=137
    6265385  Execution time max=5082 min=5207 jitter=125
    6266396  Execution time max=5079 min=5162 jitter=83
    6267407  Execution time max=5079 min=5212 jitter=133
    6268418  Execution time max=5083 min=5175 jitter=92
    6269430  Execution time max=5082 min=5400 jitter=318
    6270441  Execution time max=5080 min=5204 jitter=124
    6271452  Execution time max=5079 min=5237 jitter=158
    6272463  Execution time max=5081 min=5182 jitter=101
    6273474  Execution time max=5079 min=5189 jitter=110
    6274486  Execution time max=5080 min=5194 jitter=114
    6275497  Execution time max=5081 min=5177 jitter=96
    6276508  Execution time max=5080 min=5230 jitter=150
    6277519  Execution time max=5081 min=5182 jitter=101
    6278530  Execution time max=5082 min=5208 jitter=126
    6279541  Execution time max=5081 min=5217 jitter=136
    6280552  Execution time max=5081 min=5161 jitter=80
    6281564  Execution time max=5080 min=5206 jitter=126
    6282574  Execution time max=5079 min=5177 jitter=98
    6283586  Execution time max=5080 min=5215 jitter=135
    6284597  Execution time max=5081 min=5181 jitter=100
    6285607  Execution time max=5078 min=5184 jitter=106
    6286619  Execution time max=5081 min=5203 jitter=122
    6287630  Execution time max=5075 min=5179 jitter=104
    6288641  Execution time max=5081 min=5180 jitter=99
    6289652  Execution time max=5081 min=5219 jitter=138
    6290663  Execution time max=5082 min=5211 jitter=129
    6291674  Execution time max=5082 min=5160 jitter=78
    6292712  Execution time max=5071 min=7561 jitter=2490 <--- Start Qt test
    6293805  Execution time max=5068 min=7844 jitter=2776
    6294859  Execution time max=5069 min=7573 jitter=2504
    6295900  Execution time max=5068 min=7457 jitter=2389
    6296931  Execution time max=5067 min=7206 jitter=2139
    6297969  Execution time max=5069 min=7466 jitter=2397
    6299018  Execution time max=5067 min=7571 jitter=2504
    6300047  Execution time max=5069 min=5800 jitter=731
    6301077  Execution time max=5069 min=5751 jitter=682
    6302113  Execution time max=5067 min=6884 jitter=1817
    6303142  Execution time max=5066 min=5727 jitter=661
    6304171  Execution time max=5067 min=5928 jitter=861
    6305200  Execution time max=5068 min=6351 jitter=1283
    6306232  Execution time max=5067 min=5748 jitter=681
    6307261  Execution time max=5067 min=6127 jitter=1060
    6308289  Execution time max=5065 min=5750 jitter=685
    6309317  Execution time max=5067 min=5665 jitter=598
    6310349  Execution time max=5068 min=5795 jitter=727
    6311376  Execution time max=5069 min=5705 jitter=636
    6312404  Execution time max=5069 min=5891 jitter=822
    6313433  Execution time max=5068 min=5799 jitter=731
    6314458  Execution time max=5066 min=5640 jitter=574
    6315490  Execution time max=5067 min=6294 jitter=1227
    6316518  Execution time max=5069 min=5598 jitter=529
    6317545  Execution time max=5067 min=5741 jitter=674
    6318577  Execution time max=5069 min=6057 jitter=988
    6319604  Execution time max=5066 min=5891 jitter=825
    6320642  Execution time max=5070 min=6700 jitter=1630
    6321672  Execution time max=5069 min=5831 jitter=762
    6322700  Execution time max=5068 min=5684 jitter=616
    6323725  Execution time max=5067 min=5693 jitter=626
    6324751  Execution time max=5067 min=5632 jitter=565
    6325766  Execution time max=5070 min=5554 jitter=484
    6326777  Execution time max=5081 min=5173 jitter=92 <--- End Qt test
    6327788  Execution time max=5081 min=5214 jitter=133
    6328799  Execution time max=5082 min=5210 jitter=128
    6329810  Execution time max=5081 min=5173 jitter=92
    6330821  Execution time max=5082 min=5201 jitter=119
    6331832  Execution time max=5078 min=5206 jitter=128
    6332844  Execution time max=5080 min=5182 jitter=102
    6333854  Execution time max=5079 min=5196 jitter=117
    6334866  Execution time max=5081 min=5220 jitter=139
    6335877  Execution time max=5081 min=5168 jitter=87
    6336888  Execution time max=5080 min=5203 jitter=123
    6337899  Execution time max=5080 min=5184 jitter=104
    6338910  Execution time max=5079 min=5195 jitter=116
    6339922  Execution time max=5080 min=5157 jitter=77
    6340933  Execution time max=5078 min=5217 jitter=139
    6341944  Execution time max=5081 min=5172 jitter=91
    6342955  Execution time max=5080 min=5162 jitter=82
    6343966  Execution time max=5081 min=5226 jitter=145
    6344977  Execution time max=5079 min=5204 jitter=125
    6345988  Execution time max=5080 min=5173 jitter=93
    6346999  Execution time max=5080 min=5195 jitter=115
    6348011  Execution time max=5078 min=5192 jitter=114
    6349022  Execution time max=5080 min=5170 jitter=90
    6350033  Execution time max=5080 min=5173 jitter=93
    6351044  Execution time max=5081 min=5200 jitter=119
    6352055  Execution time max=5079 min=5206 jitter=127
    6353067  Execution time max=5082 min=5179 jitter=97
    6354078  Execution time max=5078 min=5172 jitter=94
    6355089  Execution time max=5078 min=5203 jitter=125
    6356100  Execution time max=5076 min=5240 jitter=164
    6357111  Execution time max=5079 min=5178 jitter=99
    6358122  Execution time max=5080 min=5193 jitter=113
    6359133  Execution time max=5081 min=5202 jitter=121
    6360145  Execution time max=5083 min=5179 jitter=96
    6361156  Execution time max=5079 min=5200 jitter=121
    6362167  Execution time max=5079 min=5183 jitter=104
    6363178  Execution time max=5079 min=5187 jitter=108
    6364189  Execution time max=5081 min=5171 jitter=90
    

    I am internally reviewing with our team and I will get back to you my mid next week.

    Regards,
    Krunal

  • Hi Magnus,

    As an experiment, is it possible for you to try the same experiment on RT-Linux?

    Regards,
    Krunal

  • Hello,

    I have tested on Linux RT 08.06.00.42 with the same result as ordinary Linux SDK.

    I have run all tests on my custom board.

     

    Without any changes I see these result running jitterTest and then start either jitterQt or ChameleonMan

     

    Without changes
    
    103492  Execution time max=5679 min=5710 jitter=31
    104563  Execution time max=5681 min=5722 jitter=41
    105633  Execution time max=5678 min=5716 jitter=38
    106704  Execution time max=5679 min=5718 jitter=39
    107775  Execution time max=5679 min=5719 jitter=40
    108846  Execution time max=5678 min=5712 jitter=34
    109917  Execution time max=5679 min=5720 jitter=41
    110988  Execution time max=5678 min=5713 jitter=35
    112086  Execution time max=5680 min=6872 jitter=1192	jitterQt start
    113204  Execution time max=5699 min=8497 jitter=2798
    114311  Execution time max=5690 min=6829 jitter=1139
    115413  Execution time max=5693 min=7907 jitter=2214
    116516  Execution time max=5704 min=6909 jitter=1205
    117615  Execution time max=5699 min=6722 jitter=1023
    118725  Execution time max=5710 min=7263 jitter=1553
    119831  Execution time max=5700 min=6899 jitter=1199
    120930  Execution time max=5690 min=6683 jitter=993
    122031  Execution time max=5698 min=6628 jitter=930
    123122  Execution time max=5694 min=6538 jitter=844
    124198  Execution time max=5680 min=5979 jitter=299	jitterQt stop
    125269  Execution time max=5681 min=5711 jitter=30
    126339  Execution time max=5678 min=5712 jitter=34
    127410  Execution time max=5681 min=5713 jitter=32
    128481  Execution time max=5680 min=5720 jitter=40
    129552  Execution time max=5681 min=5727 jitter=46
    130623  Execution time max=5679 min=5720 jitter=41
    131694  Execution time max=5679 min=5712 jitter=33
    
    
    22856  Execution time max=5677 min=5757 jitter=80
    523927  Execution time max=5679 min=5725 jitter=46
    524998  Execution time max=5679 min=5745 jitter=66
    526069  Execution time max=5677 min=5734 jitter=57
    527140  Execution time max=5679 min=5729 jitter=50
    528211  Execution time max=5679 min=5727 jitter=48
    529282  Execution time max=5678 min=5747 jitter=69
    530353  Execution time max=5678 min=5709 jitter=31
    531424  Execution time max=5679 min=5712 jitter=33
    532495  Execution time max=5678 min=5731 jitter=53
    533566  Execution time max=5679 min=5719 jitter=40
    534637  Execution time max=5681 min=5714 jitter=33
    535708  Execution time max=5680 min=5723 jitter=43
    536789  Execution time max=5682 min=6158 jitter=476	OGLES2ChameleonMan start
    537873  Execution time max=5713 min=5937 jitter=224
    538957  Execution time max=5704 min=5924 jitter=220
    540040  Execution time max=5709 min=5970 jitter=261
    541125  Execution time max=5710 min=6001 jitter=291
    542209  Execution time max=5713 min=5955 jitter=242
    543293  Execution time max=5710 min=5958 jitter=248
    544377  Execution time max=5698 min=5953 jitter=255
    545461  Execution time max=5711 min=6025 jitter=314
    546546  Execution time max=5713 min=6034 jitter=321
    547630  Execution time max=5716 min=5969 jitter=253
    548714  Execution time max=5706 min=5993 jitter=287
    549799  Execution time max=5704 min=6047 jitter=343
    550883  Execution time max=5712 min=6042 jitter=330
    551967  Execution time max=5708 min=5986 jitter=278
    553052  Execution time max=5720 min=6022 jitter=302
    554136  Execution time max=5715 min=5971 jitter=256
    555220  Execution time max=5711 min=5973 jitter=262
    556304  Execution time max=5708 min=6010 jitter=302
    557388  Execution time max=5709 min=6007 jitter=298
    558472  Execution time max=5705 min=5939 jitter=234
    559556  Execution time max=5714 min=5915 jitter=201
    560638  Execution time max=5693 min=6039 jitter=346	OGLES2ChameleonMan stop
    537873  Execution time max=5713 min=5937 jitter=224
    561709  Execution time max=5679 min=5733 jitter=54
    562780  Execution time max=5677 min=5713 jitter=36
    563851  Execution time max=5679 min=5712 jitter=33
    564921  Execution time max=5679 min=5715 jitter=36
    565992  Execution time max=5679 min=5719 jitter=40
    567063  Execution time max=5682 min=5709 jitter=27
    568134  Execution time max=5678 min=5706 jitter=28
    569205  Execution time max=5681 min=5762 jitter=81

     

    With these changes. (If you want to see my SW changes, just tell me)

    • Isolate core 3 and running jitterTest from it.
    • Increased A53 Core 3 prio in DDRSS CoS (Class Of Service) to highest (see previous reply)
    • Setup QoS to highest prio for main A53 cpus
    • DDS to lowest prio.

    What I can see is that with these changes the jitterTest and ChameleonMan improves a lot. I guess the “QoS to highest prio for main A53 cpus” works and the A53 CPUS has higher prio than GPU.

     

    82995  Execution time max=5672 min=5693 jitter=21
    84065  Execution time max=5671 min=5697 jitter=26
    85134  Execution time max=5670 min=5716 jitter=46
    86203  Execution time max=5670 min=5709 jitter=39
    87273  Execution time max=5670 min=5703 jitter=33
    88342  Execution time max=5671 min=5697 jitter=26
    89412  Execution time max=5670 min=5689 jitter=19
    90481  Execution time max=5672 min=5687 jitter=15
    91551  Execution time max=5671 min=5693 jitter=22
    92620  Execution time max=5672 min=5696 jitter=24
    93716  Execution time max=5671 min=8002 jitter=2331	jitterQt start
    94827  Execution time max=5677 min=8307 jitter=2630
    95921  Execution time max=5677 min=7367 jitter=1690
    97011  Execution time max=5675 min=7626 jitter=1951
    98099  Execution time max=5675 min=6736 jitter=1061
    99186  Execution time max=5677 min=6525 jitter=848
    100276  Execution time max=5674 min=6673 jitter=999
    101370  Execution time max=5675 min=7505 jitter=1830
    102456  Execution time max=5676 min=6457 jitter=781
    103545  Execution time max=5674 min=6674 jitter=1000
    104633  Execution time max=5674 min=6488 jitter=814
    105720  Execution time max=5675 min=6314 jitter=639
    106811  Execution time max=5673 min=6665 jitter=992
    107898  Execution time max=5675 min=6612 jitter=937
    108987  Execution time max=5675 min=6558 jitter=883
    110073  Execution time max=5673 min=6930 jitter=1257	jitterQt stop
    111143  Execution time max=5669 min=5746 jitter=77
    112213  Execution time max=5671 min=5693 jitter=22
    113282  Execution time max=5671 min=5692 jitter=21
    114351  Execution time max=5671 min=5691 jitter=20
    115421  Execution time max=5672 min=5690 jitter=18
    116490  Execution time max=5671 min=5692 jitter=21
    117560  Execution time max=5670 min=5691 jitter=21
    118630  Execution time max=5672 min=5689 jitter=17
    119699  Execution time max=5670 min=5694 jitter=24
    120769  Execution time max=5672 min=5689 jitter=17
    121838  Execution time max=5671 min=5689 jitter=18
    ----
    178582  Execution time max=5673 min=5688 jitter=15
    179651  Execution time max=5671 min=5699 jitter=28
    180721  Execution time max=5671 min=5691 jitter=20
    181790  Execution time max=5671 min=5692 jitter=21
    182860  Execution time max=5670 min=5690 jitter=20
    183929  Execution time max=5670 min=5690 jitter=20
    184999  Execution time max=5671 min=5690 jitter=19
    186068  Execution time max=5670 min=5701 jitter=31
    187138  Execution time max=5671 min=5692 jitter=21
    188209  Execution time max=5671 min=5816 jitter=145	OGLES2ChameleonMan start
    189284  Execution time max=5681 min=5847 jitter=166
    190358  Execution time max=5682 min=5742 jitter=60
    191432  Execution time max=5681 min=5738 jitter=57
    192506  Execution time max=5680 min=5746 jitter=66
    193579  Execution time max=5680 min=5744 jitter=64
    194653  Execution time max=5682 min=5756 jitter=74
    195727  Execution time max=5679 min=5747 jitter=68
    196801  Execution time max=5683 min=5754 jitter=71
    197875  Execution time max=5681 min=5746 jitter=65
    198949  Execution time max=5681 min=5787 jitter=106
    200022  Execution time max=5680 min=5750 jitter=70
    201096  Execution time max=5681 min=5749 jitter=68
    202169  Execution time max=5676 min=5751 jitter=75
    203240  Execution time max=5670 min=5725 jitter=55	OGLES2ChameleonMan stop
    204310  Execution time max=5671 min=5689 jitter=18
    205379  Execution time max=5672 min=5693 jitter=21
    206449  Execution time max=5671 min=5691 jitter=20
    207518  Execution time max=5672 min=5689 jitter=17
    208587  Execution time max=5672 min=5696 jitter=24
    209657  Execution time max=5670 min=5692 jitter=22
    

     

     I have also tested starting 2 more jitterTest to see how the jitterTest on core 3 behaves.

    243883  Execution time max=5669 min=5692 jitter=23
    244952  Execution time max=5671 min=5692 jitter=21
    246022  Execution time max=5671 min=5694 jitter=23
    247092  Execution time max=5670 min=5694 jitter=24
    248161  Execution time max=5671 min=5708 jitter=37
    249230  Execution time max=5670 min=5694 jitter=24
    250300  Execution time max=5670 min=5712 jitter=42
    251369  Execution time max=5671 min=5700 jitter=29
    252442  Execution time max=5670 min=5891 jitter=221	another jitterTest start
    253522  Execution time max=5766 min=5812 jitter=46
    254600  Execution time max=5740 min=5787 jitter=47
    255676  Execution time max=5724 min=5773 jitter=49
    256750  Execution time max=5708 min=5746 jitter=38
    257821  Execution time max=5681 min=5734 jitter=53
    258891  Execution time max=5672 min=5705 jitter=33
    259963  Execution time max=5683 min=5737 jitter=54
    261039  Execution time max=5720 min=5786 jitter=66
    262118  Execution time max=5746 min=5814 jitter=68
    263198  Execution time max=5762 min=5806 jitter=44
    264276  Execution time max=5744 min=5782 jitter=38
    265352  Execution time max=5723 min=5774 jitter=51
    266425  Execution time max=5703 min=5785 jitter=82
    267497  Execution time max=5679 min=5721 jitter=42
    268566  Execution time max=5672 min=5702 jitter=30
    269642  Execution time max=5691 min=5928 jitter=237 	another jitterTest start
    270720  Execution time max=5729 min=5827 jitter=98
    271797  Execution time max=5724 min=5814 jitter=90
    272877  Execution time max=5731 min=5833 jitter=102
    273953  Execution time max=5718 min=5791 jitter=73
    275034  Execution time max=5752 min=5827 jitter=75
    276110  Execution time max=5720 min=5804 jitter=84
    277190  Execution time max=5730 min=5831 jitter=101
    278267  Execution time max=5721 min=5819 jitter=98
    279347  Execution time max=5721 min=5831 jitter=110
    280425  Execution time max=5719 min=5828 jitter=109
    281503  Execution time max=5725 min=5872 jitter=147
    282583  Execution time max=5741 min=5842 jitter=101
    283660  Execution time max=5722 min=5809 jitter=87
    284740  Execution time max=5758 min=5819 jitter=61
    285817  Execution time max=5722 min=5875 jitter=153
    286898  Execution time max=5745 min=5840 jitter=95
    287975  Execution time max=5714 min=5822 jitter=108
    289055  Execution time max=5734 min=5835 jitter=101
    290132  Execution time max=5718 min=5822 jitter=104
    291212  Execution time max=5710 min=5841 jitter=131	another jitterTest stop
    292287  Execution time max=5717 min=5798 jitter=81
    293366  Execution time max=5747 min=5805 jitter=58
    294446  Execution time max=5769 min=5820 jitter=51
    295525  Execution time max=5755 min=5795 jitter=40
    296595  Execution time max=5670 min=5776 jitter=106	another jitterTest stop
    297665  Execution time max=5671 min=5692 jitter=21
    298734  Execution time max=5671 min=5691 jitter=20
    299804  Execution time max=5671 min=5695 jitter=24
    300874  Execution time max=5671 min=5688 jitter=17
    301943  Execution time max=5671 min=5689 jitter=18
    303013  Execution time max=5672 min=5689 jitter=17
    

    Regards

    Magnus

  • Hi Ryan, this is discussed in the QoS section 3.1.5 of the TRM.  You can assigned different priorities based on the initiator of the transaction.  The DDR subsystem then can use this to perform Class of Service (see section 9.1.3.1) arbitration.  You should be able to setup this arbitration with the info that's in the TRM.  If you have further questions, you can post here.
    Increased A53 Core 3 prio in DDRSS CoS (Class Of Service) to highest (see previous reply)
    Okej, then I know that limiting one A53 core vs another isn’t possible.

    A clarification that for normal cacheable memory, the entire A53 cluster or more specifically the L2 cache is one entity. The core specific information or prioritization only applies to memory mapped as non-cacheable sometimes called device type (i.e. intended for IO registers). This is general ARM Cortex A and shared coherent memory feature, for example a cache line eviction, who would be the initiator for this? last CPU to touch the line, or the CPU that read in the line that caused the write-back.

    Relative prioritization of the A53 cores (really the L2 cache) vs other initiators should still be possible.

    The reason for this is that on one A53 core we run a real time application process.

    I noticed you are not running PREEMPT_RT, or as TI calls it in the SDKs RT Linux. I would suggest to take a look at https://wiki.linuxfoundation.org/realtime/start and https://www.linutronix.de/blog/A-Checklist-for-Real-Time-Applications-in-Linux . There are some pointers for locking memory allocation that might be relevant.

      Pekka

  • Hello

    I have looked into Pekka Varis comments.

    I am already running RT linux and looked into the checklist in the link.

    No improvements so far.

    Regards

    Magnus

  • The RT Linux comments were for guarantees between A53 applications. For the GPU it looks like AM62x does have a bandwidth limiter for the GPU (GPU_WS_BW_LIMITER3_REGS at address 0x30400000 for writes, GPU_RS_BW_LIMITER2_REGS at address 0x30401000 for reads). The TRM unfortunately does not have the register details, but they are the same as for the A53 bandwidth limiter in section 7.5.2 A53_RS_BW_LIMITER Registers . My 

    I'll file a documentation bug on the missing registers.

    My suggestion would be to try limiting the GPU bandwidth. The DDR CoS only works by picking from the commands in its queue.

      Pekka

  • Hello,

    From U-boot I don’t have any problems reading these BW_LIMITER register, but from within Linux I got crashes. Don’t know why.

    Also, not super much info about these registers in the TRM and what they do. Is this a limiter of the CBASS? Is it possible to get some guiding how to handle these register?

    Magnus

  • There are two pairs of limiters (RS and WS), one for A53 cluster and another for GPU. RS is read limiter, WS is write limiter for both. Unfortunately the sections on the A53 limiters (section 7.5.2 A53_RS_BW_LIMITER) is it for the documentation for now.

  • Ok,

    Is there any info what actually write to register to limit for example the GPU, maybe some example code?

    My idea was to read out the register when I am running and then look at statistics, peak values and so on, but as soon as I read any BW_LIMITER register I got a crash. I don’t know way.

    Magnus

  • We are looking for an example code snippet. But reading the registers and even writing to them seems to work for me, although I don't have an example that results in bandwidth behavior changes? If you just run:

    devmem2 0x30403000 w

    you get a crash?

      Pekka