I've written a codec that inputs a video frame, does some processing, and outputs a video frame. This I/O is to regular memory, I believe, and not the fast (L1?) memory that I haven't studied yet. Nevertheless, it's fast enough. DM6467T, Arago Linux
The processing involves referencing a third memory buffer that contains a lookup table. I could in theory use 24 bits to address a table containing 16,777,216 bytes. However, the codec runs pretty slow in this case, presumably because of the size of the table and the access of slow memory.
When I back off my specs a little and use 18 bits to address the table instead, the table is only 262,144 bytes and my codec runs very fast. Adding just a single bit, and doubling the size of the table to 524,288 causes a big slowdown. So there's some threshold I'm going over.
I'm not intentionally using L1D. And L1D is only 32K anyway, which is significantly smaller than even my faster 262,144 byte table.
Any clues what's going on here? Is Linux or someone else doing some caching for me? My speed is slightly video-frame-data dependent, and so I may only be visiting a portion of the table at a time, and might visit more of it on the bigger table. But someone would have to be managing associating caching for me, because I am not?
Where is the magic elf?
I'm getting ready to need a bigger table anyway, so I need to understand the elf!
Thanks,
Helmut