Hello,
I have Shannon C6678 board with myself and I am able to run some simple OpenMP and Codes on it. I want to look into the memory configurations, I mean, I want to put my data in different levels of memory like, what will be the changes I will see if i will put it in L2 cache or will bypassL2 and put it in shared memory and then i want to see some time estimated stuffs and all.. I have searched but didn't get satisfied way to move on. For a Starting point, forget with large data's lets start with small multiplication codes, can Anyone suggest me something to start with?
Thanks.