I’m not big into New Year’s resolutions (I don’t see the point of waiting to do something you know you should do)… but that said, every few years I put myself on a one month French fries moratorium to start off the year because I just can’t seem to summon the willpower at any other point in the year. So now, 17 days into the moratorium, I don’t see how I can possibly hold out another 2 weeks (especially in the land of burger joints that is Dallas).
So what exactly does this have to do with high performance computing? Well, colored by the ‘golden haze’ of my withdrawal, I offer up some predictions for the year ahead in high performance computing:
Happy New Year! And if you see me in the next two weeks please don’t ask me if I want fries with that…because, yes, actually I DO!!
The C6678 is an interesting and impressive piece of floating point hardware development. Speaking as a mathematical modeller who is trying to become a DSP developer, your toolchain and software documentation are very hard work still. I think HPC researchers wanting to use C6678s might find learning to develop code for them something of a shock...
It's partly culture, and partly tools, and partly that the device *is* complicated (eg. lack of automatic cache coherency). Of course, HPC people tend to be bright so they'll probably manage some impressive things nonetheless.
I was about to suggest that a full OpenMP and/or MPI stck for the C6678 that, for example, took care of starting tasks on cores and allocating tasks between cores might make things more familiar. I see that you're working on OpenMP - I'm pleased, although it'll be too late for our current project.
Hi Gordon,
Thanks for your comment, we're always striving to make working with our devices easier and we've made good progress on OpenMP. We have a really good example running an SGEMM kernel across all 8 cores on C6678 that you might find interesting. We're also putting together an MPI + OpenMP demo (although still a work in progress) that will use MPI to distribute the job across multiple C6678's and OpenMP will be used to parallelize on each individual DSP. If you're interested in learning more about these demos please let me know and I'll be happy to follow up with you.
- Arnon