NEWS

Multicore Performance Bottleneck – Memory

  • April 3, 2013


Timely access to data is imperative for compute performance. Ideally processors should do useful work every clock cycle when there is work to be done, requiring “instant access” to data. In reality processors operate faster than memory can be accessed, which is why cache memory is used to improve performance. Even cache memory is generally slower that the processor.

Adding multiple cores and hardware accelerators into the mix, all accessing the same memory, exacerbates this issue. It is therefore increasingly important to efficiently utilize the memory available in a given multicore platform. For example, using the right type of memory for the job, and considering cache alignment. A combination of globally shared, locally shared and local memories, can help to alleviate the memory bottleneck, but when memory architectures are more complex, using the memory can be further complicated.

Also importantly, efficient utilization of memory should not be at the expense of application portability. To accomplish both efficiency and portability, designers can use a “thin” abstraction layer, relieving the application from managing memory layout differences from one platform to another. Such an abstraction layer could be provided by tools, leaving the heavy lifting to the tool itself.

There is a great deal of opportunity for improvement in the area of memory.

Contributed by Sven Brehmer, President and CEO

If this is an area of interest/importance to you and you would like to share your thoughts and questions, please get in touch with Sven here.

Comments are closed.

 
  • CONTACT

    POLYCORE SOFTWARE
    533 AIRPORT BLVD., SUITE 400 BURLINGAME, CA 94010 USA
       
    +1 (650) 570-5942
  • EMAIL US

    Name

    Email