Difference: Xeonprocessor (8 vs. 9)

Revision 92014/03/05 - Main.WilliamFedus

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
-- WilliamFedus - 2014/02/04
Line: 100 to 100
 Which we immediately see is the expected factor of 16 slow-down from our vectorized code.

Scaling to Multiple Cores

Changed:
<
<
To utilize the full capability of our cores, we must run more than one thread on each core in order to execute the FMA calculation for each clock cycle. Here, we use the OpenMP? API for shared memory multiprocessing.
>
>
To utilize the full capability of our cores, we must run more than one thread on each core in order to execute the FMA calculation for each clock cycle. Here, we use the OpenMP API for shared memory multiprocessing.
 
Changed:
<
<
* OpenMP? Tutorials: Tutorial
>
>
* OpenMP Tutorials: Tutorial
 

Vectorization

Useful guide for enabling compiler vectorization capability in the Intel compiler
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback