-moves the data from memory to cache in order to speed up the memory access
-Example from [[here>https://gcc.gnu.org/onlinedocs/gcc-5.3.0/gcc/Other-Builtins.html]]
for (i = 0; i < n; i++)
    a[i] = a[i] + b[i];
    __builtin_prefetch (&a[i+j], 1, 1);
    __builtin_prefetch (&b[i+j], 0, 1);
    /* ... */
-''__builtin_prefetch'' takes 3 parameters, which later 2 are optional
-the first parameter is the data to be prefetched
-the second parameter is 0 or 1, compile time constant
--0 is to read the data (which is default)
--1 is to write
-the third parameter is locality 0,1,2 or 3 and has to be compile time constant
--0 is no temporal locality.  it will be purged after access
--3 is strong temporal locality.  It'll try to keep the data in cache as much as possible(default)

