### Author Topic: Array speeds  (Read 2235 times)

#### fuzzy70

• Community Developer
• Prof. Inline
•      • • Posts: 828
• Look left, Look right, LOOK OUT!! ##### Array speeds
« on: 2012-Feb-26 »
While reading an old programming book (2003 so not uber old) I got onto a section about memory access, cache access etc & found something I thought was interesting so decided to test it in GLB. The book is about game coding & the code is in C++ but it's the concepts I was reading it for. Most will prob not find it interesting or any use for it but I found it of interest Basically (no pun intended) it was saying about how accessing arrays in the wrong order can have an effect on speed & cache hits. His results had more dramatic differences then mine but that was probably down the the cpu's of the time & their respective cache sizes, however I did witness noticeable differences with my translated version of the code.

Code: (glbasic) [Select]
`LOCAL n%,n2%, i%, j%, k%, TimeStart, TimeEnd, TimeTaken, DummyVarn=250n2=10000DIM TestData3d[n][n][n]PRINT "Writing Data to the 3d array",0,0// Fill 3d array in Column orderTimeStart = GETTIMERALL()FOR k=0 TO n-1 FOR j=0 TO n-1 FOR i=0 TO n-1 TestData3d[i][j][k]=0 NEXT NEXTNEXTTimeEnd = GETTIMERALL()TimeTaken = TimeEnd - TimeStartPRINT "Column order took = "+TimeTaken,0,10// Fill 3d array in Row orderTimeStart = GETTIMERALL()FOR i=0 TO n-1 FOR j=0 TO n-1 FOR k=0 TO n-1 TestData3d[i][j][k]=0 NEXT NEXTNEXTTimeEnd = GETTIMERALL()TimeTaken = TimeEnd - TimeStartPRINT "Row order took = "+TimeTaken,0,20SHOWSCREENMOUSEWAITPRINT "Reading Data from the 3d array",0,0// Read 3d array in Column orderTimeStart = GETTIMERALL()FOR k=0 TO n-1 FOR j=0 TO n-1 FOR i=0 TO n-1 DummyVar=TestData3d[i][j][k] NEXT NEXTNEXTTimeEnd = GETTIMERALL()TimeTaken = TimeEnd - TimeStartPRINT "Column order took = "+TimeTaken,0,10// Read 3d array in Row orderTimeStart = GETTIMERALL()FOR i=0 TO n-1 FOR j=0 TO n-1 FOR k=0 TO n-1 DummyVar=TestData3d[i][j][k] NEXT NEXTNEXTTimeEnd = GETTIMERALL()TimeTaken = TimeEnd - TimeStartPRINT "Row order took = "+TimeTaken,0,20SHOWSCREENMOUSEWAIT// Clear 3d array from mem & create 2d oneDIM TestData3dDIM TestData2d[n2][n2]PRINT "Writing Data to the 2d array",0,0// Fill 2d array in Column orderTimeStart = GETTIMERALL()FOR j=0 TO n2-1 FOR i=0 TO n2-1 TestData2d[i][j]=0 NEXTNEXTTimeEnd = GETTIMERALL()TimeTaken = TimeEnd - TimeStartPRINT "Column order took = "+TimeTaken,0,10// Fill 2d array IN Row orderTimeStart = GETTIMERALL()FOR i=0 TO n2-1 FOR j=0 TO n2-1 TestData2d[i][j]=0 NEXTNEXTTimeEnd = GETTIMERALL()TimeTaken = TimeEnd - TimeStartPRINT "Row order took = "+TimeTaken,0,20SHOWSCREENMOUSEWAITPRINT "Reading Data from the 2d array",0,0// Read 2d array in Column orderTimeStart = GETTIMERALL()FOR j=0 TO n2-1 FOR i=0 TO n2-1 DummyVar=TestData2d[i][j] NEXTNEXTTimeEnd = GETTIMERALL()TimeTaken = TimeEnd - TimeStartPRINT "Column order took = "+TimeTaken,0,10// Read 2d array in Row orderTimeStart = GETTIMERALL()FOR i=0 TO n2-1 FOR j=0 TO n2-1 DummyVar=TestData2d[i][j] NEXTNEXTTimeEnd = GETTIMERALL()TimeTaken = TimeEnd - TimeStartPRINT "Row order took = "+TimeTaken,0,20SHOWSCREENMOUSEWAIT`
BTW originally they where separate tests I wrote but then lumped them into one prog & the results where still the same, also trying ints rather than floats produced the same results.

On my system filling the 3d array was on average twice as fast in column order then row order, while reading there was little in it but row order was slightly quicker. The difference does scale with the the size of the array for example the 2d fill with 10000x10000 elements was 3.5 times quicker in column order whereas with 5000x5000 elements around 2 times quicker.

How different the results would be on other platforms like iPhone etc I do not know as not having any of the other platforms I'm unable to test.

If you are filling any large arrays in your project during runtime (except at the initial start) it might be an idea to run this test on the device to see what works quicker if at all. Who knows you may gain a few millisecs that could be used elsewhere or give a boost to updates Lee
"Why don't you just make ten louder and make ten be the top number and make that a little louder?"
- "These go to eleven."

This Is Spinal Tap (1984)

#### hardyx

• Community Developer
• Dr. Type
•      • • Posts: 496 ##### Re: Array speeds
« Reply #1 on: 2012-Feb-27 »
On my system filling the 3d array was on average twice as fast in column order then row order, while reading there was little in it but row order was slightly quicker.
Sometimes the C compiler optimizes the loops with pointers. In many languages row data are contiguous in memory. When you use row order you are accesing the addresses: d, d+1, d+2, d+3, etc.  When you access using column order you are accessing the addresses: d, d+rowsize*1, d+rowsize*2, d+rowsize*3, etc. An addition is faster than a multiplication.

I think GLBasic makes the arrays putting first the external indexes and the last index is the most internal, like other languages. But now I don't know with the results of your test.
« Last Edit: 2012-Feb-27 by hardyx »

#### fuzzy70

• Community Developer
• Prof. Inline
•      • • Posts: 828
• Look left, Look right, LOOK OUT!! ##### Re: Array speeds
« Reply #2 on: 2012-Feb-27 »
This was his original code

Code: (glbasic) [Select]
`const int g_n = 250;float TestData[g_n][g_n][g_n];inline void column_ordered(){   for (int k=0; k<g_n; k++)      for (int j=0; j<g_n; j++)         for (int i=0; i<g_n; i++)            TestData[i][j][k] = 0.0f;}inline void row_ordered(){   for (int i=0; i<g_n; i++)      for (int j=0; j<g_n; j++)         for (int k=0; k<g_n; k++)            TestData[i][j][k] = 0.0f;}`
& his results

Column Ordered=2817 ms   Row Ordered=298 ms

There is no mention of the cpu used but prob P3 or an early P4, the compiler was Visual C++ .net.

His findings are like you said
Quote
In many languages row data are contiguous in memory
but as you can see mine were the opposite & column order was faster. How to convert the C++ into something I can compile & run in GCC or Visual C++ express that I have is beyond me at current as do not have that much knowledge of C++, just enough to pick some parts out Lee

"Why don't you just make ten louder and make ten be the top number and make that a little louder?"
- "These go to eleven."

This Is Spinal Tap (1984)

#### hardyx

• Community Developer
• Dr. Type
•      • • Posts: 496 ##### Re: Array speeds
« Reply #3 on: 2012-Feb-27 »
Check this code, you can compile in GCC or Visual C++ like console application.

Code: (glbasic) [Select]
`#include <stdio.h>#include <time.h>// put ordering functions here//.....inline clock_t GetTime() {   return clock();}int main(){  clock_t timeinit;  timeinit = GetTime();  column_ordered();  clock_t coltime = GetTime() - timeinit;  timeinit = GetTime();  row_ordered()  clock_t rowtime = GetTime() - timeinit;  printf("Columns time: %.3f tics\n", (double)coltime);  printf("Rows time: %.3f tics\n", (double)rowtime);  return 0;}`
P.D. I fixed the data definitions
« Last Edit: 2012-Feb-28 by hardyx »

#### fuzzy70

• Community Developer
• Prof. Inline
•      • • Posts: 828
• Look left, Look right, LOOK OUT!! ##### Re: Array speeds
« Reply #4 on: 2012-Feb-28 »
Thank you hardyx for that, very much appreciated as I would not been able to add the bits you posted without some serious headaches, like I said my C++ is far from great but I managed to get it to work by looking at the compile errors & hit & miss additions This was the code I ended up with

Code: (glbasic) [Select]
`#include <stdio.h>#include <time.h>// put ordering functions here//.....const int g_n = 250;float TestData[g_n][g_n][g_n];inline void column_ordered(){   for (int k=0; k<g_n; k++)      for (int j=0; j<g_n; j++)         for (int i=0; i<g_n; i++)            TestData[i][j][k] = 0.0f;}inline void row_ordered(){   for (int i=0; i<g_n; i++)      for (int j=0; j<g_n; j++)         for (int k=0; k<g_n; k++)            TestData[i][j][k] = 0.0f;}inline clock_t GetTime() {  return clock();}int main(){  float coltime;  float rowtime;  clock_t timeinit;  timeinit = GetTime();  column_ordered();  coltime = GetTime() - timeinit;  timeinit = GetTime();  row_ordered();//  clock_t  rowtime = GetTime() - timeinit;  printf("Columns time: %.3f tics\n", (double)coltime);  printf("Rows time: %.3f tics\n", (double)rowtime);  return 0;}`
I presume I done right as it compiles in GCC & executes,  also I take it that 1 Tic = 1ms?

The results where like the ones from the book in that rows where quicker than columns

Columns time: 378.000 tics
Rows time: 40.000 tics

GLB results
Columns time: 157.785
Row Time: 374.704

As you can see quite a different ratio between columns vs rows.
"Why don't you just make ten louder and make ten be the top number and make that a little louder?"
- "These go to eleven."

This Is Spinal Tap (1984)

#### dreamerman

• Global Moderator
• Dr. Type
•       • • Posts: 365 ##### Re: Array speeds
« Reply #5 on: 2012-Feb-28 »
Interesting topic, so just for comparison same code in FreeBasic:

Code: (glbasic) [Select]
`dim as double TimeStart, TimeEnd, TimeTaken, DummyVardim as integer n,n2, i, j, kn=250n2=10000reDIM as integer TestData3d(n,n,n)clssleep 1000PRINT "Writing Data to the 3d array"'' Fill 3d array in Column orderTimeStart = timer()FOR k=0 TO n-1 FOR j=0 TO n-1 FOR i=0 TO n-1 TestData3d(i,j,k)=0 NEXT NEXTNEXTTimeEnd = timer()TimeTaken = TimeEnd - TimeStartPRINT "Column order took = "+str(TimeTaken)'' Fill 3d array in Row orderTimeStart = timer()FOR i=0 TO n-1 FOR j=0 TO n-1 FOR k=0 TO n-1 TestData3d(i,j,k)=0 NEXT NEXTNEXTTimeEnd = timer()TimeTaken = TimeEnd - TimeStartPRINT "Row order took = "+str(TimeTaken)sleep 1000PRINT "Reading Data from the 3d array"'' Read 3d array in Column orderTimeStart = timer()FOR k=0 TO n-1 FOR j=0 TO n-1 FOR i=0 TO n-1 DummyVar=TestData3d(i,j,k) NEXT NEXTNEXTTimeEnd = timer()TimeTaken = TimeEnd - TimeStartPRINT "Column order took = "+str(TimeTaken)'' Read 3d array in Row orderTimeStart = timer()FOR i=0 TO n-1 FOR j=0 TO n-1 FOR k=0 TO n-1 DummyVar=TestData3d(i,j,k) NEXT NEXTNEXTTimeEnd = timer()TimeTaken = TimeEnd - TimeStartPRINT "Row order took = "+str(TimeTaken)'' Clear 3d array from mem & create 2d oneredim TestData3d(0,0,0)reDIM as integer TestData2d(n2,n2)sleep 1000PRINT "Writing Data to the 2d array"'' Fill 2d array in Column orderTimeStart = timer()FOR j=0 TO n2-1 FOR i=0 TO n2-1 TestData2d(i,j)=0 NEXTNEXTTimeEnd = timer()TimeTaken = TimeEnd - TimeStartPRINT "Column order took = "+str(TimeTaken)'' Fill 2d array IN Row orderTimeStart = timer()FOR i=0 TO n2-1 FOR j=0 TO n2-1 TestData2d(i,j)=0 NEXTNEXTTimeEnd = timer()TimeTaken = TimeEnd - TimeStartPRINT "Row order took = "+str(TimeTaken)sleep 1000PRINT "Reading Data from the 2d array"'' Read 2d array in Column orderTimeStart = timer()FOR j=0 TO n2-1 FOR i=0 TO n2-1 DummyVar=TestData2d(i,j) NEXTNEXTTimeEnd = timer()TimeTaken = TimeEnd - TimeStartPRINT "Column order took = "+str(TimeTaken)'' Read 2d array in Row orderTimeStart = timer()FOR i=0 TO n2-1 FOR j=0 TO n2-1 DummyVar=TestData2d(i,j) NEXTNEXTTimeEnd = timer()TimeTaken = TimeEnd - TimeStartPRINT "Row order took = "+str(TimeTaken)sleep`
And my results on P E2180 (repeated several times):
Code: (glbasic) [Select]
` GLB FBwriting to 3d arraycolumn order: 0.170 sec - 0.703 secrow order: 0.699 sec - 0.089 secreading from 3d arraycolum order: 0.137 sec - 0.65 secrow order: 0.120 sec - 0.086 secwriting to 2d arraycolumn order: 0.986 sec - 8.092 secrow order: 6.791 sec - 0.808 secreading from 2d arraycolumn order: 0.656 sec - 5.20 secrow order: 0.684 sec - 0.455 sec`
So basically in GLB is better to use column order Check my source code editor for GLBasic - link Update: 20.04.2020

#### hardyx

• Community Developer
• Dr. Type
•      • • Posts: 496 ##### Re: Array speeds
« Reply #6 on: 2012-Feb-28 »
In the C code the rows ordering is faster, because what I said about the contiguous memory. Is the expected result. Tics duration depends on the sytem, you can use 1/CLOCKS_PER_SEC constant to convert to seconds.

GLBasic results are different because arrays in GLB are dynamic structures (I think) and they may be ordered in memory in a different form than normal C arrays.
« Last Edit: 2012-Feb-28 by hardyx »

#### fuzzy70

• Community Developer
• Prof. Inline
•      • • Posts: 828
• Look left, Look right, LOOK OUT!! ##### Re: Array speeds
« Reply #7 on: 2012-Feb-28 »
I don't get it, why do you call it 'GLB Results', where everything you do in that last snippet is straight C?

Ocean

The GLB Results where from the GLB program at the start of the thread, I mentioned the speed difference but not the timings. my bad

Lee
"Why don't you just make ten louder and make ten be the top number and make that a little louder?"
- "These go to eleven."

This Is Spinal Tap (1984)