GLBasic forum
Codesnippets => Code Snippets => Topic started by: Qedo on 2011Jun09

new QSIN (QQSIN).
on my computer (notebook):
QQSIN = 230 ms
QSIN = 580 ms
SIN GLBasic= 1150 ms
Wait your time
[attachment deleted by admin]

Really well optimised! :good:

I use:
dim qsin[359];for x=0 to 359;qsin[x]=sin[x];next
sinn=qsin[angle]
Can you add this to your code for testing speed?

M_QSIN (ampos code) = 380 ms
QQSIN = 230 ms
QSIN = 580 ms
SIN GLBasic= 1150 ms
Wait your time
[attachment deleted by admin]

Can I use that for distribution with the new SDK? Awesome work!

Don't forget about 'Lookuptable based SIN' by Ocean: http://www.glbasic.com/forum/index.php?topic=2855.msg20951#msg20951
on my pc:
m_qsin (Ampos) > 564ms
Sin GlBasic > 1772ms
QQSin > 488ms
QSin > 996ms
tSin (Ocean) > 538ms
QQSin isn't so precise as tSin but for most calculations it would be ok..
Good job :]

feel free to use it :)

About accuracy: Draw a circle to test:
function cirl: x,y,R
for phi=0 to 360 step 0.1
setpixel x+qsin(phi)*R, y+qsin(90+phi)*R, 0xffffff // sin(90+phi) = cos(phi)
next
endfunction

On my machine the original SIN(x) is the fastest...

Cool function! QQSIN is fastest and draws a circle fine. It seems GLB SIN is faster only in debug.
My results (Win7):
(Debub OFF)
Time for 15300000 loops
M_QSIN time: 262.7019786 ms
SIN GLBasic time: 818.1101458 ms
new QQSIN time: 154.9435493 ms ***
old QSIN time: 326.0813655 ms
(Debug ON)
Time for 15300000 loops
M_QSIN time: 1819.38153 ms
SIN GLBasic time: 1046.089785 ms ***
new QQSIN time: 2216.111983 ms
old QSIN time: 2206.459872 ms

Oh you are right, it was in debug mode... in release mode QQSIN is the fastest.

Only just tried this now. Great going. :)

from the first project file..
// Start: Thursday, March 15, 2007
You've been holding this back for all this time? Anyway, pretty impressive math stuff, and I didn't know GLBasic can handle floating points of 35 decimals. I tried the updated QQSIN1.zip sources and here are my measurements:
desktop pc/windows, Pentium 4 CPU 3.20GHz
time for 15300000 loops
M_QSIN time: 460.5181186 ms
last value SIN GLBasic = 0.7071067812
SIN GLBasic time: 1469.167289 ms
last value SIN GLBasic = 0.7071067812
new QQSIN time: 360.3170347 ms
last value new QQSIN = 0.7077865858
old QSIN time: 838.0197293 ms
last alue old QSIN = 0.7077653318
SIN GLBasic value 60.1 deg. = 0.8668967489
SIN GLBasic value 60.2 deg. = 0.8677654534
SIN GLBasic value 289.3 deg. = 0.9438009516
new QQSIN value 60.1 deg. = 0.875032347
new QQSIN value 60.2 deg. = 0.8683695847
new QQSIN value 289.3 deg. = 0.9441067712
And also tried it on my ipod, which is sloooooow in comparison
iPod 4th generation 8gb
time for 15300000 loops
M_QSIN time: 5265.441406ms
last value SIN GLBasic = .7071067691
SIN GLBasic time: 7395.682617ms
last value SIN GLBasic = .7071067691
new QQSIN time: 5005.396484 ms
last value new QQSIN = .7077865005
old QSIN time: 9279.095703ms
last alue old QSIN = .7077654004
SIN GLBasic value 60.1 deg. = .8668967485
SIN GLBasic value 60.2 deg. = .8677654862
SIN GLBasic value 289.3 deg. = .9438010454
new QQSIN value 60.1 deg. = .8675031066
new QQSIN value 60.2 deg. = .8683695793
new QQSIN value 289.3 deg. = .9441066384

Wow, amazing function!
So, calculating the value each time is faster than a lookup table? Wouldn't have guessed that!
Interesting calculation, how did you figure that out?
Is there an equivalent COS function?

FUNCTION QQCOS: x
RETURN QQSIN(x+90)
ENDFUNCTION
Ciao

like always: It depends on ....
QSIN is working faster as long as the values for the angle value are small. But often, I don't care about the value of the angle, e.g. increment per 1 at each loop for an endless rotation of a sprite. For large values for angle the native SIN funtion from GLBasic is faster.
Example:
...
time3 = GETTIMERALL()
FOR i = 0 TO 200000
z=SIN(i)
NEXT
diff3 = GETTIMERALL()  time3
counter =0
time4 = GETTIMERALL()
FOR i = 0 TO 200000
za= QSIN (i)
NEXT
diff4 = GETTIMERALL()  time4
counter =0
time5 = GETTIMERALL()
FOR i = 0 TO 200000
zs= QSIN_OLD (i)
NEXT
diff5 = GETTIMERALL()  time5
...
the bigger the values for angle, the more fast the native GLBasic SIN function is. You'll see the same effect with the precalculated SIN values (see lookup sample from Ocean)
Btw: could be interesting to look how math functions could work:
http://stackoverflow.com/questions/345085/howdotrigonometricfunctionswork/345117#345117

@ Quentin
the new QQSIN for high (and low) angle.
Ciao
[attachment deleted by admin]

I tried to optimise QSIN/QQSIN and found this to be the fastest on any angle (at least on my machine):
FUNCTION QQSIN: x
IF x > 360
DEC x, INTEGER( x / 360 ) * 360
ELSEIF x < 0
INC x, INTEGER( 1  x / 360 ) * 360
ENDIF
IF x < 180
x = ( 0.02222144652331750907567718514381  0.00012346049003081125437778164739108 * x )* x
RETURN ( 0.775 + 0.225 *x ) * x
ENDIF
x = 180  x
x = ( 0.02222144652331750907567718514381 + 0.00012346049003081125437778164739108 * x ) * x
RETURN ( 0.775  0.225 * x ) * x
ENDFUNCTION

Kanonet
You're right.
Your code is faster on average by 7%
well done
ciao

sorry Ocean, but I don't understand.
The angles outside 0360 degrees are redused at the first quadrant but not converted in integer.
Can you give a practical example?

Not true ocean, i think you didnt understand the calculation.
But very interesting:
i put some DEBUGs in the code:
FUNCTION QQSIN: x
DEBUG "start x: "+x
IF x > 360
DEC x, INTEGER( x / 360 ) * 360
ELSEIF x < 0
INC x, INTEGER( 1  x / 360 ) * 360
ENDIF
DEBUG " calculated x: "+x
IF x < 180
x = ( 0.02222144652331750907567718514381  0.00012346049003081125437778164739108 * x )* x
DEBUG " sin(x): "+(( 0.775 + 0.225 * x ) * x)+CHR$(10)
RETURN ( 0.775 + 0.225 *x ) * x
ENDIF
x = 180  x
x = ( 0.02222144652331750907567718514381 + 0.00012346049003081125437778164739108 * x ) * x
DEBUG " sin(x): "+(( 0.775  0.225 * x ) * x)+CHR$(10)
RETURN ( 0.775  0.225 * x ) * x
ENDFUNCTION
if i do:
QQSIN(60.3)
QQSIN(60.3+360)
QQSIN(60.3360)
QQSIN(60.3+720)
QQSIN(60.3720)
i get on debugger:
start x: 60.3 calculated x: 60.3 sin(x): 0.8691949954
start x: 420.3 calculated x: 60.3 sin(x): 0.8691949954
start x: 299.7 calculated x: 60.3 sin(x): 0.8691949954
start x: 780.3 calculated x: 60.3 sin(x): 0.8691949954
start x: 659.7 calculated x: 60.3 sin(x): 0.8691949954
so basically no difference.
But by coding DEBUG (QQSIN(60.3)QQSIN(60.3+360))
i get:1.110223025e016
so abruptly there s a very, very small difference... but i think its so small, i think it doesnt matter, 'cuz if you need high precision, you take the GLBasic SIN() and not one of the quicker functions.

in the Kanonet code eliminated the integer cast (see code) Now >10% faster (on my computer)
FUNCTION QQSIN1: x // Kanonet
LOCAL xx%
xx=x
IF xx >= 360
DEC x, ( xx / 360 ) * 360 // was DEC x, INTEGER( x / 360 ) * 360
ELSEIF x < 0
INC x, ( 1  xx / 360 ) * 360 // was INC x, INTEGER( 1  x / 360 ) * 360
ENDIF
IF x < 180
x = ( 0.02222144652331750907567718514381  0.00012346049003081125437778164739108 * x )* x
RETURN ( 0.775 + 0.225 *x ) * x
ENDIF
x = 180  x
x = ( 0.02222144652331750907567718514381 + 0.00012346049003081125437778164739108 * x ) * x
RETURN ( 0.775  0.225 * x ) * x
ENDFUNCTION
@Ocean
your code is very very fast (40% faster) congratulations.
Ingenious the & bAnd control in: return _cos_tbl_[ (int)(x * __d__) & __c__] ;
the other side of the coin is the increased use of memory of the lookup table.
Ciao

I am still amazed that there is a faster formula than just taking a value from a matrix... :O

The Ocean's code use a lookup table and take a value from a matrix.
It's the fastest.
Ciao

Qedo, you forgot one float, it should be:
FUNCTION QQSIN: x // by Kitty Hello, Qedo, Kanonet
LOCAL xx%=x
IF xx > 360
DEC x, INTEGER( xx / 360 ) * 360
ELSEIF xx < 0
INC x, INTEGER( 1  xx / 360 ) * 360
ENDIF
IF x < 180
x = ( 0.02222144652331750907567718514381  0.00012346049003081125437778164739108 * x )* x
RETURN ( 0.775 + 0.225 *x ) * x
ENDIF
x = 180  x
x = ( 0.02222144652331750907567718514381 + 0.00012346049003081125437778164739108 * x ) * x
RETURN ( 0.775  0.225 * x ) * x
ENDFUNCTION
On my machine Oceans Code only need 50%75% of the time... but if you want to save the additional memory or dont want to use INLINE, i think our code is good too.

Kanonet, the float var controls the negative angles between 1 and 0.
The integer xx truncated to 0.
Also the two INTEGER() cast don't need because the whole operations are with integer variables
I like "// by Kitty Hello, Qedo, Kanonet" :good:
ciao
Qedo, you forgot one float, it should be:
Code: [Select]
FUNCTION QQSIN: x // by Kitty Hello, Qedo, Kanonet
LOCAL xx%=x
IF xx > 360
DEC x, INTEGER( xx / 360 ) * 360
ELSEIF xx < 0
INC x, INTEGER( 1  xx / 360 ) * 360
ENDIF

@Ocean:
thanks for your explanation on platform compatibility, good to know, that it will work on every one. Btw. can users of the free version use INLINE?
I wait for your research on memory consumption, so we get more information for decisions.
@Qedo:
i dont think, that that truncating 0.5 to 0 is a problem, we have still 'good' precision. But what do you mean with "Also the two INTEGER() cast don't need because the whole operations are with integer variables"? We need INTEGER() because its part of a modulo operation. Do you want to change something on that code, or is the actual version in my last post?

Kanonet,
for the negative angles you are right.
for the uselessness of the INTEGER() try this code:
the results are the same
ciao
x=759.1
LOCAL xx%
xx=x
DEBUG ( xx / 360 ) * 360 + CHR$(10)
x=759.1
xx=x
DEBUG INTEGER( xx / 360 ) * 360 + CHR$(10)
END
the code should be:
FUNCTION QQSIN: x // by Kitty Hello, Qedo, Kanonet
LOCAL xx%=x
IF xx > 360
DEC x, ( xx / 360 ) * 360
ELSEIF xx < 0
INC x, ( 1  xx / 360 ) * 360
ENDIF
IF x < 180
x = ( 0.02222144652331750907567718514381  0.00012346049003081125437778164739108 * x )* x
RETURN ( 0.775 + 0.225 *x ) * x
ENDIF
x = 180  x
x = ( 0.02222144652331750907567718514381 + 0.00012346049003081125437778164739108 * x ) * x
RETURN ( 0.775  0.225 * x ) * x
ENDFUNCTION

Oh you're right. It don't speed it up, but it looks nicer.^^
Btw on my machine QSIN is faster than tSIN for angles 0 to 360 with a stepping of 0.1. But in any other case tSIN is faster.

interesting and a bit odd, considering what few instructions are involved to retrieve a tSIN from the lookup table. What machine & OS are you using?
Sorry, forgot to answer you: Im running the machine, that you find in my signature, with Win7 Pro x64.

After a long time a new version, everything got calculated new, from scratch. It offers higher precision than previous one and slightly more speed (up to +10% compared to last version).
FUNCTION QQSIN: x // by Kanonet
IF x<360 AND x>1
IF x>180
x = ( 1.523087104493429e4 * x  8.224670263788969e2 ) * x + 11.10330492621077
RETURN ( ( 1.111111111111111e2 * x  1.666666666666667e1 ) * x + 1 ) * x  1
ENDIF
x = ( 1.523087104493429e4 * x  2.741556799711035e2 ) * x + 1.233700559869966
RETURN ( ( 1.666666666666667e1  1.111111111111111e2 * x ) * x  1 ) * x + 1
ENDIF
x=x*182.044444444444445
LOCAL xx%=bAND(x, 65535)
IF xx > 32767
x = ( 4.595892692064137e9 * xx  4.517946286638592e4 ) * xx + 11.10330492621077
RETURN ( ( 1.111111111111111e2 * x  1.666666666666667e1 ) * x + 1 ) * x  1
ENDIF
x = ( 4.595892692064137e9 * xx  1.505982095546197e4 ) * xx + 1.233700559869966
RETURN ( ( 1.666666666666667e1  1.111111111111111e2 * x ) * x  1 ) * x + 1
ENDFUNCTION

Thanks :booze:

My version QQSIN2.
Perhaps not the most precise but faster, almost 50%.
I enclose an application of.
Ciao
FUNCTION QQSIN2: x // by Qedo
LOCAL xx%
xx=x* 182.04166666666666666666666666667
xx=bAND(xx , 65535)
IF xx < 32767
x = ( 0.00012206791406720535999456453271949  0.0000000037255156354636406577669912201935 * xx)* xx
RETURN ( 0.775 + 0.225 * x ) * x
ENDIF
xx= 32767  xx
x = ( 0.00012206791406720535999456453271949 + 0.0000000037255156354636406577669912201935 * xx) * xx
RETURN ( 0.775  0.225 * x ) * x
ENDFUNCTION
[attachment deleted by admin]

I must be doing something wrong or there is an incorrect setting somewhere as no matter what I do it is always 60fps :blink:
Lee

Fuzzy maybe its your hardware, or somthing? Runs with 560 fps my laptop.
Qedo, 180°=32768 not 32767
Yes my function was optimised for giving more accuracy at still good speed (on my machine its 7 times faster than original GLBasic SIN). If you want pure speed (and thats why we create this functions^^), than you need to stick with the old calculations (which are still based on Gernots QSIN). How did you do your calculations, starting with Gernots one again, or just adding to our last version? I did all calculations from Gernots base again (to avoid adding up inaccuracies) and added some extra speed improvements.
No finally QQSIN is 10 times faster than original GLBasic SIN!
Of cause more testing is always useful and we optimised it so much that most new optimisations arent necessarily useful on every machine. E.g. on my windows tablet with crappy AMDC50 CPU, my accuracy functions outruns every other function by far, even Oceans c lookup table (which is pretty slow on that machine)! I have no idea why, cuz it shouldnt... (maybe its 32bit float vs 64bit float?)
FUNCTION QQSIN: x // by Kitty Hello, Qedo, Kanonet
LOCAL xx% = bAND( x * 182.04166666666666666666666666667 , 65535 )
IF xx < 32768
x = ( 5.790101276423631e5  1.767113300842517e9 * xx ) * xx
RETURN ( 1.633843488304938 + x ) * x
ENDIF
xx = 32768  xx
x = ( 5.790101276423631e5 + 1.767113300842517e9 * xx ) * xx
RETURN ( 1.633843488304938  x ) * x
ENDFUNCTION

I found the culprit, in the driver settings if I change vertical sync from "Application Controlled" to "Forced off" it runs faster, must of changed after the last driver update as don't recall changing it myself.
DRAWRECT shows similar FPS in GLB Sin 211 fps & QQSin2 around 220 fps.
When I switch to Polyvectors, GLB sin 373 fps & QQSin 920 fps so around 2.4 times the speed up. The QQSin in your last post at 1010 fps so a slight speed up.
My machine specs are Core2Duo 2.13ghz, Nvidia 240GT & Win7 32bit
Lee

Of cause more testing is always useful and we optimised it so much that most new optimisations arent necessarily useful on every machine. E.g. on my windows tablet with crappy AMDC50 CPU, my accuracy functions outruns every other function by far, even Oceans c lookup table (which is pretty slow on that machine)! I have no idea why, cuz it shouldnt... (maybe its 32bit float vs 64bit float?)
Seems a bit odd. Are the lookuptable packed to 32 or 64 bit integers (depending on os/processor) in the C code ?
#pragma pack(push) /* current alignment to stack */
#pragma pack( integer_size ) /* set alignment to 32 or 64 bit integers depending on platform */
... C lookuptable here ...
#pragma pack(pop)
Correctly alligned tables are faster ....

@kanonet
How did you do your calculations, starting with Gernots one again, or just adding to our last version?
I started by Gernot and your code and in detail i have substituted the original numeric value in this way:
// 65535 = (2^16)1
// 182.04166666666666666666666666667 = 65535/360
// 0.00012206791406720535999456453271949 = 0.02222144652331750907567718514381/182.04166666666666666666666666667
// 0.000000003725515635463640657766991220193 = 0.00012346049003081125437778164739108/(182.04166666666666666666666666667^2)
And instead How did you do your calculations?
As you have deleted a multiplication?
Good
Ciao

@Qedo:
Oups there you pointed me to an error in my code, it needs to be 65536/360 which is 182.044444444444444444...
So this is the code actually:
FUNCTION QQSIN: x // by Kitty Hello, Qedo, Kanonet
LOCAL xx% = bAND( x * 1.820444444444444e2 , 65535 )
IF xx < 32768
x = ( 5.790101276423631e5  1.767113300842517e9 * xx ) * xx
RETURN ( 1.633843488304938 + x ) * x
ENDIF
xx = 32768  xx
x = ( 5.790101276423631e5 + 1.767113300842517e9 * xx ) * xx
RETURN ( 1.633843488304938  x ) * x
ENDFUNCTION
To my calculations:
I went back to the original maths from Gernot and added my improvements. I did not add up to already transformed decimals, cuz this would lead into less precision, without winning speed. Here you see my full calculations (for x<180), including how i got rid of this multiplication:
x = (1.2732 * (xx*360./65536./57.296) 0.4053 * (xx*360./65536./57.296) * (xx*360./65536./57.296) )*SQR(0.225)
RETURN (0.775/SQR(0.225) + x)*x
@Ketil:
i cant code C so i cant answer your question. Maybe youre right and something in his C code isnt perfect, cuz if i write my own lookuptable based sin function in glbasic its almost as fast as the C one on my main machine and way faster on my tablet, but still my QQSIN is the fastest one this machine (on my main machine the Oceans C one is the fastest  at least most of the times).

Hmm more speed, i like this idea :)
Question though...what are the limitations for acceptable input value?
One more question...is this the fastest routine?
FUNCTION QQSIN: x // by Kitty Hello, Qedo, Kanonet
LOCAL xx% = bAND( x * 1.820444444444444e2 , 65535 )
IF xx < 32768
x = ( 5.790101276423631e5  1.767113300842517e9 * xx ) * xx
RETURN ( 1.633843488304938 + x ) * x
ENDIF
xx = 32768  xx
x = ( 5.790101276423631e5 + 1.767113300842517e9 * xx ) * xx
RETURN ( 1.633843488304938  x ) * x
ENDFUNCTION

There are no known limitations, even very big positive or negative values are ok (as long as they dont cause a buffer overflow). But keep in mind, that it is not as accurate as the original GLB SIN function. So if you need to calculate very small changes or want to add up many sin values, you better stick with the original SIN.
If you like the faster sin function, you may also like my libQMATH which contains faster calculations for many math functions. see here: http://www.kanonet.de/downloads/libqmath

thanks for the lib
why does atan have only one input and output in that lib, and glbasic has 2 in and out?

aisde from atan..
this is in qatan2 causing an error:
RETURN IIF( x>180, x180, x )

Yeah ATAN normally only has one parameter and not two. If you want two parameters, that use ATAN2. Its called this way in most languages, only GLBasic just offers ATAN2 but called it ATAN. I decided to include both, so you can use what you need.
About your error:
Im using V11 beta, and IIF is a new command V11. I used this command without thinking about users with V10, sorry. To use it with V10, just replace this line with this:
IF x>180 THEN RETURN x180
RETURN x
Or get new version from my website.

Hmm, how do i calculate this?
ATAN(X,Y)
Im using this in the PE btw..for missle guidance ;P

What do you mean? If you want to replace ATAN(x,y), just use my qATAN2(x,y). They do exactly the same. Or did i understand you wrong?

Ah ok i get it now :)