Does anyone know of a faster way to count the active bits in a byte, word or long than the following example method?
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
// Count the 1s in an 8-bit number //
/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
FUNCTION Count8Bits: bit
LOCAL number
IF bit >= 128
DEC bit, 128
INC number
ENDIF
IF bit >= 64
DEC bit, 64
INC number
ENDIF
IF bit >= 32
DEC bit, 32
INC number
ENDIF
IF bit >= 16
DEC bit, 16
INC number
ENDIF
IF bit >= 8
DEC bit, 8
INC number
ENDIF
IF bit >= 4
DEC bit, 4
INC number
ENDIF
IF bit >= 2
DEC bit, 2
INC number
ENDIF
IF bit >= 1
DEC bit, 1
INC number
ENDIF
RETURN number
ENDFUNCTION
I'm an idiot. Should have thought about this for just a moment before throwing it out there. If I only need to do it in 8 bits (which happens to be the case) then keeping a look up table of 255 entries will be relatively quick & efficient. Feel free to laugh at this thread. :-[
Why not use bAND to compare the values and if you get a result of 1, it's true, if 0 , then it's false. Then just set a counter to add the values.
I don't make any guarantee that the code will work as intended as I don't have the machine I have GLBasic in front of me to test this out.
Since I'm also still learning GLBasic, please don't slap me too hard if I made any mistakes in my coding.
FUNCTION Count8Bits: bit
LOCAL number
IF bAND(bit,1) > 0
INC number
IF bAND(bit,2) > 0
INC number
IF bAND(bit,4) > 0
INC number
IF bAND(bit,8) > 0
INC number
IF bAND(bit,16) > 0
INC number
IF bAND(bit,32) > 0
INC number
IF bAND(bit,64) > 0
INC number
IF bAND(bit,128) > 0
INC number
RETURN number
ENDFUNCTION
Or combine it with the 'ASR(number, 1)' command and always check the first bit with bAND(number, 1).
Do this WHILE number <> 0.
Something like:
LOCAL number% = 125 // 0 1 1 1 1 1 0 1 <- Should give a count of 6
LOCAL count% = 0
WHILE number <> 0
IF bAND(number, 1) > 0 THEN INC count
number = ASR(number, 1)
WEND
Not sure if that will work, but in theory it should.
If it works, it should work with any size integer, not just limited to a byte or two.
[Edit]
It be a nice feature to add to GLBasic to have a temporary 'scratch pad' window that you can execute that is not part of your project! Then I could paste the above code just to see if it works, instead I'd have to close my project, create a new project, etc. Ha, nit-picking am I?
For the scratch-pad, I have a project called "_poo", that I Ctrl+A, Ctrl+V all the time.
I ran some tests, all using 32bit integers only and setting the number to measure always at 125. I made the different bit-count methods go 10^10 times each (so 10000000000 each) and measured how long it took them. Here are the somewhat unexpected results:-
Joint 1st place - Original code I posted with 8 IFs per bit count: 17017 milliseconds
Joint 1st place - XanthorXIII's 8 IFs bAND routine: 17036 (approximately identical)
2nd place - Lookup table: 42579 milliseconds
3rd place - Slydog's While Loop approach: 76624 milliseconds
A bit strange, no? :)
Well, mine being the slowest is no surprise, as it's in a while loop. I was going for simplicity.
But ya, the lookup table not being the fastest is a surprise, and being 2.5 times slower than the others.
Just curious if mine would be faster if you change this line:
IF bAND(number, 1) > 0 THEN INC count
to:
INC count, bAND(number, 1)
or
INC count, MOD(number, 2)
or
IF MOD(number, 2) = 1 THEN INC count
The bAND is a function call which takes extra processing cycles of course to run through but I'm pleased that it got close to what your code produced in speed. A mere 19 ms is probably not going to be missed by anyone I would think.
bAND is a bit slower when dealing with 64bit numbers. However, with 32bit integers bAND does not add CPU overhead. Nor do the ASL and ASR commands. For example, in tests I ran I found that if value and number are both 32bit integers:-
value = bAND(ASR(number, 8 ), 24)
is as quick as:-
value = number
This is the same for the iPhone too.
This is good news for me and the Line of Sight routine I'm working on and posting in the code snippets section. I can make it much more memory efficient by squeezing 4 8bit numbers into a single 32bit number and it won't be any slower in its calculations.