*by Carl Burch, Hendrix College, September 2012*

Using bit operators by Carl Burch is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.

Based on a work at

1. Bit operators

1.1. Bitwise operators

1.2. Shift operators

1.3. Operator hierarchy

2. Bits as flags

3. Counting one bits

3.1. Two simple techniques

3.2. A clever technique

3.3. A cleverer technique

3.4. Comparing techniques

1.1. Bitwise operators

1.2. Shift operators

1.3. Operator hierarchy

2. Bits as flags

3. Counting one bits

3.1. Two simple techniques

3.2. A clever technique

3.3. A cleverer technique

3.4. Comparing techniques

In this document, we'll study how a program can manipulate
computers' integer representation directly using *bit operators*.
This is sometimes pejoratively called
“bit twiddling,” but expert developers would
agree that mastering bit operators is a key skill
for proficient programmers.

We'll study the bit operators provided by C — though though many other languages, including Python and Java, have copied the same symbols that C uses. The bit operators fall into two basic categories: the bitwise operators and the shift operators.

The most significant bit operators are the
**bitwise operators**, which perform
logic operations on all bits at once.
For example, the bitwise NOT operator `~`

(also
called the **bitwise complement**)
takes the NOT of all the bits in an integer.
For example, if

holds the `b`

value
00110111, then **int**`~`

flips all these bits to arrive at the `b`

value 11001000.**int**

The other bitwise operators — bitwise AND `&`

,
bitwise OR `|`

, and bitwise XOR `^`

—
work with two operands.
Each pairs corresponding bits in the two operands and
performing the logical operation on them.
For example, to compute the bitwise OR of 1100 and 1010,
we take the OR of the first bit from each number (1, 1),
followed by the OR of the second bits (1, 0),
then the third bits (0, 1), then the fourth bits (0, 0).
For each of these bit pairs, we include a 1 in the result if
the first bit *or* the second bit in the pair is 1 (or
if both are one). Thus, we end up with an output of 1110.

Similarly, the bitwise AND includes a 1 in the result if
the first bit *and* the second bit in the pair is 1;
the bitwise AND of 1100 and 1010 ends up being 1000.
And the bitwise XOR (short for *exclusive or*) includes a
1 if the first bit *or* the second bit is 1 (but
*excluding* the case that both bits are 1);
the bitwise XOR of 1100 and 1010 ends up being 0110.

The following illustrates these operators at work.

**int** `a` = `0x23`; */* 00100011 */*

**int** `b` = `0x36`; */* 00110110 */*

`printf`(`"%x\n"`, ~`a`);

*/* prints DC (hex for 11011100) */*

`printf`(`"%x\n"`, `a` & `b`);

*/* prints 22 (hex for 00100010) */*

`printf`(`"%x\n"`, `a` | `b`);

*/* prints 37 (hex for 00110111) */*

`printf`(`"%x\n"`, `a` ^ `b`);

*/* prints 15 (hex for 00010101) */*

C provides two **shift operators**:
the **left-shift operator** `<<`

and the **right-shift operator** `>>`

.
These are binary operators, with the value to be shifted on the left
side of the operator, and the distance to shift the value on the right
side.

For example, the value of

is 20, since the
binary representation of 5 is 101`5` << `2`_{(2)}, which we shift left
two spaces (and fill the empty spots with zeroes) to get
10100_{(2)} = 20_{(10)}.
Bits that go off the left end are simply
lost. (Many programmers would say they fall into the
“bit bucket” (or, more verbosely, the “great bit bucket in the
sky”), a mythical location where bits go when they
disappear.)

Notice that left-shifting by `n` bits is equivalent to
multiplying by 2^{n}, since we've essentially
multiplied the value of each 1 bit in the number by that much.
In the example of left-shifting 5 by 2, we get
5 ⋅ 2² = 20.

The right-shift operator works analogously: The value of

is 5, since 20's binary representation
is 10100`20` >> `2`_{(2)}, which shifted right two spots is
101_{(2)} = 5. Again, the bits pushed off the right
end simply fall into the bit bucket, so 22_{(10)} = 10111_{(2)} shifted
right twice is also 101_{(2)} = 5. Note that right-shifting is
similar to dividing by a power of 2 (and ignoring the remainder).

C doesn't specify how to handle right-shifting negative numbers.
There are two common alternatives.
In the first, called the **logical right shift**,
a right-shift of a negative number will inserts zeroes
into the top end.
The second, the **arithmetic right shift** also
does this for positive numbers and zero, but it inserts one bits
on the top end for a negative number.
Arithmetic right shifts are not as intuitive,
but they correspond better to dividing by a power of 2 since
negative numbers remain negative:
If you take −24 and arithmetically right-shift 2 places,
you get −6.
Because this is so useful,
programmers usually prefer it, and so
most compilers do an arithmetic right shift for `>>`

.

(Java resolves this ambiguity by
adding a logical right-shift operator `>>>`

in addition to the arithmetic right-shift operator `>>`

.
Thus if

holds 11…111101100 representing −20`c`_{(10)},
a Java compiler would compute

to be 1111…1111011
representing −5`c` >> `2`_{(10)},
but it would compute

as 0011…1111011
representing 1073741822`c` >>> `2`_{(10)}).

The bit operators fit into the operator precedence hierarchy as follows.

`~ ! + -`

(unary operators)`* / %`

`+ -`

`<< >>`

(and in Java,`>>>`

)`< > <= >=`

`== !=`

`&`

`^`

`|`

`&&`

`||`

conditional operator ( `?:`

)`= *= /= %= += -= <<= >>= &= ^= |=`

At the bottom level, you can see that the bitwise and shift operators
can be combined with the assignment operator, much like
`+=`

.

So why would one want to use bitwise operators? One of the
most important applications is to use an integer to hold a set of
**flags**, where each bit corresponds to a different “fact.”
A prime example of this comes from the Linux file system,
where each file has a number called its **mode** that indicates
who can access the file.
If the 4's bit is set, then anybody can read the file; if the 2's
bit is set, then anybody can modify the file; if the 128's bit is
set, then the person who “owns” the file can modify it; and if
the 256's bit is set, then the person who “owns” the file can
read it. (The meanings for other bits are more complex.)

You might retrieve this integer in a program, but inevitably you will want to decode it. For example, what if our C program wants to check whether the owner has permission to modify a file? We would want to check whether the 128's bit is set. The most straightforward way to do this is to use the AND operator.

**if** ((`mode` & `128`) != `0`) {

By doing a bitwise AND with

, we are making it so
that only the 128's bit will remain — all other bits in
128 are 0, and anything ANDed with 0 is 0.
The result will be 0 if `128`

's 128's bit is 0,
since all other bits become 0 when ANDed with 128;
and if `mode`

's 128's bit is 1, the result will be nonzero.
So our `mode`

statement compares the result of the bitwise AND with zero.**if**

To a casual observer, the appearance of 128 here seems quite meaningless.
So the header files define constants that correspond
to each of the bits to allow us to refer to the bits in a less
obscure way. For owner write permission, the constant 128 is named

— the `S_IWUSR`

stands for write permission,
and `W`

stands for the user; the `USR`

and `S`

are simply
to ensure that the constant's name is distinct from all other constants'
names. So we'd instead write the following.`I`

**if** ((`mode` & `S_IWUSR`) != `0`) {

When we create a new file, we might want to indicate the permissions for this new file. For this, we would use the OR operator to combine several bits together.

`creat`(`"new_file.txt"`,

`S_IWUSR` | `S_IRUSR` | `S_IROTH`);

(Yes, the function is named

. Ken Thompson
was once asked if he would do anything
differently if he were to design UNIX again.
He responded, “I'd spell `creat`

with an `creat`*e*.”)

This application of binary representation and bit operators
extends beyond just file permissions. Three additional examples:
An integer representation can indicate which of the
mouse buttons and modifier keys are down, so when a program is
responding to pressing a mouse button, it can retrieve a single number
indicating the current combination, like control-right-click or
shift-alt-left-click. Java's libraries also use flags for
indicating whether a font is plain, bold, italic, or bold italic.
And Python's

library uses flags
for options on how regular expressions should be interpreted.`re`

Let's now ponder a popular programmers' puzzler:
Write a function

that, given an
`countOnes`()

value, returns the number of bits in that number that
are set to 1. For example, **int**

should return
3, since 21's binary representation 10101`countOnes`(`21`)_{(2)} has three
one bits in it. In fact, I visited Microsoft once interviewing for
an internship, and one interviewer spent our appointment discussing
various solutions to this question. (Many software companies like
to spend interviews gauging interest and mastery using problems
like this.)

The simplest technique is right-shift the number 32 times, counting the number of times you right-shift a one bit off.

**int** `countOnesA`(**int** `num`) {

**int** `ret` = `0`;

**int** `cur` = `num`;

**int** `i`;

**for** (`i` = `0`; `i` < `32`; `i`++) {

`ret` += `cur` & `1`;

`cur` >>= `1`;

}

**return** `ret`;

}

A similar alternative checks each bit of the number from right to left.

**int** `countOnesB`(**int** `num`) {

**int** `ret` = `0`;

**int** `mask` = `1`;

**int** `i`;

**for** (`i` = `0`; `i` < `32`; `i`++) {

**if** ((`num` & `mask`) != `0`) `ret`++;

`mask` <<= `1`;

}

**return** `ret`;

}

Note the importance of the parentheses in the

condition: If they were omitted, then the compiler would
interpret the expression as “**if**

,”
since the operator hierarchy places `num` & (`mask` != `0`)`!=`

above `&`

. Because C treats Boolean expressions as
integers, this would compile fine, but it would not be what is desired.
Many programmers always place parentheses around the Boolean operators
because their precedence can result is such unexpected behavior.

In my interview at Microsoft, I composed essentially

,
which impressed the interviewer well enough. But then he showed
me a different technique, which I have to admit is pretty clever.
It begins with the observation that,
when you subtract a number by 1, all of the lowest bits change up to and
including the lowest 1 bit; but the rest of the bits stay the same.
So if I do a bitwise AND of `countOnesB`()`n` with `n` − 1,
essentially I will remove the last one bit from `n`.

Once we observe this, we have only to write code that counts how many times we can remove the final bit in this way before we reach a number with no 1 bits at all (i.e., 0).

**int** `countOnesC`(**int** `num`) {

**int** `ret` = `0`;

**int** `cur` = `num`;

**while** (`cur` != `0`) {

`cur` &= `cur` - `1`;

`ret`++;

}

**return** `ret`;

}

Much later, I heard of yet another technique, which is yet more clever.

**int** `countOnesD`(**int** `num`) {

**int** `ret`;

`ret` = (`num` & `0x55555555`)

+ ((`num` >> `1`) & `0x55555555`);

`ret` = (`ret` & `0x33333333`)

+ ((`ret` >> `2`) & `0x33333333`);

`ret` = (`ret` & `0x0F0F0F0F`)

+ ((`ret` >> `4`) & `0x0F0F0F0F`);

`ret` = (`ret` & `0x00FF00FF`)

+ ((`ret` >> `8`) & `0x00FF00FF`);

`ret` = (`ret` & `0x0000FFFF`)

+ ((`ret` >> `16`) & `0x0000FFFF`);

**return** `ret`;

}

The first assignment statement begins with

. The appearance of `num` & `0x55555555`

may
seem rather random, but recall that its binary
representation alternates between zeroes and ones: 0101010101….
The bitwise AND of `0x55555555`

with this number gives a version
of `num`

where every other bit (the 2's bit, the 8's
bit, the 32's bit) has been removed and the remaining bits
(1's, 4's, 16's, …) are kept. We'll call this result
`num``a`.

The right side of this first addition, which we'll call
`b`, involves
right-shifting

once and then finding the bitwise AND
of this. The result includes all the bits omitted from `num``a`,
but shifted so that they they line up with `a`'s bits,
in the 1's place, 4's place, 16's place, and so on.

This first addition then adds these two results `a`
and `b` together. The
result, placed into

, has each pair of bits holding the
sum of that pair of bits in the original value of `ret`

.
Let's look at an 8-bit example, 01110110.`num`

num01110010

a=num&0x5501010000

b= (num>>1) &0x5500010001

ret=a+b01100001

The first two bits from

are 0 and 1,
and the first two bits from the result are 01, which is 0 + 1.
The next two bits from `num`

are 1 and 1,
and the next two bits from the result are 10, which is 1 + 1.
And so on.`num`

Notice what has happened here: In a single 32-bit addition operation, we have managed to perform 16 different bit-additions at the same time.

Now we proceed to add pairs of bits.
Finding

zeroes out every other pair of
bits, leaving only the bottom pair (1's and 2's places),
the third pair from the bottom (16's and 32's places), and so
on. And `ret` & `0x33333333``(`

finds all the remaining
pairs, shifted so they are parallel to `ret` >> `2`) & `0x33333333`

.
We now add these two groups of pairs together,
and we end up with each group of
`ret` & `0x33333333`*four* bits containing the sum of the two pairs previously
in those four bits. Let's continue with our example.

ret01100001

a=ret&0x3300100001

b= (ret>>2) &0x3300010000

ret=a+b00110001

Now the first four bits (0011) contain the sum of the first
pair of bits from the original

(01) and the second pair (10);
and the second four bits (0001) contain the sum of the third pair (00) and
the fourth pair (01). In the original value of `ret`

, the
three of the first four bits were 1, and now the first four
bits contain the number 3 in binary; likewise, the lower four
bits in the original `num`

contained one 1 bit, and now
the lower four bits contain the number 1 in binary. This is not
a coincidence.`num`

The

function continues,
adding the groups of four bits together,
then the groups of eight bits together,
and finally the two groups of sixteen bits together.
At the end, we have the total number of 1 bits in the original number.`countOnesD`()

The

technique is clever, but is it really faster? To
compare the efficiency of each of the techniques, we'll count
the total number of operations in each function, ignoring
assignments.`countOnesD`()

For

, each iteration includes five operations: a less-than comparison for the`countOnesA`()

loop, a bitwise AND and addition in the first line of the loop's body, a right-shift in the second line, and an addition as we increment**for**

for the next iteration. With five operations per iteration for each of 32 bits, we have 32 ⋅ 5 = 160 operations.`i`For

, each iteration includes five or six operations: a less-than comparison for the`countOnesB`()

loop, a bitwise AND and a not-equals comparison for the**for**

statement, possibly an addition if we execute the**if**

statement's body, a left-shift operation for the second line of the loop's body, and an addition as we increment**if**

for the next iteration. We don't know how often the`i`

statement's body will be executed, but let's estimate that half of the bits will be 1's. The number of operations then would be 16 ⋅ 5 + 16 ⋅ 6 = 176 operations.**if**For

, each iteration includes four operations: a not-equals comparison for the`countOnesC`()

condition, a subtraction and a bitwise AND for the loop's first statement, and an addition as we increment**while**

. The number of iterations will match the number of 1 bits in the number, which we estimate will be true for 16 of the 32 bits. So there would be 16 ⋅ 4 = 64 operations.`ret`Finally, for

, each assignment statement involves four operations: two bitwise ANDs, a right-shift, and an addition. There are five assignment statements, for 5 ⋅ 4 = 20 operations.`countOnesD`()

Based on this analysis, we'd surmise that

would be about three times faster than `countOnesD`()

,
which itself is almost three times faster than `countOnesC`()

or `countOnesA`()

.`countOnesB`()

That's the theoretical approach, but you might wonder about the actual performance. I ran a test involving running each of these four functions across 131,072 random numbers, using a Linux computer with an Intel Core i5 750 processor [test code]. Here is the average time per function call.

`:`

countOnesA()68.8 ns `:`

countOnesB()83.1 ns `:`

countOnesC()16.0 ns `:`

countOnesD()0.9 ns

While the times here don't match the exact ratios our theory
predicted, the conclusion remains the same:
The cleverness of

pays off.`countOnesD`()