Basic Bit Manipulation

7,371 views

Transacting on Ethereum is extremely expensive, so sometimes, it makes sense to aim for more granular control over your code. One way to achieve this is to utilize bitwise operations, like in the bit twiddling techniques I've posted on X @fiveoutofnine:

These operations/tricks are pretty different from other types of programming, so it may be fairly esoteric. Although I tried to give an application example with each post, it's still hard to see where/how to apply them yourself. A good example of a project with more extensive applications is my on-chain chess engine, but the contracts are probably a bit hard to follow for beginners.

In this post, I attempt to explain some of these tricks and explain my thought process when applying them with a much simpler game: Tic-tac-toe.

Bitwise operators

Feel free to skip this section if you are already familiar with bitwise operators.

First, here are the basic bitwise operators used in the following Tic-tac-toe example: >>, <<, &, |, and ^.

>> (SHR)

Digits are moved, or shifted, to the right. Zeros replace the discarded bits.
Source

Shifting an integer $m$ right by $n$ bits results in $\left\lfloor\frac{m}{2^{n}}\right\rfloor$ .

<< (SHL)

Digits are moved, or shifted, to the left. Zeros replace the discarded bits.
Source

Shifting an integer $m$ left by $n$ bits results in $m\cdot 2^{n}$ .

& (AND)

For each pair of digits, compute 1 if both are 1; otherwise, compute 0.

Since & only "accepts" the bit if both numbers' bits are 1, you can construct a bitmask to select which bits of a number you want to read. For example, for some number $m$ , if we want to read every other bit, we can compute m & 1010...1010.

| (OR)

For each pair of digits, compute 1 if either is 1; otherwise, compute 0.

Since | "accepts" the bit as long as either numbers' bits are 1, you can write a number's bits into another number, as long as the first number's bits in the same position are all 0.

^ (XOR)

For each pair of digits, compute 1 if they differ. Otherwise, compute 0.

^ can be helpful in conjunction with & and | when overwriting a number's bits with another set of bits.

Try working through the examples and notes to gain some foundational intuition.

Numbers in this section with only 0s and 1s are in binary representation unless otherwise specified.

Overview of Tic-tac-toe

Before writing the contract, let's lay out the basic things required for a game of Tic-tac-toe. It's a 2-player game, and each player only has 1 type of move they can play (i.e. placing their respective symbol into an unoccupied square). When a move is inputted by a player, we have to validate a few things:

The player is 1 of the 2 players in the game.
It's the player's turn to play a move.

Next, we have to check that the move is legal. If it is, we apply the move.

Finally, we must check if the game has ended (i.e. either player has formed 3 consecutive dots).

Writing this out as pseudocode in a function called playMove, we have:

function playMove(position) {
    // Player validation
    if ("player is not one of the 2 in the game") revert Error();
    if ("it is not the player's turn") revert Error();

    // Move validation
    if ("the position has already been played in the board") revert Error();

    // Apply move.
    applyMove(position)

    // Game ending scenario
    if ("either player has formed 3 consecutive dots") {
        ("set the winner and end the game")
    }
}

Board/game state representation

Now that we have an idea of all the information to store, let's come up with a representation in code. Keep in mind that the design should allow for efficient computation: the code will read the data and compute with it, so operations such as accessing, decoding, and storing should be cheap. Once we achieve both, since the game has to be stored fully on-chain and storage costs are (generally) the most expensive operation on the EVM, we can look towards using as few bits as possible.

Base representation

A naive approach might look something like:

1
struct Game {
2
    uint8[9] board; // 0 = empty, 1 = player 0, 2 = player 1
3
    bool turn; // false = player 0, true = player 1
4
    bool hasGameEnded;
5
    bool playerZeroWon; // only set if `hasGameEnded` is true
6
    bool playerOneWon; // only set if `hasGameEnded` is true
7
    address playerZero;
8
    address playerOne;
9
}

If we were satisfied with this, we would make sure struct Game was tightly packed, then proceed.

However, uint8[9] board uses a lot of unnecessary reads/writes, and we don't need 8 bits to store data for a single position on the board. Since there are only 3 valid states, 2 bits are enough ( $2^2=4$ ). Next, bool turn can easily be 1 bit: 0 if false, 1 if true. Lastly, because bool hasGameEnded, bool playerZeroWon, and bool playerOneWon work in conjunction to express the state of the game (only 4 possibilities), we can combine them into 2 bits:

the game is ongoing (00),
player zero has won (10),
player one has won (11),
and game is a draw (01).

Altogether, we use just 21 bits (down from $8 \cdot 9 + 8 + 8 + 8 + 8 = 104$ when using the struct above) to represent everything:

┌─────────────────────────── First 2 * 9 = 18 bits denote the board
│                   ┌─────── Next 2 bits denote whether game is ongoing
├──────────────────┐├──┐┌─┬─ Last bit denotes whose turn it is to play
[000000000000000000][00][0]

Bitpacking games into `uint256`s instead of using structs

If we were to use a struct to store the game states, we'd have to use uint24 because it's the smallest unit that fits 21 bits, which would waste 3 bits. Instead, let's store the players' addresses in 2 other mappings and have a third mapping exclusively to store game states:

1
mapping(uint256 => address) public playerZeros;
2
mapping(uint256 => address) public playerOnes;
3
mapping(uint256 => uint256) public games;

This allows us to bitpack consecutive games' data together and store it slightly cheaper.

The idea is to reduce as many zero slot writes as possible because writing to zero slots costs 20k gas, whereas writing to nonzero slots only costs 5k gas.

We achieve this by bitpacking as many games' data into a uint256 (in this case, $\left\lfloor\frac{256}{21}\right\rfloor=12$ , so we can fit up to 12 games in 1 uint256). This way, only every $12n^{\mathrm{th}}$ game will cost 20k gas to store, and every other game will only cost 5k gas. Reading and writing remain efficient:

TicTacToe.sol

1
// Each `uint256` value has 12 games bitpacked into it.
2
mapping(uint256 => uint256) public gameStates;
3

4
// Each game is 21 bits, so the desired mask is
5
// `0b111111111111111111111 = 0x1FFFFF` (21 `1`s).
6
uint256 internal constant GAME_MASK = 0x1FFFFF;
7

8
function retrieveGame(uint256 _gameId) external view returns (uint256) {
9
    // Divide by 12 and take mod 12 because 12 games are bit packed into
10
    // each `uint256`.
11
    uint256 key = _gameId / 12;
12
    uint256 index = _gameId % 12;
13

14
    // Bit shift by `21 * index` to get to the desired game.
15
    return (gameStates[key] >> (21 * index)) & GAME_MASK;
16
}
17

18
function writeGame(uint256 _gameId, uint256 _gameState) external {
19
    uint256 key = _gameId / 12;
20
    uint256 index = _gameId % 12;
21

22
    // Read the entire `uint256` into memory first because it's cheaper to make all
23
    // modifications in memory before writing the final result to storage.
24
    uint256 gameUint256 = gameStates[key];
25

26
    // This process overwrites whatever exists in the desired 21 bit slot with
27
    // `_gameState` (see `XOR` in "Basic Bit Operations").
28
    gameUint256 &= type(uint256).max ^ (GAME_MASK << index);
29
    gameUint256 |= _gameState << (index * 21);
30

31
    // Write to storage.
32
    gameStates[key] = gameUint256;
33
}

Final representation

Now, we have 3 mappings to keep track of all games' data and players. With this set-up, storing each new game to storage costs roughly

$20000 + 20000 + \frac{20000 + 5000 \cdot 12}{13} \approx 46154\text{ gas}$

on average (2 zero slot writes for each of the players' addresses, and $\approx 6154$ gas for the game's state).

Since addresses are only 160 bits, we can remove a write to storage by bitpacking the game's state with one of the players' addresses. Let's arbitrarily pick player zero's address for this. Combined, we have:

1
// The first 21 bits are games' data, and the remaining 160 bits are player 0
2
// addresses.
3
mapping(uint256 => uint256) public playerZerosAndGames;
4
mapping(uint256 => address) public playerOnes;

Again, reading and writing remain efficient:

TicTacToe.sol

1
uint256 internal currentGameId;
2

3
function createNewGame(address _playerZero, address _playerOne) {
4
    uint256 gameId = currentGameId++;
5
    // The first 21 bits are the game. We don't need to set anything for this
6
    // because the all bits are 0 for the starting position.
7
    playerZerosAndGames[gameId] = uint256(uint160(_playerZero));
8
    playerOnes[gameId] = _playerOne;
9
}
10

11
function retrieveGame(uint256 _gameId) external view returns (uint256) {
12
    return playerZerosAndGames[_gameId] >> 160;
13
}
14

15
function writeGame(uint256 _gameId, uint256 _gameState) external {
16
    uint160 playerZero = uint160(playerZerosAndGames[_gameId]);
17
    // Bit shift `_gameState` left by 160 to make space for player 0's address.
18
    playerZerosAndGames[_gameId] = (_gameState << 160) | playerZero;
19
}

In this particular case, since we need a minimum of 2 storage slots per new game anyway (2 player addresses already exceed 2 32-byte words), cutting down on the number of bits used to represent the game is not necessary—all else equal, using 96 bits ( $256 - 160 = 96$ ) is just as efficient. However, I thought I'd describe the process here anyway, for when it might be applicable.

Fine-tuning the representation

For Tic-tac-toe, the representation above is good enough. But if it were a different application, there might be extra considerations to take. These are highly dependent on the situation, so here are some common patterns:

Frequently accessed data

The ordering of the first/last $n$ bits might help save on computation: they only require one of >> or & to access. Thus, if some information needs to be accessed more frequently than the others, it'd probably be wise to place it at the start/end.

Structuring data to remove expensive computations

The main example here is more precisely described as a "branchless optimization," but the more general pattern of using the value itself to compute the result can be quite useful for other situations too.

An expensive computation on the EVM (and most computers) are branch instructions. Keeping this in mind, it may be wise to structure your representation to remove them. For example, suppose a 3-bit value portion maps to 8 different outputs as follows:

1
if      (portion == 0) return  3;
2
else if (portion == 1) return  7;
3
else if (portion == 2) return 12;
4
else if (portion == 3) return  6;
5
else if (portion == 4) return 13;
6
else if (portion == 5) return  9;
7
else if (portion == 6) return  3;
8
else                   return 15;

Instead, we can use portion as a value itself as part of the computation! The following code block is equivalent to the series of if/else statements above:

1
//   value << wordSize * index
2
// ----------------------------
3
//   (  15 <<        4 *     7)
4
// | (   3 <<        4 *     6)
5
// | (   9 <<        4 *     5)
6
// | (  13 <<        4 *     4)
7
// | (   6 <<        4 *     3)
8
// | (  12 <<        4 *     2)
9
// | (   7 <<        4 *     1)
10
// | (   3 <<        4 *     0)
11
// = 0xF39D6C73.
12
//
13
// 15 is just 0b1111 (i.e. masks the 4 least significant bits).
14
// `portion << 2` is the same thing as `portion * 4`.
15
// We "multiply" by 4 because each word is 4 bits long.
16
return (0xF39D6C73 >> (portion << 2)) & 15;

For a real-world example, see this post.

On a related note, you can often structure your representation and assign values in a way that allows for efficient group conditional checks (i.e. form it in a way that makes it easy to categorize). Some simple examples:

if the value is even, perform x, otherwise perform y;
if the value is greater than m, perform x, otherwise perform y;
if the value is a multiple of 8, perform x, otherwise perform y;
if the value's 2nd to 5th bits are 010, perform x, otherwise perform y.

You can also combine one or more of these to form composite checks. For example, if the value is even and greater than m, perform x, otherwise perform y.

For heavy usage of this concept in a live deployment, see generateMoves from Chess.sol.

Pedantic bit packing: is 21 bits good enough?

You may have realized that we use 2 bits ( $2^2 = 4$ total possibilities) per position for the Tic-tac-toe example when there are only 3 possible states (empty, occupied by player 0, or occupied by player 1). Why not save 3 more bits ( $222222222_3$ is 15 bits long) by using a base 3 representation for the board instead? We can then use just 18 bits ( $15 + 3 = 18$ )—isn't this better?

Not really. Besides the point that we bitpack the game data with player 0's address, consider the following: 18 bits allows us to store 14 ( $\left\lfloor\frac{256}{18}\right\rfloor=14$ ) games per uint256, which means it costs $\frac{20000 + 5000\cdot 13}{14}\approx 6071$ gas on average to store a game's data. Comparatively, 21 bits allows us to store 12 games per uint256, or $\frac{20000 + 5000\cdot 11}{12} = 6520$ gas on average. Although 449 gas is "saved" on the storage side, decoding and encoding base 3 is considerably more expensive with many division and mod operations (vs almost entirely using bit operations)¹.

Valid reasons (maybe) to be that pedantic about the number of bits is that (1) it results in major gas savings, or (2) computation complexity for encoding/decoding is not an issue (e.g. it's only encoded/decoded in a view function). On-chain art is a category of projects/communities that are open to excessive efforts to compress storage, as well as a realm where it often makes sense to.

¹ We also later end up using the extra slot to make a slight optimization in checking whether a player won. See subsection Checking if a player won.

User validation

Recall that our player 0 is stored in mapping(uint256 => uint256) public playerZerosAndGames, where the first 160 bits represent player 0, and player 1 is stored in mapping(uint256 => address) public playerOnes. Thus, to check that msg.sender is either player 0 or player 1, we have:

1
address playerZero = address(uint160(playerZerosAndGames[_gameId]));
2
address playerOne = playerOnes[_gameId];
3

4
require(msg.sender == playerZero || msg.sender == playerOne);

Next, to check whether it's msg.sender's turn to play, recall that the last bit of the board denotes whose turn it is to play: 0 if it's player 0's turn, and 1 if it's player 1's turn. In bit operations: if board & 1 == 0, it is player 0's turn, and if board & 1 == 1, it is player 1's turn.

1
uint256 playerZeroAndGame = playerZerosAndGames[_gameId];
2
// Store it in memory to compute upon, then push it to storage later all at once.
3
// We access the game by bitshifting `playerZeroAndGame` to the right by 160.
4
uint256 inMemGame = playerZeroAndGame >> 160;
5
address playerZero = address(uint160(playerZeroAndGame));
6

7
// `inMemGame & 1` retrieves whose turn it is to play.
8
require(
9
    // `msg.sender` is player 0, and it is their turn.
10
    (msg.sender == playerZero && inMemGame & 1 == 0)
11
        // `msg.sender` is player 1, and it is their turn.
12
        || (msg.sender == playerOnes[_boardId] && inMemGame & 1 == 1)
13
);

Move validation

After we have verified that msg.sender is the correct address to play, we must verify whether the move they inputted is valid. There are 2 necessary checks for this:

the game is not over,
and the position they inputted must be empty on the board.

Recall that if the 3rd to 2nd last bits are 00, the game is still ongoing. Therefore, checking whether the game is not over is as simple as reading those 2 bits and checking if they equal 0:

1
// 3 is just 0b11 (i.e. masks 2 bits)
2
require((inMemGame >> 1) & 3 == 0);

Next, let _index denote the user's input, where they correspond to the following positions of the board:

Then, accessing the position corresponding to _index is just a matter of shifting to the correct position and, once again, masking 2 bits with 3 (0b11). This value must equal 0 (i.e. the square is empty) for the move to be valid:

1
// First, bit shift right by 3 to get the relevant bits (i.e. just the bits
2
// denoting the board). Then, we bit shift to the correct position by shifting
3
// right by `_index << 1` bits (note: each position takes up 2 bits).
4
require(((inMemGame >> 3) >> (_index << 1)) & 3 == 0);

Check game ending state

Intuition

There are 4 possible ways for a player to win:

form a horizontal line of 3,
form a vertical line of 3,
form a diagonal line of 3 from the bottom right corner to the top left corner,
and form a diagonal line of 3 from the bottom left corner to the top right corner.

If either player has satisfied one of those, the game is over. How do we determine that?

Well, for each winning pattern, we can select the bits that correspond to the squares in that pattern. Then, if these bits all represent squares played entirely by player 0 or entirely by player 1, the game is over.

Winning pattern bitmasks

Recall from earlier that we can select bits with the & operator and a mask. Let's denote

HORIZONTAL for rows of 3,
VERTICAL for columns of 3,
BR_TO_TL_DIAGONAL for the diagonal from the bottom right corner of the board to the top left corner,
and BL_TO_TR_DIAGONAL for the diagonal from the bottom left corner of the board to the top right corner.

The following code block shows how they're computed (remember each square is represented by 2 bits):

1
//     1. 00 00 00
2
//        00 00 00
3
//        11 11 11
4
//     => 0b111111 = 0x3F
5
uint256 internal constant HORIZONTAL = 0x3F;
6
//     2. 00 00 11
7
//        00 00 11
8
//        00 00 11
9
//     => 0b11000011000011 = 0x30C3
10
uint256 internal constant VERTICAL = 0x30C3;
11
//     3. 11 00 00
12
//        00 11 00
13
//        00 00 11
14
//     => 0b110000001100000011 = 0x30303
15
uint256 internal constant BR_TO_TL_DIAGONAL = 0x30303;
16
//     4. 00 00 11
17
//        00 11 00
18
//        11 -- --
19
//     => 0b1100110011 = 0x333
20
uint256 internal constant BL_TO_TR_DIAGONAL = 0x333;

Applying the bitmasks

The position we apply the mask on is also important. For example, if we apply the horizontal bit mask at the square of index 0, we (correctly) select the following squares marked by x:

Now, suppose we apply it at the square of index 1:

This is incorrect. The correct positions to apply them for each mask are shown below:

Bitmask	Applied at	Selected bits
`HORIZONTAL`


`VERTICAL`


`BR_TO_TL_DIAGONAL`
`BL_TO_TR_DIAGONAL`

Try working through them on paper!

Checking if a player won

For each selected square, the corresponding pair of bits in the mask are 0b11, or 3 in decimal representation. All other bits (i.e. those that aren't included by the mask) are 0. Then, if the selected portion denotes squares entirely played by either player, it should be a scalar multiple of the mask that was used:

If 0b01 (1 in decimal) represents a square played by player 0, and all the selected squares were played by player 0, the selected portion should equal $\frac{\mathrm{MASK}}{3}$ .
Similarly, if 0b10 (2 in decimal) represents a square played by player 1, and all the selected squares were played by player 1, the selected portion should equal $\frac{2\cdot\mathrm{MASK}}{3}$ .

This is fairly efficient to check, but we can do slightly better. If we use 0b00 for empty squares, 0b01 for player 0, and 0b10 for player 1, 0b11 represents nothing. What happens if we use 0b11 instead of 0b10 for player 1's moves? The selected portion should equal $\mathrm{MASK}$ , instead of $\frac{2\cdot\mathrm{MASK}}{3}$ ! This is more efficient to check, so let's go with this.

Implementation

x denotes the position the code is at, and - denotes bits that have been bitshifted off (and thus no longer relevant).

TicTacToe.sol

1
/// @return 0 if ongoing, 1 if player 0 has won, and 2 if player 1 has won.
2
function checkGameState(uint256 _board) internal pure returns (uint256) {
3
    // Last 4 bits denote information about whether the game is over and whose
4
    // turn it is to play, so we ignore them.
5
    _board >>= 4;
6
    // • • •
7
    // • • •
8
    // • • x
9

10
    uint256 gameState = checkGameStateHelper(_board, HORIZONTAL);
11
    if (gameState != 0) return gameState;
12

13
    gameState = checkGameStateHelper(_board, VERTICAL);
14
    if (gameState != 0) return gameState;
15

16
    gameState = checkGameStateHelper(_board, BR_TO_TL_DIAGONAL);
17
    if (gameState != 0) return gameState;
18

19
    _board >>= 2;
20
    // • • •
21
    // • • •
22
    // • x -
23
    gameState = checkGameStateHelper(_board, VERTICAL);
24
    if (gameState != 0) return gameState;
25

26
    _board >>= 2;
27
    // • • •
28
    // • • •
29
    // x - -
30
    gameState = checkGameStateHelper(_board, VERTICAL);
31
    if (gameState != 0) return gameState;
32
    gameState = checkGameStateHelper(_board, BL_TO_TR_DIAGONAL);
33
    if (gameState != 0) return gameState;
34

35
    _board >>= 2;
36
    // • • •
37
    // • • x
38
    // - - -
39
    gameState = checkGameStateHelper(_board, HORIZONTAL);
40
    if (gameState != 0) return gameState;
41

42
    _board >>= 6;
43
    // • • x
44
    // - - -
45
    // - - -
46
    gameState = checkGameStateHelper(_board, HORIZONTAL);
47
    if (gameState != 0) return gameState;
48

49
    return 0;
50
}
51

52
function checkGameStateHelper(uint256 _board, uint256 _mask)
53
    internal pure
54
    returns (uint256)
55
{
56
    return (_board & _mask) == _mask
57
        ? 2 // All squares are player 1's.
58
        : (_board & _mask) == (_mask / 3)
59
            ? 1 // All square are player 0's.
60
            : 0; // No win situation.
61
}

Check game ending state w/ while loop (bonus)

This section is just for fun (and has some cool bit twiddling patterns).

The code above looks kinda ugly/redundant. Let's use a while loop instead, so we don't have checkGameStateHelper so many times. We can do this by bitpacking the win-pattern-checking masks consecutively into a uint256 masks (in the order they are used), as well as bitpacking the number of bits the board is shifted by after each check into a uint256 boardShifts.

Then, starting at index 0, we can iterate through the board masks and apply the board shifts after each check to make our way through the entire board as follows:

Board	Bitmask	Check	Shift
	`HORIZONTAL`		0
	`VERTICAL`		0
	`BR_TO_TL_DIAGONAL`		2
	`VERTICAL`		2
	`VERTICAL`		0
	`BL_TO_TR_DIAGONAL`		2
	`HORIZONTAL`		6
	`HORIZONTAL`		0

Computing `masks` and `boardShifts`

Since the largest mask is 18 bits, and $9\cdot 18 < 256$ , we can safely reserve 18 bits for each "word" within uint256 masks:

\begin{align*} &\texttt{HORIZONTAL} \\ &\mathbin{|}(\texttt{VERTICAL} &\texttt{<< } &18\cdot 1)\\ &\mathbin{|}(\texttt{BR\_TO\_TL\_DIAGONAL} &\texttt{<< } &18\cdot 2)\\ &\mathbin{|}(\texttt{VERTICAL} &\texttt{<< } &18\cdot 3)\\ &\mathbin{|}(\texttt{VERTICAL} &\texttt{<< } &18\cdot 4)\\ &\mathbin{|}(\texttt{BL\_TO\_TR\_DIAGONAL} &\texttt{<< } &18\cdot 5)\\ &\mathbin{|}(\texttt{HORIZONTAL} &\texttt{<< } &18\cdot 6)\\ &\mathbin{|}(\texttt{HORIZONTAL} &\texttt{<< } &18\cdot 7)\\ &=\texttt{0xFC003F00CCC186106187030306184003F} \end{align*}

Next, since the largest shift is 6, we can safely use 3 bits (0b111 > 6) for each "word" within uint256 boardShifts:

\begin{align*} &0\\ &\mathbin{|}(0 \texttt{ << } 3\cdot 1)\\ &\mathbin{|}(2 \texttt{ << } 3\cdot 2)\\ &\mathbin{|}(2 \texttt{ << } 3\cdot 3)\\ &\mathbin{|}(0 \texttt{ << } 3\cdot 4)\\ &\mathbin{|}(2 \texttt{ << } 3\cdot 5)\\ &\mathbin{|}(6 \texttt{ << } 3\cdot 6)\\ &=\texttt{0x190480} \end{align*}

Implementation

TicTacToe.sol

1
function checkGameStateWithWhile(uint256 _board) internal pure returns (uint256) {
2
    // Last 4 bits denote information about whether the game is over and whose turn
3
    // it is toplay, so we ignore them.
4
    _board >>= 4;
5

6
    uint256 masks = 0xFC003F00CCC186106187030306184003F;
7

8
    // Since the largest shift is 6, we can safely reserve 3 bits (1 << 3 = 8 < 6) for each
9
    // shift:
10
    //                                      0
11
    //                                      | (0 << 3 * 1)
12
    //                                      | (2 << 3 * 2)
13
    //                                      | (2 << 3 * 3)
14
    //                                      | (0 << 3 * 4)
15
    //                                      | (2 << 3 * 5)
16
    //                                      | (6 << 3 * 6)
17
    //                                      = 0x190480
18
    uint256 boardShifts = 0x190480;
19

20
    while (masks != 0) {
21
        // 0x3FFFF = 0b11...11 (18 1's)
22
        uint256 mask = masks & 0x3FFFF;
23
        if (_board & mask == mask) return 2;
24
        else if (_board & mask == (mask / 3)) return 1;
25

26
        _board >>= (boardShifts & 7);
27
        masks >>= 18;
28
        boardShifts >>= 3;
29
    }
30

31
    return 0;
32
}

Conclusion

So many people see low-level solidity optimizations from people like @transmissions11 and think that it's the key to optimizing their project

but what they don't realize is that the real savings come from efficient and well-designed mechanisms and architectures
0xSubway

It's easy to get caught up in tactical tricks to make near-obsolete optimizations to your smart contracts. Most of the techniques I shared on X were never even intended to be "optimization tips," (bit twiddling can be quite useful outside of optimization), but I sort of ended up falling down that hole.

Of course, it's always fun to discover and share tricks like this! But many of them (especially bit twiddling) have extremely niche applications, so ask yourself whether it really makes sense to obfuscate your code for small optimizations before applying them.

Extra reading/resources

If you want to see some more advanced stuff, check out:

Lists
- Sean Anderson's Bit Twiddling Hacks
- The Aggregate Magic Algorithms
- HAKHEM
- Assembly Language Lab
Books
- Hacker's Delight 2nd Edition (Amazon, IPFS URI)
- Chapter 1: "Bit Wizardry" of "Matters Computational"
Other
- fiveoutofnine/on-chain-chess
- Fast Inverse Square Root — A Quake III Algorithm (a lot of people's first exposure to bit twiddling)
- Reciprocal Multiplication, a tutorial

If you found this helpful, consider following me on X @fiveoutofnine.