Bits, Math and Performance(?): (Not) transposing a 16x16 bitmatrix

Bits, Math and Performance(?): (Not) transposing a 16x16 bitmatrix