c - How to determine if a byte is null in a word -


i reading "strlen" source code glibc, , trick developers found speed read n bytes n size of long word, instead of reading 1 byte @ each iteration.

i assume long word has 4 bytes.

the tricky part every "chunk" of 4 bytes function reads can contain null byte, @ each iteration, function has check if there null byte in chunk. like

if (((longword - lomagic) & ~longword & himagic) != 0) { /* null byte found */ } 

where longword chunk of data , himagic , lowmagic magical values defined as:

himagic = 0x80808080l; lomagic = 0x01010101l; 

here comment thoses values

/* bits 31, 24, 16, , 8 of number zero.  call these bits  "holes."  note there hole left of  each byte, @ end:   bits:  01111110 11111110 11111110 11111111  bytes: aaaaaaaa bbbbbbbb cccccccc dddddddd   1-bits make sure carries propagate next 0-bit.  0-bits provide holes carries fall into.  */ 

how trick of finding null byte work?

from famous "bit twiddling hacks" page sean eron anderson, description of used in glibc implementation you're referring (anderson calls algorithm hasless(v, 1)):

the subexpression (v - 0x01010101ul), evaluates high bit set in byte whenever corresponding byte in v 0 or greater 0x80. sub-expression ~v & 0x80808080ul evaluates high bits set in bytes byte of v doesn't have high bit set (so byte less 0x80). finally, anding these 2 sub-expressions result high bits set bytes in v zero, since high bits set due value greater 0x80 in first sub-expression masked off second.

it appears comment(s) in glibc source confusing because doesn't apply code doing anymore - it's describing have been implementation of algorithm anderson describes before hasless(v, 1) algorithm described.


Comments

Popular posts from this blog

python - mat is not a numerical tuple : openCV error -

c# - MSAA finds controls UI Automation doesn't -

wordpress - .htaccess: RewriteRule: bad flag delimiters -