c - How to determine if a byte is null in a word -
i reading "strlen" source code glibc, , trick developers found speed read n bytes n size of long word, instead of reading 1 byte @ each iteration.
i assume long word has 4 bytes.
the tricky part every "chunk" of 4 bytes function reads can contain null byte, @ each iteration, function has check if there null byte in chunk. like
if (((longword - lomagic) & ~longword & himagic) != 0) { /* null byte found */ }   where longword chunk of data , himagic , lowmagic magical values defined as:
himagic = 0x80808080l; lomagic = 0x01010101l;   here comment thoses values
/* bits 31, 24, 16, , 8 of number zero.  call these bits  "holes."  note there hole left of  each byte, @ end:   bits:  01111110 11111110 11111110 11111111  bytes: aaaaaaaa bbbbbbbb cccccccc dddddddd   1-bits make sure carries propagate next 0-bit.  0-bits provide holes carries fall into.  */   how trick of finding null byte work?
from famous "bit twiddling hacks" page sean eron anderson, description of used in glibc implementation you're referring (anderson calls algorithm hasless(v, 1)): 
the subexpression
(v - 0x01010101ul), evaluates high bit set in byte whenever corresponding byte in v 0 or greater0x80. sub-expression~v & 0x80808080ulevaluates high bits set in bytes byte of v doesn't have high bit set (so byte less0x80). finally, anding these 2 sub-expressions result high bits set bytes in v zero, since high bits set due value greater0x80in first sub-expression masked off second.
it appears comment(s) in glibc source confusing because doesn't apply code doing anymore - it's describing have been implementation of algorithm anderson describes before hasless(v, 1) algorithm described.
Comments
Post a Comment