Determine which of a set of images has been recompressed/resaved the least -


i'm working on system fuzzy image deduplication.

right now, have functional system can large-scale phash fuzzy image searching , deduplication via either dct-based or gradient-based perceptual hashes.

however, while determining if image has been reduced in size programatically trivial, how can determine image parent of which?

basically, if have 2 images same resolution, 1 resaved version of other (either in different format (jpg/png), or recompressed), how can determine 1 original in reliable manner?

(note: assume metadata has been stripped images, wish simple.)

bonus points if solution easy implement in python.

this isn't positive answer, spent while of time evaluating use of average entropy per-pixel determine if useful metric determining how compressed image is.

i have write here.

some excerpts:

variance in entropy across compression levels on sipi reference image database images.

enter image description here
in retrospect, x-axis should labeled "jpeg quality level". higher numbers mean better quality

while per-pixel entropy decline sharply @ extremely aggressive compression levels, not vary in way correlates compression level.

this means attempt compare 2 images inspecting entropy have issues unless 1 knows exactly compression level image had been resaved at.


Comments

Popular posts from this blog

python - mat is not a numerical tuple : openCV error -

c# - MSAA finds controls UI Automation doesn't -

wordpress - .htaccess: RewriteRule: bad flag delimiters -