DPChallenge: A Digital Photography Contest You are not logged in. (log in or register
 

DPChallenge Forums >> Photography Discussion >> Finding Duplicates
Pages:  
Showing posts 1 - 8 of 8, (reverse)
AuthorThread
02/18/2006 11:08:43 AM · #1
I've had an idea for a program for a long time but have never worked on it myself. I'm wondering if someone else may have produced a program like this:

Find duplicate images by reducing the image down to some kind of numerical "fingerprint" and then comparing the fingerprint values.

For example, it is possible to quickly determine a set of duplicate files by using the CRC-32 algorythm to store a 4 byte value for each file in a database. Then you just look for files that have the same CRC-32 value (there won't be very many since there are 4 billion values possible) and those that do have the same value, you can do a follow-up comparison to determine that they truly are the same file.

Well... in my "fingerprint idea", what I was thinking was to take an image, no matter how large, and reduce it (interpolate as necessary) down to a specific size. Then assign numbers to each quadrant of the image, based on luminosity and color. Store those values in a database. Then you could search for images that have similar values. If you find any, a manual check of the image could determine if they are the same image.

I'm thinking... this could be used to find copies of an image, whether the copy is a thumbnail, a jpeg, a tiff, a psd, etc. Because you're not comparing the content of the file, you're comparing a "summary" of what it looks like at a much smaller resolution.

Has anyone seen such a program already?
02/18/2006 11:09:48 AM · #2
Another interesting application for this idea:

If you could upload an image to google, have them compute the fingerprint value of your image, and then do a search of their own archives to help you find where your image may be in use!
02/18/2006 12:05:35 PM · #3
iMatch can find duplicat entries in your archive. It also allows you to draw a rough sketch of the image you're looking for and find those which look similar to your drawing.
I've seen your second idea at istock.com. Only for their database, of course.
02/18/2006 01:31:03 PM · #4
Originally posted by gloda:

iMatch can find duplicat entries in your archive. It also allows you to draw a rough sketch of the image you're looking for and find those which look similar to your drawing.


I couldn't find any mention of these capabilities in the online list of features. I'm looking at iMatch found at www.photools.com. Is that the right one?
02/19/2006 06:59:05 PM · #5
What purpose would this serve? Anti-theft? If so I don't see how this would be any more beneficial over a watermark or any hidden coding using the jpg format. For this to work it has to be based on something that can't be easily modified. Luminosity and color like any other visual element can easily be altered in most photo editing programs.
02/19/2006 07:31:34 PM · #6
I used to be responsible for information security for "a major corporation."

As of a couple of years ago, there was (very expensive - think USD X00,000 per copy) commercially available software which can do this. It reduces anything (document, photo, audio, video et al) to a "hash number" and scours the web for content with a "close" hash number. Close because the content could have been edited and if so, would yield a slightly different hash number. This software comes out of the intelligence community and believe me, you DO NOT want to know more.

It is used by corporations to detect breach of privacy, security or fiduciary responsibility and provide evidence used to prosecute offenders.

However, it relies on knowing "in advance" what's important enough to hash and then search for.

There are other techniques for tagging "something important" so that it will "phone home" when opened on a network attached computer. When it phones home, it provides forensic data used to prosecute breach of security, privacy, and/or fiduciary responsibility. Again, you DO NOT want to know more.
02/19/2006 08:30:49 PM · #7
The other place you might find this already in use is in the systems used for biometric identification; fingerprint, iris pattern and facial recognition all must use something related to what you're proposing. Like the document-protection systems, you might not be able to find out that much about it outside the corporate or academic communities.

Here's a site I found recently by a guy who specializes in security and cryptography issues.
02/19/2006 08:38:20 PM · #8
reading this made my little brain hurt.
Pages:  
Current Server Time: 08/28/2025 01:53:42 PM

Please log in or register to post to the forums.


Home - Challenges - Community - League - Photos - Cameras - Lenses - Learn - Help - Terms of Use - Privacy - Top ^
DPChallenge, and website content and design, Copyright © 2001-2025 Challenging Technologies, LLC.
All digital photo copyrights belong to the photographers and may not be used without permission.
Current Server Time: 08/28/2025 01:53:42 PM EDT.