Evaluating votes - another take - DPChallenge Forums

Canon 12x IS USM (S3 IS built-in Lens) -- 36-432mm (35mm EQ)

02/23/2005 06:30:27 PM · #1

I have been following the "Troll Voting" thread, and thought I'd put my thoughts in a new thread. I want to talk about Standard Deviation and how it relates to voting. For a few weeks now, I have taken various entries in the Challenges and input their scores into an Excel program I wrote to determine the standard deviation of their scores. Basically the standard deviation (SD) tells you how much, on average, votes deviate from the Mean (average) vote. The lower the number, the more tightly grouped the votes.

Thus, if all your votes (no matter how many) were exactly 5's, the mean would be 5.00 and the SD would be 0.00. On the other hand, if, say you had 100 votes, 50 were 1's and 50 were 10's the Mean is 5.5 and the SD is 4.523. If those 100 votes were 50 5's and 50 6's the Mean would be 5.5 and the SD is 0.503.

If you have a nice bell-shaped curve of - from 1's to 10's - 1,3,6,13,27,27,13,6,3,1 - the Mean would be (once again) 5.5 but the SD would be 1.624. Incidently, this is very close to the SD that I have found in most of my tests.

OK, unglaze your eyes. What the SD can tell you is just how much confidence you can have that your average score reflects "reality". When all your votes are exactly the same, you have an SD of 0.00 and you can be pretty confident that your score was fair. However, if you have an average vote of 5.5 and an SD of 4.523 (as in 50 1's and 50 10's) you can be much less confident that the 5.5 reflects reality. Most scores I've checked come in at around SD 1.6 or so.

So bringing this back to the troll thread, I ran my test on the image in question, "Midnight Mist" and guess what? It had the highest SD score of any that I have tested - 1.982! (The lowest I have tested is around 1.30) By comparison, the Blue Ribbon had an SD of 1.635, and the Yellow an SD of 1.543.

So Blue and Yellow can be more confident that their Average score reflected "reality" than can "Midnight Mist".

Another image that I found with an abnormally high SD was

with an SD of 1.946.

I have more thoughts on this, but maybe I'll give it rest and let it soak in.

Any thoughts?

capgal

Fujifilm FinePix S5000 Z

02/23/2005 06:36:57 PM · #2

very interesting...

GeneralE

Canon PowerShot S3 IS

02/23/2005 06:39:43 PM · #3

My main thought is that I seem to remember that being within 2.0 Standard Deviations was usually considered a "normal" distribution, and that therefore even the two most extreme examples you could find nevertheless reflect a "normal" voting pattern.

Message edited by author 2005-02-23 18:40:09.

peete

Nikon D70

02/23/2005 06:40:04 PM · #4

i had already thought of this, and it would be easy to add a SD in results. it would be a good addition for those who know what it is, because it really does tell a lot. its cool someone else thought of it too and was organized than me (meaning they actually did the tests).

PhilipDyer

Canon EOS-5D Mark II

02/23/2005 06:46:57 PM · #5

Hi Bill,

I appreciate your research on voting standard deviations and think it's a valuable tool to evaluate how voters reacted to your photo. But I don't think that a higher standard deviation necessarily indicates any unfairness in voting. I think it probably more closely corresponds to how controversial a photo is (in technique or subject matter), which I think applies to the Midnight Bridge shot.

When I saw that photo I really liked it, but recognized immediately that some voters would not appreciate the seeming level of processing (whether there was any actual computer processing or not) or would not like the frame and would probably vote it down. I think it was more a matter of taste than of some troll attack. As someone mentioned in the other thread, the blue and red ribbon shots didn't receive '1' votes, so I think the low votes for this shot were mainly a matter of personal taste. I disagree with the taste of those who voted that way, but I think their votes are legitimate.

Thanks for your analysis, Bill! Now I'm going to have to go check out my past standard deviations to see if I've submitted any controversial shots. :-)

02/23/2005 06:54:58 PM · #6

Philip,

Notice that 1) I didn't say anything about "fairness" and 2) none of the photos I mentioned were mine.

I agree that the more "controversial" a photo is, the more likely it is to have a high SD. That said, it is hard for me to see what could be controversial in "Midnight Mist"

Message edited by author 2005-02-23 18:55:20.

Canon RF 24-70mm f/2.8 L IS

02/23/2005 07:05:23 PM · #7

Originally posted by jemison:

You'd have to ask the 10 that gave it a one why they did so in this case. I'm sure they had a reason and I'd love to hear from them. Maybe they just hate that bridge for some reason? Maybe something in the photo that most of us don't notice makes some people angry? Who knows, I'd love to hear from all 10 and see if they all had the same reason.
It's dangerous to apply normal distribution laws to human opinions because although they may follow the normal distribution most of the time, there are going to be times when it doesn't even come close and you may end up scratching you head asking why...

kirbic

Canon EOS R5

02/23/2005 07:16:58 PM · #8

earns that spot. The standard deviation is 3.33, far above the normally expected value of about 2.0. If we look at the voting histogram, it looks like this:

Notice that votes are "bunched" at the bottom, and less obviously at the top (we'd expect less 10's than 9's based on a gaussian distribution). These bunched votes reflect the fact that the distribution is "chopped off" but the votes just piled up in the ends of the scale.
Another way of looking at the data is:

This is a "normal cumulative plot". It can tell you one hell of a lot about the data. The more linear the plot the more gaussian the data. In this case, even though the histogram looks nothing like the classic "bell curve", the normal cumulative plot reveals that the underlying data is gaussian to a great extent. The inverse of the slope of this plot is the standard deviation, and the x-intercept is an analog for the average score (it reflects what would be the average score if the entire distribution were present and were perfectly gaussian).
Now that we've completed this little study, let me skulk off to plot one for a photo that generated very little difference of opinion... back in a few minutes...

Edit:
Here's a shot where voter opinion was very much in agreement:

Following are the histogram and normal cumulative plot for this one:

Notice that the distribution is scrunched up in the middle, no tens, and no ones (or twos for that matter). The normal cumulative plot reflects this, it's slope is much higher, and thus the standard deviation is much lower (1.22 for this shot). Make sense?

Message edited by author 2005-02-23 19:31:12.

Jozi

Canon EOS-400D Rebel XTi

Canon EF-S 18-55mm f/3.5-5.6

02/23/2005 07:18:43 PM · #9

I, personally, didn't care for the orange vs turquoise colouring, and it didn't "feel" like a midnight picture to me... I gave it a 5 and was surprised to see it with a ribbon (no disrespect meant to ZigZag)...

02/23/2005 07:21:14 PM · #10

I figured it out, maybe?

The title is "Midnight Mist." Since there is a sunset in the photo, obviously it is not midnight. Could that be enough to cause 10 people out of nearly 400 to gove it a 1?
There are people here that put weight in the title.

Canon EF-S 18-55mm f/3.5-5.6

02/23/2005 07:22:49 PM · #11

Originally posted by Jozi:

I was composing my message before I read yours, but you support my theory in this post.

Jozi

Canon EOS-400D Rebel XTi

02/23/2005 07:26:58 PM · #12

I don't want to give the impression that I gave it a 5 because of the title. I cited my own preferences first. The title was secondary for me. But you're right, some people certainly put a LOT more stock in a title than I might. Funny thing is that pic (to me) looks neither like midnight, NOR misty. :)

scottwilson

Canon EF 28mm f/2.8

02/23/2005 07:52:18 PM · #13

Coming over from the other thread as per kirbic's request. The case of the photo of the girl in the rather skimpy outfit is one that we would expect some number of people be strongly offended by and so not too surprising that it would get a large number of ones. Other photos of this that I would expect this of would be possible religious photos and political photos.

A test for how normal the distribution is does not really test that much, for example take a photo that has a normal distribution around a mean of 5, not move half the 5 votes to 4 and the other half to 6 and your test would still show a fairly normal distribution, but clearly the distribution is not expected.

But even the extreme case of the photo of the girl, number of votes at the 3 and 4 is more consistence with the number of 1 votes that a photo of a bridge, and I would submit that the bridge does not even come close to be as controversial as the photo of the girl.

Canon RF 24-70mm f/2.8 L IS

02/23/2005 08:24:29 PM · #14

Originally posted by kirbic:

OK, since someone brought up the statistical side of this... yes, the voting data is quite "normal", or "gaussian" in it's distribution. That fact is very valuable. It is especially enlightening where shots that generate a wide distribution (large standard deviation) are concerned. Those shots often have distributions that "fall off of" one end or the other of the voting scale (or sometimes bothends!). The currently-discussed shot (2nd place in birdges) is far from being the most controversial shot I've seen or studied; this:

earns that spot. The standard deviation is 3.33, far above the normally expected value of about 2.0. If we look at the voting histogram, it looks like this:

Notice that votes are "bunched" at the bottom, and less obviously at the top (we'd expect less 10's than 9's based on a gaussian distribution). These bunched votes reflect the fact that the distribution is "chopped off" but the votes just piled up in the ends of the scale.
Another way of looking at the data is:

This is a "normal cumulative plot". It can tell you one hell of a lot about the data. The more linear the plot the more gaussian the data. In this case, even though the histogram looks nothing like the classic "bell curve", the normal cumulative plot reveals that the underlying data is gaussian to a great extent. The inverse of the slope of this plot is the standard deviation, and the x-intercept is an analog for the average score (it reflects what would be the average score if the entire distribution were present and were perfectly gaussian).
Now that we've completed this little study, let me skulk off to plot one for a photo that generated very little difference of opinion... back in a few minutes...

Edit:
Here's a shot where voter opinion was very much in agreement:

Following are the histogram and normal cumulative plot for this one:

Notice that the distribution is scrunched up in the middle, no tens, and no ones (or twos for that matter). The normal cumulative plot reflects this, it's slope is much higher, and thus the standard deviation is much lower (1.22 for this shot). Make sense?

Kirbic,

I get a SD of 2.701 for the first shot. And 1.122 for the second. Could you please explain also why the "expected SD is 2.0"? If that was the case you would expect to find that quite often, which is not the case. In fact it is quite rare for a photo to have an SD of 1.9 or greater from my calculations.

kirbic

Canon EOS R5

02/23/2005 08:31:35 PM · #15

The standard deviation that you calculate from the actual data is affected by the fact that the distribution is chopped off. Th efact that the ends of the distribution are "shoved in" will always result in a standard deviation that is lower than it would be if the distribution continued outward to its tail.
The normal cumulative plot sorts this out by estimating the S.D. of the whole distribution from the part we can see. Assuming the distribution really is fairly gaussian, this is a very powerful tool. None of our sample distributions are perfectly gaussian, of course, but the technique is quite robust to reasonable departures from normality.

utro

Canon EOS-300D Rebel

Tamron SP AF 28-75mm f/2.8 XR Di for Canon

02/23/2005 08:46:36 PM · #16

ok. I can't believe I spent so much time on this, but I went through the top 320 or so images. The first image after Zigzag's to get an equal number or more 1's than his was ranked 199. AND it didn't contain an actual physical bridge. It was one of the metaphorical ones. Between 199 and 320 I found 10 more images that matched or exceeded the number of 1's as zigzag's, NONE of them containing actual bridges.

And every one I looked at had around 318-340 votes.

From this i can assume two things:

1) these votes weren't given because of a disliking of the image itself, be it colors or title, etc. It just doesn't make sense. There were several other bridge photos that had saturated colors or were extremely edited that didn't garner that many ones. In fact, there were several other bridge photos that I would consider more controversial than this one.

2) I'm entirely insane.

Message edited by author 2005-02-23 20:49:09.

02/23/2005 09:15:44 PM · #17

I highly doubt someone looked at this image and was so threatened by it that they created 10 fake accounts and voted on at least 20 percent of the photos with different scores (to avoid the pattern hunter that drops votes that follow patterns) to get their 1 votes to count, and they were not threatened by any other image in the top 200 enough to also give it ten 1's. I don't see anyone having that much time and that much fear of just one image. Seeing as no other photo in the top 200 have that many ones and the number 1 and 3 images have no 1 votes, ten separate trolls is easily proven as not a statistically viable possibility.

It's possible that there is an error in the coding of the site and maybe the troll votes that get dropped off (the voting pattern thing) didn't get dropped off on this one image? Or another code error, or possible D&/orL is having a big laugh over this... Possible but not likely.

It's possible Zigzag, scoring so well, convinced several friends of his on here to change their vote to a 1 during voting week just to screw with us all, or to try to prevent his image from winning because he dislikes the spotlight (or other reasons...). Possible but not likely.

It's also possible that, since this was such a great image, a lot of people clicked on it and voted on it. Since it was the highest rated image in the challenge (minus the 1 votes) and the thumbnail shows well it clearly would be an eye catcher and draw people to click on it. It's pretty easy yo see why it had the most votes. Of all the people that clicked on it and voted on it, is it possible that 10 people had a reason to strongly dislike the photo? GIven the title thing, the over sharpening, and the apperance of some photoshop work on the colors, if just 3 people gave it a 1 for each of those reasons (and maybe other reasons) that explains the ten 1's. It's odd, it's weird, it doesn't match the statistical curves, but human behaivior doesn't always follow the normal distribution so my money is this is just a fluke thing.