Author | Thread |
|
02/07/2007 05:01:17 PM · #26 |
|
|
02/07/2007 05:06:09 PM · #27 |
Originally posted by jaysonmc:
The probablitity only increases if you don't believe this is on a curve, and that everyone's votes/scores are random. This is not a "flipping a coin" mechanic at work here. |
No, it increases even if you believe its on a curve.
Of course it isn't a completely random distribution, but within particular ranges, the distribution is effectively random. If you are always in the top bucket, having 5 people there with you, as opposed to 50, dramatically increases your probability of hitting the top 3 spots, rather than just being in the top 5%
|
|
|
02/07/2007 05:22:06 PM · #28 |
Originally posted by Gordon: Originally posted by jaysonmc:
The probablitity only increases if you don't believe this is on a curve, and that everyone's votes/scores are random. This is not a "flipping a coin" mechanic at work here. |
No, it increases even if you believe its on a curve.
Of course it isn't a completely random distribution, but within particular ranges, the distribution is effectively random. If you are always in the top bucket, having 5 people there with you, as opposed to 50, dramatically increases your probability of hitting the top 3 spots, rather than just being in the top 5% |
I think we are kind of agreeing. If you are in the top bucket, which is a small handful of people... However that top bucket always had the advantage to score a ribbon 95%... |
|
|
02/07/2007 05:27:42 PM · #29 |
Originally posted by Gordon: Originally posted by jaysonmc:
The probablitity only increases if you don't believe this is on a curve, and that everyone's votes/scores are random. This is not a "flipping a coin" mechanic at work here. |
No, it increases even if you believe its on a curve.
Of course it isn't a completely random distribution, but within particular ranges, the distribution is effectively random. If you are always in the top bucket, having 5 people there with you, as opposed to 50, dramatically increases your probability of hitting the top 3 spots, rather than just being in the top 5% |
I agree. In terms of "number of opponents", a smaller challenge is easier to ribbon in, ignoring the various "is this image good enough" arguments. Curves or random voting patterns don't change anything in the math. The weighted distributions only confuse the issue. |
|
|
02/07/2007 05:41:39 PM · #30 |
Originally posted by chimericvisions: Originally posted by Gordon: Originally posted by jaysonmc:
The probablitity only increases if you don't believe this is on a curve, and that everyone's votes/scores are random. This is not a "flipping a coin" mechanic at work here. |
No, it increases even if you believe its on a curve.
Of course it isn't a completely random distribution, but within particular ranges, the distribution is effectively random. If you are always in the top bucket, having 5 people there with you, as opposed to 50, dramatically increases your probability of hitting the top 3 spots, rather than just being in the top 5% |
I agree. In terms of "number of opponents", a smaller challenge is easier to ribbon in, ignoring the various "is this image good enough" arguments. Curves or random voting patterns don't change anything in the math. The weighted distributions only confuse the issue. |
Last thought, then I am done (where is that horse stick figure).
So if scalvertentered a challenge with only 50 people would you expect him to have a "better chance" to get a brown ribbon. Do you think he is "more likely" to get the worst score?
I don't think people should come away with, "oh look only 50 people entered this challenge this means I will ribbon". Unless you know who is entered in the challenge it is not true.
It would be like saying I am in a the Tour de France. I have never been in a bicycle race before. The field is only half as big this year as last. I have a "better chance" of coming in top 10 (beating all these other guys who do this every year).
And yes, I have no delusions. I don't think I will have ribbon here at DPC unless it is a "fluke" shot.
Message edited by author 2007-02-07 17:42:51. |
|
|
02/07/2007 05:45:30 PM · #31 |
Originally posted by jaysonmc:
So if scalvertentered a challenge with only 50 people would you expect him to have a "better chance" to get a brown ribbon. Do you think he is "more likely" to get the worst score?
|
Yes, statistically he is.
Originally posted by jaysonmc:
It would be like saying I am in a the Tour de France. I have never been in a bicycle race before. The field is only half as big this year as last. I have a "better chance" of coming in top 10 (beating all these other guys who do this every year) |
Yes, statistically you do.
I've been trying to get a brown ribbon most of this year. I can't even get close. I get closer in challenges with fewer entries though.
Message edited by author 2007-02-07 17:47:35.
|
|
|
02/07/2007 05:46:31 PM · #32 |
Originally posted by mouten: ... I was wondering if you had more chance to ribbon by entering a small challenge (ie a challenge with few entries), which I read here and there.
So back to excel ;), and the answer seems to be no.
|
This graph has nothing to do with a submitter's probability of winning a ribbon.
The underlying assumption that there is a higher probability of winning a ribbon if the blue ribbon winner has a low score is wrong. There is no statistical logic in this graph whatsoever that supports that assumption.
All the chart tells you for sure is that the score of blue ribbon winners have greater variability for challenges with low numbers of submissions than for challenges with high numbers of submissions. That's it.
|
|
|
02/07/2007 05:49:30 PM · #33 |
Originally posted by stdavidson:
All the chart tells you for sure is that the score of blue ribbon winners have greater variability for challenges with low numbers of submissions than for challenges with high numbers of submissions. That's it. |
That's true. It is basically a triangular distribution, much more variance in the lower challenges.
|
|
|
02/07/2007 05:50:59 PM · #34 |
In my opinion the reason many don't enter these small challenges (ex. School Days) is because the topics are just so boring. It's bad enough that you have to hit people over the head in regular challenges you have to apply extra force in the smaller ones and the results of which are static and silly images, great for the challenge but nothing much beyond it.
|
|
|
02/07/2007 05:55:37 PM · #35 |
Originally posted by Gordon: Originally posted by jaysonmc:
So if scalvertentered a challenge with only 50 people would you expect him to have a "better chance" to get a brown ribbon. Do you think he is "more likely" to get the worst score?
|
Yes, statistically he is.
Originally posted by jaysonmc:
It would be like saying I am in a the Tour de France. I have never been in a bicycle race before. The field is only half as big this year as last. I have a "better chance" of coming in top 10 (beating all these other guys who do this every year) |
Yes, statistically you do.
I've been trying to get a brown ribbon most of this year. I can't even get close. I get closer in challenges with fewer entries though. |
Last one. :)
So you are saying a person's p-value changes based on the number of entries?
Maybe I need to go take another statistics class, it's been years. :) |
|
|
02/07/2007 05:58:53 PM · #36 |
Originally posted by jaysonmc: I don't think people should come away with, "oh look only 50 people entered this challenge this means I will ribbon". Unless you know who is entered in the challenge it is not true. |
I disagree, but this is my last try. Note my placement in this series of entries in the Summer of 2004:
3 / 104
45 / 284
2 / 134
4 / 240
17 / 248
2 / 122
50 / 202
2 / 140
I certainly wasn't among the consistent top finishers at the time (3/31 top 10 finishes to that point), and I was still up against the likes of Heida, Kiwiness, and EddyG. The challenges with a low number of entries were nearly all conceptual topics or things that were difficult to portray in an appealing way. Thus, the REASON for the low number of entries is that the topic was tough. A reasonably decent photographer that can come up with a good idea will have a MUCH better chance of ribboning in a small challenge than he would in a challenge where lots of people have good ideas, regardless of who else is entering. |
|
|
02/08/2007 03:34:50 AM · #37 |
Originally posted by scalvert: The number of competitors is just a side effect of the challenge topic itself, which is the REAL critical element. |
Exactly the conclusion of my original post - said more efficiently.
There's no "free lunch" or statistical bias. So we can all focus on taking good photographs and forget any "tactical" decision. |
|
|
02/08/2007 04:08:42 AM · #38 |
I'm in agreement with Shannon that the Challenge theme is the major influence. Nonetheless, for arguements sake, lets look at the analysis here. My first thought was that a major dimension had been left out - that of time. The initial reasoning was sound, albeit subjective, and that was that I know I don't think the world of the photo that has the highest score on this site. Now none of this is very emperical, but following on this hunch I though I would check all 8.5+ scores, which are curiously all in 'small' challenges; and what do you know but that they were all submitted in the early days of the site. As a generalisation the following variables (IMO) have changed over time:
- more individuals are involved in DPC
- more photographs are submitted per challenge
- each photograph recieves more votes
- the voting dynamics have evolved
- photographer awareness of voting dynamics has evolved
Given that some people clearly do have the time - I would be interested to see how the distribution is influenced if you introduce a time dimension.
Interestingly the voting dynamics seem to be changing again. What is the probability that three shots in the last week have made it into the all-time top fifteen?
Message edited by author 2007-02-08 04:10:56.
|
|
|
02/08/2007 04:14:45 AM · #39 |
Originally posted by mouten: OK taking another 5min break after my previous statistical post on the past evolution of DPC challenges |
how many dpc stats have you got???
it'd be interesting to see some other extrapolations from that data ...
average blue ribbon score
average red ribbon score
average yellow ribbon score
average honorable mention (4th and 5th place) score
average 10th place score
stuff like that ... it'd create some interesting reading and benchmarks for people when they enter challenges.
|
|
|
02/08/2007 04:20:11 AM · #40 |
Originally posted by super-dave: how many dpc stats have you got???
it'd be interesting to see some other extrapolations from that data ...
|
unfortunately I'm just using the same info that everyoine has and that is posted on the site. For more indepth statistical analysis you'll have to ask SC! |
|
|
02/08/2007 04:25:32 AM · #41 |
Originally posted by PaulE: I would be interested to see how the distribution is influenced if you introduce a time dimension. |
have a look at my other post touching precisely on the time dimension. //www.dpchallenge.com/forum.php?action=read&FORUM_THREAD_ID=541570
you'll see there is not so much "time" effect now that dpc is more mature (was true in the first 1/2 years). |
|
|
02/08/2007 04:33:00 AM · #42 |
I'm too stupid (and blonde) to understand your graphs... 
|
|
|
02/08/2007 04:46:03 AM · #43 |
quote=stdavidson] This graph has nothing to do with a submitter's probability of winning a ribbon. [/quote]
You are correct there is actually a shortcut taken in the reasoning. The more correct, full, logic would be as follows:
Given a photographer "P" that has probability P(s) to reach a given score s.
Given a challenge "C" where the probability to ribbon with a given n score is C(n).
Then the probability R of P to ribbon in C is
R(P,C) = Sum(from k=1 to k=10) { P(k)*C(k) }
Take each element of the sum, that is P(k)*C(k)
What that means is that the probability of P to ribbon in C with a k score is P(k)*C(k) : (probability of P to get a k) X (probability of ribboning in C with a k score)
Is P(k)*C(k) higher in smaller challenges?
We can reasonably assume that the size of the challenge has no impact on P(k). Ie that for a given photograph done by P, he will get the same score whatever the size. THat may not be fully true, but should be close enough.
The question becomes how does C(k) change with the the challenge size?
The graph in my OP precisely intends to illustrate C(k) variation by size, and shows that C(k) is largely independant on the challenge size.
Therefore if neither C(k) nor P(k) vary significantly with the challenge size, the same will be for R(P,C), ie the probability of P to ribbon in C.
Now obviously the logic above assumes that P(k), ie the probability that P will get a k score, does not depend on C.
Scalvert quite rightly pointed out that this is where the trick is. Actually, while C(k) does not depend on the size of C, P(k), and therefore R(P,C), depends on the nature (not the size) of the Challenge. To take an example given by Scalvert, it seems DrJones has a better chance to score high in a Nude Challenge than, say, in a fruit and veggie challenge.
Message edited by author 2007-02-08 04:58:18. |
|
|
02/08/2007 04:51:38 AM · #44 |
Originally posted by super-dave:
it'd be interesting to see some other extrapolations from that data ...
average blue ribbon score
stuff like that ... it'd create some interesting reading and benchmarks for people when they enter challenges. |
just from the the data from the challenge archives put into excel.
This is for all challenges :)
avg entries:208
avg blue ribbon score:7.3431
avg score:5.3351
avg median score:5.3148
avg brown ribbon score:3.2192
avg # votes: 227
avg # comments: 13.45
I think the most interesting is average score, since I hear so many people say that your average vote should 5.5 when really this is the DPC average score.
Message edited by author 2007-02-08 04:52:14. |
|
|
02/08/2007 08:29:40 AM · #45 |
Originally posted by mouten: Originally posted by stdavidson: This graph has nothing to do with a submitter's probability of winning a ribbon. |
You are correct there is actually a shortcut taken in the reasoning. The more correct, full, logic would be as follows:
Given a photographer "P" that has probability P(s) to reach a given score s.
Given a challenge "C" where the probability to ribbon with a given n score is C(n).
Then the probability R of P to ribbon in C is
R(P,C) = Sum(from k=1 to k=10) { P(k)*C(k) } |
With all due respect, and I'm not a statistician, but it seems there are some flaws in this logic. Looks like R(P,C) is the probability to get a single vote where 'k' are all the possiblities from 1 to 10, not a probability for an average winning score. The formula should calculate an average to match the ribbon values plotted in the original graph.
The formula should also be limited to a probability for getting a blue ribbon since this is all we have actual numbers for.
I'm likely getting this wrong, but...
For:
R(ave)=Probability a particular photographer wins a blue ribbon in a challenge. It is an average.
n=one vote
N=Total votes in a challenge
k=possible vote of 1 to 10
P(k)=Photographer's probability of getting a particular vote
C(k)=Probability a blue ribbon will get that particular vote
Then:
R(ave)= (Sum(from n=1 to n=N) { [R(P,C)](n) })/N
Where:
R(P,C) = Sum(from k=1 to k=10) { P(k)*C(k) }
As suggested by Scalvert, we could include a factor f(P) that is a value from 0 and 1 inclusive that affects a particular photographer's ability to win a given challenge based on the challenge topic and the photographer's ability. For example, f(P) is 0 if the photographer does not enter the challenge. f(P) is 1 for challenges they enter that they are 'best' at. f(P) is < .5 for challenges they are not 'good' at and > .5 for challenges they are 'good' at.
In that case:
R(ave) = f(P)*((Sum(from n=1 to n=N) { [R(P,C)](n) })/N)
There you have it, all you have to do is keep your P(10) and your f(P) values at 1 and you will win a blue ribbon every time you enter.
What could be simpler? LOL!!!
|
|
|
02/08/2007 08:52:50 AM · #46 |
Originally posted by stdavidson: What could be simpler? |
DNMC = 1 |
|
|
02/08/2007 09:08:26 AM · #47 |
LOL!! ok I give in.
Originally posted by stdavidson: With all due respect, and I'm not a statistician, but it seems there are some flaws in this logic. Looks like R(P,C) is the probability to get a single vote where 'k' are all the possiblities from 1 to 10, not a probability for an average winning score. The formula should calculate an average to match the ribbon values plotted in the original graph.
The formula should also be limited to a probability for getting a blue ribbon since this is all we have actual numbers for.
I'm likely getting this wrong, but...
For:
R(ave)=Probability a particular photographer wins a blue ribbon in a challenge. It is an average.
n=one vote
N=Total votes in a challenge
k=possible vote of 1 to 10
P(k)=Photographer's probability of getting a particular vote
C(k)=Probability a blue ribbon will get that particular vote
Then:
R(ave)= (Sum(from n=1 to n=N) { [R(P,C)](n) })/N
Where:
R(P,C) = Sum(from k=1 to k=10) { P(k)*C(k) }
As suggested by Scalvert, we could include a factor f(P) that is a value from 0 and 1 inclusive that affects a particular photographer's ability to win a given challenge based on the challenge topic and the photographer's ability. For example, f(P) is 0 if the photographer does not enter the challenge. f(P) is 1 for challenges they enter that they are 'best' at. f(P) is < .5 for challenges they are not 'good' at and > .5 for challenges they are 'good' at.
In that case:
R(ave) = f(P)*((Sum(from n=1 to n=N) { [R(P,C)](n) })/N)
There you have it, all you have to do is keep your P(10) and your f(P) values at 1 and you will win a blue ribbon every time you enter.
What could be simpler? LOL!!! |
|
|
|
02/08/2007 09:37:28 AM · #48 |
Originally posted by scalvert: Originally posted by stdavidson: What could be simpler? |
DNMC = 1 |
Oh yeah... I keep leaving out the photographer DNMC factor. :)
In that case:
R(ave) = DNMC(P)*f(P)*((Sum(from n=1 to n=N) { [R(P,C)](n) })/N)
The good news is that your DNMC(P) multiplier is always 1 if you don't enter. Unfortunately, it drops off rapidly from that if you actually enter, even in free study challenges. In some statistical models DNMC(P) is allowed to go negative.
Message edited by author 2007-02-08 09:41:13.
|
|
|
02/08/2007 11:17:14 AM · #49 |
Is there any, any, emphasis on a "TABLE" anywhere in this shot?
Yet it placed 18 out of 177.
DNMC = Does Not Mean Clearly anything.
Yes there are more elements involved then just statistics. Many factors are not shown on charts. Many assumptions based charts don't account for all elements including change. I've heard of chaos theories. Sometimes things seem to go chaotically here.
We analyse statistics and reason things out based on other human factors that computers can't see. Topics can be more popular seasonally and regionally, and may be easier to do, depending on what Challenge Topic comes up when. Scalvert's comments contribute to the analysis of the data.
|
|
|
02/08/2007 12:07:28 PM · #50 |
So did we decide if it was easier or harder to get a brown ribbon in a smaller challenge, or not ?
I think I've given up aiming for the brown ribbon. I cant get close, no matter how bad the entry. I managed to break the bottom 10% though. I think I'll leave it at that.
Unless someone can come up with a better formula...
|
|