Jump to content

Feedback Requested on Normalizing Ratings


mottershead

Recommended Posts

We are about to deploy a new "Filter Photos by Rating"

implementation. As with the current version, users can select

different periods of time (from "Last 2 Days" to "All").

 

But what becomes obvious when the longer periods are selected is that

most of the highest-rated photos are still recent photos. This is

because of the rating inflation that has occurred over the past

year. Last summer, the average rating was a little over 5, and it

has increased steadily so that the average rating is over 7. This

means that the further in the past a photo received its ratings, the

lower its average will tend to be. A photograph that would have

appeared in the "Top 40" last year, with an 8.5 average, would be

nowheres-ville today.

 

Thus, the stars of last summer, whose photos for the most part still

are on the site and of course are just as good as ever have more or

less disappeared from view.

 

What I would like to do, at least for the filter function, is to

normalize the ratings. For example, a rating that was given 12

months ago might be given a +2, a rating from 6 months ago might be

given +1. It might be a little more fine grained than this because

the inflation has been more or less continuous, starting from when

the ratings were made public last August.

 

What are your thoughts?

Link to comment
Share on other sites

If the idea of normalizing ratings were to be implemented, I would much rather see the most recent ones brought down, rather than the older ones raised. If we raise the older ones, we will agree to the shorter span of the ratings - instead of 1-10 we will readjust everything closer to 5-10, and the average would be about 7.5. We would be better off, I think, to bring the average back to 5 by lowering the latest ratings by about the same amounts discussed.

 

However, I think that we need to do much more radical changes than this adjustment. Perhaps it is time to remove all the existing ratings and start over with a clean slate. This might work if we also took more care with preventing bogus accounts, group raising/lowering of ratings, retaliatory attacks, and "you scratch my back - I'll scratch yours" type ratings. We would also need a new, improved ratings guide, so people might have better guidance in how to rate objectively and usefully. Somewhere in here we might also consider purging all photos which have been on photo.net for more than 6 months with no ratings and no comments. Finally, I think it may also be time to limit the number of photos (or the amount of storage space) any one member may have on photo.net at one time. This might encourage people to clean out bad or poorly-received photos in order to replace them with better work. It would certainly reduce our storage bills, which would not hurt.

 

And what about letting only paying members rate photos? This might also help reduce the ratings problems, at least with bogus accounts.

Link to comment
Share on other sites

Why on earth would you think deleteing all photos without ratings would help? Some people upload photos to share but don't submit them for critique. The idea of normalizing the ratings really doesn't make much sense to me either. My personal opinion is that PN should reevaluate the ratings system all together and figure out a new system. Just my $0.02

 

 

Brian

Link to comment
Share on other sites

Oh Brian,

 

I made this suggestion before and I think the idea is great!

 

The way to go IMHO is to normalize in 2 ways (if this doesn't incease the workload on the database too much):

 

1. There a high raters and low raters. One person only gives between 5 and 10 the other between 3 and 10 and the next bewteen 7 and 10. Determine the mean and standard deviation for each photonetter and normalize them to a mean of 5 and to a common standard deviation, that spreads out the entire rating spectrum of each person from 1 to 10

 

2. As you said ratings went up on average. For each week determine the mean and standard deviation for all ratings of that week and normalize to a mean of 5 and a common standard deviation, just like above.

 

At first this sounds a lot of calculation, especially for the first part, but instead of adding or subtracting a fixed value for each rating you just use the mean and standard deviation and feed it into a function that any statistics package should offer and write it into a new field which contains the corrected rating. I strongly suggest to preserve the original ratings and offer the corrected value as an additional way of looking at things.

 

The whole calculation could be done on a weekly basis or on a rolling basis as CPU and dataserver load permits. This thing is of most interest to people who are not so much interested in shooting star kings for a week, but are instead looking for pictures or people with sustained good work. Those people do not need an updated list every hour. Once a week would be just fine.

 

And if you want to reduce workload, stop calculating the mutilated top-rated list every hour anyway, every 3, 6, 12 or even 24 would be absolutely enough.

 

Thank you very much fr your effort.

Link to comment
Share on other sites

I personally don't rate because the numbers don't have a lot of meaning. One person's 1 is another's 10. I uploaded a sunset photo that wasn't particularly good as an illustration to a thread. One person rated it a 9 for originality. While I appreciate the thought, I just don't see a rather average sunset photo being that original. My guess is most photographers have hundreds of them. :)

 

If you must attach numbers to photos, why not add words to the numbers. Make it a chart:

 

1 - The most horrible photo you've ever seen. Your cat could take a better one.

...

5 - Not terrible, but nothing especially appealing.

...

10 - Breathtaking. This photo is in the very top of those you have seen in your entire life.

 

Obviously you'd come up with your own words, but that's the general idea. It's still subjective in that one person's breathtaking is another's awful, but at least people would be agreeing on the scale in use.

 

Since this is for a "Filter Photos by Rating", I'd like to also suggest a "Filter Photos by Comments" option. I am far more interested in looking at photos with lots of comments because I learn from what people say about photos far more than a number they attach.

Link to comment
Share on other sites

I'm thinking of doing the normalization only for the Top Photos function (i.e. Filter Photos by Rating). Ratings presented in folders etc, would be the same as now, that is the raw ratings.

 

And I probably would only use the time-normalized ratings for periods longer than 1 month in the filter function.

 

After my post above, I realized that I have to normalize the standard deviation as well. For example, for the last week, the highest rated photo is one by Emil Schildt with an average score of 9.56/9.56 over 16 ratings. Now a score like this would never have happened a year ago. But if I just lop off 2 points, which is the amount the average has moved up, and make Emil's photo 7.56/7.56, then I've whacked the photo too much, because with the inflation, the ratings have also gotten more compressed into the high part of the range. Emil's photo is probably 2 standard deviations above average. A year ago the average was around 5 and the standard deviation was probably around 1.5 (I have to calculate it). Now the average is 7.5, but the standard deviation is probably .75; so if I want the average to be 5.5 and the standard deviation to be say 1.5, Emils photo should be normalized to 8.5 -- in other words, a standout photo (which it is), but not stratospheric.

 

Am I thinking about this the right way?

 

Now as for normalizing by rater as well as by time: how would I do both? How do I normalize the ratings of an extreme rater (low or high) whose ratings have remained extreme, even while they have moved up over time with the general trend?

 

And do I really need to do both? Doesn't it seem that for the purpose of showing the top few hundred photos (by various criteria), which have quite a lot of ratings, that the idiosyncracies of different raters would come out in the wash?

 

By the way, I'm planning to bring back the the full "top" list, perhaps based on normalized ratings, and calculated only once per day.

Link to comment
Share on other sites

Hi Brian,

 

I have been one of the rare few who has said all along that the rating system was useful. I found it very handy for quickly culling out some of the very best work here on Photo.net. However, given the inflated ratings over the past year I think the usefulness of the system is now gone. I don't think a statistical fix is in order here. It would simply be a band-aid for the real problem. There has been talk for some time about a new ratings system. Instead of expending energy on this, why don't the powers-to-be simply implement a new system. Then all work will be on an even playing field.

Link to comment
Share on other sites

Brian, you're right. The rating inflation not only moved the averages up, but also compressed the distribution. Any normalization of the ratings will have to take that into account.

 

<p>It would be interesting to see a graphical representation of the current distribution of ratings, next to one from some point in the past -- say, 6/1/2001 -- to compare it to. I'm sure it would effectively illustrate the reason that your suggestion is a good one.

 

<p>BTW, I think that it would be too complicated to also try to "normalize" the ratings to take into account the different ratings approaches of different individuals. Also, unnecessary. Different people have different "scales" -- that's normal. The problem here was the overall upward drift of the entire distribution.

Link to comment
Share on other sites

I agree, if you are going to do this, you need to re-adjust current ones. And decompress the ratings. However this isn't exactly great either because some photos do deserve their marks, high or otherwise and lowering their rating may be inappropriate. Also, within certain groups of raters, the inflation is more than others. And even still others do not inflate at all. It's a touchy spot and I'm not really sure you can fix it that easily. Hopefully this trend will stop but nobody knows.

 

Perhaps ratings should be kept anonymous to everyone but (s)he who owns the photo. Or perhaps if something comes up, like a person runs through a portfolio leaving 1 and 1, the data would exist but nobody would know who left it without contacting the administrators and stating that there's been a problem/violation. It does seem to be that the ratings have gone up since the information on who left the ratings has been exposed.

Link to comment
Share on other sites

The problem with the 5-10 system is that 7.5 is average.

 

My solution is to do away with the current ratings system. And

creat a new three pronged system. The current system

 

1-10 Aesthetics

1-10 Phototechnical Qualities

1-10 Orgionality.

 

1-2 == snap shot

3-4 == Good snap shot

5-6 == Average, and that is ok.

7-8 == A Keeper

9-10 == Three Thumbs Up.

 

Then remove all current ratings. And have a ratings week on

photo.net when everyone is asked to rate photo's to re-establish

the system.

 

The current system is both imprecise, and inflated. With no

ability to rate phototechnical qualities.

Link to comment
Share on other sites

Brian,

 

concerning the double normalization: If you only do the time related normalization this would be much, much better than doing nothing. On the other hand doing a double normalization seems a not so out-of-the-world problem to me, a seasoned statistician (?) might have a solution for this. Given there is enoug CPU and dataserver power, statistics could be used to address another problem: A statistical test for a non-normal e.g. bimodal distribution of ratings could identify accounts that either rate 1/1 or 10/10.

Link to comment
Share on other sites

Thanks everyone for your input. I'm going to release a new photo filtering feature today, based on the old rating system. This filtering system highlights the need for normalizing the ratings; but normalization is clearly a patch on the old system (like requiring comments for ratings below 5 was). I do plan on implementing normalization soon, but I don't want to delay the new filtering code any longer.

 

As many people have heard by now, we plan also to reform the rating system eventually -- in fact, I'd like to replace it with a "collaborative selection system". (Sorry, but I'm a software engineer as well as a photographer; I have to talk like this or else other software engineers won't pay attention.)

 

After a year of participating in, then observing, the rating system, I believe that we can't really design a good "rating" system. People need to be trained to a certain degree and accept a somewhat common discipline to "rate" photos consistently, and we don't have the means to conduct such training effectively, or the willingness to verify or impose the discipline.

 

But I think everybody can "select" photos they "like", and the only requirements for this are honesty and sincerity. (These are still not easy things to obtain on the Internet, although not as difficult as obtaining consistency and discipline.)

 

A collaborative selection system will meet the needs of the site for a way to identify photos that should be featured (as the current system does) and will also provide more understandable and meaningful feedback to photographers than the current rating system.

 

It is taking longer to implement than I thought it would, mostly because in my role as Editor-in-Chief, I don't get a lot of time. The result: the patches on the old system, while I'm doing the new one.

 

By the way, Bernhard, I didn't suppose a double-normalization was intractable -- I was just looking for guidance from any seasoned statisticians out there on how to go about it.

Link to comment
Share on other sites

One more thought -- on the matter of normalizing the ratings distributions of each individual rater to acount for "high raters" and "low raters". I'd suggest that this may not be a good idea.

 

<p>Many photo.netters developed a practice of only rating photos they thought were among the better ones here. This became especially acute after ratings were made non-anonymous, and giving a photo a low rating risked retaliation. These folk were not necessarily "high raters" who were incapable of seeing anything bad; rather, they were just (knowingly) rating <i>only the upper part of their "own" distribution</i>. The photos that they might have been inclined to give lower ratings to, they simply did not rate. If you try to "normalize" the entire distribution of such a member's ratings, moving their lowest ratings even lower to reflect an assumption that they should be near the bottom of your assumed "normal" distribution, you will distort the intent of the rater's numbers.

Link to comment
Share on other sites

Dave, I think you're right. Normalizing individual's ratings is much more problematic, simply because many raters do not have a "normal" distribution, but essentially the high side of a normal distribution, suppressing through personal decision the low side of it. I'm wondering also what this phenomenon means for the time-based normalization.
Link to comment
Share on other sites

<i>"I'm wondering also what this phenomenon means for the time-based normalization.</i>" I don't think it would affect the validity of a normalization of ratings designed to account for the overall ratings inflation we've seen here. Say a member tended to rate only the photos s/he thought were in the top third or so, and just passed on the other ones. Maybe a year ago, such a member would have been giving a lot of 6's and 7's. A member who is currently tending to rate only the photos s/he thinks are in the top third or so, is presumably handing out a lot of 7' and 8's and probably a fair number of 9's. Seems to me that a "time-based" normalization would still work to make the older ratings and recent ratings more equivalent, even though they are coming from members who are only rating what they view as the better photos.
Link to comment
Share on other sites

<I>they were just (knowingly) rating only the upper part of their "own" distribution.</I>

<br><br>

I think Dave made good point here. Probably it is better not to attempt a per user normalization now. But I still think it would be nice if there was a way to make it harder for fan clubs to rate each other high in the sky. Anyway the new top-photos feature helps to overcome this a little bit too.

Link to comment
Share on other sites

I agree. It's really too late to change the ratings. I agree with Dave that the majority of them have resulted from people only rating the better photos. I don't rate many that often because i don't do that. So I do run in to retaliation issues from time to time. I think the best thing to do would be to have a way of tracking whether a certain person leaves a lot of 0/0s or 10/10s or at least something that is unusually far from the norm without any reason whatsoever. Either that or we get rid of the number system now and have a thing that asks questions. The person selects the answer from a pull down menu and for the answers various point values are applied.

 

Kind of like those silly rate your sex life things in magazines.

Link to comment
Share on other sites

The current inflated ratings often mean exactly what they should mean.

'I think this image deserves greater exposure (no pun intended) by

being placed on the two-day (or longer) list'. Anything else has

little meaning. A '6/6' means "I don't like it all THAT much" and is

mostly offered, as are most other middle range ratings, with no

technical critique. No critique, 'I like it /I don't like it', and

'I've never seen that before' is what the current criteria encourages.

 

Include 'photographic technique' as one of the criteria. It IS

reasonable to promote photographs as opposed to conceptual art on this

site. We should also somehow give some kind of tangible benefit to

those photographers who are willing to take the time and risk ridicule

or retaliation by offering a comment that has technical merit. Right

now, they are few and far between. . .

Link to comment
Share on other sites

I realize reading this thread, and mostly Dave Nance's objection to the "rater-based" proposed normalization, that the problem is indeed more difficult than it seems.

<p>

Yet, I can't agree with the idea of giving up on what we have here for something like an "I-like-it-system". Or at least not until everything has beenn tried to give a more interesting evaluation system a new life - what ever it might be... No clue at this point what the ideal system would be, but I'm prepared to work on it if you like me to. Basically, there are thousands of possible rating systems, and none of us, I think, has explored them all. And logically, there HAS to be some sort of approximately good solution.

<p>

I can't agree with this sentence, Brian: " A collaborative selection system will meet the needs of the site for a way to identify photos that should be featured (as the current system does) and will also provide more understandable and meaningful feedback to photographers than the current rating system." Why ? Because "I like" means nothing at all to me, unless you tell me HOW MUCH you like it... We don't need figures to evaluate an emotion level at all, that's agreed. But we need SOMETHING... or else I'd just read what ever written comments I get, and forget about the rest...

<p>

So, I'll be looking for another solution - another "patch" that would actually work... The goals are: 1) To be fair to the victims of the old system... 2) To prevent mutual admiration societies from forming and to reduce the effects of their activities in the past. 3) To prevent bogus account and silly low-raters from damaging in the future the usefulness of the present rating system. 4) To get as many people as possible to WRITE comment on pictures - especially the ones they find weak. 5) To find a way to normalize WELL, taking into consideration the compression effect described above by other members.

<p>

1) First thing that comes to my mind for the 1st problem: AN AMNESTY ! :-)) Let's simply DELETE all ratings below 5, and change the system to a 5 to 10 rating system at the same time. (Or, if you prefer, "convert" the mathematical value of a 6 upon 10 to an 8 in the context of a 5 to 10 scale, etc...) [ By the way, people would therefore in the future need to write a comment for a 5 and 6 to be valid options - to preserve the idea of the previous change ]

<p>

Some people's averages are now unfairly brought down by low rates which are simply revenge ratings and such. Of course, this measure should be implemented for ALL the pictures on this site - not only old ones.

<p>

Then, there might be quite a number of funny bad pictures poping up on the top pages - and some very good ones too ! Make the top pages 500 photos instead of just 300, and for 1 month, ask all members to go at least once a week through all the pictures they haven't rated yet in that list. Some will, some won't - never mind - but those who want good pictures at the top will act, and those who want THEIR OWN pictures at the top might act twice as much but will then be traceable for nasty actions, if any.

<p>

OR you could make it compulsory to rate at least 250 shots in the top-rated during that month. Or 100, let's say...

<p>

Since the average rating on the site is now at 7++ anyway, that will bring the whole of Photo.net to adapt to a 5-10 system, which is only good, I think... Another good point is that a lot of pictures presently in the top-rated and which might not deserve to be there might be "rectified" downwards along the way. And nobody would then have the time to carry out revenge ratings, because there will just be TOO MANY ratings during that period...:-) With all this done, wait and see what happens for a while before going to step 2.

<p>

If Photo.net has any interest to POSSIBLY go ahead with this step 1, please let me know, and I'd be reparing something these days about step 2 more seriously. I have already a few ideas, but they are sometimes conflicting right now, so I'd better do a little homework before going ahead...

<p>

OTHER COMMENTS: I agree with Jeremy Stein's entire post. I find Carl Smith's last post very valid too. I like the idea of a 3rd technical rating - or ideally 5 rating categories, including lighting and composition, if possible. I like the idea of a drop down menu with key explanation sentences that would go along with a rating... I finally like the idea Carl suggested - as well as Bob Atkins in the past - to drop highest and lowest ratings a picture gets... So, let's not give up too quickly on a normalized rating system... To be continued.

Link to comment
Share on other sites

THANK YOU, Carl... for this sentence: " We should also somehow give some kind of tangible benefit to those photographers who are willing to take the time and risk ridicule or retaliation by offering a comment that has technical merit. Right now, they are few and far between..."

<p>

I believe I am one of them. Since I returned to Photo.net, I decided to boldly say what I think, and added all sorts of advice. Almost every rating I give away comes with a comment. Retaliations ? Plenty, of course... AND SOME VERY OBVIOUS ONES... For everyone who has a look at my folders, you would even find 5 / 5 ratings by people who insulted me publicly on another site, and who aren't even truly active here. And most of the rest of the 5s and 6s are from users with 0-upload-folders, partly registered in July 2002 - out of which possible bogus accounts...

<p>

I can't begin to try to want to imagine how many similar cases would have hit someone who speaks his mind, like Tony Dummett, and many more...

<p>

A few days ago, Carl, I decided to give up rating weak images. What you just wrote shows me that at least 1 person cares to carry on with integrity. Thanks to what you just wrote, Carl, I'll be considering now whether I should restart on my cruisade or not...:-)

<p>

Till I read the above, I came to the conclusion that at 50% of the top-rated images were not deserving of their ratings, and I'm sure many would agree... but what's 1 rating alone to oppose 20 mate-ratings ?! This is a battle all the honest people on Photo.net must fight together, and then the site ill look a lot nicer...

<p>

For your information. I've been BANNED from another site as a consequence of my insistant pointing at obvious technical flaws in images... On PN, hats off to Brian and Jeremy who seem to really move in the right direction. Kudos for keeping your minds open to improvements, and for pursuing the right goals...

<p>

Best regards.

Link to comment
Share on other sites

OOPs. I just accidentally posted these to the wrong thread.

They belong here . . . I hope. . . .

 

The only solution that I can think of to reduce the negative impact

of this 'gang' approach will be unpopular because it restricts the

freedom of participants to do as they please. I would prefer not to

have people vote who were not themselves qualified

photographer / judges. No high rated images - no ability to rate.

For everyone else, limit the number of rating opportunites per

time period, per photographer being rated, etc.. Limit similar

images. It sounds like a police state to too many people, I

suspect, but I, for one, would be willing to support the concept of

merit.

 

Marc. Thank You for your response. I have noticed you taking the

time to provide constructive criticism. I'm not sure I'm as

courageous as you think. I've actual limited my posts. There's a

current image that I very much want to give a 1/1 to for several

reasons, but I'm hoping there's a better way to deal with this

particular image.

 

 

I would like to hear other ideas on how to integrate critiques and

rating. As I've said, the current set up does not encourage

technical opinions, at least not in this context.

 

 

I have a little experience judging photo competitions and my

approach is 'how does this image compare to others of it's

genre? If you don't have first hand experience shooting this style,

it's hard to make an objective informed rating. But how do you

translate that into a database sort?

Link to comment
Share on other sites

This is an impossible situation for two reasons:

 

1) No matter what numeric method you decide upon, someone (and likely everyone) is going to be unhappy and rant mercilessly what a bunch of morons you are to use this method or that method.

 

2) The second problem is a technical one - you are assuming that the only change in average ratings is due to time, which is not necessarily true. Frequency of "old-standard" raters ratings can change, some "old-standard" raters adjusting their ratings and some not, or some old-std raters average going up not because they've changed their ratings yardstick, but because they've gotten bored of rating and commenting boring photos. Bottom line is that ratings can change over time for many other reasons than time, so assuming time is the only pertinent variable leads to incorrect normalization factors. Some of the thoughts here are assuming correlation=causation, which I learned in my econometrics classes (well, actually I learned it much earlier than that) is flawed logic. You'd really have to look at a multivariate regression to get it right, and I'm not sure that data is kept on all the potential pertinent variables.

 

I think a better solution is to not monkey with the numerical ratings, and if people aren't happy with their "old-style" ratings, they can delete their photo and re-submit it! I would imagine that the drift in mean ratings could be correlated to the appropriate variables (but I doubt even the indefatigable Mr Mottershead could tackle that one), and from there an appropriate normalization could be done, but then there'd be only 3 people on photo.net that understood the method, and so you've still got everyone mad <grin>.

 

So if you can't normalize it with a reasonable, defendable, and explainable degree of precision, don't try and force a fix, just let the system drift to a new mean on it's own, and advise folks to re-submit if they lay awake nights worrying about the rating.

 

Now, for those folks (like me), that want to see the top 500 of the 378,141 photos on photo.net, I say add another feature to the date range to let someone choose between date ranges, and if you don't want it dynamic, just take a look and see when "inflation" started, and add a selection for "pre-inflation" or somesuch.

 

So, for those that complain their photo isn't rated high enough, resubmit. For those (like me) that are trying to find inspiring images, give them a new date range.

 

My two cents for the night.

Link to comment
Share on other sites

I got another idea for the per user normalization. Have every user who wants to rate or keep doing so rate a standardized set of 20-50 pictures ranging in quality from abysmal to stellar. Then you have a common point of reference for each individuals rating behaviour and can normalize according to that.
Link to comment
Share on other sites

I've worked on a site that gives performance tests, where you answer a series of questions by rating where you think you fall on a scale from one to ten.

 

The scores settled down quite a bit when we added text descriptions to the numbers (I do this sometimes, Never, Every day).

 

I will say I don't like the current system, as adjusting the scale to start with average does nothing but raise the average score, and some photos are truly below average.

Link to comment
Share on other sites

Brian, I think the primary cause of inflation is the change in the way the choices are displayed. This is an interface issue. Interface is self-explanatory, you want it or not.

<p>

In the old system, it was an 1 to 10 scale.

<p>

In the new system, it is <em>perceived</em> as 5 to 10 scale, with 5 being the lowest rating and 10 being the highest. Nobody reads the fine print.

<p>

A partial remedy would be to display 1 to 10 choices and ask for feedback later, if a rating lower than 5 is submitted. Perhaps a feedback should not be mandatory, just display the form and say "Please explain the low rating and how the picture could be improved..."

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...