Hmm... I planned on creating a seperate thread, just for my suggestion... and reading this all, I probably should have.
Only Warp sorta responded to it, but not really, as he didn't really respond to anything particular, and mainly stated his own definition of technical rating...
Could anyone (particularly Warp) give some reasons why my idea shouldn't be implemented?
I'm not opposed to your idea per se. It could work. It's just that I wanted to point out my opinion about the technical rating, which you had the strongest objections against. I agree that as it is now, it doesn't perhaps work perfectly.
Reimplementing the rating system to remove one of the categories and changing the other from 0-10 to 0-100 (in practice) can have some minor technical issues.
How would the existing ratings be converted to the new ratings? (I'm assuming it wouldn't be a very popular decision to simply drop all the existing ratings.)
How should the web form be changed so that one could enter a value between 0 and 10 with one decimal of accuracy? An editable textfield could work, but it might be slightly confusing to use, as well as error-prone. If the input is invalid, the server would have to do something about it. Maybe javascript could be used to enforce a correct value (but the server would still have to do something if an incorrect value is entered with javascripts disabled).
Also, some people might like having two categories more than having just one. We should hear their opinion as well.
The current ratings could be transferred, counting the entertainment value for 2/3 and the technical rating for 1/3, and let that constitute the single new rating. People can then easily, by watching their rating list, change their ratings.
Yeah, this might be a small problem. I think the best way is indeed a field where values can be written in. I think both 7 7.0 and 7,0 should be valid ways of inputting it. A small line of explanation above could probably clarify it. This would also solve the currently unclear lables.
Yeah, some more opinions on this would definately be good (although they mostly come after something is implemented). Either way, it doesn't need to be exactly my idea that needs to be implemented, I'm just giving my opinion on what I think the ideal system is. If some aspect (like being able to rate in decimals) would be implemented, it could already be an improvement.
I give jimsfriend's first post 8/10 for entertainment and 4/10 for technical so I guess that's 6.7. If I'm ever in the mood to skim over the forums I will remember to return to it as it does cover most points here.
Baxter's idea I give 6/10 for entertainment and 9/10 for technical or 7.0 combined. Having them combined does make sense since most people use arbitrary scales anyway. Also I like the idea of decimals being used to make your own list more refined. For input you could have two drop-down lists or radio buttons or whatever, one for the unit and one for the decimal (if writing it in proves difficult).
Personally I still like the idea of boolean rating, as it is harder to skew averages and people are notoriously bad at expressing their feelings on a scale with more than 2 options (let alone 100) -- basically any movie you don't rate counts as haven't seen/don't like and the only option is to mark the movie as "I like it". That takes away making ordered preference lists but that could be implemented in a different way, maybe a way to order the movies you marked as liking but which does not affect the overall "rank" within the site (again, if I'm a bastard I'll rate movies I like 10 and movies I don't like 0 as this is the best way to skew the overall rating)
Personally I still like the idea of boolean rating
That would work well if thousands of people gave ratings to movies. However, the average amount of ratings for a movie is pretty low, so a 1-bit rating isn't very accurate nor descriptive (and would cause many movies to end up having the exact same overall rating value).
Emulator Coder, Site Developer, Site Owner, Expert player
(3577)
Joined: 11/3/2004
Posts: 4754
Location: Tennessee
I like the idea of having decimal points in the rating scales.
I do NOT like confining it to a single category. I would be open to more categories or a redefining the existing ones more precisely, however.
In general the ratings aren't as accurate as they could be, but they aren't completely unreliable. They serve a nice purpose and give the audience something to do. I would hate to see the rating system removed.
How about instead of removing rating categories, we add more of them? Here are a few ideas:
- Perfection: Does it look frame-perfect? Can you spot flaws or sloppy play? Does it look what a tool-assisted run should look like? Is it up to the standards of this site?
- Technique&tools: Are advanced TAS techniques and/or tools used in the run? Are they used efficiently and with expertise? Are they abundant? Are they performed with style? Were unusual tools or techniques used? Think of it as if you were rating figure skating (ie. the important thing is not speed, but technique).
- Overall entertainment: Forgetting about techniques, tools and speed, is the run fun to watch? Is it boring? Is it too long? Is it a good game choice for TASing? Could it work as a music video? Would you recommend this as a first movie for a newcomer?
- Interest for gamers/speedrunners: Would the TAS be interesting for someone who has played the game? Were the game physics broken to smithereens in interesting ways? Does it show things about the game that you never knew were even possible? Are the abused game bugs interesting or amusing? How well did the author explain the game mechanics, how they were abused, his route selections, etc?
Just throwing some ideas.
Warp: I think that's even worse than the current system. If people want to consider all that, fine, but just come up with a single rating that fits everything you considered. As for your individual suggestions:
- perfection: you can only guess this, and it would give rating boosts to games that are easy to TAS
- Technique&tools: for some TASes, other tools are needed than others. It's hard to really know the techniques that are used, unless you are really interested in the game. I don't think someone who just watched the TAS for fun, would care a whole lot. They just want to watch a fun movie, and give a rating.
- Overall entertainment: this is obviously important in whatever rating system is implemented.
- Interest for gamers/speedrunners: I have a hard time finding the right words for how random and unrelevant this rating would be.
==================================
As for my suggestion, I kinda see that no one seems to be in favor for a single rating... well, if there must be two ratings, it would still be good if:
- You can rate in decimals
- Maybe have the option not to give a technical rating
- Have a clearly stated definition of what the technical rating is supposed to be at the side of the rating system
Even though right after the technical rating it says "(how close it is to perfection)", no one really agrees on what it actually means, and gives it an own meaning. This might not be a problem to many, but it can be used to argue against people who say they are interested to see what technical rating people are giving.
I don't really understand how can it be used to argue against people who are just interested. The fact that every person defines the scale to themself is not only true, it's fundamentally unchangeable. Which is why we shouldn't really argue about it, let alone try to change it in some way. What we should do is accept it and relieve the importance sometimes associated with it. Victory comes at the moment you decide it doesn't really matter.
Baxter wrote:
Assuming that most people do view the technical rating as a measure how close the movie to perfection is, is this a good thing for this to influence the final score?
As good as any other beside entertainment. You can always take this "easy" game and say "this is optimal as hell", and likely be right. If that matters in some way, it can also matter as a part of the resulting score.
Baxter wrote:
Assuming that most people do view the technical rating as a measure how close the movie to perfection is, how does one know how close a particular movie comes to perfection? The truth is, you have no idea of determining that until you actually TAS the game yourself, and notice howmany, frames you can save. Neither the tricks used, the amount of rerecords used, the author, whatever else you could possibly know of this movie by watching it.
Duh?.. Let me tell you more: even if you do attempt TASing it, there's no way to tell you're doing what's best. Maybe you're overlooking as much as the original author. No way to tell.
That means that you (not you in particular) shouldn't pretend to assess the actual quality of optimization, but come to terms with the fact that you're assessing how optimal it looks. Which is neither bad nor wrong; TASes are a form of entertainment, entertainment is subjective, assessment is thus bound to be subjective. Nothing else needed.
Baxter wrote:
One can also see a very strong correspondence in some cases to people being entertained by a TAS, and the technical rating they give, even if this strictly shouldn't be the case.
That has more to do with people being stupid and not seeing/understanding the difference. Nothing new here, unfortunately.
Baxter wrote:
People might disagree on the fact that entertainment counts for 2/3 and technical rating for 1/3. Some people might find entertainment more important than that, or less important. People might also consider other things besides these two things. Would it not be better for each person to consider whatever he finds important, weigh it as high as he thinks and compiles it into a single rating?
Either too complex or potentially abusable. Hardcoded values won't be ideal, but they'll give a solid, robust base.
Baxter wrote:
People might want to rate higher than a 8, but wouldn't quite give it a 9 (or want to be between some other numbers). Some people consider a 10 to be a perfect score, and are reluctant to hand it out, but it's the only option if something is worth more than a 9. Being able to give ratings like a 8.2 or a 9.3 would solve this problem.
I don't think such accuracy is needed, either. For one, I'd be just fine with 0—20 system (via .5 decimals), but wouldn't be sad if the current 0—10 one stayed, either.
Baxter wrote:
Being able to rate a 8.2 or 9.3 (or whatever) will also enable you to list the TASes you've rated better by rating. This way, you will truly get a list of TASes you like best to TASes you 'like' worst. The current system doesn't produce this kind of list for two reasons: 1) You can only rate integers, and many movies will get the same rating, even if you like one movie a little better than the other. 2) The technical rating will give boosts to some TASes, even though you don't like them as much... this will especially be the case for the 'easy' games I mentioned earlier.
It won't be ideal at all. For one, I like certain aspects in one kind of TASes, but different ones in another. I value them differently. So far I can express it by giving different tech and entertainment ratings to represent that value. With a single value they'll be equalized, which I would be less comfortable with.
The current system is far from ideal in this case, it may even not be better, but it's not really worse either.
Baxter wrote:
The labels the current integers have "slightly above average" and so on are very confusing, and might not represent what people think. It doesn't matter if one person gives his movies an average rating of a 5, while some other gives them an average rating of an 8, as long as their own list is consistent. I don't think these labels are needed.
These are only guidelines; you shouldn't worry much about them because people aren't taking them literally either.
Warp wrote:
I have always had the opinion that people understand the technical rating wrong. It was not what I had in mind when we created the voting system.
But that's so human-like! I can't believe you were expecting everyone to take it as you wanted them to. Though maybe you weren't, but then you shouldn't feel bad about it.
Warp wrote:
Does it perform heavy luck manipulation? If so, does it do it to its great advantage? Is it "cool"? Does it zip through walls? Is the zipping performed with good style and technique? Does it "look good"? What kind of tools were used to make the run? Was lua scripting used to aid in making the run? Was a bot written to create part of the run? Was the game disassembled in order to understand how the rng works? That kind of things.
It also looks like a mishmash of subjective and/or shabby criteria that have about the same probability of being precise or misleading. Not to say any other criterion (except "was it bruteforced") isn't in this case, but still. I won't go into elaboration on how faulty they can be, because you can ask these questions to yourself just as well.
Warp wrote:
Even a frame-perfect run may deserve a low technical score if it doesn't show advanced and well-executed techniques. Perhaps the game in question just doesn't lend itself to awesome techniques, but then it's simply a poor game choice.
Makes very little sense. You're basically suggesting that some games will enjoy the usage of a full scale, but not the other games. And this is such an awesome favoritism that we could start presetting ratings for certain games just because they do or don't lend themselves to something. Let's start with SMB series. (Or wait, let's not, unless you want people to die.)
Warp wrote:
They want the technical score to be a pure optimal-frames/used-frames score, and nothing else. Interpreting it like that makes the whole technical score kind of moot and uninteresting. It has no value. It doesn't say anything.
It has. It gives you an idea of what people think of it. Wake up! The peers are telling you what they think! That's a valuable info that you could go and do something with. For example, prove them wrong. :D
Warp wrote:
Removing the stars is one of the symptoms of the current Political Correctness trend here, and I still heavily oppose the decision.
Current trend? Heh, as if. :)
Baxter wrote:
If people want to consider all that, fine, but just come up with a single rating that fits everything you considered.
Yeah, except coming up with a single value and trying to explain it to yourself is much harder. Try it. Why does this movie deserve 9.2 and not 9.3? What about another pair of 9.2-9.3 movies? Same kind of a difference? Or not?
A single rating is almost as close to being completely arbitrary as possible. Maybe it's a good thing, maybe not. After all, it's what most are striving for, so basically we're giving them a toy to fulfill their "lowly desires". :)
Warp wrote:
Edit: I think I understand now: It's my avatar, isn't it? It makes me look angry.
Even though right after the technical rating it says "(how close it is to perfection)", no one really agrees on what it actually means, and gives it an own meaning. This might not be a problem to many, but it can be used to argue against people who say they are interested to see what technical rating people are giving.
I don't really understand how can it be used to argue against people who are just interested. The fact that every person defines the scale to themself is not only true, it's fundamentally unchangeable. Which is why we shouldn't really argue about it, let alone try to change it in some way. What we should do is accept it and relieve the importance sometimes associated with it. Victory comes at the moment you decide it doesn't really matter.
My point was that if you say "removing the technical rating will remove some info (other peoples ratings on the subject) I'd be interested to see", that you should still note that you won't get conistent stats, since everyone is basically rating something else under the title "technical rating".
moozooh wrote:
Baxter wrote:
Assuming that most people do view the technical rating as a measure how close the movie to perfection is, is this a good thing for this to influence the final score?
As good as any other beside entertainment. You can always take this "easy" game and say "this is optimal as hell", and likely be right. If that matters in some way, it can also matter as a part of the resulting score.
To me, it doesn't matter in some way. How close to perfection doesn't matter that much, since it can't be determined anyway... which would matter more to me is the authors effort he put in, and his inventiveness.
moozooh wrote:
Baxter wrote:
Assuming that most people do view the technical rating as a measure how close the movie to perfection is, how does one know how close a particular movie comes to perfection? The truth is, you have no idea of determining that until you actually TAS the game yourself, and notice howmany, frames you can save. Neither the tricks used, the amount of rerecords used, the author, whatever else you could possibly know of this movie by watching it.
Duh?.. Let me tell you more: even if you do attempt TASing it, there's no way to tell you're doing what's best. Maybe you're overlooking as much as the original author. No way to tell.
This is true, but only strengthens my point.
moozooh wrote:
That means that you (not you in particular) shouldn't pretend to assess the actual quality of optimization, but come to terms with the fact that you're assessing how optimal it looks. Which is neither bad nor wrong; TASes are a form of entertainment, entertainment is subjective, assessment is thus bound to be subjective. Nothing else needed.
This way, it seems like it is all just one big subjective assessment (which it probably is), but all the more reason to combine them into a single rating. How optimal it looks sounds like it is very, very much related to the entertainment value.
moozooh wrote:
Baxter wrote:
People might disagree on the fact that entertainment counts for 2/3 and technical rating for 1/3. Some people might find entertainment more important than that, or less important. People might also consider other things besides these two things. Would it not be better for each person to consider whatever he finds important, weigh it as high as he thinks and compiles it into a single rating?
Either too complex or potentially abusable. Hardcoded values won't be ideal, but they'll give a solid, robust base.
Well, you could implement a scrollbar which assigns for what percentage each rating counts for that movie... but more ideal just seems placing the scrollbar in your head, and combining it into a single value.
moozooh wrote:
Baxter wrote:
People might want to rate higher than a 8, but wouldn't quite give it a 9 (or want to be between some other numbers). Some people consider a 10 to be a perfect score, and are reluctant to hand it out, but it's the only option if something is worth more than a 9. Being able to give ratings like a 8.2 or a 9.3 would solve this problem.
I don't think such accuracy is needed, either. For one, I'd be just fine with 0—20 system (via .5 decimals), but wouldn't be sad if the current 0—10 one stayed, either.
I have a hard time understanding why someone could be against it. If you personally like to only rate integers, then you are still free to do so. There are a lot of people who would like to give more accurate ratings though, why not let them?
moozooh wrote:
Baxter wrote:
Being able to rate a 8.2 or 9.3 (or whatever) will also enable you to list the TASes you've rated better by rating. This way, you will truly get a list of TASes you like best to TASes you 'like' worst. The current system doesn't produce this kind of list for two reasons: 1) You can only rate integers, and many movies will get the same rating, even if you like one movie a little better than the other. 2) The technical rating will give boosts to some TASes, even though you don't like them as much... this will especially be the case for the 'easy' games I mentioned earlier.
It won't be ideal at all. For one, I like certain aspects in one kind of TASes, but different ones in another. I value them differently. So far I can express it by giving different tech and entertainment ratings to represent that value. With a single value they'll be equalized, which I would be less comfortable with.
In the end, your calculated average of both ratings is what matters in the end anyway. If you really want to explain your ratings, you should write a review of the TAS or something, I don't know how just one more rating makes such a big difference... a forum post tells more anyway. This also makes me wonder, if you want to be so precise on what matters, then how come you don't want to be more precise in rating decimals?
moozooh wrote:
Baxter wrote:
The labels the current integers have "slightly above average" and so on are very confusing, and might not represent what people think. It doesn't matter if one person gives his movies an average rating of a 5, while some other gives them an average rating of an 8, as long as their own list is consistent. I don't think these labels are needed.
These are only guidelines; you shouldn't worry much about them because people aren't taking them literally either.
The fact that people aren't paying attention to them anyway isn't really a good argument for keeping them, is it?
moozooh wrote:
Baxter wrote:
If people want to consider all that, fine, but just come up with a single rating that fits everything you considered.
Yeah, except coming up with a single value and trying to explain it to yourself is much harder. Try it. Why does this movie deserve 9.2 and not 9.3? What about another pair of 9.2-9.3 movies? Same kind of a difference? Or not?
If you rate enough movies, you will get quite a large list of your ratings. When watching your list, you can determine if you like a certain TAS better or worse than others, or equally well. If you like it nearly equally well as some other TAS, but just a tiny bit less, you might want to give it a 9.2 instead of the 9.3 the other TAS got.
moozooh wrote:
A single rating is almost as close to being completely arbitrary as possible. Maybe it's a good thing, maybe not. After all, it's what most are striving for, so basically we're giving them a toy to fulfill their "lowly desires". :)
Well, coming up with a list in the end that truly goes from the TASes you like best to the TASes you like least doesn't seem very arbitrary to me. If you want to eleborate on a certain rating, there are always the forums, and I don't see how a single extra rating makes it that much less arbitrary (especially since it's just as subjective as the entertainment rating like you stated)... I think with being able to rate decimal numbers, it's less arbitrary than the current system.
- perfection: you can only guess this, and it would give rating boosts to games that are easy to TAS
Rating a run expresses your opinion of that run. It's not even intended to be an accurate measurement of something. Its only purpose is so that people can express how much they value the run. Averaging all the ratings gives a notion of what is the average opinion of that run.
moozooh wrote:
Warp wrote:
Even a frame-perfect run may deserve a low technical score if it doesn't show advanced and well-executed techniques. Perhaps the game in question just doesn't lend itself to awesome techniques, but then it's simply a poor game choice.
Makes very little sense. You're basically suggesting that some games will enjoy the usage of a full scale, but not the other games.
Exactly how is that different from the entertainment score? Some games do enjoy a better reception in the entertainment side than others. Is that "favoritism" or somehow unfair? No, it's just the opinion of people.
I don't see the technical rating being any different: It also expresses the opinion of people. Some games just aren't up to good technical ratings, in the exact same way as some games are not very entertaining. I see nothing strange or unfair here.
Hmm, if the voting topic would turn out to be in favor of somehow implementing ratings, this topic too is pretty important. I think it's obvious that my initial plan won't ever get enough support. I'll give 3 suggestions, which I think are very reasonable, and should be able to get some support:
1) Being able to give decimal ratings (like 7.8, 5.3, 9.4 etc). Some people might say that they don't think such accuracy is needed... but I know quite a lot of people would also like be able to give more precise votes, and the people who don't want to do that, they can just stay with voting integers.
2) A better description of how the technical rating is intended to be interpreted. The current "(how close it is to perfection)" does not seem to do this well at all.
3) It is somewhat hard to get to view rating list of other users. http://tasvideos.org/rating.exe/my/ This page shows a list of the top 15 raters, it would be nice if their names linked to their rating lists for instance.
1) As long as it is feasible technologically I don't see how that could be a problem. No one's stopping people from still giving only integral ratings.
2) Yes please. How about "Ignoring entertainment, how 'good' is this TAS?" Replace "good" with precise or some better words.
3) I agree as well. In case people don't know you can go to the forum profile of user and click the "Details..." link at the bottom to see their ratings, or you can replace "my" in http://tasvideos.org/rating.exe/my/ with a username. Which is not obvious at all :)
I'm no 'player' but I agree with all 3 of Baxter's suggestions above.
For decimals, how about having 2 drop-down menus for each score?
Looking sorta like:
[9].[5]
Just an idea to avoid having a hundred things to scroll through. Folk who want to use integers can use the left drop-down menu only, if they wish.
Assuming the left menu contained '0,1,2,3,4,5,6,7,8,8,9,10' and the right one ''0,1,2,3,4,5,6,7,8,8,9', a score of 10.5 (or any number up to 10.9) could potentially be given so this'd have to be capped.
I can only guess as to how tricky this is.
For ideas of how to better explain the technical rating, how about, "The technical precision of this movie (how technically impressive it seems)"?
Just a small modification, but I don't think a massive change is needed.
I'm just some random guy. Don't let my words get you riled - I have my opinions but they're only mine.
For decimals, how about having 2 drop-down menus for each score?
Looking sorta like:
[9].[5]
Just an idea to avoid having a hundred things to scroll through. Folk who want to use integers can use the left drop-down menu only, if they wish.
I suppose it shouldn't be very hard to convert the current code to use the decimal. The good thing is that existing ratings don't need to be changed. Bisqwit will have to give a green light on this, though. (After all, it's his database.) I don't remember if the database stores the ratings in integers or decimals, but I assume it wouldn't be a hard thing to convert it.
I honestly cannot fathom why in the world anybody could possibly have a use for ratings outside of integral 1-10. Seriously.
What defines 8? What about 8.1?
Even then, what about 10? WHat if something comes along that's even better than that movie you rated 10?
WHat if something comes along that's even better than that movie you rated 10?
Your face would get so red. And then the world implodes.
Really though, I don't see how people would be able to rate accurately with a 1-100 scale, and I think it would end up making many go +/- 0.1-0.3 depending on their mood, the game or other ratings. Anything beyond 1-20 seems pretty redundant to me, though I don't see any great harm in at least making the option to rate all the way up to 100 possible.
I honestly cannot fathom why in the world anybody could possibly have a use for ratings outside of integral 1-10. Seriously.
Cpadolf wrote:
Really though, I don't see how people would be able to rate accurately with a 1-100 scale, and I think it would end up making many go +/- 0.1-0.3 depending on their mood, the game or other ratings.
What defines 8? What about 8.1?
I already somewhat responded to this:
Baxter wrote:
moozooh wrote:
Why does this movie deserve 9.2 and not 9.3? What about another pair of 9.2-9.3 movies? Same kind of a difference? Or not?
If you rate enough movies, you will get quite a large list of your ratings. When watching your list, you can determine if you like a certain TAS better or worse than others, or equally well. If you like it nearly equally well as some other TAS, but just a tiny bit less, you might want to give it a 9.2 instead of the 9.3 the other TAS got.
You basically would want to list them in a way that you get a rating list in the end that truly reflects your thoughts, and being able to rate decimals will make it possible to make small distinctions between different movies, if you like one movie slightly better, and want it to be slightly higher in your rating list.
Voting that take into account other ratings you gave is perfectly fine. You need to give consistent votes to all movies you rated, which kinda makes this necessary.
Xkeeper wrote:
Even then, what about 10? WHat if something comes along that's even better than that movie you rated 10?
If someone is really bothered by this, just don't rate a 10. You will have the same problem with only rating integers, but worse. If you decide not to give 10-ratings, with decimals you could still give ratings in the 9.1-9.9 range, if you think a movie is worth more than a 9.
Answered here for the most part, feel free to inquire about the rest if I missed something.
Warp wrote:
Exactly how is that different from the entertainment score? Some games do enjoy a better reception in the entertainment side than others. Is that "favoritism" or somehow unfair? No, it's just the opinion of people.
I don't see the technical rating being any different: It also expresses the opinion of people. Some games just aren't up to good technical ratings, in the exact same way as some games are not very entertaining. I see nothing strange or unfair here.
I'll tell you the difference.
"Entertainment" is a "pure" category, it should not be justified additionally because it's essentially telling how you feel about any given movie overall, no justification required. It's for the large part influenced by the game itself. This is obvious, and the authors can't really do much about it. Well, in fact they kinda do something about it; particularly, choosing games that they like more, or those the audience likes more.
"Technical score" is a speculative category, aimed towards assessing technical quality, and usually (and in some cases highly) influenced by experience with playing or TASing games, general or specific knowledge in various areas, and other things. Naturally, this should gauge the author's performance on the game without putting irrelevant stuff in the equation. Arbitrarily limiting the mark despite the performance quality will place a stigma of disregard on it. Well, actually, if some people do this, it's fine, because they have others with different appraisal criteria to counterbalance it. But predefining the limits universally, aside from being somewhat unfair to the player, will move it closer to the "pure-subjective" entertainment domain. Which, as Baxter says, strengthens his point. In my opinion.
Warp wrote:
Edit: I think I understand now: It's my avatar, isn't it? It makes me look angry.