I think these "Super Metroid frame war" comparisons made by Comicalflop are largely invalid, because the last time a new SM submission didn't improve the previous by more than 30 seconds was this.
The last few years, we're mainly talking 1+ minute of improvement with every iteration. That can hardly be called frame wars.
So yeah, let's drop it and stop brewing the shit to take from one argument to another.
Hehe, yeah, that's what I call a great religion. :D
I was actually contemplating Buddhism several years ago, but decided on Hedonism instead, haha. Almost a complete opposite in practice.
Also, a snippet from our PRIVATE TREEHOUSE IRC channel. Some more food for thought.
<mmbossman> heh, you missed my arguement with Zurreco yesterday
<moozooh> i'm pretty sure i did.
<mmbossman> I was arguing that 5 is a good average, but only if you take into account all the submissions
<mmbossman> the gruefood would make up the lower half of the rating scale if they were rated
<mmbossman> (and it appears they will soon be)
<mmbossman> it just happens that most of our publications are "above average" submissions
<moozooh> well, that could work, and then 20 point scale would be justified.
<mmbossman> so they tend to get rated above 5
<moozooh> yes, exactly.
<moozooh> it's a problem rooted in site's positioning.
<moozooh> "we don't publish subpar movies"
<mmbossman> yeah
<moozooh> so technically pretty much all of them are above-par.
<moozooh> which cuts the scale in two from the beginning.
<mmbossman> yup
<moozooh> at least the technical.
<moozooh> leading to further inflation.
<moozooh> then there are people who rate only 10-15 movies, most of them with 8-10, contributing to the mess.
<mmbossman> yeah, those ones I can't argue with you against
<moozooh> there are dozens of them, more than the "normal" raters in fact.
<jimsfriend> well yeah, because they only rate movies they watch, and only watch movies they expect to like in advance
<moozooh> heh, yeah. that is understandable, but at the same time it's a huge blow to statistics.
<mmbossman> I've somewhat given up on trying to get great statistics
<moozooh> which makes it hard to tell if unified rating index was a good (or at least well-thought out) idea in the first place, if what most people need is in fact a personal list not influenced by others.
<mmbossman> but hopefully the new voting system will be workable
Responding to Baxter's post from elsewhere. (Ha! Now try locating that post yourself!)
Of course I will, but the main point is that it'll still instigate people towards making a different placement of movies (it will happen because it comes from human nature, and it will be imposible to counteract it), which totally defeats the purpose of assigning statistically counted indexes.
You know what would be the solution to your system? Replacing ratings with ranks. A list of n movies, where the top one is rank #1, and the bottom is rank #n. Then it all comes together perfectly, and what's more important, doesn't interfere with absolute systems. Because what your system is trying to do is bringing that relative component into what's indexed as absolute by the statistics engine.
Now you're contradicting yourself. You say you don't need labels. Why do you not spread the marks evenly so that there's no huge spikes? Because you think 5 is way too low for most of the movies you rate? Maybe that is the problem?
Yes, that's pretty stupid. With nothing other than 0.1 decimals added to the current system, we'll see 2-3 10.0 movies, a dozen of 9.9, another dozen or two of 9.8… With nearly no decimals used below ~8.0, shifting the focus of the scale from 6-10 to 8-10. It's not that hard to predict because it almost directly protrudes from the currently observable rating inflation. EDIT: Also see Acheron86's post above, which can be used for illustrating this.
To conlude this discussion (I think I've already said everything I wanted to say on the matter), your proposition works well for a personal list (albeit in a different realization), but is worse for statistics than the currently used one.
Answered here for the most part, feel free to inquire about the rest if I missed something.
I'll tell you the difference.
"Entertainment" is a "pure" category, it should not be justified additionally because it's essentially telling how you feel about any given movie overall, no justification required. It's for the large part influenced by the game itself. This is obvious, and the authors can't really do much about it. Well, in fact they kinda do something about it; particularly, choosing games that they like more, or those the audience likes more.
"Technical score" is a speculative category, aimed towards assessing technical quality, and usually (and in some cases highly) influenced by experience with playing or TASing games, general or specific knowledge in various areas, and other things. Naturally, this should gauge the author's performance on the game without putting irrelevant stuff in the equation. Arbitrarily limiting the mark despite the performance quality will place a stigma of disregard on it. Well, actually, if some people do this, it's fine, because they have others with different appraisal criteria to counterbalance it. But predefining the limits universally, aside from being somewhat unfair to the player, will move it closer to the "pure-subjective" entertainment domain. Which, as Baxter says, strengthens his point. In my opinion.
I'll tell you a story which may or may not be helpful as an answer to this (and some of the earlier, about rating precision and single criterion) point.
Some time ago, when I was a preteen boy, I was walking with my father's friend, and jumped around. He asked me what height I would be afraid jumping from. I thought a little, and said 2 meters. "So 199 cm would be fine then?" "No… I guess." "198?.." And so on. The interesting thing about that question was that precision was absolutely unneeded in such case because we don't evaluate subjective things like that precisely. Seriously, we don't even care about that kind of precision, and feel differently about it depending on the circumstances. And because of that, any normal person would not be able to tell jumping from 198 and 199 cm apart. They could, however, speculate and justify.
Yes, speculation and justification. The subjective, speculative criteria of appraising movies ("how optimal it is", "how rewatchable it is", etc.) will never be precise enough to warrant small differences (there's no "I will rewatch this movie voluntarily twice, and this one five times"), and there will be many movies you will like differently but won't be able to tell which of them you like more. And some of them, I'm sure, will be fundamentally incomparable to each other.
Naturally, you could go with infinite precision like "I like this run be 10.0, and this one 9.999999999999". You sense something wrong in here? I do, because you essentially liked both the same (or absolutely insignificantly different; to the point of having to forcibly justify this difference to yourself using irrelevant criteria), but you want one of them to end up below the other in your list. Which is the main thing I dislike about your system: the added precision is geared towards a certain end result.
This is somewhat similar to the Last.fm phenomenon, where a lot of (a significant portion at that) people would change their listening habits when the system is gathering statistics, to represent their tastes better by giving certain artists/tracks arbitrary positions on the charts. Which is an end-result oriented approach as well.
The reason I always prefer a ~10-20 step scale is that I can definitely tell the steps apart. The labels serve the same purpose: to tell the steps apart and set referential points for those who are unsure how to formulate their own. It's the same as jumping from the heights of 20, 40, 60, etc., cm, where you can physically feel the difference between each step, and eventually, empirically come to the one you will be afraid jumping from. And if I'm afraid of jumping from 200 cm but not 180, naming an arbitrary number of 187 here won't make anything better, clearer, or more precise. It will tell just about nothing, because it will stay in the grey area.
Hope that was clear.
More like, the faulty robot has a freedom of choice to break itself (by losing balance and falling, for instance) here, and you'll be fixing it.
Feel free to add metaphors.
This problem can be solved by nonlinear scale. Make 3 the average, 2 "bad", 1 "unassisted player could probably do better", and 0 "unassisted player could definitely do better", or something along those lines. It also touches on the problem of decimals by giving a more precise scale in the part where it's needed.
Basically, this. Unless you're really well-versed in the history of videogame consoles, you'll have to look up what the 4th generation is (the fact that it encompasses both the Genesis and the 32x makes it even more counterintuitive). Hell, I've been using emulators for almost a decade, and I can't be bothered checking the generations.
Short version: if you want to replace a clear text with something that isn't immediately clear, you might as well not replace it.
Because the votes have been reset due to the addition of a new option.
http://en.wikipedia.org/wiki/Asch_conformity_experiments
Good read there. Not necessarily the same as what happens during submission voting, but can give an insight on social behavior that doesn't seem rational.
Voted for "No poll, but enable rating for submissions like it is for publications". There are ways to make it neat and orderly, and it will make sense, too. If it will use the "entertainment/technical" system like our publication rating, it will also be good if the entertainment rating is locked upon voting, but tech is possible to modify (in case a person would like to change their opinion as a result of a discussion in the submission thread). It will be something new and useful.
Second favorite option was "No poll. But add "post <vote_type> post" buttons to where the poll was". It's potentially less useful than the above, especially for judges, but it can formulate a different approach to discussing submissions, hopefully moving from monosyllabic posts to rationalistic discussion and more constructive feedback.
It's not "who" that is of utmost interest, but "why". Note that such problem is near-impossible to solve if voting is kept anonymous.
Another thing you can do is return a poll but make it possible to every user to view each other's vote. Can someone add it to the option list? I'm not choosing this option, but I guess it should be there, too.
"4th generation" is definitely not an option, for an obvious reason. "Sega Megadrive / Genesis / CD / 32x" is somewhat long, but it's the only viable option I see.
Ahem… what?
I don't really understand how can it be used to argue against people who are just interested. The fact that every person defines the scale to themself is not only true, it's fundamentally unchangeable. Which is why we shouldn't really argue about it, let alone try to change it in some way. What we should do is accept it and relieve the importance sometimes associated with it. Victory comes at the moment you decide it doesn't really matter.
As good as any other beside entertainment. You can always take this "easy" game and say "this is optimal as hell", and likely be right. If that matters in some way, it can also matter as a part of the resulting score.
Duh?.. Let me tell you more: even if you do attempt TASing it, there's no way to tell you're doing what's best. Maybe you're overlooking as much as the original author. No way to tell.
That means that you (not you in particular) shouldn't pretend to assess the actual quality of optimization, but come to terms with the fact that you're assessing how optimal it looks. Which is neither bad nor wrong; TASes are a form of entertainment, entertainment is subjective, assessment is thus bound to be subjective. Nothing else needed.
That has more to do with people being stupid and not seeing/understanding the difference. Nothing new here, unfortunately.
Either too complex or potentially abusable. Hardcoded values won't be ideal, but they'll give a solid, robust base.
I don't think such accuracy is needed, either. For one, I'd be just fine with 0—20 system (via .5 decimals), but wouldn't be sad if the current 0—10 one stayed, either.
It won't be ideal at all. For one, I like certain aspects in one kind of TASes, but different ones in another. I value them differently. So far I can express it by giving different tech and entertainment ratings to represent that value. With a single value they'll be equalized, which I would be less comfortable with.
The current system is far from ideal in this case, it may even not be better, but it's not really worse either.
These are only guidelines; you shouldn't worry much about them because people aren't taking them literally either.
But that's so human-like! I can't believe you were expecting everyone to take it as you wanted them to. Though maybe you weren't, but then you shouldn't feel bad about it.
It also looks like a mishmash of subjective and/or shabby criteria that have about the same probability of being precise or misleading. Not to say any other criterion (except "was it bruteforced") isn't in this case, but still. I won't go into elaboration on how faulty they can be, because you can ask these questions to yourself just as well.
Makes very little sense. You're basically suggesting that some games will enjoy the usage of a full scale, but not the other games. And this is such an awesome favoritism that we could start presetting ratings for certain games just because they do or don't lend themselves to something. Let's start with SMB series. (Or wait, let's not, unless you want people to die.)
It has. It gives you an idea of what people think of it. Wake up! The peers are telling you what they think! That's a valuable info that you could go and do something with. For example, prove them wrong. :D
Current trend? Heh, as if. :)
Yeah, except coming up with a single value and trying to explain it to yourself is much harder. Try it. Why does this movie deserve 9.2 and not 9.3? What about another pair of 9.2-9.3 movies? Same kind of a difference? Or not?
A single rating is almost as close to being completely arbitrary as possible. Maybe it's a good thing, maybe not. After all, it's what most are striving for, so basically we're giving them a toy to fulfill their "lowly desires". :)
I don't think I've ever agreed with BoltR as much as on this very page before.
On topic of peer pressure and such, people should not be afraid of voicing their opinion. If anything, it's bullies who must be smacked, not the watchers giving feedback.
It's not so much in the context of a rule by this point, as it is in the context of a TAS being KO'd by an unassisted run competing under the same rules, and staying like that for years. This is something that shouldn't happen. Moreover, it's not like the published any% is flawless in regards other than route/major glitching; it has its own share of all kinds of other inaccuracies as well. If you know/remember, it was originally meant to be a test run, which ended up being submitted and published due to overwhelming feedback. In retrospect, I can't tell that has been a bad decision, but the fact that it's so outdated and still published in the any% category definitely doesn't benefit the site in any way. Popular games need their content refreshed, and it's a good idea for a cutting-edge site like TASVideos to stay up to date with that.
Is it? Consider that no-one would bother downloading it via BitTorrent at this point, granted it's been uploaded to every possible video hosting service, and that's about the only advantage its published status has now for those who are lazy or ignorant enough to not know about/search for faster alternatives. There are all the ways to watch it for those who really want it, but otherwise Bloobiebla's movie has all that + more + better.
Check the amount of pages on SDA's Zelda topics, and the amount of views on any given OoT trick uploaded on YouTube. The crowd following OoT tricking/sequence breaking is huge, and there's a high chance any interested user would check some of the more popular recently uploaded YT videos before watching Guano's run.
The exact reason why it should be obsoleted now is that OoT is a very popular game. The published run aims for speed (as such, it has to beat unassisted records in the comparable categories), yet it is dramatically outperformed by newly discovered knowledge, thus lowering the expectations from the site in general ("meh, I've seen better on YouTube" and so on). So far the only reason it hasn't been obsoleted is that there hasn't been a worthy replacement, but now there is.
I'd say it wouldn't be so important with a game like, say, Jaws, which is rather obscure.
Here.
AKA decided that was wrong, so he removed such notes… twice. Whatever, I've given up on trying to reason with him.
That run is deprecated, it surprises zero OoT experts since by this point it can be beat twice faster even in realtime, and thus it should be obsoleted ASAP, even if it takes another category to do that. Remind me, were we looking for reasons to obsolete it, or for reasons not to obsolete it?