Post subject: Publication Rating Guidelines??
nymx
He/Him
Editor, Judge, Expert player (2672)
Location: South Pole, True Land Down Under
Joined: 11/14/2014
Posts: 991
Location: South Pole, True Land Down Under
After a lengthy discuss with Aran;Jaeger (EternisedDragon), I decided to create a thread for something that has confused me for a while. The question is: How do you rate the technical quality of a publication? There are many different points to bring up and ED has commented on a number of good ideas. Since I don't see any guidelines...what do you think the rules of rating should be? As for Entertainment, that pretty much goes as a personal preference, but for technical ability, I would like to rate movies with a common ground of understanding. So far, my understanding is the level of effort it took to produce...but in fact, it could be that I don't see any frames to be found (meaning a 10). So...should guidelines be created? let's discuss this and see what comes of it.
I recently discovered that if you haven't reached a level of frustration with TASing any game, then you haven't done your due diligence. ---- SOYZA: Are you playing a game? NYMX: I'm not playing a game, I'm TASing. SOYZA: Oh...so its not a game...Its for real? ---- Anybody got a Quantum computer I can borrow for 20 minutes? Nevermind...eien's 64 core machine will do. :) ---- BOTing will be the end of all games. --NYMX
Masterjun
He/Him
Site Developer, Expert player (2124)
🇩🇪 Germany
Joined: 10/12/2010
Posts: 1187
Location: 🇩🇪 Germany
For reference, here is the link to the current guidelines.
Warning: Might glitch to credits I will finish this ACE soon as possible (or will I?)
Site Admin, Skilled player (1247)
Joined: 4/17/2010
Posts: 11766
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
Player (27)
Location: Amsterdam
Joined: 8/29/2011
Posts: 1206
Location: Amsterdam
Someone with database access should make a scatter plot of entertainment rating against technical rating for all published runs. I strongly suspect that they have a direct linear correlation. The best comparison I can come up with is this: famous dating site OkCupid used to let its users rate other profiles on "looks" and "personality", and it turns out that people overwhelmingly use more-or-less the same rate for both:
Banned User
🇫🇮 Finland
Joined: 3/10/2004
Posts: 7698
Location: 🇫🇮 Finland
I think much of the confusion about "technical rating" stems from that word, "technical". This is only my personal opinion, of course, but I never envisioned that rating to be something objective, something that can be measured with hard numbers and mathematically proven. In that sense the word "technical" gives a very misleading impression. I would say that "technical rating" is as subjective as "entertainment rating". I don't see any problem in it being someone's completely subjective opinion. Absolute objectivity is not required. If you personally feel that the TAS is technically impressive, by whatever qualities you think are relevant, you are fully entitled to your opinion and your rating. Don't be fooled by that, should I say, technical-sounding word "technical". Another common misconception is that "technical quality" refers solely to how frame-perfect the TAS is. In other words, if the run can most probably not be improved anymore, or by any significant amount, then it should get a perfect technical score. I completely disagree with that. I don't think technical rating should measure frame perfection (and not only because it's something that can never be proven exactly). It can be part of it, but only a small part. As the guidelines say, in the same way as not all games lend themselves for a prefect entertainment rating, no matter what the TASer does, likewise not all games lend themselves for a perfect technical rating, no matter what the TASer does. The TAS could be frame-perfect, and still deserve a 3 in technical quality, for the mere reason that the game is so simple and straightforward that it just doesn't allow the TASer to show any technical prowess in its making. While entertainment rating ought to be how entertained you were by watching the run, technical rating ought to be a (somewhat subjective) measure of how much work and effort was put into making it. And this is not measured (solely) by the number of hours put into it, but by the amount of other kind of work. With some games a lot more work like this is just not possible, while with others entire new technologies have been developed to make a faster run. The recent NES Arkanoid warpless run is a perfect example of a TAS that deserves a very high technical rating. The author developed an entire simulator in order to be able to brute-force an optimal path programmatically. This is the kind of hard work I'm talking about. It's not just the perfection of the end result, but the path there, the technical work that had to be put in order to achieve this. The impressiveness of that work. Of course you are free to interpret "technical rating" as you wish, but this is how I would see it.
Post subject: Technical ratings are a failure, delete them
adelikat
He/Him
Emulator Coder, Site Developer, Site Owner, Expert player (3576)
🇺🇸 United States
Joined: 11/3/2004
Posts: 4759
Location: 🇺🇸 United States
I just wanted to say that Radiant is very likely correct. Also, Warp is correct. I think what technical was supposed to be isn't generally the way people use it. I did some statistics and I have the 2nd most ratings next to Arc (I'd be curious what you think here, Arc). What I can tell you with my 4930 ratings is that technical is not valuable. I see two things, either they are linear like Radiant suggests, or they are the "consolation prize" for rating something low in entertainment. Either way, they add complexity without value. My opinion is that we should should WIP technical from our rating system. What we can do is calculate the average as (Ent * 2) + Tech / 3 and then replace entertainment ratings with that value, then drop the technical values. Then the UI is a single value, which I think more people will participate in. Also, I propose that we limit the decimals to .5 on the movie module, and let users go to the main rating page if they want to add nuance to their rating. I'd be prepared to take on this work in the near future, if people agree with this idea.
It's hard to look this good. My TAS projects
Post subject: Re: Technical ratings are a failure, delete them
Noxxa
They/Them
Moderator, Expert player (4239)
🇳🇱 Netherlands
Joined: 8/14/2009
Posts: 4117
Location: 🇳🇱 Netherlands
adelikat wrote:
I just wanted to say that Radiant is very likely correct. Also, Warp is correct. I think what technical was supposed to be isn't generally the way people use it. I did some statistics and I have the 2nd most ratings next to Arc (I'd be curious what you think here, Arc). What I can tell you with my 4930 ratings is that technical is not valuable. I see two things, either they are linear like Radiant suggests, or they are the "consolation prize" for rating something low in entertainment. Either way, they add complexity without value. My opinion is that we should should WIP technical from our rating system. What we can do is calculate the average as (Ent * 2) + Tech / 3 and then replace entertainment ratings with that value, then drop the technical values. Then the UI is a single value, which I think more people will participate in. Also, I propose that we limit the decimals to .5 on the movie module, and let users go to the main rating page if they want to add nuance to their rating. I'd be prepared to take on this work in the near future, if people agree with this idea.
I fully agree with this. Merge existing entertainment/tech ratings to a general rating on a 2/3-1/3 ratio (like how general ratings are calculated now), axe technical rating, and treat entertainment ratings as a general rating instead. For the movie module, ideally you can give it a rating with about 0.5-level precision in a single click (like is common for 5-star systems on review sites nowadays), but just reducing the basic option count is a good compromise to start with. It would be great if we really can get some solid work started on this. Having a better and more user-friendly rating system would a great benefit for our site system.
http://www.youtube.com/Noxxa <dwangoAC> This is a TAS (...). Not suitable for all audiences. May cause undesirable side-effects. May contain emulator abuse. Emulator may be abusive. This product contains glitches known to the state of California to cause egg defects. <Masterjun> I'm just a guy arranging bits in a sequence which could potentially amuse other people looking at these bits <adelikat> In Oregon Trail, I sacrificed my own family to save time. In Star trek, I killed helpless comrades in escape pods to save time. Here, I kill my allies to save time. I think I need help.
Site Admin, Skilled player (1247)
Joined: 4/17/2010
Posts: 11766
I think IMDB's system is the best. https://www.imdb.com/title/tt0111161/ Press the Rate This star to make the 10-star scale appear, you pick one whole star, done! It shows up as n.m, but you don't have to invent arbitrary meanings for fractional points! Just like with actual movies, you lump every factor together and rate the entire thing at once. Also just like with movies you'll know anything above 8 is worth taking a look. I don't know why we should reinvent the wheel here and pretend we know the world better than those 2 million people who feel they understand this movie rating system.
adelikat wrote:
I just wanted to say that Radiant is very likely correct.
I wanted to learn how to examine the DB and actually check if this is how everything is, but never got around to it. Could someone with DB access figure this out?
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
Post subject: Re: Technical ratings are a failure, delete them
Player (27)
Location: Amsterdam
Joined: 8/29/2011
Posts: 1206
Location: Amsterdam
adelikat wrote:
I'd be prepared to take on this work in the near future, if people agree with this idea.
I do agree. Great to hear that something may finally change on this!
Emulator Coder
Location: In his lab studying psychology to find new ways to torture TASers and forumers
Joined: 3/9/2004
Posts: 4588
Location: In his lab studying psychology to find new ways to torture TASers and forumers
Warp wrote:
The recent NES Arkanoid warpless run is a perfect example of a TAS that deserves a very high technical rating. The author developed an entire simulator in order to be able to brute-force an optimal path programmatically. This is the kind of hard work I'm talking about. It's not just the perfection of the end result, but the path there, the technical work that had to be put in order to achieve this. The impressiveness of that work. Of course you are free to interpret "technical rating" as you wish, but this is how I would see it.
I agree with everything you wrote except this last section here. While this particular TAS is easy to determine what work was put in, because he showed it. Most often for a TAS we only have the results. An author can claim they put in all kinds of work, but we do not see the work. Also despite the amount of effort they put in, other people may be able to achieve better results with less effort. Therefore when rating, I don't care about what effort was put in, I care about the results seen. I look for qualifications within the movie itself that show technicalities, such as complicated routing (which is not present in many run right games), get exactly enough powerups to complete the level as fast as possible, nothing unnecessary (which is not present in games without powerups), and so on. Some of the best games have several things to consider for how technical things need to get for best speed, other's are simpler and don't. At the end of the day, for a game which shows off precision in many different ways, I'll give a high technical rating to. For say a card game where of the 300,000 possibilities you brute forced the most optimal one, I'll give a low technical rating because you only showed off one area of precision and optimality.
Warning: Opinions expressed by Nach or others in this post do not necessarily reflect the views, opinions, or position of Nach himself on the matter(s) being discussed therein.
Memory
She/Her
Site Admin, Skilled player (1609)
Location: Dumpster
Joined: 3/20/2014
Posts: 1796
Location: Dumpster
I am absolutely in favor of adelikat's suggestion.
[16:36:31] <Mothrayas> I have to say this argument about robot drug usage is a lot more fun than whatever else we have been doing in the past two+ hours
[16:08:10] <BenLubar> a TAS is just the limit of a segmented speedrun as the segment length approaches zero
EZGames69
He/They
Publisher, Reviewer, Expert player (5032)
Joined: 5/29/2017
Posts: 2791
I also am in favor. Make it as approachable as possible.
[14:15] <feos> WinDOES what DOSn't 12:33:44 PM <Mothrayas> "I got an oof with my game!" Mothrayas Today at 12:22: <Colin> thank you for supporting noble causes such as my feet MemoryTAS Today at 11:55 AM: you wouldn't know beauty if it slapped you in the face with a giant fish [Today at 4:51 PM] Mothrayas: although if you like your own tweets that's the online equivalent of sniffing your own farts and probably tells a lot about you as a person MemoryTAS Today at 7:01 PM: But I exert big staff energy honestly lol Samsara Today at 1:20 PM: wouldn't ACE in a real life TAS just stand for Actually Cease Existing
Post subject: Re: Technical ratings are a failure, delete them
Arc
Editor, Experienced player (894)
Location: Arizona
Joined: 3/8/2004
Posts: 536
Location: Arizona
adelikat wrote:
(I'd be curious what you think here, Arc).
[4032] NES Super Mario Bros. 3 "game end glitch" by Masterjun & ais523 in 00:00.78 is the most extreme example of low entertainment and high tech. The technical rating allows the rater to express that although this movie is not enjoyable to watch, the rater respects the diligence of the author to produce a movie that demonstrates expert-level knowledge of the game. Although I think that the technical rating has value, I think that it is more important for the site to get more people to interact with the rating mechanism. Thus, it should be (1) as easy as possible to use (single-click star rating, as suggested) and (2) as prominent as possible (similar to what I suggested three years ago with this crude graphic).
Banned User
🇫🇮 Finland
Joined: 3/10/2004
Posts: 7698
Location: 🇫🇮 Finland
Nach wrote:
Therefore when rating, I don't care about what effort was put in, I care about the results seen. I look for qualifications within the movie itself that show technicalities, such as complicated routing (which is not present in many run right games), get exactly enough powerups to complete the level as fast as possible, nothing unnecessary (which is not present in games without powerups), and so on.
I think Arc gave the perfect counter-argument to this:
Arc wrote:
[4032] NES Super Mario Bros. 3 "game end glitch" by Masterjun & ais523 in 00:00.78 is the most extreme example of low entertainment and high tech. The technical rating allows the rater to express that although this movie is not enjoyable to watch, the rater respects the diligence of the author to produce a movie that demonstrates expert-level knowledge of the game
Ok, perhaps "counter-argument" is too strong of a word, but I couldn't think of a better one. I must admit I'm perhaps a bit biased in this entire subject because it was actually me who originally came up with the two rating categories (and contributed to the implementation of the backend rating system). I was actually inspired by the rating system of the (now long-defunct) Internet Raytracing Competition, which had three ratings for people to vote on submissions: Artistic, technical, and "concept, originality, interpretation of the theme". (No need to wonder where I got that word, "technical", from.) (In fact, back in the day IRTC saw quite similar problems with this rating system, the main one being cross-contamination of categories: An impressive-looking image would generally tend to get high ratings on all three categories regardless of whether it actually deserved it or not. Likewise a bad-looking image would often get low scores on all categories even if it was undeserving on some of them, although the cross-contamination problem was not as bad with these images than with the impressive-looking ones.) I'm also the one to fully blame for the 100-value rating system. In retrospect I was perhaps being too eager to give users so much needless nuance in their votes. I would definitely drop the decimal value if I were to go back in time and redo the thing. While I understand perfectly, and I'm completely sympathetic to the idea of dropping the two different voting categories and merging them into one, a much simpler one, and I think that it's technically speaking (hah!) a good idea, I can't help but be bothered a bit by the possibility of distinguishing between the two being removed. As Arc puts it, when we get an absolute extreme case like the 0.78-second SMB3 TAS, how are people supposed to vote on it? Perhaps a bit ironically, with a unified singular voting system people would actually be voting almost solely on the technical merits of the run. But perhaps that's ok. Perhaps it's ok to interpret the singular rating differently depending on the TAS, rather than try to force a universal set-in-stone meaning to it. This is, after all, very subjective, and that's just fine.
Emulator Coder
Location: In his lab studying psychology to find new ways to torture TASers and forumers
Joined: 3/9/2004
Posts: 4588
Location: In his lab studying psychology to find new ways to torture TASers and forumers
Warp wrote:
Nach wrote:
Therefore when rating, I don't care about what effort was put in, I care about the results seen. I look for qualifications within the movie itself that show technicalities, such as complicated routing (which is not present in many run right games), get exactly enough powerups to complete the level as fast as possible, nothing unnecessary (which is not present in games without powerups), and so on.
I think Arc gave the perfect counter-argument to this:
Arc wrote:
[4032] NES Super Mario Bros. 3 "game end glitch" by Masterjun & ais523 in 00:00.78 is the most extreme example of low entertainment and high tech. The technical rating allows the rater to express that although this movie is not enjoyable to watch, the rater respects the diligence of the author to produce a movie that demonstrates expert-level knowledge of the game
Ok, perhaps "counter-argument" is too strong of a word, but I couldn't think of a better one.
I don't see that as a counter-point, because in my opinion, that's movie does not get a lot of tech points. Sure it excels superbly at hacking a part of the game and jumping to the end in less than a second. But it doesn't plan a route through the game's levels. It doesn't manage powerups. It doesn't beat any of the bosses with high precision attacks at the earliest frame. It doesn't show off perfect jumping to get 99 lives. Etc... I could perhaps give that run 4 maybe 5 points for the one area it excelled in. But I have no further points to award to it, because it didn't do anything else. Therefore, under the system that I currently use, that run would get a low technical rating.
Warning: Opinions expressed by Nach or others in this post do not necessarily reflect the views, opinions, or position of Nach himself on the matter(s) being discussed therein.