If TASVideos claims that a TAS beats the game when it really doesn't, then that's no better than endorsing cheating. Eliminating that is the most important benefit in my opinion.
Re-judging should in general be very fast too, so I'm not seeing any issue here. Only the one aspect of the movie that was brought up by the claimant would need to be re-evaluated, not the entire movie again. By requiring the claimant to bring up the necessary evidence to support his claim, it should be a straightforward verification. The only case I can think of a claim taking more than 10 minutes to process in this case would be if a rule change must be considered, in which case it would be beneficial to everyone to have the matter settled as soon as possible for future TASes anyway.
It's also possible to mitigate reports flooding in ways that are not possible for regular TAS submissions. For example, Twin Galaxies requires a specific level of reputation in order to open a dispute. Other possibilities include limiting the number of concurrent open disputes to one per user as well as one per movie, and ban users making too many false or incomplete claims.
Submissions: 0, because they've already been rejected.
Publications: 0 initially, then at most 1 per manual claim.
There was nothing to comment about. You described the very problem I want to fix, and mentioned that you don't have an issue with it. That's fine, but that's just your opinion. The fact remains that some of the current publications are deceptive to the general public, and I've been fooled myself several times.
Joined: 4/17/2010
Posts: 11475
Location: Lake Chargoggagoggmanchauggagoggchaubunagungamaugg
First, "beating the game" is fundamentally moot for games without ending, so it's not like we have a run of megaman that beats half of the levels and we say that it beats the game. We made up rules for ending-less games, and those weren't fully applied, there's no official claim involved.
Second, the only other example of unbeaten games is single level movies. Those were allowed by the former rules, again no one is claiming such movies beat the game. It was just banned later.
Third, I don't know what relation any of this has with "endorsing cheating". Why "endorsing cheating" exactly, and not "stealing cookies", "fighting aliens", or "cooking pasta"?
This means the same movie we rejudged a week ago can be brought up regarding some other aspect, repeatedly. Which, in turn, means that we'll have to rejudge each movie several times now, in addition to after every rule change. Everyone's dream.
I don't know where you got "10 minutes" from, and I don't understand how rule changes are going to be worked around.
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
Incorrect. As I linked in my original post, there was a Final Fantasy VI run that was published even though it ended in a softlock.
Am I that bad with analogies? Alright then, allow me to clarify.
TASVideos publishes a TAS. The information in the publication is false. The error remains forever without any correction. So, if a cheater makes a fake TAS on purpose to get notoriety and it gets published somehow, his cheat will remain on the site forever, and at best obsoleted.
Because of this, it's hard to use TASVideos as a resource for theoretical best times and record progression, because the integrity of its leaderboard is jeopardized. Simple as that.
I already covered how to prevent this in the mitigation techniques of my last reply.
My 10 minutes estimate is based on requiring the analysis to be performed by the claimant in advance, and rejecting claims that cannot be directly verified by a judge. It should be as simple as following a series of steps. I believe that's about 5 minutes to read, execute the steos and confirm the claim, and 5 minutes to update the site. Obviously I'm not counting passive steps such as running a TAS up to a certain point. In any case, even if my estimate is wrong, my point is that it should be a quick and straightforward process.
As for rule changes, there's nothing to be worked around. Newly-banned emulators are easy to detect and flagging such movies could be automated, and as for the rest it's very unlikely to affect many movies so it will simply wait until someone finds and reports the problem.
...that's assuming a judge is too incompetent to not know the difference and failing to properly follow the Judging Guidelines by not checking every detail of the submission prior to rendering judgement.
Joined: 4/17/2010
Posts: 11475
Location: Lake Chargoggagoggmanchauggagoggchaubunagungamaugg
SmashManiac wrote:
As I linked in my original post, there was a Final Fantasy VI run that was published even though it ended in a softlock.
Alright. This is an example of thorough research that is often needed. See, the fact that it wasn't thorough enough means the human factor is likely to cause this kind of errors. To overcome it, research needs to be done with more dedication. And this is sometimes just not happening, be it judging, rejudging, or rerejudging. And it takes tons of time.
SmashManiac wrote:
TASVideos publishes a TAS. The information in the publication is false. The error remains forever without any correction. So, if a cheater makes a fake TAS on purpose to get notoriety and it gets published somehow, his cheat will remain on the site forever, and at best obsoleted.
It's clear now, thanks. The critical difference is that mistakes in judging don't work as precedents, so if a cheater does the same deliberately, it will still most likely be found during the process. As I said, it depends on dedication, but the rules still apply.
SmashManiac wrote:
Because of this, it's hard to use TASVideos as a resource for theoretical best times and record progression, because the integrity of its leaderboard is jeopardized. Simple as that.
This part is harder though. It's obviously impossible to always do perfect judgments, it can be said about everything we do here. Yes, when the errors are uncovered, it looks like we're getting closer to being 100% reliable, but it doesn't mean such an error won't happen a week later. Like I said, it's human factor. One needs to realize that there's no absolute reliability here. The question is, when such a movie is obsoleted by a proper version. In my eyes, this is the only real thing we should be discussing, and we should only think in terms of encouraging such revamps.
SmashManiac wrote:
I already covered how to prevent this in the mitigation techniques of my last reply.
You can't prevent it. If the claimant only addresses one aspect, there can be a lot more. We either find and resolve them while we're at it, or another person brings them up later. The former takes tons of time we simply do not have. The latter takes the same time, just spread around. And both claims need to be verified anyway.
SmashManiac wrote:
My 10 minutes estimate is based on requiring the analysis to be performed by the claimant in advance, and rejecting claims that cannot be directly verified by a judge. It should be as simple as following a series of steps. I believe that's about 5 minutes to read, execute the steos and confirm the claim, and 5 minutes to update the site. Obviously I'm not counting passive steps such as running a TAS up to a certain point. In any case, even if my estimate is wrong, my point is that it should be a quick and straightforward process.
Verification of all the aspects is the most time-consuming part. There can be tons of aspects, and verifying them may require all sorts of extensive work not everyone can do. There are also controversial cases when we need to talk to other judges, to users, to the author, to hardware people, to emulator authors. And there are cases when policy depends on the decision. There's no way to take shortcuts, otherwise they will catch us later.
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
...that's assuming a judge is too incompetent to not know the difference and failing to properly follow the Judging Guidelines by not checking every detail of the submission prior to rendering judgement.
Judges are humans. Judging is difficult. Perfect judging is impossible due to the sheer amount of factors to take into consideration. Honestly, the fact that there have been so few mistakes over the years just shows how an amazing job the TASVideos staff is. This is not said often enough and has to be commended.
feos wrote:
Alright. This is an example of thorough research that is often needed. See, the fact that it wasn't thorough enough means the human factor is likely to cause this kind of errors. To overcome it, research needs to be done with more dedication. And this is sometimes just not happening, be it judging, rejudging, or rerejudging. And it takes tons of time.
I might be missing something, but I believe the claimant should always be able to provide all the necessary research in advance so judges don't have to waste the time to do the same except validate it. It would be an immediate rejection otherwise.
feos wrote:
You can't prevent it. If the claimant only addresses one aspect, there can be a lot more. We either find and resolve them while we're at it, or another person brings them up later. The former takes tons of time we simply do not have. The latter takes the same time, just spread around. And both claims need to be verified anyway.
The latter would only take the same time if all aspects of a TAS would eventually be claimed, which I believe is extremely unlikely.
Also, I should note that there's something similar already happening during the obsoletion process where the previous movies have to be re-evaluated when a new TAS is being considered to obsolete another TAS through non-conventional arguments. Obviously the scope is much more limited than my suggestion, but all I'm saying is that's already happening up to some extent.
All that said, I see your point about rejudging multiple aspects of the same movie through multiple claims being tedious. A possible solution might be that once a warning has been published to ignore other claims for the same movie as they're unnecessary for the sake of integrity. That would limit the scope of the problem further. As for the rest, I think it's an acceptable risk, but that's solely based on my experience so I can't prove it.
feos wrote:
There are also controversial cases when we need to talk to other judges, to users, to the author, to hardware people, to emulator authors. And there are cases when policy depends on the decision. There's no way to take shortcuts, otherwise they will catch us later.
I have to trust you on that one. I didn't really think that a technical argument would require independent expert validation, but you're probably right. I'm not sure how frequently this would occur, but I was hoping such cases should be reviewed earlier rather than later anyway though.
Joined: 4/17/2010
Posts: 11475
Location: Lake Chargoggagoggmanchauggagoggchaubunagungamaugg
SmashManiac wrote:
I might be missing something, but I believe the claimant should always be able to provide all the necessary research in advance so judges don't have to waste the time to do the same except validate it. It would be an immediate rejection otherwise.
Research presented might contain mistakes, missing aspects, or even deliberate lies. But in general, it's more common that not-a-judge makes a mistake a judge won't do. It helps if some info is already provided, but it does nothing critical that'd remove the need to do independent research in order to confirm (or expand, or deny).
SmashManiac wrote:
The latter would only take the same time if all aspects of a TAS would eventually be claimed, which I believe is extremely unlikely.
Say there's an SMB3 run that uses a game end glitch of some very complicated nature, and on top of that, a few other features. There's limited amount of aspects to that run, but there's no limit to which of them one can miss or ignore. If you miss/ignore 1/2 of them, and then notice 1/4 more, there's still 1/4 unnoticed. You can notice it later, never, or right away, depending on how thorough your research is. There's literally no way to reduce the amount of aspects a given run has. So you either uncover and handle them all at once, or one by one, or only a few at first, and others - later, or never. The overall time it takes to examine them is the same, it's just either concatenated or broken down. I'd argue that doing all at once takes a bit less time because you don't have to recall all the info that's known from the past each time you return to some old case.
SmashManiac wrote:
Also, I should note that there's something similar already happening during the obsoletion process where the previous movies have to be re-evaluated when a new TAS is being considered to obsolete another TAS through non-conventional arguments. Obviously the scope is much more limited than my suggestion, but all I'm saying is that's already happening up to some extent.
It's not about scope, it's about actual new submissions at hand. We examine the ones those are trying to obsolete to determine how valid this obsoletion is. If the previous movie was mistakenly accepted, and the new one tries to use this mistake as a precedent, it will be found out (hopefully).
SmashManiac wrote:
All that said, I see your point about rejudging multiple aspects of the same movie through multiple claims being tedious. A possible solution might be that once a warning has been published to ignore other claims for the same movie as they're unnecessary for the sake of integrity. That would limit the scope of the problem further. As for the rest, I think it's an acceptable risk, but that's solely based on my experience so I can't prove it.
The only way it can work without unnecessarily increasing the workload for the judges is consulting about examples in the Ask a judge thread: "If it is true that there are later loops in Joust that have increased difficulty and unique content, then the current movie was mistakenly accepted, is that correct? Because I just found out there are such loops after the movie ends." So the judge would only need to confirm whether the take itself is correct or not, which means, if one has enough evidence, then this can be posted in the recently created thread dedicated to listing such mistaken judgments. This does help with attempting to obsolete such movies. So it's a positive scenario that is also easy to accomplish. Flagging the movies that have been judged wrong, on the other hand, is just not going to work.
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
I think the idea was to unpublish the few TASes that might still exist that are clearly and blatantly against the rules (such as not completing the game, or is known to be using faulty emulation). I have no idea if there exist any such TASes that are in a published (non-obsolete) state.
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
Somewhat relevant: for collecting examples to my Wiki: MESHUGGAH/ForbiddenTechniques I've semi lightning read the rejected and cancelled judgement notes. While it was probably a week, it should be much more man hour for understanding the full background of a submission.
TL;DR: it would take a lot of time to recheck all submissions one by one.
PhD in TASing 🎓 speedrun enthusiast ❤🚷🔥 white hat hacker ▓ black box tester ░ censorships and rules...