Post subject: Whole site torrent ball?
Joined: 4/21/2006
Posts: 5
I'd like to help seed torrents that need it, but I don't want to have to download them all one by one. Perhaps this has been discussed before, but has anyone thought of creating a big ZIP file with all the torrents? I certainly have the drive space spare, it's just a matter of getting the files onto my system. My understanding of the btfriend.py script is that it only juggles torrents you already have, just like Azureus would. Maybe that's incorrect?
Former player
Joined: 8/1/2004
Posts: 2687
Location: Seattle, WA
The problem with making one zip for the whole site is that it would have to be altered every time a movie is obsoleted or published.
hi nitrodon streamline: cyn-chine
Player (206)
Joined: 5/29/2004
Posts: 5712
I thought BTFriend went for any file that was needed.
put yourself in my rocketpack if that poochie is one outrageous dude
Joined: 4/21/2006
Posts: 5
Zurreco wrote:
The problem with making one zip for the whole site is that it would have to be altered every time a movie is obsoleted or published.
True it would have to be updated periodically, but not constantly. Plus, it's not too hard to write a cron job/Scheduled Task to zip files.
Bag of Magic Food wrote:
I thought BTFriend went for any file that was needed.
Even if it is, you still have the problem that volunteer seeders need to get the content before the data is needed by peers. E.g. if someone's trying to download a file, and there's only one person who has the data to seed, a bunch of other people joining that torrent in order to "help" will actually just distract the one seeder, and the end result for the peer will be the same. Right?
Player (206)
Joined: 5/29/2004
Posts: 5712
True... But at least it'll make things easier IN THE FUTURE!
put yourself in my rocketpack if that poochie is one outrageous dude
Former player
Joined: 3/30/2004
Posts: 1354
Location: Heather's imagination
Ideally those peers will be sharing with one another and the seeder will never need to repeat any information no matter how amny clients are connected.
someone is out there who will like you. take off your mask so they can find you faster. I support the new Nekketsu Kouha Kunio-kun.
Joined: 4/21/2006
Posts: 5
Boco wrote:
Ideally those peers will be sharing with one another and the seeder will never need to repeat any information no matter how many clients are connected.
Right but there's no real advantage to having those peers, is there? If we assume that either the seeder or downloader is going at max speed (or, likely, both) then those peers just change the route of the data, not the speed that its delivered. I came up with a way to solve my own question, and thankfully no one on the server side has to do anything. I was able to use wget to download the whole set of torrents, which I can now start downloading/seeding. Here's the command I used:
wget --user-agent="Firefox" --no-clobber --wait 4 --no-directories --recursive --level=1 --accept .torrent http://bisqwit.iki.fi/nesvideos/movies.cgi
The user-agent is there because the site has protections against mirroring which block it if a user-agent of wget is reported. I think those protections are there to prevent rival sites from downloading everything, so I felt OK bypassing them. The --wait 4 is crucial: it makes each download four seconds apart, instead of one after the next, so I don't hammer the server. This spreads the download out over about 20 minutes. The movies.cgi page is a page listing every torrent on the site. If any admins don't like what I've done feel free to delete this post and send a privmsg. Otherwise I'll probably update my torrent set every couple weeks and seed away. (For Windows users, wget is available through Cygwin. For mac users, Fink. For Linux, it's usually built in.)
Editor, Active player (297)
Joined: 3/8/2004
Posts: 7469
Location: Arzareth
duozmo wrote:
Right but there's no real advantage to having those peers, is there? If we assume that either the seeder or downloader is going at max speed (or, likely, both) then those peers just change the route of the data, not the speed that its delivered.
Oh yes there is. Assume there are 5 people downloading. Assume the seed can upload 100 kB/s at maximum. Assume the file is 100 MB. It will therefore take 1000 seconds for the seed to upload 1 copy of the file. If the people only download from the seed, it will take 5000 seconds for all of the 5 people to get a full copy of the file. Now assume that each of those 5 people can upload at 100 kB/s maximum to other people. Now, it will still take 1000 seconds for the seed to upload 1 full copy of the file. However, whenever the seed sends a piece of the file to one peer, that one peer passes it on to the second peer, and so on, all while the seed continues to upload more and more of the file, distributed, to those people. It will take 1000 seconds for all of the 5 peers to get a full copy of the file. That's why BitTorrent is so useful. Oh, and this is also explained here btw: http://tasvideos.org/WhyBittorent.html
Post subject: Re: mirroring
Editor, Active player (297)
Joined: 3/8/2004
Posts: 7469
Location: Arzareth
duozmo wrote:
The user-agent is there because the site has protections against mirroring which block it if a user-agent of wget is reported.
False. I am not blocking wget. wget is a very well-behaving useful downloading-program. However, I am generally forbidding attempts to mirror my site (see below for exception). The reason has not anything to do with "rival sites". My reasons are explained here: http://senseis.xmp.net/?WhyMirroringIsBad I am going to block your "Firefox" useragent from now on. I appreciate honest user-agents much more than false faked ones. If you make it impossible for me to distinguish a faked useragent from an allowed one, I will IP-ban instead. I repeat: - I am not blocking wget - Your attempt to download those torrent files was well-written and OK, except for the faked user-agent, which is not allowed. [Edit: Oops. It appears I'm indeed blocking wget, for the autogenerated pages. I will have to reconsider this one.]
Joined: 4/21/2006
Posts: 5
Bisqwit wrote:
It will take 1000 seconds for all of the 5 peers to get a full copy of the file.
Ok, I think we agree. In my case, I'm talking about a scenario where there's just one guy who wants the file, and there's only one seeder providing original data. Just like you say, even though there's 4 other people in the swarm, that one guy who wants the data is still going to get it in 1000 seconds. I can see the advantage to having extra ("helper") peers when you have more than one person wanting the file. They would provide redundancy and, save some drastic download/upload speed imbalances, would therefore help the transfer along. As for wget, all I can say is that when I tried without the custom user-agent string I got error 404 (which is usually file not found) with the comment "Mirroring forbidden." Altering the user-agent string cleared that error. Anyway I just came to volunteer some bandwidth, not cause a fuss. I'll respect your choice not to have mirrors on the seeds and delete my torrent archive. Edit: I think we posted/edited at the same time. I'll leave this alone for tonight but FYI, I just chose that user-agent ("Firefox") so it would show up in your logs as an obvious sore-thumb and you could filter it out. Any user agent string is fine with me.
Editor, Active player (297)
Joined: 3/8/2004
Posts: 7469
Location: Arzareth
duozmo wrote:
I'll respect your choice not to have mirrors on the seeds and delete my torrent archive.
It is not about not allowing to have mirrors on the seeds. It is about that downloading dynamically generated HTML pages from the site is an utter waste of resources. You can get the list of torrent files from this resource instead: http://tasvideos.org/data.xml And this one does not block wget.
Editor, Active player (297)
Joined: 3/8/2004
Posts: 7469
Location: Arzareth
duozmo wrote:
In my case, I'm talking about a scenario where there's just one guy who wants the file, and there's only one seeder providing original data. Just like you say, even though there's 4 other people in the swarm, that one guy who wants the data is still going to get it in 1000 seconds.
What are the 4 other people doing then, if they aren't seeding, and they're not wanting the data either?
Banned User
Joined: 12/23/2004
Posts: 1850
Bisqwit wrote:
duozmo wrote:
In my case, I'm talking about a scenario where there's just one guy who wants the file, and there's only one seeder providing original data. Just like you say, even though there's 4 other people in the swarm, that one guy who wants the data is still going to get it in 1000 seconds.
What are the 4 other people doing then, if they aren't seeding, and they're not wanting the data either?
Taking up space.
Perma-banned
Joined: 4/21/2006
Posts: 5
Bisqwit wrote:
What are the 4 other people doing then, if they aren't seeding, and they're not wanting the data either?
If what Bag of Magic Food said is right, that btfriend.py will download any torrent that needs seeding, then it would be those other 4 people who are running btfriend.py. Since the data.xml file is XML, wget can't operate on it directly. I put together a simple sed command to extract the torrent filenames, but in a test on a few of them I found one of them was obsolete (redirect to OldMovies.html). The XML file does have obsoleted data at the top but I don't really have the time/interest/knowledge to write an XML parser that intelligently chooses which torrents to download. Downloading everything would waste resources, which I would guess you wouldn't want.
LSK
Joined: 4/17/2006
Posts: 159
We could have packs of videos 1-100, 2-200, etc. That might work well.
SXL
Joined: 2/7/2005
Posts: 571
the problem of removing obsoleted movies from the torrent would remain. the tracker owner might as well provide one torrent for the whole collection, updated by himself.
I never sleep, 'cause sleep is the cousin of death - NAS