Discuss Scratch
- Discussion Forums
- » Advanced Topics
- » Forum Archive
- blob8108
-
Scratcher
1000+ posts
Forum Archive
I made a temporary mirror of the now-broken forum archives:All existing links work: given any link to an old forum page, simply replace “archive.scratch.mit.edu” with “scratchforums.blob8108.net”.
http://scratchforums.blob8108.net/forums/
I don't know how long I'll keep it online. But for now, enjoy!
________________________
PS. If you like, you can make your computer use my mirror instead of the archives. You can do this by adding the following line to your `/etc/hosts` file:
188.226.155.121 archive.scratch.mit.edu
Then when you visit http://archive.scratch.mit.edu/, you'll really be visiting my mirror.
You'll have to remember to undo this when I turn off my server or the ST fix the official archive.
Last edited by blob8108 (May 8, 2014 19:09:33)
- djdolphin
-
Scratcher
1000+ posts
Forum Archive
Gah, you beat me to this. I've been trying to upload 3 gigabytes of files for months. I have a copy of the archive running locally on my computer. Anyway, great job!
Last edited by djdolphin (May 8, 2014 22:16:44)
- blob8108
-
Scratcher
1000+ posts
Forum Archive
Anyway, great job!Thanks!
I've been trying to upload 3 gigabytes of files for months.The hard part is getting things like post redirects to work properly. I'm serving most things as static files, with a bit of Python to do post redirects (so links like viewtopic.php?pid=… still work). The post -> topic mapping is looked up in a database. You really don't want to store each of the 1.4 million posts as a static file.
(Since each file takes up at minimum the filesystem's block allocation size – which is usually 1024 bytes.)The topic pages still take up most of the space (~4 GB). I could probably reduce it further by storing those in a database, too, but that gets messy.
- blob8108
-
Scratcher
1000+ posts
Forum Archive
Wow, cool! Where are you hosting it?On a $5/mo server from http://www.digitalocean.com. They were giving away free $10 codes…
- blob8108
-
Scratcher
1000+ posts
Forum Archive
Or, you could just use the Wayback Machine…Sadly, most of the forums aren't in their archive.
- derpmeup
-
Scratcher
1000+ posts
Forum Archive
Or, you could just use the Wayback Machine…The Wayback Machine doesn't have every topic archived.

- jji7skyline
-
Scratcher
1000+ posts
Forum Archive
Thanks so much for doing this, it's invaluable! I hope this stays up forever!
- blob8108
-
Scratcher
1000+ posts
Forum Archive
Thanks so much for doing this, it's invaluable! I hope this stays up forever!Heh
Hosting is a little expensive for a penniless student… I need to sort out giving the ST a copy that works, but making a proper static one is a little tricky 
- comp09
-
Scratcher
1000+ posts
Forum Archive
Thanks so much for doing this, it's invaluable! I hope this stays up forever!HehHosting is a little expensive for a penniless student… I need to sort out giving the ST a copy that works, but making a proper static one is a little tricky
Are you okay with me pointing my wget request machine-gun at your server to static-ize the website?
- djdolphin
-
Scratcher
1000+ posts
Forum Archive
Isn't it already static?Thanks so much for doing this, it's invaluable! I hope this stays up forever!HehHosting is a little expensive for a penniless student… I need to sort out giving the ST a copy that works, but making a proper static one is a little tricky
Are you okay with me pointing my wget request machine-gun at your server to static-ize the website?
- comp09
-
Scratcher
1000+ posts
Forum Archive
Are you okay with me pointing my wget request machine-gun at your server to static-ize the website?Isn't it already static?
Anyway, great job!Thanks!I've been trying to upload 3 gigabytes of files for months.The hard part is getting things like post redirects to work properly. I'm serving most things as static files, with a bit of Python to do post redirects (so links like viewtopic.php?pid=… still work). The post -> topic mapping is looked up in a database. You really don't want to store each of the 1.4 million posts as a static file.(Since each file takes up at minimum the filesystem's block allocation size – which is usually 1024 bytes.)
The topic pages still take up most of the space (~4 GB). I could probably reduce it further by storing those in a database, too, but that gets messy.
Static-izing the website would make it easier to host on GitHub Pages, though. The post id redirections could be done with a bit of JavaScript hackery.
As a side note, could blob8108 possibly make the database/files available for us? You could 7-zip everything up and upload it to GitHub releases…
Last edited by comp09 (Feb. 27, 2015 21:50:14)
- blob8108
-
Scratcher
1000+ posts
Forum Archive
The post id redirections could be done with a bit of JavaScript hackery.Looked into that. Not feasible.
- nXIII
-
Scratcher
1000+ posts
Forum Archive
Couldn't you generate a global post ID → topic ID/page JSON file and then just have viewtopic.php redirect if it gets a post ID?The post id redirections could be done with a bit of JavaScript hackery.Looked into that. Not feasible.
Last edited by nXIII (Feb. 28, 2015 01:26:12)
- comp09
-
Scratcher
1000+ posts
Forum Archive
The post id redirections could be done with a bit of JavaScript hackery.Looked into that. Not feasible.
You could generate a couple thousand files, each with 1000 topic/page numbers in it. Each file should only be several kilobytes.
For example, to find where the post 1,556,664 is, a script could request the file 1556.txt, which would contain something like the following:
180110,2
162848,1
184942,5
108709,4
111153,7
181590,11
114507,9
117521,1
124810,6
114089,8
...
Sounds feasible to me.
Last edited by comp09 (Feb. 28, 2015 02:26:59)
- Discussion Forums
- » Advanced Topics
-
» Forum Archive









Hosting is a little expensive for a penniless student… I need to sort out giving the ST a copy that works, but making a proper static one is a little tricky 

