Discuss Scratch

TheCreatorOfUnTV
Scratcher
1000+ posts

We need a replacement for ScratchDB.

redspacecat wrote:

TheCreatorOfUnTV wrote:

PaperMarioFan2024 wrote:

TheCreatorOfUnTV wrote:

Please no!
Why? ScratchDB is no longer receiving updates enough for the servers to run smoothly.
Because some illegal/extremely-offensive/otherwise-not-allowed projects will get archived.
Correct me if I'm wrong, but I don't think ScratchDB ever archived project content, just data about them.
What did it archive, then?
o97doge
Scratcher
500+ posts

We need a replacement for ScratchDB.

TheCreatorOfUnTV wrote:

(#81)

redspacecat wrote:

TheCreatorOfUnTV wrote:

PaperMarioFan2024 wrote:

TheCreatorOfUnTV wrote:

Please no!
Why? ScratchDB is no longer receiving updates enough for the servers to run smoothly.
Because some illegal/extremely-offensive/otherwise-not-allowed projects will get archived.
Correct me if I'm wrong, but I don't think ScratchDB ever archived project content, just data about them.
What did it archive, then?
Mostly forum posts.
ajskateboarder
Scratcher
1000+ posts

We need a replacement for ScratchDB.

TheCreatorOfUnTV wrote:

redspacecat wrote:

TheCreatorOfUnTV wrote:

PaperMarioFan2024 wrote:

TheCreatorOfUnTV wrote:

Please no!
Why? ScratchDB is no longer receiving updates enough for the servers to run smoothly.
Because some illegal/extremely-offensive/otherwise-not-allowed projects will get archived.
Correct me if I'm wrong, but I don't think ScratchDB ever archived project content, just data about them.
What did it archive, then?
Information about the projects, like the title, description, etc. And I believe that data was updated now and then (not as quickly as Scratch obviously) to match moderation
ajskateboarder
Scratcher
1000+ posts

We need a replacement for ScratchDB.

Bump

But honestly, if we can't make a combined effort on a single database, then why should we even bother anymore
Redstone1080
Scratcher
1000+ posts

We need a replacement for ScratchDB.

elip100 wrote:

The Snazzle team is working on something similar called Voyager. I have no Idea what the status of it is though.
It's pretty much in development limbo/hell. That's the case with all of Snazzle, and to be honest, I'm thinking about leaving Snazzle development to the rest of the team that's working on it. Though even they haven't been very active lately.

Last edited by Redstone1080 (Aug. 27, 2024 20:05:52)

gilbert_given_189
Scratcher
1000+ posts

We need a replacement for ScratchDB.

ajskateboarder wrote:

Bump

But honestly, if we can't make a combined effort on a single database, then why should we even bother anymore
When I said I'd like a decentralized scraper, I don't mean something like this… Dozens of scrapers working independently to each other…in full speed…

I guess for now, we have two options:
  • have all the scrapers to agree on a single database server, merge all their databases there, and redirect all queries of their own database to use that shared database instead of their own; or
  • reduce the rate of the scrapers and make the scrapers interconnect with each other under a set protocol, which is close to my suggestion of a decentralized scraper.

Whether the scrapers cared to collaborate with each other, I don't know.

Last edited by gilbert_given_189 (Aug. 27, 2024 22:46:21)

BigNate469
Scratcher
1000+ posts

We need a replacement for ScratchDB.

gilbert_given_189 wrote:

ajskateboarder wrote:

Bump

But honestly, if we can't make a combined effort on a single database, then why should we even bother anymore
When I said I'd like a decentralized scraper, I don't mean something like this… Dozens of scrapers working independently to each other…in full speed…

I guess for now, we have two options:
  • have all the scrapers to agree on a single database server, merge all their databases there, and redirect all queries of their own database to use that shared database instead of their own; or
  • reduce the rate of the scrapers and make the scrapers interconnect with each other under a set protocol, which is close to my suggestion of a decentralized scraper.

Whether the scrapers cared to collaborate with each other, I don't know.
^^^

This needs to happen in some way or another.

For a while there I was considering making a peer-to-peer based scraper, but the issue with that was the only way I could think of to do it would be via browser extension, which we obviously can't advertise on Scratch. It would only be useful if the more active forumers were using it, and a lot of them, so just me using it wouldn't be particularly useful- it would just be all the forum pages I've visited.
davidtheplatform
Scratcher
500+ posts

We need a replacement for ScratchDB.

Why do we need to have decentralized scrapers? One person with a computer that stays on all the time and a reliable internet connection is more than enough to both scrape new posts and host the database*.

This topic isn't going to be relevant for much longer. ScratchDB can search for posts and knows about most posts (it knows about #87). I'd assume this is the main use of scratchdb. The things it used to do but doesn't now are:
  • getting individual posts
  • ranking users
  • non-forum stuff
  • let things that aren't ocular access it
Lefty said power to the normal server would be restored mid-august at the earliest. It's now late august so presumably it will be back up soon-ish. I would avoid scraping posts from the forums in an automated way (ie. making a forum viewer is fine since it doesn't cause more load than just using the forums normally)

*I have a database with some scratch posts at mercury.davidtheplatform.eu.org, if you have already scraped posts (this is NOT an invitation to scrape more posts, i would not recommend running any automated forum scrapers) i could put them there and make them searchable
gilbert_given_189
Scratcher
1000+ posts

We need a replacement for ScratchDB.

davidtheplatform wrote:

Why do we need to have decentralized scrapers? One person with a computer that stays on all the time and a reliable internet connection is more than enough to both scrape new posts and host the database*.

(pruned)
It was a silly idea I devised when ScratchDB is offline, inspired by people insisting to make the database/scraper open source (of which decentralization could help immensely). But now that it is back online in some capacity, I'd say we would be better sticking to tradition by using the closed source ScratchDB, rather than some new, open source, decentralized database. (After all, to this day we still haven't figured out when would be the year of Linux on the desktop)

Still, we have to handle with the scrapers made on ScratchDB's offline period. Either make them work together with ScratchDB, or decommission them. (But better decommission them.)

Last edited by gilbert_given_189 (Aug. 28, 2024 12:09:21)

UserFriend-
Scratcher
31 posts

We need a replacement for ScratchDB.

gilbert_given_189 wrote:

have all the scrapers to agree on a single database server, merge all their databases there, and redirect all queries of their own database to use that shared database instead of their own; or
This is what (“MeowStore”) can do. There is a clear division between what scraper and database are.
i_eat_coffee
Scratcher
1000+ posts

We need a replacement for ScratchDB.

UserFriend- wrote:

gilbert_given_189 wrote:

have all the scrapers to agree on a single database server, merge all their databases there, and redirect all queries of their own database to use that shared database instead of their own; or
This is what (“MeowStore”) can do. There is a clear division between what scraper and database are.
i just genuinely don't understand what this post means but i also just woke up because it is like 7 am so im not thinking straight
but what is meowstore?
and scraping is the process of using bots to get data from multiple webpages to create a database, i.e. using a script to scan the last X topics in the Y forum and individually read every post from the scanned topics in order to add them to a database

Last edited by i_eat_coffee (Aug. 29, 2024 05:42:29)

A-MARIO-PLAYER
Scratcher
1000+ posts

We need a replacement for ScratchDB.

gilbert_given_189 wrote:

(#89)
(After all, to this day we still haven't figured out when would be the year of Linux on the desktop)
it will be in 2025, when m**rosoft forces data collection on its users for their dumb ai model (no offense to copilot) and people switch to linux to avoid getting their data collected by that evil company. is collecting data all because of ai even legal?

Last edited by A-MARIO-PLAYER (Aug. 29, 2024 11:30:30)

gilbert_given_189
Scratcher
1000+ posts

We need a replacement for ScratchDB.

A-MARIO-PLAYER wrote:

gilbert_given_189 wrote:

(#89)
(After all, to this day we still haven't figured out when would be the year of Linux on the desktop)
it will be in 2025, when m**rosoft forces data collection on its users for their dumb ai model (no offense to copilot) and people switch to linux to avoid getting their data collected by that evil company. is collecting data all because of ai even legal?
We'll see about that.

Now, could we please stay on-topic?

Last edited by gilbert_given_189 (Sept. 1, 2024 11:32:03)

WindowsAdmin
Scratcher
1000+ posts

We need a replacement for ScratchDB.

A-MARIO-PLAYER wrote:

gilbert_given_189 wrote:

(#89)
(After all, to this day we still haven't figured out when would be the year of Linux on the desktop)
it will be in 2025, when m**rosoft forces data collection on its users for their dumb ai model (no offense to copilot) and people switch to linux to avoid getting their data collected by that evil company. is collecting data all because of ai even legal?
Kid named windows 10 LTSC
BigNate469
Scratcher
1000+ posts

We need a replacement for ScratchDB.

A-MARIO-PLAYER wrote:

gilbert_given_189 wrote:

(#89)
(After all, to this day we still haven't figured out when would be the year of Linux on the desktop)
it will be in 2025, when m**rosoft forces data collection on its users for their dumb ai model (no offense to copilot) and people switch to linux to avoid getting their data collected by that evil company. is collecting data all because of ai even legal?
How have we turned a topic about a forum scraper to Windows V.S. Linux?
gilbert_given_189
Scratcher
1000+ posts

We need a replacement for ScratchDB.

Alright, let's get back on track.

ScratchDB is online now (in some capacity). What should we do before closing this topic?

Last edited by gilbert_given_189 (Sept. 1, 2024 12:41:03)

A-MARIO-PLAYER
Scratcher
1000+ posts

We need a replacement for ScratchDB.

gilbert_given_189 wrote:

(#96)
Alright, let's get back on track.

ScratchDB is online now. What should we do before closing this topic?
uhh scratchdb is still down for me
gilbert_given_189
Scratcher
1000+ posts

We need a replacement for ScratchDB.

A-MARIO-PLAYER wrote:

gilbert_given_189 wrote:

(#96)
Alright, let's get back on track.

ScratchDB is online now. What should we do before closing this topic?
uhh scratchdb is still down for me
I'd like to amend my statement: ScratchDB is online now (in some capacity)
medians
Scratcher
1000+ posts

We need a replacement for ScratchDB.

gilbert_given_189 wrote:

Alright, let's get back on track.

ScratchDB is online now (in some capacity). What should we do before closing this topic?
Though only in some capacity. Also, postpercent
DifferentDance8
Scratcher
1000+ posts

We need a replacement for ScratchDB.

gilbert_given_189 wrote:

A-MARIO-PLAYER wrote:

gilbert_given_189 wrote:

(#96)
Alright, let's get back on track.

ScratchDB is online now. What should we do before closing this topic?
uhh scratchdb is still down for me
I'd like to amend my statement: ScratchDB is online now (in some capacity)
“In some capacity” That's overstating it quite a bit
Literally the only thing that used to use ScratchDB that still works is Ocular, ScratchStats is still broken

Powered by DjangoBB