Discuss Scratch

kccuber
Scratcher
1000+ posts

Automatic project copy detector

Uhh, did anyone say there could be the username of the original project creator encoded in MD5? (or other text hash)

for example airplanedodge (one of my alts): 5e1c97942245f46423a0141204afa996
or double-hash it with the first layer as MD5 and the second layer as Sha-256.

for example the same “airplanedodge” user would be 5e1c97942245f46423a0141204afa996 (with md5) and 26d9e56eed0000d843136f66e57fa38d115e270783e03a5b33f7b237bedce7d0 (with sha-256 over the md5)

It would also not say what it's hashed into.

Alternatively it could use totally unknown that nobody knows

edit: i literally knocked off the original topic lol. This is only the username though (it checks what the username string decodes to.)

for remixes it should add another “username” value with the remixer's username and the original creator's project id encrypted into that value

Last edited by kccuber (March 23, 2021 14:10:46)

Futurebot5
Scratcher
1000+ posts

Automatic project copy detector

This wouldn't be very helpful, people could just make a sprite with the hide block and bypass it.
46009361
Scratcher
1000+ posts

Automatic project copy detector

Futurebot5 wrote:

This wouldn't be very helpful, people could just make a sprite with the hide block and bypass it.
But then, if people look on the inside, the copy isn't “exact” anymore.
kccuber
Scratcher
1000+ posts

Automatic project copy detector

46009361 wrote:

Futurebot5 wrote:

This wouldn't be very helpful, people could just make a sprite with the hide block and bypass it.
But then, if people look on the inside, the copy isn't “exact” anymore.
That's where my “username hashing” idea comes in. Since the username of the original creator is stuck in the file forever, it would say that you have copied someone's project, as there is no way someone would bother extracting and editing the project.json and then rezipping it back. Chances are, they might not even know the hashing method used! (since i said about double-hashing - sha-256 over md5)
Obviously, let the ST pick the hash used because if they implement it and someone reads this thread…

Last edited by kccuber (March 23, 2021 16:25:28)

airplanedodge
Scratcher
1000+ posts

Automatic project copy detector

b u m p
46009361
Scratcher
1000+ posts

Automatic project copy detector

airplanedodge wrote:

b u m p
Thank you for bringing up my post, but that wasn't the post I expected after I looked in my messages!
airplanedodge
Scratcher
1000+ posts

Automatic project copy detector

46009361 wrote:

airplanedodge wrote:

b u m p
Thank you for bringing up my post, but that wasn't the post I expected after I looked in my messages!
look a couple posts above, my main user posted an idea here
46009361
Scratcher
1000+ posts

Automatic project copy detector

Bump.
airplanedodge
Scratcher
1000+ posts

Automatic project copy detector

kccuber wrote:

46009361 wrote:

Futurebot5 wrote:

This wouldn't be very helpful, people could just make a sprite with the hide block and bypass it.
But then, if people look on the inside, the copy isn't “exact” anymore.
That's where my “username hashing” idea comes in. Since the username of the original creator is stuck in the file forever, it would say that you have copied someone's project, as there is no way someone would bother extracting and editing the project.json and then rezipping it back. Chances are, they might not even know the hashing method used! (since i said about double-hashing - sha-256 over md5)
Obviously, let the ST pick the hash used because if they implement it and someone reads this thread…
stop ignoring me (this is my main user)
46009361
Scratcher
1000+ posts

Automatic project copy detector

airplanedodge wrote:

kccuber wrote:

46009361 wrote:

Futurebot5 wrote:

This wouldn't be very helpful, people could just make a sprite with the hide block and bypass it.
But then, if people look on the inside, the copy isn't “exact” anymore.
That's where my “username hashing” idea comes in. Since the username of the original creator is stuck in the file forever, it would say that you have copied someone's project, as there is no way someone would bother extracting and editing the project.json and then rezipping it back. Chances are, they might not even know the hashing method used! (since i said about double-hashing - sha-256 over md5)
Obviously, let the ST pick the hash used because if they implement it and someone reads this thread…
stop ignoring me (this is my main user)
I wasn't trying to ignore you. I still think the detector won't focus on looking at the username, because I got my username changed at some point (but rather, the user ID at api.scratch.mit.edu).

How did I get topic ID 373222 when I made this suggestion, anyway?

Last edited by 46009361 (April 3, 2021 03:34:55)

airplanedodge
Scratcher
1000+ posts

Automatic project copy detector

46009361 wrote:

airplanedodge wrote:

kccuber wrote:

46009361 wrote:

Futurebot5 wrote:

This wouldn't be very helpful, people could just make a sprite with the hide block and bypass it.
But then, if people look on the inside, the copy isn't “exact” anymore.
That's where my “username hashing” idea comes in. Since the username of the original creator is stuck in the file forever, it would say that you have copied someone's project, as there is no way someone would bother extracting and editing the project.json and then rezipping it back. Chances are, they might not even know the hashing method used! (since i said about double-hashing - sha-256 over md5)
Obviously, let the ST pick the hash used because if they implement it and someone reads this thread…
stop ignoring me (this is my main user)
I wasn't trying to ignore you. I still think the detector won't focus on looking at the username, because I got my username changed at some point (but rather, the user ID at api.scratch.mit.edu).

How did I get topic ID 373222 when I made this suggestion, anyway?
ok then, a hash over the user id then maybe?
awesome_guy6856
Scratcher
100+ posts

Automatic project copy detector

Copied projects are not a big issue. To loop over millions of projects to catch copycats? That’s a ridiculous sacrifice for a tiny payoff.
GunesKing
Scratcher
100+ posts

Automatic project copy detector

Automatic project copy detector? Well, there is something you need to understand that one thing that unsupports your thing:
What if it is a collab?
Some scratchers remix others because it was a collab, and sometimes, the difference is the same. It's copied because you deserve some fame with your code and art.
Also, reporting can help. The ST will figure out in the course of time.
So……
That might go too far.
46009361
Scratcher
1000+ posts

Automatic project copy detector

GunesKing wrote:

Automatic project copy detector? Well, there is something you need to understand that one thing that unsupports your thing:
What if it is a collab?
Some scratchers remix others because it was a collab, and sometimes, the difference is the same. It's copied because you deserve some fame with your code and art.
Also, reporting can help. The ST will figure out in the course of time.
So……
That might go too far.
Good point; I wasn't thinking of that.Yes, because the remix is downloaded and then uploaded into a new project by the original user who started the collaboration. Even if the detector caught it, they could just ask the original owner to approve or reject it before it is shared on the other profile (with a different link and project ID).
kccuber
Scratcher
1000+ posts

Automatic project copy detector

awesome_guy6856 wrote:

Copied projects are not a big issue. To loop over millions of projects to catch copycats? That’s a ridiculous sacrifice for a tiny payoff.
loop over the project that is uplaoded to scratch from a sb3 not EVERY SINGLE FILE!
Queer_Royalty
Scratcher
1000+ posts

Automatic project copy detector

This would mean that it takes like 60 seconds or more to share a project, and the report button can just be used.
kccuber
Scratcher
1000+ posts

Automatic project copy detector

46009361 wrote:

GunesKing wrote:

Automatic project copy detector? Well, there is something you need to understand that one thing that unsupports your thing:
What if it is a collab?
Some scratchers remix others because it was a collab, and sometimes, the difference is the same. It's copied because you deserve some fame with your code and art.
Also, reporting can help. The ST will figure out in the course of time.
So……
That might go too far.
Good point; I wasn't thinking of that.Yes, because the remix is downloaded and then uploaded into a new project by the original user who started the collaboration. Even if the detector caught it, they could just ask the original owner to approve or reject it before it is shared on the other profile (with a different link and project ID).
My idea specifically says it will turn the project into a remix.
redddddddddman
Scratcher
100+ posts

Automatic project copy detector

46009361 wrote:

(To clarify, this isn't exactly a “duplicate.”)
We want to be able to check if a project is an exact copy of the project before the “share” button executes its normal command. They may take up load on the servers, but that's okay. Only the code of the project matters; not the title, instructions, Notes and Credits, comments, thumbnail, remixes, studios, and data on cloud variables. The code must be in the same place, you cannot have it detect with cleaned up blocks vs. not cleaned up blocks.
Here's how I want to check for copies automatically on Scratch's servers (for new shared projects):
First, it checks if the project exactly matches the default project that you start with. If so, it shares it right away.
Otherwise, it checks if it's a remix. If so, it checks each remix box (the thanks box) and compares the MD5 checksum of the SB3 files of each other project to the one you want to share. If so, it presents a warning to the user customized for that project.
Lastly, if a warning isn't presented or it's an original project, it checks every shared project, from the first one to the last. If any exact copies are found published before that time who is not by the same user or email address, it presents the warning.
Already shared projects before the release should be taken down in the same method. The project that was shared first for the first time it was shared (if unshared and reshared), according to each person's What's Happening? section, should stay up as they own the rights to it. This also applies to reshared projects too. This is to prevent someone from sharing a project, waiting until one of their real-life enemies downloads the SB3 file or remixes it, the original person unsharing it, then the new person (uploads and) shares the copy.
Edit: My 100th post!
What about remix trees
lm1996
Scratcher
1000+ posts

Automatic project copy detector

redddddddddman wrote:

What about remix trees
What about them and don't quote the OP.
awesome_guy6856
Scratcher
100+ posts

Automatic project copy detector

kccuber wrote:

awesome_guy6856 wrote:

Copied projects are not a big issue. To loop over millions of projects to catch copycats? That’s a ridiculous sacrifice for a tiny payoff.
loop over the project that is uplaoded to scratch from a sb3 not EVERY SINGLE FILE!

You're still overestimating how fast a computer is. That's still going to make uploading projects to scratch from sb3 extremely slow, taking hours or even days. Copycats aren't even a big problem, regardless. The report button exists for a reason.

Powered by DjangoBB