Discuss Scratch

Scratchtheguy1
Scratcher
500+ posts

Voice Recognition

Support, I would like the block to look this:

() spoken?:: #076585 boolean

It has a bad word filter built-in


It will also detect special programs that bypass filters.
medians
Scratcher
1000+ posts

Voice Recognition

Bringing this topic up.
jmdzti_0-0
Scratcher
1000+ posts

Voice Recognition

Scratchtheguy1 wrote:

Support, I would like the block to look this:

() spoken?:: #076585 boolean

It has a bad word filter built-in


It will also detect special programs that bypass filters.
i don’t think we need a bad word filer for something the user themself says.
medians
Scratcher
1000+ posts

Voice Recognition

jmdzti_0-0 wrote:

Scratchtheguy1 wrote:

Support, I would like the block to look this:

() spoken?:: #076585 boolean

It has a bad word filter built-in


It will also detect special programs that bypass filters.
i don’t think we need a bad word filer for something the user themself says.
We could also prevent what the user is saying from being saved to the cloud if the concern is bad words being saved to the cloud too. Scratch can already do this because they already do it for the Face Sensing extension that is planned to be added, as well as the Video Sensing extension
Also, other extensions can already be used for blocking inappropriate words
BigNate469
Scratcher
1000+ posts

Voice Recognition

Scratchtheguy1 wrote:

Support, I would like the block to look this:

() spoken?:: #076585 boolean

It has a bad word filter built-in
First of all, that's a privacy issue- if it's constantly listening. You could have a project that just cycles through a list of commonly used words and builds a transcript out of those words, and sends that transcript to who knows where using cloud variables and an external server. Though, as @medians said, this can be stopped by disabling cloud variables in projects using this block.

Second, it's incredibly inefficient and would either lag the webpage (if the processing is done locally) or likely cost the ST a decent sum of money (if it's done on someone's servers, like AWS), because it would have to be constantly analyzing the input audio.

Scratchtheguy1 wrote:

It will also detect special programs that bypass filters.
How?

I think a better solution would be a pair of blocks:
[start v] recording from microphone :: #076585
[stop v] recording from microphone :: #076585

(transcript of microphone recording :: #076585) //would output any text detected in the last few seconds of audio
Where the audio is only processed on demand, and only the last few seconds of audio are ever saved or processed. The speech from the recording can't be processed until the recording is stopped (the block will return an empty string). All three could be made to be mandatory yield points as well, to prevent the same issues that the previous block had (you have to wait at least a frame between reading audio)

Additionally, a disclaimer similar to those on projects using the username block and the face sensing blocks would appear before the green flag is clicked, and (as is required by the way the JavaScript APIs that would make this work function) the browser's dialog about allowing Scratch to use your microphone would appear.

Last edited by BigNate469 (Aug. 8, 2025 15:36:17)

MousePotato1234
Scratcher
30 posts

Voice Recognition

jmdzti_0-0 wrote:

Scratchtheguy1 wrote:

Support, I would like the block to look this:

() spoken?:: #076585 boolean

It has a bad word filter built-in


It will also detect special programs that bypass filters.
i don’t think we need a bad word filer for something the user themself says.
Well what if a user says something that sounds a little like an expletive but is not? I support a bad word filter.
jmdzti_0-0
Scratcher
1000+ posts

Voice Recognition

BigNate469 wrote:

First of all, that's a privacy issue- if it's constantly listening. You could have a project that just cycles through a list of commonly used words and builds a transcript out of those words, and sends that transcript to who knows where using cloud variables and an external server. Though, as @medians said, this can be stopped by disabling cloud variables in projects using this block.
no, it requires permission from the user, and everything happens in the browser, and then it’s sent. have you read like literally the third post at all?

plus, transcribed speech is not PII at all. at most it could be used to make chatrooms, which are already not allowed.

Last edited by jmdzti_0-0 (Aug. 8, 2025 20:49:25)

Powered by DjangoBB