Discuss Scratch

RyRyWinner
Scratcher
9 posts

Optical Character Recognition (OCR) - Help with special issues!

I've been working on an optical character recognition program using my camera I coded a little while back, and before you can understand the problems, you need to understand my solutions.


HOW IT WORKS:
-The screen takes a binary image (pure black and white) of a portion of the screen, it's 480 px wide and 90 px tall
-The sprite goes to the far left, and slowly moves right until it comes in contact with black. It then goes to the bottom of the screen, and goes up until it comes in contact with black again.
-It floodfills pixels from that black pixel on, and grabs every pixel in contact (or touching corners) (or above it by so much- to get things like the dot on i's and j's) to find where the character starts and ends
-It resizes that character to 10*16 pixels exactly
-It goes through every instance of the first character, and for every matching pixel it adds 1 to that characters score. It does this for every character to find which one matches best.
-Whichever character had the most pixel matches gets added to the output, then goes back to find the next character, floodfill it, and so on until it's done it for every character.


THE PROBLEM:
-If a letter IS a rectangle, like I, l, |, _, -, or 1, it just resizes to a black square, making multiple characters have the same data
-If a letter is just like another one but stretched (AKA lowercase o vs uppercase O), it will see them the exact same way.
-If a letter is the exact same as another letter, but just located somewhere else, like , ‘ they end up looking the exact same when resized


Resizing the letter was the best solution to make them all match the same dataset, but it ended up only working for finding different shapes, not other deciding factors. I kind of knew this would be a problem going into it, but I’m really hoping that you guys are smart enough to help me find a solution. I also thought about doing something more like what current softwares do, where it cuts a letter into shapes and recognizes it based off of that, but I literally have no idea where I would start if I did that… so ahm, yeah.





(P.S., if you have any long lists of similar fonts that are neatly all typing the exact same character set, I could really use it for training data lol)

Last edited by RyRyWinner (Dec. 2, 2025 16:18:49)

Powered by DjangoBB