18,410 Common, Non-Profane English Words

by maxers
  scripts
  sprites
See inside
Instructions

Hi! Here's a list of 18410 non-profane English words, deemed most common by frequency of occurrence in the Google search, which is why there are words like "ferguson" and "ebay" on the list. It's not perfect, but I think it's pretty good. Although, some of the words at the bottom get a bit obscure.
Feel free to use this for a safe chat engine (with credit, please). Unfortunately, all the words (including proper nouns) are in lowercase.

While some of the words have offensive double meanings, and some of the words refer to body parts, I think I did a good job of removing words that are commonly found offensive. If you find a word that you think I should remove from the list, please comment and tell me.

I found a list of the 20000 most Google-searched words on GitHub. took the list, but found that it had many duplicate words in it. I used a Scratch script, running on turbo mode, to remove all the duplicates (click "See Inside" to see it, the whole processing took about 20 to 25 minutes). I then used another list of bad words and common blacklist workarounds to pull out all the bad words that remained in the duplicate-free list. Since the list of bad words included words that I did not consider offensive, like "beer" and "gay," I judged each word and removed it from the list manually.
This full process took about 1-2 hours. I hope you appreciate it!

Notes and Credits

List of words gotten from the user "first20hours" at GitHub:
https://github.com/first20hours/google-10000-english
List of profane/offensive words to remove from Front Gate Media: (WARNING! VERY PROFANE AND OFFENSIVE!!!)
http://www.frontgatemedia.com/a-list-of-723-bad-words-to-blacklist-and-how-to-use-facebooks-moderation-tool/

Shared: 8 Jun 2015 Modified: 7 Dec 2016
Favorite this project 11
Love this project 11
Total views 381
View the remix tree  1
  
More projects by maxers