Wordle — What are some good starting words?

Gaurav Neema
2 min readFeb 7, 2022

--

Using Python, nltk

Wordle — https://www.powerlanguage.co.uk/wordle/

You might have probably heard of Wordle, which is a word-guessing game. If you’ve played Wordle before, you’ll know that the most important part of solving the puzzle is starting with a good word. In this article, we will use word analysis to determine the best first word to use when solving Wordle.

Assumptions…

First of all, we won’t be looking into the source code of Wordle. Thus we assume that we don’t know the vocabulary of Wordle.

Requirements

  • Python3
  • pip install nltk
  • pip install english-words

Let’s do it!

  1. Get all common 5-letter words
from english_words import english_words_set
from nltk import ngrams
from itertools import islice
_5_letter_words = set(w for w in english_words_set if len(w) == 5)
print("No. of 5-letter words: ",len(_5_letter_words))

No. of 5-letter words: 3210

2. Create frequency distribution of 2-grams of 3210 5-letter words

freq = {}
for word in _5_letter_words:
_2_grams_word = ngrams(list(word), 2)
for x in _2_grams_word:
_2_gram = ''.join(x).lower()
if _2_gram in freq:
freq[_2_gram] += 1
else:
freq[_2_gram] = 1

3. Sort frequency distribution and get top 12 most occurring 2-grams

freq_sorted = {k: v for k, v in sorted(freq.items(), key=lambda item: item[1], reverse=True)}
freq_sorted_top_12 = {k: freq_sorted[k] for k in list(freq_sorted)[:12]}
print(freq_sorted_top_12)

{‘er’: 195, ‘an’: 177, ‘in’: 170, ‘ar’: 169, ‘al’: 168, ‘ra’: 157, ‘le’: 149, ‘st’: 149, ‘re’: 146, ‘ch’: 140, ‘la’: 136, ‘on’: 134}

4. Find some good words. The words that contain more than 2 top 12 2-grams.

good_words = []for word in _5_letter_words:
word = word.lower()
matching_grams = 0
_2_grams_word = ngrams(list(word), 2)
for x in _2_grams_word:
if("".join(x) in freq_sorted_top_12):
matching_grams+=1
if(matching_grams > 2):
good_words.append(word)
print(good_words)

[‘lares’, ‘allan’, ‘glare’, ‘stare’, ‘blare’, ‘alarm’, ‘ranch’, ‘alert’, ‘saran’, ‘stale’, ‘larch’, ‘clare’, ‘flare’, ‘clara’]

The above list contains some good words to start with.

Link to Google Collab Notebook.

Originally published at https://neemagaurav1996.github.io on February 7, 2022.

--

--

No responses yet