I recently played the online game called context. I was intrigued by how it worked so I created a tiny clone that can be found through here.
How I decided to create my clone was using an AI model that generates embeddings, after that, I would store that information in a json
file. With that, I could load the file in javascript and get the cosine similarity between a random word and the rest of the vocabulary.
I grabbed a random list of words from here.
The code to generate the embeddings is really small.
import json from sentence_transformers import SentenceTransformer model = SentenceTransformer('all-MiniLM-L6-v2') with open('./vocabulary.txt','r') as file: words = file.read() words = words.split('\n') embeddings = model.encode(words) print(len(words), embeddings.shape) vocab = list(zip(words, embeddings.tolist())) with open('./embeddings.json', 'w') as file: json.dump(vocab, file, indent=4)
I decided to create my own functions for the cosine similarity. So I needed to use two different functions for that. The first one is for the magnitud:
function magnitude(x) { let sumOfSquares = 0; for (let i = 0; i < x.length; i++) { sumOfSquares += x[i] * x[i]; } return Math.sqrt(sumOfSquares); }
And a function for the dot product:
function dotProduct(a, b) { let dotProduct = 0; for (let i = 0; i < a.length; i++) { dotProduct += a[i] * b[i]; } return dotProduct; }
Now, with the two parts, I can create a cosine similarity function of my own:
function cosineSimilarity(a, b) { const dotProd = dotProduct(a, b); const magA = magnitude(a); const magB = magnitude(b); if (magA === 0 || magB === 0) { return 0; } else { return dotProd / (magA*magB); } }
Finally, to serve the small webapp, I went with a tiny server written in pure node and no dependencies. I decided to go for no dependencies because I might at some point revisit this project and I do not want to fight dependencies in three years.
I basically sticked to only using a single html file for the frontend with the javascript in there to check if the current word is possible or not and to re-order the words every time a new word is inputed.
For a weekend project, I really liked how everything turned out. I think that for now, it is a solid version 0.1
and maybe I could add some small features to make it more confortable.