Creating a tiny clone of contexto.me: contextito

TLDR:

I recently played the online game called context. I was intrigued by how it worked so I created a tiny clone that can be found through here.

Creating embeddings

How I decided to create my clone was using an AI model that generates embeddings, after that, I would store that information in a json file. With that, I could load the file in javascript and get the cosine similarity between a random word and the rest of the vocabulary.

I grabbed a random list of words from here.

The code to generate the embeddings is really small.

import json
from sentence_transformers import SentenceTransformer

model = SentenceTransformer('all-MiniLM-L6-v2')

with open('./vocabulary.txt','r') as file:
    words = file.read()

words = words.split('\n')

embeddings = model.encode(words)

print(len(words), embeddings.shape)
vocab = list(zip(words, embeddings.tolist()))

with open('./embeddings.json', 'w') as file:
    json.dump(vocab, file, indent=4)
	    

Server

I decided to create my own functions for the cosine similarity. So I needed to use two different functions for that. The first one is for the magnitud:

function magnitude(x) {
	let sumOfSquares = 0;

	for (let i = 0; i < x.length; i++) {
		 sumOfSquares += x[i] * x[i];
		 }
		 return Math.sqrt(sumOfSquares);
		 }
	    

And a function for the dot product:

function dotProduct(a, b) {
	let dotProduct = 0;
	for (let i = 0; i < a.length; i++) {
		 dotProduct += a[i] * b[i];
		 }
		 return dotProduct;
		 }
	    

Now, with the two parts, I can create a cosine similarity function of my own:

function cosineSimilarity(a, b) {
	const dotProd = dotProduct(a, b);	

	const magA = magnitude(a);
	const magB = magnitude(b);

	if (magA === 0 || magB === 0) {
		return 0;
	} else {
		return dotProd / (magA*magB);
	}
}
	    

Finally, to serve the small webapp, I went with a tiny server written in pure node and no dependencies. I decided to go for no dependencies because I might at some point revisit this project and I do not want to fight dependencies in three years.

Frontend

I basically sticked to only using a single html file for the frontend with the javascript in there to check if the current word is possible or not and to re-order the words every time a new word is inputed.

Conclusion

For a weekend project, I really liked how everything turned out. I think that for now, it is a solid version 0.1 and maybe I could add some small features to make it more confortable.

Back to articles