I did a bad thing but it came good in the end. Let me explain.
I updated the transformers Python package and noticed that one of my Streamlit apps, RALTS, returned an error:
NameError: name 'nn' is not definedTo cut a very long story short, my version of PyTorch was outdated for the latest version of transformers but I was stuck on 2.2.2 and couldn’t get access to a later version. Then I found out the wild intricacies of x86_64 and AArch64 and spend many hours messing around with package installations.
Needless to say, I did some very unadvisable things and now I can’t use transformers anymore. Boo. That meant RALTS couldn’t work in its current form since it relied on a package called PolyFuzz and specifically its use of SentenceTransformers. So I had two options:
- Try again to get transformers working so everything could go back to normal
- Remove PolyFuzz altogether and just not have the functionality to compare found topics and my existing blog tags.
Or there was a mystery third option!

You see, PolyFuzz also lets you use a custom model for fuzzy matching and like anyone obsessed with embeddings, I had Ollama installed and access to embeddings models. I already had a quantized version of nomic-embed-text installed but had a look for any other models and found all-minilm which was by… SBERT aka SentenceTransformers! What’s more, it was half the size of the all-MiniLM-L6-v2 model which I’d been using until now.
Unfortunately, I was stuck on how to get it all to work as I’d relied on PolyFuzz’s existing code and now I had to work out how to generate the matches before I could make the dataframe to show them. This would be something an LLM could likely do in a matter of seconds with minimal-to-no prompt engineering (or whatever people call it these days.) But I didn’t succumb and took my time—and breaks—until I finally got it. And here’s what I came up with:
class OllamaModel(BaseMatcher):
def match(self, from_list, to_list, **kwargs):
embeddings_from = [np.array(embed) for embed in ollama.embed(model='all-minilm:latest', input=from_list)['embeddings']]
embeddings_to = [np.array(embed) for embed in ollama.embed(model='all-minilm:latest', input=to_list)['embeddings']]
# Calculate distances
matches = [[cosine_similarity(from_vector, to_vector) for to_vector in embeddings_to] for from_vector in embeddings_from]
# Get best matches
mappings = [to_list[index] for index in np.argmax(matches, axis=1)]
scores = np.max(matches, axis=1)
# Prepare dataframe
matches = pd.DataFrame({'From': from_list,
'To': mappings,
'Similarity': scores})
return matchesThe gist is that I used all-minilm via ollama to generate the embeddings on two list of strings (one to map from and one to map against). From there, I used a simple cosine similarity function (not scikit-learn’s btw!) for each one to calculate all the distances and then copied PolyFuzz’s code from the docs to find the best matches and put it into a Pandas dataframe.
After testing, the scores came out great and were noticeably quicker (although I can’t test it since I obviously don’t have transformer to compare 😢). It also meant I didn’t have to rely on HuggingFace as I could just use local Ollama models.
While none of this was intended when I updated one (1) package, it made me realise that I could figure out a coding problem without needing the help of a chatbot and, sometimes, when life gives you lemons, you can make leaner embeddings out of them.
🍋 🍋 🍋