Mastering TF-IDF: A Gamified Journey!

Understanding how computers “read” and understand text is a fascinating field. One of the most fundamental techniques for identifying important keywords in a document, relative to a collection of documents, is TF-IDF (Term Frequency-Inverse Document Frequency). I recently embarked on a gamified learning challenge to demystify TF-IDF, breaking it down into its core components. This […]

k-Nearest Neighbors

KNN is the friendly, neighborhood algorithm that believes you are defined by the company you keep. It doesn’t try to learn a complex rule; it just looks around and goes with the consensus of its closest, most similar peers. In layman terms Story – 1 Imagine you just moved into a new neighborhood, and you […]

Building a Practical Retrieval-Augmented Personal Assistant (RAG)

The elements of information discussed here are all present in my github repository. with Ollama, LangChain, and Chroma 1. Why I Built This Large Language Models are powerful, but they hallucinate and forget your private knowledge. We set out to build a small, local Retrieval-Augmented Generation (RAG) assistant that: 2. What Is RAG (In Plain […]

Data Mining Essentials

Whether you’re preparing for a quiz or just brushing up on fundamentals, this guide distills the key concepts from Data Mining into bite-sized, memorable chunks. Let’s dive in! Understanding Machine Learning Tasks Regression vs. Classification: Know Your Output The fundamental distinction in supervised learning comes down to what you’re predicting: Pro tip: If someone asks […]