Building a Minimal LangChain + Ollama Demo

This post walks through a tiny, practical example that uses LangChain primitives and a local Ollama server to run a prompt-based summarization. The project is intentionally small so you can learn the core ideas quickly: extracting prompts, initializing an LLM client, composing a chain, and running it locally. Repository used in this post: llangchain-tutorial What you’ll […]

Matrix Multiplication to GPU’s

If you are like me, you must have stumbled on this question – Why is matrix multiplication so significant – especially in the AI/ML universe. How I arrived to this question? When I first dove into Machine Learning (ML) and Deep Learning, I hit a wall. Terms like “backpropagation,” “Transformer models,” and “convolutional layers” flew […]

The Core Divide

At the heart of machine learning, algorithms learn from data. The key distinction between supervised and unsupervised learning is the type of data used for training. Supervised Learning: Learning with a Teacher Supervised learning algorithms are trained using a dataset that is labeled. This means for every input, there is a known, correct output (the […]

Beyond the `DataFrame`: How Parquet and Arrow Turbocharge PySpark 🚀

In my last post, we explored the divide between Pandas (single machine) and PySpark (distributed computing). The conclusion: for massive datasets, PySpark is the clear winner. But simply choosing PySpark isn’t the end of the optimization journey. If PySpark is the engine for big data, then Apache Parquet and Apache Arrow are the high-octane fuel […]