Juan Alberto López Cavallotti

Deploying self-hosted LLMs in Prod

A practical walkthrough of deploying a self-hosted LLM on AWS EKS with dedicated GPU node groups, taints/tolerations for isolation, separate inference and app services, model-weight caching via volumes, and sizing guidance for VRAM/concurrency and warm GPU capacity.

juancavallotti

March 30, 2026

Machine Learning, Software Engineering, Technology

ai, artificial-intelligence, chatgpt, experience, llm, Machine Learning, nlp, personal growth, Software Engineering, Technology

Mobile App Agentic Patterns

I’ve been building mobile apps that leverage LLMs for almost four years, and the hardest lessons didn’t come from prompts—they came from the phone itself. Mobile copilots feel inevitable: we already live in our messaging apps, and text is the most natural UI we have. But building copilots on a device that’s both highly privileged…

juancavallotti

March 12, 2026

Machine Learning, Software Engineering, Technology

ai, artificial-intelligence, chatgpt, experience, llm, Machine Learning, open source, Software Engineering, Technology

Reference Architecture: An Agentic CLI Application

While writing the previous post in this series, I hit a very unglamorous problem: my laptop disk was full. Not “almost full”. Full enough that everything started to feel brittle. I did what I always do: open a couple of folders, run a few du commands, check caches, look for the usual suspects, delete stuff,…

juancavallotti

March 2, 2026

Machine Learning, Software Engineering, Technology

ai, artificial-intelligence, chatgpt, cli, github, llm, Machine Learning, nlp, open source, reference architecture, Software Engineering, Technology