LLM - Run locally using OLlama

What is ollama?
  1. Ollama is a simple tool that lets you download and run large language models on your own computer, without needing a powerful server.
  2. It compiles C++ code versions of LLMs directly on the machine.
What is installed?
  1. PaLM 2 - Trained on a significantly larger dataset than previous models, giving it a broader understanding of the world.
  2. Note, Google created smaller, more efficient PaLM 2 models tailored for specific applications, like Bard (the chatbot) and Vertex AI.
How do I install ollama?
  1. Navigate to https://ollama.com/download, and download the correct version
  2. Run setup program, and follow prompts
  3. Run ollama from the command prompt, or powershell "ollama run gemma3"
  4. Enter "I am trying to learn French. I am a complete beginner. Please have a conversation with me to teach me french"
  5. We are using a free open source project, to unlock paid features in real applications

Different version of models can be seen from Ollama.com/library

  1. Good to experiment with them and find what use cases they best support for your problem
  2. This is a critical skill for an LLM engineer
Key takeaway
  • Demonstrated use case: Building a free, local AI language tutor (Spanish/French), replicating commercial app features at no cost, showcasing rapid creation of commercially viable AI solutions. This immediate application highlights the commercial value of open-source LLMs.

Strategic Implications & High-Level Insights

  • Democratization of AI Capabilities: Deploying advanced LLMs locally, without cloud dependencies or licensing fees, lowers barriers for rapid prototyping and internal innovation.

  • Cost Efficiency: Open-source models can replace paid AI services, enabling organizations to internalize capabilities and reduce recurring SaaS costs.

  • Model Selection as a Core Competency: Early focus on experimenting with multiple models (Meta’s Llama, Google’s Jammer, Alibaba’s Quen, Microsoft’s Phi) to identify the best fit for specific tasks—a critical skill for an LLM engineer crucial for maximizing ROI and performance in enterprise AI deployments.

  • Performance Considerations: Hardware architecture (e.g., Apple M1 vs. PC emulation) significantly affects LLM performance, guiding infrastructure investment decisions, as larger models may run slower on certain hardware.



 

Comments

Popular posts from this blog

Feature: Audit log for one login, and identity service

Getting started - Build your data science lab environment

QA - Run #1 - Results