Fine-tuning is the practice of training a pre-trained LLM on a smaller, domain-specific dataset so it performs better on a narrow task. The base model's weights are nudged to specialise without losing the broad capabilities it already has.
In 2026, the more common practical alternative for most enterprise use cases is retrieval-augmented generation (RAG) + prompt engineering. Fine-tuning still wins when the model needs to learn a structured output format, a private vocabulary, or a tone/style that prompting cannot reliably enforce.
Major providers (Anthropic, OpenAI, Google) offer hosted fine-tuning workflows. Open-weight options (Llama, Mistral, Qwen) extend the playbook for teams that need on-premise deployment or full weight control. The choice is downstream of: where the data must live, what the latency budget is, and whether the team has ML engineering depth.