Instruct Templates
Get more out of your model of choice.
These recommendations are informed and heuristic-based suggestions.
THEY ARE NOT ABSOLUTE FACTS.
Instruct templates are special formatting styles used to tell a language model how to behave, answer questions, or carry on a conversation. Think of them like a "starter guide" or "cheat sheet" for the AI. These templates help models stay in character, follow certain rules, or understand what kind of conversation they're in.
Why It’s Important to Use the Right Instruct Template?
Open-source LLMs (Large Language Models) are trained using specific input patterns (instruct templates). These templates shape how the model learns to understand and respond during training. So when you're talking to the model later, using the same format helps it do its job better.
ELI5 Break Down:
Imagine you trained a dog using hand signals, like waving your hand to mean “sit.” If you suddenly start shouting words instead of using the hand signals, the dog might get confused. 🐶
Language models are similar. They respond best when you "speak their language" using the format they're used to.
TL;DR
Alpaca, Mistral: Minimalist, focused on high-level traits.
Vicuna: Conversational, USER/ASSISTANT format.
Llama-3, ChatML, Gemma2: Hierarchical structures (JSON, XML tags).
Command-R: Direct, task-oriented instruction.
Metharme: YAML for detailed nesting.
Where to Look/Ask
Alpaca
Alpaca prefers straightforward instructions and can work well with minimal formatting like Markdown.
Vicuna
Vicuna uses a conversational format where roles like "USER" and "ASSISTANT" are clearly delineated.
Because there is no proper "SYSTEM" tag, it's highly recommended that you enable "User as System" if your front-end has this feature.
Example:
Llama-3
Llama-3 follows a system-message format similar to OpenAI's ChatML. Place the system directive at the top.
XML-style tags will also work if you don't overdo nesting.
Examples:
Command-R
Command-R models, tuned for explicit task completion, respond well to clear role definitions.
ChatML
ChatML is a markup language that uses structured inputs.
Use tags to delineate and define roles.
Mistral
Mistral performs best with direct and simple formatting, similar to Alpaca or Vicuna. (Mistral Documentation)
Original prompt structure:
Emulated character sheet:
Gemma2
Gemma2 often prefers tag-based or JSON-like structures for clear role delineation.
Or:
Metharme
Metharme benefits from YAML-style structured prompts or simple system headers.
Additional Resources
© 2024 by SopakcoSauce. Except as otherwise noted, the content of this page is licensed under CC BY-NC-SA 4.0
Last updated