Instruct Templates

Get more out of your model of choice.

Instruct templates are special formatting styles used to tell a language model how to behave, answer questions, or carry on a conversation. Think of them like a "starter guide" or "cheat sheet" for the AI. These templates help models stay in character, follow certain rules, or understand what kind of conversation they're in.

Why It’s Important to Use the Right Instruct Template?

Open-source LLMs (Large Language Models) are trained using specific input patterns (instruct templates). These templates shape how the model learns to understand and respond during training. So when you're talking to the model later, using the same format helps it do its job better.

ELI5 Break Down:

Imagine you trained a dog using hand signals, like waving your hand to mean “sit.” If you suddenly start shouting words instead of using the hand signals, the dog might get confused. 🐶

Language models are similar. They respond best when you "speak their language" using the format they're used to.

TL;DR

  • Alpaca, Mistral: Minimalist, focused on high-level traits.

  • Vicuna: Conversational, USER/ASSISTANT format.

  • Llama-3, ChatML, Gemma2: Hierarchical structures (JSON, XML tags).

  • Command-R: Direct, task-oriented instruction.

  • Metharme: YAML for detailed nesting.

Where to Look/Ask

1

README.txt

2

Instruct.json

3

Config.json

4

Communities

Hugging Face, SillyTavern, Wyvern, Chub, etc...


Alpaca

  • Alpaca prefers straightforward instructions and can work well with minimal formatting like Markdown.


Vicuna

  • Vicuna uses a conversational format where roles like "USER" and "ASSISTANT" are clearly delineated.

  • Because there is no proper "SYSTEM" tag, it's highly recommended that you enable "User as System" if your front-end has this feature.

  • Example:


Llama-3

  • Llama-3 follows a system-message format similar to OpenAI's ChatML. Place the system directive at the top.

  • XML-style tags will also work if you don't overdo nesting.

  • Examples:


Command-R

  • Command-R models, tuned for explicit task completion, respond well to clear role definitions.


ChatML

  • ChatML is a markup language that uses structured inputs.

  • Use tags to delineate and define roles.


Mistral

  • Mistral performs best with direct and simple formatting, similar to Alpaca or Vicuna. (Mistral Documentation)

  • Original prompt structure:

  • Emulated character sheet:


Gemma2

  • Gemma2 often prefers tag-based or JSON-like structures for clear role delineation.

Or:


Metharme

  • Metharme benefits from YAML-style structured prompts or simple system headers.


Additional Resources


© 2024 by SopakcoSauce. Except as otherwise noted, the content of this page is licensed under CC BY-NC-SA 4.0

Last updated