Model Authoring
Browse 17 Model Authoring modes for AI coding agents — production-grounded, cited, installable. Part of the VIBE library.
chat-template-expert-mode
Author and debug Jinja2 chat_template strings in HF tokenizer_config.json — ChatML, Llama 3, Qwen, Gemma, Mistral, plus tools / function calling
View → modedistil-mini-model-expert-mode
Author small distilled models for shipping — choose teacher, design distillation recipe, evaluate on real prompts before publish, GGUF quant for footprint
View → modeembedding-model-publish-expert-mode
Publish embedding models — sentence-transformers config, modules.json, 1_Pooling, MTEB submission, Matryoshka dims, embedding-specific model card
View → modegguf-conversion-expert-mode
Convert HF safetensors to GGUF with convert_hf_to_gguf.py — handle vocab, tied embeddings, sharded checkpoints, and produce reproducible F16/BF16 + quantize pipelines
View → modegguf-multimodal-mmproj-expert-mode
Author multimodal GGUF — mmproj projector files, llama-mtmd-cli, llama-server multimodal endpoint, with LLaVA / MiniCPM-V / InternVL / Qwen2-VL / Gemma 3
View → modelora-adapter-publish-expert-mode
Package and publish LoRA adapters — HF Hub layout, vLLM dynamic loading, llama.cpp LoRA GGUF, Ollama ADAPTER directive, Replicate Cog
View → modemlx-converter-expert-mode
Convert HF safetensors models to MLX format, quantize to 4-bit / 8-bit, publish to mlx-community on HF Hub for Apple Silicon serving
View → modemodel-card-publish-expert-mode
Author HF model cards — README.md frontmatter (license, library_name, base_model, datasets, language, pipeline_tag, tags), eval results, intended use, training attribution
View → modeollama-library-publisher-expert-mode
Publish models to ollama.com/library — namespace setup, ollama push, signing keys, quant tags, parameter-size tags, model card README authoring
View → modeollama-modelfile-expert-mode
Author production Modelfiles with FROM, PARAMETER, TEMPLATE, SYSTEM, ADAPTER, MESSAGE, and LICENSE directives for Llama 3, Qwen, Phi, and Gemma
View → modeollama-multimodal-modelfile-expert-mode
Author Ollama Modelfiles for vision models — llava, llama3.2-vision, MiniCPM-V — with mmproj projector handling and image-token templates
View → modeprompt-template-marketplace-expert-mode
Share and version prompt templates — LangChain Hub, Langfuse, dotprompt, OpenAI Playground exports, promptfoo configs — with deprecation patterns
View → modequantization-format-expert-mode
Pick between GGUF K/IQ quants, AWQ, GPTQ, bitsandbytes NF4, EXL2, MLX 4-bit, NVFP4 — decision matrix by hardware and serving stack
View → modesafetensors-expert-mode
Author and inspect safetensors files — header layout, sharding via model.safetensors.index.json, mmap loading, and PEFT adapter format
View → modestructured-output-expert-mode
Constrained generation across stacks — Outlines, lm-format-enforcer, llama.cpp GBNF, OpenAI json_schema, vLLM guided_json, Instructor — with a decision matrix
View → modesystem-prompt-engineering-expert-mode
Author durable system prompts — persona, capability scoping, refusal patterns, output format directives, jailbreak hardening, prompt caching, dynamic injection
View → modetokenizer-engineering-expert-mode
Train tokenizers from scratch with HF tokenizers — BPE / SentencePiece / WordPiece — extend vocab for new languages or code, and add chat / special tokens
View →