Quick Start¶
This page walks through the most common MMIRAGE workflows step by step.
Text-Only: Reformatting a Dataset¶
Suppose you have a JSONL dataset where each sample looks like:
{
"conversations": [
{"role": "user", "content": "Describe the image"},
{"role": "assistant", "content": "This is a badly formmatted answer"}
],
"modalities": ["<the images>"]
}
You want to use an LLM to reformat the assistant answer in Markdown. Create a YAML config:
processors:
- type: llm
server_args:
model_path: Qwen/Qwen3-8B
tp_size: 4
trust_remote_code: true
default_sampling_params:
temperature: 0.1
top_p: 1.0
max_new_tokens: 384
loading_params:
state_dir: /path/to/state
datasets:
- path: /path/to/dataset.jsonl
type: JSONL
output_dir: /path/to/output/shards
num_shards: 4
shard_id: "$SLURM_ARRAY_TASK_ID"
batch_size: 64
processing_params:
inputs:
- name: assistant_answer
key: conversations[1].content
- name: user_prompt
key: conversations[0].content
- name: modalities
key: modalities
outputs:
- name: formatted_answer
type: llm
output_type: plain
prompt: |
Reformat the answer in a markdown format without adding anything else:
{{ assistant_answer }}
remove_columns: false
output_schema:
conversations:
- role: user
content: "{{ user_prompt }}"
- role: assistant
content: "{{ formatted_answer }}"
modalities: "{{ modalities }}"
execution_params:
mode: local
retry: false
merge: false
Then run:
mmirage run --config configs/my_config.yaml
Multimodal: Processing Images with a VLM¶
For vision-language tasks, add image inputs and specify a chat_template:
processors:
- type: llm
server_args:
model_path: Qwen/Qwen2-VL-7B-Instruct
tp_size: 4
trust_remote_code: true
chat_template: qwen2-vl
default_sampling_params:
temperature: 0.1
top_p: 0.95
max_new_tokens: 768
loading_params:
datasets:
- path: /path/to/image/dataset
type: loadable
output_dir: /path/to/output/shards
image_base_path: /path/to/images
num_shards: 4
shard_id: "$SLURM_ARRAY_TASK_ID"
batch_size: 16
processing_params:
inputs:
- name: image
key: image_path
type: image
- name: question
key: question
outputs:
- name: answer
type: llm
output_type: plain
prompt: |
Answer this question about the image:
{{ question }}
output_schema:
question: "{{ question }}"
answer: "{{ answer }}"
execution_params:
mode: local
retry: true
merge: true
Running on SLURM¶
Set execution_params.mode: slurm and provide SLURM-specific parameters:
execution_params:
mode: slurm
account: my_account
job_name: mmirage-job
nodes: 1
ntasks_per_node: 1
gpus: 4
cpus_per_task: 64
time_limit: "11:59:59"
retry: true
merge: true
max_retries: 3
Then submit:
mmirage run --config configs/slurm_config.yaml
See CLI Reference for all available subcommands and Configuration Reference for a full parameter guide.