Configuration¶
DocAsk uses YAML configuration files stored in configs/, plus project-specific configuration files generated under data/projects/.
There are two levels of configuration:
configs/
→ global DocAsk settings
data/projects/<project_name>/
→ generated configuration and corpus for a selected target project
app_config.yaml¶
This file stores application-level settings.
Example:
app_title: DocAsk
app_subtitle: Ask questions about a project's documentation
show_sources: true
default_top_k: 5
project_profile: mmore
llm:
provider: qwen
model_name: Qwen/Qwen3-1.7B
max_new_tokens: 512
temperature: 0.0
enable_thinking: false
Main fields¶
Field |
Meaning |
|---|---|
|
Title used by the app. |
|
Subtitle used by the app. |
|
Whether sources should be shown by default. |
|
Default number of retrieved sources. |
|
Project-specific behavior profile, for example |
|
LLM provider used by DocAsk. |
|
Model name used by the provider. |
|
Maximum number of generated tokens. |
|
Generation temperature. |
|
Whether to enable Qwen thinking mode. |
project_config.yaml¶
This file describes the target project being indexed.
The default configuration is:
configs/project_config.yaml
When using the Streamlit interface, DocAsk also generates a project-specific config:
data/projects/<project_name>/project_config.yaml
Example for MMORE:
project_name: mmore
package_name: mmore
repo_path: /absolute/path/to/mmore
docs_path: /absolute/path/to/mmore/docs/source
code_path: /absolute/path/to/mmore/src/mmore
include_yaml_configs: true
yaml_config_paths:
- /absolute/path/to/mmore/examples
- /absolute/path/to/mmore/production-config
include_repo_structure: true
repo_structure_max_depth: 6
Main fields¶
Field |
Meaning |
|---|---|
|
Project name stored in metadata and used for project folders. |
|
Python package prefix used to build full module names. |
|
Root folder of the target repository. |
|
Folder containing Markdown or reStructuredText docs. |
|
Python source folder used for docstring extraction. |
|
Whether to include YAML files in the corpus. |
|
Folders scanned for |
|
Whether to add a synthetic repository tree document. |
|
Maximum depth of the generated repository tree. |
indexing_config.yaml¶
This file controls MMORE indexing defaults.
include_markdown: true
include_code_docstrings: true
include_signatures: true
include_code_snippets: false
chunk_size: 1200
chunk_overlap: 150
top_k: 5
retrieval_backend: simple
collection_name: mmore_docs
mmore_index_config_path: configs/mmore_index_config.yaml
Some fields, such as chunk_size and chunk_overlap, are kept for future chunking improvements. The current prototype mostly relies on Markdown sections and extracted code documentation records.
mmore_index_config.yaml¶
This file is passed to MMORE when building the index.
indexer:
dense_model:
model_name: sentence-transformers/all-MiniLM-L6-v2
is_multimodal: false
sparse_model:
model_name: splade
is_multimodal: false
db:
uri: ./data/indexes/mmore/proc_demo.db
name: my_db
collection_name: mmore_docs
documents_path: data/processed/mmore_corpus.jsonl
mmore_retriever_config.yaml¶
This file configures MMORE retrieval.
db:
uri: ./data/indexes/mmore/proc_demo.db
name: my_db
hybrid_search_weight: 0.5
k: 5
collection_name: mmore_docs
use_web: false
reranker_model_name: null
data/app_state.json¶
The Streamlit app persists the latest local UI state in:
data/app_state.json
It can contain:
{
"project_name": "mmore",
"project_path": "/path/to/mmore",
"corpus_path": "/path/to/docask/data/projects/mmore/corpus.jsonl",
"project_config_path": "/path/to/docask/data/projects/mmore/project_config.yaml",
"backend": "simple",
"top_k": 5,
"use_llm": true,
"show_sources": true,
"show_full_sources": false,
"show_debug": false
}
This file is machine-specific and should normally be ignored by Git.