# Configuration

## TextIndexConfig

`TextIndexConfig` controls all aspects of index behavior: tokenization, scoring, semantic embeddings, fusion, and post-processing. Every parameter is optional with a sensible default. The configuration is split into a small number of top-level standalone fields plus a set of nested option blocks.

```gcl
var _index = TextIndex<String> {
    config: TextIndexConfig {}  // every field has a default
};
```

For a fully-formed configuration tuned for a typical use case, prefer the static factory presets — see [`presets.md`](./presets.md):

```gcl
var _index = TextIndex<String> { config: TextIndexConfig::keyword() };
```

### Top-Level Layout

| Group | Type | Notes |
|-------|------|-------|
| `embed` | `function?` | Embedding function for semantic search |
| `synonyms` | `Map<String, Array<String>>?` | Synonym expansion map |
| `fields` | `Array<FieldConfig>?` | BM25F multi-field configurations |
| `deduplicateContent` | `bool?` | SHA-256 content deduplication (default `false`) |
| `fuzzyMaxTextLength` | `int?` | Document length cap for fuzzy matching (default `500`) |
| `usePhonetic` | `bool?` | Build phonetic index (default `false`) |
| `tokenization` | `TokenizationOptions?` | Tokenizer + character map + normalization |
| `stopWords` | `StopWordOptions?` | Stop word handling |
| `bm25` | `BM25Options?` | BM25 scoring parameters |
| `fusion` | `FusionOptions?` | Hybrid score fusion (RRF / linear, weights) |
| `typoTolerance` | `TypoOptions?` | Automatic typo tolerance for BM25 |
| `edgeNgram` | `EdgeNgramOptions?` | Prefix-search edge n-gram index |
| `shortCircuit` | `ShortCircuitOptions?` | Skip-other-modes optimization |
| `diversify` | `DiversifyOptions?` | MMR diversity re-ranking |
| `chunking` | `ChunkingOptions?` | Chunking for long-document semantic search |
| `dfr` | `DFROptions?` | DFR scoring parameters |
| `lmDirichlet` | `LMDirichletOptions?` | LM-Dirichlet smoothing |
| `highlight` | `HighlightOptions?` | Snippet/highlight markup tags |

---

### Tokenization (`tokenization: TokenizationOptions`)

Control how raw text is split into indexable terms.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `separators` | `Array<String>?` | `[" "]` | Token separators (whitespace by default) |
| `minTermLength` | `int?` | `2` | Minimum term length to index |
| `maxTermLength` | `int?` | `100` | Maximum term length to index |
| `filterNumericTerms` | `bool?` | `true` | Skip purely numeric tokens |
| `caseFold` | `bool?` | `true` | Apply Unicode case folding (lowercasing) |
| `stripPunctuation` | `bool?` | `true` | Remove punctuation from terms |
| `stemming` | `bool?` | `false` | Apply Porter stemmer |
| `useDefaultCharMap` | `bool?` | `true` | Apply built-in 170+ Unicode-to-ASCII mappings |
| `charMap` | `Map<String, String>?` | -- | Custom character mappings (merged with defaults) |
| `normOptions` | `NormOptions?` | -- | Advanced pre-normalization (HTML, URLs, accents, etc.) |

#### Example

```gcl
config: TextIndexConfig {
    tokenization: TokenizationOptions {
        separators: [",", ";"],
        minTermLength: 1,
        filterNumericTerms: false,
        caseFold: false,
        stemming: true
    }
}
```

#### Custom Character Map

```gcl
var customMap = Map<String, String> {};
customMap.set("EUR", "EUR");
customMap.set("$", "USD");
customMap.set("&", "and");

config: TextIndexConfig {
    tokenization: TokenizationOptions {
        useDefaultCharMap: true,
        charMap: customMap
    }
}
```

#### Built-in Character Mapping Categories

- Hyphens / dashes -> `-`
- Smart quotes -> `"` or `'`
- Ligatures -> expanded (fi -> fi, oe -> oe)
- Math symbols -> text equivalents
- Fractions -> 1/2, 1/3, etc.
- Fullwidth characters -> standard ASCII
- Cyrillic / Greek look-alikes -> Latin

---

### Stop Words (`stopWords: StopWordOptions`)

Control which high-frequency terms are excluded from the index to improve relevance and reduce index size.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `mode` | `StopWordMode?` | `none` | Stop word strategy |
| `language` | `TextSearchLanguage?` | `en` | Language for default stop words |
| `custom` | `Array<String>?` | -- | Custom stop word list (used with `StopWordMode::custom`) |
| `autoThreshold` | `float?` | `0.85` | Auto stop word threshold (0.0-1.0) |

#### Stop Word Modes

| Mode | Description | Use Case |
|------|-------------|----------|
| `StopWordMode::none` | No stop words -- index all terms | Technical docs, code search |
| `StopWordMode::default` | Language-specific stop word list | General text search |
| `StopWordMode::auto` | Auto-detect by document frequency | Domain-specific corpora |
| `StopWordMode::custom` | User-provided list | Specialized vocabularies |

#### Supported Languages (33)

`ar` `bg` `ca` `cs` `da` `de` `el` `en` `es` `fa` `fi` `fr` `gu` `he` `hi` `hu` `id` `it` `ja` `ko` `ms` `nl` `no` `pl` `pt` `ro` `ru` `sk` `sv` `tr` `uk` `vi` `zh`

#### Examples

```gcl
// German stop words
stopWords: StopWordOptions {
    mode: StopWordMode::default,
    language: TextSearchLanguage::de
}

// Auto-detect (terms in >70% of documents)
stopWords: StopWordOptions {
    mode: StopWordMode::auto,
    autoThreshold: 0.7
}

// Custom list
stopWords: StopWordOptions {
    mode: StopWordMode::custom,
    custom: ["custom", "stopword", "list"]
}
```

---

### BM25 Parameters (`bm25: BM25Options`)

Tune the BM25 probabilistic ranking model for your document collection.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `k1` | `float?` | `1.5` | Term frequency saturation (0.0-3.0) |
| `b` | `float?` | `0.75` | Length normalization (0.0-1.0) |
| `variant` | `BM25Variant?` | `lucene` | BM25 algorithm variant |
| `delta` | `float?` | `0.5` | BM25+ delta parameter (for `plus` variant) |

#### BM25 Variants

| Variant | IDF Formula | Characteristics |
|---------|-------------|-----------------|
| `lucene` | `log(1 + (N-df+0.5)/(df+0.5))` | Always positive, smooth -- recommended default |
| `plus` | `log((N+1)/df)` | Adds delta term for long documents |
| `bm25l` | `log((N+1)/(df+0.5))` | Modified length normalization |
| `atire` | `log(N/df)` | Simple ratio |
| `robertson` | `log((N-df+0.5)/(df+0.5))` | Can be negative for common terms |

#### Parameter Tuning

**k1 (Term Frequency Saturation)**

- **Lower (0.5-1.2):** Less emphasis on high TF -- good for noisy data
- **Default (1.5):** Balanced
- **Higher (2.0-3.0):** More emphasis on high TF -- good for long documents

**b (Length Normalization)**

- **Lower (0.0-0.5):** Less penalty for long documents
- **Default (0.75):** Balanced
- **Higher (0.8-1.0):** More penalty for long documents

#### Examples

```gcl
// Short documents (tweets, product titles)
bm25: BM25Options { k1: 1.2, b: 0.3 }

// Long documents (articles, papers)
bm25: BM25Options { k1: 1.8, b: 0.9, variant: BM25Variant::plus, delta: 0.5 }

// Verbose documents with repetition
bm25: BM25Options { k1: 1.0, b: 0.5 }
```

---

### Semantic Search (top-level `embed`)

Configure a user-provided embedding function for vector similarity search.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `embed` | `function?` | -- | User-provided embedding function: `fn(text: String): Tensor` |

#### Example

```gcl
// Define your embedding function
fn my_embed(text: String): Tensor {
    return ai::embed(text, model);
}

// Pass it in the config
config: TextIndexConfig {
    embed: my_embed
}

// Or use pre-computed vectors with add_batch()
var entries = Array<TextEntry> {};
entries.add(TextEntry { key: "doc text", value: "doc1", vector: precomputedTensor });
index.add_batch(entries);
```

---

### Text Chunking (`chunking: ChunkingOptions`)

Split long documents into overlapping chunks for semantic search and RAG pipelines.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `strategy` | `ChunkStrategy?` | `none` | Chunking algorithm |
| `size` | `int?` | `256` | Target chunk size (words) |
| `overlap` | `int?` | `50` | Overlapping words between chunks |

#### Chunk Strategies

| Strategy | Description | Best For |
|----------|-------------|----------|
| `none` | No chunking (full documents) | Short documents |
| `fixed` | Fixed word count | Uniform processing |
| `sentence` | Split on `.!?` boundaries | Natural language |
| `paragraph` | Split on double newlines | Structured text |
| `recursive` | Adaptive (para -> sent -> fixed) | Mixed content |

#### Examples

```gcl
// Short documents (<500 words)
chunking: ChunkingOptions { strategy: ChunkStrategy::none }

// Medium documents (500-2000 words)
chunking: ChunkingOptions {
    strategy: ChunkStrategy::sentence,
    size: 128,
    overlap: 20
}

// Long documents (>2000 words)
chunking: ChunkingOptions {
    strategy: ChunkStrategy::recursive,
    size: 256,
    overlap: 50
}

// Technical documents (code, logs)
chunking: ChunkingOptions {
    strategy: ChunkStrategy::fixed,
    size: 512,
    overlap: 100
}
```

---

### Fusion & Weights (`fusion: FusionOptions`)

Configure how multiple search modes combine their scores in hybrid search. Per-mode weights live inside a `Map<SearchMode, float>`.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `method` | `FusionMethod?` | `rrf` | Score fusion algorithm |
| `normalization` | `Normalization?` | `minmax` | Normalization for linear fusion |
| `weights` | `Map<SearchMode, float>?` | per-mode defaults | Per-mode fusion weights |
| `rrf` | `RRFOptions?` | -- | RRF tuning parameters |

#### Default per-mode weights (used when `weights` does not set the entry):

`bm25=0.4`, `semantic=0.6`, `fuzzy=0.2`, `exact=0.3`, `proximity=0.0`, `boolean=0.3`, `phrase=0.5`, `prefix=0.3`, `wildcard=0.2`, `dfr=0.3`, `lm_dirichlet=0.3`, `phonetic=0.3`, `quorum=0.3`, `span=0.3`.

#### `RRFOptions`

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `k` | `int?` | `60` | RRF smoothing constant |
| `topRankBonus` | `bool?` | `true` | Bonus for #1 ranks |
| `topBonus` | `float?` | `0.05` | Score bonus for #1 ranked results |
| `nearTopBonus` | `float?` | `0.02` | Score bonus for near-top results |
| `nearTopCutoff` | `int?` | `2` | Max rank for near-top bonus |

#### Weight Tuning Examples

```gcl
// Favor BM25 (keyword precision)
var w = Map<SearchMode, float> {};
w.set(SearchMode::bm25, 0.7);
w.set(SearchMode::semantic, 0.3);

config: TextIndexConfig {
    fusion: FusionOptions { method: FusionMethod::rrf, weights: w }
}
```

```gcl
// Favor semantic (conceptual search)
var w = Map<SearchMode, float> {};
w.set(SearchMode::bm25, 0.3);
w.set(SearchMode::semantic, 0.7);

config: TextIndexConfig {
    fusion: FusionOptions { method: FusionMethod::rrf, weights: w }
}
```

```gcl
// Balanced multi-mode
var w = Map<SearchMode, float> {};
w.set(SearchMode::bm25, 0.5);
w.set(SearchMode::semantic, 0.5);
w.set(SearchMode::fuzzy, 0.2);
w.set(SearchMode::exact, 0.3);

config: TextIndexConfig {
    fusion: FusionOptions { weights: w }
}
```

#### RRF Tuning

```gcl
// Higher k = more emphasis on lower ranks (smoother)
fusion: FusionOptions {
    method: FusionMethod::rrf,
    rrf: RRFOptions { k: 100 }
}

// Lower k = more emphasis on top ranks (sharper)
fusion: FusionOptions {
    method: FusionMethod::rrf,
    rrf: RRFOptions { k: 20, topRankBonus: true }
}
```

---

### Optimization (`shortCircuit: ShortCircuitOptions`, `deduplicateContent`)

Performance tuning for large indices and high-throughput query workloads.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `shortCircuit.enabled` | `bool?` | `true` | Skip modes when BM25 has a strong signal |
| `shortCircuit.minScore` | `float?` | `0.85` | Min normalized score to trigger short-circuit |
| `shortCircuit.minGap` | `float?` | `0.15` | Min gap to #2 result |
| `deduplicateContent` | `bool?` | `false` | Skip duplicate documents (SHA-256 hash) |

#### Examples

```gcl
// Aggressive (faster, may miss some results)
shortCircuit: ShortCircuitOptions {
    enabled: true,
    minScore: 0.8,
    minGap: 0.1
}

// Conservative (slower, more comprehensive)
shortCircuit: ShortCircuitOptions {
    enabled: true,
    minScore: 0.95,
    minGap: 0.3
}

// Disabled
shortCircuit: ShortCircuitOptions { enabled: false }
```

---

### Diversity (`diversify: DiversifyOptions`)

Maximal Marginal Relevance (MMR) re-ranking to reduce redundancy in results.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `enabled` | `bool?` | `false` | Enable MMR diversity re-ranking |
| `lambda` | `float?` | `0.7` | MMR lambda (0.0 = max diversity, 1.0 = pure relevance) |

#### Lambda Tuning

- **0.0:** Maximum diversity, minimum relevance
- **0.3:** High diversity (news feeds, recommendations)
- **0.5:** Balanced (research papers, general search)
- **0.8:** High relevance (question answering)
- **1.0:** Maximum relevance, no diversity

#### Examples

```gcl
// High diversity (e.g., news, recommendations)
diversify: DiversifyOptions { enabled: true, lambda: 0.3 }

// Balanced (e.g., research papers)
diversify: DiversifyOptions { enabled: true, lambda: 0.5 }

// High relevance (e.g., question answering)
diversify: DiversifyOptions { enabled: true, lambda: 0.8 }
```

---

### Highlighting (`highlight: HighlightOptions`)

Customize how matched terms are wrapped in search results and snippets.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `preTag` | `String?` | `<em>` | Tag before matched term |
| `postTag` | `String?` | `</em>` | Tag after matched term |

#### Examples

```gcl
// HTML bold
highlight: HighlightOptions { preTag: "<strong>", postTag: "</strong>" }

// Markdown
highlight: HighlightOptions { preTag: "**", postTag: "**" }

// Custom CSS class
highlight: HighlightOptions { preTag: "<span class=\"highlight\">", postTag: "</span>" }

// ANSI terminal colors (red)
highlight: HighlightOptions { preTag: "[31m", postTag: "[0m" }
```

---

### Synonyms and Fields (top-level)

Stemming lives under `tokenization.stemming`. Synonyms and BM25F fields are exposed at the top level of `TextIndexConfig`. Meilisearch-style ranking rules are not driven from this config — see `RankingRulesEngine::apply()` in [Function Scoring & Curation](./function-scoring.md).

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `synonyms` | `Map<String, Array<String>>?` | -- | Synonym expansion map |
| `fields` | `Array<FieldConfig>?` | -- | BM25F field configurations |

#### Synonyms

```gcl
var synMap = Map<String, Array<String>> {};
synMap.set("ml", ["machine learning", "artificial intelligence"]);
synMap.set("nlp", ["natural language processing", "text analysis"]);
synMap.set("nn", ["neural network", "deep learning"]);

config: TextIndexConfig { synonyms: synMap }

// Query "ml algorithms" expands to:
// "ml machine learning artificial intelligence algorithms"
```

#### BM25F Fields

`FieldConfig.f` is a typed `field` reference (like
`SlidingWindow<T>`/`TimeWindow<T>` in `std`) — pointed at a `String` field of
your document type. Field weights are matched at compile time, so a typo
becomes a compile error instead of a silent runtime miss. Omit `fields:`
entirely to let `add_fields` auto-discover every `String` / `String?` field on
`T` with weight 1.0.

```gcl
type Article { title: String; body: String; tags: String?; }

var fields = Array<FieldConfig> {};
fields.add(FieldConfig { f: Article::title, weight: 3.0, fieldB: 0.3 });
fields.add(FieldConfig { f: Article::body,  weight: 1.0, fieldB: 0.75 });
fields.add(FieldConfig { f: Article::tags,  weight: 2.0, fieldB: 0.0 });

config: TextIndexConfig { fields: fields }
```

---

### Typo Tolerance (`typoTolerance: TypoOptions`)

Automatic typo correction in BM25 mode (used with `search()` when typo tolerance is enabled).

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `enabled` | `bool?` | `false` | Enable automatic typo tolerance |
| `minWordLength` | `int?` | `4` | Min word length to apply typo tolerance (shorter words get 0 typos) |
| `maxEdits1` | `int?` | `1` | Max typos for words 5-8 chars long |
| `maxEdits2` | `int?` | `2` | Max typos for words 9+ chars long |

```gcl
typoTolerance: TypoOptions {
    enabled: true,
    minWordLength: 4,
    maxEdits1: 1,
    maxEdits2: 2
}
```

---

### Edge N-Grams (`edgeNgram: EdgeNgramOptions`)

Build a prefix-search edge n-gram index for O(1) prefix lookups.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `enabled` | `bool?` | `false` | Enable edge n-gram indexing |
| `min` | `int?` | `2` | Min prefix length to index |
| `max` | `int?` | `20` | Max prefix length to index |

```gcl
edgeNgram: EdgeNgramOptions { enabled: true, min: 2, max: 20 }
```

---

### DFR (`dfr: DFROptions`)

Configure the DFR scoring model, an alternative to BM25 based on information theory.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `basicModel` | `DFRBasicModel?` | `G` | Basic information model (G, In, Ine, IF) |
| `afterEffect` | `DFRAfterEffect?` | `Laplace` | After-effect normalization (Laplace, Bernoulli) |
| `normalization` | `DFRNormalization?` | `H2` | Length normalization (H1, H2, H3, Z) |

```gcl
dfr: DFROptions {
    basicModel: DFRBasicModel::In,
    afterEffect: DFRAfterEffect::Bernoulli,
    normalization: DFRNormalization::H2
}
```

---

### Language Model (`lmDirichlet: LMDirichletOptions`)

Configure Language Model scoring with Dirichlet smoothing.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `mu` | `float?` | `2000` | Dirichlet prior smoothing parameter (higher = more smoothing) |

```gcl
lmDirichlet: LMDirichletOptions { mu: 2000.0 }
```

---

### Phonetic Search (`usePhonetic`)

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `usePhonetic` | `bool?` | `false` | Build phonetic index (Double Metaphone) at `build()` time |

---

### Fuzzy Search Limits (`fuzzyMaxTextLength`)

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `fuzzyMaxTextLength` | `int?` | `500` | Max document text length for document-level fuzzy matching |

---

### Advanced Normalization (`tokenization.normOptions: NormOptions`)

Pre-processing pipeline applied before the standard normalize step (CharMap -> casefold -> whitespace collapse). Each flag enables an independent normalization stage.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `stripAccents` | `bool?` | `false` | Remove diacritics (e -> e) |
| `stripControlChars` | `bool?` | `false` | Remove ASCII 0-31 |
| `stripHtmlTags` | `bool?` | `false` | Remove `<tag>` elements |
| `decodeHtmlEntities` | `bool?` | `false` | Decode `&amp;` -> `&` |
| `stripUrls` | `bool?` | `false` | Remove http://, www. links |
| `stripEmails` | `bool?` | `false` | Remove email addresses |
| `normalizeQuotes` | `bool?` | `false` | Smart quotes -> ASCII |
| `normalizeLineBreaks` | `bool?` | `false` | \r\n -> \n |
| `normalizeRepeatingChars` | `bool?` | `false` | Limit consecutive repeating characters |
| `maxRepeat` | `int?` | `3` | Max consecutive chars (requires `normalizeRepeatingChars`) |
| `rejoinHyphenatedWords` | `bool?` | `false` | "data- base" -> "database" |

#### Example

```gcl
config: TextIndexConfig {
    tokenization: TokenizationOptions {
        normOptions: NormOptions {
            // Web scraping
            stripHtmlTags: true,
            decodeHtmlEntities: true,
            stripUrls: true,

            // Social media
            normalizeRepeatingChars: true,
            maxRepeat: 3,  // "hellooooo" -> "hello"

            // Multilingual
            stripAccents: true,  // "cafe" -> "cafe"

            // Clean text
            stripControlChars: true,
            normalizeLineBreaks: true,
            normalizeQuotes: true
        }
    }
}
```

---

## SearchOptions

Override `TextIndexConfig` defaults on a per-query basis. All parameters are optional. Pass to `index.search(query, k, options)`.

| Parameter | Type | Description |
|-----------|------|-------------|
| `modes` | `Array<SearchMode>?` | Which search modes to run (single = direct dispatch, multiple = fuse) |
| `weights` | `Map<SearchMode, float>?` | Per-mode weights (overrides `config.fusion.weights`) |
| `fusionMethod` | `FusionMethod?` | Override fusion method |
| `normalization` | `Normalization?` | Override normalization method |
| `rrf_k` | `int?` | Override RRF k parameter |
| `fuzzy` | `FuzzyOptions?` | Per-engine fuzzy parameters (used when `SearchMode::fuzzy` is in `modes`) |
| `phrase` | `PhraseOptions?` | Per-engine phrase parameters |
| `proximity` | `ProximityOptions?` | Per-engine proximity parameters |
| `typoTolerance` | `bool?` | Enable typo tolerance in BM25 mode |
| `minScore` | `float?` | Filter results below threshold |
| `diversify` | `bool?` | Override diversity setting |
| `diversityLambda` | `float?` | Override MMR lambda (0.0 = max diversity, 1.0 = pure relevance) |
| `offset` | `int?` | Skip first N results for pagination (default: 0) |
| `proximityFilter` | `bool?` | Discard docs where no query term pair appears within `proximity.distance` |
| `filter` | `Array<String>?` | Restrict the search to a subset of document keys |
| `termBoosts` | `Array<TermBoost>?` | Per-term boost multipliers for BM25 scoring |
| `quorumMinMatch` | `int?` | Minimum match count for quorum queries (default: 1) |

> Function scoring, curation, and Meilisearch-style ranking rules are applied as standalone post-processing helpers (`FunctionScoreEngine::apply()`, `CurationHelper::apply_curation()`, `RankingRulesEngine::apply()`) rather than via `SearchOptions`. See [Function Scoring & Curation](./function-scoring.md).

### Per-engine option blocks

#### `FuzzyOptions`

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `maxEdits` | `int?` | `2` | Maximum Levenshtein edit distance |
| `mode` | `FuzzyMode?` | `key` | `key` = whole-document matching, `term` = per-token vocabulary matching |
| `maxTextLength` | `int?` | from `config.fuzzyMaxTextLength` | Skip docs whose text exceeds this length |

#### `PhraseOptions`

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `slop` | `int?` | `0` | Maximum positional deviation between query terms (0 = exact phrase) |

#### `ProximityOptions`

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `distance` | `int?` | `5` | Maximum token distance between the two terms |

---

### Examples

#### High-Precision Query

```gcl
var w = Map<SearchMode, float> {};
w.set(SearchMode::bm25, 0.9);
w.set(SearchMode::fuzzy, 0.1);

var _options = SearchOptions {
    weights: w,
    minScore: 0.5
};
```

#### Broad Semantic Query

```gcl
var w = Map<SearchMode, float> {};
w.set(SearchMode::bm25, 0.2);
w.set(SearchMode::semantic, 0.8);

var _options = SearchOptions { weights: w };
```

#### Exact Match Only

```gcl
var modes = Array<SearchMode> {};
modes.add(SearchMode::exact);

var _options = SearchOptions { modes: modes };
```

#### Diverse Results

```gcl
var _options = SearchOptions {
    diversify: true,
    diversityLambda: 0.3
};
```

#### Paginated Results (Page 2, 10 Per Page)

```gcl
var _options = SearchOptions { offset: 10 };
```

#### Phrase Search with Slop

```gcl
var modes = Array<SearchMode> {};
modes.add(SearchMode::phrase);

var _options = SearchOptions {
    modes: modes,
    phrase: PhraseOptions { slop: 2 }
};
```

#### Fuzzy with Custom Edit Distance

```gcl
var modes = Array<SearchMode> {};
modes.add(SearchMode::fuzzy);

var _options = SearchOptions {
    modes: modes,
    fuzzy: FuzzyOptions { maxEdits: 1, mode: FuzzyMode::term }
};
```
