# Faceted Search & Aggregations

This document covers two features for analytical search: **Faceted Search** (grouping results by field values or numeric ranges) and **Aggregations** (computing metrics and histograms over search results).

Both features read directly off the typed document stored on each `IndexEntry.value`, so documents must be indexed via `add_fields(value: T)` (or `add_batch` of typed values) — there is no separate "field data" store. Field references are typed: every facet/metric/histogram request points at a `field` of the document type at compile time.

## Setting Up Field Data

Define a document type and index it with `add_fields`:

```gcl
type Product {
    title: String;
    description: String;
    category: String;
    brand: String;
    price: float;
    rating: float;
}

var fieldConfigs = Array<FieldConfig> {};
fieldConfigs.add(FieldConfig { f: Product::title,       weight: 3.0 });
fieldConfigs.add(FieldConfig { f: Product::description, weight: 1.0 });

var index = TextIndex<Product> {
    config: TextIndexConfig {
        fields: fieldConfigs,
        stopWords: StopWordOptions { mode: StopWordMode::default }
    }
};

index.add_fields(Product {
    title: "Gaming Laptop 15 inch",
    description: "High-performance laptop for gaming",
    category: "electronics",
    brand: "BrandX",
    price: 1299.99,
    rating: 4.5
});
index.add_fields(Product {
    title: "Office Laptop 14 inch",
    description: "Lightweight laptop for business use",
    category: "electronics",
    brand: "BrandY",
    price: 899.00,
    rating: 4.2
});
index.add_fields(Product {
    title: "Wireless Mouse",
    description: "Ergonomic wireless mouse",
    category: "accessories",
    brand: "BrandX",
    price: 49.99,
    rating: 4.0
});

index.build();
```

## Faceted Search

The `search_faceted()` method provides faceting capabilities, including term and numeric range facets, configurable term limits, and separate result containers for term and numeric facets.

### FacetRequest

Each facet request specifies a typed field reference, facet type, and type-specific options:

```gcl
@volatile
type FacetRequest {
    f: field;                            // Typed reference to the field to facet on
    facetType: FacetType?;               // term (default) or numericRange
    ranges: Array<NumericRangeBucket>?;  // Range buckets for numericRange facets
    maxTerms: int?;                      // Max term values to return (default: 10)
}
```

### FacetType

| Type | Description |
|------|-------------|
| `term` | Count documents by distinct string field values |
| `numericRange` | Count documents falling into specified numeric ranges |

### Term Facets

Term facets count documents by distinct values in a `String` field. Results are sorted by count in descending order.

```gcl
var facetRequests = Array<FacetRequest> {};

// Get top 5 categories
facetRequests.add(FacetRequest {
    f: Product::category,
    facetType: FacetType::term,
    maxTerms: 5
});

// Get top 10 brands
facetRequests.add(FacetRequest {
    f: Product::brand,
    facetType: FacetType::term,
    maxTerms: 10
});

var result = index.search_faceted("laptop", 10, facetRequests);

// Access term facets — keyed by typed `field` ref.
var brandFacets = result.termFacets.get(Product::brand);
if (brandFacets != null) {
    for (var i = 0; i < brandFacets.size(); i++) {
        var tc = brandFacets[i];
        info("${tc.value}: ${tc.count}");
    }
}
```

### TermCount Type

```gcl
@volatile
type TermCount {
    value: String;   // Facet value
    count: int;      // Number of matching documents with this value
}
```

### Numeric Range Facets

Numeric range facets count documents whose field values fall within defined ranges. Each range is defined by a `NumericRangeBucket` with a human-readable label, an inclusive lower bound, and an exclusive upper bound.

```gcl
var facetRequests = Array<FacetRequest> {};

facetRequests.add(FacetRequest {
    f: Product::price,
    facetType: FacetType::numericRange,
    ranges: [
        NumericRangeBucket { label: "budget",    from: 0.0,    to: 100.0 },
        NumericRangeBucket { label: "mid-range", from: 100.0,  to: 500.0 },
        NumericRangeBucket { label: "premium",   from: 500.0,  to: 2000.0 },
        NumericRangeBucket { label: "luxury",    from: 2000.0, to: null }
    ]
});

var result = index.search_faceted("laptop", 10, facetRequests);

var priceFacets = result.numericFacets.get(Product::price);
if (priceFacets != null) {
    for (var i = 0; i < priceFacets.size(); i++) {
        var bc = priceFacets[i];
        info("${bc.label}: ${bc.count} products");
    }
}
```

### NumericRangeBucket and NumericBucketCount Types

```gcl
@volatile
type NumericRangeBucket {
    label: String;    // Human-readable label (e.g., "cheap", "0-100")
    from: float?;     // Lower bound (inclusive). Null means -infinity.
    to: float?;       // Upper bound (exclusive). Null means +infinity.
}

@volatile
type NumericBucketCount {
    label: String;    // Label from the bucket definition
    count: int;       // Number of documents in this range
}
```

### AdvancedFacetedResult Type

```gcl
@volatile
type AdvancedFacetedResult {
    results: Array<TextResult>;                              // Ranked search results
    termFacets: Map<field, Array<TermCount>>;                // Term facet results keyed by typed field ref
    numericFacets: Map<field, Array<NumericBucketCount>>;    // Numeric range facet results keyed by typed field ref
}
```

### Combined Term and Numeric Facets

You can mix both facet types in a single request:

```gcl
var facetRequests = Array<FacetRequest> {};

// Term facet for categories
facetRequests.add(FacetRequest {
    f: Product::category,
    facetType: FacetType::term,
    maxTerms: 10
});

// Term facet for brands
facetRequests.add(FacetRequest {
    f: Product::brand,
    facetType: FacetType::term,
    maxTerms: 5
});

// Numeric range facet for price
facetRequests.add(FacetRequest {
    f: Product::price,
    facetType: FacetType::numericRange,
    ranges: [
        NumericRangeBucket { label: "under-50",  from: 0.0,   to: 50.0 },
        NumericRangeBucket { label: "50-200",    from: 50.0,  to: 200.0 },
        NumericRangeBucket { label: "200-1000",  from: 200.0, to: 1000.0 },
        NumericRangeBucket { label: "1000-plus", from: 1000.0, to: null }
    ]
});

// Numeric range facet for rating
facetRequests.add(FacetRequest {
    f: Product::rating,
    facetType: FacetType::numericRange,
    ranges: [
        NumericRangeBucket { label: "low",       from: 0.0, to: 3.0 },
        NumericRangeBucket { label: "good",      from: 3.0, to: 4.0 },
        NumericRangeBucket { label: "excellent",  from: 4.0, to: 5.1 }
    ]
});

var _result = index.search_faceted("laptop", 20, facetRequests);
```

## Metric Aggregations

The `AggregationEngine` computes statistical metrics over the numeric field values of matched documents. It uses a single-pass approach for efficiency. Numeric fields (`float`/`int`) are read directly off `TextResult.value`; non-numeric reads return null and are skipped.

### MetricType

| Metric | Description |
|--------|-------------|
| `sum` | Sum of all numeric values |
| `avg` | Arithmetic mean |
| `min` | Minimum value |
| `max` | Maximum value |
| `cardinality` | Count of distinct values |

### Computing Metrics

```gcl
// First, run a search to get matching documents
var results = index.search_bm25("laptop", 50);

// Define metric aggregations
var metrics = Array<MetricAggregation> {};
metrics.add(MetricAggregation { f: Product::price,  metric: MetricType::avg });
metrics.add(MetricAggregation { f: Product::price,  metric: MetricType::min });
metrics.add(MetricAggregation { f: Product::price,  metric: MetricType::max });
metrics.add(MetricAggregation { f: Product::price,  metric: MetricType::sum });
metrics.add(MetricAggregation { f: Product::rating, metric: MetricType::cardinality });

// Compute metrics over the search results (reads typed values off result.value).
var metricResults = AggregationEngine::compute_metrics(results, metrics);

for (var i = 0; i < metricResults.size(); i++) {
    var mr = metricResults[i];
    info("${mr.f.name()} ${mr.metric}: ${mr.value}");
}
```

### MetricAggregation and MetricResult Types

```gcl
@volatile
type MetricAggregation {
    f: field;             // Typed reference to the numeric field to aggregate
    metric: MetricType;   // Metric to compute
}

@volatile
type MetricResult {
    f: field;             // Typed reference to the field aggregated
    metric: MetricType;   // Metric computed
    value: float;         // Computed value
}
```

For `cardinality`, distinct values are bucketed by stringified numeric value.

## Histogram Aggregations

Histogram aggregations partition numeric field values into fixed-width buckets and count documents in each bucket.

### Computing Histograms

```gcl
var results = index.search_bm25("laptop", 50);

var histograms = Array<HistogramAggregation> {};
histograms.add(HistogramAggregation {
    f: Product::price,
    interval: 200.0,      // $200 buckets
    minValue: 0.0,        // Start at $0
    maxValue: 2000.0      // End at $2000
});

var histResults = AggregationEngine::compute_histograms(results, histograms);

for (var i = 0; i < histResults.size(); i++) {
    var hr = histResults[i];
    info("Histogram for ${hr.f.name()}:");
    for (var j = 0; j < hr.buckets.size(); j++) {
        var b = hr.buckets[j];
        info("  [${b.from} - ${b.to}): ${b.count} documents");
    }
}
```

### Auto-Detected Bounds

If `minValue` and `maxValue` are omitted, the histogram engine auto-detects them from the actual field values in the result set:

```gcl
histograms.add(HistogramAggregation {
    f: Product::rating,
    interval: 0.5
    // minValue and maxValue auto-detected from data
});
```

### HistogramAggregation, HistogramBucket, and HistogramResult Types

```gcl
@volatile
type HistogramAggregation {
    f: field;            // Typed reference to the numeric field to aggregate
    interval: float;     // Bucket width
    minValue: float?;    // Min value (auto-detected if null)
    maxValue: float?;    // Max value (auto-detected if null)
}

@volatile
type HistogramBucket {
    from: float;         // Lower bound (inclusive)
    to: float;           // Upper bound (exclusive)
    count: int;          // Number of documents in this bucket
}

@volatile
type HistogramResult {
    f: field;                          // Typed reference to the field aggregated
    buckets: Array<HistogramBucket>;   // Histogram buckets
}
```

## Combined Aggregation Request

The `AggregationRequest` and `AggregatedSearchResult` types bundle metrics and histograms into a single request/response pair:

```gcl
@volatile
type AggregationRequest {
    metrics: Array<MetricAggregation>?;
    histograms: Array<HistogramAggregation>?;
}

@volatile
type AggregatedSearchResult {
    results: Array<TextResult>;
    metricResults: Array<MetricResult>?;
    histogramResults: Array<HistogramResult>?;
}
```

### Full Aggregation Example

```gcl
// Search
var results = index.search_bm25("electronics", 100);

// Metrics
var metrics = Array<MetricAggregation> {};
metrics.add(MetricAggregation { f: Product::price,    metric: MetricType::avg });
metrics.add(MetricAggregation { f: Product::price,    metric: MetricType::min });
metrics.add(MetricAggregation { f: Product::price,    metric: MetricType::max });
metrics.add(MetricAggregation { f: Product::rating,   metric: MetricType::avg });
metrics.add(MetricAggregation { f: Product::price,    metric: MetricType::cardinality });
// Note: every metric (including cardinality) reads `field_get(doc, f.offset()) as float?`,
// so the target field must be numeric. For string-valued cardinality, drive it from a term
// facet instead — its bucket count is the same number.

// Histograms
var histograms = Array<HistogramAggregation> {};
histograms.add(HistogramAggregation { f: Product::price,  interval: 100.0, minValue: 0.0, maxValue: 2000.0 });
histograms.add(HistogramAggregation { f: Product::rating, interval: 0.5,   minValue: 1.0, maxValue: 5.0 });

// Compute both — no longer takes a fieldData side index; reads typed values off result.value.
var metricResults = AggregationEngine::compute_metrics(results, metrics);
var histResults = AggregationEngine::compute_histograms(results, histograms);
```
