Expected Hive version is 4.

Optimizer

These are two separate cardinality estimation systems in Hive that operate at different stages of query compilation.

  1. Calcite-Level Statistics (HiveRelMdSelectivity, HiveRelMdRowCount, etc.)
    • Package: org.apache.hadoop.hive.ql.optimizer.calcite.stats
    • Works on: Calcite RelNodes (logical plan)
    • When used: During Cost-Based Optimization (CBO) phase
    • Executed if hive.cbo.enable = true
  2. Operator-Level Statistics (StatsRulesProcFactory.java)
    • Package: org.apache.hadoop.hive.ql.optimizer.stats.annotation
    • Works on: Hive operator tree (physical plan)
    • When used: After physical plan generation (Tez compilation phase)
    • Entry point: AnnotateWithStatistics transform

Operator-level statistics

Links