Comprehensive Metrics

Go beyond simple click-through rates. Understand search performance with industry-standard IR metrics and interactive visualizations.

Quantifying Search Relevance

Metrics are calculated against a "ground truth", which is called "Expected Results" in our platform. You can have one or more sets, and use all or their subset. This is typically a CSV file you provide, mapping queries to a list of known relevant document URLs. The platform uses this data to automatically calculate a suite of metrics for every test run.

Key Metrics Supported

Precision@K

Measures the exactness of the results by calculating the proportion of relevant documents found in the top K results. A higher score indicates more relevant results were returned at the top of the page.

Recall

Measures the completeness of the results, indicating the fraction of ALL relevant documents that were successfully retrieved.

nDCG@K (Normalized Discounted Cumulative Gain)

A sophisticated, rank-aware metric that evaluates the quality of the ranking. It rewards relevant documents that appear higher in the search results, making it a powerful indicator of overall performance.

Mean Reciprocal Rank (MRR)

Focuses on how quickly the *first* correct answer is found. It's the average of the reciprocal ranks of the first relevant document for a set of queries.

Overlap & Diversity

Analyze the similarity between result sets from different configurations using Jaccard index and other overlap metrics. You can also measure result diversity by counting unique domains and titles.

Statistical Significance

The platform automatically runs pairwise statistical tests (like the Wilcoxon signed-rank test) to determine if the difference in performance between two configurations is statistically significant, helping you avoid making decisions based on random noise.

Request a Demo Sample Report

Powerful Features to Drive Better Search

TestMySearch provides a complete toolkit for search analysis and optimization.