2. Для virtual-assessor.html Это страница с описанием функции, поэтому используется разметка WebPage. HTML Virtual Search Assessor - TestMySearch 3. Для query-analysis.html Это страница с описанием функции, поэтому используется разметка WebPage. HTML Query Analysis & Clustering - TestMySearch

Query Analysis & Clustering

Discover hidden patterns in user search behavior by automatically grouping similar queries.

From Raw Queries to Actionable Insights

A raw query log is full of variations, typos, and different phrasing for the same intent. Our query clustering feature uses a powerful and efficient algorithm to cut through the noise and group semantically similar queries, such as "i phone" and "iphone," or "mangetout" and "mange tout".

The Clustering Pipeline

1

Bigram Transformation

Each query is broken down into a set of bigrams (overlapping pairs of characters). For example, “apple” becomes {“ap”, “pp”, “pl”, “le”}. This helps find similarities even with typos.

2

MinHash Fingerprinting

A compact "fingerprint" is calculated for each set of bigrams using the MinHash algorithm. Strings with similar bigram sets will produce very similar fingerprints.

3

Locality-Sensitive Hashing (LSH)

To avoid comparing every query to every other query, we use LSH. This technique places similar fingerprints into the same "buckets" with high probability, dramatically speeding up the process. It's like a significantly faster version of vector search for this specific task.

4

Final Clustering

The system then runs a more precise (but slower) fuzzy matching comparison only on the small groups of candidates identified by LSH. The result is a clean set of clusters, each containing similar queries.

Powerful Features to Drive Better Search

TestMySearch provides a complete toolkit for search analysis and optimization.