Motivation
Clustering algorithms such as K-Means belong to unsupervised learning, meaning they discover structure in data without predefined labels. While these methods are effective at grouping observations, they often suffer from a major limitation: lack of interpretability.
In high-dimensional settings, it is difficult to explain why a particular data point was assigned to one cluster rather than another. Visual tools like scatter plots become ineffective, and even when visualisation is possible, the reasoning behind cluster boundaries remains unclear.
Decision Trees, on the other hand, are supervised learning models that are inherently interpretable and transparent. Although they cannot perform clustering on their own, they can be used after clustering to explain the results.
Core Idea
The key idea is to convert an unsupervised problem into a supervised one:
- Apply a clustering algorithm (e.g., K-Means) to an unlabeled dataset.
- Treat the resulting cluster assignments as labels.
- Train a Decision Tree classifier using the original features as predictors and the cluster labels as the target.
- Use the trained tree to explain the logic behind cluster assignments.
This approach does not replace clustering. Instead, it provides a post-hoc explanation layer.
Why Decision Trees Are Suitable for Cluster Explanation
Decision Trees offer several advantages in this context:
- They generate explicit decision rules based on feature thresholds.
- Their interpretability is not affected by dimensionality.
- They naturally express nonlinear boundaries and feature interactions.
- They allow practitioners to describe clusters in human-readable terms, such as: “Observations with humidity below a certain threshold and windspeed above another threshold belong to Cluster A.”
In essence, Decision Trees translate abstract cluster geometry into logical rules.
Workflow Overview
Step 1: Clustering to Generate Labels
A clustering algorithm (such as K-Means) is applied to the dataset to discover groups based on feature similarity. The algorithm assigns each observation to a cluster, producing a categorical label for every data point.
At this stage:
- The clustering focuses purely on distance and structure.
- No interpretability is guaranteed.
Step 2: Supervised Learning with a Decision Tree
The cluster labels are then used as the target variable in a supervised learning setup.
A Decision Tree classifier is trained to predict cluster membership from the original features. If the tree achieves high accuracy, it indicates that:
- The clusters are well-separated in feature space.
- The decision rules learned by the tree effectively approximate the clustering logic.
Step 3: Interpreting the Tree
Once trained, the Decision Tree can be examined to understand:
- Which features are most important for separating clusters.
- The sequence of decisions that lead to each cluster.
- The regions of the feature space corresponding to each cluster.
Often, only a small subset of features is sufficient to explain the clustering, even when the original data have many dimensions.
Conceptual Interpretation
Each leaf node in the Decision Tree represents:
- A subset of observations,
- Defined by a set of feature-based conditions,
- Assigned to a specific cluster.
Thus, clusters can be described as rule-based segments, such as:
- “Low humidity and low windspeed”
- “Moderate humidity with high windspeed”
These descriptions are far more accessible than distance-based explanations.
Important Clarification
This approach should not be misunderstood as rediscovering clusters using a Decision Tree.
- The clustering algorithm performs the actual discovery of structure.
- The Decision Tree explains, rather than replaces, the clustering results.
The tree is a descriptive model, not a generative one for clustering.
Practical Value
Using Decision Trees for cluster explanation is especially useful when:
- Stakeholders require transparent reasoning.
- Models must be auditable or explainable.
- High-dimensional data make direct visualisation impractical.
- Clusters need to be translated into business rules or policies.
Key Takeaways
- Clustering alone often lacks interpretability.
- Decision Trees can be trained on cluster labels to explain assignments.
- This creates a bridge between unsupervised discovery and interpretable modeling.
- The resulting rules provide clear, actionable insights into cluster structure.
