Efficient Clustering of High-Dimensional Data Sets with Application to Reference Matching
- Canopies can be applied to many domains and used with a variety of clustering approaches, including
Greedy Agglomerative Clustering,
K-means and
Expectation-Maximization.
- we do not calculate the distance between two points that never appear in the same canopy, i.e. we assume their distance to be infinite.
No comments:
Post a Comment