Friday, December 28, 2012

Efficient Clustering of High-Dimensional Data Sets with Application to Reference Matching 


- Canopies can be applied to many domains and used with a variety of clustering approaches, including 
            Greedy Agglomerative Clustering,
            K-means and 
            Expectation-Maximization.

- we do not calculate the distance between two points that never appear in the same canopy, i.e. we assume their distance to be infinite.


No comments:

Post a Comment