Clustering by Passing Messages Between Data Points. Brendan Frey and Delbert Dueck, Science Jan 2007. This paper firstly caught my eye as a pure CS paper in Science! They propose a method of clustering that uses message passing. In this framework each node get as input messages from each other node how much they think that the particular node is an exemplar (or best representative of the cluster) in return, each node puts out the message to each other node how much it is available to be a representative exemplar. This type of method appears to work independently of how many initial clusters are picked (as in k-means) and also has a fast runtime. In all its a very promising approach, that is simple in theory. However it has some caveats: the number of clusters in the final result can seem to be modulated by altering the "self-availability" parameter which seems odd, since they are trying to bill this as a method insensitive to initial cluster choice. Also all of their comparisons are to k-mean clustering which isn't always the best...why they didn't compare to spectral clustering and/or hierarchical clustering is beyond me.
Also the last figure shows how they can use clustering to identify the most accessible and connected airports in the US. I'm not sure if its their method of the dataset but some obvious airports such as O'hare (Chicago), Houston and Newark are not there! Something definitely seems fishy.
Either way the software is available at http://www.psi.toronto.edu/affinitypopogation
Labels: airports, clustering, message passing
0 Comments:
Post a Comment
<< Home