Informal goal - given n points [web pages images, genome fragments, etc] classify into "coherent groups"
People in machine learning must have heard of unsupervised learning.
Assumption
1) As input, given a (dis) similarity measure
2 symmetric
examples - euclidean distance, genome similarity
goal - same clusters => nearby
People in machine learning must have heard of unsupervised learning.
Assumption
1) As input, given a (dis) similarity measure
2 symmetric
examples - euclidean distance, genome similarity
goal - same clusters => nearby
So, coherent group has smaller distance.
Objective function to do clustering - there are many objective functions, but we are reading max-spacing k-clusterings.
Max spacing k-clustering
Assume we know k = number of clusters desired
[In practice, can experiment with a range of values]
Call points p and q seperated if they are assigned to different clusters.
Problem statement - given a distance d and k, compute the k-clustering with maximum spacing.
greedy algorithm
pseudocode
0 comments:
Post a Comment