comparison python/ml.py @ 843:5dc7a507353e

updated to learn prototypes
author Nicolas Saunier <nicolas.saunier@polymtl.ca>
date Wed, 13 Jul 2016 23:45:47 -0400
parents 52aa03260f03
children 8e8ec4ece66e
comparison
equal deleted inserted replaced
842:75530d8c0090 843:5dc7a507353e
143 143
144 if an instance is different enough (<minSimilarity), 144 if an instance is different enough (<minSimilarity),
145 it will become a new prototype. 145 it will become a new prototype.
146 Non-prototype instances will be assigned to an existing prototype 146 Non-prototype instances will be assigned to an existing prototype
147 if minClusterSize is not None, the clusters will be refined by removing iteratively the smallest clusters 147 if minClusterSize is not None, the clusters will be refined by removing iteratively the smallest clusters
148 and reassigning all elements in the cluster until no cluster is smaller than minClusterSize''' 148 and reassigning all elements in the cluster until no cluster is smaller than minClusterSize
149
150 TODO: at each step, optimize the prototype as the most similar in its current cluster (can be done easily if similarities are already computed)'''
149 151
150 # sort instances based on length 152 # sort instances based on length
151 indices = range(len(instances)) 153 indices = range(len(instances))
152 if randomInitialization: 154 if randomInitialization:
153 indices = np.random.permutation(indices) 155 indices = np.random.permutation(indices)
167 for j in prototypeIndices: 169 for j in prototypeIndices:
168 if similarities[i][j] < 0: 170 if similarities[i][j] < 0:
169 similarities[i][j] = similarityFunc(instances[i], instances[j]) 171 similarities[i][j] = similarityFunc(instances[i], instances[j])
170 similarities[j][i] = similarities[i][j] 172 similarities[j][i] = similarities[i][j]
171 if similarities[i][prototypeIndices].max() < minSimilarity: 173 if similarities[i][prototypeIndices].max() < minSimilarity:
172 prototypeIndices.append(i) 174 prototypeIndices.append(i)
175 elif randomInitialization: # replace prototype by current instance i if longer
176 label = similarities[i][prototypeIndices].argmax()
177 if len(instances[prototypeIndices[label]]) < len(instances[i]):
178 prototypeIndices[label] = i
173 179
174 # assignment 180 # assignment
175 indices = [i for i in range(similarities.shape[0]) if i not in prototypeIndices] 181 indices = [i for i in range(similarities.shape[0]) if i not in prototypeIndices]
176 assign = True 182 assign = True
177 while assign: 183 while assign: