Mercurial Hosting > traffic-intelligence
comparison python/ml.py @ 843:5dc7a507353e
updated to learn prototypes
author | Nicolas Saunier <nicolas.saunier@polymtl.ca> |
---|---|
date | Wed, 13 Jul 2016 23:45:47 -0400 |
parents | 52aa03260f03 |
children | 8e8ec4ece66e |
comparison
equal
deleted
inserted
replaced
842:75530d8c0090 | 843:5dc7a507353e |
---|---|
143 | 143 |
144 if an instance is different enough (<minSimilarity), | 144 if an instance is different enough (<minSimilarity), |
145 it will become a new prototype. | 145 it will become a new prototype. |
146 Non-prototype instances will be assigned to an existing prototype | 146 Non-prototype instances will be assigned to an existing prototype |
147 if minClusterSize is not None, the clusters will be refined by removing iteratively the smallest clusters | 147 if minClusterSize is not None, the clusters will be refined by removing iteratively the smallest clusters |
148 and reassigning all elements in the cluster until no cluster is smaller than minClusterSize''' | 148 and reassigning all elements in the cluster until no cluster is smaller than minClusterSize |
149 | |
150 TODO: at each step, optimize the prototype as the most similar in its current cluster (can be done easily if similarities are already computed)''' | |
149 | 151 |
150 # sort instances based on length | 152 # sort instances based on length |
151 indices = range(len(instances)) | 153 indices = range(len(instances)) |
152 if randomInitialization: | 154 if randomInitialization: |
153 indices = np.random.permutation(indices) | 155 indices = np.random.permutation(indices) |
167 for j in prototypeIndices: | 169 for j in prototypeIndices: |
168 if similarities[i][j] < 0: | 170 if similarities[i][j] < 0: |
169 similarities[i][j] = similarityFunc(instances[i], instances[j]) | 171 similarities[i][j] = similarityFunc(instances[i], instances[j]) |
170 similarities[j][i] = similarities[i][j] | 172 similarities[j][i] = similarities[i][j] |
171 if similarities[i][prototypeIndices].max() < minSimilarity: | 173 if similarities[i][prototypeIndices].max() < minSimilarity: |
172 prototypeIndices.append(i) | 174 prototypeIndices.append(i) |
175 elif randomInitialization: # replace prototype by current instance i if longer | |
176 label = similarities[i][prototypeIndices].argmax() | |
177 if len(instances[prototypeIndices[label]]) < len(instances[i]): | |
178 prototypeIndices[label] = i | |
173 | 179 |
174 # assignment | 180 # assignment |
175 indices = [i for i in range(similarities.shape[0]) if i not in prototypeIndices] | 181 indices = [i for i in range(similarities.shape[0]) if i not in prototypeIndices] |
176 assign = True | 182 assign = True |
177 while assign: | 183 while assign: |