annotate trafficintelligence/ml.py @ 1222:69b531c7a061

added methods to reset trajectories and change object coordinates (including features)
author Nicolas Saunier <nicolas.saunier@polymtl.ca>
date Tue, 20 Jun 2023 15:42:19 -0400
parents 75a6ad604cc5
children ab4c72b9475c
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
183
ed944ff45e8c first simple clustering algorithm implementation
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents:
diff changeset
1 #! /usr/bin/env python
ed944ff45e8c first simple clustering algorithm implementation
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents:
diff changeset
2 '''Libraries for machine learning algorithms'''
ed944ff45e8c first simple clustering algorithm implementation
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents:
diff changeset
3
786
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
4 from os import path
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
5 from random import shuffle
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
6 from copy import copy, deepcopy
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
7
308
8bafd054cda4 Added a function to compute LCSS distance between two indcators
Mohamed Gomaa
parents: 184
diff changeset
8 import numpy as np
786
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
9 from matplotlib.pylab import text
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
10 import matplotlib as mpl
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
11 import matplotlib.pyplot as plt
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
12 from scipy.cluster.vq import kmeans, whiten, vq
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
13 from sklearn import mixture
978
184f1dd307f9 corrected print and exception statements for Python 3
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 961
diff changeset
14 try:
184f1dd307f9 corrected print and exception statements for Python 3
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 961
diff changeset
15 import cv2
184f1dd307f9 corrected print and exception statements for Python 3
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 961
diff changeset
16 opencvAvailable = True
184f1dd307f9 corrected print and exception statements for Python 3
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 961
diff changeset
17 except ImportError:
184f1dd307f9 corrected print and exception statements for Python 3
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 961
diff changeset
18 print('OpenCV library could not be loaded (video replay functions will not be available)') # TODO change to logging module
184f1dd307f9 corrected print and exception statements for Python 3
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 961
diff changeset
19 opencvAvailable = False
308
8bafd054cda4 Added a function to compute LCSS distance between two indcators
Mohamed Gomaa
parents: 184
diff changeset
20
1028
cc5cb04b04b0 major update using the trafficintelligence package name and install through pip
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 1025
diff changeset
21 from trafficintelligence import utils
786
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
22
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
23 #####################
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
24 # OpenCV ML models
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
25 #####################
183
ed944ff45e8c first simple clustering algorithm implementation
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents:
diff changeset
26
961
ec1682ed999f added computation of confusion matrix and improved default parameter for block normalization for SVM classification
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 953
diff changeset
27 def computeConfusionMatrix(model, samples, responses):
993
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
28 '''computes the confusion matrix of the classifier (model)
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
29
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
30 samples should be n samples by m variables'''
961
ec1682ed999f added computation of confusion matrix and improved default parameter for block normalization for SVM classification
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 953
diff changeset
31 classifications = {}
993
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
32 predictions = model.predict(samples)
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
33 for predicted, y in zip(predictions, responses):
961
ec1682ed999f added computation of confusion matrix and improved default parameter for block normalization for SVM classification
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 953
diff changeset
34 classifications[(y, predicted)] = classifications.get((y, predicted), 0)+1
ec1682ed999f added computation of confusion matrix and improved default parameter for block normalization for SVM classification
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 953
diff changeset
35 return classifications
ec1682ed999f added computation of confusion matrix and improved default parameter for block normalization for SVM classification
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 953
diff changeset
36
978
184f1dd307f9 corrected print and exception statements for Python 3
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 961
diff changeset
37 if opencvAvailable:
993
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
38 class SVM(object):
978
184f1dd307f9 corrected print and exception statements for Python 3
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 961
diff changeset
39 '''wrapper for OpenCV SimpleVectorMachine algorithm'''
993
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
40 def __init__(self, svmType = cv2.ml.SVM_C_SVC, kernelType = cv2.ml.SVM_RBF, degree = 0, gamma = 1, coef0 = 0, Cvalue = 1, nu = 0, p = 0):
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
41 self.model = cv2.ml.SVM_create()
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
42 self.model.setType(svmType)
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
43 self.model.setKernel(kernelType)
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
44 self.model.setDegree(degree)
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
45 self.model.setGamma(gamma)
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
46 self.model.setCoef0(coef0)
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
47 self.model.setC(Cvalue)
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
48 self.model.setNu(nu)
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
49 self.model.setP(p)
380
adfd4f70ee1d added SVM
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 312
diff changeset
50
993
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
51 def save(self, filename):
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
52 self.model.save(filename)
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
53
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
54 def train(self, samples, layout, responses, computePerformance = False):
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
55 self.model.train(samples, layout, responses)
978
184f1dd307f9 corrected print and exception statements for Python 3
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 961
diff changeset
56 if computePerformance:
184f1dd307f9 corrected print and exception statements for Python 3
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 961
diff changeset
57 return computeConfusionMatrix(self, samples, responses)
380
adfd4f70ee1d added SVM
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 312
diff changeset
58
978
184f1dd307f9 corrected print and exception statements for Python 3
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 961
diff changeset
59 def predict(self, hog):
993
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
60 retval, predictions = self.model.predict(hog)
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
61 if hog.shape[0] == 1:
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
62 return predictions[0][0]
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
63 else:
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
64 return np.asarray(predictions, dtype = np.int).ravel().tolist()
380
adfd4f70ee1d added SVM
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 312
diff changeset
65
993
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
66 def SVM_load(filename):
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
67 if path.exists(filename):
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
68 svm = SVM()
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
69 svm.model = cv2.ml.SVM_load(filename)
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
70 return svm
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
71 else:
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
72 print('Provided filename {} does not exist: model not loaded!'.format(filename))
e8eabef7857c update to OpenCV3 for python
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 980
diff changeset
73
786
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
74 #####################
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
75 # Clustering
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
76 #####################
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
77
665
15e244d2a1b5 corrected bug with circular import for VideoFilenameAddable, moved to base module
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 636
diff changeset
78 class Centroid(object):
184
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
79 'Wrapper around instances to add a counter'
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
80
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
81 def __init__(self, instance, nInstances = 1):
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
82 self.instance = instance
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
83 self.nInstances = nInstances
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
84
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
85 # def similar(instance2):
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
86 # return self.instance.similar(instance2)
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
87
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
88 def add(self, instance2):
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
89 self.instance = self.instance.multiply(self.nInstances)+instance2
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
90 self.nInstances += 1
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
91 self.instance = self.instance.multiply(1/float(self.nInstances))
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
92
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
93 def average(c):
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
94 inst = self.instance.multiply(self.nInstances)+c.instance.multiply(instance.nInstances)
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
95 inst.multiply(1/(self.nInstances+instance.nInstances))
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
96 return Centroid(inst, self.nInstances+instance.nInstances)
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
97
515
727e3c529519 renamed all draw functions to plot for consistency
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 501
diff changeset
98 def plot(self, options = ''):
727e3c529519 renamed all draw functions to plot for consistency
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 501
diff changeset
99 self.instance.plot(options)
184
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
100 text(self.instance.position.x+1, self.instance.position.y+1, str(self.nInstances))
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
101
386
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 382
diff changeset
102 def kMedoids(similarityMatrix, initialCentroids = None, k = None):
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 382
diff changeset
103 '''Algorithm that clusters any dataset based on a similarity matrix
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 382
diff changeset
104 Either the initialCentroids or k are passed'''
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 382
diff changeset
105 pass
184
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
106
526
21bdeb29f855 corrected bug in initialization of lists and loading trajectories from vissim files
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 515
diff changeset
107 def assignCluster(data, similarFunc, initialCentroids = None, shuffleData = True):
183
ed944ff45e8c first simple clustering algorithm implementation
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents:
diff changeset
108 '''k-means algorithm with similarity function
184
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
109 Two instances should be in the same cluster if the sameCluster function returns true for two instances. It is supposed that the average centroid of a set of instances can be computed, using the function.
183
ed944ff45e8c first simple clustering algorithm implementation
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents:
diff changeset
110 The number of clusters will be determined accordingly
ed944ff45e8c first simple clustering algorithm implementation
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents:
diff changeset
111
ed944ff45e8c first simple clustering algorithm implementation
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents:
diff changeset
112 data: list of instances
184
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
113 averageCentroid: '''
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
114 localdata = copy(data) # shallow copy to avoid modifying data
382
ba813f148ade development for clustering
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 380
diff changeset
115 if shuffleData:
ba813f148ade development for clustering
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 380
diff changeset
116 shuffle(localdata)
636
3058e00887bc removed all issues because of tests with None, using is instead of == or !=
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 563
diff changeset
117 if initialCentroids is None:
526
21bdeb29f855 corrected bug in initialization of lists and loading trajectories from vissim files
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 515
diff changeset
118 centroids = [Centroid(localdata[0])]
21bdeb29f855 corrected bug in initialization of lists and loading trajectories from vissim files
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 515
diff changeset
119 else:
184
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
120 centroids = deepcopy(initialCentroids)
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
121 for instance in localdata[1:]:
183
ed944ff45e8c first simple clustering algorithm implementation
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents:
diff changeset
122 i = 0
382
ba813f148ade development for clustering
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 380
diff changeset
123 while i<len(centroids) and not similarFunc(centroids[i].instance, instance):
183
ed944ff45e8c first simple clustering algorithm implementation
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents:
diff changeset
124 i += 1
ed944ff45e8c first simple clustering algorithm implementation
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents:
diff changeset
125 if i == len(centroids):
184
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
126 centroids.append(Centroid(instance))
183
ed944ff45e8c first simple clustering algorithm implementation
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents:
diff changeset
127 else:
184
d70e9b36889c initial work on flow vectors and clustering algorithms
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 183
diff changeset
128 centroids[i].add(instance)
183
ed944ff45e8c first simple clustering algorithm implementation
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents:
diff changeset
129
ed944ff45e8c first simple clustering algorithm implementation
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents:
diff changeset
130 return centroids
308
8bafd054cda4 Added a function to compute LCSS distance between two indcators
Mohamed Gomaa
parents: 184
diff changeset
131
382
ba813f148ade development for clustering
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 380
diff changeset
132 # TODO recompute centroids for each cluster: instance that minimizes some measure to all other elements
ba813f148ade development for clustering
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 380
diff changeset
133
293
ee3302528cdc rearranged new code by Paul (works now)
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 285
diff changeset
134 def spectralClustering(similarityMatrix, k, iter=20):
907
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
135 '''Spectral Clustering algorithm'''
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
136 n = len(similarityMatrix)
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
137 # create Laplacian matrix
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
138 rowsum = np.sum(similarityMatrix,axis=0)
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
139 D = np.diag(1 / np.sqrt(rowsum))
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
140 I = np.identity(n)
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
141 L = I - np.dot(D,np.dot(similarityMatrix,D))
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
142 # compute eigenvectors of L
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
143 U,sigma,V = np.linalg.svd(L)
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
144 # create feature vector from k first eigenvectors
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
145 # by stacking eigenvectors as columns
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
146 features = np.array(V[:k]).T
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
147 # k-means
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
148 features = whiten(features)
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
149 centroids,distortion = kmeans(features,k, iter)
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
150 code,distance = vq(features,centroids) # code starting from 0 (represent first cluster) to k-1 (last cluster)
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
151 return code,sigma
563
39de5c532559 place holder functions
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 526
diff changeset
152
1044
75a6ad604cc5 work on motion patterns
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 1033
diff changeset
153 def assignToPrototypeClusters(instances, initialPrototypeIndices, similarities, minSimilarity, similarityFunc, minClusterSize = 0):
907
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
154 '''Assigns instances to prototypes
980
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
155 if minClusterSize is not 0, the clusters will be refined by removing iteratively the smallest clusters
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
156 and reassigning all elements in the cluster until no cluster is smaller than minClusterSize
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
157
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
158 labels are indices in the prototypeIndices'''
1044
75a6ad604cc5 work on motion patterns
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 1033
diff changeset
159 prototypeIndices = copy(initialPrototypeIndices)
907
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
160 indices = [i for i in range(len(instances)) if i not in prototypeIndices]
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
161 labels = [-1]*len(instances)
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
162 assign = True
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
163 while assign:
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
164 for i in prototypeIndices:
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
165 labels[i] = i
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
166 for i in indices:
980
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
167 for j in prototypeIndices:
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
168 if similarities[i][j] < 0:
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
169 similarities[i][j] = similarityFunc(instances[i], instances[j])
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
170 similarities[j][i] = similarities[i][j]
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
171 label = similarities[i][prototypeIndices].argmax()
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
172 if similarities[i][prototypeIndices[label]] >= minSimilarity:
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
173 labels[i] = prototypeIndices[label]
907
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
174 else:
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
175 labels[i] = -1 # outlier
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
176 clusterSizes = {i: sum(np.array(labels) == i) for i in prototypeIndices}
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
177 smallestClusterIndex = min(clusterSizes, key = clusterSizes.get)
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
178 assign = (clusterSizes[smallestClusterIndex] < minClusterSize)
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
179 if assign:
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
180 prototypeIndices.remove(smallestClusterIndex)
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
181 indices = [i for i in range(similarities.shape[0]) if labels[i] == smallestClusterIndex]
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
182 return prototypeIndices, labels
980
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
183
1044
75a6ad604cc5 work on motion patterns
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 1033
diff changeset
184 def prototypeCluster(instances, similarities, minSimilarity, similarityFunc, optimizeCentroid = False, randomInitialization = False, initialPrototypeIndices = None):
731
b02431a8234c made prototypecluster generic, in ml module, and added randominitialization
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 680
diff changeset
185 '''Finds exemplar (prototype) instance that represent each cluster
952
a9b2beef0db4 loading and assigning motion patterns works
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 949
diff changeset
186 Returns the prototype indices (in the instances list)
731
b02431a8234c made prototypecluster generic, in ml module, and added randominitialization
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 680
diff changeset
187
980
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
188 the elements in the instances list must have a length (method __len__), or one can use the optimizeCentroid
735
0e875a7f5759 modified prototypeCluster algorithm to enforce similarity when re-assigning and to compute only the necessary similarities
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 734
diff changeset
189 the positions in the instances list corresponds to the similarities
0e875a7f5759 modified prototypeCluster algorithm to enforce similarity when re-assigning and to compute only the necessary similarities
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 734
diff changeset
190 if similarityFunc is provided, the similarities are calculated as needed (this is faster) if not in similarities (negative if not computed)
0e875a7f5759 modified prototypeCluster algorithm to enforce similarity when re-assigning and to compute only the necessary similarities
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 734
diff changeset
191 similarities must still be allocated with the right size
731
b02431a8234c made prototypecluster generic, in ml module, and added randominitialization
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 680
diff changeset
192
b02431a8234c made prototypecluster generic, in ml module, and added randominitialization
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 680
diff changeset
193 if an instance is different enough (<minSimilarity),
b02431a8234c made prototypecluster generic, in ml module, and added randominitialization
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 680
diff changeset
194 it will become a new prototype.
b02431a8234c made prototypecluster generic, in ml module, and added randominitialization
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 680
diff changeset
195 Non-prototype instances will be assigned to an existing prototype
843
5dc7a507353e updated to learn prototypes
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 807
diff changeset
196
908
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
197 if optimizeCentroid is True, each time an element is added, we recompute the centroid trajectory as the most similar to all in the cluster
731
b02431a8234c made prototypecluster generic, in ml module, and added randominitialization
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 680
diff changeset
198
949
d6c1c05d11f5 modified multithreading at the interaction level for safety computations
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 917
diff changeset
199 initialPrototypeIndices are indices in instances
d6c1c05d11f5 modified multithreading at the interaction level for safety computations
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 917
diff changeset
200
908
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
201 TODO: check how similarity evolves in clusters'''
878
8e8ec4ece66e minor + bug corrected in motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 843
diff changeset
202 if len(instances) == 0:
8e8ec4ece66e minor + bug corrected in motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 843
diff changeset
203 print('no instances to cluster (empty list)')
8e8ec4ece66e minor + bug corrected in motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 843
diff changeset
204 return None
8e8ec4ece66e minor + bug corrected in motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 843
diff changeset
205
908
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
206 # sort instances based on length
1033
8ffb3ae9f3d2 work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 1028
diff changeset
207 indices = list(range(len(instances)))
908
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
208 if randomInitialization or optimizeCentroid:
980
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
209 indices = np.random.permutation(indices).tolist()
731
b02431a8234c made prototypecluster generic, in ml module, and added randominitialization
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 680
diff changeset
210 else:
1033
8ffb3ae9f3d2 work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 1028
diff changeset
211 indices.sort(key=lambda i: len(instances[i]))
980
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
212 # initialize clusters
908
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
213 clusters = []
907
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
214 if initialPrototypeIndices is None:
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
215 prototypeIndices = [indices[0]]
9fd7b18f75b4 re arranged motion pattern learning
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 878
diff changeset
216 else:
908
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
217 prototypeIndices = initialPrototypeIndices # think of the format: if indices, have to be in instances
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
218 for i in prototypeIndices:
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
219 clusters.append([i])
949
d6c1c05d11f5 modified multithreading at the interaction level for safety computations
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 917
diff changeset
220 indices.remove(i)
980
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
221 # go through all instances
949
d6c1c05d11f5 modified multithreading at the interaction level for safety computations
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 917
diff changeset
222 for i in indices:
908
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
223 for j in prototypeIndices:
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
224 if similarities[i][j] < 0:
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
225 similarities[i][j] = similarityFunc(instances[i], instances[j])
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
226 similarities[j][i] = similarities[i][j]
980
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
227 label = similarities[i][prototypeIndices].argmax() # index in prototypeIndices
908
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
228 if similarities[i][prototypeIndices[label]] < minSimilarity:
843
5dc7a507353e updated to learn prototypes
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 807
diff changeset
229 prototypeIndices.append(i)
908
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
230 clusters.append([])
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
231 else:
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
232 clusters[label].append(i)
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
233 if optimizeCentroid:
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
234 if len(clusters[label]) >= 2: # no point if only one element in cluster
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
235 for j in clusters[label][:-1]:
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
236 if similarities[i][j] < 0:
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
237 similarities[i][j] = similarityFunc(instances[i], instances[j])
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
238 similarities[j][i] = similarities[i][j]
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
239 clusterIndices = clusters[label]
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
240 clusterSimilarities = similarities[clusterIndices][:,clusterIndices]
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
241 newCentroidIdx = clusterIndices[clusterSimilarities.sum(0).argmax()]
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
242 if prototypeIndices[label] != newCentroidIdx:
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
243 prototypeIndices[label] = newCentroidIdx
952
a9b2beef0db4 loading and assigning motion patterns works
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 949
diff changeset
244 elif len(instances[prototypeIndices[label]]) < len(instances[i]): # replace prototype by current instance i if longer # otherwise, possible to test if randomInitialization or initialPrototypes is not None
a9b2beef0db4 loading and assigning motion patterns works
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 949
diff changeset
245 prototypeIndices[label] = i
a9b2beef0db4 loading and assigning motion patterns works
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 949
diff changeset
246 return prototypeIndices
738
2472b4d59aea small function
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 735
diff changeset
247
2472b4d59aea small function
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 735
diff changeset
248 def computeClusterSizes(labels, prototypeIndices, outlierIndex = -1):
2472b4d59aea small function
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 735
diff changeset
249 clusterSizes = {i: sum(np.array(labels) == i) for i in prototypeIndices}
786
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
250 clusterSizes['outlier'] = sum(np.array(labels) == outlierIndex)
738
2472b4d59aea small function
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 735
diff changeset
251 return clusterSizes
786
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
252
908
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
253 def computeClusterStatistics(labels, prototypeIndices, instances, similarities, similarityFunc, clusters = None, outlierIndex = -1):
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
254 if clusters is None:
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
255 clusters = {protoId:[] for protoId in prototypeIndices+[-1]}
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
256 for i,l in enumerate(labels):
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
257 clusters[l].append(i)
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
258 clusters = [clusters[protoId] for protoId in prototypeIndices]
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
259 for i, cluster in enumerate(clusters):
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
260 n = len(cluster)
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
261 print('cluster {}: {} elements'.format(prototypeIndices[i], n))
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
262 if n >=2:
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
263 for j,k in enumerate(cluster):
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
264 for l in cluster[:j]:
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
265 if similarities[k][l] < 0:
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
266 similarities[k][l] = similarityFunc(instances[k], instances[l])
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
267 similarities[l][k] = similarities[k][l]
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
268 print('Mean similarity to prototype: {}'.format((similarities[prototypeIndices[i]][cluster].sum()+1)/(n-1)))
b297525b2cbf added options to the prototype cluster algorithm, work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 907
diff changeset
269 print('Mean overall similarity: {}'.format((similarities[cluster][:,cluster].sum()+n)/(n*(n-1))))
915
13434f5017dd work to save trajectory assignment to origin and destinations
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 914
diff changeset
270
786
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
271 # Gaussian Mixture Models
917
89cc05867c4c reorg and work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 916
diff changeset
272 def plotGMM(mean, covariance, gmmId, fig, color, alpha = 0.3):
916
7345f0d51faa added display of paths
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 915
diff changeset
273 v, w = np.linalg.eigh(covariance)
7345f0d51faa added display of paths
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 915
diff changeset
274 angle = 180*np.arctan2(w[0][1], w[0][0])/np.pi
7345f0d51faa added display of paths
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 915
diff changeset
275 v *= 4
7345f0d51faa added display of paths
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 915
diff changeset
276 ell = mpl.patches.Ellipse(mean, v[0], v[1], 180+angle, color=color)
7345f0d51faa added display of paths
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 915
diff changeset
277 ell.set_clip_box(fig.bbox)
7345f0d51faa added display of paths
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 915
diff changeset
278 ell.set_alpha(alpha)
7345f0d51faa added display of paths
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 915
diff changeset
279 fig.axes[0].add_artist(ell)
7345f0d51faa added display of paths
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 915
diff changeset
280 plt.plot([mean[0]], [mean[1]], 'x'+color)
917
89cc05867c4c reorg and work in progress
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 916
diff changeset
281 plt.annotate(str(gmmId), xy=(mean[0]+1, mean[1]+1))
916
7345f0d51faa added display of paths
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 915
diff changeset
282
915
13434f5017dd work to save trajectory assignment to origin and destinations
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 914
diff changeset
283 def plotGMMClusters(model, labels = None, dataset = None, fig = None, colors = utils.colors, nUnitsPerPixel = 1., alpha = 0.3):
786
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
284 '''plot the ellipse corresponding to the Gaussians
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
285 and the predicted classes of the instances in the dataset'''
787
0a428b449b80 improved script to display over world image
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 786
diff changeset
286 if fig is None:
0a428b449b80 improved script to display over world image
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 786
diff changeset
287 fig = plt.figure()
916
7345f0d51faa added display of paths
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 915
diff changeset
288 if len(fig.get_axes()) == 0:
7345f0d51faa added display of paths
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 915
diff changeset
289 fig.add_subplot(111)
998
933670761a57 updated code to python 3 (tests pass and scripts run, but non-executed parts of code are probably still not correct)
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 993
diff changeset
290 for i in range(model.n_components):
805
180b6b0231c0 added saving/loading points of interests
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 791
diff changeset
291 mean = model.means_[i]/nUnitsPerPixel
914
f228fd649644 corrected bugs in learn-pois.py
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 913
diff changeset
292 covariance = model.covariances_[i]/nUnitsPerPixel
915
13434f5017dd work to save trajectory assignment to origin and destinations
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 914
diff changeset
293 # plot points
786
1f2b2d1f4fbf added script and code to learn POIs
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 738
diff changeset
294 if dataset is not None:
915
13434f5017dd work to save trajectory assignment to origin and destinations
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 914
diff changeset
295 tmpDataset = dataset/nUnitsPerPixel
787
0a428b449b80 improved script to display over world image
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 786
diff changeset
296 plt.scatter(tmpDataset[labels == i, 0], tmpDataset[labels == i, 1], .8, color=colors[i])
916
7345f0d51faa added display of paths
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 915
diff changeset
297 # plot an ellipse to show the Gaussian component
7345f0d51faa added display of paths
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 915
diff changeset
298 plotGMM(mean, covariance, i, fig, colors[i], alpha)
915
13434f5017dd work to save trajectory assignment to origin and destinations
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 914
diff changeset
299 if dataset is None: # to address issues without points, the axes limits are not redrawn
13434f5017dd work to save trajectory assignment to origin and destinations
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 914
diff changeset
300 minima = model.means_.min(0)
13434f5017dd work to save trajectory assignment to origin and destinations
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 914
diff changeset
301 maxima = model.means_.max(0)
13434f5017dd work to save trajectory assignment to origin and destinations
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 914
diff changeset
302 xwidth = 0.5*(maxima[0]-minima[0])
13434f5017dd work to save trajectory assignment to origin and destinations
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 914
diff changeset
303 ywidth = 0.5*(maxima[1]-minima[1])
13434f5017dd work to save trajectory assignment to origin and destinations
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 914
diff changeset
304 plt.xlim(minima[0]-xwidth,maxima[0]+xwidth)
13434f5017dd work to save trajectory assignment to origin and destinations
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 914
diff changeset
305 plt.ylim(minima[1]-ywidth,maxima[1]+ywidth)
980
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
306
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
307 if __name__ == "__main__":
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
308 import doctest
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
309 import unittest
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
310 suite = doctest.DocFileSuite('tests/ml.txt')
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
311 unittest.TextTestRunner().run(suite)
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
312 # #doctest.testmod()
23f98ebb113f first tests for clustering algo
Nicolas Saunier <nicolas.saunier@polymtl.ca>
parents: 978
diff changeset
313 # #doctest.testfile("example.txt")