交易所网站开发,网站建设有技术的公司,做货代的有哪些网站,新密做网站推广在这次练习中将实现K-means 聚类算法并应用它压缩图片#xff0c;第二部分#xff0c;将使用主成分分析算法去找到一个脸部图片的低维描述。
K-means Clustering
Implementing K-means
K-means算法是一种自动将相似的数据样本聚在一起的方法,K-means背后的直观是一个迭代过…在这次练习中将实现K-means 聚类算法并应用它压缩图片第二部分将使用主成分分析算法去找到一个脸部图片的低维描述。
K-means Clustering
Implementing K-means
K-means算法是一种自动将相似的数据样本聚在一起的方法,K-means背后的直观是一个迭代过程它从猜测初始的质心开始然后通过重复地将示例分配到最接近的质心然后根据分配重新计算质心来细化这个猜测。 具体步骤
随机初始K个质心开始迭代为每个样本找到最近的质心用分配给每个质心的点来计算每个质心的平均值作为新的质心回到三收敛于最终的均值集 在实践中K-means算法通常使用不同的随机初始化运行几次。从不同的随机初始化中选择不同的解的一种方法是选择成本函数值(失真)最低的解。
Finding closest centroids Computing centroid means
for every centroid k we set 新的形心重置为所有聚集该形心和的平均值
K-means on example dataset
def find_closest_centroids(X, centroids):idx np.zeros((len(X), 1))print(idx.shape)K len(centroids)print(K)t [0 for i in range(K)]print(t)for i in range(len(X)):for j in range(K):temp centroids[j, :] - X[i, :]t[j] temp.dot(temp.T)index t.index(min(t)) 1print(index)idx[i] indexreturn np.array(idx)mat loadmat(ex7data2.mat)
X mat[X]
init_centroids np.array([[3, 3], [6, 2], [8, 5]])
idx find_closest_centroids(X, init_centroids)
print(idx[0:3])计算出聚集样本的平均数得到新的centroids
def compute_centroids(X, idx):centroids []# print(idx)K len(np.unique(idx))m, n X.shapecentroids np.zeros((K, n))temp np.zeros((K, n))count np.zeros((K, 1))print(temp.shape)print(count.shape)for i in range(m):for j in range(K):if idx[i] j:temp[j, :] temp[j, :] X[i, :]count[j] count[j] 1centroids temp/countreturn centroids绘制聚类过程
def plotData(X, centroids, idxNone):可视化数据并自动分开着色。idx: 最后一次迭代生成的idx向量存储每个样本分配的簇中心点的值centroids: 包含每次中心点历史记录colors [b, g, gold, darkorange, salmon, olivedrab,maroon, navy, sienna, tomato, lightgray, gainsborocoral, aliceblue, dimgray, mintcream,mintcream]assert len(centroids[0]) len(colors), colors not enough subX [] # 分号类的样本点if idx is not None:for i in range(centroids[0].shape[0]):x_i X[idx[:, 0] i]subX.append(x_i)else:subX [X] # 将X转化为一个元素的列表每个元素为每个簇的样本集方便下方绘图# 分别画出每个簇的点并着不同的颜色plt.figure(figsize(8, 5))for i in range(len(subX)):xx subX[i]plt.scatter(xx[:, 0], xx[:, 1], ccolors[i], labelCluster %d % i)plt.legend()plt.grid(True)plt.xlabel(x1, fontsize14)plt.ylabel(x2, fontsize14)plt.title(Plot of X Points, fontsize16)# 画出簇中心点的移动轨迹xx, yy [], []for centroid in centroids:xx.append(centroid[:, 0])yy.append(centroid[:, 1])plt.plot(xx, yy, rx--, markersize8)plotData(X, [init_centroids])
plt.show()def run_k_means(X, centroids, max_iters):K len(centroids)centroids_all []centroids_all.append(centroids)centroid_i centroidsfor i in range(max_iters):idx find_closest_centroids(X, centroid_i)centroid_i compute_centroids(X, idx)centroids_all.append(centroid_i)return idx, centroids_allidx, centroids_all run_k_means(X, init_centroids, 20)
plotData(X, centroids_all, idx)
plt.show()Random initialization
洗牌所有数据取前k个作为centroids 随机选取K个样本作为centroids
def initCentroids(X, K):m, n X.shapeidx np.random.choice(m, K)centroids X[idx]return centroidsfor i in range(K):centroids initCentroids(X, K)idx, centroids_all run_k_means(X, centroids, 10)plotData(X, centroids_all, idx)Image compression with K-means
读入图片128*128 RGB编码图片
from skimage import ioA io.imread(bird_small.png)
print(A.shape)
plt.imshow(A)
A A/255.K-means on pixels
X A.reshape(-1, 3)
K 16
centroids initCentroids(X, K)
idx, centroids_all run_k_means(X, centroids, 10)img np.zeros(X.shape)
centroids centroids_all[-1]
for i in range(len(centroids)):img[idx i] centroids[i]img img.reshape((128, 128, 3))fig, axes plt.subplots(1, 2, figsize(12, 6))
axes[0].imshow(A)
axes[1].imshow(img)Principal Component Analysis
Example Dataset
mat loadmat(ex7data1.mat)
X mat[X]
print(X.shape)
plt.scatter(X[:,0], X[:,1], facecolorsnone, edgecolorsb)
plt.show()Implementing PCA
两个步骤
计算矩阵数据的协方差使用SVD 方法得出特征向量 使用归一化是每个特征在同一范围内。
def feature_normalize(X):means X.mean(axis0)stds X.std(axis0, ddof1)X_norm (X - means) / stdsreturn X_norm, means, stdsdef pca(X):sigma (X.T X) / len(X)U, S, V np.linalg.svd(sigma)return U, S, VX_norm, means, stds feature_normalize(X)
U, S, V pca(X_norm)print(U[:,0])
plt.figure(figsize(7, 5))
plt.scatter(X[:,0], X[:,1], facecolorsnone, edgecolorsb)plt.plot([means[0], means[0] 1.5*S[0]*U[0,0]],[means[1], means[1] 1.5*S[0]*U[0,1]],cr, linewidth3, labelFirst Principal Component)
plt.plot([means[0], means[0] 1.5*S[1]*U[1,0]],[means[1], means[1] 1.5*S[1]*U[1,1]],cg, linewidth3, labelSecond Principal Component)
plt.grid()
plt.axis(equal)
plt.legend()Dimensionality Reduction with PCA
Projecting the data onto the principal components
def project_data(X, U, K):Z X U[:, :K]return Zdef recover_data(Z, U, K):X_rec Z U[:, :K].Treturn X_recReconstructing an approximation of the data
X_rec recover_data(Z, U, 1)
X_rec[0]Visualizing the projections
plt.figure(figsize(7,5))
plt.axis(equal)
plot plt.scatter(X_norm[:,0], X_norm[:,1], s30, facecolorsnone,edgecolorsb,labelOriginal Data Points)
plot plt.scatter(X_rec[:,0], X_rec[:,1], s30, facecolorsnone,edgecolorsr,labelPCA Reduced Data Points)plt.title(Example Dataset: Reduced Dimension Points Shown,fontsize14)
plt.xlabel(x1 [Feature Normalized],fontsize14)
plt.ylabel(x2 [Feature Normalized],fontsize14)
plt.grid(True)for x in range(X_norm.shape[0]):plt.plot([X_norm[x,0],X_rec[x,0]],[X_norm[x,1],X_rec[x,1]],k--)# 输入第一项全是X坐标第二项都是Y坐标
plt.legend()Face Image Dataset
mat loadmat(ex7faces.mat)
X mat[X]
print(X.shape)def display_data(X, row, col):fig, axs plt.subplots(row, col, figsize(8, 8))for r in range(row):for c in range(col):axs[r][c].imshow(X[r * col c].reshape(32, 32).T, cmapGreys_r)axs[r][c].set_xticks([])axs[r][c].set_yticks([])display_data(X, 10, 10)PCA on Faces
X_norm, means, stds feature_normalize(X)
U, S, V pca(X_norm)
display_data(U[:,:36].T, 6, 6)Dimensionality Reduction
z project_data(X_norm, U, K36)
X_rec recover_data(z, U, K36)
display_data(X_rec, 10, 10)