Functional PCA vs Artificial Neural Networks
2021
Abstract
Denna rapport fokuserar på jämförelsen av några olika klassificeringsmetoder applicerade på bilddatan Fashion-MNIST. De olika metoderna är artificiella neurala nätverk och funktionell principalkomponentanalys och principalkomponentanalys. För de neurala nätverken har vi två typer: CNN och FNN. Den förstnämnda är specialiserad på just bilder medan den sistnämnda kan appliceras på olika typer av dataset fast har nackdelen med försämrad noggrannhet. Funktionell principalkomponentanalys eller FPCA är en utvidgning av principalkomponentanalys (PCA) som innebär studiet av dimensionreducering av högdimensionell data. FPCA uttrycker data i form av funktioner vilket möjliggör ytterligare dimensionsreducering om funktionerna effektivt representerar datan. Parametrar i majoriteten av metoderna bestäms med hjälp av korsvalidering. Korsvalidering tillämpas för att en modell inte ska bli partiskt mot en viss del av datan och vi kan på så vis kan dra rättvisa slutsatser. Resultaten ger att de neur...
References (64)
- Forsyth, D. A. och Ponce, J. Computer vision: a modern approach. Vol. 1. Pearson, 2012, s. xviii.
- Krizhevsky, A., Sutskever, I. och Hinton, G. E. "Imagenet classification with deep convolutio- nal neural networks". Advances in neural information processing systems 25 (2012), s. 1097- 1105.
- Russakovsky, O. m. fl. "ImageNet Large Scale Visual Recognition Challenge". International Journal of Computer Vision (IJCV) 115.3 (2015), s. 211-252. doi: 10.1007/s11263-015- 0816-y.
- McCulloch, W. S. och Pitts, W. "A logical calculus of the ideas immanent in nervous activity". The bulletin of mathematical biophysics 5.4 (1943), s. 115-133.
- Saha, S. A Comprehensive Guide to Convolutional Neural Networks -the ELI5 way. 2018. url: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural- networks-the-eli5-way-3bd2b1164a53 (hämtad 2021-04-02).
- Ramsay, J. och Silverman, B. Functional Data Analysis. 2. utg. Springer, 2005, s. 149-152.
- Chen, L. "Curse of Dimensionality". I: Encyclopedia of Database Systems. Utg. av L. LIU och M. T. ÖZSU. Boston, MA: Springer US, 2009, s. 545-546. doi: 10.1007/978-0-387-39940- 9_133.
- Rawat, W. och Wang, Z. "Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review". Neural Computation 29.9 (2017), s. 2352-2449. url: https://doi. org/10.1162/neco_a_00990 (hämtad 2021-02-01).
- Xijia, L. och Krzysztof, P. Splinets: Functional Data Analysis using Splines and Orthogonal Spline Bases. https://CRAN.R-project.org/package=Splinets. 2021. (Hämtad 2021-02-25).
- Chollet, F. Keras.io. 2005. url: https : / / keras . io / api / #keras -api -reference (hämtad 2021-04-01).
- Google. Google Colaboratory. url: https://colab.research.google.com (hämtad 2021-05-11).
- Holmåker, K. och Gustafsson, I. Linjär algebra: fortsättningskurs. 1. utg. Liber, 2016, s. 1-20. isbn: 9789147112456.
- Young, N. An Introduction to Hilbert Space. 1. utg. Camebridge University Press, 1988, s. 21- 28. isbn: 9780521330718. doi: 10.1017/CBO9781139172011.
- Conway, J. B. A Course in Functional Analysis. 2. utg. Graduate Texts in Mathematics ; 96. Springer Science+Business Media, 1990, s. 7-11. isbn: 9781441930927.
- Podgórski, K. Splinets -splines through the Taylor expansion, their support sets and ortho- gonal bases. 2021. arXiv: 2102.00733 [stat.CO]. url: https://arxiv.org/abs/2102.00733 (hämtad 2021-02-18).
- James, G. m. fl. An Introduction to Statistical Learning: With Applications in R. Springer Publishing Company, Incorporated, 2014, s. 30, 374-377. isbn: 1461471370.
- Bishop, C. M. Pattern recognition and machine learning. Springer, 2006.
- Ramsundar, B. och Bosagh Zadeh, R. I: TensorFlow for Deep Learning. O'Reilly Media, Inc, mars 2018. isbn: 9781491980453.
- Mahajan, P. Fully Connected vs Convolutional Neural Networks. https : / / medium . com / swlh / fully -connected -vs -convolutional -neural -networks -813ca7bc6ee5. 2020. (Hämtad 2021-04-05).
- Hastie, T., Tibshirani, R. och Friedman, J. The elements of statistical learning: data mining, inference, and prediction. 2. utg. Springer Science, 2009.
- Xiao, H., Rasul, K. och Vollgraf, R. Fashion-MNIST: a Novel Image Dataset for Benchmar- king Machine Learning Algorithms. 28 aug. 2017. arXiv: cs.LG/1708.07747 [cs.LG]. url: https://arxiv.org/abs/1708.07747 (hämtad 2021-02-05).
- Yegulalp, S. What is TensorFlow? The machine learning library explained. 2019. url: https: //www.infoworld.com/article/3278008/what-is-tensorflow-the-machine-learning-library- explained.html (hämtad 2021-04-01).
- Chollet, F. Keras API reference: Fit-method. Keras. url: https://keras.io/api/models/ model_training_apis/#fit-method (hämtad 2021-05-12).
- Chollet, F. Keras API reference: ReLU layer. Keras. url: https://keras.io/api/layers/ activation_layers/relu/ (hämtad 2021-05-12).
- Chollet, F. Keras API reference: Softmax layer. Keras. url: https://keras.io/api/layers/ activation_layers/softmax/ (hämtad 2021-05-12).
- Chollet, F. Keras API reference: Adam. Keras. url: https://keras.io/api/optimizers/adam/ (hämtad 2021-05-12).
- Chollet, F. Keras API reference: Accuracy metrics. Keras. url: https : / / keras . io / api / metrics/accuracy_metrics/ (hämtad 2021-05-12).
- Chollet, F. Keras API reference: SparseCategoricalCrossentropy. Keras. url: https://keras. io/api/losses/probabilistic_losses/#sparse_categorical_crossentropy-function (hämtad 2021-05-12).
- E.8.7 Bästa modell för FNN och CNN på hela datasetet . . . . . . . . . . . . . .
- E.8.8 Inspektion av principalkomponenter . . . . . . . . . . . . . . . . . . . . . .
- E.8.
- 9 Visualisering av PCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [ ]: model_layers = {} model_layers['FNN_modell_1'] = [keras.layers.Flatten(input_shape=(28,28)), keras.layers.Dense(128, activation='relu'), keras.layers.Dense(10, activation='softmax') ] model_layers['FNN_modell_2'] = [keras.layers.Flatten(input_shape=(28,28)), keras.layers.Dense(88, activation='relu'), keras.layers.Dense(40, activation='relu'), keras.layers.Dense(10, activation='softmax') ] model_layers['FNN_modell_3'] = [keras.layers.Flatten(input_shape=(28,28)), keras.layers.Dense(44, activation='relu'), keras.layers.Dense(44, activation='relu'), keras.layers.Dense(40, activation='relu'), keras.layers.Dense(10, activation='softmax')] model_layers['FNN_modell_4'] = [keras.layers.Flatten(input_shape=(28,28)), keras.layers.Dense(22, activation='relu'), keras.layers.Dense(22, activation='relu'), keras.layers.Dense(22, activation='relu'), keras.layers.Dense(22, activation='relu'), keras.layers.Dense(20, activation='relu'), keras.layers.Dense(20, activation='relu'), keras.layers.Dense(10, activation='softmax')] model_layers['FNN_modell_5'] = [keras.layers.Flatten(input_shape=(28,28)), keras.layers.Dense(60, activation='relu'), keras.layers.Dense(28, activation='relu'), keras.layers.Dense(40, activation='relu'), keras.layers.Dense(10, activation='softmax')] model_layers['FNN_modell_6'] = [keras.layers.Flatten(input_shape=(28,28)), keras.layers.Dense(70, activation='relu'), keras.layers.Dense(9, activation='relu'), keras.layers.Dense(9, activation='relu'), keras.layers.Dense(40, activation='relu'), keras.layers.Dense(10, activation='softmax')] model_layers['CNN_model_1'] = [ tf.keras.layers.Conv2D(32, (3,3), padding='same', activation=tf.nn.relu, input_shape=(28, 28, 1)), tf.keras.layers.MaxPooling2D((2, 2), strides=2), tf.keras.layers.Conv2D(64, (3,3), padding='same', activation=tf.nn.relu), tf.keras.layers.MaxPooling2D((2, 2), strides=2), tf.keras.layers.Flatten(), tf.keras.layers.Dense(128, activation=tf.nn.relu), tf.keras.layers.Dense(10, activation=tf.nn.softmax) ] model_layers['CNN_model_2'] = [ tf.keras.layers.Conv2D(32, (3,3), padding='same', activation=tf.nn.relu, input_shape=(28, 28, 1)), tf.keras.layers.MaxPooling2D((2, 2), strides=2), tf.keras.layers.Conv2D(64, (3,3), padding='same', activation=tf.nn.relu), tf.keras.layers.MaxPooling2D((2, 2), strides=2), tf.keras.layers.Flatten(), tf.keras.layers.Dense(50, activation=tf.nn.relu), tf.keras.layers.Dense(10, activation=tf.nn.softmax) ] model_layers['CNN_model_3'] = [ tf.keras.layers.Conv2D(32, (3,3), padding='same', activation=tf.nn.relu, input_shape=(28, 28, 1)), tf.keras.layers.MaxPooling2D((2, 2), strides=2), tf.keras.layers.Conv2D(64, (3,3), padding='same', activation=tf.nn.relu), tf.keras.layers.MaxPooling2D((2, 2), strides=2), tf.keras.layers.Flatten(), tf.keras.layers.Dense(300, activation=tf.nn.relu), tf.keras.layers.Dense(10, activation=tf.nn.softmax) ] model_layers['CNN_model_4'] = [ tf.keras.layers.Conv2D(32, (3,3), padding='same', activation=tf.nn.relu, input_shape=(28, 28, 1)), tf.keras.layers.MaxPooling2D((2, 2), strides=2), tf.keras.layers.Conv2D(64, (3,3), padding='same', activation=tf.nn.relu), tf.keras.layers.MaxPooling2D((2, 2), strides=2), tf.keras.layers.Flatten(), tf.keras.layers.Dense(450, activation=tf.nn.relu), tf.keras.layers.Dense(10, activation=tf.nn.softmax) ] model_layers['CNN_model_5'] = [ tf.keras.layers.Conv2D(32, (3,3), padding='same', activation=tf.nn.relu, input_shape=(28, 28, 1)), tf.keras.layers.MaxPooling2D((2, 2), strides=2), tf.keras.layers.Conv2D(64, (3,3), padding='same', activation=tf.nn.relu), tf.keras.layers.MaxPooling2D((2, 2), strides=2), tf.keras.layers.Flatten(), tf.keras.layers.Dense(512, activation=tf.nn.relu), tf.keras.layers.Dense(10, activation=tf.nn.softmax) ] model_layers['CNN_model_6'] = [ tf.keras.layers.Conv2D(32, (3,3), padding='same', activation=tf.nn.relu, input_shape=(28, 28, 1)), tf.keras.layers.MaxPooling2D((2, 2), strides=2), tf.keras.layers.Conv2D(64, (3,3), padding='same', activation=tf.nn.relu), tf.keras.layers.MaxPooling2D((2, 2), strides=2), tf.keras.layers.Flatten(), Accuracy with CNN_model_5: 98.65%, validation accuracy: 96.85% Accuracy with CNN_model_5: 98.79%, validation accuracy: 97.03% Accuracy with CNN_model_5: 99.05%, validation accuracy: 98.10% Accuracy with CNN_model_5: 98.89%, validation accuracy: 98.70% Accuracy with CNN_model_5: 99.13%, validation accuracy: 98.45% Accuracy with CNN_model_5: 99.12%, validation accuracy: 99.00% Accuracy with CNN_model_5: 99.34%, validation accuracy: 98.85% Accuracy with CNN_model_6: 94.27%, validation accuracy: 90.53% Accuracy with CNN_model_6: 96.83%, validation accuracy: 93.12% Accuracy with CNN_model_6: 97.83%, validation accuracy: 94.70% Accuracy with CNN_model_6: 98.22%, validation accuracy: 96.02% Accuracy with CNN_model_6: 98.04%, validation accuracy: 96.75% Accuracy with CNN_model_6: 98.34%, validation accuracy: 96.82% Accuracy with CNN_model_6: 98.58%, validation accuracy: 98.12% Accuracy with CNN_model_6: 98.64%, validation accuracy: 98.03% Accuracy with CNN_model_6: 99.28%, validation accuracy: 97.52% Accuracy with CNN_model_6: 97.94%, validation accuracy: 97.65%
- CNN_model_6 CV-accuracy: 97.80%, validation accuracy: 95.92% [ ]: for model_name, h in history_FNN.items(): mean_accuracy = 0 mean_val_accuracy = 0 for CV_run in h: mean_accuracy += CV_run.history['accuracy'][-1]/K mean_val_accuracy += CV_run.history['val_accuracy'][-1]/K print("{} CV-accuracy: {:.2f}%, validation accuracy: {:.2f}%". →format(model_name, 100*mean_accuracy,100*mean_val_accuracy))
- CNN_model_6 CV-accuracy: 97.80%, validation accuracy: 95.92%
- Bästa modell för FNN och CNN på hela datasetet (ingen korsvalider- ing) [ ]: x_train = train_images x_test = test_images y_train = train_labels y_test = test_labels FNN_modell_4=keras.Sequential([keras.layers.Flatten(input_shape=(28,28)), keras.layers.Dense(22, activation='relu'), keras.layers.Dense(22, activation='relu'), keras.layers.Dense(22, activation='relu'), keras.layers.Dense(22, activation='relu'), keras.layers.Dense(20, activation='relu'), keras.layers.Dense(20, activation='relu'), keras.layers.Dense(10, activation='softmax')])
- FNN_modell_4.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
- print("Test accuracy is", FNN_modell_4.evaluate(test_images, test_labels, →verbose=2)[
- pca = PCA(svd_solver = 'full') pca.fit(x_class) print(pca.components_.shape) print(pca.explained_variance_[0:10]) (784, 784) [12.09439907 4.58872545 3.74873471 1.99979476 1.31679144 1.04802688 0.88884489 0.79067383 0.64199964 0.62015006] [ ]: components_to_view = 3 fig, ax = plt.subplots(nrows = 1, ncols = components_to_view) for i in range(components_to_view): ax[i].imshow(pca.components_[i,:].reshape(28,28), cmap = 'gray')
- x_cov = np.cov(np.transpose(x_class)) print(x_cov.shape) w, v = np.linalg.eig(x_cov) print(w[0:10]) components_to_view = 3 fig, ax = plt.subplots(nrows = 1, ncols = components_to_view) for i in range(components_to_view): ax[i].imshow(np.real(v[:,i]).reshape(28,28), cmap = 'gray')
- #print(np.real(v[:,
- + pca.components_[:,
- Visualisering av PCA [1]: import matplotlib.pyplot as plt import numpy as np
- #From: https://gist.github.com/WetHat/1d6cd0f7309535311a539b42cccca89c import matplotlib.pyplot as plt from mpl_toolkits.mplot3d.proj3d import proj_transform from mpl_toolkits.mplot3d.axes3d import Axes3D from matplotlib.patches import FancyArrowPatch class Arrow3D(FancyArrowPatch): def __init__(self, x, y, z, dx, dy, dz, *args, **kwargs): super().__init__((0, 0), (0, 0), *args, **kwargs) self._xyz = (x, y, z) self._dxdydz = (dx, dy, dz)
- def draw(self, renderer): x1, y1, z1 = self._xyz dx, dy, dz = self._dxdydz x2, y2, z2 = (x1 + dx, y1 + dy, z1 + dz) xs, ys, zs = proj_transform((x1, x2), (y1, y2), (z1, z2), self.axes.M) self.set_positions((xs[0], ys[0]), (xs[1], ys[1])) super().draw(renderer) def _arrow3D(ax, x, y, z, dx, dy, dz, *args, **kwargs): '''Add an 3d arrow to an `Axes3D`instance.''' arrow = Arrow3D(x, y, z, dx, dy, dz, *args, **kwargs) ax.add_artist(arrow) setattr(Axes3D, 'arrow3D', _arrow3D) fig = plt.figure() ax = fig.add_subplot(111, projection='3d') ax.set_xlim(0,2) ax.arrow3D(0,0,0, 1,1,1, mutation_scale=20, arrowstyle="-|>", linestyle='dashed') ax.arrow3D(1,0,0, 1,1,1, mutation_scale=20, ec ='green', fc='red') ax.set_title('3D Arrows Demo')
- numberOfSamples = 400 samplesInPCA = np.random.normal(0,1,[400,3]) variances = [np.sqrt(10), np.sqrt(3), np.sqrt(0.3)] samplesInPCA = samplesInPCA * variances PCAvectors = np.zeros([3,3])
- PCAvectors[:,0] = [1,1,1] PCAvectors[:,1] = [-1,0,1] PCAvectors[:,2] = [1,-2,1] for i in range(3): PCAvectors[:,i] = PCAvectors[:,i]/np.linalg.norm(PCAvectors[:,i]) print(samplesInPCA) print(PCAvectors) [[11.10538602 -1.36922283 0.29064907] [ 1.09415543 1.16436695 0.42073884] [ 0.66096259 2.61611859 0.70585629] ... [ 2.76468565 2.40635393 0.37058987] [ 1.43916281 -1.41260973 0.13176086] [ 2.3120071
- centerPosition = np.transpose(np.array([1,1,1], ndmin = 2)) samplesPositions = np.matmul(PCAvectors,np.transpose(samplesInPCA)) + →centerPosition fig = plt.figure() ax = fig.gca(projection = '3d') ax.view_init(20,25)
- = PCAvectors[0,:]*variances v = PCAvectors[1,:]*variances w = PCAvectors[2,:]*variances #ax.quiver(np.ones(3), np.ones(3), np.ones(3), u, v, w, arrow_length_ratio=0. →3,length = 3, color = "orange") ax.set_xlim3d(-3,3) ax.set_ylim3d(-3,3) ax.set_zlim3d(-3,3) ax.set_xlabel("x", fontsize = 13) ax.set_ylabel("y", fontsize = 13) ax.set_zlabel("z", fontsize = 13) ax.set_xticks([-2,0,2]) ax.set_yticks([-2,0,2]) ax.set_zticks([-2,0,2])
- for i in range(3): scale_factor = 2.5 dx = PCAvectors[0,i]*scale_factor*variances[i] dy = PCAvectors[1,i]*scale_factor*variances[i] dz = PCAvectors[2,i]*scale_factor*variances[i] ax.arrow3D(centerPosition[0,0], centerPosition[1,0], centerPosition[2,0], →dx,dy,dz, mutation_scale = 15, color = "black") ax.scatter(samplesPositions[0,:], samplesPositions[1,:], samplesPositions[2,:], →alpha = 0.4) textpos1 = centerPosition[:,0] + PCAvectors[:,0]*8 ax.text(textpos1[0], textpos1[1],textpos1[2]-1.2, "PK 1", fontsize = 15, color = →"black")
- fig = plt.figure() ax = fig.gca() ax.scatter(samplesInPCA[:,0],samplesInPCA[:,1], alpha = .4) ax.arrow(0, 0, variances[0], 0, width = 0.1, color = 'black', head_width = 0.6) ax.arrow(0, 0, 0, variances[1], width = 0.1, color = 'black', head_width = 0.6) ax.set_xlabel("PK 1", fontsize = 15) ax.set_ylabel("PK 2", fontsize = 15) plt.savefig("PCA_2D.pdf")
- numberOfSamples = 400 samplesInPCA = np.random.normal(0,1,[2,400,3])
- samplesInPCA = samplesInPCA * np.expand_dims(variances,axis = 1)
- PCAvectors = np.zeros([2,3,3])
- PCAvectors[0,:,0] = [1,1,1] PCAvectors[0,:,1] = [-1,0,1] PCAvectors[0,:,2] = [1,-2,1] PCAvectors[1,:,0] = [1,-3,1] PCAvectors[1,:,1] = [1,1,2]
- PCAvectors[1,:,2] = np.cross(PCAvectors[1,:,0],PCAvectors[1,:,1])
- samplesPositions = np.zeros([2,3,numberOfSamples]) print(samplesPositions.shape) print(variances.shape) print(centerPosition.shape) for j in range(2): samplesPositions[j,:,:] = np.matmul(PCAvectors[j,:,:],np. →transpose(samplesInPCA[j,:,:])
- + np.expand_dims(centerPosition[:,j],1) fig = plt.figure() ax = fig.gca(projection = '3d') width = 5.5 ax.set_xlim3d(-width,width) ax.set_ylim3d(-width,width) ax.set_zlim3d(-width,width) ax.set_xlabel("x", fontsize = 13) ax.set_ylabel("y", fontsize = 13) for t in txt: t.set_path_effects([PathEffects.withStroke(linewidth=1.5, foreground='black')])
- print(samplesPositions.shape) plt.savefig("PCA_3D_two_classes.pdf",bbox_inches = 'tight') (2, 3, 400) (2, 3) (3, 2) (2, 3, 400)
- fig = plt.figure() ax = fig.gca() print((samplesPositions[1,:,:] -np.expand_dims(centerPosition[:,0],1)).shape) otherClassSamples = np.matmul(np.transpose(PCAvectors[0,:,:
- →
- samplesPositions[1,:,:] -np.expand_dims(centerPosition[:,0],1))) print(otherClassSamples.shape) ax.scatter(samplesInPCA[0,:,0],samplesInPCA[0,:,1], alpha = .45) ax.scatter(otherClassSamples[0,:],otherClassSamples[1,:], alpha = .4) ax.arrow(0, 0, variances[0,0], 0, width = 0.4, color = colors[0], head_width = →1, ec = 'black') ax.arrow(0, 0, 0, variances[0,1], width = 0.4, color = colors[0], head_width = →1, ec = 'black') ax.set_xlabel("PK 1", fontsize = 18) ax.set_ylabel("PK 2", fontsize = 18) ax.tick_params(axis = 'both', labelsize = 16) plt.savefig("PCA_2D_two_classes_first.pdf", bbox_inches = 'tight') (3, 400) (3, 400)
- fig = plt.figure() ax = fig.gca() print((samplesPositions[1,:,:] -np.expand_dims(centerPosition[:,0],1)).shape) otherClassSamples = np.matmul(np.transpose(PCAvectors[1,:,:
- →
- samplesPositions[0,:,:] -np.expand_dims(centerPosition[:,1],1))) print(otherClassSamples.shape) ax.scatter(otherClassSamples[0,:],otherClassSamples[1,:], alpha = .45) ax.scatter(samplesInPCA[1,:,0],samplesInPCA[1,:,1], alpha = .35) ax.arrow(0, 0, variances[1,0], 0, width = 0.4, color = colors[1], head_width = →1, ec = 'black') ax.arrow(0, 0, 0, variances[1,1], width = 0.4, color = colors[1], head_width = →1, ec = 'black') ax.set_xlabel("PK 1", fontsize = 18) ax.set_ylabel("PK 2", fontsize = 18) ax.tick_params(axis = 'both', labelsize = 16) plt.savefig("PCA_2D_two_classes_second.pdf", bbox_inches = 'tight') (3, 400) (3, 400)