Questions tagged [pca]

Principal component analysis (PCA) is a statistical technique for dimension reduction often used in clustering or factor analysis. Given any number of explanatory or causal variables, PCA ranks the variables by their ability to explain greatest variation in the data. It is this property that allows PCA to be used for dimension reduction, i.e. to identify the most important variables from amongst a large set possible influences.

0
votes
0answers
17 views

Images dimension reduction

I have a 3000 of images each of which has shape of (200,180,3). I want to reduce their dimension. Therefore, I use PCA, I want to do it firstly without using any library. The first problem I camed ...
0
votes
0answers
25 views

PCA - Plotting individual distance to principal component

I am doing a PCA analysis with the CRAN iris dataset. I wonder how I can create the following plot: I want to select the first principal component and for this component I want to plot the distance ...
0
votes
1answer
25 views

Diffrent PCA plots

I was trying to to learn pca(using the iris dataset) with python and i got some results,so i wanted to test the results ir R to make sure it was good.When i checked the results,it gave me a mirror ...
0
votes
0answers
25 views

Fuzzy c-mean clustering and evaluation methods

I am trying to use Fuzzy c-mean clustering over my data. I would like to show only cluster n = 2. I have tried this code and it works but I am having a problem if I modified to print only cluster 2. ...
0
votes
1answer
29 views

How can I compute T2 Hotelling after PCA?

I need to compute Hotelling T2 and SPE (Q), after the PCA analisys. I did it using the pca function from library mdatools, but I see the PC computed are different from the one computed by prcomp or ...
-6
votes
0answers
35 views

Principal Component Analysis Code in Python

I need a help to code PCA in python I've tried some code but the result in python is different from the result in Minitab
0
votes
1answer
12 views

n-components doesn't seem to truncate the number of components calculated

I'm trying to perform Kernal Principal Component Analysis (KPCA) on a large data set that I will want to find the pre-image of after removal of the low energy/high entropy components. I would had ...
0
votes
3answers
31 views

scikit-learn PCA for image dataset

I am trying to perform PCA on an image dataset with 100.000 images each of size 224x224x3. I was hoping to project the images into a space of dimension 1000 (or somewhere around that). I am doing ...
0
votes
0answers
15 views

How can I extract the decoder part from an autoencoder, using syntax similar as in my example?

I built an autoencoder, and I'm trying to extract the decoding part so I can visualize 'eigenfaces', by giving the hidden layer (or the input layer in the decoder) input that assigns 1 to a single ...
0
votes
1answer
26 views

prcomp “Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric”

I'm brand new to R and I'm presently trying to create a PCA plot for a project. I created tables of my data in excel and then saved it as a .csv file, which I declared as a variable as follows: > ...
1
vote
1answer
38 views

noise reduction using python regarding other people's sound as noise

I want to use python to dispose of an Audio file which can recognize only my voice. For example, I speak to a raspberry pi car about "forward". It will go straight but other people who speak "forward" ...
0
votes
0answers
29 views

How to generate scatter plot from the vector values in each row for PCA?

I have successfully created label and feature vectors and am able to apply pca analysis on it but what happens is that the column generated is of datatype vector and each row is a vector. How do I ...
0
votes
0answers
35 views

Useless SVM model, is my data useless or am i using libsvm wrong?

I am trying to use the CBIS-DDSM dataset to classify malignant or benign breast tumours with PCA and SVM. However, my results are astonishingly bad, and I have been working my head of the last week, ...
0
votes
0answers
19 views

problem of misalignment in Houghlines detection with python

[Hough Lines detected in red][1] HI, i'm working on an app and i need to extract these bands by using hough but i have some problems while extracting them like misalignment of the detected lines. Do ...
0
votes
0answers
29 views

ValueError: Wrong number of items passed 6, placement implies 11, i dont get this

I tried to reduce the dimensionality of a data frame with PCA, but when I run my program it shows two errors, do pandas have inner attributes?? how can I fix this?? ds = pd.read_csv('forestfires.csv')...
0
votes
0answers
21 views

Positioning Multivariate Data into 2-dimensional Space (with PCA)

I have multidimensional data. (11 columns - attributes , 150K rows - number of data). It is slightly sparse-alike data, for example, which means one datum has numeric values like (0, 0, 6.5, 0, 0, 7.5,...
2
votes
0answers
21 views

Why do I get “weka.attributeSelection.PrincipalComponents: No attributes!” while running WEKA PCA in Java?

I have to make a visualization of a 4 dimension dataset (plus the class attribute). To this purpose, I want to run PCA on the whole dataset. Since I don't have to do any machine learning on the ...
0
votes
1answer
16 views

How to get a low dimensional rank of non-negative factorization matrix

I have a big matrix X = numpy.random.rand(1000, 1000) using sklearn.decomposition I factorized the matrix such as: from sklearn.decomposition import NMF model = NMF(n_components=1, init='random', ...
1
vote
1answer
23 views

Truncated SVD is taking lot of time

I'm trying to reduce dimension of data set by computing what can be the best n_components using truncated SVD but its taking lot of time. from sklearn.decomposition import TruncatedSVD pca = ...
1
vote
2answers
40 views

principal components of PCA

I came across this question in datacamp.com: Bellow are three scatter plots of the same point cloud. Each scatter plot shows a different set of axes (in red). In which of the plots could the axes ...
1
vote
1answer
25 views

scikit learn PCA - transform results

I have a timeseries of first differences onto which i apply PCA using scikit to get the first PC # data is a timeseries of first differences pca = PCA(n_components=1) pca.fit(data) pc1_trans = pca....
1
vote
1answer
38 views

Principal component analysis in matlab?

I have a training set with the size of (size(X_Training)=122 x 125937). 122 is the number of features and 125937 is the sample size. From my little understanding, PCA is useful when you want to ...
1
vote
1answer
27 views

Biplots for Functional Principal Component Scores

I'm trying to get the biplots between two F. principal components (or harmonics). I provide an example from fda package doc. to solve the riddle: library(fda) #BASIS FUNCTIONS daybasis65 <- create....
-1
votes
0answers
13 views

How to obtain the output graph from the dataset

May I know how to modify my Python programming thus I will be able to obtain the same result as refer to the image file import numpy as np import pandas as pd df_wine = pd.read_csv('https://archive....
0
votes
1answer
33 views

How to select a subset of a dataframe using a variable dynamically

I have an R dataframe with 300 columns. I have done Principal Component Analysis and grabbed the top 110 columns that explain the variability of dataset. How do we pass the 110 column names list to an ...
1
vote
1answer
32 views

What do the differences mean between pyspark SVD Eigenvectors vs. PCA Eigenvectors?

I'm using the SVD and PCA functions in (pyspark) mllib (Spark 2.2.0) as described in this link: https://spark.apache.org/docs/2.2.0/mllib-dimensionality-reduction.html Suppose we are given the ...
2
votes
0answers
32 views

How to use principle components selected by PCR in further SVR analysis?

I want to use principle component regression to find essential components and then extract those components to apply further SVR analysis, but I got some problems when doing this. First try, I follow ...
3
votes
2answers
33 views

Applying PCA to one sample

I am currently working on an image recognition project with machine learning. The train set has 1600 images with size 300x300, so 90000 features per image. To speed up training, I apply PCA with ...
0
votes
1answer
37 views

Applying PCA to a covariance matrix

I am have some difficulty understanding some steps in a procedure. They take coordinate data, find the covariance matrix, apply PCA, then extract the standard deviation from the square root of each ...
0
votes
3answers
59 views

How to calculate covariance matrix of data frame

I have read data frame of sensor data, using pandas read_fwf function. I need to find covariance matrix of read 928991 x 8 matrix. Eventually, I want to find eigen vectors and eigen values, using ...
0
votes
1answer
46 views

What is the meaning of 'components' in principal component regression?

I'm learning principal component regression and I don't understand the result I get from PCR method. My goal of using PCR is to reduce the number of predictors. For example: library(caret) # Load ...
0
votes
0answers
6 views

Prediction on test data using logistic regression coming negative and more than 1 also

Logistic regression model prediction on test data is negative as well as more than one. Whereas probabilities range from [0,1]. I have scaled data (both train and test ) using standard scaler and ...
0
votes
1answer
44 views

How to use pca function in MATLAB to select effective features? [duplicate]

I'm new in pca and after some researching I found that with pca algorithm we can select best effective features. I just wanted to use pca function (in MATLAB) to select best features to ...
0
votes
0answers
27 views

How to deal with memory error while using make_meshgrid()

I am trying to visualise SVM classification results using Matplotlib and Scikit-learn, how to handle MemoryError ?! For my example, I have a small dataset, a table X of 100 examples and 10 features (...
0
votes
0answers
9 views

when I execute a KernalPCA I get a LinAlgError

when I execute the KernalPCA code on Kaggle predict House Prices, it returns me such errorSVD did not converge in Linear Least Squares from sklearn.decomposition import KernelPCA from sklearn....
0
votes
0answers
12 views

glm.pcr throws error referencing matrix/vector

PostDF <- read.csv("https://raw.githubusercontent.com/thistleknot/Capstone-577/master/output/V7221-greaterEqual-10-filtered.csv", header=TRUE, sep=",")[,-1,drop=FALSE] x <- PostDF[,-1, drop=...
0
votes
0answers
23 views

The relation of eigenvalue and PCs in PCA [migrated]

In R, I got the result of PCA and eigenvalues and vectors and three eigenvalues above 1 were checked. If so, is it valid data from PCA results to PC1 ~ 3? Here is my eigen values and vectors, eigen(...
0
votes
1answer
38 views

PCA for KNN: preprocess parameter in caret

I am conducting knn regression on my data, and would like to: a) cross-validate through repeatedcv to find an optimal k; b) when building knn model, using PCA at 90% level threshold to reduce ...
-2
votes
0answers
30 views

How to reduce a 1D word2vec vector's dimensionality with PCA?

Suppose I have the following word embedding vector: vec = np.array([1,2,3,4,5,6,7]) What is the correct way of reducing the dimensionality of this vector from a 7 dimensional vector to a 2 ...
-2
votes
0answers
21 views

What is explanation behind the linear combination in PCA?

I have used a Principal Component Analysis on a panel dataset in R, since I'm new to both, I am unable to understand why the summation of each principal component across variables is not 1 or if it ...
0
votes
0answers
29 views

To derive Principal component in functional PCA

I am trying to get the way to find eigenfunction from eigenvalues. As far as I understand from reading, I have to multiply eigenvalues by something to get principal components in functional PCA. But ...
0
votes
1answer
17 views

How to define dimensions in fviz_cluster with PAM data?

I have a data frame which is divded as samples in rows and variables in columns Upon doing a PCA: df.pca <- PCA(df, graph = FALSE, ncp = Inf) df.coord <- data.frame(df.pca$ind$coord) ...
2
votes
0answers
53 views

Negative eigenvalues in PCA

I've a matrix x (1000*25) that contains random floats in the interval (-5,5). nFeatures=25 and nPoints=1000. I'm using this code to find the eigenvalues of the covariance matrix, but I'm getting ...
0
votes
0answers
22 views

Number of principal components in SVM parameter tuning vs. final evaluation

I am using PCA to reduce the dimensions of my data (50 samples x 32767 features) before feeding it to an SVM. I am using the following cross-validation scheme for tuning parameters of the SVM kernel, ...
0
votes
1answer
42 views

Relationship between input variables and principal components in PCA

Here is the result of PCA. RC1 and RC3 can be interpreted which variables are related. But, can not interpreted in RC2. When the eigen value is checked, the number of factor is 3. But can there really ...
-2
votes
1answer
44 views

Applying PCA on a specific column of a pandas Dataframe

I'm trying to reduce the number of features of a dataset of images so that cosine similarity computes faster. I have a pandas dataframe that has the following structure ["url", "cluster_id", "...
0
votes
1answer
54 views

How to plot a circle for each point scatter plot while each has particular radius size

I have a pandas frame with distance matrix, I use PCA to do the dim reduction. The the dataframe of this distance matrix has label for each point, and size. How can I make each scattered point ...
-2
votes
1answer
33 views

After using kmeans(): how to determine which point belongs to which group?

I am running a kmeans clustering to identify labeled data. I ran pca and then kmeans and got the following plot using ggbiplot: Now, how can I determine which point belongs to which group in table ...
1
vote
0answers
32 views

PCA preprocess parameter in caret's train function

I am conducting knn regression on my data, and would like to: a) cross-validate through repeatedcv to find an optimal k; b) when building knn model, using PCA at 90% level threshold to reduce ...
2
votes
0answers
16 views

PCA or Linear Discriminant Analysis ? Classification Problem in QoS

I'm working on a classification problem related to the marking of ip/tcp packet, the classes are Best Effort and Non Best Effort; I'm using Python language. I have selected these features: Protocol, ...

http://mssss.yulina-kosm.ru