天水一色分享 http://blog.sciencenet.cn/u/lincystar

博文

matlab 主成分分析warning

已有 11773 次阅读 2015-8-14 22:43 |个人分类:科研笔记|系统分类:科研笔记| pca, matlab, warning

warning message:

Columns of X are linearly dependent to within machine precision.
Using only the first # components to compute TSQUARED. (# is a number)

翻译:X的列向量在机器精度之内是线性相关的,只使用前#列来计算TSQUARED

对warning的解释(来自:http://stackoverflow.com/questions/27997736/matlab-warns-columns-of-x-are-linearly-dependent-to-within-machine-precision  ):

问题:



When I used the function "princomp" in Matlab to reduce the dimensions of features,

it warns:"Columns of X are linearly dependent to within machine precision. Using only the first 320 components to compute TSQUARED".

What dose it mean? The original dimension of features is 324.I would be very grateful if somebody can answer my question.

解释:(不影响计算结果,只是对数据存在线性关系的提醒)

For a more graphic interpretation of this warning imagine your data being 3-dimensional instead of 324-dimensional. These would be points in space.The output of your function princomp should be the principal axes of an ellipsoid that aligns well with your data. The equivalent warning of Using only the first 2 components would mean: Your data points lie on a plane (up to numerical error), so your ellipsoid really is a flat ellipse. As the PCA is usually used for dimensionality reduction this isn't really that worrying. It just means, that your original data is not 324-dimensional, but really only 320-dimensional, yet resides in R^324.

You would get the same warning using this random data:

N = 100;X = [10*rand(N,1), 2*rand(N,1), zeros(N,1)];X_centered = bsxfun(@minus, X, mean(X));[coeff, score, latent, tsquare] = princomp(X_centered);plot3(X_centered(:,1), X_centered(:,2), X_centered(:,3), '.');

Random pointscoeff(:,1) will be approximately [1;0;0] and latent(1) the biggest value, as the data is spread most along the x-axis. The second vector coeff(:,2) will be approximately the vector [0;1;0] while latent(2) will be quite a bit smaller than latent(1), as the second most important direction is the y-axis, which is not as spread out as the first direction. The rest of the vectors will be some vectors that are orthonormal to our current vectors. (In our simple case there is only the possibility of [0;0;1], and latent(3) will be zero, as the data is flat) [Bear in mind that the principal components will always be orthogonal to each other.]




https://blog.sciencenet.cn/blog-237238-913141.html

上一篇:如何对R包进行build-通过源码安装R包
下一篇:Matlab编程之预分配内存
收藏 IP: 112.25.185.*| 热度|

3 魏焱明 葛维亚 杨正瓴

该博文允许注册用户评论 请点击登录 评论 (3 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-4-19 09:48

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部