Dimensionality reduction: Difference between revisions
No edit summary |
|||
Line 1: | Line 1: | ||
'''Dimensionality reduction''' is one of the main applications of [[unsupervised learning]] . It can be understood as the process of reducing the number of random variables under consideration by getting a set of principal variables.<ref name="pythonistaplanet.com">[https://pythonistaplanet.com/applications-of-unsupervised-learning/ Real World Applications of Unsupervised Learning]</ref> High dimensionality has many costs, including redundant and irrelevant features which degrade the performance of some algorithms, difficulty in interpretation and visualization, and infeasible computation.<ref name="courses.washington.edu">[http://courses.washington.edu/css581/lecture_slides/17_dimensionality_reduction.pdf Dimensionality Reduction] courses.washington.edu</ref> | '''Dimensionality reduction''' is one of the main applications of [[unsupervised learning]] . It can be understood as the process of reducing the number of random variables under consideration by getting a set of principal variables.<ref name="pythonistaplanet.com">[https://pythonistaplanet.com/applications-of-unsupervised-learning/ Real World Applications of Unsupervised Learning]</ref> High dimensionality has many costs, including redundant and irrelevant features which degrade the performance of some algorithms, difficulty in interpretation and visualization, and infeasible computation.<ref name="courses.washington.edu">[http://courses.washington.edu/css581/lecture_slides/17_dimensionality_reduction.pdf Dimensionality Reduction] courses.washington.edu</ref> | ||
== | == Components == | ||
Dimensionality reduction can be devided into two subcategories<ref name="cognitive class">{{cite web |title=Machine Learning - Dimensionality Reduction - Feature Extraction & Selection |url=https://www.youtube.com/watch?v=AU_hBML2H1c |website=youtube.com |accessdate=24 March 2020}}</ref>: | Dimensionality reduction can be devided into two components or subcategories<ref name="cognitive class">{{cite web |title=Machine Learning - Dimensionality Reduction - Feature Extraction & Selection |url=https://www.youtube.com/watch?v=AU_hBML2H1c |website=youtube.com |accessdate=24 March 2020}}</ref>: | ||
* Feature selection: Consists in finding a subset of the original set of variables, and a subset aimed at modeling the problem. It usually involves three ways<ref name="flair"/>: | |||
** Wrappers | ** Wrappers | ||
** Filters | ** Filters | ||
** Embedded | ** Embedded | ||
* Feature extraction: | |||
* Feature extraction: Used to reduce the data in a high dimensional space to a lower dimension space<ref name="flair"/>. | |||
** [[Principal component analysis]] | ** [[Principal component analysis]] | ||
Revision as of 18:35, 24 March 2020
Dimensionality reduction is one of the main applications of unsupervised learning . It can be understood as the process of reducing the number of random variables under consideration by getting a set of principal variables.[1] High dimensionality has many costs, including redundant and irrelevant features which degrade the performance of some algorithms, difficulty in interpretation and visualization, and infeasible computation.[2]
Components
Dimensionality reduction can be devided into two components or subcategories[3]:
- Feature selection: Consists in finding a subset of the original set of variables, and a subset aimed at modeling the problem. It usually involves three ways[4]:
- Wrappers
- Filters
- Embedded
- Feature extraction: Used to reduce the data in a high dimensional space to a lower dimension space[4].
Algorithms
Some of the most common dimensionality reduction algorithms in machine learning are listed as follows[1]:
- Principal Component Analysis
- Kernel principal component analysis (Kernel PCA)
- Locally-Linear Embedding
Methods
Some common methods to perform dimensionality reduction are listed as follows[4]:
- Missing values:
- Low variance:
- Decision trees:
- Random forest:
- High correlation:
- Backward feature elimination:
- Factor analysis:
- Principal component analysis (PCA):