Not all algorithms are equal. We differentiate supervised learning from unsupervised learning to set you up for success as you start solving specific types of machine learning problems.
Supervised learning is the most common type of machine learning. Supervised learning always has a target variable \(y\), which can also be called a "dependent" or "response" variable, and input variables \(x\). It's called supervised because algorithms learn from a labeled training data set and are guided by the target variable. Learn more about test and training sets in one of our old posts. Examples of supervised learning problems are regression and classification.
- Regression - Predict vehicle miles per gallon (MPG) from data containing input variables like vehicle make, model, and year manufactured. Your target variable is vehicle miles per gallon.
- Classification - Predict whether a student is admitted to a university from data containing input variables like standardized test score, high school, and if they have an alum relative. Your target variable is Yes/No (whether they are admitted or not).
In unsupervised learning there is no target variable \(y\), only input variables \(x\), and the training data set is unlabeled. Algorithms are fed unlabeled data to find patterns and structures. An example of an unsupervised learning problem is clustering.
- Clustering - Recognize communities of people with similar interests on social media. The input variables could be age, location, and page likes.