Part of

← Artificial Intelligence

ML Notes

Machine learning is a form of applied statistics with increased emphasis on the use of computers to statistically estimate complicated functions and a decreased emphasis on proving confidence intervals around these functions (Goodfellow 2016).

Forms of Learning

  • Scenarios

  • Problems for which existing solutions require a lot of fine-tuning or long lists of rules

  • Complex problems for which using a traditional approach yields no good solution.

  • Fluctuating environments: ML systems can adapt to new data.

  • Getting insights about complex problems and large amounts of data.

Applications

  • Analyzing images of products on a production line to automatically classify them

    • image classification β‡’ convolutional neural networks (CNN)
  • Detecting tumors in brain scans

    • semantic segmentation β‡’ CNN
  • Automatically classifying news articles

    • natural language processing (NLP)β‡’ recurrent neural networks (RNN), CNNs, Transformers.
  • Automatically flagging offensive comments on discussion forums

    • NLP β‡’ RNNs, CNNs, Transformers
  • Summarizing long documents automatically

    • NLP β‡’ text summarization β‡’ RNNs, CNNs, Transformers
  • Creating a chatbot or a personal assistant

    • NLP β‡’ natural language understanding, question-answering modules
  • Forecasting company revenue based on several performance metrics

    • Regression β‡’ Linear Regression, Polynomial Regression, regression SVM, regression Random Forest, Artificial Neural Network.

    • in order to factor in past performance metrics β‡’ RNNs, CNNs, or Transformers

  • voice commands

    • speech recognition β‡’ RNNs, CNNs, or Transformers
  • detecting credit card fraud

    • anomaly detection
  • segmenting clients based on purchases in order to design a different marketing strategy for each segment.

    • clustering
  • Representing a complex, high-dimensional dataset in a clear and insightful diagram.

    • data visualization β‡’ dimensionality reduction
  • recommending a product that a client might be interested in, based on past purchases.

    • recommender system β‡’ feed past purchases to an artificial neural network.
  • building an intelligent bot for a game

    • Reinforcement Learning

Types of ML Systems

Human Supervision

Supervised Learning

  • The agent observes input-output pairs and learns a function that maps from input to ouput.
  • Labeled data
  • Classification
    • k-Nearest Neighbors
    • Linear Regression
    • Logistic Regression
    • Support Vector Machines (SVMs)
    • Decision Trees and Random Forests
    • Neural Networks
  • Predicting a target numeric value given a set of features called predictors

Unsupervised Learning

  • The agent learns patterns in the input without any explicit feedback
  • Unlabeled data
  • In unsupervised learning, given a training set , without a labeled output, one must construct a sufficient model of the data.
Algorithms
  • Clustering

    • K-means
      • Input:
  1. Randomly initialize cluster centroids.
  2. For all points, compute which cluster centroid is the closest.
  3. For each cluster centroid, move centroids to the average points belonging to the cluster.
  4. Repeat until convergence.

K-means is guaranteed to converge. To show this, we define a distortion function: K means is coordinate ascent on J. Since always decreases, the algorithm converges.

  • DBSCAN

  • Hierarchical Cluster Analysis (HCA)

  • Anomaly detection and novelty

    • One-class SVM

    • Isolation Forest

  • Visualization and dimensionality reduction

    • Principal Component Analysis (PCA)

    • Kernel PCA

    • Locally Linear Embedding (LLE)

    • t-Distributed Stochastic Neighbor Embedding (t-SNE)

    TIP: It’s a good idea to reduce the dimension of your training data using a dimensionality reduction algo before feeding it to another ML algo (supervised learning). It will run faster and take up less disk+memory space, and perform better.

  • Association rule learning

    • Apriori

    • Eclat

Semisupervised Learning

  • Deep Belief Networks β‡’ restricted Boltzmann machines (unsupervised) stacked on top of each other β‡’ fine-tuned using supervised learning techniques.

Reinforcement Learning

  • The agent learns from a series of reinforcements: rewards and punishments.

Feature Engineering

Learning Incrementally

Online Learning

Batch Learning

Comparing new data points to known

Instance-based Learning

Model-based Learning

References

GΓ©ron 2019

Goodfellow 2016