What is Machine Learning?

Back in 1959, Arthur Samuel, a pioneer in the field of computer gaming, artificial intelligence, defined machine learning as the field of study that gives computers the ability to learn without being explicitly programmed.”. Yes, here we go again, the artificial intelligence is full of vague definitions such as this one. Tom Mitchell presented a clear and creative definition , in his famous book “Machine Learning” published in 1997, as he says: “a computer program is said to learn from experience E, with respect to some task T, and some performance measure P, if its performance on T as measured by P improves with experience E”.

Think of a computer playing the game of chess against you for example, we can say that the computer is learning from Games (experience E) with respect to Playing Chess (the task T) and a measure of performance (P) (could  be the ratio of the number of the games won by the computer to the total games played), if its performance P on the task T (playing chess) as measured by P improves with more and more games (E).

Machine Learning

Machine Learning (Sarcasm!)

 

Applications of Machine Learning

The list could be very long, but I will put my favorite ones:

  • Predicting what movies a user would like to watch (applied by Netflix)
  • Identifying which customers (customer segments) are most likely to respond to some marketing campaign or special offers.
  • Face recognition.
  • Predicting housing prices for real estate business.
  • Predicting stock returns.
  • Self-driving vehicles.
  • The likelihood of severe reaction to some medicine based on blood analysis.

In general, we can say that machine learning can be applied to anything, from insurance, banking and healthcare to robotics, promoting marketing campaigns and OCR…etc.

You might have noticed that most machine learning applications above start with “predicting”, well yes, prediction is at the core of machine learning. The real-life prediction problems are separated into two categories:

  1. Regression: we call the prediction problem a “regression problem” when we are predicting a numerical value, such as the price of the house given the house area and the number of bedrooms.
  2. Classification: we call the prediction problem a “classification problem” when we are predicting the final outcome as a class, think of a computer program which job is to classify the messages you receive into Spam/Not-Spam messages and ignore the spam ones automatically.

Another less common goal of machine learning is inference, which is studying the relationships between the predictor variables and the response variable(s) of the data i.e. how changing one of the variables affects one of the response variables.

A Myth About Machine Learning

Some people think negatively of machine learning or artificial intelligence in general, because they live in this life with us, they know how complicated it is and they doubt that any human regardless of how smart he/she is and what tools he/she could use, could “predict” anything about the future of any meaningful situation.

What they missed is that machine learning does not aim for perfect results, and I agree, no machine learning algorithm can provide correct predictions at all times. Maybe it is the dream for machine learning practitioners but that dream is too good to be true. Machine learning just aims for  results which are close enough to the real ones and that proximity is what makes it useful.

If you can think of other myths, I would be happy if you share it in the comments below.

Types of Machine Learning:

There is two major types of machine learning and when you are reading the definitions below, keep in mind that this distinction comes to life from the nature of the data we have at hand:

Supervised Learning

This type of learning is clearly demonstrated by the definition we saw above of Tom Mitchell. Learning from experience is the main motto of this type of learning, and what is experience here? it is the historical data we have, which we will pass to our machine learning algorithm to come up with a model to predict some property of new unseen data. So with some past observations of any kind of a process, we can predict the outcome of an unseen future observation of that process. There will be upcoming posts which are completely dedicated to supervised learning as it will constitute the most of what I talk about in this blog.

Supervised Learning

Supervised Learning

Unsupervised Learning: 

As per Wikipedia:

the problem of unsupervised learning is that of trying to find hidden structure in unlabeled data. Since the examples given to the learner are unlabeled, there is no error or reward signal to evaluate a potential solution.

So unsupervised learning is more of an exploratory journey through unlabeled data, think of it as studying the similarities/differences or relations/structure of some data based on its various characteristics, this is why there is the useful technique in unsupervised learning; cluster analysis, note that cluster analysis is not an algorithm, but can be achieved using many algorithms.

Clustering in Unsupervised Learning

Clustering in Unsupervised Learning (Source: scikit-learn.org)

Cluster Analysis (or Clustering) separates the data into distinct groups of similar observations. The figure above shows 3 different groups after the cluster analysis is done, but when you start examining the data, it is just uniform data with no clear distinctions whatsoever.

I might dedicate some posts to give you deeper understanding of the concepts and algorithms behind unsupervised learning, but as I mentioned above, most of this blog content will be devoted to supervised learning. This bias comes from the fact that most of machine learning applications in real-life belong to supervised learning.

Semi-supervised Learning

There are some cases where the problem at hand belongs both to the Supervised Learning and Unsupervised Learning, and then, a mix of the two is needed to come up with useful predictions. This is where the “semi-supervised” name came from.

Conclusion:

As you will see, you can do a lot of cool things using machine learning which cannot be done using traditional programming.

Day after day, machine learning is getting more popular and now we hear of new job positions such as “Machine Learning Engineer” or “Data Scientist” of which we have never heard before.

Be ready! The market and especially the most successful and reputable companies like Google and Facebook never hire enough people in their data science departments and even if you are not seeking such a job for the near time, I think that learning machine learning is never a waste of time, so keep an eye on the upcoming interesting posts.