Ensemble learning explained
There are many instances when a single machine learning model simply isn’t good enough – you need to use multiple models. Here is what an ensemble framework looks like:
In everyday life we use a form of ensemble learning for our daily decision making. For example, suppose you are applying to a university to enrol on an ensemble machine learning course. How do you decide whether it is the right choice or not? There are always multiple things that will inform your decision:
- Other students’ reviews: Students may provide information such as whether this course is useful for improving skill sets, information about the course curriculum, the practical sessions, and so on. But as the students are not fully aware of the course’s details (obviously, that’s why they are taking that course) and also because they cannot suggest any other competitive course, you cannot rely on their suggestion solely. However, you know that their suggestions helped you in the past to choose previous courses. Let’s say they are correct 60% of the time.
- Study counselors: You can get information regarding other competitive courses. They also know which universities have experts for the domain, so they can help you out to choose one course over other ML courses. Let’s assume that these counselors are correct 60% of the time.
- Career counselors: Why do you want to take this course? Of course, for a better job, so a career counselor can tell you about the current requirements related to this skill set. They can tell you whether this course will help you enhance your career. Keep in mind that career counselors are right most of the time, say 40%.
- Social media: Yes it helps! Here, you can join many discussion forums, find many suggestions, and pros and cons of the course. You can get suggestions from a large audience and they can perform a very critical role in your decision. This can show you in which region of the country there is more demand for a certain course, or where it is not considered an important skill. It may be correct, say, 40% of the time.
- Placement officer: A placement officer is a person who takes care of job placements; so they better know which specific companies need employees for this domain. Also they will give you assurance of getting a decent job after completion of this course. And trust me, 70% of the time you will go with their review, because they will help you in getting a job!
None of these things alone are going to help you come to a final answer. But by combining each of these different components, you’ll probably come to a decision. Let’s take a look at how you can make a decision on whether you should go with this course or not. Let’s quickly analyze the scenario. As all the experts are from an independent system, we can get a very high accuracy rate as follows:
1-60%*60%*40%*40%*70% 1-0.040 96%
Can you see what we have got by the combined decision? I think it is more than you think. Can we further improve it? Yes we can; for that, we have to take suggestions from more sources, such as course faculties, a company’s workers, and so on.
The preceding example is based on an assumption that the suggestions from all the sources are independent. Well, in a practical scenario, this is not possible. If we are talking about the same domain, more or less there will be a correlation between the suggestions. Suppose we choose six sources but all are students of that course. Then we cannot reach the correct decision with high confidence; this is where the power of ensembles comes into the picture, where you have multiple predictions and you combine all of them to get a high-confidence prediction. Let’s enter the world of ensembles.
When to use ensemble learning
There are many reasons to go for ensembles, as each model of the group is usually based on algorithms, some of which are very simple and less computation intensive, but some may be quite complex and more computation intensive. For any production environment, accuracy and computation time are equally important. A system with higher accuracy but one that is unimplementable for real-time applications is of no use. However, a simple algorithm may lack in accuracy and may not fit onto the data properly; in those cases, we have to make a compromise between accuracy and computation time. This compromise can be minimized if we use many weak learners to get a combined confidence index out of them, which may help us to implement such a system for real-time applications with very high accuracy.
These are the main instanced when you should use ensemble learning:
- The dataset is too large or too small: When a dataset is too large and so it cannot be trained by a single model, we can create a small subset of data to train different models. At the end, we can choose the average of all as the final prediction. Similarly, when a dataset is too small to train a single model, we can use bootstrap methods to create random subsamples of data to train the models.
- Complex (nonlinear) data: Most of the time, a real-world dataset is a nonlinear dataset, where a single model cannot define the class boundary clearly. This is known as underfitting of the model. In such cases, we can use more than one model to train different subsets of the data and average out the result at the end to predict distinct boundaries.
- High confidence: When we train multiple classifiers on the training dataset and get mostly correlated output, it ensures a high prediction rate. Consider a case of classification where most of our classifiers predict the same class for an instance; in such cases, interprets ensemble system having high confidence on its decision.
As with just about everything in the machine learning world, the key thing is to select the right model for the data you have at your disposal and the questions you’re trying to answer.
If you want to learn more about ensemble learning, explore it in depth in Ensemble Machine Learning from which this post has been taken.
Find more machine learning eBooks and videos.
Explore deep learning eBooks and videos.