Besides “artificial intelligence” and “virtual reality”, “machine learning” (short: ML) is probably the most hyped technology in the past few years. And, at least in my opinion, with every right. However, it’s not that as if ML was just invented recently, as the term was first coined in 1959 by Arthur Samuel, a pioneer in the field of ML who described the technology as:
Machine learning is a field of study that gives computers the ability to learn without being explicitly programmed.
What is ML?
ML is a computational, or rather, statistical, method that uses experience to improve the performance of a topic or make accurate predictions. The ‘experience’ in that case is based on past data that we can access and that is labelled or categorized (by humans). The quality of the predictions depends on the accuracy and amount of the data.
While in statistical models, we collect data, clean it and use the cleaned data set to test a hypothesis and make predictions. Statistical models are based on static algorithms, implemented by the programmer. On the other hand, in ML, we don’t preselect a model and feed data into it. The data determines which analytic technique (or algorithm) should be selected to best perform the task at hand. It “learns” from the tagged data, extracts “knowledge” from the data and can then make predictions based on new data that we need to tag/categorize/interpret.
How does the data tagging work?
In short, there are three types of ML that differ in how the data is labelled/tagged, and, hence, how the computer can interpret and use it:
- Supervised ML: Here, the algorithm is trained on data that is labelled in excellent quality, by humans. As an example when we want to teach the computer to distinguish between a car and a space shuttle, we classify each photo with either ‘car’ or ‘space shuttle’. The ML algorithm then learns the difference and can then classify a newly inserted photo and predict to how many percent it’s a ‘car’ and ‘space shuttle’.
- Unsupervised ML: When a dataset does not have any labels/tags, it cannot use them to understand it. Hence, the algorithm then takes the raw, unlabelled dataset and tries to find inherent characteristics and clusters with similarities. As you can imagine, the results here can be much harder to interpret and understand.
- Reinforcement ML: The learning algorithm is similar to unsupervised ML in that the training data is also unlabelled. However, the outcome (i.e. predictions) are graded and thus labeled. For example if the ML algorithm predicts a photo of a Tesla to be a space shuttle, the user can tell the algorithm that this is wrong. This positive or negative grade provides a feedback loop and improves the ML algorithm.
From predicting of your next purchase to learning what you paint
There are hundreds of different use cases for machine learning, and I’ve collected a few of them:
- Microsoft Cognitive Services to help developers categorize/understand images/videos, read people’s emotions, translate text, etc.
- Google Autodraw to predict what you’re drawing
- IBM Watson to classify natural language or translate from speech to text
- Tesla’s (and other’s) self-driving cars
- soon your project?
- and much more!
What does machine learning mean for my business?
Today, machine learning has probably infiltrated all areas of life. This includes finance, where algorithms automatically trade stocks, bots, where algorithms take care of customer support (to a certain degree), health to better understand and improve your personal well-being, fraud detection to identify fradulent credit card use, and many many more.
It is never too late to start. Do you own labelled or unlabbeled data of your business that you want to use to learn more about it or even use it to train a model that you can use as a web service that automatically tags and analyzes new data? Contact me! I’d love to learn more about your idea and help you out!