The future doesn’t invent itself — business leaders are those who make it. The global machine learning (ML) market is booming, with more and more businesses adopting this technology in their daily operations. No wonder, ML is regarded as one of the major forces driving innovation today. The reasons why ML is so popular include but are not limited to:
- advancements in ML algorithms;
- storage capacities growth.
So far, it’s one of the most active and promising areas of artificial intelligence (AI). In this article, we are going to discuss the phenomenon of automated ML and compare some of the popular ML tools for you to choose the best for your business.
What Is AutoML?
Automated Machine Learning, also referred to as Automated ML or AutoML, is an emerging technology to automate machine learning tasks, accelerate the model-building process, help data scientists focus on higher value-added duties, and improve the accuracy of ML models. AutoML is trying to automate parts of the data science workflow and drive data driven decision making.
Basically, automated machine learning is an automated practice of selecting the model algorithm, hyperparameter optimization, modeling by iterations, and model evaluation. This technology doesn’t aim to substitute data scientists but rather frees them from repetitive tasks.
AutoMl is in growing demand because of its features:
- Usability: providing machine learning as a tool to non-machine learning experts;
- Productivity: increasing the productivity of machine learning engineers;
- Performance: finding better machine learning models.
With emerging AutoML, a new era of R&D and business app development has started. AutoML is about generating solutions without compromising on accuracy, making ML more accessible, reducing human expertise, and improving model performance in general.
AutoML’s major advantages are the following:
- Democratization: it makes ML features accessible to non-experts;
- Error reduction: it prevents possible faults caused by human intervention;
- Adding to efficiency: ML automates running repetitive tasks;
- Optimization: ML tunes hyperparameters;
- Management: managing the model’s future utilization, and more.
AutoML Market Review
The estimated AutoML market revenue reached $270 MM in 2019. Experts forecast it will hit the point of $14,500 MM by 2030. If we review this data, it becomes clear that the influence of AutoML will continue to grow.
Problems in AutoML and How to Avoid Them
AutoML does have considerable success in implementing AI advancements. But still, the AutoML implementation process has room for improvement. The question of interplay between data, models, and humans arises.
Firstly, AutoML engineers find it an obstacle to processing unstructured and semi-structured data. The next point worth mentioning is that the modern AutoML framework optimization goals are not constant. There is no way for effective judgment before the final results are presented.
Furthermore, it’s difficult to implement automated ML and obtain trusted outcomes as the conditions are changing at high speed. The AutoML applications that are currently offered by the market can only run one ML model program. For example, PyTorch.
Another challenge that should be mentioned is making a model explainable. Partly, it’s even a question of personal judgment. The solution found may not meet the final users’ expectations. Organizations have to work on developing the standards related to understandable, consistent ML.
And lastly, organizations are experiencing a lack of regulation, standards, and law support when it goes to the privacy and security of AutoML. Modern technical solutions should be applied to different scenarios.
Model selection and automation of the process of hyperparameter optimization, also known as tuning, are the AutoML’s most valuable features. This requires the use of various techniques.
Deep Learning and Neural Networks
One type of ML is based on the idea that human neurons have the ability to respond to triggers and interact with other neurons by sending them signals. This entity of millions of nodes is called a neural network. Nodes can deal with complex problems by splitting them into smaller tasks.
For example, the neural network that is in charge of recognizing dogs might have a layer of nodes determining whether the object is furry. Another layer may look for tails or legs or color patterns. This complicated system develops automatically through constant training with thousands of examples.
Neural networks are good in environments that are:
- highly complex;
- constantly changing.
Machine learning is a fundamental change in computing. We have passed the times when the enormous amount of data collected from different sources could be processed manually. Now, it seems almost impossible and fully ineffective. Unlike traditional software programs, neural networks are scalable: new layers are added without increasing complexity.
The basis of AutoML is a Neural Architecture Search (NAS) set of algorithms applied to neural networks and deep learning. The NAS set of algorithms is given the input set of data and selects the most relevant architecture and hyperparameters. The model is tuned automatically, and these algorithms can essentially replace ML developers.
Meta-learning, or the so-called learning to learn, is the ability of various ML approaches to work on different types of datasets. It results in learning from the outputs, being more effective, and conducting new tasks much faster. Machine learning algorithms learn from historical data.
AutoML consolidates best AI practices to make data science progressively accessible and simultaneously reduce time spent on generating value. There are many tasks at which machine learning is far better than human beings. Each industry is utilizing machine learning in different ways to take advantage of this cutting-edge technology. So what are the most innovative AutoML applications?
Fraud detection is one of the most fundamental applications of ML. The future of retail cannot exist without online shopping. With the growth of the eCommerce industry and the increased number of people using credit cards as a payment method, credit card fraud is becoming the most common type of identity theft.
The problem is facilitated by the emergence of new payment channels, including smartphones, different wallets, UPI, etc. The US took first place in Cases of Credit Card Fraud and lost in illegal transactions worldwide. The losses were estimated at a cool $27.85 billion in 2018, according to Nilson Report. The numbers are striking.
Another application of AutoML is translation. The most known application of ML in automated translation is Google’s GNMT (Google Neural Machine Translation). Fluency and accuracy are reached due to utilizing Neural Language Processing (POS Tagging, Named Entity Recognition, and Chunking).
AI plays a great role in the healthcare industry and medical diagnosis management in particular. Whether it goes about critical medical parameters examination, disease progression forecast based on the information extracted, treatment planning, or support, ML holds the key to effective automation of all regular, manual, and tedious workloads.
Machine Learning methodologies are additionally utilized to de-risk and accelerate clinical trials.
If you’ve ever used the Uber taxi app, it means that you’ve been utilizing ML as well. Uber’s customized application automatically detects a client’s location and offers a destination spot based on his/her previous experience (ML calculation based on Historic Trip Data).
Speaking about transportation, Tesla as a pioneer of self-driving (little or no human involvement) cars is also worth mentioning. Its current AI is powered by hardware producer NVIDIA, which depends on the Unsupervised Learning Algorithm.
Comparison of AutoML Tools
Automated Machine Learning has a long development way, with a lot of great AutoML frameworks only emerging. These tools are designed to make using ML algorithms simple and enable data scientists and ML engineers to build scalable machine learning models. Generally, AutoML tools must be able to build models with a wide range of algorithms (decision trees, neural nets, etc.) and provide a refined model to the end-user.
We can observe the growth of AutoML tools adoption in 2020 compared to 2019 (especially open-source AutoML tools) based on the results of the Kaggle’s State of Data Science and Machine Learning 2020 survey:
It should be mentioned that there is an obvious difference between open-source and enterprise solutions for AutoML. Open-source solutions can only automate algorithm selection and hyperparameter tuning, whereas enterprise solutions can do way more: data preparation and ingestion, column type detection, column intent detection, featurization, meta-learning, transfer-learning, model selection, hyperparameter optimization, pipeline selection, and so on. In addition, the results achieved using open-source solutions are quite worse than enterprise solutions.
Machine learning companies are investing in the research and development of ML to bring AI closer to consumers. Let’s briefly describe the use and performance of several of the most common AutoML tools available in the market (in random order).
PyTorch is an ML library, computing framework, and scripting language. PyTorch can be used on cloud platforms It utilizes Autograd Module to build neural networks. It helps in creating computational graphs and is easy to use because of the hybrid front-end.
The creators of auto-sklearn are the researchers at the University of Freiburg. Auto-sklearn is an open-source python package. Its development is done on Github. It was built based on the scikit-learn machine learning library.
The two main features of this toolkit are the automatic search for an appropriate learning algorithm for a new ML dataset and hyperparameter optimization. This tool offers a non-standard supervised ML, splitting preprocessing into data preprocessing and feature preprocessing. Auto-sklearn 2.0 is already available. More about technology: NeurIPS 2015.
MLBox is one of the top AutoML software and Python library with multiple useful features. It’s a framework that solves the tasks of data preparation, model selection, and hyperparameter search.
Amazon Lex is powered by the same deep learning technology as Alexa. It’s a fully managed AI service. Its goals are to create interfaces for various communication applications for better conversations.
Amazon Lex solves the problem of speech recognition and language understanding. It helps build virtual contact center agents and IVR, automate informational responses, improve productivity with application bots, and design chatbots. Its key features include natural conversations, builder productivity, and AWS service integrations.
TPOT is one of the first open-source methods presented by the AutoML software community in the USA. It’s a tree representation of the AI pipelines advancement device, which utilizes hereditary algorithms. TPOT makes use of the Python-based scikit-learn library and utilizes its classifiers. A great number of associations are evaluated to track down the most suitable one for the dataset.
H₂O AutoML was designed as an answer to the demand for ML developers. H₂O AutoML software supports traditional ML models and neural networks. Its main application area is the automation of the ML working flow. Models are supposed to be automatically trained and tuned within a time limit indicated by users.
Another tool that is worth mentioning is AutoKeras. It was developed by DATA Lab at Texas A&M University. This is an open-source technology that uses the deep learning library Keras.
AutoKeras utilizes the most recent version of Neural Architecture Search, ENAS, but keeps network functionality while introducing changes to the project architecture along with Bayesian optimization. This approach leads to a more adequate neural network search. Applying AutoKeras algorithms requires minimal ML expertise.
Data Robot is an AutoML tool that deals with predictive analytics. It assists data science experts in building and integrating correct predictive models and coordinates time frames in accordance with other software.
Data Robot AutoML owns the constantly-increasing library of the recently developed algorithms and provides access to the prototypes for datasets preparation and feature selection.
The first version of BigML AutoML helps automate the complete ML pipeline, not only the model selection. In addition, its utilization is quite simple. BigML’s AutoML performs three main operations: feature generation, feature selection, and model selection.
BigML facilitates unlimited predictive applications across industries, including aerospace, automotive, energy, entertainment, financial services, food, healthcare, IoT, pharmaceutical, transportation, telecommunications, and more.
Google Cloud AutoML
Google developed the Google Cloud AutoML tool using an approach called reinforcement learning. Here, AutoML behaves like a controller, which further develops the child ML model.
Google’s platform is a part of Vertex AI. It has an automated ability to train models on structured data: visual (image and video) and textual. Besides, Google’s AutoML offers interesting advanced configuration options that can be used to potentially simplify existing ML workflows.
Auto-WEKA (Waikato Environment for Knowledge Analysis) was released in 2013 in New Zealand. The second version of the technology, Auto-WEKA 2.0, was released in 2017. The most common application case is tabular data (a table with rows and columns).
Auto-WEKA combines both selecting algorithms and optimizing hyperparameters. The model selection feature is customized. Application maximizes general performance and offers a good indicator of productivity.
Summing It Up
Machine learning is probably one of the most influential technologies these days. It can transform a business and automate numerous operations. McKinsey has done a survey of 120 AI companies and found only 12% of those companies leverage ML projects. The rest 88% are still in the phase of experiments.
In this article, we’ve discussed automated machine learning, its major benefits, application, and tools. A deeper dive into these powerful frameworks will widespread the usage, diversify your skillset, and boost efficiency.