Effortless Machine Learning Automation With Autogluon and Autoviz

Discover the advanced capabilities of Autogluon and Autoviz automation tools in machine learning, enhancing efficiency in simplifying complex tasks

Amit Kulkarni
AI Advances

--

Source: Author

In this blog, we will cover the below topics

  • Introduction
  • What is Autogluon?
    - Installation and usage
  • What is Autoviz?
    - Installation and usage
  • Applying these tools on a use-case
  • Conclusion & FAQ

Introduction

Automation is revolutionizing data science by streamlining processes and increasing efficiency. This shift not only boosts productivity but also allows individuals from diverse backgrounds to participate effectively. Marketing analysts can use automated tools to understand customer behavior and predict future trends, allowing them to focus on strategic decision-making. In healthcare, administrators can use intuitive visualization tools to uncover correlations between patient data and treatment outcomes, enabling informed decisions that enhance patient care and optimize resource allocation.

Autogluon and Autoviz are two automation tools that stand out for their simplicity and effectiveness, simplifying complex data analysis tasks and allowing users of all skill levels to extract valuable insights. By democratizing access to advanced data science techniques, Autogluon and Autoviz are empowering individuals from diverse backgrounds to harness the power of data for decision-making and innovation.

What is Autogluon?

Autogluon is an open-source library that simplifies and automates machine learning model applications, allowing both beginners and experts to use it without extensive expertise. It automates tasks like model selection, hyperparameter tuning, and training, reducing time and effort. Autogluon automatically selects the best model for a dataset, optimizing performance and minimizing manual intervention. It integrates seamlessly with popular frameworks like TensorFlow and PyTorch, ensuring flexibility and compatibility with existing workflows. Autogluon democratizes access to advanced analytical tools, promoting innovation in various industries and academic disciplines.

Need for tools like Autogluon

Tools like Autogluon are essential because they significantly reduce the barriers to entry for leveraging machine learning in various applications.

  1. Automation of Complex Tasks: Autogluon automates complex machine learning tasks like model selection, hyperparameter tuning, and feature engineering, saving time and effort for data scientists and developers, and allowing focus on higher-level problem-solving.
  2. Accessible to Non-Experts: Autogluon provides a user-friendly interface for machine learning, democratizing the field and enabling a wider range of professionals and students to utilize its capabilities effectively.
  3. Optimization of Performance: Autogluon’s intelligent algorithms optimize machine learning models and hyperparameters, enhancing performance, accuracy, and data insights by automatically selecting the best models for a given dataset.
  4. Flexibility and Compatibility: Autogluon is designed to integrate seamlessly with popular machine learning frameworks like TensorFlow and PyTorch, allowing users to utilize their existing knowledge and infrastructure.

What is Autoviz?

Autoviz is an automation tool that simplifies data visualization in machine learning. It offers an intuitive interface for users to generate insightful visualizations from their datasets without extensive coding or visualization expertise. Autoviz helps users explore and understand their data, uncovering patterns, trends, and relationships that may not be apparent through raw data analysis alone. It automates the creation of charts, graphs, and plots, making it easy for data analysis and decision-making processes. Its seamless integration with programming languages and environments further enhances its accessibility and usability.

Need for tools like Autoviz

Tools like Autoviz are essential in data analysis and machine learning because they significantly simplify the data visualization process. Here’s how Autoviz helps:

  1. Efficiency: Autoviz automates data visualization, saving time and effort for data scientists and analysts. Users can generate a variety of charts or graphs with just a few clicks, allowing them to focus on interpreting insights rather than the mechanics of visualization.
  2. Insight Generation: Autoviz is a tool that uses visualizations to provide a more intuitive understanding of data, enabling users to uncover hidden patterns and trends, thereby enhancing decision-making and problem-solving by generating various charts, graphs, and plots.
  3. Accessibility: Autoviz, a user-friendly data visualization tool, provides a platform for diverse individuals to create informative visualizations without extensive training or coding knowledge, democratizing the field of data analysis and allowing a wider range of professionals and students to utilize visualizations.
  4. Communication: Autoviz is a powerful tool for creating professional visualizations that effectively convey complex information, aiding in better communication and understanding among stakeholders, and enabling users to present findings.

Data loading

We will be using the data from Playground Series — Season 4, Episode 1 Binary Classification with a Bank Churn Dataset from Kaggle. The detailed definition of each of the variables can be found in the Kaggle dataset description.

Data Exploration and processing

In the blog Boost Your ML Model’s Performance with Ensemble Modeling, we explored the step-by-step process of building an ensemble model and as part of it, we did data exploration and visualized the features, missing values, etc. but this time we will use Autoviz to explore and visualise the data and then move to automation of model building process using Autogluon. Remember, we had done both of these tasks manually in the previous blog.

Installing Autoviz

Open the anaconda / vscode terminal and install the library using the pip command.

pip install autoviz

----------------------------------------------------------
OUTPUT:
Successfully installed autoviz-0.1.806.......

How to use Autoviz?

Once the installation is completed, we will import the library and instantiate it. That’s it, now let the Autoviz do its magic for us.

from autoviz.AutoViz_Class import AutoViz_Class
%matplotlib inline
AV = AutoViz_Class()
AV.AutoViz(filename='',dfte=df_train,depVar=target,verbose=1,
max_rows_analyzed=df_train.shape[0]
,max_cols_analyzed=df_train.shape[1])
Source: Author
Source: Author
Source: Author

The Autoviz has generated a series of charts across numerical and categorical variables. It carried out an outlier analysis and recommended removing or capping. I have not shown all the visuals above but it is very comprehensive.

Model building

Like in the previous section, we will automate the model-building process using Autogluon.

Installing Autogluon

pip install autogluon

The installation might take a bit of time but once done, we will import the library and map the train, test data, and initiate the automl build.

from autogluon.tabular import TabularDataset, TabularPredictor
import warnings
warnings.filterwarnings("ignore")
train = TabularDataset(df_train)
test = TabularDataset(df_test)
automl = TabularPredictor(label=target, problem_type='binary',
eval_metric='roc_auc')
automl.fit(train)

Here is the final output. It shows the leaderboard with all the models.

Source: Author | Final results from Autogluon

The next step would be to predict the test data.

predictions = automl.predict_proba(test)
predictions

In the previous blogs — Boost Your ML Model’s Performance with Ensemble Modeling and Fine-Tuning Models: Optuna for Hyperparameter Optimization, we had quite a bit of coding in Python to explore the data, build the baseline models, use Optuna to find the best parameter and used them to build an ensemble model for final prediction whereas with automation tools we were able to achieve the same using few lines of code where all we did was to specify the train, test data, define the target variable and leave the rest of the heavy lifting for the tools.

Conclusion

In Conclusion, tools like Autogluon and Autoviz are revolutionizing data science by simplifying complex tasks and making data analysis more accessible. Irrespective of the individual's background, these user-friendly tools can help uncover valuable insights and make informed decisions in your field — you might be surprised at what you discover!

I hope you liked the article and found it helpful.

Connect with me

Collection of blogs

Data Science Using Python and R
Python For Finance
App Development Using Python
GeoSpatial Analysis Using Python

FAQs

Q1: Can Autogluon handle large datasets?
A1: Yes, Autogluon is designed to work with large datasets efficiently, making it suitable for a wide range of data science tasks.

Q2: Does Autogluon require extensive programming knowledge?
A2: Not at all! Autogluon’s user-friendly interface allows users with minimal programming experience to leverage its automation capabilities effectively.

Q3: Is Autogluon compatible with cloud computing platforms?
A3: Absolutely! Autogluon can be seamlessly integrated with various cloud computing services, enhancing its scalability and accessibility for users working with cloud-based data infrastructures.

Q4: Can Autoviz handle different types of data?
A4: Yes, Autoviz is versatile and can work with various types of data, including numerical, categorical, and time-series data.

Q5: Is Autoviz suitable for exploratory data analysis (EDA)?
A5: Absolutely! Autoviz is an excellent tool for exploratory data analysis, allowing users to generate informative visualizations to understand their data better quickly.

Q6: Does Autoviz support interactive visualization features?
A6: Yes, Autoviz offers interactive visualization capabilities, allowing users to explore their data dynamically and gain deeper insights through interactive charts and graphs.

--

--