Use Lux and Python to Automatically Create EDA Visualizations

Automatically creating interactive visualizations for exploratory data analysis is a lot easier than you think. Here’s how to use Lux and Python to make it happen.

Written by Abdishakur Hassan
Published on May. 06, 2022
Brand Studio Logo

 Data visualization and exploration are critical tasks in data science. However, it takes a lot of time, code and tinkering to produce even a single visualization.

What if you had an intelligent tool that automatically suggests relevant and aesthetically beautiful data visualizations to enable you to discover and explore your data quickly?

I’m not talking about suggesting a single bar chart or a couple of visuals, here. I’m talking about using one line of code to get back interactive data visuals that you can filter even further by features from columns in the data set.

Enter Lux: a Python API for Intelligent Visual Discovery.

Why Use Lux for EDA?

Lux is a Python library that facilitates fast and easy data exploration by automating the visualization and data analysis process. By simply printing out a data frame in a Jupyter notebook, Lux recommends a set of visualizations highlighting interesting trends and patterns in the data set. Visualizations are displayed by an interactive widget that enables users to quickly browse through large collections of visualizations and make sense of their data.

 

Data Discovery With Lux

You first need to install Lux in your environment, which you can do by running either of these two commands:

pip install lux-api
conda install -c conda-forge lux-api

If you are using Jupyter Notebook, activate the notebook extension:

jupyter nbextension install --py luxwidget
jupyter nbextension enable --py luxwidget

For Jupyter Lab users, run the following to activate the lab extension:

jupyter labextension install @jupyter-widgets/jupyterlab-manager
jupyter labextension install luxwidget

Now that you’ve installed Lux and have your Jupyter extensions, let’s look at the basics of Lux for data discovery.

More From AbdishakurThe 7 Best Thematic Map Types for Geospatial Data

 

Basic Functionality

In order to get the interactive data discovery tool in Lux, and the recommended visuals, read your data with Pandas and call the name of your data frame.

import pandas as pd
import lux
df = pd.read_csv("https://raw.githubusercontent.com/plotly/datasets/master/gapminderDataFiveYear.csv", parse_dates=["year"])
df

Instantly, you get your data frame displayed just like Pandas. In addition, you have the toggle button for “Toggle Pandas/Lux.” When you click on this button, you’ll get back interactive data visualization suggestions. Here’s what the process looks like.

As you can see, we have several suggested exploratory data visualizations at our fingertips without writing any code in Matplotlib, Seaborn, Plotly or any other data visualization library in Python.

Depending on the data, Lux visualizes correlations, distributions, occurrences and, if you have time or geospatial data, you’ll also get temporal and geographic data visualization suggestions.

I’ve reviewed several auto EDA tools and libraries but I’ve never seen a more powerful EDA tool that also incorporates geospatial data; Lux is the only one.

More on EDAWhat Is Exploratory Spatial Data Analysis (ESDA)?

 

Intent and Filtering

I’m sure you’re now wondering what more you can do with this fantastic tool at your disposal. Lux is flexible and versatile. It enables you to interact with data visualizations easily with intent, while seamlessly filtering the data.

Let’s see what you can do with intent functionality. Instead of just taking what Lux throws at you, you’re free to choose which feature(s) you want to explore in your data.

If we want to examine, for example, the GDP feature in the data set, we can pass that in the intent method.

df.intent = ["gdpPercap"]
df

Lux will take the intent feature and produce more immediately relevant data visualizations. This means, instead of the randomly-generated visuals, you now have specific recommendations for visuals with additional features from the data set.

Here you have the intent feature visualizations on the left and several suggested visualizations, including maps and time-series data, on the right. We can also see the filters in the next tab, where we have selected visuals connected with sub-features. For example, you can see a specific year with GDP visuals alongside a particular continent.

If you want to get the filters beforehand, you can also pass that under the intent method.

df.intent = ["gdpPercap","continent=Europe"]
df
Lux: Automatic Visualizations for Exploratory Data Science

 

Exporting Visuals

Exporting these visuals is easy with Lux. You need to select one or more visuals and hit the export button. This will create a list of visuals you’ve chosen. In this example, I’ll show you how to export a single visualization.

Now that you’ve exported your visualization, you can access the exported visualization and manipulate it however you like. You can also export the code behind the data visualization easily.

Related Tutorials From Built In Experts3 Ways to Write Pythonic Conditional Statements

 

Final Thoughts

With Lux, you can speed up the EDA process in data science and customize it according to your intention. I love that it offers a geospatial data visualization option, but it has limited functionality because it can’t treat coordinates as geographic features. I love the easy-to-use API Lux provides as well as its flexibility and its integration with the data science Python ecosystem. Most of all, you can save time and energy because you can quickly develop relevant EDA visualizations without writing a large body of code. 

Explore Job Matches.