What is Data Visualization
Data visualization is the process of viewing data graphically. Typically, data visualization incorporates commonly-accepted charts and components, such as graphs, maps, or other infographics. Data visualization is a means by which people can gain insights on their data by aggregating patterns into a meaningful visual beyond just numbers in a spreadsheet.
Why is Data Visualization Important and Potential Data Visualization Benefits
Data visualization is able to provide meaning to data by helping individuals identify patterns. These patterns enable us in B2B SaaS to do a lot of beneficial things for our underlying businesses, including:
- Improve products - by using data visualization, we are able to understand patterns around product usage, adoption, and up-and-coming trends. For example, we could use a funnel chart to visualize the sign up process and notice a steep drop-off at a certain point in the onboarding process, which enables those in Product to focus creatively on solving this analytically-identified problem.
- Increase sales - by using data visualization, sales and business development teams can understand how to better increase sales and close deals. For example, you could plot the closed lost reasons into a bar graph over time and understand which objections are becoming more critical for users and prioritize improving the objection handling in the sales process.
- Improve customer relationships - data visualization can be a great tool for understanding what makes customers happy and increasingly investing in those experiences, or, conversely, what makes customers unhappy and continuing to invest in improving in those areas. For example, a customer success team could send out a net promoter score (NPS) survey and plot those results onto a graph to understand customer satisfaction over time. Further segmentation could happen for those that give low scores and the response reasons could be further understood to make improvements.
- Cut Costs - by identifying patterns in spend across a team using data visualization (potentially graphing spend by category), a finance team can input cost cutting recommendations to improve the overall financial health of an organization.
- Detect anomalies and negative trends - Data visualization can also have a great use case for detecting general anomalies and negative trends. For example, a scatterplot can be a great way to detect both trends and outliers. With a scatterplot, a company could see how data points cluster closely together, while also seeing data points that don’t cluster around those points. This could help a company do bot detection on website visits by plotting the number of clicks as compared to session length and seeing any visitors that may differ greatly from the general trend. For trend spotting, a scatterplot could also be a great way to visualize average session length over time for users. If the session length is dropping, it could mean that recent changes to a website have not been positive for customer engagement.
Who Implements Data Visualization Solutions
- Developers - developers are able to use modern data visualization tools to create graphical representations of underlying data
- Business analysts - business analysts are supposed to take business data and gain meaningful insights to improve business operations
- Product managers - find patterns within the noise of the data in a product to best understand how to improve features and which features to build in the future
- Founders - use data visualization to make decisions about corporate direction, oftentimes driven by the most important Key Performance Indicators (KPIs) identified during the planning process. When those KPIs appear to be trending down, founders will typically make changes in an organization.
- Data scientists - use data visualization to help gain insights on their various projects to help the team make better decisions.
Types of Data Visualization Charts and Components
- Pie Charts: A circular chart divided into sectors, each representing a proportion of the whole. Each slice of the pie is proportional to the quantity it represents.
- Bar Charts: A chart with rectangular bars where the length of each bar is proportional to the value or frequency of the category it represents. Bars can be displayed horizontally or vertically.
- Line Charts: A type of chart which displays information as a series of data points called 'markers' connected by straight line segments. It is often used to visualize trends over time.
- KPI Chart (Key Performance Indicator Chart): This chart displays a single key performance indicator (ie a number or value) with clear visual elements such as color or size to communicate whether the indicator is within a desired range.
- KPI Trend Chart: Similar to a KPI chart, but this version also shows the performance indicator's trend over time, usually through a line or bar graph overlaying the KPI.
- Data Table: A grid that displays information in rows and columns, making it easy to compare different data points. Tables often include sorting and filtering functionalities.
- Area Chart: Similar to a line chart, but the area below the line is filled with color or shading. It emphasizes the magnitude of values over time or other categories.
- Funnel Chart: A chart that represents stages in a process and shows the progressive reduction of data through these stages. It's shaped like a funnel, wider at the top.
- Box Plot Chart: A graphical representation of the distribution of data based on a five-number summary: minimum, first quartile, median, third quartile, and maximum.
- Maps: Visual representations of geographical data, showing locations, regions, and spatial patterns. They can be simple outlines or detailed with multiple data layers.
- Sankey Chart: A flow diagram in which the width of the arrows is proportional to the flow rate of the data. It shows how data moves from one set of items to another.
- Scatterplot: A type of plot that uses Cartesian coordinates to display values of typically two variables for a set of data, showing the relationship between them.
- Heat Map: A graphical representation of data where values in a matrix are represented as colors. It's useful for visualizing the concentration of values.
- Pivot Table: A table of statistics that summarizes the data of a more extensive table. It allows rearranging and grouping of selected columns and rows to synthesize information.
Best Practices for Data Visualization
- Ensure that the data you visualize has potential meaning. It is possible to find correlations between variables that are generally unrelated. As we may have heard, correlation does not imply causation.
- Make sure to pick the right chart for your data visualization needs. It is possible to have a great data visualization insight, but the underlying graphical interface for representing the data is incorrect. For example, a KPI trend chart might be a much better representation of a KPI as opposed to a KPI chart with a singular number.
- Make sure the chart or graphic is readable. It is important to make sure the colors, scale, and text makes the graphic digestible. Additionally, multi-dimensional charts can be visually difficult to understand at times.
- Make sure to be fair and honest with how data is represented. For example, suppose two students got a 98 and 99 on a test. By most metrics, this would mean they both got an A+. However, if you graphed the students’ names on the X-Axis and the score on the Y-Axis starting at 97, it would look like one student far outperformed the other, which would not be accurate.
- Make sure the chart has as much interactivity as it needs and no more. Drilldowns, clickthrough, view underlying data, hover states, and more could be great to have in a chart. But, being overwhelming with the options can dilute an otherwise great graphical representation.
Python Data Visualization
Highcharts - a great solution to make dashboards and visualizations in Python, React, or VueJS.
MatplotLib - comprehensive library for visualizations in Python
Seaborn - based on MatplotLib, higher-level interface for visualization
Plotnine - tags itself as a ‘Grammar of Graphics for Python’, allowing you to compose elements together
Plotly - a Python library for low-code data apps
Geoplotlib - a Python library for maps and geolocation data
Gleam - a Python library for interactive web visualizations
Bokeh - a Python library for visualizations in browsers
Pygal - a Python library for easy visualizations in a few lines of code
Missingno - a Python library for visualizing data with missing/incomplete data
Leather - a Python library for quick visualizations
Altair - Declarative Visualizations in Python
Folium - combines Leaflet and Python into one
Embedded Analytics Data Visualization
Explo - The best customer-facing analytics offering
Tableau - Established BI tool, primarily for internal BI
PowerBI - Microsoft BI tool, largely for data analysts
Looker - Established BI tool, run by Google
Mode - Internal and embedded analytics provider
Metabase - Open-source internal and embedded analytics provider
Sisense - BI tool for business analysts
Thoughtspot - AI-powered BI tool
Where can you Find More About Data Visualization
There are a lot of great resources to learn about data visualization such as:
Northeastern Program
MIT Program
Data Viz Catalogue
Coursera Data Visualization Course
Udacity Course
Explo - The leading embedded analytics provider
When Will Data Visualization Become Easier to Implement
It already has! With a lot of great Python libraries and new wave embedded analytics providers, data visualization has only become easier. Additionally, AI has done amazing things for the data visualization space, allowing you to feed data into a system, chat casually with an AI, and get meaningful visualizations. The rest of the 2020s is going to be an amazing time for data visualization, this is only the beginning!