Interpreting big data can sometimes feel like reading oceans of tea leaves. Only a select few are capable of extracting meaning from abstruse data, and for most business users, taking action becomes impossible when it's hard even to understand what they're seeing. In the context of the Industrial Internet, we focus a lot on smart machines, clouds, and analytics, but the final consequence of all that technology is to provide actionable information. The problem is that the bigger big data gets, the harder it is to translate into conventional visualizations, like charts and graphs, that most people are familiar with.

Fortunately, there's a lot of research going into developing and exploring new kinds of visualization tools for big data.

One breakthrough, which has changed the way geographic information system (GIS) data can be mapped out in real time is the Massively Parallel Database (MapD), developed in 2012 by graduate student Todd Mostak. MapD was designed to solve a problem in mapping tweets about Middle Eastern politics during the Arab Spring. A lot of data was available, and even with the best database tools it couldn't be visualized in real time. MapD solved this problem by harnessing the power of parallel computing, leveraging modular, inexpensive, and scalable multicore CPUs and GPUs to process data and generate maps rapidly. With an ever-increasing demand for real-time data analysis, solutions like MapD are proving vital in presenting that information visually.

If time is one side of the big data visualization story, space is the other. Or more accurately, scale. Lots of data generally means there will be degrees of granularity in the analyses. You can probe enough to see high-level trends, or you can dig down and really get into the fine details of what you're trying to measure. Static visualizations driven by the sheer amount of data present are poor at providing degrees of granularity. New visual interfaces like Stanford University's imMems scale big data by summarizing it (aggregating and sampling it) on various levels. This provides not only visual scalability but speed too.

Another example, which is designed for even more granular scales, is ScalaR. ScalaR provides an interactive, intuitive visual data map by predicting what data users will access next, and pre-loading that data into memory. There are several ways it makes predictions, including modeling user interactions within the system, determining which data types are similar, and observing what kinds of statistics users are seeking out. The result is a fast and flexible system for mapping out small to large amounts of data.

In addition to these cutting-edge research products aiming to find new ways of representing big data, existing data visualization tools are also improving. Data visualization software is getting cheaper and more intuitive. Thanks to technologies like touch interfaces, HTML5, JavaScript, and gamification, it's possible to design more natural user experiences. As a result, visualization tools are getting into the hands of more and more workers and decision makers, paving the way for a robust Industrial Internet that will be anything but the domain of high-tech diviners.  

About the author

Suhas Sreedhar

Strategic Writer at GT Nexus