Deep Tech and Data Science

Why visualize data?

In the following article, we try to find answers to the question of comprehensively visualizing complex structures and hyperdimensional data

The relevance of data for the success of a company has increased rapidly over the last 5 years. Data has become an integral part of a company's products and services in contemporary industries. Most companies nowadays rely upon the evaluation of specific data to optimize processes and deliver the best possible product to their customer. The examination of data can also help to identify new business opportunities and develop business models, which in turn can create new revenue streams for businesses.

To do this, one has to properly identify and understand the correlations between the individual data points and translate these correlations into comprehensive terms. These terms then can be used to improve or drive the current service/ product at hand. The issue is that nowadays these correlations may be increasingly difficult to observe, to understand and to communicate. This is due to :

the sheer volume of the data
the increasing complexity of datasets

Individual data points have steadily grown in their complexity, since the number of behaviors or interactions, which the provider can monitor, have also increased significantly. One data point can theoretically have endless, or “n” additional, attributes that can be evaluated. This makes it very difficult to extract valuable conclusions from given datasets. Data scientists have adapted to this change by using sophisticated data extraction and plotting algorithms, using the help of cloud computing and AI to crunch through incredibly large datasets. This is all done in the hope of extracting usable conclusions from the examined dataset. Once these insights have been discovered it's extremely hard to visualize these dependencies and correlations.

Now, one could ask why visualize data?

Humans are an audio-visual species and a great deal of information that we receive about our surroundings is derived from the light that our visual neurons capture. Thus processing and analyzing visual stimuli is natural to us and to visualize any given information helps to increase our understanding of that information. This can be applied to something as basic as the looks of a specific object to high-dimensional datasets.

The process of visualization also helps to engage people with any given subject. This can easily be demonstrated by observing children. While trying to explain something to a child can be extremely tedious, using a visual medium such as a film or an interactive application can lead to extremely beneficial results.

The quality of human-machine interactions is drastically improved by giving the user the option to interact with digital content intuitively in a three-dimensional virtual space.

So how do we visualize data?

One of the more traditional techniques we used to visualize correlations between given attributes in a dataset is a simple two-dimensional graph. In a graph, one assigns two attributes of the dataset to the two respective axes of the graph and then the points are plotted using the assigned heuristic. The issue with this approach is quickly visible, one is extremely limited by the “two-dimensionality” of the visualization medium. In theory, we would need an “n” dimensional format (where n is the number of attributes) to accurately depict and plot a highly dimensional dataset. This is not feasible since humans as we know can only visually perceive 3 Dimensions.

So how does one approach this problem?

Well, the common consensus is that, before we try to get involved in multidimensional perception, one can easily take advantage of all the dimensions which we can perceive. For thousands of years, man has been depicting rasterized 3D images onto 2D planes. This form of visualization sits at the core of our modern civilization and yet it is still what some might deem as unnatural or non-intuitive. This does not mean that there has not been an avid practice of 3-dimensional visualization. The practice of sculpting materials so that they resemble a real counterpart has also been around for nearly as long as a 2D visualization.

The issue with this conventional practice is that it usually requires more time and resources and additionally requires more craftsmanship than simply drawing some lines on a flat surface. Now, it is still possible to depict 3D content over a 2D interface, but this limits the way the user can interact with the virtual space, forcing the user to use non-intuitive interaction methods like a keyboard or a mouse. Only in recent years with VR/AR technologies on the rise, has it been possible to not only efficiently produce three-dimensional content but also to be able to experience that content in a three-dimensional space.

This also gave birth to one of the more fascinating fields of interdisciplinary research, the field of Immersive Analytics. Immersive Analytics is an emerging research thrust investigating how new interaction and display technologies can be used to support analytical reasoning and decision-making.

The goal of Immersive Analytics is to provide multi-sensory interfaces that support collaboration and thus allow users to immerse themselves in a way that supports real-world analytics tasks. This can be achieved through the use of current technologies such as large touch surfaces, immersive virtual and augmented reality environments, sensor devices, and other, rapidly evolving, natural user interface devices.

Despite its young age, there are already a few major companies involved in this field. IBM has been pioneering the field of data visualization with its software suite IBM Watson studio. In the last two years, they have been successfully integrating IA solutions like the Watson IoT Digital Twin, which lets teams collaborate on the whole life-cycle of an IoT device. Once built these devices exist in virtual space and can then be monitored to see if the object performs and operates according to the intended digital design. This process allows for better communication between departments and effectively closes the feedback loop between design and operations. IBM has also been avid in building AR-based IA solutions, like the award-winning IBM Immersive Insights application. IBM Immersive Insights brings the power of AR 3D visualizations to data science tools, improving the user experience, data exploration, and data analysis process. The software is used via an AR headset and is primarily targeted at data scientists.

Future of human-machine interactions and benefits of Immersive Data Visualization

Whether you like it or not AR and VR technologies will not be going away any time soon. Despite its inability to penetrate the mainstream as a successful consumer product, it is only a matter of time before AR/VR headsets will be a market standard and will replace computer screens in every household. The reason for this is simply that the quality of human-machine interactions is drastically improved by giving the user the option to interact with digital content intuitively in a three-dimensional virtual space. Simply put, this means that the interactions in a VR or AR environment translate a lot better to the set of interactions a user is used to using in his day to day life. A click becomes a touch or a point gesture, dragging becomes simply picking up and moving, hovering can now merely be a gaze interaction, etc. This paired with quickly evolving steps in real-time computer graphics rendering will lead to a human-machine interface, with which the user can experience life-like virtual environments and in which he can interact effortlessly and intuitively with digital content. Thus, the applications of tomorrow will no longer follow the design and UX guidelines which we have perfected in the last decades of producing conventional software. This makes any type of research on human-machine interactions in VR beneficial to the process of designing and understanding the applications of tomorrow.

If this premise is not enough to justify the study of these types of human-machine interactions by itself, there are plenty of studies that examine the benefits of interacting with complex digital content in VR over using conventional visualization methods. One such study is a study conducted by the University of Bath, in which they evaluated users’ open-ended exploration of multi-dimensional datasets using VR and compared the results with that of traditional 2D visualizations. While conducting this study they found that there was no overall task workload difference between traditional visualization methods and visualizations done in VR. However, they did find differences in the accuracy and depth of insights that users gain. They suggest that users feel more satisfied and successful when using VR data exploration tools and confirm the potential of VR as an engaging medium for visual data analytics. As for why there were numerous reasons suggested.

Example of a study testing immersive vs. traditional visualization techniques (Source: Institute of Informatics, Federal University of Rio Grande do Sul)

First Virtual Reality is fully interactive, meaning that the user has no other option than to engage with his virtual environment. This leads to the user being fully engaged and immersed with the displayed information. As explained above, VR lets the user interact more intuitively with the visualized Data Points and thus, there is no need for a complicated UI. Besides, VR introduces a point of view for the user and can lead to a better understanding through the additional perception of scale. As a result of this, the user can interact with objects at different levels of intimacy. On the one end, the user can interact with the data set as a whole, or zoom in and interact with every individual data point. Additionally, having the possibility to explore multiple dimensions by mapping high dimensional datasets to multiple axes during exploration of data, could lead to new understandings of statistical relationships.

All of these findings suggest that VR data visualization may be beneficial in aiding data professionals to deeper understand and analyze their data. Furthermore, VR data visualization is also beneficial in helping individuals with no data related background understand complex datasets and structures.

Header image by Markus Spiske