Creation of a big data visual analysis platform to detect unknown operational problems

  • Access to a green field market: scientific data visualization
  • Roll-out opportunity on multiple sectors where big data is key
  • Technological foundation to develop efficient machine-learning solutions
  • Strategic investment optimized by the direct acquisition of business
Alain de Rouvray, ESI Group’s Chairman and CEO states: “Picviz Labs’ technology constitutes a remarkable and exceptionally clever innovation which strengthens operational intelligence by visually identifying important and hidden issues. Beyond the highly-promising cyber-security market, energized by SaaS/Cloud deployments and the surge of hacker attacks, Picviz Labs is poised to accelerate ESI Group’s expansion into a large number of new verticals. This movement is expected to include addressing interests of non-industrial companies and health sciences by identifying previously invisible correlations that could help improve predictability and thereby reinforce the innovative potential of numerical modeling. For ESI Group's core market Picviz Labs visual solutions add a powerful capability to smartly navigate the ocean of data generated by increasingly large and complex digital models, and to reveal the wealth of knowledge deeply buried and consequentially often undetected and wasted."

Currently consisting of 5 employees, the French Company Picviz Labs was founded in 2010 and has since built and delivered a solution “to detect the unknown” through massive data inspection. Its big data visual analytics technology, provided through a unique and fast data rendering engine, enables customers to detect unexpected problems. Capitalizing on big data’s potential, Picviz Labs offers a new conception of operational intelligence by reversing the methodology vs. classical analytic tools based on Query Engines.

Access to a green field market: scientific data visualization

Visual analytics and scientific data visualization market is predicted to considerably expand in the coming years, reaching a market potential of several billion dollars. Such tremendous demand is notably linked to the arrival of the smart and digital factory with the new approach that revolutionizes the methodology of all product development processes by dematerializing the physical prototype and boosting process flexibility to reduce development timeframes. Adapting to this new framework, all engineers and technicians will be able to detect and anticipate any operational issues or inefficiencies while the product is under development. Leveraging this operational intelligence demand, Picviz Labs offers a data visualization platform able to integrate and analyze any kind of very large quantities of data in order to interactively highlight outliers and unexpected behavior.

Roll-out opportunity on multiple sectors where big data is key

Picviz Labs’ technology aims to meet the exponential demand that could emerge from all verticals where big data can generate strong added value for security intelligence. Already collaborating with prestigious international groups such as BNP PARIBAS, La Poste, BULL (ATOS) or Thales, and also with government agencies, Picviz Labs has established in many fields a solid and acknowledged reputation for its data visualization offering.

The Company will notably position ESI Group in the growing demand from industry for cyber-security to protect their data and prevent any cyber-attacks or data theft attempts and can thereby help ESI roll-out its virtual prototyping offering beyond the traditional industrial verticals. In addition to this specific complementarity, this acquisition is also a unique opportunity for Picviz Labs to accelerate its expansion in the vast and fast-growing security intelligence market by leveraging ESI Group’s distribution infrastructure and portfolio of important global industrial players.

Technological foundation to develop efficient machine-learning solutions

Capitalizing on Picviz Labs’ big-data visual analysis capability, ESI Group plans to develop a disruptive machine-learning solution dedicated to industrial clients whose product development processes deploy more and more computer-controlled systems, including product line embedded electronic systems and robots. Offering industrialists the opportunity to improve the artificial intelligence of their machines by learning from past data and automatically improving its algorithms, the ESI Group’s strategy is to deliver strong value-added services such as predictive maintenance, quality-assurance, component-replacement planning or supply chain management and logistics.

Strategic investment optimized by the direct acquisition of business

ESI acquired all assets of Picviz Labs, including know-how for its powerful big-data visualization solution, and its strong commercial portfolio in the cyber-security market where the pursuance of its existing activity could generate substantial value for ESI Group.

The operation, which also includes the integration of the highly-qualified team, holds strong commercial and technological synergies that will facilitate ESI Group’s entry into the cyber-security market and strengthen its value added visualization services offered to its existing industrial market.

Philippe Saadé, Picviz Labs’ Chairman and CEO states: “We are particularly enthusiastic about joining ESI Group. We share the same ambition of improving the democratization of high-value added analysis solutions, formerly restricted to a limited number of high-qualified engineers. Like ESI we are convinced that the digital factory success will rely on industrial ability to implement collaborative processes based on efficient visual interfaces more likely to support better human decision”.

This is an artistic representation of 3-D mapping of the chemistry and microbes of the human skin.

Data reveals diversity in molecular and microbial composition, as well as prevalence of personal hygiene products

Researchers at the University of California, San Diego Skaggs School of Pharmacy and Pharmaceutical Sciences used information collected from hundreds of skin swabs to produce three-dimensional maps of molecular and microbial variations across the body. These maps provide a baseline for future studies of the interplay between the molecules that make up our skin, the microbes that live on us, our personal hygiene routines and other environmental factors. The study, published March 30 by Proceedings of the National Academy of Sciences, may help further our understanding of the skin's role in human health and disease.

 

 "This is the first study of its kind to characterize the surface distribution of skin molecules and pair that data with microbial diversity," said senior author Pieter Dorrestein, PhD, professor of pharmacology in the UC San Diego Skaggs School of Pharmacy. "Previous studies were limited to select areas of the skin, rather than the whole body, and examined skin chemistry and microbial populations separately."

To sample human skin nearly in its entirety, Dorrestein and team swabbed 400 different body sites of two healthy adult volunteers, one male and one female, who had not bathed, shampooed or moisturized for three days. They used a technique called mass spectrometry to determine the molecular and chemical composition of the samples. They also sequenced microbial DNA in the samples to identify the bacterial species present and map their locations across the body. The team then used MATLAB software to construct 3D models that illustrated the data for each sampling spot.

Despite the three-day moratorium on personal hygiene products, the most abundant molecular features in the skin swabs still came from hygiene and beauty products, such as sunscreen. According to the researchers, this finding suggests that 3D skin maps may be able to detect both current and past behaviors and environmental exposures. The study also demonstrates that human skin is not just made up of molecules derived from human or bacterial cells. Rather, the external environment, such as plastics found in clothing, diet, hygiene and beauty products, also contribute to the skin's chemical composition. The maps now allow these factors to be taken into account and correlated with local microbial communities.

"This is a starting point for future investigations into the many factors that help us maintain, or alter, the human skin ecosystem -- things like personal hygiene and beauty practices -- and how those variations influence our health and susceptibility to disease," Dorrestein said.

Study examines use of 'Exhibit' tools in creating interactive data visualizations

In 2007, members of the Haystack Group in MIT's Computer Science and Artificial Intelligence Laboratory released a set of Web development tools called "Exhibit." Exhibit lets novices quickly put together interactive data visualizations, such as maps with sortable data embedded in them; sortable tables that automatically pull in updated data from other sites; and sortable displays of linked thumbnail images.

In April, at the Association for Computing Machinery's Conference on Human Factors in Computing Systems, Haystack members will present an in-depth study of the ways in which Exhibit has been used — with ramifications for the design of data-visualization tools; data-management software, such as spreadsheets; and Web-authoring software, such as content management systems.

The study also indicates ways in which websites could better gauge the effectiveness of the visualizations they publish. "Imagine if The New York Times was able to track how well you understood a visualization, or how you used it, rather than simply how much time you spent on it," says Ted Benson, a graduate student in electrical engineering and computer science and co-author of the new paper, along with professor of computer science and engineering David Karger. "That could help them design more engaging data displays and maybe even help uncover new stories in the data you didn't know were there."

In their study, Benson and Karger performed a series of successively more tightly focused analyses. First, they examined the design decisions that characterize 1,897 pages built using Exhibit — "Exhibits," in the application's parlance. Then they studied the automatically generated access logs of the 100 most popular Exhibit sites. The authors of 24 of those sites also allowed the researchers to install software that tracked the individual mouse clicks executed by site visitors — 200,000 interactions in all. Finally, Benson and Karger interviewed the developers of 12 Exhibit sites about their experiences with the tool.

Untapped market

Karger believes the fact that so many people — scientists posting research findings, administrators of commercial websites, journalists — have gravitated to Exhibit is telling in itself.

"There are 1,900 websites that have chosen to build an Exhibit," Karger says, "which is actually a pretty remarkable stretch given that this is a research project with no technical support and no decent documentation. In my mind, what that says is that there is a need out there that is not being met. I believe the need centers on achieving full authorial control over the design of your interactive visualizations without having to become a programmer."

The new paper, Karger adds, is an attempt to investigate Exhibit's utility more rigorously. Exhibit is a "declarative" language, like HTML, not an "imperative" language, like Java, Karger explains. That means that programs written in Exhibit simply describe how existing classes of graphical elements will be deployed on screen and which data sets they'll draw from. Exhibit doesn't enable the programmer to create new functions from scratch.

That limits its versatility but, Karger argues, makes it much easier to use. The same goes for another aspect of Exhibit's design: An Exhibit page, or multiple pages on the same site, can feature different visualizations of the same data. But the data must be stored in a single location, which each of the visualizations accesses independently. Visualizations can't refer to each other.

In combination, these design decisions mean that novices can quickly build their own pages simply by cutting and pasting other people's code. They just need to change the names of the data files the code refers to — and they don't need to worry about broken links to other visualizations.

The numbers speak

The new study offers some strong evidence that this is exactly what Exhibit users do. The data that Exhibit pages display can be stored in a variety of formats, including Excel spreadsheets and comma-separated text. But 69 percent of Exhibit sites instead use the more obscure JavaScript Object Notation format, or JSON.

Several interview subjects explained that JSON was the format in which data were stored in most of the examples on the Exhibit website — and to produce their sites, they had simply cut and pasted code from existing Exhibits. The prevalence of JSON suggests that many other Exhibit users are doing the same thing.

Exhibit's declarative design also made it easy to analyze users' interactions with Exhibit visualizations. Since every mouse click invokes an existing computational module, rather than executing a new computation from scratch, describing usage patterns is simply a matter of logging which modules are invoked when.

One characteristic of Exhibit sites that surprised Benson and Karger: While most developers used spreadsheets to create their data, their visualizations often exploited more complex relationships among data than spreadsheets are intended to handle. Some 32 percent of Exhibits used "multivalued tables," in which a single slot — the equivalent of a cell in an Excel file — contained more than one value. Twenty-seven percent used "graphs," which capture relationships among data elements, such as which members of a user's social network are also linked to each other.

The researchers conclude that, since it seems natural even to novice Web developers to organize their data in these more sophisticated ways, spreadsheet designers should offer tools that make it easier for them. Exhibit users found ad hoc techniques for representing more complex data structures in spreadsheets, but in the process, they gave up some of the spreadsheets' core functionality. For instance, an Excel user can represent a multivalued table by entering comma-separated lists in a single cell, but those lists aren't sortable, as spreadsheet data is intended to be.

Computer analysis verifies authenticity of Jackson Pollock's drip paintings

Abstract expressionist painter Jackson Pollock was perhaps most famous for his "drip painting" technique. His legacy, however, is plagued by fake "Pollocks" and even experts often have trouble distinguishing the genuine from the counterfeit. Now, a machine vision approach described in a forthcoming issue of International Journal of Arts and Technology has demonstrated 93 percent accuracy in spotting true Pollocks.

Lior Shamir of Lawrence Technological University in Michigan, USA, was intrigued by the a revolutionary artistic style of dripping paint on a horizontal canvas and has turned to computational methods to characterize the low-level numerical differences between original Pollock drip paintings and drip paintings done by others attempting to mimic this signature style. A scan of a given painting is analyzed and 4024 numerical image descriptors extracted, Shamir explains. Among these descriptors are fractals formed by the movement of the dripping paint and features such as Zernike polynomials and Haralick textures. 

"The human perception of visual art is a complex cognitive task that involves different processing centers in the brain," Shamir explains. "The work of Jackson Pollock showed unique physiological and neurological human responses to Pollock's drip paintings." But, the human eye is limited in its perception of the specific physical qualities of a painting. A computer, on the other hand, can quantify the details at the pixel by pixel level once a painting has been digitized and "see" details and patterns that we do not consciously detect.

Shamir's analysis demonstrates that although any amateur might imagine they could copy Pollock's work, it is indeed unique and his signature style gave rise to specific features and textures that Pollock pretenders have repeatedly failed to emulate accurately. Shamir points out that his software is publicly available and could be used to analyze the work of other artists in verifying authenticity or revealing the fakes.

Shamir, L. (2015) 'What makes a Pollock Pollock: a machine vision approach', Int. J. Arts and Technology, Vol. 8, No. 1, pp.1-10.

The software is available here: http://vfacstaff.ltu.edu/lshamir/downloads/ImageClassifier

ZENBU, a new, freely available bioinformatics tool developed at the RIKEN Center for Life Science Technology in Japan, enables researchers to quickly and easily integrate, visualize and compare large amounts of genomic information resulting from large-scale, next-generation sequencing experiments.

Next-generation sequencing has revolutionized functional genomics, with protocols such as RNA-seq, ChIP-seq and CAGE being used widely around the world. The power of these techniques lies in the fact that they enable the genome-wide discovery of transcripts and transcription factor binding sites, which is key to understanding the molecular mechanisms underlying cell function in healthy and diseased individuals and the development of diseases like cancer. The integration of data from multiple experiments is an important aspect of the interpretation of results, however the growing number of datasets generated makes a thorough comparison and analysis of results cumbersome.

In a report published today in the journal Nature Biotechnology, Jessica Severin and colleagues describe the development of ZENBU, a tool that combines a genome browser with data analysis and a linked expression view, to facilitate the interactive visualization and comparison of results from large numbers of next-generation sequencing datasets. The key difference between ZENBU and previous tools is the ability to dynamically combine thousands of experimental datasets in an interactive visualization environment through linked genome location and expression signal views. This allows scientists to compare their own experiments against the over 6000 ENCODE and FANTOM consortium datasets currently loaded into the system, thus enabling them to discover new and interesting biological mechanisms. The tool is designed to integrate millions of experiments/datasets of any kind (RNA-seq, ChIP-seq or CAGE), hence its name: zenbu means 'all' or 'everything' in Japanese.

ZENBU is freely available for use on the web and for installation in individual laboratories, and all ZENBU sites are connected and continuously share data. The tool can be accessed or downloaded from http://fantom.gsc.riken.jp/zenbu/.

"By distributing the data and servers we encourage scientists to load and share their published data to help build a comprehensive resource to further advance research efforts and collaborations around the world," explain the authors.

Page 3 of 51