Russian mathematicians find gold in big data

Russian mathematicians and geophysicists have made a standard technique for ore prospecting several times more effective. Their findings are reported in Geophysical Journal International, one of the most respected scientific periodicals on computational geophysics.

The controlled-source electromagnetic method, known as CSEM, dates back to the mid-20th century. It involves deploying grounded electrodes that inject an oscillating electric current into the Earth. The electromagnetic field is then measured on the surface. The resulting data enable mapping the electrical resistivity of the subsurface rock by solving what is known as an inverse problem. This is useful because a low resistivity suggests the presence of metal ore. A considerable limitation of CSEM, which has restricted its scope of application, is its high demand for computing resources.

Now, a research group led by Michael Zhdanov from the Applied Computational Geophysics Lab at the Moscow Institute of Physics and Technology has created a numerical method that makes the calculations feasible for modern supercomputers. CREDIT @tsarcyanide/MIPT Press Office{module In-article}

"Solving the inverse problem involves calculating -- thousands of times -- the electromagnetic field from a given distribution of electric current," said paper co-author Mikhail Malovichko of Skoltech and the MIPT Applied Computational Geophysics Lab. "We have proposed a new numerical method that speeds up the forward-problem calculation on alternating current severalfold, thus making the inverse problem tractable on modern supercomputers."

However, to use the algorithm for prospecting, it first needs to be verified using precise data on real ore deposits. Highly reliable reference data are supplied by the most expensive geological prospecting technique there is -- exploration drilling.

Fortunately, such data turned out to be available on the Sukhoi Log gold deposit, 900 kilometers northeast of Irkutsk, Russia. Discovered in the 1960s, the deposit is one of the largest worldwide. That said, the precious metal concentration in the rock is fairly low. For this reason, Sukhoi Log was thoroughly scrutinized to enable extracting ore only where it is economically viable.

"The Soviet Union spent an immense amount of money to drill more than 800 boreholes in an endeavor, whose economic feasibility was not subject to any checks anyway," said study co-author Andrei Tarasov, who is an associate professor at the Department of Geophysics, St. Petersburg State University. "This makes Sukhoi Log the ideal place for testing newly developed geological surveying techniques by comparing their predictions with the precise data available from drilling."

By processing the large arrays of available data, the MIPT-Skoltech team created a detailed 3D map of the area and tested the new algorithm's ability to solve the inverse problem in CSEM. The new model enables prospectors to make do with as few exploratory holes as possible: The drilling is only employed to verify model predictions.

The technique developed by the Russian researchers is applicable for searching for other kinds of ores, including copper-nickel, volcanogenic massive sulfide, and polymetallic deposits.

AI, big data predict which research will influence future medical treatments

An artificial intelligence/machine learning model to predict which scientific advances are likely to eventually translate to the clinic has been developed by Ian Hutchins and colleagues in the Office of Portfolio Analysis (OPA), a team led by George Santangelo at the National Institutes of Health (NIH). This work, described in a Meta-Research article published October 10 in the open-access journal PLOS Biology, aims to decrease the sometimes decades-long interval between scientific discovery and clinical application; the method determines the likelihood that a research article will be cited by a future clinical trial or guideline, an early indicator of translational progress.

Hutchins and colleagues have quantified these predictions, which are highly accurate with as little as two years of post-publication data, as a novel metric called "Approximate Potential to Translate" (APT). APT values can be used by researchers and decision-makers to focus attention on areas of science that have strong signatures of translational potential. Although numbers alone should never be a substitute for evaluation by human experts, the APT metric has the potential to accelerate biomedical progress as one component of data-driven decision-making. CAPTION This image depicts the co-citation network of seminal fundamental publications that led to the clinical development of cancer immunotherapy treatments. Large dots (center) represent the most influential clinical trials that formed part of the evidence base for FDA approval of these treatments. Heat mapping indicates the extent to which the research was human-focused; at the extremes, each green dot represents a fundamental research publication and each red dot a publication describing human research. This network was generated using open access data from the new modules of the iCite webtool described in two new articles from Hutchins and colleagues.  CREDIT Ian Hutchins and George Santangelo{module In-article}

The model that computes APT values makes predictions based upon the content of research articles and the articles that cite them. A long-standing barrier to research and development of metrics like APT is that such citation data has remained hidden behind proprietary, restrictive, and often costly licensing agreements. To disrupt this impediment to the scientific community, to increase transparency, and to facilitate reproducibility, OPA has aggregated citation data from publicly available resources to create an open citation collection (NIH-OCC), the details of which appear in a Community Page article in the same issue of PLOS Biology. The NIH-OCC comprises over 420 million citation links at present and will be updated monthly as citations continue to accumulate. For publications since 2010, the NIH-OCC is already more comprehensive than leading proprietary sources of citation data.

Citation data from the NIH-OCC are used to calculate both APT values and Relative Citation Ratios (RCRs). The latter, a measure of scientific influence at the article level, normalized for the field of study and time since publication, was developed previously by Santangelo's team at NIH, and has already been widely adopted in both the scientific and evaluator communities. Upon publication of these two articles, APT values and the NIH-OCC will be freely and publicly available as new components of the iCite webtool that will continue as the primary source of RCR data (https://icite.od.nih.gov/). The OPA team encourages the use of iCite to improve research assessment and decision-making that can contribute to optimizing the scientific enterprise.

Latest artificial intelligence research from China in Big Data

China is among the leaders in the rapidly advancing artificial intelligence field, and its broad range of cutting-edge research expertise is on display in this special issue on "Artificial Intelligence in China" of Big Data, a peer-reviewed journal from Mary Ann Liebert, Inc., publishers. Click here to read the special issue free on the Big Data website through July 18, 2019.

Co-Guest Editors Weiping Zhang, Ph.D., Zheijiang University (China) and Mohit Kumar, Ph.D., Rostock University (Germany) organized the unique and timely collection of articles in this special issue.

Featured in the special issue is the article entitled "Abnormal Data Region Discrimination and Cross-Monitoring Points Historical Correlation Repair of Water Intake Data," coauthored by Huifeng Xue, Xi'an University of Technology and China Academy of Aerospace System Scientific and Engineering (Beijing), Qiaoyun Liu, Xi'an University of Technology, Junjie Hou, China Academy of Aerospace System Scientific and Engineering, and Yi Wan, Ministry of Water Resources (Beijing). The researchers analyze the characteristics of abnormal data distribution and show how the data from current monitoring points do not maximally correlate with historical data from corresponding points. They use sample data from recent years to demonstrate that application of the Abnormal Data Region Discrimination algorithm and the Cross Monitoring-Points Historical Correlation Repair method can correctly identify the abnormal data region and repair the abnormal data. 203994 web fa2d2

Yao Yu and Junhui Zhao, East China Jiaotong University (Nanchang) and Wu Lenan, Southeast University (Nanjing) collaborated on the article entitled "Multiple Targets Tracking with Big Data-Based Measurement for Extended Binary Phase Shift Keying Transceiver". The researchers proposed using Doppler measurements of target velocity in combination with target range information to improve the ability to detect multiple targets accurately in a noisy environment with an extended-binary phase shift keying (EBSPK) transmit-receive system - a high-resolution radar tracking system. In a simulated experiment, they showed significant enhancement in the tracking performance of the big Doppler data association method. The target velocity measurements support the likelihood of the EBPSK transceiver-generated information, helping to distinguish actual targets from phony targets or clutter measurements.

Big Data Editor-in-Chief Zoran Obradovic, Ph.D., Carnell Professor of Data Analytics, Temple University, (Philadelphia, PA) states: "China's spending on research has increased 8-fold since 2000. The overall results of this increased research activity are evident in many fields and are particularly impressive in the area of Artificial Intelligence and Big Data. This special issue provides an excellent opportunity to read about a range of ongoing AI-related developments across multiple big data-related fields in China."