Invited Talk 1: Deep-Learning: Investigating Feed-Forward Deep Neural Networks for Modeling High Throughput Chemical Bioactivity Data
Speaker: Dr. Jun (Luke) Huan
In recent years, research in Artificial Neural Networks (ANNs) has resurged,
now under the Deep-Learning umbrella, and grown extremely popular due to
major breakthroughs in methodological and computing capabilities.
Deep-Learning methods are part of representation-learning algorithms that
attempt to extract and organize discriminative information from the data.
Recently reported success of DL techniques in crowd-sourced QSARs and
predictive toxicology competitions has showcased these methods as powerful
tools for drug-discovery and toxicology research. Nevertheless, reported
applications of Deep Learning techniques for modeling complex bioactivity
data for small molecules remain still limited.
Short Bio:Dr. Jun (Luke) Huan is a Professor in the Department of Electrical Engineering and Computer Science at the University of Kansas. He directs the Data Science and Computational Life Sciences Laboratory at KU Information and Telecommunication Technology Center (ITTC). He holds courtesy appointments at the KU Bioinformatics Center, the KU Bioengineering Program, and a visiting professorship from GlaxoSmithKline plc. Dr. Huan received his Ph.D. in Computer Science from the University of North Carolina. Dr. Huan works on data science, machine learning, data mining, big data, and interdisciplinary topics including bioinformatics. He has published more than 100 peer-reviewed papers in leading conferences and journals and has graduated more than ten graduate students including seven PhDs. Dr. Huan serves the editorial board of several international journals including the Springer Journal of Big Data, Elsevier Journal of Big Data Research, and the International Journal of Data Mining and Bioinformatics. He regularly serves the program committee of top-tier international conferences on machine learning, data mining, big data, and bioinformatics. Dr. Huan's research is recognized internationally. He was a recipient of the prestigious National Science Foundation Faculty Early Career Development Award in 2009. His group won the Best Student Paper Award at the IEEE International Conference on Data Mining in 2011 and the Best Paper Award (runner-up) at the ACM International Conference on Information and Knowledge Management in 2009. His work appeared at mass media including Science Daily, R&D magazine, and EurekAlert (sponsored by AAAS). Dr. Huan's research was supported by NSF, NIH, DoD, and the University of Kansas. Starting January 2016, Dr. Huan serves as a Program Director in NSF at its Intelligent and Information Division in the Computer and Information Science and Engineering Directorate.
Invited Talk 2: Networks and Models for the Integrated Analysis of Multi Omics data
Speaker: Dr. Sun Kim
These days, genome-wide measurements of genetic and epigenetics events, a.k.a omics data, are routinely produced; epigenetics is control mechanisms of genetics events as epi- means `on’ or `upon’. As a result, a huge amount of omics data measured from different genetic and epigenetic events are available. For example, the amount of data at The Cancer Genome Atlas(TCGA) alone exceeds 2.5 peta byte as of October 2016. Unfortunately, the dimensions of omics data is huge, typically tens to hundreds or even millions of thousands while the number of samples are limited typically a few to thousands. Thus mining genetic and epigenetic data measured in different phenotype conditions is a very challenging problem, that is, small data sets on extremely high dimensions. Furthermore, all genetic and epigenetic events are inter-related. Thus it is necessary to perform integrated analysis of omics data sets of different types, which is even more challenging. To address these technical challenges, the bioinformatics community has used virtually all known network based analysis techniques, including recently developed deep neural networks. My group has been trying the network based integrated analysis of omics data at three different levels. First, we have been investigating on computational methods for associating different genetic and epigenetic events, which can be viewed as methods for defining edges in the network. Second, we have been developing mining sub-networks on the phenotype and time dimensions. Third, we have recently begun to investigate on the use of deep learning techniques for the integrated analysis of omics data. An important goal of our research is to combine network analysis and deep learning techniques to construct models or draw maps of cancer cells at multiple levels such as genomic mutations, gene activation/suppressions, epigenetic events including DNA methylation, histone modifications, and miRNA interference, biological pathways, and finally at the whole cell level including tumor heterogeneity and clonal evolution.
Short Bio:Dr. Sun Kin is a professor and the director of Bioinformatics Institute of Seoul National University. Click here for his short CV.
Invited Talk 3: High Performance Computational Biology and Drug Design on TianHe Supercomputers
Speaker: Dr. Shaoliang Peng
Extremely powerful computers are needed to help scientists to handle high
performance computational biology and drug design problems. The world’s
largest genomics institute BGI currently generates 6 TB data each day. The
European Bioinformatics Institute (EBI) in Hinxton currently stores 20
petabytes (1 petabyte is 1015 bytes) of data and back-ups about genes,
proteins and small molecules. TianHe supercomputers can speed up
computational biology and drug design processing. In 2013, 2014, and 2015,
Tianhe-2 topped the TOP500 list of fastest supercomputers in the world. Many
well-known bioinformatics and drug design softwares (BWA, DOCK, SOAP3-dp,
SOAPdenovo, SOAPsnp etc.) are developed and running on TH-2.
Short Bio:Dr. Shaoliang Peng is a professor in National University of Defense Technology (NUDT, Changsha, China) and an adjunct professor of BGI. He was a visiting scholar at CS Department of City University of Hong Kong (CityU) from 2007 to 2008 and at BGI Hong Kong from 2013 to 2014. His research interests are high performance computing, bioinformatics, virtual screening, and biology simulation. He has participated in many keystone projects in China such as TianHe supercomputers. He gains the Gold Award twice of PAC 2014 and 2015 (Parallel Application Challenge Competition) and is reported by CCTV 1, ScienceNet, China Science and Technology News, and 2015 Top 10 News of Hunan Province of China (1. Human Whole Genome Re-sequencing Analysis Software Pipeline, 2. mD3DOCKxb: largest high throughput molecular docking platform). He also gains the Finalist Awards of 2015 IEEE International Scalable Computing Challenge, SCALE 2015. He has published 3 books and over 50 papers in ISC 2015, ACM/IEEE Transactions, Nature Communications, Cell AJHG, BMC Bioinformatics, IPDPS, and SCIENCE CHINA. The downloading times of software are over 50000. He is Executive Editor and Associate Editors of several international journals (Interdisciplinary Sciences: Computational Life Sciences, IJCSE, IJHPCN, and IJES). Moreover, he is the PI of several key projects including 973, 863 and National Natural Science Foundation of China (NSFC).
Invited Talk 4: Multi-Omic Approaches for Liver Cancer Biomarker Discovery
Speaker: Dr. Habtom W. Resson
Omic technologies offer the opportunity to characterize liver cancer at various molecular levels. In particular, characterizing the association of biomolecules such as metabolites and glycoproteins with liver cancer is a promising strategy to discover clinically relevant biomarkers. Metabolites are molecular fingerprints of what cells do at a particular point in time; they can reveal early signs of cancers when the chances for cure are highest. Also, the analysis of protein glycosylation is relevant to liver pathology because of the major influence of this organ on the homeostasis of blood glycoproteins. This talk will focus on the application of multi-omic approaches to identify biomarkers for early detection of liver cancer in patients with liver cirrhosis. Specifically, I will present transcriptomic, proteomic, glycomic/ glycoproteomic, and metabolomic (TPGM) studies we conducted by analysis of samples from HCC cases and cirrhotic controls using multiple omic platforms such as next generation sequencing, liquid chromatography-mass spectrometry (LC-MS), and gas chromatography-mass spectrometry (GC-MS). In addition to candidate biomarkers discovered by evaluating the changes in the levels of transcripts, proteins, glycans, and metabolites between HCC cases and cirrhotic controls, I will present network-based methods we developed for integrative analysis of multi-omic data to identify aberrant pathways/network activities and biomarkers for early detection of liver cancer.
Dr. Ressom is a Professor of Oncology at Georgetown University Medical Center (GUMC). His research focuses on using multi-omic approaches for liver cancer biomarker discovery. His laboratory collects biospecimens from human research participants, designs workflows for multi-omic studies, and develops computational methods for omic data analysis. Dr. Ressom is the Director of GUMC’s Genomics and Epigenomics Shared Resource (GESR), which provides various services including next generation sequencing, SNP genotyping, copy number variation analysis, DNA methylation analysis, and mRNA/miRNA expression analysis.
Invited Talk 5: Semi-Hypothesis Guided Exploratory Analysis for Biomedical Applications
Speaker: Dr. Chi-Ren Shyu
Medical research and clinical trials are often based on hypotheses that were observed from clinical practice with noticeable evidence. Forming clinically significant hypotheses will greatly benefit the success of clinical research and ensure both external and internal validity of the trial. In this talk, I will introduce a knowledge discovery approach to automatically identify populations of subjects with commonly occurred comorbidities, genotypes, and phenotypes that present statistically high contract between populations. To focus on a confined set of medical problems as most of medical researchers would like to target (hypertension and diabetes versus all chronic diseases), this approach is able to take a set of selected attributes of interest and expand knowledge discoveries from the initial set. The computational approach consists of a forward floating search method for population selection, a hierarchical frequent pattern mining tree to efficiently handle dense associations, contrast mining for identifying actionable plans, and accumulated contrast (ac-)index for ranking mining results for biomedical researchers. I will present exploratory analysis process and results from the Simon’s Simplex Collection (SSC) by the Simons Foundation Autism Research Initiative (SFARI) which comprises data representing 11,560 individuals from 2,591 families. Putative autism subtypes were explored by partitioning families based on demographics and autism phenotypes. An extended contrast mining procedure identified genetic combinations showing preferential association for one of the contrasted subgroups, emphasizing combinations novel to the autistic proband within each family tree. Potentials for other biomedical applications will also be discussed.
Short Bio:Dr. Chi-Ren Shyu is the director of the University of Missouri Informatics Institute. He holds the Paul K. and Dianne Shumaker Endowed Professorship of Biomedical Informatics. He received his Ph.D. in Electrical and Computer Engineering from Purdue University. Since joining University of Missouri-Columbia in 2000, Shyu has received several awards including the National Science Foundation CAREER award, Engineering Faculty Research Award, Engineering Teaching Excellence Award, the 2014 University of Missouri Faculty Interdisciplinary Entrepreneurial Award, the 2016 UM System President’s Leadership Award. He actively serves the international research community, which includes organizing the IEEE HealthCom 2011 conference in Columbia as general chair, co-chairing technical program committee of the Second IEEE BigMM2016 and IEEE BIBE2016. He will serve as the general chair for the IEEE BIBM 2017 in Kansas City, Missouri, USA. Dr. Shyu also leads an interdisciplinary team of 22 researchers from veterinary medicine, engineering, human medicine to train doctoral students through the NIH BD2K’s T32 Biomedical Big Data Science program (2016-2021) to tackle One Health Big Data challenges – translating discoveries from animal model to human health. His research interests include massive data analytics, biomedical informatics, mHealth and eHealth, visual knowledge reasoning and search engine design. Project sponsors, in addition to the NSF, include the National Institutes of Health, National Library of Medicine, the U.S. Department of Education and other for-profit and nonprofit organizations.
Invited Talk 6: Computational tools for studying gene regulation in the 3-dimensional genome
Speaker: Dr. Kai Tan
Determining the 3-dimensional structure of the genome and its impact on gene expression has been a long-standing question in cell biology. Recent development in mapping technologies for chromatin interactions has led to a rapid increase in this kind of interaction data, revealing a hierarchical organization of the 3D genome, from large compartments spanning multiple chromosomes, to mega-base-sized topological associated chromatin domains, to individual long-range chromatin loops mediating enhancer-promoter interactions. With the explosion of chromatin interaction data, there is a pressing need for analytical tools. In this talk, I will describe two computational algorithms for analyzing chromatin interaction data at different scales. I will first present a fast algorithm for identifying large-scale, hierarchical chromatin domains. I will demonstrate how the algorithm enables studies of chromatin subdomains in gene regulation. Accurate knowledge of enhancer-promoter interactions is a pre-requisite to understanding regulatory output of enhancers. I will present an algorithm for predicting enhancer-promoter interactions by integrating genomic, transcriptomic, and epigenomic data. Using data from multiple human cell types, I will demonstrate how the algorithm can help decipher the mechanisms underlying enhancer-promoter communication.
Short Bio:Dr. Kai Tan is an associate professor in the Departments of Pediatrics, Genetics, Cell and Developmental Biology at the University of Pennsylvania and Children’s Hospital of Philadelphia. He received his PhD degree in computational biology from Washington University in Saint Louis, followed by postdoctoral training in systems biology at the University of California San Diego. Dr. Tan’s research focuses on understanding gene regulatory networks in normal and disease development. His laboratory has developed a number of algorithms for modeling and analyzing gene regulatory networks. He serves on the editorial board of PLos Computational Biology and BMC Genomics. He is a member of Genomics, Computational Biology and Technology study section of NIH. He has served on the organization and program committees of international conferences including BIBM, ISMB, and RECOMB.
Invited Talk 7: Clinical application of precision medicine: Zhongshan Hospital Strategy
Speaker: Dr. Xiangdong Wang
Tomorrow’s genome medicine in lung cancer should focus more on the homogeneity and heterogeneity of lung cancer which play an important role in the development of drug resistance, genetic complexity, as well as confusion and difficulty of early diagnosis and therapy. Chromosome positioning and repositioning may contribute to the sensitivity of lung cancer cells to therapy, the heterogeneity associated with drug resistance, and the mechanism of lung carcinogenesis. The CCCTC-binding factor plays critical roles in genome topology and function, increased risk of carcinogenicity, and potential of lung cancer-specific mediations. Chromosome reposition in lung cancer can be regulated by CCCTC binding factor. Single-cell gene sequencing, as part of genome medicine, was paid special attention in lung cancer to understand mechanical phenotypes, single-cell biology, heterogeneity, and chromosome positioning and function of single lung cancer cells. We at first propose to develop an intelligent single-cell robot of human cells to integrate together systems information of molecules, genes, proteins, organelles, membranes, architectures, signals, and functions. It can be a powerful automatic system to assist clinicians in the decision-making, molecular understanding, risk analyzing, and prognosis predicting.
Xiangdong Wang, MD, PhD, is a Distinguished Professor of Medicine at Fudan University, Director of Shanghai Institute of Clinical Bioinformatics, Executive Director of Clinical Science Institute of Fudan University Zhongshan Hospital, Director of Fudan University Center of Clinical Bioinformatics, Deputy Director of Shanghai Respiratory Research Institute, and visiting professor of King’s College of London. His main research is focused on clinical bioinformatics, disease-specific biomarkers, lung chronic diseases, cancer immunology, and molecular & cellular therapies.
In addition, Dr Wang serves as the Executive Vice President of International Society for Translational Medicine, Chairman of Executive Committee of International Society for Translational Medicine, Deputy President of Chinese National Professional Society of Insurance & Health and a senior advisor of Chinese Medical Doctor Association, and Director of National Program of Doctor-Pharmaceutist Communication. Dr Wang was appointed as the Principal Scientist, Global Disease Advisor, Medical Monitor and Director, and Chairman of Director Board in a number of pharmaceutical companies, e.g. Astra Draco, AstraZeneca, PPT, and CatheWill. He was the professor of Molecular Bioscience at North Carolina State University, professor of Clinical Bioinformatics at Lund University, and the active member of American Thoracic Society International Health Committee, USA.
He serves as an Editor-in-Chief of Cell Biology and Toxicology (IF=2.84) and
co-Editor-in-Chief of Clinical &
Translational Medicine; Editor of
Serial Book: Translational Bioinformatics; Asian Editor of
Journal of Cellular Molecular
Medicine (IF=4.99); Section Editor of Disease Biomarkers of
Journal of Translational Medicine
(IF=3.68); Associate Editor of
Expert Review of Clinical Pharmacology (IF=2.48);
and the editorial member of international journals, e.g.
American Journal of Pulmonary
Critical Care Medicine
American Journal of Cellular &
He is the author of more than 200 scientific publications with the impact
factor about 600, citation number about 5372, h-index 41, i10-index 138, and
cited journal impact factor about 5000.
He serves as an Editor-in-Chief of Cell Biology and Toxicology (IF=2.84) and co-Editor-in-Chief of Clinical & Translational Medicine; Editor of Serial Book: Translational Bioinformatics; Asian Editor of Journal of Cellular Molecular Medicine (IF=4.99); Section Editor of Disease Biomarkers of Journal of Translational Medicine (IF=3.68); Associate Editor of Expert Review of Clinical Pharmacology (IF=2.48); and the editorial member of international journals, e.g. American Journal of Pulmonary Critical Care Medicine (IF=13), American Journal of Cellular & Molecular Biology (IF=5). He is the author of more than 200 scientific publications with the impact factor about 600, citation number about 5372, h-index 41, i10-index 138, and cited journal impact factor about 5000.
Invited Talk 8: An Algorithmic-Information Calculus for Reprogramming Biological Networks
Speaker: Dr. Hector Zenil
Despite extensive attempts to characterize systems and networks based upon metrics drawn from traditional statistics, Shannon entropy, and graph theory to understand systems and networks to reveal their causal mechanisms without making too many unjustified assumptions remains still as one of the greatest challenges in complexity science and science in general, specially beyond traditional statistics and so-called machine learning. Knowing the causal mechanisms that govern a system allows not only the prediction of the system’s behaviour but the manipulation and controlled reprogramming of the system. Here we introduce a formal interventional calculus based upon universal principles drawn from the theory of computability and algorithmic probability, thereby enabling better approaches to the question of causal discovery. By performing sequences of fully controlled perturbations, changes in the algorithmic content of a system can be classified into the effects they have according to their shift towards or away from algorithmic randomness, thereby inducing a ranking of system’s elements. This spectral dimension unmasks an algorithmic separation between components conditioned upon the perturbations and endowing us with a suite of powerful parameter-free algorithms to reprogram the system’s underlying program. The predictive and explanatory power of these novel conceptual tools are introduced and numerical experiments are illustrated on various types of networks. We show how the algorithmic content of a network is connected to its possible dynamics and how the instant variation of the sensitivity, depth, and the number of attractors in a network is accessible by an analysis of its algorithmic information landscape. The results demonstrate how to unveil causal mechanisms to infer essential properties, including the dynamics of evolving networks. We introduce measures and methods for system reprogrammability even with no, or limited, access to the system kinetic equations or probability distributions. We expect this interventional calculus to be broadly applicable for predictive causal interventions and we anticipate it to be instrumental in the challenge of causality discovery in science from complex data.