IEEE Big Data 2018 Accepted Papers

1. Big Data Foundations

Regular Papers
Paper IDTitleAuthors
BigD357 Linear Models with Many Cores and CPUs: A Stochastic Atomic Update Scheme Edward Raff and Jared Sylvester
BigD409 Best-Choice Edge Grafting for Efficient Structure Learning of Markov Random Fields Walid Chaabene and Bert Huang
BigD504 Semi-supervised Deep Representation Learning for Multi-View Problems Vahid Noroozi, Lei Zheng, Sara Bahaadini, Sihong Xie, Weixiang Shao, and Philip S. Yu
BigD545 Projection-SVM: Distributed Kernel Support Vector Machine for Big Data using Subspace Partitioning Dinesh Singh and Krishna Mohan C
BigD564 Detecting Latent Structure Uncertainty with Structural Entropy So Hirai and Kenji Yamanishi
BigD580 Time Series Classification Using a Neural Network Ensemble Soukaina Filali Boubrahimi and Rafal Angryk
BigD602 Hybridization of Active Learning and Data Programming for Labeling Large Industrial Datasets Mona Nashaat, Aindrila Ghosh, Shaikh Quader, Chad Marston, Jean-Francois Puget, and James Miller
BigD717 DANN: Incorporating Prior Domain Knowledge into Model Training Nikhil Muralidhar, Mohammad Raihanul Islam, Manish Marwah, Anuj Karpatne, and Naren Ramakrishnan
Short Papers
BigD366 Efficient Dimensionality Reduction for Sparse Binary Data Rameshwar Pratap, Raghav Kulkarni, and Ishan Sohony
BigD369 Effective Outlier Detection based on Bayesian Network and Proximity Sha Lu, Lin Liu, Jiuyong Li, and Thuc Duy Le
BigD405 Hash-Grams On Many-Cores and Skewed Distributions Edward Raff and Mark McLean
BigD442 Securing Behavior-based Opinion Spam Detection Shuaijun Ge, Guixiang Ma, Sihong Xie, and Philip S. Yu
BigD451 AdaDIF: Adaptive Diffusions for Efficient Semi-supervised Learning over Graphs Dimitris Berberidis, Athanasios Nikolakopoulos, and Georgios B. Giannakis
BigD482 Source Free Domain Adaptation Using an Off-the-Shelf Classifier Arun Reddy Nelakurthi, Ross Maciejwski, and Jingrui He
BigD501 Modeling Road Traffic Dynamics Using Big Data Fan Yang, Alina Vereshchaka, and Wen Dong
BigD536 Scalable Bottom-up Subspace Clustering using FP-Trees for High Dimensional Data Tuan Doan, Jianzhong Qi, Sutharshan Rajasegarar, and Christopher Leckie
BigD565 Biomedical Data Classification using Random Projection Ensembles Sotiris Tasoulis, Aristidis Vrahatis, Spiros Georgakopoulos, and Vassilis Plagianakos
BigD581 Representation Learning for Question Classification via Topic Sparse Autoencoder and Entity Embedding Dingcheng Li, Jingyuan Zhang, and Ping Li
BigD699 Scaling up Inference in MLNs with Spark Maminur Islam, Khan Mohammad Al Farabi, Somdeb Sarkhel, and Deepak Venugopal
BigD701 Queryable Compression on Time-Evolving Social Networks with Streaming Michael Nelson, Sridhar Radhakrishnan, and Chandra Sekharan
BigD708 Topological approaches to skin disease image analysis Yu-Min Chung, Chuan-Shen Hu, Austin Lawson, and Clifford Smyth
BigD730 DeepFP: A Deep Learning Framework For User Fingerprinting via Mobile Motion Sensors Sara Amini, Vahid Noroozi, Sara Bahaadini, Philip S. Yu, and Chris Kanich

2. Big Data Infrastructure

Regular Papers
BigD234 An Empirical Analysis on Expressibility of Vertex Centric Graph Processing Paradigm Siyuan Liu and Arijit Khan
BigD239 ARCHIE: Data Analysis Acceleration with Array Caching in Hierarchical Storage Bin Dong, Teng Wang, Houjun Tang, Quincey Koziol, Kesheng Wu, and Suren Byna
BigD294 Column Cache: Buffer Cache for Columnar Storage on HDFS Takeshi Yoshimura, Tatsuhiro Chiba, and Hiroshi Horii
BigD336 Online Density Estimation over Streaming Data: A Local Adaptive Solution Zhong Chen, Zhide Fang, Jiabin zhao, Wei Fan, Andrea Edwards, and Kun Zhang
BigD350 Practical Cross Program Memoization with KeyChain Craig Mustard and Alexandra Fedorova
BigD397 Learning-based Automatic Parameter Tuning for Big Data Analytics Frameworks Liang Bao, Xin Liu, and Weizhao Chen
BigD403 Dynamic and Transparent Memory Sharing for Accelerating Big Data Analytics Workloads in Virtualized Cloud Wenqi Cao and Ling Liu
BigD431 Scalable Manifold Learning for Big Data with Apache Spark Frank Schoeneman and Jaroslaw Zola
BigD598 Mira: Sharing Resources for Distributed Analytics at Small Timescales Michael Kaufmann, Kornilios Kourtis, Adrian Schuepbach, and Martina Zitterbart
BigD671 A Method-Level Test Generation Framework for Debugging Big Data Applications Huadong Feng, Jaganmohan Chandrasekaran, Yu Lei, Raghu Kacker, and D. Richard Kuhn
BigD738 A Reinforcement Learning Based Resource Management Approach for Time-critical Workloads in Distributed Computing Environment Zixia Liu, Hong Zhang, Bingbing Rao, and Liqiang Wang
Short Papers
BigD233 Dilemma between Naive or Costly: Technique of Resembling Data Processing Workloads for Datacenter Flash Storage Janki Bhimani, Rajinikanth Pandurangan, Ningfang Mi, and Vijay Balakrishnan
BigD252 Serverless Big Data Processing using Matrix Multiplication as Example Sebastian Werner, Jörn Kuhlenkamp, Markus Klems, Johannes Müller, and Stefan Tai
BigD262 Versatile Communication Optimization for Deep Learning by Modularized Parameter Server Po-Yen Wu, Pangfeng Liu, and Jan-Jan Wu
BigD291 Analyzing Alibaba's Co-located Datacenter Workloads Yue Cheng, Ali Anwar, and Xuejing Duan
BigD317 Communication Model for Parallel Iterative Stream Processing Sachini Jayasekara, Xunyun Liu, Shanika Karunasekera, and Aaron Harwood
BigD394 OverSketch: Approximate Matrix Multiplication for the Cloud Vipul Gupta, Shusen Wang, Thomas Courtade, and Kannan Ramchandran
BigD459 POSUM: A Portfolio Scheduler for MapReduce Workloads Maria Voinea, Alexandru Uta, and Alexandru Iosup
BigD470 Experimental Characterizations and Analysis of Deep Learning Frameworks Yanzhao Wu, Wenqi Cao, Semih Sahin, and Ling Liu
BigD563 Scalable Distributed Top-k Join Queries in Topic-Based Pub/Sub Systems Nikos Zacheilas, Dimitris Dedousis, and Vana Kalogeraki
BigD604 XOS: An Application-Defined Operating System for Data Center Servers Chen Zheng, Lei Wang, Sally A. McKee, Jianfeng Zhan, and Lixin Zhang
BigD617 Culster-based Data Reduction for Persistent Homology Anindya Moitra, Nick Malott, and Philip Wilsey
BigD645 GeoMatch: Efficient Large-Scale Map Matching on Apache Spark Ayman Zeidan, Eemil Lagerspetz, Kai Zhao, Petteri Nurmi, Sasu Tarkoma, and Huy Vo
BigD682 Parallel DBSCAN Algorithm Using a Data Partitioning Strategy with Spark Implementation Dianwei Han
BigD689 Sync-on-the-fly: A Parallel Framework for Gradient Descent Algorithms on Transient Resources Guoyi Zhao, Lixin Gao, and David Irwin
BigD711 GreenDataFlow: Minimizing the Energy Footprint of Global Data Movement Zulkar Nine, Luigi Di Tacchio, Asif Imran, Tevfik Kosar, Fatih Bulut, and Jinho Hwang
BigD719 ThousandSunny: A Large-Scale Neural Network Training System For Online Advertising Quanchang Qi, Guangming Lu, Jun Zhang, Lichun Yang, and Haishan Liu
BigD735 Spark-uDAPL: Cost-Saving Big Data Analytics on Microsoft Azure Cloud with RDMA Networks Xiaoyi Lu, Dipti Shankar, Haiyang Shi, and Dhabaleswar K. (DK) Panda

3. Big Data Management

Regular Papers
BigD309 Concept-Driven Load Shedding: Reducing Size and Error of Voluminous and Variable Data Streams Nikos R. Katsipoulakis, Alexandros Labrinidis, and Panos K. Chrysanthis
BigD318 HYPE: Massive Hypergraph Partitioning with Neighborhood Expansion Christian Mayer, Ruben Mayer, Sukanya Bhowmik, Lukas Epple, and Kurt Rothermel
BigD358 Accelerating a Distributed CPD Algorithm for Large Dense, Skewed Tensors Kareem Aggour, Alex Gittens, and Bülent Yener
BigD379 Explaining Aggregates for Exploratory Analytics Fotis Savva, Christos Anagnostopoulos, and Peter Triantafillou
BigD396 Optimizing Lossy Compression with Adjacent Snapshots for N-body Simulation Data Sihuan Li, Sheng Di, Xin Liang, Zizhong Chen, and Franck Cappello
BigD475 Error-Controlled Lossy Compression Optimized for High Compression Ratios of Scientific Datasets Xin Liang, Sheng Di, Dingwen Tao, Sihuan Li, Shaomeng Li, Hanqi Guo, Zizhong Chen, and Franck Cappello
BigD481 Truth Inference on Sparse Crowdsourcing Data with Local Differential Privacy Haipei Sun, Boxiang Dong, Wendy Hui Wang, Ting Yu, and Zhan Qin
BigD529 Cloud based Real-Time and Low Latency Scientific Event Analysis Chen Yang, Zhihui Du, and Xiaofeng Meng
BigD573 Influence Maximization in Evolving Multi-Campaign Environments Iouliana Litou and Vana Kalogeraki
BigD611 Alleviating I/O Inefficiencies to Enable Effective Model Training Over Voluminous, High-Dimensional Datasets Daniel Rammer, Walid Budgaga, Thilina Buddhika, Shrideep Pallickara, and Sangmi Lee Pallickara
Short Papers
BigD287 Steering Top-k Influencers in Dynamic Graphs via Local Updates Vijaya Krishna Yalavarthi and Arijit Khan
BigD290 Distributed Execution of Spatial SQL Queries Konstantinos Giannousis, Konstantina Bereta, Nikolaos Karalis, and Manolis Koubarakis
BigD325 Efficient Processing of Probabilistic Single and Batch Reachability Queries in Large and Evolving Spatiotemporal Contact Networks Zohreh Raghebi and Farnoush Banaei-Kashani
BigD351 FairGAN: Fairness-aware Generative Adversarial Networks Depeng Xu, Shuhan Yuan, Lu Zhang, and Xintao Wu
BigD365 Aggregation of Linked Data: a case study in the cultural heritage domain Nuno Freire, Enno Meijers, René Voorburg, Roland Cornelissen, Antoine Isaac, and Sjors de Valk
BigD381 Integrated Real-Time Data Stream Analysis and Sketch-Based Video Retrieval in Team Sports Lukas Probst, Fabian Rauschenbach, Heiko Schuldt, Philipp Seidenschwarz, and Martin Rumo
BigD384 A Universal Namespace Approach to Support Metadata Management and Efficient Data Convergence of HPC and Cloud Scientific Workflows Hsing-bung Chen
BigD473 FastTopK: A Fast Top-K Trajectory Similarity Query Processing Algorithm for GPUs Hamza Mustafa, Eleazar Leal, and Le Gruenwald
BigD621 Optimized Storing of Workflow Outputs through Mining Association Rules Debasish Chakroborti, Manishankar Mondal, Banani Roy, Chanchal K. Roy, and Kevin A. Schneider
BigD627 A Survey on Trajectory Data Management for Hybrid Transactional and Analytical Workloads Keven Richly
BigD680 Aion: It's Never too Late in Event-Time Streams Sérgio Esteves, Gianmarco De Francisci Morales, Rodrigo Rodrigues, Marco Serafini, and Luís Veiga
BigD741 Dynamic Online Performance Optimization in Streaming Data Compression Kade Gibson, Dongeun Lee, Jaesik Choi, and Alexander Sim

4. Big Data Search and Mining

Regular Papers
BigD236 Benchmarking API Costs of Network Sampling Strategies Michele Coscia and Luca Rossi
BigD240 Using Smart Card Data to Model Commuters' Response Upon Unexpected Train Delays Xiancai Tian and Baihua Zheng
BigD302 Optimal k-Nearest-Neighbor Query Processing via Multiple Lower Bound Approximations Christian Beecks and Max Berrendorf
BigD305 Differentially Private Semi-Supervised Learning With Known Class Priors Anh Pham and Jing Xi
BigD307 Revisiting Exact kNN Query Processing with Probabilistic Data Space Transformations Atoshum Samuel Cahsai, Christos Anagnostopoulos, Nikos Ntarmos, and Peter Triantafillou
BigD328 Scalable Construction of Text Indexes with Thrill Timo Bingmann, Simon Gog, and Florian Kurpicz
BigD334 AURORA: Auditing PageRank on Large Graphs Jian Kang, Meijia Wang, Nan Cao, Yinglong Xia, Wei Fan, and Hanghang Tong
BigD356 Adaptive Data Pruning for Support Vector Machines Yasuhiro Fujiwara, Junya Arai, Sekitoshi Kanai, Yasutoshi Ida, and Naonori Ueda
BigD364 An Efficient System for Subgraph Discovery Aparna Joshi, Yu Zhang, Petko Bogdanov, and Jeong-Hyon Hwang
BigD414 ImVerde: Vertex-Diminished Random Walk for Learning Imbalanced Network Representation Jun Wu, Jingrui He, and Yongming Liu
BigD423 Efficient Discovery of Weighted Frequent Itemsets in Very Large Transactional Databases: A Re-visit RAGE UDAY KIRAN
BigD425 On Learning Psycholinguistics Tools for English-based Creole Languages using Social Media Data Pei-Chi LO and Ee-Peng LIM
BigD437 Automated Extraction of Personal Knowledge from Smartphone Push Notifications Yuanchun Li, Ziyue Yang, Yao Guo, Xiangqun Chen, Yuvraj Agarwal, and Jason Hong
BigD440 Semi-supervised Multi-instance Learning for Flu Shot Adverse Event Detection Junxiang Wang, Liang Zhao, and Yanfang Ye
BigD457 Candidate List Maintenance in High Utility Sequential Pattern Mining Scott Buffett
BigD469 ParIS: The Next Destination for Fast Data Series Indexing and Query Answering Botao Peng, Themis Palpanas, and Panagiota Fatourou
BigD477 One-Shot Learning on Attributed Sequences Zhongfang Zhuang, Xiangnan Kong, Elke Rundensteiner, Aditya Arora, and Jihane Zouaoui
BigD483 A Data-Centric Approach for Image Scene Localization Abdullah Alfarrarjeh, Seon Ho Kim, Shivnesh Rajan, Akshay Deshmukh, and Cyrus Shahabi
BigD498 Learning Multiclassifiers with Predictive Features Varied along with Data Distribution Xuan-Hong Dang, Omid Askarisichani, and Ambuj K. Singh
BigD508 FauxBuster: A Content-free Fauxtography Detector Using Social Media Comments Daniel Zhang, Lanyu Shang, Biao Geng, Shuyue Lai, Ke Li, Hongmin Zhu, Md Tanvir Amin, and Dong Wang
BigD509 Lifelong Memory Networks with Knowledge Learning from Big Data for Aspect Sentiment Classification Shuai Wang, Guangyi Lv, Sahisnu Mazumder, Geli Fei, and Bing Liu
BigD542 Hot Spot Analysis for Big Trajectory Data Panagiotis Nikitopoulos, Aris-Iakovos Paraskevopoulos, Christos Doulkeridis, Nikos Pelekis, and Yannis Theodoridis
BigD566 PER: A Probabilistic Attentional Model for Personalized Text Recommendations Lei Zheng, Yixue Wang, Lifang He, Sihong Xie, Fengjiao Wang, and Philip S. Yu
BigD569 Fast and Accurate Mining of Node Importance \\ in Trajectory Networks Tilemachos Pechlivanoglou and Manos Papagelis
BigD571 Fast Bag-Of-Words Candidate Selection in Content-Based Instance Retrieval Systems Michal Siedlaczek, Qi Wang, Yen-Yu Chen, and Torsten Suel
BigD600 BigSR: real-time expressive RDF stream reasoning on modern Big Data platforms Xiangnan Ren, Olivier Curé, Hubert Naacke, and Guohui Xiao
BigD624 Constructing Influence Trees from Temporal Sequence of Retweets: An Analytical Approach Ayan Kumar Bhowmick, G. Sai Bharath Chandra, Yogesh Singh, and Bivas Mitra
BigD626 StreamGuard: A Bayesian Network Approach to Copyright Infringement Detection Problem in Large-scale Live Video Sharing Systems Daniel Zhang, Lixing Song, Qi Li, Yang Zhang, and Dong Wang
BigD658 Influence Maximization in Social Networks With Non-Target Constraints Madhavan Padmanabhan, Naresh Somisetty, Samik Basu, and A Pavan
BigD662 A Multi-Criteria Experimental Ranking of Distributed SPARQL Evaluators Damien Graux, Louis Jachiet, Pierre Genevès, and Nabil Layaïda
BigD665 Mining top-k Popular Datasets via a Deep Generative Model Uchenna Akujuobi, Ke Sun, and Xiangliang Zhang
BigD722 Fusion of Terrain Information and Mobile Phone Location Data for Flood Area Detection in Rural Areas Takahiro Yabe, Kota Tsubouchi, and Yoshihide Sekimoto
BigD740 Fast Clustering with Flexible Balance Constraints Hongfu Liu, Ziming Huang, Yun Fu, Qi Chen, Mingqin Li, and Lintao Zhang
BigD753 Improved Dynamic Memory Network for Dialogue Act Classification with Adversarial Training Yao Wan, Wenqiang Yan, Jianwei Gao, Zhou Zhao, Jian Wu, and Philip S. Yu
BigD758 A Sketch-Based Naive Bayes Algorithms for Evolving Data Streams Maroua Bahri, Silviu Maniu, and Albert Bifet
Short Papers
BigD276 Efficient Principal Subspace Projection of Streaming Data Through Fast Similarity Matching Andrea Giovannucci, Victor Minden, Cengiz Pehlevan, and Dmitri Chklovskii
BigD293 Identifying Pros and Cons of Product Aspects Based on Customer Reviews Ebad Ahmadzadeh and Philip Chan
BigD301 Dynamic Network Embeddings: From Random Walks to Temporal Random Walks Giang Nguyen, John Boaz Lee, Ryan Rossi, Nesreen Ahmed, Eunyee Koh, and Sungchul Kim
BigD323 Efficient Triangles Estimation in Network Streams Based on Edge Sampling Roohollah Etemadi and Jianguo Lu
BigD330 StageMap: Extracting and Summarizing Progression Stages in Event Sequences Yuanzhe Chen, Abishek Puri, Linping Yuan, and Huamin Qu
BigD339 Speed Accuracy Trade-off for Pedestrian and Vehicle Detection using Localized Big Data Yeongro Yun, Youngseok Park, Chanhee Woo, and Sejoon Lim
BigD345 The content correlation of multiple streaming edges Michel de Rougemont and Guillaume Vimont
BigD383 Learning Fast and Slow - A Unified Batch/Stream Framework Jacob Montiel, Albert Bifet, Viktor Losing, Jesse Read, and Talel Abdessalem
BigD385 Top-N-Rank: A Truncated List-wise Ranking Approach for Large-scale Top-N Recommendation Junjie Liang, Jinlong Hu, Shoubin Dong, and Vasant Honavar
BigD386 Dynamic Partition Forest: An Efficient and Distributed Indexing Scheme for Similarity Search based on Hashing Yangdi Lu, Yang Bo, Wenbo He, and Amir Nabatchian
BigD390 Monitoring the shape of weather, soundscapes, and dynamical systems: a new statistic for dimension-driven data analysis on large data sets Henry Kvinge, Elin Farnell, Michael Kirby, and Chris Peterson
BigD393 Clustering-Driven and Dynamically Diversified Ensemble for Drifting Data Streams Lukasz Korycki and Bartosz Krawczyk
BigD395 Predicting computational reproducibility of data analysis pipelines in large population studies using collaborative filtering Soudabeh Barghi, Lalet Scaria, Ali Salari, and Tristan Glatard
BigD436 Detecting Highly Overlapping Community Structure by Model-based Maximal Clique Expansion Said Jabbour, Nizar Mhadhbi, Badran Raddaoui, and Lakhdar Sais
BigD444 Improving Query Execution Performance in Big Data using Cuckoo Filter Sharafat Ibn Mollah Mosharraf and Muhammad Abdullah Adnan
BigD461 CAM: A Combined Attention Model for Natural Language Inference Amit Gajbhiye, Sardar Jaf, Noura Al Moubayed, Steven Bradley, and A. Stephen McGough
BigD480 Local Partition in Rich Graphs Scott Freitas, Nan Cao, Yinglong Xia, Duen Horng Chau, and Hanghang Tong
BigD484 All-in-One Urban Mobility Mapping Application with Optional Routing Capabilities Rebekah Thompson, Jose Stovall, Daniel Velasquez, Viswa Sri Rupa Anne, Alex Samoylov, and Mina Sartipi
BigD497 Multi-Attribute Topic Feature Construction for Social Media-based Prediction Alex Morales, Nupoor Gandhi, Man-pui Sally Chan, Sophie Lohmann, Travis Sanchez, Lyle Ungar, Dolores Albaracin, and Chengxiang Zhai
BigD499 Context-Aware Deep Sequence Learning with Multi-View Factor Pooling for Time Series Classification Sreyasee Das Bhattacharjee, William J. Tolone, Ashish Mahabal, Mohammed Elshambakey, Isaac Cho, and S. G. Djorgovski
BigD530 DLA: a Distributed, Location-based and Apriori-based Algorithm for Biological Sequence Pattern Mining Eirini Stamoulakatou, Andrea Gulino, and Pietro Pinoli
BigD533 Motif-Preserving Dynamic Local Graph Cut Dawei Zhou, Jingrui He, Hasan Davulcu, and Ross Maciejewski
BigD537 AnySC: Anytime Set-wise Classification of Variable Speed Data Streams Jagat Sesh Challa, Poonam Goyal, Vijay M Giri, Dhananjay Mantri, and Navneet Goyal
BigD576 Pseudo-Inverse Linear Discriminants for Highly Imbalanced Big Datasets Daqi Gao, Jingguang Zhang, and Jiamin Song
BigD583 Correlated Anomaly Detection from Large Streaming Data Zheng Chen, Xinli Yu, Yuan Lin, Xiaohua Hu, and Erjia Yan
BigD605 TSM2: Optimizing Tall-and-Skinny Matrix-Matrix Multiplication on GPUs Jieyang Chen, Nan Xiong, Xin Liang, Dingwen Tao, Sihuan Li, Kaiming Ouyang, Kai Zhao, Nathan DeBardeleben, Qiang Guan, and Zizhong Chen
BigD614 SynthNotes: A Generator Framework for High-volume, High-fidelity Synthetic Mental Health Notes Edmon Begoli, Kris Brown, Sudarshan Srinivas, and Suzanne Tamang
BigD633 Spatio-Temporal Attention based recurrent neural network for next poi prediction Basmah Altaf, Lu Yu, and Xiangliang Zhang
BigD644 An Application of Storage-Optimal MatDot Codes for Coded Matrix Multiplication: Fast k-Nearest Neighbors Estimation Utsav Sheth, Sanghamitra Dutta, Malhar Chaudhari, Haewon Jeong, Yaoqing Yang, Jukka Kohonen, Teemu Roos, and Pulkit Grover
BigD666 PACAS: Privacy-Aware, Data Cleaning-as-a-Service Yu Huang, Mostafa Milani, and Fei Chiang
BigD669 Scalable K-Core Decomposition for Static Graphs Using a Dynamic Graph Data Structure Alok Tripathy, Fred Hohman, Duen Horng Chau, and Oded Green
BigD697 Deep Learning for Predicting Dynamic Uncertain Opinions in Network Data Xujiang Zhao, Feng Chen, and Jin-Hee Cho
BigD698 Density-aware Local Siamese Autoencoder Network Embedding with Autoencoder Graph Clustering Yang Zhou, Amnay Amimeur, Chao Jiang, Dejing Dou, Ruoming Jin, and Pengwei Wang
BigD703 Exploring Size-Speed Trade-Offs In Static Index Pruning Juan Rodriguez and Torsten Suel
BigD731 Enumerating Top-k Quasi-Cliques Seyed-Vahid Sanei-Mehri, Apurba Das, and Srikanta Tirthapura
BigD751 Context Aware Recommender System for Large Scaled Flash Sale Sites Wanying Ding, Ran Xu, Ying Ding, Yue Zhang, and Chuanjiang Luo

5. Big Data Security & Privacy

Regular Papers
BigD238 Do Bitcoin Users Really Care About Anonymity? A Graph Analysis on Bitcoin Transaction Graphs Anil Gaihre, Yan Luo, and Hang Liu
BigD247 Distributed Machine Learning Meets Blockchain: A Decentralized, Secure, and Privacy-preserving Realization Xuhui Chen, Jinlong Ji, Changqing Luo, Weixian Liao, and Pan Li
BigD296 Benchmarking Anomaly Detection Algorithms in an Industrial Context: Dealing with Scarce Labels and Multiple Positive Types David Renaudie, Maria A. Zuluaga, and Rodrigo Acuna-Agost
BigD314 A Unified Unsupervised Gaussian Mixture Variational Autoencoder for High Dimensional Outlier Detection Weixian Liao, Yifan Guo, Xuhui Chen, and Pan Li
BigD388 PrivacyZone: A novel approach to protecting location privacy of mobile users Emre Yigitoglu, Mehmet Emre Gursoy, Ling Liu, Margaret Loper, Bhuvan Bamba, and Kisung Lee
BigD531 There goes Wally: Anonymously sharing your location gives you away Apostolos Pyrgelis, Nicolas Kourtellis, Ilias Leontiadis, Joan Serra, and Claudio Soriente
BigD534 Actionable Objective Optimization for Suspicious Behavior Detection on Large Bipartite Graphs Tong Zhao, Matthew Malir, and Meng Jiang
BigD538 Phishing URL Detection with Oversampling based on Text Generative Adversarial Networks Ankesh Anand, Kshitij Gorde, Joel Moniz, Noseong Park, Tanmoy Chakraborty, and Bei-Tseng Chu
BigD677 GCI: A Transfer Learning Approach for Detecting Cheats of Computer Game Bo Dong, Md Shihabul Islam, Swarup Chandra, Latifur Khan, and Bhavani Thuraisingham
Short Papers
BigD217 Novel anomaly detection and classification schemes for Machine-to-Machine uplink Akshay Kumar, Ahmed Abdelhadi, and Charles Clancy
BigD255 Algorithmic Reputation Michael Katell
BigD346 Toward End-to-End Deception Detection in Videos Hamid Karimi, Jiliang Tang, and Yanen Li
BigD443 Learning Light-Weight Edge-Deployable Privacy Models Yeon-sup Lim, Mudhakar Srivatsa, Supriyo Chakraborty, and Ian Taylor
BigD446 Automated Generation and Selection of Interpretable Features for Enterprise Security Jiayi Duan, Ziheng Zeng, Alina Oprea, and Shobha Vasudevan
BigD523 Graph Mining-based Trust Evaluation Mechanism with Multidimensional Features for Large-scale Heterogeneous Threat Intelligence Yali Gao, Xiaoyong Li, Jirui Li, Yunquan Gao and Ning Guo
BigD589 CVExplorer: Multidimensional Visualization for Common Vulnerabilities andExposures Vung Pham and Tommy Dang
BigD630 dynamicMF: A Matrix Factorization Approach to Monitor Resource Usage in High Performance Computing Niyazi Sorkunlu, Duc Thanh Anh Luong, and Varun Chandola
BigD667 An Integrated Knowledge Graph to Automate GDPR and PCI DSS Compliance Lavanya Elluri, Ankur Nagar, and Karuna Pande Joshi

6. Big Data Applications

Regular Papers
BigD278 A Bayesian Approach to Residential Property Valuation Based on Built Environment and House Characteristics Zhicheng Liu, Shuai Yan, Jun Cao, Tanhua Jin, Jiabo Tang, Junyan Yang, and Qiao Wang
BigD285 Realtime Robustification of Interdependent Networks under Cascading Attacks Zhen Chen, Hanghang Tong, and Lei Ying
BigD292 Market Abnormality Period Detection via Co-movement Attention Model Yue Wang, Chenwei Zhang, Shen Wang, Philip S. Yu, Lu Bai, and Lixin Cui
BigD298 Optimizing Taxi Carpool Policies via Reinforcement Learning and Spatio-Temporal Mining Ishan Jindal, Zhiwei (Tony) Qin, Xuewen Chen, Matthew Nokleby, and Jieping Ye
BigD353 Large-Scale Validation of Hypothesis Generation Systems via Candidate Ranking Justin Sybrandt, Micheal Shtutman, and Ilya Safro
BigD354 Are Abstracts Enough for Hypothesis Generation? Justin Sybrandt, Angelo Carrabba, Alexander Herzog, and Ilya Safro
BigD371 Enabling of Predictive Maintenance in the Brownfield through Low-Cost Sensors, an IIoT-Architecture and Machine Learning Patrick Strauß, René Wöstmann, Markus Schmitz, and Jochen Deuse
BigD387 Integrating the University of São Paulo Security Mobile App to the Electronic Monitoring System João Eduardo Fereira, José Antônio Visintin, Jun Okamoto Jr., Mauro Cesar Bernardes, Adriano Paterlini, Alexander Csóka Roque, and Moisés Ramalho Miguel
BigD401 IL-Net: Using Expert Knowledge to Guide the Design of a Furcated Neural Networks Khushmeen Sakloth, Wesley Beckner, Jim Pfaendtner, and Garrett Goh
BigD402 Dynamic Prediction of ICU Mortality Risk Using Domain Adaptation Tiago Alves, Alberto Laender, Adriano Veloso, and Nivio Ziviani
BigD406 Two Birds with One Network: Unifying Event Prediction and Time-to-failure Modeling Karan Aggarwal, Onur Atan, Ahmed Farahat, Chi Zhang, Kosta Ristovski, and Chetan Gupta
BigD445 Transfer learning for time series classification Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, and Pierre-Alain Muller
BigD456 A Structured Learning Approach with Neural Conditional Random Fields for Sleep Staging Karan Aggarwal, Swaraj Khadanga, Shafiq Joty, Louis Kazaglis, and Jaideep Srivastava
BigD476 RiskSens: A Multi-view Learning Approach to Identifying Risky Traffic Locations in Intelligent Transportation Systems Using Social and Remote Sensing Yang Zhang, Yiwen Lu, Daniel Zhang, Lanyu Shang, and Dong Wang
BigD495 Exploiting Knowledge Graph to Improve Text-based Prediction Shan Jiang, Chengxiang Zhai, and Qiaozhu Mei
BigD526 A Minimax Approach for Classification with Big-data Krishnan Raghavan, Jagannathan Sarangapani, and VA Samaranayake
BigD528 Mining Illegal Insider Trading of Stocks: A Proactive Approach Sheikh Rabiul Islam, Sheikh Khaled Ghafoor, and William Eberle
BigD547 Profiling Driver Behavior for Personalized Insurance Pricing and Maximal Profit Bing He, Dian Zhang, Siyuan Liu, Hao Liu, Dawei Han, and Lionel M. Ni
BigD590 Inferring Housing Demand based on Express Delivery Data Qingyang Li, Zhiwen Yu, Bin Guo, and Xinjiang Lu
BigD593 Knowledge-guided Bayesian Support Vector Machine for High-Dimensional Data with Application to Analysis of Genomics Data Wenli Sun, Changgee Chang, Yize Zhao, and Qi Long
BigD643 Situation-Based Interaction Learning for Personality Prediction on Facebook Lei Zhang, Liang Zhao, Xuchao Zhang, Wenmo Kong, Zitong Sheng, and Chang-Tien Lu
BigD653 Technology Enablers for Big Data, Multi-Stage Analysis in Medical Image Processing Shunxing Bao, Prasanna Parvathaneni, Yuankai Huo, Yogesh Barve, Andrew Plassard, Yuang Yao, Hongyang Sun, Ilwoo Lyu, David Zald, Bennett Landman, and Aniruddha Gokhale
BigD655 The unbanked and poverty: Predicting area-level socio-economic status from M-Money transactions Gregor Engelmann, James Goulding, and Gavin Smith
BigD685 An Unsupervised Learning Based Approach for Mining Attribute Based Access Control Policies Leila Karimi and James Joshi
BigD721 Time-Aware Subgroup Matrix Decomposition: Imputing Missing Data Using Forecasting Events Xi Yang and Min Chi
BigD760 Learning Informative and Private Representations via Generative Adversarial Networks Tsung-Yen Yang, Christopher Brinton, Prateek Mittal, Mung Chiang, and Andrew Lan
Short Papers
BigD245 Predicting Perceived Cycling Safety Levels Using Open and Crowdsourced Data Jiahui Wu, Lingzi Hong, and Vanessa Frias-Martinez
BigD248 A Longitudinal Social Network Clustering Method Based on Tie Strength Zhiyong Zhang, Mao Ye, Yijie Huang, and Nan Sun
BigD263 Personalized heart failure severity estimates using passive smartphone data Ayse Cakmak, Erik Reinertsen, Herman Taylor, Amit Shah, and Gari Clifford
BigD280 Data-driven Blockbuster Planning on Online Movie Knowledge Library Ye Liu, Jiawei Zhang, Chenwei Zhang, and Philip S. Yu
BigD310 Deep Convolutional Neural Networks for Log Event Classification on Distributed Cluster Systems Rui Ren, Jiechao Cheng, Yan Yin, Jianfeng Zhan, and Lei Wang
BigD321 Social-Media aided Hyperlocal Help-Network Matching & Routing during Emergencies Dheeraj Kumar, Takahiro Yabe, and Satish Ukkusuri
BigD337 Session Expert: a Lightweight Conference Session Recommender System Jinfeng Yi, Qi Lei, Junchi Yan, and Wei Sun
BigD362 Visual Reasoning of Feature Attribution with Deep Recurrent Neural Networks Chuan Wang, Takeshi Onishi, Keiichi Nemoto, and Kwan-Liu Ma
BigD380 Entropy-Isomap: Manifold Learning for High-dimensional Dynamic Processes Frank Schoeneman, Varun Chandola, Nils Napp, Olga Wodo, and Jaroslaw Zola
BigD389 A Hierarchical Framework for Timely Freeway Accident Detection and Localization Yasitha Warahena Liyanage, Charalampos Chelmis, and Daphney-Stavroula Zois
BigD465 Predicting Individual-Level Call Arrival from Online Account Customer Activity Somayeh Moazeni
BigD494 A Subspace Pre-learning Approach to Fast High-Accuracy Machine Learning of Large XOR PUFs with Component-Differential Challenges Ahmad O. Aseeri, Yu Zhuang, and Mohammed Saeed Alkatheiri
BigD506 Scalable Classification of Univariate and Multivariate Time Series Saeed Karimi-Bidhendi, Faramarz Munshi, and Ashfaq Munshi
BigD535 Short-term local weather forecast using dense weather station\\by deep neural network Kazuo Yonekura, Hitoshi Hattori, and Taiji Suzuki
BigD568 NetClips: A Framework for Video Analytics in Sports Broadcast Masoumeh Izadi, Shangjing Wu, Aiden Chia, and Bernard Cheng
BigD594 Defining an Alert Mechanism for Detecting likely threats to National Security Pedro Cardenas Canto, Georgios Theodoropoulos, and Boguslaw Obara
BigD654 Distributed Learning of Deep Sparse Neural Networks for High-dimensional Classification Shweta Garg, Krishnan Raghavan, Jagannathan Sarangapani, and Samaranayake V.A.
BigD659 Twitter Health Surveillance (THS) System Manuel Rodriguez-Martinez and Cristian Garzon-Alfonso
BigD746 Land Cover Classification at the Wildland Urban Interface using High-Resolution Satellite Imagery and Deep Learning Mai H. Nguyen, Jessica Block, Daniel Crawl, Vincent Siu, Akshit Bhatnagar, Federico Rodriguez, Alison Kwan, Namrita Baru, and Ilkay Altintas
BigD755 Distributed Reverse DNS Geolocation Ovidiu Dan, Vaibhav Parikh, and Brian D. Davison

Industry and Government Program

Regular Papers
Paper IDTitleAuthors
N205Relational Similarity Machines (RSM): A Similarity-based Learning Framework for GraphsRyan Rossi, Nesreen Ahmed, Rong Zhou, and Hoda Eldardiry
N209Bridging the Gap between Big Data System Software Stack and Applications: A Case Study of Semiconductor Wafer Fabrication FoundriesHung-Chang Hsiao
N210CUImage: A Neverending Learning Platform on a Convolutional Knowledge Graph of Billion Web ImagesPing Luo, Zhanglin Peng, Lingyun Wu, and Jiamin Ren
N211Learning to Simplify Distributed Systems ManagementChristopher Streiffer, Ramya Raghavendra, Theophilus Benson, and Mudhakar Srivatsa
N213Learning Effective Embeddings for Machine Generated Emails with Applications to Email Category PredictionYu Sun, Lluis Garcia-Pueyo, James Wendt, Marc Najork, and Andrei Broder
N217Scheduling Large-scale Distributed Training via Reinforcement LearningZhanglin Peng, Jiamin Ren, Ruimao Zhang, Lingyun Wu, Xinjiang Wang, and Ping Luo
N220Parallel Polyglot Query Processing on Heterogeneous Cloud Data Stores with LeanXcaleBoyan Kolev, Oleksandra Levchenko, Esther Pacitti, Patrick Valduriez, Ricardo Vilaca, Rui Goncalves, Ricardo Jimenez-Peris, and Pavlos Kranas
N221STIPA: A Memory Efficient Technique for Interval Pattern DiscoveryAmit Kumar and Dhaval Patel
N226AISTAR: An Intelligent System for Online IT Ticket Automation RecommendationQing Wang, Chunqiu Zeng, S. S. Iyengar, Larisa Shwartz, Genady Ya Grabarnik, and Tao Li
N228Character Recognition by Deep Learning: An Enterprise SolutionKhaled Bouaziz, Thiagarajan Ramakrishnan, Srinivasan Raghavan, Kyle Grove, Awny Al-Omari, and Choudur Lakshminarayan
N229Build and Execution Environment (BEE): an Encapsulated Environment Enabling HPC Applications Running EverywhereJieyang Chen, Qiang Guan, Xin Liang, Paul Bryant, Patricia Grubel, Allen McPherson, Li-Ta Lo, Timothy Randles, Zizhong Chen, and James Ahrens
N237Predicting Age & Gender of Mobile Users at Scale - A Distributed Machine Learning ApproachKajanan Sangaralingam, Nisha Verma, Aravind Ravi, Anindya Datta, and Varun Chugh
N241High-Throughput Adaptive Data Virtualization via Context-Aware Query RoutingAmirhossein Aleyasen, Mohamed Soliman, Lyublena Antova, Florian Mike Waas, and Marianne Winslett
N243Efficient Super Resolution for Large-Scale Images using Attentional GANBrooke Cowan, Xinxin Li, Shervin Minaee, and Harsh Nilesh Pathak
N244ANNOTATE: orgANizing uNstructured cOntenTs viA Topic labElsDeepak Ajwani, Bilyana Taneva, Sourav Dutta, Pat Nicholson, Ghasem Nobari, and Alessandra Sala
N256Reacting to Variations in Product Demand: An Application for Conversion Rate (CR) Prediction in Sponsored SearchMarcello Tallis and Pranjul Yadav
N257A Smart System for Selection of Optimal Product Images in E-CommerceAbon Chaudhuri, Paolo Messina, Samrat Kokkula, Aditya Subramanian, Abhinandan Krishnan, Shreyansh Gandhi, Alessandro Magnani, and Venkatesh Kandaswamy
N259Finding Data Should be Easier than Finding OilEvgeny Kharlamov, Martin Skjaeveland, Theofilos Mailis, Ernesto Jimenez-Ruiz, Guohui Xiao, Ahmet Soylu, Hallstein Lie, and Arild Waaler
N265Data models for service failure prediction in supply-chain networksMonika Sharma, Tristan Glatard, Eric Gelinas, Mariam Tagmouti, and Brigitte Jaumard
Short Papers
N208Focusing on the Big Picture: Insights into an End-to-End Systems Approach to Deep Learning for Satellite ImageryRitwik Gupta, Carson Sestili, Javier Vazquez-Trejo, and Matthew Gaston
N212A Generic and Scalable Pipeline for Large-Scale Analytics of Continuous Operational Aircraft Engine DataFlorent Forest, Jérôme Lacaille, Mustapha Lebbah, and Hanene Azzag
N214Large Scale Open Source Video Recommender Tool Using Metadata SurrogatesGeorge Mathew, Steven Smith, and John Passarelli
N218Distributed NoSQL Data Stores: Performance Analysis and a Case StudyAbdeltawab Hendawi, Jayant Gupta, Liu Jiayi, Ankur Teredesai, Ramakrishnan Naveen, Shah Mohak, and Mohamed Ali
N222Using Real-World Store Data for Foot Traffic ForecastingSoheila Abrishami and Piyush Kumar
N225Root Cause Detection using Dynamic Dependency Graphs from Time Series DataSyed Yousaf Shah, Xuan-Hong Dang, and Petros Zerfos
N227A Complete Data Science Work-flow For Insurance FieldMohammed Ghesmoune, Hanane Azzag, Mustapha Lebbah, Salima Benbernou, Mourad Ouziri, and Tarn Duong
N233In situ TensorView: In situ Visualization of Convolutional Neural NetworksXinyu Chen, Qiang Guan, Li-Ta Lo, Simon Su, Zhengyong Ren, James Ahrens, and Trilce Estrada
N234Performance Prediction using Neural Network and Confidence Intervals: a Gas Turbine application.Silvia Cisotto and Randa Herzallah
N235Spatio-temproal prediction of crimes using network analytic approachSaroj Dash, Ilya Safro, and Ravisutha Srinivasamurthy
N236Predicting Individual Level Consumer Brand Preferences Using Persistent Mobility PatternsAravind Ravi and Kajanan Sangaralingam
N240Big Data Streaming Analytics for QoE Monitoring in Mobile Networks: A Practical ApproachDiego F. Rueda, Dahyr Vergara, and David Reniz
N242A Deterministic Self-Organizing Map Approach and its Application on Satellite Data based Cloud Type ClassificationWenbin Zhang, Jianwu Wang, Daeho Jin, Lazaros Oreopoulos, and Zhibo Zhang
N245E-commerce Product Query Classification Using Implicit User's Feedback from ClicksYiu-Chang Lin and Ankur Datta
N246Explainable Text Classification in Legal Document Review: A Case Study of Explainable Predictive CodingRishi Chhatwal, Peter Gronvall, Nathaniel Huber-Fliflet, Robert Keeling, Jianping Zhang, and Haozhen Zhao
N247Augmenting Software Project Managers with Predictions from Machine LearningKalyan Veeramachaneni and Benjamin Schreck
N248and anticipate: continuous learning to block malicious domainsIgnacio Arnaldo and Kalyan Veeramachaneni, Acquire, adapt
N249A Batched Multi-Armed Bandit Approach to Dynamic News Headline TestingYizhi Mao, Miao Chen, Abhinav Wagle, Junwei Pan, Michael Natkovich, and Don Matheson
N250Identifying Distracted and Drowsy DriversSujay Yadawadkar, Brian Mayer, Sanket Lokegaonkar, Mohammad Raihanul Islam, Miao Song, Mike Mollenhauer, and Naren Ramakrishnan
N253Performance Implications of Big Data in Scalable Deep Learning: On the Importance of Bandwidth and CachingMiro Hodak, David Ellison, Peter Seidel, and Ajay Dholakia
N254ChieF : A Change Pattern based Interpretable Failure AnalyzerDhaval Patel, Lam Nguyen, Akshay Rangamani, Shrey Shrivastava, and Jayant kalagnanam
N255NetDP: An Industrial-Scale Distributed Network Representation Framework for Default Prediction in Ant Credit PayJianbin Lin, Zhiqiang Zhang, Jun Zhou, Xiaolong Li, Jingli Fang, Yanming Fang, Quan Yu, and Yuan Qi
N261Towards Semantic Simplification of Analytical Workflows at Siemens (Extended Abstract)Evgeny Kharlamov, Gulnar Mehdi, Ognjen Savkovic, Guohui Xiao, Steffen Lamparter, Arild Waaler, and Ian Horrocks
N264I4TSPS: a Visual-Interactive Web System for Industrial Time Series Pre-processingKevin Villalobos, Jon Vadillo, Borja Diez, Borja Calvo, and Arantza Illarramendi