Main Conference
Regular Paper
|
Chin-Chi Hsu, Perng-Hwa Kung, Mi-Yen Yeh, Shou-De Lin, and Phillip B. Gibbons, Bandwidth-Efficient Distributed k-Nearest-Neighbor Search with Dynamic Time Warping |
Liang Zhao, Feng Chen, Chang-Tien Lu, and Naren Ramakrishnan, Dynamic theme tracking in Twitter |
Sean Massung and Chengxiang Zhai, SyntacticDiff: Operator-Based Transformation for Comparative Text Mining |
Suprio Ray, Angela Demke Brown, Nick Koudas, Rolando Blanco, and Anil Goel, Parallel In-Memory Trajectory-based Spatiotemporal Topological Join |
Tri Kurniawan Wijaya, Matteo Vasirani, Samuel Humeau, and Karl Aberer, Cluster-based Aggregate Forecasting for Residential Electricity Demand using Smart Meter Data |
Bin Dong, Suren Byna, and Kesheng Wu, Spatially Clustered Join on Heterogeneous Scientific Data Sets |
Yixian Zheng, Wenchao Wu, Huamin Qu, Chunyan Ma, and Lionel M. Ni, Visual Analysis of Bi-directional Movement Behavior |
Toyotaro Suzumura, ScaleGraph 2: A Library for Billion-Scale Graph Analytics |
Yuncheng Li and Jiebo Luo, User-Curated Image Collections: Modeling and Recommendation |
Maria Malik and Houman Homayoun, System and Architecture Level Characterization of Big Data Applications on Big and Little Core Server Architectures |
Chung-Yi Li, Wei-Lun Su, Todd G. McKenzie, Fu-Chun Hsu, Shou-De Lin, Phillip B. Gibbons, and Jane Yung-jen Hsu, Recommending Missing Sensor Values |
Cheng-Te Li, Yu-Jen Lin, and Mi-Yen Yeh, The Roles of Network Communities in Social Information Diffusion |
Wang Ke, Guo Ping, and Luo A-Li, Angular Quantization Based Affinity Propagation Clustering and its Application to Astronomical Big Spectra Data |
Ashwin Lall, Data Streaming Algorithms for the Kolmogorov-Smirnov Test |
Masayo Ota, Huy Vo, Claudio T. Silva, and Juliana Freire, A Scalable Approach for Data-Driven Taxi Ride-Sharing Simulation |
Yibo Yao and Lawrence Holder, Scalable Classification for Large Dynamic Networks |
Jilong Kuang, Daniel Waddington, and Changhui Lin, A Fast and Scalable Time Series Traffic Generator |
Ruslan Mavlyutov and Philippe Cudré-Mauroux, CINTIA: a Distributed, Low-Latency Index for Big Interval Data |
Katayoun Neshatpour, Maria Malik, Mohammad Ali Ghodrat, and Houman Homayoun, Energy-Efficient Acceleration of Big Data Analytics Applications Using FPGAs |
Lorenz Fischer and Abraham Bernstein, Workload Scheduling in Distributed Stream Processors using Graph Partitioning |
Arghya Kusum Das, Seung-Jong Park, Jaeki Hong, and Wooseok Chang, Evaluating Different Distributed-Cyber-Infrastructure for Data and Compute Intensive Scientific Application |
Vincenzo Gulisano, Yiannis Nikolakopoulos, Marina Papatriantafilou, and Philippas Tsigas, ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join |
Yang Wang and Kwan-Liu Ma, Revealing the Fog-of-War: A Visualization-directed, Uncertainty-aware Approach for Exploring High-dimensional Data |
Jilong Xue, Zhi Yang, Shian Hou, and Yafei Dai, When Computing Meets Heterogeneous Cluster: Workload Assignment in Graph Computation |
Bokai Cao, Francine Chen, Dhiraj Joshi, and Philip S. Yu, Inferring Crowd-Sourced Venues for Tweets |
Vasilis Efthymiou, Kostas Stefanidis, and Vassilis Christophides, Big Data Entity Resolution: From Highly to Somehow Similar Entity Descriptions in the Web |
Huanhuan Wu, James Cheng, Yi Lu, Yiping Ke, Yuzhen Huang, Da Yan, and Hejun Wu, Core Decomposition in Large Temporal Graphs |
Vasilis Efthymiou, George Papadakis, George Papastefanatos, Kostas Stefanidis, and Themis Palpanas, Parallel Meta-blocking: Realizing Scalable Entity Resolution over Large, Heterogeneous Data |
Ioanna Filippidou and Yiannis Kotidis, Online and On-demand Partitioning of Streaming Graphs |
Christos Anagnostopoulos and Peter Triantafillou, Learning to Accurately COUNT with Query-Driven Predictive Analytics |
Jason H.D. Cho, Yanen Li, Roxana Girju, and Chengxiang Zhai, Recommending Forum Posts to Designated Experts |
Mark Gates, Hartwig Anzt, Jakub Kurzak, and Jack Dongarra, Accelerating Collaborative Filtering Using Concepts from High Performance Computing |
Eldon Carman, Vassilis Tsotras, Till Westmann, Vinayak Borkar, and Michael Carey, A Scalable Parallel XQuery Processor |
Wei Xie, Feida Zhu, Siyuan Liu, and Ke Wang, Modelling Cascades Over Time in Microblogs |
Guoxin Liu, Haiying Shen, and Haoyu Wang, Computing Load Aware and Long-View Load Balancing for Cluster Storage Systems |
desheng Zhang, Ruobing Jiang, Shuai Wang, Yanmin Zhu, Bo Yang, Tian He, Jian Cao, and Fan Zhang, EveryoneCounts: Data-Driven Digital Advertising based on Uncertain Demand Models in Metro Networks |
Eser Kandogan, Mary Roth, Peter Schwarz, Joshua Hui, Ignacio Terrizzano, Christina Christodoulakis, and Renee Miller, LabBook: Metadata-driven Social Collaborative Data Analysis |
Liang Zhao, WenZhan Song, and Xiaojing Ye, Fast Decentralized Gradient Descent Method and Applications to In-situ Seismic Tomography |
Yasser Salem, Jun Hong, and Weiru Liu, CSFinder: A Cold-Start Friend Finder in Large-Scale Social Networks |
Nam-Luc Tran, Thomas Peel, and Sabri Skhiri, Distributed Frank-Wolfe under Pipelined Stale Synchronous Parallelism |
Michele Bertoni, Stefano Ceri, Abdulrahman Kaitoua, and Pietro Pinoli, Evaluating Cloud Frameworks on Genomic Applications |
Chenxi Qiu, Haiying Shen, and Liuhua Chen, Towards Green Cloud Computing: Demand Allocation and Pricing Policies for Cloud Service Brokerage |
Hien To, Seon Ho Kim, and Cyrus Shahabi, Effectively Crowdsourcing the Acquisition and Analysis of Visual Data for Disaster Response |
Zhen Chen, Hanghang Tong, and Lei Ying, Full Diffusion History Reconstruction in Networks |
Demetris Trihinas, George Pallis, and Marios Dikaiakos, AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices |
Zhao Zhang, Kyle Barbary, Frank Austin Nothaft, Evan Sparks, Oliver Zahn, Michael J. Franklin, David A. Patterson, and Saul Perlmutter, Scientific Computing Meets Big Data Technology: An Astronomy Use Case |
Nikos Zacheilas, Vana Kalogeraki, Nikolas Zygouras, Nikolaos Panagiotou, and Dimitrios Gunopulos, Elastic Complex Event Processing exploiting Prediction |
Huseyin Ulusoy, Murat Kantarcioglu, and Erman Pattuk, TrustMR: Computation Integrity Assurance system for MapReduce |
Suchismit Mahapatra and Varun Chandola, Modeling Graphs Using a Mixture of Kronecker Models |
Huseyin Ulusoy, Murat Kantarcioglu, Erman Pattuk, and Lalana Kagal, AccountableMR: Toward Accountable MapReduce systems |
Stephen Bonner, A. Stephen McGough, Ibad Kureshi, John Brennan, Georgios Theodoropoulos, Laura Moss, David Corsar, and Grigoris Antoniou, Data Quality Assessment and Anomaly Detection Via Map / Reduce and Linked Data: A Case Study in the Medical Domain |
tian guo, Jean-Paul Calbimonte, Hao Zhuang, and Karl Aberer, SigCO: Mining Significant Correlations via a Distributed Real-time Computation Engine |
Eleazar Leal, Le Gruenwald, Jianting Zhang, and Simin You, TKSimGPU: A Parallel Top-K Trajectory Similarity Query Processing Algorithm for GPGPUs and Multicore CPUs |
Michael Nalisnik, David Gutman, Jun Kong, and Lee Cooper, An Interactive Learning Framework for Scalable Classification of Pathology Images |
Yen-Kai Wang, Wei-Ming Chen, Cheng-Te Li, and Shou-De Lin, Identifying Smallest Unique Subgraphs in a Heterogeneous Social Network |
Jiejun Xu and Tsai-Ching Lu, Toward Precise User-Topic Alignment in Online Social Media |
Xi Yang, Ning Liu, Bo Feng, Xian-He Sun, and Shujia Zhou, PortHadoop: Support Direct HPC Data Processing in Hadoop |
John Canny, Huasha Zhao, Ye Chen, Jiangchang Mao, and Bobby Jaros, Machine Learning at the Limit |
Masahiko Itoh, Daisaku Yokoyama, Masashi Toyoda, and Masaru Kitsuregawa, Visual Interface for Exploring Caution Spots from Vehicle Recorder Big Data |
Nusrat Islam, Md. Wasi-ur- Rahman, Xiaoyi Lu, Dipti Shankar, and Dhabaleswar K. Panda, Performance Characterization and Acceleration of In-Memory File Systems for Hadoop and Spark Applications on HPC Clusters |
Serafettin Tasci and Murat Demirbas, PANOPTICON: A lock broker architecture for scalable transactions in the datacenter |
Bogdan Simion, Daniel Ilha, Suprio Ray, Leslie Barron, Angela Demke Brown, and Ryan Johnson, Slingshot: A Modular Framework for Designing Data Processing Systems |
Short Paper
|
Amir Bahmani and Frank Mueller, ACURDION: An Adaptive Clustering-based Algorithm for Tracing Large-scale MPI Applications |
Max Watson, Time Maps: A Tool for Visualizing Many Discrete Events Across Multiple Timescales |
Dongfang Zhao, Nagapramod Mandagere, Gabriel Alatorre, Mohamed Mohamed, and Heiko Ludwig, Diego/P: Toward Locality-aware Scheduling for Containerized Cloud Services |
Min Du and Feifei Li, ATOM: Automated Tracking, Orchestration, and Monitoring of Resource Usage in Infrastructure as a Service Systems |
Xugang Ye, Zijie Qi, and Jingjing Li, Learning Relevance from Click Data via Neural Network based Similarity Models |
I. Stephen Choi, Weiqing Yang, and Yang-Suk Kee, Early Experience with Optimizing I/O Performance Using High-Performance SSDs for In-Memory Cluster Computing |
Inho Cho, Soya Park, Sejun Park, Dongsu Han, and Jinwoo Shin, Large-scale Parallel Combinatorial Optimization through Belief Propagation |
Anand Tripathi and BhagavathiDhass Thirunavukarasu, A Transaction Model for Management of Replicated Data with Multiple Consistency Levels |
Yu Wang and Jiebo Luo, America Tweets China: A Fine-Grained Analysis of the State and Individual Characteristics Regards Attitudes towards China |
Yu Jin, Joseph JaJa, Rong Chen, and Edward Herskovits, A Data-Driven Approach to Extract Connectivity Structures from Diffusion Tensor Imaging Data |
Georgios Chatzigeorgakidis, Sophia Karagiorgou, Spiros Athanasiou, and Spiros Skiadopoulos, A MapReduce Based k-NN Joins Probabilistic Classifier |
Alessandro Lulli, Thibault Debatty, Laura Ricci, Matteo Dell'Amico, and Pietro Michiardi, Scalable k-NN based text clustering |
Yuwen Chen, Jian Cao, Shanshan Feng, and Yudong Tan, An Ensemble Learning Based Approach for Building Airfare Forecast Service |
Mack Sweeney, Jaime Lester, and Huzefa Rangwala, Next-Term Student Grade Prediction |
Sofia Apreleva and Alejandro Cantarero, Predicting the Location of Users on Twitter from Low Density Graphs |
Chad Steed, Margaret Drouhard, Justin Beaver, Joshua Pyle, and Paul Bogen, Matisse: A Visual Analytics System for Exploring Emotion Trends in Social Media Text Streams |
Philip S. Yu and Sihong Xie, Robust Crowd Bias Correction via Dual Knowledge Transfer from Multiple Overlapping Sources |
Dongyao Wu, Sherif Sakr, Liming Zhu, and Qinghua Lu, Composable and Efficient Functional Big Data Processing Framework |
Salvador Aguinaga, Aditya Nambiar, Zuozhu Liu, and Tim Weninger, Concept Hierarchies and Human Navigation |
Hyunjoo Kim, Sriganesh Madhvanath, and Tong Sun, Hybrid Active Learning for Non-stationary Streaming Data with Asynchronous Labeling |
Jianting Zhang, Simin You, and Le Gruenwald, Quadtree-Based Lightweight Data Compression for Large-Scale Geospatial Rasters on Multi-Core CPUs |
Srikant Padala, Dinesh Kumar, Arun Raj, and Janakiram Dharanipragada, Octopus: A Multi-tenant Scheduler for Graphlab |
Deepika Lalwani, Somayajulu D. V. L. N., and Radha Krishna Pisipati, A Community Driven Social Recommendation System |
Ruben Tous, Anastasios Gounaris, Carlos Tripiana, Jordi Torres, Sergi Girona, Eduard Ayguadé, Jesús Labarta, Yolanda Becerra, David Carrera, and Mateo Valero, Spark Deployment and Performance Evaluation on the MareNostrum Supercomputer |
Elias Alevizos, Alexander Artikis, Kostas Patroumpas, Marios Vodas, Yannis Theodoridis, and Nikos Pelekis, How not to drown in a sea of information: An event recognition approach |
Zhenhua Chen, Jielong Xu, Jian Tang, Kevin Kwiat, and Charles Kamhoua, G-Storm: GPU-enabled High-throughput Online Data Processing in Storm |
Orcun Yildiz, Shadi Ibrahim, Tran Anh Phuong, and Gabriel Antoniu, Chronos: Failure-Aware Scheduling in Shared Hadoop Clusters |
Yongfeng Zhang, Task-based Recommendation on a Web-Scale |
Xiaowei Jia, Aosen Wang, Xiaoyi Li, Guangxu Xun, Wenyao Xu, and Aidong Zhang, Multi-modal Learning for Video Recommendation based on Mobile Application Usage |
Jiaoyan Chen, Huajun Chen, Daning Hu, Yalin Zhou, and Ming Wu, Smog Disaster Forecasting using Social Web Data and Physical Sensor Data |
Roee Ebenstein and Gagan Agrawal, DSDQuery DSI - Querying Scientific Data Repositories with Structured Operators |
Xiaoyi Li, Xiaowei Jia, and Aidong Zhang, Improving EEG Feature Learning via Synchronized Facial Video |
muyi liu and Michael Gribskov, MMC-Margin: Identification of Maximum Frequent Subgraphs by Metropolis Monte Carlo Sampling |
Smruti Padhy, Greg Jansen, Jay Alameda, Edgar Black, Liana Diesendruck, Mike Dietze, Praveen Kumar, Rob Kooper, Jong Lee, Riu Liu, Ricard Marciano, Luigi Marini, Dave Mattson, Barbara Minsker, Chris Navarro, Marcus Slavenas, William Sullivan, Jason Votava, and Kenton McHenry, Brown Dog: Leveraging Everything Towards Autocuration |
Afsin Akdogan, Saratchandra Indrakanti, Ugur Demiryurek, and Cyrus Shahabi, Cost-Efficient Partitioning of Spatial Data on Cloud |
Kamalika Das, Kanishka Bhaduri, Bryan Matthews, and Nikunj Oza, Large scale support vector regression for aviation safety |
Yue Wang, Ke Wang, Ada Wai-Chee Fu, and Raymond Sin-Kwok Wong, KeyLabel Algorithm for Keyword Search in Large Graphs |
Enric Junqué de Fortuny, Theodoros Evgeniou, David Martens, and Foster Provost, Iteratively Refining SVMs |
Harish Bhat, Nitesh Kumar, and Garnet Vaz, Towards Scalable Quantile Regression Trees |
Dapeng Dong and John Herbert, Record-aware Two-level Compression for Big Textual Data Analysis Acceleration |
Zhen Xie and Sencun Zhu, You Can Promote, But You Can’t Hide: Large-Scale Abused App Detection in Mobile App Stores |
Kosuke Nakabasami, Toshiyuki Amagasa, Salman Shaikh, Franck Gass, and Hiroyuki Kitagawa, An Architecture for Stream OLAP Exploiting SPE and OLAP Engine |
Chung-Hsien Yu, Dong Luo, Wei Ding, Joseph Cohen, David Small, and Shafiqul Islam, Spatio-Temporal Asynchronous Co-Occurrence Pattern for Big Climate Data towards Long-Lead Flood Prediction |
Wei Xie, Jiang Zhou, Mark Reyes, Jason Noble, and Yong Chen, Two-Mode Data Distribution Scheme for Heterogeneous Storage in Data Centers |
Teng Li, Jian Tang, and Jielong Xu, A Predictive Scheduling Framework for Fast and Distributed Stream Data Processing |
Anthony Kleerekoper, Michael Pappas, Mikel Lujan, Gavin Brown, and Adam Pocock, A Scalable Implementation of Information Theoretic Feature Selection for High Dimensional Data |
Lorenzo Gabrielli, Barbara Furletti, Roberto Trasarti, Fosca Giannotti, and Dino Pedreschi, City users’ classification with mobile phone data |
Anas Abu-Doleh and Umit Catalyurek, Spaler: Spark and GraphX based de novo genome assembler |
Pouria Pirzadeh, Michael Carey, and Till Westmann, BigFUN: A Performance Study of Big Data Management System Functionality |
S M Faisal, G. Tziantzioulis, A. M. Gok, S. Parthasarathy, N. Hardavellas, and S. Ogrenci-Memik, Edge Importance Identification for Energy Efficient Graph Processing |
Florin Schimbinschi, Xuan Vinh Nguyen, James Bailey, Chris Leckie, Hai Vu, and Ramamohanarao Kotagiri, Traffic Forecasting In Complex Urban Networks: Leveraging Big Data and Machine Learning |
Kilho Shin, Tetsuji Kuboyama, Takako Hashimoto, and Dave Shepard, Super-CWC and Super-LCC: Super Fast Feature Selection Algorithms |
Tonglin Li, Ke Wang, Shiva Srivastava, Dongfang Zhao, Kan Qiao, Iman Sadooghi, Xiaobing Zhou, and Ioan Raicu, A Flexible QoS Fortified Distributed Key-Value Storage System for the Cloud |
Mahdi Ebrahimi, Aravind Mohan, Shiyong Lu, and Robert Reynolds, TPS: A Task Placement Strategy for Big Data Workflows |
Keira Zhou, Jack Wadden, Jeffrey Fox, Ke Wang, Donald Brown, and Kevin Skadron, Regular Expression Acceleration on the Micron Automata Processor: Brill Tagging as a Case Study |
Karla Caballero Barajas and Ram Akella, Prediction of Physiological Subsystem Failure and its Impact in the prediction of Patient Mortality |
Yuqing ZHU, Yilei WANG, and Fan WANG, Improving Transaction Processing Performance By Consensus Reduction |
Dipti Shankar, Xiaoyi Lu, Md. Wasi-ur-Rahman, Nusrat Islam, and Dhabaleswar K. Panda, Benchmarking Key-Value Stores on High-Performance Storage and Interconnects for Web-Scale Workloads |
Luca Pappalardo, Dino Pedreschi, and Fosca Giannotti, Human Mobility and Economic Development |
Roberto Tardío Olmos, Alejandro Maté Morga, and Juan Carlos Trujillo Mondéjar, An Iterative Methodology for Big Data Management, Analysis and Visualization |
Robert P. Trevino, Steve A. Kawamoto, Thomas J. Lamkin, and Huan Liu, Cell Analytics in Compound Hit Selection of Bacterial Inhibitors |
Don Libes, Seungjun Shin, and Jungyub Woo, Considerations and Recommendations for Data Contributions for Predictive Analytics for Manufacturing |
Padmashree Ravindra, HyeongSik Kim, and Kemafor Anyanwu, Rewriting Complex SPARQL Analytical Queries for Efficient Cloud-based Processing |
Li-Yung Ho, Fei Shao, Jan-Jan Wu, and Pangfeng Liu, Efficient Distributed Maximum Matching for Solving the Container Exchange Problem in the Maritime Industry |
Industry and Government Program
Regular Paper
|
Xiuqiang He, Wenyuan Dai, Guoxiang Cao, Huyang Sun, Mingxuan Yuan, and Qiang Yang, Mining Target Users for Online Marketing based on App Store Data |
Ahmed Metwally, Jia-Yu Pan, Minh Doan, and Christos Faloutsos, Scalable Community Discovery from Multi-Faceted Graphs |
Ernesto Diaz-Aviles, Fabio Pinelli, Karol Lynch, Zubair Nabi, Yiannis Gkoufas, Eric Bouillet, Francesco Calabrese, Eoin Coughlan, Peter Holland, and Jason Salzwedel, Towards Real-time Customer Experience Prediction for Telecommunication Operators |
I. Stephen Choi, Weiqing Yang, and Yang-Suk Kee, Early Experience with Optimizing I/O Performance Using High-Performance SSDs for In-Memory Cluster Computing |
Hyunsik Choi, Yong In Lee, Jongyoung Park, Kangho Roh, and Kwanghyun La, An Evaluation of Alternative Shared-nothing Architecture for Analytical Processing Systems |
Anjan Goswami, Wei Han, Zhenrui Wang, and Angela Jiang, Controlled Experiments for Decision-Making in e-Commerce Search |
Jenny Williams, Paul Cuddihy, Justin McHugh, Kareem Aggour, and Arvind Menon, Semantics for Big Data Access & Integration: Improving Industrial Equipment Design through Increased Data Usability |
Laura Rettig, Mourad Khayati, Michal Piorkowski, and Philippe Cudre-Mauroux, Online Anomaly Detection over Big Data Streams |
Aungon Nag Radon, Ke Wang, Uwe Glaesser, Hans Wehn, and Andrew Westwell-Roper, Contextual Verification for False Alarm Reduction in Maritime Anomaly Detection |
Tanay Saha, Mohammad Hasan, Chandler Burgess, Md Ahsan Habib, and Jeff Johnson, Batch Mode Active Learning for Technology-Assisted Review |
Mayank Kejriwal, Qiaoling Liu, Ferosh Jacob, and Faizan Javed, A Pipeline for Extracting and Deduplicating Domain-Specific Knowledge Bases |
Fang-Hsiang Su, Manas Somaiya, Shrish Mishra, and Rajyashree Mukherjee, EXOS: EXpansion On Session for Enhancing Effectiveness of Query Auto-Completion |
Gergely Acs, Jagdish Prasad Achara, and Claude Castelluccia, Probabilistic km-anonymity (Efficient Anonymization of Large Set-Valued Datasets) |
Sauptik Dhar, Congrui Yi, Naveen Ramakrishnan, and Mohak Shah, ADMM based Scalable Machine Learning on Spark |
Dapeng Dong and John Herbert, Record-aware Compression for Big Textual Data Analysis Acceleration |
Alekh Jindal, Samuel Madden, Malú Castellanos, and Meichun Hsu, Graph Analytics using Vertica Relational Database |
Andre Luckow, Ken Kennedy, Fabian Manhard, Emil Djerekarov, Bennie Vorster, and Amy Apon, Automotive Big Data: Applications, Workloads and Infrastructures |
Goktug Cinar, Jeffrey Thompson, and Soundar Srinivasan, Cost-sensitive optimization of automated inspection |
Nicolas Poggi, Josep Lluís Berral, David Carrera, Aaron Call, Rob Reinauer, Nikola Vujic, Daron Green, José Blakeley, and Fabrizio Gagliardi, From Performance Profiling to Predictive Analytics while Evaluating Hadoop Cost-Efficiency in ALOJA |
Mohammed Korayem, Camilo Ortiz, Khalifeh AlJadda, and Trey Grainger, Query Sense Disambiguation Leveraging Large Scale User Behavioral Data |
Viet Ha-Thuc, Ganesh Venkataraman, Mario Rodriguez, Shakti Sinha, Senthil Sundaram, and Lin Guo, Personalized Expertise Search at LinkedIn |
Vinay Deolalikar, How Valuable is Your Data? A Quantitative Approach using Data Mining |
Kang Li, Vinay Deolalikar, and Neeraj Pradhan, Mining Lifestyle Personas at Scale in E-commerce |
Petros Zerfos, Hangu Yeo, Brent Paulovicks, and Vadim Sheinin, SDFS: Secure Distributed File System for Data-at-Rest Security for Hadoop-as–a-Service |
Sreenivas Sukumar, Open Research Challenges with Big Data - A Data-Scientists Perspective |
Hamed Yaghoubi Shahir, Uwe Glässer, Amir Yaghoubi Shahir, and Hans Wehn, Maritime Situation Analysis Framework: Vessel Interaction Classification and Anomaly Detection |
Levente Klein, Fernando Marianno, Conrad Albrecht, Marcus Freitag, Siyuan Lu, Nigel Hinds, Hendrik Hamann, and Sergio Bermudez, PAIRS: A scalable geo-spatial data analytics platform |
Jayasimha Katukuri, Tolga Konik, Rajyashree Mukherjee, and Santanu Kolay, Post-Purchase Recommendations in Large-scale Online Marketplaces |
Hong-Han Shuai, Chih-Ya Shen, Hsiang-Chun Hsu, De-Nian Yang, Chung-Kuang Chou, Jihg-Hong Lin, and Ming-Syan Chen, Revenue Maximization for Telecommunications Company with Social Viral Marketing |
Short Paper
|
Stephanie Rosenthal, Scott McMillan, and Matthew Gaston, Developer Toolchains for Large-Scale Analytics: Two Case Studies |
Harshal Godhia, Bibek Panda, Swarnim Narayan, and Ramakrishna Vadakattu, Enterprise Subscription Churn Prediction |
Joshua Seeger, Aron Culotta, Jason Keller, Patrick van Kessel, and Michael Jugovich, Data Deidentification in Medical Transcriptions using Regular Expressions and Machine Learning |
Qinlong Luo, Meng Zhao, Faizan Javed, and Ferosh Jacob, Macau: Large-Scale Skill Sense Disambiguation in the Online Recruitment Domain |
Wei Yi Liu, Morris H. Hsiao, and Shih Yao Dai, DNA Analysis with MapReduce |
Chaitali Gupta, Ranjan Sinha, and Yong Zhang, Eagle: User Profile-based Anomaly Detection in Hadoop Clusters |
Manuel Diaz-Granados, Javier Diaz-Montes, and Manish Parashar, Investigating Insurance Fraud using Social Media |
Luca Cazzanti, Leonardo Millefiori, and Gianfranco Arcieri, A Document-based Data Model for Large Scale Computational Maritime Situational Awareness |
Jhao-Yin Li, Mi-Yen Yeh, Ming-Syan Chen, and Jihg-Hong Lin, Modeling Social Influences from Call Records and Mobile Web Browsing Histories (Extended Abstract) |
Christian Seebode, Matthias Ort, Peter Hufnagl, and Christian R. A. Regenbrecht, Next Generation Biobanks |