Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: D:\Academic\TsG\Conferences\conference pre works\bigdata2014\BigData 2014 map network drive\whitehouse.png

 

      What's New

      Important Dates

      Online Submission

      Workshops

      Special Session

      Highlights  

      Organization

      Program Committee

      Program Schedule

      Keynote Speeches

      Panel with Program Directors  

      Tutorial

      Doctoral Symposium

      Sponsors

      Accepted Papers

      Registration

      Student Travel Award

      Visa to USA

      Travel Information

      About Washington DC

      Hotel

Description: Description: Description: Description: Description: Description: Description: Description: D:\Academic\TsG\Conferences\conference pre works\bigdata2014\BigData 2014 map network drive\ieee_mb_blue.jpg          

Description: Description: Description: Description: Description: Description: Description: Description: D:\Academic\TsG\Conferences\conference pre works\bigdata2014\BigData 2014 map network drive\image_gallery.gif

          

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

IEEE BigData 2014 Program Schedule                                                                                    

Washington DC
USA
Oct 27-30, 2014

Program

 

 October 27, 2014
 October 28, 2014
 October 29, 2014
 October 30, 2014

 


 

Keynote Lecture
Main conference regular paper: 25 minutes (about 20 minutes for talk and 5 minutes for Q and A)
Main conference short paper: 15 minutes (about 11 minutes for talk and 4 minutes for Q and A)

 


 

 

26-Oct

15:30-20:00

Registration

Venue:

Ballroom Foyer and Ballroom Coatroom

 

 

27-Oct

7:30-18:00

Venue:

Registration

Ballroom Foyer and Ballroom Coatroom

10:00-10:20
and
15:20-15:40


Coffee Break at Meeting Room Foyer


08:00-18:30

 


Sessions


Session Chair


Venue

Special session I

From Data to Insight: Big Data and Analytics for Smart Manufacturing Systems

Sudarsan Rachuri

Ronay AK


JUDICIARY SUITE

Full  day event

Doctoral Consortium

Jingrui He

Patuxent

Full-day workshop

 

 

#2  

 

 

The 2nd Workshop on Scalable Machine Learning: Theory and Applications

 

Zenglin Xu

 

Waterford

#3  

1st International Workshop on High Performance Big Graph Data Management, Analysis, and Mining

Fengguang Song

Lalique

#8   

 

The 2nd International Workshop of BigData in Bioinformatics and Healthcare Informatics

Jun Huan

Haverford

#13  

First Hands-On Workshop on Leveraging High Performance Computing Resources for Managing Large Datasets

Ritu Arora

Baccarat

#18   

 

Large Scale Data Analytics in Transportation and Railway Infrastructure

Nii Attoh-Okine

CARTIER/TIFFANY

 

#20  

Big Humanities Data

Mark Hedges

CABINET SUITE

#22  

IEEE NIST Big Data PWG Workshop on Big Data: Challenges, Practices and Technologies

Nancy Grady

Embassy


08:00-12:00

 


Sessions


Session Chair


Venue

Morning  workshop

 

 

#1 

 

Scholarly Big Data: Challenges & Issues

 

Ingemar J. Cox

 

Diplomat

#15  

Workshop on Advances in Software and Hardware for Big Data to Knowledge Discovery (ASH)

Weijia Xu

Ambassador

#17  

Big Data in Computational Epidemiology

Jiangzhuo Chen

Severn

#19  

2nd Workshop on Scalable Cloud Data Management

Felix Gessert

Susquehanna

 

#21 

Complexity for Big Data

Guozhu Dong

Potomac


13:30-18:30

 


Sessions


Session Chair


Venue


Special session II


Big Data Representation and Processing in Data Science


T.Y. Lin



Susquehanna

Tutorial

Big Data Stream Mining

Alfred Bifet

Severn

Afternoon workshop

 

 

#11

 

CASK-14 :  1st International Workshop on Collaborative methodologies to Accelerate Scientific Knowledge discovery in big data

 

Chen Jin

 

Diplomat

#16 

IEEE Big Data Workshop on Semantics for Big Data on the Internet of Things (SemBIoT 2014)

Kemafor Ogan

Ambassador

 

 

 

 

 

 

28-Oct

07:30-18:00

Registration

Venue:

Ballroom Foyer and Ballroom Coatroom

08:30-08:45

Opening and Welcoming Speech

Conference Co-Chairs:

Charu Aggarwal, Nick Cercone, Vasant Honavar

Program Co-Chairs:

Jimmy Lin, Jian Pei

Industry Program co-Chairs:

Wo Chang, Raghunath Nambiar

BigData Steering Committee Chair:

Xiaohua Tony Hu (Drexel University)

Venue:

CRYSTAL BALLROOM

08:45-09:45

Session Chair:  Jian Pei

Keynote Speech 1:  Never-Ending Language Learning

Tom Mitchell - E. Fredkin University Professor, Machine Learning Department, Carnegie Mellon University

Venue:

CRYSTAL BALLROOM


09:45-10:00


Coffee Break
at Meeting Room Foyer

Poster session setup and display: Meeting Room Foyer

10:00-12:30

S 1

Visual analytics, time, and space

S 2

Cloud computing and systems (1)

S 3

Graphs and networks

Tutorial

Big ML Software for Modern ML Algorithms

Session Chair

Arash Jalal Zadeh Fard

Amy Apon

Luke Huan

Qirong Ho, Eric Xing

Venue

CABINET SUITE

DIPLOMAT/AMBASSADOR

JUDICIARY SUITE

EMBASSY/PATUXENT


12:30-14:00


Lunch
provided by the conference at BALLROOM FOYER (Seating inside the Crystal Ballroom)

Poster session setup and display: Meeting Room Foyer

14:00-16:05

L 1

Graphs and networks (1)

L 2

Scalable systems

L 3

Storage

I&G 1

Industry & Government

Session Chair

Conrad S. Tucker

Weijia Xu

Steven Y. Ko

Wo Chang

Venue:

CABINET SUITE

DIPLOMAT/AMBASSADOR

JUDICIARY SUITE

EMBASSY/PATUXENT


16:05-16:20


Coffee Break at Meeting Room Foyer


16:20-18:00

L 4

Image processing

L 5

Data streams and time series

L 6

Regression and machine learning

I&G 2

Industry & Government

Session Chair

Lin-Ching Chang

Bo Luo

Jiang Zheng

Raghunath Nambiar

Venue:

CABINET SUITE

DIPLOMAT/AMBASSADOR

JUDICIARY SUITE

EMBASSY/PATUXENT


1
9:00-20:30

Venue:


Banquet
:

CRYSTAL BALLROOM

 

 

29-Oct

07:30-18:00

Venue:

Registration

Ballroom Foyer, Ballroom Coatroom

08:30-09:30

Session Chair:  Vasant Honavar

Keynote Speech 2: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

Amit Sheth, LexisNexis Ohio Eminent Scholar, Kno.e.sis - Wright State University

Venue:

CRYSTAL BALLROOM


09:30-10:00


Coffee Break at Meeting Room Foyer

Poster session setup and display: Meeting Room Foyer

10:00-12:30

Panel with Program Directors: Dr. Chaitanya Baru (NSF),  Dr. Yuan Liu (NIH), Dr. David Kuehn (DoT), Dr.Tsengdar Lee (NASA), Dr. Sudarsan  Rachuri   (NIST),   Mr. Matti Vakkuri (DIGILE):

Big Data Challenges and Opportunities

Tutorial

Large-scale Heterogeneous Learning in Big Data Analytics

Session Chair

 

Xiaohua Tony Hu

Jun Huan

Venue:

CRYSTAL BALLROOM

OLD GEORGETOWN


12:30-14:00


Lunch provided by conference at BALLROOM FOYER (Seating inside the Crystal Ballroom)

 

Poster session setup and display: Meeting Room Foyer

14:00-16:05

L 7

Distributed systems

L 8

Visualization/bioinformatics

L 9

Cloud computing

I&G 3

Industry & Government

Session Chair

Yicheng Tu

Saumyadipta Pyne

Ada Fu

Raghunath Nambiar

Venue:

OLD GEORGETOWN

CABINET SUITE

JUDICIARY SUITE

DIPLOMAT/AMBASSADOR


16:05-16:20


Coffee Break at Meeting Room Foyer

16:20-18:00

L 10

Privacy and security

L 11

Graphs and networks (2)

I&G 4

Industry & Government

Tutorial

Big Data Benchmarking

Session Chair

Christoph Schommer

Hao Howie Huang

Wo Chang

Chaitan Baru, Tilmann Rabl

Venue:

OLD GEORGETOWN

CABINET SUITE

DIPLOMAT/AMBASSADOR

JUDICIARY SUITE

 

 

 

 

 

30-Oct

07:30-18:00

Registration

Venue:

Ballroom Foyer, Ballroom Coatroom

08:30-09:30

Session Chair:  Jimmy Lin

Keynote Speech 3:  Addressing Human Bottlenecks in Big Data

 

Joseph M. Hellerstein, Chancellor's Professor of Computer Science, University of California, Berkeley and Trifacta

Venue:

CRYSTAL BALLROOM

 

09:30-10:00

 

Coffee Break at Meeting Room Foyer

Poster session setup and display: Meeting Room Foyer

10:00-12:30

S 4

Cloud computing and systems (2)

 

S 5

Applications

S 6

Data mining and learning

Session Chair

Feng Luo

Mathias Johanson

Xiaohua Tony Hu

Venue:

JUDICIARY SUITE

OLD GEORGETOWN

CABINET SUITE


08:00-12:00

 


Sessions


Session Chair


Venue

Morning  workshop

 

 

#6 

 

The Second Workshop on Distributed Storage Systems and Coding for Big Data

 

 

Bing Zhu

 

Diplomat

#7  

First IEEE International Workshop on Big Data Security and Privacy (BDSP 2014)

 

Tyrone W A Grandison

Ambassador


13:30-18:30

 


Sessions


Session Chair


Venue

Afternoon workshop

 

 

#9

 

Solar Astronomy Big Data (SABiD) – 1st Workshop on Management, Search and Mining of Massive Repositories of Solar Astronomy Data

 

 

  Rafal Angryk

 

Diplomat

 

 

 

 

 

 

 

Keynote Speeches: 3

 

Keynote 1:


Title:  
Never-Ending Language Learning


Speaker
:

Tom Mitchell - E. Fredkin University Professor, Machine Learning Department, Carnegie Mellon University


Abstract:

We will never really understand learning until we can build machines that learn many different things, over years, and become better learners over time. We describe our research to build a Never-Ending Language Learner (NELL) that runs 24 hours per day, forever, learning to read the web. Each day NELL extracts (reads) more facts from the web, into its growing knowledge base of beliefs. Each day NELL also learns to read better than the day before. NELL has been running 24 hours/day for over four years now. The result so far is a collection of 70 million interconnected beliefs (e.g., servedWtih(coffee, applePie)), NELL is considering at different levels of confidence, along with millions of learned phrasings, morphological features, and web page structures that NELL uses to extract beliefs from the web. NELL is also learning to reason over its extracted knowledge, and to automatically extend its ontology. Track NELL's progress at http://rtw.ml.cmu.edu, or follow it on Twitter at @CMUNELL.

 

Short Bio:

Tom M. Mitchell founded and chairs the Machine Learning Department at Carnegie Mellon University, where he is the E. Fredkin University Professor. His research uses machine learning to develop computers that are learning to read the web, and uses brain imaging to study how the human brain understands what it reads. Mitchell is a member of the U.S. National Academy of Engineering, a Fellow of the American Association for the Advancement of Science (AAAS), and a Fellow and Past President of the Association for the Advancement of Artificial Intelligence (AAAI). He believes the field of machine learning will be the fastest growing branch of computer science during the 21st century.

 

 

 

 

Keynote 2:


Title: 
Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities


Speaker
:

Amit Sheth, LexisNexis Ohio Eminent Scholar, Kno.e.sis - Wright State University


Abstract:

Big Data has captured a lot of interest in industry, with the emphasis on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and their applications to drive value for businesses.   Recently, there is rapid growth in situations where a big data challenge relates to making individually relevant decisions.  A key example is personalized digital health that related to taking better decisions about our health, fitness, and well-being.  Consider for instance, understanding the reasons for and avoiding an asthma attack based on Big Data in the form of personal health signals (e.g., physiological data measured by devices/sensors or Internet of Things around humans, on the humans, and inside/within the humans), public health signals (e.g., information coming from the healthcare system such as hospital admissions), and population health signals (such as Tweets by people related to asthma occurrences and allergens, Web services providing pollen and smog information).  However, no individual has the ability to process all these data without the help of appropriate technology, and each human has different set of relevant data!

 

In this talk, I will describe Smart Data that is realized by extracting value from Big Data, to benefit not just large companies but each individual. If my child is an asthma patient, for all the data relevant to my child with the four V-challenges, what I care about is simply, “How is her current health, and what are the risk of having an asthma attack in her current situation (now and today), especially if that risk has changed?” As I will show, Smart Data that gives such personalized and actionable information will need to utilize metadata, use domain specific knowledge, employ semantics and intelligent processing, and go beyond traditional reliance on ML and NLP.  I will motivate the need for a synergistic combination of techniques similar to the close interworking of the top brain and the bottom brain in the cognitive models.

 

For harnessing volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration.  For Velocity, I will discuss somewhat more recent work on Continuous Semantics, which seeks to use dynamically created models of new objects, concepts, and relationships, using them to better understand new cues in the data that capture rapidly evolving events and situations. 

 

Smart Data applications in development at Kno.e.sis come from the domains of personalized health, energy, disaster response, and smart city. I will present examples from a couple of these.


Short Bio:

Amit P. Sheth (http://knoesis.org/amit) is an educator, researcher, and entrepreneur. He is the LexisNexis Eminent Scholar and founder/executive director of the Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis) at Wright State University. Kno.e.sis conducts research in social/sensor/semantic data and Web 3.0 with real-world applications and multidisciplinary solutions for translational research, healthcare and life sciences, cognitive science, material sciences, and others. Kno.e.sis' activities have resulted in Wright State University being recognized as a top organization in the world on World Wide Web in research impact. Prof. Sheth is one of top authors in Computer Science, World Wide Web, and databases (cf: Microsoft Academic Search; Google H-index). His research has led to several commercial products, many real-world applications, and two earlier companies with two more in early stages of development. One of these was Taalee/Voquette/Semagix, which was likely the first company (founded in 1999) that developed Semantic Web enabled search and analysis, and semantic application development platforms.

 

 

 

 

Keynote 3:


Title
Addressing Human Bottlenecks in Big Data


Speaker
:

Joseph M. Hellerstein, Chancellor's Professor of Computer Science, University of California, Berkeley and Trifacta 


Abstract:

We live in an era when compute is cheap, data is plentiful, and system software is being given away for free.  Today, the critical bottlenecks in data-driven organizations are human bottlenecks, measured in the costs of software developers, IT professionals, and data analysts.  How can computer science remain relevant in this context?  The Big Data ecosystem presents two archetypal settings for answering this question: NoSQL distributed databases, and analytics on Hadoop.

 

In the case of NoSQL, developers are being asked to build parallel programs for global-scale systems that cannot even guarantee the consistency of a single register of memory.  How can this possibly be made to work?  I’ll talk about what we have seen in the wild in user deployments, and what we’ve learned from developers and their design patterns.  Then I’ll present theoretical results—the CALM Theorem—that shed light on what’s possible here, and what requires more expensive tools for coordination on top of the typical NoSQL offerings.  Finally, I will highlight some new approaches to writing and testing software—exemplified by the Bloom language—that can help developers of distributed software avoid expensive coordination when possible, and have the coordination logic synthesized for them automatically when necessary.

 

In the Hadoop context, the key bottlenecks lie with data analysts and data engineers, who are routinely asked to work with data that cannot possibly be loaded into tools for statistical analytics or visualization.  Instead, they have to engage in time-consuming data “wrangling”—to try and figure out what’s in their data, whip it into a rectangular shape for analysis, and figure out how to clean and integrate it for use.  I’ll discuss what we heard talking with data analysts in both academic interviews and commercial engagements.  Then I’ll talk about how techniques from human-computer interaction, machine learning, and database systems can be brought together to address this human bottleneck, as exemplified by our work on various systems including the Data Wrangler project and Trifacta's platform for data transformation.


Short Bio:

oseph M. Hellerstein is a Chancellor's Professor of Computer Science at the University of California, Berkeley, whose research focuses on data-centric systems and the way they drive computing. A Fellow of the ACM, his work has been recognized via awards including an Alfred P. Sloan Research Fellowship, MIT Technology Review's TR10 and TR100 lists, Fortune Magazine's "Smartest in Tech" list, and three ACM-SIGMOD "Test of Time" awards.  In 2012, Joe co-founded Trifacta, Inc (http://www.trifacta.com/), where he currently serves as Chief Strategy Officer.

 

 

 

 

 

Conference Paper Presentations

 

L1: Graphs and networks (1)

Regular

BigD210 "4S: Learning to Estimate Pairwise Distances in Large Graphs"
Maria Christoforaki and Torsten Suel

Regular

BigD304 "Geotagging One Hundred Million Twitter Accounts with Total Variation Minimization"
Ryan Compton, David Jurgens, and David Allen

Regular

BigD357 "GRAPHiQL: A Graph Intuitive Query Language for Relational Databases"
Alekh Jindal and Samuel Madden

Regular

BigD395 "PULP: Scalable Multi-Objective Multi-Constraint Partitioning for Small-World Networks"
George Slota, Siva Rajamanickam, and Kamesh Madduri

Regular

BigD436 "Synergistic Partitioning in Multiple Large Scale Social Networks"
Songchang Jin, Jiawei Zhang, Philip S. Yu, Shuqiang Yang, and Aiping Li

 

L 2: Scalable systems

Regular

BigD216 "FusionFS: Toward Supporting Data-Intensive Scientific Applications on Extreme-Scale High-Performance Computing Systems"
Dongfang Zhao, Zhao Zhang, Xiaobing Zhou, Tonglin Li, Ke Wang, Dries Kimpe, Philip Carns, Rob Ross, and Ioan Raicu

Regular

BigD253 "Sparse computation for large-scale data mining"
Dorit S. Hochbaum and Philipp Baumann

Regular

BigD306 "BASIC: an Alternative to BASE for Large-Scale Data Management System"
Lengdong Wu, Li-Yan Yuan, and Jia-Huai You

Regular

BigD336 "Facilitating Twitter Data Analytics: Platform, Language, and Functionality"
Ke Tao, Claudia Hauff, Geert-Jan Houben, Fabian Abel, and Guido Wachsmuth

Regular

BigD444 "Large-scale Distributed Sorting for GPU-based Heterogeneous Supercomputers"
Hideyuki Shamoto, Koichi Shirahata, Aleksandr Drozd, Hitoshi Sato, and Satoshi Matsuoka

 

L 3: Storage 

Regular

BigD271 "BurstMem: A High-Performance Burst Buffer System for Scientific Applications"
Teng Wang, Sarp Oral, Yandong Wang, Brad Settlemyer, Scott Atchley, and Weikuan Yu

Regular

BigD313 "Meeting Predictable Buffer Limits in the Parallel Execution of Event Processing Operators"
Ruben Mayer, Boris Koldehofe, and Kurt Rothermel

Regular

BigD398” Effective Caching Techniques for Accelerating Pattern Matching Queries Arash Fard, Satya Manda, Lakshmish Ramaswamy, and John Miller

Regular

BigD407 "Provenance-Based Object Storage Prediction Scheme for Scientific Big Data Applications"
Dong Dai, Yong Chen, Dries Kimpe, and Rob Ross

Regular

BigD215 "Virtual Chunks: On Supporting Random Accesses to Scientific Data in Compressible Storage Systems"
Dongfang Zhao, Jian Yin, Kan Qiao, and Ioan Raicu

 

L 4: Image processing

Regular

BigD316 " Metadata Extraction and Correction for Large-Scale Traffic Surveillance Videos "
Xiaomeng Zhao, Huadong Ma, Haitao Zhang, Yi Tang, and Guangping Fu

Regular

BigD360 " Structure Recognition from High Resolution Images of Ceramic Composites "
Daniela Ushizima, Talita Perciano, Harinarayan Krishnan, Burlen Loring, Hrishikesh Bale, Dilworth Parkinson, and James Sethian

Regular

BigD379 " Evaluating Density-based Motion for Big Data Visual Analytics "
Ronak Etemadpour, Paul Murray, and Angus Forbes

Regular

BigD421 " Locating Visual Storm Signatures from Satellite Images "
Yu Zhang, Stephen Wistar, Jose A. Piedra-Fernández, Jia Li, Michael Steinberg, and James Z. Wang

 

L 5: Data streams and time series

Regular

BigD234 "Distributed Adaptive Model Rules for Mining Big Data Streams"
Anh Thu Vu, Gianmarco De Francisci Morales, Joao Gama, and Albert Bifet

Regular

BigD382 "Interpretable Streaming Regression Models with Local Performance Guarantees"
Ulf Johansson, Cecilia Sönströd, and Henrik Linusson

Regular

BigD451 "Performance Modeling in CUDA Streams - A Means for High-Throughput Data Processing"
Hao Li, Di Yu, Anand Kumar, and Yicheng Tu

Regular

BigD445 "TRISTAN: Real-Time Analytics on Massive Time Series Using Sparse Dictionary Compression"
Alice Marascu, Pascal Pompey, Eric Bouillet, Michael Wurst, Olivier Verscheure, Martin Grund, and Philippe Cudre-Mauroux

 

L 6: Regression and machine learning 

Regular

BigD402 "Predicting Glaucoma Progression using Multi-task Learning with Heterogeneous Features"
Shigeru Maya, Kai Morino, and Kenji Yamanishi

Regular

BigD283 "Examination of Data, Rule Generation and Detection of Phishing URLs using Online Logistic Regression"
Mohammed Nazim Feroz and Susan Mengel

Regular

BigD454 "Large-scale Logistic Regression and Linear Support Vector Machines Using Spark"
Chieh-Yen Lin, Cheng-Hao Tsai, Ching-Pei Lee, and Chih-Jen Lin

Regular

BigD465 "BayesWipe: A Multimodal System for Data Cleaning and Consistent Query Answering on Structured Data"
Sushovan De, Yuheng Hu, Yi Chen, and Subbarao Kambhampati

 

L 7: Distributed systems 

Regular

BigD318 "Partial Rollback-based Scheduling on In-memory Transactional Data Grids"
Junwhan Kim

Regular

BigD337 "Main Memory Evaluation of Recursive Queries on Multicore Machines"
Mohan Yang and Carlo Zaniolo

Regular

BigD391 "Distributed Algorithms for k-truss Decomposition"
Ming-Syan Chen, Pei-Ling Chen, and Chung-Kuang Chou

Regular

BigD434 "Parallel Breadth First Search on GPU Clusters"
Zhisong Fu, Harish Dasari, Martin Berzins, and Bryan Thompson

Regular

BigD471 "Optimizing Load Balancing and Data-Locality with Data-aware Scheduling"
Ke Wang, Xiaobing Zhou, Tonglin Li, Dongfang Zhao, Michael Lang, and Ioan Raicu

 

L 8: Visualization/bioinformatics

Regular

BigD258 " Topic Similarity Networks: Visual Analytics for Large Document Sets "
Arun Maiya

Regular

BigD303 " Web-based Visual Analytics for Extreme Scale Climate Science "
Chad Steed, Katherine Evans, John Harney, Brian Jewell, Galen Shipman, Brian Smith, Peter Thornton, and Dean Williams

Regular

BigD338 " Visual Fusion of Mega-City Big Data: An Application to Traffic and Tweets Data Analysis of Metro Passengers "
Masahiko Itoh, Daisaku Yokoyama, Masashi Toyoda, Yoshimitsu Tomita, Satoshi Kawamura, and Masaru Kitsuregawa

Regular

BigD277 " Random Projection Based Clustering for Population Genomics "
Sotiris Tasoulis, Lu Cheng, Niko Välimäki, Nicholas Croucher, Simon Harris, William Hanage, Teemu Roos, and Jukka Corander

Regular

BigD460 " Identification of SNP Interactions Using Data-Parallel Primitives on GPUs "
Can Altinigneli, Bettina Konte, Dan Rujescu, Christian Boehm, and Claudia Plant

 

L 9: Cloud computing 

Regular

BigD380 "Combining Hadoop and GPU to Preprocess Large Affymetrix Microarray Data"
sufeng Niu, guangyu yang, nilim sarma, Melissa Smith, Pradip Srimani, and Feng Luo

Regular

BigD423 "Detecting and Identifying System Changes in the Cloud via Discovery by Example"
Hao Chen, Sastry Duri, Vasanth Bala, Nilton Bila, Canturk Isci, and Ayse Coskun

Regular

BigD426 "PigOut: Making Multiple Hadoop Clusters to Work Together"
Kyungho Jeon, Sharath Chandrashekhara, Feng Shen, Shikhar Mehra, Oliver Kennedy, and Steven Ko

Regular

BigD432 "Accurate and Efficient Selection of the Best Consumption Prediction Method in Smart Grids"
Marc Frincu, Charalampos Chelmis, Muhammad Noor, and Viktor Prasanna

Regular

BigD244 "E-Sketch: Gathering Large-scale Energy Consumption Data Based on Consumption Patterns"
Zhichuan Huang, Hongyao Luo, David Skoda, Ting Zhu, and Yu Gu

 

L 10: Privacy and security 

Regular

BigD260 "Hierarchical Management of Large-Scale Malware Data"
Lee Kellogg, Brian Ruttenberg, Alison O'Connor, Michael Howard, and Avi Pfeffer

Regular

BigD294 "MR-TRIAGE: Scalable Multi-Criteria Clustering for Big Data Security Intelligence Applications"
Yun Shen and Olivier Thonnard

Regular

BigD383 "Using Data Content to Assist Access Control for Large-Scale Content-Centric Databases"
Wenrong Zeng, Yuhao Yang, and Bo Luo

 

 

 

L 11: Graphs and networks (2)

Regular

BigD301 "Efficient Breadth-First Search on a Heterogeneous Processor"
Mayank Daga, Mark Nutter, and Mitesh Meswani

Regular

BigD419 "Clique Guided Community Detection"
Diana Palsetia, Mostofa Patwary, William Hendrix, Ankit Agrawal, and Alok Choudhary

Regular

BigD441 "Increasing the Veracity of Event Detection on Social Media Networks Through User Trust Modeling"
Todd Bodnar, Conrad Tucker, Kenneth Hopkinson, and Sven Bilén

Regular

BigD455 "NVM-based Hybrid BFS with Memory Efficient Data Structure"
Keita Iwabuchi, Hitoshi Sato, Yuichiro Yasui, Katsuki Fujisawa, and Satoshi Matsuoka

 

 

 

 

I&G: Industry & Government (1)

Regular

N211

Spatial Computations over Terabyte-Sized Images on Hadoop Platforms

Peter Bajcsy, Phuong Nguyen, Antoine Vandecreme, and Mary Brady

Regular

N223

Astro: A Predictive Model for Anomaly Detection and Feedback-based Scheduling on Hadoop

Chaitali Gupta, Mayank Bansal, Tzu-Cheng Chuang, Ranjan Sinha, and Sami Ben-romdhane

Regular

N222

ALOJA: a Systematic Study of Hadoop Deployment Variables to Enable Automated Characterization of Cost-Effectiveness

Nicolas Poggi, David Carrera, Aaron Call, Rob Reinauer, Nikola Vujic, Daron Green, José Blakeley, Sergio Mendoza, Yolanda Becerra, Jordi Torres, Eduard Ayguadé, and Jesús Labarta

Regular

N217

Lightweight Approximate Top-k for Distributed Settings

Vinay Deolalikar and Kave Eshghi

Regular

N230

Recommending Similar Items in Large-scale Online Marketplaces

Jayasimha Reddy Katukuri, Tolga Konik, Rajyashree Mukherjee, and Santanu Kolay

 

I&G: Industry & Government (2)

Regular

N216

Crowdsourced Query Augmentation through Semantic Discovery of Domain-specific Jargon

Khalifeh Aljadda, Mohammed Korayem, Trey Grainger, and Chris Russell

Regular

N224

Heterogeneous Stream Processing for Disaster Detection and Alarming

Francois Schnitzler, Thomas Liebig, Shie Mannor, Gustavo Souto, Sebastian Bothe, and Hendrik Stange

Regular

N201

Recall Estimation for Rare Topic Retrieval from Large Corpuses

Praveen Bommannavar, Alek Kolcz, and Anand Rajaraman

Regular

N236

Identifying top Chinese network buzzwords from social media big data set based on time-distribution features

Yongli Tang, Tingting He, Bo Li, and Xiaohua Hu

Regular

N218

Query Revision During Cluster Based Search on Large Unstructured Corpora

Vinay Deolalikar

 

I&G: Industry & Government (3)

Regular

N213

A Scalable and Efficient Community Detection Algorithm

Dhaval C. Lunagariya, Somayajulu D.V.L.N., and Radha Krishna P.

Regular

N202

Future Directions of Humans in Big Data Research

Celeste Lyn Paul, Chris Argenta, William Elm, and Alex Endert

Regular

N228

An Initial Study of Predictive Machine Learning Analytics on Large Volumes of Historical Data for Power System Applications

Jiang Zheng and Aldo Dagnino

Regular

N207

In Unity There is Strength: Showcasing a Unified Big Data Platform with MapReduce Over both Object and File Storage

Renu Tewari, Dean Hildebrand, and Rui Zhang

Regular

N203

Bridging High Velocity and High Volume Industrial Big Data Through Distributed In-Memory Storage & Analytics

Jenny Weisenberg Williams, Kareem Aggour, John Interrante, Justin McHugh, and Eric Pool

 

I&G: Industry & Government (4)

Regular

N232

Big Data Predictive Analytics for Proactive Semiconductor Equipment Maintenance

Sathyan Munirathinam

Regular

N219

Automating Data Integration with HiperFuse

Eric Huang, Andres Quiroz, and Luca Ceriani

Regular

N215

Explore Efficient Data Organization for Large Scale Graph Analytics and Storage

Yinglong Xia, Ilie Tanasa, Lifeng Nai, Wei Tan, Yanbin Liu, Jason Crawford, and Ching-Yung Lin

Regular

N209

Increasing the Accessibility to Big Data Systems via a Common Services API

Rohan Malcolm, Cherrelle Morrison, Tyrone Grandison, Sean Thorpe, Kimron Christie, Akim Wallace, Damian Green, Julian Jarrett, and Arnett Campbell

 

 

S 1: Visual analytics, time, and space

Short

BigD204 "The Role of Visual Analysis in the Regulation of Electronic Order Book Markets"
Mark Paddrik, Richard Haynes, Andrew Todd, William Scherer, and Peter Beling

Short

BigD217 "Preferences over Time"
noriaki kawamae

Short

BigD227 "Online Temporal-Spatial Analysis for Detection of Critical Events in Cyber-Physical Systems"
Magnus Almgren, Olaf Landsiedel, Marina Papatriantafilou, and Zhang Fu

Short

BigD252 "In-Situ Visualization and Computational Steering for Large-Scale Simulation of Turbulent Flows in Complex Geometries"
Hong Yi, Michel Rasquin, Jun Fang, and Igor Bolotnov

Short

BigD288 "Large-Scale Network Traffic Monitoring with DBStream, a System for Rolling Big Data Analysis"
Arian Bär, Alessandro Finamore, Pedro Casas, Lukasz Golab, and Marco Mellia

Short

BigD387 "Immerive and collaborative data visualization using virtual reality platforms"
Ciro Donalek, S.G. Djorgovski, Scott Davidoff, Alex Cioc, Anwell Wang, Giuseppe Longo, Jeffrey S. Norris, Jerry Zhang, Elizabeth Lawler, and Stacy Yeh

Short

BigD411 "On Scaling Time Dependent Shortest Path Computations for Dynamic Traffic Assignment"
Amit Gupta, Weijia Xu, Kenneth Perrine, Dennis Bell, and Natalia Ruiz-Juri

Short

BigD413 "High Volume Geospatial Mapping for Internet-of-Vehicle Solutions with In-Memory Map-Reduce Processing"
Tao Zhong, Kshitij Doshi, Gang Deng, Xiaoming Yang, and Hegao Zhang

Short

BigD431 "The Adaptive Projection Forest: Using Adjustable Exclusion and Parallelism in Metric Space Indexes"
Lee Thompson, Weijia Xu, and Daniel Miranker

Short

BigD440 "Low Complexity Sensing for Big Spatio-Temporal Data"
Dongeun Lee and Jaesik Choi

 

S 2: Cloud computing and systems (1)

Short

BigD242 "Scheduling MapReduce Tasks on Virtual MapReduce Clusters from a Tenant’s Perspective"
Jia-Chun Lin, Ming-Chang Lee, and Ramin Yahyapour

Short

BigD311 "Minimizing Data Movement through Query Transformation"
Patrick Leyshock, David Maier, and Kristin Tufte

Short

BigD364 "Automated Workload-aware Elasticity of NoSQL Clusters in the Cloud"
Evie Kassela, Christina Boumpouka, Ioannis Konstantinou, and Nectarios Koziris

Short

BigD384 "Multilevel Partitioning of Large Unstructured Grids"
Oyindamola Akande and Philip Rhodes

Short

BigD392 "On the Performance of MapReduce: A Stochastic Approach"
Sarker Ahmed and Dmitri Loguinov

Short

BigD428 "VENU: Orchestrating SSDs in Hadoop Storage"
Krish K.R., M. Safdar Iqbal, and Ali Butt

Short

BigD438 "In-Memory I/O and Replication for HDFS with Memcached: Early Experiences"
Nusrat Islam, Xiaoyi Lu, Md. Rahman, Raghunath Rajachandrasekar, and Dhabaleswar Panda

Short