IEEE
BigData 2014 Program Schedule
Washington DC
USA
Oct 27-30, 2014
Program
|
|
• October 27,
2014
• October
28, 2014
• October
29, 2014
• October
30, 2014
|
|
|
|
Keynote Lecture
Main conference regular paper: 25 minutes (about 20 minutes for talk and
5 minutes for Q and A)
Main conference short paper: 15 minutes (about 11 minutes for talk and 4
minutes for Q and A)
|
|
|
|
26-Oct
|
15:30-20:00
|
Registration
|
Venue:
|
Ballroom Foyer and Ballroom Coatroom
|
27-Oct
|
7:30-18:00
Venue:
|
Registration
Ballroom Foyer and Ballroom Coatroom
|
10:00-10:20
and
15:20-15:40
|
Coffee Break
at Meeting Room Foyer
|
08:00-18:30
|
|
Sessions
|
Session Chair
|
Venue
|
Special session I
|
From
Data to Insight: Big Data and Analytics for Smart Manufacturing Systems
|
Sudarsan
Rachuri
Ronay
AK
|
JUDICIARY SUITE
|
Full day event
|
Doctoral Consortium
|
Jingrui He
|
Patuxent
|
Full-day workshop
|
#2
|
The
2nd Workshop on Scalable Machine Learning: Theory and Applications
|
Zenglin Xu
|
Waterford
|
#3
|
1st International Workshop on High
Performance Big Graph Data Management, Analysis, and Mining
|
Fengguang
Song
|
Lalique
|
#8
|
The
2nd International Workshop of BigData in Bioinformatics and Healthcare
Informatics
|
Jun Huan
|
Haverford
|
#13
|
First
Hands-On Workshop on Leveraging High Performance Computing Resources
for Managing Large Datasets
|
Ritu Arora
|
Baccarat
|
#18
|
Large
Scale Data Analytics in Transportation and Railway Infrastructure
|
Nii Attoh-Okine
|
CARTIER/TIFFANY
|
#20
|
Big
Humanities Data
|
Mark Hedges
|
CABINET
SUITE
|
#22
|
IEEE
NIST Big Data PWG Workshop on Big Data: Challenges, Practices and
Technologies
|
Nancy Grady
|
Embassy
|
08:00-12:00
|
|
Sessions
|
Session Chair
|
Venue
|
Morning workshop
|
#1
|
Scholarly Big Data: Challenges &
Issues
|
Ingemar
J. Cox
|
Diplomat
|
#15
|
Workshop on Advances in Software and
Hardware for Big Data to Knowledge Discovery (ASH)
|
Weijia
Xu
|
Ambassador
|
#17
|
Big Data in Computational Epidemiology
|
Jiangzhuo
Chen
|
Severn
|
#19
|
2nd Workshop on Scalable Cloud Data
Management
|
Felix
Gessert
|
Susquehanna
|
|
#21
|
Complexity for Big Data
|
Guozhu
Dong
|
Potomac
|
13:30-18:30
|
|
Sessions
|
Session Chair
|
Venue
|
Special session II
|
Big Data Representation and Processing in Data Science
|
T.Y. Lin
|
Susquehanna
|
Tutorial
|
Big Data Stream Mining
|
Alfred
Bifet
|
Severn
|
Afternoon
workshop
|
#11
|
CASK-14 : 1st International Workshop on
Collaborative methodologies to Accelerate Scientific Knowledge
discovery in big data
|
Chen
Jin
|
Diplomat
|
#16
|
IEEE Big Data Workshop on Semantics
for Big Data on the Internet of Things (SemBIoT 2014)
|
Kemafor
Ogan
|
Ambassador
|
|
|
|
|
28-Oct
|
|
07:30-18:00
|
Registration
|
Venue:
|
Ballroom
Foyer and Ballroom Coatroom
|
08:30-08:45
|
Opening and Welcoming Speech
Conference Co-Chairs:
Charu Aggarwal, Nick Cercone, Vasant
Honavar
Program Co-Chairs:
Jimmy Lin, Jian Pei
Industry Program co-Chairs:
Wo Chang, Raghunath Nambiar
BigData Steering Committee Chair:
Xiaohua Tony
Hu (Drexel University)
|
Venue:
|
CRYSTAL BALLROOM
|
08:45-09:45
|
Session Chair: Jian Pei
Keynote Speech 1: Never-Ending
Language Learning
Tom Mitchell - E. Fredkin University
Professor, Machine Learning Department, Carnegie Mellon University
|
Venue:
|
CRYSTAL BALLROOM
|
09:45-10:00
|
Coffee Break at Meeting Room Foyer
Poster session
setup and display: Meeting Room Foyer
|
10:00-12:30
|
S 1
Visual analytics, time, and space
|
S 2
Cloud computing and systems (1)
|
S 3
Graphs and networks
|
Tutorial
Big ML Software for Modern ML
Algorithms
|
Session Chair
|
Arash Jalal Zadeh Fard
|
Amy Apon
|
Luke Huan
|
Qirong Ho, Eric Xing
|
Venue
|
CABINET SUITE
|
DIPLOMAT/AMBASSADOR
|
JUDICIARY SUITE
|
EMBASSY/PATUXENT
|
12:30-14:00
|
Lunch provided by the conference at BALLROOM FOYER (Seating inside the Crystal Ballroom)
Poster session setup and display: Meeting Room Foyer
|
|
|
14:00-16:05
|
L 1
Graphs and networks (1)
|
L 2
Scalable systems
|
L 3
Storage
|
I&G 1
Industry & Government
|
Session
Chair
|
Conrad
S. Tucker
|
Weijia
Xu
|
Steven
Y. Ko
|
Wo
Chang
|
Venue:
|
CABINET SUITE
|
DIPLOMAT/AMBASSADOR
|
JUDICIARY SUITE
|
EMBASSY/PATUXENT
|
16:05-16:20
|
Coffee Break at Meeting
Room Foyer
|
16:20-18:00
|
L 4
Image processing
|
L 5
Data streams and time series
|
L 6
Regression and machine learning
|
I&G 2
Industry & Government
|
Session
Chair
|
Lin-Ching
Chang
|
Bo
Luo
|
Jiang
Zheng
|
Raghunath
Nambiar
|
Venue:
|
CABINET
SUITE
|
DIPLOMAT/AMBASSADOR
|
JUDICIARY
SUITE
|
EMBASSY/PATUXENT
|
19:00-20:30
Venue:
|
Banquet:
CRYSTAL BALLROOM
|
|
|
|
|
|
|
|
|
|
|
|
29-Oct
|
|
07:30-18:00
Venue:
|
Registration
Ballroom Foyer,
Ballroom Coatroom
|
08:30-09:30
|
Session Chair: Vasant Honavar
Keynote
Speech 2: Smart Data - How you
and I will exploit Big Data for personalized digital health and many
other activities
Amit
Sheth, LexisNexis Ohio Eminent Scholar, Kno.e.sis - Wright State
University
|
Venue:
|
CRYSTAL BALLROOM
|
09:30-10:00
|
Coffee Break at Meeting
Room Foyer
Poster session setup and display: Meeting Room Foyer
|
10:00-12:30
|
Panel with Program Directors: Dr. Chaitanya Baru
(NSF), Dr. Yuan Liu (NIH), Dr.
David Kuehn (DoT), Dr.Tsengdar Lee (NASA), Dr. Sudarsan Rachuri (NIST), Mr. Matti Vakkuri (DIGILE):
Big Data
Challenges and Opportunities
|
Tutorial
Large-scale
Heterogeneous Learning in Big Data Analytics
|
Session Chair
|
Xiaohua Tony Hu
|
Jun Huan
|
Venue:
|
CRYSTAL
BALLROOM
|
OLD
GEORGETOWN
|
12:30-14:00
|
Lunch provided by conference at BALLROOM FOYER (Seating inside the Crystal Ballroom)
|
|
Poster session setup and display: Meeting Room Foyer
|
14:00-16:05
|
L 7
Distributed
systems
|
L 8
Visualization/bioinformatics
|
L 9
Cloud
computing
|
I&G 3
Industry
& Government
|
Session Chair
|
Yicheng Tu
|
Saumyadipta Pyne
|
Ada Fu
|
Raghunath Nambiar
|
Venue:
|
OLD
GEORGETOWN
|
CABINET
SUITE
|
JUDICIARY
SUITE
|
DIPLOMAT/AMBASSADOR
|
16:05-16:20
|
Coffee Break at Meeting
Room Foyer
|
16:20-18:00
|
L 10
Privacy and
security
|
L 11
Graphs and
networks (2)
|
I&G 4
Industry
& Government
|
Tutorial
Big Data
Benchmarking
|
Session
Chair
|
Christoph
Schommer
|
Hao Howie
Huang
|
Wo Chang
|
Chaitan
Baru, Tilmann Rabl
|
Venue:
|
OLD
GEORGETOWN
|
CABINET
SUITE
|
DIPLOMAT/AMBASSADOR
|
JUDICIARY
SUITE
|
|
|
|
|
|
|
|
|
|
|
30-Oct
|
|
07:30-18:00
|
Registration
|
Venue:
|
Ballroom
Foyer, Ballroom Coatroom
|
08:30-09:30
|
Session Chair: Jimmy Lin
Keynote Speech 3: Addressing Human Bottlenecks in Big
Data
Joseph
M.
Hellerstein, Chancellor's Professor of Computer Science, University of
California, Berkeley and Trifacta
|
Venue:
|
CRYSTAL
BALLROOM
|
09:30-10:00
|
Coffee Break
at Meeting Room Foyer
Poster session setup and display: Meeting Room Foyer
|
10:00-12:30
|
S 4
Cloud
computing and systems (2)
|
S 5
Applications
|
S 6
Data mining
and learning
|
Session
Chair
|
Feng Luo
|
Mathias
Johanson
|
Xiaohua Tony
Hu
|
Venue:
|
JUDICIARY
SUITE
|
OLD
GEORGETOWN
|
CABINET
SUITE
|
08:00-12:00
|
|
Sessions
|
Session Chair
|
Venue
|
Morning workshop
|
#6
|
The Second Workshop on Distributed
Storage Systems and Coding for Big Data
|
Bing
Zhu
|
Diplomat
|
#7
|
First IEEE International Workshop on
Big Data Security and Privacy (BDSP 2014)
|
Tyrone
W A Grandison
|
Ambassador
|
13:30-18:30
|
|
Sessions
|
Session Chair
|
Venue
|
Afternoon
workshop
|
#9
|
Solar Astronomy Big Data (SABiD) – 1st
Workshop on Management, Search and Mining of Massive Repositories of
Solar Astronomy Data
|
Rafal Angryk
|
Diplomat
|
|
|
|
|
|
|
|
|
|
|
|
Keynote
1:
Title: Never-Ending Language Learning
Speaker:
Tom
Mitchell - E. Fredkin University Professor, Machine Learning
Department, Carnegie Mellon University
Abstract:
We
will never really understand learning until we can build machines that
learn many different things, over years, and become better learners over
time. We describe our research to build a Never-Ending Language Learner
(NELL) that runs 24 hours per day, forever, learning to read the web.
Each day NELL extracts (reads) more facts from the web, into its
growing knowledge base of beliefs. Each day NELL also learns to read
better than the day before. NELL has been running 24 hours/day for over
four years now. The result so far is a collection of 70 million
interconnected beliefs (e.g., servedWtih(coffee, applePie)), NELL is
considering at different levels of confidence, along with millions of
learned phrasings, morphological features, and web page structures that
NELL uses to extract beliefs from the web. NELL is also learning to
reason over its extracted knowledge, and to automatically extend its
ontology. Track NELL's progress at http://rtw.ml.cmu.edu, or follow it
on Twitter at @CMUNELL.
Short
Bio:
Tom
M. Mitchell founded and chairs the Machine Learning Department at
Carnegie Mellon University, where he is the E. Fredkin University
Professor. His research uses machine learning to develop computers that
are learning to read the web, and uses brain imaging to study how the
human brain understands what it reads. Mitchell is a member of the U.S.
National Academy of Engineering, a Fellow of the American Association
for the Advancement of Science (AAAS), and a Fellow and Past President
of the Association for the Advancement of Artificial Intelligence
(AAAI). He believes the field of machine learning will be the fastest
growing branch of computer science during the 21st century.
Keynote
2:
Title: Smart Data - How you and I will exploit Big Data for
personalized digital health and many other activities
Speaker:
Amit
Sheth, LexisNexis Ohio Eminent Scholar, Kno.e.sis - Wright State
University
Abstract:
Big
Data has captured a lot of interest in industry, with the emphasis on
the challenges of the four Vs of Big Data: Volume, Variety, Velocity,
and Veracity, and their applications to drive value for
businesses. Recently, there is
rapid growth in situations where a big data challenge relates to making
individually relevant decisions.
A key example is personalized digital health that related to
taking better decisions about our health, fitness, and well-being. Consider for instance, understanding
the reasons for and avoiding an asthma attack based on Big Data in the
form of personal health signals (e.g., physiological data measured by
devices/sensors or Internet of Things around humans, on the humans, and
inside/within the humans), public health signals (e.g., information
coming from the healthcare system such as hospital admissions), and
population health signals (such as Tweets by people related to asthma
occurrences and allergens, Web services providing pollen and smog
information). However, no
individual has the ability to process all these data without the help
of appropriate technology, and each human has different set of relevant
data!
In
this talk, I will describe Smart Data that is realized by extracting
value from Big Data, to benefit not just large companies but each
individual. If my child is an asthma patient, for all the data relevant
to my child with the four V-challenges, what I care about is simply,
“How is her current health, and what are the risk of having an asthma
attack in her current situation (now and today), especially if that
risk has changed?” As I will show, Smart Data that gives such
personalized and actionable information will need to utilize metadata,
use domain specific knowledge, employ semantics and intelligent
processing, and go beyond traditional reliance on ML and NLP. I will motivate the need for a
synergistic combination of techniques similar to the close interworking
of the top brain and the bottom brain in the cognitive models.
For
harnessing volume, I will discuss the concept of Semantic Perception,
that is, how to convert massive amounts of data into information,
meaning, and insight useful for human decision-making. For dealing with
Variety, I will discuss experience in using agreement represented in
the form of ontologies, domain models, or vocabularies, to support
semantic interoperability and integration. For Velocity, I will discuss somewhat
more recent work on Continuous Semantics, which seeks to use
dynamically created models of new objects, concepts, and relationships,
using them to better understand new cues in the data that capture
rapidly evolving events and situations.
Smart
Data applications in development at Kno.e.sis come from the domains of
personalized health, energy, disaster response, and smart city. I will
present examples from a couple of these.
Short Bio:
Amit
P. Sheth (http://knoesis.org/amit) is an educator, researcher, and
entrepreneur. He is the LexisNexis Eminent Scholar and
founder/executive director of the Ohio Center of Excellence in Knowledge-enabled
Computing (Kno.e.sis) at Wright State University. Kno.e.sis conducts
research in social/sensor/semantic data and Web 3.0 with real-world
applications and multidisciplinary solutions for translational
research, healthcare and life sciences, cognitive science, material
sciences, and others. Kno.e.sis' activities have resulted in Wright
State University being recognized as a top organization in the world on
World Wide Web in research impact. Prof. Sheth is one of top authors in
Computer Science, World Wide Web, and databases (cf: Microsoft Academic
Search; Google H-index). His research has led to several commercial
products, many real-world applications, and two earlier companies with
two more in early stages of development. One of these was Taalee/Voquette/Semagix,
which was likely the first company (founded in 1999) that developed
Semantic Web enabled search and analysis, and semantic application
development platforms.
Keynote
3:
Title: Addressing Human Bottlenecks in Big Data
Speaker:
Joseph
M. Hellerstein, Chancellor's Professor of Computer Science, University
of California, Berkeley and Trifacta
Abstract:
We
live in an era when compute is cheap, data is plentiful, and system
software is being given away for free.
Today, the critical bottlenecks in data-driven organizations are
human bottlenecks, measured in the costs of software developers, IT
professionals, and data analysts.
How can computer science remain relevant in this context? The Big Data ecosystem presents two
archetypal settings for answering this question: NoSQL distributed
databases, and analytics on Hadoop.
In
the case of NoSQL, developers are being asked to build parallel
programs for global-scale systems that cannot even guarantee the
consistency of a single register of memory. How can this possibly be made to
work? I’ll talk about what we
have seen in the wild in user deployments, and what we’ve learned from
developers and their design patterns.
Then I’ll present theoretical results—the CALM Theorem—that shed
light on what’s possible here, and what requires more expensive tools
for coordination on top of the typical NoSQL offerings. Finally, I will highlight some new
approaches to writing and testing software—exemplified by the Bloom
language—that can help developers of distributed software avoid
expensive coordination when possible, and have the coordination logic
synthesized for them automatically when necessary.
In
the Hadoop context, the key bottlenecks lie with data analysts and data
engineers, who are routinely asked to work with data that cannot
possibly be loaded into tools for statistical analytics or
visualization. Instead, they
have to engage in time-consuming data “wrangling”—to try and figure out
what’s in their data, whip it into a rectangular shape for analysis,
and figure out how to clean and integrate it for use. I’ll discuss what we heard talking
with data analysts in both academic interviews and commercial
engagements. Then I’ll talk
about how techniques from human-computer interaction, machine learning,
and database systems can be brought together to address this human
bottleneck, as exemplified by our work on various systems including the
Data Wrangler project and Trifacta's platform for data transformation.
Short Bio:
oseph M. Hellerstein is a Chancellor's Professor of
Computer Science at the University of California, Berkeley, whose
research focuses on data-centric systems and the way they drive
computing. A Fellow of the ACM, his work has been recognized via awards
including an Alfred P. Sloan Research Fellowship, MIT Technology
Review's TR10 and TR100 lists, Fortune Magazine's "Smartest in
Tech" list, and three ACM-SIGMOD "Test of Time"
awards. In 2012, Joe co-founded
Trifacta, Inc (http://www.trifacta.com/),
where he currently serves as Chief Strategy Officer.
|
Conference Paper Presentations
|
L1:
Graphs and networks (1)
|
Regular
|
BigD210 "4S:
Learning to Estimate Pairwise Distances in Large Graphs"
Maria Christoforaki and Torsten Suel
|
Regular
|
BigD304 "Geotagging
One Hundred Million Twitter Accounts with Total Variation
Minimization"
Ryan Compton, David Jurgens, and David Allen
|
Regular
|
BigD357
"GRAPHiQL: A Graph Intuitive Query Language for Relational
Databases"
Alekh Jindal and Samuel Madden
|
Regular
|
BigD395
"PULP: Scalable Multi-Objective Multi-Constraint Partitioning for
Small-World Networks"
George Slota, Siva Rajamanickam, and Kamesh Madduri
|
Regular
|
BigD436
"Synergistic Partitioning in Multiple Large Scale Social
Networks"
Songchang Jin, Jiawei Zhang, Philip S. Yu, Shuqiang Yang, and Aiping Li
|
L 2:
Scalable systems
|
Regular
|
BigD216
"FusionFS: Toward Supporting Data-Intensive Scientific
Applications on Extreme-Scale High-Performance Computing Systems"
Dongfang Zhao, Zhao Zhang, Xiaobing Zhou, Tonglin Li, Ke Wang, Dries
Kimpe, Philip Carns, Rob Ross, and Ioan Raicu
|
Regular
|
BigD253
"Sparse computation for large-scale data mining"
Dorit S. Hochbaum and Philipp Baumann
|
Regular
|
BigD306 "BASIC:
an Alternative to BASE for Large-Scale Data Management System"
Lengdong Wu, Li-Yan Yuan, and Jia-Huai You
|
Regular
|
BigD336
"Facilitating Twitter Data Analytics: Platform, Language, and
Functionality"
Ke Tao, Claudia Hauff, Geert-Jan Houben, Fabian Abel, and Guido
Wachsmuth
|
Regular
|
BigD444
"Large-scale Distributed Sorting for GPU-based Heterogeneous
Supercomputers"
Hideyuki Shamoto, Koichi Shirahata, Aleksandr Drozd, Hitoshi Sato, and
Satoshi Matsuoka
|
L 3:
Storage
|
Regular
|
BigD271
"BurstMem: A High-Performance Burst Buffer System for Scientific
Applications"
Teng Wang, Sarp Oral, Yandong Wang, Brad Settlemyer, Scott Atchley, and
Weikuan Yu
|
Regular
|
BigD313
"Meeting Predictable Buffer Limits in the Parallel Execution of
Event Processing Operators"
Ruben Mayer, Boris Koldehofe, and Kurt Rothermel
|
Regular
|
BigD398” Effective Caching Techniques
for Accelerating Pattern Matching Queries Arash
Fard, Satya Manda, Lakshmish Ramaswamy, and John Miller
|
Regular
|
BigD407
"Provenance-Based Object Storage Prediction Scheme for Scientific
Big Data Applications"
Dong Dai, Yong Chen, Dries Kimpe, and Rob Ross
|
Regular
|
BigD215
"Virtual Chunks: On Supporting Random Accesses to Scientific Data in
Compressible Storage Systems"
Dongfang Zhao, Jian Yin, Kan Qiao, and Ioan Raicu
|
L 4: Image
processing
|
Regular
|
BigD316 "
Metadata Extraction and Correction for Large-Scale Traffic Surveillance
Videos "
Xiaomeng Zhao, Huadong Ma, Haitao Zhang, Yi Tang, and Guangping Fu
|
Regular
|
BigD360 "
Structure Recognition from High Resolution Images of Ceramic Composites
"
Daniela Ushizima, Talita Perciano, Harinarayan Krishnan, Burlen Loring,
Hrishikesh Bale, Dilworth Parkinson, and James Sethian
|
Regular
|
BigD379 "
Evaluating Density-based Motion for Big Data Visual Analytics
"
Ronak Etemadpour, Paul Murray, and Angus Forbes
|
Regular
|
BigD421 "
Locating Visual Storm Signatures from Satellite Images "
Yu Zhang, Stephen Wistar, Jose A. Piedra-Fernández, Jia Li, Michael
Steinberg, and James Z. Wang
|
L 5:
Data streams and time series
|
Regular
|
BigD234
"Distributed Adaptive Model Rules for Mining Big Data
Streams"
Anh Thu Vu, Gianmarco De Francisci Morales, Joao Gama, and Albert Bifet
|
Regular
|
BigD382
"Interpretable Streaming Regression Models with Local Performance
Guarantees"
Ulf Johansson, Cecilia Sönströd, and Henrik Linusson
|
Regular
|
BigD451
"Performance Modeling in CUDA Streams - A Means for
High-Throughput Data Processing"
Hao Li, Di Yu, Anand Kumar, and Yicheng Tu
|
Regular
|
BigD445
"TRISTAN: Real-Time Analytics on Massive Time Series Using Sparse
Dictionary Compression"
Alice Marascu, Pascal Pompey, Eric Bouillet, Michael Wurst, Olivier
Verscheure, Martin Grund, and Philippe Cudre-Mauroux
|
L 6:
Regression and machine learning
|
Regular
|
BigD402
"Predicting Glaucoma Progression using Multi-task Learning with Heterogeneous
Features"
Shigeru Maya, Kai Morino, and Kenji Yamanishi
|
Regular
|
BigD283
"Examination of Data, Rule Generation and Detection of Phishing
URLs using Online Logistic Regression"
Mohammed Nazim Feroz and Susan Mengel
|
Regular
|
BigD454
"Large-scale Logistic Regression and Linear Support Vector
Machines Using Spark"
Chieh-Yen Lin, Cheng-Hao Tsai, Ching-Pei Lee, and Chih-Jen Lin
|
Regular
|
BigD465
"BayesWipe: A Multimodal System for Data Cleaning and Consistent Query
Answering on Structured Data"
Sushovan De, Yuheng Hu, Yi Chen, and Subbarao Kambhampati
|
L 7:
Distributed systems
|
Regular
|
BigD318
"Partial Rollback-based Scheduling on In-memory Transactional Data
Grids"
Junwhan Kim
|
Regular
|
BigD337 "Main
Memory Evaluation of Recursive Queries on Multicore Machines"
Mohan Yang and Carlo Zaniolo
|
Regular
|
BigD391
"Distributed Algorithms for k-truss Decomposition"
Ming-Syan Chen, Pei-Ling Chen, and Chung-Kuang Chou
|
Regular
|
BigD434 "Parallel
Breadth First Search on GPU Clusters"
Zhisong Fu, Harish Dasari, Martin Berzins, and Bryan Thompson
|
Regular
|
BigD471
"Optimizing Load Balancing and Data-Locality with Data-aware
Scheduling"
Ke Wang, Xiaobing Zhou, Tonglin Li, Dongfang Zhao, Michael Lang, and
Ioan Raicu
|
L 8: Visualization/bioinformatics
|
Regular
|
BigD258 "
Topic Similarity Networks: Visual Analytics for Large Document Sets
"
Arun Maiya
|
Regular
|
BigD303 " Web-based
Visual Analytics for Extreme Scale Climate Science "
Chad Steed, Katherine Evans, John Harney, Brian Jewell, Galen Shipman,
Brian Smith, Peter Thornton, and Dean Williams
|
Regular
|
BigD338 "
Visual Fusion of Mega-City Big Data:
An Application to Traffic and Tweets Data Analysis of Metro Passengers "
Masahiko Itoh, Daisaku Yokoyama, Masashi Toyoda, Yoshimitsu Tomita,
Satoshi Kawamura, and Masaru Kitsuregawa
|
Regular
|
BigD277 "
Random Projection Based Clustering for Population Genomics "
Sotiris Tasoulis, Lu Cheng, Niko Välimäki, Nicholas Croucher, Simon
Harris, William Hanage, Teemu Roos, and Jukka Corander
|
Regular
|
BigD460 "
Identification of SNP Interactions Using Data-Parallel Primitives on
GPUs "
Can Altinigneli, Bettina Konte, Dan Rujescu, Christian Boehm, and
Claudia Plant
|
L 9:
Cloud computing
|
Regular
|
BigD380
"Combining Hadoop and GPU to Preprocess Large Affymetrix
Microarray Data"
sufeng Niu, guangyu yang, nilim sarma, Melissa Smith, Pradip Srimani,
and Feng Luo
|
Regular
|
BigD423
"Detecting and Identifying System Changes in the Cloud via
Discovery by Example"
Hao Chen, Sastry Duri, Vasanth Bala, Nilton Bila, Canturk Isci, and Ayse
Coskun
|
Regular
|
BigD426
"PigOut: Making Multiple Hadoop Clusters to Work Together"
Kyungho Jeon, Sharath Chandrashekhara, Feng Shen, Shikhar Mehra, Oliver
Kennedy, and Steven Ko
|
Regular
|
BigD432
"Accurate and Efficient Selection of the Best Consumption
Prediction Method in Smart Grids"
Marc Frincu, Charalampos Chelmis, Muhammad Noor, and Viktor Prasanna
|
Regular
|
BigD244
"E-Sketch: Gathering Large-scale Energy Consumption Data Based on
Consumption Patterns"
Zhichuan Huang, Hongyao Luo, David Skoda, Ting Zhu, and Yu Gu
|
L
10: Privacy and security
|
Regular
|
BigD260
"Hierarchical Management of Large-Scale Malware Data"
Lee Kellogg, Brian Ruttenberg, Alison O'Connor, Michael Howard, and Avi
Pfeffer
|
Regular
|
BigD294
"MR-TRIAGE: Scalable Multi-Criteria Clustering for Big Data
Security Intelligence Applications"
Yun Shen and Olivier Thonnard
|
Regular
|
BigD383 "Using Data Content to
Assist Access Control for Large-Scale Content-Centric Databases"
Wenrong Zeng, Yuhao Yang, and Bo Luo
|
|
|
L
11: Graphs and networks (2)
|
Regular
|
BigD301
"Efficient Breadth-First Search on a Heterogeneous Processor"
Mayank Daga, Mark Nutter, and Mitesh Meswani
|
Regular
|
BigD419 "Clique
Guided Community Detection"
Diana Palsetia, Mostofa Patwary, William Hendrix, Ankit Agrawal, and
Alok Choudhary
|
Regular
|
BigD441 "Increasing the Veracity
of Event Detection on Social Media Networks Through User Trust
Modeling"
Todd Bodnar, Conrad Tucker, Kenneth Hopkinson, and Sven Bilén
|
Regular
|
BigD455
"NVM-based Hybrid BFS with Memory Efficient Data Structure"
Keita Iwabuchi, Hitoshi Sato, Yuichiro Yasui, Katsuki Fujisawa, and
Satoshi Matsuoka
|
I&G: Industry & Government
(1)
|
Regular
|
N211
Spatial Computations over Terabyte-Sized Images on
Hadoop Platforms
Peter Bajcsy, Phuong Nguyen, Antoine Vandecreme, and
Mary Brady
|
Regular
|
N223
Astro: A Predictive Model for Anomaly Detection and
Feedback-based Scheduling on Hadoop
Chaitali Gupta, Mayank Bansal, Tzu-Cheng Chuang,
Ranjan Sinha, and Sami Ben-romdhane
|
Regular
|
N222
ALOJA: a Systematic Study of Hadoop Deployment
Variables to Enable Automated Characterization of Cost-Effectiveness
Nicolas Poggi, David Carrera, Aaron Call, Rob
Reinauer, Nikola Vujic, Daron Green, José Blakeley, Sergio Mendoza,
Yolanda Becerra, Jordi Torres, Eduard Ayguadé, and Jesús Labarta
|
Regular
|
N217
Lightweight Approximate Top-k for Distributed
Settings
Vinay Deolalikar and Kave Eshghi
|
Regular
|
N230
Recommending Similar Items in Large-scale Online
Marketplaces
Jayasimha Reddy Katukuri, Tolga Konik, Rajyashree
Mukherjee, and Santanu Kolay
|
I&G: Industry & Government
(2)
|
Regular
|
N216
Crowdsourced Query Augmentation through Semantic
Discovery of Domain-specific Jargon
Khalifeh Aljadda, Mohammed Korayem, Trey Grainger,
and Chris Russell
|
Regular
|
N224
Heterogeneous Stream Processing for Disaster
Detection and Alarming
Francois Schnitzler, Thomas Liebig, Shie Mannor,
Gustavo Souto, Sebastian Bothe, and Hendrik Stange
|
Regular
|
N201
Recall Estimation for Rare Topic Retrieval from
Large Corpuses
Praveen Bommannavar, Alek Kolcz, and Anand Rajaraman
|
Regular
|
N236
Identifying top Chinese network buzzwords from
social media big data set based on time-distribution features
Yongli Tang, Tingting He, Bo Li, and Xiaohua Hu
|
Regular
|
N218
Query Revision During Cluster Based Search on Large
Unstructured Corpora
Vinay Deolalikar
|
I&G: Industry & Government
(3)
|
Regular
|
N213
A Scalable and Efficient Community Detection
Algorithm
Dhaval C. Lunagariya, Somayajulu D.V.L.N., and Radha
Krishna P.
|
Regular
|
N202
Future Directions of Humans in Big Data Research
Celeste Lyn Paul, Chris Argenta, William Elm, and
Alex Endert
|
Regular
|
N228
An Initial Study of Predictive Machine Learning
Analytics on Large Volumes of Historical Data for Power System
Applications
Jiang Zheng and Aldo Dagnino
|
Regular
|
N207
In Unity There is Strength: Showcasing a Unified Big
Data Platform with MapReduce Over both Object and File Storage
Renu Tewari, Dean Hildebrand, and Rui Zhang
|
Regular
|
N203
Bridging High Velocity and High Volume Industrial
Big Data Through Distributed In-Memory Storage & Analytics
Jenny Weisenberg Williams, Kareem Aggour, John
Interrante, Justin McHugh, and Eric Pool
|
I&G: Industry & Government
(4)
|
Regular
|
N232
Big Data Predictive Analytics for Proactive
Semiconductor Equipment Maintenance
Sathyan Munirathinam
|
Regular
|
N219
Automating Data Integration with HiperFuse
Eric Huang, Andres Quiroz, and Luca Ceriani
|
Regular
|
N215
Explore Efficient Data Organization for Large Scale
Graph Analytics and Storage
Yinglong Xia, Ilie Tanasa, Lifeng Nai, Wei Tan,
Yanbin Liu, Jason Crawford, and Ching-Yung Lin
|
Regular
|
N209
Increasing the Accessibility to Big Data Systems via
a Common Services API
Rohan Malcolm, Cherrelle Morrison, Tyrone Grandison,
Sean Thorpe, Kimron Christie, Akim Wallace, Damian Green, Julian
Jarrett, and Arnett Campbell
|
S 1:
Visual analytics, time, and space
|
Short
|
BigD204 "The
Role of Visual Analysis in the Regulation of Electronic Order Book
Markets"
Mark Paddrik, Richard Haynes, Andrew Todd, William Scherer, and Peter
Beling
|
Short
|
BigD217
"Preferences over Time"
noriaki kawamae
|
Short
|
BigD227
"Online Temporal-Spatial Analysis for Detection of Critical Events
in Cyber-Physical Systems"
Magnus Almgren, Olaf Landsiedel, Marina Papatriantafilou, and Zhang Fu
|
Short
|
BigD252
"In-Situ Visualization and Computational Steering for Large-Scale
Simulation of Turbulent Flows in Complex Geometries"
Hong Yi, Michel Rasquin, Jun Fang, and Igor Bolotnov
|
Short
|
BigD288
"Large-Scale Network Traffic Monitoring with DBStream, a System
for Rolling Big Data Analysis"
Arian Bär, Alessandro Finamore, Pedro Casas, Lukasz Golab, and Marco
Mellia
|
Short
|
BigD387 "Immerive
and collaborative data visualization using virtual reality
platforms"
Ciro Donalek, S.G. Djorgovski, Scott Davidoff, Alex Cioc, Anwell Wang,
Giuseppe Longo, Jeffrey S. Norris, Jerry Zhang, Elizabeth Lawler, and
Stacy Yeh
|
Short
|
BigD411 "On Scaling
Time Dependent Shortest Path Computations for Dynamic Traffic
Assignment"
Amit Gupta, Weijia Xu, Kenneth Perrine, Dennis Bell, and Natalia
Ruiz-Juri
|
Short
|
BigD413 "High
Volume Geospatial Mapping for Internet-of-Vehicle Solutions with
In-Memory Map-Reduce Processing"
Tao Zhong, Kshitij Doshi, Gang Deng, Xiaoming Yang, and Hegao Zhang
|
Short
|
BigD431 "The
Adaptive Projection Forest: Using Adjustable Exclusion and Parallelism
in Metric Space Indexes"
Lee Thompson, Weijia Xu, and Daniel Miranker
|
Short
|
BigD440 "Low
Complexity Sensing for Big Spatio-Temporal Data"
Dongeun Lee and Jaesik Choi
|
S 2:
Cloud computing and systems (1)
|
Short
|
BigD242 "Scheduling
MapReduce Tasks on Virtual MapReduce Clusters from a Tenant’s
Perspective"
Jia-Chun Lin, Ming-Chang Lee, and Ramin Yahyapour
|
Short
|
BigD311
"Minimizing Data Movement through Query Transformation"
Patrick Leyshock, David Maier, and Kristin Tufte
|
Short
|
BigD364
"Automated Workload-aware Elasticity of NoSQL Clusters in the
Cloud"
Evie Kassela, Christina Boumpouka, Ioannis Konstantinou, and Nectarios
Koziris
|
Short
|
BigD384
"Multilevel Partitioning of Large Unstructured Grids"
Oyindamola Akande and Philip Rhodes
|
Short
|
BigD392 "On
the Performance of MapReduce: A Stochastic Approach"
Sarker Ahmed and Dmitri Loguinov
|
Short
|
BigD428
"VENU: Orchestrating SSDs in Hadoop Storage"
Krish K.R., M. Safdar Iqbal, and Ali Butt
|
Short
|
BigD438
"In-Memory I/O and Replication for HDFS with Memcached: Early
Experiences"
Nusrat Islam, Xiaoyi Lu, Md. Rahman, Raghunath Rajachandrasekar, and
Dhabaleswar Panda
|
Short
|
BigD448
"Scaling Up Prioritized Grammar Enumeration for Scientific
Discovery in the Cloud"
Tony Worm and Kenneth Chiu
|
Short
|
BigD469
"In-advance Data Analytics for Reducing Time to Discovery"
Jialin Liu, Yin Lu, and Yong Chen
|
| | |