IEEE BigData 2013
Program Schedule
Hyatt Regency Santa Clara
CA, USA
Oct 6-9, 2013
Program
|
|
• October 5,
2013
• October
6, 2013
• October 7, 2013
• October 8,
2013
• October
9, 2013
|
|
|
|
Keynote Lecture: 60 minutes((about 45 minutes for talk and
15 minutes for Q and A)
Main conference regular paper: 25 minutes (about 20 minutes for talk and
5 minutes for Q and A)
Main conference short paper: 20 minutes (about 16 minutes for talk and 4
minutes for Q and A)
|
|
|
|
5-Oct
|
|
17:00-20:00
|
Registration:
Ballroom E Foyer
|
|
|
6-Oct
|
|
07:30-18:00
|
Registration:
Hotel Lobby West
|
Venue:
|
Ballroom
AB (Ba-AB), Ballroom C (Ba-C), Ballroom D (Ba-D), Ballroom E (Ba-E),
Ballroom
F (Ba-F), Ballroom G (Ba-G), Ballroom (H) (Ba-H)
|
08:30-12:10
|
Workshop
5
The First
Workshop on Big Data Visualization
|
Workshop
6
Big Data and Science:
Infrastructure and Service
|
Workshop
7
Scalable Machine
Learning: Theory and Application
|
Workshop
8
Big Data in
Bioinformatics and Health Informatics
|
Workshop
12
Knowledge
management and Big Data Analytics
|
Workshop
9
Scholarly Big
Data: Challenges & issues
|
Tutorial 1
Online Learning for
Big Data Analytics (8-10am)
|
Tutorial 2
Large-Scale
Click-stream and transaction log mining in practice
(10:20-12:20am)
|
Session
Chairs
|
Kwan-Liu
Ma
|
Shane Canon
|
Haiqin Yang
|
Juan Huan et al.
|
Qing Liu
|
Ingemar J. Cox
|
|
|
Venue:
|
Ba-AB
|
Ba-C
|
Ba-D
|
Ba-E
|
Ba-F
|
Ba-G
|
Ba-H
|
Ba-H
|
|
Coffee
break: 10:00-10:20 Foyer
|
12:10-13:30
|
Lunch
at your own
|
13:30-18:00
|
Workshop
5
The First
Workshop on Big Data Visualization
|
Workshop
6
Big Data and
Science: Infrastructure and Services
|
Workshop
7
Scalable
Machine Learning: Theory and Applications
|
Workshop
8
Big Data in
Bioinformatics and Health Informatics
|
Workshop
12
Knowledge
management and Big Data Analytics
|
Workshop
9
Scholarly Big
Data: Challenges & issues
|
Workshop
10
Scalable
Cloud Data Management
|
|
|
|
|
|
|
|
|
|
|
Session
Chairs
|
Kwan-Liu
Ma
|
Shane
Canon
|
Haiqin
Yang
|
Juan
Huan
et al.
|
Qing
Liu
|
Ingemar
J. Cox
|
Norbert
Ritter
|
|
Venue:
|
Ba-AB
|
Ba-C
|
Ba-D
|
Ba-E
|
Ba-F
|
Ba-G
|
Ba-H
|
|
|
Coffee
break: 15:40-16:00 Foyer
|
7-Oct
|
|
07:30-18:00
|
Registration:
Hotel Lobby West
|
Venue:
|
Ballroom
AB (Ba-AB), Ballroom C (Ba-C), Ballroom D (Ba-D), Ballroom E (Ba-E),
Ballroom F (Ba-F)
|
08:10-08:25
|
Opening
and Welcoming Speech
Conference
Co-Chairs:
T.Y. Lin, Vijay Raghavanm,
Benjamin Wah
Program Co-Chairs:
Ricardo Baeza-Yates, Geoffrey Fox,
Cyrus Shahabi, Matthew Smith, Qiang Yang
Industry Program co-Chairs:
Rayid Ghani, Wei Han, Ronny Lempel, Raghunath
Nambiar
BigData
Steering Committee Chair:
Xiaohua
Tony Hu (Drexel University)
|
Venue:
|
Ba-AB
|
08:25-09:25
|
Session
Chair:
Geoffrey Fox
Keynote
Lecture 1:
The Berkeley Data Analytics Stack: Present and Future
Prof.
Mike Franklin, AMP
Lab, UC Berkeley, USA
|
Venue:
|
Ba-AB
|
09:25-09:45
|
Coffee
Break : Foyer
Poster session setup:
Ballroom Foyer
|
09:45-12:00
|
Session
AB1
Algorithms
and Systems for Big Data Search
|
Session
C1
Cloud/Grid/Stream
Computing for Big Data
|
Session
D1
Complex Big
Data Applications
|
Workshop
1
Distributed
Storage Systems and Coding for Bigdata
|
Workshop
3
Workshop on
Big Data and Society
|
Session Chair
|
Umit
Catalyurek
|
Natasha
Balac
|
Qunzhi
Zhou
|
Hui
Li et al .
|
Yike
Guo et al.
|
Venue
|
Ba-AB
|
Ba-C
|
Ba-D
|
Ba-E
|
Ba-F
|
12:00-13:20
|
Lunch
at your own
|
|
Poster session setup:
Ballroom Foyer
|
13:20-15:20
|
Session AB2
Algorithms
and Systems for Big Data Search
|
Session C2
High
Performance/ Parallel Computing Platforms for Big Data
|
Session D2
Complex Big
Data Applications
|
Workshop
1
Distributed Storage
Systems and Coding for Bigdata
|
Workshop
3
Workshop on
Big Data and Society
|
Session
Chair
|
Michael
Goodrich
|
Eugen
Feller
|
Saumyadipta
Pyne
|
Hui
Li et al .
|
Yike
Guo et al.
|
Venue:
|
Ba-AB
|
Ba-C
|
Ba-D
|
Ba-E
|
Ba-F
|
15:20-15:40
|
Coffee
|
|
|
|
|
15:40-17:40
|
Session AB3
Big Data
Search Architectures, Scalability and Effciency
|
Session C3
High
Performance/ Parallel Computing Platforms for Big Data
|
Session D3
Complex Big
Data Applications
|
Workshop
1
Distributed
Storage Systems and Coding for Bigdata
|
Workshop
3
Workshop on
Big Data and Society
|
Session
Chair
|
Peter
Sanders
|
Toshimori
|
En-hui
Yang
|
Hui
Li et al .
|
Yike
Guo et al.
|
Venue:
|
Ba-AB
|
Ba-C
|
Ba-D
|
Ba-E
|
Ba-F
|
18:30-20:30
|
Banquet:
Santa
Clara Ballroom
|
|
|
|
|
|
|
|
|
8-Oct
|
|
08:00-18:00
|
Registration:
Hotel Lobby West
|
Venue:
|
Ballroom
AB (Ba-AB), Ballroom C (Ba-C), Ballroom D (Ba-D), Ballroom E (Ba-E), Ballroom
F (Ba-F)
|
8:30-9:30
|
Session
Chair: Cyrus
Shahabi
Keynote
Lecture 2:
Using Crowdsourcing for Data Analytics
Prof.
Hector Garcia-Molina, Stanford
University, USA
|
Venue:
|
Ba-AB
|
9:30-9:50
|
Coffee
Break: Ballroom
Foyer
|
9:50-12:00
|
Session
AB4
Large-scale
Recommendation Systems and Social Media Systems
|
Session
C4
Energy-efficient
Computing for Big Data
|
Session
D4
Data
Preservation, Information Integration and Heterogeneous and Mult-structured
Data Integration
|
Workshop 2
Big Data and
the Humanities
|
Workshop
4
BPOE 2013
|
Session Chair
|
Noriaki
Kawamae
|
Leonardo
Bautista
|
Yong
Chen
|
Mark
Hedges et al.
|
Jianfeng
Zhan et al.
|
Venue:
|
Ba-AB
|
Ba-C
|
Ba-D
|
Ba-E
|
Ba-F
|
12:00-13:20
|
Lunch
provided by conference: TERRA
COURTYARD
|
13:20-15:20
|
Session
AB5
Link and
Graph Mining
|
Session
C5
New
Computational Models for Big Data
|
Session
D5
Spatiotemporal
and Stream Data Management, Scientific Data Management
|
Workshop 2
Big Data and
the Humanities
|
Workshop
4
BPOE 2013
|
Session
Chair:
|
Qi
Liao
|
Shestakov
Denis
|
Frank
Dehne
|
Mark
Hedges et al.
|
Jianfeng
Zhan et al.
|
Venue:
|
Ba-AB
|
Ba-C
|
Ba-D
|
Ba-E
|
Ba-F
|
15:20-15:40
|
Coffee
Break: Ballroom
Foyer
|
15:40-17:40
|
Session AB6
Link and
Graph Mining, Mobility and Big Data
|
Session C6
Novel
Theoretical Models for Big Data
|
Session D6
Scientific
Data Management
|
Workshop 2
Big Data and
the Humanities
|
Workshop
4
BPOE 2013
|
Session
Chair
|
Abhirup
Chakraborty
|
Weijia
Xu
|
Andreas
Rauber
|
Mark
Hedges et al.
|
Jianfeng
Zhan et al.
|
Venue:
|
Ba-AB
|
Ba-C
|
Ba-D
|
Ba-E
|
Ba-F
|
9-Oct
|
|
08:00-15:00
|
Registration:
|
Venue:
|
Ballroom
AB (Ba-AB), Ballroom C (Ba-C), Ballroom D (Ba-D), Ballroom E (Ba-E),
Ballroom F (Ba-F)
|
08:30-09:30
|
Session
Chair: T.Y Lin
Keynote
Lecture 3: Security
– A Big Question for Big Data
Prof. Roger Schell,
University
of Southern California, USA
|
Venue:
|
Ba-AB
|
09:30-09:50
|
Coffee Break: : Ballroom Foyer
|
9:50-12:00
|
Key
Issues in Big Data Research Panel
|
Session
E1
Industry and Government
Program
|
Session
D7
Database
Management Challenges: Architecture, Storage, User Interfaces
|
Workshop
11
Big Data and
Smarter Cities
|
Session
E2
Industry and
Government Program
|
Session Chair
|
T.Y.Lin
|
Avigdor
Gal
|
Mihajlo
Grbovic
|
Sambit
Sahu
|
Nikos
Papailiou
|
Venue:
|
Ba-AB
|
Ba-C
|
Ba-D
|
Ba-E
|
Ba-F
|
12:00-13:25
|
Lunch
provided by conference: TERRA COURTYARD
|
13:30-14:30
Session
Chair: Raghunath
Nambiar
Keynote
Lecture 4: Key
Usage Patterns for Apache Hadoop in the Enterprise
Dr. Amr Awadallah, CTO,
Cloudera, USA
Venue: Ba-AB
|
Session
Chair
Venue:
|
Workshop
11
Big Data and
Smarter Cities
Sambit
Sahu
Ba-C
|
Session
AB7
Privacy
Preserving Big Data Collection/Analytics, Threat Detection using Big
Data Analytics
Simon
Chan
Ba-F
|
14:30-14:50
|
Coffee Break:
|
Ballroom
Foyer
|
|
|
|
14:40-16:50
Session
Chair
|
Big Data Funding Program Panel: Challenging and
Opportunities
Vijay
Raghavan
|
|
|
Workshop 11
Big Data and
Smarter Cities
Sambit Sahu
|
|
Venue:
|
Ba-AB
|
|
|
Ba-C
|
|
|
|
|
|
|
|
Keynote 1:
Title: The
Berkeley Data Analytics Stack: Present and Future
Speaker:
Prof.
Mike Franklin, UC Berkeley, USA
Abstract:
The Berkeley AMPLab
was founded on the idea that the challenges of emerging Big Data
applications requires a new approach to analytics systems. Launching in
early 2011, the project set out to rethink the traditional analytics
stack, breaking down technical and intellectual barriers that had arisen
during decades of evolutionary development. The vision of the lab is to
seamlessly integrate the three main resources available for making sense
of data at scale: Algorithms (such as machine learning and statistical
techniques), Machines (in the form of scalable clusters and elastic cloud
computing), and People (both individually as analysts and en masse, as
with crowdsourced human computation). To pursue this goal, we assembled a
research team with diverse interests across computer science, forged
relationships with domain experts on campus and elsewhere, and obtained
the support of leading industry partners and major government sponsors.
The lab is realizing its ideas through the development of a
freely-available Open Source software stack called BDAS: the Berkeley
Data Analytics Stack. In the nearly three years the lab has been in
operation, we've released major components of BDAS. Several of these
components have gained significant traction in industry and elsewhere:
the Mesos cluster resource manager, the Spark in-memory computation
framework, and the Shark query processing system. BDAS shows up
prominently in many industry discussions of the future of the Big Data
analytics ecosystem - a rare degree of impact for an ongoing academic
project. Given this initial success, the lab is continuing on its
research path, moving "up the stack" to better integrate and
support deep machine learning and to make people a full-fledged resource
for making sense of Big Data.
In this talk, I'll first outline the motivation and insights behind our
research approach and describe how we have organized to address the
cross-disciplinary nature of Big Data challenges. I will then describe
the current state of BDAS with an emphasis on the key components listed
above and will address our current efforts on machine learning
scalability and ease of use, and hybrid human/computer processing.
Finally I will present our current views of how all the pieces will fit
together to form a system that can adaptively bring the right resources
to bear on a given data-driven question to meet time, cost and quality
requirements throughout the analytics lifecycle.
Short Bio:
Michael Franklin is
the Thomas M. Siebel Professor of Computer Science at UC Berkeley, where
he also serves as Director of the Algorithms, Machines and People Lab
(AMPLab). The Berkeley AMPLab is a collaboration of over 60 researchers
supported by Founding Sponsors Amazon Web Services, Google, and SAP,
along with 17 other leading companies, the Darpa XData program, and an
NSF Expeditions in Computing award. The latter was announced as part of
the Obama Administration's Big Data research initiative in 2012. His
research interests include large-scale data management and analytics,
data integration, and hybrid human/computer data processing systems. He
was founder and CTO of Truviso, a real-time data analytics company
acquired by Cisco Systems in 2012. He is an ACM Fellow and two-time
winner of the ACM SIGMOD Test of Time Award (2013 and 2004). He also
recently received the Best Paper awards at ICDE 2013 and NSDI 2012, a
"Best of VLDB 2012" selection, Best Demo awards at SIGMOD 2012
and VLDB 2011 and the Outstanding Advisor Award from the Computer Science
Graduate Student Association at Berkeley. He is a committee member on the
U.S. National Academy of Sciences study on Analysis of Massive Data and a
Transportation Research Board committee on long-term data stewardship.
Prof. Franklin received his Ph.D. in Computer Science from the University
of Wisconsin-Madison in 1993.
Keynote 2:
Title: Using
Crowdsourcing for Data Analytics
Speaker:
Prof.
Hector Garcia-Molina, Stanford
University, USA
Abstract:
It may sound
contradictory to use humans to analyze big data, since humans cannot
process huge amounts of data, may be error prone and are relatively slow.
However, humans can do certain tasks much better than machines, e.g.,
tasks that involve image analysis or natural language.
In this talk I will discuss how humans can be judiciously used to improve
data analytics by cleansing, clustering and filtering critical data. I
will also briefly describe ongoing work at our Stanford InfoLab in this
area
.
Short Bio:
Hector Garcia-Molina
is the Leonard Bosack and Sandra Lerner Professor in the Departments of
Computer Science and Electrical Engineering at Stanford University,
Stanford, California. He was the chairman of the Computer Science
Department from January 2001 to December 2004. From 1997 to 2001 he was a
member the President's Information Technology Advisory Committee (PITAC).
From 1979 to 1991 he was on the faculty of the Computer Science
Department at Princeton University, Princeton, New Jersey. He received a
BS in electrical engineering from the Instituto Tecnologico de Monterrey,
Mexico, in 1974. From Stanford University, Stanford, California, he
received in 1975 a MS in electrical engineering and a PhD in computer
science in 1979. He holds an honorary PhD from ETH Zurich (2007).
Garcia-Molina is a Fellow of the Association for Computing Machinery and
of the American Academy of Arts and Sciences; is a member of the National
Academy of Engineering; received the 1999 ACM SIGMOD Innovations Award;
is a Venture Advisor for Onset Ventures, and is a member of the Board of
Directors of Oracle.
Keynote 3:
Title:
Security – A Big Question for Big Data
Speaker:
Prof.
Roger Schell, University of
Southern California, USA
Abstract:
Big data implies
performing computation and database operations for massive amounts of
data, remotely from the data owner’s enterprise. Since a key value
proposition of big data is access to data from multiple and diverse
domains, security and privacy will play a very important role in big data
research and technology. The limitations of standard IT security
practices are well-known, making the ability of attackers to use software
subversion to insert malicious software into applications and operating
systems a serious and growing threat whose adverse impact is intensified
by big data. So, a
big question is what security and privacy technology is adequate
for controlled assured sharing for efficient direct access to big data.
Making effective use of big data requires access from any domain to data
in that domain, or any other domain it is authorized to access. Several
decades of trusted systems developments have produced a rich set of
proven concepts for verifiable protection to substantially cope with
determined adversaries, but this technology has largely been marginalized
as “overkill” and vendors do not widely offer it. This talk will discuss
pivotal choices for big data to leverage this mature security and privacy
technology, while identifying remaining research challenges.
Short Bio:
Dr. Roger R. Schell
recently joined USC/ISI supporting their Masters of Cyber Security degree
program. He is internationally recognized for originating several key
modern security design and evaluation techniques, and he holds patents in
cryptography, authentication and trusted workstation. For more than
decade he has been co-founder and President of Aesec Corporation, a
start-up company providing verifiably secure platforms. Previously Dr.
Schell was co-founder and vice president for Gemini Computers, Inc.,
where he directed development of their highly secure (what NSA called
“Class A1”) commercial product, the Gemini Multiprocessing Secure
Operating System (GEMSOS). He was also the founding Deputy Director of
NSA’s National Computer Security Center. He has been referred to as the
"father" of the Trusted Computer System Evaluation Criteria
(the "Orange Book"). Dr. Schell is a retired USAF Colonel. He
received a Ph.D. in Computer Science from the MIT, an M.S.E.E. from Washington
State, and a B.S.E.E. from Montana State. The NIST and NSA have
recognized Dr. Schell with the National Computer System Security Award.
In 2012 he was inducted into the inaugural class of the National Cyber
Security Hall of Fame.
Keynote 4:
Title: Key
Usage Patterns for Apache Hadoop in the Enterprise
Speaker:
Dr. Amr Awadallah, CTO, Cloudera,
USA
Abstract:
Advances in
computing capabilities are palpably evident throughout many industries
manifest by unprecedented, large-scale data integration and inferencing.
Branded as “big-data” in many cases, the question of whether such
techniques can leverage advances in biomedicine and clinical practice are
obvious. High-throughput clinical analytics, synthesizing genomic and
clinical attributes of a particular patient, portends predictive models
that can directly influence clinical care decisions. However, to make
this widely shared vision practical and scalable, barriers attributable
to data heterogeneity dominate. Methods and strategies to increase the
comparability and consistency of healthcare related data will be
discussed.
Short Bio:
Before co-founding
Cloudera in 2008, Amr (@awadallah) was an Entrepreneur-in-Residence at
Accel Partners. Prior to joining Accel he served as Vice President of
Product Intelligence Engineering at Yahoo!, and ran one of the very
first organizations to use Hadoop for data analysis and business
intelligence. Amr joined Yahoo after they acquired his first startup,
VivaSmart, in July of 2000. Amr holds a Bachelor’s and Master’s degrees
in Electrical Engineering from Cairo University, Egypt, and a Doctorate
in Electrical Engineering from Stanford University.
I
Conference Paper Presentations
|
Session
AB1: Algorithms and Systems for Big Data Search
|
Regular
|
BigD220 "4S:
Scalable Subspace Search Scheme"
Hoang Vu Nguyen, Emmanuel Müller, and KlemensBöhm
|
Regular
|
BigD254
"Computing Betweenness Centrality in External Memory"
Lars Arge, Michael Goodrich, and Freek van Walderveen
|
Regular
|
BigD282
"NUMA-optimized Parallel Breadth-first Search on Multicore
Single-node System"
YuichiroYasui, Katsuki Fujisawa, and Kazushige Goto
|
Short
|
BigD248 "A
Distributed Tree Data Structure For Real-Time OLAP On Cloud
Architectures"
Frank Dehne, Quan Kong, Andrew Rau-Chaplin, HamidrezaZaboli, and
Rebecca Zhou
|
Short
|
BigD323
"Group-Scheme: A Universal SIMD-based Compression Scheme"
Xudong Zhang, Xin Zhao, Dongdong Shan, and Hongfei Yan
|
Session
C1: Cloud/Grid/Stream Computing for Big Data
|
Short
|
BigD287 "On
the Performance and Energy Efficiency of Hadoop Deployment Models"
Eugen Feller, LavanyaRamakrishnan, and Christine Morin
|
Short
|
BigD355
"Scalable and Robust Key Group Size Estimation For Reducer Load
Balancing in MapReduce"
Wei Yan, Yuan Xue, and Bradley Malin
|
Short
|
BigD359
"Robot: An Efficient Model For Big Data Storage Systems Based On
Erasure Coding"
Chao Yin, Jianzong Wang, ChangshengXie, Jiguang Wan, and Changlin Long
|
Short
|
BigD450
"Towards Hybrid Online On-Demand Querying of Realtime Data with
Stateful Complex Event Processing"
Qunzhi Zhou, YogeshSimmhan, and prasanna Viktor
|
Short
|
BigD453
"DDSN: Duplicate Detection to Reduce Both Storage and Bandwidth
Consumption"
Jiaran Zhang, Xiaohui Yu, and Liwei Lin
|
Short
|
BigD414 "An
Infrastructure for Automating Large-scale Performance Studies and Data
Processing"
Deepal Jayasinghe, Josh Kimball, Tao Zhu, Siddharth Choudhary, and
CaltonPu
|
Session
D1: Complex Big Data Applications
|
Regular
|
BigD288 "The
BTWorld Use Case for Big Data Analytics: Description, MapReduce Logical
Workflow, and Empirical Evaluation"
Tim Hegeman, BogdanGhit, MihaiCapotă, Jan Hidders, Dick Epema, and
AlexandruIosup,
|
Regular
|
BigD311
"Modeling Heterogeneous Time Series Dynamics to Profile Big Sensor
Data in Complex Physical Systems"
Bin Liu
|
Regular
|
BigD332
"Efficiently Extracting Frequent Subgraphs using MapReduce"
Wei Lu, Gang Chen, Anthony Tung, and Feng Zhao
|
Regular
|
BigD342 ” Opinion mining with word
order''
Noriaki Kawamae
|
Short
|
BigD252 "HIG
– An In-memory Database Platform Enabling Real-time Analyses of Genome
Data"
Matthieu-P.Schapranow and HassoPlattner
|
Session E1: Industry
and Government Program
|
Regular
|
N203 "
Terabyte-sized Image Computations on Hadoop Cluster Platforms
"
Peter Bajcsy, Antoine Vandecreme, Julien Amelot, Phuong Nguyen, Joe
Chalfoun, and Mary Brady
|
Regular
|
N206 " A
Fast and Scalable Method for Threat Detection in Large-scale DNS Logs
"
Ron Begleiter, Yuval Elovici, Yona Hollander, Ori Mendelson, Lior Rokach,
and Roi Saltzman
|
Regular
|
N207 "
Hourglass: a Library for Incremental Processing on Hadoop "
Matthew Hayes and Sam Shah
|
Regular
|
N209 "
Correlation-based Performance Analysis for Full-System MapReduce
Optimization "
Qi Guo, Yan Li, Tao Liu, Kun Wang, Guancheng Chen, Xiaoming Bao, and
Wentao Tang
|
Regular
|
N217 "
Large Scale Ad Latency Analysis "
Mihajlo Grbovic, Jon Malkin, and Hirakendu Das
|
Session
AB2: Algorithms and Systems for Big Data Search
|
Regular
|
BigD315 "A
Distributed Vertex-Centric Approach for Pattern Matching in Massive
Graphs"
ArashFard, M. UsmanNisar, LakshmishRamaswamy, John A. Miller, and
Matthew Saltz
|
Regular
|
BigD318 "Fast
Scalable Selection Algorithms for Large Scale Data"
Lee Thompson, WeijiaXu, and Daniel Miranker
|
Regular
|
BigD410
"Distributed Confidence-Weighted Classification on MapReduce"
NemanjaDjuric, MihajloGrbovic, and Slobodan Vucetic
|
Short
|
BigD350 "A
Streaming Partitioning Approach to Processing Large Scale Distributed
Graph Datasets"
Rui Wang and Kenneth Chiu
|
Short
|
BigD402
"Scaling Concurrency of Personalized Semantic Search over Large
RDF Data"
HAIZHOU FU, Hyeongsik Kim, and KemaforAnyanwu
|
Session
C2: High Performance/Parallel Computing Platforms for Big Data
|
Regular
|
BigD279
"HFSP: Size-based Scheduling for Hadoop"
Mario Pastorelli, Antonio Barbuzzi, DamianoCarra, MatteoDell'Amico, and
Pietro Michiardi
|
Regular
|
BigD314 "An
Evaluation Study of BigData Frameworks for Graph Processing"
BenediktElser and Alberto Montresor
|
Short
|
BigD225
"Hardware acceleration of HadoopMapReduce"
ToshimoriHonjo and Kazuki Oikawa
|
Short
|
BigD339
"Algebraic Dataflows for Big Data Analysis"
Jonas Dias, Eduardo Ogasawara, Daniel de Oliveira, Fabio Porto, Patrick
Valduriez, and Marta Mattoso
|
Short
|
BigD360
"Multilevel Active Storage for Big Data Applications in High
Performance Computing"
Chao Chen and Yong Chen
|
Session
D2: Complex Big Data Applications
|
Regular
|
BigD341
"Explaining the Product Range Effect in Purchase Data"
Diego Pennacchioli, Michele Coscia, Salvatore Rinzivillo, Dino
Pedreschi, and Fosca Giannotti
|
Regular
|
BigD372
"Parallel Deterministic Annealing Clustering and its Application
to LC-MS Data Analysis"
Geoffrey Fox, D. R. Mani, and Saumyadipta Pyne
|
Regular
|
BigD378
"Terabyte-scale image similarity search: experience and best
practice"
Diana Moise, Denis Shestakov, Gylfi Gudmundsson, and Laurent Amsaleg
|
Short
|
BigD266
"Real-time streaming mobility analytics"
Andras Garzo, Csaba Sidlo, Daniel Tahara, Erik Wyatt, and Andras Bencur
|
Short
|
BigD320
"QuPARA: Query-Driven Large-Scale Portfolio Aggregate Risk
Analysis on MapReduce"
Andrew Rau-Chaplin, Blesson Varghese, Duane Wilson, Zhimin Yao, and
Norbert Zeh
|
Session E2: Industry
and Government Program
|
Regular
|
N218 "
Accelerating semantic graph databases on commodity clusters "
Alessandro Morari, Vito Giovanni Castellana, Oreste Villa, David
Haglin, John Feo, Jesse Weaver, and Antonino Tumeo
|
Regular
|
N219 "
Practical Distributed Classification using the Alternating Direction
Method of Multipliers Algorithm "
Peter Lubell-Doughtie and Jon Sondag
|
Regular
|
N225 "
Scaling Deep Social Feeds at Pinterest "
Varun Sharma
|
Regular
|
N226 " Big
Data Analytics on High Velocity Streams: A Case Study "
Thibaud Chardonnens, Philippe Cudre-Mauroux, Martin Grund, and Benoit
Perroud
|
Session
AB3: Big Data Search
Architectures, Scalability and Efficiency
|
Regular
|
BigD260 "A
Parallel Computing Platform for Training Large Scale Neural
Networks"
RongGu, FuraoShen, and Yihua Huang
|
Regular
|
BigD330 "An
NML-based Model Selection Criterion for General Relational Data
Modeling"
Yoshiki Sakai and Kenji Yamanishi
|
Regular
|
BigD411
"Scalable Context-Aware Role Mining with MapReduce"
Zhiwei Yu, Raymond Wong, and Chi-Hung Chi
|
Short
|
BigD297
"Sparse Poisson Coding for High Dimensional Document
Clustering"
Chenxia Wu, Haiqin Yang, Jianke Zhu, Jiemi Zhang, Irwin King, and
Michael R. Lyu
|
Short
|
BigD465
"Parallel Subgroup Discovery on Computing Clusters -- First
Results"
Daniel Trabold and HenrikGrosskreutz
|
Session
C3: High Performance/Parallel Computing Platforms for Big Data
|
Regular
|
BigD331
"Storing and manipulating environmental big data with JASMIN"
Bryan Lawrence, Victoria Bennett, Jonathan Churchill, Martin Juckes,
Philip Kershaw, Stephen Pascoe, Sam Pepler, Matt Pritchard, and Ag
Stephens
|
Regular
|
BigD455
"Locality-driven High-level I/O Aggregation for Processing
Scientific Datasets"
Jialin Liu, BradlyCrysler, and Yong Chen
|
Short
|
BigD363 "GPU Accelerated
Item-Based Collaborative Filtering for Big-Data Applications"
Chandima Hewa Nadungodage, Yuni Xia, John Lee, Myungcheol Lee, and
Choon Seo Park
|
Short
|
BigD427
"Kylin: An Efficient and Scalable Graph Data Processing
System"
Li-Yung Ho, Tsung-Han Li, Jan-Jan Wu, and Pangfeng Liu
|
Short
|
BigD454 "A
Reconfigurable Computing Architecture for Semantic Information
Filtering"
Aalap Tripathy, Ka Chon Ieong, Atish Patra, and Rabi Mahapatra
|
Session
D3: Complex Big Data Applications
|
Regular
|
BigD437
"Demand Response Targeting Using Big Data Analytics"
Jungsuk Kwac and Ram Rajagopal
|
Regular
|
BigD353
"Large-scale Predictive Analytics for Real Time Energy
Management"
Natasha Balac, Tamara Sipes, Nicole Wolter, Kenneth Nunes, Robert
Sinkovits, and Homa Karimabadi
|
Short
|
BigD405 "Constructing User
Profiles from Social Media Data"
Mauricio Hernandez, Kirsten Hildrum, Prateek Jain, Chitra Venkatramani,
RohitWagle, BogdanAlexe, and Ioana Roxana Stanoi
|
Short
|
BigD431
"CloudRS: An Error Correction Algorithm of High-Throughput
Sequencing Data based on Scalable Framework"
Chien-Chih Chen, Yu-Jung Chang, Wei-Chun Chung, Der-Tsai Lee, and
Jan-Ming Ho
|
Short
|
BigD444 "Building
dynamic thermal profiles of energy consumption for individuals and
neighborhoods"
Adrian Albert and Ram Rajagopal
|
Session
AB4: Large-scale Recommendation Systems and Social Media
Systems
|
Regular
|
BigD211
"Continuous Hyperparameter Optimization for Large-scale
Recommender Systems"
Simon Chan, Philip Treleaven, and Licia Capra
|
Regular
|
BigD334
"Parallel Matrix Factorization for Binary Response"
Rajiv Khanna, Liang Zhang, Deepak Agarwal, and Bee-Chung Chen
|
Regular
|
BigD400
"CallCab: A Unified Recommendation System for Carpooling and
Regular Taxicab Services"
Desheng Zhang and Tian He
|
Short
|
BigD361
"Scalable Distributed Event Detection for Twitter"
Richard McCreadie, Craig Macdonald, IadhOunis, Miles Osborne, and
SasaPetrovic
|
Short
|
BigD233
"Massively Scalable Near Duplicate Detection in Streams of
Documents using MDSH"
Paul Bogen, Christopher Symons, Amber McKenzie, Robert Patton, and Rob
Gillen
|
Session
C4: Energy-efficient Computing for Big Data
|
Regular
|
BigD345
"Efficient Gear-shifting for a Power-proportional Distributed
Data-placement Method"
HieuHanh Le, Satoshi Hikida, and Haruo Yokota
|
Regular
|
BigD413
"Building a Generic Platform for Big Sensor Data Application"
Chun-Hsiang Lee, David Birch, Chao Wu, Dilshan Silva, OrestisTsinalis,
Yang Li, Shulin Yan, MoustafaGhanem, and YikeGuo
|
Regular
|
BigD354
"Agrios: A Hybrid Approach to Big Array Analytics"
Patrick Leyshock, David Maier, and Kristin Tufte
|
Short
|
BigD215
"clusiVAT: A Mixed Visual/Numerical Clustering Algorithm for Big
Data"
Dheeraj Kumar, James Bezdek, SutharshanRajasegarar,
MarimuthuPalaniswami, Christopher Leckie, and Timothy Havens
|
Short
|
BigD298
"Feliss: Flexible distributed computing framework with
light-weight checkpointing"
Takuya Araki, Kazuyo Narita, and Hiroshi Tamano
|
Session
D4; Data Preservation, Information Integration and
Heterogeneous and Multi-structured Data Integration
|
Regular
|
BigD253
"CORE: Cross-Object Redundancy for Efficient Data Repair in
Storage Systems"
Kyumars Sheykh Esmaili, Lluis Pamies Juarez, and Anwitaman Datta
|
Regular
|
BigD217
"Iteration Aware Prefetching For Unstructured Grids"
Oyindamola Akande and Philip Rhodes
|
Short
|
BigD278
"Scalable Data Citation in Dynamic, Large Databases: Model and
Reference Implementation"
Stefan Pröll and Andreas Rauber
|
Short
|
BigD344
"Self-Adaptive Event Recognition for Intelligent Transport
Management"
Alexander Artikis, Matthias Weidlich, Avigdor Gal, VanaKalogeraki, and
DimitriosGunopoulos
|
Short
|
BigD375
"Robust Crowdsourced Learning"
Zhiquan Liu, Luo Luo, and Wu-Jun Li
|
Session
AB5: Link and Graph Mining
|
Regular
|
BigD267 "Self-Tuned
Kernel Spectral Clustering for Large Scale Networks"
Raghvendra Mall, Rocco Langone, and Johan Suykens
|
Regular
|
BigD403
"Top-K aggregation over a Large Graph Using Shared-Nothing
Systems"
AbhirupChakraborty
|
Short
|
BigD241 "Incremental
Algorithms for Network Management and Analysis based on Closeness
Centrality"
AhmetErdemSariyuce, Kamer Kaya, Erik Saule, and Umit V. Catalyurek
|
Short
|
BigD247
"Classification of Big Velocity Data via Cross-Domain Canonical
Correlation Analysis"
Bo Zhang and Zhongzhi Shi
|
Short
|
BigD212
"Elver: Recommending Facebook Pages in Cold Start Situation
Without Content Features"
YushengXie, AlokChoudhary, Zhengzhang Chen, and AnkitAgrawal
|
Session
C5: New Computational Models for Big Data
|
Regular
|
BigD399
"Map-Based Graph Analysis on MapReduce"
Upa Gupta and Leonidas Fegaras,
|
Regular
|
BigD216
"P-DOT: A Model of Computation for Big Data"
Tao Luo, Yin Liao, Yunquan Zhang, and Guoliang Chen
|
Short
|
BigD285
"Optimizing the MapReduce Framework on Intel Xeon Phi
Coprocessor"
Mian Lu, Lei Zhang, Huynh Phung Huynh, ZhongliangOng, Yun Liang,
Bingsheng He, Rick SiowMongGoh, and Richard Huynh
|
Short
|
BigD289
"Optimizing Throughput on Guaranteed-Bandwidth WAN Networks for the
Large Synoptic Survey Telescope (LSST)"
Mike Freemon
|
Short
|
BigD390
"GPU-Accelerated Adaptive Compression Framework for Genomics
Data"
GuixinGuo, Shuang Qiu, Mian Lu, BingQiang Wang, Lin Fang, and Simon See
|
Session
D5: Spatiotemporal and Stream Data Management, Scientific
Data Management
|
Regular
|
BigD423
"Spatio-temporal Indexing in Non-relational Distributed
Databases"
Anthony Fox, Chris Eichelberger, James Hughes, and Skylar Lyon
|
Regular
|
BigD245 "Measuring
Inter-Site Engagement"
Elad Yom-Tov, MouniaLalmas, Ricardo Baeza-Yates, Georges Dupret,
Janette Lehmann, and Pinar Donmez
|
Regular
|
BigD312
"Direct QR factorizations for tall-and-skinny matrices in
MapReduce architectures"
Austin Benson, David Gleich, and James Demmel
|
Short
|
BigD243
"Scientific Discovery through Weighted Sampling"
Lefteris Sidirourgos, Martin Kersten, and Peter Boncz
|
Short
|
BigD294 "On the Use of Shared
Storage in Shared-Nothing Environments"
Krishnaraj Ravindranathan, Aleksander Khasymski, Guanying Wang, Ali
Butt, and Gaurav Makkar
|
Session
AB6: Link and Graph Mining, Mobility and Big Data
|
Short
|
BigD335
"Efficient Large Graph Pattern Mining for Big Data in the
Cloud"
Chun-Chieh Chen, Kuan-Wei Lee, Chih-Chieh Chang, De-Nian Yang, and
Ming-Syan Chen
|
Short
|
BigD417 "A
Hypergraph-Partitioned Vertex Programming Approach for Large-scale
Consensus Optimization"
Hui Miao, Xiangyang Liu, Bert Huang, and LiseGetoor
|
Short
|
BigD366
"Analysis of GSM calls data for understanding user mobility
behavior"
Chiara Renso, Barbara Furletti, Lorenzo Gabrielli, and Salvatore
Rinzivillo
|
Short
|
BigD448 "A Higher-Order Data
Flow Model for Heterogeneous Big Data"
Simon Price and Peter Flach
|
Short
|
BigD284
"DL-MPI: Enabling Data Locality Computation for MPI-based
Data-Intensive Applications"
Jiangling Yin, Andrew Foran, and Jun Wang
|
Short
|
BigD308 "Fast
OLAP Query Execution in Main Memory on Large Data in a Cluster"
Martin Weidner, Jonathan Dees, and Peter Sanders
|
Session
C6: Novel Theoretical Models for Big Data
|
Regular
|
BigD358
"Communication Efficient Algorithms for Fundamental Big Data
Problems”
Peter Sanders, Ingo Müller, and Sebastian Schlag
|
Regular
|
BigD244
"On-Line Learning Gossip Algorithm in Multi-Agent Systems with
Local Decision Rules"
Stephan Clemencon, Pascal Bianchi, Gemma Morral, and JeremieJakubowicz
|
Short
|
BigD229
"Transparent Composite Model For Large Scale Image/Video Processing"
Enhui Yang and Xiang Yu
|
Short
|
BigD319
"Elastic Algorithms for Guaranteeing Quality Monotonicity in Big
Data Mining"
Rui Han, Lei Nie, Moustafa M. Ghanem, and Yike Guo
|
Session
D6: Scientific Data Management
|
Regular
|
BigD338 "Adaptive
File Management for Scientific Workflows on the Azure Cloud"
Radu Tudoran, Alexandru Costan, Ramin Rad Rezai, Goetz Brasche, and
Gabriel Antoniu
|
Regular
|
BigD407
"Model-View Sensor Data Management in the Cloud"
TianGuo, Thanasis G. Papaioannou, and Karl Aberer
|
Short
|
BigD373
"Using Pattern-Models to Guide SSD Deployment for Big Data in HPC
systems"
Junjie Chen, Yong Chen, and Philip C. Roth
|
Short
|
BigD365
"Improving Floating Point Compression through Binary Masks"
Leonardo Bautista Gomez and Franck Cappello
|
Short
|
BigD445
"Segmented Analysis for Reducing Data Movement"
Jialin Liu, SurendraByna, and Yong Chen
|
Session
AB7: Privacy Preserving Big Data Collection/Analytics, Threat
Detection using Big Data Analytics
|
Regular
|
BigD269
"DP-WHERE: Differentially Private Modeling of Human Mobility"
Darakhshan Mir, SibrenIsaacman, Ramón Cáceres, Margaret Martonosi, and
Rebecca Wright
|
Regular
|
BigD305 "Malicious
URLs Filtering - A Big Data Application"
Min-Sheng Lin, Chien-Yi Chiu, Yuh-Jye Lee, and Hsing-KuoPao
|
Regular
|
BigD328
"Zero-Knowledge Private Graph Summarization"
Maryam Shoaran, Alex Thomo, and Jens Weber
|
Short
|
BigD230 "Scalable
Network Traffic Visualization Using Compressed Graphs"
Lei Shi, Qi Liao, and Xiaohua Sun
|
Short
|
BigD391
"Breaking the Arc: RIsk Control for Big Data"
Duncan Hodges and Sadie Creese
|
Session
D7: Database Management Challenges: Architecture, Storage,
User Interfaces
|
Regular
|
BigD249 "A
Selective Checkpointing Mechanism for Query Plans in a Parallel
Database System"
Ting Chen and Kenjiro Taura
|
Regular
|
BigD270 "H2RDF+:
High-performance Distributed Joins over Large-scale RDF Graphs"
Nikolaos Papailiou, Ioannis Konstantinou, Dimitrios Tsoumakos,
Panagiotis Karras, and Nectarios Koziris
|
Short
|
BigD447 "Knowledge
Cubes - A Proposal for Scalable and Semantically-Guided Management of
Big Data"
Amgad Madkour, Walid Aref, and Saleh Basalamah
|
Workshop
1: Distributed Storage Systems and Coding for Big Data
|
|
1
|
S1209 "The
Code Rebalancing Problem for a Storage-Flexible Data Center Network
"
Iryna Andriyanova, Alan Jule and Emina Soljanin
|
2
|
S1211 "suvfs:
A virtual file system in userspace that supports large files"
Wasim Ahmad Bhat and S.M.K. Quadri
|
3
|
S1213 "Reliability
of Erasure Coded Storage Systems: A Geometric Approach"
Antonio Campello and Vinay Vaishampayan
|
4
|
S1210
"Distributed Storage Evaluation on a Three-Wide Inter-Data Center
Deployment"
Yih-Farn Chen, Scott Daniels, Marios Hadjieleftheriou, Pingkai Liu,
Chao Tian and Vinay Vaishampayan
|
5
|
S1201
"Paired-Replicas with Constant Repair Time: Loss Functions and
Memorylessness"
Vinay Deolalikar
|
6
|
S1202
"Efficient Updates in Cross-Object Erasure-Coded Storage
Systems"
Kyumars Sheykh Esmaili, Aatish Chiniah and Anwitaman Datta
|
7
|
S1208
"Construction of Exact-BASIC Codes for Distributed Storage Systems
at the MSR Point"
Hanxu Hou, Kenneth W. Shum and Hui Li
|
8
|
S1205 "Minimum
Storage BASIC Codes: A System Perspective Xianxia Huang"
Hui Li, Tai Zhou, Yumeng Zhang, Han Guo, Hanxu Hou, Huayu Zhang, Kai
Pan and Kai Lei
|
9
|
S1207
"Layout-Aware I/O Scheduling for Terabits Data Movement"
Youngjae Kim, Scott Atchley, Geoffroy R. Vallee and Galen M. Shipman
|
|
Date
|
7th, October,2013
|
Location
|
Ballroom E
|
Time
|
Schedule
|
9:00-9:15
|
Plenary
|
9:15-10:00
|
Invited Talk
|
10:00-11:00
|
S1208 "Construction
of Exact-BASIC Codes for Distributed Storage Systems at the MSR
Point"
S1213 "Reliability of Erasure Coded Storage Systems: A Geometric
Approach"
S1209 "The Code Rebalancing Problem for a Storage-Flexible Data
Center Network "
|
11:00-11:20
|
Coffee Time
|
11:20-12:00
|
S1202
"Efficient Updates in Cross-Object Erasure-Coded Storage
Systems"
S1211 "suvfs: A virtual file system in userspace that supports
large files"
|
12:00-14:00
|
Lunch at your own
|
14:00-14:30
|
Invited Talk:
S1205 "Minimum Storage BASIC Codes: A System Perspective Xianxia
Huang"
|
14:30-15:30
|
S1210
"Distributed Storage Evaluation on a Three-Wide Inter-Data Center
Deployment"
S1201 "Paired-Replicas with Constant Repair Time: Loss Functions
and Memorylessness"
S1207 "Layout-Aware I/O Scheduling for Terabits Data
Movement"
|
Workshop
2: Big Data and the Humanities
|
|
1
|
S2203
"Robustness of emotion extraction from 20th century English books
Alberto Acerbi"
Vasileios Lampos and Alexander Bentley
|
2
|
S2228
"VisualPage: Towards Large Scale Analysis of Nineteenth-Century
Print Culture"
Neal Audenaert and Natalie Houston
|
3
|
S2210 "Back
to our Data – Experiments with NoSQL Technologies in the
Humanities"
Tobias Blanke, Michael Bryant and Mark Hedges
|
4
|
S2234 "The
Human Face of Crowdsourcing: A Citizen-led Crowdsourcing Case
Study"
Sheryl Grant, Kristan Shawgo, Richard Marciano, Jeff Heard and
Priscilla Ndiaye
|
5
|
S2219
"Visualization and Rhetoric: Key Concerns for Utilizing Big Data
in Humanities"
Kathleen Kerr, Bernice Hausman, Samah Gad and Waqas Javed
|
6
|
S2224
"Humanities 'Big Data': Myths, challenges, and lessons"
Amalia S. Levi
|
7
|
S2229 "Digging
into Human Rights Violations: Data Modeling Collective Memory"
Ben Miller, Ayush Shrestha, Jason Derby, Jennifer Olive, Fuxin Li,
Yanjun Zhao and Karthikeyan Umapathy
|
8
|
S2231 "The
Royal Birth of 2013: Analysing and Visualising Public Sentiment in the
UK Using Twitter"
Vu Dung Nguyen, Blesson Varghese and Adam Barker
|
9
|
S2221
"Bibliographic Records as Humanities Big Data"
Andrew Prescott
|
10
|
S2209
"Customising Geoparsing and Georeferencing for Historical
Texts"
C.J. Rupp, Paul Rayson, Alistair Baron, Christopher Donaldson, Ian
Gregory, Andrew Hardie and Patricia Murrieta-Flores
|
11
|
S2208 "A
Concept of Generic Workspace for Big Data Processing in
Humanities"
Jedrzej Rybicki, Benedikt von St. Vieth and Daniel Mallmann
|
12
|
S2220 "From
Assets to Stories via the Google Cultural Institute Platform"
William Seales, Steve Crossan, Mark Yoshitake and Sertan Girgin
|
13
|
S2223 "The
Curious Identity of Michael Field and its Implications for Humanities
Research with the Semantic Web"
Susan Brown and John Simpson
|
14
|
S2222
"Infectious Texts: Modeling Text Reuse in Nineteenth-Century
Newspapers"
David Smith, Ryan Cordell and Elizabeth Maddock Dillon
|
15
|
S2204
"Mapping Mutable Genres in Structurally Complex Volumes"
Ted Underwood, Michael Black, Loretta Auvil and Boris Capitanu
|
16
|
S2214 "CKM: A
Shared Visual Analytical Tool for Large-Scale Analysis of Audio-Video
Interviews"
Lu Xiao, Yan Luo and Steven High
|
17
|
S2218 "A Case
Study on Entity Resolution for Distant Processing of Big Humanities
Data"
Weijia Xu, Maria Esteva, Jessica Trlogan and Todd Swinson
|
|
|
|
Date
|
8th, October,2013
|
Location
|
Ballroom E
|
Time
|
Schedule
|
9:30-9:50
|
Coffee Time
|
9:50-12:00
|
S2222 "Infectious Texts: Modeling Text Reuse in
Nineteenth-Century Newspapers"
S2204 "Mapping Mutable Genres in
Structurally Complex Volumes"
S2231 "The Royal Birth of 2013:
Analysing and Visualising Public Sentiment in the UK Using
Twitter"
S2228 "VisualPage: Towards Large
Scale Analysis of Nineteenth-Century Print Culture"
S2218 "A Case Study on Entity
Resolution for Distant Processing of Big Humanities Data"
S2208 "A Concept of Generic
Workspace for Big Data Processing in Humanities"
|
12:00-13:20
|
Lunch (not
provided)
|
13:20-15:20
|
S2229
"Digging into Human Rights Violations: Data Modeling Collective
Memory"
S2214 "CKM: A Shared Visual Analytical Tool for Large-Scale
Analysis of Audio-Video Interviews"
S2219 "Visualization and Rhetoric: Key Concerns for Utilizing Big
Data in Humanities"
S2221 "Bibliographic Records as Humanities Big Data"
S2210 "Back to our Data – Experiments with NoSQL Technologies in
the Humanities"
S2209 "Customising Geoparsing and Georeferencing for Historical
Texts"
S2234 "The Human Face of Crowdsourcing: A Citizen-led
Crowdsourcing Case Study"
S2224 "Humanities 'Big Data': Myths, challenges, and lessons"
S2203 "Robustness of emotion extraction from 20th century English
books Alberto Acerbi"
|
15:20-15:40
|
Coffee Time
|
15:40-17:40
|
S2223 "The
Curious Identity of Michael Field and its Implications for Humanities
Research with the Semantic Web"
S2220 "From Assets to Stories via the Google Cultural Institute
Platform"
|
Workshop
3: Workshop
on Big Data and Society
-- Data Economy, Real-Time Mining and Analytics, Mining
Techniques for Online and Customer Service in Big data Era
|
|
1
|
S6207
"Enterprise Pre-Sales Forums: A Preliminary Study of Metadata and
Content"
Vinay Deolalikar
|
2
|
S6212
"Advancing value creation and value capture in data-intensive
contexts"
Roman Ferrando-Llopis, David Lopez-Berzosa and Catherine Mulligan
|
3
|
S6203 "A Cloud
Service for the Evaluation of Company's Financial Health Using
XBRL-based Financial Statements"
Wen-Chiao Hsu, Jyun-Yao Huang, Chi-Hao Chen, Chien-Yu Su, Hsiao-Chen
Shih, Tzu-Ya Liao and I-En Liao
|
4
|
S6209
"Real-Time Data Analysis in ClowdFlows"
Janez Kranjc, Vid Podpečan and Nada Lavrač
|
5
|
S6210 "ma3tch
- privacy and knowledge - dynamic networked collective
intelligence"
Udo Kroon
|
6
|
S6202
"Business Model Canvas Perspective on Big Data Applications"
Fatma Canan Pembe Muhtaroglu, Seniz Demir, Murat Obali and Canan Girgin
|
7
|
S6215
"Understanding the value of (Big) data"
Koutroumpis Pantelis and Leiponen Aija
|
8
|
S6214
"OpenFridge: A Platform for Data Economy for Energy Efficiency
Data"
Slobodanka Dana Kathrin Tomic and Anna Fensel
|
9
|
S6201 "A
Study of Innovation Network Database Construction by Using Big Data and
An Enterprise Strategy Model"
Zhou Wen, Ye Shu Tao and Lu Xiao Long
|
10
|
S6213
"Enhanced User Data Privacy with Pay-by-Data Model"
Chao Wu and Yike Guo
|
11
|
S6206 "Query
Optimization over a Heterogeneously Distributed Scientific
Database"
Helen Xiang
|
12
|
S6204
"Enterprise Data Economy: A Hadoop-Driven Model and Strategy"
Wuheng Luo
|
|
|
|
Date
|
7th, October,2013
|
Location
|
Ballroom F
|
Time
|
Schedule
|
8:00-8:40
|
Registration
(Hotel Lobby West)
|
8:40-9:35
|
Invited talk:
What's around the corner in social commerce?(Jaiddep Srivastava)
|
9:35-10:00
|
S6212 "Advancing
value creation and value capture in data-intensive contexts"
|
10:00-10:25
|
Coffee Time
|
10:25-12:30
|
S6202
"Business Model Canvas Perspective on Big Data Applications"
S6203 "A Cloud Service for the Evaluation of Company's Financial Health
Using XBRL-based Financial Statements"
S6207 "Enterprise Pre-Sales Forums: A Preliminary Study of
Metadata and Content"
S6204 "Enterprise Data Economy: A Hadoop-Driven Model and
Strategy"
S6213 "Enhanced User Data Privacy with Pay-by-Data Model"
|
12:30-13:30
|
Lunch at your own
|
13:30-14:25
|
Invited talk:
Large Scale Mining and Modeling of Telecommunication Carrier's Big Data
(Wei Fan)
|
14:25-16:05
|
S6206 "Query
Optimization over a Heterogeneously Distributed Scientific
Database"
S6210 "ma3tch - privacy and knowledge - dynamic networked
collective intelligence"
S6201 "A Study of Innovation Network Database Construction by
Using Big Data and An Enterprise Strategy Model"
S6209 "Real-Time Data Analysis in ClowdFlows"
|
16:05-16:30
|
Coffee Time
|
16:30-17:25
|
S6215
"Understanding the value of (Big) data"
S6214 "OpenFridge: A Platform for Data Economy for Energy
Efficiency Data"
|
Workshop
4: The
First Workshop on Benchmarks, Performance Optimization, and Emerging hardware
of Big Data Systems and Applications(BPOE 2013)
|
|
1
|
S7210
"Optimizing a MapReduce Module of Preprocessing High-Throughput
DNA Sequencing Data"
Wei-Chun Chung, Yu-Jung Chang, Chien-Chih Chen, Der-Tsai Lee and
Jan-Ming Ho
|
2
|
BigD370 "Hash
in a Flash: Hash Tables for Flash Devices"
Tyler Clemons, S M Faisal, Shirish Tatikonda, Charu Aggarwal and
Srinivasan Parthasarathy
|
3
|
S7202 "Memory
system characterization of Big Data workloads"
Martin Dimitrov, Karthik Kumar, Patrick Lu and Vish Viswanathan
|
4
|
S7211
"Performance Evaluation of R with Intel Xeon Phi Coprocessor"
Yaakoub El-Khamra, Niall Gaffney, David Walling, Eric Wernert, Weijia
Xu and Hui Zhang
|
5
|
S7216 "The Implications
from Benchmarking Three Big Data Systems"
Quan Jing, Shi Yingjie, Zhao Ming and Wei Yang
|
6
|
S7205 "A
Performance Evaluation of Hive for Scientific Data Management"
Taoying Liu, Jing Liu, Hong Liu and Wei Li
|
7
|
S7214
"Evaluating Task Scheduling in Hadoop-based Cloud Systems"
Shengyuan Liu, Jungang Xu, Zongzhen Liu and Xu Liu
|
8
|
BigD397
"Efficient Near-Duplicate Document Detection using FPGAs"
Xi Luo, Walid Najjar and Vagelis Hristidis
|
9
|
BigD389 "Workload-Aware
Aggregate Maintenance in Columnar In-Memory Databases"
Stephan Müller, Lars Butzmann, Stefan Klauck and Hasso Plattner
|
10
|
S7206
"Virtualization I/O Optimization Based on Shared Memory"
Fengfeng Ning, Chuliang Weng and Yuan Luo
|
11
|
S7209 "An
Ensemble MIC-based Approach for Performance Diagnosis in Big Data
Platform"
Chen Pengfei, Qi Yong, Li Xinyi and Li Su
|
12
|
S7207 "A
Reconfigurable Stream Compression Hardware based on Static
Symbol-Lookup Table"
Shinichi Yamagiwa and Hiroshi Sakamoto
|
13
|
S7201
"NativeTask: A Hadoop Compatible Framework for High
Performance"
Dong Yang, Xiang Zhong, Dong Yan, Fangqin Dai, Xusen Yin, Cheng Lian,
Zhongliang Zhu, Weihua Jiang and Gansha Wu
|
14
|
S7212 "On
Mixing High-Speed Updates and In-Memory Queries: A Big-Data
Architecture for Real-time Analytics"
Tao Zhong, Kshitij Doshi, Xi Tang, Ting Lou, Zhongyan Lu and Hong Li
|
15
|
S7215 "AxPUE:
Application Level Metrics for Power Usage Effectiveness in Data Centers"
Runlin Zhou, Yingjie Shi and Chunge Zhu
|
16
|
S7217 "A
Characterization of Big Data Benchmarks"
Wen Xiong and Zhibin Yu
|
|
tr>
Date
|
8th, October,2013
|
Location
|
Ballroom F
|
Time
|
Schedule
|
9:00-12:00
|
Opening remarks: Jianfeng
Zhan and Weijia Xu
Session one:
Performance optimization of big data systems (Session Chair: Xiaoyi Lu,
OSU)
BigD389 "Workload-Aware Aggregate Maintenance in Columnar
In-Memory Databases"
S7201 "NativeTask: A Hadoop Compatible Framework for High Performance"
S7210 "Optimizing a MapReduce Module of Preprocessing
High-Throughput DNA Sequencing Data"
S7212 "On Mixing High-Speed Updates and In-Memory Queries: A
Big-Data Architecture for Real-time Analytics"
S7206 "Virtualization I/O Optimization Based on Shared
Memory"
S7205 "A Performance Evaluation of Hive for Scientific Data
Management"
|
12:00-13:20
|
Lunch
|
13:20-15:20
|
Session two: Big
Data Benchmarks and Workload characterization (Session Chair: Jianfeng
Zhan, ICT, CAS)
Invited Talk: TBD
S7202 "Memory system characterization of Big Data workloads"
S7211 "Performance Evaluation of R with Intel Xeon Phi
Coprocessor"
S7214 "Evaluating Task Scheduling in Hadoop-based Cloud
Systems"
S7217 "A Characterization of Big Data Benchmarks"
S7209 "An Ensemble MIC-based Approach for Performance Diagnosis in
Big Data Platform"
|
15:20-15:40
|
Break
|
15:40-17:40
|
Session two: Big
Data Benchmarks and Workload characterization (Session Chair: Jianfeng
Zhan, ICT, CAS)
Invited Talk: TBD
S7207 "A Reconfigurable Stream Compression Hardware based on
Static Symbol-Lookup Table"
BigD397 "Efficient Near-Duplicate Document Detection using
FPGAs"
BigD370 "Hash in a Flash: Hash Tables for Flash Devices"
S7216 "The Implications from Benchmarking Three Big Data
Systems"
S7215 "AxPUE: Application Level Metrics for Power Usage
Effectiveness in Data Centers"
Closing remark (Weijia Xu and Jianfeng Zhan)
|
Workshop
5: The
First Workshop on Big Data Visualization
|
|
1
|
S9209 "Dynamic
Reduction of Query Result Sets for Interactive Visualization"
Leilani Battle, Michael Stonebraker and Remco Chang
|
2
|
S9211
"Overplotting: Unified solutions under Abstract Rendering
Missing"
Joseph Cottam, Andrew Lumsdaine and Peter Wang
|
3
|
S9205
"Typograph: Multiscale Spatial Exploration of Text Documents"
Alexander Endert, Russ Burtner, Nick Cramer, Ralph Perko, Shawn Hampton
and Kristin Cook
|
4
|
S9204
"VisReduce: Fast and responsive incremental information visualization
of large datasets"
Jean-Francois Im, Felix Giguere Villegas and Michael J. McGuffin
|
5
|
S9208 "A
System for Large-Scale Visualization of Streaming Doppler Data"
Peter Kristof, Bedrich Benes, Carol X. Song and Lan Zhao
|
6
|
S9210 "Visualization
of Streaming Data: Observing Change and Context in Information
Visualization Techniques"
Milos Krstajic and Daniel A. Keim
|
7
|
S9202
"CompactMap: A Mental Map Preserving Visual Interface for
Streaming Text Data"
Xiaotong Liu, Yifan Hu, Stephen North and Han-Wei Shen
|
8
|
S9207
"Egocentric Storylines for Visual Analysis of Large Dynamic
Graphs"
Chris W. Muelder, Tarik Crnovrsanin, Arnaud Sallaberry and Kwan-Liu Ma
|
9
|
S9206
"GPU-Accelerated Incremental Correlation Clustering of Large Data
in the Cloud with Visual Feedback"
Eric Papenhausen, Bing Wang, Sungsoo Ha, Alla Zelenyuk, Dan Imre and
Klaus Mueller
|
10
|
S9201
"Visualization of Big SPH Simulations via Compressed Octree
Grids"
Florian Reichl, Marc Treib and Rüdiger Westermann
|
11
|
S9203 "A
Novel Visual Analysis Approach for Clustering Large-Scale Social
Data"
Zhangye Wang, Juanxia Zhou, Wei Chen, Chang Chen, Jiyuan Liao and Ross
Maciejewski
|
12
|
S9212 "DriveSense:
Contextual Handling of Large-scale Route Map Data for the
Automobile"
Frederik Wiehr, Vidya Setlur and Alark Joshi
|
|
tr>
Date
|
6th, October,2013
|
Location
|
Ballroom AB
|
Time
|
Schedule
|
8:00-8:40
|
Opening
|
8:40-9:30
|
Keynote: "Big
Picture" Mixed-Initiative Visual Analytics of Big Data, Michelle
Zhou, IBM Research
|
9:30-10:00
|
Invited talk: Data
Intensive Visualization and Analysis of Numerically Intensive
Applications Chris Mitchell, Los Alamos National Laboratory
|
10:00-10:30
|
Coffee Time
|
10:30-12:00
|
Text Data
SS9210 "Visualization of Streaming Data: Observing Change and
Context in Information Visualization Techniques"
S9202 "CompactMap: A Mental Map Preserving Visual Interface for
Streaming Text Data"
S9205 "Typograph: Multiscale Spatial Exploration of Text
Documents"
|
12:00-13:30
|
Lunch
|
13:30-14:30
|
Rendering
S9211"Overplotting: Unified solutions under Abstract Rendering
Missing"
S9212 "DriveSense: Contextual Handling of Large-scale Route Map Data
for the Automobile"
|
14:30-15:30
|
Visual Analysis
S9203 "A Novel Visual Analysis Approach for Clustering Large-Scale
Social Data"
S9207 "Egocentric Storylines for Visual Analysis of Large Dynamic
Graphs"
|
15:30-16:00
|
Coffee Time
|
16:00-17:30
|
Scientific Data
S9201 "Visualization of Big SPH Simulations via Compressed Octree
Grids"
S9208 "A System for Large-Scale Visualization of Streaming Doppler
Data"
S9209 "Dynamic Reduction of Query Result Sets for Interactive
Visualization"
|
17:30-18:30
|
Fast, Incremental
Visualization
S9206 "GPU-Accelerated Incremental Correlation Clustering of Large
Data in the Cloud with Visual Feedback"
S9204 "VisReduce: Fast and responsive incremental information
visualization of large datasets"
|
Workshop
6: Big
Data and Science: Infrastructure and Services
|
|
1
|
SC210 "A big
data analytics framework for scientific data management"
Sandro Fiore, Cosimo Palazzo, Alessandro D'Anca, Ian Foster, Dean
Williams and Giovanni Aloisio
|
2
|
BigD337
"Searching Inter-disciplinary Scientific Big Data based on Latent
Correlation Analysis"
Eloy Gonzales, Bun Theang Ong and Koji Zettsu
|
3
|
SC209
"Complete Storm Identification Algorithms from Big Raw Rainfall Data
Using MapReduce Framework"
Kulsawasd Jitkajornwanich, Upa Gupta, Sakthi Kumaran Shanmuganathan,
Ramez Elmasri, Leonidas Fegaras and John McEnery
|
4
|
BigD409 "A
Scalable Data Analysis Platform for Metagenomics"
Wei Tang, Jared Wilkening, Narayan Desai, Wolfgang Gerlach, Andreas
Wilke and Folker Meyer
|
5
|
BigD426
"Rethinking Data Management for Big Data Scientific
Workflows"
Karan Vahi, Mats Rynge, Gideon Juve, Rajiv Mayani and Ewa Deelman
|
6
|
BigD377 "SciFlow:
A Dataflow-Driven Model Architecture for Scientific Computing using
Hadoop"
Pengfei Xuan, Yueli Zheng, Sapna Sarupria and Amy Apon
|
7
|
SC204
"perSONAR: On-board Diagnostics for Big Data"
Jason Zurawski, Sowmya Balasubramanian, Aaron Brown, Ezra Kissel,
Andrew Lake, Martin Swany, Brian Tierney and Matt Zekauskas
|
|
Date
|
6th, October,2013
|
Location
|
Ballroom C
|
Time
|
Schedule
|
9:00
|
Introduction
|
9:10-10:00
|
Keynote: Noel
Gorelick (Google) - Google Earth Engine
|
10:00-10:20
|
Break
|
10:20-12:00
|
SC204
"perSONAR: On-board Diagnostics for Big Data"
BigD426 "Rethinking Data Management for Big Data Scientific
Workflows"
BigD377 "SciFlow: A Dataflow-Driven Model Architecture for
Scientific Computing using Hadoop"
SC210 "A big data analytics framework for scientific data
management"
|
12:00-13:30
|
Lunch
|
13:30-14:15
|
SC209
"Complete Storm Identification Algorithms from Big Raw Rainfall
Data Using MapReduce Framework"
BigD337 "Searching Inter-disciplinary Scientific Big Data based on
Latent Correlation Analysis"
BigD409 "A Scalable Data Analysis Platform for Metagenomics"
|
14:15-14:45
|
Lightning Talks(1)
Tom Plunket - Analyzing Cancer-Genome Relationships
Eugen Feller - TBD
|
14:45-15:00
|
Break
|
15:00-15:45
|
Invited Speaker:
Dula Parkinson (Lawrence Berkeley National Laboratory)
"Web interfaces and High-Performance Computing: Solutions to Data
Management, Processing, and Analysis Challenges at the Advanced Light
Source X-ray Facility"
|
15:45-16:00
|
Break
|
16:00-17:00
|
Lightning Talks(2)
Yong Chen - Fast Data Analysis with Integrated Statistical Metadata in
Scientific Datasets
Chaitanya Baru - Lessons Learned from Gordon
Discussion
Close Out
|
Workshop
7: Scalable
Machine Learning: Theory and Applications
|
|
1
|
BigD351
"Assessment of Dimensionality Reduction Based on Communication
Channel Model; Application to Immersive Information Visualization"
Mohammadreza Babaee, Mihai Datcu and Gerhard Rigoll
|
2
|
BigD227
"Hierarchical Feature Learning from Sensorial Data by Spherical
Clustering"
Bonny Banerjee and Jayanta Dutta
|
3
|
BigD436 "Efficient
Learning from Explanation of Prediction Errors in Streaming Data"
Bonny Banerjee and Jayanta Dutta
|
4
|
SD211
"Distributed Pivot Clustering with Arbitrary Distance
Functions"
L. Karl Branting
|
5
|
SD202
"Nearest Neighbor Classification Using Bottom-k Sketches"
Søren Dahlgaard and Christian Igel
|
6
|
SD206
"Feature Selection Strategies for Classifying High Dimensional
Astronomical Data Sets"
Ciro Donalek, Arun Kumar, Ashish Mahabal, S. George Djorgovski, Andrew
Drake, Matthew, Graham, Sajeet Philip, Thomas Fuchs and Michael
|
7
|
SD217 "How
Data Partitioning Strategies and Subset Size Influence the Performance
of an Ensemble?"
Majed Farrash and Wenjia Wang
|
8
|
SD207 "Fast
Change Point Detection for Electricity Market Analysis"
William Gu, Jaesik Choi, Ming Gu, Horst Simon and Kesheng Wu
|
9
|
SD209 "A
Novel Integrated Method for Human Multiplex Protein Subcellular
Localization Prediction"
Hong Gu and Junzhe Cao
|
10
|
paper title
author list
|
10
|
BigD435
"Learning from Multiple Data Sets with Different Missing
Attributes and Privacy Policies: Parallel Distributed Fuzzy
Genetics-Based Machine Learning Approach"
Hisao Ishibuchi, Masakazu Yamane and Yusuke Nojima
|
11
|
SD214 "Data
Chaos: An Entropy based MapReduce Framework for Scalable Learning"
Jiaoyan Chen, Huajun Chen, Xi Chen, Guozhou Zheng and Zhaohui Wu
|
12
|
SD218
"Exploring Sketches for Probability Estimation with Sublinear
Memory"
Anthony Kleerekoper, Mikel Lujan and Gavin Brown
|
13
|
SD216
"Agglomerative Co-Clustering for Synonymous Phrases Based on
Common Effects and Influences"
Koji Kumanami, Kazuhiro Seki and Kuniaki Uehara
|
14
|
SD201
"Leveraging Memory Mapping for Fast and Scalable Graph Computation
on a PC"
Zhiyuan Lin, Duen Horng Chau and U Kang
|
15
|
SD220
"Scalable Sentiment Classification for Big Data Analysis Using
Na¨ıve Bayes Classifier"
Bingwei Liu, Erik Blasch, Yu Chen, Dan Shen and Genshe Chen
|
16
|
SD221
"Meta-learning for Large Scale Machine Learning with
MapReduce"
Xuan Liu, Xiaoguang Wang, Stan Matwin and Nathalie Japkowicz
|
17
|
SD223
"Frequent Itemset Mining for Big Data"
Sandy Moens, Emin Aksehirli and Bart Goethals
|
18
|
BigD317
"Evaluating Parallel Logistic Regression Models"
Haoruo Peng, Ding Liang and Cyrus Choi
|
19
|
SD203
"Approximate triangle counting algorithms on Multi-cores"
Mahmudur Rahman and Mohammad Al Hasan
|
20
|
SD212 "Tree Labeled
LDA: A Hierarchical Model for Web Summaries"
Anton Slutsky, Xiaohua Hu and Yuan An
|
21
|
SD205
"Nearest Neighbour Regression Outperforms Model-based Prediction
of Specific Star Formation Rate"
Kristoffer Stensbo-Smidt, Christian Igel, Andrew Zirm and Kim
Steenstrup Pedersen
|
22
|
BigD304
"MapReduce Implementation of Variational Bayesian Probabilistic
Matrix Factorization Algorithm"
Naveen Tewari, Hari Koduvely, Sarbendu Guha, Arun Yadav and Gladbin
David
|
23
|
BigD394 "A
Unified Framework for Predicting Attributes and Links in Social
Networks"
Xusen Yin, Bin Wu and Xiuqin Lin
|
24
|
SD219
"Scalable Approximation of Kernel Fuzzy c-Means"
Zijian Zhang and Timothy Havens
|
25
|
SD204 "Large-scale
Restricted Boltzmann Machines on Single GPU"
Yun Zhu, Yanqing Zhang and Yi Pan
|
|
Date
|
6th, October,2013
|
Location
|
Ballroom D
|
Time
|
Schedule
|
8:30-9:20
|
Opening Remarks:
Prof. Irwin King
|
9:20-9:35
|
Invited talk: Alex
Smola, Carnegie Mellon University
|
9:20-9:50
|
SD204
"Large-scale Restricted Boltzmann Machines on Single GPU"
BigD304 "MapReduce Implementation of Variational Bayesian
Probabilistic Matrix Factorization Algorithm"
|
9:50-10:30
|
Break and Poster
Session
|
10:30-11:15
|
Invited talk:
Joseph Gonzalez, University of California, Berkeley
|
11:15-12:00
|
SD217 "How
Data Partitioning Strategies and Subset Size Influence the Performance
of an Ensemble?"
SD201 "Leveraging Memory Mapping for Fast and Scalable Graph
Computation on a PC" SD203 "Approximate triangle counting
algorithms on Multi-cores"
|
12:00-13:30
|
Lunch
|
13:30-14:00
|
Poster Session
|
14:00-14:45
|
Invited talk:
Mikhail Bilenko, Microsoft Research
|
14:45-15:30
|
SD221
"Meta-learning for Large Scale Machine Learning with
MapReduce"
BigD436 "Efficient Learning from Explanation of Prediction Errors
in Streaming Data"
SD202 "Nearest Neighbor Classification Using Bottom-k
Sketches"
|
15:30-16:00
|
Break and Poster Session
|
16:00-16:45
|
Invited talk: Alek
Kolcz, Twitter Inc.
|
16:45-17:00
|
ISD212: Tree
Labeled LDA: A Hierarchical Model for Web Summaries
|
17:00-18:00
|
Panel Discussion
|
Workshop
8: Big
Data in Bioinformatics and Health Informatics
|
|
1
|
SE207 "Lung
Transplant Outcome Prediction using UNOS Data"
Ankit Agrawal, Reda Al-Bahrani, Mark Russo, Jaishankar Raman and Alok
Choudhary
|
2
|
SE206 "Colon
cancer survival prediction using ensemble data mining on SEER
data"
Reda Al-Bahrani, Ankit Agrawal and Alok Choudhary
|
3
|
SE204 "A Look
at Challenges and Opportunities of Big Data Analytics in
Healthcare"
Ruchie Bhardwaj, Adhiraaj Sethi, Rajesh Vargheese and Raghunath Nambiar
|
4
|
SE201
"Multidimensional Analysis of Fetal Growth Curves"
Mario Bochicchio, Lucia Vaira, Antonella Longo, Antonio Malvasi and
Andrea Tinelli
|
5
|
BigD296 "OWL
Reasoning over Big Biomedical Data"
Xi Chen, Huajun Chen, Ningyu Zhang, Jiaoyan Chen and Zhaohui Wu
|
6
|
SE205 "KUChemBio:
A database of computational chemical biology data sets hosted at the
University of Kansas"
Aaron Smalter Hall and Jun Huan
|
7
|
SE202
"Parallel and Memory-efficient Burrows-Wheeler Transform"
Shinya Hayashi and Kenjiro Taura
|
8
|
SE209 "Content-based
Assessment of the Credibility of Online Healthcare Information"
Meeyoung Park, Hariprasad Sampathkumar, Bo Luo and Xue-wen Chen
|
9
|
BigD382 "BIG
DATA Infrastructures for Pharmaceutical Research"
Christian Seebode, Matthias Ort, Martin Peuker and Christian
Regenbrecht
|
10
|
SE208 "Big
Data Solutions for Predicting Risk-of-Readmission for Congestive Heart
Failure Patients"
Kiyana Zolfaghar, Naren Meadem, Ankur Teredesai, Senjuti Basu Roy,
Si-Chi Chin and Brian Muckian
|
|
Date
|
6th, October,2013
|
Location
|
Ballroom E
|
Time
|
Schedule
|
8:30-8:40
|
Introduction
|
8:40-9:40
|
Keynote talk: Dr.
Belinda Seto, Deputy Director, NIBIB, NIH
|
9:40-10:00
|
SE201
"Multidimensional Analysis of Fetal Growth Curves"
|
10:00-10:20
|
Coffee Time
|
10:20-11:05
|
Invited talk: Dr.
Ida Sim, UC San Franscisco
|
11:05-11:25
|
SE208 "Big
Data Solutions for Predicting Risk-of-Readmission for Congestive Heart
Failure Patients"
|
11:25-12:10
|
Health Informatics
Panel
|
12:10-13:30
|
Lunch Break
|
13:30-14:15
|
Invited talk: Dr.
Mark Musen, Stanford University
"Big Data for Big Data: Metadata to Manage Access and Analysis of
Large Biomedical Datasets"
|
14:15-15:40
|
SE206 "Colon
cancer survival prediction using ensemble data mining on SEER
BigD296 "OWL Reasoning over Big Biomedical Data"
SE207 "Lung Transplant Outcome Prediction using UNOS Data"
SE209 "Content-based Assessment of the Credibility of Online
Healthcare Information"
|
15:40-16:00
|
Coffee Time
|
16:00-17:10
|
SE202
"Parallel and Memory-efficient Burrows-Wheeler Transform"
SE205 "KUChemBio: A database of computational chemical biology
data sets hosted at the University of Kansas"
SE204 "A Look at Challenges and Opportunities of Big Data
Analytics in Healthcare"
BigD382 "BIG DATA Infrastructures for Pharmaceutical
Research"
An Industrial presentation from Dr.Shipeng Yu, Siemens
|
17:10-17:55
|
Bioinformatics
Panel
|
Workshop
9: Scholarly
Big Data: Challenges & issues
|
|
1
|
PID2931527
"The Microsoft Academic Search Challenges at KDD Cup 2013"
Martine De Cock, Senjuti Basu Roy, Swapna Savvana, Vani Mandava, Brian
Dalessandro, Claudia Perlich, William Cukierski and Ben Hamner
|
2
|
sbd
"Bibliometric-enhanced Retrieval Models for Big Scholarly
Information Systems"
Philipp Mayr and Peter Mutschke
|
3
|
PID2929325
"Academic Publishing as a Social Media Paradigm"
Michael E. Payne, Linh B. Ngo and Amy W. Apon
|
4
|
SF201_9005
"Big Spatial Data Mining"
Wang Shuliang, Ding Gangyi and Zhong Ming
|
|
Date
|
6th, October,2013
|
Location
|
Ballroom G
|
Time
|
Schedule
|
8:00-8:30
|
Registration
|
8:30-8:45
|
Welcome
|
8:45-10:00
|
PID2931527 "The
Microsoft Academic Search Challenges at KDD Cup 2013"
|
10:45-11:10
|
sbd
"Bibliometric-enhanced Retrieval Models for Big Scholarly
Information Systems"
|
11:10-11:35
|
SF201_9005
"Big Spatial Data Mining"
|
11:35-12:00
|
PID2929325 "Academic
Publishing as a Social Media Paradigm"
|
12:00-13:30
|
Lunch
|
13:30-15:40
|
Discussion and
breakout sessions
|
15:40-16:00
|
Coffee Time
|
16:00-16:20
|
Closing
|
Workshop
10: Scalable
Cloud Data Management
|
|
1
|
SG206
"Modeling and Querying Data in NoSQL Databases"
Karamjit Kaur and Rincle Rani
|
2
|
SG201
"Elastic Data Partitioning for Cloud-based SQL Processing
Systems"
Lipyeow Lim
|
3
|
SG203
"Parallel SECONDO: Practical and Efficient Mobility Data Processing
in the Cloud"
Jiamin Lu and Ralf Hartmut Gutting
|
4
|
SG204
"Index-Based Join Operations in Hive"
Mahsa Mofidpoor, Nematollaah Shiri and T. Radhakrishnan
|
5
|
SG205 "SLA
data management criteria"
Katerina Stamou, Verena Kantere and Jean-Henry Morin
|
|
Date
|
6th, October,2013
|
Location
|
Ballroom H
|
Time
|
Schedule
|
13:30-14:30
|
Invited talk:
Keynote by Peter Bailis
|
14:30-14:40
|
Break
|
14:40-15:40
|
Session I
SG206 "Modeling and Querying Data in NoSQL Databases"
SG201 "Elastic Data Partitioning for Cloud-based SQL Processing
Systems"
|
15:40-16:00
|
Break
|
16:00-18:00
|
Session II
SG203 "Parallel SECONDO: Practical and Efficient Mobility Data Processing
in the Cloud"
SG204 "Index-Based Join Operations in Hive"
SG205 "SLA data management criteria"
|
Workshop
11: Big Data and Smarter Cities
|
|
1
|
SI205 "Fast
Solution of Load Shedding Problems via a Sequence of Linear Programs"
Harish S. Bhat, Garnet J. Vaz, Juan C. Meza
|
2
|
SI202 "Alarm
Prediction in Large-Scale Sensor Networks - A Case Study in
Railroad"
Hongfei Li, Buyue Qian, Dhaivat Parikh, Arun Hampapur
|
3
|
BigD419 "MiSTRAL:
An Architecture for Low-Latency Analytics on Massive Time Series"
Alice Marascu, Pascal Pompey, Eric Bouillet, Olivier Verscheure,
Michael Wurst, Martin Grund, Philippe Cudre-Mauroux
|
4
|
SI204 "Yellow
Cabs as Red Corpuscles"
Tim Savage, Huy Vo
|
5
|
BigD440
"Scalable Prediction of Energy Consumption using Incremental Time
series Clustering"
Yogesh Simmhan, Muhammad Usman Noor
|
6
|
SI207 "A Big
Data Driven Model for Taxi Drivers' Airport Pick-up Decisions in New
York City"
Anil Yazici, Camille Kamga, Abhishek Singhal
|
|
Date
|
9th, October,2013
|
Location
|
Ballroom E
|
Time
|
Schedule
|
Workshop
12: Knowledge management and Big Data Analytics
|
|
1
|
SJ213 "Managing
Massive Graphs in Relational DBMS"
Ruiwen Chen
|
2
|
SJ211 "A
Distributed Approach for Graph-Oriented Multidimensional Analysis
Benoît Denis"
Amine Ghrab, Sabri Skhiri
|
3
|
SJ206 "Constructing
E-Tourism Platform Based on Service Value Broker: A Knowledge
Management"
Yucong Duan, Jinpeng Wei, Ajay Kattepur, Wencai Du
|
4
|
SJ202 "ADraw:
A novel social network visualization tool with attribute-based layout
and coloring"
Zhenwen Wang, Weidong Xiao, Bin Ge, Hao Xu
|
5
|
SJ207
"IntegrityMR: Integrity Assurance Framework for Big Data Analytics
and Management Applications"
Yongzhi Wang, Jinpeng Wei, Mudhakar Srivatsa, Yucong Duan, Wencai Du
|
6
|
SJ212 "Local Join
Optimization over a Heterogeneously Distributed Scientific
Database"
Helen Xiang
|
7
|
SJ205
"Core-based Community Evolution in Mobile Social Networks"
Hao Xu, Weidong Xiao, Daquan Tang, Jiuyang Tang
|
8
|
BigD333 "Super-sequence
Frequent Pattern Mining on Sequential Dataset"
Xinran Yu, Turgay Korkmaz
|
9
|
SJ203
"Exploring Big Data in Small Forms: A Multi-layered Knowledge
Extraction of Social Networks"
Yun Wei Zhao, Willem-Jan van den Heuvel, Xiaojun Ye
|
10
|
SJ204
"Provenance Comparison for Large-Scale Knowledge Discovery"
Xiang Zhao, Bin Ge, Jiuyang Tang, Weidong Xiao, Haichuan Shang
|
|
Date
|
6th, October,2013
|
Location
|
Ballroom F
|
Time
|
Schedule
|
8:30-8:50
|
Introduction (Chi-Hung
Chi)
|
8:50-9:50
|
SJ203
"Exploring Big Data in Small Forms: A Multi-layered Knowledge
Extraction of Social Networks"
SJ205 "Core-based Community Evolution in Mobile Social
Networks"
|
9:50-10:20
|
Coffee Time
|
10:20-11:50
|
SJ211 "A Distributed
Approach for Graph-Oriented Multidimensional Analysis Benoît
Denis"
SJ204 "Provenance Comparison for Large-Scale Knowledge
Discovery"
SJ213 "Managing Massive Graphs in Relational DBMS"
|
11:50-13:30
|
Lunch
|
13:30-15:30
|
SJ202 "ADraw:
A novel social network visualization tool with attribute-based layout
and coloring"
SJ206 "Constructing E-Tourism Platform Based on Service Value
Broker: A Knowledge Management"
SJ207 "IntegrityMR: Integrity Assurance Framework for Big Data
Analytics and Management Applications"
SJ212 "Local Join Optimization over a Heterogeneously Distributed
Scientific Database"
|
15:30-16:00
|
Coffee Time
|
16:00-16:30
|
BigD333
"Super-sequence Frequent Pattern Mining on Sequential
Dataset"
|
Posters
|
1
|
P201
"Re-projection of Terabyte-Sized Images"
Peter Bajcsy, Antoine Vandecreme, Mary Brady
|
2
|
P207 "Tile
Based Visual Analytics for Twitter Big Data Exploratory Analysis"
Daniel Cheng, Peter Schretlen, Nathan Kronenfeld, Neil Bozowsky, William
Wright
|
3
|
P216
"Optimizing Queries over Semantically Integrated Datasets on
MapReduce Platforms"
HyeongSik Kim, Kemafor Anyanwu
|
4
|
P211 "Secure
Decoupled Linkage (SDLink) System for Building a Social Genome"
Hye-Chung Kum, Ashok Krishnamurthy, Darshana Pathak, Michael Reiter,
Stanley Ahalt
|
5
|
P206 "Risk
Adjustment of Patient Expenditures: A Big Data Analytics Approach"
Lin Li, Saeed Bagheri, Helena Goote, Asif Hasan, Gregg Hazard
|
6
|
P214
"Parallel Auto-encoder for Efficient Outlier Detection"
Yunlong Ma, Peng Zhang, Yanan Cao, Li Guo
|
7
|
mohPID29228727
"New
Factors for Identifying Influential Bloggers"
Teng-Sheng Moh, SivaNaga Prasad Shola
|
8
|
P205 "A Scalable
Infrastructure of Interactive Evolutionary Computation to Evolve
Services Online with Data"
Masaharu Munetomo, Shitaro Bando
|
9
|
P215 "Big
Data for Business Managers - Bridging the gap between Potential and
Value"
Anmol Rajpurohit
|
10
|
tsumoto_PID2930649
"Granularity-based
Temporal Data Mining in Hospital Information System"
Shusaku Tsumoto, Shoji Hirano, Haruko Iwata
|
11
|
P212
"Observation of Matthew Effects in Sina Weibo Microblogger"
Mengmeng Yang, Yi Zhou, Qu Zhou, Kai Chen, Jianhua He, Xiaokang Yang
|
12
|
P213 "A
framework of Spatial Co-location Mining on MapReduce"
Jin Soung Yoo, Douglas Boulware
|
13
|
P210 "Access
Control for Big Data using Data Content"
Wenrong Zeng, Yuhao Yang, Bo Luo
|
TUTORIAL 1: Online
Learning for Big Data Analytics
Presenters:
Irwin King, Michael R. Lyu and Haiqin Yang,
Department of Computer Science & Engineering
The Chinese University of Hong Kong
Summary:
Nowadays, Big data becomes
a new era as science, engineering and tech- nology are producing
increasingly large data streams daily making them in petabyte and exabyte
scales. Moreover, massive data embedding human activity are online and
available to analyze and build business models for providing personalized
services in commerce. Learning from big data is a novel topic to expand
the area of machine learning. Many new learning techniques need to be
developed to increase the e_ectiveness and e_ciency of learning the data.
Among them, online learning is one of the promising techniques, which we
have deeply investigated for several years, for learning big data.
The tutorial will investigate several important components of online
learning techniques for big data. First, a brief introduction of the
basic con- cept of big data and big data analytics will be given. The
basic concept of di_erent learning paradigms and online learning will be
provided to give a whole map of the techniques developed in this area.
Second, the connection of online learning techniques and big data will be
addressed. After that, some motivating examples will be presented to
illustrate the promising of online learning techniques. Fourth, we will
present di_erent online learning techniques for non-sparse learning
models, sparse learning models, unsu- pervised learning models, etc. Some
hand-on demos may be given in the tutorial.
The tutorial will conclude by summarizing and reecting back on the trends
of online learning techniques for big data which may lead to the change
of the whole area of exciting and dynamic research that is worthy of more
detailed investigation for many years to come.
Content:
1. Introduction
1.1 Basic concept of big data and big data analytics
1.2 Basic concept of online learning and its applications
2. Online Learning Algorithms
2.1 Perceptron
2.2 Online non-sparse learning
2.3 Online sparse learning
2.4 Online unsupervised learning
3. Discussion and Q & A
Short Bio.
Prof. King's
Profile
Prof. King's
research interests include machine learning, social computing, web
intelligence, data mining, and multimedia information processing. In
these research areas, he has over 210 technical publications in journals
and conferences. In addition, he has contributed over 20 book chapters
and edited volumes. Moreover, Prof. King has over 30 research and applied
grants. One notable patented system he has developed is the VeriGuide
System, previously known as the CUPIDE (Chinese University Plagiarism
IDentification Engine) system, which detects similar sentences and
performs readability analysis of text-based documents in both English and
in Chinese to promote academic integrity and honesty.
Prof. King is the Book Series Editor for Social Media and Social
Computing" with Taylor and Francis (CRC Press). He is also an
Associate Editor of the ACM Transactions on Knowledge Discovery from Data
(ACM TKDD) and a former Associate Editor of the IEEE Transactions on
Neural Networks (TNN) and IEEE Computational Intelligence Magazine (CIM).
He is a member of the Editorial Board of the Open Information Systems
Journal, Journal of Nonlinear Analysis and Applied Mathematics, and
Neural Information Processing Letters and Reviews Journal (NIP-LR). He
has also served as Special Issue Guest Editor for Neurocomputing,
International Journal of Intelligent Computing and Cybernetics (IJICC),
Journal of Intelligent Information Systems (JIIS), and International
Journal of Computational Intelligent Research (IJCIR). He is a senior
member of IEEE and a member of ACM, International Neural Network Society
(INNS), and Asian Pacific Neural Network Assembly (APNNA). Currently, he
is serving the Neural Network Technical Committee (NNTC) and the Data
Mining Technical Committee under the IEEE Computational Intelligence
Society (formerly the IEEE Neural Network Society). He is also a member
of the Board of Governors of INNS and a Vice-President and Governing
Board Member of APNNA. He also serves INNS as the Vice-President for
Membership in the Board of Governors.
Prof. King is an associate dean of engineering faculty and a professor at
the Department of Computer Science and Engineering, The Chinese
University of Hong Kong. He received his B.Sc. degree in Engineering and
Applied Science from California Institute of Technology, Pasadena and his
M.Sc. and Ph.D. degree in Computer Science from the University of
Southern California, Los Angeles.
Prof. Lyu's Profile
Prof.
Lyu's research interests include software reliability engineering,
distributed systems, fault-tolerant computing, web technologies, mobile
networks, digital video library, multimedia processing, and video
searching and delivery. He has participated in more than 30 industrial
projects in these areas, and helped to develop many commercial systems
and software tools. He has been frequently invited as a keynote or
tutorial speaker to conferences and workshops in U.S., Europe, and Asia.
Prof. Lyu has published over 400 refereed journal and conference papers
in his research areas. He initiated the first International Symposium on
Software Reliability Engineering (ISSRE) in 1990. He was the Program
Chair for ISSRE'96, Program co-Chair for WWW10, General Chair for
ISSRE'2001, General co-Chair for PRDC'2005, and has served in program
committees for many conferences. He is the editor for two book volumes:
Software Fault Tolerance, published by Wiley in 1995 and the Handbook of
Software Reliability Engineering, published by IEEE and McGraw-Hill in
1996. These books have received an overwhelming response from both the academia
and the industry. He was an Associate Editor of IEEE Transactions on
Reliability, IEEE Transactions on Knowledge and Data Engineering, and
Journal of Information Science and Engineering. He is currently on the
editorial board of Wiley Software Testing, Verification and Reliability
Journal. He was elected to IEEE Fellow (2004) and AAAS Fellow (2007) for
his contributions to software reliability engineering and software fault
tolerance. He was also named Croucher Senior Research Fellow in 2008 and
IEEE Reliability Society Engineer of the Year in 2010.
Prof. Lyu is currently a Professor in the Computer Science and
Engineering department of the Chinese University of Hong Kong. He
received his B.S. in Electrical Engineering from National Taiwan
University, his M.S. in Computer Science from University of California,
Santa Barbara, and his Ph.D. in Computer Science from University of
California, Los Angeles.
Dr. Yang's
Profile
Dr. Haiqin Yang's
research interests include machine learning, data mining, and financial
engineering. In these areas, he has over 30 technical publications in
journals (JMLR, IEEE TNN, Neurocomputing, IEEE BME, IEEE SMC) and
conferences (ICML, CIKM, IJCNN, ICONIP, etc.). In addition, he has
written two books, four book chapters, and granted seven patents. He has
served as a reviewer for many journals and in program committees for many
conferences, e.g., CIKM, ACML, and IEEE BigData 2013, IEEE BDSE 2013. He
also received many awards, including the ``First Prize" postgraduate
paper award in the IEEE Hong Kong Section 2010, PCCW Foundation
Scholarship, and The Global Scholarship Programme for Research
Excellence. Dr.~Yang is currently a Postdoctoral Fellow in The Chinese
University of Hong Kong. He received his B.S. degree in the Computer Science
and Technology in Nanjing University, his M.Phil. and Ph.D. degree in
Computer Science and Engineering from The Chinese University of Hong
Kong.
TUTORIA 2: Large-Scale Click-stream
and transaction log mining in practice
Presenters:Uwe Mayer, Nish Parikh, Gyanit Singh
Ebay
This tutorial will
summarize state-of-the-art approaches in the growing area of large scale
click-stream mining. It will give an opportunity to data scientists,
researchers and engineers with diverse backgrounds to familiarize themselves
with practical platforms, approaches and tools for extracting actionable
insights and building products from big and diverse data
sources. The organizers will accomplish this goal using three
real-life stories from the field (large scale data initiatives at eBay –
one of the world's largest e-commerce platforms). The tutorial will
feature transaction mining, behavior log mining and time-series
mining. We will talk about building robust recommendation systems
over map reduce clusters (query suggestions, shipping fee
recommendations). Talk will also include topics like user bias
removal from data, using heuristics to make intractable algorithms
practical and appropriate de-noising and normalization of diverse
data-sets. Audience is expected to be familiar with map-reduce
(preferably Hadoop). Audience is also expected to be working or
grappling with data problems. Some basic background in algorithms,
statistics would be beneficial.
Content:
We will present the
tutorial through real applications built at eBay. We will present three
case studies.
·
Shipping Recommendation System
·
Mining large-scale temporal dynamics with Hadoop
·
Query Suggestions at scale with Hadoop
Short Bio.
Uwe Mayer (http://labs.ebay.com/people/uwe-mayer/)
Prior to joining
eBay, Uwe Mayer was a senior research scientist at Yahoo, and was a
director of Analytic Science at FICO. He has been a professor of
mathematics at universities in both the U.S. and in Germany.
Uwe received his MA and PhD in mathematics from the University of Utah
where he was a Fulbright scholar, with an extended research stay at the
Institute for Advanced Studies at Princeton. He carried out his
undergraduate studies with a double major in Mathematics and Computer
Sciences in Germany. Bringing his academic career full circle from
computer sciences to mathematics back to computers, Uwe also has
co-advised a PhD student in data mining at the University of California,
San Diego, and has published in several data mining/machine learning
conferences including KDD.
Nish Parikh
(http://labs.ebay.com/people/nish-parikh/)
Nish Parikh joined eBay Research Labs in
February 2008 and currently is the Head of Data Sciences Research. At
eBay Research Labs, he leads efforts on query analysis, recommender
systems and large-scale data processing from a data science perspective.
Prior to joining eBay Research Labs he was part of the team that launched
eBay's Next Generation Search Engine Voyager which supported near
real-time indexing of products and served billions of search queries
every week. Prior to joining eBay, Nish received an M.S. in Computer
Science from University of Southern California and a B.S. in Electrical
Engineering from Gujarat University where he was awarded a gold medal for
academic excellence. Nish has published in premier conferences such as
SIGIR, KDD, WWW, CIKM and WSDM. In addition to the research community
engagement, Nish is a frequent speaker in industry and big data forums
such as the Hadoop Summit and XLDB.
Gyanit Singh
(http://labs.ebay.com/people/gyanit-singh/)
Gyanit Singh is a Research Scientist at
eBay Research Labs. His research interests are in large scale data
mining, query log mining and large scale data platforms. At eBay he has
worked on problems like query suggestion and recovery from null search.
He has also worked on in house Map-Reduce data platform called Mobius.
Prior to joining eBay, Gyanit completed his masters in Computer Science
from university of Washington, Seattle. Before that he was at Indian
institute of Technology, Delhi pursuing his bachelors in Computer Science.
Gyanit has published in premier conferences such as SIGIR, WWW,
APPROX-RANDOM and WSDM. In addition to the research community engagement,
Gyanit is a frequent speaker in industry and big data forums such as the
Hadoop Summit and Hadoop World, ACM Data Mining Camp, Bay Area Search
Forum.
|
|
|
|
Panel: "Key Issues in Big Data"
|
Panelists:
(1)Dr. Roger R. Schell, USC
(2)Dr. Amr Awadallah, Cloudera, Inc.
(3)Dr. Peter G. Neumann, RSl
(4)Dr.Tomoyuki Higuchi
(5)Dr. Sylvia Osborn, University of Western Ontario
(6)Dr. Justin Zhan, A&T State University
(7)Dr. T. Y. Lin, San Jose State University (Chair)
Bios of Panelists
 |
Dr. Roger R. Schell recently joined USC/ISI supporting their Masters of Cyber Security degree program. He is internationally recognized for originating several key modern security design and evaluation techniques, and he holds patents in cryptography, authentication and trusted workstation. For more than decade he has been co-founder and President of Aesec Corporation, a start-up company providing verifiably secure platforms. Previously Dr. Schell was co-founder and vice president for Gemini Computers, Inc., where he directed development of their highly secure (what NSA called "Class A1") commercial product, the Gemini Multiprocessing Secure Operating System (GEMSOS). He was also the founding Deputy Director of NSA's National Computer Security Center. He has been referred to as the "father" of the Trusted Computer System Evaluation Criteria (the "Orange Book"). Dr. Schell is a retired USAF Colonel. He received a Ph.D. in Computer Science from the MIT, an M.S.E.E. from Washington State, and a B.S.E.E. from Montana State. The NIST and NSA have recognized Dr. Schell with the National Computer System Security Award. In 2012 he was inducted into the inaugural class of the National Cyber Security Hall of Fame
|
 |
Dr. Amr Awadallah is the CTO/Cofounder, Cloudera, Inc. Before co-founding Cloudera in 2008, Amr (@awadallah) was an Entrepreneur-in-Residence at Accel Partners. Prior to joining Accel he served as Vice President of Product Intelligence Engineering at Yahoo!, and ran one of the very first organizations to use Hadoop for data analysis and business intelligence. Amr joined Yahoo after they acquired his first startup, VivaSmart, in July of 2000. Amr holds a Bachelor's and Master's degrees in Electrical Engineering from Cairo University, Egypt, and a Doctorate in Electrical Engineering from Stanford University |
 |
Peter G. Neumann (Neumann@CSL.sri.com) has doctorates from Harvard and Darmstadt. After 10 years at Bell Labs in Murray Hill, New Jersey, in the 1960s, during which he was heavily involved in the Multics development jointly with MIT and Honeywell, he has been in SRI's Computer Science Lab since September 1971 -- where he is a Senior Principal Scientist. He is concerned with computer systems and networks, trustworthiness/dependability, high assurance, security, reliability, survivability, safety, and many risks-related issues such as election-system integrity, crypto applications and policies, health care, social implications, and human needs -- especially those including privacy. He is currently PI on two DARPA projects: clean-slate trustworthy hosts for the CRASH program with new hardware and new software, and clean-slate networking for the Mission- oriented Resilient Clouds program. He moderates the ACM Risks Forum (http://www.risks.org), has been reponsible for CACM's Inside Risks columns monthly from 1990 to 2007, tri-annually since then, chairs the ACM Committee on Computers and Public Policy, and has chaired the National Committee for Voting Integrity (http://www.votingintegrity.org) -- which is about to be disbanded in lieu of many other efforts. He created ACM SIGSOFT's Software Engineering Notes in 1976, was its editor for 19 years, and still contributes the RISKS section. He is on the editorial board of IEEE Security and Privacy. He has participated in four studies for the National Academies of Science: Multilevel Data Management Security (1982), Computers at Risk (1991), Cryptography's Role in Securing the Information Society (1996), and Improving Cybersecurity for the 21st Century: Rationalizing the Agenda (2007). His 1995 book, Computer-Related Risks, is still timely. He is a Fellow of the ACM, IEEE, and AAAS, and is also an SRI Fellow. He received the National Computer System Security Award in 2002, the ACM SIGSAC Outstanding Contributions Award in 2005, and the Computing Research Association Distinguished Service Award in 2013. In 2012, he was elected to the newly created National Cybersecurity Hall of Fame as one of the first set of inductees. He is a member of the U.S. Government Accountability Office Executive Council on Information Management and Technology, and vestigially the California Office of Privacy Protection advisory council (although that group has been dormant due to the CA budget crunch). He co-founded People For Internet Responsibility (PFIR, http://www.PFIR.org). He has taught courses at Darmstadt, Stanford, U.C. Berkeley, and the University of Maryland. See his website (http://www.csl.sri.com/neumann) for testimonies for the U.S. Senate and House and California state Senate and Legislature, papers, bibliography, further background, etc. See also the Illustrative Risks annotated index of earlier risks incidents, which is more or less up-to-date regarding items relating to election integrity. |
 |
Tomoyuki Higuchi is Director-General of The Institute of Statistical Mathematics (ISM) and an Executive director of the Research Organization of Information and Systems (ROIS) from April 2011. He completed his Ph.D. in Geophysics, Faculty of Science at University of Tokyo in 1989. Since joining at ISM in 1989,he has taken the part to development of the statistical modeling study consistently based on the actual problem, and is making an outstanding achievement in the applied research of the Bayesian modeling, in particular, sequential data assimilation. He is a member of the International Statistical Institute (ISI) and the American Geophysical Union (AGU). |
 |
Sylvia Osborn received her PhD in Computer Science from the University of Waterloo. Since 1977, she has been a faculty member in the Computer Science Department at the University of Western Ontario in London, Ontario, Canada. She is the author of numerous research papers, starting in the database field in dependency theory, and object-oriented databases. More recently she has been active in research into role-based access control including comparison of access control models, administration of access control, delegation. Recently, she has been focusing on the integration of privacy issues with access control, and how the consideration of privacy of individuals' data does or does not differ from access control. |
 |
Dr. Justin Zhan is the director of ILAB, which is an Interdisciplinary Research Institute at North Carolina A&T State University. He is a faculty member at Department of Computer Science, College of Engineering, North Carolina A&T State University. He has previously been a faculty member at Carnegie Mellon University and National Center for the Protection of Financial Infrastructure in South Dakota State. His research interests include Big Data, Information Assurance, Social Computing, and Health Science. He is a steering chair of IEEE International Conference on Social Computing (SocialCom) and IEEE International Conference on Privacy, Security, Risk and Trust (PASSAT). He is currently an editor-in-chief of International Journal of Privacy, Security and Integrity, International Journal of Social Computing and Cyber-Physical Systems, and managing editor of SCIENCE journal. He has served as a conference general chair, a program chair, a publicity chair, a workshop chair, or a program committee member for 200 international conferences and an editor-in-chief, an editor, an associate editor, a guest editor, an editorial advisory board member, or an editorial board member for 30 journals. In recent years, he has published extensively in peer-reviewed journals and conferences. His research has been funded by NSF, DoD, NIH, NSA, etc. |
 |
Tsau Young (T.Y.) Lin received his PhD in Mathematics from Yale University. He is a Professor of Computer Science at San Jose State University and a fellow in Berkeley Initiative in Soft Computing, University of California. He is the President of International Granular Computing Society and the Founding President of International Rough Set Society. He is one of E-i-C of International Journal of Granular Computing, Rough Sets and Intelligent Systems. He has served on various roles in reputable international journals and conferences. His interests include data/text/web mining, data security and granular/rough/soft computing. He received the best contribution awards from ICDM01 and International Rough Set Society (2005), best service award from IEEE/WIC/ACM WI-IAT2007and a pioneer award from GrC 2008. |
Panel:"Big Data Projects Funding: Challenges and Opportunities"
|
Panelists:
(1)Wo Chang , Digital Data Advisor, NIST Information Technology Laboratory
(2)Vasanth Honavar, Frymoyer Chair Professor of IST, Penn State Univ.
(3)David Kuehn, Program Manager, Federal Highway Administration
(4)Piyush Mehrotra , Division Chief, NASA Advanced Supercomputing (NAS) Division
Moderator: "Vijay Raghavan, UL Lafayette" (raghavan@louisiana.edu)
Panel Statement:
Big data and data analytics are one of the hottest IT themes in both academics and industry worldwide. At the same time, many governmental agencies are in the midst of great financial uncertainties and significant budget cuts. In this panel, the panelists will present their point of view on funding/federal initiatives for research on Big Data. The discussion will leverage a diverse set of experiences and viewpoints, since the panel consists not only of current and former PDs from several agencies, but also other experts knowledgeable of various alliances that are being forged in order to face big data challenges and to creatively fund such projects.
Panelists may share their controversial points of view and provocative positions on issues, including above but not limited to, during panel presentation and discussion.
The following structure will be followed:
(1)Welcome, Panel mechanics for discussion and Q & A, Introduction of panel members
(2)Presentations from Panelists (10- 12 min. each, including any quick questions/ comments)
(3)Moderator-directed Panel Q & A
(4)Questions from the Audience and open discussion
Bios of Panelists
 |
Mr. Wo Chang, Digital Data Advisor for the NIST Information Technology Laboratory (ITL), Chair of ISO/IEC JTC/1 SC29 WG11 (MPEG) Multimedia Preservation AHG
Mr. Wo Chang is Digital Data Advisor for the NIST Information Technology Laboratory (ITL). His responsibilities include, but are not limited to, promoting a vital and growing Big Data community at NIST with external stakeholders in commercial, academic, and government sectors. Mr. Chang currently chairs the ISO/IEC JTC/1 SC29 WG11 (MPEG) Multimedia Preservation AHG. Prior to joining ITL Office, Mr. Chang was manager of the Digital Media Group in ITL and his duties included overseeing several key projects including digital data, long-term preservation and management of EHRs, motion image quality, and multimedia standards. In the past, Mr. Chang was the Deputy Chair for the US National Body for MPEG (INCITS L3.1) and chaired several other key projects for MPEG, including MPQF, MAF, MPEG-7 Profiles and Levels, and co-chaired the JPEG Search project. Mr. Chang was one of the original members of the W3C's SMIL WG and developed one of the SMIL reference software. Furthermore, Mr. Chang also participated in the HL7 and ISO/IEC TC215 for health informatics and IETF for the protocols development of SIP, RTP/RTPC, RTSP, and RSVP. Mr. Chang's research interests include digital data preservation, cloud computing, big data analytics, content metadata description, digital file formats, multimedia synchronization, and Internet protocols.
|
 |
Dr. Vasant Honavar received his Ph.D. in Computer Science and Cognitive Science in 1990 from the University of Wisconsin Madison, specializing in Artificial Intelligence. From 1990 to 2013, he served on the faculty of Computer Science and of Bioinformatics and Computational Biology at Iowa State University (ISU). At ISU, he directed the Artificial Intelligence Research Laboratory (which he founded in 1990) and the Center for Computational Intelligence, Learning & Discovery (which he founded in 2005) and served as the associate chair (2001-2003) and chair (2003-2005) of the ISU Bioinformatics and Computational Biology Graduate Program, which he helped establish in 1999 with support from an Integrative Graduate Education and Research Training (IGERT) award.
Honavar served as a program director in the Information and Intelligent Systems Division of the Computer and Information Sciences and Engineering directorate of the National Science Foundation (NSF) during 2010-2013 while maintaining his research program at ISU. He led the Big Data Science and Engineering Program, established the NSF-OFR collaboration in Computational and Information Processing Approaches to and Infrastructure in support of, Financial Research and Analysis and Management, contributed to Smart and Connected Health, Information Integration and Informatics, Expeditions in Computing, Science of Learning Centers, Integrative Graduate Education and Research Training, Computing Research Infrastructure Programs. In September 2013, Honavar joined the faculty of Penn State University where he will serve as the Frymoyer Chair Professor of Information Science and Technology and lead new research and educational initiatives in Data Sciences and contribute to initiatives in Life Sciences.
Honavar's current research and teaching interests include Artificial Intelligence, Machine Learning, Bioinformatics, Big Data Analytics, Computational Molecular Biology, Data Mining, Discovery Informatics Information Integration, Knowledge Representation and Inference, Semantic Technologies, Social Informatics, Security Informatics, and Health Informatics. Honavar has served on, or currently serves on the editorial boards of several journals and program committees of several major research conferences in these areas.
Honavar has led research projects funded by NSF, NIH, and USDA that have resulted in foundational research contributions (documented in over 250 peer-reviewed publications) in algorithms for constructing predictive models from sequence, image, text, multi-relational, graph-structured data; Scalable algorithms for building predictive models from large, distributed, semantically disparate data (big data); Knowledge base and service federation; Representing and reasoning about preferences; and applications in bioinformatics, computational biology, immunoinformatics, energy informatics, health informatics and social informatics.
Honavar received the Iowa Board of Regents Award for Faculty Excellence in 2007, the Iowa State University College of Liberal Arts and Sciences Award for Research Excellence in 2008, and the Iowa State University Margaret Ellen White Graduate Faculty Award in 2011. Honavar received the NSF Director’s Award for Superior Accomplishment in 2013 for his leadership of the NSF Big Data Program. However, he considers the 28 Ph.D. students that he has mentored and trained during his academic career his proudest accomplishments.
|
 |
David Kuehn is the Program Manager for the Federal Highway Administration (FHWA) Exploratory Advanced Research Program. The Program Manager serves as the senior advisor to agency leadership on the communication and coordination of exploratory advanced research activities and fosters partnerships with other Federal agencies, national scientific societies and organizations, and the academic community in support of the Program. The program focuses on longer term and higher risk research with the potential for transformational improvements to the transportation system. David entered federal service as a Presidential Management Fellow. Before working at the federal level, David worked in local government and as a consultant in southern California. He holds a Masters of Public Administration from the University of Southern California and a B.A from the University of California, Irvine and is a member of the American Institute of Certified Planners (AICP). |
 |
As Division Chief of the NASA Advanced Supercomputing (NAS) Division, Dr. Piyush Mehrotra oversees the full range of high-performance computing services for NASA's premier Supercomputing center. The Division focuses on the advanced computing needs of the NASA scientists and engineers, including in the areas of Accelerator technologies, Collaborative Computing, Cloud Computing, Data Analytics and Quantum Computing. Dr. Mehrotra has over 30 years of R&D experience in parallel programming languages, including compilers and runtime systems for shared- and distributed-memory systems, and middleware infrastructure for grid environments. Recently his research focus has been on performance characterization, benchmarking and effective utilization of parallel systems including HPC clouds. He has published over 100 articles in journals and conferences, edited two books, and served as editor for several issues of international computer science journals. |
|