What's New

      Call for Paper  

      Workshops

      Online Submission

      Highlights

      Important Dates

      Topics

      Organization

      Program Committee

      Program Schedule

      Keynote Speeches

      Accepted Papers

      Sponsors

      Registration

      Student Travel Award

      Visa to USA

      Hotel

      About San Francisco 

          

          

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

IEEE BigData 2013 Program Schedule

Hyatt Regency Santa Clara
CA, USA
Oct 6-9, 2013

Program

 

 October 5, 2013
 October 6, 2013
 October 7, 2013

 October 8, 2013
 October 9, 2013

 


 

Keynote Lecture: 60 minutes((about 45 minutes for talk and 15 minutes for Q and A)
Main conference regular paper: 25 minutes (about 20 minutes for talk and 5 minutes for Q and A)
Main conference short paper: 20 minutes (about 16 minutes for talk and 4 minutes for Q and A)

 


 

5-Oct

 

17:00-20:00

Registration: Ballroom E Foyer

 

 

6-Oct

 

07:30-18:00

Registration: Hotel Lobby West

Venue:

Ballroom AB (Ba-AB), Ballroom C (Ba-C), Ballroom D (Ba-D), Ballroom E (Ba-E),

Ballroom F (Ba-F), Ballroom G (Ba-G), Ballroom (H) (Ba-H)

08:30-12:10

Workshop

5

 

The First Workshop on Big Data Visualization

Workshop

6

Big Data and Science: Infrastructure and Service

Workshop

7

 

Scalable Machine Learning: Theory and Application

  Workshop

8

 

Big Data in Bioinformatics and Health Informatics

Workshop

12

 

Knowledge management and Big Data Analytics

Workshop

9

 

Scholarly Big Data: Challenges & issues

Tutorial 1

 

 

Online Learning for Big Data Analytics  (8-10am)

 

 

Tutorial 2

 

 

Large-Scale Click-stream and transaction log mining in practice

(10:20-12:20am)

Session Chairs

Kwan-Liu Ma

Shane Canon

Haiqin Yang

  Juan Huan et al.

Qing Liu

  Ingemar J. Cox

 

 

Venue:

Ba-AB

 

Ba-C

Ba-D

 

Ba-E

 

Ba-F

Ba-G

Ba-H

Ba-H

 

Coffee break:     10:00-10:20   Foyer

12:10-13:30

Lunch at your own

13:30-18:00

Workshop

5

 

The First Workshop on Big Data Visualization

Workshop

6

 

Big Data and Science: Infrastructure and Services

Workshop

7

 

Scalable Machine Learning: Theory and Applications

Workshop

8

 

Big Data in Bioinformatics and Health Informatics

Workshop

12

 

Knowledge management and Big Data Analytics

Workshop

9

 

Scholarly Big Data: Challenges & issues

Workshop 10

 

Scalable Cloud Data Management

 

   

 

 

 

 

 

 

 

 

 

Session Chairs

Kwan-Liu Ma

Shane
Canon

Haiqin
Yang

Juan Huan
et al.

Qing Liu

Ingemar J. Cox

Norbert Ritter

        

Venue:

Ba-AB

Ba-C

Ba-D

Ba-E

 

Ba-F

Ba-G

Ba-H

 

 

Coffee break:     15:40-16:00   Foyer

 

7-Oct

 

07:30-18:00

 

Registration: Hotel Lobby West

Venue:

Ballroom AB (Ba-AB), Ballroom C (Ba-C), Ballroom D (Ba-D), Ballroom E (Ba-E), Ballroom F (Ba-F)

08:10-08:25

Opening and Welcoming Speech

Conference Co-Chairs:

T.Y. Lin, Vijay Raghavanm, Benjamin Wah

Program Co-Chairs:

Ricardo Baeza-Yates, Geoffrey Fox, Cyrus Shahabi, Matthew Smith, Qiang Yang

Industry Program co-Chairs:

Rayid Ghani, Wei Han, Ronny Lempel, Raghunath Nambiar

BigData Steering Committee Chair:  

Xiaohua Tony Hu (Drexel University)

Venue:

Ba-AB

08:25-09:25

Session Chair:  Geoffrey Fox

Keynote Lecture 1:  The Berkeley Data Analytics Stack: Present and Future

 

Prof. Mike Franklin, AMP Lab, UC Berkeley, USA

Venue:

Ba-AB

09:25-09:45

Coffee Break : Foyer

Poster session setup: Ballroom  Foyer

09:45-12:00

Session AB1

 

Algorithms and Systems for Big Data Search

 

Session C1

 

Cloud/Grid/Stream Computing for Big Data

Session D1

 

Complex Big Data Applications

Workshop 1

 

Distributed Storage Systems and Coding for Bigdata

Workshop 3

 

Workshop on Big Data and Society

Session Chair

Umit Catalyurek

Natasha Balac

Qunzhi Zhou

Hui Li et al .

Yike Guo et al.

 

Venue

Ba-AB

Ba-C

Ba-D

Ba-E

Ba-F

12:00-13:20

Lunch at your own

 

 Poster session setup: Ballroom  Foyer

13:20-15:20

Session AB2

 

Algorithms and Systems for Big Data Search

 

Session C2

 

High Performance/ Parallel Computing Platforms for Big Data

 

Session D2

 

Complex Big Data Applications

Workshop 1

 

Distributed Storage Systems and Coding for Bigdata

Workshop 3

 

Workshop on Big Data and Society

Session Chair

Michael Goodrich

Eugen Feller

Saumyadipta Pyne

Hui Li et al .

Yike Guo et al.

 

Venue:

Ba-AB

Ba-C

Ba-D

Ba-E

Ba-F

15:20-15:40

Coffee

 

 

 

 

15:40-17:40

Session AB3

 

Big Data Search Architectures, Scalability and Effciency

 

Session C3

 

High Performance/ Parallel Computing Platforms for Big Data

Session D3

 

Complex Big Data Applications

Workshop 1

 

Distributed Storage Systems and Coding for Bigdata

Workshop 3

 

Workshop on Big Data and Society

Session Chair

Peter Sanders

Toshimori

En-hui Yang

Hui Li et al .

Yike Guo et al.

 

Venue:

Ba-AB

Ba-C

Ba-D

Ba-E

Ba-F

18:30-20:30

Banquet:

Santa Clara Ballroom

 

 

 

 

 

 

 

 

8-Oct

 

08:00-18:00

Registration: Hotel Lobby West

Venue:

Ballroom AB (Ba-AB), Ballroom C (Ba-C), Ballroom D (Ba-D), Ballroom E (Ba-E), Ballroom F (Ba-F)

8:30-9:30

Session Chair:   Cyrus Shahabi

Keynote Lecture 2:  Using Crowdsourcing for Data Analytics

Prof. Hector Garcia-Molina, Stanford University, USA

Venue:

Ba-AB

9:30-9:50

Coffee Break: Ballroom Foyer

9:50-12:00

Session AB4

 

Large-scale Recommendation Systems and Social Media Systems

 

Session C4

 

Energy-efficient Computing for Big Data

Session D4

 

Data Preservation, Information Integration and Heterogeneous and Mult-structured Data Integration

 

Workshop 2

 

Big Data and the Humanities

Workshop 4

 

BPOE 2013

Session Chair

Noriaki Kawamae

Leonardo Bautista

Yong Chen

Mark Hedges et al.

Jianfeng Zhan et al.

Venue:

Ba-AB

Ba-C

Ba-D

Ba-E

Ba-F

12:00-13:20

Lunch provided by conference: TERRA COURTYARD

13:20-15:20

Session AB5

 

Link and Graph Mining

Session C5

 

New Computational Models for Big Data

Session D5

 

Spatiotemporal and Stream Data Management, Scientific Data Management

 

Workshop 2

 

Big Data and the Humanities

Workshop 4

 

BPOE 2013

Session Chair:

Qi Liao

Shestakov Denis

Frank Dehne

Mark Hedges et al.

Jianfeng Zhan et al.

Venue:

Ba-AB

Ba-C

Ba-D

Ba-E

Ba-F

15:20-15:40

Coffee Break: Ballroom Foyer

15:40-17:40

Session AB6

 

Link and Graph Mining, Mobility and Big Data

 

Session C6

 

Novel Theoretical Models for Big Data

Session D6

 

Scientific Data Management

Workshop 2

 

Big Data and the Humanities

Workshop 4

 

BPOE 2013

Session Chair

Abhirup Chakraborty

Weijia Xu

Andreas Rauber

Mark Hedges et al.

Jianfeng Zhan et al.

Venue:

Ba-AB

Ba-C

Ba-D

Ba-E

Ba-F

 

9-Oct

 

08:00-15:00

Registration:

Venue:

Ballroom AB (Ba-AB), Ballroom C (Ba-C), Ballroom D (Ba-D), Ballroom E (Ba-E), Ballroom F (Ba-F)

08:30-09:30

Session Chair: T.Y Lin

Keynote Lecture 3:  Security – A Big Question for Big Data

 

Prof. Roger Schell, University of Southern California, USA

 

Venue:

Ba-AB

09:30-09:50

Coffee Break: : Ballroom Foyer

9:50-12:00

Key Issues in Big Data Research Panel

Session E1

 

Industry and Government Program

Session D7

 

Database Management Challenges: Architecture, Storage, User Interfaces

 

Workshop 11

 

Big Data and Smarter Cities

Session E2

 

Industry and Government Program

Session Chair

T.Y.Lin

Avigdor Gal

Mihajlo Grbovic

Sambit Sahu

Nikos Papailiou

Venue:

Ba-AB

Ba-C

Ba-D

Ba-E

Ba-F

12:00-13:25

Lunch provided by conference: TERRA COURTYARD

13:30-14:30                                                                                                                

Session Chair:     Raghunath Nambiar                                                                                                               

Keynote Lecture 4:  Key Usage Patterns for Apache Hadoop in the Enterprise

                               Dr. Amr Awadallah, CTO, Cloudera, USA

 

Venue:                   Ba-AB                                                           

 

 

 

 

 

Session Chair

Venue:

Workshop 11

 

Big Data and Smarter Cities

 

 

Sambit Sahu

Ba-C

Session AB7

 

Privacy Preserving Big Data Collection/Analytics, Threat Detection using Big Data Analytics

 

Simon Chan

Ba-F

14:30-14:50 

Coffee Break:

Ballroom Foyer

 

 

 

14:40-16:50

 

 

 

 

 

Session Chair  

Big Data Funding Program Panel: Challenging and Opportunities

 

 

 

Vijay Raghavan

 

 

 Workshop 11

 

Big Data and Smarter Cities

 


Sambit Sahu

 

 

Venue:

Ba-AB

Ba-C

 

 

 

 

 

 

 

 

 

I Keynote Lectures: 4


Keynote 1:

 

Title:  The Berkeley Data Analytics Stack: Present and Future

 

Speaker:

Prof. Mike Franklin, UC Berkeley, USA

 

Abstract:

The Berkeley AMPLab was founded on the idea that the challenges of emerging Big Data applications requires a new approach to analytics systems. Launching in early 2011, the project set out to rethink the traditional analytics stack, breaking down technical and intellectual barriers that had arisen during decades of evolutionary development. The vision of the lab is to seamlessly integrate the three main resources available for making sense of data at scale: Algorithms (such as machine learning and statistical techniques), Machines (in the form of scalable clusters and elastic cloud computing), and People (both individually as analysts and en masse, as with crowdsourced human computation). To pursue this goal, we assembled a research team with diverse interests across computer science, forged relationships with domain experts on campus and elsewhere, and obtained the support of leading industry partners and major government sponsors. The lab is realizing its ideas through the development of a freely-available Open Source software stack called BDAS: the Berkeley Data Analytics Stack. In the nearly three years the lab has been in operation, we've released major components of BDAS. Several of these components have gained significant traction in industry and elsewhere: the Mesos cluster resource manager, the Spark in-memory computation framework, and the Shark query processing system. BDAS shows up prominently in many industry discussions of the future of the Big Data analytics ecosystem - a rare degree of impact for an ongoing academic project. Given this initial success, the lab is continuing on its research path, moving "up the stack" to better integrate and support deep machine learning and to make people a full-fledged resource for making sense of Big Data.

In this talk, I'll first outline the motivation and insights behind our research approach and describe how we have organized to address the cross-disciplinary nature of Big Data challenges. I will then describe the current state of BDAS with an emphasis on the key components listed above and will address our current efforts on machine learning scalability and ease of use, and hybrid human/computer processing. Finally I will present our current views of how all the pieces will fit together to form a system that can adaptively bring the right resources to bear on a given data-driven question to meet time, cost and quality requirements throughout the analytics lifecycle.

 

Short Bio:

Michael Franklin is the Thomas M. Siebel Professor of Computer Science at UC Berkeley, where he also serves as Director of the Algorithms, Machines and People Lab (AMPLab). The Berkeley AMPLab is a collaboration of over 60 researchers supported by Founding Sponsors Amazon Web Services, Google, and SAP, along with 17 other leading companies, the Darpa XData program, and an NSF Expeditions in Computing award. The latter was announced as part of the Obama Administration's Big Data research initiative in 2012. His research interests include large-scale data management and analytics, data integration, and hybrid human/computer data processing systems. He was founder and CTO of Truviso, a real-time data analytics company acquired by Cisco Systems in 2012. He is an ACM Fellow and two-time winner of the ACM SIGMOD Test of Time Award (2013 and 2004). He also recently received the Best Paper awards at ICDE 2013 and NSDI 2012, a "Best of VLDB 2012" selection, Best Demo awards at SIGMOD 2012 and VLDB 2011 and the Outstanding Advisor Award from the Computer Science Graduate Student Association at Berkeley. He is a committee member on the U.S. National Academy of Sciences study on Analysis of Massive Data and a Transportation Research Board committee on long-term data stewardship. Prof. Franklin received his Ph.D. in Computer Science from the University of Wisconsin-Madison in 1993.

 

Keynote 2:

Title: Using Crowdsourcing for Data Analytics

Speaker:

Prof. Hector Garcia-Molina, Stanford University, USA

 

Abstract:

It may sound contradictory to use humans to analyze big data, since humans cannot process huge amounts of data, may be error prone and are relatively slow. However, humans can do certain tasks much better than machines, e.g., tasks that involve image analysis or natural language.

In this talk I will discuss how humans can be judiciously used to improve data analytics by cleansing, clustering and filtering critical data. I will also briefly describe ongoing work at our Stanford InfoLab in this area

.

Short Bio:

Hector Garcia-Molina is the Leonard Bosack and Sandra Lerner Professor in the Departments of Computer Science and Electrical Engineering at Stanford University, Stanford, California. He was the chairman of the Computer Science Department from January 2001 to December 2004. From 1997 to 2001 he was a member the President's Information Technology Advisory Committee (PITAC). From 1979 to 1991 he was on the faculty of the Computer Science Department at Princeton University, Princeton, New Jersey. He received a BS in electrical engineering from the Instituto Tecnologico de Monterrey, Mexico, in 1974. From Stanford University, Stanford, California, he received in 1975 a MS in electrical engineering and a PhD in computer science in 1979. He holds an honorary PhD from ETH Zurich (2007). Garcia-Molina is a Fellow of the Association for Computing Machinery and of the American Academy of Arts and Sciences; is a member of the National Academy of Engineering; received the 1999 ACM SIGMOD Innovations Award; is a Venture Advisor for Onset Ventures, and is a member of the Board of Directors of Oracle.

 

 

Keynote 3:


Title
: Security – A Big Question for Big Data

 

Speaker:

Prof. Roger Schell, University of Southern California, USA

 

Abstract:

Big data implies performing computation and database operations for massive amounts of data, remotely from the data owner’s enterprise. Since a key value proposition of big data is access to data from multiple and diverse domains, security and privacy will play a very important role in big data research and technology. The limitations of standard IT security practices are well-known, making the ability of attackers to use software subversion to insert malicious software into applications and operating systems a serious and growing threat whose adverse impact is intensified by big data. So, a big question is what security and privacy technology is adequate for controlled assured sharing for efficient direct access to big data. Making effective use of big data requires access from any domain to data in that domain, or any other domain it is authorized to access. Several decades of trusted systems developments have produced a rich set of proven concepts for verifiable protection to substantially cope with determined adversaries, but this technology has largely been marginalized as “overkill” and vendors do not widely offer it. This talk will discuss pivotal choices for big data to leverage this mature security and privacy technology, while identifying remaining research challenges.


Short Bio:

Dr. Roger R. Schell recently joined USC/ISI supporting their Masters of Cyber Security degree program. He is internationally recognized for originating several key modern security design and evaluation techniques, and he holds patents in cryptography, authentication and trusted workstation. For more than decade he has been co-founder and President of Aesec Corporation, a start-up company providing verifiably secure platforms. Previously Dr. Schell was co-founder and vice president for Gemini Computers, Inc., where he directed development of their highly secure (what NSA called “Class A1”) commercial product, the Gemini Multiprocessing Secure Operating System (GEMSOS). He was also the founding Deputy Director of NSA’s National Computer Security Center. He has been referred to as the "father" of the Trusted Computer System Evaluation Criteria (the "Orange Book"). Dr. Schell is a retired USAF Colonel. He received a Ph.D. in Computer Science from the MIT, an M.S.E.E. from Washington State, and a B.S.E.E. from Montana State. The NIST and NSA have recognized Dr. Schell with the National Computer System Security Award. In 2012 he was inducted into the inaugural class of the National Cyber Security Hall of Fame.

 

Keynote 4:

Title: Key Usage Patterns for Apache Hadoop in the Enterprise

Speaker:

Dr. Amr Awadallah, CTO, Cloudera, USA

 

Abstract:

Advances in computing capabilities are palpably evident throughout many industries manifest by unprecedented, large-scale data integration and inferencing. Branded as “big-data” in many cases, the question of whether such techniques can leverage advances in biomedicine and clinical practice are obvious. High-throughput clinical analytics, synthesizing genomic and clinical attributes of a particular patient, portends predictive models that can directly influence clinical care decisions. However, to make this widely shared vision practical and scalable, barriers attributable to data heterogeneity dominate. Methods and strategies to increase the comparability and consistency of healthcare related data will be discussed.


Short Bio:

Before co-founding Cloudera in 2008, Amr (@awadallah) was an Entrepreneur-in-Residence at Accel Partners. Prior to joining Accel he served as Vice President of Product Intelligence Engineering at Yahoo!, and ran one of the very first organizations to use Hadoop for data analysis and business intelligence. Amr joined Yahoo after they acquired his first startup, VivaSmart, in July of 2000. Amr holds a Bachelor’s and Master’s degrees in Electrical Engineering from Cairo University, Egypt, and a Doctorate in Electrical Engineering from Stanford University.

 

 

I Conference Paper Presentations

 

Session AB1: Algorithms and Systems for Big Data Search

Regular

BigD220 "4S: Scalable Subspace Search Scheme"
Hoang Vu Nguyen, Emmanuel Müller, and KlemensBöhm

Regular

BigD254 "Computing Betweenness Centrality in External Memory"
Lars Arge, Michael Goodrich, and Freek van Walderveen

Regular

BigD282 "NUMA-optimized Parallel Breadth-first Search on Multicore Single-node System"
YuichiroYasui, Katsuki Fujisawa, and Kazushige Goto

Short

BigD248 "A Distributed Tree Data Structure For Real-Time OLAP On Cloud Architectures"
Frank Dehne, Quan Kong, Andrew Rau-Chaplin, HamidrezaZaboli, and Rebecca Zhou

Short

BigD323 "Group-Scheme: A Universal SIMD-based Compression Scheme"
Xudong Zhang, Xin Zhao, Dongdong Shan, and Hongfei Yan

 

Session C1: Cloud/Grid/Stream Computing for Big Data

Short

BigD287 "On the Performance and Energy Efficiency of Hadoop Deployment Models"
Eugen Feller, LavanyaRamakrishnan, and Christine Morin

Short

BigD355 "Scalable and Robust Key Group Size Estimation For Reducer Load Balancing in MapReduce"
Wei Yan, Yuan Xue, and Bradley Malin

Short

BigD359 "Robot: An Efficient Model For Big Data Storage Systems Based On Erasure Coding"
Chao Yin, Jianzong Wang, ChangshengXie, Jiguang Wan, and Changlin Long

Short

BigD450 "Towards Hybrid Online On-Demand Querying of Realtime Data with Stateful Complex Event Processing"
Qunzhi Zhou, YogeshSimmhan, and prasanna Viktor

Short

BigD453 "DDSN: Duplicate Detection to Reduce Both Storage and Bandwidth Consumption"
Jiaran Zhang, Xiaohui Yu, and Liwei Lin

Short

BigD414 "An Infrastructure for Automating Large-scale Performance Studies and Data Processing"
Deepal Jayasinghe, Josh Kimball, Tao Zhu, Siddharth Choudhary, and CaltonPu

 

Session D1: Complex Big Data Applications 

Regular

BigD288 "The BTWorld Use Case for Big Data Analytics: Description, MapReduce Logical Workflow, and Empirical Evaluation"
Tim Hegeman, BogdanGhit, MihaiCapotă, Jan Hidders, Dick Epema, and AlexandruIosup,

Regular

BigD311 "Modeling Heterogeneous Time Series Dynamics to Profile Big Sensor Data in Complex Physical Systems"
Bin Liu

Regular

BigD332 "Efficiently Extracting Frequent Subgraphs using MapReduce"
Wei Lu, Gang Chen, Anthony Tung, and Feng Zhao

Regular

BigD342 ” Opinion mining with word order''

Noriaki Kawamae

Short

BigD252 "HIG – An In-memory Database Platform Enabling Real-time Analyses of Genome Data"
Matthieu-P.Schapranow and HassoPlattner

 

Session E1: Industry and Government Program

Regular

N203 " Terabyte-sized Image Computations on Hadoop Cluster Platforms "
Peter Bajcsy, Antoine Vandecreme, Julien Amelot, Phuong Nguyen, Joe Chalfoun, and Mary Brady

Regular

N206 " A Fast and Scalable Method for Threat Detection in Large-scale DNS Logs "
Ron Begleiter, Yuval Elovici, Yona Hollander, Ori Mendelson, Lior Rokach, and Roi Saltzman

Regular

N207 " Hourglass: a Library for Incremental Processing on Hadoop "
Matthew Hayes and Sam Shah

Regular

N209 " Correlation-based Performance Analysis for Full-System MapReduce Optimization "
Qi Guo, Yan Li, Tao Liu, Kun Wang, Guancheng Chen, Xiaoming Bao, and Wentao Tang

Regular

N217 " Large Scale Ad Latency Analysis "
Mihajlo Grbovic, Jon Malkin, and Hirakendu Das

 

Session AB2: Algorithms and Systems for Big Data Search

Regular

BigD315 "A Distributed Vertex-Centric Approach for Pattern Matching in Massive Graphs"
ArashFard, M. UsmanNisar, LakshmishRamaswamy, John A. Miller, and Matthew Saltz

Regular

BigD318 "Fast Scalable Selection Algorithms for Large Scale Data"
Lee Thompson, WeijiaXu, and Daniel Miranker

Regular

BigD410 "Distributed Confidence-Weighted Classification on MapReduce"
NemanjaDjuric, MihajloGrbovic, and Slobodan Vucetic

Short

BigD350 "A Streaming Partitioning Approach to Processing Large Scale Distributed Graph Datasets"
Rui Wang and Kenneth Chiu

Short

BigD402 "Scaling Concurrency of Personalized Semantic Search over Large RDF Data"
HAIZHOU FU, Hyeongsik Kim, and KemaforAnyanwu

 

Session C2: High Performance/Parallel Computing  Platforms for Big Data

Regular

BigD279 "HFSP: Size-based Scheduling for Hadoop"
Mario Pastorelli, Antonio Barbuzzi, DamianoCarra, MatteoDell'Amico, and Pietro Michiardi

Regular

BigD314 "An Evaluation Study of BigData Frameworks for Graph Processing"
BenediktElser and Alberto Montresor

Short

BigD225 "Hardware acceleration of HadoopMapReduce"
ToshimoriHonjo and Kazuki Oikawa

Short

BigD339 "Algebraic Dataflows for Big Data Analysis"
Jonas Dias, Eduardo Ogasawara, Daniel de Oliveira, Fabio Porto, Patrick Valduriez, and Marta Mattoso

Short

BigD360 "Multilevel Active Storage for Big Data Applications in High Performance Computing"
Chao Chen and Yong Chen

 

Session D2: Complex Big Data Applications 

Regular

BigD341 "Explaining the Product Range Effect in Purchase Data"
Diego Pennacchioli, Michele Coscia, Salvatore Rinzivillo, Dino Pedreschi, and Fosca Giannotti

Regular

BigD372 "Parallel Deterministic Annealing Clustering and its Application to LC-MS Data Analysis"
Geoffrey Fox, D. R. Mani, and Saumyadipta Pyne

Regular

BigD378 "Terabyte-scale image similarity search: experience and best practice"
Diana Moise, Denis Shestakov, Gylfi Gudmundsson, and Laurent Amsaleg

Short

BigD266 "Real-time streaming mobility analytics"
Andras Garzo, Csaba Sidlo, Daniel Tahara, Erik Wyatt, and Andras Bencur

Short

BigD320 "QuPARA: Query-Driven Large-Scale Portfolio Aggregate Risk Analysis on MapReduce"
Andrew Rau-Chaplin, Blesson Varghese, Duane Wilson, Zhimin Yao, and Norbert Zeh

 

Session E2: Industry and Government Program

Regular

N218 " Accelerating semantic graph databases on commodity clusters "
Alessandro Morari, Vito Giovanni Castellana, Oreste Villa, David Haglin, John Feo, Jesse Weaver, and Antonino Tumeo

Regular

N219 " Practical Distributed Classification using the Alternating Direction Method of Multipliers Algorithm "
Peter Lubell-Doughtie and Jon Sondag

Regular

N225 " Scaling Deep Social Feeds at Pinterest "
Varun Sharma

Regular

N226 " Big Data Analytics on High Velocity Streams: A Case Study "
Thibaud Chardonnens, Philippe Cudre-Mauroux, Martin Grund, and Benoit Perroud

 

Session AB3: Big Data Search  Architectures, Scalability and Efficiency

Regular

BigD260 "A Parallel Computing Platform for Training Large Scale Neural Networks"
RongGu, FuraoShen, and Yihua Huang

Regular

BigD330 "An NML-based Model Selection Criterion for General Relational Data Modeling"
Yoshiki Sakai and Kenji Yamanishi

Regular

BigD411 "Scalable Context-Aware Role Mining with MapReduce"
Zhiwei Yu, Raymond Wong, and Chi-Hung Chi

Short

BigD297 "Sparse Poisson Coding for High Dimensional Document Clustering"
Chenxia Wu, Haiqin Yang, Jianke Zhu, Jiemi Zhang, Irwin King, and Michael R. Lyu

Short

BigD465 "Parallel Subgroup Discovery on Computing Clusters -- First Results"
Daniel Trabold and HenrikGrosskreutz

 

Session C3: High Performance/Parallel Computing  Platforms for Big Data

Regular

BigD331 "Storing and manipulating environmental big data with JASMIN"
Bryan Lawrence, Victoria Bennett, Jonathan Churchill, Martin Juckes, Philip Kershaw, Stephen Pascoe, Sam Pepler, Matt Pritchard, and Ag Stephens

Regular

BigD455 "Locality-driven High-level I/O Aggregation for Processing Scientific Datasets"
Jialin Liu, BradlyCrysler, and Yong Chen

Short

BigD363 "GPU Accelerated Item-Based Collaborative Filtering for Big-Data Applications"
Chandima Hewa Nadungodage, Yuni Xia, John Lee, Myungcheol Lee, and Choon Seo Park

Short

BigD427 "Kylin: An Efficient and Scalable Graph Data Processing System"
Li-Yung Ho, Tsung-Han Li, Jan-Jan Wu, and Pangfeng Liu

Short

BigD454 "A Reconfigurable Computing Architecture for Semantic Information Filtering"
Aalap Tripathy, Ka Chon Ieong, Atish Patra, and Rabi Mahapatra

 

Session D3: Complex Big Data Applications

Regular

BigD437 "Demand Response Targeting Using Big Data Analytics"
Jungsuk Kwac and Ram Rajagopal

Regular

BigD353 "Large-scale Predictive Analytics for Real Time Energy Management"
Natasha Balac, Tamara Sipes, Nicole Wolter, Kenneth Nunes, Robert Sinkovits, and Homa Karimabadi

Short

BigD405 "Constructing User Profiles from Social Media Data"
Mauricio Hernandez, Kirsten Hildrum, Prateek Jain, Chitra Venkatramani, RohitWagle, BogdanAlexe, and Ioana Roxana Stanoi

Short

BigD431 "CloudRS: An Error Correction Algorithm of High-Throughput Sequencing Data based on Scalable Framework"
Chien-Chih Chen, Yu-Jung Chang, Wei-Chun Chung, Der-Tsai Lee, and Jan-Ming Ho

Short

BigD444 "Building dynamic thermal profiles of energy consumption for individuals and neighborhoods"
Adrian Albert and Ram Rajagopal

 

Session AB4: Large-scale Recommendation Systems and Social Media Systems

Regular

BigD211 "Continuous Hyperparameter Optimization for Large-scale Recommender Systems"
Simon Chan, Philip Treleaven, and Licia Capra

Regular

BigD334 "Parallel Matrix Factorization for Binary Response"
Rajiv Khanna, Liang Zhang, Deepak Agarwal, and Bee-Chung Chen

Regular

BigD400 "CallCab: A Unified Recommendation System for Carpooling and Regular Taxicab Services"
Desheng Zhang and Tian He

Short

BigD361 "Scalable Distributed Event Detection for Twitter"
Richard McCreadie, Craig Macdonald, IadhOunis, Miles Osborne, and SasaPetrovic

Short

BigD233 "Massively Scalable Near Duplicate Detection in Streams of Documents using MDSH"
Paul Bogen, Christopher Symons, Amber McKenzie, Robert Patton, and Rob Gillen

 

Session C4: Energy-efficient Computing for Big Data

Regular

BigD345 "Efficient Gear-shifting for a Power-proportional Distributed Data-placement Method"
HieuHanh Le, Satoshi Hikida, and Haruo Yokota

Regular

BigD413 "Building a Generic Platform for Big Sensor Data Application"
Chun-Hsiang Lee, David Birch, Chao Wu, Dilshan Silva, OrestisTsinalis, Yang Li, Shulin Yan, MoustafaGhanem, and YikeGuo

Regular

BigD354 "Agrios: A Hybrid Approach to Big Array Analytics"
Patrick Leyshock, David Maier, and Kristin Tufte

Short

BigD215 "clusiVAT: A Mixed Visual/Numerical Clustering Algorithm for Big Data"
Dheeraj Kumar, James Bezdek, SutharshanRajasegarar, MarimuthuPalaniswami, Christopher Leckie, and Timothy Havens

Short

BigD298 "Feliss: Flexible distributed computing framework with light-weight checkpointing"
Takuya Araki, Kazuyo Narita, and Hiroshi Tamano

 

Session D4; Data Preservation, Information Integration and Heterogeneous and Multi-structured Data Integration

Regular

BigD253 "CORE: Cross-Object Redundancy for Efficient Data Repair in Storage Systems"
Kyumars Sheykh Esmaili, Lluis Pamies Juarez, and Anwitaman Datta

Regular

BigD217 "Iteration Aware Prefetching For Unstructured Grids"
Oyindamola Akande and Philip Rhodes

Short

BigD278 "Scalable Data Citation in Dynamic, Large Databases: Model and Reference Implementation"
Stefan Pröll and Andreas Rauber

Short

BigD344 "Self-Adaptive Event Recognition for Intelligent Transport Management"
Alexander Artikis, Matthias Weidlich, Avigdor Gal, VanaKalogeraki, and DimitriosGunopoulos

Short

BigD375 "Robust Crowdsourced Learning"
Zhiquan Liu, Luo Luo, and Wu-Jun Li

 

Session AB5: Link and Graph Mining

Regular

BigD267 "Self-Tuned Kernel Spectral Clustering for Large Scale Networks"
Raghvendra Mall, Rocco Langone, and Johan Suykens

Regular

BigD403 "Top-K aggregation over a Large Graph Using Shared-Nothing Systems"
AbhirupChakraborty

Short

BigD241 "Incremental Algorithms for Network Management and Analysis based on Closeness Centrality"
AhmetErdemSariyuce, Kamer Kaya, Erik Saule, and Umit V. Catalyurek

Short

BigD247 "Classification of Big Velocity Data via Cross-Domain Canonical Correlation Analysis"
Bo Zhang and Zhongzhi Shi

Short

BigD212 "Elver: Recommending Facebook Pages in Cold Start Situation Without Content Features"
YushengXie, AlokChoudhary, Zhengzhang Chen, and AnkitAgrawal

 

Session C5: New Computational Models for Big Data

Regular

BigD399 "Map-Based Graph Analysis on MapReduce"
Upa Gupta and Leonidas Fegaras,

Regular

BigD216 "P-DOT: A Model of Computation for Big Data"
Tao Luo, Yin Liao, Yunquan Zhang, and Guoliang Chen

Short

BigD285 "Optimizing the MapReduce Framework on Intel Xeon Phi Coprocessor"
Mian Lu, Lei Zhang, Huynh Phung Huynh, ZhongliangOng, Yun Liang, Bingsheng He, Rick SiowMongGoh, and Richard Huynh

Short

BigD289 "Optimizing Throughput on Guaranteed-Bandwidth WAN Networks for the Large Synoptic Survey Telescope (LSST)"
Mike Freemon

Short

BigD390 "GPU-Accelerated Adaptive Compression Framework for Genomics Data"
GuixinGuo, Shuang Qiu, Mian Lu, BingQiang Wang, Lin Fang, and Simon See

 

Session D5: Spatiotemporal and Stream Data Management, Scientific Data Management

Regular

BigD423 "Spatio-temporal Indexing in Non-relational Distributed Databases"
Anthony Fox, Chris Eichelberger, James Hughes, and Skylar Lyon

Regular

BigD245 "Measuring Inter-Site Engagement"
Elad Yom-Tov, MouniaLalmas, Ricardo Baeza-Yates, Georges Dupret, Janette Lehmann, and Pinar Donmez

Regular

BigD312 "Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures"
Austin Benson, David Gleich, and James Demmel

Short

BigD243 "Scientific Discovery through Weighted Sampling"
Lefteris Sidirourgos, Martin Kersten, and Peter Boncz

Short

BigD294 "On the Use of Shared Storage in Shared-Nothing Environments"
Krishnaraj Ravindranathan, Aleksander Khasymski, Guanying Wang, Ali Butt, and Gaurav Makkar

 

Session AB6: Link and Graph Mining, Mobility and Big Data

Short

BigD335 "Efficient Large Graph Pattern Mining for Big Data in the Cloud"
Chun-Chieh Chen, Kuan-Wei Lee, Chih-Chieh Chang, De-Nian Yang, and Ming-Syan Chen

Short

BigD417 "A Hypergraph-Partitioned Vertex Programming Approach for Large-scale Consensus Optimization"
Hui Miao, Xiangyang Liu, Bert Huang, and LiseGetoor

Short

BigD366 "Analysis of GSM calls data for understanding user mobility behavior"
Chiara Renso, Barbara Furletti, Lorenzo Gabrielli, and Salvatore Rinzivillo

Short

BigD448 "A Higher-Order Data Flow Model for Heterogeneous Big Data"
Simon Price and Peter Flach

Short

BigD284 "DL-MPI: Enabling Data Locality Computation for MPI-based Data-Intensive Applications"
Jiangling Yin, Andrew Foran, and Jun Wang

Short

BigD308 "Fast OLAP Query Execution in Main Memory on Large Data in a Cluster"
Martin Weidner, Jonathan Dees, and Peter Sanders

 

Session C6: Novel Theoretical Models for Big Data

Regular

BigD358 "Communication Efficient Algorithms for Fundamental Big Data Problems”
Peter Sanders, Ingo Müller, and Sebastian Schlag

Regular

BigD244 "On-Line Learning Gossip Algorithm in Multi-Agent Systems with Local Decision Rules"
Stephan Clemencon, Pascal Bianchi, Gemma Morral, and JeremieJakubowicz

Short

BigD229 "Transparent Composite Model For Large Scale Image/Video Processing"
Enhui Yang and Xiang Yu

Short

BigD319 "Elastic Algorithms for Guaranteeing Quality Monotonicity in Big Data Mining"
Rui Han, Lei Nie, Moustafa M. Ghanem, and Yike Guo

 

Session D6: Scientific Data Management

Regular

BigD338 "Adaptive File Management for Scientific Workflows on the Azure Cloud"
Radu Tudoran, Alexandru Costan, Ramin Rad Rezai, Goetz Brasche, and Gabriel Antoniu

Regular

BigD407 "Model-View Sensor Data Management in the Cloud"
TianGuo, Thanasis G. Papaioannou, and Karl Aberer

Short

BigD373 "Using Pattern-Models to Guide SSD Deployment for Big Data in HPC systems"
Junjie Chen, Yong Chen, and Philip C. Roth

Short

BigD365 "Improving Floating Point Compression through Binary Masks"
Leonardo Bautista Gomez and Franck Cappello

Short

BigD445 "Segmented Analysis for Reducing Data Movement"
Jialin Liu, SurendraByna, and Yong Chen

 

Session AB7: Privacy Preserving Big Data Collection/Analytics, Threat Detection using Big Data Analytics

Regular

BigD269 "DP-WHERE: Differentially Private Modeling of Human Mobility"
Darakhshan Mir, SibrenIsaacman, Ramón Cáceres, Margaret Martonosi, and Rebecca Wright

Regular

BigD305 "Malicious URLs Filtering - A Big Data Application"
Min-Sheng Lin, Chien-Yi Chiu, Yuh-Jye Lee, and Hsing-KuoPao

Regular

BigD328 "Zero-Knowledge Private Graph Summarization"
Maryam Shoaran, Alex Thomo, and Jens Weber

Short

BigD230 "Scalable Network Traffic Visualization Using Compressed Graphs"
Lei Shi, Qi Liao, and Xiaohua Sun

Short

BigD391 "Breaking the Arc: RIsk Control for Big Data"
Duncan Hodges and Sadie Creese

 

Session D7: Database Management Challenges: Architecture, Storage, User Interfaces

Regular

BigD249 "A Selective Checkpointing Mechanism for Query Plans in a Parallel Database System"
Ting Chen and Kenjiro Taura

Regular

BigD270 "H2RDF+: High-performance Distributed Joins over Large-scale RDF Graphs"
Nikolaos Papailiou, Ioannis Konstantinou, Dimitrios Tsoumakos, Panagiotis Karras, and Nectarios Koziris

Short

BigD447 "Knowledge Cubes - A Proposal for Scalable and Semantically-Guided Management of Big Data"
Amgad Madkour, Walid Aref, and Saleh Basalamah

 

 

 

Workshops

 

Workshop 1: Distributed Storage Systems and Coding for Big Data

Paper List

1

S1209 "The Code Rebalancing Problem for a Storage-Flexible Data Center Network "
Iryna Andriyanova, Alan Jule and Emina Soljanin

2

S1211 "suvfs: A virtual file system in userspace that supports large files"
Wasim Ahmad Bhat and S.M.K. Quadri

3

S1213 "Reliability of Erasure Coded Storage Systems: A Geometric Approach"
Antonio Campello and Vinay Vaishampayan

4

S1210 "Distributed Storage Evaluation on a Three-Wide Inter-Data Center Deployment"
Yih-Farn Chen, Scott Daniels, Marios Hadjieleftheriou, Pingkai Liu, Chao Tian and Vinay Vaishampayan

5

S1201 "Paired-Replicas with Constant Repair Time: Loss Functions and Memorylessness"
Vinay Deolalikar

6

S1202 "Efficient Updates in Cross-Object Erasure-Coded Storage Systems"
Kyumars Sheykh Esmaili, Aatish Chiniah and Anwitaman Datta

7

S1208 "Construction of Exact-BASIC Codes for Distributed Storage Systems at the MSR Point"
Hanxu Hou, Kenneth W. Shum and Hui Li

8

S1205 "Minimum Storage BASIC Codes: A System Perspective Xianxia Huang"
Hui Li, Tai Zhou, Yumeng Zhang, Han Guo, Hanxu Hou, Huayu Zhang, Kai Pan and Kai Lei

9

S1207 "Layout-Aware I/O Scheduling for Terabits Data Movement"
Youngjae Kim, Scott Atchley, Geoffroy R. Vallee and Galen M. Shipman

Schedule

Date

7th, October,2013

Location

Ballroom E

Time

Schedule

9:00-9:15

Plenary

9:15-10:00

Invited Talk

10:00-11:00

S1208 "Construction of Exact-BASIC Codes for Distributed Storage Systems at the MSR Point"
S1213 "Reliability of Erasure Coded Storage Systems: A Geometric Approach"
S1209 "The Code Rebalancing Problem for a Storage-Flexible Data Center Network "

11:00-11:20

Coffee Time

11:20-12:00

S1202 "Efficient Updates in Cross-Object Erasure-Coded Storage Systems"
S1211 "suvfs: A virtual file system in userspace that supports large files"

12:00-14:00

Lunch at your own

14:00-14:30

Invited Talk: S1205 "Minimum Storage BASIC Codes: A System Perspective Xianxia Huang"

14:30-15:30

S1210 "Distributed Storage Evaluation on a Three-Wide Inter-Data Center Deployment"
S1201 "Paired-Replicas with Constant Repair Time: Loss Functions and Memorylessness"
S1207 "Layout-Aware I/O Scheduling for Terabits Data Movement"

 

Workshop 2: Big Data and the Humanities

Paper List

1

S2203 "Robustness of emotion extraction from 20th century English books Alberto Acerbi"
Vasileios Lampos and Alexander Bentley

2

S2228 "VisualPage: Towards Large Scale Analysis of Nineteenth-Century Print Culture"
Neal Audenaert and Natalie Houston

3

S2210 "Back to our Data – Experiments with NoSQL Technologies in the Humanities"
Tobias Blanke, Michael Bryant and Mark Hedges

4

S2234 "The Human Face of Crowdsourcing: A Citizen-led Crowdsourcing Case Study"
Sheryl Grant, Kristan Shawgo, Richard Marciano, Jeff Heard and Priscilla Ndiaye

5

S2219 "Visualization and Rhetoric: Key Concerns for Utilizing Big Data in Humanities"
Kathleen Kerr, Bernice Hausman, Samah Gad and Waqas Javed

6

S2224 "Humanities 'Big Data': Myths, challenges, and lessons"
Amalia S. Levi

7

S2229 "Digging into Human Rights Violations: Data Modeling Collective Memory"
Ben Miller, Ayush Shrestha, Jason Derby, Jennifer Olive, Fuxin Li, Yanjun Zhao and Karthikeyan Umapathy

8

S2231 "The Royal Birth of 2013: Analysing and Visualising Public Sentiment in the UK Using Twitter"
Vu Dung Nguyen, Blesson Varghese and Adam Barker

9

S2221 "Bibliographic Records as Humanities Big Data"
Andrew Prescott

10

S2209 "Customising Geoparsing and Georeferencing for Historical Texts"
C.J. Rupp, Paul Rayson, Alistair Baron, Christopher Donaldson, Ian Gregory, Andrew Hardie and Patricia Murrieta-Flores

11

S2208 "A Concept of Generic Workspace for Big Data Processing in Humanities"
Jedrzej Rybicki, Benedikt von St. Vieth and Daniel Mallmann

12

S2220 "From Assets to Stories via the Google Cultural Institute Platform"
William Seales, Steve Crossan, Mark Yoshitake and Sertan Girgin

13

S2223 "The Curious Identity of Michael Field and its Implications for Humanities Research with the Semantic Web"
Susan Brown and John Simpson

14

S2222 "Infectious Texts: Modeling Text Reuse in Nineteenth-Century Newspapers"
David Smith, Ryan Cordell and Elizabeth Maddock Dillon

15

S2204 "Mapping Mutable Genres in Structurally Complex Volumes"
Ted Underwood, Michael Black, Loretta Auvil and Boris Capitanu

16

S2214 "CKM: A Shared Visual Analytical Tool for Large-Scale Analysis of Audio-Video Interviews"
Lu Xiao, Yan Luo and Steven High

17

S2218 "A Case Study on Entity Resolution for Distant Processing of Big Humanities Data"
Weijia Xu, Maria Esteva, Jessica Trlogan and Todd Swinson

Schedule

Date

8th, October,2013

Location

Ballroom E

Time

Schedule

9:30-9:50

Coffee Time

9:50-12:00

S2222 "Infectious Texts: Modeling Text Reuse in Nineteenth-Century Newspapers"
S2204 "Mapping Mutable Genres in Structurally Complex Volumes"
S2231 "The Royal Birth of 2013: Analysing and Visualising Public Sentiment in the UK Using Twitter"
S2228 "VisualPage: Towards Large Scale Analysis of Nineteenth-Century Print Culture"
S2218 "A Case Study on Entity Resolution for Distant Processing of Big Humanities Data"
S2208 "A Concept of Generic Workspace for Big Data Processing in Humanities"

12:00-13:20

Lunch (not provided)

13:20-15:20

S2229 "Digging into Human Rights Violations: Data Modeling Collective Memory"
S2214 "CKM: A Shared Visual Analytical Tool for Large-Scale Analysis of Audio-Video Interviews"
S2219 "Visualization and Rhetoric: Key Concerns for Utilizing Big Data in Humanities"
S2221 "Bibliographic Records as Humanities Big Data"
S2210 "Back to our Data – Experiments with NoSQL Technologies in the Humanities"
S2209 "Customising Geoparsing and Georeferencing for Historical Texts"
S2234 "The Human Face of Crowdsourcing: A Citizen-led Crowdsourcing Case Study"
S2224 "Humanities 'Big Data': Myths, challenges, and lessons"
S2203 "Robustness of emotion extraction from 20th century English books Alberto Acerbi"

15:20-15:40

Coffee Time

15:40-17:40

S2223 "The Curious Identity of Michael Field and its Implications for Humanities Research with the Semantic Web"
S2220 "From Assets to Stories via the Google Cultural Institute Platform"

 

Workshop 3: Workshop on Big Data and Society
       -- Data Economy, Real-Time Mining and Analytics, Mining Techniques for Online and Customer Service in Big data Era

Paper List

1

S6207 "Enterprise Pre-Sales Forums: A Preliminary Study of Metadata and Content"
Vinay Deolalikar

2

S6212 "Advancing value creation and value capture in data-intensive contexts"
Roman Ferrando-Llopis, David Lopez-Berzosa and Catherine Mulligan

3

S6203 "A Cloud Service for the Evaluation of Company's Financial Health Using XBRL-based Financial Statements"
Wen-Chiao Hsu, Jyun-Yao Huang, Chi-Hao Chen, Chien-Yu Su, Hsiao-Chen Shih, Tzu-Ya Liao and I-En Liao

4

S6209 "Real-Time Data Analysis in ClowdFlows"
Janez Kranjc, Vid Podpečan and Nada Lavrač

5

S6210 "ma3tch - privacy and knowledge - dynamic networked collective intelligence"
Udo Kroon

6

S6202 "Business Model Canvas Perspective on Big Data Applications"
Fatma Canan Pembe Muhtaroglu, Seniz Demir, Murat Obali and Canan Girgin

7

S6215 "Understanding the value of (Big) data"
Koutroumpis Pantelis and Leiponen Aija

8

S6214 "OpenFridge: A Platform for Data Economy for Energy Efficiency Data"
Slobodanka Dana Kathrin Tomic and Anna Fensel

9

S6201 "A Study of Innovation Network Database Construction by Using Big Data and An Enterprise Strategy Model"
Zhou Wen, Ye Shu Tao and Lu Xiao Long

10

S6213 "Enhanced User Data Privacy with Pay-by-Data Model"
Chao Wu and Yike Guo

11

S6206 "Query Optimization over a Heterogeneously Distributed Scientific Database"
Helen Xiang

12

S6204 "Enterprise Data Economy: A Hadoop-Driven Model and Strategy"
Wuheng Luo

Schedule

Date

7th, October,2013

Location

Ballroom F

Time

Schedule

8:00-8:40

Registration (Hotel Lobby West)

8:40-9:35

Invited talk: What's around the corner in social commerce?(Jaiddep Srivastava)

9:35-10:00

S6212 "Advancing value creation and value capture in data-intensive contexts"

10:00-10:25

Coffee Time

10:25-12:30

S6202 "Business Model Canvas Perspective on Big Data Applications"
S6203 "A Cloud Service for the Evaluation of Company's Financial Health Using XBRL-based Financial Statements"
S6207 "Enterprise Pre-Sales Forums: A Preliminary Study of Metadata and Content"
S6204 "Enterprise Data Economy: A Hadoop-Driven Model and Strategy"
S6213 "Enhanced User Data Privacy with Pay-by-Data Model"

12:30-13:30

Lunch at your own

13:30-14:25

Invited talk: Large Scale Mining and Modeling of Telecommunication Carrier's Big Data (Wei Fan)

14:25-16:05

S6206 "Query Optimization over a Heterogeneously Distributed Scientific Database"
S6210 "ma3tch - privacy and knowledge - dynamic networked collective intelligence"
S6201 "A Study of Innovation Network Database Construction by Using Big Data and An Enterprise Strategy Model"
S6209 "Real-Time Data Analysis in ClowdFlows"

16:05-16:30

Coffee Time

16:30-17:25

S6215 "Understanding the value of (Big) data"
S6214 "OpenFridge: A Platform for Data Economy for Energy Efficiency Data"

 

Workshop 4: The First Workshop on Benchmarks, Performance Optimization, and Emerging hardware of Big Data Systems and Applications(BPOE 2013)

Paper List

1

S7210 "Optimizing a MapReduce Module of Preprocessing High-Throughput DNA Sequencing Data"
Wei-Chun Chung, Yu-Jung Chang, Chien-Chih Chen, Der-Tsai Lee and Jan-Ming Ho

2

BigD370 "Hash in a Flash: Hash Tables for Flash Devices"
Tyler Clemons, S M Faisal, Shirish Tatikonda, Charu Aggarwal and Srinivasan Parthasarathy

3

S7202 "Memory system characterization of Big Data workloads"
Martin Dimitrov, Karthik Kumar, Patrick Lu and Vish Viswanathan

4

S7211 "Performance Evaluation of R with Intel Xeon Phi Coprocessor"
Yaakoub El-Khamra, Niall Gaffney, David Walling, Eric Wernert, Weijia Xu and Hui Zhang

5

S7216 "The Implications from Benchmarking Three Big Data Systems"
Quan Jing, Shi Yingjie, Zhao Ming and Wei Yang

6

S7205 "A Performance Evaluation of Hive for Scientific Data Management"
Taoying Liu, Jing Liu, Hong Liu and Wei Li

7

S7214 "Evaluating Task Scheduling in Hadoop-based Cloud Systems"
Shengyuan Liu, Jungang Xu, Zongzhen Liu and Xu Liu

8

BigD397 "Efficient Near-Duplicate Document Detection using FPGAs"
Xi Luo, Walid Najjar and Vagelis Hristidis

9

BigD389 "Workload-Aware Aggregate Maintenance in Columnar In-Memory Databases"
Stephan Müller, Lars Butzmann, Stefan Klauck and Hasso Plattner

10

S7206 "Virtualization I/O Optimization Based on Shared Memory"
Fengfeng Ning, Chuliang Weng and Yuan Luo

11

S7209 "An Ensemble MIC-based Approach for Performance Diagnosis in Big Data Platform"
Chen Pengfei, Qi Yong, Li Xinyi and Li Su

12

S7207 "A Reconfigurable Stream Compression Hardware based on Static Symbol-Lookup Table"
Shinichi Yamagiwa and Hiroshi Sakamoto

13

S7201 "NativeTask: A Hadoop Compatible Framework for High Performance"
Dong Yang, Xiang Zhong, Dong Yan, Fangqin Dai, Xusen Yin, Cheng Lian, Zhongliang Zhu, Weihua Jiang and Gansha Wu

14

S7212 "On Mixing High-Speed Updates and In-Memory Queries: A Big-Data Architecture for Real-time Analytics"
Tao Zhong, Kshitij Doshi, Xi Tang, Ting Lou, Zhongyan Lu and Hong Li

15

S7215 "AxPUE: Application Level Metrics for Power Usage Effectiveness in Data Centers"
Runlin Zhou, Yingjie Shi and Chunge Zhu

16

S7217 "A Characterization of Big Data Benchmarks"
Wen Xiong and Zhibin Yu

Schedule

Date

8th, October,2013

Location

Ballroom F

Time

Schedule

9:00-12:00

Opening remarks: Jianfeng Zhan and Weijia Xu

Session one: Performance optimization of big data systems (Session Chair: Xiaoyi Lu, OSU)
BigD389 "Workload-Aware Aggregate Maintenance in Columnar In-Memory Databases"
S7201 "NativeTask: A Hadoop Compatible Framework for High Performance"
S7210 "Optimizing a MapReduce Module of Preprocessing High-Throughput DNA Sequencing Data"
S7212 "On Mixing High-Speed Updates and In-Memory Queries: A Big-Data Architecture for Real-time Analytics"
S7206 "Virtualization I/O Optimization Based on Shared Memory"
S7205 "A Performance Evaluation of Hive for Scientific Data Management"

12:00-13:20

Lunch

13:20-15:20

Session two: Big Data Benchmarks and Workload characterization (Session Chair: Jianfeng Zhan, ICT, CAS)
Invited Talk: TBD
S7202 "Memory system characterization of Big Data workloads"
S7211 "Performance Evaluation of R with Intel Xeon Phi Coprocessor"
S7214 "Evaluating Task Scheduling in Hadoop-based Cloud Systems"
S7217 "A Characterization of Big Data Benchmarks"
S7209 "An Ensemble MIC-based Approach for Performance Diagnosis in Big Data Platform"

15:20-15:40

Break

15:40-17:40

Session two: Big Data Benchmarks and Workload characterization (Session Chair: Jianfeng Zhan, ICT, CAS)
Invited Talk: TBD
S7207 "A Reconfigurable Stream Compression Hardware based on Static Symbol-Lookup Table"
BigD397 "Efficient Near-Duplicate Document Detection using FPGAs"
BigD370 "Hash in a Flash: Hash Tables for Flash Devices"
S7216 "The Implications from Benchmarking Three Big Data Systems"
S7215 "AxPUE: Application Level Metrics for Power Usage Effectiveness in Data Centers"

Closing remark (Weijia Xu and Jianfeng Zhan)

 

Workshop 5: The First Workshop on Big Data Visualization

Paper List

1

S9209 "Dynamic Reduction of Query Result Sets for Interactive Visualization"
Leilani Battle, Michael Stonebraker and Remco Chang

2

S9211 "Overplotting: Unified solutions under Abstract Rendering Missing"
Joseph Cottam, Andrew Lumsdaine and Peter Wang

3

S9205 "Typograph: Multiscale Spatial Exploration of Text Documents"
Alexander Endert, Russ Burtner, Nick Cramer, Ralph Perko, Shawn Hampton and Kristin Cook

4

S9204 "VisReduce: Fast and responsive incremental information visualization of large datasets"
Jean-Francois Im, Felix Giguere Villegas and Michael J. McGuffin

5

S9208 "A System for Large-Scale Visualization of Streaming Doppler Data"
Peter Kristof, Bedrich Benes, Carol X. Song and Lan Zhao

6

S9210 "Visualization of Streaming Data: Observing Change and Context in Information Visualization Techniques"
Milos Krstajic and Daniel A. Keim

7

S9202 "CompactMap: A Mental Map Preserving Visual Interface for Streaming Text Data"
Xiaotong Liu, Yifan Hu, Stephen North and Han-Wei Shen

8

S9207 "Egocentric Storylines for Visual Analysis of Large Dynamic Graphs"
Chris W. Muelder, Tarik Crnovrsanin, Arnaud Sallaberry and Kwan-Liu Ma

9

S9206 "GPU-Accelerated Incremental Correlation Clustering of Large Data in the Cloud with Visual Feedback"
Eric Papenhausen, Bing Wang, Sungsoo Ha, Alla Zelenyuk, Dan Imre and Klaus Mueller

10

S9201 "Visualization of Big SPH Simulations via Compressed Octree Grids"
Florian Reichl, Marc Treib and Rüdiger Westermann

11

S9203 "A Novel Visual Analysis Approach for Clustering Large-Scale Social Data"
Zhangye Wang, Juanxia Zhou, Wei Chen, Chang Chen, Jiyuan Liao and Ross Maciejewski

12

S9212 "DriveSense: Contextual Handling of Large-scale Route Map Data for the Automobile"
Frederik Wiehr, Vidya Setlur and Alark Joshi

Schedule

Date

6th, October,2013

Location

Ballroom AB

Time

Schedule

8:00-8:40

Opening

8:40-9:30

Keynote: "Big Picture" Mixed-Initiative Visual Analytics of Big Data, Michelle Zhou, IBM Research

9:30-10:00

Invited talk: Data Intensive Visualization and Analysis of Numerically Intensive Applications Chris Mitchell, Los Alamos National Laboratory

10:00-10:30

Coffee Time

10:30-12:00

Text Data
SS9210 "Visualization of Streaming Data: Observing Change and Context in Information Visualization Techniques"
S9202 "CompactMap: A Mental Map Preserving Visual Interface for Streaming Text Data"
S9205 "Typograph: Multiscale Spatial Exploration of Text Documents"

12:00-13:30

Lunch

13:30-14:30

Rendering
S9211"Overplotting: Unified solutions under Abstract Rendering Missing"
S9212 "DriveSense: Contextual Handling of Large-scale Route Map Data for the Automobile"

14:30-15:30

Visual Analysis
S9203 "A Novel Visual Analysis Approach for Clustering Large-Scale Social Data"
S9207 "Egocentric Storylines for Visual Analysis of Large Dynamic Graphs"

15:30-16:00

Coffee Time

16:00-17:30

Scientific Data
S9201 "Visualization of Big SPH Simulations via Compressed Octree Grids"
S9208 "A System for Large-Scale Visualization of Streaming Doppler Data"
S9209 "Dynamic Reduction of Query Result Sets for Interactive Visualization"

17:30-18:30

Fast, Incremental Visualization
S9206 "GPU-Accelerated Incremental Correlation Clustering of Large Data in the Cloud with Visual Feedback"
S9204 "VisReduce: Fast and responsive incremental information visualization of large datasets"

 

Workshop 6: Big Data and Science: Infrastructure and Services

Paper List

1

SC210 "A big data analytics framework for scientific data management"
Sandro Fiore, Cosimo Palazzo, Alessandro D'Anca, Ian Foster, Dean Williams and Giovanni Aloisio