News & Events

Xintong Zhao Presents at ACM/IEEE JCDL 2020 Workshop

On August 5th, Xintong Zhao, CCI/Drexel University, doctoral student and MRC research assistant (RA), presented her research at the ACM/JCDL 2020 (Association of Computing Machinery/Joint Conference on Digital Libraries) workshop, “Organizing Big Data, Information, and Knowledge.” 

Xintong Zhao
Xintong Zhao, CCI/Drexel University, doctoral student and MRC Research Assistant

Zhao’s presentation, “Scholarly Big Data: Computational Approaches to Semantic Labeling in Materials Science,” is from research she is conducting in collaboration with team members: the NSF supported Harnessing the Data Revolution (HDR) initiative, Accelerating the Discovery of Electronic Materials through Human-Computer Active Search. Zhao’s research examines computation and semantic labeling for scholarly big data in materials science. She reported on a baseline comparative analysis she led, comparing the ontology-based automatic indexing with the Helping Interdisciplinary Vocabulary Engineering (HIVE-4-MAT) application and the MATScholar system, which uses named entity recognition (NER), supported by an RNN (Recursive Neural Network). [presentation slides]

News & Events

Webinar: Computational Archival Science: A Paradigm Shift Across the Data

What: Computational Archival Science: A Paradigm Shift Across the Data [Virtual]
When: July 6 @ 12:00 pm – 1:30 pm EDT, 12:00-1:30 PM EDT
Participation: free/open to all
Registration required for ZOOM link; register @: link

The Emerging Technologies, Big Data & Archives series and the Archival Education and Research Initiative (AERI) will co-host a webinar on computing the archives, led by Richard Marciano (AICollaboratory, University of Maryland) and Jane Greenberg (Metadata Research Center, Drexel University). The session will cover graduate student datathon participation, using the Legacy of Slavery data from Maryland State Archives; and AI/machine learning applied to the WWII FDR Presidential Library diaries. The session will also highlight the connection to Drexel’s LEADS program.

Graduate students include:

  • Rajesh Gnanasekaran (UMD)
  • Alexis Hill: (UMD)
  • Phillip Nicholas (UMD)
  • Lori Perine (UMD)
  • Sonia Pascua (Drexel, 2019 LEADS Fellow)
  • Hanlin Zhang (UNC, 2019 LEADS Fellow)

The webinar is co-sponsored by CLIR, and co–hosted by Oklahoma State University Emerging Technologies & Creativity Research Lab, and led by postdoctoral fellows Rebecca Y. Bayeck, Schomburg Center for Research in Black Culture & Azure Stewart, New York University, as part of “CLIR’s Emerging Technologies, Big Data & Archives” series.

News & Events

Metadata Mixer: Quarantine Catch-Up

On Thursday, June 11th, the Metadata Research Center hosted a “Metadata Mixer: Quarantine Catch-Up.” Participants shared presentations about their research and work accomplishments over the Spring term, as well as goals for the summer. Available presentation slides can be viewed below.


  • Sam Grabus: Historical subject representation & the “Long S” [Slides]
  • Vishal Deo/Prateek Goel: Exploratory Analysis on DataSar [Slides]
  • Steve Dilliplane: Nature from Afar
  • Jeremy Leipzig: Dichotomous Keys [Slides]
  • Xintong Zhao: Accelerating Materials Discovery Using Information Extraction [Slides]

News & Events

MRC’s Sam Grabus wins 2020 LITA/Ex Libris Student Writing Award

Information Science Doctoral Candidate Sam Grabus was awarded the 2020 LITA/Ex Libris Student Writing Award for her paper, “Evaluating the Impact of the Long S upon 18th-Century Encyclopedia Britannica Automatic Subject Metadata Generation Results.” The paper reports on a comparative study of subject metadata generated both before and after the correction of the historical Long S in the 3rd edition of the Encyclopedia Britannica. The HIVE tool was used to automatically generate the subject metadata. Descriptive statistics were applied, and visualizations produced from the results were also examined to identify trends related to encyclopedia entry length.

The paper will be published in the September issue of LITA’s Open Access peer-reviewed journal, Information Technology and Libraries. Read more in the American Library Association Press Release.


Alyson Gamble: Two years of Work!

Karen Boyd, the 2018 LEADS Fellow who worked on the HSP project last year, and I presented our work at the LEADS Forum on January 24. 
Here it is as a GIF:

Alyson Gamble, Karen Boyd, LEADS Presentation

And here it is as a PDF:
In case the GIF within a GIF didn’t work here’s a quick example of editing in OpenRefine:

Alyson Gamble LEADS OpenRefine

Karen and I will be presenting together again at Code4Lib. Our talk, “Cupper and Leecher, Tinman and Shrimp Fiend: Data Science Tools for Examining Historical Occupation Data,” will be held on March 9. You can read our abstract here:

I’ll post our Code4Lib slides after we give the talk. 
LEADS has been a wonderful experience, and I’m glad to be able to talk about it to others. Hopefully the lessons from Karen and my experiences will inform another year of work on the HSP project. There’s a lot more to do, but the end results will be useful for a wide audience.

Alyson Gamble Pronouns: They/Them/Theirs
Doctoral Student, Simmons University

News & Events

LOVE Data Week Metadata Mixer: Dr. Chaomei Chen

The Metadata Research Center will host its first Metadata Mixer of 2020 on Tuesday February 11th, in Celebration of Love Data Week.

Date: Tuesday, February 11th
Time: 12-1pm
Location: 3675 Market Street, Philadelphia, PA
Room: 10th floor, room 1056
Presenters: Dr. Chaomei Chen, CCI
Topic: CiteSpace: Research Metrics and Analytics
Abstract: CiteSpace is a visual analytic and science mapping tool for visualizing various trends and patterns in the literature of scholarly publications across a wide range of research disciplines. CiteSpace is designed to produce interactive visualizations of various networks and facilitate systematic scientometric reviews. I will introduce the key concepts and theories behind CiteSpace with exemplars of studies enabled by CiteSpace and demonstrate the core workflows of working with CiteSpace.

* CiteSpace is a freely available Java application:

News & Events

MRC Celebrating Data Science and Libraries

The Metadata Research Center and Drexel CCI are hosting two events in January to celebrate data science and libraries. Both events are free and open to the public. Space for these events is limited, so register soon!

1) Data Science Foundations Carpentry:

What: Data Science Foundations Carpentry
Date: January 23rd, 2020
Time: 8am-4:30pm
Hagerty Library
Drexel University
3141 Chestnut Street,
Philadelphia ,PA
Room: L33
Registration: Link
Event Page: View Here

The workshop will cover:

  • GitHub basics
  • OpenRefine
  • APIs
  • Best data practices
  • Jargon-busting

2) LEADS-4-NDP Forum:

What: LEADS-4-NDP Forum
Date: January 24th, 2020
Time: Coffee & registration: 8:30-9:30am Forum: 9:30am – 3pm
Location: 3675 Market St, 
Drexel University
Philadelphia, PA
Room: Quorum, floor 2, Room Qu4b
Forum agenda: PDF
Registration: Link
Event Page: View Here

The Forum will celebrate the success of LEADS and chart the way forward for continuing this program with doctoral students and early to mid-career front line information professionals.

News & Events

Metadata Research Center: Data Science Foundations Carpentry

The Metadata Research Center is hosting a Data Science Foundations Carpentry on January 23rd, 2020 at Drexel University’s College of Computing and Informatics.

Date: January 23rd, 2020
Time: 8am-4:30pm
Location: 3675 Market Street, Philadelphia, PA
Room: TBA
Registration Link:
Event Page: View Here

The workshop will cover:

  • GitHub basics
  • OpenRefine
  • APIs
  • Best data practices
  • Jargon-busting

The full carpentry abstract is provided below.

Continue reading “Metadata Research Center: Data Science Foundations Carpentry”