LEADS: Getting Started
Bridget Disney, California Digital Library
My LEAD project is at the California Digital Library (CDL), working with mentor John Kunze, and fellow participant Hanlin Zhang. On June 8th, the LEADS fellows attended a three day data science bootcamp in Philadelphia. It was a great opportunity to meet the LEADS staff and the other students. What an amazing group! I’m sure that we will learn from each other and collaborate on projects in the future. We learned a lot from the professors who introduced us to the basic concepts (in some depth) of data science. It was helpful to have a complete overview in everything from metadata to text processing to visualization.
LEADS-4-NDP Data Science Boot Camp
At the CDL, I’ll be working on YAMZ (http://yamz.net), which stands for Yet Another Metadata Zoo. The tagline on the web site bills itself as “A crowdsourced metadata dictionary. Search for terms, upvote useful ones.” This platform is used those developing and sharing controlled vocabularies. The software is written in Python using a PostgreSQL database.
I spent the first week hopelessly trying to feel my way around and setting up the environment for YAMZ. I have never used Python and am excited to get the chance to learn it. It looks like there are two choices of operating systems for this project – Mac and Ubuntu, a Unix like operating system that can run on a desktop. I elected to give the Mac a try. I started using a Macintosh two years ago, just to see how it worked and now I love it so much, there’s no turning back! However, while installing the components, I have run into a few obstacles. Hopefully, I’ll be able to work through those.
Perusing through the documentation, I see there is an article about scoring of meta dictionary terms (Patton, 2014, Community-based scoring of metadictionary terms) that might be helpful. Also, Hanlin sent me a link to get me started with GitHub (https://help.github.com/en/articles/connecting-to-github-with-ssh). So now I have some reading to do!