2021-09-22

Date

Attendees

  • katie
  • Beth
  • Stephanie
  • Nancy
  • Michael
  • Mary
  • Brittney
  • Paloma

Recording

Zoom recording and chat

Theme

  • Assessment of linked data projects

Agenda

Discussion items

ItemWhoNotes
SPARQL updatesPaloma / Michael
  • Paloma - The query to retrieve all items with references that have an HRC Finding Aid URL as a value now work!!! See the slides from the last meeting for updated query.

  • Michael was not able to work on it since the last meeting- busy fall start.

Assessmentall
  • Review of the google doc “Linked Data Assessment Brainstorming Document”.

  • Mary Aycock demonstrated her work on showing value of identity management of faculty through wikidata

    • Python script to search faculty names in google. Using the results, she developed and applied a ranking system

    • Ranking by type of hit:

      • Type (faculty pages, Tx state=3)

      • List of research (research gate=2)

      • Salaries (=1)

      • Unrelated (0)

    • Ranking by its position on the page:

      • Where does the right identity fall on the google results?

      • If displayed first, then they get a 10… (thru 1)

    • Comments:
      • Very interesting approach. Was it Taxing on your time? is it scalable? Not really, a lot of manual work

      • Common names get more hits? Maybe use ORCID id?

      • Is GOOGLE customizing your search results? See TED talk by Eli Pariser: Beware online “filter bubbles”

        • Google is probably using IP address to geolocate your results

        • You can customize a google search by websites?

      • Great effort to try to approach this programmatically

      • Can use incognito or private window, however you can’t obfuscate an IP address.

      • Noticed Knowledge windows popping up now too

    • She is willing to share script
      • Time to start a repository of shared resources but where?!

      • UT enterprise GitHub- can’t use to collaborate.

      • Personal GitHub? Our group’s Wiki?

    • Google analytics?

      • Michael uses it to track usage statistics of the GeoData portal.

        • Monthly alerts that he receives via email. He then uses a script to find them on his inbox and retrieve the important data

      • Mandy uses it as a web administrator. She has access to their website, so unsure how this might apply to Mary’s project.

    • Mary also suggests use of Altmetrics approaches?

  • Qualitative questions

    • Architecture keeps locally the data that they are contributing to wikidata, which gives some backup security, but also means that the data exists in multiple places and tiers

      •  Texas data repository holds original data (no local DB for local authorities)

      • Michael - Still figuring out a programmatic approach to contribute data to wikidata and keep the data on synch. Interest also in assessing workflows

    • Ransom Center has a local DB (SQL) for authority records. Through wikidata project, stablish links between the wikidata item ("archives at", Collection ID, FA link) and the local record (wikidata URI added to the record).
    • Assessing workflows and sustainability (Michael)
      • Where does the data editing take place? And is that part sustainable? Software and expertise requirements? Editing in multiple places? Or moving edits in one direction? How do we do this and then how to assess this?
    • Got a good start on process documentation (josh). Modifying datasets being discussed. Mindful about sustainability (colleagues coming and going, work loads). Ex. Version tracking on Texas data repository to assist versioning in geosystems

      • Documentation on data publishing is a good foundation.

      • We all have similar challenges on this area.

    • Does using Wikidata add value to our internal workflows?
      • Interview project participants and stakeholders about their workflows
      • How has this changed workflows? For example, are you using Wikidata instead of something else? Are you adding Wikidata entry creation/editing, etc. to workflows? At what points and to what end?
    • Qualitative questions require further work. Maybe good to focus on quantitative questions first?

      • Would be good to start with defining a rubric/metrics
        • May not apply to every project, let’s start gathering and documenting
      • Document projects we are working on, and what we are tracking on these projects or what we aspire to track

      • Document metrics

        • How do you measure it?

        • What is the goal of the metric?

        • How do we get this data?

        • How do we share it?

File sharing/editingall
  • This has come out several times now. Should we start using Google Drive so members out of the UT community can also contribute / edit?
  • Discussion Box vs. Google, and decided to shift to Google
  • There is an old Google Drive called UT Metadata Group from an early iteration of this group that we are going to start using. Everyone on the call has been added (Melanie still needs to be added - pending preferred gmail account).
  • Potential to add other UT Metadata related Groups documentation (e.g. Metadata-Centric Meet-up group)
Topic planning for the fall semesterall
  • Top survey results:

    1. Assessing impact of Wikidata projects
      • Group still working on this
    2. Batch update methods
      • Architecture interested in this topic as they are investigating solutions to update FA URLS in wikidata due to migration of TARO FAs to new server (launching next week). Architecture will try topresent on this in October
    3. Charting linked data efforts to identify experts, collaborators, and common goals
      • Will start covering this topic while discussing how to best document ongoing wikidata projects on campus
  • For next meeting:

    • Architecture’s attempt at batch editing for new TARO URLS

    • Discuss how to organize the projects googlesheet (Michael - current fields are just suggestions)

Action items

  • Start thinking about how to organize the documentation of projects and metrics on the google sheet (all)