Hyacinth Name reconciliation subproject

From Columbia Wikibase Test
Revision as of 19:27, 23 September 2020 by TrMendenhall (talk | contribs) (Created guidelines for the name reconciliation subproject)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Steps for working the name reconciliation subproject

  1. Import a project with a name set into OpenRefine
    1. Click here to download the name sets (CUL login required). These name sets consist of temporary terms from Hyacinth Name and Hyacinth Subject-Name controlled vocabularies; an LCNAF reconciliation job has been run on these items, returning either potential matches or no matches. The objective is to 1) manually confirm correct matches; 2) consider running additional reconciliation jobs with other controlled vocabularies like Wikidata or ULAN. The total number of terms comprised by these name sets exceeds 45,000.
    2. Checking potentially matched items in the name sets
      1. Set up
        1. Facet by reconciliation candidate score [see fig. 1 and fig. 2]
        2. Uncheck “blank” -- Only “numeric” should be checked [fig. 2]
      2. Follow usual practices for reconciliation QA in OpenRefine, with the following modifications
        1. Preferred label: For reasons unclear to me, often the preferred label returned from VIAF for a match is not actually the current preferred label (MARC 100) of the term in the LCNAF. If you suspect that this is the case, please do one of the following:
          1. Put the preferred label into the PrefLabel column [preferred method]
          2. Put an “X” in the PrefLabel column -- this will serve as a flag that we need to do further recon/QA work. [time-saving method]. See Fig. 3
          3. NOTE: all label values have been “backed up” in the column HyacinthLabel. So if the value in the “value” column is not the preferred LCNAF label, but the match label value is correct, no need to do anything -- just click the double check to “match all.” The LCNAF preferred label will be grabbed from the reconciliation result in a later phase of this project
        2. No correct match returned: Click “create new item” and carry on. After you get through your set, you can try running a reconciliation on these “new” terms in a different vocabulary, such as ISNI (or another source available via VIAF), ULAN (CUL internal wiki), or Wikidata (CUL internal wiki)
  2. Unmatched items: Feel free to review these to whatever extent you see fit. You could try running alternate reconciliations on vocabularies like ISNI (or another source available via VIAF), ULAN (CUL internal wiki), or Wikidata (CUL internal wiki); you could batch this with the “new” items flagged above (see 2bii)
  3. If you note any oddness, please add a note into the column “Problem.” Delimit notes with a semicolon.
  • Fig. 1: How to add a facet based on reconciliation score
  • Fig. 2: How to set up the reconciliation facet for ease of navigation
  • Fig. 3: Use of the PrefLabel column. Note that Adedeji and Albee's LCNAF matches both have periods (full-stops), for no discernible reason. Thus, before clicking to confirm the match, I've copied over the correct version from the NAF record, which happens in these cases to correspond to the existing Hyacinth label. For Afetinan and Aitken, note that the Hyacinth label is incorrect, but the match returned from LCNAF via VIAF is completely correct – no trailing punctuation or other discrepancies. You can leave PrefLabel blank: the Hyacinth Label is retained in Hyacinth Label, and the preferred label can be extracted from the reconciliation data after the match is clicked. If you are in a rush, just put an "X" in the PrefLabel column if the correct match has an incorrect label – usually the problems with Conciliator have to do with punctuation, special characters, and capitalization. Don't go out of your way to double-check each item – just use the PrefLabel column if the suggested match appears correct but there are some problems with the preferred label supplied by the Conciliator reconciliation service.

</gallery>


[Metadata Working Group > WbStack name reconciliation subproject > Screen Shot 2020-09-11 at 2.09.31 PM.png]



[Metadata Working Group > WbStack name reconciliation subproject > Screen Shot 2020-09-11 at 4.04.13 PM.png]