NC Cardinal Support and Staff Education
  • Home
  • Submit a Request
  • Check on a Request
  • Knowledge Books
    • About NC Cardinal
    • Acquisitions in Evergreen
    • Administration Manual for Libraries
    • Cataloging Bibliographic Records
    • Cataloging Items/Copies and Holdings
    • Circulation in Evergreen
    • Evergreen Upgrades
    • Holds Management in Evergreen
    • Libraries Migrating into NC Cardinal
    • Offline Transactions
    • Patron Account Management
    • Reports in Evergreen
    • Resource Sharing
    • Serials in Evergreen
    • Student Access Initiative
    • Summon Documentation
    • Troubleshooting in Evergreen
HelpSpot help desk software

Home → Cataloging Bibliographic Records → Deduplication and Clean Up Projects → Annual Deduplication Process

11.1. Annual Deduplication Process

Last Updated 03/09/2026


What is the Annual Deduplication Process?


On This Page

The Annual Deduplication Process is a two-stage procedure initiated by NC Cardinal to ensure cataloging consistency across its consortium of libraries. It begins with the Waves clean-up, which addresses bibliographic records lacking the correct format icon. Following this, a deduplication script assesses records by comparing various fields (title, author, ISBN) to identify duplicates. Catalogers then review the records flagged as duplicates to confirm which should be merged. The timeline involves initial reviews by the Cataloging Committee, running the Waves clean-up, and executing the deduplication process over several weeks. Importantly, catalogers can continue their work during the deduplication phase without losing data, and deleted records may be prioritized for restoration if deemed necessary during deduplication.


*Last updated 02/18/2026

The annual deduplication process is a 2-stage process, beginning with a cleanup of bib records that either do not have a format icon or have the wrong one applied.  The process of identifying these records and applying the correct icon is called the Waves clean up.  This clean up looks at all of the bibliographic records in the catalog.

After the Waves clean up process is complete a deduplication script compares records by format type.  By comparing data in the record, such as title, author, ISBN, and other identifiers, the script determines whether there are duplicate records with the same data.  After duplicate records are identified, the records are scored for quality using a variety of criteria.  The record with the highest score is marked as the lead bib record.

Records are placed in a spreadsheet for examination by catalogers, who review the two bibs identified in each line item to make sure that they are for the same material (content and format) and to determine whether the “lead” bib identified is actually the better record.  After the review is complete, the script is run and items for duplicate records are merged onto the lead record and the subordinate records are deleted.

Deduplication Process Outline

  1. Waves clean up for each format
  2. Voting system to identify “Automatic” versus “Needs Humans”
  3. Fingerprints matched
  4. Bibs scored for record quality
  5. Higher quality is lead bib
  6. Catalogers examine records and confirm choice of bibs to merge and lead bib
  7. Less complete records merged onto lead bib

Approximate Timeline for Project

Cataloging Interest Group (~2 weeks)

  • Review Waves clean up Auto sheets

Mobius (~2 weeks; may take as little as a day depending on the number of bibs and the parameters used)

  • Run Waves clean up process to update format icons
  • ]Generate list of what would be deduped

Cataloging Interest Group (~2 weeks)

  • Review list of what would be merged (to be provided by Mobius)

Mobius (~1 month; may take as little as a day depending on the number of bibs and the parameters used)

  • Run deduplication process

FAQ

Deleted bibs may be included in the deduplication process.  How does this affect the merging of records, if at all?

The deduplication process can and may prefer a deleted bib over a non-deleted bib.  If that is the decision it makes, it will do all of the things necessary for it to bring the deleted bib back to life and move everything to it.  The resurrected bib will be searchable and sound in the catalog.

While the deduplication process is running, can catalogers still edit bibs, merge bibs, import bibs, etc.?

Yes.  Unlike the quarterly authorities update where many cataloging functions must cease lest they be overwritten/undone by the authorities update, catalogers can continue to perform their normal cataloging duties without fear of wasted time or lost work.

Knowledge Tags
annual deduplication process  /  deduplication process  /  deduplication  /  dedup  /  dedupe  /  waves clean up  /  waves  /  wave  / 

This page was: Helpful | Not Helpful


NC Cardinal is supported by the Institute of Museum and Library Services under the provisions of the federal Library Services and Technology Act (LSTA), as administered by the Library of North Carolina, a division of the Department of Natural and Cultural Resources.