Home → Cataloging Bibliographic Records → Deduplication and Clean Up Projects → Annual Deduplication Process
Last Updated 03/09/2026
On This Page
The Annual Deduplication Process is a two-stage procedure initiated by NC Cardinal to ensure cataloging consistency across its consortium of libraries. It begins with the Waves clean-up, which addresses bibliographic records lacking the correct format icon. Following this, a deduplication script assesses records by comparing various fields (title, author, ISBN) to identify duplicates. Catalogers then review the records flagged as duplicates to confirm which should be merged. The timeline involves initial reviews by the Cataloging Committee, running the Waves clean-up, and executing the deduplication process over several weeks. Importantly, catalogers can continue their work during the deduplication phase without losing data, and deleted records may be prioritized for restoration if deemed necessary during deduplication.
*Last updated 02/18/2026
The annual deduplication process is a 2-stage process, beginning with a cleanup of bib records that either do not have a format icon or have the wrong one applied. The process of identifying these records and applying the correct icon is called the Waves clean up. This clean up looks at all of the bibliographic records in the catalog.
After the Waves clean up process is complete a deduplication script compares records by format type. By comparing data in the record, such as title, author, ISBN, and other identifiers, the script determines whether there are duplicate records with the same data. After duplicate records are identified, the records are scored for quality using a variety of criteria. The record with the highest score is marked as the lead bib record.
Records are placed in a spreadsheet for examination by catalogers, who review the two bibs identified in each line item to make sure that they are for the same material (content and format) and to determine whether the “lead” bib identified is actually the better record. After the review is complete, the script is run and items for duplicate records are merged onto the lead record and the subordinate records are deleted.
Cataloging Interest Group (~2 weeks)
Mobius (~2 weeks; may take as little as a day depending on the number of bibs and the parameters used)
Cataloging Interest Group (~2 weeks)
Mobius (~1 month; may take as little as a day depending on the number of bibs and the parameters used)
Deleted bibs may be included in the deduplication process. How does this affect the merging of records, if at all?
The deduplication process can and may prefer a deleted bib over a non-deleted bib. If that is the decision it makes, it will do all of the things necessary for it to bring the deleted bib back to life and move everything to it. The resurrected bib will be searchable and sound in the catalog.
While the deduplication process is running, can catalogers still edit bibs, merge bibs, import bibs, etc.?
Yes. Unlike the quarterly authorities update where many cataloging functions must cease lest they be overwritten/undone by the authorities update, catalogers can continue to perform their normal cataloging duties without fear of wasted time or lost work.