Probabilistic linkage can clean up immunization information systems
New strategies improve data accuracy in the Michigan Care Improvement Registry (MCIR), the state's immunization information system.
Temporary names such as Babygirl or BoyA are sometimes submitted to MCIR with vaccinations that are administered before the child has a legal name. A second record, missing that crucial vaccination information, is often entered once the baby is officially named, thereby introducing a record with incomplete vaccination history into the database. Each year, this may occur for thousands of births in Michigan, leading to inaccurate duplicate records.
But a collaborative team of researchers from Michigan Medicine and the Division of Immunization at the Michigan Department of Health and Human Services tested a new way to alleviate this problem. In a paper published by Vaccine, the team demonstrates that software-assisted deduplication using probabilistic linkage can effectively match temporary names to permanent records, enabling subsequent record reconciliation to improve the accuracy and completeness of vaccination history.
According to the paper's lead author, Hannah Peng, MPH, a lead statistician at Michigan Medicine and the Susan B. Meister Child Health Evaluation and Research (CHEAR), probabilistic linkage is the key.
"Because these temporary records don't include the child’s actual name, you can't find an exact match for them," Peng said. "Typically, when we link people together, we do an exact match on different demographic fields. Probabilistic linkage allows us to use the same sorts of demographic details, but instead of requiring an exact match, it gives us a score indicating the level of confidence with which the two records are in fact the same person, even if there are some differences in there."
Peng said, without having a valid name to use for matching, they employ other demographics such as birthdate, address, and responsible party.
"It's a relative measure, and so higher scores indicate that it's more likely to be the same person and lower scores indicate that it's less likely to be the same person," Peng said. "There's also an aspect of manual review involved to determine where you're comfortable drawing that line of when to call it a match versus not a match."
Results in the paper prove the point that probabilistic linkage works: Of the 16,806 temporary-name records submitted to MCIR for births between Jan 1, 2020, and Dec. 31, 2023, 9,803 or 58% were linked to a legal-name record using probabilistic linkage; among the subset of 8,548 children age 19–35 months who were assigned a temporary name at birth, childhood-vaccine-series completion coverage increased from 1.5% to 39.6% after reconciliation of vaccination histories; statewide, childhood-vaccine-series coverage among children age 19–35 months (162,666) increased from 66.2% to 68.2% after reconciliation.
"It's important information to share because this is an issue that a lot of immunization registries face. These temporary names can get entered and if they are not detected, they may persist in the database as a duplicate person for a long time," Peng said. "This is an effective strategy to find those records and link them back to the appropriate permanent-name records, which enables us to deduplicate the database and make sure the data is more accurate and complete."
Kevin J. Dombkowski, DrPH, MS, a CHEAR faculty investigator, is the senior author on the paper.