Scrapbook Analysis

A Case Study in AI Consulting

Scrapbook Analysis - Doug Peterson

Executive Overview

As Head of R&D at Digital Transitions, I led an innovative consulting engagement to apply artificial intelligence to the complex challenge of generating metadata for scrapbooks at the National Geographic Society. Scrapbooks present unique digitization and metadata challenges due to their multi-layered, mixed-media composition and personal, often idiosyncratic organization. Our custom AI solution achieved:

  • Content Segmentation: Automatically identified and separated images from captions

  • Efficiency Improvement: Reduced metadata creation time by 65% compared to manual approaches

  • Enhanced Discovery: Created navigable digital structures for previously impenetrable historical artifacts, including location mapping

Segmenting the photos and their captions was an easy but important precursor to more advanced parsing.

Scrapbook Analysis - Doug Peterson

Our Approach

  • Collection Assessment: Evaluated physical characteristics and metadata needs

  • AI Strategy: Developed custom approach for multi-layered, mixed-media materials

  • Multi-level Processing: Created segmentation, classification, and relationship mapping

Client Challenge

The National Geographic Society's research library faced critical issues with their historical scrapbook collection:

  • 500 scrapbooks containing 75,000+ items spanning 1870-1950 remained largely undescribed

  • Multi-layered pages with complex attachments defied traditional cataloging approaches

  • Deteriorating materials required careful handling and preservation considerations

  • Manual item-level processing estimated at 9,000+ staff hours

Solution Development

  • Segmentation: Developed algorithms to identify individual items within composite pages

  • OCR: Applied Google Vision OCR with in-house developed adaptations for typewritten materials

  • Named-Entity Extraction: Found place names and translated to GPS coordinates automatically

  • Visual QC: Generated maps of those GPS coordinates, sorted by time, for client QC

GPS coordinates extracted from place names correlated strongly to known expedition routes to Machu Picchu, with clear temporal clustering and path formation. Visualization, sorted by date, helped identify outliers caused by place-name ambiguity.

Scrapbook Analysis - Doug Peterson

Key Results

  • 65% reduction in processing time compared to traditional approaches

  • 320% increase in research usage during the first year

Client Outcome

"Before this project, our scrapbooks were essentially locked boxes of information. Researchers would occasionally browse them but rarely found what they were looking for. Now, these materials are among our most frequently requested items and have already spawned two scholarly publications." – Director of Special Collections, National Geographic Society

I presented on this project in a Short Course at IS&T Archiving 2024 Conference, co-taught with the client.

Scrapbook Analysis - Doug Peterson

Success Factors

  • Targeted a specific, high-value challenge with measurable ROI

  • Tailored implementation to unique cultural heritage requirements

  • Balanced accuracy needs with practical considerations for different content types

  • Addressed both descriptive needs and preservation challenges

  • Delivered significant ROI by transforming underutilized materials into highly requested resources

This website uses cookies to improve your experience.