[an error occurred while processing this directive]

Western Waters Digital Library
The Columbia River Basin in Oregon

Scanning Specifications

This is a summary of scanning specifications and file characteristics for materials digitized as part of the University of Oregon Libraries contribution to the Western Waters Digital Library. The following information is broken down by format, providing information for photographic materials, maps & aerial photographs, documents that are primarily textual, and audio/visual materials.

More detailed procedure and workflow documentation is kept for each material type. The following is simply a summary of the characteristics of files retained by the University of Oregon Libraries. For photographic materials, four copies of each image are stored on either the Libraries' Mass Storage Unit (MSU) or in the CONTENTdm server, "boundless". The MSU houses two versions of a master tiff for each image, as well as a presentation JPEG:

  • The Archived Tiff (an uncompressed and uncorrected raw scan)
  • The Production Tiff (an edited, cropped, color-corrected or otherwise altered version of the Archived Tiff from which all subsequent files are generated; adjustment layers are retained in this file)
  • The Presentation JPEG (the resized and compressed version of the Production Tiff, saved as a JPEG and loaded into CONTENTdm, as well as stored on the MSU).
  • The Thumbnail JPEG (created by CONTENTdm and are only stored in CONTENTdm.

Specific details of the Archived and Production Tiffs are included in the metadata for a given image, in the following format:

Disposition -- File Name -- Bit Depth, Color Profile -- W x H pixel dimensions -- Resolution -- File Size //
e.g.:
Archived tiff -- PH035_9572.tif -- 16 bit, Adobe RGB (1998) -- 6484 x 4048 pixels -- 1200 dpi -- 157,497,792 bytes //
Production tiff -- PH035_9572jc.tif -- 16 bit, Adobe RGB (1998) -- 6484 x 4048 pixels -- 1200 dpi -- 472,468,652 bytes //

Within a given material type, technical specifications vary due to the intended use, physical characteristics and / or condition of the original source document. Images are scanned at varying resolution depending on the physical dimensions of the source. Pages of textual documents that are primarily text are given different post-production treatment than pages that consist of single images. Variations in workflow exist for color images within textual documents to allow for an additional level of color management, image enhancement and quality control. Larger format, fold-out maps and detailed images are not resized before JPEG compression, allowing them to take advantage of the Zoom and Pan features offered by CONTENTdm.

Pictorial Materials - Procedures and Characteristics

All images share the following characteristics:

  • Archived Tiff:
    • No Layers
    • Bit Depth = 16
  • Production Tiff:
    • Layers Retained
    • Bit Depth = 16
  • Presentation JPEG:
    • Flattened, no layers
    • Bit Depth = 8
    • Resolution = 125 pixels per inch
    • Larger pixel dimension reduced to 875 pixels, without re-sampling image.

Other characteristics of files vary, according to the following criteria:

  • Black and White images:
    • Archived and Production TIFFs are 16 bit grayscale with a color profile of Gray Gamma 2.2
    • Presentation JPEGs are 8 bit RGB with a color profile of sRGB IEC61966-2.1. These are still black and white photographs, but are extrapolated as 8 bit per sample, 3 sample per pixel RGB files for the purpose of maintaining a consistent profile for presentation JPEGs.
  • Color images:
    • Archived and Production TIFFs are 16 bit RGB with a color profile of Adobe RGB (1998)
    • Presentation JPEGs are 8 bit RGB with a color profile of sRGB IEC61966-2.1.
  • Resolution varies depending on the size of the image source:
    5 x 7 inches or larger 600 dpi
    4 x 5 inches and smaller 1200 dpi
    2 x 3 inches and smaller 2400 dpi
    35mm slides or 1 x 1.5 inches or smaller 4000 dpi

Textual Materials - Procedures and Characteristics

Unless there is color imagery on the page, page images of textual materials are scanned in 16 bit grayscale at 400 dpi. There is generally no production copy of text pages, although a second tiff is generated at 8 bit grayscale for use with Optical Character Recognition (OCR) software.

All images share the following characteristics:

  • Archived Tiff:
    • No Layers
    • Bit Depth = 16
    • Resolution = 400 pixels per inch
  • OCR Tiff:
    • No Layers
    • Bit Depth = 8
    • Resolution = 400 pixels per inch
  • Presentation JPEG:
    • Flattened, no layers
    • Bit Depth = 8
    • Resolution = 400 pixels per inch

Other characteristics of files vary, according to the following criteria:

  • Production TIFFs are generated for pages containing color illustrations
    • Archived and Production TIFFs are 16 bit RGB with a color profile of Adobe RGB (1998)
    • Presentation JPEGs are 8 bit RGB with a color profile of sRGB IEC61966-2.1.
  • Presentation JPEG image width is generally resized to 875 dpi so that the end product is readable on varying screen resolutions.
    • This is not the case for large format maps and other fold out images, which retain their original dimensions. This allows greater detail to be examined using CONTENTdm's Zoom and Pan features.

Maps and Aerial Photographs - Procedures and Characteristics

  • Aerial Photographs are generally scanned as 16 bit grayscale at 600 dpi.
  • Georeferenced Tiffs are generated from these scans.
  • TFW files are kept with georeferenced tiffs
  • Archived Tiffs are kept of the raw scans prior to georeferencing
  • Production Tiffs are not stored.
  • Presentation JPEGs are 8 bit RGB with a color profile of sRGB IEC61966-2.1.
  • Presentation JPEGs are not resized and retain a resolution of 600 dpi.

 

 


Last Modified: August 4, 2008
Comments and questions to Digital Collections Administrator
Metadata Services and Digital Projects, University of Oregon Libraries