National Institute of Standards and Technology

Toward Web Accessible Reference Measurements of Image Similarity

Summary

Comparison is the heart of all content-based retrieval approaches. We have investigated the problem of computationally scalable comparisons of image files in a computer cloud. Figure 1 shows examples of image comparison results applied to two images using multiple comparison metrics.

versus
Web-based framework for executing the image comparisons.

There is a need to understand the variability of biological conclusions due to the choice of a similarity metric, and due to the software quality and parameters of similarity computations. Figure 2 illustrates the variability of similarity numerical values due to the choice of an image descriptor and similarity measure.

versus 2
Numerical values showing the variability of similarity values depending on the choice of image descriptor snd similarity measure.

Description

The goal of our effort is to advance high throughput and high confidence image comparisons accessed from any stationary or mobile computer device. We address the lack of reference implementations of image similarity metrics that would be:

  • presented according to a developed taxonomy of the similarity computations,
  • validated regularly by pre-configured tests,
  • accessed from any geographic location, and
  • computed over a large collection of images and in response to a time-varying number of computational requests.
  • We have explored a process which compares the contents of one file to the contents of another file and returns a value, either a distance or level of similarity. From these proximity values, we can rank a database of similar content containing files and return the top N closest results in order to carry out a search. The applications of such functionalities are useful not only in content management systems but also in archives to detect duplicates or slightly modified versions of the same file; to group and organize large digital collections in an automated manner; and to detect differences in files that have gone through file format conversion.

    Our approach is based on organizing and evaluating image similarity metrics, according to several existing surveys of image similarities. The similarity metrics are represented by a triplet consisting of image loaders and color space representations, image descriptors, and proximity measures. The proximity measures are grouped into those that can operate on histogram descriptors, contiguous image segments, clusters of image pixels or raw pixel values. This classification of individual computations and their sub-categories allows us to build a simple tree taxonomy encapsulating image loading/representation, image characterization and comparison, and to map the taxonomy into intuitive web interfaces. Next, we leveraged the NCSA open source project called Versus that uses the web as a participation platform (web 2.0). It also provides Representational State Transfer (REST) web services for managing data and executing pair-wise comparisons using programmatic interfaces.

    We designed and integrated 40 similarity metrics into the Versus framework. These similarity metric implementations were thoroughly validated by using inputs and their corresponding numerically predicted outputs, and by processing input synthetic images and expected image outputs. The validation suite of tests can be executed regularly to verify correctness of similarity computations. Next, we deployed the web accessible image similarity computations on several virtual machines that form a computational cloud. The browser interface enables access to the computational resources via mobile devices that have the sensing and data exchange capabilities. Our current prototype web interface has been optimized to accommodate a tree-based taxonomy of similarity metrics, and to propagate any relevant information from a server to the web browser interface (e.g., definition of metrics or error messages).

    Major Accomplishments

    The paper: Peter Bajcsy, Ben Long, Antoine Vandecreme, Joe Chalfoun, Paul Khouri-Saba, and Mary Brady, “Toward Web Accessible Reference Measurements of Image Similarity,” CYTO 2012: ISAC’s XXVII Congress, Leipzig, Germany, June 23-27, 2012.

    Luigi Marini, Peter Bajcsy, Devin Bonnie, Antoine Vandecreme, Rob Kooper, Benjamin Long, Michal Ondrejcek, Paul Khouri Saba, Joe Chalfoun, Kenton McHenry, “Versus: A Framework for General Content-Based Comparisons,” The 8th IEEE International Conference on eScience (eScience 2012), Chicago, Illinois, 8-12 October 2012 (poster) (download pdf)

    Lead Organizational Unit:

    ITL

    Staff:

    ITL-Software and Systems Division
    Information Systems Group

    Publications:

    The paper: Peter Bajcsy, Ben Long, Antoine Vandecreme, Joe Chalfoun, Paul Khouri-Saba, and Mary Brady, “Toward Web Accessible Reference Measurements of Image Similarity,” CYTO 2012: ISAC’s XXVII Congress, Leipzig, Germany, June 23-27, 2012.

    Luigi Marini, Peter Bajcsy, Devin Bonnie, Antoine Vandecreme, Rob Kooper, Benjamin Long, Michal Ondrejcek, Paul Khouri Saba, Joe Chalfoun, Kenton McHenry, “Versus: A Framework for General Content-Based Comparisons,” The 8th IEEE International Conference on eScience (eScience 2012), Chicago, Illinois, 8-12 October 2012 (poster) (download pdf)

    Contact:

    Date created: April 10, 2014 | Last updated: