Query by Image Client

Ankit Patel
Mark Tonkelowitz


Introduction

In CS265, we developed a system for retrieving similar images from a database based on user sketches. A Java Test Client was developed for running experiments, but was not submitted as part of our work for CS265. We have improved the test client and are submitting it as our CS275 final project. (If you are more interested in our experimental tests/results, please refer to our paper)

Files Submitted

  • AboutDialog.java
  • ColorLookup.java
  • FreqNode.java
  • frmMain.java
  • ImageCanvas.java
  • PaintQuery.java
  • queryClient.java
  • QuitDialog.java
  • TSHNode.java
  • TSNode.java
  • TSNode2.java
  • TSNode3.java
  • WaveletTransform.java
  • WTResult.java
  • org.mm.MySQL database drivers (precompiled in org folder)

Platform

Developed on WebGain VisualCafe 4.1 Standard, a freely available Java IDE for Windows.

How to Compile

You must use JDK version 1.2 or higher in order to successfully compile.

Unzip all the files to a folder and run: javac *.java

Usage and System Requirements

In order for the client to run, you must have a MySQL database server and a webserver preconfigured. Currently, these applications are hosted on 140.247.135.145. The program will not run if this machine cannot be accessed over the network. Although this program has been tested successfully outside of Harvard, there have been some problems with it running in Maxwell Dworkin.

To invoke the client: Java queryClient
(if you get an out of memory exception, try Java -Xms64m -Xmx128m queryClient)

User Interface

A screenshot of the Query by Image Client is shown below.

The UI consists of several useful tools, a custom configurable color palette, several standard fonts, etc.

Tools

The Tools panel is where the user can choose the drawing mechanism, specify the line thickness, change the canvas size, and select an image search method. The following drawing tools have been implemented:

  • Freehand
  • Line
  • (Filled) Rectangle
  • (Filled) Oval
  • (Filled) Polygon
  • (Filled) Curve
  • Light/Heavy Spray
  • Text

A user can move the last object drawn simply by pressing the arrow keys. Holding down SHIFT while doing this moves the object further.

Color Palette

The Color Palette, as mentioned earlier, consists of 24 chosen colors. It also has the capability to be modified, by the following operations:

  • Scroll: shift position of colors by 1 space, useful for mixing
  • Add: add a new custom-made color to the palette
  • Mix: change the palette to show a gradient between the 1st and last color
  • Save: save the current palette(to memory)
  • Restore: restore any saved palettes

Fonts

The fonts panel enables the user to add text to their sketch, using one of several fonts, sizes, and styles. The following fonts are supported:

  • Helvetica
  • Times Roman
  • Courier
  • Dialog
  • Dingbats

Along with the following styles:

  • Bold
  • Italic

And sizes can range from 4 to 255 (maximum may be device-dependent).

Buttons

The button panel consists of 6 buttons, which describe the important operations a user can perform:

  • Clear: clears the drawing canvas to whatever color is currently chosen on the palette
  • Undo: undo the last action
  • Redo: redo the last action
  • Search: submit user query image and search thru database for Top 25 matches
  • Tool: switch to Tools panel
  • Palette: switch to Palette panel
  • Font: switch to Fonts panel

For more information on the graphical tools and the algorithms that implement them, see the Code Design section below.

Theory

Some research was needed to develop our image querying metrics. Our two major sources of information were a paper on "Fast Multiresolution Image Querying" written in 1995 [1], and Prof. Steven J. Gortler's PhD. Thesis on "Wavelet Methods in Computer Graphics." [5] From these papers and some other related work (see References), we developed a scheme that combines the strengths of wavelet decomposition with those of color histograms, to form the hybrid algorithm mentioned in our introduction.

Wavelets

Using mathematical basis functions known as wavelets, and more specifically the Haar wavelet basis, we decompose images into a finite tuple of coefficients that can then be used for image comparison purposes. Wavelet comparison methods are very powerful and accurate in terms of shape/texture information, however they are not robust to distortions in the query image. Without delving further into the mathematics of wavelets, we list some of the properties of the Haar basis that are important for our purposes:

  • multiresolution property: ability to extract image information at any scale or resolution
  • hierarchical representation: allows for economic representation of a function (constant areas of little detail in an image will have very small coefficients and can thus be removed with no significant loss)
  • efficient coefficient calculation: Haar decomposition is cheap in terms of computation time

Color Histograms

The second method of image retrieval is based on the idea of color histograms. A color histogram of an image is simply a record of the frequencies of colors present in that image. Comparison based on color histograms is robust with respect to distortions in the query image, however they perform poorly in shape/texture similarity.

For more information and theoretical background on wavelets, color histograms, our hybrid scheme, and related work, please see our research paper entitled WHASSUPP: A Novel Approach to Query-by-Sketch Using Wavelet Coefficients and Color Histograms.

Code Design

The Query by Image Client consists of two main components: a paint program and an image querying facility.

Paint Program

The paint program is based on the work of Brian Prentice (original source code can be found here). We have made major modifications to the interface and have replaced his palette with a palette of 24 colors that are very important to the image querying facility. We have also commented his code, converted it to an application, and updated it to a Java 1.1 event model.

For the most part, the Paint program uses the standard Java methods, implemented by the java.awt.Graphics class, to draw shapes, lines, text, etc. However, to allow the user the ability to undo/redo multiple actions, there exists an internal data structure that serves as a log for the user's actions. Light/Heavy spray is implemented by selecting 25 points uniformly distributed around a center point (where the mouse was clicked), and drawing dots/circles there. Thick lines and shapes are simulated by overlapping circles, which result in a smoother texture than other alternatives. Filling closed shapes/curves is implemented by a recursive flood-fill algorithm that is horizontally iterative and vertically recursive. The Curve tool utilizes Bezier interpolation between control points, simulating a curve by a polygon with many, many vertices.

Other functionality is the ability to read in any image format, from the File-Open menu. After the image is drawn in, one can draw over it, modify it, etc. before processing a query.

Image Querying Facility

The image querying facility allows a user to match their sketch to images in a database. A user can use one of 5 image querying metrics:

  1. Wavelet only - a sketch is decomposed in the Y, I, and Q color channels using a Haar basis decomposition. The 60 most significant coefficients of each channel and their (x, y) positions are kept and quantized to +/- 1. These quantized coefficients are compared to precomputed signatures for images in the database. A scoring and weighting system devised by Jacobs et al is used to rank the target images [1].
  2. Histogram only - first we bucket YIQ space into 32x32x32 buckets. Then we map the colors of the sketch into these buckets. Next, we map these buckets to a 24 color fixed palette and record the frequency of each color in the sketch. Finally, a simple L1 differencing scheme is used to compare the histogram of the sketch to that of every image in the database.
  3. Naive hybrid - this scheme takes the result sets from 1 & 2 and just returns them one by one alternating back and forth.
  4. Average Rank hybrid - this scheme assign two relative ranks (between 0 and 1) to each image, one for each method. Then, the overall rank is simply the average of the two.
  5. Hybrid (the subject of our 265 paper) - Sparked by the observation that when a certain method returns a relatively high score, it is usually accurate and we would like to trust it more than the other method, we want to give relative importance to each method based on its given score on an image. In other words, given two standardized, normalized scores s1 and s2 for an image I, the calculated overall score is s = s1(s1/(s1+ s2) + s2(s2/(s1 + s2)). This simplifies to s = (s1^2 + s2^2)/(s1 + s2). Some test cases show that s = 1 when either score equals 1 (or both), as we desire. The scores are each standardized by assuming a normal distribution of scores (Central Limit Theorem of Statistics with n = 1000 images).

Additional Utilities

Color Processor

This utility serves to create a mapping of colors from general YIQ space to a specific palette. Initially, we "bucketed" off YIQ space into 32 buckets per channel. Later, we found that this equal distribution led to accuracy problems and thus we decided to use a fixed color palette. Our choice of 24 colors was based on which colors are the most useful in drawing most images (for example, we omitted neon green, and kept pure red). Hence, ColorProcessor changed, and it now calculates "buckets" (or rather, equivalence classes of colors in YIQ) by using Euclidean distance. More formally, the color Red is in an equivalence class consisting of all colors c = (y, i, q) such that distance(c, Red) is minimum. In other words, all colors in YIQ-space will be put into 1 of the 24 buckets, representing each color on the palette, based on minimum Euclidean distance; all colors that have Red as their "closest" color will be in the Red equivalence class. In this way, each color in YIQ space will belong to an equivalence class, and the distribution created is NOT uniform -- the Black equivalence class may have many more colors than the Red, for example. This use of an unequal color mapping improved accuracy significantly, not to mention the boost in performance achieved by quantizing YIQ space into 24 "chunks." Indeed, less is more.

ImageInputter (OR ImageProcessor)

This utility does the image pre-processing that is needed to enable image search via the several methods mentioned above in our introduction. The ImageProcessor decomposes each image in the database into a color histogram where the "bins" are the 24 colors in our fixed palette. It also performs a 2D Haar Basis Wavelet Decomposition on each image, and truncates the 16,384 coefficients resulting per channel to only 60 (which, surprisingly, is optimal for search accuracy, as Jacobs et al. discovered [1]). More about the Haar Basis is explained in the Theory and Algorithms section above and in our paper.

What We Learned

We learned a lot about image decomposition and comparison techniques. Wavelets are great for discerning the shape/texture information in an image. The Haar basis was especially useful for our purposes because of its strength in discerning coarse textures (most sketches have little detail). Color Histograms are great when objects in the sketch are distorted with respect to their positions in the target image. Also, we learned that using a different color space (YIQ instead of RGB) leads to increased search accuracy. Euclidean distance in RGB space does not correspond well to human perception.

As for developing a paint program, we learned how graphical drawing tools are implemented and how the algorithms behind drawing more complex 2D shapes work. Also, we now understand how undo/redo capabilities can be implemented in paint programs and what data structures/formats are needed to enable such a nifty feature.

For a more in-depth discussion regarding the performance of our search algorithms, and additional observations, see Sections 4 & 5 of our paper.

References

[1] C. E. Jacobs, A. Finkelstein, D. H. Salesin, Fast Multiresolution Image Querying, Proceedings of SIGGRAPH 95, in Computer Graphics Proceedings, Annual Conference Series, pp. 277-286, August 1995.

[2] James Z. Wang, Gio Wiederhold, Oscar Firschein, Sha Xin Wei, "Content-based image indexing and searching using Daubechies' wavelets,'' International Journal of Digital Libraries(IJODL), 1(4):311-328, Springer-Verlag, 1998.

[3] James Z. Wang, Gio Wiederhold, Oscar Firschein, "System for screening objectionable images using Daubechies' wavelets and color histograms,'' Interactive Distributed Multimedia Systems and Telecommunication Services, Ralf Steinmetz and Lars C. Wolf (eds.), Proceedings of IDMS, 20-30, Springer-Verlag LNCS 1309, ACM, Darmstadt, Germany, September 1997.

[4] A. Natsev, R. Rastogi, K. Shim. WALRUS: A Similarity Retrieval Algorithm for Image Databases. Technical report, Bell Laboratories, Murray Hill, 1998.

[5] "Wavelet Methods for Computer Graphics," S. J. Gortler, PhD thesis, Princeton University

[6] Graps, A.L.; An Introduction to Wavelets, IEEE Computational Sciences and Engineering, Volume 2, Number 2, Summer 1995, pp 50-61.