PDS_VERSION_ID = PDS3 RECORD_TYPE = STREAM OBJECT = TEXT PUBLICATION_DATE = 2005-10-28 NOTE = "Description of contents of MI RDR EXTRAS directory" END_OBJECT = TEXT END MER Microscopic Imager Merged Focal Sections and Anaglyph Stereo Images by Ken Herkenhoff This document describes the data in the EXTRAS directory on the MER Microscopic Imager Science Calibrated Image RDR volumes. Files in the EXTRAS directory are included in the archive for the user's convenience, but they are not required to be compliant with PDS standards. Hence there are no PDS labels for these files. The data consist of merged focal section images and anaglyph stereo images in TIFF and JPEG formats, organized into subdirectories by sol number. These data were generated by the MI Team at the U. S. Geological Survey, Flagstaff Arizona. The remainder of this document describes how these images were processed. Over most targets observed by the MI, stacks of 3, 5 or 7 images were acquired to assure the focus of all segments of the scene in one or another image. These images were taken at different distances from the target and hence have different scales. When possible, these "focal sections" were merged into a single image that shows all parts of the target in good focus. High-quality focal section merges could be produced only when the entire surface of the MI target was observed in good focus in at least one of the images in the stack. Therefore, focal section merges could not be generated for all MI observations. Images that were completely out of focus have no depth information, and images with moving shadows confused the software to varying degrees. In general, any artifact that resulted in a sharp boundary within a scene that varied from image to image would result in artifacts in the output. In many cases, imaging the MI shadow was unavoidable. The focal merging concept is straightforward: Out-of-focus images of a scene have less high spatial frequency information than in-focus images of the same scene. The actual amount of high frequency information varies across the scene and among scenes; there is no threshold that distinguishes in-focus from out-of-focus. For each neighborhood (any region several pixels across) within a scene, each of the images was considered. The image that had the largest high frequency component was judged to be the best-focus for that neighborhood. The 3-dimensional position was then defined based on the pixel coordinates for the neighborhood and the known depth to best focus. The 3-dimensional position was refined by a polynomial fit in depth as described below, resulting in depth resolution less than the sampling interval. MI EDRs were used as the input to the focal section merging process, which make made use of IDL software created by Mark Lemmon (Texas A&M). The raw digital numbers (DN) were divided by exposure time, resulting in somewhat uniform brightness levels despite possible small changes in exposure time due to auto-exposure. Without any reprojection, the full 1024 x 1024 image was passed through a simple high-pass filter by dividing the image by a smoothed version of the image (11 x 11 pixel boxcar average) and subtracting unity. The high-pass image was the basis for determining depth. Dividing the image by the smoothed image eliminated effects of large scale illumination variations, but did not eliminate effects of moving shadow edges. When possible, images were obtained entirely in shadow. For many images, this was not possible, and the moving shadow caused local artifacts in the subsequent processing. Next, tie points were selected manually on features that were obvious in multiple images. Each selected tie point was refined using a local feature matching algorithm so that the relative position of features in the different images was known to at least one-pixel accuracy. The relative positions of the tie points were used to determine the relative altitudes of the MI in the various images (through variation in image scale) and any twist around the optical axis (through image rotation) caused by the fact that the IDD has 5 degrees of freedom, not 6. Generally, the IDD motion included very little twist. The full 6-degree of freedom position and orientation of the MI was actually determined from the tie points, to correct small errors in the actual vs. planned position of the IDD turret. All images were then rescaled with bi-cubic convolution interpolation such that tie points are aligned with their location in the top image of the stack (or a different user-selected image). The rescaling was done separately for both the set of raw (DN/sec) images and high-pass images. The processing proceeded in two stages. A first approximation of depth was determined by simply looking up, for each pixel in the aligned images, which image had the largest absolute magnitude of high-frequency component. Pixels were not compared to other pixels, as the true high-frequency component varied across the scene. But for a specific pixel, the image in which the local high frequency component was maximized was taken to be the image when that pixel was in focus. The depth for each pixel was then set to be the altitude corresponding to the in-focus image (the constant offset for pupil-to-target distance for best focus was ignored, as all depth information was treated as relative within the same scene). At the end of this stage, an in-focus image was created by assembling the best-focus raw image value (DN/sec) for all pixels into a new image. The first stage resulted in a depth map at the resolution of IDD motion, with typically 3 to 7 different altitudes used for one focal series and 3 mm steps. The amount of high frequency information was generally a smooth function of altitude (i.e., the depth of field is fairly well sampled). The second stage increased the depth resolution by going back to each pixel and performing a second order polynomial fit to find the altitude of best focus. Because each pixel in the high pass filtered version had some information from neighboring pixels and because of inherent noise in the process, a 5x5 pixel median filter was used to eliminate outliers, and the depth map was smoothed with a 15x15 pixel boxcar average. The final horizontal resolution in the depth map is therefore near 15 pixels, or 0.46 mm. Repeat image sequences of the same target suggest depth repeatability is ~1/5 the step size, or ~0.6 mm. Finally, the in-focus image was updated by polynomial interpolation of the raw images in depth to the best-focus position. The primary output of this procedure was a merged image that is, in principle, all in focus. This image was saved as a TIFF file named " _raw.tif", where is a text identifier with the following format: R###_target, where R is the rover (A for Spirit, B for Opportunity), ### is the 3-digit sol number, and target is the target name. The first ancillary product was a depth map, saved as "_dem.tif" and " _dem.txt". The text file contains scaling factors that convert the 0-255 range within the TIF to elevations in mm (with an arbitrary zero point). The depth map was then used to project the image into synthetic left and right eye views, archived as "_[RL].jpg". The projection was a simple shift of pixels values left or right depending on depth. The magnitude of the shift was dynamic, such that the full range of depths in the image was displayed. The projection resulted in variations in vertical exaggeration, so the DEM files should be used rather than the anaglyphs for quantitative assessment of topography. The right and left eye views were then combined into an anaglyph for stereo viewing, "_ana.jpg".