2MASS Spring 1999 Explanatory Supplement: Data Processing

2. Data and Basic Reductions

2.1. Image Data

The 2MASS survey strategy is to map the sky with overlapping strips, or tiles, each of approximately 6 degrees in length and 8.5’ in width, using three (one for each band) 256X256 NICMOS (HgCdTe) arrays (2² pixels). The data is efficiently acquired with a freeze-frame scanning technique (detailed in Beichman et al. 1998), such that every piece of sky is observed a total of six times at 1.3 sec of integration per sample. With sub-pixel dithering between samples, the deleterious affects of under-sampling are minimized. Frames are optimally combined to form "coadd" images of size 512X1024 pixels with resampled 1" pixels. The coadd image is more commonly referred to as the 2MASS "atlas image". Atlas images have ~10% overlap along the in-scan (declination) axis to minimize incompleteness of large galaxies. Thus, each 6° scan is comprised of approximately 23 coadd images. The atlas image is the basic data product from which galaxies and extended sources are detected, characterized and extracted into the 2MASS database. In addition to the full coadd images, small sub-sections of the atlas images (referred to as "postage stamp" images) are extracted for each extended source.

2.2. Pipeline Reductions Overview

High level data reductions include linearity, dark frame subtraction and pixel-pixel gain correction (i.e., flat-field correction), both of which are employed in a non-standard fashion to accommodate the data set unique to the 2MASS survey (see 2MAPPS Functional Design Document 1996; Beichman et al. 1998). Further pipeline reductions include frame to frame offset determination, simple background subtraction, source detection, atmospheric ‘seeing’ and point spread function (PSF) characterization, stellar photometry, band merging, artifact removal, accurate position reconstruction, and photometric calibration. The source detection step (see below) is vital to both point source processing and extended source processing. The extended source processing occurs at the end of the 2MASS data reduction pipeline. The main objective of the 2MASS extended source processor (referred to as GALWORKS) is to parameterize source detections and determine which sources are "extended" or resolved with respect to the PSF. Consequently, one of the many vital operations for successful star-galaxy discrimination is the accurate measurement of the PSF.

2.3. Source Detection

The primary 2MASS source detection procedure is designed to locate both point sources (e.g., stars) and extended source (e.g., galaxies). The detection thresholds are chosen to assure complete detection of galaxies brighter than the level-1 specification, K~13.5, J~15, over a wide range in surface brightness. For fainter low surface brightness galaxies the completeness will steadily fall off with flux, hence a separate detection step is carried out to find these objects.

The detection algorithm is closely modeled after the DAOPHOT FIND algorithm (Stetson 1991) which was devised to find stars over a wide range of stellar number density. Each coadded image is convolved with a 4 arcsec FWHM Gaussian over a 13 pixel sub-array averaged to zero. The resulting zero-sum filtered image is thresholded at ~3 times the estimated noise level for the initial coadded image, with detections corresponding to each central maximum within a thresholded region. A rough position and flux is estimated from the corrected (convolved image) centroid. The detection list is then fed to a PSF characterization task (see section 2.4) and finally to a PSF profile-fitting photometry processor, where positions and integrated fluxes are refined. The detection thresholds (3s) correspond to J~16 for point sources; thus for average surface brightness galaxies (total flux versus integrated flux in a R~3" central region), the thresholds are about 0.5 mag brighter, well beyond the extended source requirements. Extended sources are identified from this inclusive detection source list (see Appendix).

2.4. A Generalized Point Spread Function

The first order of business toward discerning extended sources, including galaxies and Galactic nebulae, from point sources (stars) is to accurately characterize the point spread function (PSF). The unique shape of the PSF derives from a combination of factors: the optics, large 2" pixels (frame images), dithering pattern of the six samples that comprise the coadd, focus, sampling/convolution algorithm to generate the coadds, and atmospheric. As such, the 2MASS PSF corresponding to frame-coadded images is not well fit with a gaussian function. Fortunately it is adequately characterized by a generalized exponential function (see below) out to a radius ~2´ FWHM, which is all that is required for star-galaxy discrimination.

The 2MASS PSF typically varies on time scales of ~minutes due to two effects: atmospheric "seeing" and variable telescope focus (thermally driven). The 2MASS telescopes are designed to be mostly free of afocal PSFs (under most conditions), but experience has shown that 2MASS images can be slightly out of focus during periods of rapid change in the air temperature – conditions that generally only occur during the hottest summer months. Out of focus images have the ill-desired property of possessing elongated PSFs. Fortunately, under most/typical observing conditions for the survey, the PSFs are symmetrically round throughout the focal plane. That leaves the atmospheric seeing as the primary dynamic to the radial size of the PSF. Given the long exposure times per sample (1.3 sec) and the six-sample coaddition (with optimal dithering to produce round PSFs), seeing changes result in a symmetric ‘puffing’ in and out of the resultant coadd PSF. We can represent the image PSF with the generalized radially symmetric exponential of the form:

where f₀ is the central surface brightness, r is the radius in arcsec, and a and b are free parameters. This versatile function not only describes the 2MASS PSF, but it also used to characterize the radial profiles of galaxies, from disk-dominated spirals (b close to unity) to ellipsoidal galaxies (b ~ 4, de Vaucouleurs law). The scale-length, a, and the modifier, b, are generally quite correlated, so we combine them "shape parameter," a ´ b. This parameter is a powerful discriminate: galaxies tend to have larger values of both a and b than compared to stars; thus, the multiplicative join of the exponential fitting parameters amplifies the difference between point sources and extended sources.

Our ability to track the seeing on short time scales depends on the density of stars. The more stars available to measure a statistically meaningful value of the "shape," the higher the frequency of seeing changes that can be tracked. The stars in question must be isolated sources free of contamination from other stars and fainter background stars. A reasonable shape value can be derived from a minimum of about 10 stars. Consequently, for low stellar density regions, say the north galactic pole, ~300 stars per deg² brighter than 14^th mag at K, the seeing is tracked on time scales of about 20 seconds, and for high density regions, >10⁴ stars per deg², is tracked on time scales of a few seconds of time. Experience has shown that the seeing can indeed significantly change on times scales as fast as a seconds of time (see below).

The mean "shape" is determined from an ensemble of isolated stars spatially clustered along the 2MASS in-scan direction. The sample population must be free of extended sources (galaxies) and double stars to be a meaningful measure of the PSF. We perform a robust separation of isolated stars from the larger population (of the spatially correlated sample) by employing an iterative selection method that is keyed by using an initial boot-strap from the lower quartile of the total population histogram. Since isolated stars will have an inherently smaller "shape" value than extended sources (or double stars), the lower quartile (25%) is dominated by isolated stars and conversely, the upper quartile by galaxies. Hence, the distribution lower quartile serves as a good first guess to the actual mean shape value of isolated stars. The idea is to exclude truely extended sources from the stellar shape determination. Once the lower quartile is identified, we can iteratively search a restricted range in the histogram to arrive at a stable and robust estimation of the true mean shape value for isolated stars. The initial restricted range corresponds -3s to +2s of the lower quartile, where s is the scatter in the "shape" value. In the first iteration we use an a priori determination of s. For each iteration thereafter, we set hard limits of ± 2s. The final "shape" value corresponds to the median (50% central quartile) of the restricted histogram sample, and the s to the rms scatter or standard deviation of the population.

In this way we build a "stellar ridgeline" of shape values as a function of scan position (or simply, scan coordinate). Two very different examples are illustrated in Figures 1 and 2. The plots show the median "shape" values (large triangles) along the scan. Extracted sources (including stars and galaxies) are denoted with small points. The corresponding FWHM of the PSF (fit with a gaussian) are also shown to give some idea of the scale in arcseconds. In Figure 1 we show the resultant ridgeline for a scan passing through the Hercules cluster of galaxies. The stellar number density is not large (galactic latitude of Hercules is about 30° ), but there are still plenty of isolated stars easily separated from extended sources (galaxies are located above the mean "shape" ridge). The seeing is fairly stable for each band all throughout the 6° scan (Dtime ~ 6 minutes). The same cannot be said for the second case, Figure 2, which demonstrates both poor seeing conditions and very rapid changes in the seeing. Fortunately, the stellar density is rather high in this field, 4000 stars per deg², and the rapid seeing diversions are, for the most part, sufficiently tracked. Scans for which the seeing is poorly tracked or the absolute value of the mean scan seeing is greater than 1.3" (PSF FWHM > 4") are considered low quality data and are in most cases rescheduled for re-observation. The stellar ridgelines are used in the extended source processing to separate point sources from ‘resolved’ sources.