V. Catalog Generation

3. Source Selection

Flux Overestimation vs. Signal-to-Noise Ratio

i. Single Band Thresholding

A simulation was done using purely Gaussian statistics to derive the approximate magnitude of the flux overestimation bias vs. the photometric uncertainty, . 11,000 sources were generated from a population that had a log N / log S slope of -1.5, a cutoff at signal-to-noise ratio, SNR_true = 0.5. This should provide accurate results for observed SNR > ~3.5. SNR was used as a direct proxy for flux, assuming a constant noise in this simulation. (Multiplying all SNR values by results in the flux, in flux units, rather than SNR units.)

The measured flux vs. the true flux are shown for all sources in Figure 1 and for sources with SNR_true between 2 and 10 in Figure 2. These figures show:

the expected Gaussian distribution, when cut at constant values of true flux;
a distinctly non-Gaussian distribution, when cut at constant values of measured flux; and
a significant number of sources observed up to SNR = 7 that actually have much lower true flux.

Figure 1 Figure 2

A histogram of the ratio of observed flux to true flux in Figure 3 shows a clear asymmetry, even in the SNR 6--7 bin, with 22% of the sources having a flux ratio above 1.3 vs. none of the sources having a flux ratio below 0.7. Even if all the sources had the highest theoretical flux uncertainties at SNR=6, the lower edge of that bin, only 3.6% of the sources should be in either tail.

Taking the ratio of the median measured flux to the median true flux in flux bins, and subtracting 1, the median flux overestimation as a function of SNR can be computed and shown in Figure 4. A clear, large mean flux overestimation of well over 5% exists below SNR = 7. Furthermore, a smooth fit to the data shows that the flux overestimation in the SNR 7--8 bin is around 5%.

Another way to look at the flux overestimation is to compute the mean flux overestimation, as a fraction of the quoted error, shown in Figure 5. The mean flux overestimation is a full 50% of the quoted flux error for SNR = 6--7, making the quoted flux errors a seriously deficient measure of the true flux accuracy.

The derived log dN / log S in Figure 6 shows the expected excess of sources, starting at SNR ~ 6 (see Figure 7).

This simulation gives minimum values for the flux overestimation. All the sources of non-Gaussian noise only increase the actual flux overestimation.

Figure 3

Figure 4 Figure 5

Figure 6 Figure 7

ii. Single Band vs. Multiband Thresholding

The above simulation was used to produce results for sources with the colors of galaxies. It was assumed that every single source had the following colors: J-H = 0.7 and H-K_s = 0.4. This is the "best case" for adding extra sources to the Catalog by a multiband rule. Normalizing to J, this translates into a typical SNR ratio of 0.69 for H/J and 0.55 for K_s/J.

The reason this is a "best case" can be seen by considering the other extreme of the bluest stellar colors: J-H = 0.2 and H-K_s = 0.05. Again, normalizing to J, this translates into a typical SNR ratio of 0.43 for H/J and 0.29 for K_s/J. With such lower SNR ratios at H and K_s, it makes it more unlikely for one of those bands to exceed any given threshold.

The simulation above was used to create the J population of sources, and then the "observed" H and K_s fluxes for those sources.

Although this may sound like it creates a bias in the simulation, the simulation procedure is actually exactly symmetric between the bands, since all sources have the same colors. For example, one can think of the process to generate a single source as simply generating a point in a "mythical SNR space" not connected to any band, and then scaling that mythical SNR space to the actual SNR of J, H and K_s, separately, using the fixed colors.

The derived log dN / log S, in Figure 8, shows the expected excess of sources starting at SNR ~ 6 in all bands. The number of sources is converging to almost the same level, independent of band at low SNR, since the number of observed SNR = 1 sources is dominated by sources boosted in flux by noise. Note that half of the simulated sources have SNR= 0.50 -- 0.79 at J and lower SNR at H and K_s, resulting in a slight excess of J sources, relative to H and K_s. If the simulation had gone down to SNR of 0.01 at J, the number of sources found at SNR = 1 would have been nearly identical in every band.

Figure 8

Sources were selected for the "catalog" using two rules:

SNR > 7 in a single band, resulting in 215 sources, all with J > 7; and
SNR > 6 in at least two bands. This added 9 sources:
- 1 with J, H, and K_s all > 6;
- 5 with J, H > 6;
- 2 with J, K_s > 6; and
- 1 with H, K_s > 6.

The nine added sources represents an increase of (4 ± 1)% in the number of sources in the "catalog." The reason for such a small number of sources is that a source with SNR = 7 at J has SNR = 4.8 at H and 3.8 at K_s. A source with SNR = 6 at J has SNR = 4.1 at H and 3.3 at K_s.

One can immediately see the source of the flux overestimation problem, detailed below at H and K_s, if a multiband threshold is used. If one uses only a single band threshold, sources are selected primarily, if not entirely, at J, and the H and K_s measurements are simply "carried along" and are unbiased. However, the additional sources selected from the multiband threshold have a serious flux overestimation problem. Only sources which have fluxes boosted by noise above SNR = 6, from their true fluxes of 4--5 at H and 3--4 at K_s, pass this multiband threshold.

Furthermore, note that the amount of flux overestimation using a multiband threshold depends on the intrinsic flux of sources at those bands, relative to a single band threshold. For example, for sources with highest SNR at J, the J threshold implies H fluxes of SNR=4--5. Imposing a lower threshold at H (which is what is effectively done by the multiband threshold), the flux overestimation must be ~ 6 / (4--5), or 20--50%. Most of the sources will be at the lower threshold of 20%, since it is harder for noise to boost a source from 4 to 6 than from 5 to 6 . In the same way at K_s, the flux overestimation must be ~ 6 / (3--4), or 50--100%, with most of the sources at 50%.

Figure 9 shows the J flux bias vs. J true flux, Figure 10 shows the H flux bias vs. H, and Figure 11 shows the K_s flux bias vs. K_s, where the flux bias is defined as the ratio of the observed flux to the true flux. In these figures, the sources selected from the multi-band rule are shown separately, with only the H thresholded sources shown on the H plot, and similarly for K_s. (In other words, the J diagram shows in a separate color only the eight additional sources which passed the SNR = 6 threshold at J, the H diagram shows the seven additional sources which passed the SNR = 6 threshold at H, and the K_s diagram shows the four additional sources above SNR = 6 at K_s.) The figures demonstrate:

the expected J flux bias vs. J true flux, due to thresholding (see below);
the "filling in" of sources "missing" from the catalog, due to the single band thresholding alone;
the expected non-bias in the fluxes in bands that were "carried along" by thresholding in another band; and,
the expected severe bias in the fluxes in bands subjected to multi-band thresholding involving a given band.

Figure 9 Figure 10 Figure 11

The thresholding J bias is simple to understand. At J_true = 7, only sources with positive noise excursions are allowed into the catalog by the single band threshold. Hence, there must be a ~1 high bias in observed fluxes at whatever threshold is picked for the catalog. At about 2 above the threshold, or SNR ~ 9, this thresholding bias disappears. Below the threshold, the bias gets more severe. The observed flux must be ~50% high at SNR_true ~ 7/1.5 = 4.7, and a factor of two high at SNR = 7/2 = 3.5, as observed.

This bias is well known (see below for further discussion). If the noise distribution in a survey is understood, this bias can be statistically corrected. Furthermore, this bias is negligible above SNR = 10, so those sources can be used with confidence.

The single "outlier" point in all three figures above is a source with true fluxes of (3.1, 2.2 and 1.7 mag) at (J, H and K_s), and observed fluxes of (7.2, 0.8 and 2.0 mag). It is actually only an "outlier" at J, having a 4.1 fluctuation upward. At H and K_s, the fluctuations are -1.4 and +0.3 . It looks like an outlier in the H and K_s diagrams only because the source is a much weaker source than the others, and hence, its flux ratio and H and K_s have large uncertainties.

The "flip side" of the flux bias is, of course, the "missing" sources which were observed to fall below the threshold. Anything that is done to put fainter sources into the catalog will partially fill in some of the missing sources, such as observed here.

For sources selected by the single band rule, which essentially means a J selection for all sources outside highly-extincted areas, both H and K_s show an unbiased flux distribution down to the lowest fluxes for which there are large number of sources, SNR ~ 4 at H and ~3 at K_s. (Recall that below those levels the uncertainty grows rapidly, and hence there is no real constraint on the flux bias below those levels in this simulation.)

It is another story altogether for sources selected by the multiband rule. The mean flux bias at H is almost 20%, and at K_s is almost 50%, just as expected from the theoretical analysis above.

A multiband rule was not used for catalog source selection, because:

A multiband rule creates a flux bias at H and K_s, where none existed using the single band rule.
The flux bias of the added sources is severe: 20 - 50% in this example, and higher for sources with bluer colors.
Having a multiband rule complicates the use of the catalog and makes it harder to derive corrections for any biases that result from what was done.

Since only 4% more sources were added as a result of the multiband rule, it the extra completeness is not worth the additional biases in the catalog. Note the statement above that "sources above SNR = 10 can be used with confidence." A corollary is that the "carry-along" bands can be used with confidence only as long as a multiband rule is not used.

iii. Catalog Selection By SNR, Not Flux Limits

The natural unit of any survey, such as 2MASS, is the noise (justified below). Magnitude or flux limits are fundamental measures for a source. But it is counterproductive, and often dangerous, if one uses a flux limit to select sources for entry into a catalog.

The definition of what the noise actually is in any survey must, of course, be carefully considered. However, for the moment, assume that one understands the noise in a given survey, and that its value is readily calculable.

The basic reason that catalog selection should be by SNR is that the value of any experimental data depends strongly on its SNR. It is generally accepted that for any individual source, nothing less than a 95 or 99% confidence limit should be used to quote a meaningful number, corresponding to 2--3 Gaussian . It is also generally accepted that for any survey containing lots of sources, no source that does not have at least one measurement with flux above 5--6 should enter a catalog of results from that survey.

The first reason behind the higher threshold for a survey is the reliability of the catalog based on the survey. The reliability depends directly on the number of volume elements over which one searches for sources in a survey. For example, if the survey makes independent estimates for every 4´´ × 4´´ area of sky, there are 3.3×10¹⁰ volume estimates in the sky. Even under Gaussian statistics, rarely attained in any survey, a +5 fluctuation happens 3×10^-7 for each volume element, and hence there would be 10,000 false sources in a catalog containing sources down to 5 .

In practice, there is a significant non-Gaussian tail in nearly all surveys, which could easily produce 10--100 times more false sources above 5 , which would be 100,000 to 1 million false sources.

There are at least three other reasons why catalogs should not use a threshold below 5-6 for a source to enter the catalog:

The actual accuracy of the photometric measurement is much lower than is commonly associated with a "5 measurement";
there is a severe flux overestimation problem; and
the quoted photometric error does not well describe the actual error distribution, which is severely non-Gaussian for fainter sources.

The basic problems here derive from the usual astronomical case that the number of sources increases rapidly as flux decreases.

Catalogs can and should contain fluxes that are carried along from other bands that are well below 5--6 . Those fluxes are largely unbiased, compared to the bands that have passed a threshold to enter the catalog. It is the act of thresholding that produces the bias, so non-thresholded bands will not have the bias discussed here.

Consider all sources reported to be at a given flux of n in a survey. They will consist of sources whose true flux is n and whose realized measurement error is zero, and of sources whose true flux is n+m whose realized measurement error is -m . When n is large, this population smearing matters little, since the ratio of the true fluxes of these sources coming from the population with realized measurement errors of +m and -m is (n+m) / (n-m) ~ 1 + 2m/n. Taking m=1, the ratio is 1 + 2/n, which is 1.2 for n=10 (1.22 non-approximated value). Note that the ratio goes to zero as n increases, but becomes quite large as n decreases.

In nearly all cases, the number of sources varies with flux, as flux to the -1 or -1.5 power. Hence, the ratio of the number of true sources at n+m to the true number at n-m is {(n+m)/(n-m)}^{(1 to 1.5)}. For n=10 and m=1, this ratio is 1.22--1.35. For m=2, the ratio is 1.5 - 1.8.

Hence, even at 10 , the error distribution is not symmetric: there are 20--30% more sources whose true flux is 9 than those whose true flux is 11 , and 50--80% more sources whose true flux is 8 than those whose true flux is 12 . Note that as n increases, this effect vanishes, and hence can be largely ignored for a catalog which contains only sources brighter than SNR=10.

However, as one goes below 10 , note what happens to that ratio of the true fluxes of these sources. At 5 , the ratio is 6/4 -- 7/3 = 1.5 to 2.3, for m=1 to 2. Hence, the observed 5 sources come from sources whose true flux varies typically by 50%, compared to the variation of 20% or less for the sources with SNR above 10.

Worse, in this set of sources whose flux is claimed to be 5 , there are a lot more sources whose true flux is actually 3--4 than whose true flux is actually 6--7 . The ratio of the number of sources whose true flux is 3, compared to the number of sources whose true flux is 5, is (5/3)^{1,1.5} = 1.7 -- 2.2. This gives rise to a large number of problems:

There is a severe flux overestimation problem for sources below 5--6 . A simple model shows that, under one set of assumptions, the set of sources with quoted flux of 5 actually have true fluxes of ~4.5 (the actual number depends strongly on a number of factors, so this value should only be taken as showing a significant difference from 5 ). The flux overestimation quickly gets worse at lower . See the IRAS Faint Source Catalog Explanatory Supplement for a quantitative discussion of the flux overestimation problem.
Note that the quoted photometric error does not well describe the actual error distribution. The quoted photometric error usually derives solely from the measurement error, which would be 20% at 5 (1 divided by 5 ). The actual photometric 1 error is larger, because the tail to low actual flux values has been significantly increased.
Furthermore, the distribution of actual photometric errors is quite different from a Gaussian, even if the measurement error is Gaussian! This is due to the larger number of true sources at fainter flux levels, and would not be true if the number of sources vs. flux was constant.
All of these problems result in a low accuracy of the photometric measurement for these sources, much lower than what is commonly associated with a "5 measurement."

All in all, this makes a catalog of sources almost begging to be misused by most astronomers, if one includes sources fainter than 5--6 in all bands. (Again, catalogs should contain measurements down to ~3--4 in bands other than the band thresholded above 5--6 .) Worse, because most of the sources in any catalog are the faintest sources in the catalog, such problems quickly adulterate any catalog that contains such faint sources.

This situation for a survey is completely different from the usual case of a measurement for an individual source. If an astronomer already has a source known from another survey, and wishes simply to obtain a flux measurement for that source, a 5 measurement is quite good. Concerns about flux overestimation play a much smaller factor, since the source has already been selected. In other words, a random source from a survey measured at 5 is likely to be a significantly fainter source, whereas a 5 measurement for a pre-detected source is likely to be an unbiased estimate of the source's flux. This is why the measurements in bands carried along after thresholding in another band also are valid measurements, not subject to this thresholding bias.

Even a 5 measurement of a previously known source at a different wavelength can easily be biased upward by ~1 , if care is not taken in how the flux measurement is done. The bias arises if one attempts to measure the flux by integrating until one gets a 5 measurement. For example, if the true flux of the source is 5 mJy, it is more likely that as the measurement converges, a 5- 6 mJy measurement is obtained (which requires the uncertainty to be 1.2 mJy), before a 5- measurement at 5 mJy or 4 mJy. Hence, the bias, which can be avoided by setting the integration time by some other means than achieving exactly a specified signal-to-noise ratio.

If you select a 5- detection in a given band, one can argue that the source is "known," and that the measurements in the other bands are unbiased. However, this is true only if one does not apply a cut-off; if one require that those measurements also pass a threshold of some sort, they will also be biased. This makes sense if one considers a requirement that a 2-band detection be 5 in each of two bands. The biases cannot change if one sorts first on the first band and then on the second band, compared to the reverse.

Hence, special care must be taken putting sources into a catalog based on two bands satisfying a lower SNR threshold than source selection based on a single band. These sources are likely to have significantly increased flux overestimation.

The proper treatment of sources with no measurements above 5-6 is to place them in a "Reject File." Astronomers can use that "Reject File" to obtain quite valid measurements of individual sources. Also, intrepid users who understand statistics and who can carefully evaluate the actual error statistics of a catalog, may be able to use the population of sources less than 5--6 and reach meaningful results. However, the history of such analysis is sordid, and hence, making such users state that they analyzed the "Reject File" serves as a warning to casual readers of their papers.

There is also a huge advantage to selecting sources by SNR. If a flux threshold is used to select sources, that flux threshold has to be set high enough, so that these problems do not exist in the least sensitive 2MASS scans. A SNR threshold takes advantage of the fact that many 2MASS scans are more sensitive than the worst scans, and releases a lot more useful sources to the community.

How To Select Sources By SNR

The SNR is the flux divided by the noise. The flux of a source is well defined. The flux of a point source is the PSF-fit flux in most cases, and the aperture flux in a few cases. Hence, the only concern is how to calculate the noise.

The prescription is simple: outside of high source density areas, the noise is directly related to the measured background. An extensive analysis of the noise shows that the typical residual between the measured background-removed noise and that predicted solely by the background is 0.02 DN. Hence, a robust estimate of the noise exists outside of high source density areas. In high source density areas, the measured Atlas Image noise can be used, either before or after background subtraction.

Of course, there are additional sources of noise: (1) meteor streaks, which can occur by themselves, on top of faint point sources (making a 1 true source into something much more), and can cross another meteor streak; (2) electronic noise; (3) airglow noise (both of which can interfere with each other); (4) cosmic ray hits, hot pixels, pixels of varying noise; and, (5) dead pixels that can occur anywhere on or in the background of a source.

None of these noise sources show up in gross measurements of the noise of an entire Atlas Image, because they affect only a few pixels. Hence, these noise sources are more properly considered as a cause of unreliable sources. The distribution of these unreliable sources gives rise to a strong non-Gaussian tail of sources that extends well beyond 5 , where the is calculated from the background, which is the same as the noise calculated from an individual Atlas Image.

The proper way to handle the presence of these noise sources is to empirically study the reliability of the catalog as a function of SNR. It may well be that one has to set the threshold well above 5--6 , in order to meet the point source reliability requirement. Thus, for point sources, the SNR threshold should be set by the reliability requirement.

The reliability requirement in the 2MASS Level 1 Specifications was placed on sources in the last half-magnitude bin above the SNR = 10 limit. The intent was that the faintest sources in the Catalog would satisfy that reliability requirement. In particular, note that one does not want searches for sources without optical identifications to be overwhelmed by false sources.

Because the "faintest sources in the Catalog" is not an easily quantifiable requirement, it is traditional to place the reliability specification at the SNR = 10 limit. The fundamental reason that the reliability standard must apply to all SNR levels in the Catalog is little attention is generally paid to what is the SNR of any source in a catalog. Putting sources into the Catalog that have significantly lower reliability than the SNR = 10 sources is the fastest route to adulterating the quality of the Catalog. In summary, a SNR threshold is mandatory for an optimal catalog.

[Last update: 1999 January 29, by T. Chester]

Return to V.3.