GNU Astronomy Utilities



2.10.4 Contiguous outliers

When source of the outlier covers more than one element, and its flux is close to the noise level, not all of its elements will be clipped: because of noise, some of its elements will remain un-clipped; and thus affecting the output.

Examples of this were created and thoroughly discussed in previous sections with \(\sigma\)-clipping and MAD-clipping (see Sigma clipping and MAD clipping). \(\sigma\)-clipping had good purity (very few non-outliers were clipped) but at the cost of bad completeness (many outliers remained). MAD-clipping was the opposite: it masked many outliers (good completeness), but at the cost of clipping many pixels that shouldn’t have been clipped (bad purity).

Fortunately there is a good way to benefit from the best of both worlds. Recall that in the numbers image of the MAD-clipping output, the wrongly clipped pixels were randomly distributed are barely connected. On the other hand, those that covered the circle were nicely connected, with unclipped pixels scattered within it. Therefore, using their spatial distribution, we can improve the completeness (not have any “holes” within the masked circle) and purity (remove the false clips). This is done through the madclip-maskfilled operator:

  1. MAD-clipping is applied (\(\sigma\)-clipping is also possible, but less effective).
  2. A binary image is created for each input: any outlying pixel of each input is set to 1 (foreground); the rest are set to 0 (background). Mathematical morphology operators are then used in preparation to filling the holes (to close the boundary of the contiguous outlier):
    • For 2D images (were each pixel has 8 neighbors) the foreground pixels are dilated with a “connectivity” of 1 (only the nearest neighbors: 4-connectivity in a 2D image).
    • For 1D arrays (where each element only has two neighbors), the foreground is eroded. This is necessary because the next step (where the holes are filled), two pixels that have been clipped purely due to noise with a large distance between them can wrongly mask a very large range of the input data.
  3. Any background pixel that is fully surrounded by the foreground (or a “hole”) is filled (given a value of 1: becoming a foreground pixel).
  4. One step of 8-connected opening (an erosion, followed by a dilation) is applied to remove (set to background) any non-contiguous foreground pixel of each input.
  5. All the foreground pixels of the binary images are set to NaN in the respective input data (that the binary image corresponds to).

Let’s have a look at the output of this process with the first command below. Note that because madclip-maskfilled returns its main input operands back to the stack, we need to call Arithmetic with --writeall (which will produce a multi-HDU output file). With the second, open the output in DS9:

$ astarithmetic build/in-*.fits 9 4.5 0.01 madclip-maskfilled \
                -g1 --writeall --output=inputs-masked.fits

$ astscript-fits-view inputs-masked.fits

In the small “Cube” window, you will see that 9 HDUs are present in inputs-masked.fits. Click on the “Next” button to see each input. When you get to the last (9th) HDU, you will notice that the circle has been masked there (it is filled with NaN values). Now that all the contiguous outlier(s) of the inputs are masked, we can use more pure stacking operators (like \(\sigma\)-clipping) to remove any strong, single-pixel outlier:

$ astarithmetic build/in-*.fits 9 4.5 0.01 madclip-maskfilled \
                9 5 0.1 sigclip-mean \
                -g1 --writeall --output=stack-good.fits

$ astscript-fits-view stack-good.fits --ds9scale=minmax

You see a clean noisy stack in the first HDU (note that we used the \(\sigma\)-clipped mean here; which was strongly affected by outliers as we saw before in Sigma clipping). When you go to the next HDU, you see that over the circle only 8 images were used and that there are no “holes” there. But the operator that was most affected by outliers was the standard deviation. To test it, repeat the first command above and use sigclip-std instead of sigclip-mean and have a look at the output: again, you don’t see any footprint of the circle.

Of course, if the potential outlier signal can become weaker, there are some solutions: use more inputs if you can (to decrease the noise), or decrease the multiple MAD in the madclip-maskfilled call above: it will decrease your purity, but to some level, this may be fine (depends on your usage of the stack).