The Dark Energy Camera (DECam) Community Pipeline (CP) is an automatic, high performance processing system to apply the best instrumental calibrations as currently understood. During the first year of camera operation there were improvements being made as the scientists and engineers learned more about the characteristics of the instrument and explored calibration algorithms. This document describes the various calibrations for version 3 and its minor updates of the CP (V3.x.x). The version may be found under the PLVER keyword of the CP data products.
The challenge for a document of this kind is providing a useful amount of detail without becoming a dense system manual. The goal of this document is to address the ultimate science users -- principle investigators and archival researchers -- with sufficient detail to understand the scientific pedigree of the CP data products. This document is provided in several versions of increasing detail so one may start with less detail and then expand to increasing detail as desired.
Another challenge for presentation is the organization of the many calibrations. As a pipeline, the calibrations proceed in a relatively linear fashion. So, though some calibrations could be done in different orders with different advantages and disadvantages, this document follows the V3.x.x calibration flow.
In addition to the calibration descriptions the end-user needs to understand the types, nature, and structures of the CP data products. Therefore, there is a section on the data products.
This document and the pipeline and its data products are still evolving. Comments about this document and the data products (both good and bad) are welcome. Please send mail to the help desk at email@example.com. Other reference documentation, including the NOAO Data Handbook, may be found at ast.noao.edu/data/docs.
A correction for electronic amplifier bias, generally known as "overscan subtraction", is made using overscan readout values. There are two amplifers per CCD, each with their own overscan values, and they are handled independently. A single bias value is subtracted from the part of the image line corresponding to a particular amplifier. Each line is treated independently; i.e. there is no smoothing of the overscan values across lines.
The overscan subtraction uses the overscan values provided by the data acquisition system. Line-by-line overscan is used because a bias jump occurs due to the simultaneous readout of the smaller ancillary CCDs. A statistical rejection of overscan values is used to eliminate possible transient and cosmic ray effects. After the overscan values have been used the image data is trimmed to remove them.
Crosstalk is an effect where the signal level in a pixel being read by one amplifier affects the signal level in another amplifier. The effect can either increase or decrease the measured/recorded value from the other amplifier. The CP computes a correction for every pixel in an amplifier caused by every other amplifier; though it is implemented so that if there is no effect between a pair of amplifiers no corrections is computed. The correction is essentially an empirical model that has been determined by the instrument scientists and engineers. The model provides a linear correction up to a threshold and then a non-linear correction up to the limit of the digital converter.
Saturation is essentially the point where the accumulated charge in the CCDs can no longer be properly calibration. In the CP this is defined to be pixels which exceed a threshold determined by the instrument scientists. The threshold is different for each amplifier. The CP sets the most current values in the image headers and then adds a saturation bit (bit 2) in the data quality maps for those pixels exceeding that value.
A bad pixel calibration map is defined external to the CP. The identified bad pixels are added to the exposure bad pixel maps. Bad pixels are replaced by linear interpolation from neighboring columns when only a single bad pixel is spanned. The weight of the interpolated pixel is reduced by a factor of 2.
Bias calibration consists of subtracting a master bias calibration. Master biases are created by the pipeline by combining many individual bias exposures (usually) taken during the night.
The CP does not do a dark count calibration and dark count exposures are ignored on input.
The DECam CCDs exhibit varying amounts of non-linearity at both low and high count levels. The instrument team provides a look up table for each amplifier to correct this.
The principle, pixel level, gain calibration consists of dividing by a master dome flat field. Master dome flat fields are created by the pipeline from dome flat exposures (usually) taken during the same night. These master dome flat fields are, themselves, corrected for large scale effects; namely, camera reflections and differences in response between the dome lamp illumination and a typical dark sky. The corrections are derived from stellar photometry of a calibration field obtain during special calibration programs. These calibrations are called star flat fields or "starflats".
The starflats account for camera reflection in the dome flat fields so as to not bias the on-sky observations. This does not remove camera reflection light from these sky observations. This is handled later as part of removing background patterns. The starflats also do not account for differences in illumination response from the time and conditions of the starflat observations nor the coarseness of the photometric sampling. This is also treated later as part of a sky illumination calibration.
A couple of filters (z and Y) show an interference pattern, generally call a "fringe pattern", caused by narrow night sky lines. This is removed by subtracting a fringe pattern calibration. This calibration is produced externally from data taken during special calibration programs. The pattern has been determined to be sufficiently stable over time to use such a calibration. The calibration pattern is scaled by the median of the exposure and then subtracted.
Saturation in bright stars and galaxies produce trails along columns due to an effect called "bleeding" or "blooming". A related effect is edge bleeding from the serial register which affects lines near the serial register side of the image when there is a very bright star near the serial register. An algorithm is used to identify pixels affected by these effects and add them to the data quality maps. In addition the pixels are replaced by linear interpolation across the bad data.
Astrometric calibration consists of refining the world coordinate system (WCS) function based on matching sources in an exposure to a astrometric reference catalog (V3.x.x: 2MASS). The sources in each CCD are cataloged and then all the catalogs are combined and matched to the astrometric reference catalog. The matched sources are then used to refine the terms of the TPV WCS function. This astrometric solution is stored in the CCD image headers of the instrumentally calibrated data products. Note that there are also data products where the images have been remapped to a simple tangent plane world coordinate system.
Evaluation of the TPV astrometric solution (in the non-remapped data products) is available in several common astronomical software packages; e.g. IRAF and DS9.
Cosmic ray events are identified and the affected pixels are added to the data quality and weight maps. The single exposure identification is based on finding pixels that are significantly brighter than nearest neighbor pixels; i.e. have "sharp" edges. This algorithm has been tuned to avoid identify the cores of unresolved sources. Part of this is that exposures with very good image quality (FWHM < 3.3 pixels) are not searched. Note that cosmic rays are also identified at a later stage when multiple exposures of the same field are part of the dataset.
A characterization of the magnitude zero point is derived by comparison to common sources in a photometric reference catalog (V3.0: USNO-B1). This is obtained by creating a source catalog of instrumental magnitudes and matching it against a reference catalog. The mean magnitude difference in the matches yields the zero point. This zero point is useful both for users and for stacking multiple exposures.
Sky pattern removal attempts to make the sky background across an exposure be uniform on large scales. This applies to the non-remapped exposures from which remapped images and stacks are subsequently produced. The first pattern to remove is a camera reflection which appears as a large image of the pupil; hence often termed a "pupil ghost". The second pattern is a gradient across the field where, By gradient we don't mean just a plane but a structure that might be like a vignetting based on the orientations of the dome slit, strong light like the moon, or differential atmospheric transmission. It is realized that some of these are actually response variations rather than additive background but in V3.x.x all sources of low spatial frequency variation are treated as additive background.
The patterns are extracted from a special, low resolution, full field sky image. The sky in each CCD is carefully measured from histograms of blocks of pixels. Each block becomes a single pixel in a low resolution image. The values for all CCDs are put together into a simple full field image that preserves the geometry of the focal plane and includes masks where there is no data.
The pupil pattern is modeled by fitting in polar coordinates about the center of the pattern over a range of radii. Outside of these radii the pattern is assumed to be absent. A polynomial is fit across each radius bin. The inner and outer rings define a background where the contribution of the pattern is taken to go to zero. The order of the polynomial for the background rings is high enough to track background gradients and the background at each point within the ring is a linear interpolation between the inner and outer fit values at a given azimuth. For the azimuthal fits of the background subtracted rings in the pattern a constant is currently used. The constant values at each radius are then fit by a polynomial. These choices produce a ring fit with no azimuthal structure. This pattern is subtracted from the low resolution sky image and the actual exposure data.
After removal of a pupil pattern the focal plane sky image is median filtered with a window that is larger than a single CCD. This filtering is able to remove sky patterns such as produced by moonlight through the dome slit while not being unduly influenced by remaining amplifier gain variations. The background is adjusted to a zero mean and then subtracted from the full exposure data. This results in final data that has a count level matching the exposure but with the reflection and background structure removed; i.e. a nearly uniform background.
This correction makes the ensemble response to sky light, after removing sky patterns due to camera reflections and illumination gradients, uniform across the field of view. This is done by combining all exposures of 60 seconds or more, in the same filter and over a small set of nights (usually a run), using source masks and medians to eliminate source light. This "dark sky illumination" calibration is divided into each individual exposure.
The pipeline operator reviews the calibrations to ensure there were enough exposures with good dithering and minimal impact of sources. If the operator accepts the calibration it is applied to the exposures. If the calibration is rejected then a successful calibration from data as close in time as possible is applied.
A photometric point to be aware of is that the various gain calibrations on the unremapped individual exposures assume the areas of the pixels projected on the sky are the same. This is a small effect in DECam (< 1% at the extreme edge) but, depending on how one performs the photometry, this may be a factor one should consider. The remapping and subsequent coadded stacks explicitly make the pixel areas the same (apart from the very small tangent plane projection) and so fixed aperture photometry would be as accurate as the gain calibration allows.
The astrometric calibration allows remapping the data to any desired projection and sampling. Note that if the astrometric calibration failed (keyword WCSCAL) then remapping and coadding are not done on the exposure though a data product will still be available without an accurate astrometric calibration. The two reasons for remapping are 1) to provide data where instrument distortions have been removed, particularly pixel size variations on the sky which can affect photometry programs, and 2) to register images for stacking. The CP resamples (which means interpolates) each exposure to a standard tangent plane project with north up, east to the left, and each pixel being 0.27 arc seconds square.
The tangent point is selected from a "healpix" grid on the sky and consideration is given to assigning overlapping exposures to the same tangent point. The reason for this is so that exposures with the same CP standard tangent plane project can be combined without further interpolation. The algorithm which assigns exposures which have different nearest grid point but are adjust to the same grid point becuase they overlap significantly requires a significant overlap (roughly 30% or more). Programs that shift exposures by nearly a full DECam field of view will produce tangent point groupings (and hence coadd stacks) which are separate.
The remapping is one where right ascension decreases with image column and declination increases with image line. In other words, when an image is displayed in the usual way with pixel (1,1) at the lower left north is up and east is left. The pixel size is set to 0.27 arc seconds in each dimension. Note that in a tangent plane projection these criteria are only exact at the tangent point which is not necessarily at the center of the field. This means that the celestial axes will depart from alignment with the image rows and columns and the pixel sizes will vary. The former alignment effect is most noticable at declinations near the equitorial pole and the latter pixel size variation is neglibibly small for the DECam field of view.
The interpolation method is reflected in the image headers (keyword REDAMPT1) and is an Lancozos interpolator in V3.x.x and earlier.
The pipeline masks "transient" sources in overlapping exposures. These masked sources are then excluded from coadds of the exposures. Transient sources include cosmic rays, satellite trails, and asteroids. In summary, anything which is significantly different in one exposure compared to a median of the exposures. However, the algorithm purposely is insensitive to differences within galaxies and stars. It also becomes statistically insensitive when there are very few exposures in regions of overlap.
The algorithm works as follows. A first stack is created as a median of the high spatial frequency subtracted remapped exposures. Then the median is subtracted from each individual exposure and a detection program is run to detect things above background. The footprint of each detection, which includes a small amount of boundary growth, is then added to the data quality maps and weight map (as zero weight) for the exposure. Finally all the exposures are stacked again as an average with all masked pixels excluded.
There is one detail to note. This algorithm does not include any PSF matching. This means that small changes in the seeing can produce residuals at the cores of bright stars or galaxy cores. To avoid inappropriate masking of these, the instrumental flux of detected residual sources is compared to the flux in the median image using the same footprint. The residual source flux is required to be greater than 70% of the median flux. So for blank sky all statistically meaningful transients are detected but in regions of stars and galaxies only truly significant sources are found.
The CP combines exposures which overlap significantly. The exposures which overlap are determined during the remapping step in the more detailed description. Exposures are considered overlapping when they have the same tangent point. As noted there, exposure patterns that move fields by roughly a DECam field of view will generally produce separate coadds and not one very large coadd.
The overlapping exposures are divided up into multiple stacks by the following criteria. Exposures are grouped by exposure time with groups for very short (t < 10s), short (10s <= t < 60s), medium (60s <= t < 300s), and long (t > 300s) exposures. If a group has more than 50 exposures it is divided in the smallest number of subgroups which all have less than 50 exposures.
Of the exposures that are identified for a stack another set of criteria are used to exclude outlier exposures which can degrade the image quality. These critera are based on outlier statistics, meaning exposures which depart significantly from the typical values. The quantities considered are unusally large relative flux scaling (e.g. low magnitude zero points due to bad transparency or short exposure time), poor seeing, and high sky brightness. In addition only less than half of the exposures are allowed to be rejected otherwise all exposures are used.
There are two stacks produced for each set of exposures satisfying the critera above. One has no background adjustments beyond those applied to the non-remapped data as described previously. The other subtracts an additional background from each CCD. This is a higher spatial frequency filtering which can produce a better matching of overlapping CCDs but can also subtract light from sources with extended light of 10% or more of a CCD. Because it is difficult for the pipeline to decide which is appropriate for a dataset, the two versions are produced from which the investigator can chose.
A common question is why a coadd does not include data over a long run or multiple runs. This is because the pipeline works on blocks of consecutive or nearly consecutive nights. Only data within a block of nights are available for defining a field. Long runs are sometimes broken up into multiple blocks due to disk space limitations. Programs that have assigned nights that are disjoint by many days (normally considered as different runs) are processed separately.
In order to not compromise dithered data where good CCDs overlap poor or bad CCDs, those CCDs are excluded. In addition to those excluded earlier from the single, non-remapped exposures (e.g. 61/N30 and 2/S30 post Nov. 2013), others may be excluded (e.g. 31/S7) are also excluded.
The coadds are done using the data quality maps generated earlier in the pipeline including the masking of the multi-exposure transient detections.
The CP maintains a Calibration Library of files which are applied during processing. The selection of a calibration file is based on the date of the exposure being calibrated. Some calibration files change infrequently and some are derived frequently from calibration or on-sky exposures taken by the observers. The processing metadata added to the data products provides the names of the calibration files used and, in principle, all calibrations files may be obtained from the archive.
The crosstalk coefficient file is a text file provides the coefficients for the crosstalk correction described earlier. The is indexed by pairs of affected and source CCD amplifiers. This file is derived by a calibration team looking at potential affected pixels for bright sources in a variety of exposures. This may be updated periodically as needed.
A text file is provided by instrument scientists containing the saturation level for each amplifier. This is considered the best estimate and overrides values in the raw exposure headers.
The instrument scientist provide a linearity coefficient FITS file with tables of linearity coefficients for each CCD.
A text file is provided by instrument scientists containing the keywords required for the TPV world coordinate system (WCS). Besides the structural keywords this also provides the initial coefficients for each CCD. These keywords and coefficients override values in the raw exposure headers.
The instrument scientists provide a map of the known bad pixels for each CCD. This information is used to populate the initial data quality maps for each exposure. These files will be periodically updated as needed.
All bias exposures from a night -- independent of proposal and excluding any subject to the voltage turn on transient -- are processed into a single master bias calibration. The first steps are the same as previously described for science exposures; namely, electronic bias, crosstalk, saturation masking, and bad pixel masking and interpolation. After these calibrations the exposures are combined, in CCD pixel space, by averaging the pixel values with lowest and highest values excluded. A weight map is also produced for error propagation.
The master bias calibrations are subject to review by the operator who may redo the processing with problem exposures eliminated or simply reject the master bias for the night, in which case a nearby night will be used for the science exposures.
All dome flat exposures from a night -- independent of proposal, excluding any subject to the voltage turn on transient, and grouped by filter -- are processed into a single master dome flat calibration. The first steps are the same as previously described for science exposures; namely, electronic bias, crosstalk, saturation masking, bad pixel masking and interpolation, master bias subtraction, and linearity. Each CCD is then scaled by the average of the central section ([500:1500,1500:2500]) with bad pixels identified by the current instrument bad pixel map excluded. The pixels are then combined, in CCD pixel space, by averaging the pixel values with lowest and highest values excluded. A weight map for error propagation is also produced.
The master dome flat calibrations are subject to review by the operator who may redo the processing with problem exposures eliminated or simply reject the master dome flat for the night, in which case a nearby night will be used for the science exposures.
Note that each CCD is normalized to one within the central section. This means application of the dome flat does not correct for the relative response differences between CCDs. That calibration is provided by the star flat master calibration.
The star flat calibrations, one per filter, is a master calibration created outside of the CP. It is produced from many dithered exposures of a modestly dense stellar field taken as part of a separate calibration program. These exposures are processed through the normalized dome flat calibration. The logic is that the dithering produces many instances of the same star over the detector. Spatial variations from the average instrumental magnitude for that star provides a measure of the relative response differences between those sampled points. By combining a large number of stars a flat field map is produced which makes the instrumental magnitudes, and hence the response of the camera, uniform across the field of view.
Templates of the fringe pattern, one for z and one for Y, are quite stable. Therefore, these are derived periodically outside of the CP from many exposures of sparse fields. The exposures are combined to exclude sources. The the master stack is filtered to extract the fringe pattern with a mean of zero. During science exposure calibration the template is scaled and subtracted.
Illumination calibrations, grouped by filter, are derived from run datasets when the exposures are sufficiently numerous, dithered, and unaffected by large sources. This provides a gain correction to the more static star flat calibration. The calibration consists of images for each CCD with flat field pixel values. The values are generally spatially smoothed. When an illumination calibration is derived, and approved by the operator, it enters the CP calibration library. It may then be used for the individual exposures from the same dataset or by other datasets for which an illumination correction cannot be derived.
The CP data products are available from the NOAO Science Archive. Calibrations are non-proprietary while most science data has an 18 month proprietary period from the time the raw data was taken.
There are currently four classes of data products: calibrations, instrumentally calibrated single exposures (non-remapped and remapped), and stacked (two version of background subtraction). Each of these consist of the basic image flux data plus various ancillary files associated with the flux data. The current ancillary file types are data quality, weight, and coverage/exposure maps. It is possible additional data products will be added in the future.
There are currently 17 types of files. The file types are identified by three type values; the observation type, processing type and product type. The table below shows the various combinations. Investigators may need only some of the types depending on the science or analysis applications. However, it is highly recommended that they consider either the data quality maps or weight map to interpret whether values in the image data are actually scientifically useful.
The product type has five values -- "image/image1" for flux data, "dqmask" for data quality maps, "wtmap" for weight map, and "expmap" for exposure or coverage map. The principle type of scientific interest is the image type while the others provide ancillary data about the flux pixels.
The processing type has four values -- "MasterCal" for master calibrations of various observation types, "InstCal" for the instrumentally calibrated but not resampled single exposures, "Resampled" for the resampled single exposures, and "Stacked" for stacked or coadded data from multiple exposures of the same pointing.
The observation type has five values -- "zero" for zero second or bias master calibrations, "dome flat" for master dome flat calibrations, "fringecor" for master fringe templates, "illumcor" for master illumination corrections, and "object" for on-sky science data.
One thing to mention is that not all combinations are produced. In particular data quality maps are not produced for master calibrations and exposure maps are only applicable to stacked data products.
All data products are packaged as compressed, multi-extensions FITS (MEF) files. The extensions are either for each CCD or for tiles of a stacked data product. The compressed format is discussed further below.
As noted in the calibration discussion, the overscan and other non-imaging regions are removed and do not appear in any of the data products.
The FITS files are compressed for storing in the archive and making it faster to transfer electronically. The image and weight map product types use a lossy tile compression method which is now widely used by data providers in astronomy and supported by standard software.
The data quality and exposure map data product type, which is inherently integer, is compressed with a lossless run-length or pixel list method. If uncompressed it becomes a fully populated integer image. However, if left compressed and, instead, simply renamed to drop the ".fz" extension then this format can be used directly by IRAF applications. This can save significant disk space and provides access to the suite of IRAF tools designed to use bad pixel maps.
The exposure or coverage map type can generally be represented as integer seconds. If for some reason it cannot then it is mapped to integers using the scale parameters. It is then compressed in the same way as the data quality map; i.e., using "plio" compression. Therefore, it can also be expanded or used as a native IRAF format by dropping the ".fz" extension.
The data products described above are archived in the NOAO Science Archive (NSA) using FITS compression. Investigators may then either use software that directly understands the compressed format or use standard tools for uncompressing to more classic FITS files. The image and weight data products use lossy tile compression which provides roughly a six-fold compression that is important for storage and data transport. This type of astronomically specialized compression has been carefully studied and has become a de facto standard. Without compression the NSA would become prohibitively expensive for NOAO and the ability to provide individual remapped versions and two versions of stacks to help observers easily optimize stacks would not be possible.