GNU Astronomy Utilities



2.1.15 Working with catalogs (estimating colors)

In the previous step we generated catalogs of objects and clumps over our dataset (see Segmentation and making a catalog). The catalogs are available in the two extensions of the single FITS file39. Let’s see the extensions and their basic properties with the Fits program:

$ astfits  cat/xdf-f160w.fits              # Extension information

Let’s inspect the table in each extension with Gnuastro’s Table program (see Table). We should have used -hOBJECTS and -hCLUMPS instead of -h1 and -h2 respectively. The numbers are just used here to convey that both names or numbers are possible, in the next commands, we will just use names.

$ asttable cat/xdf-f160w.fits -h1 --info   # Objects catalog info.
$ asttable cat/xdf-f160w.fits -h1          # Objects catalog columns.
$ asttable cat/xdf-f160w.fits -h2 -i       # Clumps catalog info.
$ asttable cat/xdf-f160w.fits -h2          # Clumps catalog columns.

As you see above, when given a specific table (file name and extension), Table will print the full contents of all the columns. To see the basic metadata about each column (for example, name, units and comments), simply append a --info (or -i) to the command.

To print the contents of special column(s), just give the column number(s) (counting from 1) or the column name(s) (if they have one) to the --column (or -c) option. For example, if you just want the magnitude and signal-to-noise ratio of the clumps (in the clumps catalog), you can get it with any of the following commands

$ asttable cat/xdf-f160w.fits -hCLUMPS --column=5,6
$ asttable cat/xdf-f160w.fits -hCLUMPS -c5,SN
$ asttable cat/xdf-f160w.fits -hCLUMPS -c5         -c6
$ asttable cat/xdf-f160w.fits -hCLUMPS -cMAGNITUDE -cSN

Similar to HDUs, when the columns have names, always use the name: it is so common to mis-write numbers or forget the order later! Using column names instead of numbers has many advantages:

  1. You do not have to worry about the order of columns in the table.
  2. It acts as a documentation in the script.
  3. Column meta-data (including a name) are not just limited to FITS tables and can also be used in plain text tables, see Gnuastro text table format.

Table also has tools to limit the displayed rows. For example, with the first command below only rows with a magnitude in the range of 29 to 30 will be shown. With the second command, you can further limit the displayed rows to rows with an S/N larger than 10 (a range between 10 to infinity). You can further sort the output rows, only show the top (or bottom) N rows, etc., see Table for more.

$ asttable cat/xdf-f160w.fits -hCLUMPS --range=MAGNITUDE,28:29
$ asttable cat/xdf-f160w.fits -hCLUMPS \
           --range=MAGNITUDE,28:29 --range=SN,10:inf

Now that you are comfortable in viewing table columns and rows, let’s look into merging columns of multiple tables into one table (which is necessary for measuring the color of the clumps). Since cat/xdf-f160w.fits and cat/xdf-f105w-on-f160w-lab.fits have exactly the same number of rows and the rows correspond to the same clump, let’s merge them to have one table with magnitudes in both filters.

We can merge columns with the --catcolumnfile option like below. You give this option a file name (which is assumed to be a table that has the same number of rows as the main input), and all the table’s columns will be concatenated/appended to the main table. Now, try it out with the commands below. We will first look at the metadata of the first table (only the CLUMPS extension). With the second command, we will concatenate the two tables and write them in, two-in-one.fits and finally, we will check the new catalog’s metadata.

$ asttable cat/xdf-f160w.fits -i -hCLUMPS
$ asttable cat/xdf-f160w.fits -hCLUMPS --output=two-in-one.fits \
           --catcolumnfile=cat/xdf-f125w-on-f160w-lab.fits \
           --catcolumnhdu=CLUMPS
$ asttable two-in-one.fits -i

By comparing the two metadata, we see that both tables have the same number of rows. But what might have attracted your attention more, is that two-in-one.fits has double the number of columns (as expected, after all, you merged both tables into one file, and did not ask for any specific column). In fact you can concatenate any number of other tables in one command, for example:

$ asttable cat/xdf-f160w.fits -hCLUMPS --output=three-in-one.fits \
           --catcolumnfile=cat/xdf-f125w-on-f160w-lab.fits \
           --catcolumnfile=cat/xdf-f105w-on-f160w-lab.fits \
           --catcolumnhdu=CLUMPS --catcolumnhdu=CLUMPS
$ asttable three-in-one.fits -i

As you see, to avoid confusion in column names, Table has intentionally appended a -1 to the column names of the first concatenated table if the column names are already present in the original table. For example, we have the original RA column, and another one called RA-1). Similarly a -2 has been added for the columns of the second concatenated table.

However, this example clearly shows a problem with this full concatenation: some columns are identical (for example, HOST_OBJ_ID and HOST_OBJ_ID-1), or not needed (for example, RA-1 and DEC-1 which are not necessary here). In such cases, you can use --catcolumns to only concatenate certain columns, not the whole table. For example, this command:

$ asttable cat/xdf-f160w.fits -hCLUMPS --output=two-in-one-2.fits \
           --catcolumnfile=cat/xdf-f125w-on-f160w-lab.fits \
           --catcolumnhdu=CLUMPS --catcolumns=MAGNITUDE
$ asttable two-in-one-2.fits -i

You see that we have now only appended the MAGNITUDE column of cat/xdf-f125w-on-f160w-lab.fits. This is what we needed to be able to later subtract the magnitudes. Let’s go ahead and add the F105W magnitudes also with the command below. Note how we need to call --catcolumnhdu once for every table that should be appended, but we only call --catcolumn once (assuming all the tables that should be appended have this column).

$ asttable cat/xdf-f160w.fits -hCLUMPS --output=three-in-one-2.fits \
           --catcolumnfile=cat/xdf-f125w-on-f160w-lab.fits \
           --catcolumnfile=cat/xdf-f105w-on-f160w-lab.fits \
           --catcolumnhdu=CLUMPS --catcolumnhdu=CLUMPS \
           --catcolumns=MAGNITUDE
$ asttable three-in-one-2.fits -i

But we are not finished yet! There is a very big problem: it is not immediately clear which one of MAGNITUDE, MAGNITUDE-1 or MAGNITUDE-2 columns belong to which filter! Right now, you know this because you just ran this command. But in one hour, you’ll start doubting yourself and will be forced to go through your command history, trying to figure out if you added F105W first, or F125W. You should never torture your future-self (or your colleagues) like this! So, let’s rename these confusing columns in the matched catalog.

Fortunately, with the --colmetadata option, you can correct the column metadata of the final table (just before it is written). It takes four values: 1) the original column name or number, 2) the new column name, 3) the column unit and 4) the column comments. Since the comments are usually human-friendly sentences and contain space characters, you should put them in double quotations like below. For example, by adding three calls of this option to the previous command, we write the filter name in the magnitude column name and description.

$ asttable cat/xdf-f160w.fits -hCLUMPS --output=three-in-one-3.fits \
        --catcolumnfile=cat/xdf-f125w-on-f160w-lab.fits \
        --catcolumnfile=cat/xdf-f105w-on-f160w-lab.fits \
        --catcolumnhdu=CLUMPS --catcolumnhdu=CLUMPS \
        --catcolumns=MAGNITUDE \
        --colmetadata=MAGNITUDE,MAG-F160W,log,"Magnitude in F160W." \
        --colmetadata=MAGNITUDE-1,MAG-F125W,log,"Magnitude in F125W." \
        --colmetadata=MAGNITUDE-2,MAG-F105W,log,"Magnitude in F105W."
$ asttable three-in-one-3.fits -i

We now have all three magnitudes in one table and can start doing arithmetic on them (to estimate colors, which are just a subtraction of magnitudes). To use column arithmetic, simply call the column selection option (--column or -c), put the value in single quotations and start the value with arith (followed by a space) like the example below. Column arithmetic uses the same “reverse polish notation” as the Arithmetic program (see Reverse polish notation), with almost all the same operators (see Arithmetic operators), and some column-specific operators (that are not available for images). In column-arithmetic, you can identify columns by number (prefixed with a $) or name, for more see Column arithmetic.

So let’s estimate one color from three-in-one-3.fits using column arithmetic. All the commands below will produce the same output, try them each and focus on the differences. Note that column arithmetic can be mixed with other ways to choose output columns (the -c option).

$ asttable three-in-one-3.fits -ocolor-cat.fits \
           -c1,2,3,4,'arith $5 $7 -'

$ asttable three-in-one-3.fits -ocolor-cat.fits \
           -c1,2,RA,DEC,'arith MAG-F125W MAG-F160W -'

$ asttable three-in-one-3.fits -ocolor-cat.fits -c1,2 \
           -cRA,DEC --column='arith MAG-F105W MAG-F160W -'

This example again highlights the important point on using column names: if you do not know the commands before, you have no way of making sense of the first command: what is in column 5 and 7? why not subtract columns 3 and 4 from each other? Do you see how cryptic the first one is? Then look at the last one: even if you have no idea how this table was created, you immediately understand the desired operation. When you have column names, please use them. If your table does not have column names, give them names with the --colmetadata (described above) as you are creating them. But how about the metadata for the column you just created with column arithmetic? Have a look at the column metadata of the table produced above:

$ asttable color-cat.fits -i

The name of the column produced by arithmetic column is ARITH_1! This is natural: Arithmetic has no idea what the modified column is! You could have multiplied two columns, or done much more complex transformations with many columns. Metadata cannot be set automatically, your (the human) input is necessary. To add metadata, you can use --colmetadata like before:

$ asttable three-in-one-3.fits -ocolor-cat.fits -c1,2,RA,DEC \
         --column='arith MAG-F105W MAG-F160W -' \
         --colmetadata=ARITH_1,F105W-F160W,log,"Magnitude difference"
$ asttable color-cat.fits -i

Sometimes, because of a particular way of storing data, you might need to take all input columns. If there are many columns (for example hundreds!), listing them (like above) will become annoying, buggy and time-consuming. In such cases, you can give -c_all. Upon execution, _all will be replaced with a comma-separated list of all the input columns. This allows you to add new columns easily, without having to worry about the number of input columns that you want anyway. A lower-level but more customizable method is to use the seq (sequence) command with the -s (separator) option set to ','). For example, if you have 216 columns and only want to return columns 1 and 2 as well as all the columns between 12 to 58 (inclusive), you can use the command below:

$ asttable table.fits -c1,2,$(seq -s',' 12 58)

We are now ready to make our final table. We want it to have the magnitudes in all three filters, as well as the three possible colors. Recall that by convention in astronomy colors are defined by subtracting the bluer magnitude from the redder magnitude. In this way a larger color value corresponds to a redder object. So from the three magnitudes, we can produce three colors (as shown below). Also, because this is the final table we are creating here and want to use it later, we will store it in cat/ and we will also give it a clear name and use the --range option to only print columns with a signal-to-noise ratio (SN column, from the F160W filter) above 5.

$ asttable three-in-one-3.fits --range=SN,5,inf -c1,2,RA,DEC,SN \
         -cMAG-F160W,MAG-F125W,MAG-F105W \
         -c'arith MAG-F125W MAG-F160W -' \
         -c'arith MAG-F105W MAG-F125W -' \
         -c'arith MAG-F105W MAG-F160W -' \
         --colmetadata=SN,SN-F160W,ratio,"F160W signal to noise ratio" \
         --colmetadata=ARITH_1,F125W-F160W,log,"Color F125W-F160W." \
         --colmetadata=ARITH_2,F105W-F125W,log,"Color F105W-F125W." \
         --colmetadata=ARITH_3,F105W-F160W,log,"Color F105W-F160W." \
         --output=cat/mags-with-color.fits
$ asttable cat/mags-with-color.fits -i

The table now has all the columns we need and it has the proper metadata to let us safely use it later (without frustrating over column orders!) or passing it to colleagues.

Let’s finish this section of the tutorial with a useful tip on modifying column metadata. Above, updating/changing column metadata was done with the --colmetadata in the same command that produced the newly created Table file. But in many situations, the table is already made and you just want to update the metadata of one column. In such cases using --colmetadata is over-kill (wasting CPU/RAM energy or time if the table is large) because it will load the full table data and metadata into memory, just change the metadata and write it back into a file.

In scenarios when the table’s data does not need to be changed and you just want to set or update the metadata, it is much more efficient to use basic FITS keyword editing. For example, in the FITS standard, column names are stored in the TTYPE header keywords, so let’s have a look:

$ asttable two-in-one.fits -i
$ astfits two-in-one.fits -h1 | grep TTYPE

Changing/updating the column names is as easy as updating the values to these keywords. You do not need to touch the actual data! With the command below, we will just update the MAGNITUDE and MAGNITUDE-1 columns (which are respectively stored in the TTYPE5 and TTYPE11 keywords) by modifying the keyword values and checking the effect by listing the column metadata again:

$ astfits two-in-one.fits -h1 \
          --update=TTYPE5,MAG-F160W \
          --update=TTYPE11,MAG-F125W
$ asttable two-in-one.fits -i

You can see that the column names have indeed been changed without touching any of the data. You can do the same for the column units or comments by modifying the keywords starting with TUNIT or TCOMM.

Generally, Gnuastro’s table is a very useful program in data analysis and what you have seen so far is just the tip of the iceberg. But to avoid making the tutorial even longer, we will stop reviewing the features here, for more, please see Table. Before continuing, let’s just delete all the temporary FITS tables we placed in the top project directory:

rm *.fits

Footnotes

(39)

MakeCatalog can also output plain text tables. However, in the plain text format you can only have one table per file. Therefore, if you also request measurements on clumps, two plain text tables will be created (suffixed with _o.txt and _c.txt).