O3a ATLAS DATA FORMAT

This document describes the data format used for the release of O3a Atlas.

The release uses Mappable Vector Library (MVL) file to distribute data.

Every MVL file has the following architecture:

[PREAMBLE] .. [VECTOR] .. [ [VECTOR] ... ]  [POSTAMBLE]

The PREAMBLE starts with letters MVL0 to identify the file as MVL format, and contains information on data alignment and endianness. For this release the data is aligned on 64 byte boundary and is little-endian.

This is followed by VECTORs that carry data separated by optional padding. Each VECTOR has the following structure:

[HEADER] [DATA]

The 64-byte HEADER describes type and length of DATA and contains an optional 64-bit offset pointing to metadata:

typedef struct {
    LIBMVL_OFFSET64 length;
    int type;
    int reserved[11];
    LIBMVL_OFFSET64 metadata;
    } LIBMVL_VECTOR_HEADER;

The DATA is a plain array of length elements of given type. MVL supports integer and numeric types, character strings and a special LIBMVL_OFFSET64 type which is an unsigned 64-bit integer.

Knowing an offset from the beginning of the file to the start of the header uniquely identifies a vector stored in MVL file. An array of offsets then acts as a list of vectors.

This allows to store data of arbitrary complexity in MVL file, whether a set of a tables similar to SQL database, a tree like structure, or a more complex data structure.

A metadata offset allows to associate additional information with a vector. A common use is to add a vector of "names" of the same length as an original vector, which can be used to retrieve data symbolically.

In order to retrieve data one needs to know offsets. They can be either provided externally (see table at the end of this document), or retrieved from other vectors in the MVL file.

The POSTAMBLE stores an offset to a directory, which is a named vector of offsets that serves as top-level directory of objects in MVL file.

The low-level libMVL library is written in C and provides functions to write data into MVL file, access existing MVL files, parse metadata and perform database functions. See code and examples at:

https://github.com/volodya31415/libMVL

A higher level RMVL package for R integrates MVL files into R allowing easy analysis of data:

https://cran.r-project.org/package=RMVL

The examples provided with the data use RMVL. The R computing environment is open source:

https://www.r-project.org/

The RMVL package (and other R packages such as inline) are installed from within R by using the command:

install.packages("RMVL")

The data included in O3a_atlas consists of two tables (or in R terminology "data.frames") "parameters" and "skymaps", which offsets are recorded in top-level directory:

> parameters[1:5,]
   label idx first_bin band_start band_stop resolution
1      0   1   7200000   500.0033   500.050 0.01499920
2      1   2   7200000   500.0033   500.050 0.01499920
3     10   3   7200648   500.0483   500.095 0.01499785
4    100   4   7206480   500.4533   500.500 0.01498571
5   1000   5   7264800   504.5033   504.550 0.01486542

> skymaps[1:5,]
            ul      ul_circ       ul_avg      snr snr_frequency        ra       dec stage idx
1 1.575584e-25 5.762704e-26 1.269117e-25 20.15175      500.0304 3.1415927 -1.570796    02   1
2 1.575584e-25 5.949139e-26 1.328293e-25 21.54693      500.0301 0.2243995 -1.557815    02   1
3 1.575584e-25 5.949139e-26 1.328293e-25 21.54693      500.0301 0.0000000 -1.544833    02   1
4 1.575584e-25 5.949139e-26 1.328293e-25 21.54693      500.0301 0.2855993 -1.544833    02   1
5 1.575584e-25 5.949139e-26 1.328293e-25 21.54693      500.0301 0.5711986 -1.544833    02   1

These two tables have a common column "idx" which ties corresponding records. Thus all entries shown above for skymaps table have idx=1 and, reading line 1 from parameter table, for these entries the frequency band spans 500.0033 Hz to 500.0500 Hz

"parameter" table columns description

"skymaps" table columns description

WARNING: the column ulavg stores values of population average proxy computed using the same formula as prior Falcon searches. We have tested that a population of signals injected with h0=ulavg is recovered 95% of the time in clean data, and 90% of the time in highly contaminated data, like the region with violin modes. However, this testing was only performed for population average proxy values computed as maximum over the entire sky. Testing at the granularity of atlas requires a large injection campaign which has not been performed.

Each entry in the parameter table corresponds to a region in the parameter space that was analyzed on its own. The regions are created by selecting one of overlapping frequency bands and one of 10 slices of the sky.

The atlas data is constructed in such a way, that if you are interested in a particular sky location and frequency, you need to find the entry in the skymaps table is closest in spherical distance and which frequency band covers frequency of interest. No corrections are needed for sky position mismatch, as it has already been accounted for.

Data layout in O3a_atlas.mvl file

                        path          type     length      vector_offset
1                /parameters    data.frame          6 0x0000000000685e40
2          /parameters/label string_vector     111150 0x00000000001db6e0
3            /parameters/idx        double     111150 0x00000000002b49c0
4      /parameters/first_bin         int32     111150 0x000000000038dca0
5     /parameters/band_start        double     111150 0x00000000003fa6c0
6      /parameters/band_stop        double     111150 0x00000000004d39a0
7     /parameters/resolution        double     111150 0x00000000005acc80
8                   /skymaps    data.frame          9 0x00000011c2253b60
9                /skymaps/ul         float 1860117611 0x0000000000685ec0
10          /skymaps/ul_circ         float 1860117611 0x00000001bbe4f0c0
11           /skymaps/ul_avg         float 1860117611 0x00000003776182c0
12              /skymaps/snr         float 1860117611 0x0000000532de14c0
13    /skymaps/snr_frequency        double 1860117611 0x00000006ee5aa6c0
14               /skymaps/ra         float 1860117611 0x0000000a6553ca60
15              /skymaps/dec         float 1860117611 0x0000000c20d05c60
16            /skymaps/stage         uint8 1860117611 0x0000000ddc4cee60
17              /skymaps/idx        double 1860117611 0x0000000e4b2c1320

Vector types: