This document describes the data format used for the release of O3a Atlas.
The release uses Mappable Vector Library (MVL) file to distribute data.
Every MVL file has the following architecture:
[PREAMBLE] .. [VECTOR] .. [ [VECTOR] ... ] [POSTAMBLE]
The PREAMBLE starts with letters MVL0 to identify the file as MVL format, and contains information on data alignment and endianness. For this release the data is aligned on 64 byte boundary and is little-endian.
This is followed by VECTORs that carry data separated by optional padding. Each VECTOR has the following structure:
[HEADER] [DATA]
The 64-byte HEADER describes type and length of DATA and contains an optional 64-bit offset pointing to metadata:
typedef struct {
LIBMVL_OFFSET64 length;
int type;
int reserved[11];
LIBMVL_OFFSET64 metadata;
} LIBMVL_VECTOR_HEADER;
The DATA is a plain array of length elements of given type. MVL supports integer and numeric types, character strings and a special LIBMVL_OFFSET64 type which is an unsigned 64-bit integer.
Knowing an offset from the beginning of the file to the start of the header uniquely identifies a vector stored in MVL file. An array of offsets then acts as a list of vectors.
This allows to store data of arbitrary complexity in MVL file, whether a set of a tables similar to SQL database, a tree like structure, or a more complex data structure.
A metadata offset allows to associate additional information with a vector. A common use is to add a vector of "names" of the same length as an original vector, which can be used to retrieve data symbolically.
In order to retrieve data one needs to know offsets. They can be either provided externally (see table at the end of this document), or retrieved from other vectors in the MVL file.
The POSTAMBLE stores an offset to a directory, which is a named vector of offsets that serves as top-level directory of objects in MVL file.
The low-level libMVL library is written in C and provides functions to write data into MVL file, access existing MVL files, parse metadata and perform database functions. See code and examples at:
https://github.com/volodya31415/libMVL
A higher level RMVL package for R integrates MVL files into R allowing easy analysis of data:
https://cran.r-project.org/package=RMVL
The examples provided with the data use RMVL. The R computing environment is open source:
The RMVL package (and other R packages such as inline) are installed from within R by using the command:
install.packages("RMVL")
The data included in O3a_atlas consists of two tables (or in R terminology "data.frames") "parameters" and "skymaps", which offsets are recorded in top-level directory:
> parameters[1:5,]
label idx first_bin band_start band_stop resolution
1 0 1 7200000 500.0033 500.050 0.01499920
2 1 2 7200000 500.0033 500.050 0.01499920
3 10 3 7200648 500.0483 500.095 0.01499785
4 100 4 7206480 500.4533 500.500 0.01498571
5 1000 5 7264800 504.5033 504.550 0.01486542
> skymaps[1:5,]
ul ul_circ ul_avg snr snr_frequency ra dec stage idx
1 1.575584e-25 5.762704e-26 1.269117e-25 20.15175 500.0304 3.1415927 -1.570796 02 1
2 1.575584e-25 5.949139e-26 1.328293e-25 21.54693 500.0301 0.2243995 -1.557815 02 1
3 1.575584e-25 5.949139e-26 1.328293e-25 21.54693 500.0301 0.0000000 -1.544833 02 1
4 1.575584e-25 5.949139e-26 1.328293e-25 21.54693 500.0301 0.2855993 -1.544833 02 1
5 1.575584e-25 5.949139e-26 1.328293e-25 21.54693 500.0301 0.5711986 -1.544833 02 1
These two tables have a common column "idx" which ties corresponding records. Thus all entries shown above for skymaps table have idx=1 and, reading line 1 from parameter table, for these entries the frequency band spans 500.0033 Hz to 500.0500 Hz
"parameter" table columns description
label
- internal string labelling parameter recordidx
- index identifying corresponding records of parameter and skymaps tablefirst_bin
- starting bin of analyzed band in units of 1/14400 Hzband_start
- first valid frequency of analyzed bandband_stop
- first frequency beyound analyzed bandresolution
- this internal parameter has units of radians and controls spacing of sky grid"skymaps" table columns description
ul
- worst-case 95% confidence level upper limitul_circ
- worst-case 95% confidence level upper limit on circularly polarized signalsul_avg
- population average proxy. PLEASE READ WARNING BELOWsnr
- maximum of signal-to-noise ratio in analyzed bandsnr_frequency
- frequency where SNR maximum was achievedra
- Right Ascension in radians, J2000dec
- Declination in radians, J2000stage
- which stage analyzed the data (either 1 or 2)idx
- index identifying corresponding records of parameter and skymaps tableWARNING: the column ulavg stores values of population average proxy computed using the same formula as prior Falcon searches. We have tested that a population of signals injected with h0=ulavg is recovered 95% of the time in clean data, and 90% of the time in highly contaminated data, like the region with violin modes. However, this testing was only performed for population average proxy values computed as maximum over the entire sky. Testing at the granularity of atlas requires a large injection campaign which has not been performed.
Each entry in the parameter table corresponds to a region in the parameter space that was analyzed on its own. The regions are created by selecting one of overlapping frequency bands and one of 10 slices of the sky.
The atlas data is constructed in such a way, that if you are interested in a particular sky location and frequency, you need to find the entry in the skymaps table is closest in spherical distance and which frequency band covers frequency of interest. No corrections are needed for sky position mismatch, as it has already been accounted for.
path type length vector_offset
1 /parameters data.frame 6 0x0000000000685e40
2 /parameters/label string_vector 111150 0x00000000001db6e0
3 /parameters/idx double 111150 0x00000000002b49c0
4 /parameters/first_bin int32 111150 0x000000000038dca0
5 /parameters/band_start double 111150 0x00000000003fa6c0
6 /parameters/band_stop double 111150 0x00000000004d39a0
7 /parameters/resolution double 111150 0x00000000005acc80
8 /skymaps data.frame 9 0x00000011c2253b60
9 /skymaps/ul float 1860117611 0x0000000000685ec0
10 /skymaps/ul_circ float 1860117611 0x00000001bbe4f0c0
11 /skymaps/ul_avg float 1860117611 0x00000003776182c0
12 /skymaps/snr float 1860117611 0x0000000532de14c0
13 /skymaps/snr_frequency double 1860117611 0x00000006ee5aa6c0
14 /skymaps/ra float 1860117611 0x0000000a6553ca60
15 /skymaps/dec float 1860117611 0x0000000c20d05c60
16 /skymaps/stage uint8 1860117611 0x0000000ddc4cee60
17 /skymaps/idx double 1860117611 0x0000000e4b2c1320
Vector types:
data.frame
- 64-bit unsigned integer (LIBMVL_OFFSET64)string_vector
- strings stored using libMVL LIBMVLPACKEDLIST64 type,
see functions mvl_packed_list_get_entry
and mvl_packed_list_get_entry_bytelength
or use RMVL
float
- 4-byte floating point numberdouble
- 8-byte floating point numberuint8
- 1-byte unsigned integerint32
- 4-byte signed integer