# Finding GBM Data

A natural question may be: "Where do I find the data I need?"  Well, you're in luck, because this will show you how to find the data you seek.  GBM Data is hosted publicly on the HEASARC FTP server via the Fermi Science Support Center, and the data are stored in a consistent directory structure.  But instead of having to navigate a winding maze of FTP directories, we provide a couple of classes built to retrieve the data you want.  First, you need to decide if you want trigger data (say from a GRB) or continuous data.  Let's start with trigger data, and assume you know the trigger number you're interested in:

In [1]:
# the datafinder class for triggers
from gbm.finder import TriggerFtp

# initialize the Trigger data finder with a trigger number
trig_finder = TriggerFtp('190114873')
trig_finder.num_files

122

We don't really care about the directory structure, we just want the data. So this quickly gets us to the directory we need.  There are 122 files associated with this trigger.  Say we want CSPEC data.  is there CSPEC available?

In [2]:
trig_finder.ls_cspec()

['glg_cspec_b0_bn190114873_v00.pha',
 'glg_cspec_b1_bn190114873_v00.pha',
 'glg_cspec_n0_bn190114873_v00.pha',
 'glg_cspec_n1_bn190114873_v00.pha',
 'glg_cspec_n2_bn190114873_v00.pha',
 'glg_cspec_n3_bn190114873_v00.pha',
 'glg_cspec_n4_bn190114873_v00.pha',
 'glg_cspec_n5_bn190114873_v00.pha',
 'glg_cspec_n6_bn190114873_v00.pha',
 'glg_cspec_n7_bn190114873_v00.pha',
 'glg_cspec_n8_bn190114873_v00.pha',
 'glg_cspec_n9_bn190114873_v00.pha',
 'glg_cspec_na_bn190114873_v00.pha',
 'glg_cspec_nb_bn190114873_v00.pha']

Great!  There's a full complement of CSPEC data.  How about responses for the CSPEC data?

In [3]:
trig_finder.ls_rsp(cspec=True, ctime=False)

['glg_cspec_b0_bn190114873_v02.rsp',
 'glg_cspec_b1_bn190114873_v02.rsp',
 'glg_cspec_n0_bn190114873_v02.rsp',
 'glg_cspec_n1_bn190114873_v02.rsp',
 'glg_cspec_n2_bn190114873_v02.rsp',
 'glg_cspec_n3_bn190114873_v02.rsp',
 'glg_cspec_n4_bn190114873_v02.rsp',
 'glg_cspec_n5_bn190114873_v02.rsp',
 'glg_cspec_n6_bn190114873_v02.rsp',
 'glg_cspec_n7_bn190114873_v02.rsp',
 'glg_cspec_n8_bn190114873_v02.rsp',
 'glg_cspec_n9_bn190114873_v02.rsp',
 'glg_cspec_na_bn190114873_v02.rsp',
 'glg_cspec_nb_bn190114873_v02.rsp']

Again, we can list all of the relevant files.  Are there any quicklook lightcurve plots?

In [4]:
trig_finder.ls_lightcurve()

['glg_lc_chan12_bn190114873_v00.pdf',
 'glg_lc_chan34_bn190114873_v00.pdf',
 'glg_lc_chan567_bn190114873_v00.pdf',
 'glg_lc_chantot_bn190114873_v00.pdf',
 'glg_lc_tot_bn190114873_v00.pdf']

What if we want to move on to another trigger?  You don't have to create a new `TriggerFTP` object, you can just `set_trigger()`:

In [5]:
# change trigger
trig_finder.set_trigger('170817529')
trig_finder.num_files

128

Of course, you don't want to just list the files in a directory, you want to download them.  Let's download all the catalog files for GRB 170817A:

In [6]:
trig_finder.get_cat_files('./')



Now we want some continuous data.  There aren't any trigger numbers for continuous data. Continuous CTIME and CSPEC are available in files that cover a whole day (in UTC) and TTE are offered in hourly files.  To find the data you need, instead of a trigger number, you need to specify a time

In [7]:
# the datafinder class for continuous data
from gbm.finder import ContinuousFtp

# initialize the continuous data finder with a time (Fermi MET, UTC, or GPS)
cont_finder = ContinuousFtp(met=587683338.0)
cont_finder.num_files

379

That's a whole lotta files in this directory. Most of them are TTE; remember that each hour has a TTE file (since the end of 2012) for each detector.  Let's just list the CTIME that's available:

In [8]:
# list ctime data covering this time
cont_finder.ls_ctime()

['glg_ctime_b0_190816_v00.pha',
 'glg_ctime_b1_190816_v00.pha',
 'glg_ctime_n0_190816_v00.pha',
 'glg_ctime_n1_190816_v00.pha',
 'glg_ctime_n2_190816_v00.pha',
 'glg_ctime_n3_190816_v00.pha',
 'glg_ctime_n4_190816_v00.pha',
 'glg_ctime_n5_190816_v00.pha',
 'glg_ctime_n6_190816_v00.pha',
 'glg_ctime_n7_190816_v00.pha',
 'glg_ctime_n8_190816_v00.pha',
 'glg_ctime_n9_190816_v00.pha',
 'glg_ctime_na_190816_v00.pha',
 'glg_ctime_nb_190816_v00.pha']

Now let's list the available TTE for this time.  This will only list the TTE files in the directory that cover the relevant time:

In [9]:
# list hourly TTE data covering this time
cont_finder.ls_tte()

['glg_tte_b0_190816_21z_v00.fit.gz',
 'glg_tte_b1_190816_21z_v00.fit.gz',
 'glg_tte_n0_190816_21z_v00.fit.gz',
 'glg_tte_n1_190816_21z_v00.fit.gz',
 'glg_tte_n2_190816_21z_v00.fit.gz',
 'glg_tte_n3_190816_21z_v00.fit.gz',
 'glg_tte_n4_190816_21z_v00.fit.gz',
 'glg_tte_n5_190816_21z_v00.fit.gz',
 'glg_tte_n6_190816_21z_v00.fit.gz',
 'glg_tte_n7_190816_21z_v00.fit.gz',
 'glg_tte_n8_190816_21z_v00.fit.gz',
 'glg_tte_n9_190816_21z_v00.fit.gz',
 'glg_tte_na_190816_21z_v00.fit.gz',
 'glg_tte_nb_190816_21z_v00.fit.gz']

Similar to the trigger finder, you can use the same object to search at different times.

In [10]:
# change the time of interest
cont_finder.set_time(utc='2017-08-17T12:41:06.47')

Now how about downloading the position history file for this time:

In [11]:
cont_finder.get_poshist('./')



---
## Searching the GBM Catalogs

The HEASARC also hosts two catalogs that are of interest here: a Trigger Catalog that contains information about every GBM trigger, and a Burst Catalog that contains standard analysis of every triggered GRB.  HEASARC provides a way to search these catalogs online through their Browse interface, but we offer a way to do it in Python through the Data Tools.

Let's look at the trigger catalog first:

In [12]:
from gbm.finder import TriggerCatalog

trigcat = TriggerCatalog()
trigcat.num_rows

Downloading Catalog from HEASARC via w3query.pl...
Finished in 9 s


7662

Depending on your connection, initialization may take a few seconds.  You can see what columns are available in the catalog:

In [13]:
trigcat.columns

array(['version', 'trigger_name', 'name', 'ra', 'dec', 'trigger_time',
       'trigger_type', 'reliability', 'adc_high', 'adc_low', 'bii',
       'channel_high', 'channel_low', 'dec_scx', 'dec_scz',
       'detector_mask', 'end_time', 'error_radius', 'geo_lat', 'geo_long',
       'lii', 'localization_source', 'phi', 'ra_scx', 'ra_scz', 'theta',
       'time', 'trigger_algorithm', 'trigger_timescale'], dtype='<U19')

You can also return the range of values for a given column:

In [14]:
# error_radius is the statistical localization radius in degrees
trigcat.column_range('error_radius')

(0.0, 93.54)

If you only care about specific columns in the table, you can return a numpy record array with only those columns.  Let's return a table with the trigger name and time for every trigger:

In [15]:
trigcat.get_table(columns=('trigger_name', 'trigger_time'))

rec.array([('bn120403857', '2012-04-03 20:33:58.493'),
           ('bn140912846', '2014-09-12 20:18:03.669'),
           ('bn120227725', '2012-02-27 17:24:41.054'), ...,
           ('bn090813174', '2009-08-13 04:10:42.593'),
           ('bn110201399', '2011-02-01 09:35:10.251'),
           ('bn150705660', '2015-07-05 15:50:18.845')],
          dtype=[('trigger_name', '<U23'), ('trigger_time', '<U23')])

Importantly, we can make slices of the catalog based on conditionals.  Let's only select triggers with localization radii between 1.1 and 10 degrees:

In [16]:
sliced_trigcat = trigcat.slice('error_radius', lo=1.1, hi=10.0)
sliced_trigcat.num_rows

2301

In [17]:
sliced_trigcat.get_table(columns=('trigger_name', 'trigger_time'))

rec.array([('bn120227725', '2012-02-27 17:24:41.054'),
           ('bn141205018', '2014-12-05 00:25:29.813'),
           ('bn170116238', '2017-01-16 05:43:15.259'), ...,
           ('bn180826785', '2018-08-26 18:50:50.345'),
           ('bn091012783', '2009-10-12 18:47:02.770'),
           ('bn180304259', '2018-03-04 06:12:47.267')],
          dtype=[('trigger_name', '<U23'), ('trigger_time', '<U23')])

You can also slice on multiple conditionals, simultaneously.  Select everything that has a localization radius between 1.1-10 degrees, *and*  happened on or after January 1, 2019:

In [18]:
# perform a row slice based on multiple conditionals that can span more than one column
sliced_trigcat2 = trigcat.slices([('error_radius', 1.1, 10.0), ('trigger_time', '2019-01-01 00:00:00', None)])
sliced_trigcat2.num_rows

385

In [19]:
sliced_trigcat2.get_table(columns=('trigger_name', 'trigger_time', 'error_radius'))

rec.array([('bn100328141', '2019-01-02 06:11:31.125', 4.82  ),
           ('bn160119072', '2019-01-17 08:50:43.596', 6.31  ),
           ('bn090403314', '2019-01-18 22:29:49.932', 9.7   ),
           ('bn110517902', '2019-01-19 05:59:03.575', 8.3   ),
           ('bn090529564', '2019-01-29 12:55:42.675', 1.5   ),
           ('bn160718975', '2019-01-31 02:36:33.938', 1.82  ),
           ('bn170614486', '2019-01-31 23:08:35.673', 1.86  ),
           ('bn110117364', '2019-02-01 06:03:28.818', 9.63  ),
           ('bn201008443', '2019-02-02 05:36:55.718', 3.33  ),
           ('bn140630505', '2019-02-02 15:13:12.939', 2.24  ),
           ('bn120118898', '2019-02-03 23:24:11.710', 7.17  ),
           ('bn110806043', '2019-02-04 15:02:50.914', 8.4833),
           ('bn080805496', '2019-02-05 22:31:15.658', 5.6   ),
           ('bn101223834', '2019-02-15 18:31:22.475', 4.34  ),
           ('bn121220311', '2019-02-17 04:31:26.137', 8.3   ),
           ('bn110710954', '2019-02-18 07:54:33.343', 3

You'll notice in the table listing that there are multiple datatypes.  This is an improvement over the scripts provided by HEASARC, because there is no metadata provided to tell you if a column is a specific datatype.  Our catalog classes have automatic type-detection so you don't have to worry about converting strings to ints or floats.  

We can also connect to the burst catalog in the same way we connected to the trigger catalog:

In [20]:
from gbm.finder import BurstCatalog
burstcat = BurstCatalog()
burstcat.num_rows

Downloading Catalog from HEASARC via w3query.pl...
Finished in 61 s


3047

Again, this may take several seconds, largely because of how the HEASARC perl API works.  One word about the Burst Catalog before you get overwhelmed: it has *a lot* of columns.  Basically every parameter for every standard spectral model that is fit, for both a time-integrated spectrum and the spectrum at the peak flux.  There is also T90, T50, flux, and fluence information on different timescales and energy ranges.  All in all, there are ***306*** different columns.  Gasp.

In [21]:
burstcat.columns

array(['name', 'ra', 'dec', 'trigger_time', 't90', 't90_error',
       't90_start', 'fluence', 'fluence_error', 'flux_1024',
       'flux_1024_error', 'flux_1024_time', 'flux_64', 'flux_64_error',
       'flnc_band_ampl', 'flnc_band_ampl_pos_err',
       'flnc_band_ampl_neg_err', 'flnc_band_epeak',
       'flnc_band_epeak_pos_err', 'flnc_band_epeak_neg_err',
       'flnc_band_alpha', 'flnc_band_alpha_pos_err',
       'flnc_band_alpha_neg_err', 'flnc_band_beta',
       'flnc_band_beta_pos_err', 'flnc_band_beta_neg_err',
       'flnc_spectrum_start', 'flnc_spectrum_stop',
       'pflx_best_fitting_model', 'pflx_best_model_redchisq',
       'flnc_best_fitting_model', 'flnc_best_model_redchisq',
       'actual_1024ms_interval', 'actual_256ms_interval',
       'actual_64ms_interval', 'back_interval_high_start',
       'back_interval_high_stop', 'back_interval_low_start',
       'back_interval_low_stop', 'bcat_detector_mask', 'bcatalog', 'bii',
       'duration_energy_high', 'duration_energy

Good luck.  
Everything that we demoed for the trigger catalog can also be done with the Burst Catalog.

[Next](./PhaExport.ipynb), we will put together some of what we've learned to do reduce some GBM data and export it.