pycbc.workflow package

Submodules

pycbc.workflow.coincidence module

This module is responsible for setting up the coincidence stage of pycbc workflows. For details about this module and its capabilities see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/coincidence.html

class pycbc.workflow.coincidence.CensorForeground(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

current_retention_level = 3
class pycbc.workflow.coincidence.MergeExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCBank2HDFExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

Converts xml tmpltbank to hdf format

create_node(bank_file)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCCombineStatmap(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

Combine coincs over different bins and apply trials factor

create_node(statmap_files, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCDistributeBackgroundBins(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

Distribute coinc files among different background bins

create_node(coinc_files, bank_file, background_bins, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 2
class pycbc.workflow.coincidence.PyCBCFindCoincExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

Find coinc triggers using a folded interval method

create_node(trig_files, bank_file, stat_files, veto_file, veto_name, template_str, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 2
file_input_options = ['--statistic-files']
class pycbc.workflow.coincidence.PyCBCFindMultiifoCoincExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

Find coinc triggers using a folded interval method

create_node(trig_files, bank_file, stat_files, veto_file, veto_name, template_str, pivot_ifo, fixed_ifo, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 2
file_input_options = ['--statistic-files']
class pycbc.workflow.coincidence.PyCBCFitByTemplateExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

Calculates values that describe the background distribution template by template

create_node(trig_file, bank_file, veto_file, veto_name)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCFitOverParamExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

Smooths the background distribution parameters over a continuous parameter

create_node(raw_fit_file, bank_file)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCHDFInjFindExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

Find injections in the hdf files output

create_node(inj_coinc_file, inj_xml_file, veto_file, veto_name, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCMultiifoAddStatmap(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.coincidence.PyCBCMultiifoCombineStatmap

Combine statmap files and add FARs over different coinc types

create_node(statmap_files, background_files, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCMultiifoCombineStatmap(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.coincidence.PyCBCCombineStatmap

Combine coincs over different coinc types and apply trials factor

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCMultiifoExcludeZerolag(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

Remove times of zerolag coincidences of all types from exclusive background

create_node(statmap_file, other_statmap_files, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCMultiifoStatMapExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

Calculate FAP, IFAR, etc

create_node(coinc_files, ifos, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCMultiifoStatMapInjExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

Calculate FAP, IFAR, etc

create_node(zerolag, full_data, injfull, fullinj, ifos, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCStatMapExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

Calculate FAP, IFAR, etc

create_node(coinc_files, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCStatMapInjExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

Calculate FAP, IFAR, etc for injections

create_node(zerolag, full_data, injfull, fullinj, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCTrig2HDFExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

Converts xml triggers to hdf format, grouped by template hash

create_node(trig_files, bank_file)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
pycbc.workflow.coincidence.convert_bank_to_hdf(workflow, xmlbank, out_dir, tags=None)[source]

Return the template bank in hdf format

pycbc.workflow.coincidence.convert_trig_to_hdf(workflow, hdfbank, xml_trigger_files, out_dir, tags=None)[source]

Return the list of hdf5 trigger files outputs

pycbc.workflow.coincidence.find_injections_in_hdf_coinc(workflow, inj_coinc_file, inj_xml_file, veto_file, veto_name, out_dir, tags=None)[source]
pycbc.workflow.coincidence.get_ordered_ifo_list(ifocomb, ifo_ids)[source]

This function sorts the combination of ifos (ifocomb) based on the given precedence list (ifo_ids dictionary) and returns the first ifo as pivot the second ifo as fixed, and the ordered list joined as a string.

pycbc.workflow.coincidence.make_foreground_censored_veto(workflow, bg_file, veto_file, veto_name, censored_name, out_dir, tags=None)[source]
pycbc.workflow.coincidence.merge_single_detector_hdf_files(workflow, bank_file, trigger_files, out_dir, tags=None)[source]
pycbc.workflow.coincidence.rerank_coinc_followup(workflow, statmap_file, bank_file, out_dir, tags, injection_file=None, ranking_file=None)[source]
pycbc.workflow.coincidence.select_files_by_ifo_combination(ifocomb, insps)[source]

This function selects single-detector files (‘insps’) for a given ifo combination

pycbc.workflow.coincidence.setup_background_bins(workflow, coinc_files, bank_file, out_dir, tags=None)[source]
pycbc.workflow.coincidence.setup_background_bins_inj(workflow, coinc_files, background_file, bank_file, out_dir, tags=None)[source]
pycbc.workflow.coincidence.setup_interval_coinc(workflow, hdfbank, trig_files, stat_files, veto_files, veto_names, out_dir, tags=None)[source]

This function sets up exact match coincidence and background estimation

using a folded interval technique.

pycbc.workflow.coincidence.setup_interval_coinc_inj(workflow, hdfbank, full_data_trig_files, inj_trig_files, stat_files, background_file, veto_file, veto_name, out_dir, tags=None)[source]

This function sets up exact match coincidence and background estimation

using a folded interval technique.

pycbc.workflow.coincidence.setup_multiifo_combine_statmap(workflow, final_bg_file_list, bg_file_list, out_dir, tags=None)[source]

Combine the multiifo statmap files into one background file

pycbc.workflow.coincidence.setup_multiifo_exclude_zerolag(workflow, statmap_file, other_statmap_files, out_dir, ifos, tags=None)[source]

Exclude single triggers close to zerolag triggers from forming any background events

pycbc.workflow.coincidence.setup_multiifo_interval_coinc(workflow, hdfbank, trig_files, stat_files, veto_files, veto_names, out_dir, pivot_ifo, fixed_ifo, tags=None)[source]

This function sets up exact match multiifo coincidence

pycbc.workflow.coincidence.setup_multiifo_interval_coinc_inj(workflow, hdfbank, full_data_trig_files, inj_trig_files, stat_files, background_file, veto_file, veto_name, out_dir, pivot_ifo, fixed_ifo, tags=None)[source]

This function sets up exact match multiifo coincidence for injections

pycbc.workflow.coincidence.setup_multiifo_statmap(workflow, ifos, coinc_files, out_dir, tags=None)[source]
pycbc.workflow.coincidence.setup_multiifo_statmap_inj(workflow, ifos, coinc_files, background_file, out_dir, tags=None)[source]
pycbc.workflow.coincidence.setup_simple_statmap(workflow, coinc_files, out_dir, tags=None)[source]
pycbc.workflow.coincidence.setup_simple_statmap_inj(workflow, coinc_files, background_file, out_dir, tags=None)[source]
pycbc.workflow.coincidence.setup_statmap(workflow, coinc_files, bank_file, out_dir, tags=None)[source]
pycbc.workflow.coincidence.setup_statmap_inj(workflow, coinc_files, background_file, bank_file, out_dir, tags=None)[source]
pycbc.workflow.coincidence.setup_trigger_fitting(workflow, insps, hdfbank, veto_file, veto_name)[source]

pycbc.workflow.configparser_test module

pycbc.workflow.configparser_test.add_options_to_section(cp, section, items, preserve_orig_file=False, overwrite_options=False)[source]

Add a set of options and values to a section of a ConfigParser object. Will throw an error if any of the options being added already exist, this behaviour can be overridden if desired

Parameters:
  • cp (The ConfigParser class) –
  • section (string) – The name of the section to add options+values to
  • items (list of tuples) – Each tuple contains (at [0]) the option and (at [1]) the value to add to the section of the ini file
  • preserve_orig_file (Boolean, optional) – By default the input ConfigParser object will be modified in place. If this is set deepcopy will be used and the input will be preserved. Default = False
  • overwrite_options (Boolean, optional) – By default this function will throw a ValueError if an option exists in both the original section in the ConfigParser and in the provided items. This will override so that the options+values given in items will replace the original values if the value is set to True. Default = True
Returns:

cp

Return type:

The ConfigParser class

pycbc.workflow.configparser_test.check_duplicate_options(cp, section1, section2, raise_error=False)[source]

Check for duplicate options in two sections, section1 and section2. Will return True if there are duplicate options and False if not

Parameters:
  • cp (The ConfigParser class) –
  • section1 (string) – The name of the first section to compare
  • section2 (string) – The name of the second section to compare
  • raise_error (Boolean, optional) – If True, raise an error if duplicates are present. Default = False
Returns:

duplicate – List of duplicate options

Return type:

List

pycbc.workflow.configparser_test.interpolate_string(testString, cp, section)[source]

Take a string and replace all example of ExtendedInterpolation formatting within the string with the exact value.

For values like ${example} this is replaced with the value that corresponds to the option called example *in the same section*

For values like ${common|example} this is replaced with the value that corresponds to the option example in the section [common]. Note that in the python3 config parser this is ${common:example} but python2.7 interprets the : the same as a = and this breaks things

Nested interpolation is not supported here.

Parameters:
  • testString (String) – The string to parse and interpolate
  • cp (ConfigParser) – The ConfigParser object to look for the interpolation strings within
  • section (String) – The current section of the ConfigParser object
Returns:

testString – Interpolated string

Return type:

String

pycbc.workflow.configparser_test.parse_workflow_ini_file(cpFile, parsed_filepath=None)[source]

Read a .ini file in, parse it as described in the documentation linked to above, and return the parsed ini file.

Parameters:
  • cpFile (The path to a .ini file to be read in) –
  • parsed_filepath (Boolean, optional) – If provided, the .ini file, after parsing, will be written to this location
Returns:

cp

Return type:

The parsed ConfigParser class containing the read in .ini file

pycbc.workflow.configparser_test.perform_extended_interpolation(cp, preserve_orig_file=False)[source]

Filter through an ini file and replace all examples of ExtendedInterpolation formatting with the exact value. For values like ${example} this is replaced with the value that corresponds to the option called example *in the same section*

For values like ${common|example} this is replaced with the value that corresponds to the option example in the section [common]. Note that in the python3 config parser this is ${common:example} but python2.7 interprets the : the same as a = and this breaks things

Nested interpolation is not supported here.

Parameters:
  • cp (ConfigParser object) –
  • preserve_orig_file (Boolean, optional) – By default the input ConfigParser object will be modified in place. If this is set deepcopy will be used and the input will be preserved. Default = False
Returns:

cp

Return type:

parsed ConfigParser object

pycbc.workflow.configparser_test.read_ini_file(cpFile)[source]

Read a .ini file and return it as a ConfigParser class. This function does none of the parsing/combining of sections. It simply reads the file and returns it unedited

Parameters:cpFile (The path to a .ini file to be read in) –
Returns:cp
Return type:The ConfigParser class containing the read in .ini file
pycbc.workflow.configparser_test.sanity_check_subsections(cp)[source]

This function goes through the ConfigParset and checks that any options given in the [SECTION_NAME] section are not also given in any [SECTION_NAME-SUBSECTION] sections.

Parameters:cp (The ConfigParser class) –
Returns:
Return type:None
pycbc.workflow.configparser_test.split_multi_sections(cp, preserve_orig_file=False)[source]

Parse through a supplied ConfigParser object and splits any sections labelled with an “&” sign (for e.g. [inspiral&tmpltbank]) into [inspiral] and [tmpltbank] sections. If these individual sections already exist they will be appended to. If an option exists in both the [inspiral] and [inspiral&tmpltbank] sections an error will be thrown

Parameters:
  • cp (The ConfigParser class) –
  • preserve_orig_file (Boolean, optional) – By default the input ConfigParser object will be modified in place. If this is set deepcopy will be used and the input will be preserved. Default = False
Returns:

cp

Return type:

The ConfigParser class

pycbc.workflow.configuration module

This module provides a wrapper to the ConfigParser utilities for pycbc workflow construction. This module is described in the page here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/ahope/initialization_inifile.html

class pycbc.workflow.configuration.WorkflowConfigParser(configFiles=None, overrideTuples=None, parsedFilePath=None, deleteTuples=None, copy_to_cwd=False)[source]

Bases: glue.pipeline.DeepCopyableConfigParser

This is a sub-class of glue.pipeline.DeepCopyableConfigParser, which lets us add a few additional helper features that are useful in workflows.

add_options_to_section(section, items, overwrite_options=False)[source]

Add a set of options and values to a section of a ConfigParser object. Will throw an error if any of the options being added already exist, this behaviour can be overridden if desired

Parameters:
  • section (string) – The name of the section to add options+values to
  • items (list of tuples) – Each tuple contains (at [0]) the option and (at [1]) the value to add to the section of the ini file
  • overwrite_options (Boolean, optional) – By default this function will throw a ValueError if an option exists in both the original section in the ConfigParser and in the provided items. This will override so that the options+values given in items will replace the original values if the value is set to True. Default = True
check_duplicate_options(section1, section2, raise_error=False)[source]

Check for duplicate options in two sections, section1 and section2. Will return a list of the duplicate options.

Parameters:
  • section1 (string) – The name of the first section to compare
  • section2 (string) – The name of the second section to compare
  • raise_error (Boolean, optional (default=False)) – If True, raise an error if duplicates are present.
Returns:

duplicates – List of duplicate options

Return type:

List

classmethod from_cli(opts)[source]

Initialize the config parser using options parsed from the command line.

The parsed options opts must include options provided by add_workflow_command_line_group().

Parameters:opts (argparse.ArgumentParser) – The command line arguments parsed by argparse
get_cli_option(section, option_name, **kwds)[source]

Return option using CLI action parsing

Parameters:
  • section (str) – Section to find option to parse
  • option_name (str) – Name of the option to parse from the config file
  • kwds (keywords) – Additional keywords are passed directly to the argument parser.
Returns:

The parsed value for this option

Return type:

value

get_opt_tag(section, option, tag)[source]

Convenience function accessing get_opt_tags() for a single tag: see documentation for that function. NB calling get_opt_tags() directly is preferred for simplicity.

Parameters:
  • self (ConfigParser object) – The ConfigParser object (automatically passed when this is appended to the ConfigParser class)
  • section (string) – The section of the ConfigParser object to read
  • option (string) – The ConfigParser option to look for
  • tag (string) – The name of the subsection to look in, if not found in [section]
Returns:

The value of the options being searched for

Return type:

string

get_opt_tags(section, option, tags)[source]

Supplement to ConfigParser.ConfigParser.get(). This will search for an option in [section] and if it doesn’t find it will also try in [section-tag] for every value of tag in tags. Will raise a ConfigParser.Error if it cannot find a value.

Parameters:
  • self (ConfigParser object) – The ConfigParser object (automatically passed when this is appended to the ConfigParser class)
  • section (string) – The section of the ConfigParser object to read
  • option (string) – The ConfigParser option to look for
  • tags (list of strings) – The name of subsections to look in, if not found in [section]
Returns:

The value of the options being searched for

Return type:

string

get_subsections(section_name)[source]

Return a list of subsections for the given section name

has_option_tag(section, option, tag)[source]

Convenience function accessing has_option_tags() for a single tag: see documentation for that function. NB calling has_option_tags() directly is preferred for simplicity.

Parameters:
  • self (ConfigParser object) – The ConfigParser object (automatically passed when this is appended to the ConfigParser class)
  • section (string) – The section of the ConfigParser object to read
  • option (string) – The ConfigParser option to look for
  • tag (string) – The name of the subsection to look in, if not found in [section]
Returns:

Is the option in the section or [section-tag]

Return type:

Boolean

has_option_tags(section, option, tags)[source]

Supplement to ConfigParser.ConfigParser.has_option(). This will search for an option in [section] and if it doesn’t find it will also try in [section-tag] for each value in tags. Returns True if the option is found and false if not.

Parameters:
  • self (ConfigParser object) – The ConfigParser object (automatically passed when this is appended to the ConfigParser class)
  • section (string) – The section of the ConfigParser object to read
  • option (string) – The ConfigParser option to look for
  • tags (list of strings) – The names of the subsection to look in, if not found in [section]
Returns:

Is the option in the section or [section-tag] (for tag in tags)

Return type:

Boolean

interpolate_exe(testString)[source]

Replace testString with a path to an executable based on the format.

If this looks like

${which:lalapps_tmpltbank}

it will return the equivalent of which(lalapps_tmpltbank)

Otherwise it will return an unchanged string.

Parameters:testString (string) – The input string
Returns:newString – The output string.
Return type:string
interpolate_string(testString, section)[source]

Take a string and replace all example of ExtendedInterpolation formatting within the string with the exact value.

For values like ${example} this is replaced with the value that corresponds to the option called example *in the same section*

For values like ${common|example} this is replaced with the value that corresponds to the option example in the section [common]. Note that in the python3 config parser this is ${common:example} but python2.7 interprets the : the same as a = and this breaks things

Nested interpolation is not supported here.

Parameters:
  • testString (String) – The string to parse and interpolate
  • section (String) – The current section of the ConfigParser object
Returns:

testString – Interpolated string

Return type:

String

perform_exe_expansion()[source]

This function will look through the executables section of the ConfigParser object and replace any values using macros with full paths.

For any values that look like

${which:lalapps_tmpltbank}

will be replaced with the equivalent of which(lalapps_tmpltbank)

Otherwise values will be unchanged.

perform_extended_interpolation()[source]

Filter through an ini file and replace all examples of ExtendedInterpolation formatting with the exact value. For values like ${example} this is replaced with the value that corresponds to the option called example *in the same section*

For values like ${common|example} this is replaced with the value that corresponds to the option example in the section [common]. Note that in the python3 config parser this is ${common:example} but python2.7 interprets the : the same as a = and this breaks things

Nested interpolation is not supported here.

populate_shared_sections()[source]

Parse the [sharedoptions] section of the ini file.

That section should contain entries according to:

  • massparams = inspiral, tmpltbank
  • dataparams = tmpltbank

This will result in all options in [sharedoptions-massparams] being copied into the [inspiral] and [tmpltbank] sections and the options in [sharedoptions-dataparams] being copited into [tmpltbank]. In the case of duplicates an error will be raised.

read_ini_file(cpFile)[source]

Read a .ini file and return it as a ConfigParser class. This function does none of the parsing/combining of sections. It simply reads the file and returns it unedited

Stub awaiting more functionality - see configparser_test.py

Parameters:cpFile (Path to .ini file, or list of paths) – The path(s) to a .ini file to be read in
Returns:cp – The ConfigParser class containing the read in .ini file
Return type:ConfigParser
resolve_file_url(test_string)[source]

Replace test_string with a path to an executable based on the format.

If this looks like

${which:lalapps_tmpltbank}

it will return the equivalent of which(lalapps_tmpltbank)

Otherwise it will return an unchanged string.

Parameters:test_string (string) – The input string
Returns:new_string – The output string.
Return type:string
resolve_urls()[source]

This function will look through all sections of the ConfigParser object and replace any URLs that are given the resolve magic flag with a path on the local drive.

Specifically for any values that look like

${resolve:https://git.ligo.org/detchar/SOME_GATING_FILE.txt}

the file will be replaced with the output of resolve_url(URL)

Otherwise values will be unchanged.

sanity_check_subsections()[source]

This function goes through the ConfigParset and checks that any options given in the [SECTION_NAME] section are not also given in any [SECTION_NAME-SUBSECTION] sections.

section_to_cli(section, skip_opts=None)[source]

Converts a section into a command-line string.

For example:

[section_name]
foo =
bar = 10

yields: ‘–foo –bar 10’.

Parameters:
  • section (str) – The name of the section to convert.
  • skip_opts (list, optional) – List of options to skip. Default (None) results in all options in the section being converted.
Returns:

The options as a command-line string.

Return type:

str

split_multi_sections()[source]

Parse through the WorkflowConfigParser instance and splits any sections labelled with an “&” sign (for e.g. [inspiral&tmpltbank]) into [inspiral] and [tmpltbank] sections. If these individual sections already exist they will be appended to. If an option exists in both the [inspiral] and [inspiral&tmpltbank] sections an error will be thrown

pycbc.workflow.configuration.add_workflow_command_line_group(parser)[source]

The standard way of initializing a ConfigParser object in workflow will be to do it from the command line. This is done by giving a

–local-config-files filea.ini fileb.ini filec.ini

command. You can also set config file override commands on the command line. This will be most useful when setting (for example) start and end times, or active ifos. This is done by

–config-overrides section1:option1:value1 section2:option2:value2 …

This can also be given as

–config-overrides section1:option1

where the value will be left as ‘’.

To remove a configuration option, use the command line argument

–config-delete section1:option1

which will delete option1 from [section1] or

–config-delete section1

to delete all of the options in [section1]

Deletes are implemented before overrides.

This function returns an argparse OptionGroup to ensure these options are parsed correctly and can then be sent directly to initialize an WorkflowConfigParser.

Parameters:parser (argparse.ArgumentParser instance) – The initialized argparse instance to add the workflow option group to.
pycbc.workflow.configuration.istext(s, text_characters=None, threshold=0.3)[source]

Determines if the string is a set of binary data or a text file. This is done by checking if a large proportion of characters are > 0X7E (0x7F is <DEL> and unprintable) or low bit control codes. In other words things that you wouldn’t see (often) in a text file. (ASCII past 0x7F might appear, but rarely).

Code modified from https://www.safaribooksonline.com/library/view/python-cookbook-2nd/0596007973/ch01s12.html

pycbc.workflow.configuration.resolve_url(url, directory=None, permissions=None, copy_to_cwd=True)[source]

Resolves a URL to a local file, and returns the path to that file.

If a URL is given, the file will be copied to the current working directory. If a local file path is given, the file will only be copied to the current working directory if copy_to_cwd is True (the default).

pycbc.workflow.core module

This module provides the worker functions and classes that are used when creating a workflow. For details about the workflow module see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/ahope.html

exception pycbc.workflow.core.CalledProcessErrorMod(returncode, cmd, errFile=None, outFile=None, cmdFile=None)[source]

Bases: exceptions.Exception

This exception is raised when subprocess.call returns a non-zero exit code and checking has been requested. This should not be accessed by the user it is used only within make_external_call.

class pycbc.workflow.core.ContentHandler(document, start_handlers={})[source]

Bases: glue.ligolw.ligolw.LIGOLWContentHandler

startColumn(parent, attrs)
startStream(parent, attrs, __orig_startStream=<unbound method ContentHandler.startStream>)
startTable(parent, attrs, __orig_startTable=<unbound method ContentHandler.startTable>)
class pycbc.workflow.core.Executable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.pegasus_workflow.Executable

ALL_TRIGGERS = 2
FINAL_RESULT = 4
INTERMEDIATE_PRODUCT = 1
KEEP_BUT_RAISE_WARNING = 5
MERGED_TRIGGERS = 3
add_ini_opts(cp, sec)[source]

Add job-specific options from configuration file.

Parameters:
  • cp (ConfigParser object) – The ConfigParser object holding the workflow configuration settings
  • sec (string) – The section containing options for this job.
add_ini_profile(cp, sec)[source]

Add profile from configuration file.

Parameters:
  • cp (ConfigParser object) – The ConfigParser object holding the workflow configuration settings
  • sec (string) – The section containing options for this job.
add_opt(opt, value=None)[source]

Add option to job.

Parameters:
  • opt (string) – Name of option (e.g. –output-file-format)
  • value (string, (default=None)) – The value for the option (no value if set to None).
create_node()[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 5
file_input_options = []
get_opt(opt)[source]

Get value of option from configuration file

Parameters:opt (string) – Name of option (e.g. output-file-format)
Returns:value – The value for the option. Returns None if option not present.
Return type:string
has_opt(opt)[source]

Check if option is present in configuration file

Parameters:opt (string) – Name of option (e.g. output-file-format)
ifo

Return the ifo.

If only one ifo in the ifo list this will be that ifo. Otherwise an error is raised.

update_current_retention_level(value)[source]

Set a new value for the current retention level.

This updates the value of self.retain_files for an updated value of the retention level.

Parameters:value (int) – The new value to use for the retention level.
update_current_tags(tags)[source]

Set a new set of tags for this executable.

Update the set of tags that this job will use. This updated default file naming and shared options. It will not update the pegasus profile, which belong to the executable and cannot be different for different nodes.

Parameters:tags (list) – The new list of tags to consider.
update_output_directory(out_dir=None)[source]

Update the default output directory for output files.

Parameters:out_dir (string (optional, default=None)) – If provided use this as the output directory. Else choose this automatically from the tags.
class pycbc.workflow.core.File(ifos, exe_name, segs, file_url=None, extension=None, directory=None, tags=None, store_file=True, use_tmp_subdirs=False)[source]

Bases: pycbc.workflow.pegasus_workflow.File

This class holds the details of an individual output file This file(s) may be pre-supplied, generated from within the workflow command line script, or generated within the workflow. The important stuff is:

  • The ifo that the File is valid for
  • The time span that the OutFile is valid for
  • A short description of what the file is
  • The extension that the file should have
  • The url where the file should be located

An example of initiating this class:

>> c = File(“H1”, “INSPIRAL_S6LOWMASS”, segments.segment(815901601, 815902001), file_url=”file://localhost/home/spxiwh/H1-INSPIRAL_S6LOWMASS-815901601-400.xml.gz” )

another where the file url is generated from the inputs:

>> c = File(“H1”, “INSPIRAL_S6LOWMASS”, segments.segment(815901601, 815902001), directory=”/home/spxiwh”, extension=”xml.gz” )

add_metadata(key, value)[source]

Add arbitrary metadata to this file

cache_entry

Returns a CacheEntry instance for File.

ifo

If only one ifo in the ifo_list this will be that ifo. Otherwise an error is raised.

segment

If only one segment in the segmentlist this will be that segment. Otherwise an error is raised.

class pycbc.workflow.core.FileList[source]

Bases: list

This class holds a list of File objects. It inherits from the built-in list class, but also allows a number of features. ONLY pycbc.workflow.File instances should be within a FileList instance.

categorize_by_attr(attribute)[source]

Function to categorize a FileList by a File object attribute (eg. ‘segment’, ‘ifo’, ‘description’).

Parameters:attribute (string) – File object attribute to categorize FileList
Returns:
  • keys (list) – A list of values for an attribute
  • groups (list) – A list of FileLists
convert_to_lal_cache()[source]

Return all files in this object as a glue.lal.Cache object

dump(filename)[source]

Output this AhopeFileList to a pickle file

entry_class

alias of File

find_all_output_in_range(ifo, currSeg, useSplitLists=False)[source]

Return all files that overlap the specified segment.

find_output(ifo, time)[source]

Returns one File most appropriate at the given time/time range.

Return one File that covers the given time, or is most appropriate for the supplied time range.

Parameters:
  • ifo (string) – Name of the ifo (or ifos) that the file should be valid for.
  • time (int/float/LIGOGPStime or tuple containing two values) – If int/float/LIGOGPStime (or similar may of specifying one time) is given, return the File corresponding to the time. This calls self.find_output_at_time(ifo,time). If a tuple of two values is given, return the File that is most appropriate for the time range given. This calls self.find_output_in_range
Returns:

pycbc_file – The File that corresponds to the time or time range

Return type:

pycbc.workflow.File instance

find_output_at_time(ifo, time)[source]

Return File that covers the given time.

Parameters:
  • ifo (string) – Name of the ifo (or ifos) that the File should correspond to
  • time (int/float/LIGOGPStime) – Return the Files that covers the supplied time. If no File covers the time this will return None.
Returns:

The Files that corresponds to the time.

Return type:

list of File classes

find_output_in_range(ifo, start, end)[source]

Return the File that is most appropriate for the supplied time range. That is, the File whose coverage time has the largest overlap with the supplied time range. If no Files overlap the supplied time window, will return None.

Parameters:
  • ifo (string) – Name of the ifo (or ifos) that the File should correspond to
  • start (int/float/LIGOGPStime) – The start of the time range of interest.
  • end (int/float/LIGOGPStime) – The end of the time range of interest
Returns:

The File that is most appropriate for the time range

Return type:

File class

find_output_with_ifo(ifo)[source]

Find all files who have ifo = ifo

find_output_with_tag(tag)[source]

Find all files who have tag in self.tags

find_output_without_tag(tag)[source]

Find all files who do not have tag in self.tags

find_outputs_in_range(ifo, current_segment, useSplitLists=False)[source]

Return the list of Files that is most appropriate for the supplied time range. That is, the Files whose coverage time has the largest overlap with the supplied time range.

Parameters:
  • ifo (string) – Name of the ifo (or ifos) that the File should correspond to
  • current_segment (glue.segment.segment) – The segment of time that files must intersect.
Returns:

The list of Files that are most appropriate for the time range

Return type:

FileList class

get_times_covered_by_files()[source]

Find the coalesced intersection of the segments of all files in the list.

classmethod load(filename)[source]

Load an AhopeFileList from a pickle file

to_file_object(name, out_dir)[source]

Dump to a pickle file and return an File object reference of this list

Parameters:
  • name (str) – An identifier of this file. Needs to be unique.
  • out_dir (path) – path to place this file
Returns:

file

Return type:

AhopeFile

class pycbc.workflow.core.Node(executable)[source]

Bases: pycbc.workflow.pegasus_workflow.Node

add_multiifo_input_list_opt(opt, inputs)[source]

Add an option that determines a list of inputs from multiple detectors. Files will be supplied as –opt ifo1:input1 ifo2:input2 …..

add_multiifo_output_list_opt(opt, outputs)[source]

Add an option that determines a list of outputs from multiple detectors. Files will be supplied as –opt ifo1:input1 ifo2:input2 …..

get_command_line()[source]
new_multiifo_output_list_opt(opt, ifos, analysis_time, extension, tags=None, store_file=None, use_tmp_subdirs=False)[source]

Add an option that determines a list of outputs from multiple detectors. Files will be supplied as –opt ifo1:input1 ifo2:input2 ….. File names are created internally from the provided extension and analysis time.

new_output_file_opt(valid_seg, extension, option_name, tags=None, store_file=None, use_tmp_subdirs=False)[source]

This function will create a workflow.File object corresponding to the given information and then add that file as output of this node.

Parameters:
  • valid_seg (ligo.segments.segment) – The time span over which the job is valid for.
  • extension (string) – The extension to be used at the end of the filename. E.g. ‘.xml’ or ‘.sqlite’.
  • option_name (string) – The option that is used when setting this job as output. For e.g. ‘output-name’ or ‘output-file’, whatever is appropriate for the current executable.
  • tags (list of strings, (optional, default=[])) – These tags will be added to the list of tags already associated with the job. They can be used to uniquely identify this output file.
  • store_file (Boolean, (optional, default=True)) – This file is to be added to the output mapper and will be stored in the specified output location if True. If false file will be removed when no longer needed in the workflow.
output_file

If only one output file return it. Otherwise raise an exception.

output_files
class pycbc.workflow.core.SegFile(ifo_list, description, valid_segment, segment_dict=None, seg_summ_dict=None, **kwargs)[source]

Bases: pycbc.workflow.core.File

This class inherits from the File class, and is designed to store workflow output files containing a segment dict. This is identical in usage to File except for an additional kwarg for holding the segment dictionary, if it is known at workflow run time.

classmethod from_multi_segment_list(description, segmentlists, names, ifos, seg_summ_lists=None, **kwargs)[source]

Initialize a SegFile object from a list of segmentlists.

Parameters:
  • description (string (required)) – See File.__init__
  • segmentlists (List of ligo.segments.segmentslist) – List of segment lists that will be stored in this file.
  • names (List of str) – List of names of the segment lists to be stored in the file.
  • ifos (str) – List of ifos of the segment lists to be stored in this file.
  • seg_summ_lists (ligo.segments.segmentslist (OPTIONAL)) – Specify the segment_summary segmentlists that go along with the segmentlists. Default=None, in this case segment_summary is taken from the valid_segment of the SegFile class.
classmethod from_segment_list(description, segmentlist, name, ifo, seg_summ_list=None, **kwargs)[source]

Initialize a SegFile object from a segmentlist.

Parameters:
  • description (string (required)) – See File.__init__
  • segmentlist (ligo.segments.segmentslist) – The segment list that will be stored in this file.
  • name (str) – The name of the segment lists to be stored in the file.
  • ifo (str) – The ifo of the segment lists to be stored in this file.
  • seg_summ_list (ligo.segments.segmentslist (OPTIONAL)) – Specify the segment_summary segmentlist that goes along with the segmentlist. Default=None, in this case segment_summary is taken from the valid_segment of the SegFile class.
classmethod from_segment_list_dict(description, segmentlistdict, ifo_list=None, valid_segment=None, file_exists=False, seg_summ_dict=None, **kwargs)[source]

Initialize a SegFile object from a segmentlistdict.

Parameters:
  • description (string (required)) – See File.__init__
  • segmentlistdict (ligo.segments.segmentslistdict) – See SegFile.__init__
  • ifo_list (string or list (optional)) – See File.__init__, if not given a list of all ifos in the segmentlistdict object will be used
  • valid_segment (ligo.segments.segment or ligo.segments.segmentlist) – See File.__init__, if not given the extent of all segments in the segmentlistdict is used.
  • file_exists (boolean (default = False)) – If provided and set to True it is assumed that this file already exists on disk and so there is no need to write again.
  • seg_summ_dict (ligo.segments.segmentslistdict) – Optional. See SegFile.__init__.
classmethod from_segment_xml(xml_file, **kwargs)[source]

Read a ligo.segments.segmentlist from the file object file containing an xml segment table.

Parameters:xml_file (file object) – file object for segment xml file
parse_segdict_key(key)[source]

Return ifo and name from the segdict key.

remove_short_sci_segs(minSegLength)[source]

Function to remove all science segments shorter than a specific length. Also updates the file on disk to remove these segments.

Parameters:minSegLength (int) – Maximum length of science segments. Segments shorter than this will be removed.
return_union_seglist()[source]
to_segment_xml(override_file_if_exists=False)[source]

Write the segment list in self.segmentList to self.storage_path.

class pycbc.workflow.core.Workflow(args, name)[source]

Bases: pycbc.workflow.pegasus_workflow.Workflow

This class manages a pycbc workflow. It provides convenience functions for finding input files using time and keywords. It can also generate cache files from the inputs.

execute_node(node, verbatim_exe=False)[source]

Execute this node immediately on the local machine

output_map
save(filename=None, output_map_path=None, transformation_catalog_path=None, staging_site=None)[source]

Write this workflow to DAX file

save_config(fname, output_dir, cp=None)[source]

Writes configuration file to disk and returns a pycbc.workflow.File instance for the configuration file.

Parameters:
  • fname (string) – The filename of the configuration file written to disk.
  • output_dir (string) – The directory where the file is written to disk.
  • cp (ConfigParser object) – The ConfigParser object to write. If None then uses self.cp.
Returns:

The FileList object with the configuration file.

Return type:

FileList

static set_job_properties(job, output_map_file, transformation_catalog_file, staging_site=None)[source]
staging_site
transformation_catalog
pycbc.workflow.core.check_output(*popenargs, **kwargs)[source]
pycbc.workflow.core.check_output_error_and_retcode(*popenargs, **kwargs)[source]

This function is used to obtain the stdout of a command. It is only used internally, recommend using the make_external_call command if you want to call external executables.

pycbc.workflow.core.get_full_analysis_chunk(science_segs)[source]

Function to find the first and last time point contained in the science segments and return a single segment spanning that full time.

Parameters:science_segs (ifo-keyed dictionary of ligo.segments.segmentlist instances) – The list of times that are being analysed in this workflow.
Returns:fullSegment – The segment spanning the first and last time point contained in science_segs.
Return type:ligo.segments.segment
pycbc.workflow.core.get_random_label()[source]

Get a random label string to use when clustering jobs.

pycbc.workflow.core.is_condor_exec(exe_path)[source]

Determine if an executable is condor-compiled

Parameters:exe_path (str) – The executable path
Returns:truth_value – Return True if the exe is condor compiled, False otherwise.
Return type:boolean
pycbc.workflow.core.make_analysis_dir(path)[source]

Make the analysis directory path, any parent directories that don’t already exist, and the ‘logs’ subdirectory of path.

pycbc.workflow.core.make_external_call(cmdList, out_dir=None, out_basename='external_call', shell=False, fail_on_error=True)[source]

Use this to make an external call using the python subprocess module. See the subprocess documentation for more details of how this works. http://docs.python.org/2/library/subprocess.html

Parameters:
  • cmdList (list of strings) – This list of strings contains the command to be run. See the subprocess documentation for more details.
  • out_dir (string) – If given the stdout and stderr will be redirected to os.path.join(out_dir,out_basename+[“.err”,”.out]) If not given the stdout and stderr will not be recorded
  • out_basename (string) – The value of out_basename used to construct the file names used to store stderr and stdout. See out_dir for more information.
  • shell (boolean, default=False) – This value will be given as the shell kwarg to the subprocess call. WARNING See the subprocess documentation for details on this Kwarg including a warning about a serious security exploit. Do not use this unless you are sure it is necessary and safe.
  • fail_on_error (boolean, default=True) – If set to true an exception will be raised if the external command does not return a code of 0. If set to false such failures will be ignored. Stderr and Stdout can be stored in either case using the out_dir and out_basename options.
Returns:

exitCode – The code returned by the process.

Return type:

int

pycbc.workflow.datafind module

This module is responsible for querying a datafind server to determine the availability of the data that the code is attempting to run on. It also performs a number of tests and can act on these as described below. Full documentation for this function can be found here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/ahope/datafind.html

class pycbc.workflow.datafind.ContentHandler(document, start_handlers={})[source]

Bases: glue.ligolw.ligolw.LIGOLWContentHandler

startColumn(parent, attrs)
startStream(parent, attrs, __orig_startStream=<unbound method ContentHandler.startStream>)
startTable(parent, attrs, __orig_startTable=<unbound method ContentHandler.startTable>)
pycbc.workflow.datafind.convert_cachelist_to_filelist(datafindcache_list)[source]

Take as input a list of glue.lal.Cache objects and return a pycbc FileList containing all frames within those caches.

Parameters:datafindcache_list (list of glue.lal.Cache objects) – The list of cache files to convert.
Returns:datafind_filelist – The list of frame files.
Return type:FileList of frame File objects
pycbc.workflow.datafind.datafind_keep_unique_backups(backup_outs, orig_outs)[source]

This function will take a list of backup datafind files, presumably obtained by querying a remote datafind server, e.g. CIT, and compares these against a list of original datafind files, presumably obtained by querying the local datafind server. Only the datafind files in the backup list that do not appear in the original list are returned. This allows us to use only files that are missing from the local cluster.

Parameters:
  • backup_outs (FileList) – List of datafind files from the remote datafind server.
  • orig_outs (FileList) – List of datafind files from the local datafind server.
Returns:

List of datafind files in backup_outs and not in orig_outs.

Return type:

FileList

pycbc.workflow.datafind.get_missing_segs_from_frame_file_cache(datafindcaches)[source]

This function will use os.path.isfile to determine if all the frame files returned by the local datafind server actually exist on the disk. This can then be used to update the science times if needed.

Parameters:datafindcaches (OutGroupList) – List of all the datafind output files.
Returns:
  • missingFrameSegs (Dict. of ifo keyed glue.segment.segmentlist instances) – The times corresponding to missing frames found in datafindOuts.
  • missingFrames (Dict. of ifo keyed lal.Cache instances) – The list of missing frames
pycbc.workflow.datafind.get_science_segs_from_datafind_outs(datafindcaches)[source]

This function will calculate the science segments that are covered in the OutGroupList containing the frame files returned by various calls to the datafind server. This can then be used to check whether this list covers what it is expected to cover.

Parameters:datafindcaches (OutGroupList) – List of all the datafind output files.
Returns:newScienceSegs – The times covered by the frames found in datafindOuts.
Return type:Dictionary of ifo keyed glue.segment.segmentlist instances
pycbc.workflow.datafind.get_segment_summary_times(scienceFile, segmentName)[source]

This function will find the times for which the segment_summary is set for the flag given by segmentName.

Parameters:
  • scienceFile (SegFile) – The segment file that we want to use to determine this.
  • segmentName (string) – The DQ flag to search for times in the segment_summary table.
Returns:

summSegList – The times that are covered in the segment summary table.

Return type:

ligo.segments.segmentlist

pycbc.workflow.datafind.log_datafind_command(observatory, frameType, startTime, endTime, outputDir, **dfKwargs)[source]

This command will print an equivalent gw_data_find command to disk that can be used to debug why the internal datafind module is not working.

pycbc.workflow.datafind.run_datafind_instance(cp, outputDir, connection, observatory, frameType, startTime, endTime, ifo, tags=None)[source]

This function will query the datafind server once to find frames between the specified times for the specified frame type and observatory.

Parameters:
  • cp (ConfigParser instance) – Source for any kwargs that should be sent to the datafind module
  • outputDir (Output cache files will be written here. We also write the) – commands for reproducing what is done in this function to this directory.
  • connection (datafind connection object) – Initialized through the gwdatafind module, this is the open connection to the datafind server.
  • observatory (string) – The observatory to query frames for. Ex. ‘H’, ‘L’ or ‘V’. NB: not ‘H1’, ‘L1’, ‘V1’ which denote interferometers.
  • frameType (string) – The frame type to query for.
  • startTime (int) – Integer start time to query the datafind server for frames.
  • endTime (int) – Integer end time to query the datafind server for frames.
  • ifo (string) – The interferometer to use for naming output. Ex. ‘H1’, ‘L1’, ‘V1’. Maybe this could be merged with the observatory string, but this could cause issues if running on old ‘H2’ and ‘H1’ data.
  • tags (list of string, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniquify the actual filename. FIXME: Filenames may not be unique with current codes!
Returns:

  • dfCache (glue.lal.Cache instance) – The glue.lal.Cache representation of the call to the datafind server and the returned frame files.
  • cacheFile (pycbc.workflow.core.File) – Cache file listing all of the datafind output files for use later in the pipeline.

pycbc.workflow.datafind.setup_datafind_from_pregenerated_lcf_files(cp, ifos, outputDir, tags=None)[source]

This function is used if you want to run with pregenerated lcf frame cache files.

Parameters:
  • cp (ConfigParser.ConfigParser instance) – This contains a representation of the information stored within the workflow configuration files
  • ifos (list of ifo strings) – List of ifos to get pregenerated files for.
  • outputDir (path) – All output files written by datafind processes will be written to this directory. Currently this sub-module writes no output.
  • tags (list of strings, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename.
Returns:

  • datafindcaches (list of glue.lal.Cache instances) – The glue.lal.Cache representations of the various calls to the datafind server and the returned frame files.
  • datafindOuts (pycbc.workflow.core.FileList) – List of all the datafind output files for use later in the pipeline.

pycbc.workflow.datafind.setup_datafind_runtime_cache_multi_calls_perifo(cp, scienceSegs, outputDir, tags=None)[source]

This function uses the gwdatafind library to obtain the location of all the frame files that will be needed to cover the analysis of the data given in scienceSegs. This function will not check if the returned frames cover the whole time requested, such sanity checks are done in the pycbc.workflow.setup_datafind_workflow entry function. As opposed to setup_datafind_runtime_single_call_perifo this call will one call to the datafind server for every science segment. This function will return a list of output files that correspond to the cache .lcf files that are produced, which list the locations of all frame files. This will cause problems with pegasus, which expects to know about all input files (ie. the frame files themselves.)

Parameters:
  • cp (ConfigParser.ConfigParser instance) – This contains a representation of the information stored within the workflow configuration files
  • scienceSegs (Dictionary of ifo keyed glue.segment.segmentlist instances) – This contains the times that the workflow is expected to analyse.
  • outputDir (path) – All output files written by datafind processes will be written to this directory.
  • tags (list of strings, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!
Returns:

  • datafindcaches (list of glue.lal.Cache instances) – The glue.lal.Cache representations of the various calls to the datafind server and the returned frame files.
  • datafindOuts (pycbc.workflow.core.FileList) – List of all the datafind output files for use later in the pipeline.

pycbc.workflow.datafind.setup_datafind_runtime_cache_single_call_perifo(cp, scienceSegs, outputDir, tags=None)[source]

This function uses the gwdatafind library to obtain the location of all the frame files that will be needed to cover the analysis of the data given in scienceSegs. This function will not check if the returned frames cover the whole time requested, such sanity checks are done in the pycbc.workflow.setup_datafind_workflow entry function. As opposed to setup_datafind_runtime_generated this call will only run one call to datafind per ifo, spanning the whole time. This function will return a list of output files that correspond to the cache .lcf files that are produced, which list the locations of all frame files. This will cause problems with pegasus, which expects to know about all input files (ie. the frame files themselves.)

Parameters:
  • cp (ConfigParser.ConfigParser instance) – This contains a representation of the information stored within the workflow configuration files
  • scienceSegs (Dictionary of ifo keyed glue.segment.segmentlist instances) – This contains the times that the workflow is expected to analyse.
  • outputDir (path) – All output files written by datafind processes will be written to this directory.
  • tags (list of strings, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!
Returns:

  • datafindcaches (list of glue.lal.Cache instances) – The glue.lal.Cache representations of the various calls to the datafind server and the returned frame files.
  • datafindOuts (pycbc.workflow.core.FileList) – List of all the datafind output files for use later in the pipeline.

pycbc.workflow.datafind.setup_datafind_runtime_frames_multi_calls_perifo(cp, scienceSegs, outputDir, tags=None)[source]

This function uses the gwdatafind library to obtain the location of all the frame files that will be needed to cover the analysis of the data given in scienceSegs. This function will not check if the returned frames cover the whole time requested, such sanity checks are done in the pycbc.workflow.setup_datafind_workflow entry function. As opposed to setup_datafind_runtime_single_call_perifo this call will one call to the datafind server for every science segment. This function will return a list of files corresponding to the individual frames returned by the datafind query. This will allow pegasus to more easily identify all the files used as input, but may cause problems for codes that need to take frame cache files as input.

Parameters:
  • cp (ConfigParser.ConfigParser instance) – This contains a representation of the information stored within the workflow configuration files
  • scienceSegs (Dictionary of ifo keyed glue.segment.segmentlist instances) – This contains the times that the workflow is expected to analyse.
  • outputDir (path) – All output files written by datafind processes will be written to this directory.
  • tags (list of strings, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!
Returns:

  • datafindcaches (list of glue.lal.Cache instances) – The glue.lal.Cache representations of the various calls to the datafind server and the returned frame files.
  • datafindOuts (pycbc.workflow.core.FileList) – List of all the datafind output files for use later in the pipeline.

pycbc.workflow.datafind.setup_datafind_runtime_frames_single_call_perifo(cp, scienceSegs, outputDir, tags=None)[source]

This function uses the gwdatafind library to obtain the location of all the frame files that will be needed to cover the analysis of the data given in scienceSegs. This function will not check if the returned frames cover the whole time requested, such sanity checks are done in the pycbc.workflow.setup_datafind_workflow entry function. As opposed to setup_datafind_runtime_generated this call will only run one call to datafind per ifo, spanning the whole time. This function will return a list of files corresponding to the individual frames returned by the datafind query. This will allow pegasus to more easily identify all the files used as input, but may cause problems for codes that need to take frame cache files as input.

Parameters:
  • cp (ConfigParser.ConfigParser instance) – This contains a representation of the information stored within the workflow configuration files
  • scienceSegs (Dictionary of ifo keyed glue.segment.segmentlist instances) – This contains the times that the workflow is expected to analyse.
  • outputDir (path) – All output files written by datafind processes will be written to this directory.
  • tags (list of strings, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!
Returns:

  • datafindcaches (list of glue.lal.Cache instances) – The glue.lal.Cache representations of the various calls to the datafind server and the returned frame files.
  • datafindOuts (pycbc.workflow.core.FileList) – List of all the datafind output files for use later in the pipeline.

pycbc.workflow.datafind.setup_datafind_server_connection(cp, tags=None)[source]

This function is resposible for setting up the connection with the datafind server.

Parameters:cp (pycbc.workflow.configuration.WorkflowConfigParser) – The memory representation of the ConfigParser
Returns:The open connection to the datafind server.
Return type:connection
pycbc.workflow.datafind.setup_datafind_workflow(workflow, scienceSegs, outputDir, seg_file=None, tags=None)[source]

Setup datafind section of the workflow. This section is responsible for generating, or setting up the workflow to generate, a list of files that record the location of the frame files needed to perform the analysis. There could be multiple options here, the datafind jobs could be done at run time or could be put into a dag. The subsequent jobs will know what was done here from the OutFileList containing the datafind jobs (and the Dagman nodes if appropriate. For now the only implemented option is to generate the datafind files at runtime. This module can also check if the frameFiles actually exist, check whether the obtained segments line up with the original ones and update the science segments to reflect missing data files.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The workflow class that stores the jobs that will be run.
  • scienceSegs (Dictionary of ifo keyed glue.segment.segmentlist instances) – This contains the times that the workflow is expected to analyse.
  • outputDir (path) – All output files written by datafind processes will be written to this directory.
  • seg_file (SegFile, optional (default=None)) – The file returned by get_science_segments containing the science segments and the associated segment_summary. This will be used for the segment_summary test and is required if, and only if, performing that test.
  • tags (list of string, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!
Returns:

  • datafindOuts (OutGroupList) – List of all the datafind output files for use later in the pipeline.
  • sci_avlble_file (SegFile) – SegFile containing the analysable time after checks in the datafind module are applied to the input segment list. For production runs this is expected to be equal to the input segment list.
  • scienceSegs (Dictionary of ifo keyed glue.segment.segmentlist instances) – This contains the times that the workflow is expected to analyse. If the updateSegmentTimes kwarg is given this will be updated to reflect any instances of missing data.
  • sci_avlble_name (string) – The name with which the analysable time is stored in the sci_avlble_file.

pycbc.workflow.grb_utils module

This library code contains functions and classes that are used in the generation of pygrb workflows. For details about pycbc.workflow see here: http://pycbc.org/pycbc/latest/html/workflow.html

pycbc.workflow.grb_utils.get_coh_PTF_files(cp, ifos, run_dir, bank_veto=False, summary_files=False)[source]

Retrieve files needed to run coh_PTF jobs within a PyGRB workflow

Parameters:
  • cp (pycbc.workflow.configuration.WorkflowConfigParser object) –
  • parsed configuration options of a pycbc.workflow.core.Workflow. (The) –
  • ifos (str) –
  • containing the analysis interferometer IDs. (String) –
  • run_dir (str) –
  • run directory, destination for retrieved files. (The) –
  • bank_veto (Boolean) –
  • true, will retrieve the bank_veto_bank.xml file. (If) –
  • summary_files (Boolean) –
  • true, will retrieve the summary page style files. (If) –
Returns:

  • file_list (pycbc.workflow.FileList object)
  • A FileList containing the retrieved files.

pycbc.workflow.grb_utils.get_ipn_sky_files(workflow, file_url, tags=None)[source]

Retreive the sky point files for searching over the IPN error box and populating it with injections.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.
  • file_url (string) – The URL of the IPN sky points file.
  • tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.
Returns:

sky_points_file – File object representing the IPN sky points file.

Return type:

pycbc.workflow.core.File

pycbc.workflow.grb_utils.get_sky_grid_scale(sky_error, sigma_sys=6.8359)[source]

Calculate suitable 3-sigma radius of the search patch, incorporating Fermi GBM systematic if necessary.

pycbc.workflow.grb_utils.make_exttrig_file(cp, ifos, sci_seg, out_dir)[source]

Make an ExtTrig xml file containing information on the external trigger

Parameters:
  • cp (pycbc.workflow.configuration.WorkflowConfigParser object) –
  • parsed configuration options of a pycbc.workflow.core.Workflow. (The) –
  • ifos (str) –
  • containing the analysis interferometer IDs. (String) –
  • sci_seg (ligo.segments.segment) –
  • science segment for the analysis run. (The) –
  • out_dir (str) –
  • output directory, destination for xml file. (The) –
Returns:

  • xml_file (pycbc.workflow.File object)
  • The xml file with external trigger information.

pycbc.workflow.grb_utils.make_gating_node(workflow, datafind_files, outdir=None, tags=None)[source]

Generate jobs for autogating the data for PyGRB runs.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.
  • datafind_files (pycbc.workflow.core.FileList) – A FileList containing the frame files to be gated.
  • outdir (string) – Path of the output directory
  • tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.
Returns:

  • condition_strain_nodes (list) – List containing the pycbc.workflow.core.Node objects representing the autogating jobs.
  • condition_strain_outs (pycbc.workflow.core.FileList) – FileList containing the pycbc.workflow.core.File objects representing the gated frame files.

pycbc.workflow.grb_utils.set_grb_start_end(cp, start, end)[source]

Function to update analysis boundaries as workflow is generated

Parameters:
  • cp (pycbc.workflow.configuration.WorkflowConfigParser object) –
  • parsed configuration options of a pycbc.workflow.core.Workflow. (The) –
  • start (int) –
  • start of the workflow analysis time. (The) –
  • end (int) –
  • end of the workflow analysis time. (The) –
Returns:

  • cp (pycbc.workflow.configuration.WorkflowConfigParser object)
  • The modified WorkflowConfigParser object.

pycbc.workflow.inference_followups module

Module that contains functions for setting up the inference workflow.

pycbc.workflow.inference_followups.create_fits_file(workflow, inference_file, output_dir, name='create_fits_file', analysis_seg=None, tags=None)[source]

Sets up job to create fits files from some given samples files.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The workflow instance we are populating
  • inference_file (pycbc.workflow.File) – The file with posterior samples.
  • output_dir (str) – The directory to store result plots and files.
  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is create_fits_file.
  • analysis_segs (ligo.segments.Segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.
  • tags (list, optional) – Tags to add to the inference executables.
Returns:

A list of output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.create_posterior_files(workflow, samples_files, output_dir, parameters=None, name='extract_posterior', analysis_seg=None, tags=None)[source]

Sets up job to create posterior files from some given samples files.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The workflow instance we are populating
  • samples_files (str or list of str) – One or more files to extract the posterior samples from.
  • output_dir (str) – The directory to store result plots and files.
  • parameters (list, optional) – A list of the parameters to extract, and (optionally) a name for them to be mapped to. This is passed to the program’s --parameters argument.
  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is extract_posterior.
  • analysis_segs (ligo.segments.Segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.
  • tags (list, optional) – Tags to add to the inference executables.
Returns:

A list of output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.get_diagnostic_plots(workflow)[source]

Determines what diagnostic plots to create based on workflow.

The plots to create are based on what executable’s are specified in the workflow’s config file. A list of strings is returned giving the diagnostic plots to create. This list may contain:

  • samples: For MCMC samplers, a plot of the sample chains as a function of iteration. This will be created if plot_samples is in the executables section.
  • acceptance_rate: For MCMC samplers, a plot of the acceptance rate. This will be created if plot_acceptance_rate is in the executables section.
Returns:List of names of diagnostic plots.
Return type:list
pycbc.workflow.inference_followups.get_plot_group(cp, section_tag)[source]

Gets plotting groups from [workflow-section_tag].

pycbc.workflow.inference_followups.get_posterior_params(cp, section='workflow-posterior_params')[source]

Gets the posterior parameters from the given config file.

The posterior parameters are read from the given section. Parameters should be specified as OUTPUT = [INPUT], where OUTPUT is what the parameter should be named in the posterior file and INPUT is the (function of) parameter(s) to read from the samples file. If no INPUT is provided, the INPUT name will assumed to be the same as the OUTPUT. Example:

[workflow-posterior_params]
mass1 = primary_mass(mass1, mass2)
mass2 = secondary_mass(mass1, mass2)
distance =
Parameters:
Returns:

List of strings giving INPUT:OUTPUT. This can be passed as the parameters argument to create_posterior_files().

Return type:

list

pycbc.workflow.inference_followups.make_diagnostic_plots(workflow, diagnostics, samples_file, label, rdir, tags=None)[source]

Makes diagnostic plots.

Diagnostic plots are sampler-specific plots the provide information on how the sampler performed. All diagnostic plots use the output file produced by pycbc_inference as their input. Diagnostic plots are added to the results directory rdir/NAME where NAME is the name of the diagnostic given in diagnostics.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The workflow to add the plotting jobs to.
  • diagnostics (list of str) – The names of the diagnostic plots to create. See get_diagnostic_plots() for recognized names.
  • samples_file ((list of) pycbc.workflow.File) – One or more samples files with which to create the diagnostic plots. If a list of files is provided, a diagnostic plot for each file will be created.
  • label (str) – Event label for the diagnostic plots.
  • rdir (pycbc.results.layout.SectionNumber) – Results directory layout.
  • tags (list of str, optional) – Additional tags to add to the file names.
Returns:

Dictionary of diagnostic name -> list of files giving the plots that will be created.

Return type:

dict

pycbc.workflow.inference_followups.make_inference_acceptance_rate_plot(workflow, inference_file, output_dir, name='plot_acceptance_rate', analysis_seg=None, tags=None)[source]

Sets up a plot of the acceptance rate (for MCMC samplers).

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
  • inference_file (pycbc.workflow.File) – The file with posterior samples.
  • output_dir (str) – The directory to store result plots and files.
  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is plot_acceptance_rate.
  • analysis_segs (ligo.segments.Segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.
  • tags (list, optional) – Tags to add to the inference executables.
Returns:

A list of output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_inference_inj_plots(workflow, inference_files, output_dir, parameters, name='inference_recovery', analysis_seg=None, tags=None)[source]

Sets up the recovered versus injected parameter plot in the workflow.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
  • inference_files (pycbc.workflow.FileList) – The files with posterior samples.
  • output_dir (str) – The directory to store result plots and files.
  • parameters (list) – A list of parameters. Each parameter gets its own plot.
  • name (str) – The name in the [executables] section of the configuration file to use.
  • analysis_segs ({None, ligo.segments.Segment}) – The segment this job encompasses. If None then use the total analysis time from the workflow.
  • tags ({None, optional}) – Tags to add to the inference executables.
Returns:

A list of result and output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_inference_plot(workflow, input_file, output_dir, name, analysis_seg=None, tags=None, input_file_opt='input-file', output_file_extension='.png', add_to_workflow=False)[source]

Boiler-plate function for creating a standard plotting job.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
  • input_file (pycbc.workflow.File) – The file used for the input.
  • output_dir (str) – The directory to store result plots.
  • name (str) – The name in the [executables] section of the configuration file to use.
  • analysis_segs (ligo.segments.Segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.
  • tags (list, optional) – Tags to add to the inference executables.
  • input_file_opt (str, optional) – The name of the input-file option used by the executable. Default is input-file.
  • output_file_extension (str, optional) – What file type to create. Default is .png.
  • add_to_workflow (bool, optional) – If True, the node will be added to the workflow before being returned. This means that no options may be added to the node afterward. Default is False.
Returns:

The job node for creating the plot.

Return type:

pycbc.workflow.plotting.PlotExecutable

pycbc.workflow.inference_followups.make_inference_posterior_plot(workflow, inference_file, output_dir, parameters=None, plot_prior_from_file=None, name='plot_posterior', analysis_seg=None, tags=None)[source]

Sets up the corner plot of the posteriors in the workflow.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
  • inference_file (pycbc.workflow.File) – The file with posterior samples.
  • output_dir (str) – The directory to store result plots and files.
  • parameters (list or str) – The parameters to plot.
  • plot_prior_from_file (str, optional) – Plot the prior from the given config file on the 1D marginal plots.
  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is plot_posterior.
  • analysis_segs (ligo.segments.Segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.
  • tags (list, optional) – Tags to add to the inference executables.
Returns:

A list of output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_inference_prior_plot(workflow, config_file, output_dir, name='plot_prior', analysis_seg=None, tags=None)[source]

Sets up the corner plot of the priors in the workflow.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
  • config_file (pycbc.workflow.File) – The WorkflowConfigParser parasable inference configuration file..
  • output_dir (str) – The directory to store result plots and files.
  • name (str) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is plot_prior.
  • analysis_segs (ligo.segments.Segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.
  • tags (list, optional) – Tags to add to the inference executables.
Returns:

A list of the output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_inference_samples_plot(workflow, inference_file, output_dir, name='plot_samples', analysis_seg=None, tags=None)[source]

Sets up a plot of the samples versus iteration (for MCMC samplers).

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
  • inference_file (pycbc.workflow.File) – The file with posterior samples.
  • output_dir (str) – The directory to store result plots and files.
  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is plot_samples.
  • analysis_segs (ligo.segments.Segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.
  • tags (list, optional) – Tags to add to the inference executables.
Returns:

A list of output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_inference_skymap(workflow, fits_file, output_dir, name='plot_skymap', analysis_seg=None, tags=None)[source]

Sets up the skymap plot.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
  • fits_file (pycbc.workflow.File) – The fits file with the sky location.
  • output_dir (str) – The directory to store result plots and files.
  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is plot_skymap.
  • analysis_segs (ligo.segments.Segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.
  • tags (list, optional) – Tags to add to the inference executables.
Returns:

A list of result and output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_inference_summary_table(workflow, inference_file, output_dir, parameters=None, print_metadata=None, name='table_summary', analysis_seg=None, tags=None)[source]

Sets up the html table summarizing parameter estimates.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
  • inference_file (pycbc.workflow.File) – The file with posterior samples.
  • output_dir (str) – The directory to store result plots and files.
  • parameters (list or str) – A list or string of parameters to generate the table for. If a string is provided, separate parameters should be space or new-line separated.
  • print_metadata (list or str) – A list or string of metadata parameters to print. Syntax is the same as for parameters.
  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is table_summary.
  • analysis_segs (ligo.segments.Segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.
  • tags (list, optional) – Tags to add to the inference executables.
Returns:

A list of output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_posterior_workflow(workflow, samples_files, config_file, label, rdir, posterior_file_dir='posterior_files', tags=None)[source]

Adds jobs to a workflow that make a posterior file and subsequent plots.

The parameters to be written to the posterior file are read from the [workflow-posterior_params] section of the workflow’s config file; see get_posterior_params() for details.

Except for prior plots (which use the given inference config file), all subsequent jobs use the posterior file, and so may use the parameters provided in [workflow-posterior_params]. The following are created:

  • Summary table: an html table created using the table_summary executable. The parameters to print in the table are retrieved from the table-params option in the [workflow-summary_table] section. Metadata may also be printed by adding a print-metadata option to that section.
  • Summary posterior plots: a collection of posterior plots to include in the summary page, after the summary table. The parameters to plot are read from [workflow-summary_plots]. Parameters should be grouped together by providing plot-group-NAME = PARAM1[:LABEL1] PARAM2[:LABEL2] in that section, where NAME is a unique name for each group. One posterior plot will be created for each plot group. For clarity, only one or two parameters should be plotted in each summary group, but this is not enforced. Settings for the plotting executable are read from the plot_posterior_summary section; likewise, the executable used is read from plot_posterior_summary in the [executables] section.
  • Sky maps: if both create_fits_file and plot_skymap are listed in the [executables] section, then a .fits file and sky map plot will be produced. The sky map plot will be included in the summary plots. You must be running in a python 3 environment to create these.
  • Prior plots: plots of the prior will be created using the plot_prior executable. By default, all of the variable parameters will be plotted. The prior plots are added to priors/LALBEL/ in the results directory, where LABEL is the given label.
  • Posterior plots: additional posterior plots are created using the plot_posterior executable. The parameters to plot are read from [workflow-plot_params] section. As with the summary posterior plots, parameters are grouped together by providing plot-group-NAME options in that section. A posterior plot will be created for each group, and added to the posteriors/LABEL/ directory. Plot settings are read from the [plot_posterior] section; this is kept separate from the posterior summary so that different settings can be used. For example, you may want to make a density plot for the summary plots, but a scatter plot colored by SNR for the posterior plots.
Parameters:
  • samples_file (pycbc.workflow.core.FileList) – List of samples files to combine into a single posterior file.
  • config_file (pycbc.worfkow.File) – The inference configuration file used to generate the samples file(s). This is needed to make plots of the prior.
  • label (str) – Unique label for the plots. Used in file names.
  • rdir (pycbc.results.layout.SectionNumber) – The results directory to save the plots to.
  • posterior_file_dir (str, optional) – The name of the directory to save the posterior file to. Default is posterior_files.
  • tags (list of str, optional) – Additional tags to add to the file names.
Returns:

  • posterior_file (pycbc.workflow.File) – The posterior file that was created.
  • summary_files (list) – List of files to go on the summary results page.
  • prior_plots (list) – List of prior plots that will be created. These will be saved to priors/LABEL/ in the resuls directory, where LABEL is the provided label.
  • posterior_plots (list) – List of posterior plots that will be created. These will be saved to posteriors/LABEL/ in the results directory.

pycbc.workflow.injection module

This module is responsible for setting up the part of a pycbc workflow that will generate the injection files to be used for assessing the workflow’s ability to detect predicted signals. (In ihope parlance, this sets up the inspinj jobs). Full documentation for this module can be found here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/NOTYETCREATED.html

pycbc.workflow.injection.compute_inj_optimal_snr(workflow, inj_file, precalc_psd_files, out_dir, tags=None)[source]

Set up a job for computing optimal SNRs of a sim_inspiral file.

pycbc.workflow.injection.cut_distant_injections(workflow, inj_file, out_dir, tags=None)[source]

Set up a job for removing injections that are too distant to be seen

pycbc.workflow.injection.setup_injection_workflow(workflow, output_dir=None, inj_section_name='injections', exttrig_file=None, tags=None)[source]

This function is the gateway for setting up injection-generation jobs in a workflow. It should be possible for this function to support a number of different ways/codes that could be used for doing this, however as this will presumably stay as a single call to a single code (which need not be inspinj) there are currently no subfunctions in this moudle.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the coincidence jobs will be added to.
  • output_dir (path) – The directory in which injection files will be stored.
  • inj_section_name (string (optional, default='injections')) – The string that corresponds to the option describing the exe location in the [executables] section of the .ini file and that corresponds to the section (and sub-sections) giving the options that will be given to the code at run time.
  • tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. This will be used in output names.
Returns:

  • inj_files (pycbc.workflow.core.FileList) – The list of injection files created by this call.
  • inj_tags (list of strings) – The tag corresponding to each injection file and used to uniquely identify them. The FileList class contains functions to search based on tags.

pycbc.workflow.injection.veto_injections(workflow, inj_file, veto_file, veto_name, out_dir, tags=None)[source]

pycbc.workflow.jobsetup module

This library code contains functions and classes that are used to set up and add jobs/nodes to a pycbc workflow. For details about pycbc.workflow see: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/ahope.html

class pycbc.workflow.jobsetup.ComputeDurationsExecutable(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.jobsetup.SQLInOutExecutable

The class responsible for making jobs for pycbc_compute_durations.

create_node(job_segment, input_file, summary_xml_file)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 1
class pycbc.workflow.jobsetup.ExtractToXMLExecutable(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

This class is responsible for running ligolw_sqlite jobs that will take an SQL file and dump it back to XML.

create_node(job_segment, input_file)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 1
class pycbc.workflow.jobsetup.GstlalFarfromsnrchisqhistExecutable(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for running the gstlal far from chisq hist jobs

create_node(job_segment, non_inj_db, marg_input_file, inj_database=None, write_background_bins=False)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 4
class pycbc.workflow.jobsetup.GstlalMarginalizeLikelihoodExecutable(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for running the gstlal marginalize_likelihood jobs

create_node(job_segment, input_file)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 4
class pycbc.workflow.jobsetup.GstlalPlotBackground(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for running gstlal_plot_background

create_node(non_inj_db, likelihood_file)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 4
class pycbc.workflow.jobsetup.GstlalPlotSensitivity(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for running gstlal_plot_sensitivity

create_node(non_inj_db, injection_dbs)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 4
class pycbc.workflow.jobsetup.GstlalPlotSummary(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for running gstlal_plot_summary

create_node(non_inj_db, injection_dbs)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 4
class pycbc.workflow.jobsetup.GstlalSummaryPage(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for running gstlal_inspiral_summary_page

create_and_add_node(workflow, parent_nodes)[source]
current_retention_level = 4
class pycbc.workflow.jobsetup.InspinjfindExecutable(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for running jobs with pycbc_inspinjfind

create_node(job_segment, input_file)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 1
class pycbc.workflow.jobsetup.JobSegmenter(data_lengths, valid_chunks, valid_lengths, curr_seg, curr_exe_class, compatibility_mode=False)[source]

Bases: object

This class is used when running sngl_ifo_job_setup to determine what times should be analysed be each job and what data is needed.

get_data_times_for_job(num_job)[source]

Get the data that this job will read in.

get_data_times_for_job_legacy(num_job)[source]

Get the data that this job will need to read in.

get_data_times_for_job_workflow(num_job)[source]

Get the data that this job will need to read in.

get_valid_times_for_job(num_job, allow_overlap=True)[source]

Get the times for which this job is valid.

get_valid_times_for_job_legacy(num_job)[source]

Get the times for which the job num_job will be valid, using the method use in inspiral hipe.

get_valid_times_for_job_workflow(num_job, allow_overlap=True)[source]

Get the times for which the job num_job will be valid, using workflow’s method.

pick_tile_size(seg_size, data_lengths, valid_chunks, valid_lengths)[source]

Choose job tiles size based on science segment length

class pycbc.workflow.jobsetup.LalappsInspinjExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class used to create jobs for the lalapps_inspinj Executable.

create_node(segment, exttrig_file=None, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 4
class pycbc.workflow.jobsetup.LigoLWCombineSegsExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

This class is used to create nodes for the ligolw_combine_segments Executable

create_node(valid_seg, veto_files, segment_name)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 4
class pycbc.workflow.jobsetup.LigolwAddExecutable(*args, **kwargs)[source]

Bases: pycbc.workflow.core.Executable

The class used to create nodes for the ligolw_add Executable.

create_node(jobSegment, input_files, output=None, use_tmp_subdirs=True, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 1
class pycbc.workflow.jobsetup.LigolwCBCAlignTotalSpinExecutable(cp, exe_name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class used to create jobs for the ligolw_cbc_skyloc_jitter executable.

create_node(parent, segment, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.jobsetup.LigolwCBCJitterSkylocExecutable(cp, exe_name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class used to create jobs for the ligolw_cbc_skyloc_jitter executable.

create_node(parent, segment, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.jobsetup.LigolwSSthincaExecutable(cp, exe_name, universe=None, ifo=None, out_dir=None, dqVetoName=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for making jobs for ligolw_sstinca.

create_node(jobSegment, coincSegment, inputFile, tags=None, write_likelihood=False)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.jobsetup.PyCBCInspiralExecutable(cp, exe_name, ifo=None, out_dir=None, injection_file=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class used to create jobs for pycbc_inspiral Executable.

create_node(data_seg, valid_seg, parent=None, dfParents=None, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 2
file_input_options = ['--gating-file']
get_valid_times()[source]

Determine possible dimensions of needed input and valid output

zero_pad_data_extend(job_data_seg, curr_seg)[source]

When using zero padding, all data is analysable, but the setup functions must include the padding data where it is available so that we are not zero-padding in the middle of science segments. This function takes a job_data_seg, that is chosen for a particular node and extends it with segment-start-pad and segment-end-pad if that data is available.

class pycbc.workflow.jobsetup.PyCBCMultiInspiralExecutable(cp, name, universe=None, ifo=None, injection_file=None, gate_files=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for setting up jobs for the pycbc_multi_inspiral executable.

create_node(data_seg, valid_seg, parent=None, inj_file=None, dfParents=None, bankVetoBank=None, ipn_file=None, slide=None, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 2
file_input_options = ['--gating-file']
get_valid_times()[source]
class pycbc.workflow.jobsetup.PyCBCTmpltbankExecutable(cp, exe_name, ifo=None, out_dir=None, tags=None, write_psd=False, psd_files=None)[source]

Bases: pycbc.workflow.core.Executable

The class used to create jobs for pycbc_geom_nonspin_bank Executable and any other Executables using the same command line option groups.

create_nodata_node(valid_seg, tags=None)[source]

A simplified version of create_node that creates a node that does not need to read in data.

Parameters:valid_seg (glue.segment) – The segment over which to declare the node valid. Usually this would be the duration of the analysis.
Returns:node – The instance corresponding to the created node.
Return type:pycbc.workflow.core.Node
create_node(data_seg, valid_seg, parent=None, dfParents=None, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
get_valid_times()[source]
class pycbc.workflow.jobsetup.PycbcCalculateFarExecutable(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.jobsetup.SQLInOutExecutable

The class responsible for making jobs for the FAR calculation code. This only raises the default retention level

current_retention_level = 4
class pycbc.workflow.jobsetup.PycbcCalculateLikelihoodExecutable(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for running the pycbc_calculate_likelihood executable which is part 4 of 4 of the gstlal_inspiral_calc_likelihood functionality

create_node(job_segment, trigger_file, likelihood_file, horizon_dist_file)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 4
class pycbc.workflow.jobsetup.PycbcCombineLikelihoodExecutable(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for running the pycbc_combine_likelihood executable which is part 2 of 4 of the gstlal_inspiral_calc_likelihood functionality

create_node(job_segment, likelihood_files, horizon_dist_file)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 1
class pycbc.workflow.jobsetup.PycbcConditionStrainExecutable(cp, exe_name, ifo=None, out_dir=None, universe=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for creating jobs for pycbc_condition_strain.

create_node(input_files, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 2
class pycbc.workflow.jobsetup.PycbcCreateInjectionsExecutable(cp, exe_name, ifo=None, out_dir=None, universe=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for creating jobs for pycbc_create_injections.

create_node(config_file=None, seed=None, tags=None)[source]

Set up a CondorDagmanNode class to run pycbc_create_injections.

Parameters:
  • config_file (pycbc.workflow.core.File) – A pycbc.workflow.core.File for inference configuration file to be used with --config-files option.
  • seed (int) – Seed to use for generating injections.
  • tags (list) – A list of tags to include in filenames.
Returns:

node – The node to run the job.

Return type:

pycbc.workflow.core.Node

current_retention_level = 2
class pycbc.workflow.jobsetup.PycbcDarkVsBrightInjectionsExecutable(cp, exe_name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The clase used to create jobs for the pycbc_dark_vs_bright_injections Executable.

create_node(parent, segment, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 4
class pycbc.workflow.jobsetup.PycbcGenerateRankingDataExecutable(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for running the pycbc_gen_ranking_data executable which is part 3 of 4 of the gstlal_inspiral_calc_likelihood functionality

create_node(job_segment, likelihood_file, horizon_dist_file)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 1
class pycbc.workflow.jobsetup.PycbcInferenceExecutable(cp, exe_name, ifos=None, out_dir=None, universe=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for creating jobs for pycbc_inference.

create_node(config_file, seed=None, tags=None, analysis_time=None)[source]

Set up a CondorDagmanNode class to run pycbc_inference.

Parameters:
  • config_file (pycbc.workflow.core.File) – A pycbc.workflow.core.File for inference configuration file to be used with --config-files option.
  • seed (int) – An int to be used with --seed option.
  • tags (list) – A list of tags to include in filenames.
Returns:

node – The node to run the job.

Return type:

pycbc.workflow.core.Node

current_retention_level = 2
class pycbc.workflow.jobsetup.PycbcPickleHorizonDistsExecutable(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for running the pycbc_pickle_horizon_distances executable which is part 1 of 4 of the gstlal_inspiral_calc_likelihood functionality

create_node(job_segment, trigger_files)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 1
class pycbc.workflow.jobsetup.PycbcSplitBankExecutable(cp, exe_name, num_banks, ifo=None, out_dir=None, universe=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for creating jobs for pycbc_hdf5_splitbank.

create_node(bank, tags=None)[source]

Set up a CondorDagmanNode class to run splitbank code

Parameters:bank (pycbc.workflow.core.File) – The File containing the template bank to be split
Returns:node – The node to run the job
Return type:pycbc.workflow.core.Node
current_retention_level = 2
extension = '.hdf'
class pycbc.workflow.jobsetup.PycbcSplitBankXmlExecutable(cp, exe_name, num_banks, ifo=None, out_dir=None, universe=None)[source]

Bases: pycbc.workflow.jobsetup.PycbcSplitBankExecutable

Subclass resonsible for creating jobs for pycbc_splitbank.

extension = '.xml.gz'
class pycbc.workflow.jobsetup.PycbcSplitInspinjExecutable(cp, exe_name, num_splits, universe=None, ifo=None, out_dir=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for running the pycbc_split_inspinj executable

create_node(parent, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 1
class pycbc.workflow.jobsetup.PycbcSqliteSimplifyExecutable(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for making jobs for pycbc_sqlite_simplify.

create_node(job_segment, inputFiles, injFile=None, injString=None, workflow=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 1
class pycbc.workflow.jobsetup.PycbcTimeslidesExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class used to create jobs for the pycbc_timeslides Executable.

create_node(segment)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 4
class pycbc.workflow.jobsetup.SQLInOutExecutable(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

The class responsible for making jobs for SQL codes taking one input and one output.

create_node(job_segment, input_file)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 2
pycbc.workflow.jobsetup.identify_needed_data(curr_exe_job, link_job_instance=None)[source]

This function will identify the length of data that a specific executable needs to analyse and what part of that data is valid (ie. inspiral doesn’t analyse the first or last 64+8s of data it reads in).

In addition you can supply a second job instance to “link” to, which will ensure that the two jobs will have a one-to-one correspondence (ie. one template bank per one matched-filter job) and the corresponding jobs will be “valid” at the same times.

Parameters:
  • curr_exe_job (Job) – An instance of the Job class that has a get_valid times method.
  • link_job_instance (Job instance (optional),) – Coordinate the valid times with another executable.
Returns:

  • dataLength (float) – The amount of data (in seconds) that each instance of the job must read in.
  • valid_chunk (glue.segment.segment) – The times within dataLength for which that jobs output can be valid (ie. for inspiral this is (72, dataLength-72) as, for a standard setup the inspiral job cannot look for triggers in the first 72 or last 72 seconds of data read in.)
  • valid_length (float) – The maximum length of data each job can be valid for. If not using link_job_instance this is abs(valid_segment), but can be smaller than that if the linked job only analyses a small amount of data (for e.g.).

pycbc.workflow.jobsetup.int_gps_time_to_str(t)[source]

Takes an integer GPS time, either given as int or lal.LIGOTimeGPS, and converts it to a string. If a LIGOTimeGPS with nonzero decimal part is given, raises a ValueError.

pycbc.workflow.jobsetup.multi_ifo_coherent_job_setup(workflow, out_files, curr_exe_job, science_segs, datafind_outs, output_dir, parents=None, slide_dict=None, tags=None)[source]

Method for setting up coherent inspiral jobs.

pycbc.workflow.jobsetup.select_generic_executable(workflow, exe_tag)[source]

Returns a class that is appropriate for setting up jobs to run executables having specific tags in the workflow config. Executables should not be “specialized” jobs fitting into one of the select_XXX_class functions above, i.e. not a matched filter or template bank job, which require extra setup.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The Workflow instance.
  • exe_tag (string) – The name of the config section storing options for this executable and the option giving the executable path in the [executables] section.
Returns:

exe_class – functions appropriate for the given executable. Instances of the class (‘jobs’) must have a method job.create_node()

Return type:

Sub-class of pycbc.workflow.core.Executable that holds utility

pycbc.workflow.jobsetup.select_matchedfilter_class(curr_exe)[source]

This function returns a class that is appropriate for setting up matched-filtering jobs within workflow.

Parameters:curr_exe (string) – The name of the matched filter executable to be used.
Returns:exe_class – functions appropriate for the given executable. Instances of the class (‘jobs’) must have methods * job.create_node() and * job.get_valid_times(ifo, )
Return type:Sub-class of pycbc.workflow.core.Executable that holds utility
pycbc.workflow.jobsetup.select_tmpltbank_class(curr_exe)[source]

This function returns a class that is appropriate for setting up template bank jobs within workflow.

Parameters:curr_exe (string) – The name of the executable to be used for generating template banks.
Returns:exe_class – functions appropriate for the given executable. Instances of the class (‘jobs’) must have methods * job.create_node() and * job.get_valid_times(ifo, )
Return type:Sub-class of pycbc.workflow.core.Executable that holds utility
pycbc.workflow.jobsetup.sngl_ifo_job_setup(workflow, ifo, out_files, curr_exe_job, science_segs, datafind_outs, parents=None, link_job_instance=None, allow_overlap=True, compatibility_mode=True)[source]

This function sets up a set of single ifo jobs. A basic overview of how this works is as follows:

  • (1) Identify the length of data that each job needs to read in, and what part of that data the job is valid for.
  • START LOOPING OVER SCIENCE SEGMENTS
  • (2) Identify how many jobs are needed (if any) to cover the given science segment and the time shift between jobs. If no jobs continue.
  • START LOOPING OVER JOBS
  • (3) Identify the time that the given job should produce valid output (ie. inspiral triggers) over.
  • (4) Identify the data range that the job will need to read in to produce the aforementioned valid output.
    1. Identify all parents/inputs of the job.
    1. Add the job to the workflow
  • END LOOPING OVER JOBS
  • END LOOPING OVER SCIENCE SEGMENTS
Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instance of the Workflow class that manages the constructed workflow.
  • ifo (string) – The name of the ifo to set up the jobs for
  • out_files (pycbc.workflow.core.FileList) – The FileList containing the list of jobs. Jobs will be appended to this list, and it does not need to be empty when supplied.
  • curr_exe_job (Job) – An instanced of the Job class that has a get_valid times method.
  • science_segs (ligo.segments.segmentlist) – The list of times that the jobs should cover
  • datafind_outs (pycbc.workflow.core.FileList) – The file list containing the datafind files.
  • parents (pycbc.workflow.core.FileList (optional, kwarg, default=None)) – The FileList containing the list of jobs that are parents to the one being set up.
  • link_job_instance (Job instance (optional),) – Coordinate the valid times with another Executable.
  • allow_overlap (boolean (optional, kwarg, default = True)) – If this is set the times that jobs are valid for will be allowed to overlap. This may be desired for template banks which may have some overlap in the times they cover. This may not be desired for inspiral jobs, where you probably want triggers recorded by jobs to not overlap at all.
  • compatibility_mode (boolean (optional, kwarg, default = False)) – If given the jobs will be tiled in the same method as used in inspiral hipe. This requires that link_job_instance is also given. If not given workflow’s methods are used.
Returns:

out_files – A list of the files that will be generated by this step in the workflow.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.matched_filter module

This module is responsible for setting up the matched-filtering stage of workflows. For details about this module and its capabilities see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/NOTYETCREATED.html

pycbc.workflow.matched_filter.setup_matchedfltr_dax_generated(workflow, science_segs, datafind_outs, tmplt_banks, output_dir, injection_file=None, tags=None, link_to_tmpltbank=False, compatibility_mode=False)[source]

Setup matched-filter jobs that are generated as part of the workflow. This module can support any matched-filter code that is similar in principle to lalapps_inspiral, but for new codes some additions are needed to define Executable and Job sub-classes (see jobutils.py).

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the coincidence jobs will be added to.
  • science_segs (ifo-keyed dictionary of ligo.segments.segmentlist instances) – The list of times that are being analysed in this workflow.
  • datafind_outs (pycbc.workflow.core.FileList) – An FileList of the datafind files that are needed to obtain the data used in the analysis.
  • tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.
  • output_dir (path) – The directory in which output will be stored.
  • injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.
  • tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.
  • link_to_tmpltbank (boolean, optional (default=True)) – If this option is given, the job valid_times will be altered so that there will be one inspiral file for every template bank and they will cover the same time span. Note that this option must also be given during template bank generation to be meaningful.
Returns:

inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.matched_filter.setup_matchedfltr_dax_generated_multi(workflow, science_segs, datafind_outs, tmplt_banks, output_dir, injection_file=None, tags=None, link_to_tmpltbank=False, compatibility_mode=False)[source]

Setup matched-filter jobs that are generated as part of the workflow in which a single job reads in and generates triggers over multiple ifos. This module can support any matched-filter code that is similar in principle to pycbc_multi_inspiral or lalapps_coh_PTF_inspiral, but for new codes some additions are needed to define Executable and Job sub-classes (see jobutils.py).

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the coincidence jobs will be added to.
  • science_segs (ifo-keyed dictionary of ligo.segments.segmentlist instances) – The list of times that are being analysed in this workflow.
  • datafind_outs (pycbc.workflow.core.FileList) – An FileList of the datafind files that are needed to obtain the data used in the analysis.
  • tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.
  • output_dir (path) – The directory in which output will be stored.
  • injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.
  • tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.
Returns:

inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.matched_filter.setup_matchedfltr_workflow(workflow, science_segs, datafind_outs, tmplt_banks, output_dir=None, injection_file=None, tags=None)[source]

This function aims to be the gateway for setting up a set of matched-filter jobs in a workflow. This function is intended to support multiple different ways/codes that could be used for doing this. For now the only supported sub-module is one that runs the matched-filtering by setting up a serious of matched-filtering jobs, from one executable, to create matched-filter triggers covering the full range of science times for which there is data and a template bank file.

Parameters:
  • Workflow (pycbc.workflow.core.Workflow) – The workflow instance that the coincidence jobs will be added to.
  • science_segs (ifo-keyed dictionary of ligo.segments.segmentlist instances) – The list of times that are being analysed in this workflow.
  • datafind_outs (pycbc.workflow.core.FileList) – An FileList of the datafind files that are needed to obtain the data used in the analysis.
  • tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.
  • output_dir (path) – The directory in which output will be stored.
  • injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.
  • tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.
Returns:

inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.minifollowups module

class pycbc.workflow.minifollowups.PlotQScanExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.plotting.PlotExecutable

Class to be used for to create workflow.Executable instances for the pycbc_plot_qscan executable. Basically inherits directly from PlotExecutable but adds the file_input_options.

file_input_options = ['--gating-file']
class pycbc.workflow.minifollowups.SingleTemplateExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.plotting.PlotExecutable

Class to be used for to create workflow.Executable instances for the pycbc_single_template executable. Basically inherits directly from PlotExecutable but adds the file_input_options.

file_input_options = ['--gating-file']
class pycbc.workflow.minifollowups.SingleTimeFreqExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.plotting.PlotExecutable

Class to be used for to create workflow.Executable instances for the pycbc_plot_singles_timefreq executable. Basically inherits directly from PlotExecutable but adds the file_input_options.

file_input_options = ['--gating-file']
pycbc.workflow.minifollowups.create_noop_node()[source]

Creates a noop node that can be added to a DAX doing nothing. The reason for using this is if a minifollowups dax contains no triggers currently the dax will contain no jobs and be invalid. By adding a noop node we ensure that such daxes will actually run if one adds one such noop node. Adding such a noop node into a workflow more than once will cause a failure.

pycbc.workflow.minifollowups.grouper(iterable, n, fillvalue=None)[source]

Create a list of n length tuples

pycbc.workflow.minifollowups.make_coinc_info(workflow, singles, bank, coinc, out_dir, n_loudest=None, trig_id=None, file_substring=None, tags=None)[source]
pycbc.workflow.minifollowups.make_inj_info(workflow, injection_file, injection_index, num, out_dir, tags=None)[source]
pycbc.workflow.minifollowups.make_plot_waveform_plot(workflow, params, out_dir, ifos, exclude=None, require=None, tags=None)[source]

Add plot_waveform jobs to the workflow.

pycbc.workflow.minifollowups.make_qscan_plot(workflow, ifo, trig_time, out_dir, injection_file=None, data_segments=None, time_window=100, tags=None)[source]

Generate a make_qscan node and add it to workflow.

This function generates a single node of the singles_timefreq executable and adds it to the current workflow. Parent/child relationships are set by the input/output files automatically.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The workflow class that stores the jobs that will be run.
  • ifo (str) – Which interferometer are we using?
  • trig_time (int) – The time of the trigger being followed up.
  • out_dir (str) – Location of directory to output to
  • injection_file (pycbc.workflow.File (optional, default=None)) – If given, add the injections in the file to strain before making the plot.
  • data_segments (ligo.segments.segmentlist (optional, default=None)) – The list of segments for which data exists and can be read in. If given the start/end times given to singles_timefreq will be adjusted if [trig_time - time_window, trig_time + time_window] does not completely lie within a valid data segment. A ValueError will be raised if the trig_time is not within a valid segment, or if it is not possible to find 2*time_window (plus the padding) of continuous data around the trigger. This must be coalesced.
  • time_window (int (optional, default=None)) – The amount of data (not including padding) that will be read in by the singles_timefreq job. The default value of 100s should be fine for most cases.
  • tags (list (optional, default=None)) – List of tags to add to the created nodes, which determine file naming.
pycbc.workflow.minifollowups.make_single_template_plots(workflow, segs, data_read_name, analyzed_name, params, out_dir, inj_file=None, exclude=None, require=None, tags=None, params_str=None, use_exact_inj_params=False)[source]

Function for creating jobs to run the pycbc_single_template code and to run the associated plotting code pycbc_single_template_plots and add these jobs to the workflow.

Parameters:
  • workflow (workflow.Workflow instance) – The pycbc.workflow.Workflow instance to add these jobs to.
  • segs (workflow.File instance) – The pycbc.workflow.File instance that points to the XML file containing the segment lists of data read in and data analyzed.
  • data_read_name (str) – The name of the segmentlist containing the data read in by each inspiral job in the segs file.
  • analyzed_name (str) – The name of the segmentlist containing the data analyzed by each inspiral job in the segs file.
  • params (dictionary) – A dictionary containing the parameters of the template to be used. params[ifo+’end_time’] is required for all ifos in workflow.ifos. If use_exact_inj_params is False then also need to supply values for [mass1, mass2, spin1z, spin2x]. For precessing templates one also needs to supply [spin1y, spin1x, spin2x, spin2y, inclination] additionally for precession one must supply u_vals or u_vals_+ifo for all ifos. u_vals is the ratio between h_+ and h_x to use when constructing h(t). h(t) = (h_+ * u_vals) + h_x.
  • out_dir (str) – Directory in which to store the output files.
  • inj_file (workflow.File (optional, default=None)) – If given send this injection file to the job so that injections are made into the data.
  • exclude (list (optional, default=None)) – If given, then when considering which subsections in the ini file to parse for options to add to single_template_plot, only use subsections that do not match strings in this list.
  • require (list (optional, default=None)) – If given, then when considering which subsections in the ini file to parse for options to add to single_template_plot, only use subsections matching strings in this list.
  • tags (list (optional, default=None)) – Add this list of tags to all jobs.
  • params_str (str (optional, default=None)) – If given add this string to plot title and caption to describe the template that was used.
  • use_exact_inj_params (boolean (optional, default=False)) – If True do not use masses and spins listed in the params dictionary but instead use the injection closest to the filter time as a template.
Returns:

output_files – The list of workflow.Files created in this function.

Return type:

workflow.FileList

pycbc.workflow.minifollowups.make_singles_timefreq(workflow, single, bank_file, trig_time, out_dir, veto_file=None, time_window=10, data_segments=None, tags=None)[source]

Generate a singles_timefreq node and add it to workflow.

This function generates a single node of the singles_timefreq executable and adds it to the current workflow. Parent/child relationships are set by the input/output files automatically.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The workflow class that stores the jobs that will be run.
  • single (pycbc.workflow.core.File instance) – The File object storing the single-detector triggers to followup.
  • bank_file (pycbc.workflow.core.File instance) – The File object storing the template bank.
  • trig_time (int) – The time of the trigger being followed up.
  • out_dir (str) – Location of directory to output to
  • veto_file (pycbc.workflow.core.File (optional, default=None)) – If given use this file to veto triggers to determine the loudest event. FIXME: Veto files should be provided a definer argument and not just assume that all segments should be read.
  • time_window (int (optional, default=None)) – The amount of data (not including padding) that will be read in by the singles_timefreq job. The default value of 10s should be fine for most cases.
  • data_segments (ligo.segments.segmentlist (optional, default=None)) – The list of segments for which data exists and can be read in. If given the start/end times given to singles_timefreq will be adjusted if [trig_time - time_window, trig_time + time_window] does not completely lie within a valid data segment. A ValueError will be raised if the trig_time is not within a valid segment, or if it is not possible to find 2*time_window (plus the padding) of continuous data around the trigger. This must be coalesced.
  • tags (list (optional, default=None)) – List of tags to add to the created nodes, which determine file naming.
pycbc.workflow.minifollowups.make_skipped_html(workflow, skipped_data, out_dir, tags)[source]

Make a html snippet from the list of skipped background coincidences

pycbc.workflow.minifollowups.make_sngl_ifo(workflow, sngl_file, bank_file, trigger_id, out_dir, ifo, tags=None)[source]

Setup a job to create sngl detector sngl ifo html summary snippet.

pycbc.workflow.minifollowups.make_trigger_timeseries(workflow, singles, ifo_times, out_dir, special_tids=None, exclude=None, require=None, tags=None)[source]
pycbc.workflow.minifollowups.setup_foreground_minifollowups(workflow, coinc_file, single_triggers, tmpltbank_file, insp_segs, insp_data_name, insp_anal_name, dax_output, out_dir, tags=None)[source]

Create plots that followup the Nth loudest coincident injection from a statmap produced HDF file.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
  • coinc_file
  • single_triggers (list of pycbc.workflow.File) – A list cointaining the file objects associated with the merged single detector trigger files for each ifo.
  • tmpltbank_file (pycbc.workflow.File) – The file object pointing to the HDF format template bank
  • insp_segs (SegFile) – The segment file containing the data read and analyzed by each inspiral job.
  • insp_data_name (str) – The name of the segmentlist storing data read.
  • insp_anal_name (str) – The name of the segmentlist storing data analyzed.
  • out_dir (path) – The directory to store minifollowups result plots and files
  • tags ({None, optional}) – Tags to add to the minifollowups executables
Returns:

layout – A list of tuples which specify the displayed file layout for the minifollops plots.

Return type:

list

pycbc.workflow.minifollowups.setup_injection_minifollowups(workflow, injection_file, inj_xml_file, single_triggers, tmpltbank_file, insp_segs, insp_data_name, insp_anal_name, dax_output, out_dir, tags=None)[source]

Create plots that followup the closest missed injections

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
  • coinc_file
  • single_triggers (list of pycbc.workflow.File) – A list cointaining the file objects associated with the merged single detector trigger files for each ifo.
  • tmpltbank_file (pycbc.workflow.File) – The file object pointing to the HDF format template bank
  • insp_segs (SegFile) – The segment file containing the data read by each inspiral job.
  • insp_data_name (str) – The name of the segmentlist storing data read.
  • insp_anal_name (str) – The name of the segmentlist storing data analyzed.
  • out_dir (path) – The directory to store minifollowups result plots and files
  • tags ({None, optional}) – Tags to add to the minifollowups executables
Returns:

layout – A list of tuples which specify the displayed file layout for the minifollops plots.

Return type:

list

pycbc.workflow.minifollowups.setup_single_det_minifollowups(workflow, single_trig_file, tmpltbank_file, insp_segs, insp_data_name, insp_anal_name, dax_output, out_dir, veto_file=None, veto_segment_name=None, statfiles=None, tags=None)[source]

Create plots that followup the Nth loudest clustered single detector triggers from a merged single detector trigger HDF file.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
  • single_trig_file (pycbc.workflow.File) – The File class holding the single detector triggers.
  • tmpltbank_file (pycbc.workflow.File) – The file object pointing to the HDF format template bank
  • insp_segs (SegFile) – The segment file containing the data read by each inspiral job.
  • insp_data_name (str) – The name of the segmentlist storing data read.
  • insp_anal_name (str) – The name of the segmentlist storing data analyzed.
  • out_dir (path) – The directory to store minifollowups result plots and files
  • statfiles (FileList (optional, default=None)) – Supplementary files necessary for computing the single-detector statistic.
  • tags ({None, optional}) – Tags to add to the minifollowups executables
Returns:

layout – A list of tuples which specify the displayed file layout for the minifollops plots.

Return type:

list

pycbc.workflow.pegasus_workflow module

This module provides thin wrappers around Pegasus.DAX3 functionality that provides additional abstraction and argument handling.

class pycbc.workflow.pegasus_workflow.DataStorage(name)[source]

Bases: object

A workflow representation of a place to store and read data from.

The abstract representation of a place to store and read data from. This can include files, database, or remote connections. This object is used as a handle to pass between functions, and is used a way to logically represent the order operation on the physical data.

class pycbc.workflow.pegasus_workflow.Database(name)[source]

Bases: pycbc.workflow.pegasus_workflow.DataStorage

class pycbc.workflow.pegasus_workflow.Executable(name, namespace=None, os='linux', arch='x86_64', installed=True, version=None, container=None)[source]

Bases: pycbc.workflow.pegasus_workflow.ProfileShortcuts

The workflow representation of an Executable

add_pfn(url, site='local')[source]
add_profile(namespace, key, value, force=False)[source]

Add profile information to this executable

clear_pfns()[source]
get_pfn(site='local')[source]
id = 0
insert_into_dax(dax)[source]
class pycbc.workflow.pegasus_workflow.File(name)[source]

Bases: pycbc.workflow.pegasus_workflow.DataStorage, Pegasus.DAX3.File

The workflow representation of a physical file

An object that represents a file from the perspective of setting up a workflow. The file may or may not exist at the time of workflow generation. If it does, this is represented by containing a physical file name (PFN). A storage path is also available to indicate the desired final destination of this file.

dax_repr

Return the dax representation of a File.

classmethod from_path(path)[source]

Takes a path and returns a File object with the path as the PFN.

has_pfn(url, site=None)[source]

Wrapper of the pegasus hasPFN function, that allows it to be called outside of specific pegasus functions.

insert_into_dax(dax)[source]
output_map_str()[source]
class pycbc.workflow.pegasus_workflow.Node(executable)[source]

Bases: pycbc.workflow.pegasus_workflow.ProfileShortcuts

add_arg(arg)[source]

Add an argument

add_input_arg(inp)[source]

Add an input as an argument

add_input_list_opt(opt, inputs)[source]

Add an option that determines a list of inputs

add_input_opt(opt, inp)[source]

Add an option that determines an input

add_list_opt(opt, values)[source]

Add an option with a list of non-file parameters.

add_opt(opt, value=None)[source]

Add a option

add_output_arg(out)[source]

Add an output as an argument

add_output_list_opt(opt, outputs)[source]

Add an option that determines a list of outputs

add_output_opt(opt, out)[source]

Add an option that determines an output

add_profile(namespace, key, value, force=False)[source]

Add profile information to this node at the DAX level

add_raw_arg(arg)[source]

Add an argument to the command line of this job, but do NOT add white space between arguments. This can be added manually by adding ‘ ‘ if needed

new_output_file_opt(opt, name)[source]

Add an option and return a new file handle

class pycbc.workflow.pegasus_workflow.ProfileShortcuts[source]

Bases: object

Container of common methods for setting pegasus profile information on Executables and nodes. This class expects to be inherited from and for a add_profile method to be implemented.

set_category(category)[source]
set_memory(size)[source]

Set the amount of memory that is required in megabytes

set_num_cpus(number)[source]
set_num_retries(number)[source]
set_priority(priority)[source]
set_storage(size)[source]

Set the amount of storage required in megabytes

set_universe(universe)[source]
class pycbc.workflow.pegasus_workflow.Workflow(name='my_workflow')[source]

Bases: object

add_node(node)[source]

Add a node to this workflow

This function adds nodes to the workflow. It also determines parent/child relations from the DataStorage inputs to this job.

Parameters:node (pycbc.workflow.pegasus_workflow.Node) – A node that should be executed as part of this workflow.
add_workflow(workflow)[source]

Add a sub-workflow to this workflow

This function adds a sub-workflow of Workflow class to this workflow. Parent child relationships are determined by data dependencies

Parameters:workflow (Workflow instance) – The sub-workflow to add to this one
save(filename=None, tc=None)[source]

Write this workflow to DAX file

pycbc.workflow.plotting module

This module is responsible for setting up plotting jobs. https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/NOTYETCREATED.html

class pycbc.workflow.plotting.PlotExecutable(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]

Bases: pycbc.workflow.core.Executable

plot executable

create_node()[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 4
pycbc.workflow.plotting.excludestr(tags, substr)[source]
pycbc.workflow.plotting.make_binned_hist(workflow, trig_file, veto_file, veto_name, out_dir, bank_file, exclude=None, require=None, tags=None)[source]
pycbc.workflow.plotting.make_coinc_snrchi_plot(workflow, inj_file, inj_trig, stat_file, trig_file, out_dir, exclude=None, require=None, tags=None)[source]
pycbc.workflow.plotting.make_foreground_table(workflow, trig_file, bank_file, out_dir, singles=None, extension='.html', tags=None, hierarchical_level=None)[source]
pycbc.workflow.plotting.make_foundmissed_plot(workflow, inj_file, out_dir, exclude=None, require=None, tags=None)[source]
pycbc.workflow.plotting.make_gating_plot(workflow, insp_files, out_dir, tags=None)[source]
pycbc.workflow.plotting.make_ifar_plot(workflow, trigger_file, out_dir, tags=None, hierarchical_level=None, executable='page_ifar')[source]

Creates a node in the workflow for plotting cumulative histogram of IFAR values.

pycbc.workflow.plotting.make_inj_table(workflow, inj_file, out_dir, missed=False, singles=None, tags=None)[source]
pycbc.workflow.plotting.make_range_plot(workflow, psd_files, out_dir, exclude=None, require=None, tags=None)[source]
pycbc.workflow.plotting.make_results_web_page(workflow, results_dir, explicit_dependencies=None)[source]
pycbc.workflow.plotting.make_seg_plot(workflow, seg_files, out_dir, seg_names=None, tags=None)[source]

Creates a node in the workflow for plotting science, and veto segments.

pycbc.workflow.plotting.make_seg_table(workflow, seg_files, seg_names, out_dir, tags=None, title_text=None, description=None)[source]

Creates a node in the workflow for writing the segment summary table. Returns a File instances for the output file.

pycbc.workflow.plotting.make_segments_plot(workflow, seg_files, out_dir, tags=None)[source]
pycbc.workflow.plotting.make_sensitivity_plot(workflow, inj_file, out_dir, exclude=None, require=None, tags=None)[source]
pycbc.workflow.plotting.make_single_hist(workflow, trig_file, veto_file, veto_name, out_dir, bank_file=None, exclude=None, require=None, tags=None)[source]
pycbc.workflow.plotting.make_singles_plot(workflow, trig_files, bank_file, veto_file, veto_name, out_dir, exclude=None, require=None, tags=None)[source]
pycbc.workflow.plotting.make_snrchi_plot(workflow, trig_files, veto_file, veto_name, out_dir, exclude=None, require=None, tags=None)[source]
pycbc.workflow.plotting.make_snrifar_plot(workflow, bg_file, out_dir, closed_box=False, cumulative=True, tags=None, hierarchical_level=None)[source]
pycbc.workflow.plotting.make_snrratehist_plot(workflow, bg_file, out_dir, closed_box=False, tags=None, hierarchical_level=None)[source]
pycbc.workflow.plotting.make_spectrum_plot(workflow, psd_files, out_dir, tags=None, hdf_group=None, precalc_psd_files=None)[source]
pycbc.workflow.plotting.make_template_plot(workflow, bank_file, out_dir, tags=None)[source]
pycbc.workflow.plotting.make_throughput_plot(workflow, insp_files, out_dir, tags=None)[source]
pycbc.workflow.plotting.make_veto_table(workflow, out_dir, vetodef_file=None, tags=None)[source]

Creates a node in the workflow for writing the veto_definer table. Returns a File instances for the output file.

pycbc.workflow.plotting.requirestr(tags, substr)[source]

pycbc.workflow.psd module

This module is responsible for setting up PSD-related jobs in workflows.

pycbc.workflow.psd.make_psd_file(workflow, frame_files, segment_file, segment_name, out_dir, tags=None)[source]
pycbc.workflow.psd.make_average_psd(workflow, psd_files, out_dir, tags=None, output_fmt='.txt')[source]
pycbc.workflow.psd.setup_psd_calculate(workflow, frame_files, ifo, segments, segment_name, out_dir, tags=None)[source]
pycbc.workflow.psd.merge_psds(workflow, files, ifo, out_dir, tags=None)[source]

pycbc.workflow.psdfiles module

This module is responsible for setting up the psd files used by CBC workflows.

pycbc.workflow.psdfiles.setup_psd_pregenerated(workflow, tags=None)[source]

Setup CBC workflow to use pregenerated psd files. The file given in cp.get(‘workflow’,’pregenerated-psd-file-(ifo)’) will be used as the –psd-file argument to geom_nonspinbank, geom_aligned_bank and pycbc_plot_psd_file.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.
  • tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.
Returns:

psd_files – The FileList holding the gating files

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.psdfiles.setup_psd_workflow(workflow, science_segs, datafind_outs, output_dir=None, tags=None)[source]

Setup static psd section of CBC workflow. At present this only supports pregenerated psd files, in the future these could be created within the workflow.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.
  • science_segs (Keyed dictionary of glue.segmentlist objects) – scienceSegs[ifo] holds the science segments to be analysed for each ifo.
  • datafind_outs (pycbc.workflow.core.FileList) – The file list containing the datafind files.
  • output_dir (path string) – The directory where data products will be placed.
  • tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.
Returns:

psd_files – The FileList holding the psd files, 0 or 1 per ifo

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.segment module

This module is responsible for setting up the segment generation stage of workflows. For details about this module and its capabilities see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/ahope/segments.html

pycbc.workflow.segment.add_cumulative_files(workflow, output_file, input_files, out_dir, execute_now=False, tags=None)[source]

Function to combine a set of segment files into a single one. This function will not merge the segment lists but keep each separate.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instance of the Workflow class that manages the workflow.
  • output_file (pycbc.workflow.core.File) – The output file object
  • input_files (pycbc.workflow.core.FileList) – This list of input segment files
  • out_dir (path) – The directory to write output to.
  • execute_now (boolean, optional) – If true, jobs are executed immediately. If false, they are added to the workflow to be run later.
  • tags (list of strings, optional) – A list of strings that is used to identify this job
pycbc.workflow.segment.cat_to_veto_def_cat(val)[source]

Convert a category character to the corresponding value in the veto definer file.

Parameters:str (single character string) – The input category character
Returns:
  • pipedown_str (str) – The pipedown equivalent notation that can be passed to programs
  • that expect this definition.
pycbc.workflow.segment.create_segs_from_cats_job(cp, out_dir, ifo_string, tags=None)[source]

This function creates the CondorDAGJob that will be used to run ligolw_segments_from_cats as part of the workflow

Parameters:
  • cp (pycbc.workflow.configuration.WorkflowConfigParser) – The in-memory representation of the configuration (.ini) files
  • out_dir (path) – Directory in which to put output files
  • ifo_string (string) – String containing all active ifos, ie. “H1L1V1”
  • tag (list of strings, optional (default=None)) – Use this to specify a tag(s). This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!
Returns:

job – The Job instance that will run segments_from_cats jobs

Return type:

Job instance

pycbc.workflow.segment.file_needs_generating(file_path, cp, tags=None)[source]

This job tests the file location and determines if the file should be generated now or if an error should be raised. This uses the generate_segment_files variable, global to this module, which is described above and in the documentation.

Parameters:
  • file_path (path) – Location of file to check
  • cp (ConfigParser) – The associated ConfigParser from which the segments-generate-segment-files variable is returned. It is recommended for most applications to use the default option by leaving segments-generate-segment-files blank, which will regenerate all segment files at runtime. Only use this facility if you need it. Choices are * ‘always’ : DEFAULT: All files will be generated even if they already exist. * ‘if_not_present’: Files will be generated if they do not already exist. Pre-existing files will be read in and used. * ‘error_on_duplicate’: Files will be generated if they do not already exist. Pre-existing files will raise a failure. * ‘never’: Pre-existing files will be read in and used. If no file exists the code will fail.
Returns:

1 = Generate the file. 0 = File already exists, use it. Other cases will raise an error.

Return type:

int

pycbc.workflow.segment.find_playground_segments(segs)[source]

Finds playground time in a list of segments.

Playground segments include the first 600s of every 6370s stride starting at GPS time 729273613.

Parameters:segs (segmentfilelist) – A segmentfilelist to find playground segments.
Returns:outlist – A segmentfilelist with all playground segments during the input segmentfilelist (ie. segs).
Return type:segmentfilelist
pycbc.workflow.segment.generate_triggered_segment(workflow, out_dir, sciencesegs)[source]
pycbc.workflow.segment.get_analyzable_segments(workflow, sci_segs, cat_files, out_dir, tags=None)[source]

Get the analyzable segments after applying ini specified vetoes and any other restrictions on the science segs, e.g. a minimum segment length, or demanding that only coincident segments are analysed.

Parameters:
  • workflow (Workflow object) – Instance of the workflow object
  • sci_segs (Ifo-keyed dictionary of glue.segmentlists) – The science segments for each ifo to which the vetoes, or any other restriction, will be applied.
  • cat_files (FileList of SegFiles) – The category veto files generated by get_veto_segs
  • out_dir (path) – Location to store output files
  • tags (list of strings) – Used to retrieve subsections of the ini file for configuration options.
Returns:

  • sci_ok_seg_file (workflow.core.SegFile instance) – The segment file combined from all ifos containing the analyzable science segments.
  • sci_ok_segs (Ifo keyed dict of ligo.segments.segmentlist instances) – The analyzable science segs for each ifo, keyed by ifo
  • sci_ok_seg_name (str) – The name with which analyzable science segs are stored in the output XML file.

pycbc.workflow.segment.get_cumulative_segs(workflow, categories, seg_files_list, out_dir, tags=None, execute_now=False, segment_name=None)[source]

Function to generate one of the cumulative, multi-detector segment files as part of the workflow.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instance of the Workflow class that manages the workflow.
  • categories (int) – The veto categories to include in this cumulative veto.
  • seg_files_list (Listionary of SegFiles) – The list of segment files to be used as input for combining.
  • out_dir (path) – The directory to write output to.
  • tags (list of strings, optional) – A list of strings that is used to identify this job
  • execute_now (boolean, optional) – If true, jobs are executed immediately. If false, they are added to the workflow to be run later.
  • segment_name (str) – The name of the combined, cumulative segments in the output file.
pycbc.workflow.segment.get_cumulative_veto_group_files(workflow, option, cat_files, out_dir, execute_now=True, tags=None)[source]

Get the cumulative veto files that define the different backgrounds we want to analyze, defined by groups of vetos.

Parameters:
  • workflow (Workflow object) – Instance of the workflow object
  • option (str) – ini file option to use to get the veto groups
  • cat_files (FileList of SegFiles) – The category veto files generated by get_veto_segs
  • out_dir (path) – Location to store output files
  • execute_now (Boolean) – If true outputs are generated at runtime. Else jobs go into the workflow and are generated then.
  • tags (list of strings) – Used to retrieve subsections of the ini file for configuration options.
Returns:

  • seg_files (workflow.core.FileList instance) – The cumulative segment files for each veto group.
  • names (list of strings) – The segment names for the corresponding seg_file
  • cat_files (workflow.core.FileList instance) – The list of individual category veto files

pycbc.workflow.segment.get_files_for_vetoes(workflow, out_dir, runtime_names=None, in_workflow_names=None, tags=None)[source]

Get the various sets of veto segments that will be used in this analysis.

Parameters:
  • workflow (Workflow object) – Instance of the workflow object
  • out_dir (path) – Location to store output files
  • runtime_names (list) – Veto category groups with these names in the [workflow-segment] section of the ini file will be generated now.
  • in_workflow_names (list) – Veto category groups with these names in the [workflow-segment] section of the ini file will be generated in the workflow. If a veto category appears here and in runtime_names, it will be generated now.
  • tags (list of strings) – Used to retrieve subsections of the ini file for configuration options.
Returns:

veto_seg_files – List of veto segment files generated

Return type:

FileList

pycbc.workflow.segment.get_sci_segs_for_ifo(ifo, cp, start_time, end_time, out_dir, tags=None)[source]

Obtain science segments for the selected ifo

Parameters:
  • ifo (string) – The string describing the ifo to obtain science times for.
  • start_time (gps time (either int/LIGOTimeGPS)) – The time at which to begin searching for segments.
  • end_time (gps time (either int/LIGOTimeGPS)) – The time at which to stop searching for segments.
  • out_dir (path) – The directory in which output will be stored.
  • tag (string, optional (default=None)) – Use this to specify a tag. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename.
Returns:

  • sci_segs (ligo.segments.segmentlist) – The segmentlist generated by this call
  • sci_xml_file (pycbc.workflow.core.SegFile) – The workflow File object corresponding to this science segments file.
  • out_sci_seg_name (string) – The name of the output segment list in the output XML file.

pycbc.workflow.segment.get_science_segments(workflow, out_dir, tags=None)[source]

Get the analyzable segments after applying ini specified vetoes.

Parameters:
  • workflow (Workflow object) – Instance of the workflow object
  • out_dir (path) – Location to store output files
  • tags (list of strings) – Used to retrieve subsections of the ini file for configuration options.
Returns:

  • sci_seg_file (workflow.core.SegFile instance) – The segment file combined from all ifos containing the science segments.
  • sci_segs (Ifo keyed dict of ligo.segments.segmentlist instances) – The science segs for each ifo, keyed by ifo
  • sci_seg_name (str) – The name with which science segs are stored in the output XML file.

pycbc.workflow.segment.get_segments_file(workflow, name, option_name, out_dir)[source]

Get cumulative segments from option name syntax for each ifo.

Use syntax of configparser string to define the resulting segment_file e.x. option_name = +up_flag1,+up_flag2,+up_flag3,-down_flag1,-down_flag2 Each ifo may have a different string and is stored separately in the file. Flags which add time must precede flags which subtract time.

Parameters:
  • workflow (pycbc.workflow.Workflow) –
  • name (string) – Name of the segment list being created
  • option_name (str) – Name of option in the associated config parser to get the flag list
Returns:

seg_file – SegFile intance that points to the segment xml file on disk.

Return type:

pycbc.workflow.SegFile

pycbc.workflow.segment.get_triggered_coherent_segment(workflow, sciencesegs)[source]

Construct the coherent network on and off source segments. Can switch to construction of segments for a single IFO search when coherent segments are insufficient for a search.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The workflow instance that the calculated segments belong to.
  • sciencesegs (dict) – Dictionary of all science segments within analysis time.
Returns:

  • onsource (ligo.segments.segmentlistdict) – A dictionary containing the on source segments for network IFOs
  • offsource (ligo.segments.segmentlistdict) – A dictionary containing the off source segments for network IFOs

pycbc.workflow.segment.get_veto_segs(workflow, ifo, category, start_time, end_time, out_dir, veto_gen_job, tags=None, execute_now=False)[source]

Obtain veto segments for the selected ifo and veto category and add the job to generate this to the workflow.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instance of the Workflow class that manages the workflow.
  • ifo (string) – The string describing the ifo to generate vetoes for.
  • category (int) – The veto category to generate vetoes for.
  • start_time (gps time (either int/LIGOTimeGPS)) – The time at which to begin searching for segments.
  • end_time (gps time (either int/LIGOTimeGPS)) – The time at which to stop searching for segments.
  • out_dir (path) – The directory in which output will be stored.
  • vetoGenJob (Job) – The veto generation Job class that will be used to create the Node.
  • tag (string, optional (default=None)) – Use this to specify a tag. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!
  • execute_now (boolean, optional) – If true, jobs are executed immediately. If false, they are added to the workflow to be run later.
Returns:

veto_def_file – The workflow File object corresponding to this DQ veto file.

Return type:

pycbc.workflow.core.SegFile

pycbc.workflow.segment.parse_cat_ini_opt(cat_str)[source]

Parse a cat str from the ini file into a list of sets

pycbc.workflow.segment.save_veto_definer(cp, out_dir, tags=None)[source]

Retrieve the veto definer file and save it locally

Parameters:
  • cp (ConfigParser instance) –
  • out_dir (path) –
  • tags (list of strings) – Used to retrieve subsections of the ini file for configuration options.
pycbc.workflow.segment.setup_segment_gen_mixed(workflow, veto_categories, out_dir, maxVetoAtRunTime, tag=None, generate_coincident_segs=True)[source]

This function will generate veto files for each ifo and for each veto category. It can generate these vetoes at run-time or in the workflow (or do some at run-time and some in the workflow). However, the CAT_1 vetoes and science time must be generated at run time as they are needed to plan the workflow. CATs 2 and higher may be needed for other workflow construction. It can also combine these files to create a set of cumulative, multi-detector veto files, which can be used in ligolw_thinca and in pipedown. Again these can be created at run time or within the workflow.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the coincidence jobs will be added to. This instance also contains the ifos for which to attempt to obtain segments for this analysis and the start and end times to search for segments over.
  • veto_categories (list of ints) – List of veto categories to generate segments for. If this stops being integers, this can be changed here.
  • out_dir (path) – The directory in which output will be stored.
  • maxVetoAtRunTime (int) – Generate veto files at run time up to this category. Veto categories beyond this in veto_categories will be generated in the workflow. If we move to a model where veto categories are not explicitly cumulative, this will be rethought.
  • tag (string, optional (default=None)) – Use this to specify a tag. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!
  • generate_coincident_segs (boolean, optional (default = True)) – If given this module will generate a set of coincident, cumulative veto files that can be used with ligolw_thinca and pipedown.
Returns:

segFilesList – These are representations of the various segment files that were constructed at this stage of the workflow and may be needed at later stages of the analysis (e.g. for performing DQ vetoes). If the file was generated at run-time the segment lists contained within these files will be an attribute of the instance. (If it will be generated in the workflow it will not be because I am not psychic).

Return type:

dictionary of pycbc.workflow.core.SegFile instances

pycbc.workflow.segment.setup_segment_generation(workflow, out_dir, tag=None)[source]

This function is the gateway for setting up the segment generation steps in a workflow. It is designed to be able to support multiple ways of obtaining these segments and to combine/edit such files as necessary for analysis. The current modules have the capability to generate files at runtime or to generate files that are not needed for workflow generation within the workflow.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The workflow instance that the coincidence jobs will be added to. This instance also contains the ifos for which to attempt to obtain segments for this analysis and the start and end times to search for segments over.
  • out_dir (path) – The directory in which output will be stored.
  • tag (string, optional (default=None)) – Use this to specify a tag. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!
Returns:

  • segsToAnalyse (dictionay of ifo-keyed glue.segment.segmentlist instances) – This will contain the times that your code should analyse. By default this is science time - CAT_1 vetoes. (This default could be changed if desired)
  • segFilesList (pycbc.workflow.core.FileList of SegFile instances) – These are representations of the various segment files that were constructed at this stage of the workflow and may be needed at later stages of the analysis (e.g. for performing DQ vetoes). If the file was generated at run-time the segment lists contained within these files will be an attribute of the instance. (If it will be generated in the workflow it will not be because I am not psychic).

pycbc.workflow.splittable module

This module is responsible for setting up the splitting output files stage of workflows. For details about this module and its capabilities see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/NOTYETCREATED.html

pycbc.workflow.splittable.select_splitfilejob_instance(curr_exe)[source]

This function returns an instance of the class that is appropriate for splitting an output file up within workflow (for e.g. splitbank).

Parameters:
  • curr_exe (string) – The name of the Executable that is being used.
  • curr_section (string) – The name of the section storing options for this executble
Returns:

exe class – The class that holds the utility functions appropriate for the given Executable. This class must contain * exe_class.create_job() and the job returned by this must contain * job.create_node()

Return type:

sub-class of pycbc.workflow.core.Executable

pycbc.workflow.splittable.setup_splittable_dax_generated(workflow, input_tables, out_dir, tags)[source]

Function for setting up the splitting jobs as part of the workflow.

Parameters:
Returns:

split_table_outs – The list of split up files as output from this job.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.splittable.setup_splittable_workflow(workflow, input_tables, out_dir=None, tags=None)[source]

This function aims to be the gateway for code that is responsible for taking some input file containing some table, and splitting into multiple files containing different parts of that table. For now the only supported operation is using lalapps_splitbank to split a template bank xml file into multiple template bank xml files.

Parameters:
Returns:

split_table_outs – The list of split up files as output from this job.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.tmpltbank module

This module is responsible for setting up the template bank stage of CBC workflows. For details about this module and its capabilities see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/ahope/template_bank.html

pycbc.workflow.tmpltbank.setup_tmpltbank_dax_generated(workflow, science_segs, datafind_outs, output_dir, tags=None, link_to_matchedfltr=True, compatibility_mode=False, psd_files=None)[source]

Setup template bank jobs that are generated as part of the CBC workflow. This function will add numerous jobs to the CBC workflow using configuration options from the .ini file. The following executables are currently supported:

  • lalapps_tmpltbank
  • pycbc_geom_nonspin_bank
Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.
  • science_segs (Keyed dictionary of glue.segmentlist objects) – scienceSegs[ifo] holds the science segments to be analysed for each ifo.
  • datafind_outs (pycbc.workflow.core.FileList) – The file list containing the datafind files.
  • output_dir (path string) – The directory where data products will be placed.
  • tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.
  • link_to_matchedfltr (boolean, optional (default=True)) – If this option is given, the job valid_times will be altered so that there will be one inspiral file for every template bank and they will cover the same time span. Note that this option must also be given during matched-filter generation to be meaningful.
  • psd_file (pycbc.workflow.core.FileList) – The file list containing predefined PSDs, if provided.
Returns:

tmplt_banks – The FileList holding the details of all the template bank jobs.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.tmpltbank.setup_tmpltbank_pregenerated(workflow, tags=None)[source]

Setup CBC workflow to use a pregenerated template bank. The bank given in cp.get(‘workflow’,’pregenerated-template-bank’) will be used as the input file for all matched-filtering jobs. If this option is present, workflow will assume that it should be used and not generate template banks within the workflow.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.
  • tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.
Returns:

tmplt_banks – The FileList holding the details of the template bank.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.tmpltbank.setup_tmpltbank_without_frames(workflow, output_dir, tags=None, independent_ifos=False, psd_files=None)[source]

Setup CBC workflow to use a template bank (or banks) that are generated in the workflow, but do not use the data to estimate a PSD, and therefore do not vary over the duration of the workflow. This can either generate one bank that is valid for all ifos at all times, or multiple banks that are valid only for a single ifo at all times (one bank per ifo).

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.
  • output_dir (path string) – The directory where the template bank outputs will be placed.
  • tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.
  • independent_ifos (Boolean, optional (default=False)) – If given this will produce one template bank per ifo. If not given there will be on template bank to cover all ifos.
  • psd_file (pycbc.workflow.core.FileList) – The file list containing predefined PSDs, if provided.
Returns:

tmplt_banks – The FileList holding the details of the template bank(s).

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.tmpltbank.setup_tmpltbank_workflow(workflow, science_segs, datafind_outs, output_dir=None, psd_files=None, tags=None, return_format=None)[source]

Setup template bank section of CBC workflow. This function is responsible for deciding which of the various template bank workflow generation utilities should be used.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.
  • science_segs (Keyed dictionary of glue.segmentlist objects) – scienceSegs[ifo] holds the science segments to be analysed for each ifo.
  • datafind_outs (pycbc.workflow.core.FileList) – The file list containing the datafind files.
  • output_dir (path string) – The directory where data products will be placed.
  • psd_files (pycbc.workflow.core.FileList) – The file list containing predefined PSDs, if provided.
  • tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.
Returns:

tmplt_banks – The FileList holding the details of all the template bank jobs.

Return type:

pycbc.workflow.core.FileList

Module contents

This package provides the utilities to construct an inspiral workflow for performing a coincident CBC matched-filter analysis on gravitational-wave interferometer data