pycbc.workflow package¶
Submodules¶
pycbc.workflow.coincidence module¶
This module is responsible for setting up the coincidence stage of pycbc workflows. For details about this module and its capabilities see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/coincidence.html
-
class
pycbc.workflow.coincidence.
CensorForeground
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
-
current_retention_level
= 3¶
-
-
class
pycbc.workflow.coincidence.
MergeExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
-
current_retention_level
= 3¶
-
-
class
pycbc.workflow.coincidence.
PyCBCBank2HDFExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
Converts xml tmpltbank to hdf format
-
create_node
(bank_file)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 3¶
-
-
class
pycbc.workflow.coincidence.
PyCBCCombineStatmap
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
Combine coincs over different bins and apply trials factor
-
create_node
(statmap_files, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 3¶
-
-
class
pycbc.workflow.coincidence.
PyCBCDistributeBackgroundBins
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
Distribute coinc files among different background bins
-
create_node
(coinc_files, bank_file, background_bins, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 2¶
-
-
class
pycbc.workflow.coincidence.
PyCBCFindCoincExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
Find coinc triggers using a folded interval method
-
create_node
(trig_files, bank_file, stat_files, veto_file, veto_name, template_str, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 2¶
-
file_input_options
= ['--statistic-files']¶
-
-
class
pycbc.workflow.coincidence.
PyCBCFindMultiifoCoincExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
Find coinc triggers using a folded interval method
-
create_node
(trig_files, bank_file, stat_files, veto_file, veto_name, template_str, pivot_ifo, fixed_ifo, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 2¶
-
file_input_options
= ['--statistic-files']¶
-
-
class
pycbc.workflow.coincidence.
PyCBCFitByTemplateExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
Calculates values that describe the background distribution template by template
-
create_node
(trig_file, bank_file, veto_file, veto_name)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 3¶
-
-
class
pycbc.workflow.coincidence.
PyCBCFitOverParamExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
Smooths the background distribution parameters over a continuous parameter
-
create_node
(raw_fit_file, bank_file)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 3¶
-
-
class
pycbc.workflow.coincidence.
PyCBCHDFInjFindExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
Find injections in the hdf files output
-
create_node
(inj_coinc_file, inj_xml_file, veto_file, veto_name, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 3¶
-
-
class
pycbc.workflow.coincidence.
PyCBCMultiifoAddStatmap
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.coincidence.PyCBCMultiifoCombineStatmap
Combine statmap files and add FARs over different coinc types
-
create_node
(statmap_files, background_files, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 3¶
-
-
class
pycbc.workflow.coincidence.
PyCBCMultiifoCombineStatmap
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.coincidence.PyCBCCombineStatmap
Combine coincs over different coinc types and apply trials factor
-
current_retention_level
= 3¶
-
-
class
pycbc.workflow.coincidence.
PyCBCMultiifoExcludeZerolag
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
Remove times of zerolag coincidences of all types from exclusive background
-
create_node
(statmap_file, other_statmap_files, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 3¶
-
-
class
pycbc.workflow.coincidence.
PyCBCMultiifoStatMapExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
Calculate FAP, IFAR, etc
-
create_node
(coinc_files, ifos, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 3¶
-
-
class
pycbc.workflow.coincidence.
PyCBCMultiifoStatMapInjExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
Calculate FAP, IFAR, etc
-
create_node
(zerolag, full_data, injfull, fullinj, ifos, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 3¶
-
-
class
pycbc.workflow.coincidence.
PyCBCStatMapExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
Calculate FAP, IFAR, etc
-
create_node
(coinc_files, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 3¶
-
-
class
pycbc.workflow.coincidence.
PyCBCStatMapInjExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
Calculate FAP, IFAR, etc for injections
-
create_node
(zerolag, full_data, injfull, fullinj, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 3¶
-
-
class
pycbc.workflow.coincidence.
PyCBCTrig2HDFExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
Converts xml triggers to hdf format, grouped by template hash
-
create_node
(trig_files, bank_file)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 3¶
-
-
pycbc.workflow.coincidence.
convert_bank_to_hdf
(workflow, xmlbank, out_dir, tags=None)[source]¶ Return the template bank in hdf format
-
pycbc.workflow.coincidence.
convert_trig_to_hdf
(workflow, hdfbank, xml_trigger_files, out_dir, tags=None)[source]¶ Return the list of hdf5 trigger files outputs
-
pycbc.workflow.coincidence.
find_injections_in_hdf_coinc
(workflow, inj_coinc_file, inj_xml_file, veto_file, veto_name, out_dir, tags=None)[source]¶
-
pycbc.workflow.coincidence.
get_ordered_ifo_list
(ifocomb, ifo_ids)[source]¶ This function sorts the combination of ifos (ifocomb) based on the given precedence list (ifo_ids dictionary) and returns the first ifo as pivot the second ifo as fixed, and the ordered list joined as a string.
-
pycbc.workflow.coincidence.
make_foreground_censored_veto
(workflow, bg_file, veto_file, veto_name, censored_name, out_dir, tags=None)[source]¶
-
pycbc.workflow.coincidence.
merge_single_detector_hdf_files
(workflow, bank_file, trigger_files, out_dir, tags=None)[source]¶
-
pycbc.workflow.coincidence.
rerank_coinc_followup
(workflow, statmap_file, bank_file, out_dir, tags, injection_file=None, ranking_file=None)[source]¶
-
pycbc.workflow.coincidence.
select_files_by_ifo_combination
(ifocomb, insps)[source]¶ This function selects single-detector files (‘insps’) for a given ifo combination
-
pycbc.workflow.coincidence.
setup_background_bins
(workflow, coinc_files, bank_file, out_dir, tags=None)[source]¶
-
pycbc.workflow.coincidence.
setup_background_bins_inj
(workflow, coinc_files, background_file, bank_file, out_dir, tags=None)[source]¶
-
pycbc.workflow.coincidence.
setup_interval_coinc
(workflow, hdfbank, trig_files, stat_files, veto_files, veto_names, out_dir, tags=None)[source]¶ This function sets up exact match coincidence and background estimation
using a folded interval technique.
-
pycbc.workflow.coincidence.
setup_interval_coinc_inj
(workflow, hdfbank, full_data_trig_files, inj_trig_files, stat_files, background_file, veto_file, veto_name, out_dir, tags=None)[source]¶ This function sets up exact match coincidence and background estimation
using a folded interval technique.
-
pycbc.workflow.coincidence.
setup_multiifo_combine_statmap
(workflow, final_bg_file_list, bg_file_list, out_dir, tags=None)[source]¶ Combine the multiifo statmap files into one background file
-
pycbc.workflow.coincidence.
setup_multiifo_exclude_zerolag
(workflow, statmap_file, other_statmap_files, out_dir, ifos, tags=None)[source]¶ Exclude single triggers close to zerolag triggers from forming any background events
-
pycbc.workflow.coincidence.
setup_multiifo_interval_coinc
(workflow, hdfbank, trig_files, stat_files, veto_files, veto_names, out_dir, pivot_ifo, fixed_ifo, tags=None)[source]¶ This function sets up exact match multiifo coincidence
-
pycbc.workflow.coincidence.
setup_multiifo_interval_coinc_inj
(workflow, hdfbank, full_data_trig_files, inj_trig_files, stat_files, background_file, veto_file, veto_name, out_dir, pivot_ifo, fixed_ifo, tags=None)[source]¶ This function sets up exact match multiifo coincidence for injections
-
pycbc.workflow.coincidence.
setup_multiifo_statmap
(workflow, ifos, coinc_files, out_dir, tags=None)[source]¶
-
pycbc.workflow.coincidence.
setup_multiifo_statmap_inj
(workflow, ifos, coinc_files, background_file, out_dir, tags=None)[source]¶
-
pycbc.workflow.coincidence.
setup_simple_statmap_inj
(workflow, coinc_files, background_file, out_dir, tags=None)[source]¶
-
pycbc.workflow.coincidence.
setup_statmap
(workflow, coinc_files, bank_file, out_dir, tags=None)[source]¶
pycbc.workflow.configparser_test module¶
-
pycbc.workflow.configparser_test.
add_options_to_section
(cp, section, items, preserve_orig_file=False, overwrite_options=False)[source]¶ Add a set of options and values to a section of a ConfigParser object. Will throw an error if any of the options being added already exist, this behaviour can be overridden if desired
Parameters: - cp (The ConfigParser class) –
- section (string) – The name of the section to add options+values to
- items (list of tuples) – Each tuple contains (at [0]) the option and (at [1]) the value to add to the section of the ini file
- preserve_orig_file (Boolean, optional) – By default the input ConfigParser object will be modified in place. If this is set deepcopy will be used and the input will be preserved. Default = False
- overwrite_options (Boolean, optional) – By default this function will throw a ValueError if an option exists in both the original section in the ConfigParser and in the provided items. This will override so that the options+values given in items will replace the original values if the value is set to True. Default = True
Returns: cp
Return type: The ConfigParser class
-
pycbc.workflow.configparser_test.
check_duplicate_options
(cp, section1, section2, raise_error=False)[source]¶ Check for duplicate options in two sections, section1 and section2. Will return True if there are duplicate options and False if not
Parameters: - cp (The ConfigParser class) –
- section1 (string) – The name of the first section to compare
- section2 (string) – The name of the second section to compare
- raise_error (Boolean, optional) – If True, raise an error if duplicates are present. Default = False
Returns: duplicate – List of duplicate options
Return type: List
-
pycbc.workflow.configparser_test.
interpolate_string
(testString, cp, section)[source]¶ Take a string and replace all example of ExtendedInterpolation formatting within the string with the exact value.
For values like ${example} this is replaced with the value that corresponds to the option called example *in the same section*
For values like ${common|example} this is replaced with the value that corresponds to the option example in the section [common]. Note that in the python3 config parser this is ${common:example} but python2.7 interprets the : the same as a = and this breaks things
Nested interpolation is not supported here.
Parameters: - testString (String) – The string to parse and interpolate
- cp (ConfigParser) – The ConfigParser object to look for the interpolation strings within
- section (String) – The current section of the ConfigParser object
Returns: testString – Interpolated string
Return type: String
-
pycbc.workflow.configparser_test.
parse_workflow_ini_file
(cpFile, parsed_filepath=None)[source]¶ Read a .ini file in, parse it as described in the documentation linked to above, and return the parsed ini file.
Parameters: - cpFile (The path to a .ini file to be read in) –
- parsed_filepath (Boolean, optional) – If provided, the .ini file, after parsing, will be written to this location
Returns: cp
Return type: The parsed ConfigParser class containing the read in .ini file
-
pycbc.workflow.configparser_test.
perform_extended_interpolation
(cp, preserve_orig_file=False)[source]¶ Filter through an ini file and replace all examples of ExtendedInterpolation formatting with the exact value. For values like ${example} this is replaced with the value that corresponds to the option called example *in the same section*
For values like ${common|example} this is replaced with the value that corresponds to the option example in the section [common]. Note that in the python3 config parser this is ${common:example} but python2.7 interprets the : the same as a = and this breaks things
Nested interpolation is not supported here.
Parameters: - cp (ConfigParser object) –
- preserve_orig_file (Boolean, optional) – By default the input ConfigParser object will be modified in place. If this is set deepcopy will be used and the input will be preserved. Default = False
Returns: cp
Return type: parsed ConfigParser object
-
pycbc.workflow.configparser_test.
read_ini_file
(cpFile)[source]¶ Read a .ini file and return it as a ConfigParser class. This function does none of the parsing/combining of sections. It simply reads the file and returns it unedited
Parameters: cpFile (The path to a .ini file to be read in) – Returns: cp Return type: The ConfigParser class containing the read in .ini file
-
pycbc.workflow.configparser_test.
sanity_check_subsections
(cp)[source]¶ This function goes through the ConfigParset and checks that any options given in the [SECTION_NAME] section are not also given in any [SECTION_NAME-SUBSECTION] sections.
Parameters: cp (The ConfigParser class) – Returns: Return type: None
-
pycbc.workflow.configparser_test.
split_multi_sections
(cp, preserve_orig_file=False)[source]¶ Parse through a supplied ConfigParser object and splits any sections labelled with an “&” sign (for e.g. [inspiral&tmpltbank]) into [inspiral] and [tmpltbank] sections. If these individual sections already exist they will be appended to. If an option exists in both the [inspiral] and [inspiral&tmpltbank] sections an error will be thrown
Parameters: - cp (The ConfigParser class) –
- preserve_orig_file (Boolean, optional) – By default the input ConfigParser object will be modified in place. If this is set deepcopy will be used and the input will be preserved. Default = False
Returns: cp
Return type: The ConfigParser class
pycbc.workflow.configuration module¶
This module provides a wrapper to the ConfigParser utilities for pycbc workflow construction. This module is described in the page here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/ahope/initialization_inifile.html
-
class
pycbc.workflow.configuration.
WorkflowConfigParser
(configFiles=None, overrideTuples=None, parsedFilePath=None, deleteTuples=None, copy_to_cwd=False)[source]¶ Bases:
glue.pipeline.DeepCopyableConfigParser
This is a sub-class of glue.pipeline.DeepCopyableConfigParser, which lets us add a few additional helper features that are useful in workflows.
-
add_options_to_section
(section, items, overwrite_options=False)[source]¶ Add a set of options and values to a section of a ConfigParser object. Will throw an error if any of the options being added already exist, this behaviour can be overridden if desired
Parameters: - section (string) – The name of the section to add options+values to
- items (list of tuples) – Each tuple contains (at [0]) the option and (at [1]) the value to add to the section of the ini file
- overwrite_options (Boolean, optional) – By default this function will throw a ValueError if an option exists in both the original section in the ConfigParser and in the provided items. This will override so that the options+values given in items will replace the original values if the value is set to True. Default = True
-
check_duplicate_options
(section1, section2, raise_error=False)[source]¶ Check for duplicate options in two sections, section1 and section2. Will return a list of the duplicate options.
Parameters: - section1 (string) – The name of the first section to compare
- section2 (string) – The name of the second section to compare
- raise_error (Boolean, optional (default=False)) – If True, raise an error if duplicates are present.
Returns: duplicates – List of duplicate options
Return type: List
-
classmethod
from_cli
(opts)[source]¶ Initialize the config parser using options parsed from the command line.
The parsed options
opts
must include options provided byadd_workflow_command_line_group()
.Parameters: opts (argparse.ArgumentParser) – The command line arguments parsed by argparse
-
get_cli_option
(section, option_name, **kwds)[source]¶ Return option using CLI action parsing
Parameters: Returns: The parsed value for this option
Return type: value
-
get_opt_tag
(section, option, tag)[source]¶ Convenience function accessing get_opt_tags() for a single tag: see documentation for that function. NB calling get_opt_tags() directly is preferred for simplicity.
Parameters: - self (ConfigParser object) – The ConfigParser object (automatically passed when this is appended to the ConfigParser class)
- section (string) – The section of the ConfigParser object to read
- option (string) – The ConfigParser option to look for
- tag (string) – The name of the subsection to look in, if not found in [section]
Returns: The value of the options being searched for
Return type: string
Supplement to ConfigParser.ConfigParser.get(). This will search for an option in [section] and if it doesn’t find it will also try in [section-tag] for every value of tag in tags. Will raise a ConfigParser.Error if it cannot find a value.
Parameters: - self (ConfigParser object) – The ConfigParser object (automatically passed when this is appended to the ConfigParser class)
- section (string) – The section of the ConfigParser object to read
- option (string) – The ConfigParser option to look for
- tags (list of strings) – The name of subsections to look in, if not found in [section]
Returns: The value of the options being searched for
Return type: string
-
has_option_tag
(section, option, tag)[source]¶ Convenience function accessing has_option_tags() for a single tag: see documentation for that function. NB calling has_option_tags() directly is preferred for simplicity.
Parameters: - self (ConfigParser object) – The ConfigParser object (automatically passed when this is appended to the ConfigParser class)
- section (string) – The section of the ConfigParser object to read
- option (string) – The ConfigParser option to look for
- tag (string) – The name of the subsection to look in, if not found in [section]
Returns: Is the option in the section or [section-tag]
Return type: Boolean
Supplement to ConfigParser.ConfigParser.has_option(). This will search for an option in [section] and if it doesn’t find it will also try in [section-tag] for each value in tags. Returns True if the option is found and false if not.
Parameters: - self (ConfigParser object) – The ConfigParser object (automatically passed when this is appended to the ConfigParser class)
- section (string) – The section of the ConfigParser object to read
- option (string) – The ConfigParser option to look for
- tags (list of strings) – The names of the subsection to look in, if not found in [section]
Returns: Is the option in the section or [section-tag] (for tag in tags)
Return type: Boolean
-
interpolate_exe
(testString)[source]¶ Replace testString with a path to an executable based on the format.
If this looks like
${which:lalapps_tmpltbank}
it will return the equivalent of which(lalapps_tmpltbank)
Otherwise it will return an unchanged string.
Parameters: testString (string) – The input string Returns: newString – The output string. Return type: string
-
interpolate_string
(testString, section)[source]¶ Take a string and replace all example of ExtendedInterpolation formatting within the string with the exact value.
For values like ${example} this is replaced with the value that corresponds to the option called example *in the same section*
For values like ${common|example} this is replaced with the value that corresponds to the option example in the section [common]. Note that in the python3 config parser this is ${common:example} but python2.7 interprets the : the same as a = and this breaks things
Nested interpolation is not supported here.
Parameters: - testString (String) – The string to parse and interpolate
- section (String) – The current section of the ConfigParser object
Returns: testString – Interpolated string
Return type: String
-
perform_exe_expansion
()[source]¶ This function will look through the executables section of the ConfigParser object and replace any values using macros with full paths.
For any values that look like
${which:lalapps_tmpltbank}
will be replaced with the equivalent of which(lalapps_tmpltbank)
Otherwise values will be unchanged.
-
perform_extended_interpolation
()[source]¶ Filter through an ini file and replace all examples of ExtendedInterpolation formatting with the exact value. For values like ${example} this is replaced with the value that corresponds to the option called example *in the same section*
For values like ${common|example} this is replaced with the value that corresponds to the option example in the section [common]. Note that in the python3 config parser this is ${common:example} but python2.7 interprets the : the same as a = and this breaks things
Nested interpolation is not supported here.
Parse the [sharedoptions] section of the ini file.
That section should contain entries according to:
- massparams = inspiral, tmpltbank
- dataparams = tmpltbank
This will result in all options in [sharedoptions-massparams] being copied into the [inspiral] and [tmpltbank] sections and the options in [sharedoptions-dataparams] being copited into [tmpltbank]. In the case of duplicates an error will be raised.
-
read_ini_file
(cpFile)[source]¶ Read a .ini file and return it as a ConfigParser class. This function does none of the parsing/combining of sections. It simply reads the file and returns it unedited
Stub awaiting more functionality - see configparser_test.py
Parameters: cpFile (Path to .ini file, or list of paths) – The path(s) to a .ini file to be read in Returns: cp – The ConfigParser class containing the read in .ini file Return type: ConfigParser
-
resolve_file_url
(test_string)[source]¶ Replace test_string with a path to an executable based on the format.
If this looks like
${which:lalapps_tmpltbank}
it will return the equivalent of which(lalapps_tmpltbank)
Otherwise it will return an unchanged string.
Parameters: test_string (string) – The input string Returns: new_string – The output string. Return type: string
-
resolve_urls
()[source]¶ This function will look through all sections of the ConfigParser object and replace any URLs that are given the resolve magic flag with a path on the local drive.
Specifically for any values that look like
${resolve:https://git.ligo.org/detchar/SOME_GATING_FILE.txt}
the file will be replaced with the output of resolve_url(URL)
Otherwise values will be unchanged.
-
sanity_check_subsections
()[source]¶ This function goes through the ConfigParset and checks that any options given in the [SECTION_NAME] section are not also given in any [SECTION_NAME-SUBSECTION] sections.
-
section_to_cli
(section, skip_opts=None)[source]¶ Converts a section into a command-line string.
For example:
[section_name] foo = bar = 10
yields: ‘–foo –bar 10’.
Parameters: Returns: The options as a command-line string.
Return type:
-
split_multi_sections
()[source]¶ Parse through the WorkflowConfigParser instance and splits any sections labelled with an “&” sign (for e.g. [inspiral&tmpltbank]) into [inspiral] and [tmpltbank] sections. If these individual sections already exist they will be appended to. If an option exists in both the [inspiral] and [inspiral&tmpltbank] sections an error will be thrown
-
-
pycbc.workflow.configuration.
add_workflow_command_line_group
(parser)[source]¶ The standard way of initializing a ConfigParser object in workflow will be to do it from the command line. This is done by giving a
–local-config-files filea.ini fileb.ini filec.ini
command. You can also set config file override commands on the command line. This will be most useful when setting (for example) start and end times, or active ifos. This is done by
–config-overrides section1:option1:value1 section2:option2:value2 …
This can also be given as
–config-overrides section1:option1
where the value will be left as ‘’.
To remove a configuration option, use the command line argument
–config-delete section1:option1
which will delete option1 from [section1] or
–config-delete section1
to delete all of the options in [section1]
Deletes are implemented before overrides.
This function returns an argparse OptionGroup to ensure these options are parsed correctly and can then be sent directly to initialize an WorkflowConfigParser.
Parameters: parser (argparse.ArgumentParser instance) – The initialized argparse instance to add the workflow option group to.
-
pycbc.workflow.configuration.
istext
(s, text_characters=None, threshold=0.3)[source]¶ Determines if the string is a set of binary data or a text file. This is done by checking if a large proportion of characters are > 0X7E (0x7F is <DEL> and unprintable) or low bit control codes. In other words things that you wouldn’t see (often) in a text file. (ASCII past 0x7F might appear, but rarely).
Code modified from https://www.safaribooksonline.com/library/view/python-cookbook-2nd/0596007973/ch01s12.html
-
pycbc.workflow.configuration.
resolve_url
(url, directory=None, permissions=None, copy_to_cwd=True)[source]¶ Resolves a URL to a local file, and returns the path to that file.
If a URL is given, the file will be copied to the current working directory. If a local file path is given, the file will only be copied to the current working directory if
copy_to_cwd
isTrue
(the default).
pycbc.workflow.core module¶
This module provides the worker functions and classes that are used when creating a workflow. For details about the workflow module see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/ahope.html
-
exception
pycbc.workflow.core.
CalledProcessErrorMod
(returncode, cmd, errFile=None, outFile=None, cmdFile=None)[source]¶ Bases:
exceptions.Exception
This exception is raised when subprocess.call returns a non-zero exit code and checking has been requested. This should not be accessed by the user it is used only within make_external_call.
-
class
pycbc.workflow.core.
ContentHandler
(document, start_handlers={})[source]¶ Bases:
glue.ligolw.ligolw.LIGOLWContentHandler
-
startColumn
(parent, attrs)¶
-
startStream
(parent, attrs, __orig_startStream=<unbound method ContentHandler.startStream>)¶
-
startTable
(parent, attrs, __orig_startTable=<unbound method ContentHandler.startTable>)¶
-
-
class
pycbc.workflow.core.
Executable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.pegasus_workflow.Executable
-
ALL_TRIGGERS
= 2¶
-
FINAL_RESULT
= 4¶
-
INTERMEDIATE_PRODUCT
= 1¶
-
KEEP_BUT_RAISE_WARNING
= 5¶
-
MERGED_TRIGGERS
= 3¶
-
add_ini_opts
(cp, sec)[source]¶ Add job-specific options from configuration file.
Parameters: - cp (ConfigParser object) – The ConfigParser object holding the workflow configuration settings
- sec (string) – The section containing options for this job.
-
add_ini_profile
(cp, sec)[source]¶ Add profile from configuration file.
Parameters: - cp (ConfigParser object) – The ConfigParser object holding the workflow configuration settings
- sec (string) – The section containing options for this job.
-
add_opt
(opt, value=None)[source]¶ Add option to job.
Parameters: - opt (string) – Name of option (e.g. –output-file-format)
- value (string, (default=None)) – The value for the option (no value if set to None).
-
create_node
()[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 5¶
-
file_input_options
= []¶
-
get_opt
(opt)[source]¶ Get value of option from configuration file
Parameters: opt (string) – Name of option (e.g. output-file-format) Returns: value – The value for the option. Returns None if option not present. Return type: string
-
has_opt
(opt)[source]¶ Check if option is present in configuration file
Parameters: opt (string) – Name of option (e.g. output-file-format)
-
ifo
¶ Return the ifo.
If only one ifo in the ifo list this will be that ifo. Otherwise an error is raised.
-
update_current_retention_level
(value)[source]¶ Set a new value for the current retention level.
This updates the value of self.retain_files for an updated value of the retention level.
Parameters: value (int) – The new value to use for the retention level.
Set a new set of tags for this executable.
Update the set of tags that this job will use. This updated default file naming and shared options. It will not update the pegasus profile, which belong to the executable and cannot be different for different nodes.
Parameters: tags (list) – The new list of tags to consider.
-
-
class
pycbc.workflow.core.
File
(ifos, exe_name, segs, file_url=None, extension=None, directory=None, tags=None, store_file=True, use_tmp_subdirs=False)[source]¶ Bases:
pycbc.workflow.pegasus_workflow.File
This class holds the details of an individual output file This file(s) may be pre-supplied, generated from within the workflow command line script, or generated within the workflow. The important stuff is:
- The ifo that the File is valid for
- The time span that the OutFile is valid for
- A short description of what the file is
- The extension that the file should have
- The url where the file should be located
An example of initiating this class:
>> c = File(“H1”, “INSPIRAL_S6LOWMASS”, segments.segment(815901601, 815902001), file_url=”file://localhost/home/spxiwh/H1-INSPIRAL_S6LOWMASS-815901601-400.xml.gz” )
another where the file url is generated from the inputs:
>> c = File(“H1”, “INSPIRAL_S6LOWMASS”, segments.segment(815901601, 815902001), directory=”/home/spxiwh”, extension=”xml.gz” )
-
cache_entry
¶ Returns a CacheEntry instance for File.
-
ifo
¶ If only one ifo in the ifo_list this will be that ifo. Otherwise an error is raised.
-
segment
¶ If only one segment in the segmentlist this will be that segment. Otherwise an error is raised.
-
class
pycbc.workflow.core.
FileList
[source]¶ Bases:
list
This class holds a list of File objects. It inherits from the built-in list class, but also allows a number of features. ONLY pycbc.workflow.File instances should be within a FileList instance.
-
categorize_by_attr
(attribute)[source]¶ Function to categorize a FileList by a File object attribute (eg. ‘segment’, ‘ifo’, ‘description’).
Parameters: attribute (string) – File object attribute to categorize FileList Returns: - keys (list) – A list of values for an attribute
- groups (list) – A list of FileLists
-
find_all_output_in_range
(ifo, currSeg, useSplitLists=False)[source]¶ Return all files that overlap the specified segment.
-
find_output
(ifo, time)[source]¶ Returns one File most appropriate at the given time/time range.
Return one File that covers the given time, or is most appropriate for the supplied time range.
Parameters: - ifo (string) – Name of the ifo (or ifos) that the file should be valid for.
- time (int/float/LIGOGPStime or tuple containing two values) – If int/float/LIGOGPStime (or similar may of specifying one time) is given, return the File corresponding to the time. This calls self.find_output_at_time(ifo,time). If a tuple of two values is given, return the File that is most appropriate for the time range given. This calls self.find_output_in_range
Returns: pycbc_file – The File that corresponds to the time or time range
Return type: pycbc.workflow.File instance
-
find_output_at_time
(ifo, time)[source]¶ Return File that covers the given time.
Parameters: - ifo (string) – Name of the ifo (or ifos) that the File should correspond to
- time (int/float/LIGOGPStime) – Return the Files that covers the supplied time. If no File covers the time this will return None.
Returns: The Files that corresponds to the time.
Return type: list of File classes
-
find_output_in_range
(ifo, start, end)[source]¶ Return the File that is most appropriate for the supplied time range. That is, the File whose coverage time has the largest overlap with the supplied time range. If no Files overlap the supplied time window, will return None.
Parameters: - ifo (string) – Name of the ifo (or ifos) that the File should correspond to
- start (int/float/LIGOGPStime) – The start of the time range of interest.
- end (int/float/LIGOGPStime) – The end of the time range of interest
Returns: The File that is most appropriate for the time range
Return type: File class
-
find_outputs_in_range
(ifo, current_segment, useSplitLists=False)[source]¶ Return the list of Files that is most appropriate for the supplied time range. That is, the Files whose coverage time has the largest overlap with the supplied time range.
Parameters: - ifo (string) – Name of the ifo (or ifos) that the File should correspond to
- current_segment (glue.segment.segment) – The segment of time that files must intersect.
Returns: The list of Files that are most appropriate for the time range
Return type: FileList class
-
-
class
pycbc.workflow.core.
Node
(executable)[source]¶ Bases:
pycbc.workflow.pegasus_workflow.Node
-
add_multiifo_input_list_opt
(opt, inputs)[source]¶ Add an option that determines a list of inputs from multiple detectors. Files will be supplied as –opt ifo1:input1 ifo2:input2 …..
-
add_multiifo_output_list_opt
(opt, outputs)[source]¶ Add an option that determines a list of outputs from multiple detectors. Files will be supplied as –opt ifo1:input1 ifo2:input2 …..
-
new_multiifo_output_list_opt
(opt, ifos, analysis_time, extension, tags=None, store_file=None, use_tmp_subdirs=False)[source]¶ Add an option that determines a list of outputs from multiple detectors. Files will be supplied as –opt ifo1:input1 ifo2:input2 ….. File names are created internally from the provided extension and analysis time.
-
new_output_file_opt
(valid_seg, extension, option_name, tags=None, store_file=None, use_tmp_subdirs=False)[source]¶ This function will create a workflow.File object corresponding to the given information and then add that file as output of this node.
Parameters: - valid_seg (ligo.segments.segment) – The time span over which the job is valid for.
- extension (string) – The extension to be used at the end of the filename. E.g. ‘.xml’ or ‘.sqlite’.
- option_name (string) – The option that is used when setting this job as output. For e.g. ‘output-name’ or ‘output-file’, whatever is appropriate for the current executable.
- tags (list of strings, (optional, default=[])) – These tags will be added to the list of tags already associated with the job. They can be used to uniquely identify this output file.
- store_file (Boolean, (optional, default=True)) – This file is to be added to the output mapper and will be stored in the specified output location if True. If false file will be removed when no longer needed in the workflow.
-
output_file
¶ If only one output file return it. Otherwise raise an exception.
-
output_files
¶
-
-
class
pycbc.workflow.core.
SegFile
(ifo_list, description, valid_segment, segment_dict=None, seg_summ_dict=None, **kwargs)[source]¶ Bases:
pycbc.workflow.core.File
This class inherits from the File class, and is designed to store workflow output files containing a segment dict. This is identical in usage to File except for an additional kwarg for holding the segment dictionary, if it is known at workflow run time.
-
classmethod
from_multi_segment_list
(description, segmentlists, names, ifos, seg_summ_lists=None, **kwargs)[source]¶ Initialize a SegFile object from a list of segmentlists.
Parameters: - description (string (required)) – See File.__init__
- segmentlists (List of ligo.segments.segmentslist) – List of segment lists that will be stored in this file.
- names (List of str) – List of names of the segment lists to be stored in the file.
- ifos (str) – List of ifos of the segment lists to be stored in this file.
- seg_summ_lists (ligo.segments.segmentslist (OPTIONAL)) – Specify the segment_summary segmentlists that go along with the segmentlists. Default=None, in this case segment_summary is taken from the valid_segment of the SegFile class.
-
classmethod
from_segment_list
(description, segmentlist, name, ifo, seg_summ_list=None, **kwargs)[source]¶ Initialize a SegFile object from a segmentlist.
Parameters: - description (string (required)) – See File.__init__
- segmentlist (ligo.segments.segmentslist) – The segment list that will be stored in this file.
- name (str) – The name of the segment lists to be stored in the file.
- ifo (str) – The ifo of the segment lists to be stored in this file.
- seg_summ_list (ligo.segments.segmentslist (OPTIONAL)) – Specify the segment_summary segmentlist that goes along with the segmentlist. Default=None, in this case segment_summary is taken from the valid_segment of the SegFile class.
-
classmethod
from_segment_list_dict
(description, segmentlistdict, ifo_list=None, valid_segment=None, file_exists=False, seg_summ_dict=None, **kwargs)[source]¶ Initialize a SegFile object from a segmentlistdict.
Parameters: - description (string (required)) – See File.__init__
- segmentlistdict (ligo.segments.segmentslistdict) – See SegFile.__init__
- ifo_list (string or list (optional)) – See File.__init__, if not given a list of all ifos in the segmentlistdict object will be used
- valid_segment (ligo.segments.segment or ligo.segments.segmentlist) – See File.__init__, if not given the extent of all segments in the segmentlistdict is used.
- file_exists (boolean (default = False)) – If provided and set to True it is assumed that this file already exists on disk and so there is no need to write again.
- seg_summ_dict (ligo.segments.segmentslistdict) – Optional. See SegFile.__init__.
-
classmethod
from_segment_xml
(xml_file, **kwargs)[source]¶ Read a ligo.segments.segmentlist from the file object file containing an xml segment table.
Parameters: xml_file (file object) – file object for segment xml file
-
classmethod
-
class
pycbc.workflow.core.
Workflow
(args, name)[source]¶ Bases:
pycbc.workflow.pegasus_workflow.Workflow
This class manages a pycbc workflow. It provides convenience functions for finding input files using time and keywords. It can also generate cache files from the inputs.
-
output_map
¶
-
save
(filename=None, output_map_path=None, transformation_catalog_path=None, staging_site=None)[source]¶ Write this workflow to DAX file
-
save_config
(fname, output_dir, cp=None)[source]¶ Writes configuration file to disk and returns a pycbc.workflow.File instance for the configuration file.
Parameters: - fname (string) – The filename of the configuration file written to disk.
- output_dir (string) – The directory where the file is written to disk.
- cp (ConfigParser object) – The ConfigParser object to write. If None then uses self.cp.
Returns: The FileList object with the configuration file.
Return type:
-
static
set_job_properties
(job, output_map_file, transformation_catalog_file, staging_site=None)[source]¶
-
staging_site
¶
-
transformation_catalog
¶
-
-
pycbc.workflow.core.
check_output_error_and_retcode
(*popenargs, **kwargs)[source]¶ This function is used to obtain the stdout of a command. It is only used internally, recommend using the make_external_call command if you want to call external executables.
-
pycbc.workflow.core.
get_full_analysis_chunk
(science_segs)[source]¶ Function to find the first and last time point contained in the science segments and return a single segment spanning that full time.
Parameters: science_segs (ifo-keyed dictionary of ligo.segments.segmentlist instances) – The list of times that are being analysed in this workflow. Returns: fullSegment – The segment spanning the first and last time point contained in science_segs. Return type: ligo.segments.segment
-
pycbc.workflow.core.
get_random_label
()[source]¶ Get a random label string to use when clustering jobs.
-
pycbc.workflow.core.
is_condor_exec
(exe_path)[source]¶ Determine if an executable is condor-compiled
Parameters: exe_path (str) – The executable path Returns: truth_value – Return True if the exe is condor compiled, False otherwise. Return type: boolean
-
pycbc.workflow.core.
make_analysis_dir
(path)[source]¶ Make the analysis directory path, any parent directories that don’t already exist, and the ‘logs’ subdirectory of path.
-
pycbc.workflow.core.
make_external_call
(cmdList, out_dir=None, out_basename='external_call', shell=False, fail_on_error=True)[source]¶ Use this to make an external call using the python subprocess module. See the subprocess documentation for more details of how this works. http://docs.python.org/2/library/subprocess.html
Parameters: - cmdList (list of strings) – This list of strings contains the command to be run. See the subprocess documentation for more details.
- out_dir (string) – If given the stdout and stderr will be redirected to os.path.join(out_dir,out_basename+[“.err”,”.out]) If not given the stdout and stderr will not be recorded
- out_basename (string) – The value of out_basename used to construct the file names used to store stderr and stdout. See out_dir for more information.
- shell (boolean, default=False) – This value will be given as the shell kwarg to the subprocess call. WARNING See the subprocess documentation for details on this Kwarg including a warning about a serious security exploit. Do not use this unless you are sure it is necessary and safe.
- fail_on_error (boolean, default=True) – If set to true an exception will be raised if the external command does not return a code of 0. If set to false such failures will be ignored. Stderr and Stdout can be stored in either case using the out_dir and out_basename options.
Returns: exitCode – The code returned by the process.
Return type:
pycbc.workflow.datafind module¶
This module is responsible for querying a datafind server to determine the availability of the data that the code is attempting to run on. It also performs a number of tests and can act on these as described below. Full documentation for this function can be found here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/ahope/datafind.html
-
class
pycbc.workflow.datafind.
ContentHandler
(document, start_handlers={})[source]¶ Bases:
glue.ligolw.ligolw.LIGOLWContentHandler
-
startColumn
(parent, attrs)¶
-
startStream
(parent, attrs, __orig_startStream=<unbound method ContentHandler.startStream>)¶
-
startTable
(parent, attrs, __orig_startTable=<unbound method ContentHandler.startTable>)¶
-
-
pycbc.workflow.datafind.
convert_cachelist_to_filelist
(datafindcache_list)[source]¶ Take as input a list of glue.lal.Cache objects and return a pycbc FileList containing all frames within those caches.
Parameters: datafindcache_list (list of glue.lal.Cache objects) – The list of cache files to convert. Returns: datafind_filelist – The list of frame files. Return type: FileList of frame File objects
-
pycbc.workflow.datafind.
datafind_keep_unique_backups
(backup_outs, orig_outs)[source]¶ This function will take a list of backup datafind files, presumably obtained by querying a remote datafind server, e.g. CIT, and compares these against a list of original datafind files, presumably obtained by querying the local datafind server. Only the datafind files in the backup list that do not appear in the original list are returned. This allows us to use only files that are missing from the local cluster.
Parameters: Returns: List of datafind files in backup_outs and not in orig_outs.
Return type:
-
pycbc.workflow.datafind.
get_missing_segs_from_frame_file_cache
(datafindcaches)[source]¶ This function will use os.path.isfile to determine if all the frame files returned by the local datafind server actually exist on the disk. This can then be used to update the science times if needed.
Parameters: datafindcaches (OutGroupList) – List of all the datafind output files. Returns: - missingFrameSegs (Dict. of ifo keyed glue.segment.segmentlist instances) – The times corresponding to missing frames found in datafindOuts.
- missingFrames (Dict. of ifo keyed lal.Cache instances) – The list of missing frames
-
pycbc.workflow.datafind.
get_science_segs_from_datafind_outs
(datafindcaches)[source]¶ This function will calculate the science segments that are covered in the OutGroupList containing the frame files returned by various calls to the datafind server. This can then be used to check whether this list covers what it is expected to cover.
Parameters: datafindcaches (OutGroupList) – List of all the datafind output files. Returns: newScienceSegs – The times covered by the frames found in datafindOuts. Return type: Dictionary of ifo keyed glue.segment.segmentlist instances
-
pycbc.workflow.datafind.
get_segment_summary_times
(scienceFile, segmentName)[source]¶ This function will find the times for which the segment_summary is set for the flag given by segmentName.
Parameters: - scienceFile (SegFile) – The segment file that we want to use to determine this.
- segmentName (string) – The DQ flag to search for times in the segment_summary table.
Returns: summSegList – The times that are covered in the segment summary table.
Return type: ligo.segments.segmentlist
-
pycbc.workflow.datafind.
log_datafind_command
(observatory, frameType, startTime, endTime, outputDir, **dfKwargs)[source]¶ This command will print an equivalent gw_data_find command to disk that can be used to debug why the internal datafind module is not working.
-
pycbc.workflow.datafind.
run_datafind_instance
(cp, outputDir, connection, observatory, frameType, startTime, endTime, ifo, tags=None)[source]¶ This function will query the datafind server once to find frames between the specified times for the specified frame type and observatory.
Parameters: - cp (ConfigParser instance) – Source for any kwargs that should be sent to the datafind module
- outputDir (Output cache files will be written here. We also write the) – commands for reproducing what is done in this function to this directory.
- connection (datafind connection object) – Initialized through the gwdatafind module, this is the open connection to the datafind server.
- observatory (string) – The observatory to query frames for. Ex. ‘H’, ‘L’ or ‘V’. NB: not ‘H1’, ‘L1’, ‘V1’ which denote interferometers.
- frameType (string) – The frame type to query for.
- startTime (int) – Integer start time to query the datafind server for frames.
- endTime (int) – Integer end time to query the datafind server for frames.
- ifo (string) – The interferometer to use for naming output. Ex. ‘H1’, ‘L1’, ‘V1’. Maybe this could be merged with the observatory string, but this could cause issues if running on old ‘H2’ and ‘H1’ data.
- tags (list of string, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniquify the actual filename. FIXME: Filenames may not be unique with current codes!
Returns: - dfCache (glue.lal.Cache instance) – The glue.lal.Cache representation of the call to the datafind server and the returned frame files.
- cacheFile (pycbc.workflow.core.File) – Cache file listing all of the datafind output files for use later in the pipeline.
-
pycbc.workflow.datafind.
setup_datafind_from_pregenerated_lcf_files
(cp, ifos, outputDir, tags=None)[source]¶ This function is used if you want to run with pregenerated lcf frame cache files.
Parameters: - cp (ConfigParser.ConfigParser instance) – This contains a representation of the information stored within the workflow configuration files
- ifos (list of ifo strings) – List of ifos to get pregenerated files for.
- outputDir (path) – All output files written by datafind processes will be written to this directory. Currently this sub-module writes no output.
- tags (list of strings, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename.
Returns: - datafindcaches (list of glue.lal.Cache instances) – The glue.lal.Cache representations of the various calls to the datafind server and the returned frame files.
- datafindOuts (pycbc.workflow.core.FileList) – List of all the datafind output files for use later in the pipeline.
-
pycbc.workflow.datafind.
setup_datafind_runtime_cache_multi_calls_perifo
(cp, scienceSegs, outputDir, tags=None)[source]¶ This function uses the gwdatafind library to obtain the location of all the frame files that will be needed to cover the analysis of the data given in scienceSegs. This function will not check if the returned frames cover the whole time requested, such sanity checks are done in the pycbc.workflow.setup_datafind_workflow entry function. As opposed to setup_datafind_runtime_single_call_perifo this call will one call to the datafind server for every science segment. This function will return a list of output files that correspond to the cache .lcf files that are produced, which list the locations of all frame files. This will cause problems with pegasus, which expects to know about all input files (ie. the frame files themselves.)
Parameters: - cp (ConfigParser.ConfigParser instance) – This contains a representation of the information stored within the workflow configuration files
- scienceSegs (Dictionary of ifo keyed glue.segment.segmentlist instances) – This contains the times that the workflow is expected to analyse.
- outputDir (path) – All output files written by datafind processes will be written to this directory.
- tags (list of strings, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!
Returns: - datafindcaches (list of glue.lal.Cache instances) – The glue.lal.Cache representations of the various calls to the datafind server and the returned frame files.
- datafindOuts (pycbc.workflow.core.FileList) – List of all the datafind output files for use later in the pipeline.
-
pycbc.workflow.datafind.
setup_datafind_runtime_cache_single_call_perifo
(cp, scienceSegs, outputDir, tags=None)[source]¶ This function uses the gwdatafind library to obtain the location of all the frame files that will be needed to cover the analysis of the data given in scienceSegs. This function will not check if the returned frames cover the whole time requested, such sanity checks are done in the pycbc.workflow.setup_datafind_workflow entry function. As opposed to setup_datafind_runtime_generated this call will only run one call to datafind per ifo, spanning the whole time. This function will return a list of output files that correspond to the cache .lcf files that are produced, which list the locations of all frame files. This will cause problems with pegasus, which expects to know about all input files (ie. the frame files themselves.)
Parameters: - cp (ConfigParser.ConfigParser instance) – This contains a representation of the information stored within the workflow configuration files
- scienceSegs (Dictionary of ifo keyed glue.segment.segmentlist instances) – This contains the times that the workflow is expected to analyse.
- outputDir (path) – All output files written by datafind processes will be written to this directory.
- tags (list of strings, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!
Returns: - datafindcaches (list of glue.lal.Cache instances) – The glue.lal.Cache representations of the various calls to the datafind server and the returned frame files.
- datafindOuts (pycbc.workflow.core.FileList) – List of all the datafind output files for use later in the pipeline.
-
pycbc.workflow.datafind.
setup_datafind_runtime_frames_multi_calls_perifo
(cp, scienceSegs, outputDir, tags=None)[source]¶ This function uses the gwdatafind library to obtain the location of all the frame files that will be needed to cover the analysis of the data given in scienceSegs. This function will not check if the returned frames cover the whole time requested, such sanity checks are done in the pycbc.workflow.setup_datafind_workflow entry function. As opposed to setup_datafind_runtime_single_call_perifo this call will one call to the datafind server for every science segment. This function will return a list of files corresponding to the individual frames returned by the datafind query. This will allow pegasus to more easily identify all the files used as input, but may cause problems for codes that need to take frame cache files as input.
Parameters: - cp (ConfigParser.ConfigParser instance) – This contains a representation of the information stored within the workflow configuration files
- scienceSegs (Dictionary of ifo keyed glue.segment.segmentlist instances) – This contains the times that the workflow is expected to analyse.
- outputDir (path) – All output files written by datafind processes will be written to this directory.
- tags (list of strings, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!
Returns: - datafindcaches (list of glue.lal.Cache instances) – The glue.lal.Cache representations of the various calls to the datafind server and the returned frame files.
- datafindOuts (pycbc.workflow.core.FileList) – List of all the datafind output files for use later in the pipeline.
-
pycbc.workflow.datafind.
setup_datafind_runtime_frames_single_call_perifo
(cp, scienceSegs, outputDir, tags=None)[source]¶ This function uses the gwdatafind library to obtain the location of all the frame files that will be needed to cover the analysis of the data given in scienceSegs. This function will not check if the returned frames cover the whole time requested, such sanity checks are done in the pycbc.workflow.setup_datafind_workflow entry function. As opposed to setup_datafind_runtime_generated this call will only run one call to datafind per ifo, spanning the whole time. This function will return a list of files corresponding to the individual frames returned by the datafind query. This will allow pegasus to more easily identify all the files used as input, but may cause problems for codes that need to take frame cache files as input.
Parameters: - cp (ConfigParser.ConfigParser instance) – This contains a representation of the information stored within the workflow configuration files
- scienceSegs (Dictionary of ifo keyed glue.segment.segmentlist instances) – This contains the times that the workflow is expected to analyse.
- outputDir (path) – All output files written by datafind processes will be written to this directory.
- tags (list of strings, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!
Returns: - datafindcaches (list of glue.lal.Cache instances) – The glue.lal.Cache representations of the various calls to the datafind server and the returned frame files.
- datafindOuts (pycbc.workflow.core.FileList) – List of all the datafind output files for use later in the pipeline.
-
pycbc.workflow.datafind.
setup_datafind_server_connection
(cp, tags=None)[source]¶ This function is resposible for setting up the connection with the datafind server.
Parameters: cp (pycbc.workflow.configuration.WorkflowConfigParser) – The memory representation of the ConfigParser Returns: The open connection to the datafind server. Return type: connection
-
pycbc.workflow.datafind.
setup_datafind_workflow
(workflow, scienceSegs, outputDir, seg_file=None, tags=None)[source]¶ Setup datafind section of the workflow. This section is responsible for generating, or setting up the workflow to generate, a list of files that record the location of the frame files needed to perform the analysis. There could be multiple options here, the datafind jobs could be done at run time or could be put into a dag. The subsequent jobs will know what was done here from the OutFileList containing the datafind jobs (and the Dagman nodes if appropriate. For now the only implemented option is to generate the datafind files at runtime. This module can also check if the frameFiles actually exist, check whether the obtained segments line up with the original ones and update the science segments to reflect missing data files.
Parameters: - workflow (pycbc.workflow.core.Workflow) – The workflow class that stores the jobs that will be run.
- scienceSegs (Dictionary of ifo keyed glue.segment.segmentlist instances) – This contains the times that the workflow is expected to analyse.
- outputDir (path) – All output files written by datafind processes will be written to this directory.
- seg_file (SegFile, optional (default=None)) – The file returned by get_science_segments containing the science segments and the associated segment_summary. This will be used for the segment_summary test and is required if, and only if, performing that test.
- tags (list of string, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!
Returns: - datafindOuts (OutGroupList) – List of all the datafind output files for use later in the pipeline.
- sci_avlble_file (SegFile) – SegFile containing the analysable time after checks in the datafind module are applied to the input segment list. For production runs this is expected to be equal to the input segment list.
- scienceSegs (Dictionary of ifo keyed glue.segment.segmentlist instances) – This contains the times that the workflow is expected to analyse. If the updateSegmentTimes kwarg is given this will be updated to reflect any instances of missing data.
- sci_avlble_name (string) – The name with which the analysable time is stored in the sci_avlble_file.
pycbc.workflow.grb_utils module¶
This library code contains functions and classes that are used in the generation of pygrb workflows. For details about pycbc.workflow see here: http://pycbc.org/pycbc/latest/html/workflow.html
-
pycbc.workflow.grb_utils.
get_coh_PTF_files
(cp, ifos, run_dir, bank_veto=False, summary_files=False)[source]¶ Retrieve files needed to run coh_PTF jobs within a PyGRB workflow
Parameters: - cp (pycbc.workflow.configuration.WorkflowConfigParser object) –
- parsed configuration options of a pycbc.workflow.core.Workflow. (The) –
- ifos (str) –
- containing the analysis interferometer IDs. (String) –
- run_dir (str) –
- run directory, destination for retrieved files. (The) –
- bank_veto (Boolean) –
- true, will retrieve the bank_veto_bank.xml file. (If) –
- summary_files (Boolean) –
- true, will retrieve the summary page style files. (If) –
Returns: - file_list (pycbc.workflow.FileList object)
- A FileList containing the retrieved files.
-
pycbc.workflow.grb_utils.
get_ipn_sky_files
(workflow, file_url, tags=None)[source]¶ Retreive the sky point files for searching over the IPN error box and populating it with injections.
Parameters: - workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.
- file_url (string) – The URL of the IPN sky points file.
- tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.
Returns: sky_points_file – File object representing the IPN sky points file.
Return type:
-
pycbc.workflow.grb_utils.
get_sky_grid_scale
(sky_error, sigma_sys=6.8359)[source]¶ Calculate suitable 3-sigma radius of the search patch, incorporating Fermi GBM systematic if necessary.
-
pycbc.workflow.grb_utils.
make_exttrig_file
(cp, ifos, sci_seg, out_dir)[source]¶ Make an ExtTrig xml file containing information on the external trigger
Parameters: - cp (pycbc.workflow.configuration.WorkflowConfigParser object) –
- parsed configuration options of a pycbc.workflow.core.Workflow. (The) –
- ifos (str) –
- containing the analysis interferometer IDs. (String) –
- sci_seg (ligo.segments.segment) –
- science segment for the analysis run. (The) –
- out_dir (str) –
- output directory, destination for xml file. (The) –
Returns: - xml_file (pycbc.workflow.File object)
- The xml file with external trigger information.
-
pycbc.workflow.grb_utils.
make_gating_node
(workflow, datafind_files, outdir=None, tags=None)[source]¶ Generate jobs for autogating the data for PyGRB runs.
Parameters: - workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.
- datafind_files (pycbc.workflow.core.FileList) – A FileList containing the frame files to be gated.
- outdir (string) – Path of the output directory
- tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.
Returns: - condition_strain_nodes (list) – List containing the pycbc.workflow.core.Node objects representing the autogating jobs.
- condition_strain_outs (pycbc.workflow.core.FileList) – FileList containing the pycbc.workflow.core.File objects representing the gated frame files.
pycbc.workflow.inference_followups module¶
Module that contains functions for setting up the inference workflow.
-
pycbc.workflow.inference_followups.
create_fits_file
(workflow, inference_file, output_dir, name='create_fits_file', analysis_seg=None, tags=None)[source]¶ Sets up job to create fits files from some given samples files.
Parameters: - workflow (pycbc.workflow.Workflow) – The workflow instance we are populating
- inference_file (pycbc.workflow.File) – The file with posterior samples.
- output_dir (str) – The directory to store result plots and files.
- name (str, optional) – The name in the [executables] section of the configuration file
to use, and the section to read for additional arguments to pass to
the executable. Default is
create_fits_file
. - analysis_segs (ligo.segments.Segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.
- tags (list, optional) – Tags to add to the inference executables.
Returns: A list of output files.
Return type: pycbc.workflow.FileList
-
pycbc.workflow.inference_followups.
create_posterior_files
(workflow, samples_files, output_dir, parameters=None, name='extract_posterior', analysis_seg=None, tags=None)[source]¶ Sets up job to create posterior files from some given samples files.
Parameters: - workflow (pycbc.workflow.Workflow) – The workflow instance we are populating
- samples_files (str or list of str) – One or more files to extract the posterior samples from.
- output_dir (str) – The directory to store result plots and files.
- parameters (list, optional) – A list of the parameters to extract, and (optionally) a name for them
to be mapped to. This is passed to the program’s
--parameters
argument. - name (str, optional) – The name in the [executables] section of the configuration file
to use, and the section to read for additional arguments to pass to
the executable. Default is
extract_posterior
. - analysis_segs (ligo.segments.Segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.
- tags (list, optional) – Tags to add to the inference executables.
Returns: A list of output files.
Return type: pycbc.workflow.FileList
-
pycbc.workflow.inference_followups.
get_diagnostic_plots
(workflow)[source]¶ Determines what diagnostic plots to create based on workflow.
The plots to create are based on what executable’s are specified in the workflow’s config file. A list of strings is returned giving the diagnostic plots to create. This list may contain:
samples
: For MCMC samplers, a plot of the sample chains as a function of iteration. This will be created ifplot_samples
is in the executables section.acceptance_rate
: For MCMC samplers, a plot of the acceptance rate. This will be created ifplot_acceptance_rate
is in the executables section.
Returns: List of names of diagnostic plots. Return type: list
-
pycbc.workflow.inference_followups.
get_plot_group
(cp, section_tag)[source]¶ Gets plotting groups from
[workflow-section_tag]
.
-
pycbc.workflow.inference_followups.
get_posterior_params
(cp, section='workflow-posterior_params')[source]¶ Gets the posterior parameters from the given config file.
The posterior parameters are read from the given
section
. Parameters should be specified asOUTPUT = [INPUT]
, whereOUTPUT
is what the parameter should be named in the posterior file andINPUT
is the (function of) parameter(s) to read from the samples file. If noINPUT
is provided, theINPUT
name will assumed to be the same as theOUTPUT
. Example:[workflow-posterior_params] mass1 = primary_mass(mass1, mass2) mass2 = secondary_mass(mass1, mass2) distance =
Parameters: - cp (pycbc.workflow.configuration.WorkflowConfigParser) – Config parser to read.
- section (str, optional) – The name of the section to load the parameters from. Default is
workflow-posterior_params
.
Returns: List of strings giving
INPUT:OUTPUT
. This can be passed as theparameters
argument tocreate_posterior_files()
.Return type:
-
pycbc.workflow.inference_followups.
make_diagnostic_plots
(workflow, diagnostics, samples_file, label, rdir, tags=None)[source]¶ Makes diagnostic plots.
Diagnostic plots are sampler-specific plots the provide information on how the sampler performed. All diagnostic plots use the output file produced by
pycbc_inference
as their input. Diagnostic plots are added to the results directoryrdir/NAME
whereNAME
is the name of the diagnostic given indiagnostics
.Parameters: - workflow (pycbc.workflow.core.Workflow) – The workflow to add the plotting jobs to.
- diagnostics (list of str) – The names of the diagnostic plots to create. See
get_diagnostic_plots()
for recognized names. - samples_file ((list of) pycbc.workflow.File) – One or more samples files with which to create the diagnostic plots. If a list of files is provided, a diagnostic plot for each file will be created.
- label (str) – Event label for the diagnostic plots.
- rdir (pycbc.results.layout.SectionNumber) – Results directory layout.
- tags (list of str, optional) – Additional tags to add to the file names.
Returns: Dictionary of diagnostic name -> list of files giving the plots that will be created.
Return type:
-
pycbc.workflow.inference_followups.
make_inference_acceptance_rate_plot
(workflow, inference_file, output_dir, name='plot_acceptance_rate', analysis_seg=None, tags=None)[source]¶ Sets up a plot of the acceptance rate (for MCMC samplers).
Parameters: - workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
- inference_file (pycbc.workflow.File) – The file with posterior samples.
- output_dir (str) – The directory to store result plots and files.
- name (str, optional) – The name in the [executables] section of the configuration file
to use, and the section to read for additional arguments to pass to
the executable. Default is
plot_acceptance_rate
. - analysis_segs (ligo.segments.Segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.
- tags (list, optional) – Tags to add to the inference executables.
Returns: A list of output files.
Return type: pycbc.workflow.FileList
-
pycbc.workflow.inference_followups.
make_inference_inj_plots
(workflow, inference_files, output_dir, parameters, name='inference_recovery', analysis_seg=None, tags=None)[source]¶ Sets up the recovered versus injected parameter plot in the workflow.
Parameters: - workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
- inference_files (pycbc.workflow.FileList) – The files with posterior samples.
- output_dir (str) – The directory to store result plots and files.
- parameters (list) – A
list
of parameters. Each parameter gets its own plot. - name (str) – The name in the [executables] section of the configuration file to use.
- analysis_segs ({None, ligo.segments.Segment}) – The segment this job encompasses. If None then use the total analysis time from the workflow.
- tags ({None, optional}) – Tags to add to the inference executables.
Returns: A list of result and output files.
Return type: pycbc.workflow.FileList
-
pycbc.workflow.inference_followups.
make_inference_plot
(workflow, input_file, output_dir, name, analysis_seg=None, tags=None, input_file_opt='input-file', output_file_extension='.png', add_to_workflow=False)[source]¶ Boiler-plate function for creating a standard plotting job.
Parameters: - workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
- input_file (pycbc.workflow.File) – The file used for the input.
- output_dir (str) – The directory to store result plots.
- name (str) – The name in the [executables] section of the configuration file to use.
- analysis_segs (ligo.segments.Segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.
- tags (list, optional) – Tags to add to the inference executables.
- input_file_opt (str, optional) – The name of the input-file option used by the executable. Default
is
input-file
. - output_file_extension (str, optional) – What file type to create. Default is
.png
. - add_to_workflow (bool, optional) – If True, the node will be added to the workflow before being returned.
This means that no options may be added to the node afterward.
Default is
False
.
Returns: The job node for creating the plot.
Return type:
-
pycbc.workflow.inference_followups.
make_inference_posterior_plot
(workflow, inference_file, output_dir, parameters=None, plot_prior_from_file=None, name='plot_posterior', analysis_seg=None, tags=None)[source]¶ Sets up the corner plot of the posteriors in the workflow.
Parameters: - workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
- inference_file (pycbc.workflow.File) – The file with posterior samples.
- output_dir (str) – The directory to store result plots and files.
- parameters (list or str) – The parameters to plot.
- plot_prior_from_file (str, optional) – Plot the prior from the given config file on the 1D marginal plots.
- name (str, optional) – The name in the [executables] section of the configuration file
to use, and the section to read for additional arguments to pass to
the executable. Default is
plot_posterior
. - analysis_segs (ligo.segments.Segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.
- tags (list, optional) – Tags to add to the inference executables.
Returns: A list of output files.
Return type: pycbc.workflow.FileList
-
pycbc.workflow.inference_followups.
make_inference_prior_plot
(workflow, config_file, output_dir, name='plot_prior', analysis_seg=None, tags=None)[source]¶ Sets up the corner plot of the priors in the workflow.
Parameters: - workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
- config_file (pycbc.workflow.File) – The WorkflowConfigParser parasable inference configuration file..
- output_dir (str) – The directory to store result plots and files.
- name (str) – The name in the [executables] section of the configuration file
to use, and the section to read for additional arguments to pass to
the executable. Default is
plot_prior
. - analysis_segs (ligo.segments.Segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.
- tags (list, optional) – Tags to add to the inference executables.
Returns: A list of the output files.
Return type: pycbc.workflow.FileList
-
pycbc.workflow.inference_followups.
make_inference_samples_plot
(workflow, inference_file, output_dir, name='plot_samples', analysis_seg=None, tags=None)[source]¶ Sets up a plot of the samples versus iteration (for MCMC samplers).
Parameters: - workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
- inference_file (pycbc.workflow.File) – The file with posterior samples.
- output_dir (str) – The directory to store result plots and files.
- name (str, optional) – The name in the [executables] section of the configuration file
to use, and the section to read for additional arguments to pass to
the executable. Default is
plot_samples
. - analysis_segs (ligo.segments.Segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.
- tags (list, optional) – Tags to add to the inference executables.
Returns: A list of output files.
Return type: pycbc.workflow.FileList
-
pycbc.workflow.inference_followups.
make_inference_skymap
(workflow, fits_file, output_dir, name='plot_skymap', analysis_seg=None, tags=None)[source]¶ Sets up the skymap plot.
Parameters: - workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
- fits_file (pycbc.workflow.File) – The fits file with the sky location.
- output_dir (str) – The directory to store result plots and files.
- name (str, optional) – The name in the [executables] section of the configuration file
to use, and the section to read for additional arguments to pass to
the executable. Default is
plot_skymap
. - analysis_segs (ligo.segments.Segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.
- tags (list, optional) – Tags to add to the inference executables.
Returns: A list of result and output files.
Return type: pycbc.workflow.FileList
-
pycbc.workflow.inference_followups.
make_inference_summary_table
(workflow, inference_file, output_dir, parameters=None, print_metadata=None, name='table_summary', analysis_seg=None, tags=None)[source]¶ Sets up the html table summarizing parameter estimates.
Parameters: - workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
- inference_file (pycbc.workflow.File) – The file with posterior samples.
- output_dir (str) – The directory to store result plots and files.
- parameters (list or str) – A list or string of parameters to generate the table for. If a string is provided, separate parameters should be space or new-line separated.
- print_metadata (list or str) – A list or string of metadata parameters to print. Syntax is the same
as for
parameters
. - name (str, optional) – The name in the [executables] section of the configuration file
to use, and the section to read for additional arguments to pass to
the executable. Default is
table_summary
. - analysis_segs (ligo.segments.Segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.
- tags (list, optional) – Tags to add to the inference executables.
Returns: A list of output files.
Return type: pycbc.workflow.FileList
-
pycbc.workflow.inference_followups.
make_posterior_workflow
(workflow, samples_files, config_file, label, rdir, posterior_file_dir='posterior_files', tags=None)[source]¶ Adds jobs to a workflow that make a posterior file and subsequent plots.
The parameters to be written to the posterior file are read from the
[workflow-posterior_params]
section of the workflow’s config file; seeget_posterior_params()
for details.Except for prior plots (which use the given inference config file), all subsequent jobs use the posterior file, and so may use the parameters provided in
[workflow-posterior_params]
. The following are created:- Summary table: an html table created using the
table_summary
executable. The parameters to print in the table are retrieved from thetable-params
option in the[workflow-summary_table]
section. Metadata may also be printed by adding aprint-metadata
option to that section. - Summary posterior plots: a collection of posterior plots to include
in the summary page, after the summary table. The parameters to plot
are read from
[workflow-summary_plots]
. Parameters should be grouped together by providingplot-group-NAME = PARAM1[:LABEL1] PARAM2[:LABEL2]
in that section, whereNAME
is a unique name for each group. One posterior plot will be created for each plot group. For clarity, only one or two parameters should be plotted in each summary group, but this is not enforced. Settings for the plotting executable are read from theplot_posterior_summary
section; likewise, the executable used is read fromplot_posterior_summary
in the[executables]
section. - Sky maps: if both
create_fits_file
andplot_skymap
are listed in the[executables]
section, then a.fits
file and sky map plot will be produced. The sky map plot will be included in the summary plots. You must be running in a python 3 environment to create these. - Prior plots: plots of the prior will be created using the
plot_prior
executable. By default, all of the variable parameters will be plotted. The prior plots are added topriors/LALBEL/
in the results directory, whereLABEL
is the givenlabel
. - Posterior plots: additional posterior plots are created using the
plot_posterior
executable. The parameters to plot are read from[workflow-plot_params]
section. As with the summary posterior plots, parameters are grouped together by providingplot-group-NAME
options in that section. A posterior plot will be created for each group, and added to theposteriors/LABEL/
directory. Plot settings are read from the[plot_posterior]
section; this is kept separate from the posterior summary so that different settings can be used. For example, you may want to make a density plot for the summary plots, but a scatter plot colored by SNR for the posterior plots.
Parameters: - samples_file (pycbc.workflow.core.FileList) – List of samples files to combine into a single posterior file.
- config_file (pycbc.worfkow.File) – The inference configuration file used to generate the samples file(s). This is needed to make plots of the prior.
- label (str) – Unique label for the plots. Used in file names.
- rdir (pycbc.results.layout.SectionNumber) – The results directory to save the plots to.
- posterior_file_dir (str, optional) – The name of the directory to save the posterior file to. Default is
posterior_files
. - tags (list of str, optional) – Additional tags to add to the file names.
Returns: - posterior_file (pycbc.workflow.File) – The posterior file that was created.
- summary_files (list) – List of files to go on the summary results page.
- prior_plots (list) – List of prior plots that will be created. These will be saved to
priors/LABEL/
in the resuls directory, whereLABEL
is the provided label. - posterior_plots (list) – List of posterior plots that will be created. These will be saved to
posteriors/LABEL/
in the results directory.
- Summary table: an html table created using the
pycbc.workflow.injection module¶
This module is responsible for setting up the part of a pycbc workflow that will generate the injection files to be used for assessing the workflow’s ability to detect predicted signals. (In ihope parlance, this sets up the inspinj jobs). Full documentation for this module can be found here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/NOTYETCREATED.html
-
pycbc.workflow.injection.
compute_inj_optimal_snr
(workflow, inj_file, precalc_psd_files, out_dir, tags=None)[source]¶ Set up a job for computing optimal SNRs of a sim_inspiral file.
-
pycbc.workflow.injection.
cut_distant_injections
(workflow, inj_file, out_dir, tags=None)[source]¶ Set up a job for removing injections that are too distant to be seen
-
pycbc.workflow.injection.
setup_injection_workflow
(workflow, output_dir=None, inj_section_name='injections', exttrig_file=None, tags=None)[source]¶ This function is the gateway for setting up injection-generation jobs in a workflow. It should be possible for this function to support a number of different ways/codes that could be used for doing this, however as this will presumably stay as a single call to a single code (which need not be inspinj) there are currently no subfunctions in this moudle.
Parameters: - workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the coincidence jobs will be added to.
- output_dir (path) – The directory in which injection files will be stored.
- inj_section_name (string (optional, default='injections')) – The string that corresponds to the option describing the exe location in the [executables] section of the .ini file and that corresponds to the section (and sub-sections) giving the options that will be given to the code at run time.
- tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. This will be used in output names.
Returns: - inj_files (pycbc.workflow.core.FileList) – The list of injection files created by this call.
- inj_tags (list of strings) – The tag corresponding to each injection file and used to uniquely identify them. The FileList class contains functions to search based on tags.
pycbc.workflow.jobsetup module¶
This library code contains functions and classes that are used to set up and add jobs/nodes to a pycbc workflow. For details about pycbc.workflow see: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/ahope.html
-
class
pycbc.workflow.jobsetup.
ComputeDurationsExecutable
(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.jobsetup.SQLInOutExecutable
The class responsible for making jobs for pycbc_compute_durations.
-
create_node
(job_segment, input_file, summary_xml_file)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 1¶
-
-
class
pycbc.workflow.jobsetup.
ExtractToXMLExecutable
(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
This class is responsible for running ligolw_sqlite jobs that will take an SQL file and dump it back to XML.
-
create_node
(job_segment, input_file)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 1¶
-
-
class
pycbc.workflow.jobsetup.
GstlalFarfromsnrchisqhistExecutable
(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for running the gstlal far from chisq hist jobs
-
create_node
(job_segment, non_inj_db, marg_input_file, inj_database=None, write_background_bins=False)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 4¶
-
-
class
pycbc.workflow.jobsetup.
GstlalMarginalizeLikelihoodExecutable
(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for running the gstlal marginalize_likelihood jobs
-
create_node
(job_segment, input_file)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 4¶
-
-
class
pycbc.workflow.jobsetup.
GstlalPlotBackground
(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for running gstlal_plot_background
-
create_node
(non_inj_db, likelihood_file)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 4¶
-
-
class
pycbc.workflow.jobsetup.
GstlalPlotSensitivity
(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for running gstlal_plot_sensitivity
-
create_node
(non_inj_db, injection_dbs)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 4¶
-
-
class
pycbc.workflow.jobsetup.
GstlalPlotSummary
(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for running gstlal_plot_summary
-
create_node
(non_inj_db, injection_dbs)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 4¶
-
-
class
pycbc.workflow.jobsetup.
GstlalSummaryPage
(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for running gstlal_inspiral_summary_page
-
current_retention_level
= 4¶
-
-
class
pycbc.workflow.jobsetup.
InspinjfindExecutable
(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for running jobs with pycbc_inspinjfind
-
create_node
(job_segment, input_file)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 1¶
-
-
class
pycbc.workflow.jobsetup.
JobSegmenter
(data_lengths, valid_chunks, valid_lengths, curr_seg, curr_exe_class, compatibility_mode=False)[source]¶ Bases:
object
This class is used when running sngl_ifo_job_setup to determine what times should be analysed be each job and what data is needed.
-
get_valid_times_for_job
(num_job, allow_overlap=True)[source]¶ Get the times for which this job is valid.
-
get_valid_times_for_job_legacy
(num_job)[source]¶ Get the times for which the job num_job will be valid, using the method use in inspiral hipe.
-
-
class
pycbc.workflow.jobsetup.
LalappsInspinjExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class used to create jobs for the lalapps_inspinj Executable.
-
create_node
(segment, exttrig_file=None, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 4¶
-
-
class
pycbc.workflow.jobsetup.
LigoLWCombineSegsExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
This class is used to create nodes for the ligolw_combine_segments Executable
-
create_node
(valid_seg, veto_files, segment_name)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 4¶
-
-
class
pycbc.workflow.jobsetup.
LigolwAddExecutable
(*args, **kwargs)[source]¶ Bases:
pycbc.workflow.core.Executable
The class used to create nodes for the ligolw_add Executable.
-
create_node
(jobSegment, input_files, output=None, use_tmp_subdirs=True, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 1¶
-
-
class
pycbc.workflow.jobsetup.
LigolwCBCAlignTotalSpinExecutable
(cp, exe_name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class used to create jobs for the ligolw_cbc_skyloc_jitter executable.
-
create_node
(parent, segment, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 3¶
-
-
class
pycbc.workflow.jobsetup.
LigolwCBCJitterSkylocExecutable
(cp, exe_name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class used to create jobs for the ligolw_cbc_skyloc_jitter executable.
-
create_node
(parent, segment, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 3¶
-
-
class
pycbc.workflow.jobsetup.
LigolwSSthincaExecutable
(cp, exe_name, universe=None, ifo=None, out_dir=None, dqVetoName=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for making jobs for ligolw_sstinca.
-
create_node
(jobSegment, coincSegment, inputFile, tags=None, write_likelihood=False)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 3¶
-
-
class
pycbc.workflow.jobsetup.
PyCBCInspiralExecutable
(cp, exe_name, ifo=None, out_dir=None, injection_file=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class used to create jobs for pycbc_inspiral Executable.
-
create_node
(data_seg, valid_seg, parent=None, dfParents=None, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 2¶
-
file_input_options
= ['--gating-file']¶
-
zero_pad_data_extend
(job_data_seg, curr_seg)[source]¶ When using zero padding, all data is analysable, but the setup functions must include the padding data where it is available so that we are not zero-padding in the middle of science segments. This function takes a job_data_seg, that is chosen for a particular node and extends it with segment-start-pad and segment-end-pad if that data is available.
-
-
class
pycbc.workflow.jobsetup.
PyCBCMultiInspiralExecutable
(cp, name, universe=None, ifo=None, injection_file=None, gate_files=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for setting up jobs for the pycbc_multi_inspiral executable.
-
create_node
(data_seg, valid_seg, parent=None, inj_file=None, dfParents=None, bankVetoBank=None, ipn_file=None, slide=None, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 2¶
-
file_input_options
= ['--gating-file']¶
-
-
class
pycbc.workflow.jobsetup.
PyCBCTmpltbankExecutable
(cp, exe_name, ifo=None, out_dir=None, tags=None, write_psd=False, psd_files=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class used to create jobs for pycbc_geom_nonspin_bank Executable and any other Executables using the same command line option groups.
-
create_nodata_node
(valid_seg, tags=None)[source]¶ A simplified version of create_node that creates a node that does not need to read in data.
Parameters: valid_seg (glue.segment) – The segment over which to declare the node valid. Usually this would be the duration of the analysis. Returns: node – The instance corresponding to the created node. Return type: pycbc.workflow.core.Node
-
create_node
(data_seg, valid_seg, parent=None, dfParents=None, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 3¶
-
-
class
pycbc.workflow.jobsetup.
PycbcCalculateFarExecutable
(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.jobsetup.SQLInOutExecutable
The class responsible for making jobs for the FAR calculation code. This only raises the default retention level
-
current_retention_level
= 4¶
-
-
class
pycbc.workflow.jobsetup.
PycbcCalculateLikelihoodExecutable
(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for running the pycbc_calculate_likelihood executable which is part 4 of 4 of the gstlal_inspiral_calc_likelihood functionality
-
create_node
(job_segment, trigger_file, likelihood_file, horizon_dist_file)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 4¶
-
-
class
pycbc.workflow.jobsetup.
PycbcCombineLikelihoodExecutable
(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for running the pycbc_combine_likelihood executable which is part 2 of 4 of the gstlal_inspiral_calc_likelihood functionality
-
create_node
(job_segment, likelihood_files, horizon_dist_file)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 1¶
-
-
class
pycbc.workflow.jobsetup.
PycbcConditionStrainExecutable
(cp, exe_name, ifo=None, out_dir=None, universe=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for creating jobs for pycbc_condition_strain.
-
create_node
(input_files, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 2¶
-
-
class
pycbc.workflow.jobsetup.
PycbcCreateInjectionsExecutable
(cp, exe_name, ifo=None, out_dir=None, universe=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for creating jobs for
pycbc_create_injections
.-
create_node
(config_file=None, seed=None, tags=None)[source]¶ Set up a CondorDagmanNode class to run
pycbc_create_injections
.Parameters: - config_file (pycbc.workflow.core.File) – A
pycbc.workflow.core.File
for inference configuration file to be used with--config-files
option. - seed (int) – Seed to use for generating injections.
- tags (list) – A list of tags to include in filenames.
Returns: node – The node to run the job.
Return type: - config_file (pycbc.workflow.core.File) – A
-
current_retention_level
= 2¶
-
-
class
pycbc.workflow.jobsetup.
PycbcDarkVsBrightInjectionsExecutable
(cp, exe_name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The clase used to create jobs for the pycbc_dark_vs_bright_injections Executable.
-
create_node
(parent, segment, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 4¶
-
-
class
pycbc.workflow.jobsetup.
PycbcGenerateRankingDataExecutable
(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for running the pycbc_gen_ranking_data executable which is part 3 of 4 of the gstlal_inspiral_calc_likelihood functionality
-
create_node
(job_segment, likelihood_file, horizon_dist_file)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 1¶
-
-
class
pycbc.workflow.jobsetup.
PycbcInferenceExecutable
(cp, exe_name, ifos=None, out_dir=None, universe=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for creating jobs for
pycbc_inference
.-
create_node
(config_file, seed=None, tags=None, analysis_time=None)[source]¶ Set up a CondorDagmanNode class to run
pycbc_inference
.Parameters: - config_file (pycbc.workflow.core.File) – A
pycbc.workflow.core.File
for inference configuration file to be used with--config-files
option. - seed (int) – An
int
to be used with--seed
option. - tags (list) – A list of tags to include in filenames.
Returns: node – The node to run the job.
Return type: - config_file (pycbc.workflow.core.File) – A
-
current_retention_level
= 2¶
-
-
class
pycbc.workflow.jobsetup.
PycbcPickleHorizonDistsExecutable
(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for running the pycbc_pickle_horizon_distances executable which is part 1 of 4 of the gstlal_inspiral_calc_likelihood functionality
-
create_node
(job_segment, trigger_files)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 1¶
-
-
class
pycbc.workflow.jobsetup.
PycbcSplitBankExecutable
(cp, exe_name, num_banks, ifo=None, out_dir=None, universe=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for creating jobs for pycbc_hdf5_splitbank.
-
create_node
(bank, tags=None)[source]¶ Set up a CondorDagmanNode class to run splitbank code
Parameters: bank (pycbc.workflow.core.File) – The File containing the template bank to be split Returns: node – The node to run the job Return type: pycbc.workflow.core.Node
-
current_retention_level
= 2¶
-
extension
= '.hdf'¶
-
-
class
pycbc.workflow.jobsetup.
PycbcSplitBankXmlExecutable
(cp, exe_name, num_banks, ifo=None, out_dir=None, universe=None)[source]¶ Bases:
pycbc.workflow.jobsetup.PycbcSplitBankExecutable
Subclass resonsible for creating jobs for pycbc_splitbank.
-
extension
= '.xml.gz'¶
-
-
class
pycbc.workflow.jobsetup.
PycbcSplitInspinjExecutable
(cp, exe_name, num_splits, universe=None, ifo=None, out_dir=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for running the pycbc_split_inspinj executable
-
create_node
(parent, tags=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 1¶
-
-
class
pycbc.workflow.jobsetup.
PycbcSqliteSimplifyExecutable
(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for making jobs for pycbc_sqlite_simplify.
-
create_node
(job_segment, inputFiles, injFile=None, injString=None, workflow=None)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 1¶
-
-
class
pycbc.workflow.jobsetup.
PycbcTimeslidesExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class used to create jobs for the pycbc_timeslides Executable.
-
create_node
(segment)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 4¶
-
-
class
pycbc.workflow.jobsetup.
SQLInOutExecutable
(cp, exe_name, universe=None, ifo=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
The class responsible for making jobs for SQL codes taking one input and one output.
-
create_node
(job_segment, input_file)[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 2¶
-
-
pycbc.workflow.jobsetup.
identify_needed_data
(curr_exe_job, link_job_instance=None)[source]¶ This function will identify the length of data that a specific executable needs to analyse and what part of that data is valid (ie. inspiral doesn’t analyse the first or last 64+8s of data it reads in).
In addition you can supply a second job instance to “link” to, which will ensure that the two jobs will have a one-to-one correspondence (ie. one template bank per one matched-filter job) and the corresponding jobs will be “valid” at the same times.
Parameters: - curr_exe_job (Job) – An instance of the Job class that has a get_valid times method.
- link_job_instance (Job instance (optional),) – Coordinate the valid times with another executable.
Returns: - dataLength (float) – The amount of data (in seconds) that each instance of the job must read in.
- valid_chunk (glue.segment.segment) – The times within dataLength for which that jobs output can be valid (ie. for inspiral this is (72, dataLength-72) as, for a standard setup the inspiral job cannot look for triggers in the first 72 or last 72 seconds of data read in.)
- valid_length (float) – The maximum length of data each job can be valid for. If not using link_job_instance this is abs(valid_segment), but can be smaller than that if the linked job only analyses a small amount of data (for e.g.).
-
pycbc.workflow.jobsetup.
int_gps_time_to_str
(t)[source]¶ Takes an integer GPS time, either given as int or lal.LIGOTimeGPS, and converts it to a string. If a LIGOTimeGPS with nonzero decimal part is given, raises a ValueError.
-
pycbc.workflow.jobsetup.
multi_ifo_coherent_job_setup
(workflow, out_files, curr_exe_job, science_segs, datafind_outs, output_dir, parents=None, slide_dict=None, tags=None)[source]¶ Method for setting up coherent inspiral jobs.
-
pycbc.workflow.jobsetup.
select_generic_executable
(workflow, exe_tag)[source]¶ Returns a class that is appropriate for setting up jobs to run executables having specific tags in the workflow config. Executables should not be “specialized” jobs fitting into one of the select_XXX_class functions above, i.e. not a matched filter or template bank job, which require extra setup.
Parameters: - workflow (pycbc.workflow.core.Workflow) – The Workflow instance.
- exe_tag (string) – The name of the config section storing options for this executable and the option giving the executable path in the [executables] section.
Returns: exe_class – functions appropriate for the given executable. Instances of the class (‘jobs’) must have a method job.create_node()
Return type: Sub-class of pycbc.workflow.core.Executable that holds utility
-
pycbc.workflow.jobsetup.
select_matchedfilter_class
(curr_exe)[source]¶ This function returns a class that is appropriate for setting up matched-filtering jobs within workflow.
Parameters: curr_exe (string) – The name of the matched filter executable to be used. Returns: exe_class – functions appropriate for the given executable. Instances of the class (‘jobs’) must have methods * job.create_node() and * job.get_valid_times(ifo, ) Return type: Sub-class of pycbc.workflow.core.Executable that holds utility
-
pycbc.workflow.jobsetup.
select_tmpltbank_class
(curr_exe)[source]¶ This function returns a class that is appropriate for setting up template bank jobs within workflow.
Parameters: curr_exe (string) – The name of the executable to be used for generating template banks. Returns: exe_class – functions appropriate for the given executable. Instances of the class (‘jobs’) must have methods * job.create_node() and * job.get_valid_times(ifo, ) Return type: Sub-class of pycbc.workflow.core.Executable that holds utility
-
pycbc.workflow.jobsetup.
sngl_ifo_job_setup
(workflow, ifo, out_files, curr_exe_job, science_segs, datafind_outs, parents=None, link_job_instance=None, allow_overlap=True, compatibility_mode=True)[source]¶ This function sets up a set of single ifo jobs. A basic overview of how this works is as follows:
- (1) Identify the length of data that each job needs to read in, and what part of that data the job is valid for.
- START LOOPING OVER SCIENCE SEGMENTS
- (2) Identify how many jobs are needed (if any) to cover the given science segment and the time shift between jobs. If no jobs continue.
- START LOOPING OVER JOBS
- (3) Identify the time that the given job should produce valid output (ie. inspiral triggers) over.
- (4) Identify the data range that the job will need to read in to produce the aforementioned valid output.
- Identify all parents/inputs of the job.
- Add the job to the workflow
- END LOOPING OVER JOBS
- END LOOPING OVER SCIENCE SEGMENTS
Parameters: - workflow (pycbc.workflow.core.Workflow) – An instance of the Workflow class that manages the constructed workflow.
- ifo (string) – The name of the ifo to set up the jobs for
- out_files (pycbc.workflow.core.FileList) – The FileList containing the list of jobs. Jobs will be appended to this list, and it does not need to be empty when supplied.
- curr_exe_job (Job) – An instanced of the Job class that has a get_valid times method.
- science_segs (ligo.segments.segmentlist) – The list of times that the jobs should cover
- datafind_outs (pycbc.workflow.core.FileList) – The file list containing the datafind files.
- parents (pycbc.workflow.core.FileList (optional, kwarg, default=None)) – The FileList containing the list of jobs that are parents to the one being set up.
- link_job_instance (Job instance (optional),) – Coordinate the valid times with another Executable.
- allow_overlap (boolean (optional, kwarg, default = True)) – If this is set the times that jobs are valid for will be allowed to overlap. This may be desired for template banks which may have some overlap in the times they cover. This may not be desired for inspiral jobs, where you probably want triggers recorded by jobs to not overlap at all.
- compatibility_mode (boolean (optional, kwarg, default = False)) – If given the jobs will be tiled in the same method as used in inspiral hipe. This requires that link_job_instance is also given. If not given workflow’s methods are used.
Returns: out_files – A list of the files that will be generated by this step in the workflow.
Return type:
pycbc.workflow.matched_filter module¶
This module is responsible for setting up the matched-filtering stage of workflows. For details about this module and its capabilities see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/NOTYETCREATED.html
-
pycbc.workflow.matched_filter.
setup_matchedfltr_dax_generated
(workflow, science_segs, datafind_outs, tmplt_banks, output_dir, injection_file=None, tags=None, link_to_tmpltbank=False, compatibility_mode=False)[source]¶ Setup matched-filter jobs that are generated as part of the workflow. This module can support any matched-filter code that is similar in principle to lalapps_inspiral, but for new codes some additions are needed to define Executable and Job sub-classes (see jobutils.py).
Parameters: - workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the coincidence jobs will be added to.
- science_segs (ifo-keyed dictionary of ligo.segments.segmentlist instances) – The list of times that are being analysed in this workflow.
- datafind_outs (pycbc.workflow.core.FileList) – An FileList of the datafind files that are needed to obtain the data used in the analysis.
- tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.
- output_dir (path) – The directory in which output will be stored.
- injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.
- tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.
- link_to_tmpltbank (boolean, optional (default=True)) – If this option is given, the job valid_times will be altered so that there will be one inspiral file for every template bank and they will cover the same time span. Note that this option must also be given during template bank generation to be meaningful.
Returns: inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.
Return type:
-
pycbc.workflow.matched_filter.
setup_matchedfltr_dax_generated_multi
(workflow, science_segs, datafind_outs, tmplt_banks, output_dir, injection_file=None, tags=None, link_to_tmpltbank=False, compatibility_mode=False)[source]¶ Setup matched-filter jobs that are generated as part of the workflow in which a single job reads in and generates triggers over multiple ifos. This module can support any matched-filter code that is similar in principle to pycbc_multi_inspiral or lalapps_coh_PTF_inspiral, but for new codes some additions are needed to define Executable and Job sub-classes (see jobutils.py).
Parameters: - workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the coincidence jobs will be added to.
- science_segs (ifo-keyed dictionary of ligo.segments.segmentlist instances) – The list of times that are being analysed in this workflow.
- datafind_outs (pycbc.workflow.core.FileList) – An FileList of the datafind files that are needed to obtain the data used in the analysis.
- tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.
- output_dir (path) – The directory in which output will be stored.
- injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.
- tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.
Returns: inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.
Return type:
-
pycbc.workflow.matched_filter.
setup_matchedfltr_workflow
(workflow, science_segs, datafind_outs, tmplt_banks, output_dir=None, injection_file=None, tags=None)[source]¶ This function aims to be the gateway for setting up a set of matched-filter jobs in a workflow. This function is intended to support multiple different ways/codes that could be used for doing this. For now the only supported sub-module is one that runs the matched-filtering by setting up a serious of matched-filtering jobs, from one executable, to create matched-filter triggers covering the full range of science times for which there is data and a template bank file.
Parameters: - Workflow (pycbc.workflow.core.Workflow) – The workflow instance that the coincidence jobs will be added to.
- science_segs (ifo-keyed dictionary of ligo.segments.segmentlist instances) – The list of times that are being analysed in this workflow.
- datafind_outs (pycbc.workflow.core.FileList) – An FileList of the datafind files that are needed to obtain the data used in the analysis.
- tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.
- output_dir (path) – The directory in which output will be stored.
- injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.
- tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.
Returns: inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.
Return type:
pycbc.workflow.minifollowups module¶
-
class
pycbc.workflow.minifollowups.
PlotQScanExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.plotting.PlotExecutable
Class to be used for to create workflow.Executable instances for the pycbc_plot_qscan executable. Basically inherits directly from PlotExecutable but adds the file_input_options.
-
file_input_options
= ['--gating-file']¶
-
-
class
pycbc.workflow.minifollowups.
SingleTemplateExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.plotting.PlotExecutable
Class to be used for to create workflow.Executable instances for the pycbc_single_template executable. Basically inherits directly from PlotExecutable but adds the file_input_options.
-
file_input_options
= ['--gating-file']¶
-
-
class
pycbc.workflow.minifollowups.
SingleTimeFreqExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.plotting.PlotExecutable
Class to be used for to create workflow.Executable instances for the pycbc_plot_singles_timefreq executable. Basically inherits directly from PlotExecutable but adds the file_input_options.
-
file_input_options
= ['--gating-file']¶
-
-
pycbc.workflow.minifollowups.
create_noop_node
()[source]¶ Creates a noop node that can be added to a DAX doing nothing. The reason for using this is if a minifollowups dax contains no triggers currently the dax will contain no jobs and be invalid. By adding a noop node we ensure that such daxes will actually run if one adds one such noop node. Adding such a noop node into a workflow more than once will cause a failure.
-
pycbc.workflow.minifollowups.
grouper
(iterable, n, fillvalue=None)[source]¶ Create a list of n length tuples
-
pycbc.workflow.minifollowups.
make_coinc_info
(workflow, singles, bank, coinc, out_dir, n_loudest=None, trig_id=None, file_substring=None, tags=None)[source]¶
-
pycbc.workflow.minifollowups.
make_inj_info
(workflow, injection_file, injection_index, num, out_dir, tags=None)[source]¶
-
pycbc.workflow.minifollowups.
make_plot_waveform_plot
(workflow, params, out_dir, ifos, exclude=None, require=None, tags=None)[source]¶ Add plot_waveform jobs to the workflow.
-
pycbc.workflow.minifollowups.
make_qscan_plot
(workflow, ifo, trig_time, out_dir, injection_file=None, data_segments=None, time_window=100, tags=None)[source]¶ Generate a make_qscan node and add it to workflow.
This function generates a single node of the singles_timefreq executable and adds it to the current workflow. Parent/child relationships are set by the input/output files automatically.
Parameters: - workflow (pycbc.workflow.core.Workflow) – The workflow class that stores the jobs that will be run.
- ifo (str) – Which interferometer are we using?
- trig_time (int) – The time of the trigger being followed up.
- out_dir (str) – Location of directory to output to
- injection_file (pycbc.workflow.File (optional, default=None)) – If given, add the injections in the file to strain before making the plot.
- data_segments (ligo.segments.segmentlist (optional, default=None)) – The list of segments for which data exists and can be read in. If given the start/end times given to singles_timefreq will be adjusted if [trig_time - time_window, trig_time + time_window] does not completely lie within a valid data segment. A ValueError will be raised if the trig_time is not within a valid segment, or if it is not possible to find 2*time_window (plus the padding) of continuous data around the trigger. This must be coalesced.
- time_window (int (optional, default=None)) – The amount of data (not including padding) that will be read in by the singles_timefreq job. The default value of 100s should be fine for most cases.
- tags (list (optional, default=None)) – List of tags to add to the created nodes, which determine file naming.
-
pycbc.workflow.minifollowups.
make_single_template_plots
(workflow, segs, data_read_name, analyzed_name, params, out_dir, inj_file=None, exclude=None, require=None, tags=None, params_str=None, use_exact_inj_params=False)[source]¶ Function for creating jobs to run the pycbc_single_template code and to run the associated plotting code pycbc_single_template_plots and add these jobs to the workflow.
Parameters: - workflow (workflow.Workflow instance) – The pycbc.workflow.Workflow instance to add these jobs to.
- segs (workflow.File instance) – The pycbc.workflow.File instance that points to the XML file containing the segment lists of data read in and data analyzed.
- data_read_name (str) – The name of the segmentlist containing the data read in by each inspiral job in the segs file.
- analyzed_name (str) – The name of the segmentlist containing the data analyzed by each inspiral job in the segs file.
- params (dictionary) – A dictionary containing the parameters of the template to be used. params[ifo+’end_time’] is required for all ifos in workflow.ifos. If use_exact_inj_params is False then also need to supply values for [mass1, mass2, spin1z, spin2x]. For precessing templates one also needs to supply [spin1y, spin1x, spin2x, spin2y, inclination] additionally for precession one must supply u_vals or u_vals_+ifo for all ifos. u_vals is the ratio between h_+ and h_x to use when constructing h(t). h(t) = (h_+ * u_vals) + h_x.
- out_dir (str) – Directory in which to store the output files.
- inj_file (workflow.File (optional, default=None)) – If given send this injection file to the job so that injections are made into the data.
- exclude (list (optional, default=None)) – If given, then when considering which subsections in the ini file to parse for options to add to single_template_plot, only use subsections that do not match strings in this list.
- require (list (optional, default=None)) – If given, then when considering which subsections in the ini file to parse for options to add to single_template_plot, only use subsections matching strings in this list.
- tags (list (optional, default=None)) – Add this list of tags to all jobs.
- params_str (str (optional, default=None)) – If given add this string to plot title and caption to describe the template that was used.
- use_exact_inj_params (boolean (optional, default=False)) – If True do not use masses and spins listed in the params dictionary but instead use the injection closest to the filter time as a template.
Returns: output_files – The list of workflow.Files created in this function.
Return type: workflow.FileList
-
pycbc.workflow.minifollowups.
make_singles_timefreq
(workflow, single, bank_file, trig_time, out_dir, veto_file=None, time_window=10, data_segments=None, tags=None)[source]¶ Generate a singles_timefreq node and add it to workflow.
This function generates a single node of the singles_timefreq executable and adds it to the current workflow. Parent/child relationships are set by the input/output files automatically.
Parameters: - workflow (pycbc.workflow.core.Workflow) – The workflow class that stores the jobs that will be run.
- single (pycbc.workflow.core.File instance) – The File object storing the single-detector triggers to followup.
- bank_file (pycbc.workflow.core.File instance) – The File object storing the template bank.
- trig_time (int) – The time of the trigger being followed up.
- out_dir (str) – Location of directory to output to
- veto_file (pycbc.workflow.core.File (optional, default=None)) – If given use this file to veto triggers to determine the loudest event. FIXME: Veto files should be provided a definer argument and not just assume that all segments should be read.
- time_window (int (optional, default=None)) – The amount of data (not including padding) that will be read in by the singles_timefreq job. The default value of 10s should be fine for most cases.
- data_segments (ligo.segments.segmentlist (optional, default=None)) – The list of segments for which data exists and can be read in. If given the start/end times given to singles_timefreq will be adjusted if [trig_time - time_window, trig_time + time_window] does not completely lie within a valid data segment. A ValueError will be raised if the trig_time is not within a valid segment, or if it is not possible to find 2*time_window (plus the padding) of continuous data around the trigger. This must be coalesced.
- tags (list (optional, default=None)) – List of tags to add to the created nodes, which determine file naming.
-
pycbc.workflow.minifollowups.
make_skipped_html
(workflow, skipped_data, out_dir, tags)[source]¶ Make a html snippet from the list of skipped background coincidences
-
pycbc.workflow.minifollowups.
make_sngl_ifo
(workflow, sngl_file, bank_file, trigger_id, out_dir, ifo, tags=None)[source]¶ Setup a job to create sngl detector sngl ifo html summary snippet.
-
pycbc.workflow.minifollowups.
make_trigger_timeseries
(workflow, singles, ifo_times, out_dir, special_tids=None, exclude=None, require=None, tags=None)[source]¶
-
pycbc.workflow.minifollowups.
setup_foreground_minifollowups
(workflow, coinc_file, single_triggers, tmpltbank_file, insp_segs, insp_data_name, insp_anal_name, dax_output, out_dir, tags=None)[source]¶ Create plots that followup the Nth loudest coincident injection from a statmap produced HDF file.
Parameters: - workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
- coinc_file –
- single_triggers (list of pycbc.workflow.File) – A list cointaining the file objects associated with the merged single detector trigger files for each ifo.
- tmpltbank_file (pycbc.workflow.File) – The file object pointing to the HDF format template bank
- insp_segs (SegFile) – The segment file containing the data read and analyzed by each inspiral job.
- insp_data_name (str) – The name of the segmentlist storing data read.
- insp_anal_name (str) – The name of the segmentlist storing data analyzed.
- out_dir (path) – The directory to store minifollowups result plots and files
- tags ({None, optional}) – Tags to add to the minifollowups executables
Returns: layout – A list of tuples which specify the displayed file layout for the minifollops plots.
Return type:
-
pycbc.workflow.minifollowups.
setup_injection_minifollowups
(workflow, injection_file, inj_xml_file, single_triggers, tmpltbank_file, insp_segs, insp_data_name, insp_anal_name, dax_output, out_dir, tags=None)[source]¶ Create plots that followup the closest missed injections
Parameters: - workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
- coinc_file –
- single_triggers (list of pycbc.workflow.File) – A list cointaining the file objects associated with the merged single detector trigger files for each ifo.
- tmpltbank_file (pycbc.workflow.File) – The file object pointing to the HDF format template bank
- insp_segs (SegFile) – The segment file containing the data read by each inspiral job.
- insp_data_name (str) – The name of the segmentlist storing data read.
- insp_anal_name (str) – The name of the segmentlist storing data analyzed.
- out_dir (path) – The directory to store minifollowups result plots and files
- tags ({None, optional}) – Tags to add to the minifollowups executables
Returns: layout – A list of tuples which specify the displayed file layout for the minifollops plots.
Return type:
-
pycbc.workflow.minifollowups.
setup_single_det_minifollowups
(workflow, single_trig_file, tmpltbank_file, insp_segs, insp_data_name, insp_anal_name, dax_output, out_dir, veto_file=None, veto_segment_name=None, statfiles=None, tags=None)[source]¶ Create plots that followup the Nth loudest clustered single detector triggers from a merged single detector trigger HDF file.
Parameters: - workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating
- single_trig_file (pycbc.workflow.File) – The File class holding the single detector triggers.
- tmpltbank_file (pycbc.workflow.File) – The file object pointing to the HDF format template bank
- insp_segs (SegFile) – The segment file containing the data read by each inspiral job.
- insp_data_name (str) – The name of the segmentlist storing data read.
- insp_anal_name (str) – The name of the segmentlist storing data analyzed.
- out_dir (path) – The directory to store minifollowups result plots and files
- statfiles (FileList (optional, default=None)) – Supplementary files necessary for computing the single-detector statistic.
- tags ({None, optional}) – Tags to add to the minifollowups executables
Returns: layout – A list of tuples which specify the displayed file layout for the minifollops plots.
Return type:
pycbc.workflow.pegasus_workflow module¶
This module provides thin wrappers around Pegasus.DAX3 functionality that provides additional abstraction and argument handling.
-
class
pycbc.workflow.pegasus_workflow.
DataStorage
(name)[source]¶ Bases:
object
A workflow representation of a place to store and read data from.
The abstract representation of a place to store and read data from. This can include files, database, or remote connections. This object is used as a handle to pass between functions, and is used a way to logically represent the order operation on the physical data.
-
class
pycbc.workflow.pegasus_workflow.
Executable
(name, namespace=None, os='linux', arch='x86_64', installed=True, version=None, container=None)[source]¶ Bases:
pycbc.workflow.pegasus_workflow.ProfileShortcuts
The workflow representation of an Executable
-
id
= 0¶
-
-
class
pycbc.workflow.pegasus_workflow.
File
(name)[source]¶ Bases:
pycbc.workflow.pegasus_workflow.DataStorage
,Pegasus.DAX3.File
The workflow representation of a physical file
An object that represents a file from the perspective of setting up a workflow. The file may or may not exist at the time of workflow generation. If it does, this is represented by containing a physical file name (PFN). A storage path is also available to indicate the desired final destination of this file.
-
dax_repr
¶ Return the dax representation of a File.
-
classmethod
from_path
(path)[source]¶ Takes a path and returns a File object with the path as the PFN.
-
-
class
pycbc.workflow.pegasus_workflow.
Node
(executable)[source]¶ Bases:
pycbc.workflow.pegasus_workflow.ProfileShortcuts
-
add_profile
(namespace, key, value, force=False)[source]¶ Add profile information to this node at the DAX level
-
-
class
pycbc.workflow.pegasus_workflow.
ProfileShortcuts
[source]¶ Bases:
object
Container of common methods for setting pegasus profile information on Executables and nodes. This class expects to be inherited from and for a add_profile method to be implemented.
-
class
pycbc.workflow.pegasus_workflow.
Workflow
(name='my_workflow')[source]¶ Bases:
object
-
add_node
(node)[source]¶ Add a node to this workflow
This function adds nodes to the workflow. It also determines parent/child relations from the DataStorage inputs to this job.
Parameters: node (pycbc.workflow.pegasus_workflow.Node) – A node that should be executed as part of this workflow.
-
pycbc.workflow.plotting module¶
This module is responsible for setting up plotting jobs. https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/NOTYETCREATED.html
-
class
pycbc.workflow.plotting.
PlotExecutable
(cp, name, universe=None, ifos=None, out_dir=None, tags=None)[source]¶ Bases:
pycbc.workflow.core.Executable
plot executable
-
create_node
()[source]¶ Default node constructor.
This is usually overridden by subclasses of Executable.
-
current_retention_level
= 4¶
-
-
pycbc.workflow.plotting.
make_binned_hist
(workflow, trig_file, veto_file, veto_name, out_dir, bank_file, exclude=None, require=None, tags=None)[source]¶
-
pycbc.workflow.plotting.
make_coinc_snrchi_plot
(workflow, inj_file, inj_trig, stat_file, trig_file, out_dir, exclude=None, require=None, tags=None)[source]¶
-
pycbc.workflow.plotting.
make_foreground_table
(workflow, trig_file, bank_file, out_dir, singles=None, extension='.html', tags=None, hierarchical_level=None)[source]¶
-
pycbc.workflow.plotting.
make_foundmissed_plot
(workflow, inj_file, out_dir, exclude=None, require=None, tags=None)[source]¶
-
pycbc.workflow.plotting.
make_ifar_plot
(workflow, trigger_file, out_dir, tags=None, hierarchical_level=None, executable='page_ifar')[source]¶ Creates a node in the workflow for plotting cumulative histogram of IFAR values.
-
pycbc.workflow.plotting.
make_inj_table
(workflow, inj_file, out_dir, missed=False, singles=None, tags=None)[source]¶
-
pycbc.workflow.plotting.
make_range_plot
(workflow, psd_files, out_dir, exclude=None, require=None, tags=None)[source]¶
-
pycbc.workflow.plotting.
make_results_web_page
(workflow, results_dir, explicit_dependencies=None)[source]¶
-
pycbc.workflow.plotting.
make_seg_plot
(workflow, seg_files, out_dir, seg_names=None, tags=None)[source]¶ Creates a node in the workflow for plotting science, and veto segments.
-
pycbc.workflow.plotting.
make_seg_table
(workflow, seg_files, seg_names, out_dir, tags=None, title_text=None, description=None)[source]¶ Creates a node in the workflow for writing the segment summary table. Returns a File instances for the output file.
-
pycbc.workflow.plotting.
make_sensitivity_plot
(workflow, inj_file, out_dir, exclude=None, require=None, tags=None)[source]¶
-
pycbc.workflow.plotting.
make_single_hist
(workflow, trig_file, veto_file, veto_name, out_dir, bank_file=None, exclude=None, require=None, tags=None)[source]¶
-
pycbc.workflow.plotting.
make_singles_plot
(workflow, trig_files, bank_file, veto_file, veto_name, out_dir, exclude=None, require=None, tags=None)[source]¶
-
pycbc.workflow.plotting.
make_snrchi_plot
(workflow, trig_files, veto_file, veto_name, out_dir, exclude=None, require=None, tags=None)[source]¶
-
pycbc.workflow.plotting.
make_snrifar_plot
(workflow, bg_file, out_dir, closed_box=False, cumulative=True, tags=None, hierarchical_level=None)[source]¶
-
pycbc.workflow.plotting.
make_snrratehist_plot
(workflow, bg_file, out_dir, closed_box=False, tags=None, hierarchical_level=None)[source]¶
-
pycbc.workflow.plotting.
make_spectrum_plot
(workflow, psd_files, out_dir, tags=None, hdf_group=None, precalc_psd_files=None)[source]¶
pycbc.workflow.psd module¶
This module is responsible for setting up PSD-related jobs in workflows.
-
pycbc.workflow.psd.
make_psd_file
(workflow, frame_files, segment_file, segment_name, out_dir, tags=None)[source]¶
-
pycbc.workflow.psd.
make_average_psd
(workflow, psd_files, out_dir, tags=None, output_fmt='.txt')[source]¶
pycbc.workflow.psdfiles module¶
This module is responsible for setting up the psd files used by CBC workflows.
-
pycbc.workflow.psdfiles.
setup_psd_pregenerated
(workflow, tags=None)[source]¶ Setup CBC workflow to use pregenerated psd files. The file given in cp.get(‘workflow’,’pregenerated-psd-file-(ifo)’) will be used as the –psd-file argument to geom_nonspinbank, geom_aligned_bank and pycbc_plot_psd_file.
Parameters: - workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.
- tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.
Returns: psd_files – The FileList holding the gating files
Return type:
-
pycbc.workflow.psdfiles.
setup_psd_workflow
(workflow, science_segs, datafind_outs, output_dir=None, tags=None)[source]¶ Setup static psd section of CBC workflow. At present this only supports pregenerated psd files, in the future these could be created within the workflow.
Parameters: - workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.
- science_segs (Keyed dictionary of glue.segmentlist objects) – scienceSegs[ifo] holds the science segments to be analysed for each ifo.
- datafind_outs (pycbc.workflow.core.FileList) – The file list containing the datafind files.
- output_dir (path string) – The directory where data products will be placed.
- tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.
Returns: psd_files – The FileList holding the psd files, 0 or 1 per ifo
Return type:
pycbc.workflow.segment module¶
This module is responsible for setting up the segment generation stage of workflows. For details about this module and its capabilities see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/ahope/segments.html
-
pycbc.workflow.segment.
add_cumulative_files
(workflow, output_file, input_files, out_dir, execute_now=False, tags=None)[source]¶ Function to combine a set of segment files into a single one. This function will not merge the segment lists but keep each separate.
Parameters: - workflow (pycbc.workflow.core.Workflow) – An instance of the Workflow class that manages the workflow.
- output_file (pycbc.workflow.core.File) – The output file object
- input_files (pycbc.workflow.core.FileList) – This list of input segment files
- out_dir (path) – The directory to write output to.
- execute_now (boolean, optional) – If true, jobs are executed immediately. If false, they are added to the workflow to be run later.
- tags (list of strings, optional) – A list of strings that is used to identify this job
-
pycbc.workflow.segment.
cat_to_veto_def_cat
(val)[source]¶ Convert a category character to the corresponding value in the veto definer file.
Parameters: str (single character string) – The input category character Returns: - pipedown_str (str) – The pipedown equivalent notation that can be passed to programs
- that expect this definition.
-
pycbc.workflow.segment.
create_segs_from_cats_job
(cp, out_dir, ifo_string, tags=None)[source]¶ This function creates the CondorDAGJob that will be used to run ligolw_segments_from_cats as part of the workflow
Parameters: - cp (pycbc.workflow.configuration.WorkflowConfigParser) – The in-memory representation of the configuration (.ini) files
- out_dir (path) – Directory in which to put output files
- ifo_string (string) – String containing all active ifos, ie. “H1L1V1”
- tag (list of strings, optional (default=None)) – Use this to specify a tag(s). This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!
Returns: job – The Job instance that will run segments_from_cats jobs
Return type: Job instance
-
pycbc.workflow.segment.
file_needs_generating
(file_path, cp, tags=None)[source]¶ This job tests the file location and determines if the file should be generated now or if an error should be raised. This uses the generate_segment_files variable, global to this module, which is described above and in the documentation.
Parameters: - file_path (path) – Location of file to check
- cp (ConfigParser) – The associated ConfigParser from which the segments-generate-segment-files variable is returned. It is recommended for most applications to use the default option by leaving segments-generate-segment-files blank, which will regenerate all segment files at runtime. Only use this facility if you need it. Choices are * ‘always’ : DEFAULT: All files will be generated even if they already exist. * ‘if_not_present’: Files will be generated if they do not already exist. Pre-existing files will be read in and used. * ‘error_on_duplicate’: Files will be generated if they do not already exist. Pre-existing files will raise a failure. * ‘never’: Pre-existing files will be read in and used. If no file exists the code will fail.
Returns: 1 = Generate the file. 0 = File already exists, use it. Other cases will raise an error.
Return type:
-
pycbc.workflow.segment.
find_playground_segments
(segs)[source]¶ Finds playground time in a list of segments.
Playground segments include the first 600s of every 6370s stride starting at GPS time 729273613.
Parameters: segs (segmentfilelist) – A segmentfilelist to find playground segments. Returns: outlist – A segmentfilelist with all playground segments during the input segmentfilelist (ie. segs). Return type: segmentfilelist
-
pycbc.workflow.segment.
get_analyzable_segments
(workflow, sci_segs, cat_files, out_dir, tags=None)[source]¶ Get the analyzable segments after applying ini specified vetoes and any other restrictions on the science segs, e.g. a minimum segment length, or demanding that only coincident segments are analysed.
Parameters: - workflow (Workflow object) – Instance of the workflow object
- sci_segs (Ifo-keyed dictionary of glue.segmentlists) – The science segments for each ifo to which the vetoes, or any other restriction, will be applied.
- cat_files (FileList of SegFiles) – The category veto files generated by get_veto_segs
- out_dir (path) – Location to store output files
- tags (list of strings) – Used to retrieve subsections of the ini file for configuration options.
Returns: - sci_ok_seg_file (workflow.core.SegFile instance) – The segment file combined from all ifos containing the analyzable science segments.
- sci_ok_segs (Ifo keyed dict of ligo.segments.segmentlist instances) – The analyzable science segs for each ifo, keyed by ifo
- sci_ok_seg_name (str) – The name with which analyzable science segs are stored in the output XML file.
-
pycbc.workflow.segment.
get_cumulative_segs
(workflow, categories, seg_files_list, out_dir, tags=None, execute_now=False, segment_name=None)[source]¶ Function to generate one of the cumulative, multi-detector segment files as part of the workflow.
Parameters: - workflow (pycbc.workflow.core.Workflow) – An instance of the Workflow class that manages the workflow.
- categories (int) – The veto categories to include in this cumulative veto.
- seg_files_list (Listionary of SegFiles) – The list of segment files to be used as input for combining.
- out_dir (path) – The directory to write output to.
- tags (list of strings, optional) – A list of strings that is used to identify this job
- execute_now (boolean, optional) – If true, jobs are executed immediately. If false, they are added to the workflow to be run later.
- segment_name (str) – The name of the combined, cumulative segments in the output file.
-
pycbc.workflow.segment.
get_cumulative_veto_group_files
(workflow, option, cat_files, out_dir, execute_now=True, tags=None)[source]¶ Get the cumulative veto files that define the different backgrounds we want to analyze, defined by groups of vetos.
Parameters: - workflow (Workflow object) – Instance of the workflow object
- option (str) – ini file option to use to get the veto groups
- cat_files (FileList of SegFiles) – The category veto files generated by get_veto_segs
- out_dir (path) – Location to store output files
- execute_now (Boolean) – If true outputs are generated at runtime. Else jobs go into the workflow and are generated then.
- tags (list of strings) – Used to retrieve subsections of the ini file for configuration options.
Returns: - seg_files (workflow.core.FileList instance) – The cumulative segment files for each veto group.
- names (list of strings) – The segment names for the corresponding seg_file
- cat_files (workflow.core.FileList instance) – The list of individual category veto files
-
pycbc.workflow.segment.
get_files_for_vetoes
(workflow, out_dir, runtime_names=None, in_workflow_names=None, tags=None)[source]¶ Get the various sets of veto segments that will be used in this analysis.
Parameters: - workflow (Workflow object) – Instance of the workflow object
- out_dir (path) – Location to store output files
- runtime_names (list) – Veto category groups with these names in the [workflow-segment] section of the ini file will be generated now.
- in_workflow_names (list) – Veto category groups with these names in the [workflow-segment] section of the ini file will be generated in the workflow. If a veto category appears here and in runtime_names, it will be generated now.
- tags (list of strings) – Used to retrieve subsections of the ini file for configuration options.
Returns: veto_seg_files – List of veto segment files generated
Return type:
-
pycbc.workflow.segment.
get_sci_segs_for_ifo
(ifo, cp, start_time, end_time, out_dir, tags=None)[source]¶ Obtain science segments for the selected ifo
Parameters: - ifo (string) – The string describing the ifo to obtain science times for.
- start_time (gps time (either int/LIGOTimeGPS)) – The time at which to begin searching for segments.
- end_time (gps time (either int/LIGOTimeGPS)) – The time at which to stop searching for segments.
- out_dir (path) – The directory in which output will be stored.
- tag (string, optional (default=None)) – Use this to specify a tag. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename.
Returns: - sci_segs (ligo.segments.segmentlist) – The segmentlist generated by this call
- sci_xml_file (pycbc.workflow.core.SegFile) – The workflow File object corresponding to this science segments file.
- out_sci_seg_name (string) – The name of the output segment list in the output XML file.
-
pycbc.workflow.segment.
get_science_segments
(workflow, out_dir, tags=None)[source]¶ Get the analyzable segments after applying ini specified vetoes.
Parameters: - workflow (Workflow object) – Instance of the workflow object
- out_dir (path) – Location to store output files
- tags (list of strings) – Used to retrieve subsections of the ini file for configuration options.
Returns: - sci_seg_file (workflow.core.SegFile instance) – The segment file combined from all ifos containing the science segments.
- sci_segs (Ifo keyed dict of ligo.segments.segmentlist instances) – The science segs for each ifo, keyed by ifo
- sci_seg_name (str) – The name with which science segs are stored in the output XML file.
-
pycbc.workflow.segment.
get_segments_file
(workflow, name, option_name, out_dir)[source]¶ Get cumulative segments from option name syntax for each ifo.
Use syntax of configparser string to define the resulting segment_file e.x. option_name = +up_flag1,+up_flag2,+up_flag3,-down_flag1,-down_flag2 Each ifo may have a different string and is stored separately in the file. Flags which add time must precede flags which subtract time.
Parameters: - workflow (pycbc.workflow.Workflow) –
- name (string) – Name of the segment list being created
- option_name (str) – Name of option in the associated config parser to get the flag list
Returns: seg_file – SegFile intance that points to the segment xml file on disk.
Return type: pycbc.workflow.SegFile
-
pycbc.workflow.segment.
get_triggered_coherent_segment
(workflow, sciencesegs)[source]¶ Construct the coherent network on and off source segments. Can switch to construction of segments for a single IFO search when coherent segments are insufficient for a search.
Parameters: - workflow (pycbc.workflow.core.Workflow) – The workflow instance that the calculated segments belong to.
- sciencesegs (dict) – Dictionary of all science segments within analysis time.
Returns: - onsource (ligo.segments.segmentlistdict) – A dictionary containing the on source segments for network IFOs
- offsource (ligo.segments.segmentlistdict) – A dictionary containing the off source segments for network IFOs
-
pycbc.workflow.segment.
get_veto_segs
(workflow, ifo, category, start_time, end_time, out_dir, veto_gen_job, tags=None, execute_now=False)[source]¶ Obtain veto segments for the selected ifo and veto category and add the job to generate this to the workflow.
Parameters: - workflow (pycbc.workflow.core.Workflow) – An instance of the Workflow class that manages the workflow.
- ifo (string) – The string describing the ifo to generate vetoes for.
- category (int) – The veto category to generate vetoes for.
- start_time (gps time (either int/LIGOTimeGPS)) – The time at which to begin searching for segments.
- end_time (gps time (either int/LIGOTimeGPS)) – The time at which to stop searching for segments.
- out_dir (path) – The directory in which output will be stored.
- vetoGenJob (Job) – The veto generation Job class that will be used to create the Node.
- tag (string, optional (default=None)) – Use this to specify a tag. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!
- execute_now (boolean, optional) – If true, jobs are executed immediately. If false, they are added to the workflow to be run later.
Returns: veto_def_file – The workflow File object corresponding to this DQ veto file.
Return type:
-
pycbc.workflow.segment.
parse_cat_ini_opt
(cat_str)[source]¶ Parse a cat str from the ini file into a list of sets
-
pycbc.workflow.segment.
save_veto_definer
(cp, out_dir, tags=None)[source]¶ Retrieve the veto definer file and save it locally
Parameters: - cp (ConfigParser instance) –
- out_dir (path) –
- tags (list of strings) – Used to retrieve subsections of the ini file for configuration options.
-
pycbc.workflow.segment.
setup_segment_gen_mixed
(workflow, veto_categories, out_dir, maxVetoAtRunTime, tag=None, generate_coincident_segs=True)[source]¶ This function will generate veto files for each ifo and for each veto category. It can generate these vetoes at run-time or in the workflow (or do some at run-time and some in the workflow). However, the CAT_1 vetoes and science time must be generated at run time as they are needed to plan the workflow. CATs 2 and higher may be needed for other workflow construction. It can also combine these files to create a set of cumulative, multi-detector veto files, which can be used in ligolw_thinca and in pipedown. Again these can be created at run time or within the workflow.
Parameters: - workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the coincidence jobs will be added to. This instance also contains the ifos for which to attempt to obtain segments for this analysis and the start and end times to search for segments over.
- veto_categories (list of ints) – List of veto categories to generate segments for. If this stops being integers, this can be changed here.
- out_dir (path) – The directory in which output will be stored.
- maxVetoAtRunTime (int) – Generate veto files at run time up to this category. Veto categories beyond this in veto_categories will be generated in the workflow. If we move to a model where veto categories are not explicitly cumulative, this will be rethought.
- tag (string, optional (default=None)) – Use this to specify a tag. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!
- generate_coincident_segs (boolean, optional (default = True)) – If given this module will generate a set of coincident, cumulative veto files that can be used with ligolw_thinca and pipedown.
Returns: segFilesList – These are representations of the various segment files that were constructed at this stage of the workflow and may be needed at later stages of the analysis (e.g. for performing DQ vetoes). If the file was generated at run-time the segment lists contained within these files will be an attribute of the instance. (If it will be generated in the workflow it will not be because I am not psychic).
Return type: dictionary of pycbc.workflow.core.SegFile instances
-
pycbc.workflow.segment.
setup_segment_generation
(workflow, out_dir, tag=None)[source]¶ This function is the gateway for setting up the segment generation steps in a workflow. It is designed to be able to support multiple ways of obtaining these segments and to combine/edit such files as necessary for analysis. The current modules have the capability to generate files at runtime or to generate files that are not needed for workflow generation within the workflow.
Parameters: - workflow (pycbc.workflow.core.Workflow) – The workflow instance that the coincidence jobs will be added to. This instance also contains the ifos for which to attempt to obtain segments for this analysis and the start and end times to search for segments over.
- out_dir (path) – The directory in which output will be stored.
- tag (string, optional (default=None)) – Use this to specify a tag. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!
Returns: - segsToAnalyse (dictionay of ifo-keyed glue.segment.segmentlist instances) – This will contain the times that your code should analyse. By default this is science time - CAT_1 vetoes. (This default could be changed if desired)
- segFilesList (pycbc.workflow.core.FileList of SegFile instances) – These are representations of the various segment files that were constructed at this stage of the workflow and may be needed at later stages of the analysis (e.g. for performing DQ vetoes). If the file was generated at run-time the segment lists contained within these files will be an attribute of the instance. (If it will be generated in the workflow it will not be because I am not psychic).
pycbc.workflow.splittable module¶
This module is responsible for setting up the splitting output files stage of workflows. For details about this module and its capabilities see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/NOTYETCREATED.html
-
pycbc.workflow.splittable.
select_splitfilejob_instance
(curr_exe)[source]¶ This function returns an instance of the class that is appropriate for splitting an output file up within workflow (for e.g. splitbank).
Parameters: - curr_exe (string) – The name of the Executable that is being used.
- curr_section (string) – The name of the section storing options for this executble
Returns: exe class – The class that holds the utility functions appropriate for the given Executable. This class must contain * exe_class.create_job() and the job returned by this must contain * job.create_node()
Return type: sub-class of pycbc.workflow.core.Executable
-
pycbc.workflow.splittable.
setup_splittable_dax_generated
(workflow, input_tables, out_dir, tags)[source]¶ Function for setting up the splitting jobs as part of the workflow.
Parameters: - workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the jobs will be added to.
- input_tables (pycbc.workflow.core.FileList) – The input files to be split up.
- out_dir (path) – The directory in which output will be written.
Returns: split_table_outs – The list of split up files as output from this job.
Return type:
-
pycbc.workflow.splittable.
setup_splittable_workflow
(workflow, input_tables, out_dir=None, tags=None)[source]¶ This function aims to be the gateway for code that is responsible for taking some input file containing some table, and splitting into multiple files containing different parts of that table. For now the only supported operation is using lalapps_splitbank to split a template bank xml file into multiple template bank xml files.
Parameters: - workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the jobs will be added to.
- input_tables (pycbc.workflow.core.FileList) – The input files to be split up.
- out_dir (path) – The directory in which output will be written.
Returns: split_table_outs – The list of split up files as output from this job.
Return type:
pycbc.workflow.tmpltbank module¶
This module is responsible for setting up the template bank stage of CBC workflows. For details about this module and its capabilities see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/ahope/template_bank.html
-
pycbc.workflow.tmpltbank.
setup_tmpltbank_dax_generated
(workflow, science_segs, datafind_outs, output_dir, tags=None, link_to_matchedfltr=True, compatibility_mode=False, psd_files=None)[source]¶ Setup template bank jobs that are generated as part of the CBC workflow. This function will add numerous jobs to the CBC workflow using configuration options from the .ini file. The following executables are currently supported:
- lalapps_tmpltbank
- pycbc_geom_nonspin_bank
Parameters: - workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.
- science_segs (Keyed dictionary of glue.segmentlist objects) – scienceSegs[ifo] holds the science segments to be analysed for each ifo.
- datafind_outs (pycbc.workflow.core.FileList) – The file list containing the datafind files.
- output_dir (path string) – The directory where data products will be placed.
- tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.
- link_to_matchedfltr (boolean, optional (default=True)) – If this option is given, the job valid_times will be altered so that there will be one inspiral file for every template bank and they will cover the same time span. Note that this option must also be given during matched-filter generation to be meaningful.
- psd_file (pycbc.workflow.core.FileList) – The file list containing predefined PSDs, if provided.
Returns: tmplt_banks – The FileList holding the details of all the template bank jobs.
Return type:
-
pycbc.workflow.tmpltbank.
setup_tmpltbank_pregenerated
(workflow, tags=None)[source]¶ Setup CBC workflow to use a pregenerated template bank. The bank given in cp.get(‘workflow’,’pregenerated-template-bank’) will be used as the input file for all matched-filtering jobs. If this option is present, workflow will assume that it should be used and not generate template banks within the workflow.
Parameters: - workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.
- tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.
Returns: tmplt_banks – The FileList holding the details of the template bank.
Return type:
-
pycbc.workflow.tmpltbank.
setup_tmpltbank_without_frames
(workflow, output_dir, tags=None, independent_ifos=False, psd_files=None)[source]¶ Setup CBC workflow to use a template bank (or banks) that are generated in the workflow, but do not use the data to estimate a PSD, and therefore do not vary over the duration of the workflow. This can either generate one bank that is valid for all ifos at all times, or multiple banks that are valid only for a single ifo at all times (one bank per ifo).
Parameters: - workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.
- output_dir (path string) – The directory where the template bank outputs will be placed.
- tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.
- independent_ifos (Boolean, optional (default=False)) – If given this will produce one template bank per ifo. If not given there will be on template bank to cover all ifos.
- psd_file (pycbc.workflow.core.FileList) – The file list containing predefined PSDs, if provided.
Returns: tmplt_banks – The FileList holding the details of the template bank(s).
Return type:
-
pycbc.workflow.tmpltbank.
setup_tmpltbank_workflow
(workflow, science_segs, datafind_outs, output_dir=None, psd_files=None, tags=None, return_format=None)[source]¶ Setup template bank section of CBC workflow. This function is responsible for deciding which of the various template bank workflow generation utilities should be used.
Parameters: - workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.
- science_segs (Keyed dictionary of glue.segmentlist objects) – scienceSegs[ifo] holds the science segments to be analysed for each ifo.
- datafind_outs (pycbc.workflow.core.FileList) – The file list containing the datafind files.
- output_dir (path string) – The directory where data products will be placed.
- psd_files (pycbc.workflow.core.FileList) – The file list containing predefined PSDs, if provided.
- tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.
Returns: tmplt_banks – The FileList holding the details of all the template bank jobs.
Return type:
Module contents¶
This package provides the utilities to construct an inspiral workflow for performing a coincident CBC matched-filter analysis on gravitational-wave interferometer data