Saboteurs - Reference manual

Logical methods

logical_methods.find_logical_saboteurs(failed_groups)

Identify bad and suspicious elements from groups failure data.

Parameters:

groups (dict) – A dict {group_name: [elements in that group]}.
failed_groups (list) – A list [group_name_1, group_name_2, …] of the names of all groups that experimentally failed.

Returns:

Returns a dictionary {‘saboteurs’: […], ‘suspicious’: []} where suspicious is the list of all elements which do not appear in successful groups, and saboteurs is the list of suspicious elements which are also the only suspicious element in at least one group.

Return type:

dict

logical_methods.design_test_batch(max_saboteurs=1)

Select a subset of the groups that enables identification of bad elements.

Parameters:

possible_groups (dict) – A dictionary where the key is ‘group_name’ and the value is a list of elements in that group.
max_saboteurs (int) – The maximum number of potential bad elements among all elements in all the groups. A bad element is an element which will make every group that contains it “fail”.

Returns:

selected_groups, error – A tuple consisting of: - ‘selected_groups’, a dictionary of carefully selected elements from possible_groups, where keys are ‘group’ and values are [elements in group]. The selected groups are such that knowing which of them failed or succeeded will be enough information to identify all bad elements in the original possible_groups set. - ‘error’, a string explaining what went wrong if there is an error in selection, otherwise it is None. If there is an error, then selected groups is an empty list, [].

Return type:

tuple

logical_methods.plot_batch(ax=None)

Plot a diagram of all groups and the elements they contain.

The groups parameter is a dict {group_name: [elements in the group]}. The ax is a Matplotlib Ax object on which to plot. If none is provided a new ax will be created and returned at the end.

logical_methods.generate_batch_report(target='@memory', group_naming='group', plot_format='pdf')

Generate a report with CSV and plot describing a groups batch.

Parameters:

groups (OrderedDict) – A (ordered) dict {group_name: [elements in the group]}.
target (str) – Either path to a folder, or a zip file, or “@memory” to return raw data of a zip file containing the report.
group_naming (str) – Word that will replace “group” in the report, e.g. “assembly”, “team”, etc.
plot_format (str) – Format of the plot (pdf, png, jpeg, etc).

Statistical methods

statistical_methods.find_statistical_saboteurs(pvalue_threshold=0.1, effect_threshold=0, max_significant_members=10)

Return statistics on possible bad elements in the data.

Parameters:

groups_data – Result of csv_to_groups_data().
pvalue_threshold – Only failure-associated elements with a p-value below this threshold will be included in the final statistics.

statistical_methods.statistics_report(outfile, replacements=())

Produce a PDF report from the results of find_statistical_saboteurs().

Parameters:

analysis_results (dict) – The result of saboteurs.find_statistical_saboteurs.
outfile (str or file-like object) – Path to the final PDF file, or file-like object, or ‘@memory’ to return binary data of the PDF report.
replacements (list of tuples) – A list of the form [("text_to_replace", "text_replacing"), ...].

Tools

tools.csv_to_groups_data(csv_string=None)

Read a CSV to get the data to feed to find_statistical_saboteurs() or find_logical_saboteurs().

See examples of such a file in the code repository: https://github.com/Edinburgh-Genome-Foundry/saboteurs/

Returns:

groups, failed_groups (tuple) – For datasheets for logical saboteur finding.

group_data (dict) – For datasheets for statistical saboteur finding. The data is of the form:

>>> {"Exp. 1": {
>>>      exp_id: "Exp. 1",
>>>     attempts: 7,
>>>     failures: 10,
>>>     members: ["Alice", "Bob"]}
>>>  }
>>>  "Exp. 2": { etc...