BandWitch Reference manual¶
DigestionProblem¶
-
bandwitch.DigestionProblem.
DigestionProblem
¶
-
class
bandwitch.DigestionProblem.
IdealDigestionsProblem
(enzymes, ladder, sequences, min_bands=3, max_bands=7, border_tolerance=0.1, topology='default_to_linear', max_enzymes_per_digestion=1, relative_migration_precision=0.1)[source]¶ Find ideal digestion(s) to validate constructs.
Other digestion problems subclass this problem and implement a computation of coverages which depends on the problem.
- Parameters
- sequences
An (ordered) dictionary of the form {sequence_name: sequence} where the sequence is an ATGC string
- enzymes
List of the names of the enzymes to consider, e.g.
['EcoRI', 'XbaI']
.- ladder
A Ladder object representing the ladder used for migrations.
- linear
True for linear sequences, false for circular sequences
- max_enzymes_per_digestion
Maximal number of enzymes that can go in a single digestion. Experimentally the best is 1, but you can try 2, or 3 in desperate situations. Not sure if more enzymes will work.
- relative_migration_error
Variance of the bands measured during the migration, given as a proportion of the total migration span (difference between the migration of the ladder’s smallest and largest bands).
-
class
bandwitch.DigestionProblem.
SeparatingDigestionsProblem
(enzymes, ladder, sequences=None, categories=None, topology='default_to_linear', max_enzymes_per_digestion=1, min_discrepancy='auto', relative_migration_precision=0.1)[source]¶ Problem: find best digestion(s) to identify constructs.
Provided a set of constructs (possibly from a combinatorial assembly), find a list of digestions such that all pair of constructs have different migration patterns for at least one digestion of the list.
-
plot_distances_map
(digestions, ax=None, target_file=None)[source]¶ Plot how well the digestions separate each construct pair.
- Parameters
- digestions
A list of digestions, eg
[('EcoRV'), ('XbaI', 'MfeI')]
.- ax
A matplotlib ax on which to plot, if none is provided, one is created and returned at the end.
- target_file
The name of the (PNG, SVG, JPEG…) file in which to write the plot.
- Returns
- axes
The axes of the generated figure (if a target file is written to, the figure is closed and None is returned instead).
-
-
bandwitch.DigestionProblem.
SetCoverProblem
¶
Clone Observations¶
Note: the classes in this module have a complicated organization, mostly due to the history of this module and the heterogeneity of the sources of data necessary for clone validation. It may get better in the future.
Bands Observations¶
-
class
bandwitch.ClonesObservations.
BandsObservation
(name, bands, ladder, migration_image=None)[source]¶ One observation of a bands pattern.
- Parameters
- name
Name of the observation (used for the top label in plots)
- bands
A BandPattern object
- ladder
A BandPattern object representing a ladder.
- migration_image
Optional RGB array (HxWx3) representing the gel “image” (which will be displayed on the side of the plot).
-
static
from_aati_fa_archive
(archive_path, min_rfu_size_ratio=0.3, ignore_bands_under=None, direction='column')[source]¶ Return a dictionnary of all band observations in AATI output files.
- Parameters
- archive_path
A path to a ZIP file containing all files output by the AATI fragment analyzer.
- min_rfu_size_ratio
Cut-off ratio to filter-out bands whose intensity is below some threshold. The higher the value, the more bands will be filtered out.
- Returns
- A dictionary
{'A1': BandsObservation(), 'A2': ...}
containing the - measured pattern information for a whole 96-well microplate.
- A dictionary
-
patterns_discrepancy
(other_bands, relative_tolerance=0.1, min_band_cutoff=None, max_band_cutoff=None)[source]¶ Return the maximal discrepancy between two band patterns.
The discrepancy is defined as the largest distance between a band in one pattern and the closest band in the other pattern.
- Parameters
- other_bands
A list of bands (integers) to be compared with the current bands
- relative_tolerance
Tolerance, as a ratio of the full ladder span. If =0.1, then the discrepancy will have a value of 1 when a band’s nearest correspondent in the other pattern is more that 10% of the ladder span apart.
- min_band_cutoff
Discrepancies involving at least one band below this minimal band size will be ignored. By default, it will be set to the smallest band size in the ladder.
- max_band_cutoff
Discrepancies involving at least one band above this minimal band size will be ignored. By default, it will be set to the smallest band size in the ladder.
Clone¶
-
class
bandwitch.ClonesObservations.
Clone
(name, digestions, construct_id=None)[source]¶ Gather all informations necessary to validate a clone.
- Parameters
- name
Name of the clone. Could be for instance a microplate well.
- digestions
A dictionnary
{digestion: CloneObservation}
wheredigestion
is of the form('EcoRI', 'BamHI')
.- construct_id
ID of the construct to be validated. This is used to group clones by construct in the validation reports.
-
validate_bands
(bands_by_digestion, relative_tolerance=0.1, min_band_cutoff=None, max_band_cutoff=None)[source]¶ Return a validation results (comparison of observed and expected).
The result is a CloneValidation object.
- Parameters
- bands_by_digestion
A dictionnary
{digestion: [bands]}
wheredigestion
is of the form('EcoRI', 'BamHI')
, and bands is a list of band sizes- relative_tolerance
Tolerance, as a ratio of the full ladder span. If =0.1, then the discrepancy will have a value of 1 when a band’s nearest correspondent in the other pattern is more that 10% of the ladder span apart.
- min_band_cutoff
Discrepancies involving at least one band below this minimal band size will be ignored. By default, it will be set to the smallest band size in the ladder.
- max_band_cutoff
Discrepancies involving at least one band above this minimal band size will be ignored. By default, it will be set to the smallest band size in the ladder.
Clone Observations¶
-
class
bandwitch.ClonesObservations.
ClonesObservations
(clones, constructs_records, partial_cutters=())[source]¶ All useful informations for a collection of clones to be validated.
- Parameters
- clones
Either a list of Clones (each with a unique name) or a dictionnary
{name: Clone}
.- constructs_records
A dictionnary
{construct_name: biopython_record}
indicating the sequence of the different constructs. For each construct, set the attributeconstruct.linear = False
if the construct is circular.
-
get_clone_digestion_bands
(clone, construct_id=None)[source]¶ Return
{digestion: bands}
for all digestions of this clone.
-
get_digestion_bands_for_construct
(construct_id, digestion)[source]¶ Return the bands resulting from the digestion of the construct.
This function enables some memoization (no digestion is computed twice).
- Parameters
- construct_id
ID of the construct as it appears in this object’
constructs_records
.- digestion
For instance
('BamHI', 'XbaI')
-
identify_all_clones
(relative_tolerance=0.05, min_band_cutoff=None, max_band_cutoff=None)[source]¶ Return
{clone: {construct_id: CloneValidation}}
for all clones.- Parameters
- relative_tolerance
Tolerance, as a ratio of the full ladder span. If =0.1, then the discrepancy will have a value of 1 when a band’s nearest correspondent in the other pattern is more that 10% of the ladder span apart.
- min_band_cutoff
Discrepancies involving at least one band below this minimal band size will be ignored. By default, it will be set to the smallest band size in the ladder.
- max_band_cutoff
Discrepancies involving at least one band above this minimal band size will be ignored. By default, it will be set to the smallest band size in the ladder.
-
static
identify_bad_parts
(validations, constructs_parts, constructs_records=None, report_target=None, extra_failures=None)[source]¶ Identifies parts associated with failure in the validations.
Uses the Saboteurs library: https://github.com/Edinburgh-Genome-Foundry/saboteurs
- Parameters
- validations
validations results
- constructs_parts
Either a dict {construct_id: [list, of, part, names]} or a function (biopython_record => [list, of, part, names])
- report_target
Can be a path to file or file-like object where to write a PDF report. Can also be “@memory”, at which case the raw binary PDF data is returned
- Returns
- analysis, pdf_data
Where
analysis
is the result of sabotage analysis (see the saboteurs library), and pdf_data is None unless report_target is set to “@memory” (see above).
-
partial_digests_analysis
(relative_tolerance=0.05)[source]¶ Compute good clones under different partial digest assumptions.
Returns a dictionnary
{partial: {'valid_clones': 60, 'label': 'x'}}
where for a given scenariopartial
is a tuple of all enzymes considered to have partia activity in this scenario (it defines the scenario)valid_clones
is the number of good clones under this assumption, andlabel
is a string representation of all enzymes involved, with partial activity enzymes in parenthesis.This result can be fed to
ClonesObservations
’s.plot_partial_digests_analysis
method for plotting
-
plot_all_validations_patterns
(validations, target=None, per_digestion_discrepancy=False)[source]¶ Plot a Graphic report of the gel validation.
The report displays the patterns, with green and red backgrounds depending on whether they passed the validation test.
- Parameters
- target_file
File object or file path where to save the figure. If provided, the function returns none, else the function returns the axes array of the final figure
- relative_tolerance
Relative error tolerated on each band for the ovserved patterns to be considered similar to the expected patterns.
- min_band_cutoff
Bands with a size below this value will not be considered
- max_band_cutoff
Bands with a size above this value will not be considered
-
static
plot_partial_digests_analysis
(analysis_results, ax=None)[source]¶ Plot partial digests analysis results.
- Parameters
- analysis_results
results from
ClonesObservations.partial_digest_analysis
- ax
A Matplotlib ax. If none, one is created and returned at the end.
-
plot_validations_plate_map
(validations, target=None, ax=None)[source]¶ Plot a map of the plate with passing/failing wells in green/red.
-
validate_all_clones
(relative_tolerance=0.05, min_band_cutoff=None, max_band_cutoff=None)[source]¶ Return
{clone: CloneValidation}
for all clones.
-
validations_summary
(validations, sort_clones_by_score=True)[source]¶ Return
{construct_id: [CloneValidation, ...]}
.To each construct corresponds a list of the validation of all clones associated with that construct, from the best-scoring to the least-scoring.
-
write_identification_report
(target_file=None, relative_tolerance=0.05, min_band_cutoff=None, max_band_cutoff=None)[source]¶ Plot a Graphic report of the gel validation.
The report displays the patterns, with green and red backgrounds depending on whether they passed the validation test.
- Parameters
- target_file
File object or file path where to save the figure. If provided, the function returns none, else the function returns the axes array of the final figure
- relative_tolerance
Relative error tolerated on each band for the ovserved patterns to be considered similar to the expected patterns.
- min_band_cutoff
Bands with a size below this value will not be considered
- max_band_cutoff
Bands with a size above this value will not be considered
Ladder¶
Bands Predictions¶
Module for digestion pattern prediction.
-
bandwitch.bands_predictions.
compute_sequence_digestions_migrations
(sequences_digestions, ladder)[source]¶ Add a ‘migration’ field to the data in sequences_digestions.
sequences_digestions
is a dict{seq: digestions_data}
wheredigestions_data
is a result ofpredict_sequence_digestions
,ladder
is a Ladder object.
-
bandwitch.bands_predictions.
predict_digestion_bands
(sequence, enzymes, linear=True, partial_cutters=())[source]¶ Return the band sizes from digestion by all enzymes at once.
Returns a list of bands sizes sorted from smallest to largest
- Parameters
- sequence
Sequence to be digested. Either a string (“ATGC…”) or a BioPython Seq
- enzymes
list of all enzymes placed at the same time in the digestion mix e.g. [“EcoRI”, “BamHI”]
- linear
True if the DNA fragment is linearized, False if it is circular
- partial_cutters
List of enzymes that are tired or inhibited would randomly miss some sites, thus creating extra bands in the final digest.
-
bandwitch.bands_predictions.
predict_sequence_digestions
(sequence, enzymes, linear=True, max_enzymes_per_digestion=1)[source]¶ Return a dict giving bands sizes pattern for all possible digestions.
The digestions, double-digestions, etc. are listed and for each the sequence band sizes are computed.
The result if of the form
{digestion: {'cuts': [], 'bands': []}}
Wheredigestion
is a tuple of enzyme names e.g.('EcoRI', 'XbaI')
, ‘cuts’ is a list of cuts locations, ‘bands’ is a list of bands sizes- Parameters
- sequence
The sequence to be digested
- enzymes
List of all enzymes to be considered
- max_enzymes_per_digestion
Maximum number of enzymes allowed in one digestion
- bands_to_migration
Function associating a migration distance to a band size. If provided, each digestion will have a
'migration'
field (list of migration distances) in addition to ‘cuts’ and ‘bands’.