BandWitch Reference manual

DigestionProblem

graph TD SC[SetCoverProblem] --> DP[DigestionProblem] DP --> SDP[SeparatingDigestionProblem] DP --> IDP[IdealDigestionProblem]
bandwitch.DigestionProblem.DigestionProblem

alias of bandwitch.DigestionProblem.DigestionProblem

class bandwitch.DigestionProblem.IdealDigestionsProblem(enzymes, ladder, sequences, min_bands=3, max_bands=7, border_tolerance=0.1, topology='default_to_linear', max_enzymes_per_digestion=1, relative_migration_precision=0.1)[source]

Find ideal digestion(s) to validate constructs.

Other digestion problems subclass this problem and implement a computation of coverages which depends on the problem.

Parameters
sequences

An (ordered) dictionary of the form {sequence_name: sequence} where the sequence is an ATGC string

enzymes

List of the names of the enzymes to consider, e.g. ['EcoRI', 'XbaI'].

ladder

A Ladder object representing the ladder used for migrations.

linear

True for linear sequences, false for circular sequences

max_enzymes_per_digestion

Maximal number of enzymes that can go in a single digestion. Experimentally the best is 1, but you can try 2, or 3 in desperate situations. Not sure if more enzymes will work.

relative_migration_error

Variance of the bands measured during the migration, given as a proportion of the total migration span (difference between the migration of the ladder’s smallest and largest bands).

migration_score(band_migrations)[source]

Score the well-numbering and well-separation of all bands.

If some bands are too high or too low, or the number of bands is out of bounds, return 0. Else, return the minimal distance between two consecutive bands.

class bandwitch.DigestionProblem.SeparatingDigestionsProblem(enzymes, ladder, sequences=None, categories=None, topology='default_to_linear', max_enzymes_per_digestion=1, min_discrepancy='auto', relative_migration_precision=0.1)[source]

Problem: find best digestion(s) to identify constructs.

Provided a set of constructs (possibly from a combinatorial assembly), find a list of digestions such that all pair of constructs have different migration patterns for at least one digestion of the list.

plot_distances_map(digestions, ax=None, target_file=None)[source]

Plot how well the digestions separate each construct pair.

Parameters
digestions

A list of digestions, eg [('EcoRV'), ('XbaI', 'MfeI')].

ax

A matplotlib ax on which to plot, if none is provided, one is created and returned at the end.

target_file

The name of the (PNG, SVG, JPEG…) file in which to write the plot.

Returns
axes

The axes of the generated figure (if a target file is written to, the figure is closed and None is returned instead).

bandwitch.DigestionProblem.SetCoverProblem

alias of bandwitch.DigestionProblem.SetCoverProblem

Clone Observations

graph TD; aati[AATI fragment analysis file] -- parser --> bobs[BandsObervation-s]; bobs --> clone[Clone-s] bobs --> clone digestions[digestion enzymes infos] -- parser --> clone clone --> ClonesObservations clone --> ClonesObservations cs[constructs sequences] --> ClonesObservations ClonesObservations --pdf generator--> reports style aati stroke:none, fill: none style digestions stroke:none, fill: none style cs stroke:none, fill: none style reports stroke:none, fill: none

Note: the classes in this module have a complicated organization, mostly due to the history of this module and the heterogeneity of the sources of data necessary for clone validation. It may get better in the future.

Bands Observations

class bandwitch.ClonesObservations.BandsObservation(name, bands, ladder, migration_image=None)[source]

One observation of a bands pattern.

Parameters
name

Name of the observation (used for the top label in plots)

bands

A BandPattern object

ladder

A BandPattern object representing a ladder.

migration_image

Optional RGB array (HxWx3) representing the gel “image” (which will be displayed on the side of the plot).

static from_aati_fa_archive(archive_path, min_rfu_size_ratio=0.3, ignore_bands_under=None, direction='column')[source]

Return a dictionnary of all band observations in AATI output files.

Parameters
archive_path

A path to a ZIP file containing all files output by the AATI fragment analyzer.

min_rfu_size_ratio

Cut-off ratio to filter-out bands whose intensity is below some threshold. The higher the value, the more bands will be filtered out.

Returns
A dictionary {'A1': BandsObservation(), 'A2': ...} containing the
measured pattern information for a whole 96-well microplate.
patterns_discrepancy(other_bands, relative_tolerance=0.1, min_band_cutoff=None, max_band_cutoff=None)[source]

Return the maximal discrepancy between two band patterns.

The discrepancy is defined as the largest distance between a band in one pattern and the closest band in the other pattern.

Parameters
other_bands

A list of bands (integers) to be compared with the current bands

relative_tolerance

Tolerance, as a ratio of the full ladder span. If =0.1, then the discrepancy will have a value of 1 when a band’s nearest correspondent in the other pattern is more that 10% of the ladder span apart.

min_band_cutoff

Discrepancies involving at least one band below this minimal band size will be ignored. By default, it will be set to the smallest band size in the ladder.

max_band_cutoff

Discrepancies involving at least one band above this minimal band size will be ignored. By default, it will be set to the smallest band size in the ladder.

to_bandwagon_bandpattern(background_color=None, label='auto')[source]

Return a pattern version for the plotting library Bandwagon.

If label is left to ‘auto’, it will be the pattern’s name.

Clone

class bandwitch.ClonesObservations.Clone(name, digestions, construct_id=None)[source]

Gather all informations necessary to validate a clone.

Parameters
name

Name of the clone. Could be for instance a microplate well.

digestions

A dictionnary {digestion: CloneObservation} where digestion is of the form ('EcoRI', 'BamHI').

construct_id

ID of the construct to be validated. This is used to group clones by construct in the validation reports.

validate_bands(bands_by_digestion, relative_tolerance=0.1, min_band_cutoff=None, max_band_cutoff=None)[source]

Return a validation results (comparison of observed and expected).

The result is a CloneValidation object.

Parameters
bands_by_digestion

A dictionnary {digestion: [bands]} where digestion is of the form ('EcoRI', 'BamHI'), and bands is a list of band sizes

relative_tolerance

Tolerance, as a ratio of the full ladder span. If =0.1, then the discrepancy will have a value of 1 when a band’s nearest correspondent in the other pattern is more that 10% of the ladder span apart.

min_band_cutoff

Discrepancies involving at least one band below this minimal band size will be ignored. By default, it will be set to the smallest band size in the ladder.

max_band_cutoff

Discrepancies involving at least one band above this minimal band size will be ignored. By default, it will be set to the smallest band size in the ladder.

CloneValidation

bandwitch.ClonesObservations.CloneValidation

alias of bandwitch.ClonesObservations.CloneValidation

Clone Observations

class bandwitch.ClonesObservations.ClonesObservations(clones, constructs_records, partial_cutters=())[source]

All useful informations for a collection of clones to be validated.

Parameters
clones

Either a list of Clones (each with a unique name) or a dictionnary {name: Clone}.

constructs_records

A dictionnary {construct_name: biopython_record} indicating the sequence of the different constructs. For each construct, set the attribute construct.linear = False if the construct is circular.

get_clone_digestion_bands(clone, construct_id=None)[source]

Return {digestion: bands} for all digestions of this clone.

get_digestion_bands_for_construct(construct_id, digestion)[source]

Return the bands resulting from the digestion of the construct.

This function enables some memoization (no digestion is computed twice).

Parameters
construct_id

ID of the construct as it appears in this object’ constructs_records.

digestion

For instance ('BamHI', 'XbaI')

identify_all_clones(relative_tolerance=0.05, min_band_cutoff=None, max_band_cutoff=None)[source]

Return {clone: {construct_id: CloneValidation}} for all clones.

Parameters
relative_tolerance

Tolerance, as a ratio of the full ladder span. If =0.1, then the discrepancy will have a value of 1 when a band’s nearest correspondent in the other pattern is more that 10% of the ladder span apart.

min_band_cutoff

Discrepancies involving at least one band below this minimal band size will be ignored. By default, it will be set to the smallest band size in the ladder.

max_band_cutoff

Discrepancies involving at least one band above this minimal band size will be ignored. By default, it will be set to the smallest band size in the ladder.

static identify_bad_parts(validations, constructs_parts, constructs_records=None, report_target=None, extra_failures=None)[source]

Identifies parts associated with failure in the validations.

Uses the Saboteurs library: https://github.com/Edinburgh-Genome-Foundry/saboteurs

Parameters
validations

validations results

constructs_parts

Either a dict {construct_id: [list, of, part, names]} or a function (biopython_record => [list, of, part, names])

report_target

Can be a path to file or file-like object where to write a PDF report. Can also be “@memory”, at which case the raw binary PDF data is returned

Returns
analysis, pdf_data

Where analysis is the result of sabotage analysis (see the saboteurs library), and pdf_data is None unless report_target is set to “@memory” (see above).

partial_digests_analysis(relative_tolerance=0.05)[source]

Compute good clones under different partial digest assumptions.

Returns a dictionnary {partial: {'valid_clones': 60, 'label': 'x'}} where for a given scenario partial is a tuple of all enzymes considered to have partia activity in this scenario (it defines the scenario) valid_clones is the number of good clones under this assumption, and label is a string representation of all enzymes involved, with partial activity enzymes in parenthesis.

This result can be fed to ClonesObservations’s .plot_partial_digests_analysis method for plotting

plot_all_validations_patterns(validations, target=None, per_digestion_discrepancy=False)[source]

Plot a Graphic report of the gel validation.

The report displays the patterns, with green and red backgrounds depending on whether they passed the validation test.

Parameters
target_file

File object or file path where to save the figure. If provided, the function returns none, else the function returns the axes array of the final figure

relative_tolerance

Relative error tolerated on each band for the ovserved patterns to be considered similar to the expected patterns.

min_band_cutoff

Bands with a size below this value will not be considered

max_band_cutoff

Bands with a size above this value will not be considered

static plot_partial_digests_analysis(analysis_results, ax=None)[source]

Plot partial digests analysis results.

Parameters
analysis_results

results from ClonesObservations.partial_digest_analysis

ax

A Matplotlib ax. If none, one is created and returned at the end.

plot_validations_plate_map(validations, target=None, ax=None)[source]

Plot a map of the plate with passing/failing wells in green/red.

validate_all_clones(relative_tolerance=0.05, min_band_cutoff=None, max_band_cutoff=None)[source]

Return {clone: CloneValidation} for all clones.

validations_summary(validations, sort_clones_by_score=True)[source]

Return {construct_id: [CloneValidation, ...]}.

To each construct corresponds a list of the validation of all clones associated with that construct, from the best-scoring to the least-scoring.

write_identification_report(target_file=None, relative_tolerance=0.05, min_band_cutoff=None, max_band_cutoff=None)[source]

Plot a Graphic report of the gel validation.

The report displays the patterns, with green and red backgrounds depending on whether they passed the validation test.

Parameters
target_file

File object or file path where to save the figure. If provided, the function returns none, else the function returns the axes array of the final figure

relative_tolerance

Relative error tolerated on each band for the ovserved patterns to be considered similar to the expected patterns.

min_band_cutoff

Bands with a size below this value will not be considered

max_band_cutoff

Bands with a size above this value will not be considered

Ladder

class bandwitch.Ladder.Ladder(bands, name=None, infos=None)[source]

Class to represent gel ladders. These ladders serve as a scale for plotting any other gel simulation.

Parameters
bands

A dictionnary of the form {dna_size: migration distance}

dna_size_to_migration(dna_sizes)[source]

Return the migration distances for the given dna sizes

Bands Predictions

graph TD pcomp[_compute_digestion_bands] --> pr[predict_digestion_bands] pcomp --> pds[predict_sequences_digestions]

Module for digestion pattern prediction.

bandwitch.bands_predictions.compute_sequence_digestions_migrations(sequences_digestions, ladder)[source]

Add a ‘migration’ field to the data in sequences_digestions.

sequences_digestions is a dict {seq: digestions_data} where digestions_data is a result of predict_sequence_digestions, ladder is a Ladder object.

bandwitch.bands_predictions.predict_digestion_bands(sequence, enzymes, linear=True, partial_cutters=())[source]

Return the band sizes from digestion by all enzymes at once.

Returns a list of bands sizes sorted from smallest to largest

Parameters
sequence

Sequence to be digested. Either a string (“ATGC…”) or a BioPython Seq

enzymes

list of all enzymes placed at the same time in the digestion mix e.g. [“EcoRI”, “BamHI”]

linear

True if the DNA fragment is linearized, False if it is circular

partial_cutters

List of enzymes that are tired or inhibited would randomly miss some sites, thus creating extra bands in the final digest.

bandwitch.bands_predictions.predict_sequence_digestions(sequence, enzymes, linear=True, max_enzymes_per_digestion=1)[source]

Return a dict giving bands sizes pattern for all possible digestions.

The digestions, double-digestions, etc. are listed and for each the sequence band sizes are computed.

The result if of the form {digestion: {'cuts': [], 'bands': []}} Where digestion is a tuple of enzyme names e.g. ('EcoRI', 'XbaI'), ‘cuts’ is a list of cuts locations, ‘bands’ is a list of bands sizes

Parameters
sequence

The sequence to be digested

enzymes

List of all enzymes to be considered

max_enzymes_per_digestion

Maximum number of enzymes allowed in one digestion

bands_to_migration

Function associating a migration distance to a band size. If provided, each digestion will have a 'migration' field (list of migration distances) in addition to ‘cuts’ and ‘bands’.