BandWitch Reference manual

DigestionProblem

        graph TD
  SC[SetCoverProblem] --> DP[DigestionProblem]
  DP --> SDP[SeparatingDigestionProblem]
  DP --> IDP[IdealDigestionProblem]
    
bandwitch.DigestionProblem.DigestionProblem

alias of <module ‘bandwitch.DigestionProblem.DigestionProblem’ from ‘/opt/hostedtoolcache/Python/3.12.10/x64/lib/python3.12/site-packages/bandwitch/DigestionProblem/DigestionProblem.py’>

class bandwitch.DigestionProblem.IdealDigestionsProblem(enzymes, ladder, sequences, min_bands=3, max_bands=7, border_tolerance=0.1, topology='default_to_linear', max_enzymes_per_digestion=1, relative_migration_precision=0.1)[source]

Find ideal digestion(s) to validate constructs.

Other digestion problems subclass this problem and implement a computation of coverages which depends on the problem.

Parameters:
  • sequences – An (ordered) dictionary of the form {sequence_name: sequence} where the sequence is an ATGC string.

  • enzymes – List of the names of the enzymes to consider, e.g. ['EcoRI', 'XbaI'].

  • ladder – A Ladder object representing the ladder used for migrations.

  • linear – True for linear sequences, False for circular sequences.

  • max_enzymes_per_digestion – Maximal number of enzymes that can go in a single digestion. Experimentally the best is 1, but you can try 2, or 3 in desperate situations. Not sure if more enzymes will work.

  • relative_migration_error – Variance of the bands measured during the migration, given as a proportion of the total migration span (difference between the migration of the ladder’s smallest and largest bands).

migration_score(band_migrations)[source]

Score the well-numbering and well-separation of all bands.

If some bands are too high or too low, or the number of bands is out of bounds, return 0. Else, return the minimal distance between two consecutive bands.

class bandwitch.DigestionProblem.SeparatingDigestionsProblem(enzymes, ladder, sequences=None, categories=None, topology='default_to_linear', max_enzymes_per_digestion=1, min_discrepancy='auto', relative_migration_precision=0.1)[source]

Problem: find best digestion(s) to identify constructs.

Provided a set of constructs (possibly from a combinatorial assembly), find a list of digestions such that all pair of constructs have different migration patterns for at least one digestion of the list.

Parameters:
  • enzymes – List of the names of the enzymes to consider, e.g. ['EcoRI', 'XbaI'].

  • ladder – A Ladder object representing the ladder used for migrations.

  • sequences – An (ordered) dictionary of the form {sequence_name: sequence} where the sequence is an ATGC string.

  • topology

  • max_enzymes_per_digestion – Maximum number of enzymes that can go in a single digestion. Experimentally the best is 1, but you can try 2, or 3 in desperate situations. Not sure if more enzymes will work.

  • min_discrepancy

  • relative_migration_precision – Variance of the bands measured during the migration, given as a proportion of the total migration span (difference between the migration of the ladder’s smallest and largest bands).

plot_distances_map(digestions, ax=None, target_file=None)[source]

Plot how well the digestions separate each construct pair.

Parameters:
  • digestions – A list of digestions, eg [('EcoRV'), ('XbaI', 'MfeI')].

  • ax – A matplotlib ax on which to plot, if none is provided, one is created and returned at the end.

  • target_file – The name of the (PNG, SVG, JPEG…) file in which to write the plot.

Returns:

The axes of the generated figure (if a target file is written to, the figure is closed and None is returned instead).

Return type:

axes

bandwitch.DigestionProblem.SetCoverProblem

alias of <module ‘bandwitch.DigestionProblem.SetCoverProblem’ from ‘/opt/hostedtoolcache/Python/3.12.10/x64/lib/python3.12/site-packages/bandwitch/DigestionProblem/SetCoverProblem.py’>

Clone Observations

        graph TD;
  aati[AATI fragment analysis file] -- parser --> bobs[BandsObervation-s];
  bobs --> clone[Clone-s]
  bobs --> clone
  digestions[digestion enzymes infos] -- parser --> clone

  clone --> ClonesObservations
  clone --> ClonesObservations
  cs[constructs sequences] --> ClonesObservations
  ClonesObservations --pdf generator--> reports
  style aati stroke:none, fill: none
  style digestions stroke:none, fill: none
  style cs stroke:none, fill: none
  style reports stroke:none, fill: none
    

Note: the classes in this module have a complicated organization, mostly due to the history of this module and the heterogeneity of the sources of data necessary for clone validation. It may get better in the future.

Bands Observations

class bandwitch.ClonesObservations.BandsObservation(name, bands, ladder, migration_image=None)[source]

One observation of a bands pattern.

Parameters:
  • name (str) – Name of the observation (used for the top label in plots).

  • bands (BandPattern) – A BandPattern object.

  • ladder (BandPattern) – A BandPattern object representing a ladder.

  • migration_image – Optional RGB array (HxWx3) representing the gel “image” (which will be displayed on the side of the plot).

static from_aati_fa_archive(archive_path, min_rfu_size_ratio=0.3, ignore_bands_under=None, direction='column')[source]

Return a dictionary of all band observations in AATI output files.

Parameters:
  • archive_path (str) – A path to a ZIP file containing all files output by the AATI fragment analyzer.

  • min_rfu_size_ratio (float) – Cut-off ratio to filter out bands whose intensity is below some threshold. The higher the value, the more bands will be filtered out.

Returns:

A dictionary {'A1': BandsObservation(), 'A2': ...} containing the measured pattern information for a whole 96-well microplate.

Return type:

dict

patterns_discrepancy(other_bands, relative_tolerance=0.1, min_band_cutoff=None, max_band_cutoff=None)[source]

Return the maximal discrepancy between two band patterns.

The discrepancy is defined as the largest distance between a band in one pattern and the closest band in the other pattern.

Parameters:
  • other_bands (list of int) – A list of bands (integers) to be compared with the current bands.

  • relative_tolerance (float) – Tolerance, as a ratio of the full ladder span. If =0.1, then the discrepancy will have a value of 1 when a band’s nearest correspondent in the other pattern is more than 10% of the ladder span apart.

  • min_band_cutoff (int, optional) – Discrepancies involving at least one band below this minimal band size will be ignored. By default, it will be set to the smallest band size in the ladder.

  • max_band_cutoff (int, optional) – Discrepancies involving at least one band above this minimum band size will be ignored. By default, it will be set to the largest band size in the ladder.

to_bandwagon_bandpattern(background_color=None, label='auto')[source]

Return a pattern version for the plotting library BandWagon.

If label is left to ‘auto’, it will be the pattern’s name.

Clone

class bandwitch.ClonesObservations.Clone(name, digestions, construct_id=None)[source]

Gather all information necessary to validate a clone.

Parameters:
  • name (str) – Name of the clone. Could be for instance a microplate well.

  • digestions (dict) – A dictionary {digestion: CloneObservation} where digestion is of the form ('EcoRI', 'BamHI').

  • construct_id (str) – ID of the construct to be validated. This is used to group clones by construct in the validation reports.

validate_bands(bands_by_digestion, relative_tolerance=0.1, min_band_cutoff=None, max_band_cutoff=None)[source]

Return a validation result (comparison of observed and expected).

The result is a CloneValidation object.

Parameters:
  • bands_by_digestion (dict) – A dictionary {digestion: [bands]} where digestion is of the form ('EcoRI', 'BamHI'), and bands is a list of band sizes.

  • relative_tolerance (float) – Tolerance, as a ratio of the full ladder span. If =0.1, then the discrepancy will have a value of 1 when a band’s nearest correspondent in the other pattern is more than 10% of the ladder span apart.

  • min_band_cutoff (int, optional) – Discrepancies involving at least one band below this minimal band size will be ignored. By default, it will be set to the smallest band size in the ladder.

  • max_band_cutoff (int, optional) – Discrepancies involving at least one band above this minimum band size will be ignored. By default, it will be corrected to the largest band size in the ladder.

CloneValidation

bandwitch.ClonesObservations.CloneValidation

alias of <module ‘bandwitch.ClonesObservations.CloneValidation’ from ‘/opt/hostedtoolcache/Python/3.12.10/x64/lib/python3.12/site-packages/bandwitch/ClonesObservations/CloneValidation.py’>

Clone Observations

class bandwitch.ClonesObservations.ClonesObservations(clones, constructs_records, partial_cutters=())[source]

All useful information for a collection of clones to be validated.

Parameters:
  • clones (list or dict) – Either a list of Clones (each with a unique name) or a dictionary {name: Clone}.

  • constructs_records (dict) – A dictionary {construct_name: biopython_record} indicating the sequence of the different constructs. For each construct, set the attribute construct.linear = False if the construct is circular.

get_clone_digestion_bands(clone, construct_id=None)[source]

Return {digestion: bands} for all digestions of this clone.

get_digestion_bands_for_construct(construct_id, digestion)[source]

Return the bands resulting from the digestion of the construct.

This function enables some memoization (no digestion is computed twice).

Parameters:
  • construct_id (str) – ID of the construct as it appears in this object’s constructs_records.

  • digestion (tuple) – For instance ('BamHI', 'XbaI').

identify_all_clones(relative_tolerance=0.05, min_band_cutoff=None, max_band_cutoff=None)[source]

Return {clone: {construct_id: CloneValidation}} for all clones.

Parameters:
  • relative_tolerance (float) – Tolerance, as a ratio of the full ladder span. If =0.1, then the discrepancy will have a value of 1 when a band’s nearest correspondent in the other pattern is more that 10% of the ladder span apart.

  • min_band_cutoff (int) – Discrepancies involving at least one band below this minimal band size will be ignored. By default, it will be set to the smallest band size in the ladder.

  • max_band_cutoff (int) – Discrepancies involving at least one band above this minimal band size will be ignored. By default, it will be set to the smallest band size in the ladder.

static identify_bad_parts(validations, constructs_parts, constructs_records=None, report_target=None, extra_failures=None)[source]

Identifies parts associated with failure in the validations.

Uses the Saboteurs package: https://github.com/Edinburgh-Genome-Foundry/saboteurs

Parameters:
  • validations – Validation results.

  • constructs_parts – Either a dict {construct_id: [list, of, part, names]} or a function (biopython_record => [list, of, part, names]).

  • report_target – Can be a path to file or file-like object where to write a PDF report. Can also be “@memory”, at which case the raw binary PDF data is returned.

Returns:

Where analysis is the result of sabotage analysis (see the saboteurs library), and pdf_data is None unless report_target is set to “@memory” (see above).

Return type:

analysis, pdf_data

partial_digests_analysis(relative_tolerance=0.05)[source]

Compute good clones under different partial digest assumptions.

Returns a dictionary {partial: {'valid_clones': 60, 'label': 'x'}} where for a given scenario partial is a tuple of all enzymes considered to have partial activity in this scenario (it defines the scenario) valid_clones is the number of good clones under this assumption, and label is a string representation of all enzymes involved, with partial activity enzymes in parenthesis.

This result can be fed to ClonesObservations’s .plot_partial_digests_analysis method for plotting.

plot_all_validations_patterns(validations, target=None, per_digestion_discrepancy=False)[source]

Plot a Graphic report of the gel validation.

The report displays the patterns, with green and red backgrounds depending on whether they passed the validation test.

Parameters:
  • validations – Validation results.

  • target – File path.

  • per_digestion_discrepancy – If true, each pattern for each clone will have an indication of how close it is from the expected pattern. If False, and the validation involves several digestions, the biggest discrepancy among all patterns is shown.

static plot_partial_digests_analysis(analysis_results, ax=None)[source]

Plot partial digest analysis results.

Parameters:
  • analysis_results – results from ClonesObservations.partial_digest_analysis.

  • ax – A Matplotlib ax. If none, one is created and returned at the end.

plot_validations_plate_map(validations, target=None, ax=None)[source]

Plot a map of the plate with passing/failing wells in green/red.

validate_all_clones(relative_tolerance=0.05, min_band_cutoff=None, max_band_cutoff=None)[source]

Return {clone: CloneValidation} for all clones.

validations_summary(validations, sort_clones_by_score=True)[source]

Return {construct_id: [CloneValidation, ...]}.

To each construct corresponds a list of the validation of all clones associated with that construct, from the best-scoring to the least-scoring.

write_identification_report(target_file=None, relative_tolerance=0.05, min_band_cutoff=None, max_band_cutoff=None)[source]

Plot a Graphic report of the gel validation.

The report displays the patterns, with green and red backgrounds depending on whether they passed the validation test.

Parameters:
  • target_file – File object or file path where to save the figure. If provided, the function returns none, else the function returns the axes array of the final figure.

  • relative_tolerance – Relative error tolerated on each band for the ovserved patterns to be considered similar to the expected patterns.

  • min_band_cutoff – Bands with a size below this value will not be considered.

  • max_band_cutoff – Bands with a size above this value will not be considered.

Ladder

class bandwitch.Ladder.Ladder(bands, name=None, infos=None)[source]

Class to represent gel ladders. These ladders serve as a scale for plotting any other gel simulation.

Parameters:

bands – A dictionary of the form {dna_size: migration distance}.

dna_size_to_migration(dna_sizes)[source]

Return the migration distances for the given DNA sizes.

Bands Predictions

        graph TD
  pcomp[_compute_digestion_bands] --> pr[predict_digestion_bands]
  pcomp --> pds[predict_sequences_digestions]
    

Module for digestion pattern prediction.

bandwitch.bands_predictions.compute_sequence_digestions_migrations(sequences_digestions, ladder)[source]

Add a ‘migration’ field to the data in sequences_digestions.

sequences_digestions is a dict {seq: digestions_data} where digestions_data is a result of predict_sequence_digestions, ladder is a Ladder object.

bandwitch.bands_predictions.predict_digestion_bands(sequence, enzymes, linear=True, partial_cutters=())[source]

Return the band sizes from digestion by all enzymes at once.

Returns a list of bands sizes sorted from smallest to largest.

Parameters:
  • sequence – Sequence to be digested. Either a string (“ATGC…”) or a Biopython Seq.

  • enzymes – list of all enzymes placed at the same time in the digestion mix e.g. [“EcoRI”, “BamHI”].

  • linear – True if the DNA fragment is linearized, False if it is circular.

  • partial_cutters – List of enzymes that are tired or inhibited would randomly miss some sites, thus creating extra bands in the final digest.

bandwitch.bands_predictions.predict_sequence_digestions(sequence, enzymes, linear=True, max_enzymes_per_digestion=1)[source]

Return a dict giving bands sizes pattern for all possible digestions.

The digestions, double-digestions, etc. are listed and for each the sequence band sizes are computed.

The result if of the form {digestion: {'cuts': [], 'bands': []}} Where digestion is a tuple of enzyme names e.g. ('EcoRI', 'XbaI'), ‘cuts’ is a list of cuts locations, ‘bands’ is a list of bands sizes.

Parameters:
  • sequence – The sequence to be digested.

  • enzymes – List of all enzymes to be considered.

  • max_enzymes_per_digestion – Maximum number of enzymes allowed in one digestion.

  • bands_to_migration – Function associating a migration distance to a band size. If provided, each digestion will have a 'migration' field (list of migration distances) in addition to ‘cuts’ and ‘bands’.