Built-in Specifications

Summary

AllowPrimer

Enforce various specifications for enabling primers at the location.

AvoidBlastMatches

Enforce that the sequence has no BLAST matches with a given database.

AvoidMatches

Enforce that the sequence has no matches longer than N in a given index.

AvoidChanges

Specify that some locations of the sequence should not be changed.

AvoidHairpins

Avoid Hairpin patterns as defined by the IDT guidelines.

AvoidHeterodimerization

Avoid that the (sub)sequence anneals with other primers.

AvoidPattern

Enforce that the given pattern is absent in the sequence.

AvoidStopCodons

Do not introduce any new stop codon in that frame.

AvoidRareCodons

Avoid the use of codons with low frequency.

CodonOptimize

Codon-optimize a coding sequence using a user-selected method.

MaximizeCAI

Codon-optimize a coding sequence for a given species.

MatchTargetCodonUsage

Codon-optimize a sequence so it has the same codon usage as a target.

HarmonizeRCA

Codon-Harmonize a native sequence for a new host (Claassens method).

EnforceGCContent

Specification on the local or global proportion of G/C nucleotides.

EnforceMeltingTemperature

Ensure that the subsequence’s Tm is in a certain segment/target.

EnforcePatternOccurence

Enforce a number of occurrences of the given pattern in the sequence.

EnforceRegionsCompatibility

Ensure that different subregions satisfy compatibility constraints.

EnforceSequence

Enforces a (possibly degenerate) sequence at some location.

EnforceTranslation

Enforce a specific amino-acid sequence translation.

SequenceLengthBounds

Checks that the sequence length is between bounds.

UniquifyAllKmers

Avoid sub-sequence of length k with homologies elsewhere.

API Details

AllowPrimer

class dnachisel.builtin_specifications.AllowPrimer(location=None, tmin=50, tmax=70, max_homology_length=6, avoid_heterodim_with=None, max_heterodim_tm=5, avoided_repeats=2, 5, 3, 4, 4, 3)[source]

Enforce various specifications for enabling primers at the location.

This is useful for making sure that you will be able to conduct a PCR or sanger sequencing with a primer annealing at a particular location of the sequence.

Shorthand for annotations: “primer”.

Parameters
location

The exact location where the primer will anneal, i.e. the subsequence under this location will be the sequence

tmin, tmax

Minimum and maximum acceptable melting temperatures in Celcius, for instance 55 and 70. When the used as an optimization objective the “target” will be (tmin + tmax)/2.

max_homology_length

Maximal length of any homology between the subsequence at that location and anywhere else in the whole sequence.

avoid_heterodim_with

List of ATGC strings representing external (primer) sequences with which the optimised location should have no annealing.

max_heterodim_tm

Max melting temperature of the heterodimerization between this subsegment and the other sequences in “avoid_heterodim_with”

avoided_repeats

Properties of the repeated patterns avoided. List of pairs (K, N) meaning “avoid K-mers repeated N times in a row”.

EnforceMeltingTemperature

class dnachisel.builtin_specifications.EnforceMeltingTemperature(mini=None, maxi=None, target=None, location=None, boost=1.0)[source]

Ensure that the subsequence’s Tm is in a certain segment/target.

Shorthand for annotations: “tm”.

Parameters
mini, maxi

Minimum and maximum acceptable melting temperatures in Celcius, for instance 55 and 70. A “target” can be provided instead when using this specification as an optimization objective.

target

Target melting temperature. Will be overridden by (mini+maxi)/2 if these are provided. The “target” parameter is only practical when the spec is used as an optimization objective.

location

Location of the subsequence whose melting temperature is considered. Can be None if the whole sequence is to be considered.

boost

Multiplicator for this specification’s score when used in a multi-objective optimization.

AvoidMatches

class dnachisel.builtin_specifications.AvoidMatches(match_length, bowtie_index=None, sequences=None, mismatches=0, location=None, boost=1)[source]

Enforce that the sequence has no matches longer than N in a given index.

This specification can be used to ensure that a sequence has no matches with a given organism, or a set of sequences, which can be useful to create orthogonal sequences or primer-friendly regions.

This specification uses Bowtie in the background and requires Bowtie installed on your machine (it can be as simple as apt install bowtie on Ubuntu).

It allows you to specify the match_length such that no subsegment of size match_length or more has any homology in the given bowtie index (which can be built from genomes using e.g. the genome_collector library). An homology can mean either perfect similarity, or up to 3 mismatches.

Parameters
bowtie_index

Path to a local bowtie2 index.

match_length

Length of the matches to avoid

mismatches

Number of single-nucleotide mismatches allowed inside each match. Only 0-3 mismatches is supported (this is what Bowtie supports.)

location

Location of the sequence on which the specification applies

Examples

Here is how you automatically get a Bowtie index link with Genome Collector (it will build the index if it is not yet on your machine), and use the result to create an AvoidMatches specification:

>>> from genome_collector import GenomeCollection
>>> collection = GenomeCollection()
>>> e_coli = collection.get_taxid_bowtie_index_path(511145, version="1")
>>> spec = AvoidMatches(bowtie_index=e_coli, match_length=15, mismatches=1)

AvoidBlastMatches

class dnachisel.builtin_specifications.AvoidBlastMatches(blast_db=None, sequences=None, word_size=4, perc_identity=100, num_alignments=100000, num_threads=3, min_align_length=20, ungapped=True, e_value=1e+80, culling_limit=1, location=None)[source]

Enforce that the sequence has no BLAST matches with a given database.

WARNING: try using AvoidMatches instead, it is much better!!

Uses NCBI Blast+. Only local BLAST is supported/tested as for now

Parameters
blast_db

Path to a local BLAST database. These databases can be obtained with NCBI’s makeblastdb. Omit the extension, e.g. ecoli_db/ecoli_db.

word_size

Word size used by the BLAST algorithm

perc_identity

Minimal percentage of identity for BLAST matches. 100 means that only perfect matches are considered.

num_alignments

Number alignments

num_threads

Number of threads/CPU cores to use for the BLAST algorithm.

min_align_length

Minimal length that an alignment should have to be considered.

AvoidChanges

class dnachisel.builtin_specifications.AvoidChanges(max_edits=0, max_edits_percent=None, location=None, indices=None, target_sequence=None, boost=1.0)[source]

Specify that some locations of the sequence should not be changed.

Shorthand for annotations: “change”.

Parameters
location

Location object indicating the position of the segment that must be left unchanged. Alternatively, indices can be provided. If neither is provided, the assumed location is the whole sequence.

indices

List of indices that must be left unchanged.

target_sequence

At the moment, this is rather an internal variable. Do not use unless you’re not afraid of side effects.

AvoidHairpins

class dnachisel.builtin_specifications.AvoidHairpins(stem_size=20, hairpin_window=200, location=None, boost=1.0)[source]

Avoid Hairpin patterns as defined by the IDT guidelines.

A hairpin is defined by a sequence segment which has a reverse complement “nearby” in a given window.

Parameters
stem_size

Size of the stem of a hairpin, i.e. the length of the sequence which should have a reverse complement nearby to be considered a hairpin.

hairpin_window

The window in which the stem’s reverse complement should be searched for.

boost

Multiplicative factor, importance of this objective in a multi-objective optimization.

UniquifyAllKmers

class dnachisel.builtin_specifications.UniquifyAllKmers(k, reference=None, location=None, include_reverse_complement=True, boost=1.0, localization_data=None)[source]

Avoid sub-sequence of length k with homologies elsewhere.

NOTE: For sequences with subsequences appearing more than 2 times, the specification may not work as a problem constraint, but will work as a problem optimization objective.

You can define a location L and an reference L* (by default they are both the full sequence)

>>>          [=== L ===]
>>>
>>>       [=========== L* ==========]
>>>
>>> --------- Sequence --------------------------

This Specification class specifies that “No sub-sequence in L of length above k has more than 1 occurence in L*”.

Some specific cases

  • L = L* = Sequence. In this case the full sequence will have only unique kmers above a certain size (no self-homology).

  • L < L*, L* = Sequence. The segment L will have no self-homology and no homology to the rest of the sequence above a certain size. But there can be self-homologies elsewhere in the sequence.

  • L = L*. segment L will have no self-homology.

Parameters
k

Minimal length of sequences to be considered repeats

reference

The default None indicates that the specification’s location should have no homologies anywhere in the whole sequence. If reference=”here”, then the specification’s location should have no homology inside that same location. Reference can also be any location of the sequence that the specification’s location should have no homologies with.

location

Segment of the sequence in which to look for repeats. If None, repeats are searched in the full sequence.

include_reverse_complement

If True, the sequence repeats are also searched for in the reverse complement of the sequence (or sub sequence if location is not None).

Examples

>>> from dnachisel import *
>>> sequence = random_dna_sequence(50000)
>>> constraint= UniquifyAllKmers(10, include_reverse_complement=True)
>>> problem = DnaOptimizationProblem(sequence, constraints= [constraint])
>>> print (problem.constraints_summary())

AvoidPattern

class dnachisel.builtin_specifications.AvoidPattern(pattern=None, location=None, strand='from_location', boost=1.0)[source]

Enforce that the given pattern is absent in the sequence.

Shorthand for annotations: “no”.

Parameters
pattern

A SequencePattern or DnaNotationPattern

location

Location of the DNA segment on which to enforce the pattern e.g. Location(10, 45, 1). For patterns which are not palindromic, the strand matters! Use +1 for eliminating the pattern on the +1 strand only, -1 for eliminating the pattern on the -1 strand, and 0 for eliminating the pattern on both strands. Default None enforces on the whole sequence.

strand

Alternative way to set the strand, meant to be used in two cases only: (1) in a Genbank annotation by setting strand=both to indicate that the pattern should be avoided on both strands (otherwise, only the feature’s strand will be considered). (2) if you want to create a specification without preset location, but with a set strand: AvoidPattern('BsmBI_site', strand=1). The default ‘from_location’ uses the strand specified in location, or if that is None, it sets both strands.

AvoidStopCodons

class dnachisel.builtin_specifications.AvoidStopCodons(genetic_table='Standard', location=None, boost=1.0)[source]

Do not introduce any new stop codon in that frame.

This can be used for research purposes, to avoid breaking a reading frame when editing it with quasi-synonymous mutations.

AvoidRareCodons

class dnachisel.builtin_specifications.AvoidRareCodons(min_frequency, species=None, codon_usage_table=None, location=None, boost=1.0)[source]

Avoid the use of codons with low frequency.

This can be seen as a “mild” form of codon optimization where only rare codons (which slow down protein synthesis) are considered.

WARNING: Make sure to always use this specification with EnforceTranslation to preserve the amino-acid sequence.

Shorthand for annotations: “no_rare_codons”.

Parameters
min_frequency

Minimal frequency accepted for a given codon.

species

Name or TaxID of the species for which to optimize the sequence. A custom codon_usage_table can be provided instead (or in addition, for species names whose codon usage table cannot be imported).

codon_usage_table

Optional codon usage table of the species for which the sequence will be codon-optimized, which can be provided instead of species. A dict of the form {'*': {"TGA": 0.112, "TAA": 0.68}, 'K': ...} giving the RSCU table (relative usage of each codon). See parameter species above.

location

Either a DnaChisel Location or a tuple of the form (start, end, strand) or just (start, end), with strand defaulting to +1, indicating the position of the gene to codon-optimize. If not provided, the whole sequence is considered as the gene. The location should have a length that is a multiple of 3. The location strand is either 1 if the gene is encoded on the (+) strand, or -1 for antisense.

boost

Score multiplicator (=weight) for when the specification is used as an optimization objective alongside competing objectives.

Codon Optimization Specifications

dnachisel.builtin_specifications.CodonOptimize(species=None, method='use_best_codon', location=None, codon_usage_table=None, original_species=None, original_codon_usage_table=None, boost=1.0)[source]

Codon-optimize a coding sequence using a user-selected method.

This pseudo-specification is actually a function which returns an instance of another specification class depending on the selected “method”:

  • For method=”use_best_codon”, every codon will be replaced by the “best” (i.e. most frequent) synonymous codon in the target organism. This is equivalent to Codon Adaptation Index (CAI) optimization.

  • For method=”match_codon_usage”, the final sequence’s codon usage will match as much as possible the codon usage profile of the target species (this method is used throughout the literature, see for instance Hale and Thomson 1998).

  • For method=”harmonize_rca”, Each codon will be replaced by a synonymous codon whose usage in the target organism matches the usage of the original codon in its host organism (as per Claassens 2017).

Parameters
species

Species for which the sequence will be codon-optimized. Either a TaxID (this requires a web connection as the corresponding table will be downloaded from the internet) or the name of the species to codon-optimize for (the name must be supported by python_codon_tables e.g. e_coli, s_cerevisiae, h_sapiens, c_elegans, b_subtilis, d_melanogaster). Note that a codon_usage_table can be provided instead, or even in addition, for species whose codon usage table cannot be auto-imported.

method

Either ‘use_best_codon’, ‘match_codon_usage’, or ‘harmonize_rca’ (see above for details)

location

Either a DnaChisel Location or a tuple of the form (start, end, strand) or just (start, end), with strand defaulting to +1, indicating the position of the gene to codon-optimize. If not provided, the whole sequence is considered as the gene. The location should have a length that is a multiple of 3. The location strand is either 1 if the gene is encoded on the (+) strand, or -1 for antisense.

codon_usage_table

Optional codon usage table of the species for which the sequence will be codon-optimized, which can be provided instead of species. A dict of the form {'*': {"TGA": 0.112, "TAA": 0.68}, 'K': ...} giving the RSCU table (relative usage of each codon). See parameter species above.

original_species

When the method is ‘harmonize_rca’, this is the native species of the original coding sequence. Same characteristics as parameter species above.

original_codon_usage_table

Optional codon usage table of the original sequence’s native species. A dict of the form {'*': {"TGA": 0.112, "TAA": 0.68}, 'K': ...} giving the codon usage table.

References

Claassens et. al., Improving heterologous membrane protein production in Escherichia coli by combining transcriptional tuning and codon usage algorithms. PLOS One, 2017

Hale and Thompson, Codon Optimization of the Gene Encoding a Domain from Human Type 1 Neurofibromin Protein… Protein Expression and Purification 1998.

class dnachisel.builtin_specifications.MaximizeCAI(species=None, location=None, codon_usage_table=None, boost=1.0)[source]

Codon-optimize a coding sequence for a given species. Maximizes the CAI.

To be precise, the score computed by this specification is N*log(CAI) where N is the number of codons. Maximizing this score also maximizes the CAI.

Index (CAI). For a sequence with N codons, the CAI is the geometric mean of the Relative Codon Adaptiveness (RCA) of the different codons. The RCA of a codon is (f_i/fmax_i) were fi is the frequency of an oligo in the codon usage table, and fmax is the maximal frequency of the synonymous codons.

So N*log(CAI) = sum_i ( log(f_i) - log(fmax_i) )

This score is between -inf. and 0 (0 meaning a perfectly optimal sequence).

Parameters
species

Species for which the sequence will be codon-optimized. Either a TaxID (this requires a web connection as the corresponding table will be downloaded from the internet) or the name of the species to codon-optimize for (the name must be supported by python_codon_tables e.g. e_coli, s_cerevisiae, h_sapiens, c_elegans, b_subtilis, d_melanogaster). Note that a codon_usage_table can be provided instead, or even in addition, for species whose codon usage table cannot be auto-imported.

location

Either a DnaChisel Location or a tuple of the form (start, end, strand) or just (start, end), with strand defaulting to +1, indicating the position of the gene to codon-optimize. If not provided, the whole sequence is considered as the gene. The location should have a length that is a multiple of 3. The location strand is either 1 if the gene is encoded on the (+) strand, or -1 for antisense.

codon_usage_table

A dict of the form {'*': {"TGA": 0.112, "TAA": 0.68}, 'K': ...} giving the RSCU table (relative usage of each codon). Only provide if no species parameter was provided.

boost

Score multiplicator (=weight) for when the specification is used as an optimization objective alongside competing objectives.

Examples

>>> objective = MaximizeCAI(
>>>     species = "E. coli",
>>>     location = (150, 300), # coordinates of a gene
>>>     strand = -1
>>> )
class dnachisel.builtin_specifications.MatchTargetCodonUsage(species=None, location=None, codon_usage_table=None, boost=1.0)[source]

Codon-optimize a sequence so it has the same codon usage as a target.

The objective minimized here is the sum of the discrepancies, over every possible triplet ATG, CCG, etc. between the codon frequency of this triplet in the sequence, and its frequency in the target organism.

This method has had several names through the ages. It may have been first proposed by Hale and Thompson, 1998. It is called Individual Codon Usage Optimization in Chung 2012, Global CAI Harmonization in Mignon 2018, and Codon Harmonization in Jayaral 2005. We didn’t call it “harmonization” in DNA Chisel to avoid any confusion with the now more common host-to-target codon harmonization. See DnaChisel’s HarmonizeRCA class for Codon Harmonization.

Parameters
species

Species for which the sequence will be codon-optimized. Either a TaxID (this requires a web connection as the corresponding table will be downloaded from the internet) or the name of the species to codon-optimize for (the name must be supported by python_codon_tables e.g. e_coli, s_cerevisiae, h_sapiens, c_elegans, b_subtilis, d_melanogaster). Note that a codon_usage_table can be provided instead, or even in addition, for species whose codon usage table cannot be auto-imported.

location

Either a DnaChisel Location or a tuple of the form (start, end, strand) or just (start, end), with strand defaulting to +1, indicating the position of the gene to codon-optimize. If not provided, the whole sequence is considered as the gene. The location should have a length that is a multiple of 3. The location strand is either 1 if the gene is encoded on the (+) strand, or -1 for antisense.

codon_usage_table

A dict of the form {'*': {"TGA": 0.112, "TAA": 0.68}, 'K': ...} giving the RSCU table (relative usage of each codon). Only provide if no species parameter was provided.

boost

Score multiplicator (=weight) for when the specification is used as an optimization objective alongside competing objectives.

References

Hale and Thompson, Codon Optimization of the Gene Encoding a Domain from Human Type 1 Neurofibromin Protein… Protein Expression and Purification 1998.

Jayaraj et. al. GeMS: an advanced software package for designing synthetic genes, Nucleic Acids Research, 2005

Mignon et. al. Codon harmonization – going beyond the speed limit for protein expression. FEBS Lett, 2018

Chung BK, Lee DY. Computational codon optimization of synthetic gene for protein expression. BMC Syst Biol. 2012

class dnachisel.builtin_specifications.HarmonizeRCA(species=None, codon_usage_table=None, original_species=None, original_codon_usage_table=None, location=None, boost=1)[source]

Codon-Harmonize a native sequence for a new host (Claassens method).

This specification will optimize a Sequence 1 from Host 1 into a Sequence 2 for target Host 2.

In simple, rare Host 1 codons will be replaced by rare Host 2 codons, and high-frequency Host 1 codons will get replaced by codons that are high-frequency in Host 2.

In more specific, each codon along Sequence 1 gets replaced by the codon whose Relative Codon Adaptiveness (RCA) in Host 2 is the closest from the RCA of the original codon in Host 1. A codon’s RCA in a given organism is defined by f/fmax where f is the codon’s frequency in the organism and fmax is the highest frequency of all synonymous codons.

The minimized quantity is sum_i abs(RCA(c_i, H1) - RCA(c’_i, H2)) where c_i, c’_i represent the i-th codon before and after optimization

This method is taken from Claassens 2017, where they simplify a previous algorithm (Angov 2008), which was much more complicated as it involved predicting “ribosome pausing” sites in the sequence.

Warning: always use with an EnforceTranslation constraint.

Parameters
species

Name or TaxID of the species for which to optimize the sequence. A custom codon_usage_table can be provided instead (or in addition, for species names whose codon usage table cannot be imported).

codon_usage_table

Optional - can be provided instead of species. A dict of the form {'*': {"TGA": 0.112, "TAA": 0.68}, 'K': ...} giving the RSCU table (relative usage of each codon).

original_species

Name or TaxID of the species the original sequence was taken from. This information will be used to spot codons which are supposed to be rare or common. A codon_usage_table can be provided instead (or in addition, for species names whose codon usage table cannot be imported).

original_codon_usage_table

A dict of the form {'*': {"TGA": 0.112, "TAA": 0.68}, 'K': ...} giving the RSCU table (relative usage of each codon).

location

Location on which the specification applies

boost

Score multiplicator (=weight) for when the specification is used as an optimization objective alongside competing objectives.

References

Claassens et. al., Improving heterologous membrane protein production in Escherichia coli by combining transcriptional tuning and codon usage algorithms. PLOS One, 2017

EnforceGCContent

class dnachisel.builtin_specifications.EnforceGCContent(mini=0, maxi=1.0, target=None, window=None, location=None, boost=1.0)[source]

Specification on the local or global proportion of G/C nucleotides.

Shorthand for annotations: “gc”.

Parameters
mini

Minimal proportion of G-C (e.g. 0.35)

maxi

Maximal proportion of G-C (e.g. 0.75)

target

Target proportion of GC (e.g. 0.4), which can be used instead of mini and maxi when using the specification as an optimization objective.

window

Length of the sliding window, in nucleotides, for local GC content. If not provided, the global GC content of the whole sequence is considered

location

Location objet indicating that the Specification only applies to a subsegment of the sequence. Make sure it is bigger than window if both parameters are provided

Examples

>>> # Enforce global GC content between 40 and 70 percent.
>>> Specification = GCContentSpecification(0.4, 0.7)
>>> # Enforce 30-80 percent local GC content over 50-nucleotides windows
>>> Specification = GCContentSpecification(0.3, 0.8, window=50)

EnforcePatternOccurence

class dnachisel.builtin_specifications.EnforcePatternOccurence(pattern=None, occurences=1, location=None, strand='from_location', center=True, boost=1.0)[source]

Enforce a number of occurrences of the given pattern in the sequence.

Shorthand for annotations: “insert” (although this specification can be used to both insert new occurences of a pattern, or destroy supernumerary patterns)

Parameters
pattern

A SequencePattern or DnaNotationPattern or a string such as “AATTG”, “BsmBI_site”, etc.

occurences

Desired number of occurrences of the pattern.

location

Location of the DNA segment on which to enforce the pattern e.g. Location(10, 45, 1). Default None means the whole sequence.

center

If True, new inserted patterns will prioritize locations at the center of the specification’s location. Else the insertion will happen at the beginning of the location.

strand

Alternative way to set the strand, meant to be used in two cases only: (1) in a Genbank annotation by setting strand='both' to indicate that the pattern could be on both strands (otherwise, only the feature’s strand will be considered). (2) if you want to create a specification without preset location, but with a set strand: EnforcePatternOccurence('BsmBI_site', strand=1). The default ‘from_location’ uses the strand specified in location, or if that is None, it sets both strands.

EnforceRegionsCompatibility

class dnachisel.builtin_specifications.EnforceRegionsCompatibility(locations, compatibility_condition, condition_label='', boost=1.0)[source]

Ensure that different subregions satisfy compatibility constraints.

EnforceSequence

class dnachisel.builtin_specifications.EnforceSequence(sequence=None, location=None, boost=1.0)[source]

Enforces a (possibly degenerate) sequence at some location.

Shorthand for annotations: “sequence”.

Parameters
sequence

An ATGC string representing the wanted sequence, possibly degenerated, for instance ATTCGCGTYTTKWNAA

location

Location of the DNA segment on which to enforce the pattern e.g. Location(10, 45, 1) or simply (10, 45, 1)

EnforceTranslation

class dnachisel.builtin_specifications.EnforceTranslation(genetic_table='default', start_codon=None, translation=None, location=None, boost=1.0)[source]

Enforce a specific amino-acid sequence translation.

This class enforces the standard translation, but it is also possible to change the class’ codons_sequences and codons_translations dictionaries for more exotic kind of translations

Shorthand for annotations: “cds”.

Parameters
location

Either a DnaChisel Location or a tuple of the form (start, end, strand) or just (start, end), with strand defaulting to +1, indicating the position of the gene to codon-optimize. If not provided, the whole sequence is considered as the gene. The location should have a length that is a multiple of 3. The location strand is either 1 if the gene is encoded on the (+) strand, or -1 for antisense.

genetic_table

Either “Standard”, “Bacterial”, or any other Biopython genetic table name (see dnachisel.biotools.CODON_TABLE_NAMES for a list of accepted names).

start_codon

Signals that the first codon is a start codon and provides a policy for changing it. It is very important that this parameter be set when dealing with full coding sequences in organisms with different start codons, e.g. GTG in bacteria. If it is not set, then a GTG (start codon or Valine) could be changed into GTA (also valine but NOT a start codon). Can be False (the first codon is not considered a start codon, but can still naturally code for methionine), keep (freezes the original sub-sequence at this location, or a codon e.g. ATG or GTG or ["ATG", "GTG"].

translation

String representing the protein sequence that the DNA segment should translate to, eg. “MKY…LL*” (“*” stands for stop codon). Can be omitted if the sequence initially encodes the right sequence.

boost

Score multiplicator (=weight) for when the specification is used as an optimization objective alongside competing objectives.

SequenceLengthBounds

class dnachisel.builtin_specifications.SequenceLengthBounds(min_length=0, max_length=None)[source]

Checks that the sequence length is between bounds.

Quite an uncommon specification as it can’t really be solved or optimized. But practical as part of a list of constraints to verify.

Parameters
min_length

Minimal allowed sequence length in nucleotides

max_length

Maximal allowed sequence length in nucleotides. None means no bound.