API Reference
BiopythonTranslator
- class dna_features_viewer.BiopythonTranslator(features_filters=(), features_properties=None)[source]
A translator from SeqRecords to dna_features_viewer GraphicRecord.
This can be subclassed to create custom “themes” (see the example
custom_biopython_translator.py
in the docs).This class is meant to be customized by subclassing and changing the methods (
compute_feature_label
, etc.) and/or the attributes (default_feature_color
etc).- default_feature_color = "#7245dc"
- graphic_record_parameters
Dictionary containing keyword arguments that will be passed to the (Circular)GraphicRecord constructor.
- ignored_features_types
A list or tuple of strings indicating all the feature types that should always be ignored (i.e. not included in the graphic record) by the translator.
- label_fields
This list of strings provides the order in which the different attributes of a Genbank feature will be considered, when automatically determining the feature label. For instance if the list is [“label”, “source”, “locus_tag”] and a feature has no label but has a “source”, the “source” will be displayed in the plots.
- Parameters:
features_filters – List of filters (some_biopython_feature) => True/False. Only features passing all the filters are kept. This only works if you haven’t redefined
compute_filtered_features
.features_properties – A function (feature)=> properties_dict.
- compute_feature_color(feature)[source]
Compute a color for this feature.
If the feature has a
color
qualifier it will be used. Otherwise, the classe’sdefault_feature_color
is used.To change the behaviour, create a subclass of
BiopythonTranslator
and overwrite this method.
- compute_feature_label_link_color(feature)[source]
Compute the color of the line linking the label to its feature.
GraphicRecord
- class dna_features_viewer.GraphicRecord(sequence_length=None, sequence=None, features=(), feature_level_height=1, first_index=0, plots_indexing='biopython', labels_spacing=8, ticks_resolution='auto')[source]
Set of Genetic Features of a same DNA sequence, to be plotted together.
- Parameters:
sequence_length – Length of the DNA sequence, in number of nucleotides.
features – list of GraphicalFeature objects.
feature_level_height – Width in inches of one “level” for feature arrows.
annotation_height – Width in inches of one “level” for feature annotations.
first_index – Indicates the first index to plot in case the sequence is actually a subsequence of a larger one. For instance, if the Graphic record represents the segment (400, 420) of a sequence, we will have
first_index=400
andsequence_length=20
.plots_indexing – Indicates which standard to use to show nucleotide indices in the plots. If ‘biopython’, the standard python indexing is used (starting at 0). If ‘genbank’, the indexing follows the Genbank standard (starting at 1).
labels_spacing – Number of pixels that will “pad” every labels to force some horizontal space between two labels or between a label and the borders of a feature.
ticks_resolution – Leave to “auto” for an auto-selected number of ticks on the ruler, or set to e.g. 50 for a tick every 50 nucleotide.
- default_font_family
Default font to use for a feature that doesn’t declare a font.
- default_ruler_color
Default ruler color to use when no color is given at plot() time.
- default_box_color
Default box color for non-inline annotations. If set to None, no boxes will be drawn unless the features declare a box_color. If “auto”, a color (clearer version of the feature’s color) will be computed, for all features also declaring their box_color as “auto”.
- default_elevate_outline_annotations
Value to use for elevate_outline_annotations when no specific value is given at
graphic_record.plot(...)
time. Set to true to have all text annotations appears above all features, or false else.
- annotate_feature(ax, feature, level, inline=False, max_label_length=50, max_line_length=30, padding=0, indicate_strand_in_label=False)
Create a Matplotlib Text with the feature’s label.
The x-coordinates of the text are determined by the feature’s x_center while the y-coordinates are determined by the level.
The text is horizontally and vertically centered.
The Arrow points in the direction of the feature’s strand. If the feature has no direction (strand==0), the returned patch will simply be a rectangle.
The x-coordinates of the patch are determined by the feature’s start and end while the y-coordinates are determined by the level.
- autoselect_label_color(background_color)
Autselect a color for the label font.
In the current method the label will be black on clear backrgounds, and white on dark backgrounds.
- bokeh_feature_patch(start, end, strand, figure_width=5, width=0.4, level=0, arrow_width_inches=0.05, **kwargs)
Return a dict with points coordinates of a Bokeh Feature arrow.
- Parameters:
start
end
strand
- coordinates_in_plot(x, level)[source]
Convert a sequence position and height level into a (x, y) position.
- determine_annotation_height(levels)[source]
By default the ideal annotation level height is the same as the feature_level_height.
- finalize_ax(ax, features_levels, annotations_max_level, auto_figure_height=False, ideal_yspan=None, annotations_are_elevated=True)
Prettify the figure with some last changes.
Changes include redefining y-bounds and figure height.
- Parameters:
ax – ax on which the record was plotted.
features_levels
annotations_max_level – Number indicating to the method the maximum height for an annotation, so the method can set ymax accordingly.
auto_figure_height – If true, the figure height will be automatically re-set to a nice value (counting ~0.4 inch per level in the figure).
ideal_yspan – if provided, can help the method select a better ymax to make sure all constraints fit.
- initialize_ax(ax, draw_line, with_ruler, ruler_color=None)
Initialize the ax: remove axis, draw a horizontal line, etc.
- Parameters:
draw_line – True/False to draw the horizontal line or not.
with_ruler – True/False to draw the indices indicators along the line.
- place_annotation(feature, ax, level, annotate_inline, max_line_length, max_label_length, indicate_strand_in_label=False)
Place an annotation in the figure.
Decide on inline vs. outline.
- Parameters:
feature – Graphic feature to place in the figure.
ax – Matplotlib ax in which to place the feature.
level – level at which the annotation should be placed.
annotate_inline – If true, the plotter will attempt to annotate inline, and fall back to outline annotation.
max_line_length – If an annotation label’s length exceeds this number the label will wrap over several lines.
max_label_length – If an annotation label’s length exceeds this number the label will be cut with an ellipsis (…).
- :paramIf an annotation label’s length exceeds this number the label will
be cut with an ellipsis (…).
- Parameters:
indicate_strand_in_label – If True, then the label will be represented as “<= label” or “label =>” with an arrow representing the strand.
- plot(ax=None, figure_width=8, draw_line=True, with_ruler=True, ruler_color=None, plot_sequence=False, annotate_inline=True, max_label_length=50, max_line_length=30, level_offset=0, strand_in_label_threshold='default', elevate_outline_annotations='default', x_lim=None, figure_height=None, sequence_params=None)
Plot all the features in the same Matplotlib ax.
- Parameters:
ax – The Matplotlib ax on which to plot the graphic record. If None is provided, a new figure and ax is generated, the ax is returned at the end.
figure_width – Width of the figure (only if no ax was provided and a new figure is created) in inches.
draw_line – If True, a base line representing the sequence will be drawn.
with_ruler – If true, the sequence indices will be indicated at regular intervals.
ruler_color – Ruler color.
plot_sequence – If True and the graphic record has a “sequence” attribute set, the sequence will be displayed below the base line.
annotate_inline – If true, some feature labels will be displayed inside their corresponding feature if there is sufficient space.
max_label_length – If an annotation label’s length exceeds this number the label will be cut with an ellipsis (…).
max_line_length – If an annotation label’s length exceeds this number the label will wrap over several lines.
level_offset – All features and annotations will be pushed up by “level_offset”. Can be useful when plotting several sets of features successively on a same ax.
strand_in_label_pixel_threshold – Number N such that, when provided, every feature with a graphical width in pixels below N will have its strand indicated in the label by an a left/right arrow.
elevate_outline_annotations – If true, every text annotation will be above every feature. If false, text annotations will be as close as possible to the features.
x_lim – Horizontal axis limits to be set at the end.
figure_height – Figure height.
sequence_params – parameters for plot_sequence.
- plot_feature(ax, feature, level, linewidth=1.0)
Create an Arrow Matplotlib patch with the feature’s coordinates.
The Arrow points in the direction of the feature’s strand. If the feature has no direction (strand==0), the returned patch will simply be a rectangle.
The x-coordinates of the patch are determined by the feature’s start and end while the y-coordinates are determined by the level.
- plot_on_multiple_lines(n_lines=None, nucl_per_line=None, plot_sequence=False, figure_width='auto', translation_params=None, **plot_params)
Plot the features on different lines (one Matplotlib ax per line).
- Parameters:
n_lines – Number of lines on which the record will be plotted. A number of nucleotides per line can be provided instead (see below).
nucl_per_line – Number of nucleotides to be represented on every line (determines the number of lines
n_lines
).plot_sequence – Whether to plot the nucleotide sequence on each line.
figure_width – Width of the figure in inches. Leave to auto for a width of either 10 (if not sequence is plotted) or 0.15*nucl_per_line inches (if a sequence is plotted).
translation_params – Parameters for sequence translation. By default (
None
), it does not plot a translated sequence.**plot_params – Parameters from
graphic_record.plot()
to be used in the plotting of the individual lines. This includesdraw_line
,with_ruler
,annotate_inline
,plot_sequence
,evelate_outline_annotations
,strand_in_label_pixel_threshold
.
- Returns:
The matplotlib figure and axes generated.
- Return type:
figure, axes
- plot_on_multiple_pages(pdf_target, n_lines=None, nucl_per_line=None, lines_per_page=5, figure_width='auto', translation_params=None, **plot_params)
Plot the features on different lines on different pages of a PDF.
This function returns None
- Parameters:
pdf_target – Either a path to a PDF, or a file(-like) handle.
n_lines – Number of lines on which the record will be plotted. A number of nucleotides per line can be provided instead (see below).
nucl_per_line – Number of nucleotides to be represented on every line (determines the number of lines
n_lines
).lines_per_page – Number of lines on each page.
plot_sequence – Whether to plot the nucleotide sequence on each line.
figure_width – Width of the figure in inches. Leave to auto for a width of either 10 (if not sequence is plotted) or 0.15*nucl_per_line inches (if a sequence is plotted).
translation_params – Parameters for sequence translation. By default (
None
), it does not plot a translated sequence.**plot_params – Parameters from
graphic_record.plot()
to be used in the plotting of the individual lines. This includesdraw_line
,with_ruler
,annotate_inline
,plot_sequence
,evelate_outline_annotations
,strand_in_label_pixel_threshold
.
- plot_sequence(ax, location=None, y_offset=1, fontdict=None, guides_intensity=0)
Plot a sequence of nucleotides at the bottom of the plot.
- Parameters:
ax – Which axes the translation should be plotted to.
location – location of the segment to translate, either (start, end) or (start, end, strand).
y_offset – Number of text levels under the plot’s base line where to draw the nucleotides. Should be 1 if the nucleotide sequence is to be plotted directly under the main line.
fontdict – Matplotlib fontdict for the text, e.g.
{'size': 11, 'weight':'bold'}
background – tuple (color1, color2) of alternate colors to plot behind each nucleotide position to guide vision. Leave to None for no background.
guides_intensity – Intensity of the vertical guides marking the different nucleotides (0 = no guides).
- plot_translation(ax, location=None, y_offset=2, fontdict=None, guides_intensity=0.5, translation=None, long_form_translation=True)
Plot a sequence of amino-acids at the bottom of the plot.
- Parameters:
ax – Which axes the translation should be plotted to.
location – location of the segment to translate (start, end).
y_offset – Number of text levels under the plot’s base line where to draw the amino acid names. Should be 2 if the nucleotide sequence is also plotted at level 1.
fontdict – Matplotlib fontdict for the text, e.g.
{'size': 11, 'weight':'bold'}
background – tuple (color1, color2) of alternate colors to plot behind each amino acid position to guide vision. Leave to None for no background.
translation – Sequence of amino acids either as a string
'MAKG...'
or as a list['Met', 'Ala', ...]
- plot_with_bokeh(figure_width=5, figure_height='auto', tools='auto')
Plot the graphic record using Bokeh.
Examples
>>>
- property span
Return the display span (start, end) accounting for first_index.
- split_overflowing_features_circularly()[source]
Split the features that overflow over the edge for circular constructs (inplace).
- to_biopython_record(sequence)[source]
Export to Biopython record.
Examples
>>> from Bio import SeqIO >>> gr_record = GraphicRecord(features=features, sequence_length=len(seq), >>> sequence=seq) >>> bio_record = gr_record.to_biopython_record() >>> with open("example.gb", "w+") as f: >>> SeqIO.write(record, f, "genbank")
CircularGraphicRecord
- class dna_features_viewer.CircularGraphicRecord(sequence_length, features, top_position=0, feature_level_height=0.2, annotation_height='auto', labels_spacing=12, **kw)[source]
Set of Genetic Features of a same DNA sequence, to be plotted together.
- Parameters:
sequence_length – Length of the DNA sequence, in number of nucleotides.
features – list of GraphicalFeature objects.
top_position – The index in the sequence that will end up at the top of the circle.
feature_level_height – Width in inches of one “level” for feature arrows.
annotation_height – Width in inches of one “level” for feature annotations.
labels_spacing – Distance in basepairs to keep between labels to avoid “quasi-collisions”.
**kw – Other keyword arguments - do not use, these parameters are allowed for making it easier to use GraphicRecord and CircularGraphicRecord interchangeably.
- coordinates_in_plot(position, level)[source]
Convert a sequence position and height level to (x, y) coordinates.
- determine_annotation_height(max_annotations_level)[source]
Auto-select the annotations height.
Annotation height is 0.2 at most, or else whatever will make the figure a 5*radius tall rectangle where the circular plasmid occupies the bottom-2 5th and the annotations occupy the top-3 5th.
- finalize_ax(ax, features_levels, annotations_max_level, auto_figure_height=False, ideal_yspan=None, annotations_are_elevated=True)[source]
Final display range and figure dimension tweakings.
- initialize_ax(ax, draw_line, with_ruler)[source]
Initialize the ax with a circular line, sets limits, aspect etc.
GraphicFeature
- class dna_features_viewer.GraphicFeature(start=None, end=None, strand=None, label=None, color='#000080', thickness=14, linewidth=1.0, linecolor='#000000', fontdict=None, html=None, open_left=False, open_right=False, box_linewidth=1, box_color='auto', legend_text=None, label_link_color='black', **data)[source]
Genetic Feature to be plotted.
- Parameters:
start – Coordinates of the feature in the final sequence.
end – Coordinates of the feature in the final sequence.
strand – Directionality of the feature. can be +1/-1/0 for direct sense, anti-sense, or no directionality.
label – Short descriptive text associated and plotted with the feature.
color – Color of the feature, any Matplotlib-compatible format is accepted, such as “white”, “w”, “#ffffff”, (1,1,1), etc.
linecolor – Color of the feature’s border, any Matplotlib-compatible format is accepted, such as “white”, “w”, “#ffffff”, (1,1,1), etc.
linewidth – Width of the line (=edge) surrounding the graphic feature, in pixels.
thickness – Vertical span of the feature.
box_color – Color of the label box. Set to None for no box around the label. Leave to “auto” for a box color that is a lightened version of the feature’s color.
data – Any other keyword is kept into the feature.data[] dictionary.
fontdict – A Matplotlib fontdict for the font to be used in the label, e.g.
size=11
,weight='bold'
,family='Helvetica'
, etc.open_left – Set to True if this feature does not end on the right or left because it is a cropped version of a bigger feature.
open_right – Set to True if this feature does not end on the right or left because it is a cropped version of a bigger feature.
box_linewidth – Width of the line delimiting the text box when the annotation is outside the graphic feature. Set to 0 for no box borders.
box_color – Background color of the annotation’s text box. If left to “auto” the color will be a lighter version of the feature’s color.
label_link_color – Color of the line linking the text annotation to its respective graphic feature. Set to auto for the line to automatically be a darker version of the feature’s color.
- crop(window)[source]
Return a the fragment of the feature that is in the window.
If there is no overlap between the feature location and the window, None is returned.
- static from_biopython_feature(feature, **props)[source]
Create a GraphicalFeature from a Biopython.Feature object.
- property length
Return the length of the feature (end-start)
- property x_center
Return the x-center of the feature, (start+end)/2
Biotools
- dna_features_viewer.biotools.annotate_biopython_record(seqrecord, location='full', feature_type='misc_feature', margin=0, **qualifiers)[source]
Add a feature to a Biopython SeqRecord.
- Parameters:
seqrecord – The biopython seqrecord to be annotated.
location – Either (start, end) or (start, end, strand). (strand defaults to +1)
feature_type – The type associated with the feature.
margin – Number of extra bases added on each side of the given location.
qualifiers – Dictionary that will be the Biopython feature’s qualifiers attribute.
- dna_features_viewer.biotools.complement(dna_sequence)[source]
Return the complement of the DNA sequence.
For instance
complement("ATGCCG")
returns"TACGGC"
.Uses BioPython for speed.
- dna_features_viewer.biotools.extract_graphical_translation(sequence, location, long_form=False)[source]
Return a string of the “graphical” translation of a sequence’s subsegment.
Here “graphical” means that the amino acid sequence is always given left-to-right, as it will appear under the sequence in the plot. This matters when the location is on the -1 strand. In this case, the amino-acids are determined by (part of) the reverse-complement of the sequence, however the sequence returned will be the mirror of the translated sequence, as this is the left-to-right order in which the codons corresponding to the amino-acids appear in the sequence.
- Parameters:
sequence – An “ATGC” string.
location – Either (start, end) or (start, end, strand), with strand in (0, 1, -1).
long_form – if True, a list of 3-letter amino acid representations is returned instead ([‘Ala’, ‘Ser’, …]).
- dna_features_viewer.biotools.find_narrowest_text_wrap(text, max_line_length)[source]
Wrap the text into a multi-line text minimizing the longest line length.
This is done by first wrapping the text using max_line_length, then attempt new wraps by iteratively decreasing the line_length, as long as the number of lines stays the same as with max_line_length.
- dna_features_viewer.biotools.load_record(path, filetype=None)[source]
Load a Genbank file.
- Parameters:
path – Path to record.
filetype – Filetype; one of “genbank” or “gff”. Default None infers from extension.