API Reference¶
BiopythonTranslator¶
-
class
dna_features_viewer.
BiopythonTranslator
(features_filters=(), features_properties=None)[source]¶ A translator from SeqRecords to dna_features_viewer GraphicRecord.
This can be subclassed to create custom “themes” (see the example
custom_biopython_translator.py
in the docs).This class is meant to be customized by subclassing and changing the methods (
compute_feature_label
, etc.) and/or the attributes (default_feature_color
etc).- Parameters
features_filters
List of filters (some_biopython_feature) => True/False. Only features passing all the filters are kept. This only works if you haven’t redefined
compute_filtered_features
features_properties
A function (feature)=> properties_dict
Attributes
default_feature_color = “#7245dc”
graphic_record_parameters
Dictionnary containing keyword arguments that will be passed to the (Circular)GraphicRecord constructor
ignored_features_types
A list or tuple of strings indicating all the feature types that should always be ignored (i.e. not included in the graphic record) by the translator
label_fields
This list of strings provides the order in which the different attributes of a Genbank feature will be considered, when automatically determining the feature label. For instance if the list is [“label”, “source”, “locus_tag”] and a feature has no label but has a “source”, the “source” will be displayed in the plots.
-
compute_feature_color
(feature)[source]¶ Compute a color for this feature.
If the feature has a
color
qualifier it will be used. Otherwise, the classe’sdefault_feature_color
is used.To change the behaviour, create a subclass of
BiopythonTranslator
and overwrite this method.
-
compute_feature_label_link_color
(feature)[source]¶ Compute the color of the line linking the label to its feature.
GraphicRecord¶
-
class
dna_features_viewer.
GraphicRecord
(sequence_length=None, sequence=None, features=(), feature_level_height=1, first_index=0, plots_indexing='biopython', labels_spacing=8, ticks_resolution='auto')[source]¶ Set of Genetic Features of a same DNA sequence, to be plotted together.
- Parameters
sequence_length
Length of the DNA sequence, in number of nucleotides
features
list of GraphicalFeature objects.
feature_level_height
Width in inches of one “level” for feature arrows.
annotation_height
Width in inches of one “level” for feature annotations.
first_index
Indicates the first index to plot in case the sequence is actually a subsequence of a larger one. For instance, if the Graphic record represents the segment (400, 420) of a sequence, we will have
first_index=400
andsequence_length=20
.plots_indexing
Indicates which standard to use to show nucleotide indices in the plots. If ‘biopython’, the standard python indexing is used (starting at 0). If ‘genbank’, the indexing follows the Genbank standard (starting at 1).
labels_spacing
Number of pixels that will “pad” every labels to force some horizontal space between two labels or between a label and the borders of a feature.
ticks_resolution
Leave to “auto” for an auto-selected number of ticks on the ruler, or set to e.g. 50 for a tick every 50 nucleotide.
Attributes
default_font_family
Default font to use for a feature that doesn’t declare a font. default_ruler_color Default ruler color to use when no color is given at plot() time. default_box_color Default box color for non-inline annotations. If set to None, no boxes will be drawn unless the features declare a box_color. If “auto”, a color (clearer version of the feature’s color) will be computed, for all features also declaring their box_color as “auto”. default_elevate_outline_annotations Value to use for elevate_outline_annotations when no specific value is given at
graphic_record.plot(...)
time. Set to true to have all text annotations appears above all features, or false else.-
annotate_feature
(ax, feature, level, inline=False, max_label_length=50, max_line_length=30, padding=0, indicate_strand_in_label=False)¶ Create a Matplotlib Text with the feature’s label.
The x-coordinates of the text are determined by the feature’s x_center while the y-coordinates are determined by the level.
The text is horizontally and vertically centered.
The Arrow points in the direction of the feature’s strand. If the feature has no direction (strand==0), the returned patch will simply be a rectangle.
The x-coordinates of the patch are determined by the feature’s start and end while the y-coordinates are determined by the level
-
autoselect_label_color
(background_color)¶ Autselect a color for the label font.
In the current method the label will be black on clear backrgounds, and white on dark backgrounds.
-
bokeh_feature_patch
(start, end, strand, figure_width=5, width=0.4, level=0, arrow_width_inches=0.05, **kwargs)¶ Return a dict with points coordinates of a Bokeh Feature arrow.
- Parameters
start, end, strand
-
coordinates_in_plot
(x, level)[source]¶ Convert a sequence position and height level into a (x, y) position.
-
determine_annotation_height
(levels)[source]¶ By default the ideal annotation level height is the same as the feature_level_height.
-
finalize_ax
(ax, features_levels, annotations_max_level, auto_figure_height=False, ideal_yspan=None, annotations_are_elevated=True)¶ Prettify the figure with some last changes.
Changes include redefining y-bounds and figure height.
- Parameters
ax
ax on which the record was plotted
features_levels
annotations_max_level
Number indicating to the method the maximum height for an annotation, so the method can set ymax accordingly
auto_figure_height
If true, the figure’height will be automatically re-set to a nice value (counting ~0.4inch per level in the figure).
ideal_yspan
if provided, can help the method select a better ymax to make sure all constraints fit.
-
initialize_ax
(ax, draw_line, with_ruler, ruler_color=None)¶ Initialize the ax: remove axis, draw a horizontal line, etc.
- Parameters
draw_line
True/False to draw the horizontal line or not.
with_ruler
True/False to draw the indices indicators along the line.
-
place_annotation
(feature, ax, level, annotate_inline, max_line_length, max_label_length, indicate_strand_in_label=False)¶ “Place an annotation in the figure. Decide on inline vs. outline.
- Parameters
feature
Graphic feature to place in the figure
ax
Matplotlib ax in which to place the feature.
level
level at which the annotation should be placed
annotate_inline
If true, the plotter will attempt to annotate inline, and fall back to outline annotation.
max_line_length
If an annotation label’s length exceeds this number the label will wrap over several lines.
max_label_length,
If an annotation label’s length exceeds this number the label will be cut with an ellipsis (…).
indicate_strand_in_label
If True, then the label will be represented as “<= label” or “label =>” with an arrow representing the strand.
-
plot
(ax=None, figure_width=8, draw_line=True, with_ruler=True, ruler_color=None, plot_sequence=False, annotate_inline=True, max_label_length=50, max_line_length=30, level_offset=0, strand_in_label_threshold='default', elevate_outline_annotations='default', x_lim=None, figure_height=None, sequence_params=None)¶ Plot all the features in the same Matplotlib ax
- Parameters
ax
The Matplotlib ax on which to plot the graphic record. If None is provided, a new figure and ax is generated, the ax is returned at the end.
figure_width
Width of the figure (only if no ax was provided and a new figure is created) in inches.
draw_line
If True, a base line representing the sequence will be drawn.
with_ruler
If true, the sequence indices will be indicated at regular intervals.
plot_sequence
If True and the graphic record has a “sequence” attribute set, the sequence will be displayed below the base line.
annotate_inline
If true, some feature labels will be displayed inside their corresponding feature if there is sufficient space.
level_offset
All features and annotations will be pushed up by “level_offset”. Can be useful when plotting several sets of features successively on a same ax.
elevate_outline_annotations
If true, every text annotation will be above every feature. If false, text annotations will be as close as possible to the features.
strand_in_label_pixel_threshold
Number N such that, when provided, every feature with a graphical width in pixels below N will have its strand indicated in the label by an a left/right arrow
x_lim
Horizontal axis limits to be set at the end.
sequence_params
parameters for plot_sequence
-
plot_feature
(ax, feature, level, linewidth=1.0)¶ Create an Arrow Matplotlib patch with the feature’s coordinates.
The Arrow points in the direction of the feature’s strand. If the feature has no direction (strand==0), the returned patch will simply be a rectangle.
The x-coordinates of the patch are determined by the feature’s start and end while the y-coordinates are determined by the level
-
plot_on_multiple_lines
(n_lines=None, nucl_per_line=None, plot_sequence=False, figure_width='auto', **plot_params)¶ Plot the features on different lines (one Matplotlib ax per line)
- Parameters
n_lines
Number of lines on which the record will be plotted. A number of nucleotides per line can be provided instead (see below).
nucl_per_line
Number of nucleotides to be represented on every line (determines the number of lines
n_lines
).plot_sequence
Whether to plot the nucleotide sequence on each line
figure_width
Width of the figure in inches. Leave to auto for a width of either 10 (if not sequence is plotted) or 0.15*nucl_per_line inches (if a sequence is plotted).
**plot_params
Parameters from
graphic_record.plot()
to be used in the plotting of the individual lines. This includesdraw_line
,with_ruler
,annotate_inline
,plot_sequence
,evelate_outline_annotations
,strand_in_label_pixel_threshold
- Returns
figure, axes
The matplotlib figure and axes generated.
-
plot_on_multiple_pages
(pdf_target, n_lines=None, nucl_per_line=None, lines_per_page=5, figure_width='auto', **plot_params)¶ Plot the features on different lines on different pages of a PDF.
This function returns None
- Parameters
pdf_target
Either a path to a PDF, or a file(-like) handle.
n_lines
Number of lines on which the record will be plotted. A number of nucleotides per line can be provided instead (see below).
nucl_per_line
Number of nucleotides to be represented on every line (determines the number of lines
n_lines
).lines_per_page
Number of lines on each page
plot_sequence
Whether to plot the nucleotide sequence on each line
figure_width
Width of the figure in inches. Leave to auto for a width of either 10 (if not sequence is plotted) or 0.15*nucl_per_line inches (if a sequence is plotted).
**plot_params
Parameters from
graphic_record.plot()
to be used in the plotting of the individual lines. This includesdraw_line
,with_ruler
,annotate_inline
,plot_sequence
,evelate_outline_annotations
,strand_in_label_pixel_threshold
-
plot_sequence
(ax, location=None, y_offset=1, fontdict=None, guides_intensity=0)¶ Plot a sequence of nucleotides at the bottom of the plot.
- Parameters
ax
Which axes the translation should be plotted to
location
location of the segment to translate, either (start, end) or (start, end, strand)
y_offset
Number of text levels under the plot’s base line where to draw the nucleotides. Should be 1 if the nucleotide sequence is to be plotted directly under the main line.
fontdict
Matplotlib fontdict for the text, e.g.
{'size': 11, 'weight':'bold'}
background
tuple (color1, color2) of alternate colors to plot behind each nucleotide position to guide vision. Leave to None for no background.
guides_intensity
Intensity of the vertical guides marking the different nucleotides (0 = no guides)
-
plot_translation
(ax, location=None, y_offset=2, fontdict=None, guides_intensity=0.5, translation=None, long_form_translation=True)¶ Plot a sequence of amino-acids at the bottom of the plot.
- Parameters
ax
Which axes the translation should be plotted to
location
location of the segment to translate (start, end)
y_offset
Number of text levels under the plot’s base line where to draw the amino acid names. Should be 2 if the nucleotide sequence is also plotted at level 1.
fontdict
Matplotlib fontdict for the text, e.g.
{'size': 11, 'weight':'bold'}
background
tuple (color1, color2) of alternate colors to plot behind each amino acid position to guide vision. Leave to None for no background.
translation
Sequence of amino acids either as a string
'MAKG...'
or as a list['Met', 'Ala', ...]
-
plot_with_bokeh
(figure_width=5, figure_height='auto')¶ Plot the graphic record using Bokeh.
Examples
>>>
-
property
span
¶ Return the display span (start, end) accounting for first_index.
CircularGraphicRecord¶
-
class
dna_features_viewer.
CircularGraphicRecord
(sequence_length, features, top_position=0, feature_level_height=0.2, annotation_height='auto', labels_spacing=12, **kw)[source]¶ Set of Genetic Features of a same DNA sequence, to be plotted together.
- Parameters
sequence_length
Length of the DNA sequence, in number of nucleotides
features
list of GraphicalFeature objects.
top_position
The index in the sequence that will end up at the top of the circle
feature_level_height
Width in inches of one “level” for feature arrows.
annotation_height
Width in inches of one “level” for feature annotations.
labels_spacing
Distance in basepairs to keep between labels to avoid “quasi-collisions”
**kw
Other keyword arguments - do not use, these parameters are allowed for making it easier to use GraphicRecord and CircularGraphicRecord interchangeably.
-
coordinates_in_plot
(position, level)[source]¶ Convert a sequence position and height level to (x, y) coordinates.
-
determine_annotation_height
(max_annotations_level)[source]¶ Auto-select the annotations height.
Annotation height is 0.2 at most, or else whatever will make the figure a 5*radius tall rectangle where the circular plasmid occupies the bottom-2 5th and the annotations occupy the top-3 5th.
-
finalize_ax
(ax, features_levels, annotations_max_level, auto_figure_height=False, ideal_yspan=None, annotations_are_elevated=True)[source]¶ Final display range and figure dimension tweakings.
-
initialize_ax
(ax, draw_line, with_ruler)[source]¶ Initialize the ax with a circular line, sets limits, aspect etc.
GraphicFeature¶
-
class
dna_features_viewer.
GraphicFeature
(start=None, end=None, strand=None, label=None, color='#000080', thickness=14, linewidth=1.0, linecolor='#000000', fontdict=None, html=None, open_left=False, open_right=False, box_linewidth=1, box_color='auto', legend_text=None, label_link_color='black', **data)[source]¶ Genetic Feature to be plotted.
- Parameters
start, end
Coordinates of the feature in the final sequence.
strand
Directionality of the feature. can be +1/-1/0 for direct sense, anti-sense, or no directionality.
label
Short descriptive text associated and plotted with the feature
color
Color of the feature, any Matplotlib-compatible format is accepted, such as “white”, “w”, “#ffffff”, (1,1,1), etc.
linecolor
Color of the feature’s border, any Matplotlib-compatible format is accepted, such as “white”, “w”, “#ffffff”, (1,1,1), etc.
linewidth
Width of the line (=edge) surrounding the graphic feature, in pixels.
thickness
Vertical span of the feature
box_color
Color of the label box. Set to None for no box around the label. Leave to “auto” for a box color that is a lightened version of the feature’s color.
data
Any other keyword is kept into the feature.data[] dictionary.
fontdict
A Matplotlib fontdict for the font to be used in the label, e.g.
size=11
,weight='bold'
,family='Helvetica'
, etc.open_left, open_right
Set to True if this feature does not end on the right or left because it is a cropped version of a bigger feature.
box_linewidth
Width of the line delimiting the text box when the annotation is outside the graphic feature. Set to 0 for no box borders
box_color
Background color of the annotation’s text box. If left to “auto” the color will be a lighter version of the feature’s color.
label_link_color
Color of the line linking the text annotation to its respective graphic feature. Set to auto for the line to automatically be a darker version of the feature’s color.
-
crop
(window)[source]¶ Return a the fragment of the feature that is in the window.
If there is no overlap between the feature location and the window, None is returned.
-
static
from_biopython_feature
(feature, **props)[source]¶ Create a GraphicalFeature from a Biopython.Feature object.
-
property
length
¶ Return the length of the feature (end-start)
-
property
x_center
¶ Return the x-center of the feature, (start+end)/2
Biotools¶
-
dna_features_viewer.biotools.
annotate_biopython_record
(seqrecord, location='full', feature_type='misc_feature', margin=0, **qualifiers)[source]¶ Add a feature to a Biopython SeqRecord.
- Parameters
seqrecord
The biopython seqrecord to be annotated.
location
Either (start, end) or (start, end, strand). (strand defaults to +1)
feature_type
The type associated with the feature
margin
Number of extra bases added on each side of the given location.
qualifiers
Dictionnary that will be the Biopython feature’s qualifiers attribute.
-
dna_features_viewer.biotools.
complement
(dna_sequence)[source]¶ Return the complement of the DNA sequence.
For instance
complement("ATGCCG")
returns"TACGGC"
.Uses BioPython for speed.
-
dna_features_viewer.biotools.
extract_graphical_translation
(sequence, location, long_form=False)[source]¶ Return a string of the “graphical” translation of a sequence’s subsegment.
Here “graphical” means that the amino acid sequence is always given left-to-right, as it will appear under the sequence in the plot. This matters when the location is on the -1 strand. In this case, the amino-acids are determined by (part of) the reverse-complement of the sequence, however the sequence returned will be the mirror of the translated sequence, as this is the left-to-right order in which the codons corresponding to the amino-acids appear in the sequence.
- Parameters
sequence
An “ATGC” string.
location
Either (start, end) or (start, end, strand), with strand in (0, 1, -1).
long_form
if True, a list of 3-letter amino acid representations is returned instead ([‘Ala’, ‘Ser’, …]).
-
dna_features_viewer.biotools.
find_narrowest_text_wrap
(text, max_line_length)[source]¶ Wrap the text into a multi-line text minimizing the longest line length.
This is done by first wrapping the text using max_line_length, then attempt new wraps by iteratively decreasing the line_length, as long as the number of lines stays the same as with max_line_length.