API reference

class crazydoc.StyleObserver

Generic class for observing style-based annotations in sequences.

The provided subclasses each observe one particular type of DNA sequence annotation, such as the highlight color, bold text, underlines, etc.

class crazydoc.CrazydocParser(observers)

Parser to translate MS Word docx documents to Biopython records.

Parameters:

observers (list) – A list of either crazydoc.Observer objects or observer names in highlight_color, font_color, bold, italic, upper_case, lower_case, underline.

Examples

>>> parser = CrazydocParser(['highlight_color', 'font_color'])
>>> biopython_records = parser.parse_doc_file('example.docx')
parse_doc_file(filepath=None, doc=None, is_protein=False)

Return a list of records, 1 for each sequence contained in the docx.

Parameters:
  • filepath (str, optional) – A path to a docx file.

  • doc (Document, optional) – A python-docx Document object, which can be provided instead of the file path.

  • is_protein (bool, optional) – True if the sequences are protein sequences (default: False).

class crazydoc.CrazydocSketcher

Unavailable class: install DnaFeaturesViewer.

Try: pip install dna_features_viewer

crazydoc.write_crazydoc(seqs, qualifier, path, formats=[[WD_COLOR_INDEX.GREEN, WD_COLOR_INDEX.GRAY_25, WD_COLOR_INDEX.TEAL, WD_COLOR_INDEX.VIOLET, WD_COLOR_INDEX.RED, WD_COLOR_INDEX.GRAY_50, WD_COLOR_INDEX.DARK_YELLOW, WD_COLOR_INDEX.DARK_RED, WD_COLOR_INDEX.DARK_BLUE], [(248, 118, 109), (183, 159, 0), (0, 186, 56), (0, 191, 196), (97, 156, 255), (245, 100, 227)], ['bold'], ['underline'], ['italic']])

Write one or more SeqRecords annotated with features to a new document.

WILL OVERWRITE EXISTING DOCUMENTS WITHOUT WARNING!

Parameters:
  • seqs (list) – List of SeqRecords to write out.

  • qualifier (str) – The key to use for feature names from .qualifiers.

  • path (str) – The path to write the document to. Will overwrite existing documents!

  • formats (list, optional) – List of format groups to use on features.

Examples

A custom list of docx formats can be supplied to the formats argument. If formats is not declared a default palette is used.

>>> from docx.shared import RGBColor
>>> from docx.enum.text import WD_COLOR
>>> custom_formats = [
        [
            WD_COLOR.PINK,
            WD_COLOR.TEAL,
            WD_COLOR.YELLOW,
            WD_COLOR.GRAY_50,
            WD_COLOR.VIOLET,
            WD_COLOR.TURQUOISE,
        ],
        [
            RGBColor(0xFF, 0x00, 0x00),
            RGBColor(0x00, 0xFF, 0x00),
            RGBColor(0x00, 0x00, 0xFF),
        ],
        ["bold"],
        ["underline"],
        ["italic"],
    ]
>>> write_crazydoc(seq, 'product', 'custom_colours.docx', formats=custom_formats)
crazydoc.write_crazyseq(seqrec, qualifier, doc, formats=[[WD_COLOR_INDEX.GREEN, WD_COLOR_INDEX.GRAY_25, WD_COLOR_INDEX.TEAL, WD_COLOR_INDEX.VIOLET, WD_COLOR_INDEX.RED, WD_COLOR_INDEX.GRAY_50, WD_COLOR_INDEX.DARK_YELLOW, WD_COLOR_INDEX.DARK_RED, WD_COLOR_INDEX.DARK_BLUE], [(248, 118, 109), (183, 159, 0), (0, 186, 56), (0, 191, 196), (97, 156, 255), (245, 100, 227)], ['bold'], ['underline'], ['italic']])

Write a single sequence with annotated features to an existing document element. If you want crazydoc to handle document creation, use write_crazydoc() even for single sequences.

Parameters:
  • seqrec (SeqRecord) – The sequence to write, with features.

  • qualifier (str) – The key of the .qualifier to use as feature names.

  • doc (docx document) – The docx document element to write to.

  • formats (list, optional) – List of format groups to use on features.

crazydoc.make_groups(position_features)

Make groups where no overlapping features share a group.

Parameters:

position_features (list of lists) – Each sublist represents a single position in the sequence and lists all the features that position is part of.

Returns:

Each sublist contains all features in a single group. Groups can use a single type of formatting, e.g., highlighting or font color.

Return type:

list of lists

crazydoc.records_to_genbank(records, path='.', extension=None, is_protein=False)