String Functions Docmuentation
This page describes the functions in string_functions.py
and their usage.
convert_dbxref_to_dict(string)
Takes a string
from the Dbxref additional fields and parses into dict
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
string |
str
|
String in the Dbxref format, as below. |
required |
Returns:
Type | Description |
---|---|
dict
|
Unpacked string into a dict with key:value format, as below. |
Examples:
>>> convert_dbxref_to_dict('Ensembl:123104,UniProtKB:Q7D56')
{'Ensembl': '123104', 'UniProtKB': 'Q7D56'}
convert_dict_to_fields_gff(dictionary)
Takes a dict
and converts it to a string
for the additional fields section of GFF format.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dictionary |
dict
|
a dict with key:value format, as below. |
required |
Returns:
Type | Description |
---|---|
str
|
String in GFF additional fields format, as below: |
Examples:
>>> convert_dict_to_fields_gff({'gene_name': 'Hh', 'gene_id': '123'})
'gene_name=Hh;gene_id=123'
convert_dict_to_fields_gtf(dictionary)
Takes a dict
and converts it to a string
for the additional fields section of GTF format.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dictionary |
dict
|
a dict with key:value format, as below. |
required |
Returns:
Type | Description |
---|---|
str
|
String in GTF additional fields format, as below: |
Examples:
>>> convert_dict_to_fields_gtf({'gene_name': 'Hh', 'gene_id': '123'})
'gene_name "Hh"; gene_id "123"'
convert_fields_to_dict_gff(string)
Takes a string
from the GFF additional fields (column 8) and parses into a dict
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
string |
str
|
String in the GFF additional fields format, as below. |
required |
Returns:
Type | Description |
---|---|
dict
|
Unpacked string into a dict with key:value format, as below. |
Examples:
>>> convert_fields_to_dict_gff('gene_name=Hh;gene_id=123')
{'gene_name': 'Hh', 'gene_id': '123'}
convert_fields_to_dict_gtf(string)
Takes a string
from the GTF additional fields (column 8) and parses into a dict
.
Tolerates strings with and without a terminal ';' separator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
string |
str
|
String in the GTF additional fields format, as below. |
required |
Returns:
Type | Description |
---|---|
dict
|
Unpacked string into a dict with key:value format, as below. |
Examples:
>>> convert_fields_to_dict_gtf('gene_name "Hh"; gene_id "123"')
{'gene_name': 'Hh', 'gene_id': '123'}
make_gene_list(lst, filename)
Makes a gene list file using a list
of genes and a string
filename or path.
File will display a list of genes, one per line.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
lst |
list
|
list of gene ids to be saved. |
required |
filename |
str
|
name or path to destination file. |
required |
Examples:
>>> make_gene_list(['Hh', 'bcd', 'nos'], 'fly_genes.txt')
>>> with open('fly_genes.txt', 'r') as file:
>>> for line in file:
>>> print(line)
'Hh'
'bcd'
'nos'
prefixify(species)
Converts a string
of format "Genus_species" to "Gspe" species prefix format.
Note
If the species
string lacks an underscore, will return the entire original string.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
species |
str
|
name of species as |
required |
Returns:
Type | Description |
---|---|
str
|
species prefix, e.g. |
Examples:
>>> prefixify('Genus_species')
'Gspe'
>>> prefixify('MmusDrerXlae')
'MmusDrerXlae'