TranSMART loader¶
This package contains classes that represent the core domain objects stored in the TranSMART platform, an open source data sharing and analytics platform for translational biomedical research.
It also provides a utility that writes such objects to tab-separated files that can be loaded into a TranSMART database using the transmart-copy tool.
⚠️ Note: this is a very preliminary version, still under development. Issues can be reported at https://github.com/thehyve/python_transmart_loader/issues.
Installation and usage¶
To install transmart_loader, do:
pip install transmart-loader
or from sources:
git clone https://github.com/thehyve/python_transmart_loader.git
cd python_transmart_loader
pip install .
Usage¶
Usage examples can be found in these projects:
- fhir2transmart: a tool that translates core HL7 FHIR resources to the TranSMART data model.
- ontology2transmart: a tool that translates ontologies available from DIMDI to TranSMART ontologies.
Documentation¶
Full documentation of the package is available at Read the Docs.
Known issues¶
- Date values are not correctly translated
Development¶
For a quick reference on software development, we refer to the software guide checklist.
Python versions¶
This repository is set up with Python version 3.6
Add or remove Python versions based on project requirements. The guide contains more information about Python versions and writing Python 2 and 3 compatible code.
Package management and dependencies¶
This project uses pip for installing dependencies and package management.
- Dependencies should be added to setup.py in the install_requires list.
Testing and code coverage¶
- Tests are in the
tests
folder. - The
tests
folder contains:- A test if files for transmart-copy are generated for fake data (file:
test_transmart_loader
) - A test that checks whether your code conforms to the Python style guide (PEP 8) (file:
test_lint.py
)
- A test if files for transmart-copy are generated for fake data (file:
- The testing framework used is PyTest
- Tests can be run with
python setup.py test
Documentation¶
- Documentation should be put in the
docs
folder. - To generate html documentation run
python setup.py build_sphinx
Coding style conventions and code quality¶
- Check your code style with
prospector
- You may need run
pip install .[dev]
first, to install the required dependencies
License¶
Copyright (c) 2019 The Hyve B.V.
The TranSMART loader is licensed under the MIT License. See the file LICENSE.
Credits¶
This package was created with Cookiecutter and the NLeSC/python-template.
API Reference¶
transmart_loader package¶
Documentation about TranSMART loader
Submodules¶
transmart_loader.collection_validator module¶
-
class
transmart_loader.collection_validator.
CollectionValidator
¶ Bases:
transmart_loader.collection_visitor.CollectionVisitor
Validation class for TranSMART data collections.
-
static
validate
(collection: transmart_loader.transmart.DataCollection)¶
-
visit_concept
(concept: transmart_loader.transmart.Concept) → None¶
-
visit_node
(node: transmart_loader.transmart.TreeNode) → None¶
-
visit_observation
(observation: transmart_loader.transmart.Observation) → None¶
-
visit_patient
(patient: transmart_loader.transmart.Patient) → None¶
-
visit_study
(study: transmart_loader.transmart.Study) → None¶
-
visit_trial_visit
(trial_visit: transmart_loader.transmart.TrialVisit) → None¶
-
visit_visit
(visit: transmart_loader.transmart.Visit) → None¶
-
static
transmart_loader.collection_visitor module¶
-
class
transmart_loader.collection_visitor.
CollectionVisitor
¶ Bases:
object
Visitor class for TranSMART data collections
-
visit
(collection: Optional[transmart_loader.transmart.DataCollection]) → None¶
-
visit_concept
(concept: transmart_loader.transmart.Concept) → None¶
-
visit_node
(node: transmart_loader.transmart.TreeNode) → None¶
-
visit_observation
(observation: transmart_loader.transmart.Observation) → None¶
-
visit_patient
(patient: transmart_loader.transmart.Patient) → None¶
-
visit_study
(study: transmart_loader.transmart.Study) → None¶
-
visit_trial_visit
(trial_visit: transmart_loader.transmart.TrialVisit) → None¶
-
visit_visit
(visit: transmart_loader.transmart.Visit) → None¶
-
transmart_loader.console module¶
-
class
transmart_loader.console.
Console
¶ Bases:
object
A helper class for displaying messages on the console (stderr).
-
Black
= '\x1b[30m'¶
-
BlackBackground
= '\x1b[40m'¶
-
Blue
= '\x1b[94m'¶
-
Green
= '\x1b[92m'¶
-
GreenBackground
= '\x1b[42m'¶
-
Grey
= '\x1b[37m'¶
-
Red
= '\x1b[91m'¶
-
RedBackground
= '\x1b[41m'¶
-
Reset
= '\x1b[0m'¶
-
Yellow
= '\x1b[93m'¶
-
YellowBackground
= '\x1b[103m'¶
-
static
error
(message)¶
-
static
info
(message)¶
-
static
success
(message)¶
-
static
title
(title)¶
-
static
warning
(message)¶
-
transmart_loader.copy_writer module¶
-
class
transmart_loader.copy_writer.
TransmartCopyWriter
(output_dir: str)¶ Bases:
transmart_loader.collection_visitor.CollectionVisitor
Writes TranSMART data collections to a folder with files that can be loaded into a TranSMART database using transmart-copy.
-
concepts_header
= ['concept_cd', 'concept_path', 'name_char']¶
-
dimensions_header
= ['id', 'name', 'modifier_code', 'value_type']¶
-
init_writers
() → None¶ Creates files and initialises writers for the output files in transmart-copy format.
-
observations_header
= ['encounter_num', 'patient_num', 'concept_cd', 'provider_id', 'start_date', 'end_date', 'modifier_cd', 'instance_num', 'trial_visit_num', 'valtype_cd', 'tval_char', 'nval_num', 'observation_blob']¶
-
patient_mappings_header
= ['patient_ide', 'patient_ide_source', 'patient_num']¶
-
patients_header
= ['patient_num', 'sex_cd']¶
-
prepare_output_dir
() → None¶ Creates an output directory if it does not exist. Fails if the output directory exists and is not empty.
-
studies_header
= ['study_num', 'study_id', 'secure_obj_token']¶
-
study_dimensions_header
= ['study_id', 'dimension_description_id']¶
-
tree_nodes_header
= ['c_hlevel', 'c_fullname', 'c_name', 'c_visualattributes', 'c_basecode', 'c_facttablecolumn', 'c_tablename', 'c_columnname', 'c_columndatatype', 'c_operator', 'c_dimcode', 'secure_obj_token']¶
-
trial_visits_header
= ['trial_visit_num', 'study_num', 'rel_time_unit_cd', 'rel_time_num', 'rel_time_label']¶
-
value_type_codes
= {<ValueType.Numeric: 1>: 'N', <ValueType.Categorical: 2>: 'T', <ValueType.Date: 4>: 'D', <ValueType.Text: 3>: 'B'}¶
-
visit_concept
(concept: transmart_loader.transmart.Concept) → None¶ Serialises a Concept entity to a TSV file.
Parameters: concept – the Concept entity
-
visit_node
(node: transmart_loader.transmart.TreeNode) → None¶
-
visit_observation
(observation: transmart_loader.transmart.Observation) → None¶ Serialises an Observation entity to a TSV file.
FIXME: fix date value serialisation
Parameters: observation – the Observation entity
-
visit_patient
(patient: transmart_loader.transmart.Patient) → None¶ Serialises an Patient entity and related PatientMapping entities to TSV files.
Parameters: patient – the Patient entity
-
visit_study
(study: transmart_loader.transmart.Study) → None¶ Serialises a Study entity to a TSV file.
Parameters: study – the Study entity
-
visit_tree_node
(node: transmart_loader.transmart.TreeNode, level=0, parent_path='\\')¶ Serialises a TreeNode entity and its children to a TSV file.
Parameters: - node – the TreeNode entity
- level – the hierarchy level of the node
- parent_path – the path of the parent node.
-
visit_trial_visit
(trial_visit: transmart_loader.transmart.TrialVisit) → None¶ Serialises a TrialVisit entity to a TSV file.
Parameters: trial_visit – the TrialVisit entity
-
visit_visit
(visit: transmart_loader.transmart.Visit) → None¶ Serialises a Visit entity to a TSV file. NB: this requires all patient visits to be cleared before loading new visits for the patient.
Parameters: visit – the Visit entity
-
visits_header
= ['encounter_num', 'patient_num', 'active_status_cd', 'start_date', 'end_date', 'inout_cd', 'location_cd', 'location_path', 'length_of_stay', 'visit_blob']¶
-
write_collection
(collection: transmart_loader.transmart.DataCollection) → None¶
-
write_dimension
(dimension: transmart_loader.transmart.Dimension) → None¶ Serialises a Dimension entity to a TSV file.
Parameters: dimension – the Dimension entity
-
write_dimensions
() → None¶ Write dimensions metadata and link all studies to the dimensions
-
write_study_dimensions
(study_index)¶
-
-
class
transmart_loader.copy_writer.
VisualAttribute
¶ Bases:
enum.Enum
Visual attribute of an ontology node
-
Categorical
= 8¶
-
Container
= 3¶
-
Date
= 7¶
-
Folder
= 2¶
-
Leaf
= 1¶
-
Numerical
= 5¶
-
Study
= 4¶
-
Text
= 6¶
-
-
transmart_loader.copy_writer.
format_date
(value: Optional[datetime.date]) → Optional[str]¶
-
transmart_loader.copy_writer.
get_concept_node_row
(node: transmart_loader.transmart.ConceptNode, level, node_path)¶
-
transmart_loader.copy_writer.
get_folder_node_row
(node: transmart_loader.transmart.TreeNode, level, node_path)¶
-
transmart_loader.copy_writer.
get_study_node_row
(node: transmart_loader.transmart.StudyNode, level, node_path)¶
transmart_loader.csv_types module¶
transmart_loader.loader_exception module¶
-
exception
transmart_loader.loader_exception.
LoaderException
¶ Bases:
Exception
transmart_loader.transmart module¶
-
class
transmart_loader.transmart.
CategoricalValue
(value: str)¶ Bases:
transmart_loader.transmart.Value
A categorical value
-
value
()¶
-
value_type
()¶
-
-
class
transmart_loader.transmart.
Concept
(concept_code: str, name: str, concept_path: str, value_type: transmart_loader.transmart.ValueType)¶ Bases:
object
Concepts to classify observations
-
class
transmart_loader.transmart.
ConceptNode
(concept: transmart_loader.transmart.Concept)¶ Bases:
transmart_loader.transmart.TreeNode
Concept node
-
class
transmart_loader.transmart.
DataCollection
(concepts: Iterable[transmart_loader.transmart.Concept], studies: Iterable[transmart_loader.transmart.Study], trial_visits: Iterable[transmart_loader.transmart.TrialVisit], visits: Iterable[transmart_loader.transmart.Visit], ontology: Iterable[transmart_loader.transmart.TreeNode], patients: Iterable[transmart_loader.transmart.Patient], observations: Iterable[transmart_loader.transmart.Observation])¶ Bases:
object
A data collection that can be loaded into TranSMART
-
class
transmart_loader.transmart.
DateValue
(value: datetime.date)¶ Bases:
transmart_loader.transmart.Value
A date value
-
value
()¶
-
value_type
()¶
-
-
class
transmart_loader.transmart.
Dimension
(name: str, modifier_code: Optional[str] = None, value_type: Optional[transmart_loader.transmart.ValueType] = None)¶ Bases:
object
Dimension metadata
-
class
transmart_loader.transmart.
NumericalValue
(value: float)¶ Bases:
transmart_loader.transmart.Value
A numerical value
-
value
()¶
-
value_type
()¶
-
-
class
transmart_loader.transmart.
Observation
(patient: transmart_loader.transmart.Patient, concept: transmart_loader.transmart.Concept, visit: Optional[transmart_loader.transmart.Visit], trial_visit: transmart_loader.transmart.TrialVisit, start_date: Optional[datetime.date], end_date: Optional[datetime.date], value: transmart_loader.transmart.Value)¶ Bases:
object
Data about an observed event or an attribute of a patient
-
class
transmart_loader.transmart.
ObservationMetadata
¶ Bases:
object
Metadata about an observation
-
class
transmart_loader.transmart.
Patient
(identifier: str, sex: str, mappings: Sequence[transmart_loader.transmart.PatientMapping])¶ Bases:
object
Patient properties
-
class
transmart_loader.transmart.
PatientMapping
(source: str, identifier: str)¶ Bases:
object
Patient identifiers
-
class
transmart_loader.transmart.
Study
(study_id: str, name: str)¶ Bases:
object
-
class
transmart_loader.transmart.
StudyNode
(study: transmart_loader.transmart.Study)¶ Bases:
transmart_loader.transmart.TreeNode
Study node
-
class
transmart_loader.transmart.
TextValue
(value: str)¶ Bases:
transmart_loader.transmart.Value
A text value
-
value
()¶
-
value_type
()¶
-
-
class
transmart_loader.transmart.
TreeNode
(name: str)¶ Bases:
object
Ontology node
-
add_child
(child: transmart_loader.transmart.TreeNode)¶
-
-
class
transmart_loader.transmart.
TrialVisit
(study: transmart_loader.transmart.Study, rel_time_label: str, rel_time_unit: Optional[str] = None, rel_time: Optional[int] = None)¶ Bases:
object
Trial visit
-
class
transmart_loader.transmart.
ValueType
¶ Bases:
enum.Enum
Type of an observed value
-
Categorical
= 2¶
-
Date
= 4¶
-
Numeric
= 1¶
-
Text
= 3¶
-
-
class
transmart_loader.transmart.
Visit
(patient: transmart_loader.transmart.Patient, identifier: str, active_status: Optional[str], start_date: Optional[datetime.date], end_date: Optional[datetime.date], inout: Optional[str], location: Optional[str], length_of_stay: Optional[int])¶ Bases:
object
Patient visit
transmart_loader.tsv_writer module¶
-
class
transmart_loader.tsv_writer.
TsvWriter
(path: str)¶ Bases:
transmart_loader.csv_types.CsvWriter
Tab-separated values writer. Creates a new file when initialised and fails when the file already exists.
-
close
() → None¶
-
writerow
(row: Sequence[Any]) → None¶
-
writerows
(rows: Sequence[Sequence[Any]]) → None¶
-