Environment and setup
Phenofhy is designed to run in the Our Future Health (OFH) trusted research environment (TRE) with DNAnexus tooling inside a JupyterLab notebook.
Prerequisites
- Working knowledge of
Pythonis requried and an understanding of how to launch and run analyses in the Our Future Health DNAnexus TRE usingJupyterLab. - For reference, an overview of resources for getting up to speed with DNAnexus TRE, the
dx toolkit, and working on phenotypic data withJupyterLabis provided on the DNAnexus Learning Resources page.
Requirements
- All you need is an active OFH TRE project on the DNAnexus platform and working knowledge of how to use JupyterLab (see Introduction to Jupyterlab)
DX_PROJECT_CONTEXT_IDset (already configured automatically in the TRE).- It is recommened you configure a
config.jsonfile in/mnt/project/helperswith file IDs and base paths (see Installation).
Metadata files
Phenofhy can automatically create the metadata files (codings, data_dictionary, entity_dictionary) that come with OFH and which can also be created with dx (see documentation provided by UK Biobank).
The helper load.metadata() downloads them into ./metadata if missing.
from phenofhy import load
meta = load.metadata()
# meta["codings"], meta["data_dictionary"], meta["entity_dictionary"]Local testing outside TRE
Phenofhy is primarily intended for use inside the OFH TRE, but parts of the package can be developed and tested locally using simulated data.
In particular, the simulation utilities let you generate OFH-like sample dataframes for debugging and pipeline prototyping without querying DNAnexus.
TRE-specific functionality (for example, dx-dependent extraction and project-context dataset access) still requires a valid OFH TRE environment.