Skip to content

Environment and setup

Phenofhy is designed to run in the Our Future Health (OFH) trusted research environment (TRE) with DNAnexus tooling inside a JupyterLab notebook.

Prerequisites

  • Working knowledge of Python is requried and an understanding of how to launch and run analyses in the Our Future Health DNAnexus TRE using JupyterLab.
  • For reference, an overview of resources for getting up to speed with DNAnexus TRE, the dx toolkit, and working on phenotypic data with JupyterLab is provided on the DNAnexus Learning Resources page.

Requirements

  • All you need is an active OFH TRE project on the DNAnexus platform and working knowledge of how to use JupyterLab (see Introduction to Jupyterlab)
  • DX_PROJECT_CONTEXT_ID set (already configured automatically in the TRE).
  • It is recommened you configure a config.json file in /mnt/project/helpers with file IDs and base paths (see Installation).

Metadata files

Phenofhy can automatically create the metadata files (codings, data_dictionary, entity_dictionary) that come with OFH and which can also be created with dx (see documentation provided by UK Biobank).

The helper load.metadata() downloads them into ./metadata if missing.

python
from phenofhy import load

meta = load.metadata()
# meta["codings"], meta["data_dictionary"], meta["entity_dictionary"]

Local testing outside TRE

Phenofhy is primarily intended for use inside the OFH TRE, but parts of the package can be developed and tested locally using simulated data.

In particular, the simulation utilities let you generate OFH-like sample dataframes for debugging and pipeline prototyping without querying DNAnexus.

TRE-specific functionality (for example, dx-dependent extraction and project-context dataset access) still requires a valid OFH TRE environment.