Skip to content

Utils

DNAnexus helpers and file utilities.

get_dataset_id()

python
phenofhy.utils.get_dataset_id(project=None, full=True)

Return the DNAnexus dataset identifier.

Parameters

  project: str | None
    DNAnexus project ID. Defaults to DX_PROJECT_CONTEXT_ID.
  full: bool
    If True, return "{project}:{record_id}". If False, return project ID only.

Returns

  out: str
    Dataset identifier string.

Raises

  RuntimeError: Exception
    If project is not available.

connect_to_dataset()

python
phenofhy.utils.connect_to_dataset(cohort_record_id="")

Connect to a DNAnexus dataset or cohort.

Parameters

  cohort_record_id: str
    Optional cohort record ID. If empty, loads the first dataset.

Returns

  out: dxdata.Dataset | dxdata.Cohort
    Loaded DNAnexus dataset or cohort.

Raises

  RuntimeError: Exception
    If DX_PROJECT_CONTEXT_ID is not set.

find_latest_dx_file_id()

python
phenofhy.utils.find_latest_dx_file_id(name_pattern, folder=None, project=None)

Find the latest DNAnexus file ID matching a name pattern.

Parameters

  name_pattern: str
    Filename or glob pattern.
  folder: str | None
    Optional DNAnexus folder path.
  project: str | None
    Optional DNAnexus project ID.

Returns

  out: str
    File ID string (e.g., "file-xxxx").

Raises

  FileNotFoundError: Exception
    If no file is found.
  subprocess.CalledProcessError: Exception
    If dx command fails.

download_files()

python
phenofhy.utils.download_files(files)

Download files from DNAnexus.

Parameters

  files: tuple | list
    A (file_id, output_path) tuple or list of such tuples.

Raises

  ValueError: Exception
    If input format is invalid.

upload_files()

python
phenofhy.utils.upload_files(files, dx_target="results")

Upload files to DNAnexus.

Parameters

  files: str | list
    File path, list of paths, or list of (path, dx_folder) tuples.
  dx_target: str
    Default DNAnexus folder.

Raises

  ValueError: Exception
    If input format is invalid.

upload_folders()

python
phenofhy.utils.upload_folders(folders, dx_target="results")

Upload one or more folders to DNAnexus.

Parameters

  folders: str | list
    Folder path, list of paths, or list of (path, dx_target) tuples.
  dx_target: str
    Default DNAnexus folder.

Raises

  ValueError: Exception
    If input format is invalid.

load_file()

python
phenofhy.utils.load_file(file_path)

Load a file by extension into a Python object.

Parameters

  file_path: str | pathlib.Path
    Path to the file.

Returns

  out: dict | str | pandas.DataFrame | None
    Loaded object based on file extension, or None if unsupported.

Raises

  Exception: Exception
    If reading the file fails.

Example

python
from phenofhy import utils

dataset_id = utils.get_dataset_id()