Utils
DNAnexus helpers and file utilities.
get_dataset_id()
phenofhy.utils.get_dataset_id(project=None, full=True)Return the DNAnexus dataset identifier.
Parameters
project: str | None
DNAnexus project ID. Defaults to DX_PROJECT_CONTEXT_ID.
full: bool
If True, return "{project}:{record_id}". If False, return project ID only.
Returns
out: str
Dataset identifier string.
Raises
RuntimeError: Exception
If project is not available.
connect_to_dataset()
phenofhy.utils.connect_to_dataset(cohort_record_id="")Connect to a DNAnexus dataset or cohort.
Parameters
cohort_record_id: str
Optional cohort record ID. If empty, loads the first dataset.
Returns
out: dxdata.Dataset | dxdata.Cohort
Loaded DNAnexus dataset or cohort.
Raises
RuntimeError: Exception
If DX_PROJECT_CONTEXT_ID is not set.
find_latest_dx_file_id()
phenofhy.utils.find_latest_dx_file_id(name_pattern, folder=None, project=None)Find the latest DNAnexus file ID matching a name pattern.
Parameters
name_pattern: str
Filename or glob pattern.
folder: str | None
Optional DNAnexus folder path.
project: str | None
Optional DNAnexus project ID.
Returns
out: str
File ID string (e.g., "file-xxxx").
Raises
FileNotFoundError: Exception
If no file is found.
subprocess.CalledProcessError: Exception
If dx command fails.
download_files()
phenofhy.utils.download_files(files)Download files from DNAnexus.
Parameters
files: tuple | list
A (file_id, output_path) tuple or list of such tuples.
Raises
ValueError: Exception
If input format is invalid.
upload_files()
phenofhy.utils.upload_files(files, dx_target="results")Upload files to DNAnexus.
Parameters
files: str | list
File path, list of paths, or list of (path, dx_folder) tuples.
dx_target: str
Default DNAnexus folder.
Raises
ValueError: Exception
If input format is invalid.
upload_folders()
phenofhy.utils.upload_folders(folders, dx_target="results")Upload one or more folders to DNAnexus.
Parameters
folders: str | list
Folder path, list of paths, or list of (path, dx_target) tuples.
dx_target: str
Default DNAnexus folder.
Raises
ValueError: Exception
If input format is invalid.
load_file()
phenofhy.utils.load_file(file_path)Load a file by extension into a Python object.
Parameters
file_path: str | pathlib.Path
Path to the file.
Returns
out: dict | str | pandas.DataFrame | None
Loaded object based on file extension, or None if unsupported.
Raises
Exception: Exception
If reading the file fails.
Example
from phenofhy import utils
dataset_id = utils.get_dataset_id()