PSI/J Executors

PSI/J is a unified submission interface for many schedulers. Tierkreis is largely compatible with the psij python package. It follows a similar design philosophy:

  1. Define a jobs resources

  2. Declare the executable you want to run

  3. Instantiate the orchestrate

  4. Run the job

  5. Retrieve the results

Using the PSI/J executor as drop in replacement

Assume we have a graph that we want to run on a cluster, e.g. using the PBS submission system.

def mpi_graph() -> Workflow:
    builder = Graph(outputs_type=TKR[str])
    mpi_result = builder.task(mpi_task())
    return builder.finish_with_outputs(mpi_result)

In this case mpi_task() is a stub for an arbitrary task None -> str. To run this on a cluster you have to provide a resource definition. Using the Tierkreis functionality, you would use JobSpec in conjunction with ResourceSpec

def pbs_spec() -> JobSpec:
    return JobSpec(
        job_name="test_job",
        account="pbsuser",
        command=("/home/pbsuser/.local/bin/uv run /mpi_worker/main.py"),
        resource=ResourceSpec(nodes=2, memory_gb=None, gpus_per_node=None),
        walltime="00:15:00",
        mpi=MpiSpec(max_proc_per_node=1),
        extra_scheduler_args={"-l place=scatter": None},
    )

in conjunction with a specific executor:

executor = PBSExecutor(
    spec=pbs_spec(),
    registry_path=None,
    logs_path=storage.logs_path,
    command="qsub",
)

In principle, you can use the same specification for the PSIJExecutor.

Important

tierkreis.JobSpec and psij.JobSpec are not congruent. For more information see the caveats

executor = PSIJExecutor(
    spec=pbs_spec(),
    launchers_path=None,
    logs_path=storage.logs_path,
    psij_executor="pbs",
)

Here the psij_executor argument will be forwarded to a psij.JobExecutor without validation.

Converting between jobs

Inside Tierkreis, we canonically use the tierkreis.JobSpec so in isolation you would not need the psij.JobSpec. Still there are two conversion functions:

They try to match the arguments as closely as possible, but they are not inverse psij_to_spec(spec_to_psij(X)) != X especiilly when extra_scheduler_args or psij.JobAttributes.custom_args are set.

Caveats

For most arguments psij.JobSpec and tierkreis.JobSpec and their respective executors overlap. Here we list the known cases where they are not congruent:

Executors

The main difference is:

  • Tierkreis uses scheduler specific executor classes

  • You cannot provide a path to the scheduler command binary for psij. The command has to be in $PATH.

  • The launchers_path argument has no effect on the PSIJExecutor. We reommend using fully qualified paths for commands, e.g. /home/pbsuser/.local/bin/uv run /mpi_worker/main.py (for running on the compute nodes).

  • psij uses an internal script to launch jobs, this can cause jobs to not launch identically to corresponding tierkreis executors

Specs

  • Tierkreis uses memory in Gb, psij in b (covered by the conversion)

  • psij doesn’t support containerizaiton

  • psijuses mpirun as a launcher and automatically adds placement; Tierkreis uses mpiexec users need to add placement as an additional argument

  • extra_scheduler_args are mostly incompatible with psij.JobAttributes.custom_args. Conversion will do a best effor job but can’t account for renaming.