{ "cells": [ { "cell_type": "markdown", "id": "e039fe77", "metadata": {}, "source": [ "# Non-python workers, Multiple Executors\n", "\n", "Tierkreis works easiest when running with python workers.\n", "Still it supports using arbitrary workers that provide a binary to run.\n", "In this example we will look at using a shell script as a worker.\n", "Conceptually, this also works with other programming languages, for example if you want to use a legacy HPC application.\n", "\n", "To make this work we need to introduce three changes:\n", "1. (Optional) We should provide an interface definition using a type spec. This allows tierkreis to generate the api and use them during graph construction.\n", "2. Rewrite or wrap the script such that it follows the tierkreis contract. In short we have to change how we read inputs and write outputs.\n", "3. Use a suitable executor. There are two flavors we will discuss in a bit.\n", "\n", "\n", "## Preliminaries\n", "\n", "For this example we're going to use the auth worker which already has some predefined functionality.\n", "Make sure to familiarize yourself with its contents.\n", "In this example we're going to use a shell script with the following contents as a worker, which is contained in the `openssl_worker`.\n", "\n", "```bash\n", "#!/usr/bin/env bash\n", "pk_file=$1\n", "passphrase=$2\n", "numbits=$3\n", "\n", "openssl genrsa -out $pk_file -aes128 -passout \"file:$passphrase\" $numbits\n", "openssl rsa -in $pk_file -passin \"file:$passphrase\" -pubout -out public-out\n", "```\n", "\n", "## Defining the interface\n", "To do this we can define a TypeSpec dsl.\n", "\n", "```\n", "@portmapping\n", "model Outputs {\n", " private_key: bytes\n", " public_key: bytes\n", "}\n", "\n", "interface openssl_worker {\n", " genrsa (\n", " numbits: int,\n", " passphrase: bytes\n", " ): Outputs \n", "}\n", "\n", "```\n", "It defines two outputs `private_key` and `public_key` consuming a `passphrase` and `numbits`.\n", "The most generic type for such values is bytes, as we will be reading and writing from file directly.\n", "Now we can generate the stubs using the cli `tkr init stubs` or from python." ] }, { "cell_type": "code", "execution_count": null, "id": "631b6c35", "metadata": {}, "outputs": [], "source": [ "%pip install tierkreis" ] }, { "cell_type": "code", "execution_count": null, "id": "738b90d8", "metadata": {}, "outputs": [], "source": [ "from pathlib import Path\n", "from tierkreis.namespace import Namespace\n", "\n", "\n", "if __name__ == \"__main__\":\n", " tsp_path = (\n", " Path().parent / \"example_workers\" / \"openssl_worker\" / \"src\" / \"schema.tsp\"\n", " )\n", " namespace = Namespace.from_spec_file(tsp_path)\n", " namespace.write_stubs(tsp_path.parent / \"api\" / \"stubs.py\")" ] }, { "cell_type": "markdown", "id": "2ea26516", "metadata": {}, "source": [ "## Adapting the script to the tierkreis contract\n", "\n", "Tierkreis writes intermediate values to storage and expects outputs to be written there as well.\n", "To make this less verbose, tierkreis will export the corresponding locations as environment variables with the following schema.\n", "For an input e.g. `numbits` there will be a variable `$input_numbits_file` (`$input__file`) which you can read as `numbits=$(cat $input_numbits_file)` in your script.\n", "Analogous, for each output e.g. `private_key` there will be a variable `$input_private_key_file` (`$output__file`) which you can use in your script e.g. with `tee`.\n", "This is convenient if your scripts already reads and writes to files.\n", "Instead, if you want to use the values directly, you can also set up the executors (we will see this later) to pass the values in the variables. \n", "This is only available for inputs which will then have the form `$input__value`.\n", "\n", "Adding these changes to the script above yields:\n", "```bash\n", "numbits=$(cat $input_numbits_file)\n", "openssl genrsa -out $output_private_key_file -aes128 -passout \"file:$input_passphrase_file\" $numbits\n", "openssl rsa -in $output_private_key_file -passin \"file:$input_passphrase_file\" -pubout -out $output_public_key_file\n", "```\n", "\n", "## Building a graph using the script\n", "Were going to build a graph that checks whether we successfully signed a message with a generated private key.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "3e527fb9", "metadata": {}, "outputs": [], "source": [ "from tierkreis.builder import GraphBuilder\n", "from tierkreis.models import TKR, EmptyModel\n", "\n", "from auth_worker import sign, verify\n", "from openssl_worker import genrsa, Outputs\n", "\n", "\n", "def signing_graph():\n", " g = GraphBuilder(EmptyModel, TKR[bool])\n", " message = g.const(\"dummymessage\")\n", " passphrase = g.const(b\"dummypassphrase\")\n", "\n", " key_pair: Outputs = g.task(genrsa(g.const(4096), passphrase))\n", " private_key: TKR[bytes] = key_pair.private_key\n", " public_key: TKR[bytes] = key_pair.public_key\n", "\n", " signing_result = g.task(sign(private_key, passphrase, message)).hex_signature\n", " verification_result = g.task(verify(public_key, signing_result, message))\n", " g.outputs(verification_result)\n", "\n", " return g" ] }, { "cell_type": "markdown", "id": "47131bbc", "metadata": {}, "source": [ "which we now can run.\n", "Before we continue we have to set up the storage." ] }, { "cell_type": "code", "execution_count": null, "id": "fac74824", "metadata": {}, "outputs": [], "source": [ "from uuid import UUID\n", "from tierkreis.storage import FileStorage\n", "\n", "storage = FileStorage(UUID(int=105))\n", "storage.clean_graph_files()" ] }, { "cell_type": "markdown", "id": "0dfc9d75", "metadata": {}, "source": [ "## Setting up the correct executors\n", "\n", "In the graph above, we use two types of workers: python and shell.\n", "This means we need to also set up the executors accordingly.\n", "For the python ones we can use the `UVExecutor` as before, while for shell scripts there is the `ShellExecutor`.\n", "Since we can only provide a single executor to the run we have to combine them using the `MultipleExecutor`.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "08a787e7", "metadata": {}, "outputs": [], "source": [ "from tierkreis.executor import MultipleExecutor, UvExecutor, ShellExecutor\n", "\n", "registry_path = Path().parent / \"example_workers\"\n", "uv = UvExecutor(registry_path, storage.logs_path)\n", "shell = ShellExecutor(\n", " registry_path, storage.workflow_dir\n", ") # export_values=True enables passing values via env vars\n", "executor = MultipleExecutor(uv, {\"shell\": shell}, {\"openssl_worker\": \"shell\"})" ] }, { "cell_type": "markdown", "id": "21ede17f", "metadata": {}, "source": [ "The `MultipleExecutor` uses a default executor `uv` and has a named list of executors `{\"shell\": shell}` and a mapping from worker to executor `{\"openssl_worker\": \"shell\"}`.\n", "Withe the executor defined we now can run the graph." ] }, { "cell_type": "code", "execution_count": null, "id": "9acb00bb", "metadata": {}, "outputs": [], "source": [ "from tierkreis.storage import read_outputs\n", "from tierkreis import run_graph\n", "\n", "run_graph(storage, executor, signing_graph().get_data(), {})\n", "is_verified = read_outputs(signing_graph().get_data(), storage)\n", "print(is_verified)" ] }, { "cell_type": "markdown", "id": "ae3fccb1", "metadata": {}, "source": [ "## Running simple scripts\n", "The above example is the most common way of running scripts with multiple inputs and outputs.\n", "There is an even simpler way if your script meets the following conditions:\n", "- It has a single input which it reads from `stdin`\n", "- It has a single output which it writes to `stdout`\n", "\n", "For example the shell build in `tee` does exactly that.\n", "For such scripts, we can use the `script` function in conjunction with the `StdInOut` executor." ] }, { "cell_type": "code", "execution_count": null, "id": "cc0fef99", "metadata": {}, "outputs": [], "source": [ "from tierkreis.builder import script\n", "from tierkreis.controller.executor.stdinout import StdInOut\n", "\n", "\n", "def stdinout_graph():\n", " g = GraphBuilder(EmptyModel, TKR[bytes])\n", " message = g.const(b\"dummymessage\")\n", " output = g.task(script(\"tee\", message))\n", "\n", " g.outputs(output)\n", " return g\n", "\n", "\n", "storage.clean_graph_files()\n", "stdinout = StdInOut(registry_path, storage.workflow_dir)\n", "run_graph(storage, stdinout, stdinout_graph().get_data(), {})\n", "out = read_outputs(stdinout_graph().get_data(), storage)\n", "print(out)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.11" } }, "nbformat": 4, "nbformat_minor": 5 }