{ "cells": [ { "cell_type": "markdown", "id": "52132a14", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "# Lesson 1: A first workflow\n", "\n", "```{note}\n", "Have you already installed everything you need? If you're \n", "not sure, check the \"Before you begin\" section of the \n", "[introduction](../index.md). \n", "```\n", "\n", "In this lesson. we'll construct a very simple workflow that simply\n", "reads three numbers from its inputs and outputs their sum. It's\n", "nealry trivial, but this will demonstrate the essentials of working\n", "with Tierkreis :\n", "1. Defining a graph through its inputs and outputs\n", "2. Constructing the computation\n", " - Using the inputs\n", " - Using simple nodes\n", " - Using built-in functionality\n", "3. Checking what we've done with the visualizer\n", "4. Running the workflow\n", "\n", "Let's get started!" ] }, { "cell_type": "markdown", "id": "86dd44d1-0a31-48c8-935b-9de1dd482f41", "metadata": {}, "source": [ "## Setting up the first graph\n", "\n", "Workflows are defined by their *graphs* which described how the inputs produce the outputs. The first step in defining the graph is to specify those inputs and outputs. And part of that is knowing their names and their *types*.\n", "\n", "Types are optional in Tierkreis, but since compile-time errors are cheap, and run-time errors are expensive, we strongly recommend using them everywhere.\n", "\n", "```{note}\n", "In order to keep a clear separation between the types used in the Tierkreis graph and the types already present in the Python\n", "language we wrap the former with `TKR`. in other words, the `TKR[A]` wrapper type indicates that an edge in the graph contains a value\n", "of type `A`. More on this in the [core concepts](../tutorial/core_concepts.md#types)\n", "```\n", "\n", "In general a graph can have any number of inputs or outputs. In this example, we have three inputs, so our first step is to create a python object to hold the inputs. We always use a `NamedTuple` for this. There's nothing special about the name `InParams`, it's just intended to be descriptive.\n", "\n", "Since we have only one output, and we already know that the type is`TKR[float]`, we don't need to do anything for that." ] }, { "cell_type": "code", "execution_count": null, "id": "5783b41e-ea9b-41c4-8ce7-406971f40908", "metadata": {}, "outputs": [], "source": [ "from typing import NamedTuple\n", "from tierkreis.models import TKR\n", "\n", "\n", "class InParams(NamedTuple):\n", " a: TKR[float]\n", " b: TKR[float]\n", " c: TKR[float]" ] }, { "cell_type": "markdown", "id": "6ac7c4f3-107f-4675-a96d-f9a4b39f50dd", "metadata": {}, "source": [ "Graphs are built using a `Graph`. It needs to be instantiated with the types of its inputs and outputs." ] }, { "cell_type": "code", "execution_count": null, "id": "c15164f5", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "from tierkreis.builder import Graph\n", "\n", "g = Graph(InParams, TKR[float])" ] }, { "cell_type": "markdown", "id": "226d8c71", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "## Building the Graph\n", "\n", "Initially the graph doesn't do anything. To change this we need to add some nodes to the graph. There are several kinds of nodes we can have - for example Inputs and Outputs are nodes - but the most useful kind are called *Tasks*. Roughly speaking a task is any kind of computation. Most tasks are done by *Workers*, but for basic things, like adding numbers together, we can rely on *built-ins*. Tierkreis has many built-ins [API docs](#tierkreis.builtins.main) but the only one we will need today is floating point addition, aka `add`.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "ad221389-f2d8-46cf-8644-27d1fdad6938", "metadata": {}, "outputs": [], "source": [ "from tierkreis.builtins import add" ] }, { "cell_type": "markdown", "id": "1546d4c8-d6bd-4533-9b9f-9f44ab8ae232", "metadata": {}, "source": [ "Next we're going to construct a graph by calling the builder functions.\n", "Each function adds a node to the graph; the arguments are the input edges to the node, and the return values are its output edges. Notice that `inputs` is a special node created when we instantiated `g` and the names of the inputs are the ones we chose when defing the `InParams` class earlier." ] }, { "cell_type": "code", "execution_count": null, "id": "1f891725", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "x = g.task(add(g.inputs.a, g.inputs.b))\n", "y = g.task(add(x, g.inputs.c))" ] }, { "cell_type": "markdown", "id": "a1279db9", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "Finally, we have to say which edges of our graph will be the outputs. In our example there's only one.\n", "Once you call `finish_with_outputs` the graph can be run as a `Workflow`.\n", "With this call you also indicate that the graph wont be changed later" ] }, { "cell_type": "code", "execution_count": null, "id": "bfdf37b9", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "workflow = g.finish_with_outputs(y)" ] }, { "cell_type": "markdown", "id": "12670b4a-af60-404f-9e3f-cb2145c2472c", "metadata": {}, "source": [ "Congratulations - you build your first workflow! But what does it look like?" ] }, { "cell_type": "markdown", "id": "90e724f1", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "## Using the visualizer\n", "\n", "Tierkreis comes with an additional library to keep track of your workflows. The main use is to observe a running workflow, but you can also use it to examine graphs that you are currently constructing.\n", "```{info}\n", "If you're running this from the tierkreis repository you need to set up the frontend once by runninig `just prod`.\n", "```\n", "\n", "The visualizer will run a local web application in the same process. To stop its execution you need to user `ctrl+c`. " ] }, { "cell_type": "code", "execution_count": null, "id": "e3c19537", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "%%script false --no-raise-error\n", "from tierkreis_visualization.visualize_graph import visualize_graph\n", "\n", "visualize_graph(g)" ] }, { "cell_type": "markdown", "id": "740edc44-bb21-47c8-8624-e8cb1346bf50", "metadata": {}, "source": [ "Opening the web interface at `localhost:8000` will show the landing page with the workflow overview.\n", "![Landing Page](../_static/first_graph_overview.png)\n", "\n", "After selecting the `tmp` workflow you will see the graph representation you just created.\n", "![Graph](../_static/first_graph.png)\n", "\n", "It shows the three input nodes `a,b,c`, the two task nodes `builtins.add` and an output with value `null` as the workflow hasn't run.\n", "For the same reason all the nodes are depicted in white, which means they haven't been started yet.\n", "\n", "To learn more about the visualizer see [this page](../tutorial/visualization.md)" ] }, { "cell_type": "markdown", "id": "9f86ba49", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "## Running the workflow\n", "\n", "Now we have made the workflow, and checked that it looks like we expect, it's time to run it.\n", "\n", "Tierkreis can run in a lot of complex configurations, but for this tutorial we will be running it locally. It's pretty simple, but there will be some code that we won't explain here. To run a general Tierkreis workflow we need to set up:\n", "\n", "- a way to store and share input and output values (the 'storage' interface)\n", "- a way to run tasks (the 'executor' interface)\n", "\n", "For this example we use the `FileStorage` that is provided by the Tierkreis library itself.\n", "The inputs and outputs will be stored in a directory on disk.\n", "(By default the files are stored in `~/.tierkreis/checkpoints/`, where `` is a `UUID` identifying the workflow.)" ] }, { "cell_type": "code", "execution_count": null, "id": "48846186", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "from uuid import UUID\n", "\n", "\n", "from tierkreis.storage import FileStorage\n", "\n", "storage = FileStorage(workflow_id=UUID(int=12345), name=\"Hello World Graph\")" ] }, { "cell_type": "markdown", "id": "ac3c1225", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "If we have already run this example then there will already be files at this directory in the storage.\n", "If we want to reuse the directory then run" ] }, { "cell_type": "code", "execution_count": null, "id": "b4757171", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "storage.clean_graph_files()" ] }, { "cell_type": "markdown", "id": "ec90fb03", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "to get a fresh area to work in.\n", "\n", "Since we are just using the Tierkreis built-in tasks the executor will not actually be called.\n", "As a placeholder we create a simple `ShellExecutor`, also provided by the Tierkreis library, which can run bash scripts in a specified directory. In this case we can use `None`.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "925fbce0", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "from tierkreis.executor import ShellExecutor\n", "\n", "executor = ShellExecutor(registry_path=None, workflow_dir=storage.workflow_dir)" ] }, { "cell_type": "markdown", "id": "bb1a9769", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "As the penultimate step we need to provide the workflow inputs to run as a dictionary which we get from the input class..\n", "If the inputs are not provided the workflow will encounter an error." ] }, { "cell_type": "code", "execution_count": null, "id": "ecc65977", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "inputs = InParams(0, 0.25, 0.5)._asdict()" ] }, { "cell_type": "markdown", "id": "66845ea8", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "source": [ "\n", "With the storage and executor specified and inputs set, we can now run a graph using `run_graph`." ] }, { "cell_type": "code", "execution_count": null, "id": "ca96cb3d", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "from tierkreis.controller import run_graph\n", "from tierkreis.storage import read_outputs\n", "\n", "run_graph(storage, executor, workflow, inputs)\n", "result = read_outputs(workflow, storage)\n", "print(result)" ] } ], "metadata": { "kernelspec": { "display_name": "tierkreis", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.11" }, "name": "first_graph.ipynb" }, "nbformat": 4, "nbformat_minor": 5 }