In part 1 we uncovered some of the features of our neural nature and how these may be used to our advantage in absorbing and understanding new information. In this part, we’ll make use of these tools to try and understand how thinking about a pipeline as a graph can be helpful.
You’ll remember from part 1 our notion of
data is at the heart of digital asset management. Without it, there would be nothing to manage.
So, lets establish some common ground on what exactly
The location data
What I have bestowed upon you is indeed the location of data. This particular set of data resides on a server somewhere and consists of the
That’s right. Data in our domain is all about hierarchies. Files within a hierarchy to be more precise and all of our tools for producing film, games and tv and web content and so on deal with files. So for now, lets determine that a
Referring back to our notion of a pipeline, where
data travels, transforms and arrives; what we are really saying is that a
file travels, transforms and arrives.
A Hero’s Journey
But where does the
file go, and what exactly is its transformation?
Lets refer yet again back to part 1 where our modeler, Bob, transformed artwork into a 3d model and sent it back out.
What really happened was that Bob was given a
jpeg and produced an
Now lets assume Bob is a strictly by the book; we may then refer to Bobs input as a pre-condition and his output as post-condition.
For Bob to perform his service, the incoming
file will have to be served in accordance with a previously agreed-to contract – in this case, the input will have to arrive in the form of a
jpeg. In return, Bob has agreed to output his
file as an
file isn’t delivered to Bob in the form of a
jpeg then Bob will be unable to guarantee the delivery of
obj; if at all.
This notion is referred to as Design by contract and is the same methodology applied to the design of an API and indeed, whether directly or indirectly, universal across pipeline design in all industries.
Now that Bob has produced an output, we may locate his output with that of a
path and a
path may come in multiple flavours.
Bob is a Maya artist and as such the transformation took place within this application.
Additionally, the task given to Bob is located within an Asana project.
But wait.. Didn’t we just declare that all
If the previous locations are indeed locations to data, we will have to expand on our understanding of what
data really means. Let’s start by establishing some common ground on what is exactly that we need our pipeline to do.
You’ll recall from part 1 that we defined our goal for Pipi as<div class=sidenote>Output may be altered up-stream and sent down-stream with little or no effort.</div>
But what does that mean, really?
Bob finished and delivered his output a while ago and it is now being transformed by another artist.
During transformation, it occurs to the recipient that:
“We need change”
data was never designed to be passed up-stream. Bob signed a contract clearly stating that he would receive
jpeg and output
obj. The recipient artist on the other hand receives
obj and outputs something else.
What is it exactly that Bob will receive? Does anyone know?
In part 1, I briefly touched upon the model of a graph, more specifically a Directed Acyclic Graph and that it may be helpful to us in thinking about a pipeline.<div class=sidenote>The DAG refers to a processing element as a `vertex` and a connection as an `edge`. These keywords already occupy space in our brains relating to 3d geometry so I will instead be refer to these by the more familiar terms `node` and `link` respectively.</div>
It would seem as though this is where this model falls apart.
The DAG is designed so that each
node must finish its computation prior to outputting any new information – so how can information flow from one artist to the next if at any point in time that information may reverse direction?
In an ideal world, Bob would instantly output no more and no less that what is required from him but unfortunately for Bob we live in the real world, which is messy, and in our messy real world we must expect the unexpected. We must plan for change.
Responding to Change
What we have just witnessed is called the waterfall model.
You may notice its correlation to the terms I’ve been using so far –
down-stream and indeed that is no coincidence.
The waterfall model was designed for an ideal world – but as just discussed, our world is far from ideal.
The Ideal Graph
How do you design a graph capable of passing data up-stream?
The answer is, you don’t.
In February of 2001, a group of 17 developers got together to form The Agile Manifesto. In this manifesto there were four values, one of which is of particular interest to us.
- Responding to change over Following a plan
An Agile Graph
Let’s have a look at how we can re-shape our thinking about a graph for production with the agile manifesto in mind.
Lets define a
branch as a sequence of
nodes. In order for change to find its place within a graph, we must reduce the time taken to get from input to output in any
branch of our graph; in short, less latency means more room for change.
Traditionally (i.e. in the waterfall-model), the output of each
branch must be completed prior to being sent back out. In a digital asset management pipeline however this may not be the most efficient way to go.
Consider Bobs recipient, John.
John is dependent on the output of Bob and John has signed a contract specifying that he will receive an
obj and output an
In a waterfall-enabled environment, John would not be receiving any information until the output of Bob is final.
But as we know, final is but a mirage.
What we would like to have happen, is for Bob to output partially finished information while still processing, so that John could begin his processing as soon as possible.<div class=sidenote>We want *partially finished* information while *still processing*</div>
The Holy Grail
[Working with a pipeline is like] rebuilding a plane mid-flight – Steve Lavietes of Imageworks from SIGGRAPH University 2013
This is where pipelines get their name for being colossal, ever-changing and hard-to-manage. This is where pipelines, and their architects, are truly put to the test.
In the the next part, we’ll dissect this last statement further and see how and if a pipeline really is all that colossal.
Stay tuned, thanks and see you in a bit.
- part 1
- Directed Acyclic Graph
- Waterfall model
- Data Upstream
- Pipeline illustration
- Design by contract
- URL UML
- Uniform Resource Locator
- Object Aggregation
- The Agile Manifesto