Abstract Factory Blog

Pipi Object Model pt. 2

27 Mar 2014

In part 1 we uncovered some of the features of our neural nature and how these may be used to our advantage in absorbing and understanding new information. In this part, we’ll make use of these tools to try and understand how thinking about a pipeline as a graph can be helpful.

Data

You’ll remember from part 1 our notion of data

Well, data is at the heart of digital asset management. Without it, there would be nothing to manage.

So, lets establish some common ground on what exactly data is.

The location data

/server/project/sequence/shot/instance/cache/data.abc

What I have bestowed upon you is indeed the location of data. This particular set of data resides on a server somewhere and consists of the cache for instance of shot of sequence of project.

Data Hierarchy

That’s right. Data in our domain is all about hierarchies. Files within a hierarchy to be more precise and all of our tools for producing film, games and tv and web content and so on deal with files. So for now, lets determine that a file = data = file

Referring back to our notion of a pipeline, where data travels, transforms and arrives; what we are really saying is that a file travels, transforms and arrives.

A Hero’s Journey

But where does the file go, and what exactly is its transformation?

Lets refer yet again back to part 1 where our modeler, Bob, transformed artwork into a 3d model and sent it back out.

What really happened was that Bob was given a jpeg and produced an obj.

Data Transformation

Now lets assume Bob is a strictly by the book; we may then refer to Bobs input as a pre-condition and his output as post-condition.

For Bob to perform his service, the incoming file will have to be served in accordance with a previously agreed-to contract – in this case, the input will have to arrive in the form of a jpeg. In return, Bob has agreed to output his file as an obj.

If the file isn’t delivered to Bob in the form of a jpeg then Bob will be unable to guarantee the delivery of obj; if at all.

This notion is referred to as Design by contract and is the same methodology applied to the design of an API and indeed, whether directly or indirectly, universal across pipeline design in all industries.

Data Tracking

Now that Bob has produced an output, we may locate his output with that of a path and a path may come in multiple flavours.

/server/project01/hero/bobs_model.obj

Bob is a Maya artist and as such the transformation took place within this application.

|Hero|L_arm_GRP|L_elbow_PLY

Additionally, the task given to Bob is located within an Asana project.

https://app.asana.com/0/846062819581/1050536261774

Conflict

But wait.. Didn’t we just declare that all data = file = data?

If the previous locations are indeed locations to data, we will have to expand on our understanding of what data really means. Let’s start by establishing some common ground on what is exactly that we need our pipeline to do.

Common Ground

You’ll recall from part 1 that we defined our goal for Pipi as<div class=sidenote>Output may be altered up-stream and sent down-stream with little or no effort.</div>

But what does that mean, really?

Imagine Bob.

Bob finished and delivered his output a while ago and it is now being transformed by another artist.

During transformation, it occurs to the recipient that:

“We need change”

Data Upstream

data was never designed to be passed up-stream. Bob signed a contract clearly stating that he would receive jpeg and output obj. The recipient artist on the other hand receives obj and outputs something else.

What is it exactly that Bob will receive? Does anyone know?

Acyclic Failure

In part 1, I briefly touched upon the model of a graph, more specifically a Directed Acyclic Graph and that it may be helpful to us in thinking about a pipeline.<div class=sidenote>The DAG refers to a processing element as a `vertex` and a connection as an `edge`. These keywords already occupy space in our brains relating to 3d geometry so I will instead be refer to these by the more familiar terms `node` and `link` respectively.</div>

It would seem as though this is where this model falls apart.

The DAG is designed so that each node must finish its computation prior to outputting any new information – so how can information flow from one artist to the next if at any point in time that information may reverse direction?

In an ideal world, Bob would instantly output no more and no less that what is required from him but unfortunately for Bob we live in the real world, which is messy, and in our messy real world we must expect the unexpected. We must plan for change.

Responding to Change

What we have just witnessed is called the waterfall model.

Waterfall model

You may notice its correlation to the terms I’ve been using so far – flow, up-stream and down-stream and indeed that is no coincidence.

The waterfall model was designed for an ideal world – but as just discussed, our world is far from ideal.

The Ideal Graph

How do you design a graph capable of passing data up-stream?

The answer is, you don’t.

In February of 2001, a group of 17 developers got together to form The Agile Manifesto. In this manifesto there were four values, one of which is of particular interest to us.

An Agile Graph

Let’s have a look at how we can re-shape our thinking about a graph for production with the agile manifesto in mind.

Lets define a branch as a sequence of nodes. In order for change to find its place within a graph, we must reduce the time taken to get from input to output in any branch of our graph; in short, less latency means more room for change.

Traditionally (i.e. in the waterfall-model), the output of each branch must be completed prior to being sent back out. In a digital asset management pipeline however this may not be the most efficient way to go.

Consider Bobs recipient, John.

John is dependent on the output of Bob and John has signed a contract specifying that he will receive an obj and output an mb.

In a waterfall-enabled environment, John would not be receiving any information until the output of Bob is final.

But as we know, final is but a mirage.

What we would like to have happen, is for Bob to output partially finished information while still processing, so that John could begin his processing as soon as possible.<div class=sidenote>We want *partially finished* information while *still processing*</div>

The Holy Grail

[Working with a pipeline is like] rebuilding a plane mid-flight – Steve Lavietes of Imageworks from SIGGRAPH University 2013

This is where pipelines get their name for being colossal, ever-changing and hard-to-manage. This is where pipelines, and their architects, are truly put to the test.

In the the next part, we’ll dissect this last statement further and see how and if a pipeline really is all that colossal.

Stay tuned, thanks and see you in a bit.

Marcus

Links
- part 1
- Directed Acyclic Graph
- Waterfall model
- Data Upstream
- Pipeline illustration
- API
- Design by contract
- URL UML
- Uniform Resource Locator
- Object Aggregation
- The Agile Manifesto