1.3. Orc: Orchestrating services

Cor is a pure declarative language. It has no state, since variables are bound at most once and cannot be reassigned. Evaluation of an expression results in at most one value. It cannot communicate with the outside world except by producing a value. The full Orc language transcends these limitations by incorporating the orchestration of external services. We introduce the term site to denote an external service which can be called from an Orc program.

As in Cor, an Orc program is an expression; Orc expressions are built up recursively from smaller expressions. Orc is a superset of Cor, i.e., all Cor expressions are also Orc expressions. Orc expressions are executed, rather than evaluated; an execution may call external services and publish some number of values (possibly zero). Different executions of the same expression may have completely different behaviors; they may call different services, may receive different responses from the same site, and may publish different values. Expressions in the functional subset, though, will display the same behavior in all executions.

In the following sections we discuss the features of Orc. First, we discuss how Orc communicates with external services. Then we introduce Orc's concurrency combinators, which combine expressions into larger orchestrations and manage concurrent executions. We have already discussed the functional subset of Orc in our coverage of Cor, but we reprise some of those topics; some Cor constructs exhibit new behaviors in the concurrent, stateful context of Orc.

The following figure summarizes the syntax of Orc as an extension of the syntax of Cor. The original Cor grammar rules are abbreviated by ellipses (...).

Table 1.3. Basic Syntax of Orc

D::= ... Declaration  
 | site X = address   site declaration
C::= ... Constant  
 | signal   signal value
E::=... Expression  
 | stop   silent expression
 |E >P> E  sequential combinator
 |E | E  parallel combinator
 |E <P< E  pruning combinator
 |E ; E  otherwise combinator

1.3.1. Communicating with external services

An Orc expression may be a site call. Sites are called using the same syntax as a function call, but with a slightly different meaning. Sites are introduced and bound to variables by a special declaration.

1.3.1.1. Calling a site

Suppose that the variable Google is bound to a site which invokes the Google search engine service in "I'm Feeling Lucky" mode. A call to Google looks just like a function call. Calling Google requests the URL of the top result for the given search term.

{- Get the top search result for "computation orchestration" -}
Google("computation orchestration")
Once the Google search service determines the top result, it sends a response. The site call then publishes that response. Note that the service might not respond: Google's servers might be down, the network might be down, or the search might yield no result URL.

A site call sends only a single request to an external service and receives at most one response. These restrictions have two important consequences. First, all of the information needed for the call must be present before contacting the service. Thus, site calls are strict; all arguments must be bound before the call can proceed. If any argument is silent, the call never occurs. Second, a site call publishes at most one value, since at most one response is received.

A call to a site has exactly one of the following effects:

  1. The site returns a value, called its response.
  2. The site communicates that it will never respond to the call; we say that the call has halted
  3. The site neither returns a value nor indicates that the call has halted; we say that the call is pending.

In the last two cases, the site call is said to be silent. However, unlike a silent expression in Cor, a silent site call in Orc might perform some meaningful computation or communication; silence does not necessarily indicate an error. Since halted site calls and pending site calls are both silent, they cannot usually be distinguished from each other; only the otherwise combinator can tell the difference.

A site is a value, just like an integer or a list. It may be bound to a variable, passed as an argument, or published by an execution, just like any other value.

{-
  Create a search site from a search engine URL,
  bind the variable Search to that site,
  then use that site to search for a term.
-}
val Search = SearchEngine("http://www.google.com/")
Search("first class value")

A site is sometimes called only for its effect on the external world; its return value is irrelevant. Many sites which do not need to return a meaningful value will instead return a signal: a special value which carries no information (analogous to the unit value () in ML). The signal value can be written as signal within Orc programs.

{- 
  Use the 'println' site to print a string, followed by
  a newline, to an output console.
  The return value of this site call is a signal.
-}
println("Hello, World!")

1.3.1.2. Declaring a site

A site declaration makes some service available as a site and binds it to a variable. The service might be an object in the host language (e.g. a class instance in Java), or an external service on the web which is accessed through some protocol like SOAP or REST, or even a primitive operation like addition. Many useful sites are already defined in the Orc standard library, documented in Appendix B. For now, we present a simple type of site declaration: using an object in the host language as a site.

The following example instantiates a Java object to be used as a site in an Orc program, specifically a Java object which provides an asynchronous buffer service. The declaration uses a fully qualified Java class name to find and load a class, and creates an instance of that class to provide the service.

{-  Define the Buffer site  -}
site Buffer = orc.lib.state.Buffer

1.3.2. The concurrency combinators of Orc

Orc has four combinators: parallel, sequential, pruning, and otherwise. A combinator forms an expression from two component expressions. Each combinator captures a different aspect of concurrency. Syntactically, the combinators are written infix, and have lower precedence than operators, but higher precedence than conditionals or declarations.

1.3.2.1. The parallel combinator

Orc's simplest combinator is |, the parallel combinator. Orc executes the expression F | G, where F and G are Orc expressions, by executing F and G concurrently. Whenever F or G communicates with a service or publishes a value, F | G does so as well.

-- Publish 1 and 2 in parallel  
1 | 1+1

{- 
  Access two search sites, Google and Yahoo, in parallel.

  Publish any results they return.
  
  Since each call may publish a value, the expression
  may publish up to two values.
-}  
Google("cupcake") | Yahoo("cupcake")

The parallel combinator is fully associative: (F | G) | H and F | (G | H) and F | G | H are all equivalent.

It is also commutative: F | G is equivalent to G | F.

-- Publish 1, 2, and 3 in parallel  
1+0 | 1+1 | 1+2

1.3.2.2. The sequential combinator

Now that we have expressions which publish multiple values, what can we do with those publications? The sequential combinator, written F >x> G, combines the expression F, which may publish some values, with another expression G, which will use the values as they are published; x transmits the values from F to G.

The execution of F >x> G starts by executing F. Whenever F publishes a value, a new copy of G is executed in parallel with F (and with any previous copies of G); in that copy of G, variable x is bound to the value published by F. Values published by copies of G are published by the whole expression, but the values published by F are not published by the whole expression; they are consumed by the variable binding.

-- Publish 1 and 2 in parallel  
(0 | 1) >n> n+1

-- Publish 3 and 4 in parallel  
2 >n> (n+1 | n+2)

-- Publish 0, 1, 2 and 3 in parallel  
(0 | 2) >n> (n | n+1)

-- Prepend the site name to each published search result
-- The cat site concatenates any number of arguments into one string  
  Google("cupcake") >s> cat("Google: ", s)
| Yahoo("cupcake") >s> cat("Yahoo: ", s)

The sequential combinator may be written as F >P> G, where P is a pattern instead of just a variable name. Any value published by F is matched against the pattern P. If this match is successful, a new copy of G is started with all of the bindings from the match. Otherwise, the published value is simply ignored, and no new copy of G is executed.

-- Publish 3, 6, and 9 in arbitrary order.
(3,6,9)  >(x,y,z)>  ( x | y | z )

-- Filter out values of the form (_,false)
( (4,true) | (5,false) | (6,true) )  >(x,true)> x
-- Publishes 4 and 6 

We may also omit the variable entirely, writing >> . This is equivalent to using a wildcard pattern: >_>

We may want to execute an expression just for its effects, and hide all of its publications. We can do this using >> together with the special expression stop, which is always silent.

{- 
  Print two strings to the console,
  but don't publish the return values of the calls.
-}
( println("goodbye") | println("world") ) >> stop

The sequential combinator is right associative: F >x> G >y> H is equivalent to F >x> (G >y> H). It has higher precedence than the parallel combinator: F >x> G | H is equivalent to (F >x> G ) | H.

The right associativity of the sequential combinator makes it easy to bind variables in sequence and use them together.

{- 
  Publish the cross product of {1,2} and {3,4}:
  (1,3), (1,4), (2,3), and (2,4).
-}
(1 | 2) >x> (3 | 4) >y> (x,y)

1.3.2.3. The pruning combinator

The pruning combinator, written F <x< G, allows us to block a computation waiting for a result, or terminate a computation. The execution of F <x< G starts by executing F and G in parallel. Whenever F publishes a value, that value is published by the entire execution. When G publishes its first value, that value is bound to x in F, and then the execution of G is immediately terminated. A terminated expression cannot call any sites or publish any values.

During the execution of F, any part of the execution that depends on x will be suspended until x is bound (to the first value published by G). If G never publishes a value, that part of the execution is suspended forever.

-- Publish either 5 or 6, but not both
x+2 <x< (3 | 4)

-- Query Google and Yahoo for a search result
-- Print out the result that arrives first; ignore the other result
println(result) <result< ( Google("cupcake") | Yahoo("cupcake") )

Though a terminated execution may not make any new calls, the calls that it has already made will continue normally; their responses are simply ignored. This may have surprising consequences when a call has side effects, as in the following example.

{- 
  This example actually prints both "true" and "false" to the
  console, regardless of which call responds first.
-}
stop <x< println("true") | println("false")
Both of the println calls are initiated before either one of them publishes a value and terminates the expression. Once the expression is terminated, no new calls occur, but the other println call still proceeds and still has the effect of printing its message to the console.

The pruning combinator may include a full pattern P instead of just a variable name. Any value published by G is matched against the pattern P. If this match is successful, then G terminates and all of the bindings of the pattern P are made in F. Otherwise, the published value is simply ignored and G continues to execute.

-- Publish either 9 or 25, but not 16.
x*x <(x,true)< ( (3,true) | (4,false) | (5,true) )

Note that even if (4,false) is published before (3,true) or (5,true), it is ignored. The right side continues to execute and will publish one of (3,true) or (5,true).

The pruning combinator is left associative: F <x< G <y< H is equivalent to (F <x< G) <y< H. It has lower precedence than the parallel combinator: F <x< G | H is equivalent to F <x< (G | H).

1.3.2.4. The otherwise combinator

Orc has a fourth concurrency combinator: the otherwise combinator, written F ; G. The execution of F ; G proceeds as follows. First, F is executed. If F completes, and has not published any values, then G executes. If F did publish one or more values, then G is ignored. The publications of F ; G are those of F if F publishes, or those of G otherwise.

We determine when an expression completes using the following rules.

  • A Cor expression completes when it is fully evaluated; if it is silent, it completes immediately.
  • A site call completes when it has published a value or halted.
  • F | G completes when its subexpressions F and G have both completed.
  • F >x> G completes when F has completed and all instantiated copies of G have completed.
  • F <x< G completes when F has completed, and G has either completed or published a value. Also, if G completes without publishing a value, then all expressions in F which use x also complete, since they will never be able to proceed.
  • F ; G completes either when F has published a value and subsequently completed, or when F and G have both completed.
  • stop completes immediately.

The otherwise combinator is fully associative, so F ; G ; H and (F ; G) ; H and F ; (G ; H) are all equivalent. It has lower precedence than the other three combinators.

The otherwise combinator was not present in the original formulation of the Orc concurrency calculus; it has been added to support computation and iteration over strictly finite data. Sequential programs conflate the concept of producing a value with the concept of completion. Orc separates these two concepts; variable binding combinators like >x> and <x< handle values, whereas ; detects the completion of an execution.

1.3.3. Revisiting Cor expressions

Some Cor expressions have new behaviors in the context of Orc, due to the introduction of concurrency and of sites.

1.3.3.1. Operators

The arithmetic, logical, and comparison operators are actually calls to sites, simply written in infix style with the expected operator symbols. For example, 2+3 is actually (+)(2,3), where (+) is a primitive site provided by the language itself. All of the operators can be used directly as sites in this way; the name of the site is the operator enclosed by parentheses, e.g. (**), (>=), etc. Negation (unary minus) is named (0-).

1.3.3.2. Conditionals

The conditional expression if E then F else G is actually a derived form based on a site named if. The if site takes a boolean argument; it returns a signal if that argument is true, or remains silent if the argument is false.

if E then F else G is equivalent to ( if(b) >> F | if(~b) >> G) <b< E.

When the else branch of a conditional is unneeded, we can write if F then G, with no else branch. This is equivalent to if(E) >> F.

1.3.3.3.  val

The declaration val x = G, followed by expression F, is actually just a different way of writing the expression F <x< G. Thus, val shares all of the behavior of the pruning combinator, which we have already described. (This is also true when a pattern is used instead of variable name x).

1.3.3.4. Nesting Orc expressions

The execution of an Orc expression may publish many values. What does such an expression mean in a context where only one value is expected? For example, what does 2 + (3 | 4) publish?

The specific contexts in which we are interested are as follows (where E is any Orc expression):

E op E operand
if E then ... conditional test
X( ... , E , ... ) call argument
( ... , E , ... ) tuple element
[ ... , E , ... ] list element

Whenever an Orc expression appears in such a context, it executes until it publishes its first value, and then it is terminated. The published value is used in the context as if it were the result of evaluating the expression.

-- Publish either 5 or 6
2 + (3 | 4)

-- Publish exactly one of 0, 1, 2 or 3
(0 | 2) + (0 | 1)

To be precise, whenever an Orc expression appears in such a context, it is treated as if it was on the right side of a pruning combinator, using a fresh variable name to fill in the hole. Thus, C[E] (where E is the Orc expression and C is the context) is equivalent to the expression C[x] <x< E.

1.3.3.5. Functions

The body of a function in Orc may be any Orc expression; thus, function bodies in Orc are executed rather than evaluated, and may engage in communication and publish multiple values.

A function call in Orc, as in Cor, binds the values of its actual parameters to its formal parameters, and then executes the function body with those bindings. Whenever the function body publishes a value, the function call publishes that value. Thus, unlike a site call, or a pure functional Cor call, an Orc function call may publish many values.

-- Publish all integers in the interval 1..n, in arbitrary order. 
def range(n) = if (n > 0) then (n | range(n-1)) else stop 

-- Publish 1, 2, and 3 in arbitrary order.
range(3)

In the context of Orc, function calls are not strict. When a function call executes, it begins to execute the function body immediately, and also executes the argument expressions in parallel. When an argument expression publishes a value, it is terminated, and the corresponding formal parameter is bound to that value in the execution of the function body. Any part of the function body which uses a formal parameter that has not yet been bound suspends until that parameter is bound to a value.

1.3.4. Time

Orc is designed to communicate with the external world, and one of the most important characteristics of the external world is the passage of time. Orc implicitly interacts with the passage of time by calling external services which take time to respond. However, Orc can also explicitly wait for some amount of time, using the special site Rtimer.

The site Rtimer is a relative timer. It takes as an argument a number of milliseconds to wait. It waits for exactly that amount of time, and then responds with a signal.

-- Print "red", wait for 3 seconds (3000 ms), and then print "green" 
println("red") >> Rtimer(3000) >> println("green") >> stop

The following example defines a metronome, which publishes a signal once every t milliseconds, indefinitely.

def metronome(t) = signal | Rtimer(t) >> metronome(t)

We can also use Rtimer together with the pruning combinator to enforce a timeout.

{-
  Publish the result of a Google search.
  If it takes more than 5 seconds, time out.
-}
result 
  <result< ( Google("impatience") 
           | Rtimer(5000) >> "Search timed out.")

We present many more examples of programming techniques using real time in Chapter 2.