Orc User Guide v1.0.0


Table of Contents

Introduction
1. The Orc Programming Language
1.1. Introduction
1.2. Cor: A Functional Subset
1.2.1. Constants
1.2.2. Operators
1.2.3. Conditionals
1.2.4. Variables
1.2.5. Data Structures
1.2.6. Patterns
1.2.7. Functions
1.2.8. Comments
1.3. Orc: Orchestrating services
1.3.1. Communicating with external services
1.3.2. The concurrency combinators of Orc
1.3.3. Revisiting Cor expressions
1.3.4. Time
1.4. Advanced Features of Orc
1.4.1. Special call forms
1.4.2. Extensions to pattern matching
1.4.3. Datatypes
1.4.4. New forms of declarations
2. Programming Methodology
2.1. Syntactic and Stylistic Conventions
2.1.1. Parallel combinator
2.1.2. Sequential combinator
2.1.3. Pruning combinator
2.1.4. Declarations
2.2. Programming Idioms
2.2.1. Channels
2.2.2. Lists
2.2.3. Streams
2.2.4. Arrays
2.2.5. Mutable Storage
2.2.6. Loops
2.2.7. Parallel Matching
2.2.8. Fork-join
2.2.9. Sequential Fork-Join
2.2.10. Priority Poll
2.2.11. Parallel Or
2.2.12. Timeout
2.2.13. Priority
2.2.14. Metronome
2.2.15. Routing
2.2.16. Interruption
2.2.17. Lifting
2.2.18. Fold
2.3. Larger Examples
2.3.1. Dining Philosophers
2.3.2. Hygienic Dining Philosophers
2.3.3. Readers-Writers
2.3.4. Quicksort
2.3.5. Meeting Scheduler
3. Accessing and Creating External Services
3.1. Introduction
3.2. class Sites
3.2.1. Dot Operator
3.2.2. Direct Calls
3.2.3. Pattern Matching
3.2.4. Method Resolution
3.2.5. Orc values in Java
3.2.6. Java Values in Orc
3.2.7. Cooperative Scheduling and Concurrency
3.3. site Sites
3.3.1. Fundamentals
3.3.2. More Site Classes
3.3.3. Tutorial Example
3.4. Web Services
3.4.1. Introduction
3.4.2. Downloading and Running Examples
3.4.3. Protocols Supported
3.4.4. REST
4. Type Checking
4.1. Simple Typing
4.1.1. Values and Operators
4.1.2. Top and Bot
4.1.3. Tuples
4.1.4. Combinators
4.1.5. Conditionals
4.1.6. Defined Functions
4.1.7. Ascriptions
4.1.8. Aliasing
4.1.9. Datatypes
4.1.10. Interacting with External Services
4.1.11. Assertions
4.2. Polymorphism
4.2.1. Parametric types
4.2.2. Parametric functions
4.2.3. Generic calls
4.2.4. Polymorphic aliases
4.2.5. Polymorphic datatypes
4.3. Interacting with Java
4.4. Syntax of Typed Orc
A. Complete Syntax of Orc
B. Standard Library
B.1. Overview
B.2. Types and Notation
B.3. Reference
B.3.1. core.inc: Fundamental sites and operators.
B.3.2. state.inc: General-purpose supplemental data structures.
B.3.3. idioms.inc: Higher-order Orc programming idioms.
B.3.4. list.inc: Operations on lists.
B.3.5. reflect.inc: Metalanguage operations.
B.3.6. text.inc: Operations on strings.
B.3.7. time.inc: Real and logical time.
B.3.8. util.inc: Miscellaneous utility functions.
C. Experimental Features
C.1. Exceptions in Orc
C.1.1. Catching Java exceptions
C.1.2. Catching Orc Engine Errors
C.1.3. Exceptions in Practice
C.1.4. Typechecking Exceptions

List of Tables

1.1. Syntax of the Functional Subset of Orc (Cor)
1.2. Operators of Cor
1.3. Basic Syntax of Orc
1.4. Advanced Syntax of Orc
4.1. Typed Orc
A.1. Complete Syntax of Orc

Introduction

Orc is a programming language designed to make distributed and concurrent programs simple and intuitive to write. Orc expresses orchestration, a type of structured concurrency. It emphasizes the flow of control and gives a global view of a concurrent system. Orc is well-suited for task orchestration, a form of concurrent programming with applications in workflow, business process management, and web service orchestration. Orc provides constructs to orchestrate the concurrent invocation of services while managing time-outs, priorities, and failures of services or communication. To learn more about Orc and run your own Orc programs, visit the website: http://orc.csres.utexas.edu/ .

Unless otherwise noted, all material in this document pertains to the Orc language implementation version 1.0.0 .

Chapter 1. The Orc Programming Language

1.1. Introduction

This chapter describes the Orc programming language in three steps. In Section 1.2, we discuss a small subset of Orc called Cor. Cor is a pure functional language, which has no features for concurrency, has no state, and does not communicate with external services. Cor introduces us to the parts of Orc that are most familiar from existing programming languages, such as arithmetic operations, variables, conditionals, and functions.

In Section 1.3, we consider Orc itself, which in addition to Cor, comprises external services and combinators for concurrent orchestration of those services. We show how Orc interacts with these external services, how the combinators can be used to build up complex orchestrations from simple base expressions, and how the functional constructs of Cor take on new, subtler behaviors in the concurrent context of Orc.

In Section 1.4, we discuss some additional features of Orc that extend the basic syntax. These are useful for creating large-scale Orc programs, but they are not essential to the understanding of the language.

1.2. Cor: A Functional Subset

In this section we introduce Cor, a pure functional subset of the Orc language. Users of functional programming languages such as Haskell and ML will already be familiar with many of the key concepts of Cor.

A Cor program is an expression. Cor expressions are built up recursively from smaller expressions. Cor evaluates an expression to reduce it to some simple value which cannot be evaluated further, for example a list of numbers or a Boolean truth value. This value is called the result of the expression.

In the following subsections we introduce the concepts of Cor. First, we talk about simple constants, such as numbers and truth values, and the operations that we can perform on those values. Then we introduce conditionals (if then else). Then we introduce variables and binding, as a way to give a name to the value of an expression. After that, we talk about constructing data structures, and examining those structures using patterns. Lastly, we introduce functions.

Table 1.1 describes the syntax of Cor. Each part of the syntax is explained in a subsequent section.

Table 1.1. Syntax of the Functional Subset of Orc (Cor)

E::=  Expression  
  C  constant value
 |X  variable
 | ( E , ... , E )   tuple
 | [ E , ... , E ]   list
 |X( E , ... , E )   function call
 |E op E  operator
 | if E then E else E   conditional
 | lambda ( P , ... , P ) = E  closure
 |D E  scoped declaration
C::= Boolean | Number | String Constant  
X::= Identifier Identifier  
D::=  Declaration  
   val P = E  value declaration
 | def X( P , ... , P ) = E  function declaration
P::=  Pattern  
  X  variable
 |C  constant
 | _   wildcard
 | ( P , ... , P )   tuple
 | [ P , ... , P ]   list

Where relevant, syntactic constructs are ordered by precedence, from highest to lowest. For example, among expressions, calls have higher precedence than operators, which in turn have higher precedence than conditionals.

1.2.1. Constants

The simplest expression one can write is a constant. It evaluates trivially to that constant value.

Cor has three types of constants, and thus for the moment three types of values:

  • Boolean: true and false
  • Number: 5, -1, 2.71828, ...
  • String: "orc", "ceci n'est pas une |"

1.2.2. Operators

Cor has a standard set of arithmetic, logical, and comparison operators. As in most other programming languages, they are written in the usual infix style. They have Java-like operator precedence, which can be overridden by adding parentheses.

Examples

  • 1+2 evaluates to 3.
  • (98+2)*17 evaluates to 1700.
  • 4 = 20 / 5 evaluates to true.
  • 3-5 >= 5-3 evaluates to false.
  • true && (false || true) evaluates to true.
  • "leap" + "frog" evaluates to "leapfrog".

Here is the full set of operators that Cor supports:

Table 1.2. Operators of Cor

ArithmeticComparisonLogicalString
+ addition = equality && logical and + concatenation
- subtraction /= inequality || logical or 
* multiplication < less than ~ logical not 
/ division > greater than  
% modulus <= less than or equal  
** exponent >= greater than or equal  

There is also a unary negation operator, written -, for example -(2 ** 5).

The = operator can compare values of any type. Values of different type are always unequal; for example, 10 = true evaluates to false.

Numbers with no decimal part, such as 3, are treated as integers. Arithmetic operators with two integer arguments will perform an integer operation and return an integer result; for example, 5 / 2 performs integer division and evaluates to 2. However, if either argument to an operator has a decimal part (even if it is trivial, as in 3.0), the other argument will be promoted, and a decimal operation will be peformed. For example, 5 / 2.0 and 5.0 / 2 both perform decimal division and evaluate to 2.5.

1.2.2.1. Silent Expression

There are situations where an expression evaluation is stuck, because it is attempting to perform some impossible operation and cannot compute a value. In that case, the expression is silent. An expression is also silent if it depends on the result of a silent subexpression. For example, the following expressions are silent: 10/0, 6 + false, 3 + 1/0, 4 + true = 5.

Cor is a dynamically typed language. A Cor implementation does not statically check the type correctness of an expression; instead, an expression with a type error is simply silent when it is evaluated.

Warning

Silent expressions can produce side effects. For example, on encountering a type error, Orc will print an error message to the console. However the expression containing the type error will not publish a value, and in this respect it is silent.

1.2.3. Conditionals

A conditional expression in Cor is of the form if E then F else G. Its meaning is similar to that in other languages: the value of the expression is the value of F if and only if E evaluates to true, or the value of G if and only if E evaluates to false. Note that G is not evaluated at all if E evaluates to true, and similarly F is not evaluated at all if E evaluates to false. Thus, for example, evaluation of if true then 2+3 else 1/0 does not evaluate 1/0; it only evaluates 2+3.

Unlike other languages, expressions in Cor may be silent. If E is silent, then the entire expression is silent. And if E evaluates to true but F is silent (or if E evaluates to false and G is silent) then the expression is silent.

Note that conditionals have lower precedence than any of the operators. For example, if false then 1 else 2 + 3 is equivalent to if false then 1 else (2 + 3), not (if false then 1 else 2) + 3.

The behavior of conditionals is summarized by the following table (v denotes a value).

EFG if E then F else G
true v - v
true silent-silent
false - v v
false -silentsilent
silent--silent

Examples

  • if true then 4 else 5 evaluates to 4.
  • if 2 < 3 && 5 < 4 then "blue" else "green" evaluates to "green".
  • if true || "fish" then "yes" else "no" is silent.
  • if false || false then 4+5 else 4+true is silent.
  • if 0 < 5 then 0/5 else 5/0 evaluates to 0.

1.2.4. Variables

A variable can be bound to a value. A declaration binds one or more variables to values. The simplest form of declaration is val, which evaluates an expression and binds its result to a variable. Declarations follow the rules of lexical scoping.

val x = 1 + 2
val y = x + x
These declarations bind variable x to 3 and variable y to 6.

If the expression on the right side of a val is silent, then the variable is not bound, but evaluation of other declarations and expressions continues. If an evaluated expression depends on that variable, that expression is silent.

val x = 1/0
val y = 4+5
if false then x else y
Evaluation of the declaration val y = 4+5 and the expression if false then x else y may continue even though x is not bound. The expression evaluates to 9.

1.2.5. Data Structures

Cor supports two basic data structures, tuples and lists.

1.2.5.1. Tuples

A tuple expression is a comma-separated sequence of at least two expressions, enclosed by parentheses. Each expression is evaluated; the value of the whole tuple expression is a tuple containing each of these values in order. If any of the expressions is silent, then the whole tuple expression is silent.

Examples

  • (1+2, 7) evaluates to (3,7).
  • ("true" + "false", true || false, true && false) evaluates to ("truefalse", true, false).
  • (2/2, 2/1, 2/0) is silent.

1.2.5.2. Lists

A list expression is a comma-separated sequence of expressions enclosed by square brackets. It may be of any length, including zero. Each expression is evaluated; the value of the whole list expression is a list containing each of these values in order. If any of the expressions is silent, then the whole list expression is silent.

Examples

  • [1,2+3] evaluates to [1,5].
  • [true && true] evaluates to [true].
  • [] evaluates vacuously to [], the empty list.
  • [5, 5 + true, 5] is silent.

There is also a concatenation (cons) operation on lists, written F:G, where F and G are expressions. Its result is a new list whose first element is the value of F and whose remaining elements are the list value of G. The : operator is right associative, so F:G:H is F:(G:H).

Examples

  • (1+3):[2+5,6] evaluates to [4,7,6].
  • 2:2:5:[] evaluates to [2,2,5].
  • Suppose t is bound to [3,5]. Then 1:t evaluates to [1,3,5].
  • 2:3 is silent, because 3 is not a list.

1.2.6. Patterns

We have seen how to construct data structures. But how do we examine them, and use them? We use patterns.

A pattern is a powerful way to bind variables. When writing val declarations, instead of just binding one variable, we can replace the variable name with a more complex pattern that follows the structure of the value, and matches its components. A pattern's structure is called its shape; a pattern may take the shape of any structured value. A pattern can hierarchically match a value, going deep into its structure. It can also bind an entire structure to a variable.

Examples

  • val (x,y) = (2+3,2*3) binds x to 5 and y to 6.
  • val [a,b,c] = ["one", "two", "three"] binds a to "one", b to "two", and c to "three".
  • val ((a,b),c) = ((1, true), (2, false)) binds a to 1, b to true, and c to (2,false).

Patterns are linear; that is, a pattern may mention a variable name at most once. For example, (x,y,x) is not a valid pattern.

Note that a pattern may fail to match a value, if it does not have the same shape as that value. In a val declaration, this has the same effect as evaluating a silent expression. No variable in the pattern is bound, and if any one of those variables is later evaluated, it is silent.

It is often useful to ignore parts of the value that are not relevant. We can use the wildcard pattern, written _, to do this; it matches any shape and binds no variables.

Examples

  • val (x,_,_) = (1,(2,2),[3,3,3]) binds x to 1.
  • val [[_,x],[_,y]] = [[1,3],[2,4]] binds x to 3 and y to 4.

1.2.7. Functions

Like most other programming languages, Cor provides the capability to define functions, which are expressions that have a defined name, and have some number of parameters. Functions are declared using the keyword def, in the following way.

def add(x,y) = x+y
The expression on the right of the = is called the body of the function.

After defining the function, we can call it. A call looks just like the left side of the declaration except that the variable names (the formal parameters) have been replaced by expressions (the actual parameters).

To evaluate a call, we treat it as a sequence of val declarations associating the formal parameters with the actual parameters, followed by the body of the function.

{- Evaluation of add(1+2,3+4) -}
val x = 1+2
val y = 3+4
x+y

Examples

  • add(10,10*10) evaluates to 110.
  • add(add(5,3),5) evaluates to 13.

Notice that the evaluation strategy of functions allows a call to proceed even if some of the actual parameters are silent, so long as the values of those actual parameters are not used in the evaluation of the body.

def cond(b,x,y) = if b then x else y
cond(true, 3, 5/0)
This evaluates to 3 even though 5/0 is silent, because y is not needed.

A function definition or call may have zero arguments, in which case we write () for the arguments.

def Zero() = 0

1.2.7.1. Recursion

Functions can be recursive; that is, the name of a function may be used in its own body.

def sumto(n) = if n < 1 then 0 else n + Sumto(n-1)
Then, sumto(5) evaluates to 15.

Mutual recursion is also supported.

def even(n) = 
  if (n > 0) then odd(n-1)
  else if (n < 0) then odd(n+1)
  else true
def odd(n) = 
  if (n > 0) then even(n-1)
  else if (n < 0) then even(n+1)
  else false
There is no special keyword for mutual recursion; any contiguous sequence of function declarations is assumed to be mutually recursive.

1.2.7.2. Closures

Functions are actually values, just like any other value. Defining a function creates a special value called a closure; the name of the function is a variable and its bound value is the closure. Thus, a closure can be put into a data structure, or bound to some other variable, just like any other value.

def a(x) = x-3
def b(y) = y*4
val funs = (a,b)

Like any other value, a closure can be passed as an argument to another function. This means that Cor supports higher-order functions.

def onetwosum(f) = f(1) + f(2)
def triple(x) = x * 3
Then, onetwosum(triple) is triple(1) + triple(2), which is 1 * 3 + 2 * 3 which evaluates to 9.

Since all declarations (including function declarations) in Cor are lexically scoped, these closures are lexical closures. This means that when a closure is created, if the body of the function contains any variables other than the formal parameters, the bindings for those variables are stored in the closure. Then, when the closure is called, the evaluation of the function body uses those stored variable bindings.

1.2.7.3. Lambda

Sometimes one would like to create a closure directly, without bothering to give it a name. There is a special keyword lambda for this purpose. By writing a function definition without the keyword def and replacing the function name with the keyword lambda, that definition becomes an expression which evaluates to a closure.

def onetwosum(f) = f(1) + f(2)

onetwosum( lambda(x) = x * 3 )
{- 
  identical to:
  def triple(x) = x * 3
  onetwosum(triple)
-}
Then, onetwosum( lambda(x) = x * 3 ) evaluates to 9.

Since a function defined using lambda has no name, it is not possible to define a recursive function in this way. Only def can create a recursive function.

1.2.7.4. Clauses

The combination of functions and pattern matching offers a powerful capability: clausal definition of functions. We can define expressions which execute different code depending on the structure of their arguments.

Here's an example.

def sum([]) = 0
def sum(h:t) = h + sum(t)
sum(l) publishes the sum of the numbers in the list l. It has two clauses: one which matches the empty list, and one which matches any nonempty list. If its argument is an empty list, it returns 0, the appropriate sum for an empty list. If the argument is a nonempty list, it adds the first element of that list to the sum of all of the other elements. In this way, it recursively finds the sum of the list.

A function may have multiple clauses, each of which has a sequence of patterns to match each argument, and a body expression. Naturally, all clauses of a function must have the same number of arguments. Any contiguous sequence of definitions with the same name and different arguments is interpreted as a clausal definition, where each individual declaration is a clause of the larger function.

When the function is called, the clauses are tried in the order in which they appear until a match is found. If no clause matches, the call remains silent.

We allow a new form of pattern which is very useful in clausal definition of functions: a constant pattern. A constant pattern is a match only for the same constant value. We can use this to define the "base case" of a recursive function in a straightforward way.

{- Fibonacci numbers -}
def fib(0) = 1
def fib(1) = 1
def fib(n) = if (n < 0) then 0 else fib(n-1) + fib(n-2)
This definition of the Fibonacci function is straightforward, but slow, due to the repeated work in recursive calls to fib. We can define a linear-time version, again with the help of pattern matching:
{- Alternate definition of the Fibonacci function -}

{- A helper function: find the pair (Fibonacci(n-1), Fibonacci(n)) -}
def H(0) = (1,1)
def H(n) = 
  val (x,y) = H(n-1)
  (y,x+y)

def fib(n) = 
  if (n < 0) then 0 
  else (
    val (x,_) = H(n) 
    x
  )
     
    

As a more complex example of matching, consider the following function which takes a list argument and returns a new list containing only the first n elements of the argument list.

def take(0,_) = []
def take(n,h:t) = if (n > 0) then h:(take(n-1,t)) else []

Mutual recursion and clausal definitions are allowed to occur together. Here are two functions, stutter and mutter, which are each mutually recursive, and each have multiple clauses. stutter(l) returns l with every odd element repeated. mutter(l) returns l with every even element repeated.

def stutter([]) = []
def stutter(h:t) = h:h:mutter(t)
def mutter([]) = []
def mutter(h:t) = h:stutter(t)
stutter([1,2,3]) evaluates to [1,1,2,3,3].

Clauses of mutually recursive functions may also be interleaved, to make them easier to read.

def even(0) = true
def odd(0) = false
def even(n) = odd(if n > 0 then n-1 else n+1)
def odd(n) = even(if n > 0 then n-1 else n+1)

1.2.8. Comments

Cor has two kinds of comments.

A line which begins with two dashes (--), preceded only by whitespace, is a single line comment. The region from the two dashes to the next encountered newline, inclusive, is ignored.

-- This is a single line comment.
	 -- This is also a single line comment.

Multi-line comments are enclosed by matching braces of the form {- -}. Multi-line comments may be nested. They may appear anywhere, even in the middle of an expression.

{- 
   This is a
   multiline comment.
-}
   
{- Multiline comments {- can be nested -} -}

{- They may appear anywhere, -} 
1 + {- even in the middle of an expression. -} 2 + 3

1.3. Orc: Orchestrating services

Cor is a pure declarative language. It has no state, since variables are bound at most once and cannot be reassigned. Evaluation of an expression results in at most one value. It cannot communicate with the outside world except by producing a value. The full Orc language transcends these limitations by incorporating the orchestration of external services. We introduce the term site to denote an external service which can be called from an Orc program.

As in Cor, an Orc program is an expression; Orc expressions are built up recursively from smaller expressions. Orc is a superset of Cor, i.e., all Cor expressions are also Orc expressions. Orc expressions are executed, rather than evaluated; an execution may call external services and publish some number of values (possibly zero). Different executions of the same expression may have completely different behaviors; they may call different services, may receive different responses from the same site, and may publish different values. Expressions in the functional subset, though, will display the same behavior in all executions.

In the following sections we discuss the features of Orc. First, we discuss how Orc communicates with external services. Then we introduce Orc's concurrency combinators, which combine expressions into larger orchestrations and manage concurrent executions. We have already discussed the functional subset of Orc in our coverage of Cor, but we reprise some of those topics; some Cor constructs exhibit new behaviors in the concurrent, stateful context of Orc.

The following figure summarizes the syntax of Orc as an extension of the syntax of Cor. The original Cor grammar rules are abbreviated by ellipses (...).

Table 1.3. Basic Syntax of Orc

D::= ... Declaration  
 | site X = address   site declaration
C::= ... Constant  
 | signal   signal value
E::=... Expression  
 | stop   silent expression
 |E >P> E  sequential combinator
 |E | E  parallel combinator
 |E <P< E  pruning combinator
 |E ; E  otherwise combinator

1.3.1. Communicating with external services

An Orc expression may be a site call. Sites are called using the same syntax as a function call, but with a slightly different meaning. Sites are introduced and bound to variables by a special declaration.

1.3.1.1. Calling a site

Suppose that the variable Google is bound to a site which invokes the Google search engine service in "I'm Feeling Lucky" mode. A call to Google looks just like a function call. Calling Google requests the URL of the top result for the given search term.

{- Get the top search result for "computation orchestration" -}
Google("computation orchestration")
Once the Google search service determines the top result, it sends a response. The site call then publishes that response. Note that the service might not respond: Google's servers might be down, the network might be down, or the search might yield no result URL.

A site call sends only a single request to an external service and receives at most one response. These restrictions have two important consequences. First, all of the information needed for the call must be present before contacting the service. Thus, site calls are strict; all arguments must be bound before the call can proceed. If any argument is silent, the call never occurs. Second, a site call publishes at most one value, since at most one response is received.

A call to a site has exactly one of the following effects:

  1. The site returns a value, called its response.
  2. The site communicates that it will never respond to the call; we say that the call has halted
  3. The site neither returns a value nor indicates that the call has halted; we say that the call is pending.

In the last two cases, the site call is said to be silent. However, unlike a silent expression in Cor, a silent site call in Orc might perform some meaningful computation or communication; silence does not necessarily indicate an error. Since halted site calls and pending site calls are both silent, they cannot usually be distinguished from each other; only the otherwise combinator can tell the difference.

A site is a value, just like an integer or a list. It may be bound to a variable, passed as an argument, or published by an execution, just like any other value.

{-
  Create a search site from a search engine URL,
  bind the variable Search to that site,
  then use that site to search for a term.
-}
val Search = SearchEngine("http://www.google.com/")
Search("first class value")

A site is sometimes called only for its effect on the external world; its return value is irrelevant. Many sites which do not need to return a meaningful value will instead return a signal: a special value which carries no information (analogous to the unit value () in ML). The signal value can be written as signal within Orc programs.

{- 
  Use the 'println' site to print a string, followed by
  a newline, to an output console.
  The return value of this site call is a signal.
-}
println("Hello, World!")

1.3.1.2. Declaring a site

A site declaration makes some service available as a site and binds it to a variable. The service might be an object in the host language (e.g. a class instance in Java), or an external service on the web which is accessed through some protocol like SOAP or REST, or even a primitive operation like addition. Many useful sites are already defined in the Orc standard library, documented in Appendix B. For now, we present a simple type of site declaration: using an object in the host language as a site.

The following example instantiates a Java object to be used as a site in an Orc program, specifically a Java object which provides an asynchronous buffer service. The declaration uses a fully qualified Java class name to find and load a class, and creates an instance of that class to provide the service.

{-  Define the Buffer site  -}
site Buffer = orc.lib.state.Buffer

1.3.2. The concurrency combinators of Orc

Orc has four combinators: parallel, sequential, pruning, and otherwise. A combinator forms an expression from two component expressions. Each combinator captures a different aspect of concurrency. Syntactically, the combinators are written infix, and have lower precedence than operators, but higher precedence than conditionals or declarations.

1.3.2.1. The parallel combinator

Orc's simplest combinator is |, the parallel combinator. Orc executes the expression F | G, where F and G are Orc expressions, by executing F and G concurrently. Whenever F or G communicates with a service or publishes a value, F | G does so as well.

-- Publish 1 and 2 in parallel  
1 | 1+1

{- 
  Access two search sites, Google and Yahoo, in parallel.

  Publish any results they return.
  
  Since each call may publish a value, the expression
  may publish up to two values.
-}  
Google("cupcake") | Yahoo("cupcake")

The parallel combinator is fully associative: (F | G) | H and F | (G | H) and F | G | H are all equivalent.

It is also commutative: F | G is equivalent to G | F.

-- Publish 1, 2, and 3 in parallel  
1+0 | 1+1 | 1+2

1.3.2.2. The sequential combinator

Now that we have expressions which publish multiple values, what can we do with those publications? The sequential combinator, written F >x> G, combines the expression F, which may publish some values, with another expression G, which will use the values as they are published; x transmits the values from F to G.

The execution of F >x> G starts by executing F. Whenever F publishes a value, a new copy of G is executed in parallel with F (and with any previous copies of G); in that copy of G, variable x is bound to the value published by F. Values published by copies of G are published by the whole expression, but the values published by F are not published by the whole expression; they are consumed by the variable binding.

-- Publish 1 and 2 in parallel  
(0 | 1) >n> n+1

-- Publish 3 and 4 in parallel  
2 >n> (n+1 | n+2)

-- Publish 0, 1, 2 and 3 in parallel  
(0 | 2) >n> (n | n+1)

-- Prepend the site name to each published search result
-- The cat site concatenates any number of arguments into one string  
  Google("cupcake") >s> cat("Google: ", s)
| Yahoo("cupcake") >s> cat("Yahoo: ", s)

The sequential combinator may be written as F >P> G, where P is a pattern instead of just a variable name. Any value published by F is matched against the pattern P. If this match is successful, a new copy of G is started with all of the bindings from the match. Otherwise, the published value is simply ignored, and no new copy of G is executed.

-- Publish 3, 6, and 9 in arbitrary order.
(3,6,9)  >(x,y,z)>  ( x | y | z )

-- Filter out values of the form (_,false)
( (4,true) | (5,false) | (6,true) )  >(x,true)> x
-- Publishes 4 and 6 

We may also omit the variable entirely, writing >> . This is equivalent to using a wildcard pattern: >_>

We may want to execute an expression just for its effects, and hide all of its publications. We can do this using >> together with the special expression stop, which is always silent.

{- 
  Print two strings to the console,
  but don't publish the return values of the calls.
-}
( println("goodbye") | println("world") ) >> stop

The sequential combinator is right associative: F >x> G >y> H is equivalent to F >x> (G >y> H). It has higher precedence than the parallel combinator: F >x> G | H is equivalent to (F >x> G ) | H.

The right associativity of the sequential combinator makes it easy to bind variables in sequence and use them together.

{- 
  Publish the cross product of {1,2} and {3,4}:
  (1,3), (1,4), (2,3), and (2,4).
-}
(1 | 2) >x> (3 | 4) >y> (x,y)

1.3.2.3. The pruning combinator

The pruning combinator, written F <x< G, allows us to block a computation waiting for a result, or terminate a computation. The execution of F <x< G starts by executing F and G in parallel. Whenever F publishes a value, that value is published by the entire execution. When G publishes its first value, that value is bound to x in F, and then the execution of G is immediately terminated. A terminated expression cannot call any sites or publish any values.

During the execution of F, any part of the execution that depends on x will be suspended until x is bound (to the first value published by G). If G never publishes a value, that part of the execution is suspended forever.

-- Publish either 5 or 6, but not both
x+2 <x< (3 | 4)

-- Query Google and Yahoo for a search result
-- Print out the result that arrives first; ignore the other result
println(result) <result< ( Google("cupcake") | Yahoo("cupcake") )

Though a terminated execution may not make any new calls, the calls that it has already made will continue normally; their responses are simply ignored. This may have surprising consequences when a call has side effects, as in the following example.

{- 
  This example actually prints both "true" and "false" to the
  console, regardless of which call responds first.
-}
stop <x< println("true") | println("false")
Both of the println calls are initiated before either one of them publishes a value and terminates the expression. Once the expression is terminated, no new calls occur, but the other println call still proceeds and still has the effect of printing its message to the console.

The pruning combinator may include a full pattern P instead of just a variable name. Any value published by G is matched against the pattern P. If this match is successful, then G terminates and all of the bindings of the pattern P are made in F. Otherwise, the published value is simply ignored and G continues to execute.

-- Publish either 9 or 25, but not 16.
x*x <(x,true)< ( (3,true) | (4,false) | (5,true) )

Note that even if (4,false) is published before (3,true) or (5,true), it is ignored. The right side continues to execute and will publish one of (3,true) or (5,true).

The pruning combinator is left associative: F <x< G <y< H is equivalent to (F <x< G) <y< H. It has lower precedence than the parallel combinator: F <x< G | H is equivalent to F <x< (G | H).

1.3.2.4. The otherwise combinator

Orc has a fourth concurrency combinator: the otherwise combinator, written F ; G. The execution of F ; G proceeds as follows. First, F is executed. If F completes, and has not published any values, then G executes. If F did publish one or more values, then G is ignored. The publications of F ; G are those of F if F publishes, or those of G otherwise.

We determine when an expression completes using the following rules.

  • A Cor expression completes when it is fully evaluated; if it is silent, it completes immediately.
  • A site call completes when it has published a value or halted.
  • F | G completes when its subexpressions F and G have both completed.
  • F >x> G completes when F has completed and all instantiated copies of G have completed.
  • F <x< G completes when F has completed, and G has either completed or published a value. Also, if G completes without publishing a value, then all expressions in F which use x also complete, since they will never be able to proceed.
  • F ; G completes either when F has published a value and subsequently completed, or when F and G have both completed.
  • stop completes immediately.

The otherwise combinator is fully associative, so F ; G ; H and (F ; G) ; H and F ; (G ; H) are all equivalent. It has lower precedence than the other three combinators.

The otherwise combinator was not present in the original formulation of the Orc concurrency calculus; it has been added to support computation and iteration over strictly finite data. Sequential programs conflate the concept of producing a value with the concept of completion. Orc separates these two concepts; variable binding combinators like >x> and <x< handle values, whereas ; detects the completion of an execution.

1.3.3. Revisiting Cor expressions

Some Cor expressions have new behaviors in the context of Orc, due to the introduction of concurrency and of sites.

1.3.3.1. Operators

The arithmetic, logical, and comparison operators are actually calls to sites, simply written in infix style with the expected operator symbols. For example, 2+3 is actually (+)(2,3), where (+) is a primitive site provided by the language itself. All of the operators can be used directly as sites in this way; the name of the site is the operator enclosed by parentheses, e.g. (**), (>=), etc. Negation (unary minus) is named (0-).

1.3.3.2. Conditionals

The conditional expression if E then F else G is actually a derived form based on a site named if. The if site takes a boolean argument; it returns a signal if that argument is true, or remains silent if the argument is false.

if E then F else G is equivalent to ( if(b) >> F | if(~b) >> G) <b< E.

When the else branch of a conditional is unneeded, we can write if F then G, with no else branch. This is equivalent to if(E) >> F.

1.3.3.3.  val

The declaration val x = G, followed by expression F, is actually just a different way of writing the expression F <x< G. Thus, val shares all of the behavior of the pruning combinator, which we have already described. (This is also true when a pattern is used instead of variable name x).

1.3.3.4. Nesting Orc expressions

The execution of an Orc expression may publish many values. What does such an expression mean in a context where only one value is expected? For example, what does 2 + (3 | 4) publish?

The specific contexts in which we are interested are as follows (where E is any Orc expression):

E op E operand
if E then ... conditional test
X( ... , E , ... ) call argument
( ... , E , ... ) tuple element
[ ... , E , ... ] list element

Whenever an Orc expression appears in such a context, it executes until it publishes its first value, and then it is terminated. The published value is used in the context as if it were the result of evaluating the expression.

-- Publish either 5 or 6
2 + (3 | 4)

-- Publish exactly one of 0, 1, 2 or 3
(0 | 2) + (0 | 1)

To be precise, whenever an Orc expression appears in such a context, it is treated as if it was on the right side of a pruning combinator, using a fresh variable name to fill in the hole. Thus, C[E] (where E is the Orc expression and C is the context) is equivalent to the expression C[x] <x< E.

1.3.3.5. Functions

The body of a function in Orc may be any Orc expression; thus, function bodies in Orc are executed rather than evaluated, and may engage in communication and publish multiple values.

A function call in Orc, as in Cor, binds the values of its actual parameters to its formal parameters, and then executes the function body with those bindings. Whenever the function body publishes a value, the function call publishes that value. Thus, unlike a site call, or a pure functional Cor call, an Orc function call may publish many values.

-- Publish all integers in the interval 1..n, in arbitrary order. 
def range(n) = if (n > 0) then (n | range(n-1)) else stop 

-- Publish 1, 2, and 3 in arbitrary order.
range(3)

In the context of Orc, function calls are not strict. When a function call executes, it begins to execute the function body immediately, and also executes the argument expressions in parallel. When an argument expression publishes a value, it is terminated, and the corresponding formal parameter is bound to that value in the execution of the function body. Any part of the function body which uses a formal parameter that has not yet been bound suspends until that parameter is bound to a value.

1.3.4. Time

Orc is designed to communicate with the external world, and one of the most important characteristics of the external world is the passage of time. Orc implicitly interacts with the passage of time by calling external services which take time to respond. However, Orc can also explicitly wait for some amount of time, using the special site Rtimer.

The site Rtimer is a relative timer. It takes as an argument a number of milliseconds to wait. It waits for exactly that amount of time, and then responds with a signal.

-- Print "red", wait for 3 seconds (3000 ms), and then print "green" 
println("red") >> Rtimer(3000) >> println("green") >> stop

The following example defines a metronome, which publishes a signal once every t milliseconds, indefinitely.

def metronome(t) = signal | Rtimer(t) >> metronome(t)

We can also use Rtimer together with the pruning combinator to enforce a timeout.

{-
  Publish the result of a Google search.
  If it takes more than 5 seconds, time out.
-}
result 
  <result< ( Google("impatience") 
           | Rtimer(5000) >> "Search timed out.")

We present many more examples of programming techniques using real time in Chapter 2.

1.4. Advanced Features of Orc

In this section we introduce some advanced features of Orc. These include curried function definitions and curried calls, writing an arbitrary expression as the target of a call, a special syntax for writing calls in an object-oriented style, extensions to pattern matching, and new forms of declarations, and ML-style datatypes.

Table 1.4. Advanced Syntax of Orc

E::=... Expression  
 |E G+  generalized call
G::=  Argument group  
   ( E , ... , E )   curried arguments
 | . field   field access
P::=... Pattern  
 |P as X  as pattern
 | =X  equality pattern
 |X( P , ... , P )   datatype pattern
D::= ... Declaration  
 | class X = classname   class declaration
 | type X = DT | ... | DT  datatype declaration
 | include " filename "   inclusion
DT::= X(_, ... ,_) Datatype  

1.4.1. Special call forms

1.4.1.1. The . notation

In many object-oriented programming languages, one calls a method or accesses a field of an object using the dot operator; for example, obj.m() calls the method m of the object obj.

There is a special kind of site call in Orc which serves a similar purpose. One may write x.msg, for any identifiers x and msg. This treats the value bound to x as a site, and calls it with a special message value msg. If the site understands the message msg (for example, if x is bound to a Java object with a field called msg), the site interprets the message and responds with some appropriate value. If the site does not understand the message sent to it, it does not respond, and no publication occurs. If x cannot be interpreted as a site, no call is made.

Typically this capability is used so that sites may be syntactically treated like objects, with multiple methods and fields. For example, a channel c might understand the messages get and put, to get values from and put values on that channel, respectively. Such calls would be written c.get(), or c.put(6).

A call such as c.put(6) actually occurs in two steps. First c.put sends the message put to the site c; this publishes a site whose only purpose is to put values on the channel. Next, that site is called on the argument 6, sending 6 on the channel. Readers familiar with functional programming will recognize this technique as currying.

1.4.1.2. Currying

It is sometimes useful to stage the arguments to a function; that is, rather than writing a function on two arguments, one instead writes a function on one argument which returns a function taking the second argument and performing the remainder of the evaluation.

This technique is known as currying and it is common in functional programming languages. We can write curried functions using closures. Suppose we want to define a curried addition function on two arguments, and later apply that function to the arguments 3 and 4. We could write such a program in the following way:

def Sum(a) = ( lambda(b) = a+b )  
val f = Sum(3)
f(4)
This defines a function Sum which, given an argument a, creates a function which will take an argument b and add a to b. It then creates the function which adds 3 to its argument, binds that to f, and then invokes f on 4 to yield 3+4.

When defining a curried function, we have abstracted it in two steps, and when applying it we have written two separate calls. However, this is verbose and not very clear. Orc has a special syntax for curried function definitions and curried applications that will simplify both of these steps. Function definitions may have multiple argument sequences; they are enclosed in parentheses and concatenated. Curried function calls chain together multiple applications in a similar way. Here is the previous program, written in this simplified syntax:

def Sum(a)(b) = a+b
Sum(3)(4)
Naturally, this syntax is backwards compatible; e.g. both of the following programs are also equivalent:
def Sum(a) = ( lambda(b) = a+b )  
Sum(3)(4)
def Sum(a)(b) = a+b  
val f = Sum(3)
f(4)

Warning

Clauses of a function may be defined with both curried arguments and patterns in arguments. However, the combination of these two features can produce unintuitive behavior. Consider the following clausal, curried function definition:

def sum(x)(0) = x
def sum(x)(y) = x + y
sum(2)(3)
One might expect sum(2)(3) to evaluate to 5. However, this is not the case; instead sum(2)(3) remains silent due to a pattern matching failure. This is because the compiler always expands curried definitions before considering patterns, so the above program is in fact equivalent to:
def sum(x) = ( lambda(0) = x )
def sum(x) = ( lambda(y) = x + y )
sum(2)(3)
Now the problem becomes apparent. The first clause will always match when the function is applied to its first argument, and then only one clause is available when the second argument is available: the clause with succeeds only on 0.

Therefore, use caution when writing function definitions that use both currying and clauses.

1.4.2. Extensions to pattern matching

1.4.2.1. As pattern

In taking apart a value with a pattern, it is often useful to capture intermediate results.

val (a,b) = ((1,2),(3,4))
val (ax,ay) = a
val (bx,by) = b
We can use the as keyword to simplify this program fragment, giving a name to an entire sub-pattern. Here is an equivalent version of the above code.
val ((ax,ay) as a, (bx,by) as b) = ((1,2),(3,4))

1.4.2.2. Equality pattern

When a variable occurs in a pattern, it is bound to the value being matched against that pattern. What if instead we would like to use the value of a bound variable as a test pattern, in a way similar to a literal value like 0?

We can use an equality pattern to do this. An equality pattern is a variable name preceded by =. Consider the following definition, which compares its argument to the value bound to variable x, returning true if they are equal and false otherwise.

def isx(=x) = true
def isx(_) = false

The equality pattern becomes much more useful when it is embedded within other patterns. For example, consider this definition withz. If one of the arguments passed to it is equal to z, then the function returns its other argument. Otherwise it remains silent.

def withz(=z,y) = y
def withz(x,=z) = x

1.4.3. Datatypes

We have seen Orc's predefined data structures: tuples and lists. Orc also provides the capability for programmers to define their own data structures, using a feature adopted from the ML/Haskell language family called datatypes (also called variants or tagged sums).

Datatypes are defined using the type declaration:

type Tree = Node(_,_,_) | Empty()

This declaration defines two new sites named Node and Empty. Node takes three arguments, and publishes a tagged value wrapping those arguments. Empty takes no arguments and does the same.

Once we have created these tagged values, we use a new pattern called a datatype pattern to match them and unwrap the arguments:

type Tree = Node(_,_,_) | Empty()
{- Build up a small binary tree -}
val l = Node(Empty(), 0, Empty())
val r = Node(Empty(), 2, Empty())
val t = Node(l,1,r)

{- And then match it to extract its contents -}
t >Node(l,j,r)>
l >Node(_,i,_)>
r >Node(_,k,_)>
( i | j | k )

One pair of datatypes is so commonly used that it is already predefined in the standard library: Some(_) and None(). These are used as return values for calls that need to distinguish between successfully returning a value (Some(v)), and successfully completing but having no meaningful value to return (None()). For example, a lookup function might return Some(result) if it found a result, or return None() if it successfully performed the lookup but found no suitable result.

1.4.4. New forms of declarations

1.4.4.1.  class declaration

When Orc is run on top of an object-oriented programming language, classes from that language may be used as sites in Orc itself, via the class declaration.

{- Use the String class from Java's standard library as a site -}
class String = java.lang.String
val s = String("foo")
s.concat("bar")

This program binds the variable String to Java's String class. When it is called as a site, it constructs a new instance of String, passing the given arguments to the constructor.

This instance of String is a Java object; its methods are called and its fields are accessed using the dot (.) notation, just as one would expect in Java. For complete details of how Orc interacts with Java, see Java Integration.

1.4.4.2.  include declaration

It is often convenient to group related declarations into units that can be shared between programs. The include declaration offers a simple way to do this. It names a source file containing a sequence of Orc declarations; those declarations are incorporated into the program as if they had textually replaced the include declaration. An included file may itself contain include declarations.

{- Contents of fold.inc -}
def foldl(f,[],s) = s
def foldl(f,h:t,s) = foldl(f,t,f(h,s))

def foldr(f,l,s) = foldl(f,rev(l),s)

{- This is the same as inserting the contents of fold.inc here -}
include "fold.inc"

def sum(L) = foldl(lambda(a,b) = a+b, L, 0)

sum([1,2,3])

Note that these declarations still obey the rules of lexical scope. Also, Orc does not detect shared declarations; if the same file is included twice, its declarations occur twice.

Chapter 2. Programming Methodology

In Chapter 1, we described the syntax and semantics of the Orc language. Now, we turn our attention to how the language is used in practice, with guidelines on style and programming methodology, including a number of common concurrency patterns.

2.1. Syntactic and Stylistic Conventions

In this section we suggest some syntactic conventions for writing Orc programs. None of these conventions are required by the parser; newlines are used only to disambiguate certain corner cases in parsing, and other whitespace is ignored. However, following programming convention helps to improve the readability of programs, so that the programmer's intent is more readily apparent.

2.1.1. Parallel combinator

When the combined expressions are small, write them all on one line.

F | G | H
Note that we do not need parentheses here, since | is fully associative and commutative.

When the combined expressions are large enough to take up a full line, write one expression per line, with each subsequent expression aligned with the first and preceded by |. Indent the first expression to improve readability.

  long expression 
| long expression
| long expression

A sequence of parallel expressions often form the left hand side of a sequential combinator. Since the sequential combinator has higher precedence, use parentheses to group the combined parallel expressions together.

( expression 
| expression
) >x> 
another expression

2.1.2. Sequential combinator

When the combined expressions are small, write a cascade of sequential combinators all on the same line.

F >x> G >y> H

Remember that sequential is right associative; in this example, x is bound in both G and H, and y is bound in H.

When the combined expressions are large enough to take up a full line, write one expression per line; each line ends with the combinator which binds the publications produced by that line.

long expression  >x>  
long expression  >y> 
long expression

For very long-running expressions, or expressions that span multiple lines, write the combinators on separate lines, indented, between each expression.

very long expression 
  >x>  
very long expression 
  >y> 
very long expression

2.1.3. Pruning combinator

When the combined expressions are small, write them on the same line:

F <x< G
When multiple pruning combinators are used to bind multiple variables (especially when the scoped expression is long), start each line with a combinator, aligned and indented, and continue with the expression.
long expression 
  <x< G
  <y< H

The pruning combinator is not often written in its explicit form in Orc programs. Instead, the val declaration is often more convenient, since it is semantically equivalent and mentions the variable x before its use in scope, rather than after.

val x = G
val y = H
long expression 

Additionally, when the variable is used in only one place, and the expression is small, it is often easier to use a nested expression. For example,

val x = G
val y = H
M(x,y)
is equivalent to
M(G,H)

Sometimes, we use the pruning combinator simply for its capability to terminate expressions and get a single publication; binding a variable is irrelevant. This is a special case of nested expressions. We use the identity site let to put the expression in the context of a function call.

For example,

x <x< F | G | H
is equivalent to
let(F | G | H)

The translation uses a pruning combinator, but we don't need to write the combinator, name an irrelevant variable, or worry about precedence (since the expression is enclosed in parentheses as part of the call).

2.1.4. Declarations

Declarations should always end with a newline.

def add(x,y) = x + y
val z = 7
add(z,z)
While the parser does not require a newline to end a declaration, it uses the newline to disambiguate certain corner cases in parsing, such as function application.

When the body of a declaration spans multiple lines, start the body on a new line after the = symbol, and indent the entire body.

def f(x,y) =
    declaration
    declaration
    body expression

Apply this style recursively; if a def appears within a def, indent its contents even further.

def f(x,y) =
    declaration
    def helper(z) =
    	declaration in helper
    	declaration in helper
    	body of helper
    declaration
    body expression

2.2. Programming Idioms

In this section we give Orc implementations of some standard idioms from concurrent and functional programming. Despite the austerity of Orc's four combinators, we are able to encode a variety of idioms straightforwardly.

2.2.1. Channels

Orc has no communication primitives like pi-calculus channels[1] or Erlang mailboxes[2]. Instead, it makes use of sites to create channels of communication.

The most frequently used of these sites is Buffer. When called, it publishes a new asynchronous FIFO channel. That channel is a site with two methods: get and put. The call c.get() takes the first value from channel c and publishes it, or blocks waiting for a value if none is available. The call c.put(v) puts v as the last item of c and publishes a signal.

A channel may be closed to indicate that it will not be sent any more values. If the channel c is closed, c.put(v) always halts (without modifying the state of the channel), and c.get() halts once c becomes empty. The channel c may be closed by calling either c.close(), which returns a signal once c becomes empty, or c.closenb(), which returns a signal immediately.

2.2.2. Lists

In the section on Cor, we were introduced to lists: how to construct them, and how to match them against patterns. While it is certainly feasible to write a specific function with an appropriate pattern match every time we want to access a list, it is helpful to have a handful of common operations on lists and reuse them.

One of the most common uses for a list is to send each of its elements through a sequential combinator. Since the list itself is a single value, we want to walk through the list and publish each one of its elements in parallel as a value. The library function each does exactly that.

Suppose we want to send the message invite to each email address in the list inviteList:

each(inviteList) >address> Email(address, invite)

Orc also adopts many of the list idioms of functional programming. The Orc library contains definitions for most of the standard list functions, such as map and fold. Many of the list functions internally take advantage of concurrency to make use of any available parallelism; for example, the map function dispatches all of the mapped calls concurrently, and assembles the result list once they all return using a fork-join.

2.2.3. Streams

Sometimes a source of data is not explicitly represented by a list or other data structure. Instead, it is made available through a site, which returns the values one at a time, each time it is called. We call such a site a stream. It is analogous to an iterator in a language like Java. Functions can also be used as streams, though typically they will not be pure functions, and should only return one value. A call to a stream may halt, to indicate that the end of the data has been reached, and no more values will become available. It is often useful to detect the end of a stream using the otherwise combinator.

Streams are common enough in Orc programming that there is a library function to take all of the available publications from a stream; it is called repeat, and it is analogous to each for lists.

def repeat(f) = f() >x> (x | repeat(f))

The repeat function calls the site or function f with no arguments, publishes its return value, and recurses to query for more values. repeat should be used with sites or functions that block until a value is available. Notice that if any call to f halts, then repeat(f) consequently halts.

For example, it is very easy to treat a channel c as a stream, reading any values put on the channel as they become available:

repeat(c.get)

2.2.4. Arrays

While lists are a very useful data structure, they are not indexed; it is not possible to get the nth element of a list in constant time. However, this capability is often needed in practice, so the Orc standard library provides a function IArray to create immutable arrays. Once initialized, an immutable array cannot be rewritten; it can only be read.

IArray takes two arguments: an array size, and a function to initialize the array. The function takes the index being initialized as an argument (indices start at 0), and publishes the value to be stored at that array position. Here are a few examples:

{- Create an array of 10 elements; element i is the ith power of 2 -}
IArray(10, lambda(i) = 2 ** i)
{- Create an array of 5 elements; each element is a newly created buffer -}
IArray(5, lambda(_) = Buffer())

The array is used like a function; the call A(i) returns the ith element of the array A. A call with an out-of-bounds index halts.

{- Create an array of 3 buffers -}
val A = IArray(10, lambda(_) = Buffer())

{- Send true on the 0th channel
   and listen for a value on the 0th channel. -}
A(0).put(true) | A(0).get()

Since arrays are accessed by index, there is a library function specifically designed to make programming with indices easier. The function upto(n) publishes all of the numbers from 0 to n-1 simultaneously; thus, it is very easy to access all of the elements of an array simultaneously. Suppose we have an array A of n email addresses and would like to send the message m to each one.

upto(n) >i> A(i) >address> Email(address, m)

2.2.5. Mutable Storage

Variables in Orc are immutable. There is no assignment operator, and there is no way to change the value of a bound variable. However, it is often useful to have mutable state when writing certain algorithms. The Orc library contains two sites that offer simple mutable storage: Ref and Cell.

The Ref site creates rewritable reference cells.

val r = Ref(0)
println(r.read()) >>
r.write(2) >> 
println(r.read()) >>
stop

These are very similar to ML's ref cells. r.write(v) stores the value v in the reference r, overwriting any previous value, and publishes a signal. r.read() publishes the current value stored in r.

However, unlike in ML, a reference cell can be left initially empty by calling Ref with no arguments. A read operation on an empty cell blocks until the cell is written.

{- Create a cell, and wait 1 second before initializing it. -}
{- The read operation blocks until the write occurs. -}
val r = Ref()
r.read() | Rtimer(1000) >> r.write(1) >> stop

The Orc library also offers write-once reference cells, using the Cell site. A write-once cell has no initial value. Read operations block until the cell has been written. A write operation succeeds only if the cell is empty; subsequent write operations simply halt.

{- Create a cell, try to write to it twice, and read it -}
{- The read will block until a write occurs
   and only one write will succeed. -}
val r = Cell()
  Rtimer(1000) >> r.write(2) >> println("Wrote 2") >> stop
| Rtimer(1000) >> r.write(3) >> println("Wrote 3") >> stop
| r.read()
Write-once cells are very useful for concurrent programming, and they are often safer than rewritable reference cells, since the value cannot be changed once it has been written. The use of write-once cells for concurrent programming is not a new idea; they have been studied extensively in the context of the Oz programming language.

A word of caution: References, cells, and other mutable objects may be accessed concurrently by many different parts of an Orc program, so race conditions may arise.

2.2.5.1. Reference Syntax

Orc provides syntactic sugar for reading and writing mutable storage:

  • x? is equivalent to x.read(). This operator is of equal precedence with the dot operator and function application, so you can write things like x.y?.v?. This operator is very similar to the C languages's * operator, but is postfix instead of prefix.
  • x := y is equivalent to x.write(y). This operator has higher precedence than the concurrency combinators and if/then/else, but lower precedence than any of the other operators.

Here is a previous example rewritten using this syntactic sugar:

{- Create a cell, try to write to it twice, and read it -}
{- The read will block until a write occurs
   and only one write will succeed. -}
val r = Cell()
  Rtimer(1000) >> r := 2 >> println("Wrote 2") >> stop
| Rtimer(1000) >> r := 3 >> println("Wrote 3") >> stop
| r?

2.2.6. Loops

Orc does not have any explicit looping constructs. Most of the time, where a loop might be used in other languages, Orc programs use one of two strategies:

  1. When the iterations of the loops can occur in parallel, write an expression that expands the data into a sequence of publications, and use a sequential operator to do something for each publication. This is the strategy that uses functions like each, repeat, and upto.
  2. When the iterations of the loops must occur in sequence, write a tail recursive function that iterates over the data. Any loop can be rewritten as a tail recursion. Typically the data of interest is in a list, so one of the standard list functions, such as foldl, applies. The library also defines a function while, which handles many of the common use cases of while loops.

2.2.7. Parallel Matching

Matching a value against multiple patterns, as we have seen it so far, is a linear process, and requires a def whose clauses have patterns in their argument lists. Such a match is linear; each pattern is tried in order until one succeeds.

What if we want to match a value against multiple patterns in parallel, executing every clause that succeeds? Fortunately, this is very easy to do in Orc. Suppose we have an expression F which publishes pairs of integers, and we want to publish a signal for each 3 that occurs.

We could write:

F >(x,y)>
  ( if(x=3) >> signal
  | if(y=3) >> signal ) 

But there is a more general alternative:
F >x>
  ( x >(3,_)> signal
  | x >(_,3)> signal ) 

The interesting case is the pair (3,3), which is counted twice because both patterns match it in parallel.

This parallel matching technique is sometimes used as an alternative to pattern matching using function clauses, but only when the patterns are mutually exclusive.

For example,

def helper([]) = 0
def helper([_]) = 1
def helper(_:_:_) = 2
helper([4,6])
is equivalent to
[4,6] >x>
  x >[]> 0
| x >[_]> 1
| x >_:_:_> 2
whereas
def helper([]) = 0
def helper([_]) = 1
def helper(_) = 2
helper([5])
is not equivalent to
[5] >x>
  x >[]> 0
| x >[_]> 1
| x >_> 2
because the clauses are not mutually exclusive. Function clauses must attempt to match in linear order, whereas this expression matches all of the patterns in parallel. Here, it will match [5] two different ways, publishing both 1 and 2.

2.2.8. Fork-join

One of the most common concurrent idioms is a fork-join: run two processes concurrently, and wait for a result from each one. This is very easy to express in Orc. Whenever we write a val declaration, the process computing that value runs in parallel with the rest of the program. So if we write two val declarations, and then form a tuple of their results, this performs a fork-join.

val x = F
val y = G
(x,y)

Fork-joins are a fundamental part of all Orc programs, since they are created by all nested expression translations. In fact, the fork-join we wrote above could be expressed even more simply as just:

(F,G)

2.2.8.1. Example: Machine initialization

In Orc programs, we often use fork-join and recursion together to dispatch many tasks in parallel and wait for all of them to complete. Suppose that given a machine m, calling m.init() initializes m and then publishes a signal when initialization is complete. The function initAll initializes a list of machines.

def initAll([]) = signal
def initAll(m:ms) = ( m.init() , initAll(ms) ) >> signal

For each machine, we fork-join the initialization of that machine (m.init()) with the initialization of the remaining machines (initAll(ms)). Thus, all of the initializations proceed in parallel, and the function returns a signal only when every machine in the list has completed its initialization.

Note that if some machine fails to initialize, and does not return a signal, then the initialization procedure will never complete.

2.2.8.2. Example: Simple parallel auction

We can also use a recursive fork-join to obtain a value, rather than just signaling completion. Suppose we have a list of bidders in a sealed-bid, single-round auction. Calling b.ask() requests a bid from the bidder b. We want to ask for one bid from each bidder, and then return the highest bid. The function auction performs such an auction for a list of bidders (max finds the maximum of its arguments):

def auction([]) = 0
def auction(b:bs) = max(b.ask(), auction(bs))

Note that all bidders are called simultaneously. Also note that if some bidder fails to return a bid, then the auction will never complete. Later we will see a different solution that addresses the issue of non-termination.

2.2.8.3. Example: Barrier synchronization

Consider an expression of the following form, where F and G are expressions and M and N are sites:

M()  >x>  F | N()  >y>  G

Suppose we would like to synchronize F and G, so that both start executing at the same time, after both M() and N() respond. This is easily done using the fork-join idiom. In the following, we assume that x does not occur free in G, nor y in F.

( M() , N() )  >(x,y)>  ( F | G )

2.2.9. Sequential Fork-Join

Previous sections illustrate how Orc can use the fork-join idiom to process a fixed set of expressions or a list of values. Suppose that instead we wish to process all the publications of an expression F, and once this processing is complete, execute some expression G. For example, F publishes the contents of a text file, one line at a time, and we wish to print each line to the console using the site println, then publish a signal after all lines have been printed.

Sequential composition alone is not sufficient, because we have no way to detect when all of the lines have been processed. A recursive fork-join solution would require that the lines be stored in a traversable data structure like a list, rather than streamed as publications from F. A better solution uses the ; combinator to detect when processing is complete:

F >x> println(x) >> stop ; signal

Since ; only evaluates its right side if the left side does not publish, we suppress the publications on the left side using stop. Here, we assume that we can detect when F halts. If, for example, F is publishing the lines of the file as it receives them over a socket, and the sending party never closes the socket, then F never halts and no signal is published.

2.2.10. Priority Poll

The otherwise combinator is also useful for trying alternatives in sequence. Consider an expression of the form F0 ; F1 ; F2 ; .... If Fi does not publish and halts, then Fi+1 is executed. We can think of the Fi's as a series of alternatives that are explored until a publication occurs.

Suppose that we would like to poll a list of buffers for available data. The list of buffers is ordered by priority. The first buffer in the list has the highest priority, so it is polled first. If it has no data, then the next buffer is polled, and so on.

Here is a function which polls a prioritized list of buffers in this way. It publishes the first item that it finds, removing it from the originating buffer. If all buffers are empty, the function halts. We use the getnb ("get non-blocking") method of the buffer, which retrieves the first available item if there is one, and halts otherwise.

def priorityPoll([]) = stop
def priorityPoll(b:bs) = b.getnb() ; priorityPoll(bs)

2.2.11. Parallel Or

``Parallel or'' is a classic idiom of parallel programming. The ``parallel or'' operation executes two expressions F and G in parallel, each of which may publish a single boolean, and returns the disjunction of their publications as soon as possible. If one of the expressions publishes true, then the disjunction is true, so it is not necessary to wait for the other expression to publish a value. This holds even if one of the expressions is silent.

The ``parallel or'' of expressions F and G may be expressed in Orc as follows:

let(
    val a = F
    val b = G

      (a || b)
    | if(a) >> true 
    | if(b) >> true 
)

The expression (a || b) waits for both a and b to become available and then publishes their disjunction. However if either a or b is true we can publish true immediately regardless of whether the other variable is available. Therefore we run if(a) >> true and if(b) >> true in parallel to wait for either variable to become true and immediately publish the result true. Since more than one of these expressions may publish true, the surrounding let(...) is necessary to select and publish only the first result.

2.2.12. Timeout

Timeout, the ability to execute an expression for at most a specified amount of time, is an essential ingredient of fault-tolerant and distributed programming. Orc accomplishes timeout using pruning and the Rtimer site. The following program runs F for at most one second, publishing its result if available and the value 0 otherwise.

let( F | Rtimer(1000) >> 0 )

2.2.12.1. Auction with timeout

In the auction example given previously, the auction may never complete if one of the bidders does not respond. We can add a timeout so that a bidder has at most 8 seconds to provide a bid:

def auction([]) = 0
def auction(b:bs) = 
  val bid = b.ask() | Rtimer(8000) >> 0
  max(bid, auction(bs))

This version of the auction is guaranteed to complete within 8 seconds.

2.2.12.2. Detecting timeout

Sometimes, rather than just yielding a default value, we would like to determine whether an expression has timed out, and if so, perform some other computation. To detect the timeout, we pair the result of the original expression with true and the result of the timer with false. Thus, if the expression does time out, then we can distinguish that case using the boolean value.

Here, we run expression F with a time limit t. If it publishes within the time limit, we bind its result to r and execute G. Otherwise, we execute H.

val (r, b) = (F, true) | (Rtimer(t), false)
if b then G else H
Instead of using a boolean and conditional, we could use pattern matching:
val s = Some(F) | Rtimer(t) >> None()
  s >Some(r)> G
| s >None()>  H

2.2.13. Priority

We can use a timer to give a window of priority to one computation over another. In this example, we run expressions F and G concurrently. For one second, F has priority; F's result is published immediately, but G's result is held until the time interval has elapsed. If neither F nor G publishes a result within one second, then the first result from either is published.

val x = F
val y = G
let( y | Rtimer(1000) >> x )

2.2.14. Metronome

A timer can be used to execute an expression repeatedly at regular intervals, for example to poll a service. Recall the definition of metronome from the previous chapter:

def metronome(t) = signal | Rtimer(t) >> metronome()

The following example publishes "tick" once per second and "tock" once per second after an initial half-second delay. The publications alternate: "tick tock tick tock ...". Note that this program is not defined recursively; the recursion is entirely contained within metronome.

  metronome(1000) >> "tick"
| Rtimer(500) >> metronome(1000) >> "tock"

2.2.15. Routing

The Orc combinators restrict the passing of values among their component expressions. However, some programs will require greater flexibility. For example, F <x< G provides F with the first publication of G, but what if F needs the first n publications of G? In cases like this we use channels or other stateful sites to redirect or store publications. We call this technique routing because it involves routing values from one execution to another.

2.2.15.1. Generalizing Termination

The pruning combinator terminates an expression after it publishes its first value. We have already seen how to use pruning just for its termination capability, without binding a variable, using the let site. Now, we use routing to terminate an expression under different conditions, not just when it publishes a value; it may publish many values, or none, before being terminated.

Our implementation strategy is to route the publications of the expression through a channel, so that we can put the expression inside a pruning combinator and still see its publications without those publications terminating the expression.

2.2.15.1.1. Enhanced Timeout

As a simple demonstration of this concept, we construct a more powerful form of timeout: allow an expression to execute, publishing arbitrarily many values (not just one), until a time limit is reached.

val c = Buffer()
repeat(c.get) << 
    F >x> c.put(x) >> stop 
  | Rtimer(1000) >> c.closenb()

This program allows F to execute for one second and then terminates it. Each value published by F is routed through channel c so that it does not terminate F. After one second, Rtimer(1000) responds, triggering the call c.closenb(). The call c.closenb() closes c and publishes a signal, terminating F. The library function repeat is used to repeatedly take and publish values from c until it is closed.

2.2.15.1.2. Test Pruning

We can also decide to terminate based on the values published. This expression executes F until it publishes a negative number, and then terminates it:

val c = Buffer()
repeat(c.get) << 
  F >x> 
    (if x >= 0 
        then c.put(x) >> stop
        else c.closenb())

Each value published by F is tested. If it is non-negative, it is placed on channel c (silently) and read by repeat(c.get). If it is negative, the channel is closed, publishing a signal and causing the termination of F.

2.2.15.1.3. Interrupt

We can use routing to interrupt an expression based on a signal from elsewhere in the program. We set up the expression like a timeout, but instead of waiting for a timer, we wait for the semaphore done to be released. Any call to done.release will terminate the expression (because it will cause done.acquire() to publish), but otherwise F executes as normal and may publish any number of values.

val c = Buffer()
val done = Semaphore(0)
repeat(c.get) <<
    F >x> c.put(x) >> stop
  | done.acquire() >> c.closenb()
2.2.15.1.4. Publication Limit

We can limit an expression to n publications, rather than just one. Here is an expression which executes F until it publishes 5 values, and then terminates it.

val c = Buffer()
val done = Semaphore(0)
def allow(0) = done.release() >> stop
def allow(n) = c.get() >x> (x | allow(n-1))
allow(5) << 
    F >x> c.put(x) >> stop 
  | done.acquire() >> c.closenb()

We use the auxiliary function allow to get only the first 5 publications from the channel c. When no more publications are allowed, allow uses the interrupt idiom to halt F and close c.

2.2.15.2. Non-Terminating Pruning

We can use routing to create a modified version of the pruning combinator. As in F <x< G, we'll run F and G in parallel and make the first value published by G available to F. However instead of terminating G after it publishes a value, we will continue running it, ignoring its remaining publications.

val r = Ref()
(F <x< r.read()) | (G >x> r.write(x))

2.2.15.3. Publication-Agnostic Otherwise

We can also use routing to create a modified version of the otherwise combinator. We'll run F until it halts, and then run G, regardless of whether F published any values or not.

val c = Buffer()
repeat(c.get) | (F >x> c.put(x) >> stop ; c.close() >> G)  
We use c.close() instead of the more common c.closenb() to ensure that G does not execute until all the publications of F have been routed. Recall that c.close() does not return until c is empty.

2.2.16. Interruption

We can write a function interruptible that implements the interrupt idiom to execute any function in an interruptible way. interruptible(g) calls the function g, which is assumed to take no arguments, and silences its publications. It immediately publishes another function, which we can call at any time to terminate the execution of g. For simplicity, we assume that g itself publishes no values.

Here is a naive implementation that doesn't quite work:

def interruptible(f) =
  val done = Semaphore(0)
  done.release
    << f() >> stop 
     | done.acquire() >> c.closenb()

{- wrong! -}
val stopper = interruptible(g)
...

The function interruptible is correct, but the way it is used causes a strange error. The function g executes, but is always immediately terminated! This happens because the val declaration which binds stopper also kills all of the remaining computation in interruptible(g), including the execution of g itself.

The solution is to bind the variable differently:

def interruptible(f) =
  val done = Semaphore(0)
  done.release
    << f() >> stop 
     | done.acquire() >> c.closenb()

interruptible(g) >stopper>
...

This idiom, wherein a function publishes some value that can be used to monitor or control its execution, arises occasionally in Orc programming. When using this idiom, always remember to avoid terminating that execution accidentally. Since Orc is a structured concurrent language, every process is contained with some other process; kill the containing process, and the contained processes die too.

2.2.17. Lifting

It is often useful to explicitly lift an execution, so that it is in some sense protected from being terminated. We can do this by running a "lifter" process, to which we can send functions that will be executed by the lifter and thus will not be terminated unless the lifter itself is terminated. Such a lifter is written by creating a channel, and running a loop which listens for functions to be sent on the channel and executes those functions as they arrive. The lifter publishes only the put method for the channel; the loop itself publishes no values, since the values published by the lifted functions are silenced. Here, we write such a lifter, and then use it to protect a function call from a timeout.
def lifter() =
  val c = Buffer()
  def loop() = c.get() >f> ( f() >> stop | loop() )
  c.put | loop()

def delayedPrint() = Rtimer(1500) >> println("Delayed 1.5 seconds")

lifter() >lift>
(
println("Running...") 
  << Rtimer(1000) | delayedPrint() | lift(delayedPrint)
)

The timeout stops the execution of delayedPrint(), so it does not print a result. However, the lifted execution of delayedPrint does succeed, since it is executing within the loop of lifter(), unaffected by the timeout.

2.2.18. Fold

We consider various concurrent implementations of the classic "list fold" function from functional programming:

def fold(_, [x])  = x
def fold(f, x:xs) = f(x, fold(xs))

This is a seedless fold (sometimes called fold1) which requires that the list be nonempty and uses its first element as a seed. This implementation is short-circuiting --- it may finish early if the reduction operator f does not use its second argument --- but it is not concurrent; no two calls to f can proceed in parallel. However, if f is associative, we can overcome this restriction and implement fold concurrently. If f is also commutative, we can further increase concurrency.

2.2.18.1. Associative Fold

We first consider the case when the reduction operator is associative. We define afold(f,xs) where f is a binary associative function and xs is a non-empty list. The implementation iteratively reduces xs to a single value. Each step of the iteration applies the auxiliary function step, which halves the size of xs by reducing disjoint pairs of adjacent items.

def afold(_, [x]) = x
def afold(f, xs) =
  def step([]) = []
  def step([x]) = [x]
  def step(x:y:xs) = f(x,y):step(xs)
  afold(f, step(xs))

Notice that f(x,y):step(xs) is an implicit fork-join. Thus, the call f(x,y) executes in parallel with the recursive call step(xs). As a result, all calls to f execute concurrently within each iteration of afold.

2.2.18.2. Associative, Commutative Fold

We can make the implementation even more concurrent when the fold operator is both associative and commutative. We define cfold(f,xs), where f is a associative and commutative binary function and xs is a non-empty list. The implementation initially copies all list items into a buffer in arbitrary order using the auxiliary function xfer, counting the total number of items copied. The auxiliary function combine repeatedly pulls pairs of items from the buffer, reduces them, and places the result back in the buffer. Each pair of items is reduced in parallel as they become available. The last item in the buffer is the result of the overall fold.

def cfold(f, xs) =
  val c = Buffer()
  
  def xfer([])    = 0
  def xfer(x:xs)  = c.put(x) >> stop | xfer(xs)+1

  def combine(0) = stop
  def combine(1) =  c.get()
  def combine(m) =  c.get() >x> c.get() >y> 
                    ( c.put(f(x,y)) >> stop | combine(m-1))

  xfer(xs) >n> combine(n)

2.3. Larger Examples

In this section we show a few larger Orc programs to demonstrate programming techniques. There are many more such examples available at the Orc website, on the community wiki.

2.3.1. Dining Philosophers

The dining philosophers problem is a well known and intensely studied problem in concurrent programming. Five philosophers sit around a circular table. Each philosopher has two forks that she shares with her neighbors (giving five forks in total). Philosophers think until they become hungry. A hungry philosopher picks up both forks, one at a time, eats, puts down both forks, and then resumes thinking. Without further refinement, this scenario allows deadlock; if all philosophers become hungry and pick up their left-hand forks simultaneously, no philosopher will be able to pick up her right-hand fork to eat. Lehmann and Rabin's solution[3], which we implement, requires that each philosopher pick up her forks in a random order. If the second fork is not immediately available, the philosopher must set down both forks and try again. While livelock is still possible if all philosophers take forks in the same order, randomization makes this possibility vanishingly unlikely.

def shuffle(a,b) = if (random(2) = 1) then (a,b) else (b,a)

def take((a,b)) =    
  a.acquire() >> b.acquirenb() ;
  a.release() >> take(shuffle(a,b))
    
def drop(a,b) = (a.release(), b.release()) >> signal

def phil(n,a,b) =
  def thinking() = 
    println(n + " thinking") >> 
    if (random(10) < 9)
      then Rtimer(random(1000))
      else stop
  def hungry() = take((a,b))
  def eating() = 
    println(n + " eating") >> 
    Rtimer(random(1000)) >> 
    println(n + " done eating") >> 
    drop(a,b)
  thinking() >> hungry() >> eating() >> phil(n,a,b)

def philosophers(1,a,b) = phil(1,a,b)
def philosophers(n,a,b) =
  val c = Semaphore(1)
  philosophers(n-1,a,c) | phil(n,c,b)

val fork = Semaphore(1)
philosophers(5,fork,fork)

The phil function simulates a single philosopher. It takes as arguments two binary semaphores representing the philosopher's forks, and calls the thinking, hungry, and eating functions in a continuous loop. A thinking philosopher waits for a random amount of time, with a 10% chance of thinking forever. A hungry philosopher uses the take function to acquire two forks. An eating philosopher waits for a random time interval and then uses the drop function to relinquish ownership of her forks.

Calling take(a,b) attempts to acquire a pair of forks (a,b) in two steps: wait for fork a to become available, then immediately attempt to acquire fork b. The call b.acquirenb() either acquires b and responds immediately, or halts if b is not available. If b is acquired, signal success; otherwise, release a, and then try again, randomly changing the order in which the forks are acquired using the auxiliary function shuffle.

The function call philosophers(n,a,b) recursively creates a chain of n philosophers, bounded by fork a on the left and b on the right. The goal expression of the program calls philosophers to create a chain of five philosophers bounded on the left and right by the same fork; hence, a ring.

This Orc solution has several nice properties. The overall structure of the program is functional, with each behavior encapsulated in its own function, making the program easy to understand and modify. Mutable state is isolated to the "fork" semaphores and associated take and get functions, simplifying the implementation of the philosophers. The program never manipulates threads explicitly, but instead expresses relationships between activities using Orc's combinators.

2.3.2. Hygienic Dining Philosophers

Here we implement a different solution to the Dining Philosophers problem, described in "The Drinking Philosophers Problem", by K. M. Chandy and J. Misra. Briefly, this algorithm efficiently and fairly solves the dining philosophers problem for philosophers connected in an arbitrary graph (as opposed to a simple ring). The algorithm works by augmenting each fork with a clean/dirty state. Initially, all forks are dirty. A philosopher is only obliged to relinquish a fork to its neighbor if the fork is dirty. On receiving a fork, the philosopher cleans it. On eating, the philosopher dirties all forks. For full details of the algorithm, consult the original paper.

{-
Start a philosopher actor; never publishes.
Messages sent between philosophers include:
- ("fork", p): philosopher p relinquishes the fork
- ("request", p): philosopher p requests the fork
- ("rumble", p): sent by a philosopher to itself when it should
  become hungry

name: identify this process in status messages
mbox: our mailbox; the "address" of this philosopher is mbox.put
missing: set of neighboring philosophers holding our forks
-}
def philosopher(name, mbox, missing) =
  {- deferred requests for forks -}
  val deferred = Buffer()
  {- forks we hold which are clean -}
  val clean = Set()

  def sendFork(p) =
    {- remember that we no longer hold the fork -}
    missing.add(p) >>
    p(("fork", mbox.put))
 
  def requestFork(p) =
    p(("request", mbox.put))
  
  {- Start a timer which will tell us when we're hungry. -}
  def digesting() =
      println(name + " thinking") >>
      thinking()
    | Rtimer(random(1000)) >>
      mbox.put(("rumble", mbox.put)) >>
      stop

  {- Wait to become hungry -}
  def thinking() =
    def on(("rumble", _)) =
      println(name + " hungry") >>
      map(requestFork, missing) >>
      hungry()
    def on(("request", p)) =
      sendFork(p) >> thinking()
    on(mbox.get())

  {- Eat once we receive all forks -}
  def hungry() =
    def on(("fork", p)) =
      missing.remove(p) >>
      clean.add(p) >>
      if missing.isEmpty()
      then println(name + " eating") >> eating()
      else hungry()
    def on(("request", p)) =
      if clean.contains(p)
      then deferred.put(p) >> hungry()
      else sendFork(p) >> requestFork(p) >> hungry()
    on(mbox.get())

  {- Dirty forks, process deferred requests, then digest -}
  def eating() =
    clean.clear() >>
    Rtimer(random(1000)) >>
    map(sendFork, deferred.getAll()) >>
    digesting()

  {- All philosophers start out digesting -}
  digesting()

{-
Create an NxN 4-connected grid of philosophers.  Each philosopher
holds the fork for the connections below and to the right (so the
top left philosopher holds both its forks).
-}
def philosophers(n) =
  {- A set with 1 item -}
  def Set1(item) = Set() >s> s.add(item) >> s
  {- A set with 2 items -}
  def Set2(i1, i2) = Set() >s> s.add(i1) >> s.add(i2) >> s

  {- create an NxN matrix of mailboxes -}
  val cs = uncurry(IArray(n, lambda (_) = IArray(n, ignore(Buffer))))

  {- create the first row of philosophers -}
  philosopher((0,0), cs(0,0), Set())
  | for(1, n) >j>
    philosopher((0,j), cs(0,j), Set1(cs(0,j-1).put))

  {- create remaining rows -}
  | for(1, n) >i> (
      philosopher((i,0), cs(i,0), Set1(cs(i-1,0).put))
      | for(1, n) >j>
        philosopher((i,j), cs(i,j), Set2(cs(i-1,j).put, cs(i,j-1).put))
    )

{- Simulate a 3x3 grid of philosophers for 10 seconds -}
let(
  philosophers(3)
  | Rtimer(10000)
) >> "HALTED"

Our implementation is based on the actor model of concurrency. An actor is a state machine which reacts to messages. On receiving a message, an actor can send asynchronous messages to other actors, change its state, or create new actors. Each actor is single-threaded and processes messages sequentially, which makes some concurrent programs easier to reason about and avoids explicit locking. Erlang is one popular language based on the actor model.

Orc emulates the actor model very naturally. In Orc, an actor is an Orc thread of execution, together with a Buffer which serves as a mailbox. To send a message to an actor, you place it in the actor's mailbox, and to receive a message, the actor gets the next item from the mailbox. The internal states of the actor are represented by functions: while an actor's thread of execution is evaluating a function, it is considered to be in the corresponding state. Because Orc implements tail-call optimization, state transitions can be encoded as function calls without running out of stack space.

In this program, a philosopher is implemented by an actor with three primary states: eating, thinking, and hungry. An additional transient state, digesting, is used to start a timer which will trigger the state change from thinking to hungry. Each state is implemented by a function which reads a message from the mailbox, selects the appropriate action using pattern matching, performs the action, and finally transitions to the next state (possibly the same as the current state) by calling the corresponding function.

Forks are never represented explicitly. Instead each philosopher identifies a fork with the "address" (sending end of a mailbox) of the neighbor who shares the fork. Every message sent includes the sender's address. Therefore when a philosopher receives a request for a fork, it knows who requested it and therefore which fork to relinquish. Likewise when a philosopher receives a fork, it knows who sent it and therefore which fork was received.

2.3.3. Readers-Writers

Here we present an Orc solution to the readers-writers problem. Briefly, the readers-writers problem involves concurrent access to a mutable resource. Multiple readers can access the resource concurrently, but writers must have exclusive access. When readers and writers conflict, different solutions may resolve the conflict in favor of one or the other, or fairly. In the following solution, when a writer tries to acquire the lock, current readers are allowed to finish but new readers are postponed until after the writer finishes. Lock requests are granted in the order received, guaranteeing fairness. Normally, such a service would be provided to Orc programs by a site, but it is educational to see how it can be implemented directly in Orc.

-- Queue of lock requests
val m = Buffer()
-- Count of active readers/writers
val c = Counter()

{-- Process requests in sequence --}
def process() =
  -- Grant read request
  def grant((false,s)) = c.inc() >> s.release()
  -- Grant write request
  def grant((true,s)) =
    c.onZero() >> c.inc() >> s.release() >> c.onZero()
  -- Goal expression of process()
  m.get() >r> grant(r) >> process()

{-- Acquire the lock: argument is "true" if writing --}
def acquire(write) =
  val s = Semaphore(0)
  m.put((write, s)) >> s.acquire()

{-- Release the lock --}
def release() = c.dec()

-------------------------------------------------

{-- These definitions are for testing only --}
def reader(start) = Rtimer(start) >>
  acquire(false) >> println("START READ") >>
  Rtimer(1000) >> println("END READ") >>
  release() >> stop
def writer(start) = Rtimer(start) >>
  acquire(true) >> println("START WRITE") >>
  Rtimer(1000) >> println("END WRITE") >>
  release() >> stop

let(
    process()  {- Output:     -}
  | reader(10) {- START READ  -}
  | reader(20) {- START READ  -}
               {- END READ    -}
               {- END READ    -}
  | writer(30) {- START WRITE -}
               {- END WRITE   -}
  | reader(40) {- START READ  -}
  | reader(50) {- START READ  -}
               {- END READ    -}
               {- END READ    -}
  -- halt after the last reader finishes
  | Rtimer(60) >> acquire(true)
)

The lock receives requests over the channel m and processes them sequentially with the function grant. Each request includes a boolean flag which is true for write requests and false for read requests, and a Semaphore which the requester blocks on. The lock grants access by releasing the semaphore, unblocking the requester.

The counter c tracks the number of readers or writers currently holding the lock. Whenever the lock is granted, grant increments c, and when the lock is released, c is decremented. To ensure that a writer has exclusive access, grant waits for the c to become zero before granting the lock to the writer, and then waits for c to become zero again before granting any more requests.

2.3.4. Quicksort

The original quicksort algorithm [4] was designed for efficient execution on a uniprocessor. Encoding it as a functional program typically ignores its efficient rearrangement of the elements of an array. Further, no known implementation highlights its concurrent aspects. The following program attempts to overcome these two limitations. The program is mostly functional in its structure, though it manipulates the array elements in place. We encode parts of the algorithm as concurrent activities where sequentiality is unneeded.

The following listing gives the implementation of the quicksort function which sorts the array a in place. The auxiliary function sort sorts the subarray given by indices s through t by calling part to partition the subarray and then recursively sorting the partitions.

def quicksort(a) =

  def swap(x, y) = a(x)? >z> a(x) := a(y)? >> a(y) := z

  def part(p, s, t) =
    def lr(i) = if i < t && a(i)? <= p then lr(i+1) else i
    def rl(i) = if a(i)? > p then rl(i-1) else i

    (lr(s), rl(t)) >(s', t')>
    ( if (s' + 1 < t') >> swap(s', t') >> part(p, s'+1, t'-1)
    | if (s' + 1 = t') >> swap(s', t') >> s'
    | if (s' + 1 > t') >> t'
    )

  def sort(s, t) =
     if s >= t then signal
     else part(a(s)?, s+1, t) >m>
          swap(m, s) >>
          (sort(s, m-1), sort(m+1, t)) >>
          signal

  sort(0, a.length()-1)

The function part partitions the subarray given by indices s through t into two partitions, one containing values less than or equal to p and the other containing values > p. The last index of the lower partition is returned. The value at a(s-1) is assumed to be less than or equal to p --- this is satisfied by choosing p = a(s-1)? initially. To create the partitions, part calls two auxiliary functions lr and rl concurrently. These functions scan from the left and right of the subarray respectively, looking for out-of-place elements. Once two such elements have been found, they are swapped using the auxiliary function swap, and then the unscanned portion of the subarray is partitioned further. Partitioning is complete when the entire subarray has been scanned.

This program uses the syntactic sugar x? for x.read() and x := y for x.write(y). Also note that the expression a(i) returns a reference to the element of array a at index i, counting from 0.

2.3.5. Meeting Scheduler

Orc makes very few assumptions about the behaviors of services it uses. Therefore it is straightforward to write programs which interact with human agents and network services. This makes Orc especially suitable for encoding workflows, the coordination of multiple activities involving multiple participants. The following program illustrates a simple workflow for scheduling a business meeting. Given a list of people and a date range, the program asks each person when they are available for a meeting. It then combines all the responses, selects a meeting time which is acceptable to everyone, and notifies everyone of the selected time.

include "net.inc"
val during = Interval(LocalDate(2009, 9, 10),
                      LocalDate(2009, 10, 17))
val invitees = ["john@example.com", "jane@example.com"]

def invite(invitee) =
  Form() >f>
  f.addPart(DateTimeRangesField("times",
    "When are you available for a meeting?", during, 9, 17)) >>
  f.addPart(Button("submit", "Submit")) >>
  SendForm(f) >receiver>
  SendMail(invitee, "Meeting Request", receiver.getURL()) >>
  receiver.get() >response>
  response.get("times")

def notify([]) =
  each(invitees) >invitee>
  SendMail(invitee, "Meeting Request Failed",
                    "No meeting time found.")
def notify(first:_) =
  each(invitees) >invitee>
  SendMail(invitee, "Meeting Request Succeeded",
                    first.getStart())

map(invite, invitees) >responses>
afold(lambda (a,b) = a.intersect(b), responses) >times>
notify(times)

This program begins with declarations of during (the date range for the proposed meeting) and invitees (the list of people to invite represented by email addresses).

The invite function obtains possible meeting times from a given invitee, as follows. First it uses library sites (Form, DateTimeRangesField, Button, and SendForm) to construct a web form which may be used to submit possible meeting times. Then it emails the URL of this form to the invitee and blocks waiting for a response. When the invitee receives the email, he or she will use a web browser to visit the URL, complete the form, and submit it. The corresponding execution of invite receives the response in the variable response and extracts the chosen meeting times.

The notify function takes a list of possible meeting times, selects the first meeting time in the list, and emails everyone with this time. If the list of possible meeting times is empty, it emails everyone indicating that no meeting time was found.

The goal expression of the program uses the library function map to apply notify to each invitee and collect the responses in a list. It then uses the library function afold to intersect all of the responses. The result is a set of meeting times which are acceptable to everyone. Finally, notify is called to select one of these times and notify everyone of the result.

This program may be extended to add more sophisticated features, such as a quorum (to select a meeting as soon as some subset of invitees responds) or timeouts (to remind invitees if they don't respond in a timely manner). These modifications are local and do not affect the overall structure of the program. For complete details, see examples on our website.



[1] R. Milner. Communicating and Mobile Systems: the π-Calculus. Cambridge University Press, May 1999.

[2] J. Armstrong, R. Virding, C. Wikstr¨om, and M. Williams. Concurrent programming in ERLANG (2nd ed.). Prentice Hall International (UK) Ltd., Hertfordshire, UK, UK, 1996.

[3] D. J. Lehmann and M. O. Rabin. On the advantages of free choice: A symmetric and fully distributed solution to the dining philosophers problem. In POPL, pages 133–138, 1981.

[4] C. A. R. Hoare. Partition: Algorithm 63, Quicksort: Algorithm 64, and Find: Algorithm 65. Communications of the ACM, 4(7):321–322, 1961.

Chapter 3. Accessing and Creating External Services

3.1. Introduction

There are two primary ways to create sites which can be used in Orc:

  1. Implement sites as regular Java classes. Orc programs can import and use these classes directly with the class declaration. This approach is easy for anyone already familiar with Java, and such sites are straightforward to share between Orc and Java code. However such sites are limited in how they can interact with the Orc engine.
  2. Implement sites using Orc's low-level site API. Orc programs can import and use these sites with the site declaration. This approach provides full access to the features of the Orc engine. However such sites are difficult to use from Java code.

External services (including web services) are handled using the Proxy pattern. A site implemented in Java (using one of the two techniques above) must act as the local proxy for the service, translating Orc site calls into the appropriate requests and translating responses into site return values, halts, or errors.

3.2.  class Sites

Java classes can be imported into Orc as sites using the class declaration. Imported classes must be in the classpath of the JVM running the Orc interpreter. The following sections describe in detail how such imported classes behave in Orc programs.

3.2.1. Dot Operator

x.member, where x evaluates to a Java class or object, is evaluated as follows:

  • If x has one or more methods named member, a "method handle" site is returned which may be called like any other Orc site. When a method handle is actually called with arguments, the appropriate Java method is selected and called depending on the number and type of arguments, as described in Method Resolution below.
  • Otherwise, if x has a field named member, the object's field is returned, encapsulated in a Ref object. The Ref object has read and write methods which are used to get and set the value of the field.

Note that no distinction is made between static and non-static members; it is an error to reference a non-static member through a class, but this does not change how members are resolved. Note also that if a field shares a name with one or more methods, there is no way to access the field directly.

The following (rather useless) example illustrates how the dot operator can be used to access both static and non-static methods and fields:

{- bind Integer to a Java class -}
class Integer = java.lang.Integer

{- call a static method -}
val i = Integer.decode("5")
{- read a field -}
val m = Integer.MIN_VALUE.read()
{- write a field -}
Integer.MIN_VALUE.write(5) >>
{- call a non-static method -}
i.toString()

3.2.2. Direct Calls

When x evaluates to a Java object (but not a Java class), the syntax x(...) is equivalent to x.apply(...).

When x evaluates to a Java class, the syntax x(...) calls the class's constructor. In case of overloaded constructors, the appropriate constructor is chosen based on the number and types of arguments as described in Method Resolution.

3.2.3. Pattern Matching

If C is bound to a Java class, it can be used as a pattern. A pattern C(x) matches any Java object of that class or any of its subclasses. The variable x is simply bound to the object again; thus the matcher is just a partial identity function.

3.2.4. Method Resolution

When a method handle is called, the actual Java method called is chosen based on the runtime types of the arguments, as follows:

  1. If only one method has the appropriate number of arguments, that method is called.
  2. Otherwise, each method taking the appropriate number of arguments is tested for type compatibility as follows, and the first matching method is called.
    1. Every argument is compared to the corresponding formal parameter type as follows. All arguments must match for the method to match.
      1. If the argument is null, then the argument matches
      2. If the formal parameter type is primitive (int, char, float, ...) and the argument is an instance of a wrapper class, then the argument is unboxed (unwrapped) and coerced to the type of the formal parameter according to Java's standard rules for implicit widening coercions.
      3. If the formal parameter type is a primitive numeric type and the argument is an instance of BigDecimal, the argument is implicitly narrowed to the formal parameter type.
      4. If the formal parameter type is a primitive integral type and the argument is an instance of BigInteger, the argument is implicitly narrowed to the formal parameter type.
      5. Otherwise, the argument must be a subtype of the formal parameter type.

The reason for the unusual implicit narrowing of BigDecimal and BigInteger is that Orc numeric literals have these types, and it would be awkward to have to perform an explicit conversion every time such a value is passed to a Java method expecting a primitive.

Currently we do not implement specificity rules for choosing the best matching method; the first matching method (according to some unspecified order) is chosen. Note also that we do not support varargs methods explicitly, but instead varargs may be passed as an array of the appropriate type.

3.2.5. Orc values in Java

Orc values are implemented by Java objects, so in general any Orc value may be passed to a site implemented in Java. Standard Orc values have the following Java types:

string
java.lang.String
boolean
java.lang.Boolean
number
java.lang.Number
tuple
orc.runtime.values.TupleValue
list
orc.runtime.values.ListValue
function
orc.runtime.values.Closure

Currently it is not possible to call Orc functions from Java code.

site
orc.runtime.values.Site

Currently it is not possible to directly call Orc sites from Java code. However if you are implementing a site yourself, you may provide methods which can be called from Java code to invoke the behavior of the site.

3.2.6. Java Values in Orc

Java objects may be used directly as values anywhere in an Orc program. Primitive Java values cannot be used directly in an Orc program, but are automatically boxed (and unboxed) as necessary.

When both arguments of an arithmetic or comparison operator are Java numeric types, the arguments are implicitly coerced to the widest of the two argument types. "Widest" is defined by the following relation, where ">" means "is wider than": BigDecimal > Double > Float > BigInteger > Long > Integer > Short > Byte

3.2.7. Cooperative Scheduling and Concurrency

3.2.7.1. Overview

In order to support massive concurrency efficiently in Java, Orc uses cooperative threading. Orc programs are broken into discrete steps which are executed by a fixed-size thread pool. This approach works for Orc expressions, but the internals of an Orc site written in Java cannot be easily broken down. So Orc has two choices when it needs to call a local Java site:

  • Run the site call in a new thread. This means site calls never unnecessarily block the Orc engine, but it introduces the potentially-unnecessary overhead of creating and context-switching threads.
  • Run the site call in the same thread as the Orc engine, which is efficient and conserves thread resources but may block the engine from making further progress until the site call finishes.

Fortunately, in practice most Orc site calls take a short, deterministic amount of time to complete. Of those that don't, it's usually only because they are blocked waiting for some external event, like another site call. In this case, instead of blocking the site can return control to the Orc engine, asking to be run again when the external event occurs. In typical Java applications this kind of non-blocking behavior is implemented using callbacks (or in its most general form, continuation passing). Unfortunately this creates convoluted and verbose code: what if you need to block in the middle of a for loop, for example?

Kilim resolves this issue by allowing programmers to write code in a natural blocking style and then rewriting the bytecode to perform a form of CPS conversion, so that instead of blocking the code actually returns control to a scheduler. Orc supports Kilim, allowing site authors to write sites with internal concurrency and blocking behavior which don't use Java threads and cooperate with the Orc engine.

3.2.7.2. Kilim Fundamentals

For a full introduction to Kilim, see A Thread of One's Own, by Sriram Srinivasan. In the following, we will cover just enough to get you started writing Orc sites using Kilim.

Kilim introduces three key primitives:

Pausable

This checked exception marks methods which may block. You should never catch this exception, and you should always declare it even if you have already declared throws Exception or throws Throwable. In all other respects it follows the normal rules for checked exceptions: an override can only throw it if the superclass method throws it, and you must throw it if you call any method which throws it.

Originally Kilim used annotations to mark pausable methods. It turned out that javac would sometimes manipulate code in ways which violate the invariants regarding use of the annotation. Annotations also have the disadvantage that the invariants on their use are not checked automatically by tools like Eclipse.

Mailbox

A multiple-producer, single-consumer queue used to communicate between tasks. The most important methods are put(Object), which places an item in the mailbox, and get(), which blocks until the mailbox is non-empty and then returns the first item.

Task
Kilim analogue of a Thread. Every Pausable method must be run within a task. To create and run a task:
new Task() {
    public void execute() throws Pausable {
        // do some stuff
    }
}.start();

Together, these primitives allow you to write highly-concurrent Java programs in a familiar threaded style. Under the hood, Kilim can schedule any number of concurrent tasks on a fixed number of physical threads, providing better scalability and performance than possible with Java threads.

When writing Orc sites you rarely need to use tasks explicitly because every Orc site call is implicitly run inside a Kilim task if necessary. Therefore all you need to do is mark Pausable methods. Here's a short example of a buffer site written using Kilim Mailboxes:

public class KilimBuffer<V> {
	private LinkedList<Mailbox<V>> waiters =
		new LinkedList<Mailbox<V>>();
	private LinkedList<V> buffer = new LinkedList<V>();
	public synchronized void put(V o) {
		Mailbox<V> waiter = waiters.poll();
		if (waiter != null) waiter.putnb(o);
		else buffer.add(o);
	}
	public synchronized V get() throws Pausable {
		V out = buffer.poll();
		if (out != null) return out;
		else {
			Mailbox<V> waiter = new Mailbox<V>();
			waiters.add(waiter);
			return waiter.get();
		}
	}
}

Classes using Kilim can be imported like other Java classes using the Orc class declaration. For example, we can use the KilimBuffer class defined above:

class Buffer = orc.lib.state.KilimBuffer
val b = Buffer()
  Rtimer(1000) >> b.put("1 second later") >> null
| b.get()

Warning

The version of Kilim bundled with Orc does not support calls to pausable methods in constructor arguments. For example, this does not work:
new Foo(bar.pausable());
The workaround is to use a temporary variable:
Object tmp = bar.pausable();
new Foo(tmp);

3.2.7.3. When you must block

Sometimes blocking is unavoidable, for example if a site must perform blocking IO. For such cases, Orc provides a utility method orc.runtime.Kilim#runThreaded(Callable) which farms work out to a thread pool. The advantage of doing this over spawning your own thread is that there is no chance of using too many threads; if no thread is available, the method pauses until one becomes available. The disadvantage is that if you have too many Java methods which must communicate with each other concurrently and can't use Mailboxes, there won't be enough threads for them all and execution will deadlock. This situation is sufficiently rare that runThreaded is usually the correct approach.

3.2.7.4. Compiling Kilim Sites

Code which uses Kilim must be processed with the Kilim "weaver" after compiling and before running. The weaver is bundled with the Orc source code, or may be downloaded separately from the Kilim project site.

To process compiled classes, run kilim.tools.Weaver. If your .class files are in the build/ directory, you would run the weaver like this: java -cp kilim.jar:build/ kilim.tools.Weaver -d build/ build/

The Orc source distribution includes an ant kilim task (in build.xml) which weaves compiled classes in the build directory. If you are compiling Orc in Eclipse, an Eclipse "Builder" is included which does this automatically whenever any source files change.

3.3.  site Sites

3.3.1. Fundamentals

Orc sites imported with the site declaration are implemented as Java classes which extend orc.runtime.sites.Site. Here we will summarize the most important features of this and related classes; for a detailed description, refer to the Javadoc.

Sites must implement a single method: callSite(Args args, Token token). The Orc engine calls this method whenever the site is called by the Orc program.

The args argument (of type Args) is used to get the arguments which were passed to the site call by the Orc program. It has the following important methods:

getArg(int n)
returns the value of the (n+1)th argument, as an Object. For example, to get the value of the first argument, call args.getArg(0).
intArg(int n), stringArg(int n), boolArg(int n), etc.
returns the value of the (n+1)th argument cast to the appropriate type. If the argument is not the appropriate type, throws an ArgumentTypeMismatchException.

The token argument (of type Token) is a sort of callback which is used by the site to return a value or signal that it has halted. It has the following important methods:

resume(Object value)
return the value value from the site call. For example, to return the number 5, call token.resume(5). To publish a signal without returning any specific value, call token.resume().
error(TokenException problem)
halt without publishing a value and report an error. This method is an appropriate way to report Java exceptions encountered during a site call. Alternatively, a site can simply throw a checked exception from the callSite method; the engine will call token.error on its behalf. To convert a Java exception to the necessary TokenException type, wrap it in an instance of orc.error.runtime.JavaException.
die()
halt, without publishing a value or reporting an error. For example, the built-in site if uses this method to halt when called with a false argument.

The callSite method must return control to its caller within a short, deterministic amount of time. To implement a site which blocks or waits for some condition, callSite must save the token and return without calling any of the above methods. Then when it is time for the site call to return, call the appropriate method on the token.

3.3.2. More Site Classes

3.3.2.1. EvalSite

Many sites are deterministic, always returning a value immediately when called. Such sites can be implemented easily by extending orc.runtime.sites.EvalSite and implementing the method evaluate(Args args). This method should read the arguments from args as usual, and then return the value to be returned from the site call, or throw a checked exception to indicate an error.

3.3.2.2. ThreadedSite

Some sites need to perform blocking IO or use other Java blocking methods like Object#wait(). This cannot be done directly in the callSite or EvalSite#evaluate methods because that might block the Orc engine. Instead, extend orc.runtime.sites.ThreadedSite and implement the method evaluate(Args args). This method should read the arguments from args as usual, and then return the value to be returned from the site call, or throw a checked exception to indicate an error. Your evaluate method will be automatically run asynchronously in a new thread so that it does not interfere with the Orc engine.

Beware that the Orc engine may be configured to only allow a fixed number of site threads. If too many sites need to use threads simultaneously, some will block until threads become available.

3.3.2.3. DotSite

Recall that the notation x.msg corresponds to calling the site x with the special message value msg. To implement a site which responds to such messages, extend orc.runtime.sites.DotSite.

In your site's constructor, call the method addMember(String message, Object value) once for each message you wish the site to respond to. The call addMember(m, v) instructs the site to return the value v when it is called with the message m.

Any object can be used as a member value, even another site. Anonymous inner classes extending Site are often used as member values which behave like methods of the enclosing DotSite.

3.3.3. Tutorial Example

We will implement a simplified version of Orc's Buffer site, called ExampleBuffer. The goal will be to be able to run this Orc program:

-- Import the site definition
site ExampleBuffer = ExampleBuffer
-- Create a new buffer site
val b = ExampleBuffer()
-- wait for a value in the buffer
b.get()
-- put a value in the buffer
| b.put(3) >> stop
-- Publishes: 3

The first step is to create ExampleBuffer.java. The ExampleBuffer site is actually a discovery site: all it does when called is create and return a reference to a new site (the buffer instance).

public class ExampleBuffer extends EvalSite {
  public Object evaluate(Args args) {
    return new ExampleBufferInstance();
  }
}

Next we must implement ExampleBufferInstance, which defines the behavior of an individual buffer. We can put this class either in its own file or in ExampleBuffer.java, since that is the only class which refers to it directly. How does an instance behave? If we write b.get(), that means that b responds to the message get with a site which we call to get a value from the buffer. So we'll extend DotSite, telling it to respond to the message "get" with a GetMethod site, and likewise for "put". We'll use a linked list to store the actual contents of the buffer.

class ExampleBufferInstance extends DotSite {
  private LinkedList<Object> contents = new LinkedList<Object>();
  public ExampleBufferInstance() {
    addMember("get", new GetMethod());
    addMember("put", new PutMethod());
  }
}

The GetMethod site is implemented as an inner class so it has easy access to the contents of the buffer. In the implementation we first check if there is an item in the buffer. If so, we return it. Otherwise, we must store the token in a queue to be notified when the next item is put in the buffer. We synchronize on the outer object to protect against concurrent modification.

private LinkedList<Token> waiters = new LinkedList<Token>();
private class GetMethod extends Site {
  public void callSite(Args args, Token token) {
    synchronized (ExampleBufferInstance.this) {
      if (!contents.isEmpty()) {
        token.resume(contents.removeFirst());
      } else {
        waiters.add(token);
      } 
    }
  }
}

PutMethod is even simpler since it doesn't need to block. If there is a token waiting for an item, we give it the item. Otherwise we put the item in the buffer. We synchronize on the outer object to protect against concurrent modification. When done we return a dummy "signal" value.

private class PutMethod extends EvalSite {
  public Object evaluate(Args args) {
    Object item = args.getArg(0);
    synchronized (ExampleBufferInstance.this) { 
      if (!waiters.isEmpty()) {
        waiters.removeFirst().resume(item);
      } else {
        contents.add(item);
      } 
    }
    return signal();
  }
}

Putting it all together, here is the entire implementation:

import java.util.LinkedList;

import orc.runtime.sites.EvalSite;
import orc.runtime.sites.DotSite;
import orc.runtime.sites.Site;
import orc.runtime.Token;

public class ExampleBuffer extends EvalSite {
  public Object evaluate(Args args) {
    return new ExampleBufferInstance();
  }
}

class ExampleBufferInstance extends DotSite {
  private LinkedList<Object> contents = new LinkedList<Object>();
  private LinkedList<Token> waiters = new LinkedList<Token>();
    
  private class GetMethod extends Site {
    public void callSite(Args args, Token token) {
      synchronized (ExampleBufferInstance.this) {
        if (!contents.isEmpty()) {
          token.resume(contents.removeFirst());
        } else {
          waiters.add(token);
        } 
      }
    }
  }

  private class PutMethod extends EvalSite {
    public Object evaluate(Args args) {
      Object item = args.getArg(0);
      synchronized (ExampleBufferInstance.this) { 
        if (!waiters.isEmpty()) {
          waiters.removeFirst().resume(item);
        } else {
          contents.add(item);
        } 
      }
      return signal();
    }
  }

  public ExampleBufferInstance() {
    addMember("get", new GetMethod());
    addMember("put", new PutMethod());
  }
}

3.4. Web Services

3.4.1. Introduction

While we believe Orc is an excellent language for web service scripting, currently the library support for such tasks is at a proof-of-concept level. The web service libraries are not bundled with the core Orc distribution, do not have stable APIs, and are not officially documented. However this section should provide you with enough information to get started.

3.4.2. Downloading and Running Examples

All Orc sites related to web services are bundled in a separate "OrcSites" library. As with the core Orc distribution, you can either download this library as a prepackaged JAR or check out the source code from the "OrcSites" module in version control. We generally recommend the latter approach, since the source code provides several examples which can be used as a basis for creating your own web service sites.

Within the OrcSites source code you will find:

  • Java source code for web service sites: src/orc/lib/net/
  • Orc source code for programs using web services: examples/

Several of the examples require you to create .properties files and place them in your classpath. The simplest way to do this is to make sure the examples/ directory is in your classpath, and place the necessary .properties files there.

3.4.3. Protocols Supported

Web services use a variety of protocols, so there are a variety of ways to contact them. All of them boil down to creating a Java proxy site for the service and calling that. Previous sections explain how to implement such Java sites which can be called in Orc. Because web services tend to use blocking I/O, Java wrappers make frequent use of ThreadedSite and Kilim to ensure that web service calls don't block the Orc engine.

3.4.3.1. Java APIs

Some web services provide Java APIs specifically for the service. For example, Google Calendar. In these cases we just use the Java API from Orc, either directly or via a small Java wrapper which simplifies the interface.

OrcSites includes class orc.lib.net.GoogleCalendar as an example of this type of web service.

3.4.3.2. SOAP RPC

OrcSites includes a generic site orc.lib.net.Webservice which allows you to connect to any SOAP RPC (specifically rpc/encoded) service, without writing Java wrappers. Instead Apache Axis is used to generate Java wrappers on-demand.

http://www.xmethods.net is a good place to find examples of SOAP RPC services:

  1. Find the RPC "Style" service which does what you want.
  2. Click on the name of the service.
  3. Click on "View RPC Profile" for a summary of the methods available.
  4. Construct instances of the service by passing the WSDL URL to the Webservice site. E.g. Webservice("http://site.com/wsdl").
  5. Call methods on the service as you would with any Java object. The JAX-RPC specification has complete details on how SOAP operations and data types are represented in Java. Beware: method names always begin with a lower-case letter, even if the corresponding operation does not.

Example:

site orc.lib.net.Webservice
{-
Find documentation of this service at:
http://www.xmethods.net/ve2/WSDLRPCView.po?
key=uuid:BF3EFCDD-FCD4-8867-3AAC-068985E7CB89
-}
val service = Webservice(
  "http://www.ebob42.com/cgi-bin/"
  + "Romulan.exe/wsdl/IRoman")
service.intToRoman(451)

3.4.4. REST

Many services use ad-hoc REST/XML protocols. Unfortunately, there is no REST equivalent to WSDL, so there's no way to automatically generate an API for REST services. We're not trying to solve this problem with Orc, but when the web services community reaches some kind of consensus, Orc will support it.

For now you must write a Java wrapper for each service which handles marshalling and unmarshalling the data. There's no reason such marshalling code couldn't be written in Orc, calling low-level HTTP and XML libraries directly, but there would be no advantage to doing so, so we let each language play to its strengths. OrcSites includes utility classes to assist with submitting requests and parsing responses.

OrcSites includes the following examples of this type of service:

  • class orc.lib.net.Upcoming
  • site orc.lib.net.TrueRandom
  • site orc.lib.net.YahooSpellFactory

Chapter 4. Type Checking

The Orc language, as it is described in Chapter 1, is dynamically typed. If an operation occurs at runtime which is not type-correct, the call which attempted that operation becomes silent, and produces a type error that is reported on the console.

Orc also has an optional static typechecker, which will guarantee that a program is free of type errors before the program is run. For every expression in the program, the typechecker tries to find the types of values that it could publish, and checks that all such types are consistent. It performs a limited form of type inference, so it can discover many types automatically; however, the programmer must provide additional type information for defined functions, and in a few other specific cases.

The typechecker uses the local type inference algorithm described by Pierce and Turner in their paper, Local Type Inference. It extends this algorithm with polymorphic type constructors (e.g. List and Buffer), user-defined datatypes (which may also be polymorphic), and a typing strategy for external services. The typechecker supports both parametric polymorphism (generics) and inclusion polymorphism (subtyping), though it does not currently implement the extended version of the algorithm which allows both of them simultaneously (bounded polymorphism).

The typechecker is disabled by default, though typed syntax is still permitted (and types are still checked for syntax errors) even when the typechecker is not used. It may be enabled as a project property in the Eclipse plugin, or by using the -typecheck switch on the command line.

If the typechecker can verify that a program is correctly typed, it will display the message

... :: T

on the console. The symbol :: means "has type", and T is the type of any value that the program might publish.

4.1. Simple Typing

In this section we will see how Orc programs are typechecked, starting with simple expressions like constants and arithmetic operations, and working up to more interesting cases such as defined functions, the concurrency combinators, and calls to external services.

4.1.1. Values and Operators

Each of the Orc constants has the expected type:

  • true , false :: Boolean
  • ... -1, 0, 1 ... :: Integer
  • ... "orc" , "ceci n'est pas une |" ... :: String
  • ... 0.001, -1.5, 2.71828 ... :: Number
The expression signal has its own unique type, Signal.

Orc allows subtyping: some types are included in other types. For example, Integer is a subtype of Number, because all Integers are also Numbers.

Each of the primitive operators typechecks in the obvious way; for example && requires two Boolean operands and returns a Boolean result. Some of the arithmetic operators also support ad-hoc polymorphism; for example, a + expression with two Integer operands has type Integer, but with one or more Number operands it instead has type Number.

4.1.2.  Top and Bot

There are two other important types: Top and Bot.

Top is the universal type; it is the type of any value, and all other types are subtypes of it.

Bot is the empty type; no value has type Bot. It is a subtype of all other types. Expressions with type Bot are expressions that the typechecker can verify will never publish any values. In particular, stop will never publish, so it has type Bot.

Bot has an interesting status in Orc. In other typed languages, if an expression has type Bot, this usually indicates a guaranteed error, infinite loop, or other failure to return a value. Since sequential programming rarely involves subexpressions that are guaranteed never to return, Bot is usually just a curiosity or a formal artifact of the type system, and indeed many static type systems do not have a Bot type at all. In Orc, however, Bot is very useful, since it is frequently the case that Orc expressions are written to carry out ongoing concurrent activities but never publish any values, and the type system can use the type Bot to indicate that no publications will ever be seen from such expressions.

4.1.3. Tuples

The type of a tuple is a tuple of the types of each of its elements.

  • (3+3, true) :: (Integer, Boolean)
  • ("orc", (2, "orc")) :: (String, (Integer, String))

4.1.4. Combinators

4.1.4.1. Parallel

The type of F | G is the join of the types of F and G.

A common supertype of two types S and T is any type U such that S and T are both subtypes of U. The join J is the least common supertype of S and T, that is, J is a common supertype, and J is also a subtype of every other common supertype.

The most common case is the join of a type T with itself. Then the join is just T. This happens when both parallel branches publish the same type of values.

  • 3+4 | 5 :: Integer

If S is a subtype of T, the join of S and T is T. This happens when one branch is publishing a less specific type of value. To be safe, the whole expression then has that less specific type.

  • 3 | 3.1 :: Number

As a special case, the join of any type T with Bot is T. This occurs when one branch publishes values and the other one never publishes. Thus, it is safe to use the type of values from the first branch, since the second branch contributes no values at all.

  • "all quiet on the western front" | EasternFront() >> stop :: String

In any other case, the typechecker will try to infer the join. However, it may not always find the least supertype, and in some cases there may not even be a useful common supertype. In these cases it will default to the Top type, which is a supertype of all other types, and is therefore always a safe join, if not a very useful one.

  • 3 | "three" :: Top

4.1.4.2. Sequential

The type of F >x> G is the type of G using the assumption x :: T, where T is the type of F.

If a pattern is used instead of a variable, the structure of the pattern must match the structure of the type of F, and the types for each variable bound in G are taken from the corresponding pieces of the type of F. For example, in the expression F >(x,b)> G, where F :: (Integer, Boolean), G is typechecked with the assumptions x :: Integer and b :: Boolean.

4.1.4.3. Pruning

The type of F <x< G is the type of F using the assumption x :: T, where T is the type of G.

When a declaration val x = F occurs, the expression in scope for the declaration is typechecked using the assumption x :: T, where T is the type of F.

Patterns in the pruning combinator and in val declarations are typechecked in the same way as in the sequential combinator.

4.1.4.4. Otherwise

The type of F ; G is the join of the types of F and G, as described for the parallel combinator.

4.1.5. Conditionals

In the conditional expression if E then F else G, the expression E must have type Boolean, and the type of the whole expression is the join of the types of F and G, as described for the parallel combinator.

4.1.6. Defined Functions

Though the typechecker can infer the types of many expressions without additional information from the programmer, there are some cases where the algorithm does need assistance. In particular, whenever a function is defined (using the def keyword), the programmer must also include a signature for that function, supplying its argument types and return type.

A signature immediately precedes a function definition, and is written as follows:

def magsq(Number, Number) :: Number
def magsq(x,y) = x*x + y*y
Argument types replace the arguments, :: replaces =, and the return type (the type of the body expression) replaces the body expression. If the function is not recursive, then the return type is optional, because the typechecker can infer it from the body expression without additional information. So, in this case, we could have just written:
def magsq(Number, Number)
def magsq(x,y) = x*x + y*y

It is also possible to include the argument types and return type directly in a definition, rather than as a separate signature. The signature and definition above could be written together as:

def magsq(x :: Number, y :: Number) :: Number = x*x + y*y
An argument x with type T is replaced by x :: T in the argument list. The return type is inserted between the argument list and the = sign. This form is not typically used for functions with multiple clauses or pattern matches, though it is allowed in those cases so long as only one clause has the type information.

lambda expressions are a special case: the type information must be included directly in this way, since there is no way to declare the signature separately. Specifying the return type of a lambda is always optional, since it is not possible for an anonymous function to make a recursive call. Written as a typed lambda function, the magsq definition looks like:

lambda (x :: Number, y :: Number) = x*x + y*y
When the typechecker has enough information to check the type of a lambda against a given function type, rather than needing to infer it, the argument types of the lambda are also optional. This occurs, for example, when a lambda is given as an argument to a defined higher-order function.

Once defined, functions are values, and thus have a special type of their own. The type of a function is written very much like a signature, but using the lambda keyword instead of def and the function name.

  • magsq, defined earlier, has type lambda (Number, Number) :: Number
  • lambda (x :: Integer) = x+1 has type lambda (Integer) :: Integer
  • lambda () = signal has type lambda () :: Top
  • lambda (a :: Boolean, b :: Boolean) = (b,a) has type lambda (Boolean,Boolean) :: (Boolean,Boolean)

4.1.7. Ascriptions

In addition to annotation defined functions with a signature, the typechecker can also accept programmer-supplied type information for any expression. An ascription, written E :: T, asks the typechecker to verify that expression E has type T.

Normally, the typechecker would simply find the type of E by inference, but in certain situations it is easier to check a given type. For example, the typechecker may not be able to infer the correct join type for a parallel combinator, but it is always able to check that both branches are subtypes of an already provided type. Furthermore, adding extra type information makes it easier to pinpoint the source of a typechecking failure.

Ascriptions are also allowed in patterns; P :: T is a valid pattern, instructing the typechecker to verify that the value fragment bound by the pattern P will always be of type T.

4.1.8. Aliasing

When adding function signatures and ascriptions, it is often helpful to have simpler names for complex types that frequently occur. In the same way that variable binding can be used to give a name to a complex value, there is a type declaration that binds a complex type to a type variable. This is called aliasing, and it is accomplished using the type definition.
-- A rectangle is defined by its lower left and upper right coordinates.
type Rect = ((Number, Number), (Number, Number))

-- Transpose such a rectangle, rotating it around an x=y axis
def flip(Rect) :: Rect
def flip( ((a, b), (c, d)) ) = ((a, b), (d, c))
While it does not change the behavior of the typing algorithm, aliasing can substantially improve readability and make the intention of one's code much clearer.

4.1.9. Datatypes

We have already seen user-defined datatypes in the untyped Orc language. In the typed language, they are declared in the same way, except that the slots in the datatype, which previously were always _, now contain the actual types that go in those slots. As an example, here is a datatype for geometric shapes.
{- 
  Rect(w,h): A rectangle with width w and height h. 
  Circle(r): A circle with radius r.
  RegularPolygon(n,l): A regular polygon with n sides, each of length l.
-}
type Shape = Rect(Number, Number) | Circle(Number) | RegularPolygon(Integer, Number)
We will see how to declare generic datatypes, like Tree or Option, in the next section.

4.1.10. Interacting with External Services

In addition to verifying the correctness of code written within Orc itself, the typechecker must also ensure that external services are used correctly in the context of the Orc program.

When an external site is made available to Orc with the site declaration, its type is also made available to the Orc typechecker. The type of a site is itself treated like a service; it is passed the types of its arguments, and responds with a return type for those arguments. Thus, a site call can typecheck in ways not possible for function calls or other expressions. For example, sites can support ad-hoc polymorphism. Also, sites can respond to messages sent using the dot operator. In fact, Orc has no native record type, because the dot operator only applies to sites, so the type of the site can interpret the message internally and determine the correct type to return, if any.

Additionally, a site can introduce new types into Orc. For example, the Semaphore site responds with new instances of semaphores. And just as sites can be declared using site, these types can be declared using type. Typically they are represented as Java classes, just as sites are. As an example, here is a program which declares the Semaphore type, and then defines a function which "toggles" a pair of semaphores by acquiring the first and releasing the second.

type Semaphore = orc.lib.state.types.SemaphoreType

def toggle(Semaphore, Semaphore) :: Top
def toggle(s, t) = s.acquire() >> t.release()
Notice that it is not possible to typecheck toggle without first giving a name to the Semaphore type, because it is used in the signature of toggle.

type definitions occur alongside site definitions in the standard library, so the types returned by many of the library functions are already available.

4.1.11. Assertions

While the typechecker can be helpful, it will not accept programs that are not safe according to its algorithm, which can be burdensome when the programmer knows that an expression will have a certain type but the typechecker cannot verify it.

Since the typechecker is optional, it can always be turned off in these cases. But this is often too drastic a solution: typechecking difficulties often arise from small segments of a much larger program, and the rest of the program still benefits from typechecking.

Fortunately, the typechecker can be selectively disabled for parts of a program. For this purpose, the typechecker supports a special operation called an assertion, written E :!: T, where E is an expression and T is its asserted type. An assertion is used like an ascription, but rather than verifying that E has type T, the typechecker instead assumes that E has type T, without inspecting E at all. Thus, the programmer can supply the correct type T without being restricted by the typechecking algorithm.

This feature should be used sparingly, with the knowledge that it does compromise the integrity of the typechecking algorithm. If the supplied type is wrong, runtime type errors could propagate to any part of the program that depends on that type. Assertions are useful for rapid prototyping, but they are not recommended for production code.

4.2. Polymorphism

In this section we examine a powerful feature of Orc's typechecker: parametric polymorphism. This allows us to define types that take other types as parameters; the same theory underlies Java's generics.

4.2.1. Parametric types

What is the type of [1,2,3]? It is a list, containing Integers. What about [true, true]? Again a list, but it contains Booleans instead. It would be very cumbersome to have separate types such as IntegerList and BooleanList for each of these possibilities, and in fact infinitely many such types are possible. What we would like instead is one List type, with a parameter which gives us the types contained in the list.

Such a type is called a parametric, or generic, type. It contains two parts: a type operator, such as List, followed by a sequence of paramemters enclosed in brackets [...]. For example, [1,2,3] :: List[Integer], and [true,true] :: List[Boolean].

Lists are not the only parametric type. The standard library includes other parametric types, such as Option, Buffer, and Cell. Type operators, such as List, are declared using the same type declarations as any other type. For example, the declaration

type Buffer = orc.lib.state.types.BufferType
is part of the standard library; it declares the Buffer type operator.

4.2.2. Parametric functions

How do we write functions that use parametric types? Consider the following definition of the append function, which appends two lists:

def append[T](List[T], List[T]) :: List[T]
def append([], l) = l
def append(h::t, l) = h::append(t,l)
The function append has a type argument, T, in its signature. The type T is the type of elements in the lists that we are appending. Notice that both argument lists must contain the same type of elements. The resulting list contains elements of that same type.

4.2.3. Generic calls

When calling the append function, in addition to providing its normal arguments, we must also provide its type argument:

append[Integer]([1,2,3], [4,5])
However, it would be very burdensome and verbose to provide type arguments to all such calls. Fortunately, in most cases, the type checker can infer the correct type arguments, in the same way that it infers the correct type for many expressions without any additional information. So in this case, we can simply write:
append([1,2,3], [4,5])
and the typechecker infers that the parameter T is Integer, since both argument lists are of type List[Integer]. For a more thorough explanation of how this inference occurs, please refer to Pierce and Turner's paper.

Inference of type arguments will always fail on certain kinds of calls, because the typechecker does not have enough information to infer the correct type. The most common case is a site call which constructs a parametric type without taking any arguments. For example:

val b = Buffer()
will never typecheck, since there is no way for the typechecker to know what type of elements the buffer should contain. In other languages such as ML, the typechecker might be able to infer this information from the rest of the program, but Orc's typechecker is based on local type inference, which must find the information locally, such as from the types of the arguments. So, to construct a buffer that will contain Numbers, a type parameter must be given:
val b = Buffer[Number]()

4.2.4. Polymorphic aliases

A type alias may have type parameters. For example, here is a type alias for triples:

type Triple[T] = (T,T,T)
Triple is now a type operator that takes one parameter. Any occurrence of Triple[T] is equivalent to (T,T,T).

4.2.5. Polymorphic datatypes

Datatypes can also have type parameters. This allows a programmer to define very useful generic data structures. For example, this is necessary to define a reasonable datatype for trees:

type Tree[T] = Node(Tree[T], Tree[T], T) | Leaf()
Notice that the datatype is recursive, and the type parameter T is passed recursively to Tree to define the type of the subtrees.

4.3. Interacting with Java

When a Java class is made accessible to Orc using the class declaration, the Orc typechecker interacts with the Java type system to make Java types available in Orc. This feature is experimental and does not allow access to all of Java's typing features (for example, interfaces are not fully supported), but it does allow the typechecker to handle many of the common cases.

The type returned by the constructor of a class has the same name in Orc as the class. It is the Orc type of Java objects of that class.

class File = java.io.File
File("test.orc") :: File

If a class declaration binds a generic Java class, that type shows up in Orc as a parametric type, and its constructor takes type arguments. For example:

class TreeSet = java.util.TreeSet
TreeSet[String]() :: TreeSet[String]

There is a correspondence between Orc's primitive types and the equivalent classes in Java; they are interchangeable. For example, a call expecting a java.lang.Integer can be given an Orc Integer.

Subtyping between Java object types is determined by Java's subtyping relation: if one class is a subclass of the other, then one type will be a subtype of the other. The typechecker does not implement support a full join operation for Java types; it will not find the least common ancestor of two classes as the join.

4.4. Syntax of Typed Orc

The following figure summarizes the syntax of typed Orc as an extension of untyped Orc. The original Orc grammar rules are abbreviated by ellipses (...).

Table 4.1. Typed Orc

T::= Type  
  X  Type variable
 | Integer | Boolean | String | Number | Signal | Top | Bot   Ground type
 | ( T , ... , T )   Tuple type
 | lambda [ X , ... , X ] ( T , ... , T ) :: T  Function type
 |X[ T , ... , T ]   Type application
D::= ... Declaration  
 | type X = classname   type import
 | type X[ X , ... , X ] = T  type alias
 | type X[ X , ... , X ] = TC | ... | TC   datatype declaration (typed)
 | def X[ X , ... , X ] ( T , ... , T ) :: T  function signature
TC::= X( T , ... , T ) Constructor (typed)  
E::=... Expression  
 |E :: T  type ascription
 |E :!: T  type assertion
 | lambda [ T , ... , T ] ( P , ... , P ) :: T = E   closure (typed)
G::=... Argument group  
   [ T , ... , T ] ( E , ... , E )   type parameters plus arguments
P::=... Pattern  
 |P :: T  type ascription

Appendix A. Complete Syntax of Orc

Table A.1. Complete Syntax of Orc

E::=  Expression  
  C  constant value
 |X  variable
 | stop   silent expression
 | ( E , ... , E )   tuple
 | [ E , ... , E ]   list
 |E G+  call
 | op E  prefix operator
 |E op E  infix operator
 |E >P> E  sequential combinator
 |E | E  parallel combinator
 |E <P< E  pruning combinator
 |E ; E  otherwise combinator
 | lambda ( P , ... , P ) = E  closure (untyped)
 | lambda [ T , ... , T ] ( P , ... , P ) :: T = E   closure (typed)
 | if E then E else E   conditional
 |D E  scoped declaration
 |E :: T  type ascription
 |E :!: T  type assertion
G::=  Argument group  
   ( E , ... , E )   arguments
 | [ T , ... , T ] ( E , ... , E )   type parameters plus arguments
 | . field   field access
 | ?   dereference
C::= Boolean | Number | String | signal | null Constant  
X::= identifier Variable  
D::=  Declaration  
   val P = E  value declaration
 | site X = address   site declaration
 | class X = classname   class declaration
 | include " filename "   inclusion
 | def X( P , ... , P ) = E  function declaration
 | def X[ X , ... , X ] ( T , ... , T ) :: T  function signature
 | type X = classname   type import
 | type X[ X , ... , X ] = T  type alias
 | type X = UC | ... | UC   datatype declaration (untyped)
 | type X[ X , ... , X ] = TC | ... | TC   datatype declaration (typed)
UC::= X(_, ... ,_) Constructor (untyped)  
TC::= X( T , ... , T ) Constructor (typed)  
P::=  Pattern  
  X  variable
 |C  constant
 | _   wildcard
 |X ( P , ... , P )   datatype pattern
 | ( P , ... , P )   tuple pattern
 | [ P , ... , P ]   list pattern
 |P : P  cons pattern
 |P as X  as pattern
 | =X  equality pattern
 |P :: T  type ascription
T::= Type  
  X  Type variable
 | Integer | Boolean | String | Number | Signal | Top | Bot   Ground type
 | ( T , ... , T )   Tuple type
 | lambda [ X , ... , X ] ( T , ... , T ) :: T  Function type
 |X[ T , ... , T ]   Type application

Where relevant, syntactic constructs are ordered by precedence, from highest to lowest. For example, among expressions, calls have higher precedence than any of the combinators, which in turn have higher precedence than conditionals.

Appendix B. Standard Library

B.1. Overview

The standard library is a set of declarations implicitly available to all Orc programs. In this section we give an informal description of the standard library, including the type of each declaration and a short explanation of its use.

Orc programs are expected to rely on the host language and environment for all but the most essential sites. For example, in the Java implementation of Orc, the entire Java standard library is available to Orc programs via class declarations. Therefore the Orc standard library aims only to provide convenience for the most common Orc idioms, not the complete set of features needed for general-purpose programming.

B.2. Types and Notation

The standard library is fully compatible with the static type checker; all library declarations have associated type declarations, which also serve as helpful documentation.

The documentation of library functions uses special notation for parametric types that have dot-accessible members. Member names are written in the form Type.member, e.g. Foo.get refers to the get member of an object of type Foo. The object type can include type variables which are referenced by the member type, so for example site Buffer[A].get() :: A means that when the get method is called on a Buffer holding an arbitrary element type A, it will return a value of the same type.

B.3. Reference

B.3.1. core.inc: Fundamental sites and operators.

Fundamental sites and operators.

These declarations include both prefix and infix sites (operators). For consistency, all declarations are written in prefix form, with the site name followed by the operands. When the site name is surrounded in parentheses, as in (+), it denotes an infix operator.

For a more complete description of the built-in operators and their syntax, see the Operators section of the User Guide.

let

site let() :: Top

When applied to no arguments, return a signal.

let

site let[A](A) :: A

When applied to a single argument, return that argument (behaving as the identity function).

let

site let[A, ...](A, ...) :: (A, ...)

When applied to two or more arguments, return the arguments in a tuple.

if

site if(Boolean) :: Top

Fail silently if the argument is false. Otherwise return a signal.

Example:

-- Publishes: "Always publishes"
  if(false) >> "Never publishes"
| if(true) >> "Always publishes"

error

site error(String) :: Bot

Halt with the given error message.

Example, using error to implement assertions:

def assert(b) =
  if b then signal else error("assertion failed")

-- Fail with the error message: "assertion failed"
assert(false)

(+)

site (+)(Number, Number) :: Number

a+b returns the sum of a and b.

(-)

site (-)(Number, Number) :: Number

a-b returns the value of a minus the value of b.

(0-)

site (0-)(Number) :: Number

Return the additive inverse of the argument. When this site appears as an operator, it is written in prefix form without the zero, i.e. (-a)

(*)

site (*)(Number, Number) :: Number

a*b returns the product of a and b.

(**)

site (**)(Number, Number) :: Number

a ** b returns ab , i.e. a raised to the bth power.

(/)

site (/)(Number, Number) :: Number

a/b returns a divided by b. If both arguments have integral types, (/) performs integral division, rounding towards zero. Otherwise, it performs floating-point division. If b=0, a/b halts with an error.

Example:

  7/3   -- publishes 2
| 7/3.0 -- publishes 2.333...

(%)

site (%)(Number, Number) :: Number

a%b computes the remainder of a/b. If a and b have integral types, then the remainder is given by the expression a - (a/b)*b. For a full description, see the Java Language Specification, 3rd edition.

(<)

site (<)(Top, Top) :: Boolean

a < b returns true if a is less than b, and false otherwise.

(<=)

site (<=)(Top, Top) :: Boolean

a <= b returns true if a is less than or equal to b, and false otherwise.

(>)

site (>)(Top, Top) :: Boolean

a > b returns true if a is greater than b, and false otherwise.

(>=)

site (>=)(Top, Top) :: Boolean

a >= b returns true if a is greater than or equal to b, and false otherwise.

(=)

site (=)(Top, Top) :: Boolean

a = b returns true if a is equal to b, and false otherwise. The precise definition of "equal" depends on the values being compared, but always obeys the rule that if two values are considered equal, then one may be substituted locally for the other without affecting the behavior of the program.

Two values with the same object identity are always considered equal. In addition, Cor constant values and data structures are considered equal if their contents are equal. Other types are free to implement their own equality relationship provided it conforms to the rules given here.

Note that although values of different types may be compared with =, the substitutability principle requires that such values are always considered inequal, i.e. the comparison will return false.

(/=)

site (/=)(Top, Top) :: Boolean

a/=b returns false if a=b, and true otherwise.

(~)

site (~)(Boolean) :: Boolean

Return the logical negation of the argument.

(&&)

site (&&)(Boolean, Boolean) :: Boolean

Return the logical conjunction of the arguments. This is not a short-circuiting operator; both arguments must be evaluated and available before the result is computed.

(||)

site (||)(Boolean, Boolean) :: Boolean

Return the logical disjunction of the arguments. This is not a short-circuiting operator; both arguments must be evaluated and available before the result is computed.

(:)

site (:)[A](A, List[A]) :: List[A]

The list a:b is formed by prepending the element a to the list b.

Example:

-- Publishes: (3, [4, 5])
3:4:5:[] >x:xs> (x,xs)

In patterns, the (:) deconstructor can be applied to a variety of list-like values such as Arrays and Java Iterables, in which case it returns the first element of the list-like value, and a new list-like value (not necessarily of the same type as the original list-like value) representing the tail. Modifying the structure of the original value (e.g. adding an element to an Iterable) may render old "tail"s unusable, so you should refrain from modifying a value while you are deconstructing it. This feature is highly experimental and will probably change in future versions of the implementation.

abs

def abs(Number) :: Number

Return the absolute value of the argument.

Implementation. 

def abs(Number) :: Number
def abs(x) = if x < 0 then -x else x

signum

def signum(Number) :: Number

signum(a) returns -1 if a<0, 1 if a>0, and 0 if a=0.

Implementation. 

def signum(Number) :: Number
def signum(x) =
  if x < 0 then -1
  else if x > 0 then 1
  else 0

min

def min[A](A,A) :: A

Return the lesser of the arguments. If the arguments are equal, return the first argument.

Implementation. 

def min[A](A,A) :: A
def min(x,y) = if y < x then y else x

max

def max[A](A,A) :: A

Return the greater of the arguments. If the arguments are equal, return the second argument.

Implementation. 

def max[A](A,A) :: A
def max(x,y) = if x > y then x else y

floor

site floor(Number) :: Integer

Return the greatest integer less than or equal to this number.

ceil

site ceil(Number) :: Integer

Return the least integer greater than or equal to this number.

B.3.2. state.inc: General-purpose supplemental data structures.

General-purpose supplemental data structures.

Some

site Some[A](A) :: Option[A]

An optional value which is available. This site may also be used in a pattern.

Example:

-- Publishes: (3,4)
Some((3,4)) >s> (
    s >Some((x,y))> (x,y)
  | s >None()> signal
)

None

site None[A]() :: Option[A]

An optional value which is not available. This site may also be used in a pattern.

Semaphore

site Semaphore(Integer) :: Semaphore

Return a semaphore with the given value. The semaphore maintains the invariant that its value is always non-negative.

An example using a semaphore as a lock for a critical section:

-- Prints:
-- Entering critical section
-- Leaving critical section
val lock = Semaphore(1)
lock.acquire() >>
println("Entering critical section") >>
println("Leaving critical section") >>
lock.release()

acquire

site Semaphore.acquire() :: Top

If the semaphore's value is greater than 0, decrement the semaphore and return a signal. If the semaphore's value is 0, block until it becomes greater than 0.

acquirenb

site Semaphore.acquirenb() :: Top

If the semaphore's value is greater than 0, decrement the semaphore and return a signal. If the semaphore's value is 0, halt.

release

site Semaphore.release() :: Top

If any calls to acquire are blocked, allow the oldest such call to return. Otherwise, increment the value of the semaphore. This may increment the value beyond that with which the semaphore was constructed.

snoop

site Semaphore.snoop() :: Top

If any calls to acquire are blocked, return a signal. Otherwise, block until some call to acquire blocks.

snoopnb

site Semaphore.snoopnb() :: Top

If any calls to acquire are blocked, return a signal. Otherwise, halt.

Buffer

site Buffer[A]() :: Buffer[A]

Create a new buffer (FIFO channel) of unlimited size. A buffer supports get, put and close operations.

A buffer may be either empty or non-empty, and either open or closed. When empty and open, calls to get block. When empty and closed, calls to get halt. When closed, calls to put halt. In all other cases, calls return normally.

Example:

-- Publishes: 10
val b = Buffer()
  Rtimer(1000) >> b.put(10) >> stop
| b.get()

get

site Buffer[A].get() :: A

Get an item from the buffer. If the buffer is open and no items are available, block until one becomes available. If the buffer is closed and no items are available, halt.

getnb

site Buffer[A].getnb() :: A

Get an item from the buffer. If no items are available, halt.

put

site Buffer[A].put(A) :: Top

Put an item in the buffer. If the buffer is closed, halt.

close

site Buffer[A].close() :: Top

Close the buffer and block until it is empty. This has the effect of immediately causing any blocked calls to get to halt. In addition, any subsequent calls to put will halt, and once the buffer becomes empty, any subsequent calls to get will halt.

closenb

site Buffer[A].closenb() :: Top

Close the buffer and return a signal immediately. This has the effect of immediately causing any blocked calls to get to halt. In addition, any subsequent calls to put will halt, and once the buffer becomes empty, any subsequent calls to get will halt.

isClosed

site Buffer[A].isClosed() :: Boolean

If the buffer is currently closed, return true, otherwise return false.

getAll

site Buffer[A].getAll() :: List[A]

Get all of the items currently in the buffer, emptying the buffer and returning a list of the items in the order they were added. If there are no items in the buffer, return an empty list.

BoundedBuffer

site BoundedBuffer[A](Integer) :: BoundedBuffer[A]

Create a new buffer (FIFO channel) with the given number of slots. Putting an item into the buffer fills a slot, and getting an item opens a slot. A buffer with zero slots is equivalent to a synchronous channel.

A bounded buffer may be empty, partly filled, or full, and either open or closed. When empty and open, calls to get block. When empty and closed, calls to get halt. When full and open, calls to put block. When closed, calls to put halt. In all other cases, calls return normally.

Example:

-- Publishes: "Put 1" "Got 1" "Put 2" "Got 2"
val c = BoundedBuffer(1)
  c.put(1) >> "Put " + 1
| c.put(2) >> "Put " + 2
| Rtimer(1000) >> (
    c.get() >n> "Got " + n
  | c.get() >n> "Got " + n
  )

get

site BoundedBuffer[A].get() :: A

Get an item from the buffer. If the buffer is open and no items are available, block until one becomes available. If the buffer is closed and no items are available, halt.

getnb

site BoundedBuffer[A].getnb() :: A

Get an item from the buffer. If no items are available, halt.

put

site BoundedBuffer[A].put(A) :: Top

Put an item in the buffer. If no slots are open, block until one becomes open. If the buffer is closed, halt.

putnb

site BoundedBuffer[A].putnb(A) :: Top

Put an item in the buffer. If no slots are open, halt. If the buffer is closed, halt.

close

site BoundedBuffer[A].close() :: Top

Close the buffer and block until it is empty. This has the effect of immediately causing any blocked calls to get to halt. In addition, any subsequent calls to put will halt, and once the buffer becomes empty, any subsequent calls to get will halt. Note that any blocked calls to put initiated prior to closing the buffer may still be allowed to return as usual.

closenb

site BoundedBuffer[A].closenb() :: Top

Close the buffer and return a signal immediately. This has the effect of immediately causing any blocked calls to get to halt. In addition, any subsequent calls to put will halt, and once the buffer becomes empty, any subsequent calls to get will halt. Note that any blocked calls to put initiated prior to closing the buffer may still be allowed to return as usual.

isClosed

site BoundedBuffer[A].isClosed() :: Boolean

If the buffer is currently closed, return true, otherwise return false.

getOpen

site BoundedBuffer[A].getOpen() :: Integer

Return the number of open slots in the buffer. Because of concurrency this value may become out-of-date so it should only be used for debugging or statistical measurements.

getBound

site BoundedBuffer[A].getBound() :: Integer

Return the total number of slots (open or filled) in the buffer.

getAll

site BoundedBuffer[A].getAll() :: [A]

Get all of the items currently in the buffer or waiting to be added, emptying the buffer and returning a list of the items in the order they were added. If there are no items in the buffer or waiting to be added, return an empty list.

SyncChannel

site SyncChannel[A]() :: SyncChannel[A]

Create a synchronous channel, or rendezvous.

Example:

-- Publish: 10
val c = SyncChannel()
  c.put(10)
| Rtimer(1000) >> c.get()

get

site SyncChannel[A].get() :: A

Receive an item over the channel. If no sender is available, block until one becomes available.

put

site SyncChannel[A].put(A) :: Top

Send an item over the channel. If no receiver is available, block until one becomes available.

Cell

site Cell[A]() :: Cell[A]

Create a write-once storage location.

Example:

-- Publishes: 5 5
val c = Cell()
  c.write(5) >> c.read()
| Rtimer(1) >> ( c.write(10) ; c.read() )

read

site Cell[A].read() :: A

Read a value from the cell. If the cell does not yet have a value, block until it receives one.

readnb

site Cell[A].readnb() :: A

Read a value from the cell. If the cell does not yet have a value, halt.

write

site Cell[A].write() :: Top

Write a value to the cell. If the cell already has a value, halt.

Ref

site Ref[A]() :: Ref[A]

Create a rewritable storage location without an initial value.

Example:

val r = Ref()
Rtimer(1000) >> r := 5 >> stop
| println(r?) >>
  r := 10 >>
  println(r?) >>
  stop

Ref

site Ref[A](A) :: Ref[A]

Create a rewritable storage location initialized to the provided value.

read

site Ref[A].read() :: A

Read the value of the ref. If the ref does not yet have a value, block until it receives one.

readnb

site Ref[A].readnb() :: A

Read the value of the ref. If the ref does not yet have a value, halt.

write

site Ref[A].write(A) :: Top

Write a value to the ref.

(?)

def (?)[A](Ref[A]) :: A

Get the value held by a reference. x? is equivalent to x.read().

Implementation. 

def (?)[A](Ref[A]) :: A
def (?)(r) = r.read()

(:=)

def (:=)[A](Ref[A], A) :: Top

Set the value held by a reference. x := y is equivalent to x.write(y).

Implementation. 

def (:=)[A](Ref[A], A) :: Top
def (:=)(r,v) = r.write(v)

swap

def swap[A](Ref[A], Ref[A]) :: Top

Swap the values in two references.

Implementation. 

def swap[A](Ref[A], Ref[A]) :: Top
def swap(r,s) = (r?,s?) >(rval,sval)> (r := sval, s := rval) >> signal

Array

site Array[A](Integer) :: Array[A]

Create a new native array of the given size. The array is initialized to contain nulls.

The resulting array can be called directly with an index, as if its type were lambda (Integer) :: Ref[A]. In this case, it returns a Ref pointing to the element of the array specified by an index, counting from 0. Changes to the array are reflected immediately in the ref and visa versa.

Simple example:

-- Publishes: 3
val a = Array(1)
a(0) := 3 >>
a(0)?

More complex example:

-- Publishes: 0 1 2
val a = Array(3)
for(0, a.length()) >i>
a(i) := i >>
stop
; a(0)? | a(1)? | a(2)?

Array

site Array[A](Integer, String) :: Array[A]

Create a new primitive array of the given size with the given primitive type. The initial values in the array depend on the primitive type: for numeric types, it is 0; for booleans, false; for chars, the character with codepoint 0.

The element type of the array should be the appropriate wrapper type for the given primitive type, although a typechecker may not be able to verify this. This constructor is only necessary when interfacing with certain Java libraries; most programs will just use the Array(Integer) constructor.

get

site Array[A].get(Integer) :: A

Get the element of the array given by the index, counting from 0. a.get(i) is equivalent to a(i)?.

set

site Array[A].set(Integer, A) :: Top

Set the element of the array given by the index, counting from 0. a.set(i,v) is equivalent to a(i) := v.

slice

site Array[A].slice(Integer, Integer) :: Array[A]

Return a copy of the portion of the array with indices covered by the given half-open range. The result array is still indexed counting from 0.

length

site Array[A].length() :: Integer

Return the size of the array.

fill

site Array[A].fill(A) :: Top

Set every element of the array to the given value. The given value is not copied, but is shared by every element of the array, so for example a.fill(Semaphore(1)) would allow you to access the same semaphore from every element a.

This method is primarily useful to initialize or reset an array to a constant value, for example:

-- Publishes: 0 0 0
val a = Array(3)
a.fill(0) >> each(a)

IArray

def IArray[A](Integer, lambda (Integer) :: A)(Integer) :: A

The call IArray(n,f), where n is a natural number and f a total function over natural numbers, creates and returns a partial, pre-computed version of f restricted to the range (0, n-1). If f halts on any number in this range, the call to IArray will halt.

The user may also think of the call as returning an array whose ith element is f(i).

This function provides a simple form of memoisation; we avoid recomputing the value of f(i) by storing the result in an array.

Example:

val a = IArray(5, fib)
-- Publishes the 4th number of the fibonnaci sequence: 5
a(3)

Implementation. 

def IArray[A](Integer, lambda (Integer) :: A)(Integer) :: A
def IArray(n, f) =
  val a = Array[A](n)
  def fill(Integer, lambda (Integer) :: A) :: Top
  def fill(i, f) =
    if i < 0 then signal
    else (a.set(i, f(i)), fill(i-1, f)) >> signal
  fill(n-1, f) >> a.get

Set

site Set[A]() :: Set[A]

Construct an empty mutable set. The set considers two values a and b to be the same if and only if a=b. This site conforms to the Java interface java.util.Set, except that it obeys Orc rules for equality of elements rather than Java rules.

add

site Set[A].add(A) :: Boolean

Add a value to the set, returning true if the set did not already contain the value, and false otherwise.

remove

site Set[A].remove(Top) :: Boolean

Remove a value from the set, returning true if the set contained the value, and false otherwise.

contains

site Set[A].contains(Top) :: Boolean

Return true if the set contains the given value, and false otherwise.

isEmpty

site Set[A].isEmpty() :: Boolean

Return true if the set contains no values.

clear

site Set[A].clear() :: Top

Remove all values from the set.

size

site Set[A].size() :: Integer

Return the number of unique values currently contained in the set.

Map

site Map[K,V]() :: Map[K,V]

Construct an empty mutable map from keys to values. Each key contained in the map is associated with exactly one value. The mapping considers two keys a and b to be the same if and only if a=b. This site conforms to the Java interface java.util.Map, except that it obeys Orc rules for equality of keys rather than Java rules.

put

site Map[K,V].put(K, V) :: V

map.put(k,v) associates the value v with the key k in map, such that map.get(k) returns v. Return the value previously associated with the key, if any, otherwise return Null().

get

site Map[K,V].get(K) :: V

Return the value currently associated with the given key, if any, otherwise return Null().

remove

site Map[K,V].remove(Top) :: V

Remove the given key from the map. Return the value previously associated with the key, if any, otherwise return Null().

containsKey

site Map[K,V].containsKey(Top) :: Boolean

Return true if the map contains the given key, and false otherwise.

isEmpty

site Map[K,V].isEmpty() :: Boolean

Return true if the map contains no keys.

clear

site Map[K,V].clear() :: Top

Remove all keys from the map.

size

site Map[K,V].size() :: Integer

Return the number of unique keys currently contained in the map.

Counter

site Counter(Integer) :: Counter

Create a new counter initialized to the given value.

Counter

site Counter() :: Counter

Create a new counter initialized to zero.

inc

site Counter.inc() :: Top

Increment the counter.

dec

site Counter.dec() :: Top

If the counter is already at zero, halt. Otherwise, decrement the counter and return a signal.

onZero

site Counter.onZero() :: Top

If the counter is at zero, return a signal. Otherwise block until the counter reaches zero.

value

site Counter.value() :: Integer

Return the current value of the counter.

Example:

-- Publishes five signals
val c = Counter(5)
repeat(c.dec)

Dictionary

site Dictionary() :: Dictionary

Create a new dictionary (a mutable map from field names to values), initially empty. The first time each field of the dictionary is accessed (using dot notation), the dictionary creates and returns a new empty Ref which will also be returned on subsequent accesses of the same field. Dictionaries allow you to easily create object-like data structures.

Example:

-- Prints: 1 2
val d = Dictionary()
  println(d.one.read()) >>
  println(d.two.read()) >>
  stop
| d.one.write(1) >>
  d.two.write(2) >>
  stop

Here is the same example rewritten using Orc's reference syntax to improve clarity:

-- Prints: 1 2
val d = Dictionary()
  println(d.one?) >>
  println(d.two?) >>
  stop
| d.one := 1 >>
  d.two := 2 >>
  stop

To create a multi-level dictionary, you must explicitly create sub-dictionaries for each field. For example:

-- Prints: 2
val d = Dictionary()
d.one := Dictionary() >>
d.one?.two := 2 >>
println(d.one?.two?) >>
stop

Note that you cannot write d.one.two: because d.one is a reference to a dictionary, and not simply a dictionary, you must dereference before accessing its fields, as in d.one? >x> x.two. For readers familiar with the C language, this is the same reason you must write s->field instead of s.field when s is a pointer to a struct.

Record

site Record(String, A, String, B, ...) :: Record[A, B, ...]

Create a new record (an immutable map from field names to values). Arguments are consumed in pairs; the first argument of each pair is the key, and the second is the value for that key.

To access the value in record r for key "x", use the syntax r.x. For example:

-- Publishes: 1
val r = Record(
  "one", 1,
  "two", 2)
r.one

fst

def fst[A,B]((A,B)) :: A

Return the first element of a pair.

Implementation. 

def fst[A,B]((A,B)) :: A
def fst((x,_)) = x

snd

def snd[A,B]((A,B)) :: B

Return the second element of a pair.

Implementation. 

def snd[A,B]((A,B)) :: B
def snd((_,y)) = y

Interval

site