Overview#

The main purpose of deparallelization is to remove pruning and parallel combinators to produce an expression which uses only sequencing and "otherwise" (which can be interpreted efficiently or compiled into bytecode). The goal is to deparallelize first-order expressions written in Cor + | + >>.

Theory#

An expression is contained if it calls only contained sites. A contained site is any site whose observed behavior is not affected by, and cannot affect, other site calls. Some examples of contained sites:

  • + is contained: obviously
  • if is contained: contained sites are allowed to halt
  • / is contained: contained sites may trigger an error
  • println is contained: although it prints to stdout, this side-effect cannot affect the behavior of the program
  • random is contained: although it is non-functional, any relationship to concurrent calls cannot be observed
  • x := 3 is not contained: the side-effect of updating x may be observed
  • x? is not contained: the behavior depends on writes to x
  • Rtimer is not contained: the behavior depends on a shared clock (this case shows the definition of containment needs clarification)
The predicate contained[] is defined by:
contained[M(x)] = M is contained
contained[f <x< g] = contained[f] and contained[g]
contained[f >x> g] = contained[f] and contained[g]
contained[f ; g] = contained[f] and contained[g]
contained[f | g] = contained[f] and contained[g]

An expression is pure if it calls only pure sites. A pure site is any site whose publication is determined only by the values of its arguments, and has no observable behavior aside from publication (or lack thereof). This corresponds to the mathematical notion of a function. All pure expressions are contained, but not all contained expressions are pure:

  • / is not pure because it may print an error message
  • println is not pure because it prints to stdout
  • random is not pure because its publications vary even with the same arguments
  • if is pure: pure sites are allowed to halt (without error)
The predicate pure[] is defined by:
pure[M(x)] = M is pure
pure[f <x< g] = pure[f] and pure[g]
pure[f >x> g] = pure[f] and pure[g]
pure[f ; g] = pure[f] and pure[g]
pure[f | g] = pure[f] and pure[g]

An expression is sequential (maybe "deterministic" is a better term?) if it does not use the parallel combinator. The predicate seq[] is defined by:

seq[M(x)] = true
seq[f <x< g] = seq[f]
seq[f >x> g] = seq[f] and seq[g]
seq[f ; g] = seq[f] and seq[g]
seq[f | g] = false

An expression is safe if it publishes at least one value. The predicate safe[] is defined by:

safe[M(x), E] = E[x] and M is a safe site
safe[f <x< g, E] = safe[f, E+x:safe[g, E]]
safe[f >x> g, E] = safe[f] and safe[g, E+x:true]
safe[f ; g, E] = safe[f, E] or safe[g, E]
safe[f | g, E] = safe[f, E] or safe[g, E]

Here's a more refined definition of safe[] which evaluates to a statement of predicate logic; if the statement is a theorem (= true) then the expression is safe. This definition is necessary, for example, to recognize that "if then else" will always execute either the then branch or the else branch.

value[M(x), V] = fresh symbol
value[~a, V] = not V[a]
value[a && b, V] = V[a] and V[b]
value[a || b, V] = V[a] or V[b]
value[f >x> g, V] = value[g, V+x:value[f, V]]
value[f <x< g, V] = value[f, V+x:value[g, V]]
value[f | g, V] = value[f, V] or value[g, V]
value[f ; g, V] = value[f, V] or value[g, V]

safe[if(x), P, V] = P[x] and V[x]
safe[M(x), P, V] = P[x] and (M is a safe site)
safe[f <x< g, P, V] = safe[f, P+x:safe[g, P], V+x:value[g, V]]
safe[f >x> g, P, V] = safe[f, P, V] and safe[g, P+x:true, V+x:value[f, V]]
safe[f ; g, P, V] = safe[f, P, V] or safe[g, P, V]
safe[f | g, P, V] = safe[f, P, V] or safe[g, P, V]
(When evaluating | and ;, should we take advantage of safety to prove that one branch can never publish?)

Now we will enumerate the identities which we will use to deparallelize expressions. First, some standard identities:

    (f | g) >> h
<=> f >> h | g >> h

    (f >> g) <x< h
<=> (f <x< h) >> g
if g does not refer to x

<=> (f | g) <x< h
    (f <x< h) | g
if g does not refer to x

<=> (f | g) <x< h
    f | (g <x< h)
if f does not refer to x

<=> (f ; g) <x< h
    (f <x< h) ; g
if g does not refer to x

Define a transformation O[] on contained expressions:

O[M(...)] = M(...)
 O[f | g] = O[f] ; O[g]
 O[f ; g] = O[f] ; O[g]
O[f >> g] = f >> O[g]
O[f << g] = O[f] << g

This transformation obeys the identity:

    f << g
<=> f << O[g]
if g is contained

Some new identities:

    f <x< h
<=> f
if h is pure
and f does not refer to x

    (f | g) <x< h
<=> (f <x< h) | (g <x< h)
if h is pure
(is this necessary/desirable?)

    (f ; g) <x< h
<=> (f <x< h) ; (g <x< h)
if h is pure
(is this necessary/desirable?)

    f <x< h
<=> h >x> f
if h is contained, safe and sequential

    f ; g
<=> f
if f is safe

    (f >> g) | h
<=> f >> (g | h)
if f is safe and sequential

    (f >> g) <x< h
<=> f >> (g <x< h)
if (h is pure or (f is safe and sequential))
and f does not refer to x

    (f_0 | f_1 | ... | f_n) <x< g
<=> g >x> (f_0 | f_1 | ... | f_n)
if g is sequential
and every f_i has the form "M(x)" or "M(x) >> g"
(can we generalize this better?)

We can apply these identities in a bottom-up fashion to:

  1. pull parallel combinators up (until they can be removed by O[])
  2. push pruning combinators down (to the point where the variable is used)
  3. remove parallel combinators (to make expressions sequential)
  4. remove pruning combinators (to deparallelize)

Limitations#

What are some interesting expressions we cannot deparallelize?

Nested | and ;#

x <x< ((f | g) ; h) >> j

If f succeeds and j fails, then we must still try g, but we must not try h; there is no way to express this without |, and therefore no way to remove the pruning combinator. Fortunately, Cor + | + >> expressions, lacking ";", cannot have this form[1].


[#1] This assumes that if b then f else g is encoded as if(b) >> f | if(~b) >> g rather than if(b) >> f ; if(~b) >> g.

Add new attachment

Only authorized users are allowed to upload new attachments.
« This page (revision-1) was last changed on 17-Jan-2009 17:01 by AdrianQuark