2011-12-30

Y Combinator




Fixed-point combinator

In computer science, a fixed-point combinator (or fixpoint combinator ) is a higher-order function that computes a fixed point of other functions. A fixed point of a function f is a value x such that x = f(x). For example, 0 and 1 are fixed points of the function f(x) = x, because 0 = 0 and 1 = 1. Whereas a fixed-point of a first-order function (a function on "simple" values such as integers) is a first-order value, a fixed point of a higher-order function f is another function p such that p = f(p). A fixed-point combinator, then, is a function g which produces such a fixed point p for any function f:

g( f) = p, where p = f( p)

or, alternatively:

g( f) = f( g( f)).

Because fixed-point combinators are higher-order functions, their history is intimately related to the development of lambda calculus. One well-known fixed-point combinator in the untyped lambda calculus is Haskell Curry's Y = λf·(λx·f (x x)) (λx·f (x x)). The name of this combinator is sometimes incorrectly used to refer to any fixed-point combinator. The untyped lambda calculus however, contains an infinitude of fixed-point combinators. Fixed-point combinators do not necessarily exist in more restrictive models of computation. For instance, they do not exist in simply typed lambda calculus.

In programming languages that support anonymous functions, fixed-point combinators allow the definition and use of anonymous recursive functions, i.e. without having to bind such functions to identifiers. In this setting, the use of fixed-point combinators is sometimes called anonymous recursion.

How it works

Ignoring for now the question whether fixed-point combinators even exist (to be addressed in the next section), we illustrate how a function satisfying the fixed-point combinator equation works. To easily trace the computation steps, we choose untyped lambda calculus as our programming language. The (computational) equality from the equation that defines a fixed-point combinator corresponds to beta reduction in lambda calculus.

As a concrete example of a fixed-point combinator applied to a function, we use the standard recursive mathematical equation that defines the factorial function

fact( n) = if n = 0 then 1 else n * fact( n − 1)

We can express a single step of this recursion in lambda calculus as the lambda abstraction

 F = λf. λn. IFTHENELSE (ISZERO n) 1 (MULT n (f (PRED n))) 

using the usual encoding for booleans, and Church encoding for numerals. The functions in CAPITALS above can all be defined as lambda abstractions with their intuitive meaning; see lambda calculus for their precise definitions. Intuitively, f in F is a place-holder argument for the factorial function itself.

We now investigate what happens when a fixed-point combinator FIX is applied to F and the resulting abstraction, which we hope would be the factorial function, is in turn applied to a (Church-encoded) numeral.

 (FIX F) n = (F (FIX F)) n --- expanded the defining equation of FIX = (λx. IFTHENELSE (ISZERO x) 1 (MULT x ((FIX F) (PRED x)))) n --- expanded the first F = IFTHENELSE (ISZERO n) 1 (MULT n ((FIX F) (PRED n))) --- applied abstraction to n. 

If we now abbreviate FIX F as FACT, we recognize that any application of the abstraction FACT to a Church numeral calculates its factorial

 FACT n = IFTHENELSE (ISZERO n) 1 (MULT n (FACT (PRED n))). 

Recall that in lambda calculus all abstractions are anonymous, and that the names we give to some of them are only syntactic sugar. Therefore, it's meaningless in this context to contrast FACT as a recursive function with F as somehow being "not recursive". What fixed point operators really buy us here is the ability to solve recursive equations. That is, we can ask the question in reverse: does there exist a lambda abstraction that satisfies the equation of FACT? The answer is yes, and we have a "mechanical" procedure for producing such an abstraction: simply define F as above, and then use any fixed point combinator FIX to obtain FACT as FIX F.

In the practice of programming, substituting FACT at a call site by the expression FIX F is sometimes called anonymous recursion. In the lambda abstraction F, FACT is represented by the bound variable f, which corresponds to an argument in a programming language, therefore F need not be bound to an identifier. F however is a higher-order function because the argument f is itself called as a function, so the programming language must allow passing functions as arguments. It must also allow function literals because FIX F is an expression rather than an identifier. In practice, there are further limitations imposed on FIX, depending on the evaluation strategy employed by the programming environment and the type checker. These are described in the implementation section.

Existence of fixed-point combinators

In certain mathematical formalizations of computation, such as the untyped lambda calculus and combinatory logic, every expression can be considered a higher-order function. In these formalizations, the existence of a fixed-point combinator means that every function has at least one fixed point; a function may have more than one distinct fixed point.

In some other systems, for example in the simply typed lambda calculus, a well-typed fixed-point combinator cannot be written. In those systems any support for recursion must be explicitly added to the language. In still others, such as the simply-typed lambda calculus extended with recursive types, fixed-point operators can be written, but the type of a "useful" fixed-point operator (one whose application always returns) may be restricted. In polymorphic lambda calculus (System F) a polymorphic fixed-point combinator has type ∀a.(a→a)→a, where a is a type variable.

Y combinator

One well-known (and perhaps the simplest) fixed-point combinator in the untyped lambda calculus is called the Y combinator. It was discovered by Haskell B. Curry, and is defined as:

Y = λf.(λx.f (x x)) (λx.f (x x))

We can see that this function acts as a fixed-point combinator by applying it to an example function g and rewriting:

Y g = (λf . (λx . f (x x)) (λx . f (x x))) g (by definition of Y)
= (λx . g (x x)) (λx . g (x x)) (β-reduction of λf: applied main function to g)
= (λy . g (y y)) (λx . g (x x)) (α-conversion: renamed bound variable)
= g ((λx . g (x x)) (λx . g (x x))) (β-reduction of λy: applied left function to right function)
= g (Y g) (by second equality)

In programming practice, the Y combinator is useful only in those languages that provide a call-by-name evaluation strategy, since (Y g) diverges (for any g) in call-by-value settings.

Example in Scheme

 (define Y (lambda (f) ((lambda (recur) (f (lambda arg (apply (recur recur) arg)))) (lambda (recur) (f (lambda arg (apply (recur recur) arg))))))) 

Factorial definition using Y Combinator

 (define fact (Y (lambda (f) (lambda (n) (if (<= n 0) 1 (* n (f (- n 1)))))))) 

Example in Common Lisp

 (defun y (func) ((lambda (x) (funcall x x)) (lambda (y) (funcall func (lambda (&rest args) (apply (funcall y y) args)))))) 

Factorial program using the Y combinator

 (funcall (y (lambda (f) (lambda (x) (if (<= x 1) 1 (* x (funcall f (1- x))))))) 5) => 120 

Other fixed-point combinators

In untyped lambda calculus fixed-point combinators are not especially rare. In fact there are infinitely many of them. In 2005 Mayer Goldberg showed that the set of fixed-point combinators of untyped lambda calculus is recursively enumerable.

A version of the Y combinator that can be used in call-by-value (applicative-order) evaluation is given by η-expansion of part of the ordinary Y combinator:

Z = λf. (λx. f (λy. x x y)) (λx. f (λy. x x y))

Here is an example of this in Python:

 >>> Z = lambda f: (lambda x: f(lambda *args: x(x)(*args)))(lambda x: f(lambda *args: x(x)(*args))) >>> fact = lambda f: lambda x: 1 if x == 0 else x * f(x-1) >>> Z(fact)(5) 120 

The Y combinator can be expressed in the SKI-calculus as

Y = S (K (S I I)) (S (S (K S) K) (K (S I I)))
Y' = S S K (S (K (S S (S (S S K)))) K)

which corresponds to the lambda expression

Y' = (λx. λy. x y x) (λy. λx. y (x y x))

This fixed-point combinator is simpler than the Y combinator, and β-reduces into the Y combinator; it is sometimes cited as the Y combinator itself:

X = λf.(λx.x x) (λx.f (x x))

Another common fixed point combinator is the Turing fixed-point combinator (named after its discoverer, Alan Turing):

Θ = (λx. λy. (y (x x y))) (λx. λy. (y (x x y)))

It also has a simple call-by-value form:

Θ v = (λx. λy. (y (λz. x x y z))) (λx. λy. (y (λz. x x y z)))

Some fixed point combinators, such as this one (constructed by Jan Willem Klop) are useful chiefly for amusement:

Yk = (L L L L L L L L L L L L L L L L L L L L L L L L L L)

where:

L = λabcdefghijklmnopqstuvwxyzr. (r (t h i s i s a f i x e d p o i n t c o m b i n a t o r))

Strictly non-standard fixed-point combinators

In untyped lambda calculus there are terms that have the same Böhm tree as a fixed-point combinator, that is they have the same infinite extension λx.x(x(x ... )). These are called non-standard fixed-point combinators. Evidently, any fixed-point combinator is also a non-standard one, but not all non-standard fixed-point combinators are fixed-point combinators because some of them fail to satisfy the equation that defines the "standard" ones. These strange combinators are called strictly non-standard fixed-point combinators; an example is the following combinator N = BM(B(BM)B), where B = λxyz.x(yz) and M = λx.xx. The set of non-standard fixed-point combinators is not recursively enumerable.

Implementing fixed-point combinators

In a language that supports lazy evaluation, like in Haskell, it is possible to define a fixed-point combinator using the defining equation of the fixed-point combinator. A programming language of this kind effectively solves the fixed-point combinator equation by means of its own (lazy) recursion mechanism. This implementation of a fixed-point combinator in Haskell is sometimes referred to as defining the Y combinator in Haskell. This is incorrect because the actual Y combinator is rejected by the Haskell type checker (but see the following section for a small modification of Y using a recursive type which works). The listing below shows the implementation of a fixed-point combinator in Haskell that exploits Haskell's ability to solve the fixed-point combinator equation. This fixed-point combinator is traditionally called fix in Haskell. Observe that fix is a polymorphic fixed-point combinator (c.f. the discussion in previous section on System F); its type is only shown for clarity—it can be inferred in Haskell. The definition is followed by some usage examples.

 fix :: (a -> a) -> a fix f = f (fix f) fix (const 9) -- const is a function that returns its first parameter and ignores the second; -- this evaluates to 9 factabs fact 0 = 1 -- factabs is F from our lambda calculus example factabs fact x = x * fact (x-1) (fix factabs) 5 -- evaluates to 120 

In the example above, the application of fix does not loop infinitely, because of lazy evaluation; e.g., in the expansion of fix (const 9) as (const 9)(fix f) , the subexpression fix f is not evaluated. In contrast, the definition of fix from Haskell loops forever when applied in a strict programming language, because the argument to f is expanded beforehand, yielding an infinite call sequence f (f ... (fix f) ... )) , which causes a stack overflow in most implementations.

In a strict language like OCaml, we can avoid the infinite recursion problem by forcing the use of a closure. The strict version of fix shall have the type ∀a.∀b.((a→b)→(a→b))→(a→b). In other words, it works only on a function which itself takes and returns a function. For example, the following OCaml code implements a strict version of fix :

 let rec fix f x = f (fix f) x (* note the extra x *) let factabs fact = function (* factabs now has extra level of lambda abstraction *) 0 -> 1 | x -> x * fact (x-1) let _ = (fix factabs) 5 (* evaluates to "120" *) 

The same idea can be used to implement a (monomorphic) fixed combinator in strict languages that support inner classes inside methods (called local inner classes in Java), which are used as 'poor man's closures' in this case. Even when such classes may be anonymous, as in the case of Java, the syntax is still cumbersome. Java code. Function objects, e.g. in C++, simplify the calling syntax, but they still have to be generated, preferably using a helper function such as boost::bind. C++ code.

Example of encoding via recursive types

In programming languages that support recursive types (see System Fω), it is possible to type the Y combinator by appropriately accounting for the recursion at the type level. The need to self-apply the variable x can be managed using a type (Rec a), which is defined so as to be isomorphic to (Rec a -> a).

For example, in the following Haskell code, we have In and out being the names of the two directions of the isomorphism, with types:

 In :: (Rec a -> a) -> Rec a out :: Rec a -> (Rec a -> a) 

which lets us write:

 newtype Rec a = In { out :: Rec a -> a } y :: (a -> a) -> a y = \f -> (\x -> f (out x x)) (In (\x -> f (out x x))) 

Or equivalently in OCaml:

 type 'a recc = In of ('a recc -> 'a) let out (In x) = x let y f = (fun x a -> f (out x x) a) (In (fun x a -> f (out x x) a)) 

Anonymous recursion by other means

Although fixed point combinators are the standard solution for allowing a function not bound to an identifier to call itself, some languages like JavaScript provide a syntactical construct which allows anonymous functions to refer to themselves. For example, in Javascript one can write the following, although it has been removed in the strict mode of the latest standard edition.

 function(x) { return x === 0 ? 1 : x * arguments.callee(x-1); } 

See also

Notes

References

References

External links


Retrieved from : http://en.wikipedia.org/w/index.php?title=Fixed-point_combinator&oldid=465380705