User:Humbugde/Evaluation order

In computer science, an evaluation strategy is a set of (usually deterministic) rules for evaluating expressions in a programming language. Emphasis is typically placed on functions or operators: an evaluation strategy defines when and in what order the arguments to a function are evaluated, when they are substituted into the function, and what form that substitution takes. The lambda calculus, a formal system for the study of functions, has often been used to model evaluation strategies, where they are usually called reduction strategies. Evaluation strategies divide into two basic groups, strict and non-strict, based on how arguments to a function are handled. A language may combine several evaluation strategies; for example, C++ combines call-by-value with call-by-reference. Most languages that are predominantly strict use some form of non-strict evaluation for boolean expressions and if-statements.

Theoretical considerations[edit]

Practical questions[edit]

Parameter passing modes[edit]

Strict evaluation[edit]

In strict evaluation, the arguments to a function are always evaluated completely before the function is applied.

Under Church encoding, eager evaluation of operators maps to strict evaluation of functions; for this reason, strict evaluation is sometimes called "eager". Most existing programming languages use strict evaluation for functions.

Applicative order[edit]

Applicative order (or leftmost innermost) evaluation refers to an evaluation strategy in which the arguments of a function are evaluated from left to right in a post-order traversal of reducible expressions (redexes). Unlike call-by-value, applicative order evaluation reduces terms within a function body as much as possible before the function is applied.

Call by value[edit]

Call-by-value evaluation is the most common evaluation strategy, used in languages as different as C and Scheme. In call-by-value, the argument expression is evaluated, and the resulting value is bound to the corresponding variable in the function (frequently by copying the value into a new memory region). If the function or procedure is able to assign values to its parameters, only its local copy is assigned — that is, anything passed into a function call is unchanged in the caller's scope when the function returns.

Call-by-value is not a single evaluation strategy, but rather the family of evaluation strategies in which a function's argument is evaluated before being passed to the function. While many programming languages (such as Eiffel and Java) that use call-by-value evaluate function arguments left-to-right, some evaluate functions and their arguments right-to-left, and others (such as Scheme, OCaml and C) leave the order unspecified (though they generally require implementations to be consistent).

The term "call-by-value" is sometimes problematic, as the value implied is not the value of the variable as understood by the ordinary meaning of value, but an implementation-specific reference to the value. The term "call-by-value where the value is a reference" is common (but should not be understood as being call-by-reference). Thus the behaviour of call-by-value Java or Visual Basic and call-by-value C or Pascal are significantly different: in C or Pascal, calling a function with a large structure as an argument will cause the entire structure to be copied, potentially causing serious performance degradation, and mutations to the structure are invisible to the caller. However, in Java or Visual Basic only the reference to the structure is copied, which is fast, and mutations to the structure are visible to the caller. (See also call-by-sharing....)

Call by reference[edit]

In call-by-reference evaluation, a function receives an implicit reference to the argument, rather than a copy of its value. This typically means that the function can modify the argument- something that will be seen by its caller. Call-by-reference therefore has the advantage of greater time- and space-efficiency (since arguments do not need to be copied), as well as the potential for greater communication between a function and its caller (since the function can return information using its reference arguments), but the disadvantage that a function must often take special steps to "protect" values it wishes to pass to other functions.

Many languages support call-by-reference in some form or another, but comparatively few use it as a default; Perl and Visual Basic are two that do, though Visual Basic also offers a special syntax for call-by-value parameters. A few languages, such as C++ and REALbasic, default to call-by-value, but offer special syntax for call-by-reference parameters. C++ additionally offers call-by-reference-to-const. In purely functional languages there is typically no semantic difference between the two strategies (since their data structures are immutable, so there is no possibility for a function to modify any of its arguments), so they are typically described as call-by-value even though implementations frequently use call-by-reference internally for the efficiency benefits.

Even among languages that don't exactly support call-by-reference, many, including C and ML, support explicit references (objects that refer to other objects), such as pointers (objects representing the memory addresses of other objects), and these can be used to effect or simulate call-by-reference (but with the complication that a function's caller must explicitly generate the reference to supply as an argument).

Call by sharing[edit]

Also known as "call by object" or "call by object-sharing" is an evaluation strategy first named by Barbara Liskov et al. for the language CLU in 1974^[1]. It is used by languages such as Python^[2], Iota, Java (for object references)^[3], Ruby, Scheme, OCaml, AppleScript, and many other languages. However, the term "call by sharing" is not in common use; the terminology is inconsistent across different sources. For example, in the Java community, they say that Java is pass-by-value, whereas in the Ruby community, they say that Ruby is pass-by-reference, even though the two languages exhibit the same semantics. Call-by-sharing implies that values in the language are based on objects rather than primitive types.

The semantics of call-by-sharing differ from call-by-reference in that assignments to function arguments within the function aren't visible to the caller (unlike by-reference semantics) ^{[citation needed]}. However since the function has access to the same object as the caller (no copy is made), mutations to those objects within the function are visible to the caller, which differs from call-by-value semantics.

Although this term has widespread usage in the Python community, identical semantics in other languages such as Java and Visual Basic are often described as call-by-value, where the value is implied to be a reference to the object.

Call by copy-restore[edit]

Call-by-copy-restore, call-by-value-result or call-by-value-return (as termed in the Fortran community) is a special case of call-by-reference where the provided reference is unique to the caller. If a parameter to a function call is a reference that might be accessible by another thread of execution, its contents are copied to a new reference that is not; when the function call returns, the updated contents of this new reference are copied back to the original reference ("restored").

The semantics of call-by-copy-restore also differ from those of call-by-reference where two or more function arguments alias one another; that is, point to the same variable in the caller's environment. Under call-by-reference, writing to one will affect the other; call-by-copy-restore avoids this by giving the function distinct copies, but leaves the result in the caller's environment undefined (depending on which of the aliased arguments is copied back first).

When the reference is passed to the callee uninitialized, this evaluation strategy may be called call-by-result.

Partial evaluation[edit]

In partial evaluation, evaluation may continue into the body of a function that has not been applied. Any sub-expressions that do not contain unbound variables are evaluated, and function applications whose argument values are known may be reduced. In the presence of side-effects, complete partial evaluation may produce unintended results; for this reason, systems that support partial evaluation tend to do so only for "pure" expressions (expressions without side-effects) within functions.

Non-strict evaluation[edit]

In non-strict evaluation, arguments to a function are not evaluated unless they are actually used in the evaluation of the function body.

Under Church encoding, lazy evaluation of operators maps to non-strict evaluation of functions; for this reason, non-strict evaluation is often referred to as "lazy". Boolean expressions in many languages use lazy evaluation; in this context it is often called short circuiting. Conditional expressions also usually use lazy evaluation, albeit for different reasons.

Normal order[edit]

Normal-order (or leftmost outermost) evaluation is the evaluation strategy where the outermost redex is always reduced, applying functions before evaluating function arguments. It differs from call-by-name in that call-by-name does not evaluate inside the body of an unapplied function^{[clarification needed]}.

Call by name[edit]

In call-by-name evaluation, the arguments to functions are not evaluated at all — rather, function arguments are substituted directly into the function body using capture-avoiding substitution. If the argument is not used in the evaluation of the function, it is never evaluated; if the argument is used several times, it is re-evaluated each time. (See Jensen's Device.)

Call-by-name evaluation can be preferable over call-by-value evaluation because call-by-name evaluation always yields a value when a value exists, whereas call-by-value may not terminate if the function's argument is a non-terminating computation that is not needed to evaluate the function. Opponents of call-by-name claim that it is significantly slower when the function argument is used, and that in practice this is almost always the case as a mechanism such as a thunk is needed.

Call by need[edit]

Call-by-need is a memoized version of call-by-name where, if the function argument is evaluated, that value is stored for subsequent uses. In a "pure" (effect-free) setting, this produces the same results as call-by-name; when the function argument is used two or more times, call-by-need is almost always faster.

Because evaluation of expressions may happen arbitrarily far into a computation, languages using call-by-need generally do not support computational effects (such as mutation) except through the use of monads and uniqueness types. This eliminates any unexpected behavior from variables whose values change prior to their delayed evaluation.

This is a kind of lazy evaluation.

Haskell is the most well-known language that uses call-by-need evaluation.

R also uses a form of call-by-need.

Call by macro expansion[edit]

Call-by-macro-expansion is similar to call-by-name, but uses textual substitution rather than capture-avoiding substitution. With uncautious use, macro substitution may result in variable capture and lead to undesired behavior. Hygienic macros avoid this problem by checking for and replacing shadowed variables that are not parameters.

Nondeterministic strategies[edit]

Full β-reduction[edit]

Under full β-reduction, any function application may be reduced (substituting the function's argument into the function using capture-avoiding substitution) at any time. This may be done even within the body of an unapplied function.

Call by future[edit]

Call-by-future (or parallel call-by-name) is like call-by-need, except that the function's argument may be evaluated in parallel with the function body (rather than only if used). The two threads of execution synchronize when the argument is needed in the evaluation of the function body; if the argument is never used, the argument thread may be killed.

Optimistic evaluation[edit]

Optimistic evaluation is another variant of call-by-need in which the function's argument is partially evaluated for some amount of time (which may be adjusted at runtime), after which evaluation is aborted and the function is applied using call-by-need. This approach avoids some of the runtime expense of call-by-need, while still retaining the desired termination characteristics.

References[edit]

Harold Abelson and Gerald Jay Sussman. Structure and Interpretation of Computer Programs, Second Edition. MIT Press, 1996. ISBN 0-262-01153-0.
Clem Baker-Finch, Clem, David King, Jon Hall, and Phil Trinder. "An Operational Semantics for Parallel Call-by-Need", Research report 99/1. Faculty of Mathematics & Computing, The Open University, 1999.
Robert Ennals and Simon Peyton Jones. "Optimistic Evaluation: a fast evaluation strategy for non-strict programs", in ICFP'03. ACM Press, 2003.
Bertram Ludäscher. CSE 130 lecture notes. January 24, 2001.
Pierce, Benjamin C. (2002). Types and Programming Languages. MIT Press. ISBN 0-262-16209-1.
P. Sestoft. "Demonstrating Lambda Calculus Reduction", in T. Mogensen, D. Schmidt, I. H. Sudburough (editors): The Essence of Computation: Complexity, Analysis, Transformation. Essays Dedicated to Neil D. Jones. Lecture Notes in Computer Science 2566. Springer-Verlag, 2002. Pages 420-435. ISBN 3-540-00326-6

Category:Programming language topics Category:Programming evaluation

[1] ttp://www.lcs.mit.edu/publications/pubs/pdf/MIT-LCS-TR-225.pdf

[2] ttp://effbot.org/zone/call-by-object.htm

[3] ttp://www.cs.cornell.edu/courses/cs412/2001sp/iota/iota.html

[1]

[2]

[3]