click below
click below
Normal Size Small Size show me how
COMP2212
Programming Language Concepts
| Term | Definition |
|---|---|
| Names | also referred to as identifiers; essential in programming; used to identify virtual entities that we manipulate in programs |
| Name design choices | case sensitivity, restricted/fixed length, lexical rules (what characters they must/can't contain), clashing with reserved words, conventions (e.g camel case) |
| Variables | reference to a memory location whose contents may change; a placeholder for a value of some possibly complex type |
| Functional language variables | can store closures o arbitrary higher-order types ((int -> int) -> int); this is also a semantic concept |
| Memory location information | where variables do not refer to explicit memory locations, some languages allow you to obtain this; e.g "&var" in C takes variables to pointers (address in virtual memory) |
| Aliasing | two variables pointing to the same memory location |
| Attributes of a variable | name, address, value, type, extent, scope |
| Binding | an association between an entity and some attribute (e.g between a variable and its type) |
| Static binding | occurs before execution (compile time) and remains unchanged throughout execution |
| Dynamic binding | first occurs during execution (runtime) or changes during execution (runtime) |
| Allocation | the binding of a variable to its address; can be static or dynamic |
| Deallocation | the unbinding of a variable to its address; a largely dynamic concept |
| Extent | the time between a variables allocation and deallocation; when we can access the variable using that address |
| Static variables | aka global variables; bound to a memory location at initialisation time |
| Stack-dynamic variables | aka local variables; memory is allocated from a runtime stack and bound when a declaration is executed and deallocated when the procedure block it is contained in returns |
| Explicit heap-dynamic variables | nameless abstract memory locations that are allocated/deallocated by explicit runtime commands by the programmer |
| Implicit heap-dynamic variables | memory in heap is allocated when variables are assigned to values; it is deallocated and reallocated with re-assignment; error prone and inefficient |
| Static type binding | type declaration or type inference |
| Type declaration | most commonly used approach; a variable is introduced with an explicit type and possibly an initial value |
| Type inference | no types in variable declarations; the type is inferred from the usage of the variable or by following a fixed naming scheme |
| Dynamic type binding | typically occurs as a variable is assigned to a value at runtime; can change during execution; commonly used in scripting languages (e.g JavaScript); slower; advantageous in readability and coding convenience |
| Lifetime | aka extent; the periods of execution time when a variable is bound to a particular location storing a meaningful value; is a semantic concept and depends on execution model |
| Static variable extent | whole program execution |
| Static-dynamic variable extent | particular stack frame or procedure call |
| Explicit heap-dynamic variable extent | from explicit allocation to explicit deallocation (garbage collection and memory leak) |
| Implicit heap-dynamic variable extent | from implicit allocation to implicit deallocation (values may persist in memory but addresses are freed) |
| Scope | the part of the code in which a variable can be referenced; where the variable's name is meaningful |
| Garbage collectors | based on the principle that a variable's scope affects its extent; a no-longer referenceable value may be considered meaningless |
| Local variables | declared within a program block; the block is the scope |
| Static variables | whole program scope; except when they are temporarily hidden by a locally scoped variable with the same name |
| Lexical scope | where scope is aligned to statically determined areas of source code; lexical concept (not semantic) |
| Dynamic scope | not really scope; determined at runtime; depends on control flow; a variable is in dynamic scope if its name is meaningful within the bindings of the current call stack |
| Dynamic scope uncommon | flies in the face of referential transparency |
| Referential transparency | we can safely replace that piece of code with the value it computes and vice-versa, anywhere where that piece is used, without changing the meaning or result of our program |
| Syntax | refers to the structure of statements in a program; typically follows a grammar based on certain lexical symbols |
| Semantics | refers to the meaning of programs and how programs execute |
| Interpreter/Compiler role | to transform syntax into semantics |
| BNF | standard for defining grammars of programming languages; most programming languages are expressible using a CFG; acts as a metalanguage for defining languages |
| Non-terminals | represent different states of determining whether a string is accepted by the grammar |
| Terminal | represent the actual symbols of that appear in the strings accepted by the grammar |
| Parse trees | a representation of a a derivation of a string in a BNF grammar; each node shows a rule being used |
| Syntax to execution | step 1- lexing, step 2- parsing |
| Lexing | translating the particular symbols or characters in the string that make up the terminals of the grammar into tokens |
| Parsing | translating the sequence of tokens that make up the input string into a tree; it must follow the rules of the grammar and build a tree representing the derivation |
| Abstract syntax trees | used to remove unnecessary detail regarding how a term was parsed; retains the structure of the code but abstracts away the syntax |
| Ambiguous grammars | a grammar G where there exists a string s for which there exist two or more different parse trees for s using rules of the G; generally considered a bad thing |
| Resolving ambiguous grammars | parentheses (effective but impacts readability; operator precedence |
| Parentheses | used to "reset" precedence |
| Associativity | affects some operators; swap order in grammar rules; left recursive grammars don't work well with recursive descent |
| Lexeme | a pattern of characters in the input string that are considered meaningful in the given programming language |
| Token | a lexeme that has been identified and tagged with a meaning and possibly including a value |
| Scanner | used to identify lexemes using pattern matching with "maximal munch" (tries to identify the biggest possible token) |
| Evaluator | used to create tokens by analysing lexemes, tag them, and identify any associated value (in Haskell, the read function is very helpful for this) |
| Lexer generator | a software tool that, given an input that describes the lexemes and what tokens they produce, generates code to perform lexical analysis |
| Top down (recursive descent) parsing | builds parse tree from the root down; LL grammars (reads left to right, forms leftmost derivations) |
| Bottom up parsing | builds parse tree from leaves up; LR Parsers (reads left to right, forms rightmost derivations); more complicated but can handle a wider range of grammars |
| Shift-reduce | used by LR parsers; token stream scanned and forest of partial subtrees are built from bottom up; shift - read next input token; reduce - apply a completed grammar rule to join subtrees together; continues until single tree is formed |
| Parser generator | a software tool that, given an input that describes the rules of your grammar and what actions to perform when each rule is used, will generate code that performs parsing |
| Correctness | can mean many things: software testing, formal methods, theorem provers, hoare logic, algebraic specification, static analysis, denotational semantics, model checking, runtime monitoring, type systems |
| Type system | tractable syntactic method for proving the absence of certain program behaviours by classifying program phrases according to the kinds of values they compute |
| Types | abstract descriptions of programs - we can study correctness properties of interest by abstracting away all low level detail; precise descriptions of program behaviours; - we can use mathematical tools to formalise and check interesting properties |
| Type System uses | guarantee absence of certain behaviours, enforce higher-level modularity properties; enforce disciplined programming |
| Absence of behaviour examples | application of an arithmetic expression to wrong kind of data; existence of a method or field when invoked on a particular object; array lookups of the correct dimension |
| Higher-level modularity examples | maintain integrity of data abstractions; check for violation of information hiding |
| Type systems info | form the backbone of module based languages for large scale composition; types are the interfaces between modules; encourages abstract design; types are also a form of documentation |
| Strongly typed | languages check use of data to prevent program errors e.g accessing private data from corrupting memory, from crashing the machine, etc |
| Weakly typed | Languages do not always prevent errors but use types for other compilation purposes such as memory purposes |
| Static typing | refers to the point at which type checking occurs - compile time; use an approximation of the runtime types of values; statically determining control flow is undecidable; can avoid costly runtime errors; appropriate where types are used for memory layout |
| Dynamic typing | refers to type checking that is delayed until runtime; exact types can be used; allow variables to change the type of data they store; common for scripting languages and web programming; late error detection |
| Strong typing requirement | when an object is passed from a calling function to a called function to a called function, its type must be compatible with they type declared in the called function; intended types must be declared or inferred |
| Weak typing requirement | attempt to implicitly coerce data to be of the required type |
| Weak dynamically typed languages | PHP, Perl, Javascript, Lua |
| Strong dynamically typed languages | Python, Lisp, Scheme |
| Weak statically typed languages | C, C++ |
| Strong statically typed languages | Ada, OCaml, Haskell, Java, C# |
| Strongly typed languages insist | all program variable are manifestly typed; some allow type inference (the compiler automatically works out suitable types); guarantees a certain amount of correctness |
| determining strong or weak typing | way the compiler makes use of annotations that indicate the type of data; compiler checks that the program doesn't attempt to consume data of the wrong type; to check well-typedness, need to check abstract syntax trees |
| Type checking | traverse the types of the programs abstract syntax tree, each subtree (and node) has a type |
| well typed AST | every program fragment (E) needs to be given a type (T) to decide whether the AST is well typed |
| Typing relation | ⊢ E : T - need to define this for every possible program E; define a relation over the set of all programs - prove using rules; smallest relation that satisfies the set of rules |
| Type derivation rules | if the relation holds for the things above the line then the relation holds for the things below the line as well |
| Prove typing relation holds | rules must be formed into a tree such that the leaf nodes of the tree have no premises (nothing above the line) |
| Inductively defined sets | a set where the elements are constructed by a finite number of applications of a given set of rules for creating more complicated objects from simpler ones |
| Set of all programs of a language | inductively defined by the operations of the language; just need to define it on each program construct of the language |
| Type environments | to give a type to expression E, we need to have assumptions about the types of the free variables in E; formally - a mapping from variable names to types; written e.g - x : Int, y : Bool, z : Int, etc |
| lambda calculus syntax | λ x -> x can also be written λx.x |
| syntax directed rules | a set of inference rules S defined over programs used to define an inductive relation R that whenever a program E holds in R then there is a unique rule in S that justifies this; this unique rule is determined by the syntactic operator at the root of E |
| Structure of type derivation trees | for a syntax directed set of type rules we see that the structure of derivation trees matches the structure of the AST of the program we are deriving a type for |
| Inversion | the ability to infer types of subprograms from the type of the whole program - essentially by reading type rules from bottom to top |
| explicit argument type declaration | type checking is easy; common practice; but language is verbose - burden to programmer |
| type checking vs type inference | type inference is algorithmically more complicated than type checking; type inference for languages with implicit types |
| type variables | symbolic variables that represent unknown or unconstrained type |
| unification | to obtain an actual type for the program we need to solve the constraints; need to find a substitution of type variables such that all of the constraints hold (for implicit typing) |
| structured types | programs must specify behaviour in terms of the shape of the data they manipulate; catch common programming errors where data being passed to a function or library call is in a different format to what is expected |
| the unit type | has exactly one value = () ; type of java methods that take no parameters; can be used to suspend the evaluation of an expression until a later point in the computation |
| type rule for unit values | ⊢() : unit ; it has no premises (nothing above the line) |
| thunking | wrapping an expression with a function of unit type; a value can be unthunked as well |
| pairs and tuples | common structured typed; given two pieces of data of types T and U we can form a piece of data of shape T x U; this can be generalised for arbitrary size tuples too |
| projections | "destructors" of structured types; for example fst and snd operations on a pair type; for generalised n-tuples we would need n projection functions |
| record types | collections of n elements of data indexed by integers; use labels to index the items |
| selection | destructor for record types; we write R.Ii to select the field labelled Ii from record R |
| lists | constructor is cons (often written :: ); destructor operations are head and tail |
| arrays | aren't really a structural type; have no constructor to form elements and can't be pattern matched |
| sum types | a type that represents the possibility of data of either type T or type U being present; usually written T + U; only destructor is pattern matching (case or match) |
| injections | constructors of sum types; written inl or inr (left or right value) |
| variant types | a generalisation of sum types; written as <I1 : T1, I2 : T2, ... , Ii : Ti>; constructors for the values are injections with labels <Ii = E> |
| enumerations | using a variant type for its labels; e.g a type for days of the week; values will be of the form < label = () > ; this type is simply sugar for this structure; values of the types are simply the labels |
| option types | e.g maybe values in haskell; a variant type with a nothing field (of implicit unit type) and a Just field of some other type |
| subtyping | helps when type systems force us to be too specific; e.g a function that returns the length of an array doesn't care what type the elements of the array are; we need generic, polymorphic types |
| generics/polymorphism | supported by many programming languages; allow subtyping |
| polymorphic | "many shaped"; functions of this type may be applied to many different types of data; different types (parametric, ad hoc) - we care about subtype |
| structural subtyping | look at capabilities of types; T is a subtype of U if every operation that can be performed on U can also be performed on T |
| nominal subtyping | explicitly declare what types we want to be subtypes of others and then make sure that any operations valid on a supertype are valid on the subtype; approach taken in OOP languages via inheritance |
| subsumption | property of subtyping: if T is a subtype of U then every value of type T can also be considered as a value of type U; relies on subtyping relation T <: U between types; same rule for both structural and nominal subtyping |