Basics
Warning: This document is incomplete. It has not been proofread (style, correct links, mistranslation, etc.). Also, Erg's syntax may be change destructively during version 0.*, and the documentation may not have been updated accordingly. Please be aware of this beforehand. If you find any errors in this document, please report then to here form or GitHub repo. We would appreciate your suggestions.
This document describes the basic syntax of Erg. If you already have experience with languages such as Python, please refer to the quick tour for an overview. There is also standard API docs, type definition and internal docs for contributors. If you need a detailed explanation of the syntax or Erg's internal architecture, please refer to those documents.
Hello, World!
First, let's do "Hello World".
print!("Hello, World!")
This is almost identical to Python and other languages in the same family. The most striking feature is the !
, the meaning of which will be explained later.
In Erg, parentheses ()
can be omitted unless there is some confusion in interpretation.
The omission of parentheses is similar to Ruby, but it is not possible to omit parentheses that can be interpreted in more than one way.
print! "Hello, World!" # OK
print! "Hello,", "World!" # OK
print!() # OK
print! # OK, but this does not mean to call, simply to get `print!` as a callable object
print! f x # OK, interpreted as `print!(f(x))`
print!(f(x, y)) # OK
print! f(x, y) # OK
print! f(x, g y) # OK
print! f x, y # NG, can be taken to mean either `print!(f(x), y)` or `print!(f(x, y))`
print!(f x, y) # NG, can be taken to mean either `print!(f(x), y)` or `print!(f(x, y))`
print! f(x, g y, z) # NG, can be taken to mean either `print!(x, g(y), z)` or `print!(x, g(y, z))`
Scripts
Erg code is called a script. Scripts can be saved and executed in file format (.er).
REPL/File Execution
To start REPL, simply type:
> erg
>
mark is a prompt, just type erg
.
Then the REPL should start.
> erg
Starting the REPL server...
Connecting to the REPL server...
Erg interpreter 0.2.4 (tags/?:, 2022/08/17 0:55:12.95) on x86_64/windows
>>>
Or you can compile from a file.
> 'print! "hello, world!"' >> hello.er
> erg hello.er
hello, world!
Comments
The code after #
is ignored as a comment. Use this to explain the intent of the code or to temporarily disable the code.
# Comment
# `#` and after are ignored until a new line is inserted
#[
Multi-line comment
Everything from `#[` to `]#` is a comment.
]#
Documentation Comments
'''...'''
is a documentation comment. Note that unlike Python, it is defined outside of any class or function.
'''
PI is a constant that is the ratio of the circumference of a circle to its diameter.
'''
PI = 3.141592653589793
'''
This function returns twice the given number.
'''
twice x = x * 2
print! twice.__doc__
# This function returns twice the given number.
'''
Documentation comments for the entire class
'''
C = Class {x = Int}
'''
Method documentation comments
'''
.method self = ...
You can specify the language of the document by writing the language code immediately after the '''
. The Erg Language Server will then display documents in the Markdown format for each language version (The default language is English).
See here for registered language codes.
'''
Answer to the Ultimate Question of Life, the Universe, and Everything.
cf. https://www.google.co.jp/search?q=answer+to+life+the+universe+and+everything
'''
'''japanese
生命、宇宙、そして全てについての究極の謎への答え
参照: https://www.google.co.jp/search?q=answer+to+life+the+universe+and+everything
'''
ANSWER = 42
Also, if you specify erg
, it will be displayed as Erg's sample code.
'''
the identity function, does nothing but returns the argument
'''
'''erg
assert id(1) == 1
assert id("a") == "a"
'''
id x = x
Expressions, separators
A script is a series of expressions. An expression is something that can be calculated or evaluated, and in Erg almost everything is an expression.
Each expression is separated by a separator - either a new line or a semicolon ;
-.
Erg scripts are basically evaluated from left to right, top to bottom.
n = 1 # assignment expression
f x, y = x + y # function definition
f(1, 2) # function-call expression
1 + 1 # operator-call expression
f(1, 2); 1 + 1
As shown below, there is a syntax called instant block that takes the last expression evaluated in the block as the value of the variable.
This differs from a function with no arguments, which does not add ()
. Note that instant blocks are evaluated only once on the fly.
i =
x = 1
x + 1
assert i == 2
This cannot be accomplished with a semicolon (;
).
i = (x = 1; x + 1) # SyntaxError: cannot use `;` in parentheses
Indentation
Erg, like Python, uses indentation to represent blocks. There are three operators (special forms) that trigger the start of a block: =
, ->
, and =>
(In addition, :
and |
, although not operators, also produce indentation). The meanings of each are described later.
f x, y =
x + y
for! 0..9, i =>
print!
for! 0..9, i =>
print! i; print! i
ans = match x:
0 -> "zero"
_: 0..9 -> "1 dight"
_: 10..99 -> "2 dights"
_ -> "unknown"
If a line is too long, it can be broken using \
.
# this does not means `x + y + z`, but means `x; +y; +z`
x
+ y
+ z
# this means `x + y + z`
x \
+ y \
+ z
Literal
Basic Literals
Int Literal
0, -0, 1, -1, 2, -2, 3, -3, ...
Ratio Literal
0.00, -0.0, 0.1, 400.104, ...
Note that the Ratio type is different from the Float type; the API is the same, but there are differences in the accuracy and efficiency of the calculation results.
If a Ratio
literal has an integer or decimal part of 0
, you can omit the 0
.
assert 1.0 == 1.
assert 0.5 == .5
Note: This function
assert
was used to show that1.0
and1.
are equal. Subsequent documents may useassert
to indicate that the results are equal.
Str Literal
Any Unicode-representable string can be used.
Unlike Python, quotation marks cannot be enclosed in '
. If you want to use "
in a string, use \"
.
"", "a", "abc", "111", "1# 3f2-3*8$", "こんにちは", "السَّلَامُ عَلَيْكُمْ", ...
\{..}
allows you to embed expressions in strings. This is called string interpolation.
If you want to output \{..}
itself, use \\{..}
.
assert "1 + 1 is 2" == "\{1} + \{1} is \{1+1}"
Documentation comments are also treated as string literals, so string interpolation can be used. This is expanded at compile time. You will be warned if you embed an expression that cannot be determined at compile time.
PI = 3.14159265358979323
'''
S(r) = 4 × \{PI} × r^2
'''
sphere_surface r = 4 * PI * r ** 2
Exponential Literal
This is a literal representing exponential notation often used in academic calculations. It is an instance of type Ratio
.
The notation is the same as in Python.
1e-34, 0.4e-10, 2.455+e5, 245e5, 25E5, ...
assert 1e-10 == 0.0000000001
Compound Literals
Each of these literals has its own documentation describing them separately, so please refer to that documentation for details.
Array Literal
[], [1], [1, 2, 3], ["1", "2",], ...
Tuple Literal
(), (1, 2, 3), (1, "hello", True), ...
Dict Literal
{:}, {"one": 1}, {"one": 1, "two": 2}, {"1": 1, "2": 2}, {1: "1", 2: True, "three": [1]}, ...
Record Literal
{=}, {one = 1}, {one = 1; two = 2}, {.name = "John"; .age = 12}, {.name = Str; .age = Nat}, ...
Set Literal
{}, {1}, {1, 2, 3}, {"1", "2", "1"}, ...
As a difference from Array
literals, duplicate elements are removed in Set
.
assert {1, 2, 1} == {1, 2}
What looks like a literal but isn't
Boolean Object
True
and False
are simply constant objects of type Bool
.
True, False
None Object
None
is a singleton object of type NoneType
.
None
Range Object
Unlike Python's range
, it can treat not only Int
but also any object of type that allows comparisons (subtype of Ord
, e.g. Str
, Ratio
, etc.).
assert 0..10 in 5
assert 0..<10 notin 10
assert 0..9 == 0..<10
assert (0..5).to_set() == {1, 2, 3, 4, 5}
assert "a" in "a".."z"
Float Object
assert 0.0f64 == 0
assert 0.0f32 == 0.0f64
This is a Ratio
object multiplied by f64
, a Float 64
unit object. You can also use f32
to indicate Float 32
.
They are basically performant than Ratio
, but may introduce errors.
assert 0.1 + 0.2 == 0.3
assert 0.1f64 + 0.2f64 != 0.3f64 # Oops!
Complex Object
1+2Im, 0.4-1.2Im, 0Im, Im
A Complex
object is simply an arithmetic combination of an imaginary unit object, Im
.
*-less multiplication
In Erg, you can omit the *
to indicate multiplication as long as there is no confusion in interpretation. However, the combined strength of the operators is set stronger than *
.
# same as `assert (1*m) / (1*s) == 1*(m/s)`
assert 1m / 1s == 1 (m/s)
Variable
Variables are a type of algebra; algebra in Erg - sometimes simply referred to as variable if there is no confusion - refers to the feature to name objects and make them referable from elsewhere in the code.
A variable is defined as follows.
The n
part is called the variable name (or identifier), =
is the assignment operator, and the 1
part is the assigned value.
n = 1
The n
defined in this way can thereafter be used as a variable to denote the integer object 1
. This system is called assignment (or binding).
We have just said that 1
is an object. We will discuss what an object is later, but for now we will assume that it is something that can be assigned to, i.e., on the right side of the assignment operator (=
, etc.).
If you want to specify the "type" of a variable, do the following. The type is roughly the set to which an object belongs, as will be explained later.
Here we specify that n
is a natural number (Nat
) type.
n: Nat = 1
Note that, unlike other languages, multiple assignments are not allowed.
# NG
l1 = l2 = [1, 2, 3] # SyntaxError: multiple assignment not allowed
# OK
l1 = [1, 2, 3]
l2 = l1.clone()
It is also not possible to reassign to a variable. The syntax that can be used instead, to hold mutable states, are described later.
i = 1
i = i + 1 # AssignError: cannot assign twice
You can define a variable with the same name in the inner scope, but you are only covering it over, not destructively rewriting its value. If you go back to the outer scope, the value will return as well. Note that this is a different behavior than the Python "statement" scope. This kind of functionality is generally referred to as shadowing. However, unlike shadowing in other languages, you cannot shadow in the same scope.
x = 0
# x = 1 # AssignError: cannot assign twice
if x.is_zero(), do:
x = 1 # different from outer x with same name
assert x == 1
assert x == 0
The following may seem possible at first glance, but it is still not possible. This is a design decision, not a technical constraint.
x = 0
if x.is_zero(), do:
x = x + 1 # NameError: cannot define variables refer to variables with the same name
assert x == 1
assert x == 0
Constants
Constants are also a type of algebra. If you start an identifier with a capital letter, it is treated as a constant. They are called constants because once defined, they do not change.
The N
part is called the constant name (or identifier). Otherwise, it is the same as a variable.
N = 0
if True, do:
N = 1 # AssignError: constants cannot be shadowed
pass()
Constants are immutable beyond the defined scope. They cannot be shadowed. Because of this property, constants can be used in pattern matching. Pattern matching is explained later.
For example, constants are used for mathematical constants, information about external resources, and other immutable values.
It is common practice to use all-caps (style in which all letters are capitalized) for identifiers of objects other than types.
PI = 3.141592653589793
URL = "https://example.com"
CHOICES = ["a", "b", "c"]
PI = 3.141592653589793
match! x:
PI => print! "π"
other => print! "other"
The above code prints π
when x
is 3.141592653589793
. If x
is changed to any other number, it prints other
.
Some objects cannot be bound as constants. Mutable objects, for example. Mutable objects are objects whose states can be changed, as described in detail later. This is because of the rule that only constant expressions can be assigned to constants. Constant expressions are also discussed later.
X = 1 # OK
X = !1 # TypeError: cannot define Int! object as a constant
Delete an Variable
You can delete an variable by using the Del
function. All other variables that depend on the variable (that is, that refer directly to the value of the variable) are also removed.
x = 1
y = 2
z = 3
f a = x + a
assert f(2) == 3
Del x
Del y, z
f(2) # NameError: f is not defined (deleted in line 6)
Note that Del
can only delete variables defined in the user-defined module. Built-in constants such as True
cannot be deleted.
Del True # TypeError: cannot delete built-in constants
Del print! # TypeError: cannot delete built-in variables
Appendix: Assignment and Equivalence
Note that x == a
is not necessarily true when x = a
. An example is Float.NaN
. This is the formal specification of floating-point numbers as defined by IEEE 754.
x = Float.NaN
assert x != NaN
assert x != x
There are other objects for which no equivalence relation is defined in the first place.
f = x -> x**2 + 2x + 1
g = x -> (x + 1)**2
f == g # TypeError: cannot compare function objects
C = Class {i: Int}
D = Class {i: Int}
C == D # TypeError: cannot compare class objects
Strictly speaking, =
does not assign the right-hand side value directly to the left-hand side identifier.
In the case of function and class objects, "modification" such as giving variable name information to the object is performed. However, this is not the case for structural types.
f x = x
print! f # <function f>
g x = x + 1
print! g # <function g>
C = Class {i: Int}
print! C # <class C>
Declaration
Declaration is the syntax for specifying the type of variable to be used. Declarations can be made anywhere in the code, but declarations alone do not refer to the variables. They must be initialized. After the assignment, the declaration can be checked to ensure that the type is compatible with the object to which it is assigned.
i: Int
# Can be declared at the same time as the assignment, like i: Int = 2
i = 2
i: Num
i: Nat
i: -2..2
i: {2}
Declaration after assignment is similar to type checking by assert
, but has the feature that it is checked at compile time.
Type checking by assert
at runtime can be checked for "may be type Foo", but type checking by :
at compile time is strict: if the type is not determined to be "type Foo", it will not pass the check and an error will occur.
i = (-1..10).sample!
assert i in Nat # this may pass
i: Int # this will pass
i: Nat # this will not pass (-1 is not an element of Nat)
Functions can be declared in 2 different ways.
f: (x: Int, y: Int) -> Int
f: (Int, Int) -> Int
If you declare the argument names explicitly, a type error will result if the names are different at definition time. If you want to give the argument names arbitrary names, you can declare them in the second way. In that case, only the method name and its type will be seen by type checking.
T = Trait {
.f = (x: Int, y: Int): Int
}
C = Class()
C|<: T|.
f(a: Int, b: Int): Int = ... # TypeError: `.f` must be type of `(x: Int, y: Int) -> Int`, not `(a: Int, b: Int) -> Int`
Function
A function is a block that takes an "argument", processes it, and returns it as a "return value". It is defined as follows.
add x, y = x + y
# or
add(x, y) = x + y
The names specified after a function name are called parameters.
In contrast, the objects passed to a function are called arguments.
The function add
is a function that takes x
and y
as parameters and returns the sum of them, x + y
.
The defined function can be called (applied/invoked) as follows.
add 1, 2
# or
add(1, 2)
Colon application style
Functions are invoked like f x, y, ...
, but if there are too many arguments for a single line, they can be applied using :
(colon).
f some_long_name_variable_1 + some_long_name_variable_2, some_long_name_variable_3 * some_long_name_variable_4
f some_long_name_variable_1 + some_long_name_variable_2:
some_long_name_variable_3 * some_long_name_variable_4
Codes above mean the same thing. This style is also useful when using if
functions, for example.
result = if Bool.sample!():
do:
log "True was chosen"
1
do:
log "False was chosen"
0
After :
, no code other than comments may be written, and must always be on a new line.
Also, you cannot use :
immediately after a function. Only do
and do!
can do this.
# NG
f:
x
y
# Ok
f(
x,
y
)
Keyword Arguments
If a function is defined with a large number of parameters, there is a danger of passing the arguments in the wrong order. In such cases, it is safe to call the function using keyword arguments.
f x, y, z, w, v, u: Int = ...
The functions defined above have many arguments and are arranged in a confusing order. You should not create such a function, but you may encounter such code when using code written by others. Therefore, we use keyword arguments. If you use keyword arguments, the values are passed from the name to the correct argument, even if they are in the wrong order.
f u := 6, v := 5, w:= 4, x := 1, y := 2, z := 3
Default parameters
Default parameters are used when some parameters are mostly fixed and you want to be able to omit them.
Default parameters are specified by :=
(default-assign operator). If base
is not specified, assign math.E
to base
.
math_log x: Ratio, base := math.E = ...
assert math_log(100, 10) == 2
assert math_log(100) == math_log(100, math.E)
Note that there is a distinction between specifying no argument and assigning None
.
p! x := 0 = print!
p!(2) # 2
p!() # 0
p!(None) # None
Can also be used with type specification and patterns.
math_log x, base: Ratio := math.E = ...
f [x, y] := [1, 2] = ...
However, within the default arguments, it is not possible to call the procedures (described later) or assign mutable objects.
f x := p! 1 = ... # NG
Also, the argument just defined cannot be used as the value passed to the default argument.
f x := 1, y := x = ... # NG
Variable-length arguments
The log
function, which outputs a log (record) of its arguments, can take any number of arguments.
log "Hello", "World", "!" # Hello World !
To define such a function, add *
to a parameter. This way, the function receives arguments as a variable-length array.
f *x =
for x, i ->
log i
# x == [1, 2, 3, 4, 5]
f 1, 2, 3, 4, 5
Function definition with multiple patterns
fib n: Nat =
match n:
0 -> 0
1 -> 1
n -> fib(n - 1) + fib(n - 2)
Functions like the one above, where match
appears directly under the definition, can be rewritten as follows.
fib 0 = 0
fib 1 = 1
fib(n: Nat): Nat = fib(n - 1) + fib(n - 2)
Note that a function definition with multiple patterns is not so-called overloading (multiple definition); a function has only a single definition. In the example above, n
must be of the same type as 0
or 1
. Also, as with match
, pattern matching is done from top to bottom.
If instances of different classes are mixed, the last definition must specify that the function argument is of type Or
.
f "aa" = ...
f 1 = ...
# `f x = ... ` is invalid
f x: Int or Str = ...
Also, like match
, it must also be exhaustive.
fib 0 = 0
fib 1 = 1
# PatternError: pattern of fib's parameter is not exhaustive
However, it can be made exhaustive by explicitly specifying the type using the refinement type described later.
fib: 0..1 -> 0..1
fib 0 = 0
fib 1 = 1
# OK
Recursive functions
A recursive function is a function that includes itself in its definition.
As a simple example, let us define a function factorial
that performs a factorial calculation. Factorial is a computation that "multiplies all positive numbers less than or equal to".
The factorial of 5 is 5*4*3*2*1 == 120
.
factorial 0 = 1
factorial 1 = 1
factorial(n: Nat): Nat = n * factorial(n - 1)
First, from the definition of factorial, the factorial of 0 and 1 are both 1.
In turn, the factorial of 2 is 2*1 == 2
, the factorial of 3 is 3*2*1 == 6
, and the factorial of 4 is 4*3*2*1 == 24
.
If we look closely, we can see that the factorial of a number n is the factorial of the preceding number n-1 multiplied by n.
Putting this into code, we get n * factorial(n - 1)
.
Since the definition of factorial
contains itself, factorial
is a recursive function.
As a reminder, if you do not add a type specification, it is inferred like this.
factorial: |T <: Sub(Int, T) and Mul(Int, Int) and Eq(Int)| T -> Int
factorial 0 = 1
factorial 1 = 1
factorial n = n * factorial(n - 1)
However, even if you can reason about it, you should explicitly specify the type of the recursive function. In the example above, a code like factorial(-1)
would work, but
factorial(-1) == -1 * factorial(-2) == -1 * -2 * factorial(-3) == ...
and this computation does not stop. Recursive functions must carefully define the range of values or you may end up in an infinite loop. So the type specification also helps to avoid accepting unexpected values.
High-order functions
A higher-order function is a function that takes a function as its parameter or return value. For example, a higher-order function that takes a function as its parameter can be written as follows
arg_f = i -> log i
higher_f(x: (Int -> NoneType)) = x 10
higher_f arg_f # 10
Of coursers, it is possible to take return value as a function.
add(x): (Int -> Int) = y -> x + y
add_ten = add(10) # y -> 10 + y
add_hundred = add(100) # y -> 100 + y
assert add_ten(1) == 11
assert add_hundred(1) == 101
By taking functions as parameters and return values in this way, more flexible expressions can be defined with functions.
Compile-time functions
A function name begins with an uppercase letter to indicate a compile-time function. User-defined compile-time functions must have all arguments as constants and must specify their types. Compile-time functions are limited in what they can do. Only constant expressions can be used in compile-time functions, i.e., only some operators (such as quadrature, comparison, and type construction operations) and compile-time functions. Arguments to be passed must also be constant expressions. In return, the advantage is that the computation can be done at compile time.
Add(X, Y: Nat): Nat = X + Y
assert Add(1, 2) == 3
Factorial 0 = 1
Factorial(X: Nat): Nat = X * Factorial(X - 1)
assert Factorial(10) == 3628800
math = import "math"
Sin X = math.sin X # ConstantError: this function is not computable at compile time
Compile-time functions are also used in polymorphic type definitions.
Option T: Type = T or NoneType
Option: Type -> Type
Compile-time function parameters must have different names from any constants already defined. If the names are the same, it will be interpreted as a constant pattern.
# Int is not a parameter but a constant (type Int)
K Int = None
Appendix: Function Comparison
Erg does not define ==
for functions. This is because there is no structural equivalence algorithm for functions in general.
f = x: Int -> (x + 1)**2
g = x: Int -> x**2 + 2x + 1
assert f == g # TypeError: cannot compare functions
Although f
and g
always return the same result, it is extremely difficult to make that determination. We have to teach algebra to the compiler.
So Erg gives up on function comparisons entirely, and (x -> x) == (x -> x)
also results in a compile error. This is a different specification from Python and should be noted.
# Python, weird example
f = lambda x: x
assert f == f
assert (lambda x: x) ! = (lambda x: x)
Appendix2: ()-completion
f x: Object = ...
# will be completed to
f(x: Object) = ...
f a
# will be completed to
f(a)
f a, b # TypeError: f() takes 1 positional argument but 2 were given
f(a, b) # TypeError: f() takes 1 positional argument but 2 were given
f((a, b)) # OK
The function type T -> U
is actually the syntax sugar of (T,) -> U
.
Built-in functions
if
if
is a function that changes processing depending on a condition.
result: Option Int = if! Bool.sample!(), do:
log "True was chosen"
1
print! result # None (or 1)
.sample!()
returns a random set of values. If the return value is true, print! "True"
is executed.
You can also specify what to do if the condition is false; the second do block is called the else block.
result: Nat = if Bool.sample!():
do:
log "True was chosen"
1
do:
log "False was chosen"
0
print! result # 1 (or 0)
If the process is a single line, you can omit indentation.
result = if Bool.sample!():
do 1
do 0
for
You can use for
to write a repeating process.
match_s(ss: Iterator(Str), pat: Pattern): Option Str =
for ss, s ->
if pat.match(s).is_some():
break s
operator
Operators are symbols that represent operations. Operands are things to the (left) right of an operator.
Operators are a kind of function, and thus are themselves first-class objects that can be bound to variables. When binding, it is necessary to enclose it with ``.
For +
(and -
), there are both unary and binary operators, so _+_
(binary operation)/+_
(unary operation ) must be specified.
add = `+` # SyntaxError: specify `_+_` or `+_`
add = `_+_`
assert add(1, 2) == 3
assert add("a", "b") == "ab"
mul = `*` # OK, this is binary only
assert mul(1, 2) == 2
Some fundamental operators, called special forms, cannot be bound.
def = `=` # SyntaxError: cannot bind `=` operator, this is a special form
# NG: def x, 1
function = `->` # SyntaxError: cannot bind `->` operator, this is a special form
# NG: function x, x + 1
Side effects and procedures
We have been neglecting to explain the meaning of the !
, but now its meaning will finally be revealed. This !
indicates that this object is a "procedure" with a "side-effect". A procedure is a function with a side-effect.
f x = print! x # EffectError: functions cannot be assigned objects with side effects
# hint: change the name to 'f!'
The above code will result in a compile error. This is because you are using a procedure in a function. In such a case, you must define it as a procedure.
p! x = print! x
p!
, q!
, ... are typical variable names for procedures.
Procedures defined in this way also cannot be used within a function, so side-effects are completely isolated.
Methods
Functions and procedures each can be methods. Functional methods can only take immutable references to self
, while procedural methods can take mutable references to self
.
The self
is a special parameter, which in the context of a method refers to the calling object itself. The reference self
cannot be assigned to any other variable.
C!.
method ref self =
x = self # OwnershipError: cannot move out 'self'
x
Procedural methods can also take ownership of self
. Remove ref
or ref!
from the method definition.
n = 1
s = n.into(Str) # '1'
n # ValueError: n was moved by .into (line 2)
Only one procedural methods can have a mutable reference at any given time. In addition, while a mutable reference is taken, no more mutable reference can be taken from the original object. In this sense, ref!
causes a side-effect on self
.
Note, however, that it is possible to create (immutable/mutable) references from mutable references. This allows recursion and print!
of self
in procedural methods.
T -> T # OK (move)
T -> Ref T # OK (move)
T => Ref! T # OK (only once)
Ref T -> T # NG
Ref T -> Ref T # OK
Ref T => Ref!
T -> Ref T # NG
T -> Ref T # OK
T => Ref!
Appendix: Strict definition of side-effects
The rules for whether a code has a side-effect or not are not immediately understandable.
Until you can understand them, we recommend that you leave it to the compiler to define them as functions for the time being, and if an error occurs, add !
to treat them as procedures.
However, for those who want to understand the exact specifications of the language, the following is a more detailed explanation of side-effects.
First, it must be stated that the equivalence of return values is irrelevant with respect to side effects in Erg.
There are procedures that for any given x
will result in p!(x) == p!(x)
(e.g. always return None
), and there are functions that will result in f(x) ! = f(x)
.
An example of the former is print!
, and an example of the latter is the following function.
nan _ = Float.NaN
assert nan(1) ! = nan(1)
There are also objects, such as classes, for which equivalence determination itself is not possible.
T = Structural {i = Int}
U = Structural {i = Int}
assert T == U
C = Class {i = Int}
D = Class {i = Int}
assert C == D # TypeError: cannot compare classes
Back to the point: the precise definition of "side-effect" in Erg is
- Accessing mutable external information.
"External" generally refers to the outer scope; computer resources that Erg cannot touch and pre-/post-execution information are not included in "external". "Access" includes reading as well as writing.
As an example, consider the print!
procedure. At first glance, print!
does not seem to rewrite any variables. But if it were a function, it could rewrite outer variables, for example, with code like this:
camera = import "some_camera_module"
ocr = import "some_ocr_module"
n = 0
_ =
f x = print x # Suppose we could use print as a function
f(3.141592)
cam = camera.new() # camera faces PC display
image = cam.shot!()
n = ocr.read_num(image) # n = 3.141592
Think of the camera
module as an external library providing an API for a certain camera product, and ocr
as a library for OCR (optical character recognition).
The direct side-effect is caused by cam.shot!()
, but obviously that information is leaked from f
. Therefore, print!
cannot be a function by nature.
Nevertheless, there may be cases where you want to temporarily check a value in a function and do not want to add !
in the related function just for that purpose. In such cases, the log
function can be used.
log
prints the value after the entire code has been executed. In this way, side-effects are not propagated.
log "this will be printed after execution"
print! "this will be printed immediately"
# this will be printed immediately
# this will be printed after execution
If there is no feedback to the program, or in other words, if no external object can use the internal information, then the "leakage" of the information may be allowed. It is only necessary that the information not be "propagated".
Procedures
Procedures mean the functions that allow side-effect.
Please refer to Function basically usage or definition.
Add !
to a function name to define it.
proc!(x: Int!, y: Int!) =
for! 0..x, i =>
for 0..y, j =>
print! i, j
Procedures are necessary when dealing with mutable objects, but having a mutable object as an argument does not necessarily make it a procedure. Here is a function takes a mutable object (not procedure).
peek_str s: Str! = log s
make_proc(x!: (Int => Int)): (Int => Int) = y => x! y
p! = make_proc(x => x)
print! p! 1 # 1
Also, procedures and functions are related by proc :> func
.
Therefore, it is possible to define functions in procedure.
However, note that the reverse is not possible.
proc!(x: Int!) = y -> log x, y # OK
func(x: Int) = y => print! x, y # NG
Binding
Procedures can manipulate mutable variables that are out of scope.
x = ! 0
proc! () =
x.inc! ()
proc! ()
assert x == 1
In this case, 'proc!' has the following type.
proc!: {| x: Int! |} () => ()
`{| x: Int! |} The ' part is called the bind column and represents the variable and its type that the procedure operates on. Binding columns are derived automatically, so you don't need to write them explicitly. Note that normal procedures can only manipulate predetermined external variables. This means that variables passed in arguments cannot be rewritten. If you want to do something like that, you need to use procedural methods. Procedural methods can rewrite 'self'.
C! N = Class {arr = [Int; N]!}
C!.
new() = Self! (0)::__new__ {arr = ![]}
C! (N).
# push!: {|self: C!( N) ~> C! (N+1)|} (self: RefMut(C!( N)), x: Int) => NoneType
push! ref! self, x = self.arr.push! (x)
# pop!: {|self: C!( N) ~> C! (N-1)|} (self: RefMut(C!( N))) => Int
pop! ref! self = self.arr.pop! ()
c = C!. new()
c.push! (1)
assert c.pop! () == 1
Built-in procedure
id!
Returns the unique identification number of the object.
Although in pure Erg semantics no difference can be found between objects with the same structure, in practice objects have different locations in memory.
id!
returns a number representing this position.
Array
Arrays are the most basic collection (aggregate). A collection is an object that can hold multiple objects inside it.
a = [1, 2, 3]
a: [Int; 3] # Type specification: number after semicolon is the number of elements
# Can be omitted if the number of elements is not known
a: [Int]
mut_a = [!1, !2, !3]
mut_a[0].inc!()
assert mut_a == [2, 2, 3]
As a rule, arrays cannot contain objects of different types.
[1, "a"] # TypeError: 1st element is Int, but 2nd element is Str
However, you can bypass the restriction by explicitly specifying the type like this.
[1: Int or Str, "a"]
Slice
An array can also have multiple values taken out at once. This is called slicing.
l = [1, 2, 3, 4]
# Same as l[1:3] in Python
assert l[1.. <3] == [2, 3]
assert l[1..2] == [2, 3]
# Same as l[1]
assert l[1..1] == [2]
# Same as l[::2] in Python
assert l[..].step(2) == [2, 4]
The object obtained by slicing is an (immutable) copy to an array.
print! Typeof l[1..2] # [Int; 4]
Dict
Dict is a collection of key/value pairs.
ids = {"Alice": 145, "Bob": 214, "Charlie": 301}
assert ids["Alice"] == 145
The key does not have to be a string if it is a Hashable
object.
# deprecated to use a range object as a key (confused with slice)
r = {1..3: "1~3", 4..6: "4~6", 7..9: "7~9"}
assert r[1..3] == "1~3"
l = {[]: "empty", [1]: "1"}
assert l[[]] == "empty"
Order does not matter for Dict. It also cannot have duplicate elements. In this respect, Dict is similar to Set. You could say that a Dict is a Set with values.
{"Alice": 145, "Bob": 214, "Charlie": 301} == {"Alice": 145, "Charlie": 301, "Bob": 214}
When generating a dict from a dict literal, it is checked for duplicate keys. Any duplicates will result in a compile error.
{"Alice": 145, "Alice": 1} # KeyError: Duplicate key "Alice"
Empty Dict is created with {:}
. Note that {}
denotes an empty set.
mut_dict = !{:}
mut_dict.insert! "Alice", 145
mut_dict.insert! "Bob", 214
assert mut_dict["Alice"] == 145
Heterogeneous Dict
There need not be a single key/value type. Such a dictionary is called a _heterogenous dict.
d: {Str: Int, Int: Str} = {"a": 1, 1: "a"}
assert d["a"] == 1
assert d[1] == "a"
However, it is not possible to assign values of the same type to keys of different types, or values of different types to keys of the same type. In such cases, use the type Or instead.
invalid1 = {1: "a", "a": "b"}
invalid2 = {1: "a", 2: 2}
# Erg type inference does not infer Or type, so type specification is required
valid1: {Int or Str: Str} = {1: "a", "a": "b"}
valid2: {Int: Int or Str} = {1: "a", 2: 2}
Use with type ascription
The format x: y
in {}
is preferentially interpreted as a dictionary key/value pair.
If you want to use it as a type ascription, you must enclose it in ()
.
x = "a"
{(x: Str): 1}
Subscript (index access)
[]
is different from normal methods.
a = [!1, !2]
a[0].inc!()
assert a == [2, 2]
Recall that the return value of a subroutine cannot be a reference.
The type of a[0]
here should clearly be Ref!(Int!)
(the type of a[0]
depends on the context).
So []
is actually part of a special syntax, just like .
. Unlike Python, it cannot be overloaded.
It is also not possible to reproduce the behavior of []
in a method.
C = Class {i = Int!}
C.steal(self) =
self::i
C.get(ref self) =
self::i # TypeError: `self::i` is `Int!` (require ownership) but `get` doesn't own `self`
# OK (assigning)
c = C.new({i = 1})
i = c.steal()
i.inc!()
assert i == 2
# or (own_do!)
own_do! C.new({i = 1}).steal(), i => i.inc!()
# NG
C.new({i = 1}).steal().inc!() # OwnershipWarning: `C.new({i = 1}).steal()` is not owned by anyone
# hint: assign to a variable or use `uwn_do!`
Also, []
can be disowned, but the element is not shifted.
a = [!1, !2]
i = a[0]
i.inc!()
assert a[1] == 2
a[0] # OwnershipError: `a[0]` is moved to `i`
Tuple
Tuples are similar to arrays, but can hold objects of different types. Such a collection is called an unequal collection. In contrast, homogeneous collections include arrays, sets, etc.
t = (1, True, "a")
(i, b, s) = t
assert(i == 1 and b == True and s == "a")
The tuple t
can retrieve the nth element in the form t.n
; note that unlike Python, it is not t[n]
.
This is because accessing tuple elements is more like an attribute (the existence of the element is checked at compile time, and the type can change depending on n
) than a method (an array's []
is a method).
assert t.0 == 1
assert t.1 == True
assert t.2 == "a"
Parentheses ()
are optional when not nested.
t = 1, True, "a"
i, b, s = t
Tuples can hold objects of different types, so they cannot be iterated like arrays.
t: ({1}, {2}, {3}) = (1, 2, 3)
(1, 2, 3).iter().map(x -> x + 1) # TypeError: type ({1}, {2}, {3}) has no method `.iter()`
# If all types are the same, they can be represented by `(T; n)` like arrays, but this still does not allow iteration
t: (Int; 3) = (1, 2, 3)
assert (Int; 3) == (Int, Int, Int)
However, nonhomogeneous collections (such as tuples) can be converted to homogeneous collections (such as arrays) by upcasting, intersecting, and so on. This is called equalization.
(Int, Bool, Str) can be [T; 3] where T :> Int, T :> Bool, T :> Str
t: (Int, Bool, Str) = (1, True, "a") # non-homogenous
a: [Int or Bool or Str; 3] = [1, True, "a"] # homogenous
_a: [Show; 3] = [1, True, "a"] # homogenous
_a.iter().map(x -> log x) # OK
t.try_into([Show; 3])? .iter().map(x -> log x) # OK
Unit
A tuple with zero elements is called a unit. A unit is a value, but also refers to its own type.
unit = ()
(): ()
Unit is a superclass of all tuples.
() :> (Int; 0)
() :> (Str; 0)
() :> (Int, Str)
...
The use of this object is for procedures with no arguments and no return value, etc. Erg subroutines must have arguments and a return value. However, in some cases, such as a procedure, there may be no meaningful arguments or return value, only side effects. In such cases, we use units as "meaningless, formal values.
p!() =.
# `print!` does not return a meaningful value
print! "Hello, world!"
p!: () => () # The parameter part is part of the syntax, not a tuple
However, Python tends to use None
instead of units in such cases.
In Erg, you should use ()
when you are sure from the beginning that the operation will not return a meaningful value, such as in a procedure, and return None
when there is a possibility that the operation will fail and you will get nothing, such as when retrieving an element.
Record
A record is a collection that combines the properties of a Dict accessed by key and a tuple whose access is inspected at compile time. If you know JavaScript, think of it as a (more enhanced) kind of object literal notation.
john = {.name = "John"; .age = 21}
assert john.name == "John"
assert john.age == 21
assert john in {.name = Str; .age = Nat}
john["name"] # Error: john is not subscribable
The .name
and .age
parts are called attributes, and the "John"
and 21
parts are called attribute values.
The difference from JavaScript object literals is that they are not accessible as strings. That is, attributes are not just strings.
This is because access to the value is determined at compile-time, and because dictionaries and records are different things. In other words, {"name": "John"}
is a Dict and {name = "John"}
is a record.
So how should we use dictionaries and records?
In general, we recommend using records. Records have the advantages of being checked at compile-time for the existence of elements and of being able to specify _visibility.
Specifying visibility is equivalent to specifying public/private in Java and other languages. For details, see visibility for details.
a = {x = 1; .y = x + 1}
assert a.y == 2
a.x # AttributeError: x is private
# Hint: declare as `.x`.
The above example may seem strange to someone familiar with JavaScript, but simply declaring x
makes it inaccessible from the outside. .
. `.
You can also explicitly specify the type of an attribute.
anonymous = {
.name: Option! Str = "Jane Doe"
.age = 20
}
anonymous.name.set! "John Doe"
A record can also have the method.
o = {
.i = !0
.inc! ref! self = self.i.inc!()
}
assert o.i == 0
o.inc!()
assert o.i == 1
There is a notable syntax with respect to records. When all the attribute values of a record are classes (not structural types), the record itself behaves as a type with its own attributes as required attributes. Such a type is called a record type. See the section Record for more details.
# record
john = {.name = "John"}
# record type
john: {.name = Str}
Named = {.name = Str}
john: Named
greet! n: Named =
print! "Hello, I am \{n.name}"
john # "Hello, I am John" print!
Named.name # Str
Deconstructing a record
Records can be deconstructed as follows.
record = {x = 1; y = 2}
{x = a; y = b} = record
assert a == 1
assert b == 2
point = {x = 2; y = 3; z = 4}
match point:
{x = 0; y = 0; z = 0} -> "origin"
{x = _; y = 0; z = 0} -> "on the x axis"
{x = 0; ...} -> "x = 0"
{x = x; y = y; z = z} -> "(\{x}, \{y}, \{z})"
x = ...
can also be abbreviated to x
when there is a variable with the same name as the attribute, for example, x = x
or x = .x
to x
, and .x = .x
or .x = x
to .x
.
However, when there is only one attribute, it must be followed by ;
to distinguish it from a set.
x = 1
y = 2
xy = {x; y}
a = 1
b = 2
ab = {.a; .b}
assert ab.a == 1
assert ab.b == 2
record = {x;}
tuple = {x}
assert tuple.1 == 1
This syntax can be used to deconstructed a record and assign it to a variable.
# same as `{x = x; y = y} = xy`
{x; y} = xy
assert x == 1
assert y == 2
# same as `{.a = a; .b = b} = ab`
{a; b} = ab
assert a == 1
assert b == 2
Empty Record
An empty record is represented by {=}
. An empty record is also its own class, like Unit.
empty_record = {=}
empty_record: {=}
# Object: Type = {=}
empty_record: Object
empty_record: Structural {=}
{x = 3; y = 5}: Structural {=}
An empty record is different from an empty Dict {:}
or empty set {}
. In particular, note that it is the opposite of {}
in meaning (in Python, {}
is an empty dictionary, while in Erg it is !{:}
in Erg).
As an enumerated type, {}
is an empty type that contains nothing in its elements. The Never
type is a classification of this type.
Conversely, the record class {=}
has no required instance attribute, so all objects are elements of it. An Object
is an alias of this.
An Object
(a patch of Object
) is an element of . __sizeof__
and other very basic provided methods.
AnyPatch = Patch Structural {=}
. __sizeof__ self = ...
.clone self = ...
...
Never = Class {}
Note that no other type or class can be structurally equivalent to the {}
, Never
type, and it is an error if the user defines a type with {}
, Class {}
on the right side.
This means that, for example, 1..10 or -10. -1
, but 1..10 and -10... -1
. -1
when it should be 1..10 or -10...-1
, for example.
Also, if you define a type (such as Int and Str
) that results in a composition Object
, you will be warned to simply set it to Object
.
Instant Block
Erg has another syntax, instant block, which simply returns the last value evaluated. Attributes cannot be retained.
x =
x = 1
y = x + 1
y ** 3
assert x == 8
y =
.x = 1 # SyntaxError: cannot define an attribute in an entity block
Data Class
A bare record (a record generated by a record literal) must be defined directly in the instance if you try to implement a method on its own. This is inefficient, and as the number of attributes increases, error messages and the like become difficult to see and use.
john = {
name = "John Smith"
age = !20
.greet! ref self = print! "Hello, my name is \{self::name} and I am \{self::age} years old."
.inc_age! ref! self = self::age.update! x -> x + 1
}
print! john + 1
# TypeError: + is not implemented for {name = Str; age = Int; .greet! = Ref(Self). () => None; inc_age! = Ref! () => None}, Int
So, in such a case, you can inherit a record class. Such a class is called a data class. This is described in class.
Person = Inherit {name = Str; age = Nat}
Person.
greet! ref self = print! "Hello, my name is \{self::name} and I am \{self::age} years old."
inc_age! ref! self = self::age.update! x -> x + 1
john = Person.new {name = "John Smith"; age = 20}
print! john + 1
# TypeError: + is not implemented for Person, Int
Set
A set represents a collection, which is structurally a duplicate, unordered array.
assert Set.from([1, 2, 3, 2, 1]) == {1, 2, 3}
assert {1, 2} == {1, 1, 2} # duplicates are automatically removed
assert {1, 2} == {2, 1}
Sets can be declared by specifying type and length.
a: {Int; 3} = {0, 1, 2} # OK
b: {Int; 3} = {0, 0, 0} # NG, Duplicates are deleted, and the length changes.
# TypeError: the type of b is mismatched
# expected: Set(Int, 3)
# but found: Set({0, }, 1)
In addition, only objects that implement the Eq
trait can be elements of the Set.
Therefore, it is not possible to use the Set elements such as a Float.
d = {0.0, 1.0} # NG
#
# 1│ d = {0.0, 1.0}
# ^^^^^^^^
# TypeError: the type of _ is mismatched:
# expected: Eq(Float)
# but found: {0.0, 1.0, }
Sets can perform set operations.
assert 1 in {1, 2, 3}
assert not 1 in {}
assert {1} or {2} == {1, 2}
assert {1, 2} and {2, 3} == {2}
assert {1, 2} not {2} == {1}
A set is a homogeneous collection. In order for objects of different classes to coexist, they must be homogenized.
s: {Int or Str} = {"a", 1, "b", -1}
Sets as types
Sets can also be treated as types. Such types are called Enum types.
i: {1, 2, 3} = 1
assert i in {1, 2, 3}
Elements of the set are directly elements of the type. Note that the sets themselves are different.
mut_set = {1, 2, 3}.into {Int; !3}
mut_set.insert!(4)
types
Types are a very important feature in Erg, so we have a dedicated section. Please see there.
Erg's Type System
The following is a brief description of Erg's type system. Details are explained in other sections.
How to define
One of the unique features of Erg is that there is not much difference in syntax between (normal) variable, function (subroutine), and type (Kind) definitions. All are defined according to the syntax of normal variable and function definitions.
f i: Int = i + 1
f # <function f>
f(1) # 2
f.method self = ... # SyntaxError: cannot define a method to a subroutine
T I: Int = {...}
T # <kind 'T'>
T(1) # Type T(1)
T.method self = ...
D = Class {private = Int; .public = Int}
D # <class 'D'>
o1 = {private = 1; .public = 2} # o1 is an object that does not belong to any class
o2 = D.new {private = 1; .public = 2} # o2 is an instance of D
o2 = D.new {.public = 2} # InitializationError: class 'D' requires attribute 'private'(: Int) but not defined
Classification
All objects in Erg are strongly typed.
The top-level type is {=}
, which implements __repr__
, __hash__
, clone
, etc. (not required methods, and these attributes cannot be overridden).
Erg's type system incorporates structural subtyping (SST). The types typed by this system are called Structural types.
There are three major types of structural types: Attributive (attribute type), Refinement (refinement type), and Algebraic (algebraic type).
Record | Enum | Interval | Union | Intersection | Diff | |
---|---|---|---|---|---|---|
kind | Attributive | Refinement | Refinement | Algebraic | Algebraic | Algebraic |
generator | record | set | range operator | or operator | and operator | not operator |
Nominal subtyping (NST) can also be used, and the conversion of an SST type to an NST type is called nominalization of the type. The resulting type is called a nominal type. In Erg, the nominal types are classes and traits. When we simply say class/trait, we often mean record class/trait.
Type | Abstraction | Subtyping procedure | |
---|---|---|---|
NST | NominalType | Trait | Inheritance |
SST | StructuralType | Structural Trait | (Implicit) |
The type for the entire nominal type (NominalType
) and the type for the entire structural type (StructuralType
) are subtypes of the type for the entire type (Type
).
Erg can pass arguments (type arguments) to the type definition. An Option
, Array
, etc. with type arguments are called a polynomial kind. These are not themselves types, but they become types by applying arguments. Types such as Int
, Str
, etc., which have no arguments, are called simple types (scalar types).
A type can be regarded as a set, and there is an inclusion relation. For example, Num
contains Add
, Sub
, etc., and Int
contains Nat
.
The upper class of all classes is Object == Class {:}
and the lower class of all types is Never == Class {}
. This is described below.
Types
A type like Array T
can be regarded as a function of type Type -> Type
that takes type T
as an argument and returns type Array T
(also called Kind in type theory). Types like Array T
are specifically called polymorphic types, and Array
itself is called unary Kind.
The type of a function whose argument and return types are known is denoted as (T, U) -> V
. If you want to specify an entire two-argument function of the same type, you can use |T| (T, T) -> T
, and if you want to specify an entire N-argument function, you can use Func N
. However, the Func N
type has no information about the number of arguments or their types, so all return values are of type Obj
when called.
The Proc
type is denoted as () => Int
and so on. Also, the name of the Proc
type instance must end with !
at the end.
A Method
type is a function/procedure whose first argument is the object self
to which it belongs (by reference). For dependent types, you can also specify the type of yourself after the method is applied. This is T!(!N)
type and T!(N ~> N-1). () => Int
and so on.
Erg's array (Array) is what Python calls a list. [Int; 3]
is an array class that contains three objects of type Int
.
Note:
(Type; N)
is both a type and a value, so it can be used like this.Types = (Int, Str, Bool) for! Types, T => print! T # Int Str Bool a: Types = (1, "aaa", True)
pop|T, N|(l: [T; N]): ([T; N-1], T) =
[*l, last] = l
(l, last)
lpop|T, N|(l: [T; N]): (T, [T; N-1]) =
[first, *l] = l
(first, l)
A type ends with !
can be rewritten internal structure. For example, the [T; !N]
class is a dynamic array.
To create an object of type T!
from an object of type T
, use the unary operator !
.
i: Int! = !1
i.update! i -> i + 1
assert i == 2
arr = [1, 2, 3]
arr.push! 4 # ImplError:
mut_arr = [1, 2, 3].into [Int; !3]
mut_arr.push4
assert mut_arr == [1, 2, 3, 4].
Type Definitions
Types are defined as follows.
Point2D = {.x = Int; .y = Int}
Note that if .
is omitted from a variable, it becomes a private variable used within the type. However, this is also a required attribute.
Since types are also objects, there are attributes on the types themselves. Such attributes are called type attributes. In the case of a class, they are also called class attributes.
Data type
As mentioned earlier, a "type" in Erg roughly means a set of objects.
The following is a definition of the Add
type, which requires +
(the middle operator). R, O
are the so-called type parameters, which can be a true type (class) such as Int
or Str
. In other languages, type parameters are given a special notation (generics, templates, etc.), but in Erg they can be defined just like normal parameters.
Type parameters can also be used for types other than type objects. For example, the array type [Int; 3]
is a syntax sugar for Array Int, 3
. If the type implementations overlap, the user must explicitly choose one.
Add R = Trait {
.AddO = Type
. `_+_` = Self.(R) -> Self.AddO
}
._+_
is an abbreviation for Add._+_
. The prefix operator .+_
is a method of type Num
.
Num = Add and Sub and Mul and Eq
NumImpl = Patch Num
NumImpl.
`+_`(self): Self = self
...
Polymorphic types can be treated like functions. They can be monomorphic by specifying them as Mul Int, Str
, etc. (in many cases, they are inferred with real arguments without specifying them).
1 + 1
`_+_` 1, 1
Nat.`_+_` 1, 1
Int.`_+_` 1, 1
The top four lines return the same result (to be exact, the bottom one returns Int
), but it is common to use the top one.
Ratio.`_+_`(1, 1)
will return 2.0
without error.
This is because Int <: Ratio
, so 1
is downcast to Ratio
.
But this is not cast.
i = 1
if i: # TypeError: i: Int cannot be cast to Bool, use Int.is_zero() instead.
log "a"
log "b"
This is because Bool <: Int
(True == 1
, False == 0
). Casts to subtypes generally require validation.
Type Inference System
Erg uses static duck typing, so there is little need to explicitly specify the type.
f x, y = x + y
In the case of the code above, the type with +
, i.e., Add
is automatically inferred; Erg first infers the smallest type. If f 0, 1
, it will infer f x: {0}, y: {1}
, if n: Nat; f n, 1
, it will infer f x: Nat, y: {1}
. After minimization, the type is increased until an implementation is found. In the case of {0}, {1}
, Nat
is monomorphic to Nat
since Nat
is the smallest type with a +
implementation.
If {0}, {-1}
, it is monomorphic to Int
since it does not match Nat
. If there is no relationship between subtypes and supertypes, the one with the lowest concentration (number of instances) (or even fewer arguments in the case of polymorphic types) is tried first.
{0}
and {1}
are enumerated types that are partial types such as Int
and Nat
.
Enumerated types, for example, can be given names and request/implementation methods. In namespaces that have access to that type, objects that satisfy the request can use the implementation method.
Binary = Patch {0, 1}
Binary.
# self contains an instance. In this example, either 0 or 1.
# If you want to rewrite self, you must append ! must be added to the type name and method name.
is_zero(self) = match self:
0 -> True
1 -> False # You can also use _ -> False
is_one(self) = not self.is_zero()
to_bool(self) = match self:
0 -> False
1 -> True
Thereafter, the code 0.to_bool()
is possible (although 0 as Bool == False
is defined built-in).
Here is an example of a type that can actually rewrite self
as shown in the code.
Binary! = Patch {0, 1}!
Binary!
switch! ref! self = match! self:
0 => self = 1
1 => self = 0
b = !1
b.switch!()
print! b # => 0
Structure type (anonymous type)
Binary = {0, 1}
Binary
in the above code is a type whose elements are 0
and 1
. It is also a subtype of the Int
type, which has both 0
and 1
.
An object like {}
is itself a type and can be used with or without assignment to a variable as above.
Such types are called structural types. When we want to emphasize its use as the latter in contrast to a class (named type), it is also called an unnamed type. A structural type such as {0, 1}
is called an enumerated type, and there are also interval types, record types, and so on.
Type Identity
The following cannot be specified. For example, you cannot specify Int
and Int
and Int
and Int
and Int
and Int
.
For example, Int
and Str
are both Add
, but Int
and Str
cannot be added.
add l: Add, r: Add =
l + r # TypeError: there is no implementation of `_+_`: |T, U <: Add| (T, U) -> <Failure>
Also, the types A
and B
below are not considered the same type. However, the type O
is considered to match.
... |R1, R2, O, A <: Add(R1, O); B <: Add(R2, O)|
Basic syntax for types
Type specification
In Erg, the type of a variable can be specified after :
as follows. This can be done at the same time as an assignment.
i: Int # Declare the variable i to be of type Int
i: Int = 1
j = 1 # type specification can be omitted
For simple variable assignments, most type specifications can be omitted. Type specifications are more useful when defining subroutines and types.
# Type specification for parameters
f x, y: Array Int = ...
T X, Y: Array Int = ...
Note that in the above case, x, y
are both Array Int
.
# The value of a capital variable must be a constant expression
f X: Int = X
Alternatively, if you don't need complete information about the type argument, you can omit it with _
.
g v: [T; _] = ...
Note, however, _
at a type specification implies Object
.
f x: _, y: Int = x + y # TypeError: + is not implemented between Object and Int
Type ascription
Erg can explicitly indicate the type of any expression as well as variables. This syntax is called type ascription.
Variable type ascription = type specification.
x = 1: Nat
f("a": Str)
f("a"): Int
"a": Nat # TypeError:
Subtype specification
In addition to the :
(type declaration operator), Erg also allows you to specify the relationship between types by using <:
(partial type declaration operator).
The left side of <:
can only specify a class. Use Subtypeof
or similar operators to compare structural types.
This is also often used when defining subroutines or types, rather than simply specifying variables.
# Subtype specification of an argument
f X <: T = ...
# Subtype specification of the required attribute (.Iterator attribute is required to be a subtype of type Iterator)
Iterable T = Trait {
.Iterator = {Iterator} # {Iterator} == {I: Type | I <: Iterator}
.iter = Self.() -> Self.Iterator T
...
}
You can also use a subtype specification when defining a class to statically check whether the class is a subtype of the specified type.
# Class C is a subtype of Show
C = Class Object, Impl := Show
C.show self = ... # Show's required attributes.
You can also specify a subtype only in specific cases.
K T: Eq
K Int <: Show and Eq
K T = Class Object
K(T).
`==` self, other = ...
K(Int).
show self = ...
Subtype specification is recommended when implementing structural types. This is because, due to the nature of structural subtyping, typo or type specification errors will not cause errors when implementing required attributes.
C = Class Object
C.shoe self = ... # Show is not implemented due to Typo (it is considered just a unique method).
Attribute definitions
Attributes can be defined for traits and classes only in modules.
C = Class()
C.pub_attr = "this is public"
C::private_attr = "this is private"
c = C.new()
assert c.pub_attr == "this is public"
The syntax for defining a batch definition is called a batch definition, in which a newline is added after C.
or C::
and the definitions are grouped together below the indentation.
C = Class()
C.pub1 = ...
C.pub2 = ...
C::priv1 = ...
C::priv2 = ...
# is equivalent to
C = Class()
C.
pub1 = ...
C. pub2 = ...
C::
priv1 = ...
priv2 = ...
Aliasing
Types can be aliased. This allows long types, such as record types, to be shortened.
Id = Int
Point3D = {x = Int; y = Int; z = Int}
IorS = Int or Str
Vector = Array Int
Also, when displaying errors, the compiler will use aliases for composite types (in the above example, right-hand-side types other than the first) if they are defined.
However, only one alias of the same type is allowed per module, and multiple aliases will result in a warning. This means that types with different purposes should be defined as separate types. The purpose is also to prevent adding aliases on top of types that already have aliases.
Id = Int
UserId = Int # TypeWarning: duplicate aliases: Id and UserId
Ids = Array Id
Ints = Array Int # TypeWarning: duplicate aliases: Isd and Ints
IorS = Int or Str
IorSorB = IorS or Bool
IorSorB_ = Int or Str or Bool # TypeWarning: duplicate aliases: IorSorB and IorSorB_
Point2D = {x = Int; y = Int}
Point3D = {.... Point2D; z = Int}
Point = {x = Int; y = Int; z = Int} # TypeWarning: duplicate aliases: Point3D and Point
Trait
Traits are nominal types that add a type attribute requirement to record types. It is similar to the Abstract Base Class (ABC) in Python, but it has the feature of being able to perform algebraic operations.
Traits are used when you want to identify different classes. Examples of builtin traits are Eq
and Add
.
Eq
requires ==
to be implemented. Add
requires the implementation of +
(in-place).
So any class that implements these can be (partially) identified as a subtype of trait.
As an example, let's define a Norm
trait that computes the norm (length) of a vector.
Norm = Trait {.norm = (self: Self) -> Int}
Note that traits can only be declared, not implemented. Traits can be "implemented" for a class as follows:
Point2D = Class {.x = Int; .y = Int}
Point2D|<: Norm|.
Norm self = self.x**2 + self.y**2
Point3D = Class {.x = Int; .y = Int; .z = Int}
Point3D|<: Norm|.
norm self = self.x**2 + self.y**2 + self.z**2
Since Point2D
and Point3D
implement Norm
, they can be identified as types with the .norm
method.
norm x: Norm = x.norm()
assert norm(Point2D.new({x = 1; y = 2})) == 5
assert norm(Point3D.new({x = 1; y = 2; z = 3})) == 14
Error if the required attributes are not implemented.
Point3D = Class {.x = Int; .y = Int; .z = Int}
Point3D|<: Norm|.
foo self = 1
One of the nice things about traits is that you can define methods on them in Patch (described later).
@Attach NotEqual
Eq = Trait {. `==` = (self: Self, other: Self) -> Bool}
NotEq = Patch Eq
NotEq.
`! =` self, other = not self.`==` other
With the NotEq
patch, all classes that implement Eq
will automatically implement !=
.
Trait operations
Traits, like structural types, can apply operations such as composition, substitution, and elimination (e.g. T and U
). The resulting trait is called an instant trait.
T = Trait {.x = Int}
U = Trait {.y = Int}
V = Trait {.x = Int; y: Int}
assert Structural(T and U) == Structural V
assert Structural(V not U) == Structural T
W = Trait {.x = Ratio}
assert Structural(W) ! = Structural(T)
assert Structural(W) == Structural(T.replace {.x = Ratio})
Trait is also a type, so it can be used for normal type specification.
points: [Norm; 2] = [Point2D::new(1, 2), Point2D::new(3, 4)]
assert points.iter().map(x -> x.norm()).collect(Array) == [5, 25].
Trait inclusion
Subsume
allows you to define a trait that contains a certain trait as a supertype. This is called the subsumption of a trait.
In the example below, BinAddSub
subsumes BinAdd
and BinSub
.
This corresponds to Inheritance in a class, but unlike Inheritance, multiple base types can be combined using and
. Traits that are partially excluded by not
are also allowed.
Add R = Trait {
.Output = Type
. `_+_` = (self: Self, R) -> Self.Output
}
Sub R = Trait {
.Output = Type
. `_-_` = (self: Self, R) -> Self.Output
}
BinAddSub = Subsume Add(Self) and Sub(Self)
Structural Traits
Traits can be structured. This way, there is no need to explicitly declare the implementation, this is a feature that is called duck typing in Python.
SAdd = Structural Trait {
. `_+_` = Self.(Self) -> Self
}
# |A <: SAdd| cannot be omitted
add|A <: SAdd| x, y: A = x.`_+_` y
C = Class {i = Int}
C.
new i = Self.__new__ {i;}
# this is an __implicit__ implementation of SAdd
`_+_` self, other: Self = Self.new {i = self::i + other::i}
assert add(C.new(1), C.new(2)) == C.new(3)
Regular trait, i.e. nominal traits cannot be used simply by implementing a request method, but must be explicitly declared to have been implemented.
In the following example, add
cannot be used with an argument of type C
because there is no explicit declaration of implementation. It must be C = Class {i = Int}, Impl := Add
.
Add = Trait {
.`_+_` = (self: Self, Self) -> Self
}
# |A <: Add| can be omitted
add|A <: Add| x, y: A = x.`_+_` y
C = Class {i = Int}
C.
new i = Self.__new__ {i;}
`_+_` self, other: Self = Self.new {i = self::i + other::i}
add C.new(1), C.new(2) # TypeError: C is not a subclass of Add
# hint: inherit or patch 'Add'
Structural traits do not need to be declared for this implementation, but instead type inference does not work. Type specification is required for use.
Polymorphic Traits
Traits can take parameters. This is the same as for polymorphic types.
Mapper T: Type = Trait {
.mapIter = {Iterator}
.map = (self: Self, T -> U) -> Self.MapIter U
}
# ArrayIterator <: Mapper
# ArrayIterator.MapIter == ArrayMapper
# [1, 2, 3].iter(): ArrayIterator Int
# [1, 2, 3].iter().map(x -> "\{x}"): ArrayMapper Str
assert [1, 2, 3].iter().map(x -> "\{x}").collect(Array) == ["1", "2", "3"].
Override in Trait
Derived traits can override the type definitions of the base trait. In this case, the type of the overriding method must be a subtype of the base method type.
# `Self.(R) -> O` is a subtype of ``Self.(R) -> O or Panic
Div R, O: Type = Trait {
. `/` = (self: Self, R) -> O or Panic
}
SafeDiv R, O = Subsume Div, {
@Override
. `/` = (self: Self, R) -> O
}
Implementing and resolving duplicate traits in the API
The actual definitions of Add
, Sub
, and Mul
look like this.
Add R = Trait {
.Output = Type
. `_+_` = (self: Self, R) -> .Output
}
Sub R = Trait {
.Output = Type
. `_-_` = (self: Self, R) -> .Output
}
Mul R = Trait {
.Output = Type
. `*` = (self: Self, R) -> .Output
}
.Output
is duplicated. If you want to implement these multiple traits at the same time, specify the following.
P = Class {.x = Int; .y = Int}
# P|Self <: Add(P)| can be abbreviated to P|<: Add(P)|
P|Self <: Add(P)|.
Output = P
`_+_` self, other = P.new {.x = self.x + other.x; .y = self.y + other.y}
P|Self <: Mul(Int)|.
Output = P
`*` self, other = P.new {.x = self.x * other; .y = self.y * other}
Duplicate APIs implemented in this way are almost always type inferred when used, but can also be resolved by explicitly specifying the type with ||
.
print! P.Output # TypeError: ambiguous type
print! P|<: Mul(Int)|.Output # <class 'P'>
Appendix: Differences from Rust traits
Erg's trait is faithful to the one proposed by Schärli et al.. In order to allow algebraic operations, traits are designed to be unable to have method implementations directory, but can be patched if necessary.
Class
A class in Erg is roughly a type that can create its own elements (instances). Here is an example of a simple class.
Person = Class {.name = Str; .age = Nat}
# If `.new` is not defined, then Erg will create `Person.new = Person::__new__`
Person.
new name, age = Self::__new__ {.name = name; .age = age}
john = Person.new "John Smith", 25
print! john # <Person object>
print! classof(john) # Person
The type given to Class
(normally a record type) is called the requirement type (in this case {.name = Str; .age = Nat}
).
Instances can be created with <Class name>::__new__ {<attribute name> = <value>; ...}
can be created with.
{.name = "John Smith"; .age = 25}
is just a record, but it is converted to a Person
instance by passing Person.new
.
The subroutine that creates such an instance is called a constructor.
In the class above, the .new
method is defined so that field names, etc. can be omitted.
Note that the following definition without line breaks will result in a syntax error.
Person.new name, age = ... # SyntaxError: cannot define attributes directly on an object
Newtype notation
You can define a class C
by C = Class T
for a non-record type T
. This is a short-hand notation, which is equivalent to C = Class {base = T}
.
This is to simplify the definition of the so-called "new type pattern".
Also, the constructor __new__
/new
can be passed directly to a T
type object without wrapping it in a record.
Id = Class {base = Int}
i = Id.new {base = 1}
# ↓
Id = Class Int
i = Id.new 1
Instance and class attributes
In Python and other languages, instance attributes are often defined on the block side as follows, but note that such writing has a different meaning in Erg.
# Python
class Person:
name: str
age: int
# In Erg, this notation implies the declaration of a class attribute (not an instance attribute)
Person = Class()
Person.
name: Str
age: Int
# Erg code for the Python code above
Person = Class {
.name = Str
.age = Nat
}
Element attributes (attributes defined in a record) and type attributes (also called instance/class attributes, especially in the case of classes) are completely different things. Type attributes are attributes of the type itself. An element of a type refers to a type attribute when it does not have the desired attribute in itself. An element attribute is a unique attribute directly possessed by the element. Why is this distinction made? If all attributes were element attributes, it would be inefficient to duplicate and initialize all attributes when the object is created. In addition, dividing the attributes in this way clarifies roles such as "this attribute is shared" and "this attribute is held separately".
The example below illustrates this. The attribute species
is common to all instances, so it is more natural to use it as a class attribute. However, the attribute name
should be an instance attribute because each instance should have it individually.
Person = Class {name = Str}
Person::
species = "human"
Person.
describe() =
log "species: \{species}"
greet self =
log "Hello, My name is \{self::name}."
Person.describe() # species: human
Person.greet() # TypeError: unbound method Person.greet needs an argument
john = Person.new {name = "John"}
john.describe() # species: human
john.greet() # Hello, My name is John.
alice = Person.new {name = "Alice"}
alice.describe() # species: human
alice.greet() # Hello, My name is Alice.
Incidentally, if an instance attribute and a type attribute have the same name and the same type, a compile error occurs. This is to avoid confusion.
C = Class {.i = Int}
C.i = 1 # AttributeError: `.i` is already defined in instance fields
Class, Type
The question "What is the type of 1
?" requires a slightly longer answer.
There is only one class Nat
from which 1
belongs. The class to which an object belongs can be obtained by classof(obj)
or obj.__class__
.
In contrast, there are countless types to which 1
belongs. Examples are {1}, {0, 1}, 0..12, Nat, Int, Num
.
The reason an object can belong to more than one type is that Erg has a subtype system. Nat
is a subtype of Int
and Int
is a subtype of Num
.
Difference from structural types
We said that a class is a type that can generate its own elements, but that is not a strict description. In fact, a record type + patch can do the same thing.
Person = {.name = Str; .age = Nat}
PersonImpl = Patch Person
PersonImpl.
new name, age = {.name; .age}
john = Person.new("John Smith", 25)
There are four advantages to using classes. The first is that the constructor is validity checked, the second is that it is more performant, the third is that you can use notational subtypes (NSTs), and the fourth is that you can inherit and override.
We saw earlier that record type + patch can also define a constructor (of sorts), but this is of course not a legitimate constructor. This is of course not a legitimate constructor, because it can return a completely unrelated object even if it calls itself .new
. In the case of a class, .new
is statically checked to see if it produces an object that satisfies the requirements.
~
Type checking for classes is simply a matter of checking the object's . __class__
attribute of the object. So it is fast to check if an object belongs to a type.
~
Erg enables NSTs in classes; the advantages of NSTs include robustness. When writing large programs, it is often the case that the structure of an object is coincidentally matched.
Dog = {.name = Str; .age = Nat}
DogImpl = Patch Dog
DogImpl.
bark = log "Yelp!"
...
Person = {.name = Str; .age = Nat}
PersonImpl = Patch Person
PersonImpl.
greet self = log "Hello, my name is \{self.name}."
john = {.name = "John Smith"; .age = 20}
john.bark() # "Yelp!"
The structure of Dog
and Person
is exactly the same, but it is obviously nonsense to allow animals to greet and humans to bark.
The former is impossible, so it is safer to make it inapplicable. In such cases, it is better to use classes.
Dog = Class {.name = Str; .age = Nat}
Dog.bark = log "Yelp!"
...
Person = Class {.name = Str; .age = Nat}
Person.greet self = log "Hello, my name is \{self.name}."
john = Person.new {.name = "John Smith"; .age = 20}
john.bark() # TypeError: `Person` object has no method `.bark`.
Another feature is that the type attributes added by the patch are virtual and are not held as entities by the implementing class.
That is, T.x
, T.bar
are objects that can be accessed (compile-time bound) by types compatible with {i = Int}
, and are not defined in {i = Int}
or C
.
In contrast, class attributes are held by the class itself. Therefore, they cannot be accessed by classes that are not in an inheritance relationship, even if they have the same structure.
C = Class {i = Int}
C.
foo self = ...
print! dir(C) # ["foo", ...].
T = Patch {i = Int}
T.
x = 1
bar self = ...
print! dir(T) # ["bar", "x", ...].
assert T.x == 1
assert {i = 1}.x == 1
print! T.bar # <function bar>
{i = Int}.bar # TypeError: Record({i = Int}) has no method `.bar`.
C.bar # TypeError: C has no method `.bar` print!
print! {i = 1}.bar # <method bar>
C.new({i = 1}).bar # <method bar>
Difference from Data Class
There are two types of classes: regular classes, which are generated with Class(record)
, and data classes, which are generated with Inherit(record)
.
The data class inherits the functionality of the record class and has features such as decomposition assignment, ==
and hash
implemented by default, etc. On the other hand, the data class has its own equivalence relation and format display.
On the other hand, if you want to define your own equivalence relations or formatting displays, you should use the normal class.
C = Class {i = Int}
c = C.new {i = 1}
d = C.new {i = 2}
print! c # <C object>
c == d # TypeError: `==` is not implemented for `C`
D = Inherit {i = Int}
e = D::{i = 1} # same as `e = D.new {i = 1}`
f = D::{i = 2}
print! e # D(i=1)
assert e ! = f
Enum Class
To facilitate defining classes of type Or
, an Enum
is provided.
X = Class()
Y = Class()
XorY = Enum X, Y
Each type can be accessed as XorY.X
, XorY.Y
and the constructor can be obtained as X.new |> XorY.new
.
x1 = XorY.new X.new()
x2 = (X.new |> XorY.new())()
x3 = (Y.new |> XorY.new())()
assert x1 == x2
assert x1 != x3
Class Relationships
A class is a subtype of a requirement type. methods (including patch methods) of the requirement type can be used in the class.
T = Trait {.foo = Foo}
C = Class(... , impl: T)
C.
foo = foo
bar x = ...
assert C < T
assert C.foo == foo
assert not T < C
assert T.foo == Foo
Inheritance
Inheritance allows you to define a new class that adds functionality or specialization to an existing class. Inheritance is similar to inclusion in a trait. The inherited class becomes a subtype of the original class.
Inherit
is a function that defines an inherited class. The type on the left side is called subclass and the argument type of Inherit
on the right side is called superclass.
NewInt = Inherit Int
NewInt.
plus1 self = self + 1
assert NewInt.new(1).plus1() == 2
assert NewInt.new(1) + NewInt.new(1) == 2
Unlike Python, classes are non-inheritable (final) by default.
If you want the newly defined class to be inheritable, you must give it the Inheritable
decorator.
You can specify an optional argument Additional
to allow the class to have additional instance attributes, but only if the class is a value class. However, you cannot add instance attributes if the class is a value class.
@Inheritable
Person = Class {name = Str}
Student = Inherit Person, Additional := {id = Int}
john = Person.new {name = "John"}
alice = Student.new {name = "Alice", id = 123}
MailAddress = Inherit Str, Additional := {owner = Str} # TypeError: instance variables cannot be added to a value class
Inheritance of Enumerated Classes
Or type can also be inherited. In this case, you can remove any of the choices (multiple choices are possible with or
) by specifying the optional argument Excluding
.
No additional choices can be added. The class to which you add an option is not a subtype of the original class.
Number = Class Int or Float or Complex
Number
.abs(self): Float =
match self:
i: Int -> i.abs().into Float
f: Float -> f.abs()
c: Complex -> c.abs().into Float
# c: Complex cannot appear in match choices
RealNumber = Inherit Number, Excluding: Complex
Similarly, refinement type can also be specified.
Months = Class 0..12
MonthsNot31Days = Inherit Months, Excluding: {1, 3, 5, 7, 8, 10, 12}
StrMoreThan3 = Class StrWithLen N | N >= 3
StrMoreThan4 = Inherit StrMoreThan3, Excluding: StrWithLen N | N == 3
Overriding
The class is the same as the patch in that new methods can be added to the original type, but the class can be further "overridden".
This overriding is called override. To override, three conditions must be met.
First, the override must have an Override
decorator because by default it will result in an error.
In addition, the override cannot change the type of the method. It must be a subtype of the original type.
And if you override a method that is referenced by another method, you must also override all referenced methods.
Why is this condition necessary? It is because overriding does not merely change the behavior of one method, but may affect the behavior of another method.
Let's start with the first condition. This condition is to prevent "accidental overrides.
In other words, the Override
decorator must be used to prevent the name of a newly defined method in a derived class from conflicting with the name of the base class.
Next, consider the second condition. This is for type consistency. Since the derived class is a subtype of the base class, its behavior must also be compatible with that of the base class.
Finally, consider the third condition. This condition is unique to Erg and not often found in other object-oriented languages, again for safety. Let's look at what could go wrong if this were not the case.
# Bad example
@Inheritable
Base! = Class {x = Int!}
Base!.
f! ref! self =
print! self::x
self.g!()
g! ref! self = self::x.update! x -> x + 1
Inherited! = Inherit Base!
Inherited!.
@Override
g! ref! self = self.f!() # InfiniteRecursionWarning: This code falls into an infinite loop
# OverrideError: method `.g` is referenced by `.f` but not overridden
In the inherited class Inherited!
, the .g!
method is overridden to transfer processing to .f!
. However, the .f!
method in the base class transfers its processing to .g!
, resulting in an infinite loop. .f
was a problem-free method in the Base!
class, but it was used in an unexpected way by the override, and it was broken.
Erg has built this rule into the specification.
# OK.
@Inheritable
Base! = Class {x = Int!}
Base!.
f! ref! self =
print! self::x
self.g!()
g! ref! self = self::x.update! x -> x + 1
Inherited! = Inherit Base!
Inherited!.
@Override
f! ref! self =
print! self::x
self::x.update! x -> x + 1
@Override
g! ref! self = self.f!()
However, this specification does not completely solve the override problem. However, this specification does not completely solve the override problem, since the compiler cannot detect if the override fixes the problem. It is the responsibility of the programmer creating the derived class to correct the effects of the override. Whenever possible, try to define an alias method.
Replacing Traits (or what looks like it)
Although it is not possible to replace traits at inheritance time, there are examples that appear to do so.
For example, Int
, a subtype of Real
(which implements Add()
), appears to reimplement Add()
.
Int = Class ... , Impl := Add() and ...
But in fact Add()
in Real
stands for Add(Real, Real)
, and in Int
it is just overwritten by Add(Int, Int)
.
They are two different traits (Add
is a covariate, so Add(Real, Real) :> Add(Int, Int)
).
Multiple Inheritance
Erg does not allow intersection, diff, and complement between normal classes.
Int and Str # TypeError: cannot unite classes
This rule prevents inheritance from multiple classes, i.e., multiple inheritance.
IntAndStr = Inherit Int and Str # SyntaxError: multiple inheritance of classes is not allowed
However, multiple inherited Python classes can be used.
Multi-layer (multi-level) Inheritance
Erg inheritance also prohibits multi-layer inheritance. That is, you cannot define a class that inherits from another class.
Inheritable classes that inherit from an Object
may exceptionally inherit.
Also in this case, Python's multi-layered inherited classes can be used.
Rewriting Inherited Attributes
Erg does not allow rewriting the attributes inherited from the base class. This has two implications.
The first is an update operation on the inherited source class attribute. It cannot be reassigned, nor can it be updated by the .update!
method, for example.
Overriding is different from rewriting because it is an operation to override with a more specialized method. Overrides must also be replaced by compatible types.
@Inheritable
Base! = Class {.pub = !Int; pri = !Int}
Base!
var = !1
inc_pub! ref! self = self.pub.update! p -> p + 1
Inherited! = Inherit Base!
Inherited!
var.update! v -> v + 1
# TypeError: can't update base class variables
@Override
inc_pub! ref! self = self.pub + 1
# OverrideError: `.inc_pub!` must be subtype of `Self! () => ()`
The second is an update operation on the (variable) instance attribute of the inherited source. This is also prohibited. Instance attributes of the base class may only be updated from methods provided by the base class. Regardless of the visibility of the attribute, it cannot be updated directly. However, they can be read.
@Inheritable
Base! = Class {.pub = !Int; pri = !Int}
Base!.
inc_pub! ref! self = self.pub.update! p -> p + 1
inc_pri! ref! self = self::pri.update! p -> p + 1
self = self.pub.update!
Inherited!.
# OK
add2_pub! ref! self =
self.inc_pub!()
self.inc_pub!()
# NG, `Child` cannot touch `self.pub` and `self::pri`.
add2_pub! ref! self =
self.pub.update! p -> p + 2
After all, Erg inheritance can only add new attributes and override base class methods.
Usage of Inheritance
While inheritance is a powerful feature when used correctly, it also has the drawback that it tends to complicate class dependencies, especially when multiple or multi-layer inheritance is used. Complicated dependencies can reduce code maintainability. The reason Erg prohibits multiple and multi-layer inheritance is to reduce this risk, and the class patch feature was introduced to reduce the complexity of dependencies while retaining the "add functionality" aspect of inheritance.
So, conversely, where should inheritance be used? One indicator is when "semantic subtypes of the base class are desired.
Erg allows the type system to automatically do part of the subtype determination (e.g., Nat, where Int is greater than or equal to 0).
However, for example, it is difficult to create a "string type representing a valid e-mail address" relying solely on Erg's type system. You should probably perform validation on a normal string. Then, we would like to add some kind of "warrant" to the string object that has passed validation. That is the equivalent of downcasting to an inherited class. Downcasting a Str object
to ValidMailAddressStr
is a one-to-one correspondence with validating that the string is in the correct email address format.
ValidMailAddressStr = Inherit Str
ValidMailAddressStr.
init s: Str =
validate s # mail-address validation
Self.new s
s1 = "invalid mail address"
s2 = "foo@gmail.com"
_ = ValidMailAddressStr.init s1 # panic: invalid mail address
valid = ValidMailAddressStr.init s2
valid: ValidMailAddressStr # assurance that it is in the correct email address format
Another indicator is when you want to achieve a nominal polymorphism.
For example, the greet!
procedure defined below will accept any object of type Named
.
But obviously it is wrong to apply a Dog
type object. So we will use the Person
class for the argument type.
This way, only Person
objects, classes that inherit from them, and Student
objects will be accepted as arguments.
This is more conservative and avoids unnecessarily assuming too much responsibility.
Named = {name = Str; ...}
Dog = Class {name = Str; breed = Str}
Person = Class {name = Str}
Student = Inherit Person, additional: {id = Int}
structural_greet! person: Named =
print! "Hello, my name is \{person::name}."
greet! person: Person =
print! "Hello, my name is \{person::name}."
max = Dog.new {name = "Max", breed = "Labrador"}
john = Person.new {name = "John"}
alice = Student.new {name = "Alice", id = 123}
structural_greet! max # Hello, my name is Max.
structural_greet! john # Hello, my name is John.
greet! alice # Hello, my name is Alice.
greet! max # TypeError:
Nominal Subtyping vs. Structural Subtyping
Months = 0..12
# NST
MonthsClass = Class Months
MonthsClass.
name self =
match self:
1 -> "january"
2 -> "february"
3 -> "march"
...
# SST
MonthsImpl = Patch Months
MonthsImpl.
name self =
match self:
1 -> "January"
2 -> "February"
3 -> "March"
...
assert 12 in Months
assert 2.name() == "February"
assert not 12 in MonthsClass
assert MonthsClass.new(12) in MonthsClass
# It can use structural types, even though wrapped in a class.
assert MonthsClass.new(12) in Months
# If both exist, class methods take priority.
assert MonthsClass.new(2).name() == "february"
In The End, Which Should I Use, NST or SST?
If you cannot decide which one to use, our recommendation is NST. SST requires abstraction skills to write code that does not break down in any use case. Good abstraction can lead to high productivity, but wrong abstraction (commonality by appearances) can lead to counterproductive results. (NSTs can reduce this risk by deliberately keeping abstraction to a minimum. If you are not a library implementor, it is not a bad idea to code only with NSTs.
Patch
Erg does not allow modification of existing types and classes. This means, it is not possible to define additional methods in a class, nor to perform specialization (a language feature that monomorphizes a polymorphically declared type and defines a dedicated method, as in C++). However, there are many situations where you may want to add feature to an existing type or class, and there is a function called "patching" that allows you to do this.
StrReverse = Patch Str
StrReverse.
reverse self = self.iter().rev().collect(Str)
assert "abc".reverse() == "cba"
The name of the patch should be a straightforward description of the primary functionality to be added.
This way, objects of the type being patched (Str
) can use the methods of the patch (StrReverse
).
In fact, built-in method .reverse
is not a method of Str
, but a method added to StrRReverse
.
However, patch methods have lower precedence than methods of the nominal type (class/trait) and cannot override methods of existing types.
StrangeInt = Patch Int
StrangeInt.
`_+_` = Int.`_-_` # AssignError: . `_+_` is already defined in Int
If you want to override, you must inherit from the class. However, it is basically recommended not to override and to define a method with a different name. Overriding is not very easy to do because of some safety restrictions.
StrangeInt = Inherit Int
StrangeInt.
# Overriding methods must be given Override decorators.
# In addition, you need to override all Int methods that depend on Int.`_+_`.
@Override
`_+_` = Super.`_-_` # OverrideError: Int.`_+_` is referenced by ... , so these methods must also be overridden
Selecting Patches
Patches can be defined for a single type, and can be grouped together.
# foo.er
StrReverse = Patch(Str)
StrReverse.
reverse self = ...
StrMultiReplace = Patch(Str)
StrMultiReverse.
multi_replace self, pattern_and_targets: [(Pattern, Str)] = ...
StrToCamelCase = Patch(Str)
StrToCamelCase.
to_camel_case self = ...
StrToKebabCase = Patch(Str)
StrToKebabCase.
to_kebab_case self = ...
StrBoosterPack = StrReverse and StrMultiReplace and StrToCamelCase and StrToKebabCase
StrBoosterPack = StrReverse and StrMultiReplace and StrToCamelCase and StrToKebabCase
{StrBoosterPack;} = import "foo"
assert "abc".reverse() == "cba"
assert "abc".multi_replace([("a", "A"), ("b", "B")]) == "ABc"
assert "to camel case".to_camel_case() == "toCamelCase"
assert "to kebab case".to_kebab_case() == "to-kebab-case"
If multiple patches are defined, some of them may result in duplicate implementations.
# foo.er
StrReverse = Patch(Str)
StrReverse.
reverse self = ...
# more efficient implementation
StrReverseMk2 = Patch(Str)
StrReverseMk2.
reverse self = ...
"hello".reverse() # PatchSelectionError: multiple choices of `.reverse`: StrReverse, StrReverseMk2
In such a case, you can make it unique by using the related function form instead of the method form.
assert StrReverseMk2.reverse("hello") == "olleh"
You can also make it unique by selectively importing.
{StrReverseMk2;} = import "foo"
assert "hello".reverse() == "olleh"
Glue Patch
Patches can also relate types to each other. The StrReverse
patch relates Str
and Reverse
.
Such a patch is called a glue patch.
Because Str
is a built-in type, a glue patch is necessary for users to retrofit traits.
Reverse = Trait {
.reverse = Self.() -> Self
}
StrReverse = Patch Str, Impl := Reverse
StrReverse.
reverse self =
self.iter().rev().collect(Str)
Only one glue patch can be defined per type/trait pair. This is because if multiple glue patches were "visible" at the same time, it would not be possible to uniquely determine which implementation to choose. However, you can swap patches when moving to another scope (module).
NumericStr = Inherit Str
NumericStr.
...
NumStrRev = Patch NumericStr, Impl := Reverse
NumStrRev.
...
# DuplicatePatchError: NumericStr is already associated with `Reverse`
# hint: `Str` (superclass of `NumericStr`) is associated with `Reverse` by `StrReverse`
Appendix: Relationship to Rust's Trait
Erg patches are the equivalent of Rust's (retrofitted) impl
blocks.
#![allow(unused)] fn main() { // Rust trait Reverse { fn reverse(self) -> Self; } impl Reverse for String { fn reverse(self) -> Self { self.chars().rev().collect() } } }
You could say that Rust's traits are features of Erg's traits and patches. This makes Rust's traits sound more convenient, but that is not necessarily the case.
# Erg
Reverse = Trait {
.reverse = Self.() -> Self
}
StrReverse = Patch(Str, Impl := Reverse)
StrReverse.
reverse self =
self.iter().rev().collect(Str)
Because the impl
block is objectized as a patch in Erg, selective inclusion is possible when importing from other modules. As a side-effect, it also allows implementation of external traits to external structures.
Also, syntaxes such as dyn trait
and impl trait
are no longer required by the structure type.
# Erg
reversible: [Reverse; 2] = [[1, 2, 3], "hello"]
iter|T|(i: Iterable T): Iterator T = i.iter()
#![allow(unused)] fn main() { // Rust let reversible: [Box<dyn Reverse>; 2] = [Box::new([1, 2, 3]), Box::new("hello")]; fn iter<I>(i: I) -> impl Iterator<Item = I::Item> where I: IntoIterator { i.into_iter() } }
For-All Patch
A patch can be defined not only for one specific type, but also for "function types in general" and so on.
In this case, the term to which the degree of freedom is to be given is given as an argument (in the case below, T: Type
). A patch defined in this way is called a for-all patch.
As you can see, a for-all patch is precisely a function that returns a patch, but it can also be considered a patch in its own right.
FnType T: Type = Patch(T -> T)
FnType(T).
type = T
assert (Int -> Int).type == Int
Structural Patch
In addition, patches can be defined for any type that satisfies a certain structure. However, this has a lower priority than nominal patches and class methods.
Careful design should be used when defining structural patches, as some properties are lost by extension, such as the following.
# This should not be `Structural`
Norm = Structural Patch {x = Int; y = Int}
Norm.
norm self = self::x**2 + self::y**2
Point2D = Class {x = Int; y = Int}
assert Point2D.new({x = 1; y = 2}).norm() == 5
Point3D = Class {x = Int; y = Int; z = Int}
assert Point3D.new({x = 1; y = 2; z = 3}).norm() == 14 # AssertionError:
Value Type
Value types are Erg built-in types that can be evaluated at compile time, specifically:
Value = (
Int
or Nat
or Ratio
or Float
or Complex
or Bool
or Str
or NoneType
or Array Const
or Tuple Const
or Set Const
or ConstFunc(Const, _)
or ConstProc(Const, _)
or ConstMethod(Const, _)
)
Value-type objects, constants, and compile-time subroutines applied to them are called constant expressions.
1, 1.0, 1+2im, True, None, "aaa", [1, 2, 3], Fib(12)
Be careful with subroutines. Subroutines may or may not be value types. Since the substance of a subroutine is just a pointer, it can be treated as a value 1, but when compiling something that is not a subroutine cannot be used in a constant context. is not a value type because it doesn't make much sense.
Types classified as value types may be added in the future.
1 The term "value type" in Erg differs from the definition in other languages. There is no concept of memory within pure Erg semantics, and it is incorrect to state that it is a value type because it is placed on the stack, or that it is not a value type because it is actually a pointer. A value type only means that it is a Value
type or its subtypes. ↩
Attributive Type
Attribute types are types that contain Record and Dataclass, Patch, Module, etc. Types belonging to attribute types are not value types.
Record Type Composite
It is possible to flatten Record types composited.
For example, {... {.name = Str; .age = Nat}; ... {.name = Str; .id = Nat}}
becomes {.name = Str; .age = Nat; .id = Nat}
.
Interval Type
The most basic use of Range
objects is as iterator.
for! 0..9, i =>
print! i
Note that unlike Python, it includes a end number.
However, this is not only use for the Range
objects. It can also be used the type. Such a type is called the Interval type.
i: 0..10 = 2
The Nat
type is equivalent to 0..<Inf
and, Int
and Ratio
are equivalent to -Inf<..<Inf
,
0..<Inf
can also be written 0.._
. _
means any instance of Int
type.
Since it is can also be used as iterator, it can be specified in reverse order, such as 10..0
, however <..
, ..<
and <..<
cannot be reversed.
a = 0..10 # OK
b = 0..<10 # OK
c = 10..0 # OK
d = 10<..0 # Syntax error
e = 10..<0 # Syntax error
f = 10<..<0 # Syntax error
A Range operator can be used for non-numeric types, as long as they are Ord
immutable types.
Alphabet = "A".."z"
Enumerative Type
Enum types generated by Set. Enum types can be used as-is with type specifications, but further methods can be defined by classifying them into classes or defining patches.
A partially typed system with an enumerated type is called an enumerated partially typed.
Bool = {True, False}
Status = {"ok", "error"}
Since 1..7
can be rewritten as {1, 2, 3, 4, 5, 6, 7}
, so when element is finite, the Enum types essentially equivalent the Range types.
Binary! = Class {0, 1}!.
invert! ref! self =
if! self == 0:
do!:
self.set! 1
do!:
self.set! 0
b = Binary!.new !0
b.invert!()
Incidentally, Erg's Enum types are a concept that encompasses enumerative types common in other languages.
#![allow(unused)] fn main() { // Rust enum Status { Ok, Error } }
# Erg
Status = {"Ok", "Error"}
The difference with Rust is that it uses a structural subtype(SST).
#![allow(unused)] fn main() { // There is no relationship between Status and ExtraStatus. enum Status { Ok, Error } enum ExtraStatus { Ok, Error, Unknown } // Methods can be implemented impl Status { // ... } impl ExtraStatus { // ... } }
# Status > ExtraStatus, and elements of Status can use methods of ExtraStatus.
Status = Trait {"Ok", "Error"}
# ...
ExtraStatus = Trait {"Ok", "Error", "Unknown"}
# ...
Methods can also be added by patching.
Use the or
operator to explicitly indicate inclusion or to add a choice to an existing Enum type.
ExtraStatus = Status or {"Unknown"}
An enumerated type in which all classes to which an element belongs are identical is called a homogenous enumerated type.
By default, a class whose requirement type is an homogeneous enumerated type can be treated as a subclass of the class to which the element belongs.
If you do not wish to do so, you can make it a wrapper class.
Abc = Class {"A", "B", "C"}
Abc.new("A").is_uppercase()
OpaqueAbc = Class {inner = {"A", "B", "C"}}.
new inner: {"A", "B", "C"} = Self.new {inner;}
OpaqueAbc.new("A").is_uppercase() # TypeError
Refinement Type
Refinement type is a type constrained by a predicate expression. Enumeration types and interval types are syntax sugar of refinement types.
The standard form of a refinement type is {Elem: Type | (Pred)*}
. This means that the type is a type whose elements are Elem
satisfying Pred
.
The type that can be used for the refinement type is Value type only.
Nat = 0.. _
Odd = {N: Int | N % 2 == 1}
Char = StrWithLen 1
# StrWithLen 1 == {_: StrWithLen N | N == 1}
[Int; 3] == {_: Array Int, N | N == 3}
Array3OrMore == {A: Array _, N | N >= 3}
When there are multiple preds, they can be separated by ;
or and
or or
. ;
and and
mean the same thing.
The elements of Odd
are 1, 3, 5, 7, 9, ...
.
It is called a refinement type because it is a type whose elements are part of an existing type as if it were a refinement.
The Pred
is called a (left-hand side) predicate expression. Like assignment expressions, it does not return a meaningful value, and only a pattern can be placed on the left-hand side.
That is, expressions such as X**2 - 5X + 6 == 0
cannot be used as refinement-type predicate expressions. In this respect, it differs from a right-hand-side predicate expression.
{X: Int | X**2 - 5X + 6 == 0} # SyntaxError: the predicate form is invalid. Only names can be on the left-hand side
If you know how to solve quadratic equations, you would expect the above refinement form to be equivalent to {2, 3}
.
However, the Erg compiler has very little knowledge of algebra, so it cannot solve the predicate on the right.
Subtyping rules for refinement types
All refinement types are subtypes of the type specified in the Type
part.
{I: Int | I <= 0} <: Int
Otherwise, the current Erg has a subtyping type rule for integer comparisons.
{I: Int | I <= 5} <: {I: Int | I <= 0}
Smart Cast
It's nice that you defined Odd
, but as it is, it doesn't look like it can be used much outside of literals. To promote an odd number in a normal Int
object to Odd
, i.e., to downcast an Int
to Odd
, you need to pass the constructor of Odd
.
For refinement types, the normal constructor .new
may panic, and there is an auxiliary constructor called .try_new
that returns a Result
type.
i = Odd.new (0..10).sample!() # i: Odd (or Panic)
It can also be used as a type specification in match
.
# i: 0..10
i = (0..10).sample!
match i:
o: Odd ->
log "i: Odd"
n: Nat -> # 0..10 < Nat
log "i: Nat"
However, Erg cannot currently make sub-decisions such as Even
because it was not Odd
, etc.
Enumerated, Interval and Refinement Types
The enumerative/interval types introduced before are syntax sugar of the refinement type.
{a, b, ...}
is {I: Typeof(a) | I == a or I == b or ... }
, and a..b
is desugarized to {I: Typeof(a) | I >= a and I <= b}
.
{1, 2} == {I: Int | I == 1 or I == 2}
1..10 == {I: Int | I >= 1 and I <= 10}
1... <10 == {I: Int | I >= 1 and I < 10}
Refinement pattern
Just as _: {X}
can be rewritten as X
(constant pattern), _: {X: T | Pred}
can be rewritten as X: T | Pred
.
# method `.m` is defined for arrays of length 3 or greater
Array(T, N | N >= 3)
.m(&self) = ...
Algebraic type
Algebraic types are types that are generated by operating types by treating them like algebra. Operations handled by them include Union, Intersection, Diff, Complement, and so on. Normal classes can only perform Union, and other operations will result in a type error.
Union
Union types can give multiple possibilities for types. As the name suggests, they are generated by the or
operator.
A typical Union is the Option
type. The Option
type is a T or NoneType
patch type, primarily representing values that may fail.
IntOrStr = Int or Str
assert dict.get("some key") in (Int or NoneType)
Option T = T or NoneType
Note that Union types are commutative but not associative. That is, X or Y or Z
is (X or Y) or Z
, not X or (Y or Z)
.
Allowing this would result in, for example, Int or Option(Str)
, Option(Int) or Str
and Option(Int or Str)
being of the same type.
Intersection
Intersection types are got by combining types with the and
operation.
Num = Add and Sub and Mul and Eq
As mentioned above, normal classes cannot be combined with the and
operation. This is because instances belong to only one class.
Diff
Diff types are got by not
operation.
It is better to use and not
as a closer notation to English text, but it is recommended to use just not
because it fits better alongside and
and or
.
CompleteNum = Add and Sub and Mul and Div and Eq and Ord
Num = CompleteNum not Div not Ord
True = Bool not {False}
OneTwoThree = {1, 2, 3, 4, 5, 6} - {4, 5, 6, 7, 8, 9, 10}
Complement
Complement types is got by the not
operation, which is a unary operation. The not T
type is a shorthand for {=} not T
.
Intersection with type not T
is equivalent to Diff, and Diff with type not T
is equivalent to Intersection.
However, this way of writing is not recommended.
# the simplest definition of the non-zero number type
NonZero = Not {0}
# deprecated styles
{True} == Bool and not {False} # 1 == 2 + - 1
Bool == {True} not not {False} # 2 == 1 - -1
True Algebraic type
There are two algebraic types: apparent algebraic types that can be simplified and true algebraic types that cannot be further simplified.
The "apparent algebraic types" include or
and and
of Enum, Interval, and the Record types.
These are not true algebraic types because they are simplified, and using them as type specifiers will result in a Warning; to eliminate the Warning, you must either simplify them or define their types.
assert {1, 2, 3} or {2, 3} == {1, 2, 3}
assert {1, 2, 3} and {2, 3} == {2, 3}
assert -2..-1 or 1..2 == {-2, -1, 1, 2}
i: {1, 2} or {3, 4} = 1 # TypeWarning: {1, 2} or {3, 4} can be simplified to {1, 2, 3, 4}
p: {x = Int, ...} and {y = Int; ...} = {x = 1; y = 2; z = 3}
# TypeWaring: {x = Int, ...} and {y = Int; ...} can be simplified to {x = Int; y = Int; ...}
Point1D = {x = Int; ...}
Point2D = Point1D and {y = Int; ...} # == {x = Int; y = Int; ...}
q: Point2D = {x = 1; y = 2; z = 3}
True algebraic types include the types Or
and And
. Classes such as or
between classes are of type Or
.
assert Int or Str == Or(Int, Str)
assert Int and Marker == And(Int, Marker)
Diff, Complement types are not true algebraic types because they can always be simplified.
dependent type
Dependent types are a feature that can be said to be the biggest feature of Erg. A dependent type is a type that takes a value as an argument. Ordinary polymorphic types can take only types as arguments, but dependent types relax that restriction.
Dependent types are equivalent to [T; N]
(Array(T, N)
).
This type is determined not only by the content type T
but also by the number of contents N
. N
contains an object of type Nat
.
a1 = [1, 2, 3]
assert a1 in [Nat; 3]
a2 = [4, 5, 6, 7]
assert a1 in [Nat; 4]
assert a1 + a2 in [Nat; 7]
If the type object passed in the function argument is related to the return type, write:
narray: |N: Nat| {N} -> [{N}; N]
narray(N: Nat): [N; N] = [N; N]
assert array(3) == [3, 3, 3]
When defining a dependent type, all type arguments must be constants.
Dependent types themselves exist in existing languages, but Erg has the feature of defining procedural methods on dependent types.
x=1
f x =
print! f::x, module::x
# The Phantom type has an attribute called Phantom whose value is the same as the type argument
T X: Int = Class Impl := Phantom X
T(X).
x self = self::Phantom
T(1).x() # 1
Type arguments of mutable dependent types can be transitioned by method application.
Transition specification is done with ~>
.
# Note that `Id` is an immutable type and cannot be transitioned
VM!(State: {"stopped", "running"}! := _, Id: Nat := _) = Class(..., Impl := Phantom! State)
VM!().
# Variables that do not change can be omitted by passing `_`.
start! ref! self("stopped" ~> "running") =
self.initialize_something!()
self::set_phantom!("running")
# You can also cut out by type argument (only in the module where it's defined)
VM!.new() = VM!(!"stopped", 1).new()
VM!("running" ~> "running").stop!ref!self =
self.close_something!()
self::set_phantom!("stopped")
vm = VM!.new()
vm.start!()
vm.stop!()
vm.stop!() # TypeError: VM!(!"stopped", 1) doesn't have .stop!()
# hint: VM!(!"running", 1) has .stop!()
You can also embed or inherit existing types to create dependent types.
MyArray(T, N) = Inherit[T; N]
# The type of self: Self(T, N) changes in conjunction with .array
MyStruct!(T, N: Nat!) = Class {.array: [T; !N]}
Type Variable, quantified type
A type variable is a variable used, for example, to specify the type of subroutine arguments, and its type is arbitrary (not monomorphic).
First, as motivation for introducing type variables, consider the id
function, which returns input as is.
id x: Int = x
The id
function that returns the input as is is defined for the type Int
, but this function can obviously be defined for any type.
Let's use Object
for the largest class.
id x: Object = x
i = id 1
s = id "foo"
b = id True
Sure, it now accepts arbitrary types, but there is one problem: the return type is expanded to Object
. The return type is expanded to Object
.
I would like to see the return type Int
if the input is of type Int
, and Str
if it is of type Str
.
print! id 1 # <Object object>
id(1) + 1 # TypeError: cannot add `Object` and `Int
To ensure that the type of the input is the same as the type of the return value, use a type variable.
Type variables are declared in ||
(type variable list).
id|T: Type| x: T = x
assert id(1) == 1
assert id("foo") == "foo"
assert id(True) == True
This is called the universal quantification (universalization) of the function. There are minor differences, but it corresponds to the function called generics in other languages. A universalized function is called a polymorphic function. Defining a polymorphic function is like defining a function of the same form for all types (Erg prohibits overloading, so the code below cannot really be written).
id|T: Type| x: T = x
# pseudo code
id x: Int = x
id x: Str = x
id x: Bool = x
id x: Ratio = x
id x: NoneType = x
...
Also, the type variable T
can be inferred to be of type Type
since it is used in the type specification. So |T: Type|
can simply be abbreviated to |T|
.
You can also omit |T, N| foo: [T; N]
if it can be inferred to be other than a type object (T: Type, N: Nat
).
You can also provide constraints if the type is too large for an arbitrary type. Constraints also have advantages, for example, a subtype specification allows certain methods to be used.
# T <: Add
# => T is a subclass of Add
# => can do addition
add|T <: Add| l: T, r: T = l + r
In this example, T
is required to be a subclass of type Add
, and the actual types of l
and r
to be assigned must be the same.
In this case, T
is satisfied by Int
, Ratio
, etc. So, the addition of Int
and Str
, for example, is not defined and is therefore rejected.
You can also type it like this.
f|
Y, Z: Type
X <: Add Y, O1
O1 <: Add Z, O2
O2 <: Add X, _
| x: X, y: Y, z: Z =
x + y + z + x
If the annotation list is long, you may want to pre-declare it.
f: |Y, Z: Type, X <: Add(Y, O1), O1 <: Add(Z, O2), O2 <: Add(X, O3)| (X, Y, Z) -> O3
f|X, Y, Z| x: X, y: Y, z: Z =
x + y + z + x
Unlike many languages with generics, all declared type variables must be used either in the temporary argument list (the x: X, y: Y, z: Z
part) or in the arguments of other type variables.
This is a requirement from Erg's language design that all type variables are inferrable from real arguments.
So information that cannot be inferred, such as the return type, is passed from real arguments; Erg allows types to be passed from real arguments.
Iterator T = Trait {
# Passing return types from arguments.
# .collect: |K: Type -> Type| (self: Self, {K}) -> K(T)
.collect(self, K: Type -> Type): K(T) = ...
...
}
it = [1, 2, 3].iter().map i -> i + 1
it.collect(Array) # [2, 3, 4].
Type variables can only be declared during ||
. However, once declared, they can be used anywhere until they exit scope.
f|X|(x: X): () =
y: X = x.clone()
log X.__name__
log X
f 1
# Int
# <class Int>
You can also explicitly monophasize at the time of use as follows
f: Int -> Int = id|Int|
In that case, the specified type takes precedence over the type of the actual argument (failure to match will result in a type error that the type of the actual argument is wrong). That is, if the actual object passed can be converted to the specified type, it will be converted; otherwise, a compile error will result.
assert id(1) == 1
assert id|Int|(1) in Int
assert id|Ratio|(1) in Ratio
# You can also use keyword arguments
assert id|T: Int|(1) == 1
id|Int|("str") # TypeError: id|Int| is type `Int -> Int` but got Str
When this syntax is batting against comprehensions, you need to enclose it in ()
.
# {id|Int| x | x <- 1..10} would be interpreted as {id | ...} will be interpreted as.
{(id|Int| x) | x <- 1..10}
A type variable cannot be declared with the same name as a type that already exists. This is because all type variables are constants.
I: Type
# ↓ invalid type variable, already exists
f|I: Type| ... = ...
Type arguments in method definitions
Type arguments on the left-hand side are treated as bound variables by default.
K(T: Type, N: Nat) = ...
K(T, N).
foo(x) = ...
Using another type variable name will result in a warning.
K(T: Type, N: Nat) = ...
K(U, M). # Warning: K's type variable names are 'T' and 'N'
foo(x) = ...
Constants are the same in all namespaces since their definition, so of course they cannot be used for type variable names.
N = 1
K(N: Nat) = ... # NameError: N is already defined
L(M: Nat) = ...
# Defined only if M == N == 1
L(N).
foo(self, x) = ...
# defined for any M: Nat
L(M).
.bar(self, x) = ...
You cannot have multiple definitions for each type argument, but you can define methods with the same name because there is no relationship between dependent types that are not assigned type arguments (non-primitive-kind) and dependent types that are assigned (primitive-kind).
K(I: Int) = ...
K.
# K is not a true type (atomic Kind), so we cannot define a method
# This is not a method (more like a static method)
foo(x) = ...
K(0).
foo(self, x): Nat = ...
For-all types
The id
function defined in the previous section is a function that can be of any type. So what is the type of the id
function itself?
print! classof(id) # |T: Type| T -> T
We get a type |T: Type| T -> T
. This is called a closed universal quantified type/universal type, which is ['a. ...]'
in ML, and forall t. ...
in Haskell. Why the adjective "closed" is used is discussed below.
There is a restriction on the closed universal quantified type: only subroutine types can be made universal quantified, i.e., only subroutine types can be placed in the left clause. But this is sufficient, since subroutines are the most basic control structure in Erg, so when we say "I want to handle arbitrary X," i.e., I want a subroutine that can handle arbitrary X. So, the quantified type has the same meaning as the polymorphic function type. From now on, this type is basically called polymorphic function type.
Like anonymous functions, polymorphic types have arbitrary type variable names, but they all have the same value.
assert (|T: Type| T -> T) == (|U: Type| U -> U)
The equality is satisfied when there is an alpha equivalence, as in the lambda calculus. Since there are some restrictions on operations on types, equivalence determination is always possible (if we don't consider the stoppage property).
Subtyping of Polymorphic Function Types
A polymorphic function type can be any function type. This means that there is a subtype relationship with any function type. Let's look at this relationship in detail.
A type in which the type variable is defined on the left-hand side and used on the right-hand side, such as OpenFn T: Type = T -> T
, is called an open universal type.
In contrast, a type in which type variables are defined and used on the right-hand side, such as ClosedFn = |T: Type| T -> T
, is called a closed universal type.
An open universal type is a supertype of all isomorphic "true" types. In contrast, a closed universal type is a subtype of all isomorphic true types.
(|T: Type| T -> T) < (Int -> Int) < (T -> T)
You may remember that closed ones are smaller/open ones are larger. But why is this so? For a better understanding, let's consider an instance of each.
# id: |T: Type| T -> T
id|T|(x: T): T = x
# iid: Int -> Int
iid(x: Int): Int = x
# return arbitrary function as is
id_arbitrary_fn|T|(f1: T -> T): (T -> T) = f
# id_arbitrary_fn(id) == id
# id_arbitrary_fn(iid) == iid
# return the poly correlation number as it is
id_poly_fn(f2: (|T| T -> T)): (|T| T -> T) = f
# id_poly_fn(id) == id
id_poly_fn(iid) # TypeError
# Return Int type function as is
id_int_fn(f3: Int -> Int): (Int -> Int) = f
# id_int_fn(id) == id|Int|
# id_int_fn(iid) == iid
Since id
, which is of type |T: Type| T -> T
, can be assigned to a parameter f3
of type Int -> Int
, we may consider (|T| T -> T) < (Int -> Int)
.
Conversely, iid
, which is of type Int -> Int
, cannot be assigned to parameter f2
of type (|T| T -> T)
, but it can be assigned to parameter f1
of type T -> T
, so (Int -> Int) < (T -> T)
.
Therefore, it is indeed (|T| T -> T) < (Int -> Int) < (T -> T)
.
Quantified Types and Dependent Types
What is the relationship between dependent types and quantified types (polymorphic function types) and what is the difference between them? We can say that a dependent type is a type that takes arguments, and an quantified type is a type that gives arbitrariness to the arguments.
The important point is that there are no type arguments in the closed, polymorphic type itself. For example, the polymorphic function type |T| T -> T
is a type that takes a polymorphic function only, and its definition is closed. You cannot define methods, etc. using its type argument T
.
In Erg, the type itself is also a value, so types that take arguments, such as function types, will probably be dependent types. In other words, polymorphic function types are both a quantified type and a dependent type.
PolyFn = Patch(|T| T -> T)
PolyFn.
type self = T # NameError: cannot find 'T'
DepFn T = Patch(T -> T)
DepFn.
type self =
log "by DepFn"
T
assert (Int -> Int).type() == Int # by DepFn
assert DepFn(Int).type() == Int # by DepFn
Subtyping
In Erg, class inclusion can be determined with the comparison operators <
, >
.
Nat < Int
Int < Object
1... _ < Nat
{1, 2} > {1}
{=} > {x = Int}
{I: Int | I >= 1} < {I: Int | I >= 0}
Note that this has a different meaning than the <:
operator. It declares that the class on the left-hand side is a subtype of the type on the right-hand side, and is meaningful only at compile-time.
C <: T # T: StructuralType
f|D <: E| ...
assert F < G
You can also specify Self <: Add
for a polymorphic subtype specification, for example Self(R, O) <: Add(R, O)
.
Structural types and class type relationships
Structural types are types for structural typing and are considered to be the same object if they have the same structure.
T = Structural {i = Int}
U = Structural {i = Int}
assert T == U
t: T = {i = 1}
assert t in T
assert t in U
In contrast, classes are types for notational typing and cannot be compared structurally to types and instances.
C = Class {i = Int}
D = Class {i = Int}
assert C == D # TypeError: cannot compare classes
c = C.new {i = 1}
assert c in C
assert not c in D
Subtyping of subroutines
Arguments and return values of subroutines take only a single class. In other words, you cannot directly specify a structural type or a trait as the type of a function. It must be specified as "a single class that is a subtype of that type" using the partial type specification.
# OK
f1 x, y: Int = x + y
# NG
f2 x, y: Add = x + y
# OK
# A is some concrete class
f3<A <: Add> x, y: A = x + y
Type inference in subroutines also follows this rule. When a variable in a subroutine has an unspecified type, the compiler first checks to see if it is an instance of one of the classes, and if not, looks for a match in the scope of the trait. If it still cannot find one, a compile error occurs. This error can be resolved by using a structural type, but since inferring an anonymous type may have unintended consequences for the programmer, it is designed to be explicitly specified by the programmer with Structural
.
Class upcasting
i: Int
i as (Int or Str)
i as (1..10)
i as {I: Int | I >= 0}
Cast
Upcasting
Because Python is a language that uses duck typing, there is no concept of casting. There is no need to upcast, and there is essentially no downcasting.
However, Erg is statically typed, so there are times when casting must be done.
A simple example is 1 + 2.0
: the +
(Int, Ratio), or Int(<: Add(Ratio, Ratio)) operation is not defined in the Erg language specification. This is because Int <: Ratio
, so 1 is upcast to 1.0, an instance of Ratio.
~~The Erg extended bytecode adds type information to BINARY_ADD, in which case the type information is Ratio-Ratio. In this case, the BINARY_ADD instruction does the casting of Int, so no special instruction specifying the cast is inserted. So, for example, even if you override a method in a child class, if you specify the parent as the type, type coercion is performed and the method is executed in the parent's method (name modification is performed at compile time to refer to the parent's method). The compiler only performs type coercion validation and name modification. The runtime does not cast objects (currently. Cast instructions may be implemented for execution optimization). ~~
@Inheritable
Parent = Class()
Parent.
greet!() = print! "Hello from Parent"
Child = Inherit Parent
Child.
# Override requires Override decorator
@Override
greet!() = print! "Hello from Child"
greet! p: Parent = p.greet!()
parent = Parent.new()
child = Child.new()
parent # "Hello from Parent" greet!
child # "Hello from Parent"
This behavior does not create an incompatibility with Python. In the first place, Python does not specify the type of a variable, so that all variables are typed as type variables, so to speak. Since type variables choose the smallest type they can fit, the same behavior as in Python is achieved if you do not specify a type in Erg.
@Inheritable
Parent = Class()
Parent.
greet!() = print! "Hello from Parent"
Child = Inherit Parent
Child.
greet!() = print! "Hello from Child" Child.
greet! some = some.greet!()
parent = Parent.new()
child = Child.new()
parent # "Hello from Parent" greet!
child # "Hello from Child"
You can also use .from
and .into
, which are automatically implemented for types that are inherited from each other.
assert 1 == 1.0
assert Ratio.from(1) == 1.0
assert 1.into<Ratio>() == 1.0
Downcasting
Since downcasting is generally unsafe and the conversion method is non-trivial, we instead implement TryFrom.try_from
.
IntTryFromFloat = Patch Int
IntTryFromFloat.
try_from r: Float =
if r.ceil() == r:
then: r.ceil()
else: Error "conversion failed".
Mutable Type
Warning: The information in this section is old and contains some errors.
By default all types in Erg are immutable, i.e. their internal state cannot be updated.
But you can of course also define mutable types. Variable types are declared with !
.
Person! = Class({name = Str; age = Nat!})
Person!.
greet! ref! self = print! "Hello, my name is \{self::name}. I am \{self::age}."
inc_age!ref!self = self::name.update!old -> old + 1
To be precise, a type whose base type is a mutable type, or a composite type containing mutable types, must have a !
at the end of the type name. Types without !
can exist in the same namespace and are treated as separate types.
In the example above, the .age
attribute is mutable and the .name
attribute is immutable. If even one attribute is mutable, the whole is mutable.
Mutable types can define procedural methods that rewrite instances, but having procedural methods does not necessarily make them mutable. For example, the array type [T; N]
implements a sample!
method that randomly selects an element, but of course does not destructively modify the array.
Destructive operations on mutable objects are primarily done via the .update!
method. The .update!
method is a higher-order procedure that updates self
by applying the function f
.
i = !1
i.update! old -> old + 1
assert i == 2
The .set!
method simply discards the old content and replaces it with the new value. .set!x = .update!_ -> x.
i = !1
i.set! 2
assert i == 2
The .freeze_map
method operates on values unchanged.
a = [1, 2, 3].into [Nat; !3]
x = a.freeze_map a: [Nat; 3] -> a.iter().map(i -> i + 1).filter(i -> i % 2 == 0).collect(Array)
In a polymorphic immutable type the type argument T
of the type is implicitly assumed to be immutable.
# ImmutType < Type
KT: ImmutType = Class ...
K!T: Type = Class ...
In the standard library, mutable (...)!
types are often based on immutable (...)
types. However, T!
and T
types have no special linguistic relationship, and may not be constructed as such 1 .
From the above explanation, mutable types include not only those that are themselves mutable, but also those whose internal types are mutable.
Types such as {x: Int!}
and [Int!; 3]
are internal mutable types where the object inside is mutable and the instance itself is not mutable.
Cell! T
Mutable types are already available for Int
and arrays, but how can we create mutable types for general immutable types? For example, in the case of {x = Int; y = Int}
, corresponding mutable type is {x = Int!; y = Int!}
, etc. But how did Int!
made from Int
?
Erg provides Cell!
type for such cases.
This type is like a box for storing immutable types. This corresponds to what is called a reference (ref) in ML and other languages.
IntOrStr = Inr or Str
IntOrStr! = Cell! IntOrStr
x = IntOrStr!.new 1
assert x is! 1 # `Int or Str` cannot compare with `Int` directly, so use `is!` (this compares object IDs) instead of `==`.
x.set! "a"
assert x is! "a"
An important property is that Cell! T
is a subtype of T
. Therefore, an object of type Cell! T
can use all the methods of type T
.
# definition of `Int!`
Int! = Cell! Int
...
i = !1
assert i == 1 # `i` is casted to `Int`
1 It is intentional that T!
and T
types have no special linguistic relationship. It's a design. If there is a relationship, for example, if the T
/T!
type exists in the namespace, it will not be possible to introduce the T!
/T
type from another module. Also, the mutable type is not uniquely defined for the immutable type. Given the definition T = (U, V)
, the possible variable subtypes of T!
are (U!, V)
and (U, V!)
. ↩
Type Bound
Type bounds add conditions to type specifications. A function that realizes this is a guard (guard clause). This feature is available for function signatures, anonymous function signatures, as well as refinement types. Guards are written after the return type.
Predicate
You can specify the condition that the variable satisfies with an expression (predicate expression) that returns Bool
.
Only value objects and operators can be used. Compile-time functions may be supported in future versions.
f a: [T; N] | T, N, N > 5 = ...
g a: [T; N | N > 5] | T, N = ...
Odd = {I: Int | I % 2 == 1}
R2Plus = {(L, R) | L, R: Ratio; L > 0 and R > 0}
GeneralizedOdd = {I | U; I <: Div(Nat, U); I % 2 == 0}
Compound Type
Tuple Type
(), (X,), (X, Y), (X, Y, Z), ...
Tuples have a subtype rule for length as well as type inside.
For any Tuple T
, U
, the following holds.
* T <: () (unit rule)
* forall N in 0..<Len(T) (Len(T) <= Len(U)), U.N == T.N => U <: T (oblivion rule)
For example, (Int, Str, Bool) <: (Int, Str)
.
However, these rules do not apply to the tuple-like part of a Function type, because this part is not really the tuples.
(Int, Int) -> Int !<: (Int,) -> Int
In addition, return values of Unit types can be ignored, but return values of other tuple types cannot be ignored.
Array Type
[], [X; 0], [X; 1], [X; 2], ..., [X; _] == [X]
The same subtype rules exist for arrays as for tuples.
* T <: [] (unit rule)
* forall N in 0..<Len(T) (Len(T) <= Len(U)), U[N] == T[N] => U <: T (oblivion rule)
Arrays like the one below are not valid types. It is an intentional design to emphasize that the elements of the array are homogenized.
[Int, Str]
Because of this, detailed information about each element is lost. To preserve this, refinement types can be used.
a = [1, "a"]: {A: [Int or Str; 2] | A[0] == Int}
a[0]: Int
Set Type
{}, {X; _}, ...
Set types have length information, but mostly useless. This is because duplicate elements are eliminated in sets, but duplicate elements cannot generally be determined at compile time. In the first place, the length of the information is not very meaningful in a Set.
{}
is the empty set, a subtype of all types. Note that {X}
is not a set type, but a type that contains only one constant X
.
Dict Type
{:}, {X: Y}, {X: Y, Z: W}, ...
All dict types are subtypes of Dict K, V
. {X: Y} <: Dict X, Y
and {X: Y, Z: W} <: Dict X or Z, Y or W
.
Record Type
{=}, {i = Int}, {i = Int; j = Int}, {.i = Int; .j = Int}, ...
A private record type is a super type of a public record type.
e.g. {.i = Int} <: {i = Int}
Function Type
() -> ()
Int -> Int
(Int, Str) -> Bool
# named parameter
(x: Int, y: Int) -> Int
# default parameter
(x := Int, y := Int) -> Int
# variable-length parameter
(*objs: Obj) -> Str
(Int, Ref Str!) -> Int
# qualified parameter
|T: Type|(x: T) -> T
# qualified parameter with default type
|T: Type|(x: T := NoneType) -> T # |T: Type|(x: T := X, y: T := Y) -> T (X != Y) is invalid
Bound Method Type
Int.() -> Int
Int.(other: Int) -> Int
# e.g. 1.__add__: Int.(Int) -> Int
The type C.(T) -> U
is a subtype of T -> U
. They are almost the same, but C.(T) -> U
is the type of a method whose receiver type is C
, and the receiver is accessible via an attribute __self__
.
In the following, we will discuss more advanced type systems. Beginners do not have to read all the sections.
Generalized Algebraic Data Types (GADTs)
Erg can create Generalized Algebraic Data Types (GADTs) by classifying Or types.
Nil T = Class(Impl := Phantom T)
Cons T = Class {head = T; rest = List T}, Impl := Unpack
List T: Type = Class(Nil T or Cons T)
List.
nil|T|() = Self(T).new Nil(T).new()
cons head, rest | T = Self(T).new Cons(T).new(head, rest)
head self = match self:
{head; ...}: Cons_ -> head
_: Nil -> panic "empty list"
{nil; cons} = List
print! cons(1, cons(2, nil())).head() # 1
print! nil.head() # RuntimeError: "empty list"
The reason we say List.nil|T|() = ...
instead of List(T).nil() = ...
is that we don't need to specify the type when using it.
i = List.nil()
_: List Int = cons 1, i
The List T
defined here is GADTs, but it's a naive implementation and doesn't show the true value of GADTs.
For example, the .head
method above will throw a runtime error if the body is empty, but this check can be done at compile time.
List: (Type, {"Empty", "Nonempty"}) -> Type
List T, "Empty" = Class(Impl := Phantom T)
List T, "Nonempty" = Class {head = T; rest = List(T, _)}, Impl := Unpack
List.
nil|T|() = Self(T, "Empty").new Nil(T).new()
cons head, rest | T = Self(T, "Nonempty").new {head; rest}
List(T, "Nonempty").
head {head; ...} = head
{nil; cons} = List
print! cons(1, cons(2, nil())).head() # 1
print! nil().head() # TypeError
An example of GADTs that is often explained on the street is a list that can judge whether the contents are empty or not by type as above. Erg can be further refined to define a list with length.
List: (Type, Nat) -> Type
List T, 0 = Class(Impl := Phantom T)
List T, N = Class {head = T; rest = List(T, N-1)}, Impl := Unpack
List.
nil|T|() = Self(T, 0).new Nil(T).new()
cons head, rest | T, N = Self(T, N).new {head; rest}
List(_, N | N >= 1).
head {head; ...} = head
List(_, N | N >= 2).
pair {head = first; rest = {head = second; ...}} = [first, second]
{nil; cons} = List
print! cons(1, cons(2, nil)).pair() # [1, 2]
print! cons(1, nil).pair() # TypeError
print! cons(1, nil).head() # 1
print! nil. head() # TypeError
Function type with default arguments
First, let's look at an example of using default arguments.
f: (Int, Int, z := Int) -> Int
f(x, y, z := 0) = x + y + z
g: (Int, Int, z := Int, w := Int) -> Int
g(x, y, z := 0, w := 1) = x + y + z + w
fold: ((Int, Int) -> Int, [Int], acc := Int) -> Int
fold(f, [], acc) = acc
fold(f, arr, acc := 0) = fold(f, arr[1..], f(acc, arr[0]))
assert fold(f, [1, 2, 3]) == 6
assert fold(g, [1, 2, 3]) == 8
Arguments after :=
are default arguments.
The subtyping rules are as follows.
((X, y := Y) -> Z) <: (X -> Z)
((X, y := Y, ...) -> Z) <: ((X, ...) -> Z)
The first means that a function with default arguments can be identified with a function without. The second means that any default argument can be omitted.
Type erasure
Type erasure is the process of setting a type argument to _
and deliberately discarding its information. Type erasure is a feature of many polymorphic languages, but in the context of Erg's syntax, it is more accurate to call it type argument erasure.
The most common example of a type that has been type-erased is [T, _]
. Arrays are not always known their length at compile-time. For example, sys.argv
, which refers to command line arguments, is of type [Str, _]
. Since Erg's compiler has no way of knowing the length of command line arguments, information about their length must be given up.
However, a type that has been type-erased becomes a supertype of a type that has not been (e.g. [T; N] <: [T; _]
), so it can accept more objects.
Objects of type [T; N]
can of course use methods of type [T; _]
, but the N
information is erased after use. If the length does not change, then it is possible to use [T; N]
in the signature. If the length remains the same, it must be indicated by a signature.
# Functions that are guaranteed to not change the length of the array (e.g., sort)
f: [T; N] -> [T; N] # functions that do not (f: [T; N])
# functions that do not (e.g. filter)
g: [T; n] -> [T; _]
If you use _
in the type specification itself, the type is upcast to Object
.
For non-type type arguments (Int, Bool, etc.), the parameter with _
will be undefined.
i: _ # i: Object
[_; _] == [Object; _] == Array
Type erasure is not the same as omitting a type specification. Once the type argument information has been erased, it will not be returned unless you assert it again.
implicit = (1..5).iter().map(i -> i * 2).to_arr()
explicit = (1..5).iter().map(i -> i * 2).into(Array(Nat))
In Rust, this corresponds to the following code.
#![allow(unused)] fn main() { let partial = (1..6).iter().map(|i| i * 2).collect::<Vec<_>>(); }
Erg does not allow partial omission of types, but uses higher-order kind polymorphism instead.
# collect is a higher-order Kind method that takes Kind
hk = (1..5).iter().map(i -> i * 2).collect(Array)
hk: Array(Int)
Existential type
If there is a for-all type corresponding to ∀, it is natural to assume that there is an existential type corresponding to ∃. Existential types are not difficult. You already know the existential type, just not consciously aware of it as such.
T: Trait
f x: T = ...
The trait T
above is used as the existential type.
In contrast, T
in the lower case is only a trait, and X
is a for-all type.
f|X <: T| x: X = ...
In fact, the existential type is replaced by a for-all type. So why is there such a thing as an existential type? First of all, as we saw above, existential types do not involve type variables, which simplifies type specification. Also, since the type variable can be removed, it is possible to construct a type that would have rank 2 or higher if it were an all-presumptive type.
show_map f: (|T| T -> T), arr: [Show; _] =
arr.map x ->
y = f x
log y
y
However, as you can see, the existential type forgets or expands the original type, so if you do not want to expand the return type, you must use the for-all type. Conversely, types that are only taken as arguments and are not relevant to the return value may be written as existential types.
# id(1): I want it to be Int
id|T|(x: T): T = x
# |S <: Show|(s: S) -> () is redundant
show(s: Show): () = log s
By the way, a class is not called an existential type. A class is not called an existential type, because its elemental objects are predefined. Existential type means any type that satisfies a certain trait, and it is not the place to know what type is actually assigned.
Function type with keyword arguments
h(f) = f(y: 1, x: 2)
h: |T: Type|((y: Int, x: Int) -> T) -> T
The subtyping rules for functions with keyword arguments are as follows.
((x: T, y: U) -> V) <: ((T, U) -> V) # x, y are arbitrary keyword parameters
((y: U, x: T) -> V) <: ((x: T, y: U) -> V)
((x: T, y: U) -> V) <: ((y: U, x: T) -> V)
This means that keyword arguments can be deleted or replaced.
But you can't do both at the same time.
That is, you cannot cast (x: T, y: U) -> V
to (U, T) -> V
.
Note that keyword arguments are attached only to top-level tuples, and not to arrays or nested tuples.
Valid: [T, U] -> V
Invalid: [x: T, y: U] -> V
Valid: (x: T, ys: (U,)) -> V
Invalid: (x: T, ys: (y: U,)) -> V
Kind
Everything is typed in Erg. Types themselves are no exception. kind represents the "type of type". For example, Int
belongs to Type
, just as 1
belongs to Int
. Type
is the simplest kind, the atomic kind. In type-theoretic notation, Type
corresponds to *
.
In the concept of kind, what is practically important is one or more kinds (multinomial kind). One-term kind, for example Option
, belongs to it. A unary kind is represented as Type -> Type
1. A container such as Array
or Option
is specifically a polynomial kind that takes a type as an argument.
As the notation Type -> Type
indicates, Option
is actually a function that receives a type T
and returns a type Option T
. However, since this function is not a function in the usual sense, it is usually called the unary kind.
Note that ->
itself, which is an anonymous function operator, can also be seen as a kind when it receives a type and returns a type.
Also note that a kind that is not an atomic kind is not a type. Just as -1
is a number but -
is not, Option Int
is a type but Option
is not. Option
etc. are sometimes called type constructors.
assert not Option in Type
assert Option in Type -> Type
So code like the following will result in an error:
In Erg, methods can only be defined in atomic kinds, and the name self
cannot be used anywhere other than the first argument of a method.
# K is an unary kind
K: Type -> Type
K T = Class ...
K.
foo x = ... # OK, this is like a so-called static method
bar self, x = ... # TypeError: cannot define a method to a non-type object
K(T).
baz self, x = ... # OK
Examples of binary or higher kinds are {T: U}
(: (Type, Type) -> Type
), (T, U, V)
(: (Type, Type, Type) - > Type
), ... and so on.
There is also a zero-term kind () -> Type
. This is sometimes equated with an atomic kind in type theory, but is distinguished in Erg. An example is Class
.
Nil = Class()
Containment of kind
There is also a partial type relation, or rather a partial kind relation, between multinomial kinds.
K T = ...
L = Inherit K
L<: K
That is, for any T
, if L T <: K T
, then L <: K
, and vice versa.
∀T. L T <: K T <=> L <: K
higher kind
There is also a higher-order kind. This is a kind of the same concept as a higher-order function, a kind that receives a kind itself. (Type -> Type) -> Type
is a higher kind. Let's define an object that belongs to a higher kind.
IntContainerOf K: Type -> Type = K Int
assert IntContainerOf Option == Option Int
assert IntContainerOf Result == Result Int
assert IntContainerOf in (Type -> Type) -> Type
The bound variables of a polynomial kind are usually denoted as K, L, ..., where K is K for Kind.
set kind
In type theory, there is the concept of a record. This is almost the same as the Erg record 2.
# This is a record, and it corresponds to what is called a record in type theory
{x = 1; y = 2}
When all record values were types, it was a kind of type called a record type.
assert {x = 1; y = 2} in {x = Int; y = Int}
A record type types a record. A good guesser might have thought that there should be a "record kind" to type the record type. Actually it exists.
log Typeof {x = Int; y = Int} # {{x = Int; y = Int}}
A type like {{x = Int; y = Int}}
is a record kind. This is not a special notation. It is simply an enumeration type that has only {x = Int; y = Int}
as an element.
Point = {x = Int; y = Int}
Pointy = {Point}
An important property of record kind is that if T: |T|
and U <: T
then U: |T|
.
This is also evident from the fact that enums are actually syntactic sugar for refinement types.
# {c} == {X: T | X == c} for normal objects, but
# Equality may not be defined for types, so |T| == {X | X <: T}
{Point} == {P | P <: Point}
U <: T
in type constraints is actually syntactic sugar for U: |T|
.
A kind that is a set of such types is commonly called a set kind. Setkind also appears in the Iterator pattern.
Iterable T = Trait {
.Iterator = {Iterator}
.iter = (self: Self) -> Self.Iterator T
}
Type inference for polynomial kinds
Container K: Type -> Type, T: Type = Patch K(T, T)
Container (K).
f self = ...
Option T: Type = Patch T or NoneType
Option(T).
f self = ...
Fn T: Type = Patch T -> T
Fn(T).
f self = ...
Fn2 T, U: Type = Patch T -> U
Fn2(T, U).
f self = ...
(Int -> Int).f() # which one is selected?
In the example above, which patch would the method f
choose?
Naively, Fn T
seems to be chosen, but Fn2 T, U
is also possible, Option T
includes T
as it is, so any type is applicable, Container K , T
also matches `->`(Int, Int)
, i.e. Container(`->`, Int)
as ```Int -> Int`. So all four patches above are possible options.
In this case, patches are selected according to the following priority criteria.
- Any
K(T)
(e.g.T or NoneType
) preferentially matchesType -> Type
overType
. - Any
K(T, U)
(e.g.T -> U
) matches(Type, Type) -> Type
preferentially overType
. *Similar criteria apply for kind of 3 or more. - The one that requires fewer type variables to replace is chosen. For example,
Int -> Int
isT -> T
rather thanK(T, T)
(replacement type variables: K, T) orT -> U
(replacement type variables: T, U). `(replacement type variable: T) is matched preferentially. - If the number of replacements is also the same, an error is given as being unselectable.
1 In type theory notation, *=>*
↩
2 There are subtle differences such as visibility. ↩
Marker Trait
Marker traits are traits without required attributes. That is, you can Impl without implementing any method. It seems meaningless without the required attribute, but since the information that it belongs to the trait is registered, you can use the patch method or get special treatment by the compiler.
All marker traits are subsumed by the Marker
trait.
Light
provided as standard is a kind of marker trait.
Light = Subsume Marker
Person = Class {.name = Str; .age = Nat} and Light
M = Subsume Marker
MarkedInt = Inherit Int, Impl := M
i = MarkedInt.new(2)
assert i + 1 == 2
assert i in M
Marker classes can also be excluded with the Excluding
argument.
NInt = Inherit MarkedInt, Impl := N, Excluding: M
Mutable Structure Type
The T!
type is described as a box type that can be replaced by any T
type object.
Particle!State: {"base", "excited"}! = Class(... Impl := Phantom State)
Particle!
# This method moves the State from "base" to "excited".
apply_electric_field!(ref! self("base" ~> "excited"), field: Vector) = ...
The T!
type can replace data, but it cannot change its structure.
More like the behavior of a real program, it cannot change its size (on the heap). Such a type is called an immutable structure (mutable) type.
In fact, there are data structures that cannot be represented by invariant structure types.
For example, a Mutable-length array. The [T; N]!
type can contain objects of any [T; N]
, but cannot be replaced by objects of type [T; N+1]
and so on.
In other words, the length cannot be changed. To change the length, the structure of the type itself must be changed.
This is achieved by Mutable structure (mutable) types.
v = [Str; !0].new()
v.push! "Hello"
v: [Str; !1].
For mutable structure types, Mutable type arguments are marked with !
. In the above case, the type [Str; !0]
can be changed to [Str; !1]
and so on. That is, the length can be changed.
Incidentally, the [T; !N]
type is the sugar-coated syntax of the ArrayWithLength!(T, !N)
type.
Mutable structure types can of course be user-defined. Note, however, that there are some differences from invariant structure types in terms of the construction method.
Nil T = Class(Impl := Phantom T)
List T, !0 = Inherit Nil T
List T, N: Nat! = Class {head = T; rest = List(T, !N-1)}
List(T, !N).
push! ref! self(N ~> N+1, ...), head: T =
self.update! old -> Self.new {head; old}
Newtype pattern
Here is the Erg version of the newtype pattern commonly used in Rust.
Erg allows type aliases to be defined as follows, but they only refer to the same type.
UserId = Int
So, for example, if you have a specification that a number of type UserId
must be a positive 8-digit number, you can put in 10
or -1
, because it is the same as type Int
. If you set it to Nat
, -1
can be rejected, but the nature of an 8-digit number cannot be expressed by Erg's type system alone.
Also, for example, when designing a database system, suppose there are several types of IDs: user IDs, product IDs, product IDs, and user IDs. If the number of ID types increases, such as user IDs, product IDs, order IDs, etc., a bug may occur in which different types of IDs are passed to different functions. Even if user IDs and product IDs are structurally equivalent, they are semantically different.
The newtype pattern is a good design pattern for such cases.
UserId = Class {id = Nat}
UserId.
new id: Nat =
assert id.dights().len() == 8, else: "UserId must be a positive number with length 8"
UserId::__new__ {id;}
i = UserId.new(10000000)
print! i # <__main__.UserId object>
i + UserId.new(10000001) # TypeError: + is not implemented between `UserId` and `UserId
The constructor guarantees the pre-condition of an 8-digit number.
The UserId
loses all the methods that Nat
has, so you have to redefine the necessary operations each time.
If the cost of redefinition is not worth it, it is better to use inheritance. On the other hand, there are cases where the loss of methods is desirable, so choose the appropriate method depending on the situation.
Overloading
Erg does not support ad hoc polymorphism. That is, multiple definitions of functions and Kinds (overloading) are not possible. However, you can reproduce the overloading behavior by using a combination of a trait and a patch.
You can use traits instead of trait classes, but then all types that implement .add1
will be covered.
Add1 = Trait {
.add1: Self.() -> Self
}
IntAdd1 = Patch Int, Impl := Add1
IntAdd1.
add1 self = self + 1
RatioAdd1 = Patch Ratio, Impl := Add1
RatioAdd1.
add1 self = self + 1.0
add1|X <: Add1| x: X = x.add1()
assert add1(1) == 2
assert add1(1.0) == 2.0
Such a polymorphism by accepting all subtypes of a type is called subtyping polymorphism.
If the process is exactly the same for each type, it can be written as below. The above is used when the behavior changes from class to class (but the return type is the same). A polymorphism that uses type arguments is called parametric polymorphism. Parametric polymorphism is often used in conjunction with subtyping, as shown below, in which case it is a combination of parametric and subtyping polymorphism.
add1|T <: Int or Str| x: T = x + 1
assert add1(1) == 2
assert add1(1.0) == 2.0
Also, overloading of types with different numbers of arguments can be reproduced with default arguments.
C = Class {.x = Int; .y = Int}
C.
new(x, y := 0) = Self::__new__ {.x; .y}
assert C.new(0, 0) == C.new(0)
Erg takes the stance that you cannot define a function that behaves completely differently, such as having a different type depending on the number of arguments, but if the behavior is different to begin with, it should be named differently.
In conclusion, Erg prohibits overloading and adopts subtyping plus parametric polymorphism for the following reasons.
First, overloaded functions are distributed in their definitions. This makes it difficult to report the cause of an error when it occurs. Also, importing a subroutine may change the behavior of already defined subroutines.
{id;} = import "foo"
...
id x: Int = x
...
id x: Ratio = x
...
id "str" # TypeError: id is not implemented for Str
# But... But... where did this error come from?
Second, it is incompatible with default arguments. When a function with default arguments is overloaded, there is a problem with which one takes precedence.
f x: Int = ...
f(x: Int, y := 0) = ...
f(1) # which is chosen?
Furthermore, it is incompatible with the declaration.
The declaration f: Num -> Num
cannot specify which definition it refers to. This is because Int -> Ratio
and Ratio -> Int
are not inclusive.
f: Num -> Num
f(x: Int): Ratio = ...
f(x: Ratio): Int = ...
And the grammar is inconsistent: Erg prohibits variable reassignment, but the overloaded grammar looks like reassignment. Nor can it be replaced by an anonymous function.
# same as `f = x -> body`
f x = body
# same as... what?
f x: Int = x
f x: Ratio = x
Phantom class
Phantom types are marker traits that exist only to provide annotations to the compiler. As a usage of phantom types, let's look at the structure of a list.
Nil = Class()
List T, 0 = Inherit Nil
List T, N: Nat = Class {head = T; rest = List(T, N-1)}
This code results in an error.
3 | List T, 0 = Inherit Nil
^^^
TypeConstructionError: since Nil does not have a parameter T, it is not possible to construct List(T, 0) with Nil
hint: use 'Phantom' trait to consume T
This error is a complaint that T
cannot be type inferred when List(_, 0).new Nil.new()
is used.
In such a case, whatever the T
type is, it must be consumed on the right-hand side. A type of size zero, such as a tuple of length zero, is convenient because it has no runtime overhead.
Nil T = Class((T; 0))
List T, 0 = Inherit Nil T
List T, N: Nat = Class {head = T; rest = List(T, N-1)}
This code passes compilation. But it's a little tricky to understand the intent, and it can't be used except when the type argument is a type.
In such a case, a phantom type is just what you need. A phantom type is a generalized type of size 0.
Nil T = Class(Impl := Phantom T)
List T, 0 = Inherit Nil T
List T, N: Nat = Class {head = T; rest = List(T, N-1)}
nil = Nil(Int).new()
assert nil.__size__ == 0
Phantom
holds the type T
. But in fact the size of the Phantom T
type is 0 and does not hold an object of type T
.
Also, Phantom
can consume arbitrary type arguments in addition to its type. In the following example, Phantom
holds a type argument called State
, which is a subtype object of Str
.
Again, State
is a fake type variable that does not appear in the object's entity.
VM! State: {"stopped", "running"}! = Class(... State)
VM!("stopped").
start ref! self("stopped" ~> "running") =
self.do_something!()
self::set_phantom!("running"))
The state
is updated via the update_phantom!
or set_phantom!
methods.
This is the method provided by the standard patch for Phantom!
(the variable version of Phantom
), and its usage is the same as the variable update!
and set!
.
Projection Type
A projection type represents a type such as Self.AddO
in the following code.
Add R = Trait {
. `_+_` = Self, R -> Self.AddO
.AddO = Type
}
AddForInt = Patch(Int, Impl := Add Int)
AddForInt.
AddO = Int
The type Add(R)
can be said to be a type that defines addition with some object. Since the method should be a type attribute, the +
type declaration should be written below the indentation.
The mise-en-scène of the Add
type is the declaration .AddO = Type
, and the entity of the .AddO
type, which is a projective type, is held by a type that is a subtype of Add
. For example, Int.AddO = Int
, Odd.AddO = Even
.
assert Int < Add
assert Int.AddO == Int
assert Odd < Add
assert Odd.AddO == Even
Quantified Dependent Type
Erg has quantified and dependent types. Then naturally, it is possible to create a type that combines the two. That is the quantified dependent type.
NonNullStr = |N: Nat| StrWithLen N | N ! = 0 # same as {S | N: Nat; S: StrWithLen N; N ! = 0}
NonEmptyArray = |N: Nat| [_; N | N > 0] # same as {A | N: Nat; A: Array(_, N); N > 0}
The standard form of quantified dependent types are K(A, ... | Pred)
. K
is a type constructor, A, B
are type arguments, and Pred
is a conditional expression.
Quantified dependent types as left-hand side values can only define methods in the same module as the original type.
K A: Nat = Class ...
K(A).
...
K(A | A >= 1).
method ref! self(A ~> A+1) = ...
Quantified dependent types as right-hand side values require that the type variable to be used be declared in the type variable list (||
).
# T is a concrete type
a: |N: Nat| [T; N | N > 1]
Shared Reference
Shared references are one of those language features that must be handled with care. In TypeScript, for example, the following code will pass type checking.
class NormalMember {}
class VIPMember extends NormalMember {}
let vip_area: VIPMember[] = []
let normal_area: NormalMember[] = vip_area
normal_area.push(new NormalMember())
console.log(vip_area) # [NormalMember]
A NormalMember has entered the vip_area. It is an obvious bug, however what went wrong?
The cause is the shared reference denatured. The normal_area
is created by copying the vip_area
, but in doing so the type has changed.
But VIPMember
inherits from NormalMember
, so VIPMember[] <: NormalMember[]
, and this is not a problem.
The relation VIPMember[] <: NormalMember[]
is fine for immutable objects. However, if you perform a destructive operation like the one above, there will be a breakdown.
In Erg, such code is played back due to the ownership system.
NormalMember = Class()
VIPMember = Class()
vip_area = [].into [VIPMember; !_]
normal_area: [NormalMember; !_] = vip_area
normal_area.push!(NormalMember.new())
log vip_area # OwnershipError: `vip_room` was moved to `normal_room`
However, it can be inconvenient for an object to be owned by only one place.
For this reason, Erg has a type SharedCell!T!
, which represents a shared state.
$p1 = SharedCell!.new(!1)
$p2 = $p1.mirror!()
$p3 = SharedCell!.new(!1)
# If $p1 == $p2, a comparison of the content type Int!
assert $p1 == $p2
assert $p1 == $p3
# Check if $p1 and $p2 point to the same thing with `.addr!`.
assert $p1.addr!() == $p2.addr!()
assert $p1.addr!() != $p3.addr!()
$p1.add! 1
assert $p1 == 2
assert $p2 == 2
assert $p3 == 1
Objects of type SharedCell!
must be prefixed with $
. Also, by their nature, they cannot be constants.
The SharedCell! T!
type is also a subtype of T!
and can call methods of type T!
. The only methods specific to the SharedCell!T!
type are .addr!
, .mirror!
and .try_take
.
An important fact is that SharedCell! T!
is non-variant, i.e., no inclusions are defined for different type arguments.
$vip_area = SharedCell!.new([].into [VIPMember; !_])
$normal_area: SharedCell!([NormalMember; !_]) = $vip_area.mirror!() # TypeError: expected SharedCell!([NormalMember; !_]), but got SharedCell!([VIPMember; !_])
# hint: SharedCell!(T) is non-variant, which means it cannot have a supertype or a subtype.
However, the following code have not problem. In the last line, it's the VIPMember
argument that has been typed converted.
$normal_area = SharedCell!.new([].into [NormalMember; !_])
$normal_area.push!(NormalMember.new()) # OK
$normal_area.push!(VIPMember.new()) # OK
Special types (Self, Super)
Self
represents its own type. You can just use it as an alias, but note that the meaning changes in derived types (refers to the own type).
@Inheritable
C = Class()
C.
new_self() = Self. new()
new_c() = C.new()
D = Inherit C
classof D. new_self() # D
classof D. new_c() # C
Super
represents the type of the base class. The method itself refers to the base class, but the instance uses its own type.
@Inheritable
C = Class()
D = Inherit(C)
D.
new_super() = Super.new()
new_c() = C.new()
classof D. new_super() # D
classof D. new_c() # C
special type variables
Self
and Super
can be used as type variables in structured types and traits. This refers to classes that are subtypes of that type. That is, Self
in type T
means Self <: T
.
Add R = Trait {
.AddO = Type
.`_+_`: Self, R -> Self.AddO
}
ClosedAdd = Subsume Add(Self)
ClosedAddForInt = Patch(Int, Impl := ClosedAdd)
ClosedAddForInt.
AddO = Int
assert 1 in Add(Int, Int)
assert 1 in ClosedAdd
assert Int < Add(Int, Int)
assert Int < ClosedAdd
Typeof, classof
Typeof
is a function that can peek into Erg's type inference system, and its behavior is complex.
assert Typeof(1) == {I: Int | I == 1}
i: 1..3 or 5..10 = ...
assert Typeof(i) == {I: Int | (I >= 1 and I <= 3) or (I >= 5 and I <= 10)}
C = Class {i = Int}
I = C. new {i = 1}
assert Typeof(I) == {X: C | X == I}
J: C = ...
assert Typeof(J) == {i = Int}
assert {X: C | X == I} < C and C <= {i = Int}
The Typeof
function returns the derived type, not the class of the object.
So for instance I: C
of class C = Class T
, Typeof(I) == T
.
A value class does not have a corresponding record type. To solve this problem, value classes are supposed to be record types that have a __valueclass_tag__
attribute.
Note that you cannot access this attribute, nor can you define a __valueclass_tag__
attribute on a user-defined type.
i: Int = ...
assert Typeof(i) == {__valueclass_tag__ = Phantom Int}
s: Str = ...
assert Typeof(s) == {__valueclass_tag__ = Phantom Str}
Typeof
outputs only structured types. I explained that structured types include attribute types, refinement types, and (true) algebraic types.
These are independent types (inference precedence exists) and inference conflicts do not occur.
Attribute types and algebraic types can span multiple classes, while refinement types are subtypes of a single class.
Erg infers object types as refinement types as much as possible, and when that is not possible, expands refinement base classes to structured types (see below).
structured
All classes can be converted to derived types. This is called structuring. The structured type of a class can be obtained with the Structure
function.
If a class is defined with C = Class T
(all classes are defined in this form) then Structure(C) == T
.
C = Class {i = Int}
assert Structure(C) == {i = Int}
D = Inherit C
assert Structure(D) == {i = Int}
Nat = Class {I: Int | I >= 0}
assert Structure(Nat) == {I: Int | I >= 0}
Option T = Class (T or NoneType)
assert Structure(Option Int) == Or(Int, NoneType)
assert Structure(Option) # TypeError: only monomorphized types can be structured
# You can't actually define a record with __valueclass_tag__, but conceptually
assert Structure(Int) == {__valueclass_tag__ = Phantom Int}
assert Structure(Str) == {__valueclass_tag__ = Phantom Str}
assert Structure((Nat, Nat)) == {__valueclass_tag__ = Phantom(Tuple(Nat, Nat))}
assert Structure(Nat -> Nat) == {__valueclass_tag__ = Phantom(Func(Nat, Nat))}
# Marker classes are also record types with __valueclass_tag__
M = Inherit Marker
assert Structure(M) == {__valueclass_tag__ = Phantom M}
D = Inherit(C and M)
assert Structure(D) == {i = Int; __valueclass_tag__ = Phantom M}
E = Inherit(Int and M)
assert Structure(E) == {__valueclass_tag__ = Phantom(And(Int, M))}
F = Inherit(E not M)
assert Structure(F) == {__valueclass_tag__ = Phantom Int}
Variance
Erg can subtype polymorphic types, but there are some caveats.
First, consider the inclusion relation of ordinary polymorphic types. In general, there is a container K
and a type A, B
to which it assigns, and when A < B
, K A < K B
.
For example, Option Int < Option Object
. Therefore, methods defined in Option Object
can also be used in Option Int
.
Consider the typical polymorphic type Array!(T)
.
Note that this time it's not Array!(T, N)
because we don't care about the number of elements.
Now, the Array!(T)
type has methods called .push!
and .pop!
, which mean adding and removing elements, respectively. Here is the type:
Array.push!: Self(T).(T) => NoneType Array.pop!: Self(T).() => T
As can be intuitively understood,
Array!(Object).push!(s)
is OK whens: Str
(just upcastStr
toObject
)- When
o: Object
,Array!(Str).push!(o)
is NG Array!(Object).pop!().into(Str)
is NGArray!(Str).pop!().into(Object)
is OK
is. In terms of the type system, this is
- (Self(Object).(Object) => NoneType) < (Self(Str).(Str) => NoneType)
- (Self(Str).() => Str) < (Self(Object).() => Object)
means
The former may seem strange. Even though Str < Object
, the inclusion relation is reversed in the function that takes it as an argument.
In type theory, such a relation (the type relation of .push!
) is called contravariant, and vice versa, the type relation of .pop!
is called covariant.
In other words, function types are contravariant with respect to their argument types and covariant with respect to their return types.
It sounds complicated, but as we saw earlier, it's a reasonable rule if you apply it to an actual example.
If you still don't quite get it, consider the following.
One of Erg's design principles is "large input types, small output types". This is precisely the case for function mutability. Looking at the rules above, the larger the input type, the smaller the overall type. This is because general-purpose functions are clearly rarer than special-purpose functions. And the smaller the output type, the smaller the whole.
As a result, the above policy is equivalent to saying "minimize the type of the function".
Non-variance
Erg has another modification. It is non-variance.
This is a modification that built-in types such as SharedCell! T!
have.
This means that for two types T! = U!
, even if there is an inclusion relation, it means that you cannot cast between SharedCell! T!
and SharedCell! U!
.
This is because SharedCell! T!
is a shared reference. See shared references for details.
Mutated generic type
A universal type variable can specify its upper and lower bounds.
|A <: T| K(A)
|B :> T| K(B)
In the type variable list, the variant specification of the type variable is performed. In the above variant specification, the type variable A
is declared to be any subclass of type T
and the type variable B
is declared to be any superclass of type T
.
In this case, T
is also called the upper type for A
and the lower type for B
.
Mutation specifications can also overlap.
# U<A<T
|A<: T, A :> U| ...
Here is an example of code that uses a variable specification.
show|S <: Show| s: S = log s
Nil T = Class(Impl = Phantom T)
Cons T = Class(Nil T or List T)
List T = Class {head = T; rest = Cons T}
List(T).
push|U <: T|(self, x: U): List T = Self. new {head = x; rest = self}
upcast(self, U :> T): List U = self
Change specification
The List T
example is tricky, so let's go into a little more detail.
To understand the code above, you need to know about polymorphic type degeneration. Variance is discussed in detail in this section, but for now we need three facts:
- Ordinary polymorphic types, such as
List T
, are covariant withT
(List U > List T
whenU > T
) - The function
T -> U
is contravariant with respect to the argument typeT
((S -> U) < (T -> U)
whenS > T
) - Function
T -> U
is covariant with return typeU
((T -> U) > (T -> S)
whenU > S
)
For example, List Int
can be upcast to List Object
and Obj -> Obj
can be upcast to Int -> Obj
.
Now let's consider what happens if we omit the variable specification of the method.
...
List T = Class {head = T; rest = Cons T}
List(T).
# List T can be pushed U if T > U
push|U|(self, x: U): List T = Self. new {head = x; rest = self}
# List T can be List U if T < U
upcast(self, U): List U = self
Even in this case, the Erg compiler does a good job of inferring the upper and lower types of U
.
Note, however, that the Erg compiler doesn't understand the semantics of methods. The compiler simply infers and derives type relationships mechanically according to how variables and type variables are used.
As written in the comments, the type U
put in the head
of List T
is a subclass of T
(T: Int
, such as Nat
). That is, it is inferred as U <: T
. This constraint changes the argument type of .push{U}
upcast (List(T), U) -> List(T) to (List(T), T) -> List(T)
( e.g. disallow List(Int).push{Object}
). Note, however, that the U <: T
constraint does not alter the type containment of the function. The fact that (List(Int), Object) -> List(Int) to (List(Int), Int) -> List(Int)
does not change, just in .push
method It means that the cast cannot be performed.
Similarly, a cast from List T
to List U
is possible subject to the constraint U :> T
, so the variation specification is inferred. This constraint changes the return type of .upcast(U)
to upcast List(T) -> List(T) to List(T) -> List(T)
(e.g. List(Object) .upcast(Int)
) is prohibited.
Now let's see what happens if we allow this upcast. Let's invert the denaturation designation.
...
List T = Class {head = T; rest = Cons T}
List(T).
push|U :> T|(self, x: U): List T = Self. new {head = x; rest = self}
upcast(self, U :> T): List U = self
# TypeWarning: `U` in the `.push` cannot take anything other than `U == T`. Replace `U` with `T`.
# TypeWarning: `U` in the `.upcast` cannot take anything other than `U == T`. Replace `U` with `T`.
Both the constraint U <: T
and the modification specification U :> T
are satisfied only when U == T
. So this designation doesn't make much sense.
Only "upcasts such that U == T
" = "upcasts that do not change where U
" are actually allowed.
Appendix: Modification of user-defined types
Mutations of user-defined types are immutable by default. However, you can also specify mutability with the Inputs/Outputs
marker trait.
If you specify Inputs(T)
, the type is contravariant with respect to T
.
If you specify Outputs(T)
, the type is covariant with respect to T
.
K T = Class(...)
assert not K(Str) <= K(Object)
assert not K(Str) >= K(Object)
InputStream T = Class ..., Impl := Inputs(T)
# A stream that accepts Objects can also be considered to accept Strs
assert InputStream(Str) > InputStream(Object)
OutputStream T = Class ..., Impl := Outputs(T)
# A stream that outputs a Str can also be considered to output an Object
assert OutputStream(Str) < OutputStream(Object)
Type Widening
For example, define the polycorrelation coefficient as follows.
ids|T|(x: T, y: T) = x, y
There's nothing wrong with assigning a pair of instances of the same class. When you assign an instance pair of another class that has a containment relationship, it is upcast to the larger one and becomes the same type. Also, it is easy to understand that an error will occur if another class that is not in the containment relationship is assigned.
assert ids(1, 2) == (1, 2)
assert ids(1, 2.0) == (1.0, 2.0)
ids(1, "a") # TypeError
Now, what about types that have different derived types?
i: Int or Str
j: Int or NoneType
ids(i, j) # ?
Before explaining this, we have to focus on the fact that Erg's type system doesn't actually look at (runtime) classes.
1: {__valueclass_tag__ = Phantom Int}
2: {__valueclass_tag__ = Phantom Int}
2.0: {__valueclass_tag__ = Phantom Ratio}
"a": {__valueclass_tag__ = Phantom Str}
ids(1, 2): {__valueclass_tag__ = Phantom Int} and {__valueclass_tag__ = Phantom Int} == {__valueclass_tag__ = Phantom Int}
ids(1, 2.0): {__valueclass_tag__ = Phantom Int} and {__valueclass_tag__ = Phantom Ratio} == {__valueclass_tag__ = Phantom Ratio} # Int < Ratio
ids(1, "a"): {__valueclass_tag__ = Phantom Int} and {__valueclass_tag__ = Phantom Str} == Never # TypeError
I don't see the class because it may not be seen exactly, because in Erg the class of an object belongs to runtime information.
For example, the class of an Int or Str
type object is either Int
or Str
, but you can only know which one it is by executing it.
Of course, the class of an object of type Int
is defined as Int
, but in this case as well, what is visible from the type system is the structural type {__valueclass_tag__ = Int}
of Int
.
Now let's go back to another structured type example. In conclusion, the above code will result in a TypeError as the type does not match. However, if you do type expansion with type annotations, compilation will pass.
i: Int or Str
j: Int or NoneType
ids(i, j) # TypeError: types of i and j not matched
# hint: try type widening (e.g. ids<Int or Str or NoneType>)
ids<Int or Str or NoneType>(i, j) # OK
A and B
have the following possibilities.
A and B == A
: whenA <: B
orA == B
.A and B == B
: whenA :> B
orA == B
.A and B == {}
: when!(A :> B)
and!(A <: B)
.
A or B
has the following possibilities.
A or B == A
: whenA :> B
orA == B
.A or B == B
: whenA <: B
orA == B
.A or B
is irreducible (independent types): if!(A :> B)
and!(A <: B)
.
Type widening in subroutine definitions
Erg defaults to an error if return types do not match.
parse_to_int s: Str =
if not s.is_numeric():
do parse_to_int::return error("not numeric")
... # return Int object
# TypeError: mismatched types of return values
# 3 | do parse_to_int::return error("not numeric")
# └─ Error
# 4 | ...
# └ Int
In order to solve this, it is necessary to explicitly specify the return type as Or type.
parse_to_int(s: Str): Int or Error =
if not s.is_numeric():
do parse_to_int::return error("not numeric")
... # return Int object
This is by design so that you don't unintentionally mix a subroutine's return type with another type.
However, if the return value type option is a type with an inclusion relationship such as Int
or Nat
, it will be aligned to the larger one.
Iterator
An iterator is an object used to retrieve elements of a container.
for! 0..9, i =>
print! i
This code prints the numbers 0 through 9.
Each number (=Int object) is assigned to i
and the following operation (=print! i
) is executed. This kind of repetitive execution is called iteration.
Now let's look at the type signature of the for!
procedure.
for!: |T: Type, I <: Iterable T| (I, T => None) => None
The first argument seems to accept an object of type Iterable
.
Iterable
is a type with .Iterator
attribute, .iter
method in the request method.
Iterable T = Trait {
.Iterator = {Iterator}
.iter = (self: Self) -> Self.Iterator T
}
The type {Iterator}
of the .Iterator
attribute is so-called set-kind (kind is described here).
assert [1, 2, 3] in Iterable(Int)
assert 1..3 in Iterable(Int)
assert [1, 2, 3].Iterator == ArrayIterator
assert (1..3).Iterator == RangeIterator
log [1, 2, 3].iter() # <ArrayIterator object
log (1..3).iter() # <RangeIterator object>
Both ArrayIterator
and RangeIterator
are classes that implement Iterator
and exist only to give Array
and Range
iteration functions.
Such a design pattern is called companion class 1.
And the IteratorImpl
patch is the core of the iteration functionality. Iterator
requires only one .next
method, IteratorImpl
provides dozens of methods indeed. ArrayIterator
and RangeIterator
can use the implementation method of IteratorImpl
just by implementing the .next
method. For this convenience, the standard library implements a number of iterators.
classDiagram
class Array~T~ {
...
iter() ArrayIterator~T~
}
class Range~T~ {
...
iter() RangeIterator~T~
}
class Iterable~T~ {
<<trait>>
iter() Iterator~T~
}
Iterable~T~ <|.. Array~T~: Impl
Iterable~T~ <|.. Range~T~: Impl
class ArrayIterator~T~ {
array: Array~T~
next() T
}
class RangeIterator~T~ {
range: Range~T~
next() T
}
class Iterator~T~ {
<<trait>>
next() T
}
Iterator~T~ <|.. ArrayIterator~T~: Impl
Iterator~T~ <|.. RangeIterator~T~: Impl
Array <-- ArrayIterator
Range <-- RangeIterator
Types such as Iterable
that provide an interface for handling traits (in this case Iterator
) in a static dispatch yet unified manner are called companion class adapters.
1 There doesn't seem to be a uniform name for this pattern, but in Rust, there is [companion struct pattern]( https://gist.github.com/qnighy/be99c2ece6f3f4b1248608a04e104b38# :~:text=%E3%82%8F%E3%82%8C%E3%81%A6%E3%81%84%E3%82%8B%E3%80%82-,companion%20struct,-%E3%83%A1%E3%82%BD%E3%83%83%E3%83%89%E3%81%A8%E3%80%81%E3 %81%9D%E3%81%AE), and was named after it. ↩
Mutability
As we have already seen, all Erg variables are immutable. However, Erg objects have the concept of mutability. Take the following code as an example.
a = [1, 2, 3]
a = a + [4, 5, 6]
print! a # [1, 2, 3, 4, 5, 6]
The above code cannot actually be executed by Erg. This is because it is not reassignable.
This code can be executed.
b = ![1, 2, 3]
b.concat! [4, 5, 6]
print! b # [1, 2, 3, 4, 5, 6]
The final result of a, b
looks the same, but their meanings are very different.
Although a
is a variable that represents an array of Nat
, the objects pointed to in the first and second lines are different. The name a
is the same, but the contents are different.
a = [1, 2, 3]
print! id! a # 0x000002A798DFE940
_a = a + [4, 5, 6]
print! id! _a # 0x000002A798DFE980
The id!
procedure returns the address in memory where the object resides.
b
is a Nat
"dynamic" array. The content of the object changes, but the variables point to the same thing.
b = ![1, 2, 3]
print! id! b # 0x000002A798DFE220
b.concat! [4, 5, 6]
print! id! b # 0x000002A798DFE220
i = !0
if! True. do!
do! i.inc!() # or i.add!(1)
do pass
print! i # 1
!
is a special operator called the mutation operator. It makes immutable objects mutable.
The behavior of objects marked with !
can be customized.
Point = Class {.x = Int; .y = Int}
# In this case .x is made mutable and .y remains immutable
Point! = Class {.x = Int!; .y = Int}
Point!.
inc_x! ref!(self) = self.x.update! x -> x + 1
p = Point!.new {.x = !0; .y = 0}
p.inc_x!()
print! p.x # 1
Constant
Unlike variables, constants point to the same thing in all scopes.
Constants are declared with the =
operator.
PI = 3.141592653589
match! x:
PI => print! "this is pi"
Constants are identical in all scopes below the global and cannot be overwritten. Therefore, they cannot be redefined by =
. This restriction allows it to be used in pattern matching.
The reason why True
and False
can be used in pattern matching is because they are constants.
Also, constants always point to immutable objects. Types such as Str!
cannot be constants.
All built-in types are constants because they should be determined at compile time. Types that are not constants can be generated, but they cannot be used to specify a type and can only be used like a simple record. Conversely, a type is a record whose contents are determined at compile time.
Variable, Name, Identifier, Symbol
Let's clear up a few terms related to variables in Erg.
A Variable is a mechanism to give an object a Name so that it can be reused (or point to that Name). Identifier is a grammar element that specifies a variable. A symbol is a grammatical element, a token, that represents a name.
Only non-symbolic characters are symbols, and symbols are not called symbols, although they can be identifiers as operators.
For example, x
is an identifier and a symbol. x.y
is also an identifier, but it is not a symbol. x
and y
are symbols.
And even if x
were not tied to any object, x
would still be a Symbol and an Identifier, but it would not be called a Variable.
Identifiers of the form x.y
are called Field Accessors.
Identifiers of the form x[y]
are called Subscript Accessors.
The difference between a variable and an identifier is that if we are talking about a variable in the sense of Erg's grammatical theory, the two are in effect the same. In C, types and functions cannot be assigned to variables; int and main are identifiers, but not variables (strictly speaking, they can be assigned, but there are restrictions). However, in Erg, "everything is an object". Not only functions and types, but even operators can be assigned to variables.
Ownership system
Since Erg is a language that uses Python as the host language, the method of memory management depends on the Python implementation. But semantically Erg's memory management is different from Python's. A notable difference is in the ownership system and the prohibition of circular references.
Ownership
Erg has an ownership system inspired by Rust. Rust's ownership system is generally considered esoteric, but Erg's is simplified to be intuitive. In Erg, mutable objects are owned and cannot be referenced after ownership is lost.
v = [1, 2, 3].into [Int; !3]
push! vec, x =
vec.push!(x)
vec
# The contents of v ([1, 2, 3]) are owned by w
w = push! v, 4
print! v # error: v was moved
print!w # [1, 2, 3, 4]
Ownership transfer occurs, for example, when an object is passed to a subroutine. If you want to still have ownership after giving it away, you'll need to clone, freeze, or borrow. However, as will be explained later, there are limited situations in which it can be borrowed.
replication
Duplicate an object and transfer its ownership. It does this by applying the .clone
method to the actual arguments.
The duplicated object is exactly the same as the original, but independent of each other and unaffected by changes.
Duplication is equivalent to Python's deep copy, and since it recreates the same object entirely, the computation and memory costs are generally higher than freezing and borrowing. A subroutine that needs to duplicate an object is said to be an "argument consuming" subroutine.
capitalize s: Str!=
s. capitalize!()
s
s1 = !"hello"
s2 = capitalize s1.clone()
log s2, s1 # !"HELLO hello"
freeze
We take advantage of the fact that immutable objects can be referenced from multiple places and convert mutable objects to immutable objects.
This is called freezing. Freezing is used, for example, when creating an iterator from a mutable array.
Since you can't create an iterator directly from a mutable array, convert it to an immutable array.
If you don't want to destroy the array, use the .freeze_map
method.
# Compute the sum of the values produced by the iterator
sum|T <: Add + HasUnit| i: Iterator T = ...
x = [1, 2, 3].into [Int; !3]
x.push!(4)
i = x.iter()
assert sum(i) == 10
y # y can still be touched
borrow
Borrowing is cheaper than duplicating or freezing. Borrowing can be done in the following simple cases:
peek_str ref(s: Str!) =
log s
s = !"hello"
peek_str s
A borrowed value is called a reference to the original object. You can "sublease" the reference to another subroutine, but you cannot consume it because you are only borrowing it.
steal_str ref(s: Str!) =
# Since the log function only borrows the arguments, it can be sub-leased
log s
# error because the discard function consumes arguments
discard s # OwnershipError: cannot consume a borrowed value
# hint: use `clone` method
steal_str ref(s: Str!) =
# This is no good either (= consumes the right side)
x = s # OwnershipError: cannot consume a borrowed value
x
Erg's references are more restrictive than Rust's. References are first-class objects in the language, but cannot be created explicitly, they can only be specified as argument passing via ref
/ref!
.
This means that you cannot stuff references into arrays or create classes with references as attributes.
However, such restrictions are a natural specification in languages without references in the first place, and they are not so inconvenient.
circular references
Erg is designed to prevent unintentional memory leaks, and will issue an error if the memory checker detects a circular reference. In most cases, this error can be resolved with a weak reference Weak
. However, since it is not possible to generate objects with circular structures such as cyclic graphs, we plan to implement an API that can generate circular references as unsafe operations.
Visibility
Erg variables have the concept of visibility.
All the variables we've seen so far are called private variables. This is an externally invisible variable.
For example, a private variable defined in the foo
module cannot be referenced by another module.
# foo.er
x = "this is an invisible variable"
# bar.er
foo = import "foo"
foo.x # AttributeError: Module 'foo' has no attribute 'x' ('x' is private)
On the other hand, there are also public variables, which can be referenced from the outside.
Public variables are defined with .
.
# foo.er
.x = "this is a visible variable"
# bar.er
foo = import "foo"
assert foo.x == "this is a visible variable"
You don't need to add anything to private variables, but you can also add ::
or self::
(Self::
for types etc.) to indicate that they are private. increase. It can also be module::
if it is a module.
::x = "this is an invisible variable"
assert ::x == x
assert self ::x == ::x
assert module::x == ::x
In the context of purely sequential execution, private variables are almost synonymous with local variables. It can be referenced from the inner scope.
::x = "this is a private variable"
y =
x + 1 # exactly module::x
By using ::
, you can distinguish variables with the same name within the scope.
Specify the scope of the variable you want to refer to on the left. Specify module
for the top level.
If not specified, the innermost variable is referenced as usual.
::x = 0
assert x == 0
y =
::x = 1
assert x == 1
z =
::x = 2
assert ::x == 2
assert z::x == 2
assert y::x == 1
assert module::x == 0
In the anonymous subroutine scope, self
specifies its own scope.
x = 0
f = x ->
log module::x, self::x
f1# 0 1
::
is also responsible for accessing private instance attributes.
x = 0
C = Class {x = Int}
C.
# Top-level x is referenced (warning to use module::x)
f1 self = x
# instance attribute x is referenced
f2 self = self::x
Visibility in external modules
A class defined in one module can actually define methods from an external module.
# foo.er
.Foo = Class()
# bar.er
{Foo;} = import "foo"
Foo::
private self = pass
Foo.
public self = self::private()
.f() =
foo = Foo.new()
foo.public()
foo::private() # AttributeError
However, both of those methods are only available within that module.
Private methods defined externally are visible to methods of the Foo
class only within the defining module.
Public methods are exposed outside the class, but not outside the module.
# baz.er
{Foo;} = import "foo"
foo = Foo.new()
foo.public() # AttributeError: 'Foo' has no attribute 'public' ('public' is defined in module 'bar')
Also, methods cannot be defined in the type to be re-exported. This is to avoid confusion about methods being found or not found depending on the module they are imported from.
# bar.er
{.Foo;} = import "foo"
.Foo::
private self = pass # Error
Foo.
public self = self::private() # Error
If you want to do something like this, define a patch.
# bar.er
{Foo;} = import "foo"
FooImpl = Patch Foo
FooImpl :=:
private self = pass
Foo Impl.
public self = self::private()
# baz.er
{Foo;} = import "foo"
{FooImpl;} = import "bar"
foo = Foo.new()
foo.public()
restricted public variables
Variable visibility is not limited to complete public/private. You can also publish with restrictions.
Add .[]
and specify "the identifier of a namespace 1 you want allow accessing" in it.
In the example below, .[.record]
is only exposed within the .record
namespace, and .[module]
is only exposed within the module.
# foo.er
.record = {
.a = {
.[.record]x = 0
.[module]y = 0
.z = 0
}
_ = .a.x # OK
_ = .a.y # OK
_ = .a.z # OK
}
func x =
_ = .record.a.x # VisibilityError
_ = .record.a.y # OK
_ = .record.a.z # OK
None
_ = .record.a.x # VisibilityError
_ = .record.a.y # OK
_ = .record.a.z # OK
foo = import "foo"
_ = foo.record.a.x # VisibilityError
_ = foo.record.a.y # VisibilityError
_ = foo.record.a.z # OK
Multiple namespaces can be specified by separating them with commas.
.[.record, func]x = 0
By the way, private attributes of a class are not accessible to subclasses.
C = Class {i = Int}
D = Inherit C
D.
f self = self::i # VisibilityError
If you want to be able to access from some subclass, specify as follows.
C = Class {.[D]i = Int}
D = Inherit C
D.
f self = self.i
If you want to expose it to all subclasses, use .[<:Self]
.
This is the equivalent of protected
in other languages.
C = Class {.[<: C]i = Int}
1 In Erg, a namespace refers to a set of correspondences between names and objects. Modules, functions, classes, records, and the identifiers of instant scopes are equated with namespaces. Since records, classes, and functions can be created without binding them to identifiers, they essentially create an anonymous namespace. However, when bound to an identifier, it is overwritten with the same name as the identifier. ↩
Naming convention
If you want to use a variable as a constant expression, make sure it starts with a capital letter. Two or more letters may be lowercase.
i: Option Type = Int
match i:
t: Type -> log "type"
None -> log "None"
Objects with side effects always end with !
. Procedures and procedural methods, and mutable types.
However, the Proc
type itself is not mutable.
# Callable == Func or Proc
c: Callable = print!
match c:
p! -> log "proc" # `: Proc` can be omitted since it is self-explanatory
f -> log "func"
If you want to expose an attribute to the outside world, define it with .
at the beginning. If you don't put .
at the beginning, it will be private. To avoid confusion, they cannot coexist within the same scope.
o = {x = 1; .x = 2} # SyntaxError: private and public variables with the same name cannot coexist
Literal Identifiers
The above rule can be circumvented by enclosing the string in single quotes (''). That is, procedural objects can also be assigned without !
. However, in this case, even if the value is a constant expression, it is not considered a constant.
A character string enclosed in single quotes like this is called a literal identifier.
This is used when calling APIs (FFI) of other languages such as Python.
bar! = pyimport("foo").'bar'
Identifiers that are also valid in Erg do not need to be enclosed in ''.
Furthermore, literal identifiers can contain both symbols and spaces, so strings that cannot normally be used as identifiers can be used as identifiers.
'∂/∂t' y
'test 1: pass x to y'()
Anonymous function
Anonymous functions are a syntax for creating function objects on the fly without naming them.
# `->` is an anonymous function operator
# same as `f x, y = x + y`
f = (x, y) -> x + y
# same as `g(x, y: Int): Int = x + y`
g = (x, y: Int): Int -> x + y
You can omit the ()
if there is only one argument.
assert [1, 2, 3].map_collect(i -> i + 1) == [2, 3, 4]
assert ((i, j) -> [i, j])(1, 2) == [1, 2]
In the case below it is 0..9, (i -> ...)
and not (0..9, i) -> ...
.
->
takes only one argument on the left side. Multiple arguments are received as a single tuple.
for 0..9, i: Int ->
...
In anonymous functions, there is a difference in parsing due to whitespace.
# In this case, interpreted as `T(() -> Int)`
i: T() -> Int
# in this case it is interpreted as (U()) -> Int
k: U() -> Int
Anonymous functions can be used without arguments.
# `=>` is an anonymous procedure operator
p! = () => print! "`p!` was called"
# `() ->`, `() =>` have syntax sugar `do`, `do!`
# p! = do! print! "`p!` was called"
p!() # `p!` was called
No-argument functions can be used for lazy initialization.
time = import "time"
date = import "datetime"
now = if! True:
do!:
time. sleep! 1000
date.now!()
do date.new("1970", "1", "1", "00", "00")
You can also type and pattern match. Because of this, the match
function is mostly implemented with the power of anonymous functions.
Anonymous functions given as arguments to the match
function are tried in order from the top. So, you should describe the special cases at the top and the more general cases at the bottom. If you get the order wrong, the compiler will issue a warning (if possible).
n = (Complex or Ratio or Int).sample!()
i = matchn:
PI -> PI # if equal to constant PI
For (i: 1..10) -> i # Int from 1 to 10
(i: Int) -> i # Int
(c: Complex) -> c.real() # For Complex. Int < Complex, but can fallback
_ -> panic "cannot convert to Int" # If none of the above apply. match must cover all patterns
Error handling is also generally done using ?
or match
.
res: ParseResult Int
matchres:
i: Int -> i
err: Error -> panic err.msg
res2: Result Int, Error
match res2:
ok: Not Error -> log Type of ok
err: Error -> panic err.msg
Anonymous polymorphic functions
# same as id|T| x: T = x
id = |T| x: T -> x
Subroutine Signatures
Func
some_func(x: T, y: U) -> V
some_func: (T, U) -> V
Proc
some_proc!(x: T, y: U) => V
some_proc!: (T, U) => V
Func Method
The method type cannot be specified externally with Self
.
.some_method(self, x: T, y: U) => ()
# Self.(T, U) => () takes ownership of self
.some_method: (Ref(Self), T, U) => ()
Proc Method (dependent)
In the following, assume that the type T!
takes the type argument N: Nat
. To specify it externally, use a type variable.
K!: Nat -> Type
# ~> indicates the state of the type argument before and after application (in this case, self must be a variable reference)
K!(N).some_method!: (Ref!(K! N ~> N+X), X: Nat) => ()
As a note, the type of .some_method
is |N, X: Nat| (Ref!(K! N ~> N+X), {X}) => ()
.
For methods that do not have ref!
, i.e., are deprived of ownership after application, the type argument transition (~>
) cannot be used.
If ownership is taken, it is as follows.
# If you don't use N, you can omit it with _.
# .some_method!: |N, X: Nat| (T!(N), {X}) => T!(N+X)
.some_method!|N, X: Nat|(self: T!(N), X: Nat) => T!(N+X)
Operator
It can be defined as a normal function by enclosing it with ``.
Neuter alphabetic operators such as and
and or
can be defined as neuter operators by enclosing them with ``.
and(x, y, z) = x and y and z
`_+_`(x: Foo, y: Foo) = x.a + y.a
`-_`(x: Foo) = Foo.new(-x.a)
Closure
Erg subroutines have a feature called a "closure" that captures external variables.
outer = 1
f x = outer + x
assert f(1) == 2
As with immutable objects, mutable objects can also be captured.
sum = !0
for! 1..10, i =>
sum.add!i
assert sum == 45
p!x=
sum.add!x
p!(1)
assert sum == 46
Note, however, that functions cannot capture mutable objects. If a mutable object can be referenced in a function, you can write code like the following.
# !!! This code actually gives an error !!!
i = !0
f x = i + x
assert f 1 == 1
i.add! 1
assert f 1 == 2
The function should return the same value for the same arguments, but the assumption is broken.
Note that i
is evaluated only at call time.
Call .clone
if you want the contents of the mutable object at the time the function was defined.
i = !0
immut_i = i.clone().freeze()
fx = immut_i + x
assert f 1 == 1
i.add! 1
assert f 1 == 1
avoid mutable state, functional programming
# Erg
sum = !0
for! 1..10, i =>
sum.add!i
assert sum == 45
The equivalent program above can be written in Python as follows:
# Python
sum = 0
for i in range(1, 10):
sum += i
assert sum == 45
However, Erg recommends a simpler notation. Instead of carrying around state using subroutines and mutable objects, use a style of localizing state using functions. This is called functional programming.
# Functional style
sum = (1..10).sum()
assert sum == 45
The code above gives exactly the same result as before, but you can see that this one is much simpler.
The fold
function can be used to do more than sum.
fold
is an iterator method that executes the argument f
for each iteration.
The initial value of the counter that accumulates results is specified in init
and accumulated in acc
.
# start with 0, result will
sum = (1..10).fold(init: 0, f: (acc, i) -> acc + i)
assert sum == 45
Erg is designed to be a natural succinct description of programming with immutable objects.
Module
Erg allows you to think of the file itself as a single record. This is called a module.
# foo.er
.i = 1
# Defining the foo module is almost the same as defining this record
foo = {.i = 1}
# bar.er
foo = import "foo"
print! foo # <module 'foo'>
assert foo.i == 1
Since module types are also record types, deconstruction assignment is possible.
For modules, you can omit the trailing ...
.
# same as {sin; cos; ...} = import "math"
{sin; cos} = import "math"
Module Visibility
Directories as well as files can be modules.
However, by default Erg does not recognize directories as Erg modules. To have it recognized, create a file named __init__.er
.
__init__.er
is similar to __init__.py
in Python.
└─┬ bar
└─ __init__.er
Now the bar
directory is recognized as a module. If the only file in bar
is __init__.er
, there is not much point in having a directory structure, but it is useful if you want to bundle several modules into a single module. For example:
└─┬ bar
├─ __init__.er
├─ baz.er
└─ qux.er
From outside the bar
directory, you can use like the following.
bar = import "bar"
bar.baz.p!()
bar.qux.p!()
__init__.er
is not just a marker that makes a directory as a module, it also controls the visibility of the module.
# __init__.er
# `. /` points to the current directory. It can be omitted
.baz = import ". /baz"
qux = import ". /qux"
.f x =
.baz.f ...
.g x =
qux.f ...
When you import a bar
module from outside, the baz
module will be accessible, but the qux
module will not.
circular dependencies
Erg allows you to define circular dependencies between modules.
# foo.er
bar = import "bar"
print! bar.g 1
.f x = x
# bar.er
foo = import "foo"
print! foo.f 1
.g x = x
However, variables created by procedure calls cannot be defined in circular reference modules. This is because Erg rearranges the order of definitions according to dependencies.
# foo.er
bar = import "bar"
print! bar.x
.x = g!(1) # ModuleError: variables created by procedure calls cannot be defined in circular reference modules
# bar.er
foo = import "foo"
print! foo.x
.x = 0
In addition, An Erg module that is an entry point (i.e., a module that __name__ == "__main__"
) cannot be the subject of circular references.
Object
All data that can be assigned to a variable. The attributes of the Object
class are as follows.
.__repr__
: Returns a (non-rich) string representation of the object.__sizeof__
: Returns the size of the object (including heap allocation).__dir__
: Returns a list of object attributes.__hash__
: returns the hash value of the object.__getattribute__
: Get and return an attribute of an object.clone
: Creates and returns a clone of an object (with an independent entity in memory).copy
: Returns a copy of the object (pointing to the same thing in memory)
Record
An object generated by a record literal ({attr = value; ...}
).
This object has basic methods such as .clone
and .__sizeof__
.
obj = {.x = 1}
assert obj.x == 1
obj2 = {*x; .y = 2}
assert obj2.x == 1 and obj2.y == 2
Attribute
An object associated with an object. In particular, a subroutine attribute that takes self (self
) as its implicit first argument is called a method.
# note that there is no `.` in private_attr
record = {.public_attr = j; private_attr = 2; .method = self -> self.i + 1}
record. public_attr == 2
record.private_attr # AttributeError: private_attr is private
assert record.method() == 3
Element
An object belonging to a particular type (e.g. 1
is an element of type Int
). All objects are at least elements of type {=}
.
Elements of classes are sometimes called instances.
Subroutine
Indicates an object that is an instance of a function or procedure (including methods). The class representing a subroutine is Subroutine
.
An object that implements .__call__
is more commonly called a Callable
.
Callable
An object that implements .__call__
. It is also the superclass of Subroutine
.
Type
An object that defines requirement attributes and commonizes objects.
There are two main types: Polymorphic Type and Monomorphic Type. Typical monomorphic types are Int
, Str
, etc., and polymorphic types are Option Int
, [Int; 3]
, etc.
Furthermore, a type that defines a method that changes the state of an object is called a Mutable type, and it is necessary to add !
to the variable attribute (e.g. dynamic array: [T; !_]
) .
Class
A type that has .__new__
, .__init__
methods, etc. Implement class-based object orientation.
Function
A subroutine that has read permission for external variables (excluding static variables) but does not have read/write permission for external variables. In other words, it has no external side effects. Erg functions are defined differently than Python's because they do not allow side effects.
Procedure
It has read and self
permissions for external variables, read/write permissions for static variables, and is allowed to use all subroutines. It can have external side effects.
Method
A subroutine that implicitly takes self
as the first argument. It is a different type than a simple function/procedure.
Entity
Objects that are not subroutines and types.
Monomorphic entities (1
, "a"
, etc.) are also called value objects, polymorphic entities ([1, 2, 3], {"a": 1}
) are also called container objects .
pattern matching, refutable
Patterns available in Erg
variable pattern
# basic assignments
i = 1
# with type
i: Int = 1
# with anonymous type
i: {1, 2, 3} = 2
# functions
fn x = x + 1
# equals
fn x: Add(Int) = x + 1
# (anonymous) function
fn = x -> x + 1
fn: Int -> Int = x -> x + 1
# higher-order type
a: [Int; 4] = [0, 1, 2, 3]
# or
a: Array Int, 4 = [0, 1, 2, 3]
Literal patterns
# Raise a TypeError if `i` cannot be determined to be 1 at compile time.
# omit `_: {1} = i`
1 = i
# simple pattern matching
match x:
1 -> "1"
2 -> "2"
_ -> "other"
# fibonacci function
fib0 = 0
fib1 = 1
fibn: Nat = fibn-1 + fibn-2
constant pattern
cond=False
match! cond:
True => print! "cond is True"
_ => print! "cond is False"
PI = 3.141592653589793
E = 2.718281828459045
num = PI
name = match num:
PI -> "pi"
E -> "e"
_ -> "unnamed"
Refinement pattern
# these two are the same
Array(T, N: {N | N >= 3})
Array(T, N | N >= 3)
f M, N | M >= 0, N >= 1 = ...
f(1, 0) # TypeError: N (2nd parameter) must be 1 or more
discard (wildcard) pattern
_ = 1
_: Int = 1
zero_ = 0
right(_, r) = r
If not constrained by context, _
is of type Obj
.
Variable length patterns
It is used in combination with the tuple/array/record pattern described later.
[i,...j] = [1, 2, 3, 4]
assert j == [2, 3, 4]
first|T|(fst: T, ...rest: T) = fst
assert first(1, 2, 3) == 1
Tuple pattern
(i, j) = (1, 2)
((k, l), _) = ((1, 2), (3, 4))
# If not nested, () can be omitted (1, 2 are treated as (1, 2))
m, n = 1, 2
f(x, y) = ...
array pattern
[i, j] = [1, 2]
[[k, l], _] = [[1, 2], [3, 4]]
length[] = 0
length[_, ...rest] = 1 + lengthrest
record pattern
record = {i = 1; j = 2; k = 3}
{j; ...} = record # i, k will be freed
{sin; cos; tan; ...} = import "math"
{*} = import "math" # import all
person = {name = "John Smith"; age = 20}
age = match person:
{name = "Alice"; _} -> 7
{_; age} -> age
f {x: Int; y: Int} = ...
Data class pattern
Point = Inherit {x = Int; y = Int}
p = Point::{x = 1; y = 2}
Point::{x; y} = p
Nil T = Class Impl := Phantom T
Cons T = Inherit {head = T; rest = List T}
List T = Enum Nil(T), Cons(T)
List T.
first self =
match self:
Cons::{head; ...} -> x
_ -> ...
second self =
match self:
Cons::{rest=Cons::{head; ...}; ...} -> head
_ -> ...
enumeration pattern
*Actually, it's just an enumeration type
match x:
i: {1, 2} -> "one or two: \{i}"
_ -> "other"
range pattern
*Actually, it is just an interval type.
# 0 < i < 1
i: 0<..<1 = 0.5
# 1 < j <= 2
_: {[I, J] | I, J: 1<..2} = [1, 2]
# 1 <= i <= 5
match i
i: 1..5 -> ...
Things that aren't patterns, things that can't be patterned
A pattern is something that can be uniquely specified. In this respect pattern matching differs from ordinary conditional branching.
Condition specifications are not unique. For example, to check if the number n
is even, the orthodox is n % 2 == 0
, but you can also write (n / 2).round() == n / 2
.
A non-unique form is not trivial whether it works correctly or is equivalent to another condition.
set
There is no set pattern. Because the set has no way to uniquely retrieve the elements. You can retrieve them by iterator, but the order is not guaranteed.
Comprehension
Array with [expr | (name <- iterable)+ (predicate)*]
,
set with {expr | (name <- iterable)+ (predicate)*}
,
You can create a Dict with {key: value | (name <- iterable)+ (predicate)*}
.
The first part of the clauses separated by |
is called the layout clause (location clause), the second part is called the bind clause (binding clause), and the third part is called the guard clause (conditional clause).
A guard clause can be omitted, but a bind clause cannot be omitted, and a guard clause cannot precede a bind clause.
Comprehension example
# the layout clause is i
# bind clause is i <- [0, 1, 2]
assert [i | i <- [0, 1, 2]] == [0, 1, 2]
# layout clause is i / 2
# bind clause is i <- 0..2
assert [i/2 | i <- 0..2] == [0.0, 0.5, 1.0]
# layout clause is (i, j)
# bind clause i <- 0..2, j <- 0..2
# guard clause is (i + j) % 2 == 0
assert [(i, j) | i <- 0..2; j <- 0..2; (i + j) % 2 == 0] == [(0, 0), (0, 2), (1, 1), (2, 0), (2, 2)]
assert {i % 2 | i <- 0..9} == {0, 1}
assert {k: v | k <- ["a", "b"]; v <- [1, 2]} == {"a": 1, "b": 2}
Erg comprehensions are inspired by Haskell, but with some differences. For Haskell list comprehensions, the order of variables makes a difference in the result, but in Erg it doesn't matter.
-- Haskell
[(i, j) | i <- [1..3], j <- [3..5]] == [(1,3),(1,4),(1,5),(2 ,3),(2,4),(2,5),(3,3),(3,4),(3,5)]
[(i, j) | j <- [3..5], i <- [1..3]] == [(1,3),(2,3),(3,3),(1 ,4),(2,4),(3,4),(1,5),(2,5),(3,5)]
# Erg
assert [(i, j) | i <- 1..<3; j <- 3..<5] == [(i, j) | j <- 3..<5; i <- 1.. <3]
This specification is the same as that of Python.
# Python
assert [(i, j) for i in range(1, 3) for j in range(3, 5)] == [(i, j) for j in range(3, 5) for i in range(1, 3)]
Refinement type
Similar to comprehensions are refinement types. A refinement type is a type (enumerated type) created in the form {Name: Type | Predicate}
.
In the case of the refinement type, only one Name can be specified and the layout cannot be specified (however, multiple values can be handled if it is a tuple type), and the Predicate can be calculated at compile time, that is, only a constant expression can be specified.
Nat = {I: Int | I >= 0}
# If the predicate expression is only and, it can be replaced with ;
# Nat2D = {(I, J): (Int, Int) | I >= 0; J >= 0}
Nat2D = {(I, J): (Int, Int) | I >= 0 and J >= 0}
Spread assignment
In a decomposing assignment, putting ...
in front of a variable expands all remaining elements into that variable. This is called expansion assignment.
[x, *y] = [1, 2, 3]
assert x == 1
assert y == [2, 3]
x, *y = (1, 2, 3)
assert x == 1
assert y == (2, 3)
Extract assignment
Extraction assignment is a convenient syntax for localizing specific attributes within a module or record.
{sin; cos; tan} = import "math"
After that, you can use sin, cos, tan
locally.
You can do the same with records.
record = {x = 1; y = 2}
{x; y} = record
If you want to expand all, use {*} = record
. It is open
in OCaml.
record = {x = 1; y = 2}
{*} = records
assert x == 1 and y == 2
Decorator (modifier)
Decorators are used to add or demonstrate a particular state or behavior to a type or function. The syntax of the decorator is as follows.
@deco
X = ...
You can have multiple decorators as long as they don't conflict.
A decorator is not a special object, it's just a one-argument function. The decorator is equivalent to the following pseudocode.
X = ...
X = deco(X)
Erg doesn't allow reassignment of variables, so code like the one above won't work.
For simple variables it's the same as X = deco(...)
, but for instant blocks and subroutines you can't do that, so you need a decorator.
@deco
f x =
y = ...
x + y
# You can also prevent the code from becoming horizontal
@LongNameDeco1
@LongNameDeco2
C = Class...
Below are some frequently used built-in decorators.
Inheritable
Indicates that the defining type is an inheritable class. If you specify "public"
for the argument scope
, it will be possible to inherit even the class of the external module. By default it is "private"
and cannot be inherited externally.
Final
Make the method non-overridable. Adding it to a class makes it a non-inheritable class, but since it's the default it doesn't make sense.
Override
Used when overriding attributes. By default, Erg will throw an error if you try to define the same attribute as the base class.
Impl
Indicates that the argument trait is implemented.
Add = Trait {
.`_+_` = Self.(Self) -> Self
}
Sub = Trait {
.`_-_` = Self.(Self) -> Self
}
C = Class({i = Int}, Impl := Add and Sub)
C.
@Impl Add
`_+_` self, other = C.new {i = self::i + other::i}
@Impl Sub
`_-_` self, other = C.new {i = self::i - other::}
Attach
Specifies the attachment patch that comes with the trait by default. This allows you to reproduce the same behavior as Rust traits.
# foo.er
Add R = Trait {
.AddO = Type
.`_+_` = Self.(R) -> Self.AddO
}
@Attach AddForInt, AddForOdd
ClosedAdd = Subsume Add(Self)
AddForInt = Patch(Int, Impl := ClosedAdd)
AddForInt.AddO = Int
AddForOdd = Patch(Odd, Impl := ClosedAdd)
AddForOdd.AddO = Even
This will automatically apply the attachment patch when importing traits from other modules.
# Originally, IntIsBinAdd and OddIsBinAdd should be imported at the same time, but if it's an attachment patch, you can omit it
{BinAdd; ...} = import "foo"
assert Int. AddO == Int
assert Odd.AddO == Even
Internally it's just attached using the trait's .attach
method. Conflicts can be removed with the trait's .detach
method.
@Attach X
T = Trait ...
assert X in T. attaches
U = T.detach(X).attach(Y)
assert X not in U. attaches
assert Y in U. attaches
Deprecated
Indicates that the variable specification is obsolete and deprecated.
Test
Indicates that this is a test subroutine. Test subroutines are run with the erg test
command.
error handling system
Mainly use Result type. In Erg, an error occurs if you throw away an Error type object (not supported at the top level).
Exceptions, interop with Python
Erg does not have an exception mechanism (Exception). When importing a Python function
- Set return value to
T or Error
type T or Panic
type (may cause runtime error)
There are two options, pyimport
defaults to the latter. If you want to import as the former, use
Specify Error
in pyimport
exception_type
(exception_type: {Error, Panic}
).
Exceptions and Result types
The Result
type represents values that may be errors. Error handling with Result
is superior to the exception mechanism in several ways.
First of all, it's obvious from the type definition that the subroutine might throw an error, and it's also obvious when you actually use it.
# Python
try:
x = foo().bar()
y = baz()
qux()
except e:
print(e)
In the above example, it is not possible to tell from this code alone which function raised the exception. Even going back to the function definition, it's hard to tell if the function throws an exception.
# Erg
try!:
do!:
x = foo!()?.bar()
y = baz!()
qux!()?
e =>
print! e
On the other hand, in this example we can see that foo!
and qux!
can raise an error.
Precisely y
could also be of type Result
, but you'll have to deal with it eventually to use the value inside.
The benefits of using the Result
type don't stop there. The Result
type is also thread-safe. This means that error information can be (easily) passed between parallel executions.
Context
Since the Error
/Result
type alone does not cause side effects, unlike exceptions, it cannot have information such as the sending location (Context), but if you use the .context
method, you can put information in the Error
object. can be added. The .context
method is a type of method that consumes the Error
object itself and creates a new Error
object. They are chainable and can hold multiple contexts.
f() =
todo() \
.context "to be implemented in ver 1.2" \
.context "and more hints ..."
f()
# Error: not implemented yet
# hint: to be implemented in ver 1.2
# hint: and more hints ...
Note that Error
attributes such as .msg
and .kind
are not secondary, so they are not context and cannot be overridden as they were originally created.
Stack trace
The Result
type is often used in other languages because of its convenience, but it has the disadvantage of making it difficult to understand the source of an error compared to the exception mechanism.
Therefore, in Erg, the Error
object has an attribute called .stack
, and reproduces a pseudo-exception mechanism-like stack trace.
.stack
is an array of caller objects. Each time an Error object is returned
(including by ?
) it pushes its calling subroutine onto the .stack
.
And if it is ?
ed or .unwrap
ed in a context where return
is not possible, it will panic with a traceback.
f x =
...
y = foo.try_some(x)?
...
g x =
y = f(x)?
...
i = g(1)?
# Traceback (most recent call first):
# ...
# Foo.try_some, line 10, file "foo.er"
# 10 | y = foo.try_some(x)?
# module::f, line 23, file "foo.er"
# 23 | y = f(x)?
# module::g, line 40, file "foo.er"
# 40 | i = g(1)?
# Error: ...
Panic
Erg also has a mechanism for dealing with unrecoverable errors called panicing. An unrecoverable error is an error caused by an external factor such as a software/hardware malfunction, an error so fatal that it makes no sense to continue executing the code, or an error unexpected by the programmer. Etc. If this happens, the program will be terminated immediately, because the programmer's efforts cannot restore normal operation. This is called "panicing".
Panic is done with the panic
function.
panic "something went wrong!"
Pipeline operator
Pipeline operators are used like this:
assert f(g(x)) == (x |> g |> f())
assert f(g(x, y)) == (x |> g(y) |> f())
In other words, the order Callable(object)
can be changed to object |> Callable()
.
The pipeline operator can also be used on methods. For methods, object.method(args)
changes to object |>.method(args)
.
It looks like just more |>
, but since the bond strength is low, you may be able to reduce the amount of ()
.
rand = -1.0..1.0 |>.sample!()
log rand # 0.2597...
1+1*2 |>.times do log("a", end := "") # aaa
evens = 1..100 |>.iter() |>.filter i -> i % 2 == 0 |>.collect Array
# When implemented without the pipeline operator,
_evens = (1..100).iter().filter(i -> i % 2 == 0).collect(Array)
# or
__evens = 1..100 \
.iter() \
.filter i -> i % 2 == 0 \
.collect Array
Integration with Python
Export to Python
When the Erg script is compiled, a .pyc file is generated, which can simply be imported as a Python module. However, variables set to private on the Erg side cannot be accessed from Python.
# foo.er
.public = "this is a public variable"
private = "this is a private variable"
erg --compile foo.er
import foo
print(foo.public)
print(foo.private) # AttributeError:
import from Python
By default, all objects imported from Python are of type Object
. Since no comparison is possible with this type, it is necessary to narrow down the type.
Type specification in the standard library
All APIs in the Python standard library are type-specified by the Erg development team.
time = pyimport "time"
time.sleep! 1
Type specification for user scripts
Type hints on the Python side are ignored. Create a foo.d.er
file that types the Python foo
module.
# foo.py
X = ...
def bar(x):
...
def baz():
...
class C:
...
...
# foo.d.er
.X: Int
.bar!: Int => Int
.foo! = baz!: () => Int # aliasing
.C!: Class
No syntax other than declarations and definitions (aliasing) are allowed in d.er
.
Note that all Python functions can only be registered as procedures, and all classes as variable classes.
foo = pyimport "foo"
assert foo.bar!(1) in Int
This ensures type safety by performing type checking at runtime. The checking mechanism generally works as follows.
decl_proc proc!: Proc, T =
x =>
assert x in T.Input
y = proc!(x)
assert y in T.Output
y
This is a runtime overhead, so a project to statically type analyze Python scripts with Erg's type system is underway.
Package System
Erg packages can be roughly classified into the app package, which is the application, and the lib package, which is the library.
The entry point of the app package is src/app.er
. The main
function defined in app.er
is executed.
The entry point for the lib package is src/lib.er
. Importing a package is equivalent to importing lib.er
.
A package has a sub-structure called a module, which in Erg is an Erg file or directory composed of Erg files. External Erg files/directories are manipulatable objects as module objects.
In order for a directory to be recognized as a module, a __init__.er
file must be placed in the directory.
This is similar to __init__.py
in Python.
As an example, consider the following directory structure.
└─┬ ./src
├─ app.er
├─ foo.er
└─┬ bar
├─ __init__.er
├─ baz.er
└─ qux.er
In app.er
you can import foo
and bar
modules. The bar
directory can be recognized as a module because of the __init__.er
file.
A foo
module is a module consisting of files, and a bar
module is a module consisting of directories. The bar
module also contains baz
and qux
modules.
This module is simply an attribute of the bar
module, and can be accessed from app.er
as follows
# app.er
foo = import "foo"
bar = import "bar"
baz = bar.baz
# or `baz = import "bar/baz"`
main args =
...
Note that the delimiter for accessing submodules is /
. This is because a file name like bar.baz.er
is possible.
However, such filenames are discouraged, because in Erg, the identifier immediately preceding the .er
, the prefix, is meaningful.
For example, a module for testing. A file ending with .test.er
is a (white box) test module, which executes a subroutine decorated with @Test
when the test is run.
└─┬ . /src
├─ app.er
├─ foo.er
└─ foo.test.er
# app.er
foo = import "foo"
main args =
...
Also, modules that are not reimported in __init__.er
are private modules and can only be accessed by modules in the same directory.
└─┬
├─ foo.er
└─┬ bar
├─ __init__.er
├─ baz.er
└─ qux.er
# __init__.py
.qux = import "qux" # this is public
# foo.er
bar = import "bar"
bar.qux
bar.baz # AttributeError: module 'baz' is private
# qux.er
baz = import "baz"
Generator
Generators are special procedures that use the yield!
procedure in a block.
g!() =
yield! 1
yield! 2
yield! 3
yield!
is a procedure defined in a block of subroutines that calls self!.yield!
. Like return
, it returns the value passed to it as a return value, but it has the feature of saving the current execution state of the block and executing it from the beginning when it is called again.
A generator is both a procedure and an iterator; a Python generator is a function that creates an iterator, while Erg iterates directly. Procedures themselves are generally not mutable objects (no !
), but a generator is a mutable object because its own contents can change with each execution.
# Generator!
g!: Generator!((), Int)
assert g!() == 1
assert g!() == 2
assert g!() == 3
A Python-style generator can be defined as follows.
make_g() = () =>
yield! 1
yield! 2
yield! 3
make_g: () => Generator!
The Grammar of Erg (ver 0.1.0, provisional)
special_op ::= '=' | '->' | '=>' | '.' | ',' | ':' | '::' | '|>' | '&'
separator ::= ';' | '\n'
escape ::= '\'
comment_marker ::= '#'
reserved_symbol ::= special_op | separator | comment_marker
number ::= [0-9]
first_last_dight ::= number
dight ::= number | '_'
bin_dight ::= [0-1]
oct_dight ::= [0-8]
hex_dight ::= [0-9]
| [a-f]
| [A-F]
int ::= first_last_dight
| first_last_dight dight* first_last_dight
| '0' ('b' | 'B') binary_dight+
| '0' ('o' | 'O') octa_dight+
| '0' ('x' | 'X') hex_dight+
ratio ::= '.' dight* first_last_dight
| first_last_dight dight* '.' dight* first_last_dight
bool ::= 'True' | 'False'
none ::= 'None'
ellipsis ::= 'Ellipsis'
not_implemented ::= 'NotImplemented'
parenthesis ::= '(' | ')'
bracket ::= '{' | '}'
square_bracket ::= '[' | ']'
enclosure ::= parenthesis | bracket | square_bracket
infix_op ::= '+' | '-' | '*' | '/' | '//' | '**'
| '%' | '&&' | '||' | '^^' | '<' | '<=' | '>' | '>='
| 'and' | 'or' | 'is' | 'as' | 'isnot' | 'in' | 'notin' | 'dot' | 'cross'
prefix_op ::= '+' | '-' | '*' | '**' | '..' | '..<' | '~' | '&' | '!'
postfix_op ::= '?' | '..' | '<..'
operator ::= infix_op | prefix_op | postfix_op
char ::= /* ... */
str ::= '\"' char* '\"
symbol_head ::= /* char except dight */
symbol ::= symbol_head /* char except (reserved_symbol | operator | escape | ' ') */
subscript ::= accessor '[' expr ']'
attr ::= accessor '.' symbol
accessor ::= symbol | attr | subscript
literal ::= int | ratio | str | bool | none | ellipsis | not_implemented
pos_arg ::= expr
kw_arg ::= symbol ':' expr
arg ::= pos_arg | kw_arg
enc_args ::= pos_arg (',' pos_arg)* ','?
args ::= '()' | '(' arg (',' arg)* ','? ')' | arg (',' arg)*
var_pattern ::= accessor | `...` accessor | '[' single_patterns ']'
var_decl_opt_t = var_pattern (':' type)?
var_decl = var_pattern ':' type
param_pattern ::= symbol | `...` symbol | literal | '[' param_patterns ']'
param_decl_opt_t = param_pattern (':' type)?
param_decl = param_pattern ':' type
params_opt_t ::= '()' (':' type)?
| '(' param_decl_opt_t (',' param_decl_opt_t)* ','? ')' (':' type)?
| param_decl_opt_t (',' param_decl_opt_t)*
params ::= '()' ':' type
| '(' param_decl (',' param_decl)* ','? ')' ':' type
subr_decl ::= accessor params
subr_decl_opt_t ::= accessor params_opt_t
decl ::= var_decl | subr_decl
decl_opt_t = var_decl_opt_t | subr_decl_opt_t
body ::= expr | indent line+ dedent
def ::= ('@' decorator '\n')* decl_opt_t '=' body
call ::= accessor args | accessor call
decorator ::= call
lambda_func ::= params_opt_t '->' body
lambda_proc ::= params_opt_t '=>' body
lambda ::= lambda_func | lambda_proc
normal_array ::= '[' enc_args ']'
array_comprehension ::= '[' expr | (generator)+ ']'
array ::= normal_array | array_comprehension
record ::= '{' '=' '}'
| '{' def (';' def)* ';'? '}'
set ::= '{' '}'
| '{' expr (',' expr)* ','? '}'
dict ::= '{' ':' '}'
| '{' expr ':' expr (',' expr ':' expr)* ','? '}'
tuple ::= '(' ')'
| '(' expr (',' expr)* ','? ')'
indent ::= /* ... */
expr ::= accessor | literal
| prefix | infix | postfix
| array | record | set | dict | tuple
| call | def | lambda
line ::= expr separator+
program ::= expr? | (line | comment)*
index
See here for APIs not in this index.
Also, see here for terminology.
symbol
- ! → side effect
- !-type → mutable type
- ? → error handling
- # → Str
- $ → shared
- %
- &
- &&
- ′ (single quote)
- " (double quote)
- () → Tuple
- *
- + (prefix) → operator
- +_ → + (prefix)
- + (infix) → operator
- + (infix) → Trait
- ,
- − (prefix)
- −_ → − (prefix)
- − (infix) → operator
- − (infix) → Trait
- −> → anonymous function
- . → Visibility
- .. → closed range operator
- ..< → right-open range operator
- ...
- ... → Extract assignment
- ... → Variable-length arguments
- /
- :
- : → Colon application style
- : → Type ascription
- : → Keyword arguments
- :: → private variable modifier
- := → default parameters
- ;
- <
- <: → Subtype specification
- <<
- <=
- <.. → left-open range operator
- <..< → open range operator
- = → Variable
- ==
- => → anonymous procedure operator
- >
- >>
- >=
- @ → decorator
- [] → Array
- \ → Escaping
- ^
- ^^
- _ → Type erasure
- _+_ → + (infix)
- _-_ → − (infix)
- `` (back quote)
- {}
- {=} → Type System
- |
- || → Type variable list
- ~
alphabet
A
- [Add]
- alias
- Aliasing
- For all types
- algebraic type
- [And]
- [and]
- anonymous function
- anonymous type → Type system
- Array
- assert
- Attach
- attribute
- Attribute definitions
- Attribute type
B
C
- Cast
- Comments
- Complex object
- Compile-time functions
- circular references
- Class
- Class relationship
- Class upcasting
- Colon application style
- Closure
- Compound literals
- Complement
- Comprehension
- constant
- Constants
- Context
D
- Data type
- Declaration
- decorator
- Default parameters
- Del
- Dependent type
- Deprecated
- Dict
- Diff
- distinct
- Downcasting
E
- Empty record
- Enum class
- Enum type
- Enumerated, Interval and Refinement types
- error handling
- Existential type
- Exponential literal
- Extract assignment
F
- False → Boolean object
- Float object
- for
- For-All patch
- freeze
- Function
- Function definition with multiple patterns
G
H
I
- id
- if
- import
- impl
- in
- Indention
- Instant block
- Instance/class attributes
- inheritable
- inheritance
- Int
- Integration with Python
- Interval Type
- Intersection
- Iterator
J
K
L
- lambda → anonymous function
- let-polymorphism → [rank 1 polymorphism]
- Literal Identifiers
M
- match
- Marker trait
- Method
- Modifier → decorator
- module
- Multiple inheritance
- Multi-layer (multi-level) Inheritance
- Mutable type
- Mutable structure type
- Mutability
N
- Nat
- Never
- New type
- Heterogeneous Dict
- None → [None Object]
- None Object
- Nominal Subtyping → Class
- Not
- not
O
- Object
- [Option]
- [Or]
- [or]
- [Ord]
- ownership system
- Overloading
- Overriding
- Override in trait
P
- Panic
- Patch
- Pattern match
- Phantom class
- pipeline operator
- Predicate
- print!
- Procedures
- Projection type
- Python → Integration with Python
Q
R
- Range Object
- ref
- ref!
- Record
- Recursive functions
- Refinement pattern
- Refinement type
- replication
- Replacing traits
- Result → error handling
S
- Script
- self
- Self
- Shared reference
- side-effect
- Smart cast
- Spread assignment
- special type variables
- Stack trace
- Structure type
- Structural patch
- Structural trait
- Structural subtyping
- Structural types and class type relationships
- Str
- Subtyping
- Subtyping of subroutines
- Subtype specification
- Subtyping of polymorphic function types
- Subroutine signatures
T
- Test
- Traits
- Trait inclusion
- True → Boolean object
- True algebraic type
- Type
- type
- Type arguments in method definitions
- Type bound
- Type definitions
- Type erasure
- Type inference system
- Type specification
- Type system
- Type widening
- Tuple
U
V
W
- while!
X
Y
Z
Quick Tour
The documentation below syntax
is written with the aim of being understandable even for programming beginners.
For those who have already mastered languages such as Python, Rust, Haskell, etc., it may be a bit verbose.
So, here's an overview of the Erg grammar. Please think that the parts not mentioned are the same as Python.
Basic calculation
Erg has a strict type. However, types are automatically casting if subtypes due to the flexibility provided by classes and traits (see API for details).
In addition, different types can be calculated for each other as long as the type is a numeric type.
a = 1 # 1: Nat
b = a - 10 # -9: Int
c = b / 2 # -4.5: Float
d = c * 0 # -0.0: Float
e = f // 2 # 0: Nat
If you do not want to allow unexpected type widening, you can specify the type at declaration time to detect them as errors at compile time.
a = 1
b: Int = a / 2
# error message
Error[#0047]: File <stdin>, line 1, in <module>
2│ b: Int = a / 2
^
TypeError: the type of b is mismatched:
expected: Int
but found: Float
Boolean type
True
and False
are singletons of the Boolean type, but they can also be cast to the Int type.
Therefore, they can be compared if they are of type Int, but comparisons with other types will result in an error.
True == 1 # OK
False == 0 # OK
True == 1.0 # NG
False == 0.0 # NG
True == "a" # NG
Variables, constants
Variables are defined with =
. As with Haskell, variables once defined cannot be changed. However, it can be shadowed in another scope.
i = 0
if True:
i = 1
assert i == 0
Anything starting with an uppercase letter is a constant. Only things that can be computed at compile time can be constants. Also, a constant is identical in all scopes since its definition. This property allows constants to be used in pattern matching.
PI = 3.141592653589793
match random.random!(0..10):
PI ->
log "You get PI, it's a miracle!"
declaration
Unlike Python, only the variable type can be declared first. Of course, the declared type and the type of the object actually assigned to must be compatible.
i: Int
i = 10
Functions
You can define it just like in Haskell.
fib 0 = 0
fib 1 = 1
fib n = fib(n - 1) + fib(n - 2)
An anonymous function can be defined like this:
i -> i + 1
assert [1, 2, 3].map(i -> i + 1).to_arr() == [2, 3, 4]
operator
The Erg-specific operators are:
mutating operator (!)
It's like ref
in Ocaml.
i = !0
i.update! x -> x + 1
assert i == 1
Procedures
Subroutines with side effects are called procedures and are marked with !
.
Functions are subroutines that do not have side effects (pure).
You cannot call procedures in functions. This explicitly isolates side effects.
print! 1 # 1
generic function (polycorrelation)
id|T|(x: T): T = x
id(1): Int
id("a"): Str
Records
You can use the equivalent of records in ML-like languages (or object literals in JS).
p = {x = 1; y = 2}
assert p.x == 1
Ownership
Ergs are owned by mutable objects (objects mutated with the !
operator) and cannot be rewritten from multiple places.
i = !0
j = i
assert j == 0
i# MoveError
Immutable objects, on the other hand, can be referenced from multiple places.
Visibility
Prefixing a variable with .
makes it a public variable and allows it to be referenced from external modules.
# foo.er
.x = 1
y = 1
foo = import "foo"
assert foo.x == 1
foo.y # VisibilityError
Pattern matching
Variable pattern
# basic assignments
i = 1
# with type
i: Int = 1
# functions
fn x = x + 1
fn: Int -> Int = x -> x + 1
Literal patterns
# if `i` cannot be determined to be 1 at compile time, TypeError occurs.
# shorthand of `_: {1} = i`
1 = i
# simple pattern matching
match x:
1 -> "1"
2 -> "2"
_ -> "other"
# fibonacci function
fib 0 = 0
fib 1 = 1
fib n: Nat = fibn-1 + fibn-2
Constant pattern
PI = 3.141592653589793
E = 2.718281828459045
num = PI
name = match num:
PI -> "pi"
E -> "e"
_ -> "unnamed"
Discard (wildcard) pattern
_ = 1
_: Int = 1
right(_, r) = r
Variable length patterns
Used in combination with the tuple/array/record pattern described later.
[i, *j] = [1, 2, 3, 4]
assert j == [2, 3, 4]
first|T|(fst: T, *rest: T) = fst
assert first(1, 2, 3) == 1
Tuple pattern
(i, j) = (1, 2)
((k, l), _) = ((1, 2), (3, 4))
# If not nested, () can be omitted (1, 2 are treated as (1, 2))
m, n = 1, 2
Array pattern
length [] = 0
length [_, *rest] = 1 + length rest
Record pattern
{sin; cos; tan} = import "math"
{*} = import "math" # import all
person = {name = "John Smith"; age = 20}
age = match person:
{name = "Alice"; _} -> 7
{_; age} -> age
Data class pattern
Point = Inherit {x = Int; y = Int}
p = Point::{x = 1; y = 2}
Point::{x; y} = p
Comprehensions
odds = [i | i <- 1..100; i % 2 == 0]
Class
Erg does not support multiple inheritance.
Classes are non-inheritable by default, and you must define inheritable classes with the Inheritable
decorator.
@Inheritable
Point2D = Class {x = Int; y = Int}
Point3D = Inherit Point2D, Base := {x = Int; y = Int; z = Int}
Trait
They are similar to Rust traits, but in a more literal sense, allowing composition and decoupling, and treating attributes and methods as equals. Also, it does not involve implementation.
XY = Trait {x = Int; y = Int}
Z = Trait {z = Int}
XYZ = XY and Z
Show = Trait {show: Self.() -> Str}
@Impl XYZ, Show
Point = Class {x = Int; y = Int; z = Int}
Point.
...
Patch
You can retrofit a class or trait with an implementation.
Invert = Patch Bool
Invert.
invert self = not self
assert False.invert()
Refinement types
A predicate expression can be type-restricted.
Nat = {I: Int | I >= 0}
parametric type with value (dependent type)
a: [Int; 3]
b: [Int; 4]
a + b: [Int; 7]