Kawa: Multi-dimensional Arrays

Multi-dimensional Arrays

Arrays are heterogeneous data structures that generaize vectors to multiple indexes or dimensions. Instead of a single integer index, there are multiple indexes: An index is a vector of integers; the length of a valid index sequence is the rank or the number of dimensions of an array.

Kawa multi-dimensional arrays follows the by SRFI-25 specification, with additions from Racket’s math.array package and other sources.

An array whose rank is 1, and where the (single) lower bound is 0 is a sequence. Furthermore, if such an array is simple (not created by share-array) it will be implemented using a <vector>. Uniform vectors and strings are also arrays in Kawa.

A rank-0 array has a single value. It is essentially a box for that value. Functions that require arrays may treat non-arrays as a rank-0 array containing that value.

An array of rank 2 is frequently called a matrix.

Note that Kawa arrays are distinct from Java (native) arrays. The latter is a simpler one-dimensional vector-like data structure, which is used to implement Kawa arrays and vectors.

Procedure: array? obj

Returns #t if obj is an array, otherwise returns #f.

Array shape

The shape of an array consists of bounds for each index.

The lower bound b and the upper bound e of a dimension are exact integers with (<= b e). A valid index along the dimension is an exact integer i that satisfies both (<= b i) and (< i e). The length of the array along the dimension is the difference (- e b). The size of an array is the product of the lengths of its dimensions.

There is no separate data type for a shape. The canonical representation for a shape (a canonical shape) is a rank-2 array where the first index is the dimension (zero-based), and the second index is 0 or 1: Elements (i 0) and (i 1) are respectively the lower bound and upper bound of dimension i.

For convenience, the procedures that require a shape can accept a shape-specifier, as if converted by the procedure ->shape. For example (array-reshape array shape) is equivalent to (array-reshape array (->shape shape)).

Procedure: ->shape specifier

Convert the shape specifier specifier to a canonical shape. The specifier must be either a canonical shape, or vector with one element for each dimension, as described below. We use as examples a 2*3 array with lower bounds 0 and a 3*4 array with lower bounds 1.

A vector of simple integers. Each integer e is an upper bound, with a zero lower bound. Equivalent to the range [0 <: e].

A specifier for the first examples is #(2 3), and the second is not expressible.

A vector of lists of length 2. The first element of each list is the lower bound, and the second is the upper bound.

Examples: #((0 2) (0 3)) and #((1 3) (1 4)).

A vector of simple ranges, one for each dimension, all of who are bounded (finite), consist of integer values, and have a step of 1. Each range, which is usually written as [b <: e], expresses the bounds of the corresponding dimension For the first example you can use [[0 <: 2] [0 <=: 2]]; for the second you can use [[1 <: 3] [1 size: 4]].

A vector consisting of a mix of integers, length-2 lists, and ranges.

Examples: #(2 (0 3)) and ['(1 3) [1 size: 4]].

A canonical shape: A rank-2 array S whose own shape is [r 2]. For each dimension k (where (<= k 0) and (< k r)), the lower bound b_k is (S k 0), and the upper bound e_k is (S k 1).

Examples: #2a((0 2) (0 3)) and #2a((1 3) (1 4)).

Procedure: shape bound ...

Returns a shape. The sequence bound ... must consist of an even number of exact integers that are pairwise not decreasing. Each pair gives the lower and upper bound of a dimension. If the shape is used to specify the dimensions of an array and bound ... is the sequence b0 e0 ... bk ek ... of n pairs of bounds, then a valid index to the array is any sequence j0 ... jk ... of n exact integers where each jk satisfies (<= bk jk) and (< jk ek).

The shape of a d-dimensional array is a d * 2 array where the element at k 0 contains the lower bound for an index along dimension k and the element at k 1 contains the corresponding upper bound, where k satisfies (<= 0 k) and (< k d).

(shape @bounds) is equivalent to: (array [2 (/ (length bounds) 2)] @bounds)

Procedure: array-shape array

Return the shape of array in the canonical (r 2) form. It is an error to attempt to modify the shape array.

Procedure: array-rank array

Returns the number of dimensions of array.
(array-rank
  (make-array (shape 1 2 3 4)))
Returns 2.

Procedure: array-start array k

Returns the lower bound (inclusive) for the index along dimension k. This is most commonly 0.

Procedure: array-end array k

Returns the upper bound for the index along dimension k. The bound is exclusive - i.e. the first integer higher than the last legal index.

Procedure: array-size array

Return the total number of elements of array. This is the product of (- (array-end array k) (array-start array k)) for every valid k.

Array types

Type: array

Type: arrayN

Type: array[etype]

Type: arrayN[etype]

The type array matches all array values. The type arrayN, where N is an integer matches array of rank N. For example array2 matches rank-2 array - i.e. matrixes.

You can optionally specify the element type etype. This can be a primitive type. For example a array2[double] is a rank-2 array whose elements are double values.

Array literals and printing

An array literal starts with # followed by its rank, followed by a tag that describes the underlying vector (by default a), optionally followed by information about its shape, and finally followed by the cells, organized into dimensions using parentheses.

For example, #2a((11 12 13) (21 22 23)) is a rank-2 array (a matrix) whose shape is [2 3] or equivalently [[0 <: 2] [0 <: 3]]. It is pretty-printed as:

╔#2a:2:3═╗
║11│12│13║
╟──┼──┼──╢
║21│22│23║
╚══╧══╧══╝

array-literal ::= array-literal-header datum
array-literal-header ::= # rank vectag array-bound^*
array-bound ::= [@lower]:length | @lower
vectag ::= a | uniform-tag

The vectag specifies the type of the elements of the array.

Following the vectag you can optionally include information about the shape: For each dimension you can optionally specify the lower bounds (after the character "@"), followed by the length of the dimension (after the character ":"). The shape information is required if a lower bound is non-zero, or any length is zero.

The datum contains the elements in a nested-list format: a rank-1 array (i.e. vector) uses a single list, a rank-2 array uses a list-of-lists, and so on. The elements are in lexicographic order.

A uniform u32 array of rank 2 with index ranges 2..3 and 3..4:

#2u32@2@3((1 2) (2 3))

This syntax follows Common Lisp with Guile extensions. (Note that Guile prints rank-0 arrays with an extra set of parentheses. Kawa follows Common Lisp in not doing so.)

When an array is printed with the write function, the result is an array-literal. Printing with display formats the array in a rectangular grid using the format-array procedure. (Note that format-array is only used when the output is in column 0, because Kawa has very limited support for printing rectangles.)

Procedure: format-array value [port] [element-format]

Produce a nice “pretty” display for value, which is usually an array.

If port is an output port, the formatted output is written into that port. Otherwise, port must be a boolean (one of #t or #f). If the port is #t, output is to the (current-output-port). If the port is #f or no port is specified, the output is returned as a string. If the port is specified and is #t or an output-port, the result of the format-array procedure is unspecified. (This convention matches that of the format procedure.)

The top line includes an array-literal-header. The lower bound are only printed if non-zero. The dimension lengths are printed if there is room, or if one of them is zero.

#|kawa:34|# (! arr (array [[1 <=: 2] [1 <=: 3]]
#|.....35|#   #2a((1 2) (3 4)) 9 #2a((3 4) (5 6))
#|.....36|#   [42 43] #2a:1:3((8 7 6)) #2a((90 91) (100 101))))
#|kawa:37|# arr
╔#2a@1:2@1:3════╤═════════╗
║#2a═╗  │      9│#2a═╗    ║
║║1│2║  │       │║3│4║    ║
║╟─┼─╢  │       │╟─┼─╢    ║
║║3│4║  │       │║5│6║    ║
║╚═╧═╝  │       │╚═╧═╝    ║
╟───────┼───────┼─────────╢
║╔#1a:2╗│#2a:1:3│╔#2a:2:2╗║
║║42│43║│║8│7│6║│║ 90│ 91║║
║╚══╧══╝│╚═╧═╧═╝│╟───┼───╢║
║       │       │║100│101║║
║       │       │╚═══╧═══╝║
╚═══════╧═══════╧═════════╝

If element-format is specified, it is a format string used for format each non-array:

#|kawa:38|# (format-array arr "~4,2f")
╔#2a@1:2@1:3══╤════════════════╤═══════════════╗
║╔#2a:2:2══╗  │            9.00│╔#2a:2:2══╗    ║
║║1.00│2.00║  │                │║3.00│4.00║    ║
║╟────┼────╢  │                │╟────┼────╢    ║
║║3.00│4.00║  │                │║5.00│6.00║    ║
║╚════╧════╝  │                │╚════╧════╝    ║
╟─────────────┼────────────────┼───────────────╢
║╔#1a:2╤═════╗│╔#2a:1:3══╤════╗│╔#2a:2:2══════╗║
║║42.00│43.00║│║8.00│7.00│6.00║│║ 90.00│ 91.00║║
║╚═════╧═════╝│╚════╧════╧════╝│╟──────┼──────╢║
║             │                │║100.00│101.00║║
║             │                │╚══════╧══════╝║
╚═════════════╧════════════════╧═══════════════╝

If the rank is more than 2, then each “layer” is printed separated by double lines.

#|kawa:42|# (array-reshape [1 <=: 24] [3 2 4])
╔#3a:3:2:4══╗
║ 1│ 2│ 3│ 4║
╟──┼──┼──┼──╢
║ 5│ 6│ 7│ 8║
╠══╪══╪══╪══╣
║ 9│10│11│12║
╟──┼──┼──┼──╢
║13│14│15│16║
╠══╪══╪══╪══╣
║17│18│19│20║
╟──┼──┼──┼──╢
║21│22│23│24║
╚══╧══╧══╧══╝

Array construction

Array indexing

If you “call” an array as it it were a function, it is equivalent to using array-index-ref, which is generalization of array-ref. For example, given a rank-2 array arr with integer indexes i and j, the following all get the element of arr at index [i j].

(arr i j)
(array-index-ref arr i j)
(array-ref arr i j)
(array-ref arr [i j])

Using function-call notation or array-index-ref (but not plain array-ref) you can do generalized APL-style slicing and indirect indexing. See under array-index-ref for examples.

Procedure: array-ref array k ...

Procedure: array-ref array index

Returns the contents of the element of array at index k .... The sequence k ... must be a valid index to array. In the second form, index must be either a vector (a 0-based 1-dimensional array) containing k ....
(array-ref (array [2 3]
              'uno 'dos 'tres
              'cuatro 'cinco 'seis)
   1 0)
Returns cuatro.
(let ((a (array (shape 4 7 1 2) 3 1 4)))
   (list (array-ref a 4 1)
         (array-ref a (vector 5 1))
         (array-ref a (array (shape 0 2)
                         6 1))))
Returns (3 1 4).

Procedure: array-index-ref array index ...

Generalized APL-style array indexing, where each index can be either an array or an integer.

If each index is an integer, then the result is the same as array-ref.

Otherwise, the result is an immutable array whose rank is the sum of the ranks of each index. An integer is treated as rank-0 array.

If marr is the result of (array-index-ref arr M₁ M₂ ...) then:
(marr i₁₁ i₁₂ ... i₂₁ i₂₂ ...)
is defined as:
(arr (M₁ i₁₁ i₁₂ ...) (M₂ i₂₁ i₂₂ ...) ...)
Each M_k gets as many indexes as its rank. If M_k is an integer, then it we use it directly without any indexing.

Here are some examples, starting with simple indexing.
#|kawa:1|# (define arr (array #2a((1 4) (0 4))
#|.....2|#                    10 11 12 13 20 21 22 23 30 31 32 33))
#|kawa:3|# arr
╔#2a@1:3:4══╗
║10│11│12│13║
╟──┼──┼──┼──╢
║20│21│22│23║
╟──┼──┼──┼──╢
║30│31│32│33║
╚══╧══╧══╧══╝
#|kawa:4|# (arr 2 3)
23
If one index is a vector and the rest are scalar integers, then the result is a vector:
#|kawa:5|# (arr 2 [3 1])
#(23 21)
You can select a “sub-matrix” when all indexes are vectors:
#|kawa:6|# (arr [2 1] [3 1 3])
╔#2a:2:3═╗
║23│21│23║
╟──┼──┼──╢
║13│11│13║
╚══╧══╧══╝
Using ranges for index vectors selects a rectangular sub-matrix.
#|kawa:7|# (arr [1 <: 3] [1 <: 4])
╔#2a:2:3═╗
║11│12│13║
╟──┼──┼──╢
║21│22│23║
╚══╧══╧══╝
You can add new dimensions:
#|kawa:8|# (arr [2 1] #2a((3 1) (3 2)))
#3a╤══╗
║23│21║
╟──┼──╢
║23│22║
╠══╪══╣
║13│11║
╟──┼──╢
║13│12║
╚══╧══╝
The pseudo-range [<:] can be used to select all the indexes along a dimension. To select row 2 (1-origin):
#|kawa:9|# (arr 2 [<:])
#(20 21 22 23)
To reverse the order use [>:]:
#|kawa:10|# (arr 2 [>:])
#(23 22 21 20)
To select column 3:
#|kawa:11|# (arr [<:] 3)
#(13 23 33)
If you actually want a column matrix (i.e. with shape [3 1] you can place the index in a single-element vector:
#|kawa:12|# (arr [<:] [3])
#2a╗
║13║
╟──╢
║23║
╟──╢
║33║
╚══╝
To expand that column to 5 colums you can repeat the column index:
#|kawa:13|# [3 by: 0 size: 5]
#(3 3 3 3 3)
#|kawa:14|# (arr [<:] [3 by: 0 size: 5])
╔#2a:3:5═╤══╤══╗
║13│13│13│13│13║
╟──┼──┼──┼──┼──╢
║23│23│23│23│23║
╟──┼──┼──┼──┼──╢
║33│33│33│33│33║
╚══╧══╧══╧══╧══╝

Modifying arrays

You can use set! to modify one or multiple elements. To modify a single element:

(set! (arr index ...) new-value)

is equivalent to

(array-set! arr index ... new-value)

You can set a slice (or all of the elements). In that case:

(set! (arr index ...) new-array)

is equivalent to:

(array-copy! (array-index-share arr index ...) new-array)

Procedure: array-set! array k ... obj

Procedure: array-set! array index obj

Stores obj in the element of array at index k .... Returns the void value. The sequence k ... must be a valid index to array. In the second form, index must be either a vector or a 0-based 1-dimensional array containing k ....
(let ((a (make-array
            (shape 4 5 4 5 4 5))))
   (array-set! a 4 4 4 "huuhkaja")
   (array-ref a 4 4 4))
Returns "huuhkaja".

Compatibility: SRFI-47, Guile and Scheme-48 have array-set! with a different argument order.

Procedure: array-copy! dst src

Compatibility: Guile has a array-copy! with the reversed argument order.

Procedure: array-fill! array value

Set all the values array to value.

Transformations and views

A view or transform of an array is an array a₂ whose elements come from some other array a₁, given some transform function T that maps a₂ indexes to a₁ indexes. Specifically (array-ref a₂ indexes) is (array-ref a₁ (T indexes)). Modifying a₂ causes a₁ to be modified; modifying a₁ may modify a₂ (depending on the transform function). The shape of a₂ is in generally different than that of a₁.

Procedure: array-transform array shape transform

This is a general mechanism for creating a view. The result is a new array with the given shape. Accessing this new array is implemented by calling the transform function on the index vector, which must return a new index vector valid for indexing the original array. Here is an example (using the same arr as in the array-index-ref example):
#|kawa:1|# (define arr (array #2a((1 4) (0 4))
#|.....2|#                    10 11 12 13 20 21 22 23 30 31 32 33))
#|kawa:14|# (array-transform arr #2a((0 3) (1 3) (0 2))
#|.....15|#   (lambda (ix) (let ((i (ix 0)) (j (ix 1)) (k (ix 2)))
#|.....16|#                  [(+ i 1)
#|.....17|#                   (+ (* 2 (- j 1)) k)])))
#3a:3@1:2:2
║10│11║
╟──┼──╢
║12│13║
╠══╪══╣
║20│21║
╟──┼──╢
║22│23║
╠══╪══╣
║30│31║
╟──┼──╢
║32│33║
╚══╧══╝
The array-transform is generalization of share-array, in that it does not require the transform to be affine. Also note the different calling conversions for the tranform: array-transform takes a single argument (a vector of indexes), and returns a single result (a vector of indexes); share-array takes one argument for each index, and returns one value for each index. The difference is historical.

Procedure: array-index-share array index ...

This does the same generalized APL-style indexing as array-index-ref. However, the resulting array is a modifiable view into the argument array.

Procedure: array-reshape array shape

Creates a new array narray of the given shape, such that (array->vector array) and (array->vector narray) are equivalent. I.e. the i’th element in row-major-order of narray is the i’th element in row-major-order of array. Hence (array-size narray) (as specied from the shape) must be equal to (array-size array). The resulting narray is a view such that modifying array also modifies narray and vice versa.

Procedure: share-array array shape proc

Returns a new array of shape shape that shares elements of array through proc. The procedure proc must implement an affine function that returns indices of array when given indices of the array returned by share-array. The array does not retain a reference to shape.
(define i_4
   (let* ((i (make-array
                (shape 0 4 0 4)
                0))
          (d (share-array i
                (shape 0 4)
                (lambda (k)
                   (values k k)))))
      (do ((k 0 (+ k 1)))
          ((= k 4))
         (array-set! d k 1))
      i))
Note: the affinity requirement for proc means that each value must be a sum of multiples of the arguments passed to proc, plus a constant.

Implementation note: arrays have to maintain an internal index mapping from indices k1 ... kd to a single index into a backing vector; the composition of this mapping and proc can be recognised as (+ n0 (* n1 k1) ... (* nd kd)) by setting each index in turn to 1 and others to 0, and all to 0 for the constant term; the composition can then be compiled away, together with any complexity that the user introduced in their procedure.

Here is an example where the array is a uniform vector:
(share-array
  (f64vector 1.0 2.0 3.0 4.0 5.0 6.0)
  (shape 0 2 0 3)
  (lambda (i j) (+ (* 2 i) j)))
  ⇒  #2f64((1.0 2.0 3.0) (4.0 5.0 6.0))

Procedure: array-flatten array

Procedure: array->vector array

Return a vector consisting of the elements of the array in row-major-order.

The result of array-flatten is fresh (mutable) copy, not a view. The result of array->vector is a view: If array is mutable, then modifying array changes the flattened result and vice versa.

If array is “simple”, array->vector returns the original vector. Specifically, if vec is a vector then:
(eq? vec (array->vector (array-reshape vec shape)))

Miscellaneous

Procedure: format-array value [element-format]