The nodes of GreyCat

GreyCat is an implementation of a graph structure. One of its core concepts is therefore the notion of Node that constitute the elements of the graph.


Node definition

fn main() {
    //Luxembourg object here is in RAM only, not persisted to disk
    var luxembourg = Country {
        name: "Luxembourg",
        phoneCode: 352,
        location: geo::new(49.8153, 6.1296),
    };

    //In order to store it in the graph, we need to wrap it in a node<Country>
    var n_lux = node<Country>::new(luxembourg);

    println(n_lux);  //prints the reference / ID in the graph
}

{“_type”:“core.node”,“ref”:“0100000000000000”}

Congratulations! You created your first node in the Greycat Graph

A Node within GreyCat comprises three primary elements as depicted in the illustration below.

Alt text

Node can be assimilated to a pointer toward a memory zone. Represented as a pill on the left of the illustration, it is a very lightweight element, a label, used as a handle for a content. It is a sort of unique address or reference of a container in the graph.

Container is a zone in the graph to which the Node is associated. This association is unique and can never be changed. Containers can host anything, from literal values to complex objects, as presented in the illustration.

From a Node, the content of its associated Container can be retrieved using the star (*) prefix, as illustrated by the text above the dotted line. Here again, the mechanism is close to the mechanism of pointers.
At the moment of the resolution, if the associated content is not yet available in memory, it we be retrieved from the storage. The operation can therefore be costly. Our advice: resolve only when strictly necessary, and do the resolution only once in case you have loops.

Nodes (or more exactly the Container they are associated with) are typed. The type is specified between < and > symbols. On the illustration is specified that the node is “A node of T”. Meaning that, upon resolving its content, you can expect receiving an Object of type T. If you wanna store a ‘foo’ String in a node of the graph, you would likely create a var stringNode = node<String>::new("foo");. Resolving the node later would return the value “foo”.
var nodeContent = *stringNode; Assert::equals("foo", nodeContent);

Objects in GreyCat can only be contained by *one* and only *one* container. If you assign an object already contained, to another node, it will be removed from its origin container.
The node cannot be retrieved from the content of the container. Said otherwise, the resolution is not bi-directional and only works one way. You can get the content from a node, you cannot get a node from the content.
fn main() {
    var luxembourg = Country {name: "Luxembourg", phoneCode: 352, location: geo::new(49.8153, 6.1296)};
    var n_lux = node<Country>::new(luxembourg);

    println(n_lux);
    println(n_lux.resolve().name);
    println(n_lux->name);
}

The arrow notation node->attribute

The arrow notation can be handy in some places, but has to be used carefully. It is indeed a shortcut for a resolution (*) followed by an access (.). If you need to access several attributes on the node, better resolved it once, then use the dot notation.

Operators Summary

Operator Description
. Dot operator, to access fields, attributes and functions on an instance
:: Static access operator, to access static attributes and static functions
-> Arrow operator, Traverses the graph to access an attribute or a function

Object (heavy) vs Node (light)

What’s the difference between:

type City {
    name: String;
    country: Country;
}

and

type City {
    name: String;
    country: node<Country>;
}

??? And which modeling is better ???

The difference between:

type City {
    name: String;
    // Here every city instance, will embed a full country object
    // So if country object is heavy and contains 1000 attributes, object city will be heavier
    // since it will contains all country 1000 attributes and the city name attribute
    // 2 cities can't have the same country object, since every city object will embed it's own copy
    country: Country; //For example Country{ name: String, location: geo::new(6.45, 20.45), ...}
}

and:

type City {
    name: String;
    // Here country attribute is an ID - just a 64 bit ID
    // Several cities can point to the same country, if they have the same node ref
    // Even if the object Country contains 1000 attributes, only a 64 bit ID is stored in the object city
    // Using node will make the object lighter because when we load city, we don't need to automatically load the country instance
    country: node<Country>; //For example Ref00000001
}

most of the time it is better to decompose nodes in case they evolve independently!


The indexes

A great part of the usage of a graph is to navigate its elements and find the ones of interest. It is therefore of tremendous importance to have efficient structures to organize the data in the graph. Indexes are meant exactly for that. They are specialized kinds of nodes, optimized for the indexing of content. There exist several kinds of indexes depending on the nature of the key used to index elements.

nodeTime

nodeTime aims at indexing elements in the natural order of time. It is the ideal structure to store and manipulate time series. Indeed, a nodeTime may allow you to store the evolution of a temperature for instance.

Alt text

One particularity of the nodeTime is that it takes into account the continuous nature of the time scale. It is therefore completely possible in GreyCat to ask the resolution of an element in between two records. Said otherwise, nodeTime returns the value indexed at a time that is always previous or equal to the time you requested. This mechanism makes it very easy to align various time series and create datasets where all times are aligned.

A nodeTime can be navigated with a simple for loop as follows.

var myTemperature = nodeTime<float>::new();
[...] //insertion of points
for(t: time, temp: float in myTemperature) {
    println("Temperature was ${value} at time ${t}");
}

The loop presented will go through the entire time series, from its first record to the last in the order of time. Filters can be applied using brackets to restrict the temporal period within which records should be navigated for(t: time, temp: float in myTemperature[fromTime..toTime] )

nodeList

nodeList allows indexing elements in the natural order of integers. In this case, the key to index is an integer value of 64 bits.

Alt text

In contrast with the nodeTime, the integer scale being discrete, you must have the exact index to retrieve a specific value. Yet, the nodeList allows to get the first and last elements, and their respective indexes. It is also naturally possible to use for loops to navigate the index.

var myStock = nodeList<Palette>::new();
[...] //insertion of points
for(position: int, content: Palette in myStock) {
    println("Stock location ${position} contains palette ${content}");
}

The loop presented will go through all the positions present in the index, from its first value to the last, in the natural order of integers. Filters can be applied using brackets to restrict the range of integers to be considered. To be noted, the for loop wil only iterate on values existing in the index, and will not execute the content of the loop with null for all the integers in the range. for(position: int, content: Palette in myStock[54..78])

nodeGeo

nodeGeo enables the indexation of elements using a geographical position (latitude and longitude). This comes quite handy when assets need to be localized on a territory.

Alt text

Consistently, the content of nodeGeo can be explored with a for loop.

var myBuildings = nodeGeo<node<Building>>::new();
[...] //insertion of points
for(position: geo, building: Building in myBuildings) {
    println("My building ${building->name} is located at ${position}");
}

nodeIndex

nodeIndex allow the indexation of elements with other kind of keys, that are not time, int, or geo. The key type has to be specified, and the value of the key is hashed to 64 bits unsigned value. It is mostly used to index elements using a String.

Alt text

Navigation in this index is also achieved with a for loop going through all elements. There being no meaningful natural order, filters are not available.

var collaboratorsByName = nodeIndex<String, node<Person>>::new();
[...] //insertion of points
for(name: String, collabNode: node<Person> in collaboratorsByName) {
    println("${name} started with us on ${collabNode->start_date}");
}

Containment and multi-references

There can be circumstances when an element needs to be indexed, or referenced in several places of the graph, to create relationships, sub-groups, handy shortcuts, etc. In that case, as soon as an element needs to be references (and/or indexed) at least twice, you shall

  1. Put this element in a dedicated node (container)
  2. Use the node to index/reference the content.

Alt text

On this illustration, we want to index the element of type T twice, using its name attribute, and using its id attribute. We therefore put this element in its own container, through the usage of a node we name johnNode and reference this node in both our indexes.

var t_by_id = nodeList<node<T>>::new();
var t_by_name = nodeIndex<String, node<T>>::new();

var johnNode = node<T>::new(T{id: 25473, name: "John"});
t_by_id.set(johnNode->id, johnNode);
t_by_name.set(johnNode->name, johnNode);