Concurrency
GreyCat tasks are sequential but can spawn and wait jobs to create concurrent computation units.
A Job is a handle to a function and its arguments, creating a Job does nothing on its own, it is just a GreyCat object
Calling the await
function will pause the current task while awaiting for jobs passed as arguments to complete.
fn long_computation(max: int): int {
var count = 0;
for (var i = 0; i < max; i++) {
count++;
}
return count;
}
fn main() {
// Define the task request object by providing the arguments and the function pointer
var jobs = Array<Job> {
Job {function: project::long_computation , arguments: [100_000] },
Job {function: project::long_computation , arguments: [100_000] }
};
// Blocks code execution awaiting the end or fails early if an error occurs
await(jobs);
for(_, job in jobs){
// accessing the result
var result = job.result();
}
}
Only jobs in a functions executed as a task will be run in parallel. Jobs outside of a task context will only run sequentially
Jobs
inherits from the eventual modifications made from the parent task
to any nodes.
Conversely, parent task will aggregates all modifications done by jobs right after the await
method call.
GreyCat Task and Job model can be compared to a sequence of fork-and-join process.
Handling Errors in Jobs
await
Will throw if at least one job has raised an exception, you can try/catch it and iterate over the jobs result to find the error
The following snippet demonstrates how you would handle a scenario where you don’t want the whole execution to fail if one Jobs fails.
fn main() {
var jobs = Array<Job> {
Job { function: foo, arguments: [10_s, false] },
Job { function: foo, arguments: [1_s, true] }
};
try {
await(jobs);
} catch (err) {
for (i, job in jobs) {
var res = job.result();
if (res is Error) {
println("Job ${i} failed ");
} else {
println("Job ${i} finished ");
}
}
}
}
fn foo(a: duration, fail: bool) {
Runtime::sleep(a);
if (fail) {
throw "Failed";
}
}
Parallel writes (Concurrent Writing)
You are able to write in parallel to the graph, the only caveat is you may not write to the same node
in parallel, since there is a build in protection.
In a future update this limitation may be improved with an automatic best effort merge.
The following code snippet shows an example of creating independent nodes in parallel and insert them into a global index at the end;
This model becomes very powerful when used to import big amounts of data from separate files, or executing heavy computation.
var sensor_list: nodeList<node<Sensor>>;
fn main() {
var jobs = Array<Job> {};
jobs.add(Job {function: project::import });
jobs.add(Job {function: project::import });
await(jobs);
for (_, job in jobs) {
var sensors = job.result();
for (_, sensor: node<Sensor> in sensors) {
sensor_list.add(sensor);
}
}
}
fn import(): Array<node<Sensor>> {
var sensors = Array<node<Sensor>> {};
for (var i = 0; i < 10; i++) {
var sensor = node<Sensor> { Sensor { history: nodeTime<int> {} }};
sensors.add(sensor);
}
return sensors;
}
Limitations
Take the following snippet as an example, it will raise the following error message wrong state before await, variable contains an object stored in a node.
type Foo {
status: String;
}
fn task(foo: node<Foo>) {
var resolved_foo = foo.resolve();
// awaiting jobs
resolved_foo.status = "Done";
}
When execution reaches an await point, the current function scope is serialized.
This means that any object resolved from a node before an await may become outdated after resumption, as the node’s content might have changed in the meantime.
Accessing such outdated values is unsafe and will result in an error.
Scope serialization occurs because the current execution context is discarded and later reconstructed from the serialized data upon resuming execution after the await.
Leverage the arrow operator if you need to modify an object inside a node, or set the variable to null before the await, resolved_foo = null, both methods will work.