Organisation: | Copyright (C) 2016-2024 EDF R&D |
---|---|
Contact: | olivier (dot) boudeville (at) edf (dot) fr |
Author: | Olivier Boudeville |
Creation Date: | Tuesday, January 3, 2017 |
Lastly updated: | Tuesday, February 13, 2024 |
Version: | 2.4.7 |
Status: | Stable |
Website: | |
Dedication: | For the Sim-Diasca model integrators. |
Abstract: | This document describes how the coupling of models and simulations can be done with Sim-Diasca. Four coupling schemes are more specifically discussed, following:
These coupling schemes account for the most common options that range from co-simulation to pure simulation. |
Table of Contents
When having to perform simulations, the usual design process involves:
However in some cases the last two steps cannot be performed in that order, notably because some models may predate the simulation effort at hand.
In this case we have to deal with pre-existing models, and the challenge switches from their writing to their coupling, so that, even if these models are already written, they can be nevertheless integrated - at least to some extent - in the new simulation of interest.
Said otherwise, the models involved in the vast majority of Sim-Diasca simulations shall be purposedly Erlang-based and shall derive, directly or not, from the class_Actor mother classes provided by the engine.
We suppose, in the context of this document, that this most direct "native" approach cannot be taken, inducing the need to perform some kind of coupling instead.
Sim-Diasca primarily focuses on the usual simulation design process described above, where the engine is first selected, and then the models are written for that engine.
This typically corresponds to the City-Example simulation case [1], which focuses on urban simulation by itself (i.e. on the models themselves).
[1] | This case is distributed with the free software version of Sim-Diasca, and can be found in the mock-simulators/city-example directory. |
Another example case, the Sustainable-Cities [2] one, is also a fairly classical simulation, yet focuses on its boundaries (i.e. its input and output) in the context of the integration to a third-party platform; yet no actual coupling is involved here either.
[2] | This case does not belong to the free software version of Sim-Diasca; it is located in the sustainable-cities/sustainable-cities-case directory. |
A third simulation case, nicknamed MUG [3], focuses on the subject of interest here, i.e. the coupling of a set of models that were not all written for Sim-Diasca (they were written independently, and their coupling is thus an afterthought).
[3] | This case does not belong to the free software version of Sim-Diasca (it is located in the sustainable-cities/mug-case directory of the internal version). |
As a result these models were not specifically in line with the engine's expectations at various levels; from the most problematic to the least:
[4] | Please refer to the Sim-Diasca Dataflow HOWTO for more information. |
Additionally some models were delivered "as they were", as black boxes (hence with no information at all about their inner workings and no possibility of being modified either).
The challenge was therefore to devise a sufficiently generic interoperability scheme enabling the coupling of models from any origin, ironing out their multi-level differences while minimizing the dependency of models onto others, or onto the engine itself ; the coupling architecture discussed here is an answer to these needs.
Whether or not a case relies on coupling, the overall simulation is to be driven by the engine, notably by its time management service. As a consequence, from the engine's point of view, all the scheduled elements are expected to be ultimately (Sim-Diasca) actors.
For some models, either not developed yet or for which having an alternate implementation is relevant, the best option for the project may be to have them implemented directly in the target simulation environment. Then by design they will be well integrated with the engine.
However, probably more often than not, such a complete integration will not be performed, for various reasons: a model may have been delivered as a black box only, or another form of implementation is preferred, or any kind of port for this model would require too much rewriting work.
So we need in these cases to also include models that have not been designed according to any prior federating scheme, implying that no specific coupling consideration or engine integration was taken into account when they were first designed. We have therefore to provide technical measures ensuring that these models will nevertheless fit, a posteriori, in the overall Sim-Diasca scheme.
To perform these adaptations, we will resort to a range of coupling schemes that will make these pre-existing models behave, from the point of view of the engine, as if they were legit Sim-Diasca models.
Technically, this means that engine-specific adapters, implemented in Erlang, will wrap each exogenous model and, far beyond the question of the programming languages, will ensure that the resulting overall model complies with the appropriate synchronization and exchange contracts, so that that model can be seamlessly integrated in the simulation among the other models, regardless of their nature.
Such a simulation, where the models accommodate the engine, can alternatively be seen as a co-simulation, as discussed in the next section.
Let's start with an excerpt of the Wikipedia article about co-simulation:
In co-simulation the different subsystems which form a coupled problem are modeled and simulated in a distributed manner. Hence, the modeling is done on the subsystem level without having the coupled problem in mind. Furthermore, the coupled simulation is carried out by running the subsystems in a black-box manner. During the simulation the subsystems will exchange data.
In this context, the architecture of the coupling case discussed here can also be seen as Sim-Diasca being the master [5] engine, with each of the third-party slave models corresponding to a coupled simulation (not to an individual model anymore), a simulation evaluated by any engine that this slave embeds.
[5] | A term used also in FMI parlance, for Co-Simulation, as opposed to a Model-Exchange integration. |
Mixing and matching models that are either natively developed for the engine or that are merely adapted for it somewhat blurs the frontier between a simulation engine and a coupling engine (i.e. a master).
Should there be only coupled models (as opposed to native ones), we could rely on a pure co-simulation master, an example of which being MECSYCO; if, additionally, all coupled models were FMUs, DACCOSIM could also be used.
We will see in the next sections that, for the coupling cases considered here, more diverse schemes than mere black boxes are to be managed.
So we saw that there is a kind of continuum between "pure co-simulations" (i.e. with only coupled simulations being involved) and "pure simulations" (i.e. with only engine-native models involved).
More precisely, for any given model, four main coupling schemes (noted CS) can be seen as pacing this continuum; going from the least integrated to the engine to the most, one may rely on:
Each of these coupling schemes will be detailed below.
Of course none of these four modes of operation strictly prevail over any other, they all have pros and cons, so any choice in this matter is the consequence of an architectural trade-off depending heavily on the model, the engine and the objectives of the project.
Besides the coupling potential (i.e. the possibility of including that model in numerous interactions, so that the added value expected from the coupling can be fully obtained), other metrics may be of interest in order to select a scheme:
Note
A key point is that this choice of coupling scheme is model-level, not simulation-level: a given simulation may involve various models, each relying on a coupling scheme that best suits its nature and needs.
In the next section, the various coupling schemes are detailed, so that one can select the best approach when having to integrate a model in one's simulation.
These descriptions focus on the coupled model itself, knowing that its environment ("the rest of the simulation") is mostly made of the engine and of the other model instances.
The red arrows denote the control flow, i.e. which component drives which one, in what order.
The green arrows denote the information stream, i.e. the paths taken by the data conveying that information, from a component to another.
As in most cases a pre-existing model cannot be integrated "as is" in the simulation, an adapter (shown as a blue, filled rectangle), is generally required. From the engine's point of view, such an adapter is an actor like the others, and is treated as such; the engine then is unaware that there is a third-party model underneath.
This adapter applies to an instance of the model considered here (depicted in these diagrams as the plot of a curve), which usually is meant to interact with other model instances (be them of the same model or not), represented by white rectangles enclosed in blue.
Finally, the overall simulation is driven by the simulation engine (namely Sim-Diasca, shown here as a light-blue component with arrows suggesting multiple paths), in charge of the coordination of all model instances (actors) and of their exchanges.
Let's then discuss each coupling scheme in turn.
In this setting, we have a pre-existing model, possibly a raw binary standalone executable, on the mode of operation of which we have little to no information, except:
So this model is originally expected to be run (in an appropriate environment) with a command-line akin to:
$ ./my-simulator.exe --option_1 --option_2=42 --first-input-file=foo.dat --first-output-file=bar.dat
In this scheme we then create a corresponding black-box adapter, in the form of a Sim-Diasca actor, comprising mostly:
Both translators are typically direct parts of the adapter, and may use helper libraries to perform their conversion of information format, especially if a common data model has been defined underneath.
Variations of this scheme exist; notably:
In all cases we consider that it is the responsibility of the adapter to collect and propagate on behalf of the black box the relevant information from and to the simulation; for that, the adapter has to behave like a standard (Sim-Diasca) actor (using actor messages for that, respecting the scheduling rules, being stateful, etc.).
In any case, as shown in the diagram below, an evaluation of the model results then in the following sequence of actions:
Of course multiple of these sequences of actions (one per model instance) may run concurrently during the evaluation of a time-step.
Usually this implies that one operation-system process (typically, a UNIX process) has to be spawned at each time-step (for the black box executable), and that at least two files are to be written and then immediately read, parsed and discarded (to account for the communication between the black box and its two translators).
Relying on a black box is probably the most basic form of coupling; it remains often quite loose, yet many projects may find this approach sufficient.
Pros:
Cons:
Some co-simulation approaches make heavy use of this black-box coupling scheme; among the lessons learned, following points were identified:
In this coupling scheme, one starts with a pre-existing model taken as it is, yet its inner logic used to fetch its inputs and transmit its outputs is replaced by calls to a coupling API, in order to perform the same operations, yet this time through the engine.
This scheme, while looking interesting on the paper, conceals pitfalls of its own, as it reverses the intended control flow between the engine (meant to drive) and the models (meant to be driven).
Indeed, once triggered, each model instance may, thanks the coupling APIs at its disposal, act quite freely upon the simulation, i.e. mostly by itself and with little to no control of the engine - whereas it is up to the latter to organise, synchronise and control accesses of the numerous actors involved.
Typically, such a scheme would naturally lead to the adapted models needing to freely perform indiscriminate, blocking read/write operations onto the rest of the simulation (notably onto other model instances), in spite of a technical context that relies on asynchronous, synchronised message passing to provide parallelism.
Granting such a freedom to models would be done at the expense of the coupling APIs, which would be either very limiting or, more probably, extremely tricky to develop: these APIs would have to hide the fact that underneath each of their call from the model, the engine would have to make the overall logical time seamlessly progress and perform simulation-level synchronisation.
Another option would be to reimplement in the target language the whole applicative protocol that rules a Sim-Diasca actor, as it is. This would result in, semantically, offering the same API, yet in a translated form (e.g. in Python or Java).
Doing so would involve significant effort on the side of the coupling architecture in order to support each new language, and writing the corresponding models would not be easier than using the classical, native route.
In a parallel context, with an arbitrary number of model instances performing intricate, simultaneous interactions, having the engine being simultaneously driven by each of these actors would be considerably more complex than the default, opposite mode of operation.
However, if emulating a full-blown Sim-Diasca actor from another language may prove difficult, there are settings where by design less leeway is granted to at least some actors, in which case the integration of third-party code and the development of relevant APIs could be significantly eased.
A typical use case is the one of the dataflow, where the dataflow actors are quite constrained in terms of interactions (that have to be mediated through input and output ports). As a result, a CS2-level coupling API could be considered in that case, please refer to the A Focus on the Special Case of the Dataflow section for that.
More generally, it shall be noted that in this CS2 scheme, for each of these overall logical adapted models, there are:
In any case, as shown in the diagram below, an evaluation of the model would then result in the following sequence of actions:
Pros:
Cons:
This coupling scheme may superficially look similar to the previous one; it is actually quite different. Indeed in this scheme the model appears to be mostly made of the adapter itself, since most its key elements, from the state to the interactions, are actually directly mediated by the adapter itself; however the actual computations are - at least to some elected extent - deferred to the actual, embedded model, which can be seen here mostly as a domain-specific library, i.e. as a set of exposed functions.
So the role of the adapter is here to:
By moving the centre of gravity (with regard to control, state, interactions) from the embedded model to the adapter, the tricky part of the coupling, which is the synchronisation with the rest of the simulation, can be secured directly and a lot more easily; moreover the degree of integration can be finely tuned.
Indeed embedded models could expose only a very limited set of functions (possibly just one in the simplest cases, a function that could be named for example process or compute) to perform their actual operations, while, over time, the services they offer might be subdivided into finer and finer pieces, for better control and selective interactions [6]; the granularity of the computations exposed by the embedded model can be freely adjusted to accommodate the interactions needed by a simulation, which can be seen as well as finding a balance between the operations delegated to the model and the ones directly taken in charge by the adapter - knowing that being able to change their respective amount over time could be very convenient.
[6] | For example in some cases the precise set of needed data cannot be determined from the very start of the model's evaluation at a given timestep; hence, instead of first collecting inputs, then fully performing the processing, then collecting the outputs, it could be more effective, thanks to the adapter, to intersperse more finely the computations with the relevant data exchanges. |
In some cases it may be contemplated that the state of that actor is split in two parts:
Then the public state may be minimized in favour of the private one, native in terms of programming language for that model.
The series of operations for this coupling scheme is illustrated in the next diagram:
Pros:
Mixed:
Cons:
Note
This coupling scheme and the previous one may be seen as special cases of a more general scheme, where an applicative protocol, relying on any kind of IPC, may couple the logical processes that correspond respectively to the adapter and to the model in relations that are more complex than having a master and a slave.
However considering the adapter and the model as peers does not seem to grant specific benefits, and surely leads to a more complex design, hence we retained here only strict master/slave relations, with CS2 and CS3.
This coupling scheme is the most straightforward one: like for most simulations, a model here is specifically designed according to the engine that is supposed to evaluate it.
As such, the model is fully compliant with the framework, and is perfectly integrated with the rest of the simulation, which often leads to the best controllability and runtime performances (significantly better than the ones obtained with co-simulation).
Of course developing a model specific to Sim-Diasca has a cost; however integrating a third-party model requires efforts as well, and they may be significant.
The series of operations involved in this coupling scheme is simple, as it corresponds exactly to the normal mode of operation of the engine, as shown in the next diagram:
Pros:
Cons:
Some simulations significantly deviate from the usual multi-agent, disaggregated scheme often seen in the simulation of complex systems, and are best expressed according to alternate paradigms.
One of these paradigms is the dataflow [7] architecture, where operations are implemented by blocks that are interlinked in a rather static overall assembly.
[7] | Please refer to the Sim-Diasca Dataflow HOWTO for more information. |
This architecture, involving fixed, typed routes delimited by input and output ports, is more constrained than the usual multi-agent approach, and as such offers coupling perspectives of its own.
Indeed, even if of course the targeted dataflow may involve only Sim-Diasca actors, having to dispatch the corresponding computations to well-defined blocks fuels the possibility of having some of these blocks be implemented as third-party components, with their own technical conventions and languages.
We therefore identified specifically two main coupling opportunities pertaining to the dataflow approach, discussed and contrasted below: having library-based blocks, or blocks using a foreign dataflow API.
This architecture simply corresponds to CS3 (i.e. the Adapter-Driven Model coupling scheme) once applied to the dataflow special case.
More precisely, in this approach a given third-party model is structured as a domain-specific component offering a set of pre-implemented computations, of relevant granularity. On top of this "expert library", a Sim-Diasca dataflow adapter is then defined, in order to obtain from it an actual dataflow block.
The purpose of this adapter is to drive that embedded domain library in the context of the underlying dataflow, triggering the services it offers according to a relevant logic, feeding it with the proper information and fetching from it the relevant outputs.
Taking the example of an hypothetical model of a heat pump, the third-party library may split related computations into, say, 5 functions allowing respectively to determine:
These domain services exposed by the library, which are often at least loosely coupled, might then be federated by a relevant Sim-Diasca adapter, in charge of:
[8] | The logic of a block may exceed a plain, unconditional series of steps; in the general case an adapter-level algorithm is to be defined (e.g. with stateful logic, conditional sections, loops, etc.). |
Of course a bridge must be created between the third-party library and the adapter. As in the general case different, non-Erlang programming languages have to be accommodated, please refer to the Interworking with Other Programming Languages section for the corresponding technical details.
An advantage of basing a block onto a third-party library augmented of an adapter is the clean separation between the two: the domain-specific part is embodied by the library while the technical, engine-specific, dataflow-compliant part lies in the adapter.
Doing so allows to define an autonomous library that can be used in multiple contexts, the one with the Sim-Diasca dataflow adapter being only one of them.
For example, on top of the same library, separate, re-usable tests can be defined, adapters to other platforms can be devised, etc., while the library can still be used directly, in ad hoc developments.
Another coupling approach can be considered in such a dataflow context: any component can be safely integrated provided that it respects the laws and rules applying to the dataflow at hand.
For that a dataflow API must have been defined, and be implemented in the programming language that has been chosen in order to develop the model.
The model would then take the form of a dataflow block, and use directly the various constructs provided by the API, such as input and output ports, that are implemented by the coupling architecture - and all that developed in one's language of choice.
Using such a foreign dataflow API allows to stick to one's favorite language (e.g. Python or Java), sparing the need of using Erlang - of course on condition that this API has already been implemented in that particular language.
A drawback is that this approach results in a version of the model that is dedicated to this coupling architecture, and thus cannot be readily used in other contexts.
The choice between dataflow coupling options is to be made per-block: in a given simulation, native dataflow blocks (i.e. based on classical Sim-Diasca actors), library-based ones (based on Sim-Diasca dataflow adapters) and blocks relying on one of the foreign dataflow APIs can safely coexist, as they obey the same conventions.
So, for a given model, how should it best integrated as a computation block in a target dataflow?
If a clean separation between the domain expertise and the software development is sought upon, then the library approach shall be considered, especially if technical support is available in order to couple the model (i.e. write the dataflow adapter); this would offer much potential for model re-use.
Conversely, if developing in Erlang is a problem, and if, for a target language, the dataflow API is already available, then this route should be favored, especially if the model is to be written for this sole context.
Note
As mentioned, a different coupling scheme can be chosen for each model.
Is this model is to be included in a dataflow approach? If yes, refer to Choosing a Dataflow Option.
If no, is this model delivered as a black box, or is it overly complex?
(as a result, due to its limitations, we currently see no use case where CS2 should really be specifically recommended)
Often a separate process animated by the target programming language is spawned. Then following components are used:
So that Java code can interact with Erlang code, one should use Jinterface.
Parallel execution and bidirectional control and exchange can be implemented.
The adaptation bridge here is the thin Java layer added on top of the adapted model so that it can emulate the behaviour of an Erlang node (hence relying on the primitives offered by Jinterface in order to define a mailbox, send and receive messages, create Erlang terms, etc.).
The purpose of this bridge is mostly to route the requests emanating from the adapter so that their Java implementation is executed and their results are propagated back.
This can be done either thanks to ErlPort (probably the best choice), or to Py-Interface.
Parallel execution and bidirectional control and exchange can be implemented.
Links known to exist:
This topic is also addressed in the Erlang FAQ and in the Interoperability Tutorial User's Guide.
Some interworking solutions are quite low-level, they induce little overhead (e.g. for linked-in drivers or for NIFs, no extra operating system process is needed) yet they may affect adversely the Erlang VM (should they crash, leak memory, block their associated scheduler or do not terminate), while others offer higher-level, more integrated solutions (for example to transform adequately and transparently the datastructures being exchanged).
In these latter cases, care must be taken about scalability. For example, depending on the number of integrated instances that are involved, too many system processes might be created or too many file descriptors may be opened, as for each port one process is spawned while one file descriptor is used for reading and one is used for writing.
Custom protocols over IPC (e.g. files, pipes, UNIX sockets, Internet ones, etc.), based on the exchange of binary messages, can also always be devised. This would show a priori little advantage over any of the already available standard protocols, yet would allow to add support for virtually any language or exchange channel. Extending ErlPort could be another option.