Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preamble #32

Open
lhstrh opened this issue Sep 18, 2019 · 33 comments
Open

Preamble #32

lhstrh opened this issue Sep 18, 2019 · 33 comments

Comments

@lhstrh
Copy link
Member

lhstrh commented Sep 18, 2019

Currently, preamble blocks can defined within the scope of a reactor definition. As such, it looks more like a constructor. It seems odd to call it preamble. Moreover, if we want inheritance in LF, it seems perfectly reasonable to have constructors, even if the target language doesn't have them. We had the keyword constructor before, and we've reverted to preamble at some point, but I don't recall why. It doesn't seem like a satisfactory solution to me. To me, preamble is something that should have global scope, which is something we're not even allowing currently.

Concrete syntax proposal:

  • give preamble global scope (we could potentially call it global)
  • change current "local" preamble into a target language code block delimited by {= and =} but without the preamble keyword; the functionality would remain unchanged for the C target; it be useful for defining class functions in OO languages
  • add a constructor {= ... =} block to reactor definitions; this would map to regular constructors in OO languages and, if used, would have the same functionality in C as the current preamble provides
@MattEWeber
Copy link
Contributor

I've been experimenting with protocol buffers in reactors, and in C it's necessary to #include a header file which is separately compiled from the protocol buffer definition. It seems to me "preamble" is actually a pretty good name for the code block containing reactor specific #includes. What sort of usage do you have in mind for a "constructor" block that couldn't be achieved by using a reaction triggered at time 0?

I'm confused by some of your proposals: Looking at generated C code for reactors with a preamble, the preamble seems to already have global scope. And, eclipse tells me I've made a mistake if I don't write the preamble as a code block delimited by {= and =}. Am I missing something?

@lhstrh
Copy link
Member Author

lhstrh commented Sep 19, 2019

I realize that the C code generator is currently set up that way, but my reasoning is that allowing reactors to freely append to a global preamble is a bad idea because it breaks composability; if two reactors happen to declare the same variable or function we're already screwed.

From a language design perspective it is an odd choice to have something that is part of a class definition (let's say reactor definitions are classes; we use them that way, anyway) to be featured in global scope. In C we can mitigate this problem by leveraging file scope, but we're not currently using it (i.e., all imported code is inlined). In JS there are other scoping rules we can leverage.

That said, for C, do we really need anything else beyond #includes? If we want to write code that is accessible by all reactors in file scope, I would say that putting them in a .h file is the cleanest way of allowing that, anyway. So perhaps we can narrow preamble down to a declaration of target-language imports? If header files are written properly, redundant include statements are idempotent, so...

On second thought, I actually rescind the first point of the proposal (considering we have imports and hierarchy, that would break compositionality just as easily but in a slightly different way).

With regard to the constructor issue, I agree with you that it could be done with an initial reaction. Come to think of it, in the C code generator we're essentially already synthesizing a constructor to populate the self struct based on the initialization syntax we provide. Having a state keyword, however, begs the question why we don't have mechanism for adding methods to the self struct. That would equip us with a nice OO scoping mechanism. Perhaps that, together with target-language imports could obviate preamble altogether?

@edwardalee
Copy link
Collaborator

edwardalee commented Sep 19, 2019 via email

@edwardalee
Copy link
Collaborator

edwardalee commented Sep 19, 2019 via email

@lhstrh
Copy link
Member Author

lhstrh commented Sep 19, 2019

We use file scope in C, but how can we do it in JS? We don't have file scope there. I guess my point is that we're essentially doing OO, but doing it in a half-baked way. I'm trying to find out whether this is an intentional feature of the language or lack of time spent working out the details. I've been assuming it has been the latter, and I'd like to devote some time to working on parts that I think need improvement.

Also, I'm trying to be conservative and make sure what we introduce in LF is truly necessary, not excess. I'm thinking we might be able to do with something less powerful and more manageable than preamble. Is there currently any use of preamble that would not be covered by target-language imports + allowing methods in the self struct? This would map beautifully to languages that support OO natively, and we know how to do it in C as well.

@lhstrh
Copy link
Member Author

lhstrh commented Sep 19, 2019

We've discussed initialize {= … =} before, and I like it. Unless someone objects I'll add it to the grammar in the current revision that I'm working on.

@edwardalee
Copy link
Collaborator

edwardalee commented Sep 19, 2019 via email

@edwardalee
Copy link
Collaborator

edwardalee commented Sep 19, 2019 via email

@lhstrh
Copy link
Member Author

lhstrh commented Sep 19, 2019

Not sure. I think you're right about it not being so easy. I think it might turn out extremely verbose or requiring the parsing the C code to peal out the function name, args, and types.

There is a book about OO in C, written by Axel-Tobias Schreiner that I expect holds the answers: https://www.cs.rit.edu/~ats/books/ooc.pdf

@lhstrh
Copy link
Member Author

lhstrh commented Sep 19, 2019

Something like this could work:
method {= int =} id {= int arg1 =} {= return arg1; =}
We would have to generate a fun decl:
int id(int arg1) { return arg1; };
And put in the self struct an entry:
(*id) (int arg1);
After setting the pointer, we could then call it like:
self->id(1)

Things get rather ugly with these many delimiters. But the same is already true for reactor parameters and initialization syntax... I'm still tempted to suggest a strategy where we parse the target code so we can make things look more fluid...

@edwardalee
Copy link
Collaborator

edwardalee commented Sep 19, 2019 via email

@lhstrh
Copy link
Member Author

lhstrh commented Sep 19, 2019

Perhaps. But the issue does come into play with inheritance. Suppose reactor B extends reactor A. Some reaction in a A uses a function X declared in its preamble. B wants to override X. How does it do that? Or maybe A adds a reaction that also would like to invoke X. How does it do that? Or maybe A, unaware of the implementation specifics of B, introduces X in its preamble; this would cause a clash (this could be avoided by using file scope, but it would pose a problem for my second question regarding the sharing of preamble functions within the inheritance hierarchy). There are lots of subtleties here...

@lhstrh lhstrh changed the title Preamble vs. Constructor Preamble Sep 19, 2019
@cmnrd
Copy link
Collaborator

cmnrd commented Nov 14, 2019

While working on the C++ target, I noticed anothe problem with the preamble (which prevented me from implementing the Preamble.lf test).

Unlike the C target, the C++ target produces multiple output files. This is one main file (main.cc) and a header and a source file for each reactor (<reactor_name>.hh and <reactor_name>.cc). Now the preamble goes to the top of the header file as it might contain includes that are required for the declaration of the reactor. This causes problems if the preamble conatins more than just includes, e.g. a function definition as in Preamble.lf. Since this function definition goes to the header and this header included in at least two files (main.cc and <reactor_name>.cc) the function will be defined in two translation units causing a linker error.

While I think the preamble is importan for imports and other things that require file scope, I don't think it should contain function definitions or other code. One way to deal with this issue in the current version of the C++ backend, is to simply avoid placing code in the preamble and to define an external liberay with then is included in the preamble.

@edwardalee
Copy link
Collaborator

edwardalee commented Nov 14, 2019 via email

@lhstrh
Copy link
Member Author

lhstrh commented Nov 20, 2019

After some more discussion, we've come up with the following proposal:

  • remove the preamble statement from reactor definitions
  • augment import to also allow the inclusion of target code files; the code generator will produce target code inclusion/import statements accordingly
  • add a header statement in LF in file scope, to list verbatim code that shall be put at the top to the generated target file or in the header file (if there is one)

The rationale behind eliminating class scope inclusion of verbatim target code is the following: soon as we have support for inheritance, achieving such inclusions will be possible by extending an extended Reactor base class. The idea is that for object-oriented targets, reactors can either extend LF reactor definitions or target code reactor definitions that extend the Reactor base class. An LF reactor definition that does not explicitly extends anything is assumed to extend the Reactor base class.

Side note: the header statement would be a convenience, not a necessity (one could simply produce and reference a header file in order to incorporate additional target code).

@edwardalee
Copy link
Collaborator

edwardalee commented Nov 20, 2019 via email

@lhstrh
Copy link
Member Author

lhstrh commented Nov 20, 2019

We agree on the last point (inheritance).

I think our misunderstanding in part stems from the difference in what we mean by file scope. For the C++ target, each reactor is put in a separate file. For the C target, on other other hand, everything is put in one file. If we keep preamble as a file-level verbatim inclusion, this should have the same meaning for both the C++ and C code; right now it doesn't. Which approach do we take?

That said, if we can have imports in the reactor scope (which I see the need for if each reactor is in its own file), why do we still need verbatim inclusion?

@cmnrd
Copy link
Collaborator

cmnrd commented Nov 21, 2019

It seems like I am missing some context here, but I'll try to add my view.

I think the problem with the current preamble is not the missing distinction between imports and code, but rather the distinction between the public interface of an reactor and its private interface. By public interface I mean everything that is required
to instantiate and use a reactor (for instance the declaration of a fancy type in C/C++). By private interface I mean everything that is required to implement the reactors functionality (for instance a specific library function.

In C++ the public interface goes to the header. This could be various includes but it is actually not limited to includes. It might also make sense to allow inserting code directly (for instance to define a type that the reactor uses). The private interface goes to the source file and may include headers or define functions that are required for the reaction definitions but not for the public interface.

I think what exactly is part of the private interface and what is part of the public interface is somewhat language dependent. For some languages it might not even make sense to distinguish them. However, I think that this is a concept that is found in many languages in one way or another.

So in conclusion my proposal is to leave the preamble at reactor level, but to extend it with the public or private qualifier. I think this would also play nice with inheritance. A derived reactor would inherit the public interface of its base but not it's private interface. We could even think about introducing a protected preamble, that is not public but is inherited.

@cmnrd
Copy link
Collaborator

cmnrd commented Feb 12, 2020

After talking to Marten on the phone about this I want to revisit this discussion. I think that Lingua Franca should not try to understand target language imports. This can be immensely difficult. For instance, C, C++, and python support conditional imports that might check for different platforms or libraries. Moreover, I think that there would not be much of a benefit if Lingua Franca actually understands and manages target code imports. Target language compilers are much better at doing this.

I propose to leave the preamble as is. To me it is quite clear what it does and I am also perfectly fine with the name. Since, C and C++ distinguish declaration and definition, and place them in different files, this also needs to be reflected in the preamble. Therefore, I want to revisit the approach of public and private preamble which I already proposed above and underline it with the code snippet below. The basic idea is to distinguish types that are visible on the interface of an reactor (public preamble) from types, functions, etc. that are only used within an reactor (private preamble).

target Cpp;

reactor PreambleTest {
  public preamble {=
    // Public preamble should contain everything that is required to use a
    // reactor. This includes everything that is used on the reactor interface,
    // i.e., types used on ports.

    // The public preamble goes to the header (PreambleTest.hh).

    struct Point {
      int x;
      int y;
    };
  =}

  private preamble {=
    // Private preamble should contain everything that is required for the
    // definition of a reactor, i.e., everything that is used within reaction
    // bodies. For instance, the code below includes some header and
    // defines a helper function.

    // The private preamble goes to the source file (PreambleTest.cc).

    #include <iostream>
    #include <string>
    #include <sstream>

    std::string point_to_string(const Point& p) {
      std::stringstream ss;
      ss << '(' << p.x << ',' << p.y << ')';
      return ss.str();
    }
  =}

  input point:Point;
  reaction(point) {=
      std::cout << "Received point" << point_to_string(*point.get()) << '\n';
  =}
}

Since not all languages distinguish declaration and definition, the distinction between private and public preamble is not required for all target languages. We could make the private/public prefix of the preamble keyword optional and validate the correct usage based on the selected target language. In the code snippet above, the preambles are part of the reactor definition. In order to avoid code duplication. We could also allow the definition of public and private preambles at file level. Then, for the C++ target, the public and private preamble should go to there own header and source file. We could support these options exclusively or we could even allow combined usage.

In case we later support inheritance of reactors, we can also think about introducing a protected preamble.

What do you think?

@edwardalee
Copy link
Collaborator

I like this proposal. An interesting further step for the C and C++ targets would be to parse the public preamble for typedefs and automatically generate a Protobuf (or alternative) specification for structs or other typedefs defined therein.

We still have the problem, however, that we need to be able to specify compiler and linker options, e.g. to link to separately compiled .cc file for which we #include the .hh file. What should be the syntax for that?

In the C target, we still have the problem that preamble code has file scope rather than reactor scope. We should fix this by generating separate .c and .h files for each reactor and then generating a makefile.

@cmnrd
Copy link
Collaborator

cmnrd commented Feb 13, 2020

Regarding the compiler and linker options, I don't think that specifying the bare options is sufficient for larger projects. I think we need a way to hook into the target build system. In the C++ target, I added a target property which allows to specify a cmake file which is automatically included by the cmake build system. This provides full configurability of the build process including searching for and linking to external libraries as well as control over compiler options.

I don't think that there is a syntax that is general enough to rule them all. The compilers for different languages and there build systems are just too diverse.

@lhstrh
Copy link
Member Author

lhstrh commented Feb 24, 2020

I've change the grammar to implement Christian's proposal (43843e8), but the respective code generators still need to be updated to pay attention to the public/private visibility modifiers and generate code accordingly. This activity should be lumped in with the task of using .h files in the C generator: I've created a separate ticket for this: #108
Note that this change also permits preambles at the LF file scope level. While it is now legal to write these, they are currently still ignored by the code generator.

I've made the default visibility private. Please let me know if you think the default should be public instead.

@cmnrd
Copy link
Collaborator

cmnrd commented Mar 31, 2020

I actually think the default visibility should be NONE or something similar indicating that the visibilty was not set. The validator then can enforce for C++ that either public or private is given (and I assume later for C as well) and enforce for other targets that no visibility is given.

By looking at the grammar, I also noticed that it is possible now to have an arbitrary number of preambles. While this can be dealt with in code generation, I dont't think it should be encouraged. Do you think the validator should restrict the preambles to at moste one of each visibility on file scope and on reactor scope?

cmnrd added a commit that referenced this issue Apr 9, 2020
cmnrd added a commit that referenced this issue Apr 9, 2020
In C++, this produces an error if no visibility is given.
In other targets, this produces a warning if a visibility is given.

See #32
cmnrd added a commit that referenced this issue Apr 9, 2020
In C++, this produces an error if no visibility is given.
In other targets, this produces a warning if a visibility is given.

See #32
@cmnrd
Copy link
Collaborator

cmnrd commented Apr 9, 2020

I made a small adjustment to the grammar to set default visibility to NONE. This allos the validator to check whether the visibility was set in the LF code or not. I also added a check that produces an error for the C++ target when no visibility is given. For all other tragets, it produces a warning stating that the visibility has no meaning in for this target.

cmnrd added a commit that referenced this issue Apr 9, 2020
This ensures that on imports the public preamble of the imported file
becomes included in the public preamble of the importing file.

See #32
@cmnrd
Copy link
Collaborator

cmnrd commented Apr 9, 2020

While implementing more C++ test, I found an issue with the way file scope preambles work now in C++.

The public preamble of an reactor A is visible to A itself and all othe reactors that instantiate A. The public preamble on top of an LF file is visible to all reactors contained within the file and all other reactors that instantiate one of those reactors. For instance in this file:

// preamble.lf
target Cpp;

public preamble {=
  struct Foo { /* */ };
=}

reactor A {=
=}

reactor B {=
=}

The definition of Foo is visible to both reactors A and B. If we now import this file, it is not so clear who is supposed to see the public preamble of preamble.lf. Consider this file:

// import.lf
target Cpp;
import preamble.lf;
reactor C {
}
reeactor D {
  a = new A();
} 

In the current implementation. D sees the definition of Foo because it instantiates A and A's preamble becomes automatically included. C, however, does not see Foo although it is defined in the imported file preamble.lf. To me this seems odd. I think the public preamble of an imported file should automatically become part of the public preamble of the importing flle so that boath C and D see the definition of Foo.

I put this here mainly as a note and as something to consider for the preamble mechanims in other targets. I think it also shows that there is still room for discussions on the semantics of preambles.

I will push a change in a second that properly includes the preambles of imported files.

@lhstrh
Copy link
Member Author

lhstrh commented Jul 11, 2020

As we're transitioning into a class-path-based import system, this (unresolved) issue has become relevant again. We also have several other unresolved issues that tie in with this one, such as #194 #149 #132 #108.

The mechanism currently active in the master branch simply parses URIs, finds them in the file system, and recursively adds them to the resource set. All imported .lf files will have all their reactors code generated (regardless of whether they are instantiated).

We're replacing this with a mechanism (see the new_import branch) that makes everything in the directory of the compiled source (standalone mode) or workspace (integrated mode) -- as well as everything found in the LF_CLASSPATH environment variable -- visible through a global scope provider. The order in which reactor classes are generated is based on a topological sort of their instantiations (which, indeed, cannot be circular). Unused classes are not code generated.

The problem that I ran into is the following: we have tests in the test suite the declare C types inside of the preamble of one reactor and go on to use that data type in another reactor that is defined in the same file (or even a different file, through an import). If the topological sort determines that the class definition with the preamble will get code generated after the one without the preamble, we get a compile error. If another reactor imports the reactor without the preamble, but not the reactor with the preamble, we get a compile error. If it imports both, you might get a compile error. If you compile again, the error may disappear (because the topological sort has no unique solution). You get the picture.

In the C generator, these preambles that are declared in the scope of a particular reactor are simply printed before the reactor's definition (i.e., the associated structs, reaction functions, etc.) but not limited in scope because subsequent reactor definitions can access the preamble of the classes defined before it, which leads to the issues described above. What is the point of associating a preamble a with a particular reactor if we have no mechanism to limit its visibility to that particular reactor? Why isn't a preamble something that gets printed at the beginning of a file, as its name suggests? Coincidentally, we also have a file-level preamble, which (or so it looks like) gets ignored by the C generator. Wouldn't file-level preambles (where the public modifier designates it's contents to be put in a header file, and the private modifiers designates it's contents to be printed in the C file) suffice?

Assuming that we would only have such file-level preambles, then what would it mean to import a reactor that comes from a file that has a preamble. Clearly, the imported reactor might depend on the preamble code, so we'll have to include it as well. Should imported reactors be spliced into a single file (as is now done in CGenerator), or should we generate separate files for reactors that originate from a different .lf files and #include them? If we put them in the same file, we might create name clashes between private preambles of imported reactors, which strikes me as problematic (after all, private implies that such interference would not be possible).

Note that we plan to also allow foreign imports (i.e., non-LF sources, such as .proto files); should we allow to import .h files too, as an alternative way of defining a public preamble? We might have to, because if we simply #include the header file in the private preamble, then where should it be stored (it would have to be a path relative to the src-gen directory)?

I feel like we've come to a point where we need to announce a feature freeze in order to jointly address these issues, as they've been lingering unresolved for too long. To address them properly will require a fair amount of work. I realize that these are not exciting topics from a research perspective, but I think that the way we address (or fail to address) these matters could really impact LF in the longer term.

@edwardalee
Copy link
Collaborator

My gut feeling is that importing .h files is the right thing to do. The .h file can be written using the usual trick with #define so that it is harmless to import it more than once.

@lhstrh
Copy link
Member Author

lhstrh commented Jul 13, 2020

Yes, importing .h files makes sense to me as well. This doesn't answer all the open questions though.

  • Do we remove support for the reactor-scoped preamble?
  • Do we add support for public/private preambles in C (i.e., put the public preamble + declarations in a generated header file + put the contents of each imported .lf file in a separate .c file)

@edwardalee
Copy link
Collaborator

Arguably, if we have a good include mechanism, then we could do away with preambles altogether. This might be a better solution since it seems hard to give preambles useful meaning that is common across targets. I think that elaborating them with public and private modifiers, if anything, just further muddies the water. We could then think about instead adding a truly object-oriented mechanism, like a "method" keyword to complete "reaction".

I suggest we support including .c, .cc, .h, and .hh files, put in a mechanism in the CGenerator that ensures that they never get #included more than once, and redo the tests that rely on these. I.e., let'd try to purge the tests of preambles and see where the friction arises.

@lhstrh
Copy link
Member Author

lhstrh commented Jul 13, 2020

You might be right. But I think there are some tricky corner cases to consider, even if we do away with preambles altogether.

Let's say I have a file foo.lf that has a reactor Foo. It also has a #include "foo_stuff.h" that provides it with some things that the implementation of Foo requires. Now assume we have a bar.lf that has a #include "bar_stuff.h", and it also imports Foo. Bar needs bar_stuff.h and Foo needs foo_stuff.h but putting them in the same file scope could raise spurious conflicts between foo_stuff.h and bar_stuff.h. What I think should happen is that the code generator should generate the following files: foo.h, foo.c, bar.h, and bar.c, where foo.c includes foo_stuff.h and bar.c includes foo.h and bar_stuff.h. This is a very different code generation approach from what we currently follow in CGenerator (i.e., dump everything in one file).

If we want a robust import mechanism, we also need a more advanced code generation approach, more akin to what CppGenerator does. We'll have to generate Makefiles, etc. While I think this is a dreadful task, I think it's important to do this. I can't do it alone, though, and I don't think it is a good idea to make these changes amid the amount of code churn that we've seen in the recent weeks.

@edwardalee
Copy link
Collaborator

Unfortunately, in your example, I don't think putting things into separate files will be so easy, at least not all the time. Suppose that Bar imports Foo because reactor Bar extends reactor Foo rather than just instantiating it. The simple mechanism we derived for inheritance is incompatible, I think, with putting these into separate files.

I suggest we proceed with support import of .h, .c, etc., without undertaking the (substantial) redesign required to put things into separate files. Yes, there is a possibility for collisions, but there are easy workarounds (move stuff out of .lf files and into .c files that you link against). Moreover, if we later undertake the project to put things into separate files, this can be done in a backward compatible way.

@lhstrh
Copy link
Member Author

lhstrh commented Jul 13, 2020

I agree that conflicts cannot be avoided between preambles or imports of reactor classes that are in an inheritance relationship (at least not with the primitive inheritance mechanism that we currently have), but I see inheritance as a rather intentional mixing of two classes. I would be more willing to accept conflicts that arise from such mixing as "legitimate," as opposed to a conflict that is the accidental result of interfacing two components the internals of which are intended to exist in isolation of one another.

That said, if we want support for preambles in C (either file-scope or reactor-scope), I see no other reasonable option than putting generated code from different .lf files into separate .c files (like CppGenerator does). If we don't want to do this now, then I think we need to disallow them in the validator for .lf files with a C target for the time being.

@edwardalee
Copy link
Collaborator

I honestly don't see why we have to disallow them. There are many ways that users can create C files that won't compile, and we would just be eliminating one of those, and not necessarily one of the more problematic ones. And the price seems kind of high. You can't define shared functions in a .lf file...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants