Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major Refactoring of API to a clean OOP interface #10

Open
mrh1997 opened this issue Jun 8, 2015 · 12 comments
Open

Major Refactoring of API to a clean OOP interface #10

mrh1997 opened this issue Jun 8, 2015 · 12 comments

Comments

@mrh1997
Copy link
Contributor

mrh1997 commented Jun 8, 2015

As noted in Issues #5, #6 and #8 the interface of version 0.2.0 has some shortcomings in the API which shall be cleaned up. PR #9 did already a careful rework that keeps compatibility. But as decided in #8 the interface shall be reworked completely.


This issue is a continuation of the discussion started in #8 (comment) and superseeds #8. Regarding the annotiation in this comment:

  1. BaseType is for simple, uncomposed types. I.e. int, unsigned char, float (What do you mean by "is not passed one in your example"? It is used in the example multiple times; i.e. https://gist.github.com/mrh1997/876070be3763733f849e#file-typesystem_proposal-py-L155)
  2. I also think that the parser should not be initialized globally. Will change this in my part of the project.
  3. Of course we need to be able to parse multiple headers (my fault). But I propose the parser not to store CLibInterface, but the parse method to accept multiple files (see modified GIST https://gist.github.com/mrh1997/876070be3763733f849e). This way we ensure the single resposibility principle:
    • the CLibInterface is the python object model of all public header files of the corresponding c library
    • the CParser object knows about the compiler details which may vary from compiler to compiler (see below).
  4. My intention for the naming of "CLibInterface" was to show, that this object corresponds to a c library and not to a c header (one CLibInterface per libary, even if the library has multiple headers). How about: "CLibHeaders"?
  5. Using the parser-rework branch is fine. How about releasing the current release as 0.2.0 (which is intentionally compatible to 0.1.0) and start a new, incompatible major version (0.3.0) when merging the parser-rework branch?

Regarding the CParser object, I have one more idea:

An application could support multiple compilers by simply instantiating multiple CParsers with slightly different different grammars. Currently the grammars may only vary in type_quals, but in future we could even implement more complex differences. I would propose to provide a submodule for every standard compiler (visual studio, gcc, ...) that only provide a single object: a prebuild CParser for the corresponding compiler.


Regarding cooperation:

I propose to change the unittests of the backend (test_ctypes.py): Currently the tests require CParser which is not a clean test design. A better approach would be to separate the header-file-model (see GIST) completely into a separate file and the backend test only rely on this file (not on cparser.py).

This way the test would be a "real" unit test, testing only a single unit (the backend; not parser+backend).

If you agree with me, I would start with a first PR, where I only add a independant file with the header-file-model. Then you and me could work independently.

@MatthieuDartiailh
Copy link
Owner

  1. Sorry I completely misread that sorry.
  2. Cool.
  3. The one case I would like to also see covered is the possibility to pass an existing CLibInterface object to a Parser (or several) which contains necessary definitions (such as windows headers or the like), those definitions could be stored in the newly produced CLibInterface (in an attribute named 'externals' for examples).
  4. If we bundle all definitions in a single object then I agree on CLibInterface
  5. I think I will rather go with a 0.11 version for this time and next one will be 0.2.

I like the idea of prebuilt parsers, let's do this.

I agree with you about making the backends tests independent of the parser. Don't bother about writing the CLibInterface for the header I use for the tests I will do it myself while working on the backends, and once I will be done I will also add a test checking that the parser does indeed produce the same result as the handwritten CLibInterface.

Lets rather start with the type system (in which you will have to include a Var and MacroVar object for completeness). As soon as this is in place I will start working on the backends.

@mrh1997
Copy link
Contributor Author

mrh1997 commented Jun 8, 2015

  1. Agree with you. For simplicity I Would propose, that every Parser contains a single CLibInterface, that acts as template and is copied in every call to parse()

Regarding MacroVar: I added a MacroDef() class and .macros attribute already (see gist). Did you mean this object or did you mean an additiinal class (meant for what?).

Regarding Var: My intention was to handle this via the .globals Attribute of the CLibInterface:

All global objs (functions and vars) are stored in the .globals Attribute. If a globale Entry points to a FunctionType, it is a Funktion, otherwise a variable.

@MatthieuDartiailh
Copy link
Owner

  1. Save if we end up with a compelling use case it should work.

Looking at the gist I was under the impression that this was reserved to macro function. Having two class one for macro functions and one for macro values might make things easier to understand.

My point is what do you do with a global entry referring to a lets say a structure, where do you store the members values ? The idea behind Var would be to store the type and alongside the value.

@mrh1997
Copy link
Contributor Author

mrh1997 commented Jun 8, 2015

What do you mean with compelling use case?

I can create two classes for macros if you prefer, but then I propose to derive one from the other, to avoid duplication oft common code.

Why do we need a value? Am I overlooking sth.? If I am correct the interface Definition only needs a Name and a type for vars (respektive respective key and value oPointe to a obale attr). Values Come into the game if you load a library. But the values of the library instances variables are currntly represented by ctype module objects.

If a variable points to a structure, the idea is, that it shall look like:

assert p.globals == { 'varname': TypeRef('struct structname') }
assert p.types == { 'struct structname': StructType(...) }

@MatthieuDartiailh
Copy link
Owner

Sorry about that I started my answer with something than re-wrote it ...
My idea that there might be use cases where it is desirable to include definition from multiple other library (windows header + user common library header) when parsing some headers. Such a case would require to be able to update the CLibInterface object hold by the parser.

My idea for the macros is that in the case of a function we simply keep what is needed to perform the preprocessing steps, in the case of values the user can query the actual value (as a python object) but maybe this could be reserved to the backend. (I definitively agree that those should have a common base class).

I was considering the ugly case of global variables declared in a header file (which may happen in single header project). Something like :

int var = 1;

This is ugly and should be avoided but I am worried that this exists nonetheless.

@mrh1997
Copy link
Contributor Author

mrh1997 commented Jun 9, 2015

  1. I see your point. But I think the goal should be to work with "constant" CParser objects (otherwise we could not provide them as globals in the submodules for standard compilers). How about adding a .derive(std_clibintf) function to the parser class, that allows to create a (shallow) copy of the parser with different "standard include definitions"?
  2. I think it could be possible to merge both cases into a single class, without dirty "if is_macro_func" checks in the macro class:
    • macro class provides a parameterlist attribute which is a list of strings (it is empty if we have a macro value and if we have a macro function without params).
    • macro class provides a function .value([p1, p2, p3, ...]) which is returning a string, where the parameters p1, ..., pn are replaced.
    • To provide a possibility for the application code to check if this is a function or value an additional attribute .is_macro_func could be provided (But this differnentiation is not needed by the code of the macro object)
    • All further code needed for differentiation (like additional parenthesis on funcs) are parser specific and should not be in the responsibility of a macro object.
  3. In the case you outlined the value is not needed by the backend to provide its functionality (let me know if I am wrong). If you want the information for completeness, I suggest to introduce an additional attribute .initial_vals to CLibInterface

@MatthieuDartiailh
Copy link
Owner

  1. I am fine with the derive method.
  2. Why not... I would prefer macro function to have a format method rather than value. As macro definition don't have parameters it would be easy to just make macrofunc a subclass. Creating two classes is IMHO cleaner (and not much more work).
  3. I guess you are right this is just a stupid corner case ... as adding the initial_vals should be easy I propose to do it nonetheless.

@mrh1997
Copy link
Contributor Author

mrh1997 commented Jun 10, 2015

OK.
I will implement the module as discussed above...

mrh1997 pushed a commit to mrh1997/pyclibrary that referenced this issue Jun 11, 2015
…artiailh#10 (comment).

Currently only c_model.py and test_c_model.py where added (and a yet empty class for the reworked parser in c_parser.py), without implications on existing code. Thus the next steps can be done independently relying on this model.
@mrh1997
Copy link
Contributor Author

mrh1997 commented Jun 11, 2015

I reasoned about the "stupid corner case" int x = 1;:
I cannot imagine a real-world header file like this, since it would lead to the following problem:

Imagine we had a mod1.c which is used by mod2.c and mod3.c. mod1.c's header (which contains int x = 1) can only be included by eiher mod2.c or mod3.c otherwise the linker would complain, since mod2.c and mod3.c are defining 'x'.

If you agree with me, I propose to avoid complicating the interface of CLibInterface by an attribute that is never needed.

@MatthieuDartiailh
Copy link
Owner

I agree that this is very ugly coding, so ok lets keep it out for the time being. If we find somebody stupid enough to do such a thing we can always add it later.

@mrh1997
Copy link
Contributor Author

mrh1997 commented Jun 27, 2015

While implementing #15 I figured out a relatively clean interface of the parser. Please let me know your thoughts about the API.


Regarding reworking of preprocessor: My original intention was to move the macro evaluation to the CLibInterface but keeping the CParser parsing the macro definitions. This would have required a AST like structure for macro definitions. Then the backend could use the macro without need to have knowledge of the CParser (which is currently responsible for evaluation).

But then I detected, that a clean macro evalution required more work, as the current implementation is not really conform to the C preprocessor spec. The reason is that macros currently are always evaluated during definition, instead of invokation. This can cause trouble. I.e.:

#define Y X
#define X  int
Y var;    //<- var will be of type CustomType('Y') instead of BuildinType('int')

Then I realized that in 99% percent the current implementation should do its job. Thus I think I will skip this job and continue with the backend. If we come back later to the preprocessor issue, it should be doable without API change (the CParser would use the Preprocessor object internally then)

@MatthieuDartiailh
Copy link
Owner

This makes perfect sense to me. Have you found a templating engine suiting your needs ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants