Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] API: Implement C++ wrapper #22

Closed
wants to merge 1 commit into from
Closed

[RFC] API: Implement C++ wrapper #22

wants to merge 1 commit into from

Conversation

weltling
Copy link
Contributor

@weltling weltling commented Apr 10, 2023

The idea behind the implementation is to get the full extent of the powerful C++ features like the strict type handling, method overloading, etc. At the same time, the C++ wrapper should not add any significant overhead to the IR code generation time, while being embeddable well into modern C++ code.

The most of the IR functions use ir_ref which is typeless on the C/C++ side of things. Many functions use the type constants like IR_BOOL and suffixes in the function names to signify the data types. The C++ paradigms allow to simplify this by wrapping types and values in a class, which can be used to overload a method handling a particular operand type.

For example, an operation like getting an absolute value using the C functionality could lead to a function API similar to:

ir_ref abs_i8(ir_ctx *ctx, ir_ref op1) { return ir_fold1(ctx, IR_OPT(IR_ABS, IR_I8), op1);
ir_ref abs_i8(ir_ctx *ctx, ir_ref op1) { return ir_fold1(ctx, IR_OPT(IR_ABS, IR_I16), op1);
.....

where in C++, the same could be done by an approach similar to

i8 abs(ir_ctx *ctx, i8 op1) { return ir_fold1(ctx, IR_OPT(IR_ABS, IR_I8), op1);
i16 abs(ir_ctx *ctx, i16 op1) { return ir_fold1(ctx, IR_OPT(IR_ABS, IR_I16), op1);
.....

As illustrated here, in C++ the function signatures can be simplified, using the type checking and function overloading paradigms. Further simplifications, for example encapsulating the context argument by using OOP, automatically recognizing whether there's a value, ref or type, can be done. As required by the underlaying C API, a corresponding object like i8 can be converted to a type as in enum _ir_type, to a value, or to a ref depending on the context.

This PR approaches an initial C++ wrapper implementation based on the C++ paradigms as above and more. There are still some open points with regard to the data types in certain functions, possible overloading with arguments of mixed types that will need a closer attention. This PR is in first place to open the discussion on the wrapper approach, to see how for it can lead and catch any possible issues in the boiler plate code.

Fixes: #14

Signed-off-by: Anatol Belski [email protected]

@weltling weltling changed the title API: Implement C++ wrapper [RFC] API: Implement C++ wrapper Apr 10, 2023
Signed-off-by: Anatol Belski <[email protected]>
Copy link
Owner

@dstogov dstogov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constval() is really over-designed.
The following patch simplifies implementation and usage (in the test)

diff --git a/c++/include/ir.hxx b/c++/include/ir.hxx
index 0e209cf..c1b752f 100644
--- a/c++/include/ir.hxx
+++ b/c++/include/ir.hxx
@@ -61,19 +61,16 @@ namespace ir
 
 	// i8, i16, i32, i64, u8, u16, u32, u64, addr, c, b, d, f
 #define IR_TYPE_CLS(name, cxx_type, field, flags) \
-	class field : public type, public ref { \
-	protected: \
-		cxx_type v; \
+	class field : public ref { \
 	public: \
 		static const field t; \
-		field() : v(0), type(IR_ ## name), ref(0) {} \
-		field(cxx_type _v) : v(_v), type(IR_ ## name), ref(0) {} \
-		field(const ref& _r) : v(0), type(IR_ ## name), ref(_r) {} \
-		operator cxx_type() const { return v; } \
-		operator ref() const { return r; } \
-		operator type() const { return t; } \
+		field() : ref(0) {} \
+		field(ir_ref _r) : ref(_r) {} \
+		field(const ref& _r) : ref(_r) {} \
+		operator ir_type() const { return IR_ ## name; } \
 	}; \
-	const field field::t = field{ IR_ ## name };
+	const field field::t = field{ IR_ ## name }; \
+	const auto name = IR_ ## name;
 	IR_TYPES(IR_TYPE_CLS)
 #undef IR_TYPE_CLS
 	class none : public type {
@@ -128,24 +125,28 @@ namespace ir
 		builder(uint32_t flags, int32_t consts_limit = IR_CONSTS_LIMIT_MIN, int32_t insns_limit = IR_INSNS_LIMIT_MIN)
 			: engine(flags, consts_limit, insns_limit) {}
 
-		i8 constval(i8 v) { return ref(ir_const_i8(&ctx, v)); }
-		i16 constval(i16 v) { return ref(ir_const_i16(&ctx, v)); }
-		i32 constval(i32 v) { return ref(ir_const_i32(&ctx, v)); }
-		i64 constval(i64 v) { return ref(ir_const_i64(&ctx, v)); }
-		u8 constval(u8 v) { return ref(ir_const_u8(&ctx, v)); }
-		u16 constval(u16 v) { return ref(ir_const_u16(&ctx, v)); }
-		u32 constval(u32 v) { return ref(ir_const_u32(&ctx, v)); }
-		u64 constval(u64 v) { return ref(ir_const_u64(&ctx, v)); }
-		b constval(b v) { return ref(ir_const_bool(&ctx, v)); }
-		c constval(c v) { return ref(ir_const_char(&ctx, v)); }
-		f constval(f v) { return ref(ir_const_float(&ctx, v)); }
+		i8 const_i8(int8_t v) { return ref(ir_const_i8(&ctx, v)); }
+		i16 const_i16(int16_t v) { return ref(ir_const_i16(&ctx, v)); }
+		i32 const_i32(int32_t v) { return ref(ir_const_i32(&ctx, v)); }
+		i64 const_i64(int64_t v) { return ref(ir_const_i64(&ctx, v)); }
+		u8 const_u8(uint8_t v) { return ref(ir_const_u8(&ctx, v)); }
+		u16 const_u16(uint16_t v) { return ref(ir_const_u16(&ctx, v)); }
+		u32 const_u32(uint32_t v) { return ref(ir_const_u32(&ctx, v)); }
+		u64 const_u64(uint64_t v) { return ref(ir_const_u64(&ctx, v)); }
+		b const_bool(bool v) { return ref(ir_const_bool(&ctx, v)); }
+		c const_char(char v) { return ref(ir_const_char(&ctx, v)); }
+		f const_float(float v) { return ref(ir_const_float(&ctx, v)); }
+		d const_double(double v) { return ref(ir_const_double(&ctx, v)); }
+		addr const_addr(uintptr_t v) { return ref(ir_const_addr(&ctx, v)); }
+		addr const_func_addr(uintptr_t v, uint16_t flags) { return ref(ir_const_func_addr(&ctx, v, flags)); }
+
+		addr unique_const_addr(uintptr_t v)  { return ref(ir_unique_const_addr(&ctx, v)); }
+
+		b constval(bool v) { return ref(ir_const_bool(&ctx, v)); }
+		c constval(char v) { return ref(ir_const_char(&ctx, v)); }
 		f constval(float v) { return ref(ir_const_float(&ctx, v)); }
-		d constval(d v) { return ref(ir_const_double(&ctx, v)); }
 		d constval(double v) { return ref(ir_const_double(&ctx, v)); }
-		addr constval(addr v) { return ref(ir_const_addr(&ctx, v)); }
-		ref constval(ref v) { return ir_const_str(&ctx, v); } // XXX have type str?
-		addr constval(addr v, uint16_t flags) { return ref(ir_const_func_addr(&ctx, v, flags)); }
-		ref constval(ref v, uint16_t flags) { return ir_const_func(&ctx, v, flags); } // XXX have type func?
+		addr constval(const void *v) { return ref(ir_const_addr(&ctx, (uintptr_t)v)); }
 
 		b eq(ref op1, ref op2) { return ref(ir_fold2(&ctx, IR_OPT(IR_EQ, IR_BOOL), ref(op1), ref(op2))); }
 		b ne(ref op1, ref op2) { return ref(ir_fold2(&ctx, IR_OPT(IR_NE, IR_BOOL), ref(op1), ref(op2))); }
@@ -511,7 +512,7 @@ namespace ir
 			// NOTE There is likely no chance to verify whether the number is correct.
 			//      Numbers can be passed in an arbitrary order.
 			num = num > 0 ? num : ++param_pos_cnt;
-			return ref(_ir_PARAM(&ctx, type(typ), name.c_str(), num));
+			return ref(_ir_PARAM(&ctx, ir_type(typ), name.c_str(), num));
 		}
 
 		template <class T>
diff --git a/c++/ir_test3.cxx b/c++/ir_test3.cxx
index 6a760c8..ce356d3 100644
--- a/c++/ir_test3.cxx
+++ b/c++/ir_test3.cxx
@@ -58,14 +58,14 @@ int main(int argc, char** argv) {
 	auto ci = bld.copy(x);
 	auto zi = bld.copy(bld.constval(.0));
 	auto zr = bld.copy(bld.constval(.0));
-	auto i = bld.copy(bld.constval(ir::i32{0}));
+	auto i = bld.copy(bld.const_i32(0));
 
 	auto loop = bld.loop_begin();
 		ir::d zi_1 = bld.phi(zi);
 		ir::d zr_1 = bld.phi(zr);
 		ir::i32 i_1 = bld.phi(i);
 
-		auto i_2 = bld.add(i_1, bld.constval(ir::i32{1}));
+		auto i_2 = bld.add(i_1, bld.const_i32(1));
 		auto temp = bld.mul(zr_1, zi_1);
 		auto zr2 = bld.mul(zr_1, zr_1);
 		auto zi2 = bld.mul(zi_1, zi_1);
@@ -75,9 +75,9 @@ int main(int argc, char** argv) {
 			bld.if_true(if_1);
 				bld.ret(i_2);
 			bld.if_false(if_1);
-				auto if_2 = bld.if_cond(bld.gt(i_2, bld.constval(ir::i32{1000})));
+				auto if_2 = bld.if_cond(bld.gt(i_2, bld.const_i32(1000)));
 				bld.if_true(if_2);
-					bld.ret(bld.constval(ir::i32{0}));
+					bld.ret(bld.const_i32(0));
 				bld.if_false(if_2);
 					auto loop_end = bld.loop_end();
 

field(cxx_type _v) : v(_v), type(IR_ ## name), ref(0) {} \
field(const ref& _r) : v(0), type(IR_ ## name), ref(_r) {} \
operator cxx_type() const { return v; } \
operator ref() const { return r; } \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see compilation warning

./c++/include/ir.hxx:73:17: warning: converting ‘ir::b’ to a base class ‘ir::ref’ will never use a type conversion operator [-Wclass-conversion]
   73 |                 operator ref() const { return r; } \
      |                 ^~~~~~~~

There are few other similar wrnings

ref end() { return _ir_END(&ctx); }
ref end_list(ref list) { return _ir_END_LIST(&ctx, list); }
void merge(ref src1, ref src2) { return _ir_MERGE_2(&ctx, src1, src2); }
// XXX datatypes
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A trailing space after the data type. Git emits 4 similar warnings

rebase-apply/patch:700: trailing whitespace.
		// XXX datatypes 
rebase-apply/patch:759: trailing whitespace.
	
rebase-apply/patch:769: trailing whitespace.
	
rebase-apply/patch:855: trailing whitespace.
			
warning: 4 lines add whitespace errors.

Comment on lines +48 to +49
OBJS_IR_TEST2 = $(BUILD_DIR)/c++/ir_test2.o
OBJS_IR_TEST3 = $(BUILD_DIR)/c++/ir_test3.o
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build fails if c++ directory is not created manually.

f sub(f op1, f op2) { return ref(ir_fold2(&ctx, IR_OPT(IR_SUB, IR_FLOAT), ref(op1), ref(op2))); }
d sub(d op1, d op2) { return ref(ir_fold2(&ctx, IR_OPT(IR_SUB, IR_DOUBLE), ref(op1), ref(op2))); }

i8 add(i8 op1, i8 op2) { return ref(ir_fold2(&ctx, IR_OPT(IR_ADD, IR_I8), ref(op1), ref(op2))); }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This methods seems not inlined in RELEASE build:

$ nm --print-size --size-sort build_x86_64/ir_test.o 
0000000000000300 0000000000000108 T run
0000000000000000 00000000000002f8 T gen_mandelbrot
0000000000000000 00000000000003e7 T main
$ nm --print-size --size-sort build_x86_64/c++/ir_test3.o 
00000000000000dc 0000000000000001 b _ZStL8__ioinit
0000000000000000 0000000000000003 W _ZNKSt5ctypeIcE8do_widenEc
0000000000000000 0000000000000004 B _ZN2ir4none1tE
00000000000000d0 000000000000000c B _ZN2ir1b1tE
0000000000000070 000000000000000c B _ZN2ir1c1tE
0000000000000008 000000000000000c B _ZN2ir1f1tE
0000000000000060 000000000000000c B _ZN2ir2i81tE
00000000000000c0 000000000000000c B _ZN2ir2u81tE
0000000000000050 000000000000000c B _ZN2ir3i161tE
0000000000000040 000000000000000c B _ZN2ir3i321tE
00000000000000b0 000000000000000c B _ZN2ir3u161tE
00000000000000a0 000000000000000c B _ZN2ir3u321tE
0000000000000020 0000000000000010 B _ZN2ir1d1tE
0000000000000030 0000000000000010 B _ZN2ir3i641tE
0000000000000090 0000000000000010 B _ZN2ir3u641tE
0000000000000080 0000000000000010 B _ZN2ir4addr1tE
0000000000000000 000000000000002d W _ZN2ir7builder3addENS_1dES1_
0000000000000000 000000000000002d W _ZN2ir7builder3mulENS_1dES1_
0000000000000000 0000000000000031 t main.cold
0000000000000000 00000000000000b0 W _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1IS3_EEPKcRKS3_
0000000000000000 00000000000000b0 W _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC2IS3_EEPKcRKS3_
0000000000000780 0000000000000141 t _GLOBAL__sub_I__ZN2ir1b1tE
0000000000000000 00000000000001d7 T _Z3runPFiddE
0000000000000000 0000000000000776 T main

Comment on lines +53 to +89
// start gen_mandelbrot
bld.start();
auto x = bld.param(ir::d{}, "x");
auto y = bld.param(ir::d{}, "y");
auto cr = bld.sub(y, bld.constval(.5));
auto ci = bld.copy(x);
auto zi = bld.copy(bld.constval(.0));
auto zr = bld.copy(bld.constval(.0));
auto i = bld.copy(bld.constval(ir::i32{0}));

auto loop = bld.loop_begin();
ir::d zi_1 = bld.phi(zi);
ir::d zr_1 = bld.phi(zr);
ir::i32 i_1 = bld.phi(i);

auto i_2 = bld.add(i_1, bld.constval(ir::i32{1}));
auto temp = bld.mul(zr_1, zi_1);
auto zr2 = bld.mul(zr_1, zr_1);
auto zi2 = bld.mul(zi_1, zi_1);
auto zr_2 = bld.add(bld.sub(zr2, zi2), cr);
auto zi_2 = bld.add(bld.add(temp, temp), ci);
auto if_1 = bld.if_cond(bld.gt(bld.add(zi2, zr2), bld.constval(6.0)));
bld.if_true(if_1);
bld.ret(i_2);
bld.if_false(if_1);
auto if_2 = bld.if_cond(bld.gt(i_2, bld.constval(ir::i32{1000})));
bld.if_true(if_2);
bld.ret(bld.constval(ir::i32{0}));
bld.if_false(if_2);
auto loop_end = bld.loop_end();

// close loop
bld.merge_set_op(loop, 2, loop_end);
bld.phi_set_op(zi_1, 2, zi_2);
bld.phi_set_op(zr_1, 2, zr_2);
bld.phi_set_op(i_1, 2, i_2);
// end gen_mandelbrot
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please separate this into a gen_mandelbrot() function.

@@ -83,6 +93,11 @@ $(BUILD_DIR)/ir_emit_$(DASM_ARCH).h: $(SRC_DIR)/ir_$(DASM_ARCH).dasc $(SRC_DIR)/
$(OBJS_COMMON) $(OBJS_IR) $(OBJS_IR_TEST): $(BUILD_DIR)/$(notdir %.o): $(SRC_DIR)/$(notdir %.c)
$(CC) $(CFLAGS) -I$(BUILD_DIR) -o $@ -c $<

$(OBJS_IR_TEST2): $(BUILD_DIR)/c++/$(notdir %.o): $(SRC_DIR)/c++/$(notdir %.cxx)
$(CXX) $(CFLAGS) $(CXXFLAGS) -I$(BUILD_DIR) -o $@ -c $<
$(OBJS_IR_TEST3): $(BUILD_DIR)/c++/$(notdir %.o): $(SRC_DIR)/c++/$(notdir %.cxx)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should also depend on $(SRC_DIR)/c++/include/ir.hxx


b overflow(ref op1) { return ir_fold1(&ctx, IR_OPT(IR_OVERFLOW, IR_BOOL), op1); }

b not_cond(b op1) { return ref(ir_fold1(&ctx, IR_OPT(IR_NOT, IR_BOOL), ref(op1))); }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not_cond is really bad name that makes fragmentation with C version.
May be it's better to use upper-case names?


// i8, i16, i32, i64, u8, u16, u32, u64, addr, c, b, d, f
#define IR_TYPE_CLS(name, cxx_type, field, flags) \
class field : public type, public ref { \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I remember multiple inheritance is not a good C++ practice

#define IR_TYPE_CLS(name, cxx_type, field, flags) \
class field : public type, public ref { \
protected: \
cxx_type v; \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This v is necessary only for constants, but you include it into each reference.

(gdb) p sizeof(ir_ref)
4
(gdb) p sizeof(ir::ref) 
4
(gdb) p sizeof(ir::i32)
12
(gdb) p sizeof(ir::i64)
16

auto y = bld.param(ir::d{}, "y");
auto cr = bld.sub(y, bld.constval(.5));
auto ci = bld.copy(x);
auto zi = bld.copy(bld.constval(.0));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How this makes distinct between FLOAT and DOUBLE?

@dstogov
Copy link
Owner

dstogov commented Apr 11, 2023

Type usage in parm and call also looks more complex then in C.

The rest builder API (e.g. add, sub, etc) benefits from C++ type overloading, but note that this overloading works only if we know the type(s) statically. This is a serious limitation. For example in PHP JIT we keep an array of ir_refs to nodes that represent the current values of PHP variables (some of them DOUBLE others are LONG). Static typing won't work in this case. See https://github.com/dstogov/php-src/blob/php-ir/ext/opcache/jit/zend_jit_ir.c#L1147 . Using object with virtual method is also not an option.

Finally, it would be great to have C and C++ APIs without big differences, to have a common reference manual.

Comment on lines +11 to +12
#include "ir.h"
#include "ir_builder.h"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we include these files under ir::c namespace to hide C declarations from C++ users.

namespace ir {
  namespace c {
#  include "ir.h"
#  include "ir_builder.h"
  }
    const auto consistency_check = ir::c::ir_consistency_check;

@weltling
Copy link
Contributor Author

After further consideration it became obvious, that the approach as in this PR isn't suitable and doesn't reflect concepts involved in the low level implementation. Some part of the code in this PR could be reused in the further work, but in general the approach needs to be chosen more deliberately. Thus, herewith is this PR abandoned.

Thanks a lot for your guidance and reviews, @dstogov!

@weltling weltling closed this May 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bindings: Implement C++ frontend
2 participants