[RFC] API: Implement C++ wrapper #22

weltling · 2023-04-10T23:53:05Z

The idea behind the implementation is to get the full extent of the powerful C++ features like the strict type handling, method overloading, etc. At the same time, the C++ wrapper should not add any significant overhead to the IR code generation time, while being embeddable well into modern C++ code.

The most of the IR functions use ir_ref which is typeless on the C/C++ side of things. Many functions use the type constants like IR_BOOL and suffixes in the function names to signify the data types. The C++ paradigms allow to simplify this by wrapping types and values in a class, which can be used to overload a method handling a particular operand type.

For example, an operation like getting an absolute value using the C functionality could lead to a function API similar to:

ir_ref abs_i8(ir_ctx *ctx, ir_ref op1) { return ir_fold1(ctx, IR_OPT(IR_ABS, IR_I8), op1);
ir_ref abs_i8(ir_ctx *ctx, ir_ref op1) { return ir_fold1(ctx, IR_OPT(IR_ABS, IR_I16), op1);
.....

where in C++, the same could be done by an approach similar to

i8 abs(ir_ctx *ctx, i8 op1) { return ir_fold1(ctx, IR_OPT(IR_ABS, IR_I8), op1);
i16 abs(ir_ctx *ctx, i16 op1) { return ir_fold1(ctx, IR_OPT(IR_ABS, IR_I16), op1);
.....

As illustrated here, in C++ the function signatures can be simplified, using the type checking and function overloading paradigms. Further simplifications, for example encapsulating the context argument by using OOP, automatically recognizing whether there's a value, ref or type, can be done. As required by the underlaying C API, a corresponding object like i8 can be converted to a type as in enum _ir_type, to a value, or to a ref depending on the context.

This PR approaches an initial C++ wrapper implementation based on the C++ paradigms as above and more. There are still some open points with regard to the data types in certain functions, possible overloading with arguments of mixed types that will need a closer attention. This PR is in first place to open the discussion on the wrapper approach, to see how for it can lead and catch any possible issues in the boiler plate code.

Fixes: #14

Signed-off-by: Anatol Belski [email protected]

Signed-off-by: Anatol Belski <[email protected]>

dstogov

The constval() is really over-designed.
The following patch simplifies implementation and usage (in the test)

diff --git a/c++/include/ir.hxx b/c++/include/ir.hxx
index 0e209cf..c1b752f 100644
--- a/c++/include/ir.hxx
+++ b/c++/include/ir.hxx
@@ -61,19 +61,16 @@ namespace ir
 
 	// i8, i16, i32, i64, u8, u16, u32, u64, addr, c, b, d, f
 #define IR_TYPE_CLS(name, cxx_type, field, flags) \
-	class field : public type, public ref { \
-	protected: \
-		cxx_type v; \
+	class field : public ref { \
 	public: \
 		static const field t; \
-		field() : v(0), type(IR_ ## name), ref(0) {} \
-		field(cxx_type _v) : v(_v), type(IR_ ## name), ref(0) {} \
-		field(const ref& _r) : v(0), type(IR_ ## name), ref(_r) {} \
-		operator cxx_type() const { return v; } \
-		operator ref() const { return r; } \
-		operator type() const { return t; } \
+		field() : ref(0) {} \
+		field(ir_ref _r) : ref(_r) {} \
+		field(const ref& _r) : ref(_r) {} \
+		operator ir_type() const { return IR_ ## name; } \
 	}; \
-	const field field::t = field{ IR_ ## name };
+	const field field::t = field{ IR_ ## name }; \
+	const auto name = IR_ ## name;
 	IR_TYPES(IR_TYPE_CLS)
 #undef IR_TYPE_CLS
 	class none : public type {
@@ -128,24 +125,28 @@ namespace ir
 		builder(uint32_t flags, int32_t consts_limit = IR_CONSTS_LIMIT_MIN, int32_t insns_limit = IR_INSNS_LIMIT_MIN)
 			: engine(flags, consts_limit, insns_limit) {}
 
-		i8 constval(i8 v) { return ref(ir_const_i8(&ctx, v)); }
-		i16 constval(i16 v) { return ref(ir_const_i16(&ctx, v)); }
-		i32 constval(i32 v) { return ref(ir_const_i32(&ctx, v)); }
-		i64 constval(i64 v) { return ref(ir_const_i64(&ctx, v)); }
-		u8 constval(u8 v) { return ref(ir_const_u8(&ctx, v)); }
-		u16 constval(u16 v) { return ref(ir_const_u16(&ctx, v)); }
-		u32 constval(u32 v) { return ref(ir_const_u32(&ctx, v)); }
-		u64 constval(u64 v) { return ref(ir_const_u64(&ctx, v)); }
-		b constval(b v) { return ref(ir_const_bool(&ctx, v)); }
-		c constval(c v) { return ref(ir_const_char(&ctx, v)); }
-		f constval(f v) { return ref(ir_const_float(&ctx, v)); }
+		i8 const_i8(int8_t v) { return ref(ir_const_i8(&ctx, v)); }
+		i16 const_i16(int16_t v) { return ref(ir_const_i16(&ctx, v)); }
+		i32 const_i32(int32_t v) { return ref(ir_const_i32(&ctx, v)); }
+		i64 const_i64(int64_t v) { return ref(ir_const_i64(&ctx, v)); }
+		u8 const_u8(uint8_t v) { return ref(ir_const_u8(&ctx, v)); }
+		u16 const_u16(uint16_t v) { return ref(ir_const_u16(&ctx, v)); }
+		u32 const_u32(uint32_t v) { return ref(ir_const_u32(&ctx, v)); }
+		u64 const_u64(uint64_t v) { return ref(ir_const_u64(&ctx, v)); }
+		b const_bool(bool v) { return ref(ir_const_bool(&ctx, v)); }
+		c const_char(char v) { return ref(ir_const_char(&ctx, v)); }
+		f const_float(float v) { return ref(ir_const_float(&ctx, v)); }
+		d const_double(double v) { return ref(ir_const_double(&ctx, v)); }
+		addr const_addr(uintptr_t v) { return ref(ir_const_addr(&ctx, v)); }
+		addr const_func_addr(uintptr_t v, uint16_t flags) { return ref(ir_const_func_addr(&ctx, v, flags)); }
+
+		addr unique_const_addr(uintptr_t v)  { return ref(ir_unique_const_addr(&ctx, v)); }
+
+		b constval(bool v) { return ref(ir_const_bool(&ctx, v)); }
+		c constval(char v) { return ref(ir_const_char(&ctx, v)); }
 		f constval(float v) { return ref(ir_const_float(&ctx, v)); }
-		d constval(d v) { return ref(ir_const_double(&ctx, v)); }
 		d constval(double v) { return ref(ir_const_double(&ctx, v)); }
-		addr constval(addr v) { return ref(ir_const_addr(&ctx, v)); }
-		ref constval(ref v) { return ir_const_str(&ctx, v); } // XXX have type str?
-		addr constval(addr v, uint16_t flags) { return ref(ir_const_func_addr(&ctx, v, flags)); }
-		ref constval(ref v, uint16_t flags) { return ir_const_func(&ctx, v, flags); } // XXX have type func?
+		addr constval(const void *v) { return ref(ir_const_addr(&ctx, (uintptr_t)v)); }
 
 		b eq(ref op1, ref op2) { return ref(ir_fold2(&ctx, IR_OPT(IR_EQ, IR_BOOL), ref(op1), ref(op2))); }
 		b ne(ref op1, ref op2) { return ref(ir_fold2(&ctx, IR_OPT(IR_NE, IR_BOOL), ref(op1), ref(op2))); }
@@ -511,7 +512,7 @@ namespace ir
 			// NOTE There is likely no chance to verify whether the number is correct.
 			//      Numbers can be passed in an arbitrary order.
 			num = num > 0 ? num : ++param_pos_cnt;
-			return ref(_ir_PARAM(&ctx, type(typ), name.c_str(), num));
+			return ref(_ir_PARAM(&ctx, ir_type(typ), name.c_str(), num));
 		}
 
 		template <class T>
diff --git a/c++/ir_test3.cxx b/c++/ir_test3.cxx
index 6a760c8..ce356d3 100644
--- a/c++/ir_test3.cxx
+++ b/c++/ir_test3.cxx
@@ -58,14 +58,14 @@ int main(int argc, char** argv) {
 	auto ci = bld.copy(x);
 	auto zi = bld.copy(bld.constval(.0));
 	auto zr = bld.copy(bld.constval(.0));
-	auto i = bld.copy(bld.constval(ir::i32{0}));
+	auto i = bld.copy(bld.const_i32(0));
 
 	auto loop = bld.loop_begin();
 		ir::d zi_1 = bld.phi(zi);
 		ir::d zr_1 = bld.phi(zr);
 		ir::i32 i_1 = bld.phi(i);
 
-		auto i_2 = bld.add(i_1, bld.constval(ir::i32{1}));
+		auto i_2 = bld.add(i_1, bld.const_i32(1));
 		auto temp = bld.mul(zr_1, zi_1);
 		auto zr2 = bld.mul(zr_1, zr_1);
 		auto zi2 = bld.mul(zi_1, zi_1);
@@ -75,9 +75,9 @@ int main(int argc, char** argv) {
 			bld.if_true(if_1);
 				bld.ret(i_2);
 			bld.if_false(if_1);
-				auto if_2 = bld.if_cond(bld.gt(i_2, bld.constval(ir::i32{1000})));
+				auto if_2 = bld.if_cond(bld.gt(i_2, bld.const_i32(1000)));
 				bld.if_true(if_2);
-					bld.ret(bld.constval(ir::i32{0}));
+					bld.ret(bld.const_i32(0));
 				bld.if_false(if_2);
 					auto loop_end = bld.loop_end();

dstogov · 2023-04-11T19:34:11Z

c++/include/ir.hxx

+		field(cxx_type _v) : v(_v), type(IR_ ## name), ref(0) {} \
+		field(const ref& _r) : v(0), type(IR_ ## name), ref(_r) {} \
+		operator cxx_type() const { return v; } \
+		operator ref() const { return r; } \


I see compilation warning

./c++/include/ir.hxx:73:17: warning: converting ‘ir::b’ to a base class ‘ir::ref’ will never use a type conversion operator [-Wclass-conversion] 73 | operator ref() const { return r; } \ | ^~~~~~~~

There are few other similar wrnings

dstogov · 2023-04-11T19:38:47Z

c++/include/ir.hxx

+		ref end() { return _ir_END(&ctx); }
+		ref end_list(ref list) { return _ir_END_LIST(&ctx, list); }
+		void merge(ref src1, ref src2) { return _ir_MERGE_2(&ctx, src1, src2); }
+		// XXX datatypes 


A trailing space after the data type. Git emits 4 similar warnings

rebase-apply/patch:700: trailing whitespace. // XXX datatypes rebase-apply/patch:759: trailing whitespace. rebase-apply/patch:769: trailing whitespace. rebase-apply/patch:855: trailing whitespace. warning: 4 lines add whitespace errors.

dstogov · 2023-04-11T20:03:33Z

Makefile

+OBJS_IR_TEST2 = $(BUILD_DIR)/c++/ir_test2.o
+OBJS_IR_TEST3 = $(BUILD_DIR)/c++/ir_test3.o


Build fails if c++ directory is not created manually.

dstogov · 2023-04-11T20:11:21Z

c++/include/ir.hxx

+		f sub(f op1, f op2) { return ref(ir_fold2(&ctx, IR_OPT(IR_SUB, IR_FLOAT), ref(op1), ref(op2))); }
+		d sub(d op1, d op2) { return ref(ir_fold2(&ctx, IR_OPT(IR_SUB, IR_DOUBLE), ref(op1), ref(op2))); }
+
+		i8 add(i8 op1, i8 op2) { return ref(ir_fold2(&ctx, IR_OPT(IR_ADD, IR_I8), ref(op1), ref(op2))); }


This methods seems not inlined in RELEASE build:

$ nm --print-size --size-sort build_x86_64/ir_test.o 0000000000000300 0000000000000108 T run 0000000000000000 00000000000002f8 T gen_mandelbrot 0000000000000000 00000000000003e7 T main $ nm --print-size --size-sort build_x86_64/c++/ir_test3.o 00000000000000dc 0000000000000001 b _ZStL8__ioinit 0000000000000000 0000000000000003 W _ZNKSt5ctypeIcE8do_widenEc 0000000000000000 0000000000000004 B _ZN2ir4none1tE 00000000000000d0 000000000000000c B _ZN2ir1b1tE 0000000000000070 000000000000000c B _ZN2ir1c1tE 0000000000000008 000000000000000c B _ZN2ir1f1tE 0000000000000060 000000000000000c B _ZN2ir2i81tE 00000000000000c0 000000000000000c B _ZN2ir2u81tE 0000000000000050 000000000000000c B _ZN2ir3i161tE 0000000000000040 000000000000000c B _ZN2ir3i321tE 00000000000000b0 000000000000000c B _ZN2ir3u161tE 00000000000000a0 000000000000000c B _ZN2ir3u321tE 0000000000000020 0000000000000010 B _ZN2ir1d1tE 0000000000000030 0000000000000010 B _ZN2ir3i641tE 0000000000000090 0000000000000010 B _ZN2ir3u641tE 0000000000000080 0000000000000010 B _ZN2ir4addr1tE 0000000000000000 000000000000002d W _ZN2ir7builder3addENS_1dES1_ 0000000000000000 000000000000002d W _ZN2ir7builder3mulENS_1dES1_ 0000000000000000 0000000000000031 t main.cold 0000000000000000 00000000000000b0 W _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1IS3_EEPKcRKS3_ 0000000000000000 00000000000000b0 W _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC2IS3_EEPKcRKS3_ 0000000000000780 0000000000000141 t _GLOBAL__sub_I__ZN2ir1b1tE 0000000000000000 00000000000001d7 T _Z3runPFiddE 0000000000000000 0000000000000776 T main

dstogov · 2023-04-11T20:14:30Z

c++/ir_test3.cxx

+	// start gen_mandelbrot
+	bld.start();
+	auto x = bld.param(ir::d{}, "x");
+	auto y = bld.param(ir::d{}, "y");
+	auto cr = bld.sub(y, bld.constval(.5));
+	auto ci = bld.copy(x);
+	auto zi = bld.copy(bld.constval(.0));
+	auto zr = bld.copy(bld.constval(.0));
+	auto i = bld.copy(bld.constval(ir::i32{0}));
+
+	auto loop = bld.loop_begin();
+		ir::d zi_1 = bld.phi(zi);
+		ir::d zr_1 = bld.phi(zr);
+		ir::i32 i_1 = bld.phi(i);
+
+		auto i_2 = bld.add(i_1, bld.constval(ir::i32{1}));
+		auto temp = bld.mul(zr_1, zi_1);
+		auto zr2 = bld.mul(zr_1, zr_1);
+		auto zi2 = bld.mul(zi_1, zi_1);
+		auto zr_2 = bld.add(bld.sub(zr2, zi2), cr);
+		auto zi_2 = bld.add(bld.add(temp, temp), ci);
+		auto if_1 = bld.if_cond(bld.gt(bld.add(zi2, zr2), bld.constval(6.0)));
+			bld.if_true(if_1);
+				bld.ret(i_2);
+			bld.if_false(if_1);
+				auto if_2 = bld.if_cond(bld.gt(i_2, bld.constval(ir::i32{1000})));
+				bld.if_true(if_2);
+					bld.ret(bld.constval(ir::i32{0}));
+				bld.if_false(if_2);
+					auto loop_end = bld.loop_end();
+
+	// close loop
+	bld.merge_set_op(loop, 2, loop_end);
+	bld.phi_set_op(zi_1, 2, zi_2);
+	bld.phi_set_op(zr_1, 2, zr_2);
+	bld.phi_set_op(i_1, 2, i_2);
+	// end gen_mandelbrot


Could you please separate this into a gen_mandelbrot() function.

dstogov · 2023-04-11T20:29:43Z

Makefile

@@ -83,6 +93,11 @@ $(BUILD_DIR)/ir_emit_$(DASM_ARCH).h: $(SRC_DIR)/ir_$(DASM_ARCH).dasc $(SRC_DIR)/
 $(OBJS_COMMON) $(OBJS_IR) $(OBJS_IR_TEST): $(BUILD_DIR)/$(notdir %.o): $(SRC_DIR)/$(notdir %.c)
 	$(CC) $(CFLAGS) -I$(BUILD_DIR) -o $@ -c $<

+$(OBJS_IR_TEST2): $(BUILD_DIR)/c++/$(notdir %.o): $(SRC_DIR)/c++/$(notdir %.cxx)
+	$(CXX) $(CFLAGS) $(CXXFLAGS) -I$(BUILD_DIR) -o $@ -c $<
+$(OBJS_IR_TEST3): $(BUILD_DIR)/c++/$(notdir %.o): $(SRC_DIR)/c++/$(notdir %.cxx)


should also depend on $(SRC_DIR)/c++/include/ir.hxx

dstogov · 2023-04-11T20:33:11Z

c++/include/ir.hxx

+
+		b overflow(ref op1) { return ir_fold1(&ctx, IR_OPT(IR_OVERFLOW, IR_BOOL), op1); }
+
+		b not_cond(b op1) { return ref(ir_fold1(&ctx, IR_OPT(IR_NOT, IR_BOOL), ref(op1))); }


not_cond is really bad name that makes fragmentation with C version.
May be it's better to use upper-case names?

dstogov · 2023-04-11T21:20:31Z

c++/include/ir.hxx

+
+	// i8, i16, i32, i64, u8, u16, u32, u64, addr, c, b, d, f
+#define IR_TYPE_CLS(name, cxx_type, field, flags) \
+	class field : public type, public ref { \


As I remember multiple inheritance is not a good C++ practice

dstogov · 2023-04-11T21:24:01Z

c++/include/ir.hxx

+#define IR_TYPE_CLS(name, cxx_type, field, flags) \
+	class field : public type, public ref { \
+	protected: \
+		cxx_type v; \


This v is necessary only for constants, but you include it into each reference.

(gdb) p sizeof(ir_ref) 4 (gdb) p sizeof(ir::ref) 4 (gdb) p sizeof(ir::i32) 12 (gdb) p sizeof(ir::i64) 16

dstogov · 2023-04-11T21:45:31Z

c++/ir_test3.cxx

+	auto y = bld.param(ir::d{}, "y");
+	auto cr = bld.sub(y, bld.constval(.5));
+	auto ci = bld.copy(x);
+	auto zi = bld.copy(bld.constval(.0));


How this makes distinct between FLOAT and DOUBLE?

dstogov · 2023-04-11T22:50:33Z

Type usage in parm and call also looks more complex then in C.

The rest builder API (e.g. add, sub, etc) benefits from C++ type overloading, but note that this overloading works only if we know the type(s) statically. This is a serious limitation. For example in PHP JIT we keep an array of ir_refs to nodes that represent the current values of PHP variables (some of them DOUBLE others are LONG). Static typing won't work in this case. See https://github.com/dstogov/php-src/blob/php-ir/ext/opcache/jit/zend_jit_ir.c#L1147 . Using object with virtual method is also not an option.

Finally, it would be great to have C and C++ APIs without big differences, to have a common reference manual.

dstogov · 2023-04-11T22:56:50Z

c++/include/ir.hxx

+#include "ir.h"
+#include "ir_builder.h"


Can we include these files under ir::c namespace to hide C declarations from C++ users.

namespace ir { namespace c { # include "ir.h" # include "ir_builder.h" } const auto consistency_check = ir::c::ir_consistency_check;

weltling · 2023-05-22T21:03:43Z

After further consideration it became obvious, that the approach as in this PR isn't suitable and doesn't reflect concepts involved in the low level implementation. Some part of the code in this PR could be reused in the further work, but in general the approach needs to be chosen more deliberately. Thus, herewith is this PR abandoned.

Thanks a lot for your guidance and reviews, @dstogov!

weltling changed the title ~~API: Implement C++ wrapper~~ [RFC] API: Implement C++ wrapper Apr 10, 2023

API: Implement C++ wrapper

b08d294

Signed-off-by: Anatol Belski <[email protected]>

dstogov reviewed Apr 11, 2023

View reviewed changes

weltling closed this May 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] API: Implement C++ wrapper #22

[RFC] API: Implement C++ wrapper #22

weltling commented Apr 10, 2023 •

edited

Loading

dstogov left a comment

dstogov Apr 11, 2023

dstogov Apr 11, 2023

dstogov Apr 11, 2023

dstogov Apr 11, 2023

dstogov Apr 11, 2023

dstogov Apr 11, 2023

dstogov Apr 11, 2023

dstogov Apr 11, 2023

dstogov Apr 11, 2023

dstogov Apr 11, 2023

dstogov commented Apr 11, 2023

dstogov Apr 11, 2023

weltling commented May 22, 2023

		OBJS_IR_TEST2 = $(BUILD_DIR)/c++/ir_test2.o
		OBJS_IR_TEST3 = $(BUILD_DIR)/c++/ir_test3.o


		b overflow(ref op1) { return ir_fold1(&ctx, IR_OPT(IR_OVERFLOW, IR_BOOL), op1); }

		b not_cond(b op1) { return ref(ir_fold1(&ctx, IR_OPT(IR_NOT, IR_BOOL), ref(op1))); }

		#include "ir.h"
		#include "ir_builder.h"

[RFC] API: Implement C++ wrapper #22

[RFC] API: Implement C++ wrapper #22

Conversation

weltling commented Apr 10, 2023 • edited Loading

dstogov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dstogov commented Apr 11, 2023

Choose a reason for hiding this comment

weltling commented May 22, 2023

weltling commented Apr 10, 2023 •

edited

Loading