Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDEV-35785 innodb_log_file_mmap is not defined on 32-bit systems #3732

Merged
merged 2 commits into from
Jan 13, 2025

Conversation

dr-m
Copy link
Contributor

@dr-m dr-m commented Jan 7, 2025

  • The Jira issue number for this PR is: MDEV-35785

Description

HAVE_INNODB_MMAP: Remove, and unconditionally enable this code.

log_mmap(): On 32-bit systems, ensure that the size fits in 32 bits.

log_t::resize_start(), log_t::resize_abort(): Only handle memory-mapping if HAVE_PMEM is defined. The generic memory-mapped interface is only for reading the log in recovery. Writable memory mappings are only for persistent memory, that is, Linux file systems with mount -o dax.

Release Notes

The configuration parameter innodb_log_file_mmap will be available in all target platforms.

How can this PR be tested?

This is covered by the regression test suite. The test sys_vars.sysvars_innodb will ensure that the parameter is present.

I tested this by running the regression test suite on IA-32 and AMD64 in my local development environment.

Basing the PR against the correct MariaDB version

  • This is a new feature or a refactoring, and the PR is based against the main branch.
  • This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

@dr-m dr-m self-assigned this Jan 7, 2025
@CLAassistant
Copy link

CLAassistant commented Jan 7, 2025

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@dr-m
Copy link
Contributor Author

dr-m commented Jan 7, 2025

4268d64 needs to be tested on the Debian CI by @ottok, because we have neither RISC-V nor LoongArch on https://buildbot.mariadb.org.

Copy link
Contributor

@mariadb-DebarunBanerjee mariadb-DebarunBanerjee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good that we are getting rid of some conditional paths HAVE_INNODB_MMAP. Please check the backup source file too. I wonder if some backup test should have caught it,

storage/innobase/include/univ.i Show resolved Hide resolved
@grooverdan grooverdan enabled auto-merge (rebase) January 9, 2025 02:57
@grooverdan grooverdan disabled auto-merge January 9, 2025 03:20
@ottok
Copy link
Contributor

ottok commented Jan 9, 2025

4268d64 needs to be tested on the Debian CI by @ottok, because we have neither RISC-V nor LoongArch on https://buildbot.mariadb.org.

I am in the process of testing with riscv64, but builds are very slow..
I don't have access to LoongArch but let's assume they are close enough to riscv64 in this regard.

@ottok ottok marked this pull request as draft January 9, 2025 07:43
@ottok
Copy link
Contributor

ottok commented Jan 9, 2025

I converted this to a draft so it does not get accidentally merged before it has been finalized and rebased so the fixup commit is squashed into the first one.

@ottok
Copy link
Contributor

ottok commented Jan 9, 2025

To summarize original situation in MariaDB 11.4.4:

  1. On amd64/arm64/ppc64 the output of mariadbd --help --verbose shows:

--innodb-log-file-mmap
Whether ib_logfile0 resides in persistent memory or
should initially be memory-mapped

  1. On s390x/riscv64/loong64 the description in the mariadbd --help --verbose is:

--innodb-log-file-mmap
Whether ib_logfile0 should initially be memory-mapped

  1. On i386, armel and armhf the --innodb-log-file-mmap is completely missing.

@dr-m
Copy link
Contributor Author

dr-m commented Jan 9, 2025

By the way, have you noticed tha ton s390x the description in the mariadbd --help --verbose is:

--innodb-log-file-mmap
Whether ib_logfile0 should initially be memory-mapped

On all other archs it seems to be:

--innodb-log-file-mmap
Whether ib_logfile0 resides in persistent memory or
should initially be memory-mapped

(and on some archs --innodb-log-file-mmap is completely missing, which this PR is about to fix)

By the way, have you noticed tha ton s390x the description in the mariadbd --help --verbose is:

--innodb-log-file-mmap
Whether ib_logfile0 should initially be memory-mapped

On all other archs it seems to be:

--innodb-log-file-mmap
Whether ib_logfile0 resides in persistent memory or
should initially be memory-mapped

(and on some archs --innodb-log-file-mmap is completely missing, which this PR is about to fix)

I pointed this out a week ago, including some reasoning why we can’t support persistent memory on IBM System Z (s390x), or on any 32-bit architecture.

@dr-m
Copy link
Contributor Author

dr-m commented Jan 9, 2025

I converted this to a draft so it does not get accidentally merged before it has been finalized and rebased so the fixup commit is squashed into the first one.

I intend this to be merged as 2 commits, addressing separate issues in the same area.

@@ -5,13 +5,13 @@

--vertical_results
--replace_regex /^\/\S+/PATH/ /\.\//PATH/
--replace_result 'resides in persistent memory or ' ''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I see this unifies the variation in the help text:

   --innodb-log-file-mmap 
-                      Whether ib_logfile0 resides in persistent memory or
-                      should initially be memory-mapped
+                      Whether ib_logfile0 should initially be memory-mapped
                       (Defaults to on; use --skip-innodb-log-file-mmap to disable.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s right.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically places that have documentation to users show one version per software version. For example https://manpages.debian.org/unstable/mariadb-server-core/mariadbd.8.en.html has 3 versions for the 3 versions of MariaDB in Debian currently. Now it seems that the text will be different depending on architecture, so documentation should maybe have its own version per architecture.. obviously that would not make sense in any online collection of documentation, but for binaries outputting different help texts on different platforms it is less clear that may be the case

@@ -5,13 +5,13 @@

--vertical_results
--replace_regex /^\/\S+/PATH/ /\.\//PATH/
--replace_result 'resides in persistent memory or ' ''
select VARIABLE_NAME, SESSION_VALUE, DEFAULT_VALUE, VARIABLE_SCOPE, VARIABLE_TYPE, VARIABLE_COMMENT, NUMERIC_MIN_VALUE, NUMERIC_MAX_VALUE, NUMERIC_BLOCK_SIZE, ENUM_VALUE_LIST, READ_ONLY, COMMAND_LINE_ARGUMENT from information_schema.system_variables
where variable_name like 'innodb%' and
variable_name not in (
'innodb_numa_interleave', # only available WITH_NUMA
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..and this seems to be a collection of all variables that appear/disappear based on platform capabilities, even though the source code and dependency versions would be the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically places that have documentation to users show one version per software version. For example https://manpages.debian.org/unstable/mariadb-server-core/mariadbd.8.en.html has 3 versions for the 3 versions of MariaDB in Debian currently. Now it seems that the text will be different depending on architecture, so documentation should maybe have its own version per architecture.. obviously that would not make sense in any online collection of documentation, but for binaries outputting different help texts on different platforms it is less clear that may be the case

@ottok
Copy link
Contributor

ottok commented Jan 9, 2025

My riscv64 test build is still running.. so slow, and also host/queue seems to be busy

@ottok
Copy link
Contributor

ottok commented Jan 9, 2025

I converted this to a draft so it does not get accidentally merged before it has been finalized and rebased so the fixup commit is squashed into the first one.

I intend this to be merged as 2 commits, addressing separate issues in the same area.

Thanks for clarification. I misunderstood that "fixup" keyword as it (or "!fixup") is typically used as command words for git rebase --interactive --autosquash but seems in this case it is meant for humans to parse together with the MDEV number.

The third commit is a merge commit, surely you don't want that?

Maybe it came because Daniel updated with "Merge" instead of "Rebase" in the UI? (Wondering why GitHub even offers merge updates at all, or as primary option)

image

@dr-m
Copy link
Contributor Author

dr-m commented Jan 10, 2025

I prefer to push this manually (first rebasing and force-pushing the source branch, then pushing it to the target branch as soon as the tests on the mandatory builders have completed), because it is not clear to me how and when the GitHub auto-merge is being triggered.
I’m eagerly looking forward to the LoongArch and RISC-V test results.
I could also make the parameter documentation string the same on all platforms, replacing the compile-time condition with a comprehension-time condition, something like this:

Whether ib_logfile0 resides in persistent memory (when supported) or should initially be memory-mapped

Would that be acceptable?

@ottok
Copy link
Contributor

ottok commented Jan 10, 2025

Unfortunately the riscv64 build is now failing on unrelated error:

[ 39%] Building CXX object storage/perfschema/CMakeFiles/perfschema.dir/table_esgs_by_account_by_event_name.cc.o
cd /<<PKGBUILDDIR>>/builddir/storage/perfschema && /usr/bin/c++ -DHAVE_CONFIG_H -DMYSQL_SERVER -D_FILE_OFFSET_BITS=64 -I/<<PKGBUILDDIR>>/wsrep-lib/include -I/<<PKGBUILDDIR>>/wsrep-lib/wsrep-API/v26 -I/<<PKGBUILDDIR>>/builddir/include -I/<<PKGBUILDDIR>>/include/providers -I/<<PKGBUILDDIR>>/include -I/<<PKGBUILDDIR>>/sql -I/<<PKGBUILDDIR>>/builddir/sql -I/<<PKGBUILDDIR>>/builddir/storage/perfschema -g -O2 -fno-omit-frame-pointer -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -fno-stack-clash-protection -fdebug-prefix-map=/<<PKGBUILDDIR>>=/usr/src/mariadb-1:11.4.4-3~bpo25.04.1~1736454963.86e3d18b7fc+dev.otto -Wdate-time -D_FORTIFY_SOURCE=3 -Wdate-time -D_FORTIFY_SOURCE=3 -pie -fPIC -fstack-protector --param=ssp-buffer-size=4 -O2 -g -fno-omit-frame-pointer -fno-strict-aliasing -Wno-uninitialized -fno-omit-frame-pointer -DDBUG_OFF -Wall -Wenum-compare -Wenum-conversion -Wextra -Wformat-security -Wmissing-braces -Wno-format-truncation -Wno-init-self -Wno-nonnull-compare -Wno-unused-parameter -Wnon-virtual-dtor -Woverloaded-virtual -Wvla -Wwrite-strings -std=gnu++11   -Wdate-time -D_FORTIFY_SOURCE=3 -DHAVE_OPENSSL -DOPENSSL_API_COMPAT=0x10100000L  -fvisibility=hidden -MD -MT storage/perfschema/CMakeFiles/perfschema.dir/table_esgs_by_account_by_event_name.cc.o -MF CMakeFiles/perfschema.dir/table_esgs_by_account_by_event_name.cc.o.d -o CMakeFiles/perfschema.dir/table_esgs_by_account_by_event_name.cc.o -c /<<PKGBUILDDIR>>/storage/perfschema/table_esgs_by_account_by_event_name.cc
In file included from /usr/include/unicode/uenum.h:25,
                 from /usr/include/unicode/ucnv.h:52,
                 from /usr/include/libxml2/libxml/encoding.h:31,
                 from /usr/include/libxml2/libxml/parser.h:812,
                 from /<<PKGBUILDDIR>>/storage/connect/libdoc.cpp:8:
/usr/include/unicode/localpointer.h:561:26: error: parameter declared ‘auto’
  561 | template <typename Type, auto closeFunction>
      |                          ^~~~
/usr/include/unicode/localpointer.h:573:76: error: template argument 2 is invalid
  573 |     explicit LocalOpenPointer(std::unique_ptr<Type, decltype(closeFunction)> &&p)
      |                                                                            ^
/usr/include/unicode/localpointer.h:583:78: error: template argument 2 is invalid
  583 |     LocalOpenPointer &operator=(std::unique_ptr<Type, decltype(closeFunction)> &&p) {
      |                                                                              ^
/usr/include/unicode/localpointer.h:599:59: error: template argument 2 is invalid
  599 |     operator std::unique_ptr<Type, decltype(closeFunction)> () && {
      |                                                           ^
/usr/include/unicode/uenum.h:69:1: note: invalid template non-type parameter
   69 | U_DEFINE_LOCAL_OPEN_POINTER(LocalUEnumerationPointer, UEnumeration, uenum_close);
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/unicode/ucnv.h:597:1: note: invalid template non-type parameter
  597 | U_DEFINE_LOCAL_OPEN_POINTER(LocalUConverterPointer, UConverter, ucnv_close);
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
make[4]: *** [storage/connect/CMakeFiles/connect.dir/build.make:796: storage/connect/CMakeFiles/connect.dir/libdoc.cpp.o] Error 1
make[4]: Leaving directory '/<<PKGBUILDDIR>>/builddir'
make[3]: *** [CMakeFiles/Makefile2:6589: storage/connect/CMakeFiles/connect.dir/all] Error 2

Trying to debug it..

@ottok
Copy link
Contributor

ottok commented Jan 11, 2025

Finally https://launchpadlibrarian.net/769798422/buildlog_ubuntu-oracular-riscv64.mariadb_1%3A11.4.4-3~bpo24.10.1~1736527158.86e3d18b7fc+dev.otto_BUILDING.txt.gz

innodb-log-file-mmap is now behaving identical to amd64 and also all main suite tests pass so no regressions.

The servers were restarted 230 times
Spent 6990.452 of 1577 seconds executing testcases
Completed: All 1108 tests were successful.
``

Looks good to me!

@ottok
Copy link
Contributor

ottok commented Jan 11, 2025

I could also make the parameter documentation string the same on all platforms, replacing the compile-time condition with a comprehension-time condition, something like this:

Whether ib_logfile0 resides in persistent memory (when supported) or should initially be memory-mapped

Would that be acceptable?

Sound good.The text from --help and first sentence of wiki (https://mariadb.com/kb/en/innodb-system-variables/#innodb_log_file_mmap) should perhaps be identical so that they are aligned and later easy to keep aligned?

@dr-m
Copy link
Contributor Author

dr-m commented Jan 11, 2025

@ottok Thank you for the testing. Did you have a chance to run this on LoongArch? This is about covering the following function:

void pmem_persist(const void *buf, size_t size)
{
# if defined __riscv && __riscv_xlen == 64
  __asm__ __volatile__("fence w,w" ::: "memory");
# elif defined __loongarch64
  __asm__ __volatile__("dbar 0" ::: "memory");
# else
#  error "Missing implementation; recompile with cmake -DWITH_INNODB_PMEM=OFF"
# endif
}

According to https://godbolt.org, fence w,w compiles into 0x0110000f, which matches
https://riscv-software-src.github.io/riscv-unified-db/manual/html/isa/20240411/insts/fence.html
with the following parameters:

  • fm=0 (fence mode: normal)
  • pred=1 (PW)
  • succ=1 (SW)
  • rs1=0, rd=0 (ignored)
    If I understand the reference correctly, implementing this instruction is mandatory.

https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN#_dbar
https://gcc.gnu.org/onlinedocs/gcc/LoongArch-Base-Built-in-Functions.html mentions that __builtin_loongarch_dbar() is available in all ISA versions. I’m using inline assembly, because older compilers might not implement that built-in function. For clang, it was added in D136906. The oldest that https://godbolt.org offers is clang-17, which generates the same binary encoding (0x38720000) for both __builtin_loongarch_dbar(0) and dbar 0. Likewise, all GCC versions available there generate the same binary encoding.

Hence, it seems to me that it is not strictly necessary to wait for the test results for RISC-V. Curiously, https://gcc.gnu.org/onlinedocs/gcc/RISC-V-Built-in-Functions.html does not mention any barrier instructions, only __builtin_riscv_pause(), which MY_RELAX_CPU() currently fails to put into use. The my_atomic_cas32_strong_explicit() fallback ought to be much more expensive. It could be the topic of another patch (which would be nice to get tested):

diff --git a/include/my_cpu.h b/include/my_cpu.h
index 003087dd94d..0aa3c748d27 100644
--- a/include/my_cpu.h
+++ b/include/my_cpu.h
@@ -104,6 +104,8 @@ static inline void MY_RELAX_CPU(void)
 #elif defined __GNUC__ && (defined __arm__ || defined __aarch64__)
   /* Mainly, prevent the compiler from optimizing away delay loops */
   __asm__ __volatile__ ("":::"memory");
+#elif defined __riscv
+  __builtin_riscv_pause();
 #else
   int32 var, oldval = 0;
   my_atomic_cas32_strong_explicit(&var, &oldval, 1, MY_MEMORY_ORDER_RELAXED,

Curiously, __builtin_riscv_pause() uses almost the same encoding as fence w,w (0x0100000f, only differing in the parameter succ=0). The patch for implementing __builtin_riscv_pause() is accompanied with a statement that the fence instruction encoding is valid for all ISA versions.

@ottok
Copy link
Contributor

ottok commented Jan 13, 2025

Did you have a chance to run this on LoongArch?

No, I don't have access to a LoongArch machine other than as an official Debian builder. I will upload the contents of https://salsa.debian.org/mariadb-team/mariadb-server/-/commits/debian/latest to Debian now so results for the full build and Debian CI on all platforms should be available in a couple of days.

Commit log of what uploaded to Debian is visible at https://salsa.debian.org/mariadb-team/mariadb-server/-/commits/debian/latest

dr-m added 2 commits January 13, 2025 07:27
innodb_log_file_mmap: Use a constant documentation string that
refers to persistent memory also when it is not available in the build.

HAVE_INNODB_MMAP: Remove, and unconditionally enable this code.

log_mmap(): On 32-bit systems, ensure that the size fits in 32 bits.

log_t::resize_start(), log_t::resize_abort(): Only handle memory-mapping
if HAVE_PMEM is defined. The generic memory-mapped interface is only for
reading the log in recovery. Writable memory mappings are only for
persistent memory, that is, Linux file systems with mount -o dax.

Reviewed by: Debarun Banerjee, Otto Kekäläinen
Let us enable pmem_persist() on RISC-V and LoongArch, because those are
available in the Debian CI.
In commit 3f9f5ca these were initially
disabled by default.

According to the available documentation, these instructions are
available in all ISA versions. On LoongArch there would also be
__builtin_loongarch_dbar() that generates the same code.
@dr-m dr-m force-pushed the 10.11-MDEV-35785 branch from d5867eb to 4fc3a44 Compare January 13, 2025 06:07
@dr-m dr-m merged commit 4fc3a44 into 10.11 Jan 13, 2025
10 of 13 checks passed
@dr-m dr-m deleted the 10.11-MDEV-35785 branch January 13, 2025 06:38
@dr-m
Copy link
Contributor Author

dr-m commented Jan 13, 2025

I filed #3752 for putting __builtin_riscv_pause() into use.

@ottok
Copy link
Contributor

ottok commented Jan 18, 2025

Follow-up:

@dr-m
Copy link
Contributor Author

dr-m commented Jan 22, 2025

Thank you, Otto!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants