Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crashes on Windows on ARM #304

Open
orditeck opened this issue Dec 16, 2024 · 68 comments
Open

Crashes on Windows on ARM #304

orditeck opened this issue Dec 16, 2024 · 68 comments

Comments

@orditeck
Copy link

Bug description
Every time I put my Surface Pro 11 to sleep and wake it up, the Syncthing Tray icon has disappeared and syncthing isn't running anymore. Doesn't happen with any other app.

Addition: it also just crashed after I right-clicked on the icon, so I guess it's not only after sleep.

Version: syncthingtray-1.6.4-aarch64-w64-mingw32
Qt 6.8.1

@orditeck orditeck added the bug label Dec 16, 2024
@orditeck
Copy link
Author

Now using the x64 version, emulated for ARM, drop-in replacement and no crash so far.

@Martchus
Copy link
Owner

Martchus commented Dec 16, 2024

Sounds similar to what has been discussed on #218 (comment). Probably it makes sense to have a distinct issue for that so I'm not closing this as duplicate.

Considering I don't have aarch64 hardware to reproduce this I'm limited in how I can help here. It makes probably sense if you debug this further in your own but I can of course try to assist.

Maybe it makes a difference that the ARM builds use libc++ instead of libstdc++. Another difference is that the ARM build uses QProcess instead of Boost.Process.

@Martchus Martchus mentioned this issue Dec 16, 2024
6 tasks
@Martchus
Copy link
Owner

Martchus commented Dec 16, 2024

Without a stacktrace this is hard to pin-down. So I recommend you create a debug build using MSYS2's mingw-w64 packaging and reproduce/debug the crash with gdb. You can find build instructions on the README of c++utilities (another repo on my GitHub page). If you paste a stacktrace here I can also help with the further investigation.

@Martchus Martchus changed the title Crashes on ARM Crashes on Windows on ARM Dec 17, 2024
@Martchus
Copy link
Owner

Martchus commented Dec 22, 2024

I was able to compile the latest version of Boost for Windows on ARM with the Boost.Process library using the patch from msys2/MINGW-packages@da4f4c3#diff-35ebcd8b8f3e30bb84ddbfad90b6dd1ed6f063ae4b2197b55464081b736f5a28R85. I was also able to compile the latest Syncthing Tray release against it using Boost.Process. The builds that are supposed to be statically linked are unfortunately dynamically linked so one can't test the binaries easily at the moment. (The linking problem has been fixed in c++utilities and thus will be fixed on the next release. Maybe I can provide development builds in the meantime. I also came up with a change in my release scripts to prevent uploading binaries that depend on DLLs not provided by Windows itself.)

So on the next release the ARM build will be closer to the x86_64 build; maybe that's already enough. (Somehow it suspect it is the use of libc++, though.)

@theAeon
Copy link

theAeon commented Dec 26, 2024

Currently running the arm build myself and have both VS devtools and mingw installed-I can try giving stuff a debug compile when i get the chance later.

@theAeon
Copy link

theAeon commented Dec 27, 2024

okay for whatever its worth it appears to have survived multiple sleep wake cycles w/ boost-process

will let you know if/when it dies

@theAeon
Copy link

theAeon commented Dec 28, 2024

yeah i think that was it

@Martchus
Copy link
Owner

So it works with Boost.Process? That's good news. I only hope that my cross-compiled builds using static libraries will work as well.

@theAeon
Copy link

theAeon commented Dec 28, 2024

That it does. Feel free to shoot over a test build it you'd like.

@theAeon
Copy link

theAeon commented Jan 1, 2025

....that one crashed on first wake.

for whatever its worth its not dying on debug attach so i should be able to get a proper trace.

@theAeon
Copy link

theAeon commented Jan 1, 2025

ModLoad: 00007ffc`86d50000 00007ffc`86d90000   C:\windows\SYSTEM32\gpapi.dll
(245ac.29144): C++ EH exception - code e06d7363 (first chance)
(245ac.29144): C++ EH exception - code e06d7363 (first chance)
(245ac.29144): C++ EH exception - code e06d7363 (first chance)
(245ac.29144): C++ EH exception - code e06d7363 (first chance)
ModLoad: 00007ffc`3c7e0000 00007ffc`3c974000   C:\windows\SYSTEM32\d3d9on12.dll
ModLoad: 00007ffc`3c020000 00007ffc`3c43e000   C:\windows\SYSTEM32\D3D12Core.dll
ModLoad: 00007ffc`3b400000 00007ffc`3bc77000   C:\windows\System32\DriverStore\FileRepository\qcdx8380.inf_arm64_cd283b9bc940b474\qcdx12arm64xum.dll
ModLoad: 00007ffb`c5ce0000 00007ffb`c5f40000   C:\windows\System32\DriverStore\FileRepository\qcdx8380.inf_arm64_cd283b9bc940b474\qcegparm64x.DLL
ModLoad: 00007ffc`3a6c0000 00007ffc`3a9a2000   C:\windows\SYSTEM32\dxilconv.dll
ModLoad: 00007ffc`67d20000 00007ffc`6a509000   C:\windows\System32\DriverStore\FileRepository\qcdx8380.inf_arm64_cd283b9bc940b474\qcgpuarm64xcompilercore.dll
ModLoad: 00007ffc`67a70000 00007ffc`67ac1000   C:\windows\SYSTEM32\D3DSCache.dll
ModLoad: 00007ffc`83cf0000 00007ffc`83d1a000   C:\windows\SYSTEM32\resourcepolicyclient.dll
ModLoad: 00007ffc`495c0000 00007ffc`4967b000   C:\windows\system32\dataexchange.dll
ModLoad: 00007ffc`72be0000 00007ffc`72e68000   C:\windows\SYSTEM32\textinputframework.dll
ModLoad: 00007ffc`7c5d0000 00007ffc`7c85b000   C:\windows\SYSTEM32\CoreMessaging.dll
ModLoad: 00007ffc`6c6c0000 00007ffc`6cd2f000   C:\windows\SYSTEM32\CoreUIComponents.dll
ModLoad: 00007ffc`49860000 00007ffc`4994c000   C:\windows\system32\Oleacc.dll
ModLoad: 00007ffc`49d50000 00007ffc`4a58d000   C:\windows\SYSTEM32\uiautomationcore.DLL
ModLoad: 00007ffc`88100000 00007ffc`88200000   C:\windows\SYSTEM32\sxs.dll
mincore\com\oleaut32\dispatch\ups.cpp(2126)\OLEAUT32.dll!00007FFC8A44DAB0: (caller: 00007FFC8A44F7EC) ReturnHr(1) tid(29144) 8002801D Library not registered.
(245ac.29144): C++ EH exception - code e06d7363 (first chance)
(245ac.29144): C++ EH exception - code e06d7363 (first chance)
(245ac.1dcc): Security check failure or stack buffer overrun - code c0000409 (!!! second chance !!!)
Subcode: 0xa FAST_FAIL_GUARD_ICALL_CHECK_FAILURE 
ntdll!RtlFailFast2:
00007ffc`8d02ee20 d43e0060 brk         #0xF003
0:028> !analyze -v
*******************************************************************************
*                                                                             *
*                        Exception Analysis                                   *
*                                                                             *
*******************************************************************************


KEY_VALUES_STRING: 1

    Key  : Analysis.CPU.mSec
    Value: 1718

    Key  : Analysis.Elapsed.mSec
    Value: 75591

    Key  : Analysis.IO.Other.Mb
    Value: 16

    Key  : Analysis.IO.Read.Mb
    Value: 1

    Key  : Analysis.IO.Write.Mb
    Value: 55

    Key  : Analysis.Init.CPU.mSec
    Value: 1281

    Key  : Analysis.Init.Elapsed.mSec
    Value: 247723

    Key  : Analysis.Memory.CommitPeak.Mb
    Value: 184

    Key  : Analysis.Version.DbgEng
    Value: 10.0.27704.1001

    Key  : Analysis.Version.Description
    Value: 10.2408.27.01 arm64fre

    Key  : Analysis.Version.Ext
    Value: 1.2408.27.1

    Key  : CFG.InvalidUserCallTarget.Detected
    Value: 1

    Key  : FailFast.Name
    Value: GUARD_ICALL_CHECK_FAILURE

    Key  : FailFast.Type
    Value: 10

    Key  : Failure.Bucket
    Value: FAIL_FAST_GUARD_ICALL_CHECK_FAILURE_c0000409_powrprof.dll!PowerpResumeSuspendCallback

    Key  : Failure.Hash
    Value: {991fe02e-bb2a-c2c7-02b1-03b2ee7aaecd}

    Key  : Timeline.OS.Boot.DeltaSec
    Value: 1816313

    Key  : Timeline.Process.Start.DeltaSec
    Value: 247

    Key  : WER.OS.Branch
    Value: ge_release

    Key  : WER.OS.Version
    Value: 10.0.26100.1

    Key  : WER.Process.Version
    Value: 1.7.0.0


NTGLOBALFLAG:  70

APPLICATION_VERIFIER_FLAGS:  0

EXCEPTION_RECORD:  (.exr -1)
ExceptionAddress: 00007ffc8d02ee20 (ntdll!RtlFailFast2)
   ExceptionCode: c0000409 (Security check failure or stack buffer overrun)
  ExceptionFlags: 00000001
NumberParameters: 1
   Parameter[0]: 000000000000000a
Subcode: 0xa FAST_FAIL_GUARD_ICALL_CHECK_FAILURE 

FAULTING_THREAD:  00001dcc

PROCESS_NAME:  syncthingtray-qt6-static.exe

ERROR_CODE: (NTSTATUS) 0xc0000409 - The system detected an overrun of a stack-based buffer in this application. This overrun could potentially allow a malicious user to gain control of this application.

EXCEPTION_CODE_STR:  c0000409

EXCEPTION_PARAMETER1:  000000000000000a

STACK_TEXT:  
000000fc`a59fec50 00007ffc`8cfcb518     : 000000fc`a59fec70 7b60fffc`8d01ebd8 0000025e`cc10ae30 00000000`00001000 : ntdll!RtlFailFast2
000000fc`a59fec50 00007ffc`8d01ebd8     : 000000fc`a59fec70 7b60fffc`8d01ebd8 0000025e`cc10ae30 00000000`00001000 : ntdll!RtlpHandleInvalidUserCallTarget+0x78
000000fc`a59fec70 00007ffc`883751e4     : 000000fc`a59fed60 00007ffc`883751e4 00000000`00000000 00000000`00000000 : ntdll!LdrpHandleInvalidUserCallTarget+0x38
000000fc`a59fed60 00007ffc`88346654     : 000000fc`a59ff490 de1f7ffc`88346654 00000000`00000000 00000000`00000000 : powrprof!PowerpResumeSuspendCallback+0xb4
000000fc`a59ff0b0 00007ffc`88346410     : 00000000`00000028 00000000`00000320 20000000`20000000 00000000`000002e0 : UMPDC!PdcpAlpcProcessMessage+0x20c
000000fc`a59ff4c0 00007ffc`8cf63300     : 000000fc`a59ff550 29157ffc`8cf63300 0000025e`cc122ae0 000000fc`a59ff6c8 : UMPDC!PdcpAlpcCallback+0x50
000000fc`a59ff4f0 00007ffc`8cf6dbac     : 00000000`00000002 0000025e`cc000000 00000000`00000000 00000000`00000000 : ntdll!TppAlpcpExecuteCallback+0x480
000000fc`a59ff5b0 00007ffc`8a968740     : 00010101`01000101 00000001`00000000 00000000`00000000 00000000`00000000 : ntdll!TppWorkerThread+0x7ac
000000fc`a59ff880 00007ffc`8cf91084     : 000000fc`a59ff890 e1547ffc`8cf91084 00000000`00000000 89690000`00000000 : KERNEL32!BaseThreadInitThunk+0x40
000000fc`a59ff890 00000000`00000000     : 00000000`00000000 89690000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x44


SYMBOL_NAME:  powrprof!PowerpResumeSuspendCallback+b4

MODULE_NAME: powrprof

IMAGE_NAME:  powrprof.dll

STACK_COMMAND:  ~28s ; .cxr ; kb

FAILURE_BUCKET_ID:  FAIL_FAST_GUARD_ICALL_CHECK_FAILURE_c0000409_powrprof.dll!PowerpResumeSuspendCallback

OS_VERSION:  10.0.26100.1

BUILDLAB_STR:  ge_release

OSPLATFORM_TYPE:  arm64

OSNAME:  Windows 10

IMAGE_VERSION:  10.0.26100.1882

FAILURE_ID_HASH:  {991fe02e-bb2a-c2c7-02b1-03b2ee7aaecd}

Followup:     MachineOwner
---------

@Martchus
Copy link
Owner

Martchus commented Jan 1, 2025

Probably my builds are done with more security hardening flags than your builds so a problem with the code leads to this hard crash.

I have no idea what the problem with my code could be, though. The stracktrace doesn't point to a concrete function of my code or Qt. I'm not even sure where the thread comes from.

Syncthing Tray has actually code that runs explicitly after the system has been resumed (

if (eventType == "windows_generic_MSG") {
) but I cannot see a problem with it and the stacktrace also doesn't point to this handler. The Qt documentation also says that this kind of event filter will be executed on the main thread so if it was the culprit the crash would have happened on the main thread.

Those are the compile flags I used to compile Syncthing Tray and all of its dependencies: https://github.com/Martchus/PKGBUILDs/blob/master/environment/mingw-w64-clang/mingw-env.sh
Maybe the use of -mguard=cf and/or -fcf-protection is problematic here.

@theAeon
Copy link

theAeon commented Jan 1, 2025

want me to rebuild on my end w/ your flags?

@Martchus
Copy link
Owner

Martchus commented Jan 1, 2025

You could try; I'm not sure whether it'll make a difference when just compiling Syncthing Tray itself with different flags. Maybe the compilation of Qt, other dependencies and mingw-w64 itself is also relevant.

@Martchus
Copy link
Owner

Martchus commented Jan 1, 2025

It looks like the use of --enable-cfguard when compiling mingw-w64-crt is at least also present in MSYS2 (https://github.com/msys2/MINGW-packages/blob/master/mingw-w64-crt-git/PKGBUILD#L51).

(This page contains more information about possibly relevant flags: https://gist.github.com/alvinhochun/a65e4177e2b34d551d7ecb02b55a4b0a)

@theAeon
Copy link

theAeon commented Jan 1, 2025

    Run Build Command(s): C:/rtools44-aarch64/clangarm64/bin/ninja.exe -v cmTC_41189
    [1/2] C:\rtools44-aarch64\clangarm64\bin\cc.exe   -rtlib=compiler-rt -fuse-ld=lld -mguard=cf -D_FORTIFY_SOURCE=3 -D_GLIBCXX_ASSERTIONS -O2 -pipe -fno-plt -fexceptions --param=ssp-buffer-size=4 -Wformat -Werror=format-security -mguard=cf -fcf-protection -MD -MT CMakeFiles/cmTC_41189.dir/testCCompiler.c.obj -MF CMakeFiles\cmTC_41189.dir\testCCompiler.c.obj.d -o CMakeFiles/cmTC_41189.dir/testCCompiler.c.obj -c E:/build/CMakeFiles/CMakeScratch/TryCompile-tsfez2/testCCompiler.c
    FAILED: CMakeFiles/cmTC_41189.dir/testCCompiler.c.obj
    C:\rtools44-aarch64\clangarm64\bin\cc.exe   -rtlib=compiler-rt -fuse-ld=lld -mguard=cf -D_FORTIFY_SOURCE=3 -D_GLIBCXX_ASSERTIONS -O2 -pipe -fno-plt -fexceptions --param=ssp-buffer-size=4 -Wformat -Werror=format-security -mguard=cf -fcf-protection -MD -MT CMakeFiles/cmTC_41189.dir/testCCompiler.c.obj -MF CMakeFiles\cmTC_41189.dir\testCCompiler.c.obj.d -o CMakeFiles/cmTC_41189.dir/testCCompiler.c.obj -c E:/build/CMakeFiles/CMakeScratch/TryCompile-tsfez2/testCCompiler.c
    error: option 'cf-protection=return' cannot be specified on this target
    error: option 'cf-protection=branch' cannot be specified on this target

it won't even let me 😅

@R-Goc
Copy link

R-Goc commented Jan 1, 2025

This stack trace doesn't even leave windows DLLs. Unless it is corrupted main wasn't even called. You might want to report it to one of the Microsoft forums, not sure which one.

@Martchus
Copy link
Owner

Martchus commented Jan 1, 2025

Maybe PowerpResumeSuspendCallback is trying to call some callback provided by application code but the address of the provided callback is wrong. However, none of my code is using these WinAPIs trying to supply some kind of callback. Maybe some other piece of code in the software stack does (e.g. Qt, however, I doubt it and couldn't find anything after a brief search).

@R-Goc
Copy link

R-Goc commented Jan 1, 2025

Seems a bit weird then that it would cause this error.

@R-Goc
Copy link

R-Goc commented Jan 1, 2025

Nevermind you're right. Rtlp­Handle­Invalid­User­Call­Target handles a call to an invalid function pointer.

@R-Goc
Copy link

R-Goc commented Jan 1, 2025

If you could read the registers it would be nice. The function pointer should still be there.

@Martchus
Copy link
Owner

Martchus commented Jan 1, 2025

I suppose the relevant WinAPI function where one would pass the callback is PowerRegisterSuspendResumeNotification (https://learn.microsoft.com/en-us/windows/win32/api/powerbase/nf-powerbase-powerregistersuspendresumenotification). I'm just wondering where this function is called. I've already checked Qt, relevant Boost libraries and LLVM but none of these code bases contain PowerRegisterSuspendResumeNotification. Of course this function might also be called by any of the loaded DLLs for whatever reason.

@R-Goc
Copy link

R-Goc commented Jan 1, 2025

That is my suspicion. As long as this can be reproduced and you check where the pointer points to it should be done.

@theAeon
Copy link

theAeon commented Jan 1, 2025

on it.

@R-Goc
Copy link

R-Goc commented Jan 1, 2025

Here is a blog from a similar failure. https://devblogs.microsoft.com/oldnewthing/20240913-00/?p=110257

@Martchus
Copy link
Owner

Martchus commented Jan 2, 2025

At least the flag when compiling mingw-w64 shouldn't hurt as MSYS2 enabled it as well. They also have tests using -mguard=cf which probably run on all archs. See msys2/MINGW-packages#16833 for details.

It also looks like llvm-mingw enables it for all archs: https://github.com/mstorsjo/llvm-mingw/blob/master/Dockerfile

@theAeon
Copy link

theAeon commented Jan 2, 2025

probably worth noting that mguard=cf, enable-cfguard, and fcf-protection are all different flags

@theAeon
Copy link

theAeon commented Jan 2, 2025

only one that clangarm is giving me trouble for using is fcf

@Martchus
Copy link
Owner

Martchus commented Jan 2, 2025

I know but I'm also not using fcf. Sorry for the confusion, I initially thought as well that fcf might be the culprit but it is not used at all in my builds as it is only used when the condition [[ $_arch_ == aarch64 ]] is not true (see https://github.com/Martchus/PKGBUILDs/blob/9ab7f046f27d9abfade7efdb5c31c9b39d16746d/environment/mingw-w64-clang/mingw-env.sh#L13).

@theAeon
Copy link

theAeon commented Jan 2, 2025

Bash syntax, love it.

gonna see what happens if its a personal build w/o fcf then

@theAeon
Copy link

theAeon commented Jan 2, 2025

misaligned ldr/str offset?

@theAeon
Copy link

theAeon commented Jan 2, 2025

C:\windows\system32\cmd.exe /C "cd . && C:\rtools44-aarch64\clangarm64\bin\c++.exe -rtlib=compiler-rt -fuse-ld=lld -mguard=cf -D_FORTIFY_SOURCE=3 -D_GLIBCXX_ASSERTIONS -O2 -pipe -fno-plt -fexceptions --param=ssp-buffer-size=4 -Wformat -Werror=format-security -mguard=cf -stdlib=libc++ -g -rtlib=compiler-rt -fuse-ld=lld -mguard=cf -Wl,-O1,--sort-common,--as-needed -fstack-protector syncthingtray/cli/CMakeFiles/syncthingctl-cli.dir/syncthingctl-cli_autogen/mocs_compilation.cpp.obj syncthingtray/cli/CMakeFiles/syncthingctl-cli.dir/resources/windows-cli-wrapper-Debug.rc.obj syncthingtray/cli/CMakeFiles/syncthingctl-cli.dir/E_/syncthing/c++utilities/cmake/templates/cli-wrapper.cpp.obj -o syncthingtray\cli\syncthingctl-cli.exe -Wl,--out-implib,syncthingtray\cli\libsyncthingctl-cli.dll.a -Wl,--major-image-version,0,--minor-image-version,0 -LE:/build/c++utilities   -LE:/build/qtutilities   -LE:/build/syncthingtray/syncthingconnector   -LE:/lib -lkernel32 -luser32 -lgdi32 -lwinspool -lshell32 -lole32 -loleaut32 -luuid -lcomdlg32 -ladvapi32 && cd ."
ld.lld: error: misaligned ldr/str offset
ld.lld: error: misaligned ldr/str offset
ld.lld: error: misaligned ldr/str offset
ld.lld: error: misaligned ldr/str offset
ld.lld: error: misaligned ldr/str offset
c++: error: linker command failed with exit code 1 (use -v to see invocation)
[2/74] Linking CXX executable syncthingtray\tray\syncthingtray-cli.exe
FAILED: syncthingtray/tray/syncthingtray-cli.exe
C:\windows\system32\cmd.exe /C "cd . && C:\rtools44-aarch64\clangarm64\bin\c++.exe -rtlib=compiler-rt -fuse-ld=lld -mguard=cf -D_FORTIFY_SOURCE=3 -D_GLIBCXX_ASSERTIONS -O2 -pipe -fno-plt -fexceptions --param=ssp-buffer-size=4 -Wformat -Werror=format-security -mguard=cf -stdlib=libc++ -g -rtlib=compiler-rt -fuse-ld=lld -mguard=cf -Wl,-O1,--sort-common,--as-needed -fstack-protector syncthingtray/tray/CMakeFiles/syncthingtray-cli.dir/syncthingtray-cli_autogen/mocs_compilation.cpp.obj syncthingtray/tray/CMakeFiles/syncthingtray-cli.dir/resources/windows-cli-wrapper-Debug.rc.obj syncthingtray/tray/CMakeFiles/syncthingtray-cli.dir/E_/syncthing/c++utilities/cmake/templates/cli-wrapper.cpp.obj -o syncthingtray\tray\syncthingtray-cli.exe -Wl,--out-implib,syncthingtray\tray\libsyncthingtray-cli.dll.a -Wl,--major-image-version,0,--minor-image-version,0 -LE:/build/c++utilities   -LE:/build/qtutilities   -LE:/build/syncthingtray/syncthingconnector   -LE:/build/syncthingtray/syncthingmodel   -LE:/lib   -LE:/build/lib -lkernel32 -luser32 -lgdi32 -lwinspool -lshell32 -lole32 -loleaut32 -luuid -lcomdlg32 -ladvapi32 && cd ."
ld.lld: error: misaligned ldr/str offset
ld.lld: error: misaligned ldr/str offset
ld.lld: error: misaligned ldr/str offset
ld.lld: error: misaligned ldr/str offset
ld.lld: error: misaligned ldr/str offset
c++: error: linker command failed with exit code 1 (use -v to see invocation)
[3/74] Linking CXX static library syncthingtray\syncthingconnector\libsyncthingconnector.a
ninja: build stopped: subcommand failed.

@R-Goc
Copy link

R-Goc commented Jan 2, 2025

This looks curious:
llvm/llvm-project#110186

@theAeon
Copy link

theAeon commented Jan 2, 2025

guess the crt as compiled by the msys guys isn't above board here, huh

edit: can't find any issue reports, granted, so that seems unlikely

@Martchus
Copy link
Owner

Martchus commented Jan 2, 2025

The issue with ld.lld: error: misaligned ldr/str offset was very strange, indeed. I came to the conclusion this was just due bintuils and llvm tooling being incompatible. That you're now running into the issue as well is even more curious and maybe my previous conclusion was wrong.

Note that I rebuilt all my packages using llvm tooling. (The only package where I used binutils was boot but only for windmc. Other tooling was taken from llvm. I think it would have been impossible to use binutils anyways because the misalignment issue goes in the way very soon.)

@theAeon
Copy link

theAeon commented Jan 2, 2025

perhaps even more strange is that this wasn't happening until i started adding the hardening flags.

edit: and compiler-rt/ld.lld so maybe i should try removing those first

edit edit: same deal

@theAeon
Copy link

theAeon commented Jan 2, 2025

culprit appears to either be fno-plt or fexceptions, per testing

edit: looks like its fno-plt, gonna double check by readding the other options

edit edit: yup, testing now

edit edit edit: cfguard fast fail crash

@theAeon
Copy link

theAeon commented Jan 2, 2025

0:035> u @x1 L1
syncthingtray!runtime.callbackasm+0x8 [./runtime/zcallback_windows_arm64.s @ 15]:
00007ff6`d7d29b88 b24003ec movi        x12,#1

that's....new.

https://github.com/golang/go/blob/master/src/runtime/zcallback_windows_arm64.s

@Martchus
Copy link
Owner

Martchus commented Jan 2, 2025

Ok, so what flags exactly did you have to add to the compilation of what to reproduce a crash? And did you reproduce the power callback crash or only the one pointing to the Go runtime? Unfortunately I'm really not familiar with what these callback from the Go runtime to be able to shed some light on what's perhaps going on. Maybe you're onto something, though :-)

@theAeon
Copy link

theAeon commented Jan 2, 2025

Ok, so what flags exactly did you have to add to the compilation of what to reproduce a crash? And did you reproduce the power callback crash or only the one pointing to the Go runtime? Unfortunately I'm really not familiar with what these callback from the Go runtime to be able to shed some light on what's perhaps going on. Maybe you're onto something, though :-)

Same flags minus -fno-plt. That's the function pointer present when the same powercallback trips on resume from sleep.

@Martchus
Copy link
Owner

Martchus commented Jan 2, 2025

And I assume you're only recompiling Syncthing Tray but none of the dependencies like Qt, libc++ or mingw-w64? If the compilation of Syncthing Tray alone makes a difference that's maybe a good sign (and I won't have to recompile everything to create a fixed build).

To exclude Go you can of course build with NO_LIBSYNCTHING=ON. and USE_LIBSYNCTHING=OFF.

@theAeon
Copy link

theAeon commented Jan 2, 2025

Correct. Not sure how to get the output of a no_libsyncthing build to attach to a running copy of syncthing, though.

edit: its in settings, ignore me. will update.

@theAeon
Copy link

theAeon commented Jan 2, 2025

seems its surviving sleep-wake with external syncthing. next testing your current release build.

@Martchus
Copy link
Owner

Martchus commented Jan 2, 2025

Juding by https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html#index-fno-plt and other information I found about -fno-plt I am starting to think that this flag is not relevant for PE targets as it seems rather ELF specific. Unfortunately I haven't found a clear statement that it is ELF specific yet. However, if that's true then it would be very strange if it was the culprit.

I think I only added this flag because it is also used in regular x86 mingw-w64 builds (https://aur.archlinux.org/cgit/aur.git/tree/mingw-env.sh?h=mingw-w64-environment). It might just ended up being used in regular x86 mingw-w64 builds because we generally try to stay close to how things are done in native Arch Linux packaging (to have the same level of hardening and optimization).

@theAeon
Copy link

theAeon commented Jan 2, 2025

still dies. seems its a function of whether libsyncthing is compiled in rather than a function of whether its actually using it.

(also interesting that this ones back to the libpcre call)

@theAeon
Copy link

theAeon commented Jan 2, 2025

However, if that's true then it would be very strange if it was the culprit.

To be fair this didn't affect the crashing (although it did change the called-back function to something that at least made some sense). It only affected the alignment error on native build.

@Martchus
Copy link
Owner

Martchus commented Jan 2, 2025

still dies. seems its a function of whether libsyncthing is compiled in rather than a function of whether its actually using it.

That doesn't surprise me. A few years ago I had crashes when trying to run an x86 Windows build with Wine. It always crashed when it was compiled with libsyncthing support, even though none of the Go functions were ever executed. (It worked natively under Windows, though.)

@Martchus
Copy link
Owner

Martchus commented Jan 3, 2025

I asked about this on the LLVM discord and got answers:

  • -fno-plt has no effect on Windows/PE targets, indeed.
  • -mguard=cf is supported by LLVM for mingw-w64 targets including aarch64 as long as mingw-w64 is built with --enable-cfguard (my build of mingw-w64 is configured with that flag).

About debugging the linker errors:

Oh btw, if you or the reporter run into issues with LLD, regarding the issues with alignment - rerun the link command and add -Wl,--reproduce=repro.tar, this produces a tarball which can be shared, which should allow someone else (me) to reproduce and debug what's really going on at the link stage, without needing to share the whole compilation/build environment

@Martchus
Copy link
Owner

Martchus commented Jan 6, 2025

I've just released 1.7.0 and for this release I removed support for the built-in Syncthing library in ARM builds. Maybe this makes a difference.

@theAeon
Copy link

theAeon commented Jan 7, 2025

Appears it survives sleep-wake.

Dunno if that counts as solving the issue, but it certainly solves my usecase.

@Martchus
Copy link
Owner

Martchus commented Jan 7, 2025

I suppose then the culprit is really the Go runtime which doesn't support Windows on ARM in the way I'm building my binaries (with recommend build/hardening flags and where the Go runtime is linked into the final executable as static library).

Then I'll keep the built-in library disabled for now. I wouldn't close the issue right now. I'll at least update the notes on the website and README and maybe there's an issue on the Go side I can watch and block on.

@Martchus
Copy link
Owner

Martchus commented Jan 11, 2025

I had one new thought: Maybe I just need to be more careful with what flags to pass to cgo specifically. My CMake-based build system passes flags via the the env variables CGO_CFLAGS, CGO_CXXFLAGS and CGO_LDFLAGS from CMake to cgo. The idea is to build everything with consistent flags that are specified in one place (CMake level e.g. from toolchain file). I also had a brief look at the code and I must say that passing the flags is probably actually slightly broken. So there's definitely room for improvement anyway.

EDIT: You can experiment with this by passing -DCGO_${LANGUAGE}FLAGS_OVERRIDE=… as CMake argument where ${LANGUAGE} is C, CXX or LD. To clear the flags (so no flags from CMake are passed to cgo) you probably have to pass a whitespace.

@Martchus
Copy link
Owner

Martchus commented Jan 24, 2025

Note that issues with ld.lld: error: misaligned ldr/str offset were definitely caused by use of -fno-plt, see Martchus/PKGBUILDs@dcbd169 for an explanation and further links.


I plan to do the next release once Qt 6.8.2 is out (should have been released yesterday). Then the builds for Windows on ARM will be using LLVM/Clang/libc++ 19.1.7. For this release I will keep support for the built-in Syncthing library disabled. However, I will also build additional binaries with the build-in Syncthing library enabled:

  1. A version where the mguard flag is not passed to the Go build. Maybe this already helps to prevent the issue.
  2. A version where the mguard flag is not used at all during the build of Syncthing Tray. If 1. does not prevent the issue then maybe not using mguard at all helps. However, my build of mingw-w64 and all other dependencies will still use mguard flag. So maybe this doesn't make a big difference.
  3. A version where the mguard flag is still used everywhere as before. Maybe it will just work after updating the LLVM and Go toolchains.

If you want to experiment with flags yourself, you can do so by setting CMake variables as mentioned in my previous comment. For an example, see the commit on the branch I prepared on by build script repository for the next release. I have tested whether passing/modifying flags this way works (and it does). I only have no binaries to share right now because I apparently need to rebuild Qt after updating LLVM/Clang/libc++ as I currently run into linker errors¹.


¹

d.lld: error: duplicate symbol: std::__1::bad_function_call::~bad_function_call()
>>> defined at libQt6Core.a(removed_api.cpp.obj)
>>> defined at libc++.a(functional.cpp.obj)

ld.lld: error: duplicate symbol: vtable for std::__1::bad_function_call
>>> defined at libQt6Core.a(removed_api.cpp.obj)
>>> defined at libc++.a(functional.cpp.obj)
clang++: error: linker command failed with exit code 1 (use -v to see invocation)

@Martchus
Copy link
Owner

By the way, I recently dealt with another problem caused by a limitation of the Go runtime affecting Go program built as a library on certain platforms. So maybe we are hitting another limitation here as well that is possibly specific to the program being built as a library and the use of mguard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants