Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Root cause PR_TAGGED_ADDR_ENABLE failures on Pixels #962

Open
Mygod opened this issue Aug 2, 2023 · 25 comments
Open

Root cause PR_TAGGED_ADDR_ENABLE failures on Pixels #962

Mygod opened this issue Aug 2, 2023 · 25 comments
Assignees

Comments

@Mygod
Copy link
Contributor

Mygod commented Aug 2, 2023

How long? I miss my pr.

@ignoramous
Copy link
Collaborator

Hi, which PR is missing?

The next release is imminent. Everything looks okay (except for some hard to track down bugs).

@Mygod
Copy link
Contributor Author

Mygod commented Aug 4, 2023

#856. The app is randomly closing itself and not restarting. It's very annoying.

@ignoramous
Copy link
Collaborator

(going by temporal analysis) v054c released on May 6. The patch you submitted was merged on Apr 14. You sure that patch isn't already in the released v054c version?

@Mygod
Copy link
Contributor Author

Mygod commented Aug 4, 2023

You are right. Maybe #966 would do the trick? 🤔

@ignoramous
Copy link
Collaborator

The app is randomly closing itself and not restarting. It's very annoying.

Is this similar to #893?

Or, is it about the app crashing (due to bugs)?

I'm confused how START_STICKY is supposed to solve either of those two cases? Although, you're right that Rethink should probably use START_STICKY and START_NOT_STICKY like so: https://android.googlesource.com/platform/development/+/master/samples/ToyVpn/src/com/example/android/toyvpn/ToyVpnService.java#73

@Mygod
Copy link
Contributor Author

Mygod commented Aug 4, 2023

I think it is closed due to memory pressure, so perhaps on #893. This is on a Pixel though.

@Mygod Mygod closed this as completed Aug 6, 2023
@Mygod
Copy link
Contributor Author

Mygod commented Sep 6, 2023

Following up, it looks like it was actually crashing (a few times a day). Sent the bug report to your email.

@ignoramous
Copy link
Collaborator

Thanks for sending along a bugreport. Can you confirm if you sent it from @mygod.be? We can't find any in our inboxes (you can forward the attachment to hello at celzero dot com or mz at celzero dot com or md at celzero dot com). Thanks.

@Mygod
Copy link
Contributor Author

Mygod commented Sep 7, 2023

Yes that's me.

@hussainmohd-a
Copy link
Collaborator

@Mygod, Could you please confirm which email you sent the bug report to? We can't seem to find it in our inboxes.

@Mygod
Copy link
Contributor Author

Mygod commented Sep 7, 2023

It's hello at celzero dot com?

@ignoramous
Copy link
Collaborator

Found both the emails. Thanks.

@ignoramous
Copy link
Collaborator

By the time I check last night, the files were gone though. If you can reshare it...

Sorry that this has been a bit too cumbersome for you.

@Mygod
Copy link
Contributor Author

Mygod commented Sep 8, 2023

Weird. Resent. Sorry!

@hussainmohd-a
Copy link
Collaborator

Cmdline: com.celzero.bravedns
pid: 20494, tid: 28355, name: Thread-80  >>> com.celzero.bravedns <<<
uid: 10273
tagged_addr_ctrl: 0000000000000001 (PR_TAGGED_ADDR_ENABLE)
signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
    x0  0000000000000000  x1  0000000000006ec3  x2  0000000000000006  x3  0000000000000008
    x4  0000000000000001  x5  0000000000000001  x6  0000000000000000  x7  000000000000000a
    x8  0000000000000083  x9  0000000000000035  x10 0000000000000010  x11 0000007b0f6bc130
    x12 0000000000000002  x13 0000000000000000  x14 00000000000000c3  x15 0000000000000000
    x16 0000007b2154d340  x17 0000007b216446e0  x18 0000007b1c6e8000  x19 000000000000500e
    x20 0000007b21644580  x21 00000040003c4000  x22 0000000000000004  x23 000000000000be45
    x24 000000400345b9d0  x25 000000400020c980  x26 0000000000000000  x27 0000000000000010
    x28 0000004002b1ba00  x29 00000040052db788
    lr  0000007b0f09aaa4  sp  00000040052db790  pc  0000007b0f0b8268  pst 0000000080001000
backtrace:
      #00 pc 0000000000292268  /data/app/~~sVw6-98aRfkKBSylSWt3ug==/com.celzero.bravedns-gMO_xYdcYIK4yhVhAR-KXQ==/base.apk (offset 0x31f000)
Fatal signal 6 (SIGABRT), code -6 (SI_TKILL) in tid 10499 (Thread-22), pid 10407 (elzero.bravedns)

It seems that the issue causing is similar to issue #786. The error message in the logs suggests that address tagging aka PR_TAGGED_ADDR_ENABLE is not happy (we believe the Go runtime which runs Rethink's network engine isn't compatible with MTE). It's worth noting that this feature may be enabled by default on Pixel devices (maybe debug builds?) but disabled on other devices.

However, disabling Memory Tagging Extension (MTE) on Rethink (per #786 memtagMode: off set already) has not resolved the crashes, it'd seem?

Alternatively, this command achieves same result:

adb shell am compat enable NATIVE_MEMTAG_OFF com.celzero.bravedns

We don't have Pixel devices for ourselves to test this. If you are okay, will you please let me know if it helped in any way?

@Mygod
Copy link
Contributor Author

Mygod commented Sep 8, 2023

Unknown or invalid change: 'NATIVE_MEMTAG_OFF'.

@Mygod
Copy link
Contributor Author

Mygod commented Sep 8, 2023

Maybe there is actually use-after-free/buffer-overflow in your code instead? Are there other cgo/android repos that have this issue?

@ignoramous
Copy link
Collaborator

Use after free, and buffer overflow exists in gomobile / cgo output of firestack's code, perhaps?

I don't think firestack uses golang's unsafe anywhere. May be its downstream dependency gvisor/pkg/tcpip@go does, which is a vast codebase.

If only these logs were any helpful...

I'm generally surprised that switching off memtagMode for Rethink didn't work on your Pixel (it did for folks who originally reported this bug back in the day).

@Mygod
Copy link
Contributor Author

Mygod commented Sep 8, 2023

It looks like if you run it in SYNC mode the logs will be more helpful. https://source.android.com/docs/security/test/memory-safety/arm-mte#sync-mode

@ignoramous
Copy link
Collaborator

In my limited experience elsewhere with ASAN / MSAN / TSANs... debug modes are painfully slow (:

(on that note, I see a few firestack golang deps use pkg unsafe).

@Mygod
Copy link
Contributor Author

Mygod commented Sep 8, 2023

You can send a debug mode apk to me and I can try to grab some logs. :)

@ignoramous ignoramous reopened this Sep 8, 2023
@ignoramous ignoramous changed the title New release? Root cause PR_TAGGED_ADDR_ENABLE failures on Pixels Sep 8, 2023
@hussainmohd-a
Copy link
Collaborator

@Mygod, please find the debug build at this link with android:memtagMode is set as "sync". Let me know if you have difficulties while downloading the APK.

@Mygod
Copy link
Contributor Author

Mygod commented Oct 3, 2023

Did you receive the crash report?

@Mygod
Copy link
Contributor Author

Mygod commented Oct 4, 2023

I can no longer use this app after updating to Android 14.

10-04 17:32:53.426 18682 18682 E AndroidRuntime: android.app.MissingForegroundServiceTypeException: Starting FGS without a type  callerApp=ProcessRecord{51c5a3e 18682:com.celzero.bravedns/u0a400} targetSDK=34
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at android.app.MissingForegroundServiceTypeException$1.createFromParcel(MissingForegroundServiceTypeException.java:53)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at android.app.MissingForegroundServiceTypeException$1.createFromParcel(MissingForegroundServiceTypeException.java:49)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at android.os.Parcel.readParcelableInternal(Parcel.java:4870)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at android.os.Parcel.readParcelable(Parcel.java:4852)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at android.os.Parcel.createExceptionOrNull(Parcel.java:3052)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at android.os.Parcel.createException(Parcel.java:3041)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at android.os.Parcel.readException(Parcel.java:3024)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at android.os.Parcel.readException(Parcel.java:2966)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at android.app.IActivityManager$Stub$Proxy.setServiceForeground(IActivityManager.java:6761)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at android.app.Service.startForeground(Service.java:775)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at com.celzero.bravedns.service.BraveVPNService$onStartCommand$1.invokeSuspend(BraveVPNService.kt:1206)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at com.celzero.bravedns.service.BraveVPNService$onStartCommand$1.invoke(Unknown Source:8)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at com.celzero.bravedns.service.BraveVPNService$onStartCommand$1.invoke(Unknown Source:2)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at com.celzero.bravedns.service.BraveVPNService$ui$1.invokeSuspend(BraveVPNService.kt:2186)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:108)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at android.os.Handler.handleCallback(Handler.java:958)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at android.os.Handler.dispatchMessage(Handler.java:99)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at android.os.Looper.loopOnce(Looper.java:205)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at android.os.Looper.loop(Looper.java:294)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at android.app.ActivityThread.main(ActivityThread.java:8177)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at java.lang.reflect.Method.invoke(Native Method)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:552)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:971)
10-04 17:32:53.426 18682 18682 E AndroidRuntime: 	Suppressed: kotlinx.coroutines.internal.DiagnosticCoroutineContextException: [StandaloneCoroutine{Cancelling}@d1d5641, Dispatchers.Main]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants