Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pure Go SQLite #20614

Closed
wants to merge 4 commits into from
Closed

Pure Go SQLite #20614

wants to merge 4 commits into from

Conversation

jolheiser
Copy link
Member

This PR is an attempt at reviving #15002

Signed-off-by: jolheiser <[email protected]>
Copy link
Contributor

@Gusted Gusted left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much is the binary size affected?

@@ -182,7 +179,7 @@ GOOS=linux GOARCH=arm64 make build
Cross-build Gitea for Linux ARM64, with recommended build tags:

```
CC=aarch64-unknown-linux-gnu-gcc GOOS=linux GOARCH=arm64 TAGS="bindata sqlite sqlite_unlock_notify" make build
CC=aarch64-unknown-linux-gnu-gcc GOOS=linux GOARCH=arm64 TAGS="bindata" make build
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also good to remove any reference to CGO? Like providing C profile here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If some users want to use some C modules like pam, then CGO is still needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression the pam module didn't use CGO as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we use https://github.com/msteinert/pam which is CGO 😔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If some users want to use some C modules like pam, then CGO is still needed.

But there is a build tag for pam

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If some users want to use some C modules like pam, then CGO is still needed.

But there is a build tag for pam

But this document is showing users how to set the C compiler if the CGO is needed when cross-build

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Aug 2, 2022
@jolheiser
Copy link
Member Author

jolheiser commented Aug 2, 2022

How much is the binary size effected?

On my machine it looks like it increased ~2mb

EDIT: When I built the "old" version I forgot to use the sqlite tags, I've updated the change (only a ~1mb difference)

@jolheiser jolheiser added the type/enhancement An improvement of existing functionality label Aug 2, 2022
@jolheiser jolheiser added this to the 1.18.0 milestone Aug 2, 2022
@lunny
Copy link
Member

lunny commented Aug 2, 2022

This PR is an attempt at reviving #15002

Thanks @jolheiser, I closed mine.

@lunny
Copy link
Member

lunny commented Aug 2, 2022

How about the performance for the new versions of sqlite compare with cgo sqlite.

Signed-off-by: jolheiser <[email protected]>
Signed-off-by: jolheiser <[email protected]>
@@ -182,7 +179,7 @@ GOOS=linux GOARCH=arm64 make build
Cross-build Gitea for Linux ARM64, with recommended build tags:

```
CC=aarch64-unknown-linux-gnu-gcc GOOS=linux GOARCH=arm64 TAGS="bindata sqlite sqlite_unlock_notify" make build
CC=aarch64-unknown-linux-gnu-gcc GOOS=linux GOARCH=arm64 TAGS="bindata" make build
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is no CGO, so the CC env could be removed?

@lunny
Copy link
Member

lunny commented Aug 9, 2022

Looks like sqlite doesn't support lock mechanism of c sqlite. We may need to post issues on upstream.

@Gusted
Copy link
Contributor

Gusted commented Aug 9, 2022

Looks like sqlite doesn't support lock mechanism of c sqlite. We may need to post issues on upstream.

Not sure, this is just how Go works. They are listening for when the provided ctx is cancelled, however in the meantime we are just changing the data of that context pointer. Solution is simple, avoid changing the context.

@jolheiser
Copy link
Member Author

I'll have to take a look at why we need to set those values/where they are being used.
Unfortunately Go itself doesn't really allow setting context values without the whole ctx = context.WithValue(...) dance afaik

@Gusted
Copy link
Contributor

Gusted commented Aug 9, 2022

I'll have to take a look at why we need to set those values/where they are being used.

Because context is the only "variable" being shared across the publicKeyHandler and sessionHandler. Maybe giving a copy of the context(context.WithCancel) to the database function in publicKeyHandler would avoid the data race? I'm not even sure why sqlite is still waiting for a interrupt when the database function already returned the result, seems a bug on it's own.

@Gusted
Copy link
Contributor

Gusted commented Aug 11, 2022

Just to note this data race is not just in testing but also for actual workflow.

So yeah giving a duplicated context and then cancel it after the function does resolve the data race:

diff --git a/modules/ssh/ssh.go b/modules/ssh/ssh.go
index 8a280fb1b..31084870e 100644
--- a/modules/ssh/ssh.go
+++ b/modules/ssh/ssh.go
@@ -181,7 +181,11 @@ func publicKeyHandler(ctx ssh.Context, key ssh.PublicKey) bool {
                // look for the exact principal
        principalLoop:
                for _, principal := range cert.ValidPrincipals {
-                       pkey, err := asymkey_model.SearchPublicKeyByContentExact(ctx, principal)
+                       // Copy the context and give it a new done channel. This ensures that the read for context
+                       // in sqlite3 is closed after we got the results and avoids a data race when we change the context.
+                       dupCtx, cancel := context.WithCancel(ctx)
+                       pkey, err := asymkey_model.SearchPublicKeyByContentExact(dupCtx, principal)
+                       cancel()
                        if err != nil {
                                if asymkey_model.IsErrKeyNotExist(err) {
                                        log.Debug("Principal Rejected: %s Unknown Principal: %s", ctx.RemoteAddr(), principal)
@@ -242,7 +246,11 @@ func publicKeyHandler(ctx ssh.Context, key ssh.PublicKey) bool {
                log.Debug("Handle Public Key: %s Fingerprint: %s is not a certificate", ctx.RemoteAddr(), gossh.FingerprintSHA256(key))
        }
 
-       pkey, err := asymkey_model.SearchPublicKeyByContent(ctx, strings.TrimSpace(string(gossh.MarshalAuthorizedKey(key))))
+       // Copy the context and give it a new done channel. This ensures that the read for context
+       // in sqlite3 is closed after we got the results and avoids a data race when we change the context.
+       dupCtx, cancel := context.WithCancel(ctx)
+       pkey, err := asymkey_model.SearchPublicKeyByContent(dupCtx, strings.TrimSpace(string(gossh.MarshalAuthorizedKey(key))))
+       cancel()
        if err != nil {
                if asymkey_model.IsErrKeyNotExist(err) {
                        if log.IsWarn() {

@Gusted
Copy link
Contributor

Gusted commented Aug 18, 2022

Okay we're getting somewhere, so relevant issues:

--- FAIL: TestDumpDatabase (2.08s)

    engine_test.go:33: 

        	Error Trace:	engine_test.go:33

        	Error:      	Received unexpected error:

        	            	unsupported database type sqlite

        	Test:       	TestDumpDatabase

FAIL

coverage: 7.4% of statements

Seems like xorm doesn't cover the sqlite database type?

    --- FAIL: TestResourceIndex/issue_19 (0.23s)

        issue_test.go:392: 

            	Error Trace:	issue_test.go:392

            	Error:      	Received unexpected error:

            	            	generate issue index failed: database table is locked (262)

            	Test:       	TestResourceIndex/issue_19

Seems like busy_timeout needs to be changed(maybe 3d10000?)

@lunny lunny modified the milestones: 1.18.0, 1.19.0 Oct 17, 2022
@lunny lunny modified the milestones: 1.19.0, 1.20.0 Jan 31, 2023
@yardenshoham yardenshoham removed this from the 1.20.0 milestone May 10, 2023
@silverwind
Copy link
Member

Related: #24658

@wxiaoguang
Copy link
Contributor

wxiaoguang commented May 11, 2023

And put my question here:

Actually, I have somewhat objection for using that CCGO based sqlite package in Gitea

Are we willing to take the risk of a complex essential core library getting unmaintained(some unfixable bugs)?

Although SQLite is popular, but the pure-go solution isn't (actually, most people do not need the pure-go solution).


I would still question whether it's worth to use CCGO in Gitea. What if there is an unfixable bug?

  • If we use a 3rd go package, even if there is a bug, we can fork and fix.
  • But if there is a bug in CCGO related package, how to fix it? I don't have the time to patch the CCGO compiler ....

If CCGO pakcage has been proven to be stable enough, then I have no objection.

@lnchan
Copy link

lnchan commented Jul 26, 2023

And put my question here:

Actually, I have somewhat objection for using that CCGO based sqlite package in Gitea

Are we willing to take the risk of a complex essential core library getting unmaintained(some unfixable bugs)?

Although SQLite is popular, but the pure-go solution isn't (actually, most people do not need the pure-go solution).

I would still question whether it's worth to use CCGO in Gitea. What if there is an unfixable bug?

  • If we use a 3rd go package, even if there is a bug, we can fork and fix.
  • But if there is a bug in CCGO related package, how to fix it? I don't have the time to patch the CCGO compiler ....

If CCGO pakcage has been proven to be stable enough, then I have no objection.

As far as I'm aware, there is a bun module to automatically select between the modernc.org/sqlite if CGo is disabled, and mattn/go-sqlite if CGo is enabled. I believe this would be the best option for everyone. This way, users may use whichever they prefer: the pure Go version, easier to compile and with better cross-platform support, or the native C version, harder to compile, but offering bindings to the original version.

As a side note: the modernc.org/sqlite package is used in production, and passes the SQLite3 testing suites. It is used by many applications, specifically using the above bun ORM module. Note that it is also sponsored by a German company as specified in their README; meaning, they have interests to keep the project maintained.

I would personally enjoy having the pure-Go version, as it would be a step towards being able to go build Gitea instead of using the Makefile (at least for the backend); which would make life easier for everyone; maintainers, contributors, packagers or users.

@wxiaoguang
Copy link
Contributor

I would personally enjoy having the pure-Go version, as it would be a step towards being able to go build Gitea instead of using the Makefile (at least for the backend); which would make life easier for everyone; maintainers, contributors, packagers or users.

IMO it's not related. With current CGO sqlite, go build also works well. The Makefile is never a must. The only benefit of the pure-go sqlite is it drops the C compiler dependency.

@lnchan
Copy link

lnchan commented Jul 26, 2023

I would personally enjoy having the pure-Go version, as it would be a step towards being able to go build Gitea instead of using the Makefile (at least for the backend); which would make life easier for everyone; maintainers, contributors, packagers or users.

IMO it's not related. With current CGO sqlite, go build also works well. The Makefile is never a must. The only benefit of the pure-go sqlite is it drops the C compiler dependency.

I thought there was still a dependency on an old-fashioned way of embedding files in Go binaries, but I may have read incorrectly or be wrong.
Dropping the C compiler dependency would be a definite benefit for many users, but not everyone wants that (for diverse reasons), which is why letting the user decide would be, in my opinion, the best option.

@silverwind
Copy link
Member

Yes, Dropping CGO would considerably simply our build because we can use the native go compiler, eliminating the xgo dependency. I would definitely prefer it.

@lunny
Copy link
Member

lunny commented Dec 20, 2024

This should be replaced by #32628

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. type/enhancement An improvement of existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants