Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve json decoding #1402

Merged
merged 9 commits into from
May 5, 2022
Merged

Conversation

cmaglie
Copy link
Member

@cmaglie cmaglie commented Aug 19, 2021

I found that the standard golang JSON decoder is very slow. This is particularly noticeable because the library_index.json is constantly growing (currently we are at ~20Mb) and the parsing time adds up at every invocation of the cli.

This PR:

  • use easyjson library for parsing the library_index.json
  • updates the testdata/library_index.json with the latest current
  • add a benchmark to test JSON decoding performances

Here the result:

$ go test -v -benchmem -bench BenchmarkIndexParsing github.com/arduino/arduino-cli/arduino/libraries/librariesindex
=== RUN   TestIndexer
--- PASS: TestIndexer (0.16s)
goos: linux
goarch: amd64
pkg: github.com/arduino/arduino-cli/arduino/libraries/librariesindex
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkIndexParsingStdJSON
BenchmarkIndexParsingStdJSON-8                 5         214872730 ns/op          94.52 MB/s    58956539 B/op     418973 allocs/op
BenchmarkIndexParsingEasyJSON
BenchmarkIndexParsingEasyJSON-8               16          69215472 ns/op         293.42 MB/s    56162664 B/op     418966 allocs/op
PASS
ok      github.com/arduino/arduino-cli/arduino/libraries/librariesindex 4.442s

@cmaglie cmaglie force-pushed the improve_json_decoding branch 3 times, most recently from 21db43c to ee5eeca Compare August 19, 2021 14:40
@per1234 per1234 added type: enhancement Proposed improvement topic: code Related to content of the project itself labels Jan 12, 2022
@cmaglie cmaglie force-pushed the improve_json_decoding branch from ee5eeca to 470e2b9 Compare May 4, 2022 10:02
@cmaglie cmaglie force-pushed the improve_json_decoding branch from 470e2b9 to fdfb53b Compare May 4, 2022 10:03
cmaglie added 2 commits May 4, 2022 12:04
Results:

$ go test -v -benchmem -bench BenchmarkIndexParsing github.com/arduino/arduino-cli/arduino/libraries/librariesindex
=== RUN   TestIndexer
--- PASS: TestIndexer (0.16s)
goos: linux
goarch: amd64
pkg: github.com/arduino/arduino-cli/arduino/libraries/librariesindex
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkIndexParsingStdJSON
BenchmarkIndexParsingStdJSON-8                 5         214872730 ns/op          94.52 MB/s    58956539 B/op     418973 allocs/op
BenchmarkIndexParsingEasyJSON
BenchmarkIndexParsingEasyJSON-8               16          69215472 ns/op         293.42 MB/s    56162664 B/op     418966 allocs/op
PASS
ok      github.com/arduino/arduino-cli/arduino/libraries/librariesindex 4.442s

easyjson is 3x faster.
@cmaglie cmaglie force-pushed the improve_json_decoding branch from fdfb53b to 878168e Compare May 4, 2022 10:04
@cmaglie cmaglie requested a review from a team May 4, 2022 10:38
Copy link
Contributor

@per1234 per1234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to set up some infrastructure for the code generation?

  • Task that runs the generation command
  • GitHub Actions workflow that runs the task and then checks for a diff (which would indicate that the generated code has gone out of sync)

I did that for the generated code in Arduino Lint and found it to be very useful. The benefits should be even more here where there are more contributors.

Probably should also be documented in the contributor guide:
https://github.com/arduino/arduino-cli/blob/master/docs/CONTRIBUTING.md

@ubidefeo ubidefeo self-requested a review May 5, 2022 09:07
Copy link

@ubidefeo ubidefeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have run the build from this PR for about one hour on several projects and have not been able to break it.
From a usage point of view it LGTM

@cmaglie cmaglie force-pushed the improve_json_decoding branch 2 times, most recently from f4d814a to ab68a8e Compare May 5, 2022 11:14
@cmaglie cmaglie force-pushed the improve_json_decoding branch from ab68a8e to 7f393b1 Compare May 5, 2022 11:18
@cmaglie cmaglie requested a review from per1234 May 5, 2022 11:23
.github/workflows/check-easyjson.yml Outdated Show resolved Hide resolved
@cmaglie cmaglie merged commit 20449fc into arduino:master May 5, 2022
@cmaglie cmaglie deleted the improve_json_decoding branch May 5, 2022 13:07
@cmaglie
Copy link
Member Author

cmaglie commented May 6, 2022

For the record: we are using a patched version of easyjson until this patch is accepted upstream: mailru/easyjson#372

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: code Related to content of the project itself type: enhancement Proposed improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants