Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ra_log_cache_key_not_found exception exit occured #416

Closed
sile opened this issue Feb 5, 2024 · 4 comments · Fixed by #428
Closed

ra_log_cache_key_not_found exception exit occured #416

sile opened this issue Feb 5, 2024 · 4 comments · Fixed by #428
Labels

Comments

@sile
Copy link
Contributor

sile commented Feb 5, 2024

Describe the bug

The following exception was raised when processing a consistent query:

exception exit: {ra_log_cache_key_not_found,15}
      in function  ra_log_cache:fetch/2 (_build/default/lib/ra/src/ra_log_cache.erl, line 68)
      in call from ra_log:'-resend_from0/2-fun-0-'/3 (_build/default/lib/ra/src/ra_log.erl, line 932)
      in call from ra_log:'-resend_from0/2-lists^foldl/2-0-'/3 (_build/default/lib/ra/src/ra_log.erl, line 931)
      in call from ra_log:resend_from/2 (_build/default/lib/ra/src/ra_log.erl, line 915)
      in call from ra_log:handle_event/2 (_build/default/lib/ra/src/ra_log.erl, line 456)
      in call from ra_server:handle_follower/2 (_build/default/lib/ra/src/ra_server.erl, line 1123)
      in call from ra_server_proc:handle_follower/2 (_build/default/lib/ra/src/ra_server_proc.erl, line 1090)
      in call from ra_server_proc:follower/3 (_build/default/lib/ra/src/ra_server_proc.erl, line 794)
      in call from gen_statem:loop_state_callback/11 (gen_statem.erl, line 1395)
  • ra version: 2.9.0

Reproduction steps

I am unable to provide the reproduction steps as the chance of the exception occurring is very rare, and it happened while running test code for our non-open source product.
(Feel free to ignore this issue if you think there is insufficient information.)

Instead, here is a rough outline of the test scenario:

  1. Construct a cluster consisting of 5 nodes, where each node is named as a, b, c, d, and e.
  2. Regularly monitor the availability of the cluster by running a consistent query.
  3. Divide the nodes into two groups: {a,b} forms one group, while {c,d,e} forms the other group.
    • We employ a custom Erlang distribution module that emulates a significantly slow network (when a certain flag is enabled, communication between the connected nodes is severely limited.)
  4. Restore the split cluster to its normal state.

The exception mentioned above appears to have occurred at a node in the majority group when running a consistent query (for health check) just after the 3rd step.

Expected behavior

No exception exits happen.

Additional context

No response

@sile sile added the bug label Feb 5, 2024
@sile sile changed the title ra_log_cache_key_not_found exception exit was occured ra_log_cache_key_not_found exception exit occured Feb 5, 2024
@kjnilsson
Copy link
Contributor

we have seen this error a couple of times but not been able to trace it down. After the process has crashed and restarted it should be ok typically. Was this the case here?

@sile
Copy link
Contributor Author

sile commented Feb 13, 2024

I see. Thank you for your response.

After the process has crashed and restarted it should be ok typically. Was this the case here?

Yes, our test case almost passed (except for the crash log check at the end of the test case) even if the exception occurred.
So, it seems this exception did not introduce a critical problem.

@kjnilsson
Copy link
Contributor

kjnilsson commented Apr 24, 2024

I am very confident that #428 will fix this issue.

@sile
Copy link
Contributor Author

sile commented Apr 25, 2024

Great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants