Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Adds retry to send raw transaction #3161

Merged

Conversation

konstantinabl
Copy link
Collaborator

Description:

Currently, we experience the SDk timing out in the CI and on mainnet, eventhough the transactions that fail do not take more than 10 seconds. More info in the issue below
Related issue(s):

Workaround until #3118 is solved

@konstantinabl konstantinabl linked an issue Oct 28, 2024 that may be closed by this pull request
Copy link

github-actions bot commented Oct 28, 2024

Test Results

    4 files   -  16    419 suites  +148   19s ⏱️ - 40m 15s
1 477 tests +873  1 476 ✅ +881  1 💤  - 3  0 ❌  - 5 
1 486 runs  +753  1 485 ✅ +762  1 💤  - 4  0 ❌  - 5 

Results for commit ec0a0a1. ± Comparison against base commit 7fea98c.

This pull request removes 604 and adds 1477 tests. Note that renamed tests count towards both.
"before all" hook for "should execute "eth_getCode" for hts token" ‑ RPC Server Acceptance Tests Acceptance tests @api-batch-2 RPC Server Acceptance Tests eth_getCode "before all" hook for "should execute "eth_getCode" for hts token"
"eth_call" for non-existing contract address returns 0x ‑ RPC Server Acceptance Tests Acceptance tests @api-batch-3 RPC Server Acceptance Tests eth_call "eth_call" for non-existing contract address returns 0x
001 Should call pureMultiply ‑ RPC Server Acceptance Tests Acceptance tests @api-batch-3 RPC Server Acceptance Tests eth_call Caller contract With evm address 001 Should call pureMultiply
001 Should call pureMultiply ‑ RPC Server Acceptance Tests Acceptance tests @api-batch-3 RPC Server Acceptance Tests eth_call Caller contract With long-zero address 001 Should call pureMultiply
002 Should call msgSender ‑ RPC Server Acceptance Tests Acceptance tests @api-batch-3 RPC Server Acceptance Tests eth_call Caller contract With evm address 002 Should call msgSender
002 Should call msgSender ‑ RPC Server Acceptance Tests Acceptance tests @api-batch-3 RPC Server Acceptance Tests eth_call Caller contract With long-zero address 002 Should call msgSender
003 Should call txOrigin ‑ RPC Server Acceptance Tests Acceptance tests @api-batch-3 RPC Server Acceptance Tests eth_call Caller contract With evm address 003 Should call txOrigin
003 Should call txOrigin ‑ RPC Server Acceptance Tests Acceptance tests @api-batch-3 RPC Server Acceptance Tests eth_call Caller contract With long-zero address 003 Should call txOrigin
004 Should call msgSig ‑ RPC Server Acceptance Tests Acceptance tests @api-batch-3 RPC Server Acceptance Tests eth_call Caller contract With evm address 004 Should call msgSig
004 Should call msgSig ‑ RPC Server Acceptance Tests Acceptance tests @api-batch-3 RPC Server Acceptance Tests eth_call Caller contract With long-zero address 004 Should call msgSig
…
"eth_blockNumber" return the latest block number on second try ‑ @ethGetBlockByNumber using MirrorNode "eth_blockNumber" return the latest block number on second try
"eth_blockNumber" should return the latest block number using cache ‑ @ethGetBlockByNumber using MirrorNode "eth_blockNumber" should return the latest block number using cache
"eth_blockNumber" should return the latest block number ‑ @ethGetBlockByNumber using MirrorNode "eth_blockNumber" should return the latest block number
"eth_blockNumber" should throw an error if no blocks are found after third try ‑ @ethGetBlockByNumber using MirrorNode "eth_blockNumber" should throw an error if no blocks are found after third try
"eth_blockNumber" should throw an error if no blocks are found ‑ @ethGetBlockByNumber using MirrorNode "eth_blockNumber" should throw an error if no blocks are found
Adds a revertReason field for receipts with errorMessage ‑ @ethGetTransactionReceipt eth_getTransactionReceipt tests Adds a revertReason field for receipts with errorMessage
BLOCK_HASH filter timeouts and throws the expected error ‑ @ethGetLogs using MirrorNode timeout BLOCK_HASH filter timeouts and throws the expected error
BLOCK_HASH filter ‑ @ethGetLogs using MirrorNode BLOCK_HASH filter
Can extract the account number out of an account pagination next link url ‑ MirrorNodeClient Can extract the account number out of an account pagination next link url
Can extract the evm address out of an account pagination next link url ‑ MirrorNodeClient Can extract the evm address out of an account pagination next link url
…
This pull request removes 4 skipped tests and adds 1 skipped test. Note that renamed tests count towards both.
011 Should fail when calling msgValue with more value than available balance ‑ RPC Server Acceptance Tests Acceptance tests @api-batch-3 RPC Server Acceptance Tests eth_call Caller contract With evm address 011 Should fail when calling msgValue with more value than available balance
011 Should fail when calling msgValue with more value than available balance ‑ RPC Server Acceptance Tests Acceptance tests @api-batch-3 RPC Server Acceptance Tests eth_call Caller contract With long-zero address 011 Should fail when calling msgValue with more value than available balance
from/to Addresses in transaction to a contract (deployed through HAPI tx) are in evm and long-zero format ‑ RPC Server Acceptance Tests Acceptance tests @api-batch-2 RPC Server Acceptance Tests Formats of addresses in Transaction and Receipt results from/to Addresses in transaction to a contract (deployed through HAPI tx) are in evm and long-zero format
should not return contract bytecode after sefldestruct ‑ RPC Server Acceptance Tests Acceptance tests @api-batch-2 RPC Server Acceptance Tests eth_getCode should not return contract bytecode after sefldestruct
supports optionality of request id when configured ‑ RPC Server given REQUEST_ID_IS_OPTIONAL=true are set supports optionality of request id when configured

♻️ This comment has been updated with latest results.

@konstantinabl konstantinabl force-pushed the 3160-fix-handle-timeout-exceeded-for-better-user-experience branch from 41aa0c4 to 4ef6b94 Compare October 28, 2024 13:16
@konstantinabl konstantinabl marked this pull request as ready for review October 28, 2024 16:17
Copy link
Contributor

@victor-yanev victor-yanev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, the following suggestion will make sure we don't do unnecessary repeats and avoids submitting duplicated transactions.

Also, please add unit tests which simulate a 'timeout exceeded'/'Connection dropped' error from the SDK client and confirm that:

  • the transaction is not retried when the mirror node returns the transaction with a { result: 'SUCCESS' }
  • the transaction is retried when the transaction is not found in the mirror node or a result different from 'SUCCESS' is returned

packages/relay/src/lib/eth.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@ebadiere ebadiere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work. A couple of questions.

@ebadiere ebadiere self-requested a review October 30, 2024 16:36
Copy link
Collaborator

@Nana-EC Nana-EC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice change although I don't think we should be doing a retry.
Let's discuss

packages/relay/src/lib/eth.ts Outdated Show resolved Hide resolved
@konstantinabl konstantinabl force-pushed the 3160-fix-handle-timeout-exceeded-for-better-user-experience branch from 25437f2 to cc16db5 Compare October 31, 2024 10:17
@konstantinabl konstantinabl marked this pull request as draft October 31, 2024 10:18
@konstantinabl konstantinabl marked this pull request as ready for review October 31, 2024 19:23
@konstantinabl konstantinabl force-pushed the 3160-fix-handle-timeout-exceeded-for-better-user-experience branch from dc77178 to b4ec668 Compare October 31, 2024 20:19
@konstantinabl konstantinabl added this to the 0.60.0 milestone Oct 31, 2024
@konstantinabl konstantinabl added the Internal For changes that affect the project's internal workings but not its outward-facing functionality. label Oct 31, 2024
Copy link
Member

@quiet-node quiet-node left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice changes but sorry gotta block. Please fine more details below.

packages/relay/src/lib/eth.ts Outdated Show resolved Hide resolved
packages/relay/src/lib/errors/SDKClientError.ts Outdated Show resolved Hide resolved
packages/relay/src/lib/clients/sdkClient.ts Show resolved Hide resolved
packages/relay/src/lib/clients/sdkClient.ts Show resolved Hide resolved
packages/relay/src/lib/clients/sdkClient.ts Outdated Show resolved Hide resolved
@quiet-node
Copy link
Member

@konstantinabl Now that I’ve had a chance to delve into this “Transaction execution returns a null value...” error and to examine more closely how sdkClient.executeTransaction() handles errors, this block caught my attention. It seems we only throw an error if it has the WRONG_NONCE error code. As I mentioned above, there are several Ethereum-related errors a transaction can encounter while still reaching the consensus node and being retrievable by the mirror node. These error codes may include WRONG_NONCE, INSUFFICIENT_TX_FEE, CONTRACT_REVERT_EXECUTED, and others. These errors fail the transaction but still reach the consensus node. Thus, the line transactionResponse = await transaction.execute(this.clientMain); still returns a non-null value assigned to transactionResponse.

In the catch block, this part only throws an error if it has the WRONG_NONCE code. I initially designed this logic under the assumption that errors like WRONG_NONCE or CONTRACT_REVERT_EXECUTED were the main issues causing transactions to fail while still making their way to the consensus node, so transactionResponse wouldn’t be null and wouldn’t trigger this block.

However, this may be a regression and not entirely accurate, as it appears there are errors that cause transactions to fail and not reach the consensus node, i.e., the transactionResponse is null. I think it’s worth investigating which other errors can hit this catch block without making it to the consensus node. Once we better understand these errors, we can design a more robust error-handling solution instead of throwing an INTERNAL_ERROR as we currently do in this line.

I hope this clarifies things and stays within the scope of the Transaction execution returns a null value... error.

@konstantinabl
Copy link
Collaborator Author

@konstantinabl Now that I’ve had a chance to delve into this “Transaction execution returns a null value...” error and to examine more closely how sdkClient.executeTransaction() handles errors, this block caught my attention. It seems we only throw an error if it has the WRONG_NONCE error code. As I mentioned above, there are several Ethereum-related errors a transaction can encounter while still reaching the consensus node and being retrievable by the mirror node. These error codes may include WRONG_NONCE, INSUFFICIENT_TX_FEE, CONTRACT_REVERT_EXECUTED, and others. These errors fail the transaction but still reach the consensus node. Thus, the line transactionResponse = await transaction.execute(this.clientMain); still returns a non-null value assigned to transactionResponse.

In the catch block, this part only throws an error if it has the WRONG_NONCE code. I initially designed this logic under the assumption that errors like WRONG_NONCE or CONTRACT_REVERT_EXECUTED were the main issues causing transactions to fail while still making their way to the consensus node, so transactionResponse wouldn’t be null and wouldn’t trigger this block.

However, this may be a regression and not entirely accurate, as it appears there are errors that cause transactions to fail and not reach the consensus node, i.e., the transactionResponse is null. I think it’s worth investigating which other errors can hit this catch block without making it to the consensus node. Once we better understand these errors, we can design a more robust error-handling solution instead of throwing an INTERNAL_ERROR as we currently do in this line.

I hope this clarifies things and stays within the scope of the Transaction execution returns a null value... error.

@quiet-node Not quite sure what the relation to this PR is, but I believe we also handle these error here -> https://github.com/hashgraph/hedera-json-rpc-relay/blob/main/packages/relay/src/lib/eth.ts#L1482

Signed-off-by: Konstantina Blazhukova <[email protected]>
… to include transactionId

Signed-off-by: Konstantina Blazhukova <[email protected]>
Signed-off-by: Konstantina Blazhukova <[email protected]>
Signed-off-by: Konstantina Blazhukova <[email protected]>
Signed-off-by: Konstantina Blazhukova <[email protected]>
Signed-off-by: Konstantina Blazhukova <[email protected]>
Signed-off-by: Konstantina Blazhukova <[email protected]>
@konstantinabl konstantinabl force-pushed the 3160-fix-handle-timeout-exceeded-for-better-user-experience branch from a2773b5 to b83a40a Compare November 1, 2024 16:30
Signed-off-by: Konstantina Blazhukova <[email protected]>
Copy link

sonarcloud bot commented Nov 1, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
22.6% Duplication on New Code (required ≤ 3%)

See analysis details on SonarCloud

Copy link
Member

@quiet-node quiet-node left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve to unblock.

Copy link
Collaborator

@Nana-EC Nana-EC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvements. Left some comments.
I'm unblocking as the transaction is no longer being retried

packages/relay/src/lib/clients/sdkClient.ts Show resolved Hide resolved
packages/relay/src/lib/clients/sdkClient.ts Show resolved Hide resolved
packages/relay/src/lib/clients/sdkClient.ts Show resolved Hide resolved
@Nana-EC Nana-EC added enhancement New feature or request and removed Internal For changes that affect the project's internal workings but not its outward-facing functionality. labels Nov 4, 2024
@konstantinabl konstantinabl merged commit acc0b35 into main Nov 4, 2024
45 of 47 checks passed
@konstantinabl konstantinabl deleted the 3160-fix-handle-timeout-exceeded-for-better-user-experience branch November 4, 2024 15:16
Copy link

codecov bot commented Nov 4, 2024

Codecov Report

Attention: Patch coverage is 77.77778% with 6 lines in your changes missing coverage. Please review.

Project coverage is 84.55%. Comparing base (7fea98c) to head (ec0a0a1).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
packages/relay/src/lib/clients/sdkClient.ts 73.91% 3 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3161      +/-   ##
==========================================
+ Coverage   83.35%   84.55%   +1.19%     
==========================================
  Files          66       69       +3     
  Lines        4290     4473     +183     
  Branches      837      881      +44     
==========================================
+ Hits         3576     3782     +206     
+ Misses        473      428      -45     
- Partials      241      263      +22     
Flag Coverage Δ
config-service 98.14% <ø> (ø)
relay 85.28% <48.14%> (-0.26%) ⬇️
server 83.52% <ø> (ø)
ws-server 36.87% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
packages/relay/src/lib/errors/SDKClientError.ts 78.26% <100.00%> (+2.07%) ⬆️
.../relay/src/lib/services/hapiService/hapiService.ts 80.43% <100.00%> (+2.17%) ⬆️
packages/relay/src/lib/clients/sdkClient.ts 66.66% <73.91%> (+7.81%) ⬆️

... and 15 files with indirect coverage changes

ebadiere pushed a commit that referenced this pull request Nov 4, 2024
* Adds retry to eth send raw transaction

Signed-off-by: Konstantina Blazhukova <[email protected]>

* Adds mirror node check before retry and expands SDKClientError object to include transactionId

Signed-off-by: Konstantina Blazhukova <[email protected]>

* Adds unit tests

Signed-off-by: Konstantina Blazhukova <[email protected]>

* Adds improvements

Signed-off-by: Konstantina Blazhukova <[email protected]>

* adds relevant comment for hardcoded values

Signed-off-by: Konstantina Blazhukova <[email protected]>

* removes retry and adds mirror node check in the sdk client

Signed-off-by: Konstantina Blazhukova <[email protected]>

* Adds unit tests

Signed-off-by: Konstantina Blazhukova <[email protected]>

* Addresses PR comments

Signed-off-by: Konstantina Blazhukova <[email protected]>

* Changes the error thrown when timeout occurs to SDKClientError

Signed-off-by: Konstantina Blazhukova <[email protected]>

* Ensure other error types are thrown

Signed-off-by: Konstantina Blazhukova <[email protected]>

---------

Signed-off-by: Konstantina Blazhukova <[email protected]>
ebadiere pushed a commit that referenced this pull request Nov 4, 2024
* Adds retry to eth send raw transaction

Signed-off-by: Konstantina Blazhukova <[email protected]>

* Adds mirror node check before retry and expands SDKClientError object to include transactionId

Signed-off-by: Konstantina Blazhukova <[email protected]>

* Adds unit tests

Signed-off-by: Konstantina Blazhukova <[email protected]>

* Adds improvements

Signed-off-by: Konstantina Blazhukova <[email protected]>

* adds relevant comment for hardcoded values

Signed-off-by: Konstantina Blazhukova <[email protected]>

* removes retry and adds mirror node check in the sdk client

Signed-off-by: Konstantina Blazhukova <[email protected]>

* Adds unit tests

Signed-off-by: Konstantina Blazhukova <[email protected]>

* Addresses PR comments

Signed-off-by: Konstantina Blazhukova <[email protected]>

* Changes the error thrown when timeout occurs to SDKClientError

Signed-off-by: Konstantina Blazhukova <[email protected]>

* Ensure other error types are thrown

Signed-off-by: Konstantina Blazhukova <[email protected]>

---------

Signed-off-by: Konstantina Blazhukova <[email protected]>
Signed-off-by: Eric Badiere <[email protected]>
ebadiere added a commit that referenced this pull request Nov 4, 2024
* Adds retry to eth send raw transaction



* Adds mirror node check before retry and expands SDKClientError object to include transactionId



* Adds unit tests



* Adds improvements



* adds relevant comment for hardcoded values



* removes retry and adds mirror node check in the sdk client



* Adds unit tests



* Addresses PR comments



* Changes the error thrown when timeout occurs to SDKClientError



* Ensure other error types are thrown



---------

Signed-off-by: Konstantina Blazhukova <[email protected]>
Signed-off-by: Eric Badiere <[email protected]>
Co-authored-by: konstantinabl <[email protected]>
@Nana-EC
Copy link
Collaborator

Nana-EC commented Nov 11, 2024

@konstantinabl we should fix the title of the PR as no retry logic was added

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fix: handle timeout exceeded for better user experience
5 participants