Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flush race #1363

Merged
merged 4 commits into from
Jan 30, 2024
Merged

Fix flush race #1363

merged 4 commits into from
Jan 30, 2024

Conversation

agouin
Copy link
Member

@agouin agouin commented Dec 16, 2023

Without this, acks are not being cleared fully from the caches when giving up on sending a packet message after max retries.

I think this is caused by a race condition with flushing. Flushing performs queries and enqueues chantypes.EventTypeSendPacket or [chantypes.EventTypeRecvPacket, chantypes.EventTypeWriteAck] events when recvs or acks need to be relayed, respectively. These can be enqueued after the respective action was already successful and observed, which I believe is causing the looping ACKs.

Some potential ways to handle this:

  • maintain separate cache (or build into existing PacketSequenceCache) of state for a sequence number where sequences expire after some amount of time. Then, when flushing, check if the sequence is already beyond the state that is about to be enqueued. If so, don't enqueue it.
  • clear caches of a sequence when an ACK that we have broadcasted is successful (since we will not get another acknowledge_packet event for the uneffected acks, which is currently necessary to clear the caches)
  • pause cosmoschainprocessors when performing flush (not desirable as it will slow down relay performance)

Resolves #1359

@agouin agouin changed the title Remove writeack too when removing packet retention Fix flush race Dec 16, 2023
@agouin agouin force-pushed the andrew/delete_write_ack_too branch from fe6793a to 34d60fa Compare January 30, 2024 19:35
@agouin agouin marked this pull request as ready for review January 30, 2024 20:57
Copy link
Member

@jtieri jtieri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after our discussion in Slack I think I have my head completely wrapped around the changes here! everything looks good, only thing i can think of is that we may possibly want to make the value passed into Prune configurable at some point but not necessary rn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

v2.4.x - Update Client and ACKs in loop
2 participants