Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best Practices for storage network configuration #709

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rrajendran17
Copy link

Copy link

github-actions bot commented Jan 22, 2025

Name Link
🔨 Latest commit af3d934
😎 Deploy Preview https://67a262a1eb580500af33aee7--harvester-preview.netlify.app

Copy link
Member

@w13915984028 w13915984028 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

If comment on harvester/harvester#7422 is accepted, then please update the last section accordingly.

since validation added for this in https://github.com/harvester/harvester/pull/7422/files, removing the last line regarding the mgmt network.

@rrajendran17
Copy link
Author

I also observed that harvester latest docs still mentions about "instance-manager-r" and "instance-manager-e".
Created this harvester/harvester#7446 to take care of the changes.

Copy link
Member

@starbops starbops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you.

docs/advanced/storagenetwork.md Outdated Show resolved Hide resolved
docs/advanced/storagenetwork.md Outdated Show resolved Hide resolved
docs/advanced/storagenetwork.md Outdated Show resolved Hide resolved
docs/advanced/storagenetwork.md Outdated Show resolved Hide resolved
docs/advanced/storagenetwork.md Outdated Show resolved Hide resolved
docs/advanced/storagenetwork.md Outdated Show resolved Hide resolved
Copy link
Contributor

@jillian-maroket jillian-maroket left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review done. @ihcsim is correct in that the list items must be worded as best practices. We must state the recommended action before mentioning other options.

Comment on lines 390 to 395
- When allocating IP range for storage network using this https://docs.harvesterhci.io/v1.4/advanced/storagenetwork/#configuration-example
consider allocating sufficient IP addresses keeping in mind the future needs of the cluster.
When new nodes are added to the cluster or more disks added to the node after storage network configuration and if required number of IPs becomes
greater than allocated, then it results in Longhorn pods (`instance-manager`, and `backing-image-manager`) not in running state.

Storage network has to be reconfigured with sufficient IP range to solve this.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- When allocating IP range for storage network using this https://docs.harvesterhci.io/v1.4/advanced/storagenetwork/#configuration-example
consider allocating sufficient IP addresses keeping in mind the future needs of the cluster.
When new nodes are added to the cluster or more disks added to the node after storage network configuration and if required number of IPs becomes
greater than allocated, then it results in Longhorn pods (`instance-manager`, and `backing-image-manager`) not in running state.
Storage network has to be reconfigured with sufficient IP range to solve this.
- When configuring an [IP range](https://docs.harvesterhci.io/v1.4/advanced/storagenetwork/#configuration-example) for the storage network, ensure that the allocated IP addresses can service the future needs of the cluster. This is important because Longhorn pods (`instance-manager` and `backing-image-manager`) stop running when new nodes are added to the cluster or more disks are added to a node after the storage network is configured, and when the required number of IPs exceeds the allocation. Resolving the issue involves reconfiguring the storage network with the correct IP range.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jillian-maroket. I have updated the PR as suggested.
"when the required number of IPs exceeds the allocation" - I changed allocation to "allocated IPs" to be more clearer.

Comment on lines 397 to 402
- Users are allowed to configure storage networks using mgmt cluster, but it is always recommended to configure storage network on a non mgmt cluster to
ensure complete separation of the Longhorn replication traffic from the pod network traffic.

- Configuring the storage network on the managment (`mgmt`) network will have a negative impact on the network performance, due to resource and
bandwidth contention as same NIC is shared by both. Use this option only if you could efficiently segregate the two types of traffic or your cluster has
some NIC constraints.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Users are allowed to configure storage networks using mgmt cluster, but it is always recommended to configure storage network on a non mgmt cluster to
ensure complete separation of the Longhorn replication traffic from the pod network traffic.
- Configuring the storage network on the managment (`mgmt`) network will have a negative impact on the network performance, due to resource and
bandwidth contention as same NIC is shared by both. Use this option only if you could efficiently segregate the two types of traffic or your cluster has
some NIC constraints.
- Configure the storage network on a non-`mgmt` cluster network to ensure complete separation of the Longhorn replication traffic from the Kubernetes control plane traffic. Using `mgmt` is possible but not recommended because of the negative impact (resource and bandwidth contention) on the control plane network performance. Use `mgmt` only if your cluster has NIC-related constraints and if you can completely segregate the traffic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second and third items are related so I combined them. In addition, the third point was originally phrased as a limitation rather than a recommendation.

I use mgmt instead of "management network" to refer to the built-in cluster network. This wording decision was based on how @starbops explained the different networks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jillian-maroket for the suggestions.I have incorporated as suggested.

Copy link
Contributor

@ihcsim ihcsim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - just one comment about the href.


## Best Practices

- When configuring an [IP range](https://docs.harvesterhci.io/v1.4/advanced/storagenetwork/#configuration-example) for the storage network, ensure that the allocated IP addresses can service the future needs of the cluster. This is important because Longhorn pods (`instance-manager` and `backing-image-manager`) stop running when new nodes are added to the cluster or more disks are added to a node after the storage network is configured, and when the required number of IPs exceeds the allocated IPs. Resolving the issue involves reconfiguring the storage network with the correct IP range.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use relative href like:

Harvester also introduced storage networking to separate the storage traffic from other cluster-wide workloads. Please refer to [the storage network document](../advanced/storagenetwork.md) for more details.

Otherwise, it will always be pinned to the 1.4 version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants