Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Staggered upgrade procedure for OVS clouds #1408

Open
wants to merge 7 commits into
base: stackhpc/2024.1
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions doc/source/operations/upgrading-openstack.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1063,6 +1063,12 @@ This will block the upgrade, but may be overridden by setting
``etc/kayobe/kolla/globals.yml`` or
``etc/kayobe/environments/<env>/kolla/globals.yml``.

Depending on the networking architecture of your cloud, the steps used
to upgrade the containerised services will differ.

OVN
^^^

To upgrade the containerised control plane services:

.. code-block:: console
Expand All @@ -1076,6 +1082,80 @@ scope of the upgrade:

kayobe overcloud service upgrade --tags config --kolla-tags keystone

OVS
^^^

You should first stop the Octavia health manager to prevent alerts during
the service upgrade.

.. code-block:: console

kayobe overcloud host command run --command "docker stop octavia_health_manager" --limit controllers --become

For dedicated network nodes, upgrade the control plane services:

.. code-block:: console

kayobe overcloud service upgrade --kolla-limit controllers

For converged network nodes, you should specify the service limit to only
upgrade the Neutron API service.

.. code-block:: console

kayobe overcloud service upgrade --kolla-limit controllers -ke neutron_service_limit=neutron-server

To ensure L3 reliability during the upgrade, we will need to sequentially drain
and upgrade each network node by first disabling agents and then running a targeted upgrade.
the network nodes of all agents, and upgrade the nodes sequentially.

Kolla credentials will need to be activated before running the neutron-namespace-drain
role.

.. code-block:: console

source $KOLLA_CONFIG_PATH/public-openrc.sh

You should substitute <network0> with the first network node to be drained, To set
the node for maintenance and begin draining the agents:

.. code-block:: console

kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/neutron-l3-drain.yml -e neutron_namespace_drain_host=<network0> -e maintenance=true -e neutron_namespace_drain_dhcp_agents=true

You can monitor the L3/DHCP agents being drained from the node by running:

.. code-block:: console

ssh -t <network0> watch ip netns ls

Once all agents have been drained, you can upgrade the containerised services
on the network node. For dedicated network nodes run:

.. code-block:: console

kayobe overcloud service upgrade --kolla-limit <network0>

Converged network nodes will require specifying the the service limit for the Neutron
agents.

.. code-block:: console

kayobe overcloud service upgrade --kolla-limit <network0> -ke neutron_service_limit='neutron-openvswitch-agent,neutron-dhcp-agent,neutron-l3-agent,neutron-metadata-agent,ironic-neutron-agent'

Following the service upgrade, the agents can be restored on the node by disabling maintenance:

.. code-block:: console

kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/neutron-l3-drain.yml -e neutron_namespace_drain_host=<network0> -e maintenance=false -e neutron_namespace_drain_dhcp_agents=true

The above steps should be repeated for the remaining network nodes, once all network nodes have been upgraded
the remaining containerised services can be upgraded:

.. code-block:: console

kayobe overcloud service upgrade --kolla-tags common,nova,prometheus,openvswitch,neutron --skip-prechecks -kl controllers,compute --limit controllers,compute

Updating the Octavia Amphora Image
----------------------------------

Expand Down
25 changes: 25 additions & 0 deletions etc/kayobe/ansible/neutron-l3-drain.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
- name: Drain neutron of l3 agents and dhcp agents
hosts: localhost
gather_facts: true
tags:
- neutron-l3-drain
vars:
maintenance: false
neutron_namespace_drain_ctrl1: false
neutron_namespace_drain_ctrl2: false
neutron_namespace_drain_ctrl3: false
tasks:
- name: Drain hosts
ansible.builtin.import_role:
name: stackhpc.openstack_ops.neutron_namespace_drain
tasks_from: main.yml
when: neutron_namespace_drain_ctrl1 | bool or neutron_namespace_drain_ctrl2 | bool or neutron_namespace_drain_ctrl3 | bool or neutron_namespace_drain_host is defined

- name: Print info
ansible.builtin.debug:
msg:
- "{{ neutron_namespace_drain_host }} is read for maintenance"
- "rerun this play book with -e maintenance=false to re-add"
- "routers"
when: maintenance | bool
3 changes: 3 additions & 0 deletions etc/kayobe/ansible/requirements.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@ collections:
version: 2.5.1
- name: stackhpc.kayobe_workflows
version: 1.1.0
- name: https://github.com/stackhpc/ansible-collection-openstack-ops
type: git
version: feature/neutron-namespace-drain
roles:
- src: stackhpc.vxlan
- name: ansible-lockdown.ubuntu22_cis
Expand Down
Loading