diff --git a/source/images/aws_cluster_images_datastore.png b/source/images/aws_cluster_images_datastore.png index b470ea013..a58589db0 100644 Binary files a/source/images/aws_cluster_images_datastore.png and b/source/images/aws_cluster_images_datastore.png differ diff --git a/source/images/minione-aws-ubuntu24.04.png b/source/images/minione-aws-ubuntu24.04.png new file mode 100644 index 000000000..a521e3edc Binary files /dev/null and b/source/images/minione-aws-ubuntu24.04.png differ diff --git a/source/images/sunstone-aws_cluster_download_oneke.png b/source/images/sunstone-aws_cluster_download_oneke.png new file mode 100644 index 000000000..6d0a4ed17 Binary files /dev/null and b/source/images/sunstone-aws_cluster_download_oneke.png differ diff --git a/source/images/sunstone-aws_cluster_replica_host.png b/source/images/sunstone-aws_cluster_replica_host.png index 73b6b7899..756b3506f 100644 Binary files a/source/images/sunstone-aws_cluster_replica_host.png and b/source/images/sunstone-aws_cluster_replica_host.png differ diff --git a/source/images/sunstone-aws_edge_cluster_deploying.png b/source/images/sunstone-aws_edge_cluster_deploying.png new file mode 100644 index 000000000..e3b65bc6e Binary files /dev/null and b/source/images/sunstone-aws_edge_cluster_deploying.png differ diff --git a/source/images/sunstone-aws_edge_cluster_sys_ds.png b/source/images/sunstone-aws_edge_cluster_sys_ds.png new file mode 100644 index 000000000..ac3a80405 Binary files /dev/null and b/source/images/sunstone-aws_edge_cluster_sys_ds.png differ diff --git a/source/images/sunstone-aws_k8s_vms_list.png b/source/images/sunstone-aws_k8s_vms_list.png new file mode 100644 index 000000000..10bd5fb49 Binary files /dev/null and b/source/images/sunstone-aws_k8s_vms_list.png differ diff --git a/source/images/sunstone-aws_kubernetes_vnf_ip.png b/source/images/sunstone-aws_kubernetes_vnf_ip.png new file mode 100644 index 000000000..1af726b3e Binary files /dev/null and b/source/images/sunstone-aws_kubernetes_vnf_ip.png differ diff --git a/source/images/sunstone_kubernetes_netw_dropdowns.png b/source/images/sunstone_kubernetes_netw_dropdowns.png new file mode 100644 index 000000000..9daa2d539 Binary files /dev/null and b/source/images/sunstone_kubernetes_netw_dropdowns.png differ diff --git a/source/quick_start/deployment_basics/try_opennebula_on_kvm.rst b/source/quick_start/deployment_basics/try_opennebula_on_kvm.rst index 4f8599878..6034241cd 100644 --- a/source/quick_start/deployment_basics/try_opennebula_on_kvm.rst +++ b/source/quick_start/deployment_basics/try_opennebula_on_kvm.rst @@ -61,15 +61,15 @@ To run the miniONE script on AWS, you will need to instantiate a virtual machine - 2616 (for the FireEdge GUI) - 5030 (for the OneGate service) -.. tip:: To quickly deploy a suitable VM, browse the AWS AMI Catalog and select ``Ubuntu Server 22.04 LTS (HVM), SSD Volume Type``: +.. tip:: To quickly deploy a suitable VM, browse the AWS AMI Catalog and select **Ubuntu Server 24.04 LTS (HVM), SSD Volume Type**: - .. image:: /images/minione-aws-ubuntu22.04.png + .. image:: /images/minione-aws-ubuntu24.04.png :align: center Below is an example of a successfully-tested configuration (though by no means the only possible one): - Region: Frankfurt -- Operating System: Ubuntu Server 22.04 LTS (HVM) +- Operating System: Ubuntu Server 24.04 LTS (HVM) - Tier: ``t2.medium`` - Open ports: 22, 80, 2616, 5030 - Storage: 80 GB SSD @@ -120,7 +120,7 @@ Once you have logged in to the VM as user ``ubuntu``, use the ``sudo`` command t .. prompt:: - sudo su - + sudo -i Then, update the system to its latest software packages by running the following command: @@ -128,6 +128,35 @@ Then, update the system to its latest software packages by running the following apt update && apt upgrade +After updating, you will probably need to restart the VM to run the latest kernel. Check the output of the ``apt upgrade`` command for lines similar to the following: + +.. prompt:: + + Pending kernel upgrade! + Running kernel version: + 6.8.0-1012-aws + Diagnostics: + The currently running kernel version is not the expected kernel version 6.8.0-1014-aws. + +In this example, you need to restart the VM in order to upgrade to kernel 6.8.0-1014-aws. To restart the VM, run: + +.. prompt:: + + shutdown -r now + +You will be immediately logged out of the VM as it restarts. Wait a few moments for the VM to finish rebooting, then log in again using the same procedure as before. After logging back into the VM, you can check the running kernel version with: + +.. prompt:: + + uname -a + +For example, in this case: + +.. prompt:: + + $ uname -a + Linux ip-172-31-3-252 6.8.0-1014-aws #15-Ubuntu SMP Thu Aug 8 19:13:06 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux + Your AWS VM is now ready. In the next steps, we’ll download the miniONE script, upload it to the VM, and run the installation. Step 3: Download and install miniONE @@ -162,11 +191,25 @@ Step 3.2. Run the miniONE script on the AWS VM After copying the miniONE script to the VM, log in to the VM (as described :ref:`above `). -Use the ``sudo`` command to become the ``root`` user. +Use the ``sudo`` command to become the ``root`` user: + +.. prompt:: + + sudo -i + +If necessary, use the ``cd`` command to navigate to the folder where you copied the miniONE script. For example, if you copied it to the home directory of user ``ubuntu`` run: + +.. prompt:: + + cd ~ubuntu + +Next, ensure that the ``minione`` file has execute permissions, by running: + +.. prompt:: -If necessary, use the ``cd`` command to navigate to the folder where you copied the miniONE script. For example, if you copied it to the home directory of user ``ubuntu`` run ``cd ~ubuntu``. + chmod +x minione -To install miniONE, run: +To install miniONE, run as root: .. prompt:: diff --git a/source/quick_start/usage_basics/running_kubernetes_clusters.rst b/source/quick_start/usage_basics/running_kubernetes_clusters.rst index 31758eac4..009b9566b 100644 --- a/source/quick_start/usage_basics/running_kubernetes_clusters.rst +++ b/source/quick_start/usage_basics/running_kubernetes_clusters.rst @@ -39,7 +39,11 @@ Follow these steps: :align: center :scale: 50% - #. Select the **system** datastore for the AWS cluster. (If you began this Quick Start Guide on a clean install, it will probably display ID ``101``.) + #. Select the **system** datastore for the AWS cluster. (If you began this Quick Start Guide on a clean install, it will probably display ID ``100``.) + + .. image:: /images/sunstone-aws_edge_cluster_sys_ds.png + :align: center + #. Sunstone will display the **Info** panel for the datastore. Scroll down to the **Attributes** section and find the ``REPLICA_HOST`` attribute. Hover your mouse to the right, to display the **Copy**/**Edit**/**Delete** icons |icon3| for the attribute value: .. image:: /images/sunstone-aws_cluster_replica_host.png @@ -49,6 +53,7 @@ Follow these steps: | #. Click the **Delete** icon |icon4|. + #. When Sunstone requests to confirm the action, click **Yes**. You have deleted the ``REPLICA_HOST`` parameter from the datastore. In the next step we’ll download the OneKE appliance. @@ -61,8 +66,6 @@ The `OpenNebula Public Marketplace `__ is a r The Kubernetes cluster is packaged in a multi-VM service appliance listed as **Service OneKE **. To download it, follow the same steps as when downloading the WordPress VM: -Log in to Sunstone as user ``oneadmin``. - Open the left-hand pane, then select **Storage** -> **Apps**. Sunstone will display the **Apps** screen, showing the first page of apps that are available for download. .. image:: /images/sunstone-apps_list.png @@ -81,7 +84,13 @@ In the search field at the top, type ``oneke`` to filter by name. Then, select * Click the **Import into Datastore** |icon1| icon. -As with the WordPress appliance, Sunstone displays the **Download App to OpenNebula** wizard. In the first screen of the wizard, click **Next**. In the second screen you will need to select a datastore for the appliance. Select the **aws-edge-cluster-image** datastore. +As with the WordPress appliance, Sunstone displays the **Download App to OpenNebula** wizard. In the first screen of the wizard, click **Next**. + +.. image:: /images/sunstone-aws_cluster_download_oneke.png + :align: center + :scale: 60% + +In the second screen you will need to select a datastore for the appliance. Select the **aws-edge-cluster-image** datastore. |kubernetes-qs-marketplace-datastore| @@ -118,6 +127,8 @@ Sunstone displays the **Address Range** dialog box. Here you can define an addre |kubernetes-aws-private-network-range| +Click **Accept**. + Lastly, you will need to add a DNS server for the network. Select the **Context** tab, then the **DNS** input field. Type the address for the DNS server, such as ``8.8.8.8`` or ``1.1.1.1``. |kubernetes-aws-dns| @@ -192,7 +203,13 @@ Click **Next** to go to the next screen, **Network**. Select the Public and Private Networks ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The Kubernetes cluster needs access to the private and the public network defined for the Edge Cluster. First we’ll select the public network. Check that the **Network ID** drop-down menu displays ``Public``, then select the **metal-aws-edge-cluster-public** network. +The Kubernetes cluster needs access to the private and the public network defined for the Edge Cluster. First we’ll select the public network. + +Set the **Network ID** drop-down menu to ``Public``, and the **Network Type** drop-down menu to ``Existing``. + +.. image::/images/sunstone_kubernetes_netw_dropdowns.png + +Check that the **Network ID** drop-down menu displays ``Public``, then select the **metal-aws-edge-cluster-public** network. |kubernetes-qs-pick-networks-public| @@ -226,12 +243,11 @@ To verify that the VMs for the cluster were correctly deployed, you can use the .. prompt:: bash $ auto [oneadmin@FN]$ onevm list - ID USER GROUP NAME STAT CPU MEM HOST TIME - 5 oneadmin oneadmin storage_0_(service_3) runn 2 3G 0d 00h05 - 4 oneadmin oneadmin worker_0_(service_3) runn 2 3G 0d 00h05 - 3 oneadmin oneadmin master_0_(service_3) runn 2 3G 0d 00h05 - 2 oneadmin oneadmin vnf_0_(service_3) runn 1 2G 0d 00h06 - 1 oneadmin oneadmin Service WordPress - KVM-1 runn 1 2G 54.235.30.169 0d 00h21 + ID USER GROUP NAME STAT CPU MEM HOST TIME + 3 oneadmin oneadmin worker_0_(service_3) runn 2 3G 0d 00h31 + 2 oneadmin oneadmin master_0_(service_3) runn 2 3G 0d 00h31 + 1 oneadmin oneadmin vnf_0_(service_3) runn 1 512M 0d 00h31 + 0 oneadmin oneadmin Service WordPress - KVM-0 runn 1 768M 0d 01h22 At this point you have successfully instantiated the Kubernetes cluster. Before deploying an application, you need to find out the **public** IP address of the VNF node, since we will use it later to connect to the master Kubernetes node. @@ -240,15 +256,18 @@ At this point you have successfully instantiated the Kubernetes cluster. Before Check the IP Address for the VNF Node ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -To see the IP in Sunstone, go to **Instances** -> **VMs**, then check the **IP** column for the VNF VM. +To check the VNF node IP in Sunstone, in the left-hand pane go to **Instances** -> **VMs**, then check the information displayed under **vnf_0_(service_)**. The IP is displayed on the right, highlighted in the image below (note that all public IPs have been blurred in the image): + + .. image:: /images/sunstone-aws_k8s_vms_list.png + :align: center Alternatively, to check on the command line, log in to the Front-end and run: .. prompt:: bash $ auto - [oneadmin@FN]$ onevm show -j |jq -r .VM.TEMPLATE.NIC[0].EXTERNAL_IP + onevm show -j |jq -r .VM.TEMPLATE.NIC[0].EXTERNAL_IP -Replace ```` with the ID of the VNF VM as listed by the ``onevm list`` command (ID ``2`` in the example above). +Replace ```` with the ID of the VNF VM as listed by the ``onevm list`` command (ID ``1`` in the example above). If you do not see all VMs listed, or if the OneKE Service is stuck in ``DEPLOYING``, see :ref:`Known Issues ` below. @@ -277,17 +296,17 @@ To deploy an application, we will first connect to the master Kubernetes node vi For connecting to the master Kubernetes node, you need to know the public address (AWS elastic IP) of the VNF node, as described :ref:`above `. -Once you know the correct IP, from the Front-end node connect to the master Kubernetes node with this command: +Once you know the correct IP, from the Front-end node connect to the master Kubernetes node with the below command (replace “1.2.3.4” with the public IP address of the VNF node): .. prompt:: bash $ auto - $ ssh -A -J root@ root@172.20.0.2 + $ ssh -A -J root@1.2.3.4 root@172.20.0.2 In this example, ``172.20.0.2`` is the private IP address of the Kubernetes master node (the second address in the private network). .. tip:: - If you don't use ``ssh-agent`` then you may skip the ``-A`` flag in the above command. You will need to copy your *private* ssh key (used to connect to VNF) into the VNF node itself, at the location ``~/.ssh/id_rsa``. Make sure that the file permissions are correct, i.e. ``0600`` (or ``u=rw,go=``). For example: + If you don’t use ``ssh-agent`` then you may skip the ``-A`` flag in the above command. You will need to copy your *private* ssh key (used to connect to VNF) into the VNF node itself, at the location ``~/.ssh/id_rsa``. Make sure that the file permissions are correct, i.e. ``0600`` (or ``u=rw,go=``). For example: .. prompt:: bash $ auto @@ -386,7 +405,7 @@ OneFlow Service is Stuck in ``DEPLOYING`` An error in network configuration, or any major failure (such as network timeouts or performance problems) can cause the OneKE service to lock up due to a communications outage between it and the Front-end node. The OneKE service will lock if *any* of the VMs belonging to it does not report ``READY=YES`` to OneGate within the default time. -If one or more of the VMs in the Kubernetes cluster never leave the ``DEPLOYING`` state, you can troubleshoot OneFlow communications by inspecting the file ``/var/log/oneflow.log`` on the Front-end node. Look for a line like the following: +If one or more of the VMs in the Kubernetes cluster never leave the ``DEPLOYING`` state, you can troubleshoot OneFlow communications by inspecting the file ``/var/log/one/oneflow.log`` on the Front-end node. Look for a line like the following: .. code-block:: text @@ -402,7 +421,7 @@ To recreate the VM instance, you must first terminate the OneKE service. A servi .. prompt:: bash $ auto - [oneadmin@FN]$ oneflow recover --delete + oneflow recover --delete Then, re-instantiate the service from the Sunstone UI: in the left-hand pane, **Service Templates** -> **OneKE 1.29**, then click the **Instantiate** icon. @@ -411,7 +430,7 @@ Lack of Connectivity to the OneGate Server Another possible cause for VMs in the Kubernetes cluster failing to run is lack of contact between the VNF node in the cluster and the OneGate server on the Front-end. -As described in :ref:`Quick Start Using miniONE on AWS `, the AWS instance where the Front-end is running needs to allow incoming connections for port 5030. If you do not want to open the port for all addresses, check the **public** IP address of the VNF node (the AWS Elastic IP, see :ref:`above `), and create an inbound rule in the AWS security groups that IP. +As described in :ref:`Quick Start Using miniONE on AWS `, the AWS instance where the Front-end is running must allow incoming connections for port 5030. If you do not want to open the port for all addresses, check the **public** IP address of the VNF node (the AWS Elastic IP, see :ref:`above `), and create an inbound rule in the AWS security groups for that IP. In cases of lack of connectivity with the OneGate server, the ``/var/log/one/oneflow.log`` file on the Front-end will display messages like the following: @@ -422,43 +441,121 @@ In cases of lack of connectivity with the OneGate server, the ``/var/log/one/one In this scenario only the VNF node is successfully deployed, but no Kubernetes nodes. -To troubleshoot, log in to the VNF node via SSH. Then, check if the VNF node is able to contact the OneGate server on the Front-end node, by running this command as root: +To troubleshoot, follow these steps: -.. prompt:: bash $ auto + #. Find out the IP address of the VNF node, as described :ref:`above `. + #. Log in to the VNF node via ssh as root. + #. Check if the VNF node is able to contact the OneGate server on the Front-end node, by running this command: - [root@VNF]$ onegate vm show + .. prompt:: bash $ auto -A successful response should look like: + onegate vm show -.. code-block:: text + A successful response should look like: - [root@VNF]$ onegate vm show - VM 0 - NAME : vnf_0_(service_3) + .. code-block:: text -And a failure gives a timeout message: + [root@VNF]$ onegate vm show + VM 0 + NAME : vnf_0_(service_3) -.. code-block:: text + And a failure gives a timeout message: - [root@VNF]$ onegate vm show - Timeout while connected to server (Failed to open TCP connection to :5030 (execution expired)). - Server: :5030 + .. code-block:: text -Possible causes -++++++++++++++++ + [root@VNF]$ onegate vm show + Timeout while connected to server (Failed to open TCP connection to :5030 (execution expired)). + Server: :5030 + + In this case, the VNF node cannot communicate with the OneGate service on the Front-end node. Possible causes include: -**Wrong Front-end node AWS IP**: The VNF node may be trying to connect to the OneGate server on the wrong IP address. In the VNF node, the IP address for the Front-end node is defined by the value of ``ONEGATE_ENDPOINT``, in the scripts found in the ``/run/one-context*`` directories. You can check the value with: + * **Wrong Front-end node for the AWS IP**: The VNF node may be trying to connect to the OneGate server on the wrong IP address. In the VNF node, the IP address for the Front-end node is defined by the value of ``ONEGATE_ENDPOINT``, in the scripts found in the ``/run/one-context`` directory. You can check the value with: -.. code-block:: text + .. code-block:: text - [root@VNF]$ grep ONEGATE -r /run/one-context* + grep -r ONEGATE /run/one-context* -If the value of ``ONEGATE_ENDPOINT`` does not match the IP address where OneGate is listening on the Front-end node, edit the parameter with the correct IP address, then terminate the service from the Front-end (see :ref:`above `) and re-deploy. + If the value of ``ONEGATE_ENDPOINT`` does not match the IP address where OneGate is listening on the Front-end node, edit the parameter with the correct IP address. Then, terminate the OneKE service from the Front-end (see :ref:`above `) and re-deploy. -**Filtered incoming connections**: On the Front-end node, the OneGate server listens on port 5030, so you must ensure that this port accepts incoming connections. If necessary, create an inbound rule in the AWS security groups for the elastic IP of the VNF node. + * **Filtered incoming connections**: On the Front-end node, the OneGate server listens on port 5030, so you must ensure that this port accepts incoming connections. If necessary, create an inbound rule in the AWS security groups for the elastic IP of the VNF node. .. |icon1| image:: /images/icons/sunstone/import_into_datastore.png .. |icon2| image:: /images/icons/sunstone/instantiate.png .. |icon3| image:: /images/icons/sunstone/parameter_manipulation_icons.png .. |icon4| image:: /images/icons/sunstone/trash.png .. |icon5| image:: /images/icons/sunstone/VNC.png + +One or more VMs Fail to Report Ready +++++++++++++++++++++++++++++++++++++++ + +Another possible cause for failure of the OneKE Service to leave the ``DEPLOYING`` state is that a temporary network glitch or other variation in performance prevented one or more of the VMs in the service to report ``READY`` to the OneGate service. In this case, it is possible that you see all of the VMs in the service up and running, but the OneKE service is stuck in ``DEPLOYING``. + +For example on the Front-end, the output of ``onevm list`` shows all VMs running: + +.. prompt:: + + onevm list + ID USER GROUP NAME STAT CPU MEM HOST TIME + 3 oneadmin oneadmin worker_0_(service_3) runn 2 3G 0d 01h02 + 2 oneadmin oneadmin master_0_(service_3) runn 2 3G 0d 01h02 + 1 oneadmin oneadmin vnf_0_(service_3) runn 1 512M 0d 01h03 + 0 oneadmin oneadmin Service WordPress - KVM-0 runn 1 768M 0d 01h53 + +Yet ``oneflow list`` shows: + +.. prompt:: + + ID USER GROUP NAME STARTTIME STAT + 3 oneadmin oneadmin OneKE 1.29 08/30 12:30:07 DEPLOYING + +In this case you can manually instruct the VMs to report ``READY`` to the OneGate server. Follow these steps: + + #. From the Front-end node, log in to the VNF node by running: + + .. prompt:: + + ssh root@ + + (To find out the IP address of the VNF node, see :ref:`above `.) + + #. For each VM in the OneKE service, run the following command: + + .. prompt:: + + onegate vm update --data "READY=YES" + + For example, ``onegate vm update 2 --data "READY=YES"``. + + Then, you can check the status of the service with ``onegate vm show``: + + .. prompt:: + + onegate service show + SERVICE 3 + NAME : OneKE 1.29 + STATE : RUNNING + + ROLE vnf + VM 1 + NAME : vnf_0_(service_3) + + ROLE master + VM 2 + NAME : master_0_(service_3) + + ROLE worker + VM 3 + NAME : worker_0_(service_3) + + ROLE storage + + #. On the Front-end, run ``oneflow list`` again to verify that the service reports ``RUNNING``: + + .. prompt:: + + [oneadmin@FN]$ oneflow list + ID USER GROUP NAME STARTTIME STAT + 3 oneadmin oneadmin OneKE 1.29 08/30 12:35:21 RUNNING + + +