How to delete the failed TKGm or Native k8s cluster in CSE?

In CSE 3.1.1, delete operation on a cluster (Native or TKG) that is in an error state (RDE.state = RESOLUTION_ERROR (or) status.phase = :FAILED), may fail with Bad request (400) or the Delete process will be stuck in ‘DELETEIN_PROGRESS’ state. The steps are given below to resolve the issue.

Step1: Assign API explorer privilege to the CSE Service Account.

Login to VCD Provider portal as Administrator.

Edit the CSE Service Role.

Navigate to Administration > Provider Access Control > Roles > CSE Service Role.

In the tenant portal check if there’re any stale vApp entries for the failed clusters. If so, please delete them.

Step2: Get the failed cluster ID through vcd cli

# vcd cse cluster list
Name        Org             Owner     VDC            K8s Runtime    K8s Version            Status
----------  --------------  --------  -------------  -------------  ---------------------  ------------------
test-tkg-1  CSE-Site1-Test  orgadmin  CSE-TEST-OVDC  TKGm           TKGm v1.21.2+vmware.1  DELETE:IN_PROGRESS
tkg         CSE-Site1-Test  orgadmin  CSE-TEST-OVDC  TKGm           TKGm v1.21.2+vmware.1  DELETE:IN_PROGRESS
tkgtest     CSE-Site1-Test  orgadmin  CSE-TEST-OVDC  TKGm           TKGm v1.21.2+vmware.1  DELETE:IN_PROGRESS
tkg-test    CSE-Site1-Test  orgadmin  CSE-TEST-OVDC  TKGm           TKGm v1.21.2+vmware.1  DELETE:IN_PROGRESS
tkg-test3   CSE-Site1-Test  orgadmin  CSE-TEST-OVDC  TKGm           TKGm v1.21.2+vmware.1  DELETE:IN_PROGRESS

# vcd cse cluster info tkg-test3 | grep uid
  uid: urn:vcloud:entity:cse:nativeCluster:9364bf18-0faa-49ce-8be7-7e92af692d1b

Step3: Run GET call to check the status

Login to VCD Provider portal with the CSE service account which has CSE Service Role assigned.
Open API Explorer.
Click on GET in difinedEntity section.

Click on TryitOut
In Description, provide the cluster UID from last step.
In the output we can see the state as PRE_CREATED.

Step3: Run the POST call resolve to resolve

Select the POST call from definedEntity section.

/1.0.0/entities/{id}/resolve
Validates the defined entity against the entity type schema.

Provide the cluster ID and run the call. The state will be changed to RESOLVED.


Step4: Run the DELETE call to delete RDE.

Povide the cluser ID and ‘false’ as value for inovkeHooks.

Please check and confirm the failed Cluster is deleted now.

#vcd cse cluster list
InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised.
Name      Org             Owner     VDC            K8s Runtime    K8s Version            Status
--------  --------------  --------  -------------  -------------  ---------------------  ------------------
tkg       CSE-Site1-Test  orgadmin  CSE-TEST-OVDC  TKGm           TKGm v1.21.2+vmware.1  DELETE:IN_PROGRESS
tkgtest   CSE-Site1-Test  orgadmin  CSE-TEST-OVDC  TKGm           TKGm v1.21.2+vmware.1  DELETE:IN_PROGRESS
tkg-test  CSE-Site1-Test  orgadmin  CSE-TEST-OVDC  TKGm           TKGm v1.21.2+vmware.1  DELETE:IN_PROGRESS


Upgrade vRealize Operations Tenant App to 8.6 Using the Offline Upgrade Image

In this blog post, I am going to review the process of updating a vRealize Oprations Tenant App for VMware Cloud Director 2.6 virtual appliance to version 8.6.

  1. First, Download vROps Tenant App 8.6 offline upgrade ISO from VMware marketplace. The Upgrade bundle is available in following URL > Resource & Support > ‘Offline Upgrade: vRealize Operations TA 8.6’

    https://marketplace.cloud.vmware.com/services/details/vrealize-operations-tenant-app-for-vmware-cloud-director-2-61211?slug=true
  2. Take a snapshot of the Tenant App appliance.
  3. Connect the ISO to the Tenant App appliance. You can use ‘Client Device’, ‘Host Device’, ‘Datastore ISO file’ or ‘Content Library ISO FIle’ for mounting the ISO.
  4. n a Web browser, log in to the virtual appliance management interface (VAMI). The URL for the VAMI is https:// tenantapp_appliance_address:5480
  5. Click the Update tab.
  6. Click Settings, select Use CDROM Updates, and click Save Settings.
  7. Click Status and click Check Updates. The appliance version appears in the list of available updates.

8. Click Install Updates and click OK. It will take a while to complete the upgrade.

9. After the updates install, click the System tab and click Reboot
10. After reboot confirm the Tenant App version is 8.6.

Upgrade VMware Cloud Director App Launchpad from 2.0 to 2.1

Please find the steps to upgrade VMware Cloud Director App Launchpad from version 2.0 to 2.1

  1. Download VMware Cloud Director App Launchpad 2.1 RPM package from here.
  2. Upload it to the App Launchpad VM.
  3. Open an SSH connection to the App Launchpad VM and log in as root.
  4. Upgrade the RPM package.
[[email protected] ~]# rpm -U vmware-alp-2.1.0-18834930.x86_64.rpm
warning: vmware-alp-2.1.0-18834930.x86_64.rpm: Header V3 RSA/SHA1 Signature, key ID 001e5cc9: NOKEY
Upgrading...

Execute 'alp upgrade' to upgrade ...

  Append the excute permission to the existing logs...

5. Run the following command to upgrade App Launchpad.

[[email protected] ~]# alp upgrade --admin-user [email protected] --admin-pass 'passwd'
Upgraded the plugin of App Launchpad successfully.

Upgraded the management service successfully.
  [Upgrade Task]
     CREATE_ENTITY_TYPE_CATALOG_INFO : true
     MIGRATE_CATALOGS : true
     CREATE_ENTITY_TYPE_SIZING_TEMPLATE : true
     MIGRATE_LEGACY_SIZING_TEMPLATES : true
     CREATE_ENTITY_TYPE_MARKETPLACE_BANNER : true
     CREATE_ENTITY_TYPE_ORG_METRICS : true
     UPGRADE_SERVICE_ROLE : true

6. Restart alp service and confirm its running.

[[email protected]~]# systemctl restart alp
[[email protected] ~]# systemctl status alp
● alp.service - VMware ALP Management Service
   Loaded: loaded (/usr/lib/systemd/system/alp.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2021-11-18 11:46:14 +01; 14s ago
 Main PID: 29334 (java)
   CGroup: /system.slice/alp.service
           └─29334 java -jar /opt/vmware/alp/alp.jar --logging.path=log

Nov 18 11:46:14 bd1-srp-al01.acs.local systemd[1]: Stopped VMware ALP Management Service.
Nov 18 11:46:14 bd1-srp-al01.acs.local systemd[1]: Started VMware ALP Management Service.

7. Diagnose deployment errors by running the /opt/vmware/alp/bin/diagnose executable file.

The diagnose tool verifies that the services are up and running and that all configuration
requirements are met.

[[email protected] ~]# /opt/vmware/alp/bin/diagnose

Step 1: System diagnose
--------------------------------------------------------------------------------
- App Launchpad service is initialized.


Step 2: Cloud Director diagnose
--------------------------------------------------------------------------------
- Service Account for App Launchpad is good.
- App Launchpad's extension is ready.


Step 3: MQTT diagnose
--------------------------------------------------------------------------------
- Cloud Director MQTT for extensibility is ready.


Step 4: Integration diagnose
--------------------------------------------------------------------------------
- App Launchpad API is up, and version is 2.1.0-18834930.


Step 5: App Launchpad diagnose
--------------------------------------------------------------------------------
- App Launchpad service is listening on port 8086.


8. Confirm the ALP version.

[[email protected] ~]# alp
NAME:
        alp - The Cloud Director App Launchpad
        (ALP) Command-line tool

USAGE:
        alp <subcommand> [flags]

VERSION:
        '2.1.0-18834930'