Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creation of VSVIP for Gateway object always uses global VRF, returns error if AKO is configured with custom VRF in NodePort Mode #1610

Open
Vacant0mens opened this issue Dec 12, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@Vacant0mens
Copy link

Vacant0mens commented Dec 12, 2024

Describe the bug

In my test setup, I have a custom VRF set up that uses a Tier 1 to route requests to my Kubernetes cluster that's running AKO v1.12.2 .

When I set up AKO, using the Ingress and Service objects to create a VS and VSVIP and everything works fine, and they use the correct (custom) VRF as expected.
However, when I create a Gateway object (as described in the Kubernetes documentation, as well as the Avi/VMware-ALB documentation), the ako-gateway-api container reports these warnings (which are actually an error from the AviApi):
[formatted for easier reading]

WARN rest/rest_operation.go:305 key: <tenantName>/ako-gw-<clusterName>--avi-system-test-gateway-EVH,
msg: RestOp method POST path /api/vsvip tenant <tenantName>
Obj {
    "cloud_ref": "/api/cloud?name=<cloudName>",
    "dns_info": [
      {
        "fqdn": "<domainName>"
      }
    ],
    "east_west_placement": false,
    "markers": [
      {
        "key": "clustername",
        "values": [
          "<clusterName>"
        ]
      }
    ],
    "name": "ako-gw-<clusterName>--avi-system-test-gateway-EVH",
    "tenant_ref": "/api/tenant/?name=<tenantName>",
    "vip": [
      {
        "auto_allocate_ip": true,
        "ipam_network_subnet": {
          "network_ref": "/api/network/?name=avi-vip",
          "subnet": {
            "ip_addr": {
              "addr": "10.10.4.0",
              "type": "V4"
            },
            "mask": 24
          }
        },
        "vip_id": "0"
      }
    ],
    "vrf_context_ref": "/api/vrfcontext?name=global",
    "vsvip_cloud_config_cksum": "160433136"
  },
  "returned err": {
    "code": 0,
    "message": "map[error:Tier 1 cannot be derived from vrf]",
    "Verb": "POST",
    "Url": "https://10.1.1.230//api/vsvip",
    "HttpStatusCode": 400
}
with response null
2024-12-12T21:41:06.941Z 
WARN rest/dequeue_nodes.go:627 key: <vsVipName>, 
msg: there was an error sending the macro
Error during POST: Encountered an error on POST request to URL https://<avi-controller>//api/vsvip: 
    HTTP code: 400; error from Controller: map[error:Tier 1 cannot be derived from vrf]

with log level set to INFO, this log shows up immediately after the above errors.

INFO rest/rest_operation.go:158 Failed to remove VsVip ref, object is not of type Virtualservice

In the second-to-last line of the request body, it says it's trying to use the global VRFContext.

If I copy-paste the JSON from the error message, change the one line to use my custom VRFContext, I can submit the request using the Swagger UI for Avi and it creates the VSVIP successfully, but AKO still cannot use it, doesn't seem to know it exists, and still throws the same error.

I also get the same error if I submit the request body with no changes.

Reproduction steps

  1. Create custom T1 in NSX that is associated with a custom VRF Context in Avi, with an appropriate static route (verify that it is usable by creating Service and Ingress objects using the VRF Context).
  2. Install AKO on a Kubernetes cluster using the custom VRF Context and NSX T1 router and with the Gateway feature enabled
  3. verify that the VRF Context is usable by creating Service and Ingress objects (that point appropriately to a pod or pods) using the VRF Context
  4. Apply standard Yaml for a Gateway object of version: gateway.networking.k8s.io/v1
  5. Error shows up as a warning in the pod logs for the ako-gateway-api container in the ako-0 pod.

Expected behavior

A VSVIP should get created in Avi that corresponds to the Gateway object in the kubernetes cluster. The VSVIP should also be associated with the custom VRFContext.

Additional context

in my values.yml file that I use when installing the Helm chart, I included:
NetworkSettings.nsxtT1LR: /infra/tier-1s/<GUID>
ControllerSettings.vrfName: <customVrfName>
L7Settings.serviceType: NodePort

For the ako container, I can see both of these settings got added as environment variables. But in the ako-gateway-api container, neither of them exist.

If I add the variables so that the ako-gateway-api container has the same environment variables for those two configuration items, it does not change the outcome.

Other logs that might be useful to someone who knows more than me:

INFO rest/rest_operation.go:158 Failed to remove VsVip ref, object is not of type Virtualservice
WARN rest/dequeue_nodes.go:627 key: <vsVipName>, msg: there was an error sending the macro Error during POST: Encountered an error on POST request to URL https://10.10.1.230//api/vsvip: HTTP code: 400; error from Controller: map[error:Tier 1 cannot be derived from vrf]
INFO rest/dequeue_nodes.go:714 key: <vsVipName>, msg: Service Metadata: {"namespace_ingress_name":null,"ingress_name":"","namespace":"","hostnames":null,"namespace_svc_name":null,"crd_status":{"type":"","value":"","status":""},"pool_ratio":0,"passthrough_parent_ref":"","passthrough_child_ref":"","gateway":"avi-system/test-gateway","insecureedgetermallow":false,"is_mci_ingress":false}
WARN rest/dequeue_nodes.go:648 key: <vsVipName>, msg: Retrieved key for Retry:<vsVipName>, object: <vsVipName>
INFO rest/dequeue_nodes.go:651 key: <vsVipName>, msg: Error is not of type AviError, err: Aborted due to prev error, *errors.errorString
WARN rest/dequeue_nodes.go:648 key: <vsVipName>, msg: Retrieved key for Retry:<vsVipName>, object: <vsVipName>
WARN rest/dequeue_nodes.go:861 key: <vsVipName>, msg: problem in processing request for: VsVip
INFO rest/dequeue_nodes.go:862 key: <vsVipName>, msg: error str: Encountered an error on POST request to URL https://10.10.1.230//api/vsvip: HTTP code: 400; error from Controller: map[error:Tier 1 cannot be derived from vrf]
INFO rest/dequeue_nodes.go:1129 key: <vsVipName>, msg: Detected error code 400 that we don't support, not going to retry
INFO status/status.go:57 key: <vsVipName>, msg: starting status Sync
INFO status/gateway_status.go:246 key: <vsVipName>, msg: Successfully updated the gateway avi-system/test-gateway status {"listeners":[{"name":"test-http-listener","supportedKinds":[{"group":"gateway.networking.k8s.io","kind":"HTTPRoute"}],"attachedRoutes":0,"conditions":[{"type":"Accepted","status":"False","observedGeneration":1,"lastTransitionTime":"2024-12-12T22:52:25Z","reason":"Invalid","message":"Aborted due to prev error"}]}]}
@Vacant0mens Vacant0mens added the bug Something isn't working label Dec 12, 2024
@Vacant0mens Vacant0mens changed the title Creation of VSVIP for Gateway object always uses global VRF, returns error if AKO is configured with custom VRF Creation of VSVIP for Gateway object always uses global VRF, returns error if AKO is configured with custom VRF in NodePort Mode Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant