Hi I deployed my stack on GCP this way ```terraform apply ta Outerbounds #dev-metaflow

Hi, I deployed my stack on GCP this way ```terraf...

few-salesmen-35936

08/24/2023, 9:38 AM

Hi, I deployed my stack on GCP this way

Copy code

terraform apply -target="module.infra" -var-file=myenv
terraform apply -target="module.services" -var-file=myenv

and everything was deployed perfectly and I can access to the services and run a DAG. But when I try to modify the stack by

Copy code

terraform plan/apply/destroy -var-file=myenv -target="module.infra"
terraform plan/apply/destroy -var-file=myenv -target="module.services"

I get the following and can't access the services on k8s. I have a

metaflow_gsa_key

service account JSON generated at the infra creation in my

terraform

folder. I tried

Copy code

gcloud auth application-default login
gcloud container clusters get-credentials mygke --region= myregion

but no luck. Is there anything I'm missing?

Copy code

╷
│ Error: Get "<http://localhost/api/v1/namespaces/argo>": dial tcp [::1]:80: connect: connection refused
│
│   with module.services.kubernetes_namespace.argo[0],
│   on services/argo.tf line 1, in resource "kubernetes_namespace" "argo":
│    1: resource "kubernetes_namespace" "argo" {
│
╵
╷
│ Error: Get "<http://localhost/api/v1/namespaces/default/services/metadata-service>": dial tcp [::1]:80: connect: connection refused
│
│   with module.services.kubernetes_service.metadata-service,
│   on services/metadata_service.tf line 90, in resource "kubernetes_service" "metadata-service":
│   90: resource "kubernetes_service" "metadata-service" {
│
╵
╷
│ Error: Get "<http://localhost/api/v1/namespaces/default/services/metaflow-ui-backend-service>": dial tcp [::1]:80: connect: connection refused
│
│   with module.services.kubernetes_service.metaflow-ui-backend-service,
│   on services/metaflow_ui_backend_service.tf line 122, in resource "kubernetes_service" "metaflow-ui-backend-service":
│  122: resource "kubernetes_service" "metaflow-ui-backend-service" {
│
╵
╷
│ Error: Get "<http://localhost/apis/apps/v1/namespaces/default/deployments/metaflow-ui-static-service>": dial tcp [::1]:80: connect: connection refused
│
│   with module.services.kubernetes_deployment.metaflow-ui-static-service,
│   on services/metaflow_ui_static_service.tf line 1, in resource "kubernetes_deployment" "metaflow-ui-static-service":
│    1: resource "kubernetes_deployment" "metaflow-ui-static-service" {
│
╵
╷
│ Error: Get "<http://localhost/api/v1/namespaces/default/services/metaflow-ui-static-service>": dial tcp [::1]:80: connect: connection refused
│
│   with module.services.kubernetes_service.metaflow-ui-static-service,
│   on services/metaflow_ui_static_service.tf line 58, in resource "kubernetes_service" "metaflow-ui-static-service":
│   58: resource "kubernetes_service" "metaflow-ui-static-service" {
│
╵
╷
│ Error: Get "<http://localhost/api/v1/namespaces/default/serviceaccounts/ksa-metaflow>": dial tcp [::1]:80: connect: connection refused
│
│   with module.services.kubernetes_service_account.metaflow_service_account,
│   on services/service_account.tf line 2, in resource "kubernetes_service_account" "metaflow_service_account":
│    2: resource "kubernetes_service_account" "metaflow_service_account" {
│
╵
╷
│ Error: Get "<http://localhost/api/v1/namespaces/argo-events>": dial tcp [::1]:80: connect: connection refused
│
│   with module.services.module.argo_events.kubernetes_namespace.argo_events,
│   on common/terraform/argo_events/main.tf line 148, in resource "kubernetes_namespace" "argo_events":
│  148: resource "kubernetes_namespace" "argo_events" {
│
╵

average-beach-28850

08/25/2023, 12:17 AM

I would guess something is off with

gcloud container clusters get-credentials..

bit. Cluster API endpoint is not supposed to be "localhost" if you're using a cluster running on GKE. And this command is supposed to generate a kubeconfig for you, which includes that hostname

average-beach-28850

08/25/2023, 12:21 AM

actually nevermind, you shouldn't even need that command. Terraform is supposed to create GKE cluster, get creds for it and then install K8S resources into it.

average-beach-28850

08/25/2023, 12:21 AM

Do you see if terraform managed to create the cluster there?

average-beach-28850

08/25/2023, 12:23 AM

at least with AWS template I was getting similar errors if I made changes to tf files that would result in K8S cluster getting re-created

few-salesmen-35936

08/29/2023, 10:04 AM

I guess it was my

GOOLE_APPLICATION_CREDENTAILS

env var, somehow it didn't affect while creating the stack but did to modify🤔

few-salesmen-35936

09/05/2023, 7:57 AM

hmmm now not sure if it was

GOOLE_APPLICATION_CREDENTAILS

at least with AWS template I was getting similar errors if I made changes to tf files that would result in K8S cluster getting re-created

@narrow-lion-2703 OK so updating the k8s cluster by terraform is not supported?

terraform destroy

also returns the same error so I can imagine manually deleting the entire stack and re-deploy is the way?

average-beach-28850

09/05/2023, 3:02 PM

I'm not sure how it is in GCP, but in AWS EKS, depending on the nature of the change, they usually allow update k8s cluster in-place without re-creating it. That works fine for me. Its only when you make some changes that make terraform decide that it has to be destroyed and recreated as part of the plan, i get that error (either on apply or destroy)

4 Views

Open in Slack

Previous Next