Hi, I deployed my stack on GCP this way ```terraf...
# dev-metaflow
f
Hi, I deployed my stack on GCP this way
Copy code
terraform apply -target="module.infra" -var-file=myenv
terraform apply -target="module.services" -var-file=myenv
and everything was deployed perfectly and I can access to the services and run a DAG. But when I try to modify the stack by
Copy code
terraform plan/apply/destroy -var-file=myenv -target="module.infra"
terraform plan/apply/destroy -var-file=myenv -target="module.services"
I get the following and can't access the services on k8s. I have a
metaflow_gsa_key
service account JSON generated at the infra creation in my
terraform
folder. I tried
Copy code
gcloud auth application-default login
gcloud container clusters get-credentials mygke --region= myregion
but no luck. Is there anything I'm missing?
Copy code
╷
│ Error: Get "<http://localhost/api/v1/namespaces/argo>": dial tcp [::1]:80: connect: connection refused
│
│   with module.services.kubernetes_namespace.argo[0],
│   on services/argo.tf line 1, in resource "kubernetes_namespace" "argo":
│    1: resource "kubernetes_namespace" "argo" {
│
╵
╷
│ Error: Get "<http://localhost/api/v1/namespaces/default/services/metadata-service>": dial tcp [::1]:80: connect: connection refused
│
│   with module.services.kubernetes_service.metadata-service,
│   on services/metadata_service.tf line 90, in resource "kubernetes_service" "metadata-service":
│   90: resource "kubernetes_service" "metadata-service" {
│
╵
╷
│ Error: Get "<http://localhost/api/v1/namespaces/default/services/metaflow-ui-backend-service>": dial tcp [::1]:80: connect: connection refused
│
│   with module.services.kubernetes_service.metaflow-ui-backend-service,
│   on services/metaflow_ui_backend_service.tf line 122, in resource "kubernetes_service" "metaflow-ui-backend-service":
│  122: resource "kubernetes_service" "metaflow-ui-backend-service" {
│
╵
╷
│ Error: Get "<http://localhost/apis/apps/v1/namespaces/default/deployments/metaflow-ui-static-service>": dial tcp [::1]:80: connect: connection refused
│
│   with module.services.kubernetes_deployment.metaflow-ui-static-service,
│   on services/metaflow_ui_static_service.tf line 1, in resource "kubernetes_deployment" "metaflow-ui-static-service":
│    1: resource "kubernetes_deployment" "metaflow-ui-static-service" {
│
╵
╷
│ Error: Get "<http://localhost/api/v1/namespaces/default/services/metaflow-ui-static-service>": dial tcp [::1]:80: connect: connection refused
│
│   with module.services.kubernetes_service.metaflow-ui-static-service,
│   on services/metaflow_ui_static_service.tf line 58, in resource "kubernetes_service" "metaflow-ui-static-service":
│   58: resource "kubernetes_service" "metaflow-ui-static-service" {
│
╵
╷
│ Error: Get "<http://localhost/api/v1/namespaces/default/serviceaccounts/ksa-metaflow>": dial tcp [::1]:80: connect: connection refused
│
│   with module.services.kubernetes_service_account.metaflow_service_account,
│   on services/service_account.tf line 2, in resource "kubernetes_service_account" "metaflow_service_account":
│    2: resource "kubernetes_service_account" "metaflow_service_account" {
│
╵
╷
│ Error: Get "<http://localhost/api/v1/namespaces/argo-events>": dial tcp [::1]:80: connect: connection refused
│
│   with module.services.module.argo_events.kubernetes_namespace.argo_events,
│   on common/terraform/argo_events/main.tf line 148, in resource "kubernetes_namespace" "argo_events":
│  148: resource "kubernetes_namespace" "argo_events" {
│
╵
a
I would guess something is off with
gcloud container clusters get-credentials..
bit. Cluster API endpoint is not supposed to be "localhost" if you're using a cluster running on GKE. And this command is supposed to generate a kubeconfig for you, which includes that hostname
actually nevermind, you shouldn't even need that command. Terraform is supposed to create GKE cluster, get creds for it and then install K8S resources into it.
Do you see if terraform managed to create the cluster there?
at least with AWS template I was getting similar errors if I made changes to tf files that would result in K8S cluster getting re-created
f
I guess it was my
GOOLE_APPLICATION_CREDENTAILS
env var, somehow it didn't affect while creating the stack but did to modify🤔
hmmm now not sure if it was
GOOLE_APPLICATION_CREDENTAILS
at least with AWS template I was getting similar errors if I made changes to tf files that would result in K8S cluster getting re-created
@narrow-lion-2703 OK so updating the k8s cluster by terraform is not supported?
terraform destroy
also returns the same error so I can imagine manually deleting the entire stack and re-deploy is the way?
a
I'm not sure how it is in GCP, but in AWS EKS, depending on the nature of the change, they usually allow update k8s cluster in-place without re-creating it. That works fine for me. Its only when you make some changes that make terraform decide that it has to be destroyed and recreated as part of the plan, i get that error (either on apply or destroy)