The human-centric platform for production ML & AI

Outerbounds

We’ve been examining some of the slower Metaflow UI queries using the AWS RDS Performance Insights dashboard, and we’ve identified at least one slow query when running in RDS PostgreSQL 11.18 on a `db.m5.2xlarge` instance. The following query on a 200 million row `artifact_v3` takes up to four minutes to run:
```SELECT * FROM ( SELECT flow_id,run_number,run_id,step_name,task_id,task_name,name,location,ds_type,sha,type,content_type,user_name,attempt_id,ts_epoch,tags,system_tags FROM artifact_v3 ) T WHERE flow_id = '[REDACTED]' AND run_id = '[REDACTED]' AND ("ds_type" = 'local') LIMIT 1;```
While acknowledging that searching 200 million rows would be potentially slow, we’ve noticed via `EXPLAIN` and `EXPLAIN ANALYZE` that the query does not actually make use of the indexes defined on `artifacts_v3`. There is an index:
```"artifact_v3_idx_str_ids_primary_key" btree (flow_id, run_id, step_name, task_name, attempt_id, name) WHERE run_id IS NOT NULL AND task_name IS NOT NULL```
but for PostgreSQL to use the index we need to add `AND task_name IS NOT NULL` to the query.

In general, would we ever expect either `run_id` or `task_name` to be null?