big-football-5446
11/25/2024, 4:26 AMbig-football-5446
01/22/2025, 12:04 AMsquare-wire-39606
01/22/2025, 2:27 PMbig-football-5446
02/03/2025, 5:39 AMMETAFLOW_BATCH_EMIT_TAGS
is not set or is set to false - would you have any examples of how I should setup warnings in this repo, just so I can keep the pattern the same.thankful-ambulance-42457
02/04/2025, 1:09 AMthankful-ambulance-42457
02/04/2025, 7:47 PMbig-football-5446
02/16/2025, 11:50 PMBATCH_DEFAULT_TAGS
environment variable into the decorator aws tags. I don't want to go down this route as we want the BATCH_DEFAULT_TAGS
to be seperate and tag even when the batch decorator is not used. We want the batch_decorator to be the last applied tags option. I'm trying to look for a place to move the validation of the BATCH_DEFAULT_TAGS
out to, what would be the best place to move it out so that it can validate before the batch job is created and still remain seperate from decorator logic. I'm thinking to validate inside batch_cli.py
thankful-ambulance-42457
02/17/2025, 9:25 AMthankful-ambulance-42457
02/18/2025, 1:02 PMbig-football-5446
03/03/2025, 10:51 PMfrom metaflow import (
FlowSpec,
step,
batch,
retry,
schedule,
project,
conda,
current
)
import logging
from common import (
logging_setup
)
logging_setup.configure()
custom_step_tags = {
"goodbye": "world",
"hello": "universe",
"batch_decorator_tag": "True",
#"inv@lid": "t@g"
}
custom_step_tags_v2 = {
"something": "else",
"batch_decorator_tag": "True"
}
@project(name="ml_apac_trisolaris")
@schedule(hourly=True)
class CanarySimpleAdhoc(FlowSpec):
@step
def start(self):
self.logger = logging.getLogger(self.__class__.__name__)
<http://self.logger.info|self.logger.info>(f"Canary Hello")
self.next(self.hello)
@batch(cpu=1, memory=500, aws_batch_tags=custom_step_tags)
@retry
@step
def hello(self):
<http://self.logger.info|self.logger.info>("Canary Hello World in Batch!")
self.next(self.goodbye)
@batch(cpu=1, memory=500, aws_batch_tags=custom_step_tags_v2)
@retry
@step
def goodbye(self):
<http://self.logger.info|self.logger.info>("Canary Goodbye World in Batch!")
self.next(self.end)
@step
def end(self):
<http://self.logger.info|self.logger.info>("HelloAWS is finished.")
if __name__ == "__main__":
CanarySimpleAdhoc()
But it seems the start step is receiving all the decorators at once - Is this expected?
Heres the logs from the batch job for start step:
2025-03-03T22:10:44.773Z
AWS Batch error:
ī
2
2025-03-03T22:10:44.773Z
aws_batch_tags is not a dictionary must be Dict[str, str]
ī
3
2025-03-03T22:10:44.773Z
Recieved BATCH_DEFAULT_TAGS as: {}
ī
4
2025-03-03T22:10:44.714Z
Recieved BATCH_DEFAULT_TAGS as: {}
ī
5
2025-03-03T22:10:44.714Z
Recieved Decorator AWS Tags Dictionary as: {'something': 'else', 'batch_decorator_tag': 'True'}
ī
6
2025-03-03T22:10:44.713Z
Recieved BATCH_DEFAULT_TAGS as: {}
ī
7
2025-03-03T22:10:44.713Z
Recieved Decorator AWS Tags Dictionary as: {'goodbye': 'world', 'hello': 'universe', 'batch_decorator_tag': 'True'}
ī
8
2025-03-03T22:10:41.471Z
Task is starting.
ī
9
2025-03-03T22:10:41.421Z
Code package downloaded.
ī
10
2025-03-03T22:10:40.335Z
Downloading code package...
ī
11
2025-03-03T22:10:34.682Z
Setting up task environment.
big-football-5446
03/03/2025, 10:54 PMMETAFLOW_BATCH_DEFAULT_TAGS
does not get carried over when deploying to step functions - its not present in the job definition and in the logs its coming through as the default {}
, is there another place that needs updating.thankful-ambulance-42457
03/03/2025, 10:58 PMstep_functions.py
where the overall state-machine setup is donebig-football-5446
03/03/2025, 10:59 PMbulky-afternoon-92433
03/04/2025, 10:40 AMbig-football-5446
03/04/2025, 12:18 PMbatch_decorator.py
change from it. I can fix the rebase tomorrow morning.big-football-5446
03/12/2025, 4:14 AMaws_batch_tags
in the decorator into another variable in there called aws_batch_tags_list
- without this split it seemed like there was a loop somewhere and already processed tags were being sent through twice which was causing issues.
PR still needs cleanup and some variable names fixes like you suggested before with changing aws_batch_tags
to aws_tags
in batch related files. Once you are happy with the setup I will clean things up. Another thing to note was that I commented out the recommendation you made for batch_decorator in runtime_init - seems this is not necessary but if you think this could cause issues with different setup I'm happy to run more tests and discuss.
From my testing it does look like there is a difference between what happens in batch from a deployed step function vs what happens when deploying - the BATCH_DEFAULT_TAGS variable is correctly assigned when deploying and shows up in the correct place in the temporary logging I have, but when it runs on batch from a step function trigger - what is supposed to log in the BATCH_DEFAULT_TAGS temporary logging instead gets logged through the aws_batch_tags temporary logging. Not too sure what is causing this mixup.big-football-5446
03/17/2025, 11:00 PMthankful-ambulance-42457
03/21/2025, 4:39 PMbig-football-5446
03/24/2025, 5:10 AMcool-notebook-79020
03/25/2025, 4:12 PMbig-football-5446
03/25/2025, 10:42 PMDo I understand this feature correct saying we would be able to add AWS tags on all our flows that can later on be filtered in aws cost explorer?Yes, adds tags to the batch jobs created by flows which can be filtered for cost attribution across teams
shy-midnight-40599
04/11/2025, 9:11 AMbig-football-5446
04/15/2025, 2:49 AMthankful-ambulance-42457
04/16/2025, 8:19 PMpython flow.py step-functions create --aws-batch-tag some=tag --aws-batch-tag another=tag
2. Defaults through env var / metaflow config
METAFLOW_BATCH_DEFAULT_TAGS='{"tag1":"value1","tag2":"value2"}' python flow.py run --with batch
3. Step specific tags with @batch
decorator
@batch(aws_batch_tags = {"tag1": "val1", "tag2": "val2"})
and order of merging tags is env < cli < decorator
thankful-ambulance-42457
04/17/2025, 1:21 PMpytest tag_test.py
should make it easier to validate the functionality. As I noted, right now the branch fails to execute with these though.
I'll add the step-functions coverage later onbig-football-5446
04/21/2025, 11:55 PMbig-football-5446
04/22/2025, 7:57 AMMETAFLOW_BATCH_EMIT_TAGS
to true
as its still under that