Hello! Question about the interaction of CometML a...
# ask-metaflow
b
Hello! Question about the interaction of CometML and Metaflow when it comes to exceptions. I noticed that if a step throws an exception I first see a message like
<flow FailExample step process_fail_on_two[1] (input: 2)> failed:
then after a moment I see the actual exception as
Internal Error
with a useful stack trace. However when using CometML that error just gets swallowed. I do still see the message that the flow step failed but the internal error is gone. CometML reports the end of the experiment but the stack trace is nowhere to be found. Not in the console and not on the CometML UI. Anyone had / solved this issue before?
1
a
I am not super familiar with the internals of comet ml, but how are you using comet ml today?
b
Actually struggling to re-produce the problem in an isolated context. Will come back to you when I managed to re-create it in a simpler example.
I had troubles re-producing because it turns out that it behaves differently on MacOS than on Linux! Flow code:
Copy code
from comet_ml import Experiment
os.environ["COMET_URL_OVERRIDE"] = "redacted"

class CometFlow(FlowSpec):
    @secrets(sources=["COMET_API_KEY"])
    @step
    def start(self):
        experiment = Experiment(
            project_name="comet-example-metaflow-hello-world",
        )

        num = 0 / 0

        self.next(self.end)
Output on MacOS
Copy code
COMET INFO: Experiment is live on comet.com <https://redacted/tim-scheller/comet-example-metaflow-hello-world/fd690ab3a55c4105b32c4262d51e40b3>

<flow CometFlow step start> failed:
Internal error
Traceback (most recent call last):
<snip stack trace>
ZeroDivisionError: division by zero

COMET INFO: The process of logging environment details (conda environment, git patch) is underway. Please be patient as this may take some time.
<more comet info>
Same code on Linux produces: (Note the missing traceback)
Copy code
COMET INFO: Experiment is live on comet.com <https://redacted/tim-scheller/comet-example-metaflow-hello-world/fd690ab3a55c4105b32c4262d51e40b3>

<flow CometFlow step start> failed:
COMET INFO: The process of logging environment details (conda environment, git patch) is underway. Please be patient as this may take some time.
<more comet info>
This only happens if I successfully setup a comet experiment before that line that would throw an exception. This does not happen outside of Metaflow. If I put the same code I have in the
start
step into a python file I do get the Traceback.
More data: When using comets
@comet_flow
decorator we do get the Traceback. (Unfortunately we would like to avoid using the decorator)
Copy code
@comet_flow(project_name="comet-example-metaflow-hello-world")
class CometFlow(FlowSpec):
    @secrets(sources=["metaflow_secrets"])
    @step
    def start(self):
        num = 0 / 0
When setting up the experiment inside of the code we do not
Copy code
class CometFlow(FlowSpec):
    @secrets(sources=["metaflow_secrets"])
    @step
    def start(self):
        experiment = Experiment(
            project_name="comet-example-metaflow-hello-world",
        )

        num = 0 / 0
On MacOS both work, on Linux only the first one does I will see if I can report this to the Comet team as well