Hello Team, <@U0453TBH5LJ> and I just finished de...
# dev-metaflow
e
Hello Team, @important-xylophone-52223 and I just finished deploying/configuring Metaflow on AWS using the CloudFormation template provide in the Github. We would like to configure PyCharm with remote containers/compute, but can't find any ressource about it. Is there a guide available ? Here is the goal 🙂 : https://docs.metaflow.org/metaflow-on-aws#:~:text=Netflix%20uses%20this,remote%20instance%20natively. Thanks!
âś… 1
d
hey @elegant-carpenter-7681, what do you mean “configure PyCharm with remote containers/compute”? That section you point to refers to how at Netflix, people use EC2 instances as their main development machines and typically use something like pycharm or vscode to have a local editor but a remote machine (not to say we have a perfect setup and are actually working to improve it). What exactly are you looking to do?
e
hey! Thanks for the fast response. My goal is to setup a development environment for all the data scientists in my team and I like the idea to have a local editor and a remote machine. However I don't know how to configure the EC2/instance and the connection between EC2/instance and Pycharm (ideally). Is the EC2/instance already created by the Cloudformation ? Or should I create it ?
d
ah ok. So no, the EC2 instance is not created by the template (as far as I remember). This would be a EC2 instance that you setup for your user. Internally, we have a team that sets up cloud workstations and that is what that is. It’s not directly related per Metaflow per say. I am not a huge expert in CF and what not but @average-beach-28850 is more well versed in this. In terms of how to connect to it, typically it’s just plain old ssh
e
An exemple would be much appreciated! I found a lots of guide online, but nothing as precise as I would like. Here is a list of questions/tasks • What kind of EC2 should I aim for. • How to properly configure the EC2 (with the Metaflow Resources already up and running, like VPC, subnet, etc) • How to deploy/configure a data science docker container on the machine • How to ssh in the docker on the EC2 instance. Many questions hehe ! Or maybe the EC2 instance/remote instance is overkill for the size of our team (we are just 2, soon to be 3) and we should use our local environment.
d
oops, sorry this slipped. This is a bit past my comfort zone so @narrow-lion-2703 would probably know better. I will set that setting up remote environments for everyone is not always simple (everyone wants something slighlty different). For the size of your team, I think that using your laptop and
--with batch
is probably the simplest and easiest way to go. You could setup a “lightweight” environment by having a common conda configuration for example and have everyone install that if you wanted. That would be fairly lightweight.
e
Thanks Romain for the answer! For now, I think using
--with batch
is a good solution. I'm still curious on how to set up a cloud desktop/remote container, so if anyone has a guide to implementing a simple architecture, that would be much appreciated.