top | item 36852265

(no title)

framebit | 2 years ago

I'm an ML Engineer who's really closer to an MLOps role. I'm weak on ML and strong on data, scaling, cloud stuff, infra as code, making processes not suck, kinda everything _but_ the ML. So take my opinions for what they are worth, and keep in mind that the role of ML Engineering at company A != ML Engineering at company B.

I've described ML Engineering as putting the "science" in Data Science because we help introduce reproducibility. For example, I can take your model training and make it a robust process that happens over a huge amount of data on a daily basis with all the monitoring, logging, and reliability stuff surrounding that.

Some topics I would personally want to see for an ML Engineer on my team (and again, "ML Engineer" has less of a solid definition across the industry than "frontend engineer" or other roles that have been around longer) - Docker: can you containerize your code? Can you interact with a local container? - Model serving: at a basic level, can you wrap an API around a model? There's lots more systems design stuff here if you want to go deeper on model serving platforms. - CI/CD: do you know what Jenkins does? (Or equivalent) Can you talk about a coherent code testing strategy for ML code? How would you deploy a model service using a system like Jenkins? - Cloud stuff: you don't need to be an expert, but can you interact with cloud APIs directly or through Terraform, spin up instances, know the difference between object storage and databases, and do you have some Kubernetes experience (run a pod, get the logs, take some debugging steps when something's wrong). - Modern MLOps: model registry systems like MLFlow, feature stores (DIY preferred but vendors ok) - Scheduling and Pipelining: Airflow, Vertex Pipelines, lots of options here but those are the biggies. Know how to use these for basic data pipelines, model training, service deployment, and why and how you can deploy these via CD - Monitoring: know the difference and have strategies around monitoring systems metrics (cpu usage, etc) and model metrics (data drift, etc)

A lot of this stuff is harder to learn on your own because it often comes up in the context of larger teams and enterprise scale, where monitoring and reliability turn into KPIs that execs look at, but this is, to me, the stuff that defines the difference between a Data Scientists and an ML Engineer.

discuss

varane|2 years ago

Thanks, this is useful information (and also fairly overwhelming). I have a basic idea of some of these because of having taken CS courses but no hardcore experience in any of them. Even though I'd like to work on these, it does sound like I need to get into a tech company that does this in the first place. Having had a life revolve around university for a while, looks like I have a hill to climb.