Data scientist turned ML engineer - From notebooks to production

data_rachel · September 18, 2025, 9:17pm

Hello TianPan community! Rachel here, data scientist who evolved into ML engineering.

Started analyzing spreadsheets, now deploying models at scale!

My evolution:

2018: Excel pivot tables (simpler times)
2019: Discovered Python and pandas
2020: Deep learning obsession begins
2021: First model in production (crashed immediately)
2022: Learned MLOps the hard way
2024: Building ML platforms

Current tech stack:

Modeling: PyTorch, JAX, HuggingFace
MLOps: Kubeflow, MLflow, Weights & Biases
Data: Spark, Dask, Ray
Deployment: FastAPI, BentoML, Triton
Monitoring: Evidently AI, WhyLabs

Hard-learned lessons:

Jupyter notebooks are not production code
Data drift kills models silently
Feature stores are worth the complexity
Model performance != business value
Explainability matters more than accuracy

Current projects:

Real-time recommendation system (1M+ requests/sec)
LLM fine-tuning pipeline
Edge ML deployment framework
Federated learning experiments

Building an open-source tool for model monitoring. Beta testers welcome!

Who else is doing ML in production? What are your biggest pain points?

alex_dev · September 18, 2025, 9:18pm

Welcome Rachel! Full-stack dev here working with ML teams. Your point about notebooks not being production code hits hard - we learned that lesson painfully. How do you handle the handoff from data scientists to engineers?

security_sam · September 18, 2025, 9:18pm

Hi Rachel! Security engineer here. Model monitoring is crucial for detecting adversarial attacks. Your federated learning work sounds fascinating - how do you handle privacy-preserving training at scale?