Location: SF Bay Area
Type: Full Time
Compensation: Cash + Equity
Vali is transforming the home care industry from the ground up with our agentic OS. We’re hiring a pragmatic ML platform engineer—ideally with tech-lead experience—to stand up our ML stack and own models in production.
What you’ll do
- Build the ML platform from scratch: data pipelines, feature/embedding store, model registry, CI/CD for models, evals (offline/online), observability, rollback.
- Ship production models/agents for scheduling & matching (availability forecasting, constraints/optimization) and communications (intent/routing, summarization, after-hours voice agent).
- Create training/feedback loops from historical interactions; enforce data quality, drift detection, guardrails, and human-in-the-loop review.
- Reinforcement learning for LLM agents: reward modeling, offline RL from logged interactions, contextual bandits/A-Bats, RLAIF/RLHF, safe exploration, and policy evaluation.
- Define and move the metrics (fill rate, on-time starts, reassign latency, SLA adherence) with tight product/ops collaboration.
- [Optional] Lead/mentor a small team; drive roadmap and engineering standards.
What you’ve done
- 5–8+ years building ML systems in production with ownership of reliability, latency, and business KPIs.
- Hands-on with MLOps: Airflow/Prefect, Spark/Ray, Feast/feature stores, MLflow/Kubeflow, vector DBs, Docker/K8s, and a major cloud (GCP/AWS/Azure).
- Experience with ranking/matching, forecasting, strong Python and software engineering fundamentals.
- LLM/agent know-how (RAG, tool use, orchestration) or a demonstrated ability to ramp quickly.
Nice to have
- Workforce scheduling, logistics, contact centers, or healthcare ops background.
- HIPAA/PHI handling and healthcare compliance experience.