Staff Data Scientist
Company | Proofpoint |
---|---|
Location | Remote in India, Toronto, ON, Canada, Cork, Ireland |
Salary | $Not Provided – $Not Provided |
Type | Full-Time |
Degrees | Master’s, PhD |
Experience Level | Expert or higher |
Requirements
- PhD or Master’s degree in Computer Science, Data Science, Machine Learning, Statistics, or related discipline.
- 10+ years of experience in data science or applied machine learning, with 3+ years in a technical leadership or managerial role.
- Proven track record of designing, developing, and deploying ML and GenAI solutions at scale.
- Hands-on experience working with LLMs (e.g., OpenAI, Anthropic, LLaMA, Mistral) and GenAI frameworks (e.g., LangChain, LlamaIndex, Hugging Face).
- Experience in cybersecurity or enterprise-scale threat detection systems is a strong plus.
- Proficiency in Python and relevant ML/AI libraries (e.g., PyTorch, TensorFlow, Transformers, Scikit-learn).
- Strong grasp of LLM fine-tuning, prompt engineering, RAG pipelines, vector databases (e.g., FAISS, Pinecone), and inference optimization.
- Experience with cloud platforms (AWS, GCP, Azure) and containerization tools (Docker, Kubernetes).
- Solid understanding of MLOps principles including CI/CD for ML, feature stores, model versioning, and monitoring.
- Familiarity with privacy, security, and compliance considerations in deploying AI solutions.
- Excellent leadership and mentorship skills, with a collaborative approach to cross-functional problem solving.
- Ability to communicate complex technical ideas to both technical and non-technical stakeholders.
- Strong innovation mindset, strategic thinking, and a passion for applying AI to impactful real-world problems.
Responsibilities
- Lead the development of machine learning models and advanced analytics solutions to solve complex business problems.
- Design and implement generative AI and large language model (LLM) applications, including fine-tuning and domain adaptation for cybersecurity use cases.
- Collaborate with engineering teams to build scalable and secure LLM-based systems (retrieval-augmented generation, prompt engineering, evaluation pipelines).
- Architect and lead AI solutions across full lifecycle—from experimentation to MLOps pipelines and production deployment.
- Design experiments and use statistical analysis to measure the impact of various business strategies.
- Oversee the deployment of machine learning and LLM models in production, ensuring performance, scalability, and responsible AI practices.
- Lead the development of model performance monitoring, observability, and continuous learning pipelines.
- Define technical direction for LLM and GenAI adoption, including benchmarking open-source and commercial models.
- Champion AI/ML best practices including model governance, reproducibility, and ethical AI considerations.
- Promote a data-driven and AI-forward culture within the organization and advocate for cutting-edge AI adoption across teams.
- Stay current with advancements in LLMs, GenAI, AI engineering, and emerging AI regulations.
Preferred Qualifications
- Experience in cybersecurity or enterprise-scale threat detection systems is a strong plus.