Staff Data Scientist

Company	Proofpoint
Location	Remote in India, Toronto, ON, Canada, Cork, Ireland
Salary	$Not Provided – $Not Provided
Type	Full-Time
Degrees	Master’s, PhD
Experience Level	Expert or higher

PhD or Master’s degree in Computer Science, Data Science, Machine Learning, Statistics, or related discipline.
10+ years of experience in data science or applied machine learning, with 3+ years in a technical leadership or managerial role.
Proven track record of designing, developing, and deploying ML and GenAI solutions at scale.
Hands-on experience working with LLMs (e.g., OpenAI, Anthropic, LLaMA, Mistral) and GenAI frameworks (e.g., LangChain, LlamaIndex, Hugging Face).
Experience in cybersecurity or enterprise-scale threat detection systems is a strong plus.
Proficiency in Python and relevant ML/AI libraries (e.g., PyTorch, TensorFlow, Transformers, Scikit-learn).
Strong grasp of LLM fine-tuning, prompt engineering, RAG pipelines, vector databases (e.g., FAISS, Pinecone), and inference optimization.
Experience with cloud platforms (AWS, GCP, Azure) and containerization tools (Docker, Kubernetes).
Solid understanding of MLOps principles including CI/CD for ML, feature stores, model versioning, and monitoring.
Familiarity with privacy, security, and compliance considerations in deploying AI solutions.
Excellent leadership and mentorship skills, with a collaborative approach to cross-functional problem solving.
Ability to communicate complex technical ideas to both technical and non-technical stakeholders.
Strong innovation mindset, strategic thinking, and a passion for applying AI to impactful real-world problems.

Lead the development of machine learning models and advanced analytics solutions to solve complex business problems.
Design and implement generative AI and large language model (LLM) applications, including fine-tuning and domain adaptation for cybersecurity use cases.
Collaborate with engineering teams to build scalable and secure LLM-based systems (retrieval-augmented generation, prompt engineering, evaluation pipelines).
Architect and lead AI solutions across full lifecycle—from experimentation to MLOps pipelines and production deployment.
Design experiments and use statistical analysis to measure the impact of various business strategies.
Oversee the deployment of machine learning and LLM models in production, ensuring performance, scalability, and responsible AI practices.
Lead the development of model performance monitoring, observability, and continuous learning pipelines.
Define technical direction for LLM and GenAI adoption, including benchmarking open-source and commercial models.
Champion AI/ML best practices including model governance, reproducibility, and ethical AI considerations.
Promote a data-driven and AI-forward culture within the organization and advocate for cutting-edge AI adoption across teams.
Stay current with advancements in LLMs, GenAI, AI engineering, and emerging AI regulations.

Experience in cybersecurity or enterprise-scale threat detection systems is a strong plus.