Posted in

Software Engineer – XR Codec Interactions and Avatars Team

Software Engineer – XR Codec Interactions and Avatars Team

CompanyMeta
LocationRedmond, WA, USA, Pittsburgh, PA, USA
Salary$56.25 – $173000
TypeFull-Time
DegreesBachelor’s
Experience LevelMid Level, Senior

Requirements

  • Currently has, or is in the process of obtaining a Bachelor’s degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience. Degree must be completed prior to joining Meta
  • 3+ years of experience in UNIX/LINUX and clear understanding of TCP/IP network fundamentals
  • 5+ years of experience coding in at least one of the following languages: C++, Python, or Rust
  • Experience with software development practices such as source control, code reviews, unit testing, debugging and profiling
  • Experience with Internet service architecture capacity planning and/or handling needs for urgent capacity augmentation
  • Knowledge of common web technologies and/or Internet service architectures (such as LAMP or MEAN stacks, CDN, Load Balancing techniques, etc.)
  • Experience configuring and running infrastructure level applications, such as Kubernetes, Terraform, MySQL, SLURM, etc.

Responsibilities

  • Leverage the scale and complexity of the larger Meta infrastructure to accelerate our Codec Interaction and Avatars projects
  • Influence outcomes within your immediate team, peer engineering teams, and with cross-functional stakeholders
  • Work independently, handle large projects simultaneously, and prioritize team roadmap and deliverables by balancing required effort with resulting impact
  • Own Research Super Cluster back-end services which handle fleet management, infrastructure components that drive Meta’s advances in AI, core services which are used by every team at XRCIA, networking systems, and everything in between
  • Author and review code, develop documentation and capacity plans, and debug the hardest problems, all live, on some of the largest and most complex systems in the world
  • Together with your engineering team, you will share an on-call rotation and be an escalation contact for service incidents. Provide on-call support and lead incident root cause analysis through multiple data engineering layers (compute, storage, network) for GPU clusters and act as a final escalation point

Preferred Qualifications

  • Thorough understanding of Linux operating system, including the networking subsystem
  • Experience in distributed system performance measurement, logging, and optimization
  • Experience with Python library management systems such as Conda
  • Prior experience in cluster oncall operations, including troubleshooting server/scheduler/storage errors, maintaining compute/storage environments/libraries/tools, helping onboard users to the cluster, and answering general questions from users
  • Prior experience in cluster coordination and strategy planning, including collecting/understanding needs of users, developing tools to improve user experience, providing guidance on best practices, forecasting compute/storage needs, and developing long-term user experience/compute/storage strategies
  • Prior experience building tooling for monitoring and telemetry
  • Prior experience in developing/managing distributed network file systems
  • Prior experience in network security