Posted in

Senior Software Developer – Big Data

Senior Software Developer – Big Data

CompanyAutodesk
LocationToronto, ON, Canada
Salary$Not Provided – $Not Provided
TypeFull-Time
Degrees
Experience LevelSenior

Requirements

  • 5 years of experience in cloud ETL, ELT and near-real time data collection, transport, and processing technologies
  • 3+ years of experience with data modeling, including the design of schemas optimized for data retrieval
  • 5+ year of experience in programming with SQL, Python, Spark, PySpark, Spark SQL, Java, Jinja, dbt and related languages and technologies
  • 3+ years of experience working with a big data environment including Hadoop, Hive, Spark and Presto
  • Experience architecting and implementing data testing solutions
  • Experience working with a big data environment including Hadoop, Hive, Spark and Presto
  • Experience with workflow management tools, like Airflow and Temporal
  • Proven expertise with algorithms, distributed systems design and the software development lifecycle
  • Experience designing and implementing RESTful APIs in Python, preferably with Flask
  • Experience deploying and maintaining cloud infrastructure, preferably on AWS
  • Experience with Infrastructure as Code frameworks, preferably Terraform
  • Experience developing CI/CD pipelines, preferably with Jenkins or Spinnaker
  • Familiarity with data governance frameworks, SDLC (Software Development Life Cycle), and Agile methodology
  • Strong technical skills and interest in learning best of class technologies around data warehousing, data wrangling, data quality, data governance and data ethics

Responsibilities

  • Design, architect and implement secure and scalable data solutions enabling data scientists and analysts
  • Develop Micro-front ends and APIs that interact with each other to provide automation of complex processes
  • Leverage programming languages for data manipulation, including SQL, Python, Spark, PySpark, Spark SQL, Java
  • Apply CI/CD processes to orchestrate automated batch and streaming pipelines running PySpark and Flink
  • Design, develop, execute, and document software solutions to address complex data collection, processing, transformation, testing and publishing
  • Collaborate with peer organizations, DevOps, and Support on technical issues and help troubleshoot code level problems and performance issues
  • Recommend architectural standards, best practices and quality assurance processes for data related systems and applications
  • Build data quality and durability tracking mechanisms to provide visibility into and address changes in data ingestion, processing, and storage
  • Lead and/or help design data schemas to ensure common protocols and storage mechanisms are used across all data services
  • Partner with data and software architects to design data models, APIs or other architectural elements
  • Develop, refine, and educate the data community on coding standards and best practices
  • Automate data transformations against data sourced from a variety of systems including desktop, web, and mobile product usage logs and business systems data

Preferred Qualifications

    No preferred qualifications provided.