Big Data Engineer
Company | Synechron |
---|---|
Location | Charlotte, NC, USA |
Salary | $100000 – $110000 |
Type | Full-Time |
Degrees | Bachelor’s, Master’s, PhD |
Experience Level | Expert or higher |
Requirements
- Bachelor’s, Master’s, or Ph.D. in Computer Science, Information Technology, or related field.
- 10+ years of industry experience in big data engineering with proven expertise in Hadoop and Spark technologies.
- Extensive experience with Hadoop ecosystem components: HDFS, MapReduce, YARN, Hive, Pig, HBase, and Oozie.
- Strong proficiency in Apache Spark (Scala, Python, or Java) and Spark SQL.
- Experience with data ingestion tools such as Apache NiFi, Kafka, or Flume.
- Hands-on experience with cloud platforms (AWS, Azure, GCP) and their big data services integrated with Hadoop/Spark.
- Knowledge of data modeling, data warehousing, and database technologies (NoSQL, relational systems).
- Familiarity with containerization and orchestration tools like Docker and Kubernetes.
- Familiar with data governance, security, and compliance standards.
- Excellent problem-solving, system architecture, and communication skills.
Responsibilities
- Design, develop, and optimize big data pipelines and processing frameworks using Hadoop (HDFS, MapReduce, YARN) and Apache Spark.
- Build scalable data ingestion processes and data lakes for diverse data sources.
- Develop and maintain ETL workflows that handle processing of structured and unstructured data.
- Collaborate with Data Scientists, Analysts, and Business Teams to translate requirements into technical solutions.
- Tune and troubleshoot Spark and Hadoop jobs for maximum efficiency and performance.
- Implement data security, privacy, and compliance best practices across all platforms.
- Mentor junior team members and foster best practices in big data development.
- Stay current with emerging trends and technologies related to Hadoop and Spark.
- Document architecture, workflows, and standards for maintainability and knowledge sharing.
Preferred Qualifications
- Experience with Spark Streaming and real-time data processing.
- Knowledge of advanced analytics, machine learning pipelines, and integration with Spark MLlib.
- Experience with automation and orchestration tools such as Apache Airflow.
- Familiarity with version control and CI/CD practices for big data platforms.