Data Engineer
Company | Manulife Financial |
---|---|
Location | Toronto, ON, Canada, Kitchener, ON, Canada |
Salary | $75880 – $140920 |
Type | Full-Time |
Degrees | Bachelor’s |
Experience Level | Senior |
Requirements
- Bachelor’s Degree in Computer Science, Information Technology, or a related field. Master’s degree is a plus.
- 5+ years of experience as a Data Engineer, with a track record of efficiently implementing and maintaining data pipelines, scheduling, monitoring, notification, and ETL processes using Azure Data Factory, Databricks, PythonPyspark, Java, Scala.
- Understanding of Azure infrastructure; subscriptions, resource groups, resources, access control with RBAC (role-based access control), integrations with Azure AD and Azure security principles (user group, service principal, managed identity), network concepts (VNet, Subnet, NSG rules, private endpoints), passwordcredentialkey management and data protection.
- Experience deploying and integrating Azure Data Services (Azure Data Factory, Databricks) using DevOps tools and principles: GitHub Repository, Jenkins CICD pipelines, integrated unit tests, etc.
- Knowledge of Azure Data Lake Storage (ADLS Gen2) and its topology on blob storage with file hierarchy, storage account, containers, and folders.
- Proficient in data mart fact and dimension design concepts, ETLELT logic to perform upsert and type-2, and applyingimplementing with pythonpysparkstore procSQL in ADLS (with Lakehouse architecture using Databricks) or in Azure Synapse (with Dedicated SQL pool).
- Experience using Power BI to connect to sources with ADLS (including Delta Lake), Databricks SQL, Azure Synapse (SQL pool), design semantic layerdata models, createpublish reporting content (data setspaginate reportinteractive dashboard), and manage workspaces.
- Solid understanding of data privacy and compliance regulations and standard methodologies.
- Excellent problem-solving skills and the ability to solve technical issues efficiently.
- Effective communication skills to collaborate with technical and non-technical partners.
- Thorough approach with a commitment to delivering high-quality work in a fast-paced environment.
Responsibilities
- Design, develop, and manage end-to-end data pipelines that facilitate the detailed extraction, transformation, and loading of data from diverse sources.
- Collaborate closely with multi-functional teams to understand and design schemas for data from various source systems and other transactional or application databases, ensuring accuracy and reliability.
- Continuously improve and optimize ETL processes to enhance data flow efficiency, minimize latency, and support real-time and batch processing requirements.
- Implement data cleansing, enrichment, and transformation processes to ensure high-quality data is available for analysis and reporting.
- Design testing plans, develop and implement data quality checks, validation rules, and supervising mechanisms to maintain data accuracy and integrity.
- Collaborate with various technical resources from across the organization to identify and implement enhancements to the infrastructure, integrations, and functionalities.
- Work closely with business leads and data architects to design, implement, and manage end-to-end architecture based on business requirements.
- Create and maintain comprehensive documentation for data pipelines, processes, and configurations to facilitate knowledge sharing and onboarding.
- Partner with other data engineers, data analysts, business collaborators, and data scientists to understand data requirements and translate them into effective data engineering solutions.
- Monitor data pipeline performance and solve issues to ensure optimal data flow and proactively find opportunities for enhancement.
- Ensure consistency with data privacy and compliance standards throughout the data lifecycle.
Preferred Qualifications
- Master’s degree is a plus.