Data Engineer

Company	CVS Health
Location	Irving, TX, USA
Salary	$122949 – $180000
Type	Full-Time
Degrees	Master’s
Experience Level	Junior, Mid Level

Requirements

Master’s degree (or foreign equivalent) in Computer Science, Information Technology, Computer Information Systems, Engineering, or a related field
one (1) year of experience in the job offered or a related occupation
one (1) year of experience with machine learning operations, including model versioning, model and data lineage, and model deployment, scalability and orchestration
one (1) year of experience with designing data models and solutions for analytical and reporting use cases
one (1) year of experience with CI/CD, Jenkins, GIT, or DevOps
one (1) year of experience with programming in Python, R, or SQL
one (1) year of experience with Spark, Airflow, Kafka, Hbase, Pig, MySQL, or NoSQL
one (1) year of experience with Oracle, Teradata, or DB2
one (1) year of experience with quantitative analysis techniques, including clustering, regression, and pattern recognition
one (1) year of experience with software development lifecycle (SDLC)
one (1) year of experience contributing to largescale applications development, data science, or data analytics projects
one (1) year of experience designing data architectures, including data pipelines, distributed computing engines, and machine learning infrastructure design
one (1) year of experience with data analytics on large data sets in healthcare, business, or retail sector
one (1) year of experience with healthcare data management processes and techniques, including data standards, interoperability, and data privacy
one (1) year of experience with cloud components including cluster management

Responsibilities

Analyze data engineering problems and develop, build and manage large-scale data structures, pipelines and efficient Extract/Load/Transform (ETL) workflows
Develop large scale data structures and pipelines to organize, collect and standardize data to generate insights and address reporting needs
Write ETL (Extract/Transform/Load) processes, design database systems, and develop tools for real-time and offline analytic processing that improve existing systems and expand capabilities
Collaborate with Data Science team to transform data and integrate algorithms and models into automated processes
Test and maintain systems and troubleshoot malfunctions
Leverage knowledge of Hadoop architecture, HDFS commands, and designing and optimizing queries to build data pipelines
Utilize programming skills in Python, Java, or similar languages to build robust data pipelines and dynamic systems
Build data marts and data models to support Data Science and other internal customers
Integrate data from a variety of sources and ensure adherence to data quality and accessibility standards
Analyze current information technology environments to identify and assess critical capabilities and recommend solutions to complex business problems
Experiment with available tools and advise on new tools to provide optimal solutions that meet the requirements dictated by the model/use case

Preferred Qualifications

No preferred qualifications provided.