Data Scientist

Company	Caterpillar Inc.
Location	Chicago, IL, USA
Salary	$95640 – $155400
Type	Full-Time
Degrees	Master’s
Experience Level	Junior

Requirements

Master’s degree, or foreign equivalent, in Computer Science, Computer Engineering, or a related field
One (1) year experience as a Junior Data Scientist, Associate Data Engineer, or related occupation
Formal version control such as Git
Extracting, transforming, and loading using structured data sets
Industry standard data science packages and libraries in Python
Operating in Agile environment
Performing data processing and working on analytics initiatives
Creating data extracts
Working with Python, NoSQL, and relational databases such as Snowflake
MYSQL and experience with dashboard / reporting data using BI software such as Tableau and MS Power BI

Responsibilities

Perform analytical tasks and initiatives on huge amounts of data to support data-driven business decisions and development
Direct the data gathering, data mining, and data processing processes in huge volume, creating appropriate data models
Explore, promote, and implement semantic data capabilities through Natural Language Processing, text analysis and machine learning techniques
Lead to define requirements and scope of data analyses, presenting and reporting possible business insights to management using data visualization technologies
Conduct research on data model optimization and algorithms to improve effectiveness and accuracy on data analyses
Develop failure models that help save machine downtime from weeks to months
Responsible for the development and deployment of predictive analytical models for condition monitoring on Caterpillar engines and machines
Build based analytical models for active monitoring of mining and construction machinery using high frequency time series (sensory) data using Python
Develop Python packages and machine learning models for Caterpillar partners maintaining data privacy using AWS workspaces and Sage maker
Develop end to end strategy starting from data modelling to communicating insights to SME’s and business partners
Develop back-testing strategies and validating models to compute model performance metrics
Develop Python packages and rule-based algorithms to solve efficient problems faced during data retrieval and model development
Work on SQL using Snowflake and SQL Alchemy to have a data pipeline ready for condition monitoring and statistical metrics
Lead Failure Modelling deployment pipeline for all industry partner requests
Develop dashboards using Power BI to identify patterns on model triggers
Responsible for presenting insights to business and cross industry partners on on-prem tools and projects

Preferred Qualifications

No preferred qualifications provided.