Posted in

Data Scientist

Data Scientist

CompanyCaterpillar Inc.
LocationChicago, IL, USA
Salary$95640 – $155400
TypeFull-Time
DegreesMaster’s
Experience LevelJunior

Requirements

  • Master’s degree, or foreign equivalent, in Computer Science, Computer Engineering, or a related field
  • One (1) year experience as a Junior Data Scientist, Associate Data Engineer, or related occupation
  • Formal version control such as Git
  • Extracting, transforming, and loading using structured data sets
  • Industry standard data science packages and libraries in Python
  • Operating in Agile environment
  • Performing data processing and working on analytics initiatives
  • Creating data extracts
  • Working with Python, NoSQL, and relational databases such as Snowflake
  • MYSQL and experience with dashboard / reporting data using BI software such as Tableau and MS Power BI

Responsibilities

  • Perform analytical tasks and initiatives on huge amounts of data to support data-driven business decisions and development
  • Direct the data gathering, data mining, and data processing processes in huge volume, creating appropriate data models
  • Explore, promote, and implement semantic data capabilities through Natural Language Processing, text analysis and machine learning techniques
  • Lead to define requirements and scope of data analyses, presenting and reporting possible business insights to management using data visualization technologies
  • Conduct research on data model optimization and algorithms to improve effectiveness and accuracy on data analyses
  • Develop failure models that help save machine downtime from weeks to months
  • Responsible for the development and deployment of predictive analytical models for condition monitoring on Caterpillar engines and machines
  • Build based analytical models for active monitoring of mining and construction machinery using high frequency time series (sensory) data using Python
  • Develop Python packages and machine learning models for Caterpillar partners maintaining data privacy using AWS workspaces and Sage maker
  • Develop end to end strategy starting from data modelling to communicating insights to SME’s and business partners
  • Develop back-testing strategies and validating models to compute model performance metrics
  • Develop Python packages and rule-based algorithms to solve efficient problems faced during data retrieval and model development
  • Work on SQL using Snowflake and SQL Alchemy to have a data pipeline ready for condition monitoring and statistical metrics
  • Lead Failure Modelling deployment pipeline for all industry partner requests
  • Develop dashboards using Power BI to identify patterns on model triggers
  • Responsible for presenting insights to business and cross industry partners on on-prem tools and projects

Preferred Qualifications

    No preferred qualifications provided.