Data Engineer
Company | City of Philadelphia |
---|---|
Location | Philadelphia, PA, USA |
Salary | $Not Provided – $Not Provided |
Type | Full-Time |
Degrees | |
Experience Level | Mid Level |
Requirements
- Python programming and scripting for automation and data workflows.
- SQL and relational databases (PostgreSQL preferred).
- Modern data engineering tools such as Apache Airflow, DBT, and Terraform.
- Cloud infrastructure and services (especially AWS).
- Git and GitHub for version control and CI/CD.
- Docker and Linux-based server environments.
- Geospatial data and tools (PostGIS, ArcGIS Online, CARTO) preferred.
- Data modeling and integration design.
- Writing clear, maintainable code and documentation.
- Analyzing large, diverse datasets to extract meaningful insights.
- Managing multiple concurrent workflows and priorities.
- Troubleshooting technical issues and implementing solutions independently.
- Collaborate effectively across departments with varying levels of technical expertise.
- Design and implement secure, scalable, and maintainable data systems.
- Communicate technical concepts clearly to both technical and non-technical audiences.
- Learn new tools, platforms, and programming languages as needed.
- Thrive in a hybrid work environment and contribute to a collaborative, innovative team.
Responsibilities
- Develop an intimate understanding of the City’s diverse data and contribute to improving the City’s data engineering infrastructure, pipelines, models, and integrations.
- Use a blend of open source, custom developed and off-the-shelf tools, including Python, Bash, SQL, DBT, GIS, Docker, Terraform, Apache Airflow, Jenkins, Postgres, PostGIS, AWS, GitHub, MuleSoft as an iPaaS and SAS API provides like ArcGIS Online and CARTO.
- Design, develop, and maintain robust data pipelines and workflows using Python, Apache Airflow, and related tools.
- Build and support scalable data integrations between diverse systems of record, centralized databases, and cloud-based platforms (AWS, SaaS tools, etc.).
- Collaborate with City departments and agencies to understand business needs and deliver data integrations that improve public services and resident-facing applications.
- Design and maintain data pipelines that promote equitable access to information by supporting the publication of open datasets and GIS tools that inform the public on key City initiatives.
- Document technical procedures, system architectures, and operational workflows in a clear and organized manner for internal reference and audit purposes.
- Communicate effectively with both technical and non-technical stakeholders, ensuring alignment between system capabilities and departmental needs.
- Work with departments to improve data quality and integrity, particularly for datasets used in service delivery to underrepresented and historically marginalized communities.
- Ensure integration solutions meet accessibility and inclusion standards when applied in public-facing tools and systems.
- Collaborate with business partners and stakeholders to understand technical needs, create integration solutions, and make data more accessible to internal departments and the public.
- Analyze complex datasets to identify trends, correlations, and data anomalies.
- Write technical documentation, data dictionaries, and integration specifications; effectively communicate technical information to non-technical stakeholders.
- Apply DevOps practices using tools like Docker, GitHub, and Terraform to support CI/CD pipelines.
- All other duties as assigned.
Preferred Qualifications
- Geospatial data and tools (PostGIS, ArcGIS Online, CARTO) preferred.