Data Management Lead
Company | BioAgilytix |
---|---|
Location | Durham, NC, USA |
Salary | $Not Provided – $Not Provided |
Type | Full-Time |
Degrees | Bachelor’s, Master’s |
Experience Level | Expert or higher |
Requirements
- Bachelor’s degree in data science, Information Management, Computer Science, or related field; master’s degree preferred.
- 10+ years in data management, governance, or a related field, ideally in a highly regulated industry (life sciences, pharmaceuticals, healthcare).
- Strong expertise with data integration tools (e.g., Microsoft Azure Data Factory, Informatica), ETL processes, and data governance platforms (e.g., Collibra, Alation).
- Demonstrated experience as a self-directed technical contributor with hands-on expertise in data engineering, data governance, and compliance automation, including work with LabVantage LIMS and ERP systems.
- Proven ability to define data standards, develop and implement quality assurance processes, and ensure data integrity in alignment with regulatory frameworks such as HIPAA, GxP, GDPR, and FERPA.
Responsibilities
- Develop a foundational data governance framework, with hands-on responsibility for data quality, access control, and compliance protocols.
- Define standards, build quality assurance processes, and monitor data integrity to meet regulatory requirements, including HIPAA, GxP, GDPR, and FIRPA.
- Design data quality KPIs and implement initial quality checks independently, ensuring data accuracy, consistency, and reliability.
- Develop ETL processes to automate data extraction, transformation, and loading from core systems (LabVantage LIMS, ERP) into a centralized data repository.
- Design and implement the architecture for a scalable data hub, using cloud platforms like Microsoft Azure or SQL-based solutions for centralized data storage.
- Create APIs to enable real-time data sharing between systems, with a focus on operational efficiency and data integrity.
- Use Robotic Process Automation (RPA) tools like UiPath or Microsoft Power Automate to automate high-frequency, repetitive tasks, such as data validation and QC compliance checks.
- Establish automated data validation and rule-based checks to identify data discrepancies in real-time, enhancing regulatory compliance and minimizing errors.
- Independently design and configure basic automated workflows to streamline compliance reporting, reducing manual efforts.
- Organize and clean historical QC and operational data to lay the groundwork for future predictive models focused on quality control, lab maintenance, and operational forecasting.
- Assess core lab and operational data for predictive value, setting up structured data sets to support the rollout of predictive analytics as the team grows.
- Independently create data visualizations and dashboards using tools like Power BI or Tableau to enable lab and operations teams to access actionable insights.
- Work closely with IT, lab operations, and quality/compliance teams to ensure alignment on data goals and processes.
- Act as an advocate for data quality and governance, providing education to teams on the importance of data integrity and compliance.
- Develop a roadmap for building a high-performing data management team, identifying roles in data engineering, quality, and automation as the function matures.
- Direct the complete pipeline for extracting, transforming, validating, packaging, and securely transmitting data sets to sponsors, meeting agreed SLAs (e.g., ≤ 24 h for interim cuts, ≤ 5 days for final locked data).
- Standardize fields, units, and controlled vocabularies (e.g., CDISC SDTM/SEND) to present a single, consistent dataset to sponsors.
- Leverage RPA, scripted edit checks, and audit-trail capture to shrink cycle times and reduce defects.
- Provide dashboards, secure APIs, or portals that allow sponsors real-time visibility into validated data.
- Ensure comprehensive data-lineage and validation documentation is always ready for FDA, EMA, or sponsor inspection.
Preferred Qualifications
- Proficiency in designing, building, and optimizing ETL pipelines using SQL and cloud data tools (Azure Data Factory, Informatica). Ability to work with large datasets from multiple sources (LIMS, ERP).
- Experience with RPA platforms (e.g., UiPath, Microsoft Power Automate) to independently automate data validation checks and routine compliance tasks.
- Skilled in developing and managing APIs for data integration, enabling seamless connectivity between lab systems, ERP, and data hubs.
- Proficient in Power BI or Tableau for creating dashboards and visualizations, translating data insights into accessible formats for lab and operational teams.
- Familiarity with governance platforms like Collibra or Alation for data cataloging, quality monitoring, and compliance tracking.