Data Engineer - Careers

Medcare MSO is one of the largest USA-based Healthcare IT organization in Pakistan, with 950+ people on board. We implement best practices and adopt state-of-the-art technology tools to achieve results. As a Data Engineer, you will be responsible for building and maintaining the data infrastructure that powers Medcare’s AI and analytics systems. You will work closely with ML Engineers and Product teams to develop reliable, scalable, and secure data pipelines across structured and unstructured healthcare data sources. Your role will focus on ensuring high-quality, accessible, and compliant data that enables downstream machine learning models and business-critical workflows. This role requires strong fundamentals in data engineering, a solid understanding of data systems, and the ability to work in regulated environments with sensitive data.

Position: Data Engineer

Shift Timing: 02:00pm-11:00pm

Location: MMalam, Lahore

Key Responsibilities:

Build, maintain, and optimize batch and near real-time data pipelines for ingesting, transforming, and loading data from multiple internal and external sources
Develop robust ETL/ELT workflows for structured and semi-structured healthcare data, ensuring data quality, consistency, and reliability
Write efficient, scalable SQL queries and Python-based data processing scripts for large datasets
Collaborate with ML Engineers to prepare and serve clean, feature-ready datasets for training and inference workflows
Implement data validation, quality checks, and monitoring to ensure pipeline reliability and integrity
Work with orchestration tools to schedule, manage, and monitor pipeline execution
Ensure secure handling of sensitive healthcare data, including PHI-safe practices, access control, and audit logging
Maintain and optimize data storage systems including data warehouses, data lakes, and hybrid architectures
Contribute to schema design, data modeling, and standardization of data structures across systems
Support data versioning, lineage tracking, and reproducibility of datasets used in ML workflows
Collaborate with engineering teams to integrate data pipelines with downstream applications and APIs

Qualifications:

Bachelor’s degree in Computer Science, Data Engineering, or a related discipline
2-3 years of experience in Data Engineering with strong preference for 1+ year of rigorous hands-on experience, including freelance or independent project work demonstrating real-world problem solving
Demonstrated ability to write clean, efficient, and production-grade code independently, with a strong emphasis on first-principles problem solving rather than reliance on code-generation tools or low-code/no code platforms
Strong proficiency in SQL and Python for data processing and transformation
Hands-on experience building ETL/ELT pipelines using tools such as Airflow, dbt, or Azure Data Factory
Solid understanding of data modeling concepts, including normalization, denormalization, and schema design
Experience working with relational and non-relational databases
Familiarity with data warehouses and data lake architectures
Understanding of data quality, validation techniques, and pipeline monitoring
Experience working with large-scale datasets and performance optimization
Familiarity with healthcare data or regulated environments is a plus
Experience with Microsoft Azure (e.g., Azure Data Factory, Synapse, Blob Storage) and serverless data workflows is a plus
Strong problem-solving skills with the ability to work on ambiguous data challenges
Strong cross-functional communication and documentation skills
Demonstrated ability to write efficient and reliable data pipelines from first principles, without over-reliance on low-code tools or automated pipeline generators