Job Title: Data Engineer

Experience Required: 5+ years
Qualification: Bachelor’s or Master’s degree in Computer Science, Information Technology, Data Engineering, Software Engineering, or a related field.


📝 Job Description

We are looking for an experienced Data Engineer with 5+ years of hands-on expertise in designing, building, and optimizing data pipelines and analytics solutions. The ideal candidate will be responsible for constructing robust data architectures, building scalable ETL/ELT workflows, integrating complex datasets, and ensuring high data quality across systems.

You will collaborate with cross-functional teams, including data scientists, analysts, and platform engineers, to support data-driven decision-making and advanced analytics initiatives.


🔎 Key Responsibilities

Data Pipeline Development

  • Design, develop, and maintain scalable ETL/ELT pipelines that acquire, transform, and load structured and unstructured data.
  • Implement batch and real-time data processing workflows using modern big data tools.
  • Optimize existing pipelines for performance, scalability, and cost-efficiency.

Data Architecture & Modeling

  • Build and maintain data warehouses, data lakes, and lakehouse architectures.
  • Develop logical and physical data models that support reporting, analytics, and ML workloads.
  • Ensure proper metadata management and lineage documentation.

Cloud & Platform Engineering

  • Work with cloud platforms such as AWS, Azure, or Google Cloud to architect and deploy data solutions.
  • Implement storage solutions (S3, ADLS, GCS), compute (Databricks, EMR, Dataflow), and orchestration (Airflow, Data Factory).
  • Monitor and optimize cloud resource usage and system performance.

Data Quality & Governance

  • Implement data validation, reconciliation, and quality checks.
  • Ensure compliance with data governance, security, and privacy standards.
  • Develop frameworks for monitoring pipeline reliability and data accuracy.

Integration & Collaboration

  • Integrate data from APIs, databases, applications, and external sources.
  • Work closely with data science and analytics teams to ensure data availability and usability.
  • Support business stakeholders by delivering curated datasets and insights-ready data layers.

📚 Required Skills & Technical Competencies

  • Strong proficiency with Python, SQL, and data transformation scripts.
  • Experience with big data technologies like Spark, Hadoop, Kafka, Snowflake, or Databricks.
  • Expertise with cloud-native data services: AWS: Glue, Redshift, EMR, Lambda
    Azure: Data Factory, Synapse, Databricks, ADLS
    GCP: BigQuery, Dataflow, Pub/Sub
  • Solid understanding of databases:
    • Relational (PostgreSQL, MySQL, SQL Server)
    • NoSQL (MongoDB, Cassandra, DynamoDB)
  • Experience with data orchestration tools (Airflow, Prefect, Dagster).
  • Strong knowledge of data modeling, schema design, and normalization techniques.
  • Proficiency with version control (Git) and CI/CD workflows.

🎯 Preferred Skills

  • Experience with containerization (Docker, Kubernetes).
  • Familiarity with machine learning pipelines and feature stores.
  • Experience working in Agile or DevOps environments.
  • Knowledge of data security best practices and compliance frameworks.
  • Exposure to real-time streaming platforms (Kafka, Kinesis, Pulsar).

Share your resumes at career@cloudbridgeusa.com