Experience Required: 5+ years
Qualification: Bachelor’s or Master’s degree in Computer Science, Information Technology, Data Engineering, Software Engineering, or a related field.
📝 Job Description
We are looking for an experienced Data Engineer with 5+ years of hands-on expertise in designing, building, and optimizing data pipelines and analytics solutions. The ideal candidate will be responsible for constructing robust data architectures, building scalable ETL/ELT workflows, integrating complex datasets, and ensuring high data quality across systems.
You will collaborate with cross-functional teams, including data scientists, analysts, and platform engineers, to support data-driven decision-making and advanced analytics initiatives.
🔎 Key Responsibilities
Data Pipeline Development
- Design, develop, and maintain scalable ETL/ELT pipelines that acquire, transform, and load structured and unstructured data.
- Implement batch and real-time data processing workflows using modern big data tools.
- Optimize existing pipelines for performance, scalability, and cost-efficiency.
Data Architecture & Modeling
- Build and maintain data warehouses, data lakes, and lakehouse architectures.
- Develop logical and physical data models that support reporting, analytics, and ML workloads.
- Ensure proper metadata management and lineage documentation.
Cloud & Platform Engineering
- Work with cloud platforms such as AWS, Azure, or Google Cloud to architect and deploy data solutions.
- Implement storage solutions (S3, ADLS, GCS), compute (Databricks, EMR, Dataflow), and orchestration (Airflow, Data Factory).
- Monitor and optimize cloud resource usage and system performance.
Data Quality & Governance
- Implement data validation, reconciliation, and quality checks.
- Ensure compliance with data governance, security, and privacy standards.
- Develop frameworks for monitoring pipeline reliability and data accuracy.
Integration & Collaboration
- Integrate data from APIs, databases, applications, and external sources.
- Work closely with data science and analytics teams to ensure data availability and usability.
- Support business stakeholders by delivering curated datasets and insights-ready data layers.
📚 Required Skills & Technical Competencies
- Strong proficiency with Python, SQL, and data transformation scripts.
- Experience with big data technologies like Spark, Hadoop, Kafka, Snowflake, or Databricks.
- Expertise with cloud-native data services: AWS: Glue, Redshift, EMR, Lambda
Azure: Data Factory, Synapse, Databricks, ADLS
GCP: BigQuery, Dataflow, Pub/Sub - Solid understanding of databases:
- Relational (PostgreSQL, MySQL, SQL Server)
- NoSQL (MongoDB, Cassandra, DynamoDB)
- Experience with data orchestration tools (Airflow, Prefect, Dagster).
- Strong knowledge of data modeling, schema design, and normalization techniques.
- Proficiency with version control (Git) and CI/CD workflows.
🎯 Preferred Skills
- Experience with containerization (Docker, Kubernetes).
- Familiarity with machine learning pipelines and feature stores.
- Experience working in Agile or DevOps environments.
- Knowledge of data security best practices and compliance frameworks.
- Exposure to real-time streaming platforms (Kafka, Kinesis, Pulsar).
Share your resumes at career@cloudbridgeusa.com


