Experience: 2+ years
Minimum Education Requirement: This is a professional position, and as such, we require, at minimum, a Bachelor’s degree or higher (or equivalent) in computer science, computer information systems, information technology, Mechanical engineering, Industrial engineering, Any Science Degree or a combination of education equating to the U.S. equivalent of a Bachelor’s degree in one of the aforementioned subjects.
Salary Range: $105747.00 to $106000.00 per year
Job Description:
- Design, develop, and maintain scalable ETL/ELT data pipelines processing large-scale structured and unstructured datasets using Python, Apache Spark, Apache Airflow, BigQuery, and cloud-based data platforms to support advanced analytics, machine learning, and business intelligence initiatives.
- Develop and implement data transformation frameworks and enterprise data models using dbt, enabling standardized, high-quality datasets for data science, artificial intelligence, and reporting applications.
- Architect and optimize cloud-based data solutions, including migration of large-scale data assets from Hadoop environments to Google Cloud Platform (GCP), leveraging BigQuery, Looker, and LookML for enterprise analytics and decision support.
- Analyze and optimize data warehouse performance through partitioning, clustering, schema design, workload tuning, monitoring, and automated alerting to ensure data reliability, scalability, and operational efficiency.
- Develop enterprise data quality frameworks utilizing statistical analysis, anomaly detection, and automated monitoring techniques to ensure data accuracy, consistency, and integrity across streaming and batch data environments.
- Design, build, and deploy machine learning solutions, including feature engineering pipelines, predictive models, and automated MLOps workflows for inventory optimization, audit prioritization, forecasting, and business process improvement.
- Implement machine learning model lifecycle management using Vertex AI and related cloud technologies, including model deployment, retraining automation, performance monitoring, drift detection, and production support.
- Conduct statistical analyses, experimental design, hypothesis testing, sample size determination, A/B testing, and model validation to evaluate and improve predictive performance of machine learning solutions.
- Develop and deploy Generative AI and Retrieval-Augmented Generation (RAG) solutions using large language models (LLMs), LangChain, LangGraph, vector databases, embeddings, and semantic search technologies to support enterprise knowledge management and data discovery.
- Design AI-driven automation solutions, including intelligent agents, metadata enrichment systems, prompt engineering frameworks, and retrieval optimization strategies to improve data accessibility and decision-making capabilities.
- Establish and maintain enterprise semantic layer standards, data governance controls, certified business metrics, and analytical models using LookML and Looker to support consistent reporting and analytics across business functions.
- Collaborate with Data Engineering, Data Governance, Product Management, and Business stakeholders to implement data security controls, row-level security, data lineage tracking, PII governance, and scalable analytics solutions supporting organizational objectives.
Share your resumes at career@cloudbridgeusa.com


