Job Overview
-
Date PostedNovember 2, 2024
-
Location
-
Expiration date--
Job Description
We are looking for a skilled and experienced GCP Data Lead Engineer to join our Data team in Hyderabad location.
About the job
Position: GCP Data Engineering – Lead Programmer Analyst
Total Experience: Minimum 5 yrs to 7 yrs of experience.
Notice Period: Immediate -15 days
Must have skills: GCP Cloud, GCS, BigQuery, Python, PySpark, DWH Architecture, ETL, SQL
Job Description: GCP Data Engineering – Lead Programmer Analyst
We are seeking a highly skilled and knowledgeable Data Engineer to join our Data Management team on a transformative Move to Cloud (M2C) project. The ideal candidate will have a strong background in creating robust data ingestion pipelines and a deep understanding of ETL processes, particularly within the Google Cloud Platform ecosystem and using tools such as dbt Labs over BigQuery.
Responsibilities:
- Develop and maintain scalable and reliable data pipelines using PySpark and SQL to support the migration from Oracle's on-premises data warehouse (structured data source) and unstructured data sources to BigQuery.
- Design and implement bulletproof data ingestion and integration processes that ensure data quality and consistency.
- Use the dbt tool to create and manage ETL processes that transform and load data efficiently from/into BigQuery.
- Ensure the resilience and efficiency of data transformation jobs that can handle large volumes of data within a cloud-based architecture.
- Work closely with our Data Engineers to gather requirements for the currently developed data pipelines.
- Provide expertise in GCP services like DataProc, DataFlow, Cloud Functions, Workflows, Cloud Composer, and BigQuery, advocating for best practices in cloud-based data management.
- Collaborate with data architects and other stakeholders to optimize data models and warehouse design for the cloud environment.
- Develop and implement monitoring, quality, and validation processes to ensure the integrity of data pipelines and data.
- Document all data engineering processes and create clear specifications for future reference and compliance.
Requirements:
- Bachelor's or Master’s degree in Computer Science, Engineering, or a related field.
- Minimum of 5 years of experience as a Data Engineer with a focus on cloud-based solutions.
- Proficient in GCP services, with a strong emphasis on data-related products such as DataProc, DataFlow, Cloud Functions, Workflows, Cloud Composer, and BigQuery.
- Extensive experience with ETL tools, particularly dbt Labs, and a clear understanding of ETL best practices.
- Experience in building and optimizing data pipelines, architectures, and data sets from structured/unstructured data sources.
- Strong analytical skills with the ability to understand complex requirements and translate them into technical solutions.
- Excellent problem-solving abilities and a commitment to quality.
- Strong communication skills, with the ability to work collaboratively in a team environment.
- Relevant certifications in the Google Cloud Platform or other data engineering credentials are desirable.
- Proficiency in SQL and Python with knowledge of Spark.
- Fluent in English, with strong written and verbal communication skills.