Job Description
Senior Data engineer
Get to know the Role
As the Data engineer in the Lending Data Engineering team, you will work closely with data modelers, product analytics, product managers, software engineers and business stakeholders across the SEA in understanding the business and data requirements. You will be responsible for building and managing the data asset, including acquisition, storage, processing and consumption channels, and using some of the most scalable and resilient open source big data technologies like Flink, Airflow, Spark, Kafka, Trino and more on cloud infrastructure. You are encouraged to think out of the box and have fun exploring the latest patterns and designs.
The Day-to-Day Activities
- Developing and maintaining scalable and reliable ETL pipelines and processes to ingest data from a large number and variety of data sources
- Developing a deep understanding of real-time data productions availability to inform on the real time metric definitions
- Develop data quality checks and establish best practices for data governance, quality assurance, data cleansing, and ETL-related activities
- Build solutions leveraging AWS services such as Glue, Redshift, Athena, Lambda, S3, Step Functions, EMR, and Kinesis to enable efficient data processing and analytics.
- Implement and monitor data quality checks and establish best practices for data governance, quality assurance, data cleansing, and ETL-related activities using AWS Glue DataBrew or similar tools.
The Must-Haves
- At least 5+ years of relevant experience in developing scalable, secured, distributed, fault tolerant, resilient & mission-critical data pipelines.
- Proficiency in at least one of the programming languages Python, Scala.
- Strong understanding of big data technologies like Flink, Spark, Trino, Airflow, Kafka, and familiarity with AWS services like EMR, Glue, Redshift, Kinesis, and Athena.
- Experience with SQL, schema design and data modeling((must have experience in all these skills)
- Experience with different databases – NoSQL, Columnar, Relational.
- Develop familiarity with in-house and AWS-native data platform tools to efficiently set up data pipelines.
- You are organized, insightful and can communicate your observations well, both written and verbally to your stakeholders to share updates and coordinate the development of data pipelines
The Nice-to-Haves
- You have a degree or higher in Computer Science, Electronics or Electrical Engineering, Software Engineering, Information Technology or other related technical disciplines.
- You have a good understanding of Data Structure or Algorithms or Machine Learning models.