Our data engineering team extends and maintain the various data pipelines that are at the heart of our business.
We are a data-driven company which collects and processes more than 500GB of raw data daily. We leverage big data technologies such as Spark on AWS EMR to crunch these volumes of data and make it queryable.
During this internship you will assist the data engineering team with their responsibilities.
Our data engineers are responsible for the following topics:
- DataLake expansion & maintenance.
- Spark, Scala, AWS (EMR, Lambda, Elastic Search, Athena, …) , Apache Airflow.
- Infrastructure setup for data processing pipelines.
- EMR, Spark, Airflow, Automation, Docker, …
- Design and build big data processing architectures to build further use-cases (such as AI and machine learning) upon our data.
- AWS solutions/architecture design, Python, cost-awareness.
- Data Mart/Warehouse design to make our data more accessible for BI tools, marketing and analytics.
- SQL database/data warehouse design, Python, data modeling.
About the stack
Daltix is using big data technologies such as Spark, Airflow, Amazon Athena (Presto), Elastic Search & Snowflake to cope with the big amounts of data that it has to process & make accessible for analytics & the data science team on a daily basis. This is not an easy task as the volume of the data grows on a daily basis.
Being able to make huge sets of data easily analyzable & available for different use-cases one of our main challenges here as well as building tools to monitor & guard the quality of the data.
In order to qualify for this internship we have some requirements.
- 3 years of university experience in computer engineering (or relevant computer engineering degree).
- Good experience & exposure to Python programming.
- Understanding of databases & SQL.
- Highly proficient in spoken and written English.
Ideally you also…
- Solid programming experience in Python.
- Experience building on top of Amazon Web Services.
- Some experience with big data technologies (such as ElasticSearch, Spark, Hadoop, Airflow, Cassandra).
What we can offer?
- Health Insurance
- Direct impact in the company
- Meal allowance
- Flexible working hours
- Free fruit, tea, coffee and snacks
- Occasional drinks and other team events