Processing terabytes of data in streaming mode and generate a structured output format
@Adobe Systems Romania, în Bd.Timisoara, București, Municipiul București, România

Responsabilități

The challenge:
Adobe Audience Manager is helping our customers to understand more and more about the users that are using their digital properties (web sites or mobile applications). In order to achieve this goal, we need to be able to create meaningful reports as soon as possible for our customers. Reporting becomes a challenge when we’re talking about the billions of events that flow into our pipeline daily and that we need to report on.
The goal here is to create a fast processing pipeline, which is able to partition the data and generate it into a columnar format like Parquet for enabling a quick aggregation on the reporting level.
The application created here will use AWS services:
S3 for reading / storing the data

EMR / EC2 instances with Spark to process the data
SNS / Kinesis / Kafka for notifications

What you’ll do:
Learn about the principles and best practices of Agile Software Development
Develop real-life applications using BIG DATA technologies (e.g. Hadoop, Spark)
Learn how to use AWS (Amazon Web Services) services: S3, EMR, SNS, Kinesis
Demo/Showcase your work

Calificări

What you need to succeed
Analytical thinking and a desire to learn new things
Knowledge of services oriented architectures and server side technologies (e.g. Java, SO/Unix)
Be able to work in a fast paced dynamic environment
Fluent in English (both written and spoken)

Ajută-ne să ducem acest Stagiu pe Bune la mai mulți studenți.