Data Team at Summer Accelerator 2021
After a crazy 2020, equally crazy 2021 came. But, the CROZ Summer accelerator once again proved it’s the only constant in these uncertain times. Just as constant as (very) positive experiences from students who participate in the Accelerator every year. Our Data internship is no exception, and this is our story.
Before the beginning of the real story, there was an intro – an interview for the internship. Why is the interview listed as the casually mentioned intro and not some huge plot point? Because that’s exactly what it was – a laid back relaxed conversation where we had the chance to show our knowledge and (more importantly) our interests and passions, to meet our future mentors and find out what we would do during the internship.
After that, we got our place at the internship, and that is where our story actually began. As we all know, every good story answers six basic questions: what, how, why, where, when, and who – so does ours.
The goal was to explore three cloud technologies: Amazon’s AWS, IBM’s Cloud Pak, and Microsoft’s Azure. In the beginning, we were cleaning the data and defining use cases together, but later we split into three teams. Each team explored one of the technologies and at the end of the internship, we compared our impressions and results to discover which one of these three technologies is the best.
As we all know, the best way of learning about any new technology is by using it to create something. We had to create a project consisting of three parts: a data warehouse, data visualization, and trying out the machine learning capabilities of all three cloud technologies. Each of these clouds has built-in tools for working on each of these parts, so we didn’t have to reinvent the wheel, but to learn how to use these tools, and from there, it was (mostly) smooth sailing.
The purpose of the project was to analyze and explore the neighborhoods and real estate in Zagreb, make a data warehouse, and visualize everything we discovered. We used machine learning to predict the real estate prices depending on the size of the real estate and the neighborhood. We paid attention to the following parameters: the price of the real estate, the number of contents in the neighborhood (recreational, fun, cultural, etc.), the vicinity of public transportation, the demographic composition, etc. It was fun choosing which parameters to use and what we could find out from the available data. The data we used included sources like Open Street Map, data.zagreb.hr, Airbnb, and the Croatian bureau of statistics.
The internship started with a few training courses so we would have the same basic knowledge from the beginning. We learned about agile development and Git but also advanced SQL and data warehouses (which are our bread and butter). Some training courses were real lectures, while others were like informal gatherings where our mentors would give us examples of good practices and share experiences from their expertise.
Cloud technologies were relatively new to us, which implied a lot of digging through the official documentation, but also a lot of helping each other (this is why working in teams was definitely an advantage). Moreover, we could always ask our mentors if we needed any help. The mentor-student ratio was almost 1:1, which meant we could basically get the answer to any question we might have.
CROZ always works on improvement when it comes to employees and technologies. This year the emphasis of the internship was exactly on improvement. We wanted to explore the technologies of the future and see how it is to work with them (because, in the future, everything is going to move to the cloud for sure). As interns, we had the opportunity to learn about these new technologies first hand and use them in a real project that we (both us and our mentors) were incredibly proud of.
This year, the Data internship was held in two cities for the first time – Zagreb and Rijeka. That proves that it doesn’t matter where you work, you will have colleagues and mentors everywhere, and you will always have somebody to hang out with or ask for help. And who knows, maybe next year we will have an internship in even more cities (or even abroad because CROZ has offices in five countries).
The internship lasted almost three months – from mid-July till the end of September. Although that seems like a long period, time really flew by, and towards the end, we asked ourselves: “Is it possible the internship is already over?” Looks like time really does fly when you’re having fun.
In the end, we should meet the protagonists – the heart of every story. This year the Data internship was bigger than ever (which proves that the industry needs data engineers, so if you want to pursue this field, you definitely can’t go wrong). This year there were eight data interns. We came from a bunch of different colleges and cities, but what connected us was the love for data and the curiosity for discovering something new. Now, it’s time to meet our protagonists…
Let’s start with Dino. Why are we starting with him? Because he was the one who started all of our online meetings (our very own icebreaker) so nobody else had to. When he isn’t starting meetings, he is preparing for work with a (large) coffee. After Dino, we introduce you to Lara who specialised in data visualization. We discovered she has a talent for designing T-shirts, so with her help, we had the best T-shirts for the last day of our internship – The Show Off. The person who made sure that Lara would have complete and accurate data was Marin. He diligently worked on the ETL process. Marin also introduced the Zagreb team to the drink that the Rijeka team jealously guarded for most of the summer – Pašareta. Next up – Andro. With him, atomic watches are becoming a thing of the past… If you need to set your watch, just wait for him to come to work. When he arrives, you can be sure it’s exactly 9 AM. During his internship, Andro discovered his hidden talent for data visualization that he continued developing after the internship. After Andro, meet Franko – during the internship, he was a doctor for polygons, but currently, he is the doctor for big data. He is also a fierce enemy of „broken windows” (bad programming practices). On our team, we also had Gabrijela, our on-call statistician. In her free time, she loves mountain climbing, and, compared to that, searching for bugs seems like a vacation. Next, we also had Mister Dorian, our machine learning expert. Since the initial data wasn’t enough for us (because real data engineers always want more data), he made sure we got more using some „python magic”. In the end, there’s me, Robert, someone who loves writing (a lot) – which includes documentation and use cases, but also this review… And someone who not only learned so much but who had a lot of fun during his internship.
This was our CROZ story. The only thing left to write is the end – the happy end. We all continued to work at CROZ after the internship. I think that says enough about how great the Summer Accelerator was and that we will write many more stories here in the future.
Summer Accelerator – the most wanted student internship at the Faculty of Electrical Engineering and Computing in 2020.