Home

Content

Bojan Stavrikj
Senior Data Engineer

Projects

ECB Climate Stress Test - Data Collection, Processing and Preparation - BNP Paribas

I was part of a project which was a new requirement by the ECB in order to stress test the portfolio of systematically important European banks. I was involved in the data collection, data integration and the data transformation process. Ultimately building an ETL pipeline in order to deliver databases containing the necessary information as per the ECB requirements in order to stress the bank’s portfolio dependent on different climate change scenarios and projection years.

Global Network Analysis - Banco de Portugal & Nova IMS

I have completed my thesis project in 2022, which analyzes a global network of investment relationships. Two datasets were used for this project, including: Coordinated Direct Investment Survey (CDIS) and Coordinated Portfolio Investment Survey (CPIS). Network analysis was used to understand the position of different countries in intermediating investments in a global network. These findings can be used to identify patterns, preferential paths for investment, establish trends and describe the relations between countries over time. Ultimately, the results are visualized in an interactive web application developed maily with d3js. The visualizations include complex node-link force directed graph, as well as simpler bar, line charts and tabular representations. The web application is available at: https://fi-networks.com/.

Fantasy Premier League Web App - Personal Project

I built a transfer planner and player comparison web application for fantasy premier league managers. The data was obtained through the Premier League and Understat API's. I used python to for data cleaning and aggregation, while for building the app I used mainly HTML, CSS and JavaScript (d3.js). The app allows users to import their team based on ID and show the players that they currently own. The transfer planner section gives the possibility to simulate transfers in a complete one-page view, where all the necessary information is available. Additionally, the users can visually compare the past and expected stats of the players and understand which asset is the most valuable. The web app is available at: https://fplmania.com.

Web Scraping - Personal

While doing some of the projects outlined in this section, I had the need of obtaining data that would contribute to my models. In one such scenario the team needed some weather data. After several failed attemprs of finding structured files with weather information I realized I should scrape this data for myself. This is when I started learning how to web scrape. After successfuly writing this code for myself, I decided to create a post with the code and explanation so others in similar situations can get this data (avalable in content section).

Demand Forecasting - Nova IMS

The company had trouble in determining the right number of drivers to hire for different periods of the year. A predictive model was built using Machine Learning techniques for estimating expected number of services. The challenges faced in this project were data quality and aggregation. Additionaly, the client request for predicting 4 week batches which are 8 weeks in advance made this project more challenging. Although, this was a necessary constraint for the model to be effective due to the long hiring processes. The target set by the client was a maximum mean absolute error of 10%, while we managed to achieve a 5% error on average. The model could further be improved by getting more, and better quality data.

Booking Cancellation Prediction - Nova IMS

The data used is real and obtained from a hotel based in Lisboa, Portugal. The issue the hotel had was a staggering 40% booking cancellation rate. This is largely contributed to by Online Travel Agencies (OTA's) such as booking.com and airbnb, which often give clients the flexibilty of free cancellations. In order to tackle this problem a predictive model was built which ultimately achieved an 81% accuracy. Having this model deployed, would allow management to better allocate resources and decrease the total number of unvacant rooms at any given time. This could be achieved by allowing for over/under booking flexibility when expected cancellation rates are high/low.

Customer Segmentation

The data used is real and obtained from a hotel based in Lisboa, Portugal. At the time the hotel had implemented a customer segmentation which was based solely on the booking platform the customer used last. Our team obtained data on a large number of clients of this hotel and managed to perform in-depth analysis on the types of customers that book stays at the hotel in question. The end product resulted in 4 segments for the specific static dataset. Lastly our team gave detailed description of each cluster and suggestions on deployment, maintenance as well as marketing strategies for each profile of customers.

Classic Models Data Warehouse

The data used is synthetic data with regards to an automotive company and has 110 different models for sale grouped in 7 major product lines: classic cars, vintage cars, motorcycles, trucks, planes, ships and planes. The records of sales orders are available for the period from January 2003 to May 2005. The goal of the project was to develop a data warehouse with star-schema including fact table and several dimensions. This was done in mySQL, and later connected with Pentaho in order to schedule jobs for running the ETL processes. Lastly, a dashboard was created using PowerBI for clear overview for the sales department of Classic Models.

Developing New Control Tool

I developed a new controlling tool for the team. It spread over more than 10 interconnected sheets controlling and reconciling data from 3 different systems on several levels of grain. The challenge was to have all the data in one place while making it run fast without having to go through many formulas to obtain the final result. Data would be inputted over several sheets, from 3 different sources; manual ledger adjustments would be added to the file depending on where the system is breaking and needs to be adjusted. All of this would then be summarized as a simulation of what the final numbers would be after all necessary adjustments are uploaded in the system. This summary was shown on day-to-date, month-to-date and year-to-date basis, including a difference calculation to the initial risk system. After all of this was done, a final system number would be inputed for a final check. Once the trading desk approves and agrees with the final reported numbers, a macro was created in order to distribute the numbers to a list of managers within the company.

Work Experience

BNP Paribas - Data Engineer

05.2021-Present

I am part of the Stress Testing Financial Synthesis (STFS) team in BNP Paribas, within STFS the sub team of Stress Testing Data Analytics (STDA) where we process data from different data streams in the bank. I work in a big data environment, leveraging on hadoop and Pyspark to process huge amount of data and prepare it for the modelling team in order to stress the portfolio based on different types of shocks. The main project I was part of is a new requirement by the ECB which stresses the portfolio of systematically important European banks. I was involved in the new data collection, data integration and data transformation process. Ultimately building an ETL pipeline in order to deliver databases containing the necessary information as per the ECB requirements in order to stress the bank’s portfolio dependent on different climate change scenarios and projection years.

Citibank - Junior Product Control Analyist

03.2018-08.2019

Within Citibank Product Control (PC) is the largest department in Finance with responsibility for controlling daily profit and loss reporting, price verification and new trading activity for the ICG in EMEA. The department is organised into business-aligned teams and the product scope comprehensive, comprising cash, derivative, as well as structured and exotic variants of the following asset classes; credit, FX, equity, money markets, commodities and rates. I worked closely across functions on a daily basis (including the Trading desks, Risk Management, Operations, and other areas of Finance) and developed a good understanding of the products traded, along with the associated market risks and accounting complexities.

Magyar Telekom - Financial Controlling Intern

11.2017-12.2017

The internship was taken as part of my bachelor degree, with main goal to study the importance of the financial controlling in keeping Magyar Telekom competitive on the market while being at the forefront. While I was there they were developing new strategic packages that are released on the market as a response to competitors moves.

Skills

Exploratory Data Analysis

Market Analysis

Business Dashboards

Predictive Modeling

Customer Base Analytics

Data Engineering

Tools

Python

Javascript

SQL

PowerBI

Web
Scraping

Machine
Learning

HTML5
& CSS

Pyspark

Education

NOVA Information Management School (NOVA IMS)

Masters in Data Science and Advanced Analytics

09.2019 - 03.2021.

Major in Business Analytics

Thesis: Global cross-country investment relationships – using network analysis and interactive visualization techniques

Average: 17/20

Erasmus University Rotterdam

International Bachelor of Business and Business Economics

09.2014 - 03.2018

Major in International Economics

Thesis: Empirical analysis on “The Relationship Between Oil Price Fluctuations and Exchange Rates of Net Oil Exporters”

Average: 6.5/10

American International School of Budapest (ASIB)

High School - International Baccalaureate (IB)

09.2011 - 05.2014

Higher Level - Economics, Biology and Theatre

Standard Level - Math, English and Spanish

About

I am a Macedonian national, currently living in the city of Lisbon pursuing my masters in data science and majoring in business analytics. After living in Macedonia through my elementary phase, I moved to Budapest with my family and lived there until graduating high school and fully picking up Hungarian, and quite fluently.

My pursuit to get out of home and living the college life led me to Rotterdam in the Netherlands, where I graduated my bachelors in Economics at Erasmus University Rotterdam. Fours years in, I decided to move back to Budapest for a position as Junior Product Control Analyst at Citibank acquiring knowledge and experience in Finance. Mostly focused on equity derivatives and corporate equity derivatives.

Fast forward to 2020, I am currently expanding my knowledge in data science at NOVA IMS in Lisbon, getting hands-on experience in supervised and unsupervised machine learning models, as well forecasting, classification, prediction and segmentation methods on data sourced out through established companies.

I am actively seeking opportunities in the felid of Data Science in Lisbon. Feel free to get in touch and we could chat more on how my skills can meet your current requirements in data science or business analysis!

Languages

Macedonian

English

Native

Fluent

Serbian

Hungarian

Portuguese

Fluent

Advanced

Basic

Let's Get In Touch!

Name

Email

Message