STATWORX on TOUR

A data geek, an AI guy, and a fintech dude go into a bar…

Lukas Strömsdörfer Blog, Data Science, Statistik

… some water down the bridge, we are having a Co-Meetup in Frankfurt – kudos to the organizers. Those guys are just awesome. For the past years they have been making an effort to build a Data Science community in Frankfurt – you should check out their Twitter feed. Whenever there is a Meetup – which you should totally check out, by the way – we typically sponsor some beers and some snacks to support the effort those guys are making.

On 24th of January the guys planned something special, though: A Meetup for data scientists, AI developers, and FinTech guys. What can I say, it was lit. The organizers really out-did themselves. We all met at the Frankfurt School of Finance campus. Besides the massive conference room for the talks, we and other companies had little pop-up displays to show interested people what we are doing all day long.

Not only the location but also the speakers were remarkable. Apart from us – I will get to this in a second – Jonathan Masci – one of the founders of NNAISENSE – was talking about their research on deep learning (@Jonathan: I was really hyped by your talk by the way). Yassin Hankir bombared us with memes while giving a talk about his company savedroid which you should really check out if you are broke all the time because you tend to forget to put money aside, ocassionally.

As I said before, we also gave a talk. Led by our fierce CEO Sebastian, Fabian and myself presented one of our newest creations. With only a couple of weeks preparation the entire team got together – sometimes with a couple of beers – and we built a really cool and handy tool. Let me tell you about it:

If you ever had the joy of working on a Machine Learning forecasting project you pretty much know the workflow. At first you are all excited to finally get the data so you can start hacking some models together. Well, it is usually not like that. You pretty much spend your first days with cleaning the data and forcing them into a somewhat machine interpretable shape. After some exhausting, frustrating and coffein intensive hours you are finally done and your dataset is recognizable as such. The next thing you do is, you start thinking about features (variables) that could potentially influence your target. Our world is quite complex – statistically speaking – so there are nearly infinite vairables that could explain some parts of the variation in your target. So you go on and select some of them. You might use some statistical discrimination method, use some sort of logical explanation for why a feature matters, or you just do it randomly. Once you found the formula you were looking for, you will likely start testing various algorithms to see which one predicts your data best. You will probably use some elaborate train-test split with an elaborate cross-validation scheme for model tuning. In the end, you evaluate your models and select the one fitting the best to your data. Now your are pretty much good to go and you can start forecasting.

This of course is tidious work. Since we pretty much go through this entire workflow with every single project we work on, we came up with a handy automation for this workflow. For now we aggreed – not unanimously though – on the name TSBOX. TSBOX is able to take a time series and then automatically produce a forecast. Here some illustration on how it works.

workflow automated forecasting

We pretty much follow the logic I described before. However, with a bunch of code we can delegate those tidious tasks to our machines. Like every Data Scientist would do, our program preprocesses the data first. The exhaustive search for potential outliers, NAs or other weird things going on is taken care of by our artifical Data Scientist so to say. Then comes the task that is usually even more coding intensive – feature generation. Just as well taken care of. Selection of features – no problem. Choosing an algorithm that best fits your data – once again, our program is stealing at least my job.

We are still developing our prototype. What you saw at the Data Science Meetup was our first demo, that we hacked together in merely two weeks. With a little more time, we are pretty confident to come up with something really cool. So lets hope it works. If it does, it's going to be lit.

Über den Autor
Lukas Strömsdörfer

Lukas Strömsdörfer

I am a data scientists at STATWORX, apart from automating my job, I am taking my vintage bike for a spin and building a ML tool that lets me become a below-average gardener.

ABOUT US


STATWORX
is a consulting company for data science, statistics, machine learning and artificial intelligence located in Frankfurt, Zurich and Vienna. Sign up for our NEWSLETTER and receive reads and treats from the world of data science and AI. If you have questions or suggestions, please write us an e-mail addressed to blog(at)statworx.com.