As a REST API, Livy provides Spark interaction without any need of a Spark configuration on your client. Once being able to communicate with the API Spark code can be submitted from everywhere.
What's cooking at STATWORX?
Besides working hard to provide our clients with cutting-edge Machine Learning solutions, we are also big fans of all things culinary here at STATWORX. But can we apply some of those algorithms to make us better cooks? This blog article explores the unlikely union of Data Science and baking!
Web Scraping 101 in Python with Requests & BeautifulSoup
Want to obtain a specific dataset from a website which does not have an API? In this post, I explain how to do this by scraping data using Python, how you determine whether it is allowed to scrape a specific page and more.
Using Reinforcement Learning to play Super Mario Bros on NES using TensorFlow
Could you #BeatTheAI? We let deep learning have a go at Super Mario’s first level and compared it to human players. Here we explain how we did it!
R oder Python
Data Science Einsteiger stehen immer wieder vor der gleichen Frage: Welche Programmiersprache sollte man als Erstes lernen? Die Wahl fällt meistens auf eine der beiden großen Anbieter, R oder Python. Mit diesem Blogartikel wollen wir bei der Suche nach der geeigneten Programmiersprache helfen.
Open Workshop: Programming with Python, May 6th and 7th in Frankfurt
We at STATWORX are opening our doors for anyone who wants to learn more about data. Our open workshop is designed for Python-beginners and provides the perfect mixture between theory and practice. Participants achieve first insights into data science and programming with Python. The workshop will be held in German at our office in Frankfurt. There is a limit of …
How to Speed Up Gradient Boosting by a Factor of Two
Our latest tool development at STATWORX: random boost, an algorithm twice as fast as gradient boosting, with comparable prediction performance.
R and Python: Using reticulate to get the best of both worlds
We at STATWORX use mostly R or Python for our projects. But why not both? With the help of the reticulate package we can use Python within R. Here we show an example of how to train a Support Vector Machine.
Fixing the most common problem with Plotly histograms
In today’s blog post, we show you how to improve the interactivity of Plotly histograms with automatically new rebinning.
Plotly – An interactive Charting library
In this blog we will explore the plotly library for python and R. We show how plotly is structured and use the LA Metro Bike dataset as an example to create interactive plots.