tree with background

Coding Regression trees in 150 lines of R code

André Bleier Blog, Data Science

Motivation There are dozens of machine learning algorithms out there. It is impossible to learn all their mechanics, however, many algorithms sprout from the most established algorithms, e.g. ordinary least squares, gradient boosting, support vector machines, tree-based algorithms and neural networks. At STATWORX we discuss algorithms daily to evaluate their usefulness for a specific project. In any case, understanding these …

gradient boosting machine

Coding Gradient boosted machines in 100 lines of R code

André Bleier Blog, Data Science

Motivation There are dozens of machine learning algorithms out there. It is impossible to learn all their mechanics, however, many algorithms sprout from the most established algorithms, e.g. ordinary least squares, gradient boosting, support vector machines, tree-based algorithms and neural networks. At STATWORX we discuss algorithms daily to evaluate their usefulness for a specific project or problem. In any case, …

scikit learn title

Data Science in Python – Der Einstieg in Machine Learning mit Scikit-Learn

Moritz Gnisia Blog, Data Science

In unseren bisherigen Artikeln zu Data Science in Python haben wir uns mit der grundlegenden Syntax, Datenstrukturen, Arrays, der Datenvisualisierung und Manipulation/Selektion auseinander gesetzt. Was jetzt noch für den Einstieg fehlt, ist die Möglichkeit Modelle auf die Daten anzuwenden, um so zum einen Muster in diese zu erkennen und zum anderen Prädiktionen abzuleiten. Die Vielfalt an implementierten Modellen in Python …

generalized random forest

Using Machine Learning for Causal Inference

Markus Berroth Blog, Data Science

Machine Learning (ML) is still an underdog in the field of economics. However, it gets more and more recognition in the recent years. One reason for being an underdog is, that in economics and other social sciences one is not only interested in predicting but also in making causal inference. Thus many "off-the-shelf" ML algorithms are solving a fundamentally different …

greedy forest

Regularized Greedy Forest – The Scottish Play (Act II)

Fabian Müller Blog, Data Science

In part one of the blog post, the Regularized Greedy Forest (RGF) was introduced as a contender to the more frequently used technique of Gradient Boosting Decision Trees (GBDT). Now it is time to turn words into actions and find out whether it actually is. Among all GBDT implementations, XGBoost is probably the most commonly used implementation in the field …

greedy forest

Regularized Greedy Forest – The Scottish Play (Act I)

Fabian Müller Blog, Data Science, Statistik

Macbeth shall never vanquish'd be until Great Birnam Wood to high Dunsinane Hill Shall come against him. (Act 4, Scene 1) In Shakespeare's The Tragedy of Macbeth, the prophecy of Birnam Wood is one of three misleading prophecies foreshadowing the defeat of the protagonist of the same name. While highly unlikely, the event of a nearby forest moving towards his …