An Implementation of Double Machine Learning with XGboost in R

A Benchmark Estimate

This is an attempt to estimate Double Machine Learning with XGboost algorithm in R. The purpose is to create a benchmark estimation with DML. The user can choose various machine learning algorithms, where optimizing hyperparameters can be time-consuming. XGboost is a very useful in this regard. This script can be used to produce substantially accurate preliminary results. Repository is here. [Read More]

A Fast Method to Create Synthetic Data with Python

Python package available in PyPI: synloc

I have been working on a project to create synthetic data. I mostly used the R package synthpop in the project. I have been thinking about a very simple algorithm to create synthetic data using the nearest neighbor algorithm since then. I have created a Python package named synloc. I discuss the practical and theoretical here: Generating Synthetic Data with The Nearest Neighbors Algorithm [Read More]

Reducing Matrix Computation Time in R

Using sparseMatrix from Matrix package

In order to increase computation time, I transformed loops into matrix operations in an algorithm. Nevertheless, my matrices were extremely large, and thus computation was slower than I expected. I was using the %*% operator in R to do matrix multiplication. I found out that it is not possible to achieve dramatically faster computations with other background programming languages (e.g., using Rcppor JuliaCall). I tried and failed. [Read More]