Metrics and utilities

synloc.compareStats(original_data, syn_data)[source]

Compare univariate statistics for original and synthetic data.

The returned DataFrame includes mean, standard deviation, min/max, Kolmogorov-Smirnov statistic and Wasserstein distance for each column.

synloc.quality_report(original_data, syn_data)[source]

Return per-variable and overall utility metrics.

synloc.kolmogorov_distances(original_data, syn_data)[source]

Return per-column Kolmogorov-Smirnov distances.

synloc.compute_k_distances(data, K=5)[source]

For each observation in data, compute the sum of Euclidean distances to its K nearest neighbors (excluding itself). :param data: pandas.DataFrame or numpy.ndarray :param K: int, number of neighbors :return: numpy.ndarray of shape (n_samples,)

synloc.sample_trivariate_xyz(size=1000)[source]
synloc.sample_circulars_xy(size)[source]