The k parameter for the k nearest neighbors used in DiffusionMap should be as big as possible while still being computationally feasible. This function approximates it depending on the size of the dataset n.

find_dm_k(n, min_k = 100L, small = 1000L, big = 10000L)

Arguments

n

Number of possible neighbors (nrow(dataset) - 1)

min_k

Minimum number of neighbors. Will be chosen for \(n \ge big\)

small

Number of neighbors considered small. If/where \(n \le small\), n itself will be returned.

big

Number of neighbors considered big. If/where \(n \ge big\), min_k will be returned.

Value

A vector of the same length as n that contains suitable k values for the respective n

Examples

curve(find_dm_k(n), 0, 13000, xname = 'n')
curve(find_dm_k(n) / n, 0, 13000, xname = 'n')