Approximate k nearest neighbor search with flexible distance function.
find_knn(data, k, ..., query = NULL, distance = c("euclidean", "cosine", "rankcor", "l2"), method = c("covertree", "hnsw"), sym = TRUE, verbose = FALSE)
data | Data matrix |
---|---|
k | Number of nearest neighbors |
... | Parameters passed to |
query | Query matrix. Leave it out to use |
distance | Distance metric to use. Allowed measures: Euclidean distance (default), cosine distance (\(1-corr(c_1, c_2)\)) or rank correlation distance (\(1-corr(rank(c_1), rank(c_2))\)) |
method | Method to use. |
sym | Return a symmetric matrix (as long as query is NULL)? |
verbose | Show a progressbar? (default: FALSE) |
A list
with the entries:
index
A \(nrow(data) \times k\) integer matrix containing the indices of the k nearest neighbors for each cell.
dist
A \(nrow(data) \times k\) double matrix containing the distances to the k nearest neighbors for each cell.
dist_mat
A dgCMatrix
if sym == TRUE
,
else a dsCMatrix
(\(nrow(query) \times nrow(data)\)).
Any zero in the matrix (except for the diagonal) indicates that the cells in the corresponding pair are close neighbors.