This repository has been archived by the owner on Jan 8, 2020. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathknn.R
73 lines (73 loc) · 3.69 KB
/
knn.R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
#' Generic function to make a prediction for a time series.
#' If a knn model is provided as the first argument, knn_forecast
#' will be directly called. If single values are provided as k and d
#' as no parameter search can be perfomed, knn_forecast will be
#' called automatically. If no values are provided for k and/or d,
#' values 1 to 50 will be used by default.
#'
#' @param y A time series or a trained kNN model generated by the
#' knn_param_search function. In case that a model is provided the knn_forecast
#' function will be automatically called.
#' @param k Values of k's to be analyzed or chosen k for knn forecasting.
#' Default value is 1 to 50.
#' @param d Values of d's to be analyzed or chosen d for knn forecasting.
#' Default value is 1 to 50.
#' @param initial Variable that determines the limit of the known past for
#' the first instant predicted.
#' @param distance Type of metric to evaluate the distance between points.
#' Many metrics are supported: euclidean, manhattan, dynamic time warping,
#' camberra and others. For more information about the supported metrics check
#' the values that 'method' argument of function parDist (from parallelDist
#' package) can take as this is the function used to calculate the distances.
#' Link to package info: https://cran.r-project.org/web/packages/parallelDist
#' Some of the values that this argument can take are "euclidean", "manhattan",
#' "dtw", "camberra", "chord".
#' @param error_measure Type of metric to evaluate the prediction error.
#' Five metrics supported:
#' \describe{
#' \item{ME}{Mean Error}
#' \item{RMSE}{Root Mean Squared Error}
#' \item{MAE}{Mean Absolute Error}
#' \item{MPE}{Mean Percentage Error}
#' \item{MAPE}{Mean Absolute Percentage Error}
#' }
#' @param weight Type of weight to be used at the time of calculating the
#' predicted value with a weighted mean. Three supported: proportional,
#' average, linear.
#' \describe{
#' \item{proportional}{the weight assigned to each neighbor is inversely
#' proportional to its distance}
#' \item{average}{all neighbors are assigned with the same weight}
#' \item{linear}{nearest neighbor is assigned with weight k, second closest
#' neighbor with weight k-1, and so on until the least nearest
#' neighbor which is assigned with a weight of 1.}
#' }
#' @param v Variable to be predicted if given multivariate time series.
#' @param threads Number of threads to be used when parallelizing, default is 1
#' @return A matrix of errors, optimal k and d. All tested ks and ks and all
#' the used metrics.
#' @examples
#' knn(AirPassengers, 1:5, 1:3)
#' knn(LakeHuron, 1:10, 1:6)
#' @export
knn <- function(y, k = 1:50, d = 1:50, distance = "euclidean", error_measure =
"MAE", weight = "proportional", v = 1, threads = 1) {
if (any(class(y) == "kNN")) {
warning("kNN model provided, simple prediction carried out",
immediate. = TRUE)
knn_forecast(y)
}
else if (length(k) == 1 && length(d) == 1) {
warning(paste0("k and d are single integers: supposing simple ",
"prediction is required"), immediate. = TRUE)
knn_forecast(y = y, k = k, d = d, distance = distance, weight = weight,
v = v, threads = threads)
}
else {
warning(paste0("Beginning parameter search process. This may take a ",
"while"), immediate. = TRUE)
knn_forecast(knn_param_search(y = y, k = k, d = d, distance = distance,
error_measure = error_measure, weight = weight,
v = v, threads = threads))
}
}