When evaluating a time series prediction model, if very large errors (outliers) are a big concern, which error metric would you use to highlight them more: Mean Absolute Error (MAE) or Root Mean Squared Error (RMSE)?
When evaluating a time series prediction model and very large errors, also known as outliers, are a significant concern, Root Mean Squared Error (RMSE) would be the preferred error metric to highlight them more over Mean Absolute Error (MAE).
Mean Absolute Error (MAE) is calculated as the average of the absolute differences between the predicted values and the actual observed values. The "absolute difference" means that the sign of the error (whether the prediction was too high or too low) is disregarded, and only the magnitude is considered. MAE treats all errors linearly; an error of 10 units has exactly ten times the impact on the total error sum as an error of 1 unit. This means MAE provides a straightforward average of the error magnitudes, without giving disproportionate weight to larger errors.
Root Mean Squared Error (RMSE), on the other hand, is calculated by first squaring the differences between predicted and actual values, then averaging these squared differences, and finally taking the square root of that average. The crucial part of RMSE is the "squaring the differences" step. This operation disproportionately penalizes larger errors. For example, an error of 10, when squared, becomes 100, while an error of 1, when squared, becomes 1. This means the 10-unit error contributes 100 times more to the sum of squared errors than the 1-unit error, even though it's only 10 times larger in magnitude. This quadratic weighting causes RMSE to give much more weight to large errors than MAE does. The subsequent square root operation in RMSE brings the unit of the error back to the original scale of the data, making it interpretable, but the amplifying effect of the initial squaring on larger errors remains.
Therefore, because RMSE applies a quadratic penalty to errors, it amplifies the impact of large errors (outliers) much more significantly than MAE. This makes RMSE more sensitive to outliers and causes a single large error to result in a much higher RMSE value, effectively highlighting these concerning large errors more prominently.