Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modeltime_calibrate() Unable to calibrate a classfiication model (R 4.3.1) - .pred_class column returned #252

Open
sebsfox opened this issue Aug 29, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@sebsfox
Copy link

sebsfox commented Aug 29, 2024

I'm new to {modeltime} and I think it is a great package - thank you for putting so much time into it.
I'm trying to do perform a classification task with time series data.

When I get to modeltime_calibrate() I get an error, which looks to be because it is looking for the .pred column, but the predict() function returns a .pred_class column. It might be related to #228.

Reproducible example below:

library(tsibbledata) # Diverse Datasets for 'tsibble'
library(tidymodels)
library(modeltime)
library(dplyr)
library(timetk)

# create high demand variable
df <- vic_elec |> #vic_elec is in tsibbledata package
  mutate(
    high_demand = case_when(
      Demand > quantile(Demand, probs = 0.95) ~ "Yes",
      .default = "No"
    )
  )

splits <- initial_time_split(
  df,
  prop = 0.8
)

recipe_spec <- recipe(
  high_demand ~ Time, # consider Temperature and Holiday
  data = training(splits)
)

model_fit_glm <- logistic_reg() |>
  set_engine(
    "glm",
    family = stats::binomial(link = "logit")
  )

wflw_glm <- workflow() |>
  add_recipe(recipe_spec) |>
  add_model(model_fit_glm) |>
  fit(training(splits))

models_tbl <- modeltime::modeltime_table(
  wflw_glm
)

calibration_tbl <- models_tbl |> 
  modeltime::modeltime_calibrate(
    new_data = testing(splits),
    quiet = FALSE
  )
#> Error: ℹ In argument: `.pred = ifelse(is.na(.pred), high_demand, .pred)`.
#> Caused by error:
#> ! object '.pred' not found
#> 
#> ── Model Calibration Failure Report ────────────────────────
#> # A tibble: 1 × 4
#>   .model_id .model     .model_desc .nested.col
#>       <int> <list>     <chr>       <lgl>      
#> 1         1 <workflow> GLM         NA
#> All models failed Modeltime Calibration:
#> - Model 1: Failed Calibration.
#> 
#> Potential Solution: Use `modeltime_calibrate(quiet = FALSE)` AND Check the Error/Warning Messages for clues as to why your model(s) failed calibration.
#> ── End Model Calibration Failure Report ────────────────────
#> Error in `validate_modeltime_calibration()`:
#> ! All models failed Modeltime Calibration.


predict(
  wflw_glm,
  new_data = testing(splits)
)
#> # A tibble: 10,522 × 1
#>    .pred_class
#>    <fct>      
#>  1 No         
#>  2 No         
#>  3 No         
#>  4 No         
#>  5 No         
#>  6 No         
#>  7 No         
#>  8 No         
#>  9 No         
#> 10 No         
#> # ℹ 10,512 more rows

Created on 2024-08-29 with reprex v2.1.1

Session info

─ Session info ────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.4.1 (2024-06-14 ucrt)
 os       Windows 10 x64 (build 19045)
 system   x86_64, mingw32
 ui       RStudio
 language (EN)
 collate  English_United Kingdom.utf8
 ctype    English_United Kingdom.utf8
 tz       Europe/London
 date     2024-08-29
 rstudio  2024.04.2+764 Chocolate Cosmos (desktop)
 pandoc   3.1.11 @ C:/Users/Sebastian.Fox/AppData/Local/Programs/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)

─ Packages ────────────────────────────────────────────────────────────────────────────────────────
 ! package      * version    date (UTC) lib source
   backports      1.5.0      2024-05-23 [1] CRAN (R 4.4.0)
   broom        * 1.0.6      2024-05-17 [1] CRAN (R 4.4.1)
   callr          3.7.6      2024-03-25 [1] CRAN (R 4.4.1)
   class          7.3-22     2023-05-03 [1] CRAN (R 4.4.1)
   cli            3.6.3      2024-06-21 [1] CRAN (R 4.4.1)
   clipr          0.8.0      2022-02-22 [1] CRAN (R 4.4.1)
   clock          0.7.0      2023-05-15 [1] CRAN (R 4.4.1)
   codetools      0.2-20     2024-03-31 [1] CRAN (R 4.4.1)
   colorspace     2.1-0      2023-01-23 [1] CRAN (R 4.4.1)
   crayon         1.5.3      2024-06-20 [1] CRAN (R 4.4.1)
   data.table     1.15.4     2024-03-30 [1] CRAN (R 4.4.1)
   dials        * 1.2.1      2024-02-22 [1] CRAN (R 4.4.1)
   DiceDesign     1.10       2023-12-07 [1] CRAN (R 4.4.1)
   digest         0.6.36     2024-06-23 [1] CRAN (R 4.4.1)
   dplyr        * 1.1.4      2023-11-17 [1] CRAN (R 4.4.1)
   evaluate       0.24.0     2024-06-10 [1] CRAN (R 4.4.1)
   fansi          1.0.6      2023-12-08 [1] CRAN (R 4.4.1)
   farver         2.1.2      2024-05-13 [1] CRAN (R 4.4.1)
   fastmap        1.2.0      2024-05-15 [1] CRAN (R 4.4.1)
   forcats        1.0.0      2023-01-29 [1] CRAN (R 4.4.1)
   foreach        1.5.2      2022-02-02 [1] CRAN (R 4.4.1)
   fs             1.6.4      2024-04-25 [1] CRAN (R 4.4.1)
   furrr          0.3.1      2022-08-15 [1] CRAN (R 4.4.1)
   future         1.33.2     2024-03-26 [1] CRAN (R 4.4.1)
   future.apply   1.11.2     2024-03-28 [1] CRAN (R 4.4.1)
   generics       0.1.3      2022-07-05 [1] CRAN (R 4.4.1)
   ggplot2      * 3.5.1      2024-04-23 [1] CRAN (R 4.4.1)
   globals        0.16.3     2024-03-08 [1] CRAN (R 4.4.0)
   glue           1.7.0      2024-01-09 [1] CRAN (R 4.4.1)
   gower          1.0.1      2022-12-22 [1] CRAN (R 4.4.0)
   GPfit          1.0-8      2019-02-08 [1] CRAN (R 4.4.1)
   gtable         0.3.5      2024-04-22 [1] CRAN (R 4.4.1)
   hardhat        1.4.0      2024-06-02 [1] CRAN (R 4.4.1)
   htmltools      0.5.8.1    2024-04-04 [1] CRAN (R 4.4.1)
   infer        * 1.0.7      2024-03-25 [1] CRAN (R 4.4.1)
   ipred          0.9-14     2023-03-09 [1] CRAN (R 4.4.1)
   iterators      1.0.14     2022-02-05 [1] CRAN (R 4.4.1)
   knitr          1.48       2024-07-07 [1] CRAN (R 4.4.1)
   labeling       0.4.3      2023-08-29 [1] CRAN (R 4.4.0)
   lattice        0.22-6     2024-03-20 [1] CRAN (R 4.4.1)
   lava           1.8.0      2024-03-05 [1] CRAN (R 4.4.1)
   lhs            1.2.0      2024-06-30 [1] CRAN (R 4.4.1)
   lifecycle      1.0.4      2023-11-07 [1] CRAN (R 4.4.1)
   listenv        0.9.1      2024-01-29 [1] CRAN (R 4.4.1)
   lubridate      1.9.3      2023-09-27 [1] CRAN (R 4.4.1)
   magrittr       2.0.3      2022-03-30 [1] CRAN (R 4.4.1)
   MASS           7.3-60.2   2024-04-26 [1] CRAN (R 4.4.1)
   Matrix         1.7-0      2024-04-26 [1] CRAN (R 4.4.1)
   modeldata    * 1.4.0      2024-06-19 [1] CRAN (R 4.4.1)
   modeltime      1.3.0      2024-07-29 [1] CRAN (R 4.4.1)
   munsell        0.5.1      2024-04-01 [1] CRAN (R 4.4.1)
   nnet           7.3-19     2023-05-03 [1] CRAN (R 4.4.1)
   parallelly     1.37.1     2024-02-29 [1] CRAN (R 4.4.0)
   parsnip      * 1.2.1      2024-03-22 [1] CRAN (R 4.4.1)
   pillar         1.9.0      2023-03-22 [1] CRAN (R 4.4.1)
   pkgconfig      2.0.3      2019-09-22 [1] CRAN (R 4.4.1)
   pkgload        1.4.0      2024-06-28 [1] CRAN (R 4.4.1)
   processx       3.8.4      2024-03-16 [1] CRAN (R 4.4.1)
   prodlim        2024.06.25 2024-06-24 [1] CRAN (R 4.4.1)
   ps             1.7.6      2024-01-18 [1] CRAN (R 4.4.1)
   purrr        * 1.0.2      2023-08-10 [1] CRAN (R 4.4.1)
   R6             2.5.1      2021-08-19 [1] CRAN (R 4.4.1)
   ranger         0.16.0     2023-11-12 [1] CRAN (R 4.4.1)
   rappdirs       0.3.3      2021-01-31 [1] CRAN (R 4.4.1)
   Rcpp           1.0.12     2024-01-09 [1] CRAN (R 4.4.1)
 D RcppParallel   5.1.9      2024-08-19 [1] CRAN (R 4.4.1)
   recipes      * 1.1.0      2024-07-04 [1] CRAN (R 4.4.1)
   reprex         2.1.1      2024-07-06 [1] CRAN (R 4.4.1)
   rlang          1.1.4      2024-06-04 [1] CRAN (R 4.4.1)
   rmarkdown      2.27       2024-05-17 [1] CRAN (R 4.4.1)
   rpart          4.1.23     2023-12-05 [1] CRAN (R 4.4.1)
   rsample      * 1.2.1      2024-03-25 [1] CRAN (R 4.4.1)
   rstudioapi     0.16.0     2024-03-24 [1] CRAN (R 4.4.1)
   scales       * 1.3.0      2023-11-28 [1] CRAN (R 4.4.1)
   sessioninfo    1.2.2      2021-12-06 [1] CRAN (R 4.4.1)
   StanHeaders    2.32.10    2024-07-15 [1] CRAN (R 4.4.1)
   stringi        1.8.4      2024-05-06 [1] CRAN (R 4.4.0)
   stringr        1.5.1      2023-11-14 [1] CRAN (R 4.4.1)
   survival       3.6-4      2024-04-24 [1] CRAN (R 4.4.1)
   tibble       * 3.2.1      2023-03-20 [1] CRAN (R 4.4.1)
   tidymodels   * 1.2.0      2024-03-25 [1] CRAN (R 4.4.1)
   tidyr        * 1.3.1      2024-01-24 [1] CRAN (R 4.4.1)
   tidyselect     1.2.1      2024-03-11 [1] CRAN (R 4.4.1)
   timechange     0.3.0      2024-01-18 [1] CRAN (R 4.4.1)
   timeDate       4032.109   2023-12-14 [1] CRAN (R 4.4.1)
   timetk       * 2.9.0      2023-10-31 [1] CRAN (R 4.4.1)
   tsibbledata  * 0.4.1      2022-09-01 [1] CRAN (R 4.4.1)
   tune         * 1.2.1      2024-04-18 [1] CRAN (R 4.4.1)
   tzdb           0.4.0      2023-05-12 [1] CRAN (R 4.4.1)
   utf8           1.2.4      2023-10-22 [1] CRAN (R 4.4.1)
   vctrs          0.6.5      2023-12-01 [1] CRAN (R 4.4.1)
   withr          3.0.0      2024-01-16 [1] CRAN (R 4.4.1)
   workflows    * 1.1.4      2024-02-19 [1] CRAN (R 4.4.1)
   workflowsets * 1.1.0      2024-03-21 [1] CRAN (R 4.4.1)
   xfun           0.45       2024-06-16 [1] CRAN (R 4.4.1)
   xts            0.14.0     2024-06-05 [1] CRAN (R 4.4.1)
   yaml           2.3.9      2024-07-05 [1] CRAN (R 4.4.1)
   yardstick    * 1.3.1      2024-03-21 [1] CRAN (R 4.4.1)
   zoo            1.8-12     2023-04-13 [1] CRAN (R 4.4.1)

 [1] C:/Program Files/R/R-4.4.1/library

 D ── DLL MD5 mismatch, broken installation.

───────────────────────────────────────────────────────────────────────────────────────────────────
@sebsfox
Copy link
Author

sebsfox commented Aug 29, 2024

perhaps this line needs updating to:

nms_final <- stringr::str_replace(nms_final, ".pred_res|.pred_class", ".pred")

@mdancho84
Copy link
Contributor

The problem:

  • Modeltime is designed for regression (numeric values) for forecasting
  • Using a logistic_reg() model is a classification task

Solution:

  • For now I'd make it a regression problem. For example instead of Yes or No, convert to 1 or 0. Then have model prediction be a numeric value that you can compare to the true values.

@sebsfox
Copy link
Author

sebsfox commented Oct 24, 2024

Thanks for responding and putting it on your backlog. I've gone onto another project but I may come back to this at some point

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants