Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running robyn_outputs() #318

Closed
tgtod002 opened this issue Feb 23, 2022 · 23 comments
Closed

Error running robyn_outputs() #318

tgtod002 opened this issue Feb 23, 2022 · 23 comments
Assignees

Comments

@tgtod002
Copy link

Project Robyn 3.6

Getting this error message when running Robyn(outputs)

Using robyn object location: C:/Users todor/OneDrive - Vericast/Documents/AFW_MMM_Model
Provided 'plot_folder' doesn't exist. Using default 'plot_folder = getwd()': C:/Users/ttodor/OneDrive - Vericast/Documents/AFW_MMM_Model

Running Pareto calculations for 10000 models on 3 fronts...
Error in robyn_pareto(InputCollect, OutputModels, pareto_fronts, calibration_constraint) :
object 'dt_expoCurvePlot' not found

Provide dummy data & model configuration

Issues are often related to custom input data that is difficult to debug without. If necessary, please modify your data to mask real values and share a dataset that is able to reproduce the issue. Please also share your model configuration.

Environment & Robyn version

R version (R --version)
Please make sure you're using the latest Robyn version

@gufengzhou
Copy link
Contributor

Hi, have you installed/updated to the latest Robyn v3.6.0?

@tgtod002
Copy link
Author

Yes. I have.

@Jozephz
Copy link

Jozephz commented Feb 24, 2022

I'm also getting a very similar error message when running the model with robyn_run():

Running Pareto calculations for 10000 models on 3 fronts...
Error in robyn_pareto(InputCollect, OutputModels, pareto_fronts, calibration_constraint) :
object 'dt_expoCurvePlot' not found
In addition: Warning message:
In check_legacy_input(InputCollect, cores, iterations, trials, intercept_sign, :
Using legacy InputCollect values. Please set "iterations", "trials" within robyn_run() instead

This message began appearing yesterday, after a suggested update. I think it's possible a dependency may have been broken here, but I'm not sure how to resolve this. Thanks for your help :)

@F1nalFortune
Copy link

Also receiving the same error here. Prior to this error, am receiving a convergence plot error:

> OutputModels$convergence$moo_distrb_plot
Error in if (!(lo <- min(hi, IQR(x)/1.34))) (lo <- hi) || (lo <- abs(x[1L])) ||  : 
  missing value where TRUE/FALSE needed

@laresbernardo
Copy link
Collaborator

@F1nalFortune this seems like another issue. Reading this ticket in ggridges repo, it seems like an Inf issue. Can you please check your data for missing values, too many zeroes or NAs?

@tgtod002
Copy link
Author

This is the new error I am getting. Also, no plots/files were outputed.
Using robyn object location: C:/Users todor/OneDrive - Vericast/Documents/AFW_MMM_Model
Provided 'plot_folder' doesn't exist. Using default 'plot_folder = getwd()': C:/Users/ttodor/OneDrive - Vericast/Documents/AFW_MMM_Model

Error in RNGseq(n, seed, ..., version = if (checkRNGversion("1.4") >= :
NMF::createStream - invalid value for 'n' [positive value expected]
In addition: Warning message:
In min(coef) : no non-missing arguments to min; returning Inf

@tgtod002
Copy link
Author

I re-ran robyn_run and put output = false. Then re-ran robyn_outputs. I am back to original error message which is this:
Using robyn object location: C:/Users todor/OneDrive - Vericast/Documents/AFW_MMM_Model
Provided 'plot_folder' doesn't exist. Using default 'plot_folder = getwd()': C:/Users/ttodor/OneDrive - Vericast/Documents/AFW_MMM_Model

Running Pareto calculations for 10000 models on 3 fronts...
Error in robyn_pareto(InputCollect, OutputModels, pareto_fronts, calibration_constraint) :
object 'dt_expoCurvePlot' not found

@gufengzhou
Copy link
Contributor

The dt_expoCurvePlot not found bug should be fixed now. Please update and test the package. 9b77b99

@tgtod002
Copy link
Author

@gufengzhou . I re-installed Robyn and re-ran the code. I am still getting the same error message

Using robyn object location: C:/Users todor/OneDrive - Vericast/Documents/AFW_MMM_Model
Provided 'plot_folder' doesn't exist. Using default 'plot_folder = getwd()': C:/Users/ttodor/OneDrive - Vericast/Documents/AFW_MMM_Model

Running Pareto calculations for 10000 models on 3 fronts...
Error in robyn_pareto(InputCollect, OutputModels, pareto_fronts, calibration_constraint) :
object 'dt_expoCurvePlot' not found

@laresbernardo
Copy link
Collaborator

@tgtod002 If you please share with me (laresbernardo @gmail.com) your script and anonymized data so I can replicate it, it would def help me debug this issue. Before that, make sure you are really running on the latest version in a fresh new session to see if the error persists.

@tgtod002
Copy link
Author

Yes. running this version packageVersion("Robyn")
[1] ‘3.6.0’ . Will send it to you shortly.

@FDwangchao
Copy link

@laresbernardo @tgtod002
在pareto.R script 中,line 269,代码块的最后,添加如下代码后,重新install Robyn,就可以跑通。没有这个error发生。
image
猜测原因是:InputCollect中的paid_media_vars 和 exposure_vars 是一样的,所以不会执行代码里面的内容,从而导致代码块外部没定义这个对象,引发报错。

@laresbernardo
Copy link
Collaborator

Hi, @FDwangchao thanks for your suggestion. Can you please update to the most recent version and check that the fix Gufeng deployed also works for you? He added dt_expoCurvePlot <- NULL instead, which should be enough to fix this problem. Thanks

@laresbernardo laresbernardo self-assigned this Feb 28, 2022
@tgtod002
Copy link
Author

tgtod002 commented Feb 28, 2022

HI Lares, the error occures in running robyn_outputs(). I am actually just following the demo

Here is my code

OutputModels <- robyn_run(
  InputCollect = InputCollect # feed in all model specification
  #, cores = NULL # default
  #, add_penalty_factor = FALSE # Untested feature. Use with caution.
  , iterations = 2000 # recommended for the dummy dataset
  , trials = 5 # recommended for the dummy dataset
  , outputs = FALSE # outputs = FALSE disables direct model output
)

OutputCollect <- robyn_outputs(
  InputCollect, OutputModels
  , pareto_fronts = 3
  , csv_out = "all" # "pareto" or "all"
  , clusters = TRUE # Set to TRUE to cluster similar models by ROAS. See ?robyn_clusters
  , plot_pareto = FALSE # Set to FALSE to deactivate plotting and saving model one-pagers
  , plot_folder = robyn_object # path for plots export
)

@laresbernardo
Copy link
Collaborator

laresbernardo commented Feb 28, 2022

Sorry @tgtod002 but I can't seem to replicate the issue with the demo.R file and your code:

> OutputModels <- robyn_run(
+   InputCollect = InputCollect # feed in all model specification
+   #, cores = NULL # default
+   #, add_penalty_factor = FALSE # Untested feature. Use with caution.
+   , iterations = 2000 # recommended for the dummy dataset
+   , trials = 5 # recommended for the dummy dataset
+   , outputs = FALSE # outputs = FALSE disables direct model output
+ )
Input data has 208 weeks in total: 2015-11-23 to 2019-11-11
Initial model is built on rolling window of 92 week: 2016-11-21 to 2018-08-20
Using geometric adstocking with 19 hyperparameters (19 to iterate + 0 fixed) on 16 cores
>>> Starting 5 trials with 2000 iterations each using TwoPointsDE nevergrad algorithm...
  Running trial 1 of 5
  |==========================================================================================================| 100%
  Finished in 1.65 mins
  Running trial 2 of 5
  |==========================================================================================================| 100%
  Finished in 1.68 mins
  Running trial 3 of 5
  |==========================================================================================================| 100%
  Finished in 1.67 mins
  Running trial 4 of 5
  |==========================================================================================================| 100%
  Finished in 1.62 mins
  Running trial 5 of 5
  |==========================================================================================================| 100%
  Finished in 1.45 mins
- DECOMP.RSSD NOT converged: [email protected] 0.068 >= 0.059 & [email protected] 0.15 < 0.22 [email protected]*sd
- NRMSE converged: [email protected] 0.0047 < 0.046 & [email protected] 0.059 < 0.089 [email protected]*sd
Total run time: 8.16 mins
> OutputCollect <- robyn_outputs(
+   InputCollect, OutputModels
+   , pareto_fronts = 3
+   , csv_out = "all" # "pareto" or "all"
+   , clusters = TRUE # Set to TRUE to cluster similar models by ROAS. See ?robyn_clusters
+   , plot_pareto = FALSE # Set to FALSE to deactivate plotting and saving model one-pagers
+   , plot_folder = robyn_object # path for plots export
+ )
Using robyn object location: /Users/bernardolares/Desktop
>>> Running Pareto calculations for 10000 models on 3 fronts...
>>> Collecting 154 pareto-optimum results into: /Users/bernardolares/Desktop/2022-02-28 14.04 init/
>> Exporting all results as CSVs into directory...
>> Exporting general plots into directory...
>>> Calculating clusters for model selection using Pareto fronts...
>> Auto selected k = 6 (clusters) based on minimum WSS variance of 5%

What hyperparameters are you using? Are you literally running the demo or did you adapt those hyperparameters to be used with your own data?

Please, update to the latest version, and DO make sure you're using the latest version in a fresh new session.

@laresbernardo laresbernardo changed the title Error running Robyn(outputs) Error running robyn_outputs() Feb 28, 2022
@tgtod002
Copy link
Author

Hi Lares,

Just making sure....are you using my sample file? I only have 105 weeks of data.

@laresbernardo
Copy link
Collaborator

laresbernardo commented Feb 28, 2022 via email

@tgtod002
Copy link
Author

tgtod002 commented Feb 28, 2022

InputCollect <- robyn_inputs(
  dt_input = afw_data1
  ,dt_holidays = dt_prophet_holidays
  
  ### set variables
  
  ,date_var = "new_date"                               # date format must be "2020-01-01"
  ,dep_var = "SALES.AMT"                              # there should be only one dependent variable
  ,dep_var_type = "revenue"                         # "revenue" or "conversion"
  
  ,prophet_vars = c("trend", "season", "holiday") #               
  ,prophet_signs = c("default", "default", "default")       # c("default", "positive", and "negative").
  ,prophet_country = "US"                           # only one country allowed once. Including national s

  ,paid_media_vars = c("CrossDeviceDisplayimpressions",  "Cross_device_impressions", "DirectMailMediaimpressions", "DynamicMobileimpressions"	)
  ,paid_media_signs = c("positive", "positive", "positive", "positive")            #   
  ,paid_media_spends = c("CrossDevideDisplay_media_cost",  "Cross_dev_video_cost", "DirectMailInsert_media_cost", "DynamicMobilemedia_cost" 	)         
  ### set model parameters
  
  ## set cores for parallel computing
  ,cores = 6 
  
  ## set rolling window start
  ,window_start = "2020-01-04"
  ,window_end = "2021-12-27"
  
  ## set model core features
  ,adstock = "geometric"            
  ,iterations = 2000  
  ,intercept_sign = "non_negative" # intercept_sign input must be any of: non_negative, unconstrained
  ,nevergrad_algo = "TwoPointsDE" 
  ,trials = 5 # number of allowed trials. 5 is recommended without calibration,

)

## 3. Hyperparameter interpretation & recommendation:

hyper_names(adstock = InputCollect$adstock, all_media = InputCollect$all_media)

hyperparameters <- list(
  
  Cross_dev_video_cost_alphas = c(0.5, 3)      
  ,Cross_dev_video_cost_gammas = c(0.3, 1)      
  ,Cross_dev_video_cost_thetas =  c(0, 0.3)
  
  
  ,CrossDevideDisplay_media_cost_alphas = c(0.5, 3) 
  ,CrossDevideDisplay_media_cost_gammas = c(0.3, 1)  
  ,CrossDevideDisplay_media_cost_thetas =  c(0, 0.3)
  
  ,DirectMailInsert_media_cost_alphas = c(0.5, 3)
  ,DirectMailInsert_media_cost_gammas = c(0.3, 1)
  ,DirectMailInsert_media_cost_thetas = c(0.1, 0.4)
  
  ,DynamicMobilemedia_cost_alphas = c(0.5, 3) 
  ,DynamicMobilemedia_cost_gammas =  c(0.3, 1)  
  , DynamicMobilemedia_cost_thetas   = c(0, 0.3)  
 )


#### 2a-3: Third, add hyperparameters into robyn_inputs()

InputCollect <- robyn_inputs(InputCollect = InputCollect, hyperparameters = hyperparameters)

@laresbernardo
Copy link
Collaborator

laresbernardo commented Feb 28, 2022

Ok, thanks for sharing. A couple of comments:

  • Your data contains a lot of continuous 0s. When checking the data you've got loads of zeroes in the first sets of weeks. Please change to a narrower window and/or clean your data. That is def a problem. Notice the warnings printed:
Recommendations are: 
1. increase hyperparameter ranges for 0-coef channels to give Robyn more freedom
2. split media into sub-channels, and/or aggregate similar channels, and/or introduce other media
3. increase trials to get more samples
  • You are not using Measure_1__Visits_ data anywhere. Is that intentional?
  • Notice that 'cores', 'iterations', 'trials', 'intercept_sign', 'nevergrad_algo' should be set in robyn_run(), not robyn_inputs().
  • Regarding the "Michaelis-Menten fitting for X out of range. Using lm instead", check this answer.
  • Given your particular input dataset noticed a small bug when calculating convergence when Inf values which I've just fixed.
  • Lastly, I'm sorry but with your code and your inputs I wasn't able to reproduce the mentioned (or any) error. Please, DO make sure you're running on the latest version. I'm attaching how I wanted you to kindly share your script so I can replicate the error (simply pasted it all together). Try running it for me (with updated Robyn version) and see if you get any errors: tgtod002-318.R.zip.

@tgtod002
Copy link
Author

tgtod002 commented Mar 1, 2022

Thanks Lares. What I am trying to do is run the program with some data as a test. So, my data is still WIP. Given that, I noticed that I get an error running robyn_outputs when I set outputs = TRUE in robyn_run
The following is the error:
_Using robyn object location: C:/Users todor/OneDrive - Vericast/Documents/AFW_MMM_Model
Provided 'plot_folder' doesn't exist. Using default 'plot_folder = getwd()': C:/Users/ttodor/OneDrive - Vericast/Documents/AFW_MMM_Model

Running Pareto calculations for 10000 models on 3 fronts...
Error in eval(jsub, SDenv, parent.frame()) : object 'iterNG' not found_

If I set outputs = FALSE in robyn_run, robyn_outputs works.

I want to be able to look at the the one pagers, which is created when you set outputs = TRUE.

@laresbernardo
Copy link
Collaborator

laresbernardo commented Mar 1, 2022

Yeah, makes sense.

  • I notice you are trying to save results in C:/Users todor/... and you missed a "/" there.
  • Ideally, you'd run first robyn_run(..., outputs = FALSE) and then robyn_outputs(...) to export results. That way you don't have to re-run simulations to select the outputs you'd like.
  • Can you share your head(OutputModels$trial1$resultCollect$resultHypParam) (given OutputModels as robyn_run(..., outputs = FALSE) output)?
  • Having outputs = TRUE is exactly the same as running robyn_outputs() so you'll get the one-pagers

@laresbernardo
Copy link
Collaborator

Feel free to re-open if this particular error or issue persists when updating to 3.6.2 @tgtod002

@FDwangchao
Copy link

FDwangchao commented Oct 11, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants