-
-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vsl.ml: Random Forest #126
Comments
I've been making some progress on this in my own fork. |
@BMJHayward feel free to send me a Draft Pull Request or the link to the branch where you have your changes so I can suggest how to do it 😊 |
Thanks @ulises-jeremias here's my current branch, not quite ready for PR: I thought to make a proper decision tree implementation and use that in the RF as well, but using the |
yeah, I think it is not enough. The |
I just updated latest master adding some methods. there are two ways you can do this now:
mut new_data := data.clone_with_same_x()
new_data.set_y(new_index_y)? or if you want X to be a new instance of la.Matrix, you can have multiple instances of Data doing mut data_with_new_index := data.clone()
data_with_new_index.set_y(new_index_y)? |
@BMJHayward ^^ |
excellent thankyou, I'll take a look over the weekend |
@BMJHayward hey! did that work? is there anything else I can do to help? |
@ulises-jeremias hi thanks for following up on this. The lynchpin for me is in I can't figure out how to do this yet and maintain a consistent interface using I'm also busy with family and renovations on the house at the moment, it might be better if someone takes this on and I can consult or something. I'm happy to do it, it just won't be quick. |
Was just looking for Cox Regression and Random Forest in VSL which brought me here. I wonder if there are any plans for Cox R. and perhaps a few other from https://github.com/shankarpandala/lazypredict . Also it seems VSL so far does not support "stop & resume" operation acutely needed for fully automated "checkpointing to HDD & recovery from HDD" in long-running apps (which often fail due to full memory, stall, etc. and need to be restarted paying the tens of hours of identical computation again and again...). Any plans for such "stop & resume" API? Of course, it has to be weighted against performance, so maybe it could be tied to time - every approx 10 seconds by default the computation will be interrupted and saved to a user-defined location. IDK |
hey! don't rush with it. Family is more important 😊 About the question, I think calling set_y multiple times is OK as soon as the .clone() method is used 👌🏻 |
lazypredict is great! we will probably add more models during time 👌🏻 regarding the checkpointing, I didnt thought about it. We can probably add it in the near future. Will think about it and try to figure out a best way to do it. Probably creating .h5 files on some iterations |
Yep, .h5 is fine. Maybe to not slow down the computation we could just fork the process (i.e. delegate COW of all the structs with data to the operating system as e.g. Redis does) so takes a negligible time and then save it to disk. The data might have easily hundreds of MB or more, so not doing it fully in parallel could slow down the computation too much (and V's threading support is probably not enough as it would involve |
I wonder if there is any news regarding Cox Regression, Random Forest, and .h5 checkpointing. I could not find anything in the commits. But no pressure, I just want to regularly get up to date 😉. |
Any news? Especially the checkpointing seems highly beneficial to everybody (compared to Cox Regression and Random Forest). |
Still interested in this to allow me start recommending V (VSL) within my bubble 😉. |
Describe the feature
We want to create a new model on
vsl.ml
to do classification using the Random Forest algorithm. That model should follow the following interfaces:With the following methods
Use Case
Proposed Solution
Other Information
Acknowledgements
Version used
Environment details (OS name and version, etc.)
The text was updated successfully, but these errors were encountered: