-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to control the max length of a genome and several other issues? #158
Comments
Thanks so much for these bug reports @Y1fanHE!
Good catch. To address this we would need to do a few things.
If you are interested in working on a PR for these changes, I would be happy to assist. Otherwise, I can add it to my list of things to do for this project and try to get around to them at some point.
Another good catch! I see you have already opened PR #159 to fix this issue. Thank you! We can discuss this further there.
I don't have any data on this at the moment. One of the things on the backlog of this project is to add a script for launching multiple runs of common benchmark problems and gathering statistics such as runtime and solution rates. This is a larger task that isn't well defined yet, but I am happy to offer assistance if you feel motivated to take it on. |
Hello, @erp12 . Thank you for the reply. Yes, I would like to work on the max genome size changes.
I think this could be an additional parameter, since in GP usually, the genome can grow longer than the initial states. For the bug report and the test case, I have already made some comments in PR #159.
I think this is a good idea. Since I eventually will run experiments using this Also, I saw in the gecco paper of this library, there is future work to implement the stack in another language. Do you have a plan to do this recently? Or I can try to do that though I do not have any experience before. |
That's great! I think your decision to use an additional parameter is good. To guide your work on this feature, I think the best place to implement the limit is in the
Sounds good. If you think your script is generic enough to be usable by a wider set of users and you would like to take the time to contribute it to the project, feel free to open a PR and we can discuss further.
A few years ago I made an attempt to implement the stacks as a lightweight C++ data structure. Unfortunately I was not knowledgable enough about how to interface between Python and C++ without sacrificing either performance or flexibility. I am also concerned about increasing the complexity when it comes to build and distribution processes once the project moves to a multi-language codebase. I am happy to consider an architecture for a faster Push interpreter, including implementations in other languages, if you have a specific proposal in mind. Some other possibilities for getting faster runtimes could be to try running pyshgp on PyPy or after compiling the project to c using mypyc. I have not tried either of these technologies yet. Lastly, I should mention that a single PushGP run on the typical benchmark problems used in the literature is known to typically take hours per run regardless of implementation language. This recent paper about a new set of benchmark problems (similar to "small-or-large") did an informal measurement of runtime and found that most problems required an average runtime of over 5 hours on modest hardware. |
Hello @erp12 @lspector ,
Very grateful for your contribution on making this PyshGP!
I am doing my research project on PushGP and I am trying to use this Python library.
I found that it seems there is no control parameter for the maximum genome length (not the one for the initial genome length).
This parameter seems to have the name of
max-points
in Clojush.In
pyshgp/push/stack.py
, you commented at Line 48 as# Collection sizes and string lengths are bounded to avoid utilizing too many resources.
However, inpyshgp/push/types.py
, you did not setis_collection=True
for thePushStrType
. This returns MemoryError when I tried some string related tasks, such as "small or large".I would like to ask the average time for you to run experiment on the general program synthesis benchmark problems. It takes me a long time to run 1000 x 300 evaluations (about 2.5h for "small or large" with parallelism using 6 cores).
The text was updated successfully, but these errors were encountered: