Replies: 2 comments 1 reply
-
I ended up using Redis SUBSCRIBE to act as a buffered stream for URL input and the go binary act as the consumer for the input from Redis. There should be a more simple way to do this but I'm happy that this is working and it seems like the memory footprint is not as large as I used puppeteer before. |
Beta Was this translation helpful? Give feedback.
0 replies
-
Yes, each url a goroutine. No need to use redis for single machine. Use redis or other db for distributed system. Too general to teach you what to do, you can get them by just googling |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm new to this concurrency thing, let's say I want to implement a page crawler with the page pool, the same as the one in example https://github.com/go-rod/rod/blob/master/examples_test.go#L532 but with a bit of modification to accept an array of URLs and doing the crawling job, is it a best practice to just create all the goroutine for every URL I want to crawl or do I need to implement another pool for this? For example:
Seeing the process at the
htop
command, it doesn't really create a hundred threads as I thought it would and it looks like go uses all the max core of the current CPU. Is there a way to limit this process usage of cores? Thanks 🙏 .Beta Was this translation helpful? Give feedback.
All reactions