Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not downloading images #2

Open
zazencodes opened this issue Feb 16, 2019 · 6 comments
Open

Not downloading images #2

zazencodes opened this issue Feb 16, 2019 · 6 comments

Comments

@zazencodes
Copy link
Contributor

I cannot get the script working. Here's the output I get:

=====================================
=== Google Arts & Culture crawler ===
=====================================
Provide image URL
sample url: https://artsandculture.google.com/asset/madame-moitessier/hQFUe-elM1npbw
> URL: https://artsandculture.google.com/asset/madame-moitessier/hQFUe-elM1npbw
=====================================
Provide image maximum SIZE
sample size: 12000 (recommended)
> SIZE: 12000
=====================================
> Opening website
> Downloading partial images..
> Downloaded 0 partial images
> Saving partial images as final image
FAILED
integer division or modulo by zero

As you can see, no images are downloaded. Looking at the chromedriver window I don't see any images on the screen, is that expected or not?

What version of chromedriver are you using and can you confirm this script still works for you?

@zazencodes
Copy link
Contributor Author

Sometimes I have a bit more luck and instead get an error message like this

> Downloading partial images..
FAILED
cannot identify image file 'blobs/17.jpg'

I am yet to get it working

@piotrantosz
Copy link
Owner

piotrantosz commented Feb 16, 2019 via email

@piotrantosz
Copy link
Owner

'cannot identify error' sometimes occurs in slow networks. Will take a closer look at it.

@zazencodes
Copy link
Contributor Author

zazencodes commented Feb 17, 2019

Thanks for taking a look. When I get the "cannot identify image" error, I opened the blob file e.g. 17.jpg and turns out it's HTML with a 404 error:

404. That’s an error.

The requested URL was not found on this server.

Seems that some of the blobs are not loading properly

@zazencodes
Copy link
Contributor Author

I got it working today :)

Maybe the issue above was due to slow network (either me or the google arts servers).

@piotrantosz
Copy link
Owner

Yep it looks like network speed is broking blobs. We can't really check if image was loaded, as it's really hard to parse. I guess the only option is adding exception - and retry downloading.

Thank you for your contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants