Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix fingerprinting import issue #317

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

J-CPelletier
Copy link

Goal

The goal of this PR is to do the changes explained here: #311 (comment) and to solve #316

I do this by copying the Scrapy code for fingerprinting to keep backwards compatibility, and I also added a new fingerprinter using the newer function.

This is done for backward compatibility purposes, since we want the same
behavior as the function deleted in Scrapy 2.12.0.
Scrapy 2.12.0 added JsonResponse, which is the class any JSON response
is instantiated as. We therefore needed to use it for our test.
@J-CPelletier
Copy link
Author

J-CPelletier commented Jan 9, 2025

After testing the master branch on my end with this change to the setup.py to allow for things to run like before:
install_requires=['scrapy==2.11.2', 'six'],
I see that most integrations tests fail in the exact same way that they do in this PR. Do we want to fix them in this PR or in another one? This seems more like a blocker to us being able to release a working version in this situation.

@Gallaecio
Copy link
Contributor

Do we want to fix them in this PR or in another one?

I think whatever works best for you.

Scrapy removed support for Python 3.8
@J-CPelletier
Copy link
Author

I think whatever works best for you.

The integration tests seem to working in the CI, although I haven't been able to reproduce it locally (through the commands directly or using act). The issue had to do with setting up the MockServer and getting Connection Refused in the test, but if it works in the Actions it's not a big deal.

This should now be ready for code reviews.

scrapy_splash/cache.py Outdated Show resolved Hide resolved
scrapy_splash/request.py Outdated Show resolved Hide resolved
Copy link
Contributor

@Gallaecio Gallaecio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. The only thing I think is missing is updating the README accordingly, to replace the old configuration instructions with the new ones.

@J-CPelletier
Copy link
Author

I just did the fixes you requested, let me know if I missed something!

@wRAR
Copy link

wRAR commented Jan 14, 2025

TypeError: SplashRequestFingerprinter.__init__() missing 1 required positional argument: 'crawler'

@J-CPelletier
Copy link
Author

TypeError: SplashRequestFingerprinter.__init__() missing 1 required positional argument: 'crawler'

I just fixed that, the tests should now pass

@fliot
Copy link

fliot commented Jan 18, 2025

This CR is a great update, thanks !
+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants