Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some fixes and improvements #19

Closed
wants to merge 11 commits into from
5 changes: 3 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,13 @@ classifiers = [
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
]
requires-python = ">=3.10"
requires-python = ">=3.9"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey, this change needs to be reverted if you want this PR to get merged - I am sticking with >=3.10 for the X | Y union syntax for type annotations.

BTW, sorry for the delay in responding, I need to do some further research to understand the websockets==14.0 changes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I replace|with typing.Union ? I need 3.9 compatibility …

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I recreated MR, (can't find way to change branch in already created MR), I can't rollback python 3.9 adaptation in main and created MR with other branch : #22

dependencies = [
"requests",
"asyncio-atexit>=1.0.1",
"deprecated>=1.2.14",
"mss>=9.0.2",
"websockets>=13.1,<14",
"websockets>=13.1",
]

[build-system]
Expand Down
3 changes: 3 additions & 0 deletions zendriver/cdp/network.py
Original file line number Diff line number Diff line change
Expand Up @@ -1491,6 +1491,9 @@ def to_json(self) -> T_JSON_DICT:

@classmethod
def from_json(cls, json: T_JSON_DICT) -> CookiePartitionKey:
if isinstance(json, str):
# Some chrome versions return partitionKey as string
return cls(json, False)
Comment on lines +1494 to +1496
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just out of curiosity are there any docs on this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find something in docs, simple logged CDP message and got from practice usage of chrome >=129

return cls(
top_level_site=str(json["topLevelSite"]),
has_cross_site_ancestor=bool(json["hasCrossSiteAncestor"]),
Expand Down
16 changes: 13 additions & 3 deletions zendriver/core/browser.py
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,11 @@ async def get(
await connection.sleep(0.25)
return connection

async def communicate(self) -> tuple[bytes, bytes]:
if self._process is None:
raise ValueError("Browser process not running")
return await self._process.communicate()

async def start(self) -> Browser:
"""launches the actual browser"""
if not self:
Expand Down Expand Up @@ -358,6 +363,12 @@ async def start(self) -> Browser:
break

if not self.info:
stderr = None
try :
_, stderr_bytes = await self._process.communicate()
stderr = stderr_bytes.decode()[:1000]
except Exception :
pass
Comment on lines +367 to +371
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the try/except needed? when would this cause an exception?

if this is to handle self._process being None, we should check for that explicitly instead of doing try/except

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, why are we truncating to first 1000 characters? maybe if anything getting the last 1000 characters would be more useful?

Copy link
Author

@yoori yoori Nov 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

about exception handling : communicate is hard method (it use system calls and can raise exception by much reasons), and raise exception in block that already process exception isn't good for me ... in this case we can live with original exception without output (and debug output separatly)
about output : I got first 1000 chars, because I don't expect here much output (we only started chrome here and don't pushed some commands) - it is trivial protection for memory overuse for strange cases (for example : when chrome executable file replaced with something)

raise Exception(
(
"""
Expand All @@ -366,7 +377,8 @@ async def start(self) -> Browser:
---------------------
One of the causes could be when you are running as root.
In that case you need to pass no_sandbox=True
"""
""" +
("Browser error output:" + stderr if stderr else '')
Comment on lines +380 to +381
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I've been meaning to add something like this. very useful

)
)

Expand Down Expand Up @@ -702,8 +714,6 @@ async def save(self, file: PathLike = ".session.dat", pattern: str = ".*"):
# return
# if not connection.websocket:
# return
# if connection.websocket.closed:
# return
cookies = await self.get_all(requests_cookie_format=False)
included_cookies = []
for cookie in cookies:
Expand Down
31 changes: 15 additions & 16 deletions zendriver/core/connection.py
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,7 @@ def __init__(
):
super().__init__()
self._target = target
self.__count__ = itertools.count(0)
self._cdp_id_generator = itertools.count(0)
self._owner = _owner
self.websocket_url: str = websocket_url
self.websocket = None
Expand Down Expand Up @@ -230,7 +230,7 @@ def target(self, target: cdp.target.TargetInfo):
def closed(self):
if not self.websocket:
return True
return self.websocket.closed
return (not self.websocket)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you added

websocket = None

below, so this will work when we close the connection, but aren't there also other ways for the websocket to get closed? what if the browser closes the connection?

to handle all cases, we should probably do like

self.websocket is None or self.websocket.closed

(this applies to the other places you changed this as well)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

closed property removed in 14 and logic of websockets authors :
closed can be changed only on operations (send/recv) - here no any loop in websockets that check sockets in background
closed attribute can be filled only on operations, that already raise ConnectionClosedError
using of closed attribute after this exception is bad design for applications


def add_handler(
self,
Expand Down Expand Up @@ -282,7 +282,7 @@ async def aopen(self, **kw):
:return:
"""

if not self.websocket or self.websocket.closed:
if not self.websocket:
try:
self.websocket = await websockets.connect(
self.websocket_url,
Expand All @@ -308,11 +308,12 @@ async def aclose(self):
"""
closes the websocket connection. should not be called manually by users.
"""
if self.websocket and not self.websocket.closed:
if self.websocket:
if self.listener and self.listener.running:
self.listener.cancel()
self.enabled_domains.clear()
await self.websocket.close()
self.websocket = None
logger.debug("\n❌ closed websocket connection to %s", self.websocket_url)

async def sleep(self, t: Union[int, float] = 0.25):
Expand Down Expand Up @@ -411,7 +412,7 @@ async def send(
:return:
"""
await self.aopen()
if not self.websocket or self.closed:
if not self.websocket:
return
if self._owner:
browser = self._owner
Expand All @@ -425,9 +426,7 @@ async def send(
try:
tx = Transaction(cdp_obj)
tx.connection = self
if not self.mapper:
self.__count__ = itertools.count(0)
tx.id = next(self.__count__)
tx.id = next(self._cdp_id_generator)
self.mapper.update({tx.id: tx})
if not _is_update:
await self._register_handlers()
Expand Down Expand Up @@ -603,15 +602,17 @@ async def listener_loop(self):
# breathe
# await asyncio.sleep(self.time_before_considered_idle / 10)
continue
except (Exception,) as e:
# break on any other exception
# which is mostly socket is closed or does not exist
# or is not allowed

except (websockets.exceptions.ConnectionClosedError,) as e:
logger.debug(
"connection listener exception while reading websocket:\n%s", e
)
break
except (Exception,) as e:
# we don't expect here other exceptions, need to debug if it will appear.
logger.exception(
"connection listener exception while reading websocket"
)
break

if not self.running:
# if we have been cancelled or otherwise stopped running
Expand Down Expand Up @@ -646,9 +647,7 @@ async def listener_loop(self):
try:
event = cdp.util.parse_json_event(message)
event_tx = EventTransaction(event)
if not self.connection.mapper:
self.connection.__count__ = itertools.count(0)
event_tx.id = next(self.connection.__count__)
event_tx.id = next(self.connection._cdp_id_generator)
self.connection.mapper[event_tx.id] = event_tx
except Exception as e:
logger.info(
Expand Down