Added functionality to locate elements based on their tagname, attribute, and value. #32

hamzaaitbrik · 2024-12-20T19:02:59Z

Hello, I hope you're doing fine.

I submitted an issue not along ago stating that I had issues trying to locate elements nested deeply into shadowRoots. I tried doing it with JavaScript, the same way I do it with Selenium, but that didn't work because the evaluate() method only returns serializable data, and I needed elements to interact with.

So, I decided to change the source code of the project to fix my issue, I ended up creating 4 other functions that can accurately, and efficiently locate elements in the whole DOM, including shadowRoots and iFrames, using only the tagname of the element, and a pair of attribute/value to narrow down the targeting.

The functions I created are locate(), locate_all(), locate_element_by_tagname_attribute_value(), and locate_elements_by_tagname_attribute_value().

The locate() function takes three arguments, tagname, attribute, and value. That way, if we're aiming to locate a button, and we know that our target button is the only button with a certain pair of attribute/value, we can effectively locate that button every single time, even if the DOM changes in case we're working on a website that changes its structure.

The locate_all() function takes three arguments as well, tagname, attribute, and value. It works the exact same way as the locate() function, the only difference is that we're able to retrieve a list of all elements of our specified tagname, and pair of attribute/value.

Feel free to examine the code I added, and let me know if I missed anything. I will be more than happy to modify on it to match the preferences of this project.

I thought of deleting the find() and select() functions as they won't be much of use compared to locate(), but that's your decision to make.

Looking forward for this pull request to get merged. I believe this will add great value!

Thanks,
Hamza

…, and value

…e and return multiple elements

stephanlensky · 2024-12-20T21:11:16Z

Thank you for your contribution!

I'm interested in merging this, but am hesitant because adding another "find" function with a new name when find and select already exist sounds like it might be confusing for new users.

I spent a bit of time thinking how to solve this problem, but wasn't able to come up with anything. I'd be open to suggestions, though. I do think it's important we don't add a 3rd new function with ambiguous purpose compared to the other existing ones.

hamzaaitbrik · 2024-12-20T23:29:16Z

Thanks for the reply @stephanlensky.

I don't know what you suggest. I believe the function I added is much more efficient at locating elements than the other two functions, especially if elements are nested deeply into shadowRoots, hence why I thought of making such thing.

The find() and select() functions are not able to locate elements within shadowRoots, which I stumble upon quite frequently when doing web automation.

If having three functions is not an option, I would suggest removing the find() function altogether, because that's the least important one of them. Selecting elements based on their text can be achieved using the locate() function with the arguments tagname=tagname, attribute=innerHTML or attribute=textContent, and value=ourTarget.

The only downside in only using the select() and locate() functions solely is having to inspect the webpage whenever we're looking for our targeted element. I believe that should be totally okay for people who are using this project, because they're developers. I myself do web automation, and I can't imagine myself writing a script for a webpage without inspecting it's DOM.

Let me know what you think about what I said, and whether there's any modifications I should add.

I am also looking forward into having this pull request merged into the main branch.

Cheers,
Hamza

stephanlensky · 2024-12-21T01:22:16Z

@hamzaaitbrik after giving this more thought, I think the best path is to integrate your functionality into the existing find/find_all functions, rather than creating a new function.

If you are familiar with the Python package beautifulsoup, my goal is for find/find_all to be very similar to the find/find_all functions from there.

More concretely, here is what I suggest:

Modify the existing find/find_all functions to traverse shadow roots
Add a new optional argument (we don't want to break existing users) to find/find_all: attrs.
- This argument will be a dict mapping attribute name -> value
- You can specify one or more attributes in this dict to filter results
- Very similar to what you have now, but extended to work with multiple attributes instead of just one
Make the existing text and best_match arguments optional, so that you have the option to filter only by attrs without searching for some specific text.

Let me know if you have any questions or if this makes sense, thanks.

hamzaaitbrik · 2024-12-22T21:01:30Z

Hello @stephanlensky,

I have given what you said so much thought. But I can't comprehend how would such function work? The locate() function takes a tagname as argument, that way we know what we're targeting; and that's more technical than just trying to find something called 'Log In' or 'Submit,' which is not practical in my opinion.

I genuinely believe that this project should ditch the option of finding elements by text, because that's simply not as efficient as finding elements by tagname and attribute/value.

I would argue that whoever will base their work on this project will definitely need a bulletproof method to locate elements effectively, and text simply doesn't do it. I tried, tho; I tried looking for buttons with 'Log In' text with a modified find() method, but I don't accurately get my target element every time, I sometimes get div or span elements while I need a button to click.

This will obviously be a big merge, adding/changing functions is something users will have to adapt with.
What I suggest is, we ditch finding elements by text altogether, because simply that is not efficient. I will be more than happy to rewrite the documentation for the project explaining how the function responsible for finding elements using their tagname and attrs work.

I'm open to other suggestions, but finding elements with text just doesn't work, I would consider myself intermediate in web automation and I know that finding the exact same element every single time is a crucial for longevity.

Let me know what you think.

stephanlensky · 2024-12-22T21:06:40Z

@hamzaaitbrik please reread what I said carefully. Specifically:

Make the existing text and best_match arguments optional, so that you have the option to filter only by attrs without searching for some specific text.

I am suggesting the existing function be extended in such a way that searching by text is no longer required, and (if you desire) you can instead search just by attribute and tag name.

Attributes/attributes values can be provided as an optional dictionary. Similarly, tag name can be provided as an optional string.

This gives users the power to search however they want, whether that is by text, attribute or tag name, all without breaking any existing usages or adding a new function.

hamzaaitbrik · 2024-12-22T21:13:06Z

Oh, okay! Gotcha now. Sorry, I totally missed the part where you said text could be optional. I will code it and try to make it as efficient as possible and submit the modified code. Thanks.

hamzaaitbrik added 3 commits December 20, 2024 17:25

adding functions to locate elements based on their tagname, attribute…

0e32c2d

…, and value

making the functions more efficient and adding functionality to locat…

1547b3d

…e and return multiple elements

comments

df9bd3f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added functionality to locate elements based on their tagname, attribute, and value. #32

Added functionality to locate elements based on their tagname, attribute, and value. #32

hamzaaitbrik commented Dec 20, 2024

stephanlensky commented Dec 20, 2024

hamzaaitbrik commented Dec 20, 2024

stephanlensky commented Dec 21, 2024

hamzaaitbrik commented Dec 22, 2024

stephanlensky commented Dec 22, 2024 •

edited

Loading

hamzaaitbrik commented Dec 22, 2024

Added functionality to locate elements based on their tagname, attribute, and value. #32

Are you sure you want to change the base?

Added functionality to locate elements based on their tagname, attribute, and value. #32

Conversation

hamzaaitbrik commented Dec 20, 2024

stephanlensky commented Dec 20, 2024

hamzaaitbrik commented Dec 20, 2024

stephanlensky commented Dec 21, 2024

hamzaaitbrik commented Dec 22, 2024

stephanlensky commented Dec 22, 2024 • edited Loading

hamzaaitbrik commented Dec 22, 2024

stephanlensky commented Dec 22, 2024 •

edited

Loading