-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create json tracking protection lists #176
base: main
Are you sure you want to change the base?
Conversation
I have checked this PR locally and I can see the files generated.
@say-yawn can you confirm if that's correct? I see there are 5 tests failing that also fail for me locally, does that mean the json are not correct? don't have enough knowledge about those yet but if the lists are fine and the tests just need some changes to pass, we could start working on simplifying the iOS scripts to use these lists. |
The tests have not been updated to take the JSON output into account. I've been trying to update them but it is very challenging because the tests mock the I wonder if we should rewrite the tests without mocking and instead just write the generated files to a temporary directory where they can be inspected/validated. I think that would make it simpler to support multiple output formats. What do you think @say-yawn |
I'm going to pick up this PR and refactor it a bit to make testing simpler. |
Temporarily comment out the publish to cloud to remove noise/errors
I pushed some changes to generate the JSON list for all versioned branches too. The naming convention for the versioned ETP list is as mentioned in this code snippet. Before, the script was generating and replacing the JSON files for every time the |
@say-yawn I can't get this to work. I copy the config file from the config repo into place and run the tool, but it fails with an error:
I don't understand this error because it is definitely in the file:
|
I was able to run the script and get the json files before latest commit landed. On 4a7c13a it is working (I get the json files although they are not versioned) I have followed your steps above to copy the |
I was able to make this working. I had to rename: After that I run ./lists2safebrowsing.py and got all the versioned json files! 👏 |
Hello @say-yawn, please let me try to go back to this and see if we can move it a bit forward if you have some time at some point. Once I was able to get the lists as commented above, I tried to match them with current list used in Firefox, this is what I got and that we will need to confirm with you if that's right:
I would say it looks good and we are a step closer to continue improving the environment.
I can take care f 3 and 4 but need your help for 1 and 2. Thank you in advance! |
@isabelrios hello! Thanks for being patient with me
I am unsure what this is referring to but |
Thanks for your reply @say-yawn !! Your comment makes sense, in fact that list is not being used for cookie blocking, as you can see here: https://github.com/mozilla-mobile/firefox-ios/blob/9a9b927484cf49a2cde438e3db71ee5203f88d66/content-blocker-lib-ios/src/ContentBlocker.swift#L41 which will match with the doc you shared cell B8 (if I'm reading that table correctly) If the other iOS lists match with the json files you can create with this PR I think we are on a good path here. Happy to work with you, clarify what's needed so that we can move to next steps :) |
I had a look and also come to the same mapping conclusion as @isabelrios described in #176 (comment) 👍 I think the next step is to |
Hello! So turns out we'll need one more file to be generated with this PR, which is the entity-list as we had on the shavar-prod-lists. With that we'll be able to generate all files necessary for the Firefox for iOS project |
Hello! All in all, on this PR the only thing missing is that we get the entity-list from it (as is, no change needed in format). |
@say-yawn would it be possible to get that entity-list json file when running your script |
Hi, if you can pull the entity lists from shavar-prod-lists that would be great. I believe that was the last thing that was identified as a need to merge this PR and I will not be able to make the changes on the script as I need to prioritize other work I was asked to deliver. @isabelrios and @lmarceau, if this is an acceptable step forward will either of you review and merge the PR so you can use the changes from main rather than a feature branch? |
Hello @say-yawn, thanks for the update. I think we can move forward with what we have, there area few questions though we would need to clarify:
In summary, the questions are:
Thank you! @lmarceau please add anything I may have missed... 🤔 |
I have one more question @say-yawn, just to double check that we are doing this correctly... when I run the lists2safebrowsing.py script I get versioned files (from version 69.0), but the latest version is 98.0. Then there are the same files files without version that are the ones I am using...is that right? Thanks |
@say-yawn, I forgot to comment about our final solution to store the JSON and manage the files we generate.. they will live in a public GCP bucket, is that fine or do we need to check with someone? I rememeber that you said we would need a license if we wanted to have them in the repo but since they will not be there... please confirm this with us to be sure we are doing this right. Thanks! |
I responded in a separate email about the public GCP bucket. The list should be fine to use if it's not public and may be fine to use on public. I recommend reaching out to legal to check if the public bucket is alright. |
Thank you @say-yawn for your detailed response. It helps a lot to understand how to continue with our solution. |
This expands the funciton to be used for entity lists as well
About this PR
Generate JSON files of the ETP lists for every versioned branch on shavar-prod-lists including the main branch.
Acceptance Criteria
{version}-{list name}
. For example, 93.0 versioned fingerprinting list looks like:93.0-base-fingerprinting-track-digest256.json
base-fingerprinting-track-digest256/93.0
in the bucket for shavar-list-creation.STR
Generate versioned list to upload to shavar-prod-lists
prod.ini
from shavar-list-creation-config here into theshavar_list_creation.ini
locally.lists2safebrowsing.py
base-fingerprinting-track-digest256.json
you need and copy and rename the file tobase-fingerprinting-track.json
normalized-lists
from the project root folder and add thebase-fingerprinting-track.json
filer under the newly created the folder.shavar-prod-lists
repo with the changes merging to the versioned branch the JSON file was for.