These notebooks are prototypes, research, and sanity checks for the Firewall Cafe project.
Install these packages at a minimum:
- Jupyter Notebooks (or the Anaconda stack)
For some of them, you'll need:
- Selenium
- Google Cloud Translate
- ipyplot
If you want to run those notebooks, you'll need to set up some credentials with Google Cloud Translation and you'll need to download the appropriate Chrome webdriver for your version of Chrome.
1_requests-google-baidu. Reverse-engineering search results.
2_using-google-cloud-translation. Getting some basic automatic translation with Google Translate.
3_compare-languages-Google. Comparing what search results look like in different languages on Google.
4_compare-languages-Baidu. Comparing what search results look like in different languages on Baidu.
5_querying-many-sensitive-words-archive. Testing rate limits to see if Google or Baidu have automatic ban-hammers at a certain rate.
6_firewall-api. Testing Firewall Cafe API endpoints and demonstrating their use.
7_firewall-babelfish. Demonstrating how to use the Babelfish translate API (if you have a key).
8_image-hashing. Testing different image hashing algorithms.
9_wordpress-node-APIs. Looking at similarities between the old and new Firewall Cafe APIs.
10_transfer-images-http. A first attempt at getting 10k images from one place to another.
11_extract-images-postgres-dump. Extracting images from a postgresql dump; never got it working.
12_data-integrity. Checking that search results are getting entered correctly into the API, and returning as expected when we ask for them.
13_clean-up-searches-API. Delete searches that incorrectly stored way too many images.
14_wordpress-and-db-check. Take a closer look at Wordpress API vs new API to see if there are discrepencies in image results (they all seem to match).