Skip to content

Commit

Permalink
Update alt-text-scan.py.md
Browse files Browse the repository at this point in the history
Adding headings
  • Loading branch information
mgifford authored Nov 20, 2024
1 parent 80ad4f4 commit ff14f91
Showing 1 changed file with 12 additions and 12 deletions.
24 changes: 12 additions & 12 deletions alt-text-scan.py.md
Original file line number Diff line number Diff line change
@@ -1,34 +1,34 @@
# Image Analysis Script for Web Accessibility

Overview
## Overview

This script analyzes images on a website for accessibility compliance. It identifies issues with alt text and other metadata, providing suggestions to improve accessibility. The script can parse sitemaps or crawl the website manually if a sitemap is unavailable or invalid.

Features
## Features

• Crawl websites for image data using sitemaps or manual crawling.
• Analyze image metadata, including alt text, title, and size.
• Generate detailed suggestions for improving alt text.
• Exclude non-HTML content (e.g., PDFs, videos).
• Output results to a CSV file with a summary of findings and recommendations.

Installation
## Installation

Prerequisites
### Prerequisites

Ensure you have Python 3.10 or later installed. Install the following Python libraries:

pip install requests beautifulsoup4 pandas tqdm textblob readability-lxml textstat

Usage
## Usage

Running the Script

To run the script, use the following command:

python3.10 alt_scan.py <domain> --sample_size <number>

Parameters
## Parameters

• <domain>: The starting URL for the website (e.g., https://example.com).
• --sample_size: Maximum number of unique URLs to crawl (default: 100).
Expand All @@ -39,7 +39,7 @@ python3.10 alt_scan.py https://www.whitehouse.gov --sample_size 1000

This command crawls up to 1,000 unique pages on the specified domain and analyzes the images found.

Output
## Output

The script generates two files:
1. CSV File: <domain>_images.csv
Expand All @@ -59,7 +59,7 @@ CSV Columns
• Load_Time (s): Time taken to fetch the image.
• Suggestions: Accessibility improvement recommendations.

Features of Analysis
## Features of Analysis

The script provides actionable suggestions, including:
• “Image hidden with no semantic value” if an image is marked with aria-hidden or hidden attributes.
Expand All @@ -69,7 +69,7 @@ The script provides actionable suggestions, including:
• Suggestions to avoid unnecessary phrases like “A picture of” in alt text.
• Readability checks using a customizable threshold.

Troubleshooting
## Troubleshooting

Invalid or Missing Sitemap

Expand All @@ -82,14 +82,14 @@ The script excludes non-HTML content, such as:
• Media files (.jpg, .mp4, etc.)
• Archives (.zip, .rar, etc.)

Logging Issues
## Logging Issues

The script outputs warnings for any URLs it fails to process.

Contributing
## Contributing

Feel free to submit issues or pull requests to improve this script.

License
## License

This project is open-source and available under the MIT License.

0 comments on commit ff14f91

Please sign in to comment.