Skip to content

Commit

Permalink
first implemenation
Browse files Browse the repository at this point in the history
  • Loading branch information
tawandakembo committed Jul 27, 2024
0 parents commit cb180d7
Show file tree
Hide file tree
Showing 8 changed files with 336 additions and 0 deletions.
30 changes: 30 additions & 0 deletions .github/workflows/python-publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
name: Upload Python Package

on:
push:
branches:
- main
pull_request:
branches:
- main

jobs:
deploy:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.x'
- name: Install dependencies
run: |
python -m pip install --upgrade setuptools wheel twine
- name: Build and publish
env:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: |
python setup.py sdist bdist_wheel
twine upload dist/*
130 changes: 130 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
*.pyc
collated-code.md

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
Pipfile.lock

# poetry
poetry.lock

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/
1 change: 1 addition & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
MIT License
40 changes: 40 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Code Collator

Code Collator is a powerful CLI tool designed to streamline your code review and documentation process by collating your entire codebase into a single, organised Markdown file. This is particularly useful for sharing with AI tools like ChatGPT or Claude for analysis, troubleshooting, or documentation.

## Use Case

Have you ever needed to provide a comprehensive overview of your codebase for a code review, AI analysis, or detailed documentation? Code Collator simplifies this task by aggregating all your code files into a single Markdown file. This makes it easy to:
- Share your code with AI tools like ChatGPT or Claude for intelligent analysis.
- Generate a unified document for code reviews or team collaboration.
- Create comprehensive documentation for your projects with minimal effort.

## Features
- **Full Codebase Collation**: Collates all files in the specified directory and subdirectories into one Markdown file.
- **.gitignore Support**: Automatically ignores files specified in the `.gitignore` file if one exists.
- **Customizable Output**: Outputs a single Markdown file named `collated-code.md` by default, with options to specify the path to the codebase directory and output file name.
- **Binary File Inclusion**: Includes binary files such as images in the output with a note about their file type.
- **Help Command**: Provides a help command to display usage instructions.

## Installation

You can easily install Code Collator using pip:

```sh
pip install code-collator
```

## Usage

Here’s a basic example of how to use Code Collator:

```sh
code-collator --path /path/to/codebase --output my-collated-code.md
```

For more detailed usage instructions, use the help command:

```sh
code-collator --help
```

Empty file added code_collator/__init__.py
Empty file.
94 changes: 94 additions & 0 deletions code_collator/collate.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
import os
import argparse
from pathlib import Path
import logging

def setup_logging():
"""Set up logging configuration."""
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s'
)

def is_binary_file(filepath):
"""Check if a file is binary."""
try:
with open(filepath, 'rb') as f:
for byte in f.read():
if byte > 127:
return True
except Exception as e:
logging.error(f"Error reading file {filepath}: {e}")
return False
return False

def read_gitignore(path):
"""Read the .gitignore file and return patterns to ignore."""
gitignore_path = os.path.join(path, '.gitignore')
if not os.path.exists(gitignore_path):
return []

try:
with open(gitignore_path, 'r') as f:
patterns = f.read().splitlines()
logging.info(f"Loaded .gitignore patterns from {gitignore_path}")
return patterns
except Exception as e:
logging.error(f"Error reading .gitignore file {gitignore_path}: {e}")
return []

def should_ignore(file_path, ignore_patterns):
"""Check if a file should be ignored based on .gitignore patterns and if it's in the .git directory."""
from fnmatch import fnmatch
if '.git' in Path(file_path).parts:
return True
for pattern in ignore_patterns:
if fnmatch(file_path, pattern):
return True
return False

def collate_codebase(path, output_file):
"""Aggregate the codebase into a single Markdown file."""
ignore_patterns = read_gitignore(path)
try:
with open(output_file, 'w') as output:
output.write("# Collated Codebase\n\n")
for root, _, files in os.walk(path):
for file in files:
file_path = os.path.join(root, file)
if should_ignore(file_path, ignore_patterns):
logging.info(f"Ignored file {file_path}")
continue

output.write(f"## {file_path}\n\n")
if is_binary_file(file_path):
output.write(f"**Note**: This is a binary file.\n\n")
elif file.endswith('.svg'):
output.write(f"**Note**: This is an SVG file.\n\n")
else:
try:
with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
content = f.read()
output.write(f"```\n{content}\n```\n\n")
except Exception as e:
logging.error(f"Error reading file {file_path}: {e}")
output.write(f"**Note**: Error reading this file.\n\n")
logging.info(f"Collated codebase written to {output_file}")
except Exception as e:
logging.error(f"Error writing to output file {output_file}: {e}")

def main():
"""Parse arguments and initiate codebase collation."""
setup_logging()
parser = argparse.ArgumentParser(description="Aggregate codebase into a single Markdown file.")
parser.add_argument('-p', '--path', type=str, default='.', help="Specify the path to the codebase directory (default: current directory)")
parser.add_argument('-o', '--output', type=str, default='collated-code.md', help="Specify output file (default: collated-code.md)")

args = parser.parse_args()

logging.info(f"Starting code collation for directory: {args.path}")
collate_codebase(args.path, args.output)
logging.info("Code collation completed.")

if __name__ == "__main__":
main()
Empty file added requirements.txt
Empty file.
41 changes: 41 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
from setuptools import setup, find_packages
import pathlib

here = pathlib.Path(__file__).parent.resolve()

setup(
name="code-collator",
version="0.1",
description="A CLI tool to aggregate codebase into a single Markdown file",
long_description=(here / 'README.md').read_text(encoding='utf-8'),
long_description_content_type='text/markdown',
url="https://github.com/tawanda-kembo/code-collator",
author="Tawanda Kembo",
author_email="[email protected]",
classifiers=[
"Development Status :: 3 - Alpha",
"Intended Audience :: Developers",
"Topic :: Software Development :: Build Tools",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
],
keywords="cli, development, documentation",
packages=find_packages(),
python_requires=">=3.6, <4",
install_requires=[
# Add any dependencies here
],
entry_points={
"console_scripts": [
"code-collator=code_collator.collate:main",
],
},
project_urls={
"Bug Reports": "https://github.com/tawanda-kembo/code-collator/issues",
"Source": "https://github.com/tawanda-kembo/code-collator",
},
)

0 comments on commit cb180d7

Please sign in to comment.