rest-ensembl-batch-request

A collection of python files to help analyze gene coordinates for regulatory features

Genes can be analyzed for regulatory features using the Ensembl REST API. The files are designed to be used in a batch request, where the user can input a list of gene coordinates and retrieve regulatory features for each gene. The files use the Ensembl REST API to retrieve regulatory features for a given gene coordinate. The regulatory features include transcription factor binding sites, enhancers, promoters, and other regulatory elements.

The files in this repository are meant to analyze the human genome.

Steps to use the files

1. Clone the repository

git clone https://github.com/your-username/new-repo.git

1a. Create a virtual environment

Optional: At the root of the repository, create a virtual environment, and select it as the interpreter.

python -m venv venv

2. Install the required packages

Still at the root of the repository, install the required packages:

pip install -r requirements.txt

3. Update input files

Place the gene coordinates in a csv file in the data/input directory. The csv file should have the column chromosomal_region, in the format:

1:918352:918705

where the numbers represent chromosome_number:start_position:end_position.

4. Run the main file

From the root directory, run

python -m src.main

This will retrieve regulatory features for the gene coordinates that are listed in each csv file in the data/input directory. Depending on the size of your input file(s), this may take several minutes or hours.

5. Wait for the requests to complete

Depending on the number of gene coordinates, the process may take some time. Print statements are used to indicate which chromosome and which file are currently being read. The output will be saved in the data/output directory as a csv file. Because the API returns nested JSON objects, after all files in the data/input directory have been processed, and features have been fetched, the output files will be "flattened". The flattened files will be saved in the data/output directory with the prefix flat_.

Future Improvements

Feel free to suggest improvements or modifications via an issue.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rest-ensembl-batch-request

Steps to use the files

1. Clone the repository

1a. Create a virtual environment

2. Install the required packages

3. Update input files

4. Run the main file

5. Wait for the requests to complete

Future Improvements

About

Releases

Packages

Languages

License

joseph-belmonte/rest-ensembl-batch-request

Folders and files

Latest commit

History

Repository files navigation

rest-ensembl-batch-request

Steps to use the files

1. Clone the repository

1a. Create a virtual environment

2. Install the required packages

3. Update input files

4. Run the main file

5. Wait for the requests to complete

Future Improvements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages