"loopmailsource" is a Python project that allows you to collect information about your favorite music artist producers and scrape their open data contacts. It can fetch data about songs, producers, and artists while also gathering Instagram usernames, public email addresses, and biographies.
To set up the project, follow these steps:
- Clone the project repository from GitHub.
- Create a virtual environment and activate it.
- Install the required packages listed in the
requirements.txt
file usingpip install -r requirements.txt
. - Create a
.env
file in the root directory (next torequirements.txt
) and populate it with your API keys and Instagram credentials (see Configuration). - You are now ready to use the project.
To use the Music and Instagram Data Collector:
- Run the
main.py
script. - Follow the prompts to input the artist's name and the maximum number of songs to process.
- The script will collect data about the artist, their songs, producers, and Instagram-related information.
- The results will be saved to a CSV file named
<artist_name>_<max_count>_tracks.csv
.
In the .env
file, configure the following variables:
api_key = 'your_genius_api_key'
user_name = 'your_instagram_username' password = 'your_instagram_password'
Make sure to replace 'your_genius_api_key', 'your_instagram_username', and 'your_instagram_password' with your actual API key and credentials.
- Instagram Credentials: Use your own Instagram username and password for Instagram data scraping.
To obtain the necessary API keys:
- Genius API Key:
- Sign up on Genius.
- Go to the Genius API Clients page.
- Obtain a CLIENT ACCESS TOKEN for your application.
With the API keys and credentials configured in the .env
file, you can start using the Music and Instagram Data Collector to gather information about your favorite music artist and their Instagram presence.
The project is organized into several Python files within the src
directory:
genius.py
: Handles interaction with the Genius API for music-related data.instagram_scraper.py
: Provides functions for scraping Instagram data.log_manager.py
: Manages API keys and authentication for Genius and Instagram.main.py
: The main script that orchestrates the data collection process.song_list.py
: Defines lists for storing data related to songs, producers, Instagram usernames, emails, and biographies.song_processing.py
: Contains functions for processing song-related data.
genius.py
This module handles interactions with the Genius API for music-related data. It includes functions for authentication, retrieving artist information, and fetching songs.
instagram_scraper.py
The instagram_scraper.py
module provides functions for scraping Instagram data, including public email addresses and biographies, from Instagram usernames.
log_manager.py
log_manager.py
manages API keys and authentication for both the Genius and Instagram APIs. It loads environment variables from the .env
file and provides the genius_auth
function for Genius API authentication.
main.py
The main.py
script is the main entry point for the project. It prompts the user for input, fetches data about the artist and their songs, processes the data, scrapes Instagram information, and saves the results to a CSV file.
song_list.py
song_list.py
defines lists for storing data related to songs, producers, Instagram usernames, public email addresses, and biographies. These lists are used to collect and organize the data during the execution of the project.
song_processing.py
song_processing.py
contains functions for processing song-related data. It iterates through songs, retrieves information about producers, and calculates estimated time and progress during data collection.