This repository contains code for classifying high-resolution satellite images using SE-ResNet50 and other advanced deep learning models. The dataset used is the UCMerced LandUse dataset, and the code includes models like SE-ResNet50, SE-ResNeXt, and SENet, using transfer learning to fine-tune pretrained models for image classification.
- Project Overview
- Dataset
- Preprocessing
- Model Architectures
- Training
- Evaluation
- Results
- Installation
- Usage
- References
This project aims to classify high-resolution satellite images from the UCMerced LandUse dataset into various land use categories. The key objectives are:
- Convert
.tif
images to.jpg
format for easier processing. - Use SE-ResNet50, SE-ResNeXt, and SENet models for high accuracy classification.
- Train models on the dataset and evaluate performance with metrics such as accuracy, confusion matrix, and ROC curves.
The dataset used for this project is the UCMerced LandUse Dataset. It contains 21 land-use classes with 100 images per class. Each image is 256x256 pixels with a spatial resolution of 0.3 meters.
- Input:
.tif
satellite images. - Output: Classified land use images in categories like agricultural, residential, commercial, etc.
Before feeding the images into the model, we perform the following preprocessing steps:
- Convert
.tif
images to.jpg
format using the Python Imaging Library (PIL). - Resize images to 224x224 pixels.
- Normalize images using ImageNet mean and standard deviation values.
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
We have used the following models, all of which are based on the Squeeze-and-Excitation (SE) block:
- SE-ResNet50: A variant of ResNet50 with SE blocks.
- SE-ResNeXt50/101: A more advanced version of SE-ResNet with additional cardinality for wider networks.
- SENet154: The most complex and accurate model in the family, incorporating SE blocks.
- We trained the models on the UCMerced dataset using the following configuration:
- Optimizer: Adam
- Loss Function: CrossEntropyLoss
- Learning Rate: 0.001
- Epochs: 35
- Batch Size: 32
Example training loop:
for epoch in range(epochs):
model.train()
running_loss = 0.0
for inputs, labels in train_loader:
inputs, labels = inputs.to(device), labels.to(device)
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
print(f"Epoch {epoch+1}/{epochs}, Loss: {running_loss/len(train_loader)}")
We evaluated the model using:
- Accuracy: The percentage of correct predictions.
- Confusion Matrix: A heatmap to visualize misclassifications.
- ROC Curve: For multiclass classification, we plotted the ROC curve for each class.
from sklearn.metrics import accuracy_score, confusion_matrix, roc_curve, auc
from sklearn.preprocessing import label_binarize
import matplotlib.pyplot as plt
The models achieved the following accuracy on the validation set:
- SE-ResNet50: 94.5% accuracy
- SE-ResNeXt50: 95.3% accuracy
- SENet154: 96.2% accuracy
To install the required dependencies, clone the repository and install the Python packages using pip
:
git clone https://github.com/ArnavGhosh999/Aasmaan.git
cd Aasmaan/Se-ResNet50
Ensure that you have the following libraries installed:
torch
torchvision
timm
Pillow
rasterio
opencv-python
You can also install them manually using:
pip install torch torchvision timm Pillow rasterio opencv-python
- Dataset Preparation: Mount your Google Drive or local directory where the dataset is stored.
- Image Conversion: Run the conversion script to convert
.tif
images to.jpg
:for root, dirs, files in os.walk(input_directory): for file in files: if file.lower().endswith(('.tif', '.tiff')): input_path = os.path.join(root, file) relative_path = os.path.relpath(input_path, input_directory) output_path = os.path.join(output_directory, os.path.splitext(relative_path)[0] + '.jpg') output_dir = os.path.dirname(output_path) if not os.path.exists(output_dir): os.makedirs(output_dir) convert_image(input_path, output_path)
- Training: Train the model using:
for epoch in range(epochs): model.train() running_loss = 0.0 for inputs, labels in train_loader: inputs, labels = inputs.to(device), labels.to(device) optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step() running_loss += loss.item()
- Evaluation: Evaluate the model performance:
for inputs, labels in train_loader: inputs, labels = inputs.to(device), labels.to(device) outputs = model(inputs) _, predicted = torch.max(outputs, 1) total += labels.size(0) correct += (predicted == labels).sum().item()
- UCMerced LandUse Dataset: Link
- SE-ResNet and SE-ResNeXt models: Timm Library
- SENet Paper: Squeeze-and-Excitation Networks