- The project was conducted as part of "RISE (Research Intensive Self-motivated Education)" class in the Molecular Life Sciences at Incheon National University, Korea.
- The purpose of this project is to learn how to analyze DNA methylation datasets using the "ChAMP" R package.
- According to this project, I want to understand which genes exhibit differential methylation at CpG sites between the control and bipolar groups.
- It mainly deals with steps below.
- Quality control (QC) & Preprocessing
- Differentially methylation analysis
- Gene Set Enrichment analysis (GSEA)
- estimate Copy number variation (CNA)
- Infinium MethylationEPIC array chip (850K) platform
- Post-mortem hippocampus tissue
- the number of control group is 32 samples
- the number of bipolar group is 32 samples
- IDAT format. (GSE129428, Download from GEO database)
- ChAMP_analysis.R : DNA methylation analysis code using ChAMP R package.
- GSE129428_pd.csv : Metadata (Phenotype, Age, Smoke status, BMI and so on.)
- GSEA_result.csv : GSEA result
- CNA : CNA analysis result
- Normalization : QC results after normalization (MDS plot, Hirechical Cluster, Density plot)
- SVD_analysis : SVD analysis result (check batch effect)
- Annotation.py : Python script that proceeds with gene annotation based on estimated CNV information.
- Metadata and IDAT files are in the same directory.
- Check the Slide columns in Metadata. (ex) Slide : 202816900054, Basenames : GSM3712754_202816900054_R01C01