Scripts from Fifer, J., Bentlage, B., Lemer, S., Fujimura, A. G., Sweet, M., & Raymundo, L. J. (2021). Going with the flow: How corals in high‐flow environments can beat the heat. Molecular Ecology, 30(9), 2009-2024.
These scripts were written by Dr. Bastian Bentlage and James Fifer and is a modified version of the pipeline available at
Step 1) Trim files using script. Sequences were trimmed using TRIMMOMATIC (ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:35:-phred33) (Bolger, Lohse, & Usadel, 2014), which removes low quality nucleotides with bp ≤35 and sequencing adapters, per default settings.
Step 2) Use on trimmed combined fq files. All quality-filtered paired reads were aligned against the publicly available A. digitifera genome (Shinzato et al., 2011) and Cladocopium goreaui genome (Liu et al, 2018), using the splice-junction mapper TopHat2 (Kim et al., 2013) . Using the resulting BAM files, two reference transcriptomes, one for A. cf. pulchra and one for Cladocopium sp. were assembled via the genome-guided version of the Trinity transcriptome assembler (Haas et al., 2013), with the A. digitifera genome (Shinzato et al., 2011) or C. goreaui genome (Liu et al, 2018) as a guide.
Step 3) Use the to split the fasta into one file per sequence in order to run in parallel. This script blasts against NCBI nt database and retains hits that match scleractinian with an evalue cutoff of 1e-5 (for sym blasted against nr databased and retained Dinophyceae hits). This creates a coralcontigs.txt or a symcontigs.txt file
Step 4) Filter out contigs from the sym or coralcontigs.txt file with <300bp using the &
Step 5) Filter out rRNA with and
Step 6) With this fasta file annotate using the script, blastx against the uniprot database with desired GO terms. Use script to only pull cnidarian GO terms from uniprot (note this filtering was only done for the host not symbiont due to fewer annotations available for Symbiodiniaceae in Uniprot). Use to filter hits with e values < 1e-5 and to turn these results into fasta file. This creates the final reference transcriptome.
Step 7) Format sample file according to samples.txt. Use the script to create counts files (through RSEM). For all downstream analyses with counts check
We followed the pipeline described in Davies et al., 2018 for determining Symbiodiniaceae genera.
To determine possible differences in Symbiodiniaceae between treatments that might account for differences in host differential gene expression analyses, we further mapped RNAseq libraries against the Symbiodiniaceae COI BOLD database (Ratnasingham & Hebert, 2007) and publicly available COI sequences from NCBI, using Bowtie 2 (Langmead & Salzberg L., 2013) with the --no-unal flag. This database is available in Sym_COIDB.tar.gz.
We used the script to create reads. Reads that were mapped successfully were then assembled against the consensus sequence of our Symbiodiniaceae reference alignment, using the reference-guided assembly algorithm implemented in Geneious (version 10.0.2; Biomatters, Auckland, NZ). All Symbiodiniaceae sequences and a Protodinium plus Polarella glacialis outgroup were aligned using MUSCLE (Edgar, 2004) with the default options implemented in Geneious. This was followed by removal of sites of ambiguous alignment using Gblocks (Castresana, 2000), with the default settings implemented in the alignment viewer Seaview version 4 (Gouy, Guindon, & Gascuel, 2010). Phylogenetic relationships were inferred using PhyML version 3.2 (Guindon et al., 2010), under the GTR+I+G model of nucleotide substitution. Robustness of the phylogeny was evaluated with 100 non-parametric bootstrap replicates.