Fish Behavioural screening
Behavioural screening was used to select for animal personality (AP) for both European sea bass, Dicentrarchus labrax, and Atlantic salmon, Salmo salar. Screenings were conducted following methods described in (Ferrari et al., 2015)  for D. labrax and a similar method with modification for S. salar (Damsgard et al., 2019) .
Briefly, the study was conducted at the Aquaculture Research Station in Tromsø, northern Norway, using age 0+ Atlantic salmon (Atlantic QTL-innova IPN). The fish were reared at 10 °C, in a continuous 24 h light regime and surplus feeding (Skretting Nutra). Fish were individually tagged using internal 12 mm PIT-Tags (HPT12 tags, Biomark, Boise, US), injected with a MK-25 implant gun. The fish population (n = 480, divided over 8 groups) was reared in circular holding tanks (~ 116 L) with flow through fresh water. Average weight 2 weeks prior to the experiment was 57.1 ± 7.3 g. Hypoxia sorting consisted of two custom-made circular tanks (~ 200 L, diameter 65 cm, water depth 60 cm, Cipax AS, Bjørkelangen, Norway), i.e. hypoxia and normoxia tanks. The tanks were connected at the surface level by a tube (inner diameter 9 cm), containing an integrated custom-made spool PIT-Tag antenna (Biomark Ltd., Boise, US), linked to a Biomark FS2001 reader. Tag manager software was used to identify fish leaving to the normoxia tank and those staying (independent of the decreasing oxygen level). Each tank had a separate water inlet and outlet. In the hypoxia tank the inlet was connected to a N2 gas exchanger (15 mg N2 L− 1), which deoxygenated incoming water. Oxygen levels (mgL− 1) in the tanks were monitored every minute, using an O2-monitoring system (Loligo Systems, Tjele, Denmark). Control tests prior to the experiment demonstrated that oxygen depletion was homogenous throughout the water column. Two video cameras were mounted outside the tube on each side of the opening and over water, in order to monitor fish movement. Each test had a duration of approximately 5 h starting at 8.30 AM and all tests were conducted in an equal manner. Prior to the test, the tanks were cleaned, water temperature regulated if necessary and the water flow in each tank set to 3.5 L min− 1. Fish were transferred to the experimental set-up as carefully as possible and released into the hypoxia tank. The two tanks were left undisturbed behind an opaque curtain for the duration of the trial. The monitoring of the fish started immediately, and the fish were allowed to acclimate for 2 h prior to the change in the oxygen level. During the decline in oxygen, the water flow in the hypoxia tank was directed through the gas exchanger, and a door between the hypoxic tank and the normoxic tank was opened so fish could freely swim between the tanks. After the oxygen level in the hypoxia tank reached 25% O2 saturation, the screening was terminated and the fish transferred back to their holding tanks. Fish that left the hypoxia tank were classified as leavers and considered proactives. Fish that never left the tank were considered stayers and classified as reactives. All fish was sampled for blood plasma cortisol to confirm the activity levels of the HPI axis corresponding to proactive and reactive personalities. (see  for details). After sorting, fish were left undisturbed in their holding tanks for a period of two and a half months, before sampling at basal conditions. Proactive and reactive individuals were sampled directly from holding tanks. Immediately after collection, individuals were euthanized with a lethal overdose of 1 g/L MS-222 (Finquel®, Argent Chemical Laboratories, Redmond, WA, USA). All fish were rapidly weighed and fork length measured. Whole brains were dissected out and placed in individually labelled tubes and stored at − 80 °C for analysis of gene expression.
Fish were hatched and reared at the experimental research station; Ifremer (Palavas-les-Flots, France) according to seabass rearing standards . Three triplicates of 120 fish (60 per tank) were used and each triplicate was housed in a 1.5 m3 tank in a sand filtered open flow system. Three batches of 120 individuals were screened at 215 days post hatching (dph) for hypoxia tolerance [33, 34] in order to assign animal personality (AP) and then returned to original 1.5 m3 tanks without modifying group composition. For the hypoxia test, oxygen concentration was decreased in one chamber of a two-chamber tank and escape from the hypoxic to the normoxic chamber was assessed. Experiments were carried out using two identical circular tanks (70 l, h: 48 cm, diameter: 49.5 cm,) attached via a transparent acrylic pipe (diameter: 11 cm, length: 30 cm, height from bottom: 23 cm) see . Each tank was considered to be a separate environment and was equipped with an independent oxygen and air supply that were switched off during trials. Sixty fish were placed in one chamber of the tank, subsequently the hypoxia tank, and were allowed to acclimate to the conditions for 30 min. To induce hypoxic conditions nitrogen bubbling was used to decrease oxygen saturation from 90 to 8% in 1 h. The second chamber of the tank was maintained under normal conditions. Once an individual escaped from the hypoxic tank into the normoxic tank it was immediately netted and placed in a separate tank before being anesthetized, benzocaine 200 ppm, and tagged with 12 mm ISO PIT tags, measured for weight and returned to their respective tanks. Assignment of AP categories was determined as follows. The first 20 fish escaping hypoxic conditions were proactive (P), the ~ 20 followers were intermediate (I) and the remaining fish, with no escape behaviour, were reactive (R). The hypoxia test ended when two thirds of the fish had escaped from the hypoxia tank or when 8% oxygen saturation was reached (water temperature 20 °C, salinity 26.9). This operation was repeated for all experimental fish. SCS assignation in each tank was: Tank 1: 40 P, 39 I and 40 R, Tank 2: 40 P, 61 I and 19 R and finally Tank 3: 40 P, 46 I and 34 R. Fish were sampled at 342 dph (mean weight 89.2 ± 31.8 g) with prior 24 h fasting. Fish were anaesthetized in their home tank using benzocaine, 200 ppm, and gathered in a smaller holding tank. PIT tags were read, weights were recorded, and fish were separated into new tanks according to AP. A random subsample of screened fish were immediately sacrificed using an overdose of anaesthetic (benzocaine) and kept on ice for further dissection. Whole brains were extracted and immediately frozen with liquid nitrogen and stored at − 80 °C. (see  for details).
qPCR analysis of fish brains
RNA isolation and cDNA synthesis
Individual brain samples were separately weighed and homogenized with Tri-Reagent following manufacturer’s instructions (1 mL/100 mg of tissue; Molecular Research Centre, Sigma-Aldrich, UK). Total RNA was extracted from individual fish brains for both S. salar (Proactive = 88; Reactive =40, Figure S1a and Table S1) and D. labrax (Proactive = 20; Reactive = 20. Figure S1b and Table S1) using the standard TriReagent (Sigma-Aldrich, UK) based method following manufacturer’s instructions. The concentration of each sample (total RNA) was quantified by Nanodrop (ND-1000) and quality visualized under UV light in a 1% agarose gel containing 1 μg/ml ethidium bromide. 2 μg of total RNA was taken from each individual to synthesize cDNA with SuperScript III RNase Transcriptase (Invitrogen) and oligo-dT primer (Promega).
Identification of salmon and sea bass target mRNAs using the zebrafish AP transcriptome
AP-specific gene lists were derived from our previous study in zebrafish  using individual zebrafish brain transcriptomes (dataset http://www.ncbi.nlm.nih.gov/geo/; GEO Accession: GSE40615GEO). Briefly, in this study behavioural tests including risk-taking, activity, mirror image stimulation and latency to feeding after confinement were carried out on a population of 280 zebrafish from a wild-type background. A sub-set of individuals exhibiting consistent AP were selected for brain transcriptome analyses, n = 10, for each animal personality. Microarray hybridisations were performed using the Zebrafish V2 (G2519F) 4x44K Agilent oligonucleotide microarray following standard methods according to manufacturer’s instructions (Agilent Technologies). To identify DEGs for each personality we selected genes that were highly significant (ANOVA P < 0.001) for proactive and reactive individuals. All p-values were adjusted with a false-discovery rate (FDR) correction for multiple testing by the Benjamini–Hochberg method (Benjamini & Hochberg 1995). All genes with FDR-corrected p-values < 0.05 were considered significant.
In silico cloning in the target species was carried out using genomic resources. These resources were: A all available public resources; S. salar were derived from the NCBI (994,195 sequences) and UniProt (> 9 K sequences). B All public resources for sea bass (Dicentrarchus labrax) UniProt (> 2.5 K sequences) and NCBI (86,801 ESTs) were added to sequence collections for sea bass held in the AquaSea server which forms part of the Aquagenomics website (http://www.aquagenomics.es). Sea bass resources were released in 2015. In order to maximize identification of salmon and sea bass mRNAs we applied our best BLAST iterative methodology [28, 39] (Goetz et al., 2010; Mackenzie et al., 2009). The scripts used consist of a four-step iterative BLAST, combining BLASTx: BLASTn: BLASTx: BLASTn, until a best hit description is assigned, the two first rounds are based on best description and the two next rounds based on best BLAST scores. If no description is found after the 4 rounds, sequences appear as “no hit found”. The e-value cut-off was set to 1E− 20 and the best BLAST hit with both highest similarity and coverage and lowest e-value was assigned as the mRNA transcript identity. BLAST results obtained for each species filtering at <1E-20 are available as Supplementary material (Table S2).
Curation of results
The lists obtained were then manually filtered to remove redundancy, cross check annotations and curated using the following criteria: 1) Species specificity, 2) contig length, 3) lowest e-value and 4) identity score. We found that BLASTn results were adequate for the analyses. A manual search for functional significance was undertaken using both classical literature search methods and also database orientated searches for functionality in model organisms for example GeneCards. This was also supported by network building in the Cytoscape platform aiming to identify interactions between the mRNAs identified.
All primers for QPCR were design by BatchPrimer3 v1.0 based on different sources of template genes as: i) for individual separation of mRNA transcript abundance target marker genes were chosen from the proactive-related gene list based on previous study on D. rerio  primers were designed based on the orthologous sequences from S. salar and D. labrax respectively (Table S2). Target genes were validated for each species using thermal gradient RT-PCR and the products that met the quality criteria were cloned into bacterial plasmids. ii) for the verification of RNA-Seq results, sequences of overlapping proactive-related mRNA transcripts (ppfibp1b and edrf1) across all three species were obtained from either the transcriptome assemblies (for S. salar and D. labrax) or from the UniProt database via unique identifiers (D. rerio) (Table S2). cDNA (2 μl) was used as a template for PCR with gene-specific primers. Target mRNAs were amplified using MyTaq HS DNA Polymerases (Bioline, UK), and amplicons were separated on 1% agarose gels, stained with ethidium bromide and purified with NucleoSpin® Gel and PCR Clean-up (MACHEREY-NAGEL, Germany). Purified PCR products were ligated in pGEM-T easy vector (Promega, USA) and transformed into E. coli (DH5a strain). One selected transformant of each construct was grown to obtain plasmid DNA (Miniprep kit, Macherey-Nagel). All constructs were verified by Sanger sequencing (GATC Biotech).
Absolute quantification was performed and the copy number of each transcript, derived from the standard dilution curve obtained from target plasmids was analyzed using a Thermocycler Stratagene Mx3005P (Agilent, USA). Each sample was tested in triplicate in a 96-well plate. The reaction mix (20 μl final volume) consisted of 10 μ of SYBR Green mix (Aligent, USA), 0.5 μl of each primer (20 μM), 7 μl of H2O and 2 μl of a 1/10 dilution of the cDNA sample. The thermocycling program consisted of one hold at 95 °C for 3 min, followed by three-step 35 cycles of 15 s at 95 °C, 10 s at 58 °C and 10 s at 72 °C. No template controls (NTCs) were used to assure no false positive signals were calculated.
Statistical analysis for behavioural and QPCR data
Exploratory analysis of the gene expression data in relation to the behavioural data was performed with specific software (AutoDiscovery®, Butler Scientifics). The approach evaluates Spearman’s Rank correlation coefficients for every pair of numerical variables and one-way ANOVAs for every pair of qualitative numerical variables within the consolidated data set in order to extract the most relevant relationships between the variables (Exploratory Data Analysis or EDA process). This was used as an exploratory analysis without significance corrections for false discovery rate (FDR) as the main objective was to suggest potential associations between multiple variables to be further explored. Correlations between all mRNAs, gene expression, tank, sex and weight were also performed. Data was tested for normality with a Kolmogorov-Smirnoff test and Levene’s test for homogeneity of variances. Non-normal data was log10 (var + 1) transformed. Tank effects on fish population total weight and gene copy number were also checked either with paired samples T-test or ANOVA tests. Post-hoc Scheffé or Dunnett test, for non-homogeneous variances, were performed for specific significances. A GLM ANCOVA was used to test for significant differences in gene expression between all individuals screened for AP in the populations studied. Dependent variable was Log copy number for each target gene with AP as the fixed factor. Tank and weight were used as co-variables.
To use the individual mRNA transcript abundance data to screen for AP we firstly analysed individuals using the K-mean cluster method, based on absolute mRNA transcript levels for each individual for both species. Optimal cluster numbers were chosen and different clusters were named following the cluster groupings. The criteria of classification was based on our previous data for zebrafish behavioural phenotypes. As a second approach we applied statistical comparisons between different personalities of both species following one-way ANOVA (individual grouping) or Two-way T-test. Significance was reported at P < 0.05. All analyses were performed with SPSS v19 (IBM®). Graphs were plotted using Prism5® for MacOS X, SPSSv19® or EXCEL® for Mac 2011 (v14.4.6).
The statistical difference of mean expression levels of gene markers among behavioural groups were calculated by R function “aov” (ANOVA) together with post hoc multiple pairwise comparisons by “TukeyHSD”. Significantly different groups were reported by LSD.test (R package “agricolae”) at p < 0.05. The Principle Component Analysis (PCA) of individual fish brains was performed by R function “prcomp” without normalization and scale.
cDNA libraries preparation and RNA seq
Only individuals tested to be proactive with coherence in both behavioural screening and absolute QPCR assays were classified as proactive. For control reference groups, individual RNA samples were pooled proportionally following 25% proactive, 25% reactive and 50% intermediate for both species. In total, 12 pools of total RNA were prepared; 3 Proactive pooled samples and 3 control pooled samples for both species (Table S3). All pooled RNA samples were quantified with Nanodrop (ND-1000) and a Qubit Fluorometer (Life Technology) respectively. The RNA integrity and quality were also assessed with a Bioanalyzer 2100 (Agilent Technologies), of which only samples with a RIN > 8 were further processed (Table S3). Total RNA (1 μg) for each pooled RNA sample was used for reverse transcription by SuperScript III RNase Transcriptase (Invitrogen) following the manufacturer’s protocol, with the only exception of reduced RNA fragmentation time to maximize obtention of longer reads . The cDNA libraries were constructed by TruSeq V2 kits (Illumina, CA, USA) and sequenced on a Illumina HiSeq 2500 platform at the Norwegian Sequencing Centre (raw reads: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA543167; sequence depth approx.: 20 × 106 reads/library). The adapters (indexers) in the paired-end raw reads were trimmed out by the quality control tool Trim-Galore (a wrapper tool based on Cutadapt Version 1.4.2 and FastQC Version 0.10.1) for high throughput sequence data, as set up by default quality threshold of Q20. The FastQC reports from both before- and after-trimming were checked.
Genome guided de novo transcriptomic assembly and completeness estimation
To facilitate downstream analyses, the after-trimmed reads across all samples (both proactive and control) were concatenated together into a single dataset for follow-up assembly for S. salar and D. labrax respectively. Trinity was used to generate a de novo transcriptomic assembly. For each species, all ‘left’ reads were combined into a single file, and the same was applied for the ‘right’ reads. Trinity parameters were held as default (Version 2.0.6)  with the “genome_guided_max_intron” parameter set to 15,000 for S. salar and 10,000 for D. labrax. The reference genome sequence for S. salar was obtained from NCBI (Accession No.GCA_000233375.4, assembly ICSASG_v2) and D. labrax reference genome was obtained from the online database (http://seabass.mpipz.mpg.de).
Quantitative assessment of trinity assemblies for both S. salar and D. labrax assemblies were measured based on evolutionarily informed expectations of gene content from near-universal single-copy orthologs selected from OrthoDB by BUSCO (v1.1B1) (python3/3.5.0; gcc/5.2.0; emboss/6.5.7; hmmer/3.1b1). Additionally, assemblies were inspected by Translate (1.0.2) which assigned scores based on alignments of reads to assemblies (Figure S2 and Table S4).
Whole brain transcriptomic annotation and species-wide comparisons
For D. rerio, microarray data (Agilent V2, G2519F, 4x44K)  were re-analyzed as follows: the probe sequences were extracted, and the sequence annotation updated using both BLASTn (2.2.26) against RefSeq (GCF_000002035.5_GRCz10, 54,437 sequences) and BLASTx (2.2.26) against UniProt (GRCz10, 59,208 sequences). After background correction (control and low expressed probes removed) and normalization by R (3.2.2) package “limma” (3.24.15) only probes determined to be positively expressed (N > =4 arrays) in both control and proactive groups were retained for further analysis (Figure S3). To further complement the D. rerio transcriptome information the RNA-Seq dataset (GSE61108)  (Figure S4) was parsed into the D. rerio transcriptome increase and improve comparability with the brain transcriptomes of S. salar and D. labrax obtained on the same Illumina platform. To facilitate species-wide transcriptome comparison the gene identifiers from the two non-model species were substituted for zebrafish identifiers based on sequence similarity. For S. salar and D. labrax, both transcriptome assemblies were compared to the D. rerio UniProt database (GRCz10, 59,208 sequences) using BLASTx (2.2.26). The UniProt identifiers of all three species were used for the following comparisons: i) “gene names”, “Pfam ID numbers” and “GO (gene ontology) terms” were obtained based on their UniProt identifiers by Biomart (http://www.ensembl.org/biomart) respectively; ii) Venn diagram are used to visualized overlapping and/or unique genes and protein families (Pfam) across three teleost brain transcriptomes. iii) Functional GO terms classification was calculated by online software WEGO.
Proactive-related gene identification and cross species comparison
Gene expression changes between proactive and control groups were compared using two distinct methods for RNA-Seq, e.g. Agilent microarray hybridization and Illumina sequencing. For the first approach, the intensity value of hybridization for each probe across all D. rerio samples (N = 20) was measured by R package “limma”. For RNA-Seq the TMM-trimmed (trimmed mean of M values) FPKM (Fragments Per Kilobase of exon per Million reads) value of each transcript was calculated for samples (N = 6) for both S. salar and D. labrax using “edgeR”. For D. rerio, the differential expressed genes (DEGs) were calculated and compared, with threshold of FDR < 0.05 and log2“fold change” > 0 (N = 10 for each group). For S. salar and D. labrax, the expression abundance of transcripts were estimated as FPKM which were calculated by RSEM  and normalized by TMM. Finally a set of pair-wised differential expression analysis were conducted separately using the R package “edgeR” for both species following Trinity DEGs identification pipeline  with the same statistical threshold as that of D. rerio, i.e. FDR < 0.05 and log2 “fold change” > 0 (N = 3). In order to estimate sequence conservation and the consistency of expression patterns in each species the following comparisons were conducted. The numbers of overlapping “Gene names” of proactive-related DEGs from each species were compared and visualized using a venn diagram. Only those expressed in at least 2 species were labelled as conserved proactive-related mRNAs and were kept for the subsequent analysis. The BLASTp results (Identities % and E-values) of these highly conservative DEGs from both S. salar and D. labrax against D. rerio were analysed respectively and then log2-transformed fold changes of these highly conserved DEGs were extracted for comparison.
Functional gene ontology enrichment and network analysis
Functional gene ontology enrichment and network analysis were conducted using Cytoscape (v3.3.0) plugin ClueGO and CluePedia (v2.2.3) following standard pipelines . Briefly, proactive up- or down-regulated DEGs were separately analysed using the model species mode for D. rerio (Taxonomy ID:7955) for all three species. Numbers of associated genes for each GO term were setup differently according to the number of DEGs identified in each species (1% for D. labrax with 284 DEGs, 2% for S. salar with 462 DEGs and 7% for D. rerio with 2078 DEGs). GO levels from 3 to 20 were examined with the “GO Fusion” option chosen. Enriched/Depleted GOs were identified by two-sided hypergeometric test and with only adjusted Mid P-values (Benjamini-Hochberg) less than 0.05 being kept. The percentage for a Cluster to be significant was set at 60%. The name of each functional group (Overview term) was given by the leading term with the smallest P-value in the group. The network grouping Kappa Score threshold was 0.30. The GO annotation data base versions were: KEGG (15.01.2016, 5461), REACTOME (15.01.2016, 6173), GO_CellularComponent-Custom (08.01.2016_00h21,14,997), GO_MolecularFunction-Custom (08.01.2016_00h21, 15,812), GO_BiologicalProcess-Custom (08.01.2016_00h21, 15,294) and InterPro_ProteinDomains (21.03.2016, 14,291).