The porcine translational research database: a manually curated, genomics and proteomics-based research resource
© The Author(s). 2017
Received: 12 October 2016
Accepted: 2 August 2017
Published: 22 August 2017
The use of swine in biomedical research has increased dramatically in the last decade. Diverse genomic- and proteomic databases have been developed to facilitate research using human and rodent models. Current porcine gene databases, however, lack the robust annotation to study pig models that are relevant to human studies and for comparative evaluation with rodent models. Furthermore, they contain a significant number of errors due to their primary reliance on machine-based annotation. To address these deficiencies, a comprehensive literature-based survey was conducted to identify certain selected genes that have demonstrated function in humans, mice or pigs.
The process identified 13,054 candidate human, bovine, mouse or rat genes/proteins used to select potential porcine homologs by searching multiple online sources of porcine gene information. The data in the Porcine Translational Research Database ((http://www.ars.usda.gov/Services/docs.htm?docid=6065) is supported by >5800 references, and contains 65 data fields for each entry, including >9700 full length (5′ and 3′) unambiguous pig sequences, >2400 real time PCR assays and reactivity information on >1700 antibodies. It also contains gene and/or protein expression data for >2200 genes and identifies and corrects 8187 errors (gene duplications artifacts, mis-assemblies, mis-annotations, and incorrect species assignments) for 5337 porcine genes.
This database is the largest manually curated database for any single veterinary species and is unique among porcine gene databases in regard to linking gene expression to gene function, identifying related gene pathways, and connecting data with other porcine gene databases. This database provides the first comprehensive description of three major Super-families or functionally related groups of proteins (Cluster of Differentiation (CD) Marker genes, Solute Carrier Superfamily, ATP binding Cassette Superfamily), and a comparative description of porcine microRNAs.
Swine are an important models for human anatomy, nutrition, metabolism and immunology [1–3]. Their organs are anatomically and histologically similar to humans as are their sensory innervation and blood supply . Pigs are naturally susceptible to infection with organisms that are closely related or identical to those species infecting humans including helminths (Ascaris, Taenia, Trichuris, Trichinella, Shistosoma, Strongyloides), bacteria (Campylobacter, Chlamydia, Eschericia coli, Helicobacter, Neisseria, Mycoplasma, Salmonella), protozoans (Toxoplasma) and viri (Coronavirus, Hepatitis E, Influenza, Nipah, Reovirus, Rotavirus) [2, 5, 6]. The last 10 years has seen a boon in the development of genetically modified pig as models for human cardiovascular and lung disease, neurodegenerative and musculoskeletal disorders [7, 8] and cancer . There is also a robust effort to develop pigs as sources for organs and tissues for human xenotransplantation .
Despite these potential strengths as a model, the lack of an annotated database for porcine gene and protein expression data is a limiting factor for translating findings in one species to another. Multiple online databases exist for the storage and retrieval of diverse bovine, rodent or human biomedical data [11–19]. Other databases exist for Zebrafish (ZFIN, ), C. elegans (WormBase, ), and Drosophila melanogaster (Flybase, ). Databases that encompass multispecies analysis such as Homologene and/or that rely on manual annotation such as InnateDb  include bovine but not porcine genes. Several porcine genome companion databases exist; however they lack robust manual annotation and are somewhat limited in scope or are infrequently updated [16–19]. Agbase, a large, multispecies functional analysis database allows the user to search 51,489 porcine genes based on 12 criteria including gene and protein names (UniProt) and Gene Ontology (GO) annotations. Furthermore, databases can contain a significant number of errors due to their primary reliance on machine-based annotation . For example, the SUS-BAR database  is designed to identify protein orthologs based upon data that includes annotations from the machine-annotated NCBI genome. NCBI has recently begun to include GO annotations into curated entries for non-human and rodent species but most of these are indirect and often based on observations made in other species. As swine are an important model for comparative human studies, there is a critical need to have a centralized, manually-curated source of information for biomedical research. To address these needs, we created the Porcine Translational Research Database.
Construction and Content
To generate content of immunological relevance, broad-based literature searches were conducted using the following terms: apoptosis, B cell development or activation, CD markers, chemokines, chemokine receptors, cytokines, cytokine receptors, dendritic cells, type 1 IFN induced genes, inflammation, nuclear factor kappa-light-chain-enhancer of activated B cells (NFκ-B) signaling pathway, toll receptor signaling pathway, T cell development or activation, Th1 cell development and Th2 cell development. In addition, immunologically related genes associated with the susceptibility to or pathology of allergy, asthma, arthritis, atherosclerosis and inflammation were included. In addition, The Gene Ontology consortium’s community annotation wikis for immunology, cardiovascular disease and muscle biology were searched (http://wiki.geneontology.org/index.php/Main_Page). The Jackson Laboratory database of knockout mouse phenotypes was searched for genes leading to defects in immune or metabolic phenotypes when over or under expressed. These genes include the vast majority of genes that are related to immunity and inflammation [2, 3, 25, 26]. For additional metabolically related genes, genes involved in the transport or metabolism of macronutrients, trace vitamins and minerals were searched. Other genes, associated with the susceptibility to or pathology of atherosclerosis, diabetes, and obesity, were identified. This process identified 13,054 candidate human, bovine, mouse or rat genes/proteins of interest used to select potential porcine orthologs by searching various online sources of porcine gene information. One to one orthology of protein coding genes were determined by protein structure similarity (best reciprocal BLAST hits) and the presence of a corresponding gene in the syntenic region of the human and or mouse genome. No 1:1 orthology could be established for members of some gene families including the Leukocyte Immunoglobulin-like Receptor (LILR) Killer Cell immunoglobulin-like Receptor (KIR), Carcinoembryonic antigen-related cell adhesion molecule (CEACAM) and Cytochrome P450 superfamilies. One to one porcine orthologs of human genes utilize the approved HGNC Name according to the International Society for Animal Genetics (ISAG) publishing guidelines. We defined pseudogenes by the criteria used by Ensembl and ENSCODE; namely the presence of one or more stop codons in the open reading frame that disrupt the protein structure, and (usually) a lack of intron structure at the genome level . Pseudogenes are further classified into Processed, Duplicated, Unitary or Polymorphic categories .
Current Database Statistics (07/12/2017)
Number of Entries
Number of Full-Length RNA Sequences (5′ and 3′ Representation)
Number of Genes with Full-Length RNA Sequences
Dawson Lab Full Length Submissions to Genbank
Percent of Genome in Database with RNA Sequences
Number of Protein Coding Genes with Full-Length RNA Sequences
Number of Protein Coding Gene Splice Variants
Number of Genes in Database with Full-Length Protein Sequences
Percent of Genome in Database with Protein Sequences
Percent of Proteins in Database with Full-Length RNA Sequences
Number of Unigene Numbers Assigned
Percentage of Entries with a Unigene Assignment
Entries with a Unigene Assignment
Entries without a Unigene Assignment
Number of NCBI Loci Represented
Percentage of Entries with a NCBI loci Assignment
Entries with a NCBI loci Assignment
Entries without a NCBI loci Assignment
Number and Types of Errors Located in Publically-available Porcine Databases
Number of Errors
Number of Entries with Errors
Number of Genes not identified in Ensembl Build 10.2.
Missing from Genome
Present but not Annotated
Artifactually Duplicated Loci
Functional Annotations for 1041 Protein-Coding Genes that are Missing from Ensembl build 10.2
GOTERM MF DIRECT
Nucleic acid binding transcription factor activity
GOTERM MF DIRECT
Interleukin-1 receptor binding
GOTERM MF DIRECT
Ly-6 antigen / uPA receptor -like
DNA binding HTH domain, Psq-type
CENP-B N-terminal DNA-binding domain
RNA biosynthetic process
Based upon gene number estimates from other closely related species such as human and cow, we estimated that our database has a coverage rate of approximately 42% of the porcine genome. These represent sequences found in 10,232 Unigene entries (1.45 per gene), 9967 NCBI loci (5756 are single loci that are not duplicated gene artifacts or split into multiple loci, and 1793 genes have multiple (4211) loci. A total of 2109 and 1616 of the genes have no assigned Unigene number or NCBI loci, respectively. In addition to GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations, literature-based functional annotations (derived from more than 5500 references) are provided for these sequences. We have also discovered a relatively large number (178) of porcine or artiodactyl-specific paralogs (Additional file 3) for 104 protein or non-protein coding porcine genes. For genes with multiple paralogs, genes are named in the order of phylogenetic distance of the parent human or bovine gene. Some of these genes are expressed pseudogenes. Some of these genes have been previously discussed (i.e., CD36, IL1B [25, 31]) or will be discussed in the following sections.
Extensive Gene Fragmentation/Truncation Frequently Occurs Among Proteins of Extreme Size
# of Exons
In previous studies, we extensively compared porcine, human and mouse genes related to immunity and inflammation [2, 3, 25, 26]. In the following section, we will summarize our findings for three major Superfamilies or functionally related groups of proteins (CD marker genes, Solute carrier superfamily, ATP binding cassette superfamily) or non-coding RNA (microRNA) that have complete or nearly complete representation. CD markers (accessible as a group by entering CD markers in the Annotations field) encode a heterogeneous group of cell surface proteins. The Human Leucocyte Differentiation Antigen (HLDA) workshop has designated 408 molecules (some of which are grouped within a CD) as CD markers . Based upon our assembly and analysis, we could establish 1:1 orthology for 357 porcine genes to those that compose HLDA version 10. Forty-three genes are not present in the porcine genome or could not be designated as 1:1 orthologs. Of these, nine genes (CLEC4C, CLEC4M, SIGLEC7, BTN3A1, LILRA1, LAIR2, PSG1, SIRPG, TNFRSF10C, are primate-specific [35–37]. KLRC2 (CD159c) is found in humans and rodents but not pigs. FCGR2C is a human-specific gene/pseudogene that belongs to a family of three low-affinity immunoglobulin gamma Fc receptors (CD32) . We have determined that pigs have two member of this family that roughly corresponds to FCGR2A and FCGR2B. TNFRSF14 (CD270) is a marker for B cells, dendritic cells, monocytes, and Treg cells  found in humans and rodents, but not cows. Although, canine, feline, equine and ursine homologs have been identified, this gene may be a pseudogene in pigs as the putative ORF is interrupted by an endogenous retroviral sequence (H. Dawson, unpublished). FCRL2 (CD307b) is a marker for B cells in humans. Although sequences corresponding to FCRL2 have been identified in other mammals including dog and horse, no mouse ortholog has been identified . This gene shows evidence of positive selection in humans  and is most likely a pseudogene in pigs.
Due to rapid evolution and post-speciation gene duplication, no 1:1 orthology could be established for most mouse and pig LILR or KIR family members, including LILRA4 (CD85G) and LILRB4 (CD85K) . Similarly, other than CEACAM1 (CD66) and CEACAM6 (CD66C), no 1:1 orthology could be established for most pig and mouse CEACAM family members (CEACAM3 (CD66D) CEACAM5 (CD66E). CEACAM8 (CD67) may be a pseudogene as ESTs in Unigene Ssc.60435 predict a 243 amino acid protein interrupted by several stop codons. CEACAM8 and CEACAM6 were previously determined to have no direct murine orthologs . Several other shared human-pig CD marker orthologs (ADGRE2 (CD312), ADGRE3 (CD313r), CD1A, CD1E, CR1 (CD35), CD58, FCGR2A (CD32), FCAR (CD89), FCRL3 (CD307c), FCRL4 (CD307d), ICAM3 (CD50), NCR2 (CD336), NCR3 (CD337) and TLR10 (CD290r) have no rodent orthologs [2, 40, 43].
A significant number of errors were discovered in genes encoding porcine orthologs of human CD markers; 25 are not present in Ensembl build 10.2, 88 of the proteins are truncated and 52 are duplicated gene artifacts. Sixty-seven full-length mRNA sequences encoding proteins, assembled from macrophage RNA-Seq reads, have been deposited in Genbank. An additional 79 in silico constructs are provided. Antibody data, gathered from publications, manufacturers or generated in house, is provided for 186 proteins including 395 monoclonal and 285 polyclonal antibodies. Additional cross reactivity for 29 proteins is expected because they are >95% similar to human proteins. Several of the CD Marker family are members of other gene families including the Solute Carrier and ATP-binding Cassette Super Family.
The Human Genome Organization’s gene nomenclature committee (HGNC) has assigned 395 genes to the Solute Carrier Superfamily, 21 are pseudogenes and three hundred seventy four encode proteins (accessible as a group by entering Solute Carrier Superfamily in the Annotations field). These are organized into 52 subfamilies; about 25% are dedicated to nutrient transport. The porcine Solute Carrier Super family contains 398 protein-coding members and all human subfamilies are represented. Forty-two of these genes are present in other porcine genomes but missing from Ensembl build 10.2, 113 are truncated and 58 of these are duplicated gene artifacts. Sixty three full-length mRNA sequences, assembled from macrophage RNA-Seq reads, have been deposited in Genbank and an additional 159 in silico constructs are provided. Forty-two of these genes are missing from all porcine genomes or are present as pseudogenes. Among these genes are UCP1 (thermogenein), a protein involved in non-shivering thermogenesis and a pseudogene in pigs  and SLC52A2, a primate specific riboflavin transporter . Other species-specific genes include eight primate-specific (SLC2A14, SLC22A24, SLC35E2, SLC35G3, SLC35G4, SLC35G5, SLCO1B1, SLCO1B7), one human specific (SLC22A25) gene and 14 mouse or rodent-specific genes (Slc6a20b, Slc7a12, Slc21a4, Slc22a19, Slc22a21, Slc22a22, Slc22a26, Slc22a27, Slc22a28, Slc22a29, Slc22a30, Slco1a1, Slco1b2, and Slco6b1). SLC25A18 is present in human and rodent genomes but is missing from bovine and porcine genomes. SLC25A52 is present in primate and rat genomes but not mouse. SLC9C2 is a pseudogene in mouse . SLC22A31 is an expressed pseudogene in pigs and is missing in rodents. SLC22A11 is an expressed pseudogene in pigs and a non-expressed pseudogene in mouse. Lastly, SLC23A4, an intestinal nucleobase transporter , is a pseudogene in humans but is present in pig, cow and rodent genomes. Several porcine or artiodactyl-specific gene expansions are found in subfamilies (Additional file 3) including SLC7A3 (14 members), SLC7A13 (3 members) SLC22A6 (2 members), SLC22A10 (4 members) and SLC47A1 (2 members). The biological functions of these paralogs remain to be determined; however the parent genes are involved in amino acid (SLC7A3, SLC7A13) or dipeptide transport (SLC22A6) [48, 49].
The HGNC has assigned 51 genes to the ATP binding Cassette Superfamily, three are pseudogenes and 48 encode proteins (accessible as a group by entering ATP binding Cassette Superfamily in the Annotations field). These are organized into five subfamilies (A-G), about 20% are dedicated to nutrient (i.e., carotenoid, cholesterol and vitamin A) transport. The porcine ATP binding Cassette Family contains 57 members and all human subfamilies are represented. These include five that are missing from Ensembl build 10.2 and 18 that are duplicated gene artifacts. Five of these genes are present in other porcine genomes, but missing from Ensembl build 10.2, 21 are truncated, and 18 of these genes are duplicated gene artifacts, Eleven full-length mRNA sequences, assembled from macrophage RNA-Seq reads, have been deposited in Genbank and an additional 24 in silico constructs are provided.
An analysis of this superfamily revealed that ABCC11 has no murine ortholog  and ABCA8 has no direct rodent ortholog as the gene has diverged into two paralogs, Abca8a and Abca8b . The ABCC4, a prostaglandin E2 transporter , has diverged from the parent gene into five paralogs (ABCC4L1, ABCC4L2, ABCC4L3, ABCC4L4 and ABCC4L5 (Additional file 3). ABCA10, involved in human macrophage cholesterol transport , is a pseudogene in rodents. It may be an expressed pseudogene in pigs as the predicted protein is half (787 amino acids) the size of human ABCA10 (1543 amino acids) and weak expression (by RNASeq) was detected in macrophages and moderated expression in intestine (H. Dawson, unpublished). ABCA17 is an expressed pseudogene is humans and pigs. Like the Solute Carrier Superfamily, most of the genes in the ATP binding Cassette Super family have not been characterized at the functional level. Nevertheless, the similarities and differences in the ATP binding Cassette and ATP binding Cassette Super families impact the suitability of rodents and pigs as models for human drug and nutrient transport and metabolism.
The exact number of microRNAs in the porcine genome is unknown. There are 4272 annotated microRNAs in the human genome (build 30). Although there are several papers describing the measurement of porcine microRNAs in various tissues or estimating the number in the porcine genome [54–57] and three partially overlapping sources of porcine microRNA sequences, the exact number of porcine microRNAs is currently unknown. There are only 382, 385 and 816 (non-redundant) annotated pig miRNA sequences in Mirbase, NCBI gene build, and Ensembl build 10.2, respectively. These three sources of information have a significant amount of overlap (Fig. 5a). We have consolidated this information and provide sequence data for our own predicted sequences based on conserved sequence identity to 1900 human, mouse or bovine sequences, to provide 1033 non-redundant porcine microRNA sequences (accessible as a group by entering MicroRNA in the Annotations field). Of note, all of the sequences found in Mirbase were found in the NCBI gene build, 59 of the microRNA sequences in Ensembl were found to be duplicated artifacts, and 214 of the 1033 sequences are not present in the current Ensembl gene build (10.2). This includes 81 that we have predicted based upon their presence in other species and other unfinished porcine genomes. We discovered the following species- or genera-specific microRNA; pigs (454), humans (199), primates (111) bovine (179), mouse (76) and rodents (20). Many of the porcine-specific microRNA have arisen from biological duplication/expansion (Additional file 3). A comparison of microRNA that are present in pigs and shared among at least one of the three other species (human, cows, and mice) revealed that 318 microRNA are shared among the four species, 107 are shared between pigs, humans and cows but not mice, and 34 are shared between pigs, mice and cows but not humans (Fig. 5b). Thus, the frequency of non-conserved microRNA preservation between human and pig is nearly three times that of mouse to pig.
The Porcine Translational Research Database is named because of its unique utility to translate findings made in rodents to pigs and from those in pigs to humans. A comprehensive literature-based survey was conducted to identify genes that have demonstrated function in humans, mice or pigs. The resulting data in the database is documented by >6000 references. The database currently contains 65 data fields for each entry. Our efforts to improve the genome and its annotation are similar to other efforts, for example the sequencing of 12,000 genes to supplement annotation of the pig genome [32, 33, 58] and de novo assembly of multiple pig genomes to reveal 1737 protein coding genes that are missing from Ensembl build 10.2 . The online Supplemental data from the latter manuscript was unavailable at the time of the preparation of this manuscript so no comparison could be made. The manual assembly of >9700 RNA sequences has direct practical implications for genomics-based analysis. The state of the current genome build (mis-annotations, duplication artifacts, and missing sequences) effectively prohibits its use for aligning RNAseq reads. We have used these sequences to compare gene expression separately from Ensembl 10.2 and have also compared the number of reads obtained from the corresponding templates in Ensembl 10.2. For the great majority of transcripts compared, as expected, our full-length sequences provided a higher level of sensitivity than the corresponding Ensembl sequences (H. Dawson unpublished).
The full 5′ and 3′ representation of each gene will also allow for characterization of regulatory regions and miRNA target sites. In our estimation, >40% of transcripts in Ensembl or NCBI genomes do not represent the full-length gene. Our efforts will also allow for further consolidation of porcine Unigene numbers. Currently, each gene is represented by from 0 to >10 Unigene assignments, and >10% of genes have more than one.
It is significant that we discovered a large number of errors (about 30% of entries) in the publicly available sequence databases (these can be accessed by searching the “Notes Field” using the word “error” (Fig. 3)). In addition to the duplication artifacts, mis-annotations and missing genes, we also encountered a number of RNA sequences in publically available archives belonging to other species. For, example, human (AHR, AF233432.1), panda (IL2, NM_001199892.1) and rat (NUDT14, ESTs in Unigene Ssc.85635) RNA sequences are annotated as porcine derived. We also found sources of contaminating DNA from completely unrelated species. For example, about 1/5 of porcine chromosome 4 clone CU076066.6 is from Zebrafish. These sequences represent 6 Zebrafish genes (LOC100003615, LOC447815, LOC108179932, LOC108183883, LOC108183971, and LOC103910681) and are annotated as porcine genes by Ensembl build 10.2 (ENSSSCG00000006223) and NCBI genomes (LOC100739857). Similarly, several NCBI loci (ASNA1L*, LOC100737282, LOC100737202, LOC100620149, LOC100737282) and one Ensembl locus (ENSSSCG00000026988) are derived from contaminating Babesia bigemina genomic DNA.
We have discovered several sources of systematic errors in the Ensmbl and NCBI gene/protein prediction or annotation pipelines. For example all selenoproteins in Ensembl are truncated because the codon (UGA) for selenocysteine is mistranslated or translated as a stop codon. We and others have identified a systematic error in the identification of another gene family, the Taste receptor, type 2 (TAS2R) Superfamily. Despite being intronless and mostly devoid of 5′ and 3′ UTR regions, Ensembl consistently fails to recognize them as genes . These data illustrate the critical importance of the manual-curation process to reduce errors.
We believe that this is the largest manually curated database for any veterinary species and that the infomantics are unique among those targeting a veterinary species in regard to linking gene expression to gene function, identification of related gene pathways, and connectivity with other porcine gene databases, as well as for reagents that measure gene and protein expression. In addition, it is the largest source of centralized antibody information for the pig. Any database must be updated frequently in order to be useful. Currently the database is updated monthly and we anticipate expanding the content to include all porcine genes. There are several Super families of genes that will be the next targets of our efforts. One is the GPCR super family, the exact size of the GPCR super family is still unknown, but nearly 800 different human genes (or ~4% of the entire protein-coding genome) have been predicted to code for them. We will also continue to develop and annotate new assays. We intend to include our own prediction analysis for the promoter and 3′ UTR region of RNA for transcription factor and microRNA binding sites. Lastly, we intend to synchronize our database with the porcine “Snowball” array and porcine gene expression atlas .
Supported by USDA/ARS Project Plan 1235-51,000-055-00D.
Availability and requirements
The dataset(s) supporting the conclusions of this article are included within the article, its additional file (Additional files 1, 2, 3 and 4) and within the online database (http://www.ars.usda.gov/Services/docs.htm?docid=6065).
HDD, CC, BG and JS contributed to the content of the database. HDD and JFU wrote the manuscript. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Miller ER, Ullrey DE. The pig as a model for human nutrition. Annu Rev Nutr. 1987;7:361–82.View ArticlePubMedGoogle Scholar
- Dawson HD: A comparative assessment of the pig, mouse and human genomes. In: The Minipig in Biomedical Research. Boca Raton: CRC Press; 2011: 323-342.Google Scholar
- Groenen MA, Archibald AL, Uenishi H, Tuggle CK, Takeuchi Y, Rothschild MF, Rogel-Gaillard C, Park C, Milan D, Megens HJ, et al. Analyses of pig genomes provide insight into porcine demography and evolution. Nature. 2012;491(7424):393–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Roth JA, Tuggle CK. Livestock models in translational medicine. ILAR J. 2015;56(1):1–6.View ArticlePubMedGoogle Scholar
- Meurens F, Summerfield A, Nauwynck H, Saif L, Gerdts V. The pig: a model for human infectious diseases. Trends Microbiol. 2012;20(1):50–7.View ArticlePubMedGoogle Scholar
- Schautteet K, Vanrompay D. Chlamydiaceae infections in pig. Vet Res. 2011;42:29.View ArticlePubMedPubMed CentralGoogle Scholar
- Holm IE, Alstrup AK, Luo Y. Genetically modified pig models for neurodegenerative disorders. J Pathol. 2016;238(2):267–87.View ArticlePubMedGoogle Scholar
- Selsby JT, Ross JW, Nonneman D, Hollinger K. Porcine models of muscular dystrophy. ILAR J. 2015;56(1):116–26.View ArticlePubMedPubMed CentralGoogle Scholar
- Flisikowska T, Kind A, Schnieke A. Pigs as models of human cancers. Theriogenology. 2016;86(1):433–7.View ArticlePubMedGoogle Scholar
- Klymiuk N, Aigner B, Brem G, Wolf E. Genetic modification of pigs as organ donors for xenotransplantation. Mol Reprod Dev. 2010;77(3):209–21.PubMedGoogle Scholar
- Kelley J, de Bono B, Trowsdale J. IRIS: a database surveying known human immune system genes. Genomics. 2005;85(4):503–11.View ArticlePubMedGoogle Scholar
- Schonbach C, Koh JL, Flower DR, Brusic V. An update on the functional molecular immunology (FIMM) database. Appl Bioinforma. 2005;4(1):25–31.View ArticleGoogle Scholar
- Lefranc MP. IMGT, the international ImMunoGeneTics information system(R): a standardized approach for immunogenetics and immunoinformatics. Immunome Res. 2005;1(1):3.View ArticlePubMedPubMed CentralGoogle Scholar
- Grimes GR, Moodie S, Beattie JS, Craigon M, Dickinson P, Forster T, Livingston AD, Mewissen M, Robertson KA, Ross AJ, et al. GPX-macrophage expression atlas: a database for expression profiles of macrophages challenged with a variety of pro-inflammatory, anti-inflammatory, benign and pathogen insults. BMC Genomics. 2005;6:178.View ArticlePubMedPubMed CentralGoogle Scholar
- Korber B, LaBute M, Yusim K. Immunoinformatics comes of age. PLoS Comput Biol. 2006;2(6):e71.View ArticlePubMedPubMed CentralGoogle Scholar
- Uenishi H, Eguchi T, Suzuki K, Sawazaki T, Toki D, Shinkai H, Okumura N, Hamasima N, Awata T. PEDE (pig EST data explorer): construction of a database for ESTs derived from porcine full-length cDNA libraries. Nucleic Acids Res. 2004;32(Database issue):D484–8.View ArticlePubMedPubMed CentralGoogle Scholar
- McCarthy FM, Wang N, Magee GB, Nanduri B, Lawrence ML, Camon EB, Barrell DG, Hill DP, Dolan ME, Williams WP, et al. AgBase: a functional genomics resource for agriculture. BMC Genomics. 2006;7(1):229.View ArticlePubMedPubMed CentralGoogle Scholar
- Ruan J, Guo Y, Li H, Hu Y, Song F, Huang X, Kristiensen K, Bolund L, Wang J. PigGIS: pig genomic informatics system. Nucleic Acids Res. 2007;35(Database issue):D654–7.View ArticlePubMedGoogle Scholar
- Piovesan D, Profiti G, Martelli PL, Fariselli P, Fontanesi L, Casadio R. SUS-BAR: a database of pig proteins with statistically validated structural and functional annotation. Database. 2013;2013:bat065.View ArticlePubMedPubMed CentralGoogle Scholar
- Howe DG, Bradford YM, Eagle A, Fashena D, Frazer K, Kalita P, Mani P, Martin R, Moxon ST, Paddock H, et al. The Zebrafish model organism database: new support for human disease models, mutation details, gene expression phenotypes and searching. Nucleic Acids Res. 2017;45(D1):D758–68.View ArticlePubMedGoogle Scholar
- Howe KL, Bolt BJ, Cain S, Chan J, Chen WJ, Davis P, Done J, Down T, Gao S, Grove C, et al. WormBase 2016: expanding to enable helminth genomic research. Nucleic Acids Res. 2016;44(D1):D774–80.View ArticlePubMedGoogle Scholar
- Drysdale R, FlyBase C. FlyBase : a database for the drosophila research community. Methods Mol Biol. 2008;420:45–59.View ArticlePubMedGoogle Scholar
- Breuer K, Foroushani AK, Laird MR, Chen C, Sribnaia A, Lo R, Winsor GL, Hancock RE, Brinkman FS, Lynn DJ. InnateDB: systems biology of innate immunity and beyond--recent updates and continuing curation. Nucleic Acids Res. 2013;41(Database issue):D1228–33.View ArticlePubMedGoogle Scholar
- Rhee SY, Wood V, Dolinski K, Draghici S. Use and misuse of the gene ontology annotations. Nat Rev Genet. 2008;9(7):509–15.View ArticlePubMedGoogle Scholar
- Dawson HD, Loveland JE, Pascal G, Gilbert JG, Uenishi H, Mann KM, Sang Y, Zhang J, Carvalho-Silva D, Hunt T, et al. Structural and functional annotation of the porcine immunome. BMC Genomics. 2013;14:332.View ArticlePubMedPubMed CentralGoogle Scholar
- Dawson HD, Smith AD, Chen C, Urban JF Jr. An in-depth comparison of the porcine, murine and human inflammasomes; lessons from the porcine genome and transcriptome. Vet Microbiol. 2016;Google Scholar
- Pei B, Sisu C, Frankish A, Howald C, Habegger L, Mu XJ, Harte R, Balasubramanian S, Tanzer A, Diekhans M, et al. The GENCODE pseudogene resource. Genome Biol. 2012;13(9):R51.View ArticlePubMedPubMed CentralGoogle Scholar
- Thomas JW, Prasad AB, Summers TJ, Lee-Lin SQ, Maduro VV, Idol JR, Ryan JF, Thomas PJ, McDowell JC, Green ED. Parallel construction of orthologous sequence-ready clone contig maps in multiple species. Genome Res. 2002;12(8):1277–85.View ArticlePubMedPubMed CentralGoogle Scholar
- Heckel T, Schmucki R, Berrera M, Ringshandl S, Badi L, Steiner G, Ravon M, Kung E, Kuhn B, Kratochwil NA, et al. Functional analysis and transcriptional output of the Gottingen minipig genome. BMC Genomics. 2015;16:932.View ArticlePubMedPubMed CentralGoogle Scholar
- Li M, Chen L, Tian S, Lin Y, Tang Q, Zhou X, Li D, Yeung CK, Che T, Jin L, et al. Comprehensive variation discovery and recovery of missing sequence in the pig genome using multiple de novo assemblies. Genome Res. 2016;Google Scholar
- Mathew DJ, Newsom EM, Guyton JM, Tuggle CK, Geisert RD, Lucy MC. Activation of the transcription factor nuclear factor-kappa B in uterine luminal epithelial cells by interleukin 1 Beta 2: a novel interleukin 1 expressed by the elongating pig conceptus. Biol Reprod. 2015;92(4):107.View ArticlePubMedGoogle Scholar
- International Human Genome Sequencing C. Finishing the euchromatic sequence of the human genome. Nature. 2004;431(7011):931–45.View ArticleGoogle Scholar
- Church DM, Goodstadt L, Hillier LW, Zody MC, Goldstein S, She X, Bult CJ, Agarwala R, Cherry JL, DiCuccio M, et al. Lineage-specific biology revealed by a finished genome assembly of the mouse. PLoS Biol. 2009;7(5):e1000112.View ArticlePubMedPubMed CentralGoogle Scholar
- Clark G, Stockinger H, Balderas R, van Zelm MC, Zola H, Hart D, Engel P. Nomenclature of CD molecules from the tenth human Leucocyte differentiation antigen workshop. Clin Translational Immunol. 2016;5(1):e57.View ArticleGoogle Scholar
- Kelley J, Walter L, Trowsdale J. Comparative genomics of natural killer cell receptor gene clusters. PLoS Genet. 2005;1(2):129–39.View ArticlePubMedGoogle Scholar
- van Beek EM, Cochrane F, Barclay AN, van den Berg TK. Signal regulatory proteins in the immune system. J Immunol. 2005;175(12):7781–7.View ArticlePubMedGoogle Scholar
- Crocker PR, Paulson JC, Varki A. Siglecs and their roles in the immune system. Nat Rev Immunol. 2007;7(4):255–66.View ArticlePubMedGoogle Scholar
- Su K, Wu J, Edberg JC, McKenzie SE, Kimberly RP. Genomic organization of classical human low-affinity Fcgamma receptor genes. Genes Immun. 2002;3(Suppl 1):S51–6.View ArticlePubMedGoogle Scholar
- Tao R, Wang L, Murphy KM, Fraser CC, Hancock WW. Regulatory T cell expression of herpesvirus entry mediator suppresses the function of B and T lymphocyte attenuator-positive effector T cells. J Immunol. 2008;180(10):6649–55.View ArticlePubMedGoogle Scholar
- Davis RS. Fc receptor-like molecules. Annu Rev Immunol. 2007;25:525–60.View ArticlePubMedGoogle Scholar
- Barreiro LB, Quintana-Murci L. From evolutionary genetics to human immunology: how selection shapes host defence genes. Nat Rev Genet. 2010;11(1):17–30.View ArticlePubMedGoogle Scholar
- Kang X, Kim J, Deng M, John S, Chen H, Wu G, Phan H, Zhang CC. Inhibitory leukocyte immunoglobulin-like receptors: immune checkpoint proteins and tumor sustaining factors. Cell Cycle. 2016;15(1):25–40.View ArticlePubMedGoogle Scholar
- Jacobson AC, Weis JH. Comparative functional evolution of human and mouse CR1 and CR2. J Immunol. 2008;181(5):2953–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Berg F, Gustafson U, Andersson L. The uncoupling protein 1 gene (UCP1) is disrupted in the pig lineage: a genetic explanation for poor thermoregulation in piglets. PLoS Genet. 2006;2(8):e129.View ArticlePubMedPubMed CentralGoogle Scholar
- Yonezawa A, Inui K. Novel riboflavin transporter family RFVT/SLC52: identification, nomenclature, functional characterization and genetic diseases of RFVT/SLC52. Mol Asp Med. 2013;34(2-3):693–701.View ArticleGoogle Scholar
- Fuster DG, Alexander RT. Traditional and emerging roles for the SLC9 Na+/H+ exchangers. Pflugers Archiv. 2014;466(1):61–76.View ArticlePubMedGoogle Scholar
- Yamamoto S, Inoue K, Murata T, Kamigaso S, Yasujima T, Maeda JY, Yoshida Y, Ohta KY, Yuasa H. Identification and functional characterization of the first nucleobase transporter in mammals: implication in the species difference in the intestinal absorption mechanism of nucleobases and their analogs between higher primates and other mammals. J Biol Chem. 2010;285(9):6522–31.View ArticlePubMedGoogle Scholar
- Ito K, Groudine M. A new member of the cationic amino acid transporter family is preferentially expressed in adult mouse brain. J Biol Chem. 1997;272(42):26780–6.View ArticlePubMedGoogle Scholar
- Hagos Y, Burckhardt G, Burckhardt BC. Human organic anion transporter OAT1 is not responsible for glutathione transport but mediates transport of glutamate derivatives. Am J Physiol Renal Physiol. 2013;304(4):F403–9.View ArticlePubMedGoogle Scholar
- Shimizu H, Taniguchi H, Hippo Y, Hayashizaki Y, Aburatani H, Ishikawa T. Characterization of the mouse Abcc12 gene and its transcript encoding an ATP-binding cassette transporter, an orthologue of human ABCC12. Gene. 2003;310:17–28.View ArticlePubMedGoogle Scholar
- Annilo T, Chen ZQ, Shulenin S, Dean M. Evolutionary analysis of a cluster of ATP-binding cassette (ABC) genes. Mamm Genome. 2003;14(1):7–20.View ArticlePubMedGoogle Scholar
- Kochel TJ, Fulton AM. Multiple drug resistance-associated protein 4 (MRP4), prostaglandin transporter (PGT), and 15-hydroxyprostaglandin dehydrogenase (15-PGDH) as determinants of PGE2 levels in cancer. Prostaglandins Other Lipid Mediators. 2015;116-117:99–103.View ArticlePubMedGoogle Scholar
- Wenzel JJ, Kaminski WE, Piehler A, Heimerl S, Langmann T, Schmitz G. ABCA10, a novel cholesterol-regulated ABCA6-like ABC transporter. Biochem Biophys Res Commun. 2003;306(4):1089–98.View ArticlePubMedGoogle Scholar
- Bao H, Kommadath A, Plastow GS, Tuggle CK, Guan le L, Stothard P. MicroRNA buffering and altered variance of gene expression in response to salmonella infection. PLoS One. 2014;9(4):e94352.View ArticlePubMedPubMed CentralGoogle Scholar
- Sharbati S, Friedlander MR, Sharbati J, Hoeke L, Chen W, Keller A, Stahler PF, Rajewsky N, Einspanier R. Deciphering the porcine intestinal microRNA transcriptome. BMC Genomics. 2010;11:275.View ArticlePubMedPubMed CentralGoogle Scholar
- Anthon C, Tafer H, Havgaard JH, Thomsen B, Hedegaard J, Seemann SE, Pundhir S, Kehr S, Bartschat S, Nielsen M, et al. Structured RNAs and synteny regions in the pig genome. BMC Genomics. 2014;15:459.View ArticlePubMedPubMed CentralGoogle Scholar
- Paczynska P, Grzemski A, Szydlowski M. Distribution of miRNA genes in the pig genome. BMC Genet. 2015;16:6.View ArticlePubMedPubMed CentralGoogle Scholar
- Uenishi H, Morozumi T, Toki D, Eguchi-Ogawa T, Rund LA, Schook LB. Large-scale sequencing based on full-length-enriched cDNA libraries in pigs: contribution to annotation of the pig genome draft sequence. BMC Genomics. 2012;13:581.View ArticlePubMedPubMed CentralGoogle Scholar
- Freeman TC, Ivens A, Baillie JK, Beraldi D, Barnett MW, Dorward D, Downing A, Fairbairn L, Kapetanovic R, Raza S, et al. A gene expression atlas of the domestic pig. BMC Biol. 2012;10:90.View ArticlePubMedPubMed CentralGoogle Scholar