Pseudofam -- the pseudogene families database
What you can do:
A database of pseudogene families based on the protein families from the Pfam database.
Highlights:
- Pseudofam provides resources for analyzing the family structure of pseudogenes including query tools, statistical summaries and sequence alignments.
- It contains more than 125,000 pseudogenes identified from 10 eukaryotic genomes and aligned within nearly 3000 families (approximately one-third of the total families in PfamA).
- Pseudofam uses a large-scale parallelized homology search algorithm (implemented as an extension of the PseudoPipe pipeline) to identify pseudogenes.
- Each identified pseudogene is assigned to its parent protein family and subsequently aligned to each other by transferring the parent domain alignments from the Pfam family.
- Pseudogenes are also given additional annotation based on an ontology, reflecting their mode of creation and subsequent history.
- The annotation highlights the association of pseudogene families with genomic features, such as segmental duplications.
- In addition, pseudogene families are associated with key statistics, which identify outlier families with an unusual degree of pseudogenization.
- The statistics also show how the number of genes and pseudogenes in families correlates across different species.
Keywords:
- pseudogene
- non-coding sequence
- pseudogenization
Literature & Tutorials:
PubMed Link: Pseudofam: the pseudogene families database
This record last updated: 01-21-2009