iUUCD 2.0 - integrated annotations for ubiquitin and ubiquitin-like conjugation database

※ Data Statistics for iUUCD 2.0 and UUCD 1.0:

Content	iUUCD 2.0	UUCD 1.0
Known data
E1	27	26
E2	109	105
E3	1153	1003
DUB	164	148
UBDs	396	N/A
ULDs	183	N/A
Total	1895	1282
Data integration
Data size	32.1GB	0.41GB
Families	74	30
HMM profiles	58	23
Species	148	70
Total entries	136,512	56,949
Integrated databases	68	2
Regulator types	E1, E2, E3, DUB, UBDs and ULDs	E1, E2, E3 and DUB
Integrated information	Basic information, functional domain, protein sequence, gene sequence, cancer mutation, single nucleotide polymorphism (SNP), mRNA expression, DNA & RNA element, protein-protein interaction, protein structure, disease-associated variation, drug-target relation, post-translational modifications (PTMs), DNA methylation and protein expression	Basic information, functional domain and protein sequence
Retrieval mode	Browse by species, browse by families, simple search, batch search, advance search, BLAST search and HMM search	Browse by species, browse by families, simple search, advance search, BLAST search and HMM search

※ USAGE:

iUUCD is a comprehensive and powerful database that is convenient to be used. The online service was provided. This USAGE is prepared for the online service. The iUUCD provides the browse option, search option and advance options.

1. Browse. You can click the one of three pictrues to browse by classifications or species of animals, plants or fungi in the middle of the "browse" webpage. Then a clear page is showed with a detailed treeview and a big and clear picture of phylogenetic tree. Please click the names on the left treeview or small logos on the right picture to browse each family or species.

2. Search. You can input one or multiple keywords (separated by space character) to search the iUUCD. The search fields including iUUCD ID, UUCD1 ID UniProt ID, Ensembl Protein ID, Ensembl Gene ID, Ensembl Transcript ID, Protein Name, Gene Name and Family.

EXAMPLE: Please click on the "Example" button to search "MDM2" in Gene Name field. By clicking on the "Search" button, the related iUUCD proteins will be shown.

3. Advance Options. Four advance options are provided, including batch search, advance search, BLAST search, and HMM search.

(1) Batch search. You can input one keyword to search several Proteins the iUUCD. The search fields include iUUCD ID, UUCD1 ID, Ensembl Protein ID, Ensembl Gene ID, Ensembl Transcript ID, UniProt Accession, Gene Name, Protein Name and Family.

You can click on the "Example" button to load an instance. All species containing Ensembl Protein ID like "ENSG00000182866; ENSG00000169967; ENSG00000163558" will be shown by clicking on the "Submit" button.

(2) Advance search. You can input up to three terms to search the more specifical information. The query fields can be empty if less terms are needed. The three terms could be connected by the following operators:

exclude: If selected, the term following this operator must be not contained in the specified field(s)
and: the term following this operator has to be included in the specified field(s)
or: either the preceding or the following term to this operator should occur in the specified field(s)

EXAMPLE: You can click on the "Example" button to load an instance, which could search a E3 ligase in Homo sapiens. The H. sapiens MDM2 (IUUC-Hsa-046376) will be shown by clicking on the "Submit" button.

(3) BLAST search. You can input a sequence to find the specific protein and/or related homologues by sequence alignment. This search-option will help you to find the queried protein accurately and fast. Only one protein sequence in FASTA format is allowed per time. The E-value threshold could be user-defined, while the species information could be specified. The default parameters of E-value and species are 0.01 and H. Sapiens, respectively.

EXAMPLE: You can click on the "Example" button to load the protein sequence of human F-box/WD repeat-containing protein 1A. By clicking on the "Submit" button, you can find the related homologues in H. Sapiens.

(4) HMM search. You can find the specific domain or motif of a protein sequence by HMM algorithm. This search-option will help you to find the detailed positions of the queried domains accurately. Only one protein sequence in FASTA format is allowed per time. the score threshold is internally defined. There are not a same E-value threshold for all domain-based HMM profiles. Every E-value threshold is corresponding to a particullar domain model. For instance, the score threshold of Cullin domain is 196.1, and the score threshold of HECT domain is 95.8.

Frequently Asked Questions:

A: If you input a protein that has the potential to be the function actor in both two systems, we will display all the possible role it can be.

1. Q: What is the difference between "Reviewed" and "Unreviewed"? (Status)

A:The protein is annotated as reviewed or unreviewed according to whether it has been reported by the paper. For example, if a protein is clearly reported that it has the specific function like binding the ubiquitin chains, we will mark it as "Reviewed". If a protein is found by predict but without paper’s supports, we will annotate it as "Unreviewed".

2. Q: I have a few questions which are not listed above, how can I contact the authors of iUUCD?

A: Please contact Jiaqi Zhou, Yang Xu, Shaofeng Lin and Dr. Yu Xue for details.