NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|42573129|ref|NP_974661|]
View 

choice-of-anchor C domain protein, putative (Protein of unknown function, DUF642) [Arabidopsis thaliana]

Protein Classification

DUF642 domain-containing protein( domain architecture ID 11477412)

DUF642 domain-containing protein contains a conserved CGP sequence motif

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
PLN03089 PLN03089
hypothetical protein; Provisional
1-365 0e+00

hypothetical protein; Provisional


:

Pssm-ID: 215569 [Multi-domain]  Cd Length: 373  Bit Score: 701.33  E-value: 0e+00
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129    1 MKEMGVIVLLLLHSFFYVAFCFNDGLLPNGDFELGPRHSDMKGTQVINITAIPNWELSGFVEYIPSGHKQGDMILVVPKG 80
Cdd:PLN03089   4 MHSLLLLLLLLLCAAAASAAPVTDGLLPNGDFETPPKKSQMNGTVVIGKNAIPGWEISGFVEYISSGQKQGGMLLVVPEG 83
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129   81 AFAVRLGNEASIKQKISVKKGSYYSITFSAARTCAQDERLNVSVAPHHAVMPIQTVYSSSGWDLYSWAFKAQSDYADIVI 160
Cdd:PLN03089  84 AHAVRLGNEASISQTLTVTKGSYYSLTFSAARTCAQDESLNVSVPPESGVLPLQTLYSSSGWDSYAWAFKAESDVVNLVF 163
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129  161 HNPGVEEDPACGPLIDGVAMRALFPPRPTNKNILKNGGFEEGPWVLPNISSGVLIPPNSIDDHSPLPGWMVESLKAVKYI 240
Cdd:PLN03089 164 HNPGVEEDPACGPLIDAVAIKTLFPPRPTKDNLLKNGGFEEGPYVFPNSSWGVLLPPNIEDDTSPLPGWMIESLKAVKYI 243
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129  241 DSDHFSVPQGRRAVELVAGKESAVAQVVRTIPGKTYVLSFSVGDASNACAGSMIVEAFAGKDTIKVPYESKGKGGFKRSS 320
Cdd:PLN03089 244 DSAHFSVPEGKRAVELVSGKESAIAQVVRTVPGKSYNLSFTVGDANNGCHGSMMVEAFAGKDTQKVPYESQGKGGFKRAS 323
                        330       340       350       360
                 ....*....|....*....|....*....|....*....|....*
gi 42573129  321 LRFVAVSSRTRVMFYSTFYAMRNDDFSSLCGPVIDDVKLLSARRP 365
Cdd:PLN03089 324 LRFKAVSNRTRITFYSSFYHTKSDDFGSLCGPVVDDVRVVPVRAP 368
 
Name Accession Description Interval E-value
PLN03089 PLN03089
hypothetical protein; Provisional
1-365 0e+00

hypothetical protein; Provisional


Pssm-ID: 215569 [Multi-domain]  Cd Length: 373  Bit Score: 701.33  E-value: 0e+00
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129    1 MKEMGVIVLLLLHSFFYVAFCFNDGLLPNGDFELGPRHSDMKGTQVINITAIPNWELSGFVEYIPSGHKQGDMILVVPKG 80
Cdd:PLN03089   4 MHSLLLLLLLLLCAAAASAAPVTDGLLPNGDFETPPKKSQMNGTVVIGKNAIPGWEISGFVEYISSGQKQGGMLLVVPEG 83
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129   81 AFAVRLGNEASIKQKISVKKGSYYSITFSAARTCAQDERLNVSVAPHHAVMPIQTVYSSSGWDLYSWAFKAQSDYADIVI 160
Cdd:PLN03089  84 AHAVRLGNEASISQTLTVTKGSYYSLTFSAARTCAQDESLNVSVPPESGVLPLQTLYSSSGWDSYAWAFKAESDVVNLVF 163
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129  161 HNPGVEEDPACGPLIDGVAMRALFPPRPTNKNILKNGGFEEGPWVLPNISSGVLIPPNSIDDHSPLPGWMVESLKAVKYI 240
Cdd:PLN03089 164 HNPGVEEDPACGPLIDAVAIKTLFPPRPTKDNLLKNGGFEEGPYVFPNSSWGVLLPPNIEDDTSPLPGWMIESLKAVKYI 243
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129  241 DSDHFSVPQGRRAVELVAGKESAVAQVVRTIPGKTYVLSFSVGDASNACAGSMIVEAFAGKDTIKVPYESKGKGGFKRSS 320
Cdd:PLN03089 244 DSAHFSVPEGKRAVELVSGKESAIAQVVRTVPGKSYNLSFTVGDANNGCHGSMMVEAFAGKDTQKVPYESQGKGGFKRAS 323
                        330       340       350       360
                 ....*....|....*....|....*....|....*....|....*
gi 42573129  321 LRFVAVSSRTRVMFYSTFYAMRNDDFSSLCGPVIDDVKLLSARRP 365
Cdd:PLN03089 324 LRFKAVSNRTRITFYSSFYHTKSDDFGSLCGPVVDDVRVVPVRAP 368
DUF642 pfam04862
Protein of unknown function (DUF642); This family represents a duplicated conserved region ...
25-181 4.05e-94

Protein of unknown function (DUF642); This family represents a duplicated conserved region found in a number of uncharacterized plant proteins, potentially in the stem. There is a conserved CGP sequence motif.


Pssm-ID: 398500 [Multi-domain]  Cd Length: 157  Bit Score: 277.60  E-value: 4.05e-94
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129    25 GLLPNGDFELGPRHSDMKGTQVINITAIPNWELSGFVEYIPSGHKQGDMILVVPKGAFAVRLGNEASIKQKISVKKGSYY 104
Cdd:pfam04862   1 GLLPNGDFETGPDPSNMKGTVLAGPNAIPGWTVTGFVEYIKSGQKQGDMYLQVPEGAHAVRLGNDASISQTFSVTPGSTY 80
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 42573129   105 SITFSAARTCAQDERLNVSVAPHHAVMPIQTVYSSSGWDLYSWAFKAQSDYADIVIHNPGVEEDPACGPLIDGVAMR 181
Cdd:pfam04862  81 SLTFSAARTCAQDESLNVSVAPDSGVFPFQTLYSSSGWDSYAWAFKATGSVVTLVFHNPGVEEDPACGPLIDNVAIK 157
choice_anch_C TIGR04362
choice-of-anchor C domain; This family describes an extracellular bacterial domain that occurs ...
26-178 6.38e-05

choice-of-anchor C domain; This family describes an extracellular bacterial domain that occurs on a number of proteins with PEP-CTERM (exosortase recognition site) sequences at the C-terminus, as well some with an apparent alternate anchor sequence. Note that related pfam04862 (DUF642), as of release 26, is double the length of this model because it has two tandem regions homologous to this domain. pfam04862, in turn, belongs to a Pfam clan called the galactose-binding domain-like superfamily.


Pssm-ID: 275156 [Multi-domain]  Cd Length: 157  Bit Score: 42.74  E-value: 6.38e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129    26 LLPNGDFELGPRHSDmkGTQVINI--TAIPNWE-LSGFVEYIPSGHKQGDmilvvpkGAFAVRL-GNEA--SIKQKISVK 99
Cdd:TIGR04362   2 LITNGSFESGSDPGN--GFSTLSAgsSAITGWTvGSGSVDLINGYWQASE-------GSRSIDLnGTTGpgGISQTFNTV 72
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129   100 KGSYYSITFSAARTCAQDERLN---VSVAPhhavMPIQTVYSSSG-------WDLYSWAFKAQSD-----YADIvihnpg 164
Cdd:TIGR04362  73 AGQTYRVTFDLAGNPDGGPGLKdltVSVGG----ASQDFSFDTTGkttanmgWTTKSFDFTATSTsttlsFTSL------ 142
                         170
                  ....*....|....
gi 42573129   165 vEEDPACGPLIDGV 178
Cdd:TIGR04362 143 -DNGGAWGPALDNV 155
 
Name Accession Description Interval E-value
PLN03089 PLN03089
hypothetical protein; Provisional
1-365 0e+00

hypothetical protein; Provisional


Pssm-ID: 215569 [Multi-domain]  Cd Length: 373  Bit Score: 701.33  E-value: 0e+00
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129    1 MKEMGVIVLLLLHSFFYVAFCFNDGLLPNGDFELGPRHSDMKGTQVINITAIPNWELSGFVEYIPSGHKQGDMILVVPKG 80
Cdd:PLN03089   4 MHSLLLLLLLLLCAAAASAAPVTDGLLPNGDFETPPKKSQMNGTVVIGKNAIPGWEISGFVEYISSGQKQGGMLLVVPEG 83
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129   81 AFAVRLGNEASIKQKISVKKGSYYSITFSAARTCAQDERLNVSVAPHHAVMPIQTVYSSSGWDLYSWAFKAQSDYADIVI 160
Cdd:PLN03089  84 AHAVRLGNEASISQTLTVTKGSYYSLTFSAARTCAQDESLNVSVPPESGVLPLQTLYSSSGWDSYAWAFKAESDVVNLVF 163
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129  161 HNPGVEEDPACGPLIDGVAMRALFPPRPTNKNILKNGGFEEGPWVLPNISSGVLIPPNSIDDHSPLPGWMVESLKAVKYI 240
Cdd:PLN03089 164 HNPGVEEDPACGPLIDAVAIKTLFPPRPTKDNLLKNGGFEEGPYVFPNSSWGVLLPPNIEDDTSPLPGWMIESLKAVKYI 243
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129  241 DSDHFSVPQGRRAVELVAGKESAVAQVVRTIPGKTYVLSFSVGDASNACAGSMIVEAFAGKDTIKVPYESKGKGGFKRSS 320
Cdd:PLN03089 244 DSAHFSVPEGKRAVELVSGKESAIAQVVRTVPGKSYNLSFTVGDANNGCHGSMMVEAFAGKDTQKVPYESQGKGGFKRAS 323
                        330       340       350       360
                 ....*....|....*....|....*....|....*....|....*
gi 42573129  321 LRFVAVSSRTRVMFYSTFYAMRNDDFSSLCGPVIDDVKLLSARRP 365
Cdd:PLN03089 324 LRFKAVSNRTRITFYSSFYHTKSDDFGSLCGPVVDDVRVVPVRAP 368
DUF642 pfam04862
Protein of unknown function (DUF642); This family represents a duplicated conserved region ...
25-181 4.05e-94

Protein of unknown function (DUF642); This family represents a duplicated conserved region found in a number of uncharacterized plant proteins, potentially in the stem. There is a conserved CGP sequence motif.


Pssm-ID: 398500 [Multi-domain]  Cd Length: 157  Bit Score: 277.60  E-value: 4.05e-94
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129    25 GLLPNGDFELGPRHSDMKGTQVINITAIPNWELSGFVEYIPSGHKQGDMILVVPKGAFAVRLGNEASIKQKISVKKGSYY 104
Cdd:pfam04862   1 GLLPNGDFETGPDPSNMKGTVLAGPNAIPGWTVTGFVEYIKSGQKQGDMYLQVPEGAHAVRLGNDASISQTFSVTPGSTY 80
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 42573129   105 SITFSAARTCAQDERLNVSVAPHHAVMPIQTVYSSSGWDLYSWAFKAQSDYADIVIHNPGVEEDPACGPLIDGVAMR 181
Cdd:pfam04862  81 SLTFSAARTCAQDESLNVSVAPDSGVFPFQTLYSSSGWDSYAWAFKATGSVVTLVFHNPGVEEDPACGPLIDNVAIK 157
PLN03089 PLN03089
hypothetical protein; Provisional
23-188 1.88e-08

hypothetical protein; Provisional


Pssm-ID: 215569 [Multi-domain]  Cd Length: 373  Bit Score: 55.35  E-value: 1.88e-08
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129   23 NDGLLPNGDFELGP--------------RHSDMkgtqvinITAIPNW--ELSGFVEYIPSGHkqgdmiLVVPKGAFAVRL 86
Cdd:PLN03089 193 KDNLLKNGGFEEGPyvfpnsswgvllppNIEDD-------TSPLPGWmiESLKAVKYIDSAH------FSVPEGKRAVEL 259
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129   87 --GNEASIKQKISVKKGSYYSITFS---AARTC---------AQDERLNVSVAPHhavmpiqtvySSSGWDLYSWAFKAQ 152
Cdd:PLN03089 260 vsGKESAIAQVVRTVPGKSYNLSFTvgdANNGChgsmmveafAGKDTQKVPYESQ----------GKGGFKRASLRFKAV 329
                        170       180       190       200
                 ....*....|....*....|....*....|....*....|....*
gi 42573129  153 SDYADIV---------IHNPGVeedpACGPLIDGVAMRALFPPRP 188
Cdd:PLN03089 330 SNRTRITfyssfyhtkSDDFGS----LCGPVVDDVRVVPVRAPRA 370
choice_anch_C TIGR04362
choice-of-anchor C domain; This family describes an extracellular bacterial domain that occurs ...
26-178 6.38e-05

choice-of-anchor C domain; This family describes an extracellular bacterial domain that occurs on a number of proteins with PEP-CTERM (exosortase recognition site) sequences at the C-terminus, as well some with an apparent alternate anchor sequence. Note that related pfam04862 (DUF642), as of release 26, is double the length of this model because it has two tandem regions homologous to this domain. pfam04862, in turn, belongs to a Pfam clan called the galactose-binding domain-like superfamily.


Pssm-ID: 275156 [Multi-domain]  Cd Length: 157  Bit Score: 42.74  E-value: 6.38e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129    26 LLPNGDFELGPRHSDmkGTQVINI--TAIPNWE-LSGFVEYIPSGHKQGDmilvvpkGAFAVRL-GNEA--SIKQKISVK 99
Cdd:TIGR04362   2 LITNGSFESGSDPGN--GFSTLSAgsSAITGWTvGSGSVDLINGYWQASE-------GSRSIDLnGTTGpgGISQTFNTV 72
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 42573129   100 KGSYYSITFSAARTCAQDERLN---VSVAPhhavMPIQTVYSSSG-------WDLYSWAFKAQSD-----YADIvihnpg 164
Cdd:TIGR04362  73 AGQTYRVTFDLAGNPDGGPGLKdltVSVGG----ASQDFSFDTTGkttanmgWTTKSFDFTATSTsttlsFTSL------ 142
                         170
                  ....*....|....
gi 42573129   165 vEEDPACGPLIDGV 178
Cdd:TIGR04362 143 -DNGGAWGPALDNV 155
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH