Amino acid dipepetide frequency for Mycobacterium phage SamScheppers

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
19.234AlaAla: 19.234 ± 1.441
1.021AlaCys: 1.021 ± 0.255
7.683AlaAsp: 7.683 ± 0.564
11.175AlaGlu: 11.175 ± 1.237
3.062AlaPhe: 3.062 ± 0.604
10.208AlaGly: 10.208 ± 1.189
2.418AlaHis: 2.418 ± 0.412
4.997AlaIle: 4.997 ± 0.607
3.331AlaLys: 3.331 ± 0.332
10.53AlaLeu: 10.53 ± 0.727
2.794AlaMet: 2.794 ± 0.394
3.224AlaAsn: 3.224 ± 0.438
5.695AlaPro: 5.695 ± 0.65
5.211AlaGln: 5.211 ± 0.519
8.918AlaArg: 8.918 ± 0.97
4.943AlaSer: 4.943 ± 0.491
5.373AlaThr: 5.373 ± 0.74
9.241AlaVal: 9.241 ± 0.735
2.149AlaTrp: 2.149 ± 0.329
3.062AlaTyr: 3.062 ± 0.472
0.0AlaXaa: 0.0 ± 0.0
Cys
1.504CysAla: 1.504 ± 0.356
0.161CysCys: 0.161 ± 0.097
0.806CysAsp: 0.806 ± 0.206
0.645CysGlu: 0.645 ± 0.181
0.322CysPhe: 0.322 ± 0.13
1.236CysGly: 1.236 ± 0.31
0.269CysHis: 0.269 ± 0.13
0.591CysIle: 0.591 ± 0.162
0.484CysLys: 0.484 ± 0.173
0.698CysLeu: 0.698 ± 0.235
0.107CysMet: 0.107 ± 0.07
0.215CysAsn: 0.215 ± 0.104
0.86CysPro: 0.86 ± 0.207
0.376CysGln: 0.376 ± 0.151
1.182CysArg: 1.182 ± 0.29
0.967CysSer: 0.967 ± 0.202
0.376CysThr: 0.376 ± 0.162
0.322CysVal: 0.322 ± 0.141
0.107CysTrp: 0.107 ± 0.074
0.0CysTyr: 0.0 ± 0.0
0.0CysXaa: 0.0 ± 0.0
Asp
7.199AspAla: 7.199 ± 0.521
0.645AspCys: 0.645 ± 0.217
6.984AspAsp: 6.984 ± 0.772
5.587AspGlu: 5.587 ± 0.539
1.88AspPhe: 1.88 ± 0.309
6.877AspGly: 6.877 ± 0.669
1.182AspHis: 1.182 ± 0.346
1.504AspIle: 1.504 ± 0.265
2.095AspLys: 2.095 ± 0.314
5.587AspLeu: 5.587 ± 0.55
1.504AspMet: 1.504 ± 0.315
1.773AspAsn: 1.773 ± 0.284
4.298AspPro: 4.298 ± 0.417
1.558AspGln: 1.558 ± 0.29
5.426AspArg: 5.426 ± 0.735
2.794AspSer: 2.794 ± 0.567
3.492AspThr: 3.492 ± 0.366
4.406AspVal: 4.406 ± 0.574
0.752AspTrp: 0.752 ± 0.174
1.075AspTyr: 1.075 ± 0.216
0.0AspXaa: 0.0 ± 0.0
Glu
7.898GluAla: 7.898 ± 0.879
0.752GluCys: 0.752 ± 0.224
2.74GluAsp: 2.74 ± 0.356
1.773GluGlu: 1.773 ± 0.354
0.913GluPhe: 0.913 ± 0.282
4.029GluGly: 4.029 ± 0.374
1.666GluHis: 1.666 ± 0.421
1.934GluIle: 1.934 ± 0.473
2.686GluLys: 2.686 ± 0.4
8.059GluLeu: 8.059 ± 0.743
1.289GluMet: 1.289 ± 0.197
1.075GluAsn: 1.075 ± 0.251
3.17GluPro: 3.17 ± 0.491
2.74GluGln: 2.74 ± 0.402
5.265GluArg: 5.265 ± 0.853
2.633GluSer: 2.633 ± 0.477
3.009GluThr: 3.009 ± 0.367
4.943GluVal: 4.943 ± 0.607
1.289GluTrp: 1.289 ± 0.192
1.773GluTyr: 1.773 ± 0.262
0.0GluXaa: 0.0 ± 0.0
Phe
2.525PheAla: 2.525 ± 0.343
0.215PheCys: 0.215 ± 0.1
3.277PheAsp: 3.277 ± 0.436
1.88PheGlu: 1.88 ± 0.387
0.537PhePhe: 0.537 ± 0.155
3.277PheGly: 3.277 ± 0.427
0.698PheHis: 0.698 ± 0.235
1.075PheIle: 1.075 ± 0.26
0.752PheLys: 0.752 ± 0.198
1.827PheLeu: 1.827 ± 0.302
0.591PheMet: 0.591 ± 0.164
0.913PheAsn: 0.913 ± 0.228
0.967PhePro: 0.967 ± 0.299
0.806PheGln: 0.806 ± 0.265
1.666PheArg: 1.666 ± 0.343
1.128PheSer: 1.128 ± 0.259
1.988PheThr: 1.988 ± 0.369
2.471PheVal: 2.471 ± 0.284
0.376PheTrp: 0.376 ± 0.137
0.376PheTyr: 0.376 ± 0.122
0.0PheXaa: 0.0 ± 0.0
Gly
8.059GlyAla: 8.059 ± 1.105
1.021GlyCys: 1.021 ± 0.229
5.856GlyAsp: 5.856 ± 0.717
6.017GlyGlu: 6.017 ± 0.625
2.256GlyPhe: 2.256 ± 0.391
8.328GlyGly: 8.328 ± 1.019
1.558GlyHis: 1.558 ± 0.39
3.438GlyIle: 3.438 ± 0.578
3.868GlyLys: 3.868 ± 0.478
6.34GlyLeu: 6.34 ± 1.001
1.612GlyMet: 1.612 ± 0.278
3.546GlyAsn: 3.546 ± 0.64
3.385GlyPro: 3.385 ± 0.52
3.546GlyGln: 3.546 ± 0.476
6.017GlyArg: 6.017 ± 0.644
4.943GlySer: 4.943 ± 0.786
5.856GlyThr: 5.856 ± 0.731
5.856GlyVal: 5.856 ± 0.61
1.719GlyTrp: 1.719 ± 0.321
2.471GlyTyr: 2.471 ± 0.42
0.0GlyXaa: 0.0 ± 0.0
His
1.719HisAla: 1.719 ± 0.32
0.215HisCys: 0.215 ± 0.098
1.128HisAsp: 1.128 ± 0.256
0.86HisGlu: 0.86 ± 0.191
0.698HisPhe: 0.698 ± 0.176
2.203HisGly: 2.203 ± 0.507
0.698HisHis: 0.698 ± 0.171
0.806HisIle: 0.806 ± 0.239
0.698HisLys: 0.698 ± 0.206
1.558HisLeu: 1.558 ± 0.279
0.376HisMet: 0.376 ± 0.118
0.484HisAsn: 0.484 ± 0.146
1.075HisPro: 1.075 ± 0.238
0.698HisGln: 0.698 ± 0.181
2.418HisArg: 2.418 ± 0.392
0.806HisSer: 0.806 ± 0.234
1.075HisThr: 1.075 ± 0.229
2.256HisVal: 2.256 ± 0.426
0.376HisTrp: 0.376 ± 0.134
0.591HisTyr: 0.591 ± 0.204
0.0HisXaa: 0.0 ± 0.0
Ile
4.352IleAla: 4.352 ± 0.543
0.215IleCys: 0.215 ± 0.1
3.546IleAsp: 3.546 ± 0.39
3.224IleGlu: 3.224 ± 0.484
0.86IlePhe: 0.86 ± 0.187
4.298IleGly: 4.298 ± 0.628
0.322IleHis: 0.322 ± 0.127
1.182IleIle: 1.182 ± 0.282
1.612IleLys: 1.612 ± 0.501
2.256IleLeu: 2.256 ± 0.296
0.322IleMet: 0.322 ± 0.112
1.719IleAsn: 1.719 ± 0.325
1.88IlePro: 1.88 ± 0.358
0.537IleGln: 0.537 ± 0.24
2.847IleArg: 2.847 ± 0.407
1.236IleSer: 1.236 ± 0.262
3.331IleThr: 3.331 ± 0.475
3.6IleVal: 3.6 ± 0.447
0.591IleTrp: 0.591 ± 0.192
0.967IleTyr: 0.967 ± 0.238
0.0IleXaa: 0.0 ± 0.0
Lys
5.265LysAla: 5.265 ± 0.604
0.752LysCys: 0.752 ± 0.229
1.719LysAsp: 1.719 ± 0.33
0.806LysGlu: 0.806 ± 0.181
0.806LysPhe: 0.806 ± 0.182
2.042LysGly: 2.042 ± 0.281
1.128LysHis: 1.128 ± 0.286
1.236LysIle: 1.236 ± 0.287
0.752LysLys: 0.752 ± 0.272
2.901LysLeu: 2.901 ± 0.293
1.021LysMet: 1.021 ± 0.224
1.075LysAsn: 1.075 ± 0.237
2.471LysPro: 2.471 ± 0.345
1.666LysGln: 1.666 ± 0.298
2.418LysArg: 2.418 ± 0.418
1.773LysSer: 1.773 ± 0.312
1.988LysThr: 1.988 ± 0.386
2.149LysVal: 2.149 ± 0.343
0.484LysTrp: 0.484 ± 0.144
0.698LysTyr: 0.698 ± 0.169
0.0LysXaa: 0.0 ± 0.0
Leu
11.497LeuAla: 11.497 ± 0.72
0.86LeuCys: 0.86 ± 0.202
7.146LeuAsp: 7.146 ± 0.711
2.149LeuGlu: 2.149 ± 0.288
2.418LeuPhe: 2.418 ± 0.329
6.931LeuGly: 6.931 ± 0.575
1.128LeuHis: 1.128 ± 0.291
3.385LeuIle: 3.385 ± 0.432
2.364LeuLys: 2.364 ± 0.487
6.178LeuLeu: 6.178 ± 0.548
1.504LeuMet: 1.504 ± 0.309
2.74LeuAsn: 2.74 ± 0.408
4.406LeuPro: 4.406 ± 0.532
2.901LeuGln: 2.901 ± 0.497
7.199LeuArg: 7.199 ± 0.628
5.319LeuSer: 5.319 ± 0.487
5.587LeuThr: 5.587 ± 0.521
5.641LeuVal: 5.641 ± 0.554
1.558LeuTrp: 1.558 ± 0.354
2.364LeuTyr: 2.364 ± 0.401
0.0LeuXaa: 0.0 ± 0.0
Met
1.719MetAla: 1.719 ± 0.255
0.0MetCys: 0.0 ± 0.0
0.645MetAsp: 0.645 ± 0.178
0.591MetGlu: 0.591 ± 0.188
0.645MetPhe: 0.645 ± 0.205
1.289MetGly: 1.289 ± 0.256
0.43MetHis: 0.43 ± 0.154
1.021MetIle: 1.021 ± 0.241
0.698MetLys: 0.698 ± 0.185
1.988MetLeu: 1.988 ± 0.366
0.322MetMet: 0.322 ± 0.133
0.913MetAsn: 0.913 ± 0.218
0.967MetPro: 0.967 ± 0.241
0.484MetGln: 0.484 ± 0.156
1.236MetArg: 1.236 ± 0.261
1.88MetSer: 1.88 ± 0.364
2.095MetThr: 2.095 ± 0.344
1.182MetVal: 1.182 ± 0.234
0.43MetTrp: 0.43 ± 0.141
0.322MetTyr: 0.322 ± 0.138
0.0MetXaa: 0.0 ± 0.0
Asn
3.653AsnAla: 3.653 ± 0.492
0.269AsnCys: 0.269 ± 0.116
1.451AsnAsp: 1.451 ± 0.266
1.128AsnGlu: 1.128 ± 0.193
0.913AsnPhe: 0.913 ± 0.307
2.847AsnGly: 2.847 ± 0.432
0.322AsnHis: 0.322 ± 0.129
0.913AsnIle: 0.913 ± 0.26
0.86AsnLys: 0.86 ± 0.205
2.471AsnLeu: 2.471 ± 0.465
0.645AsnMet: 0.645 ± 0.164
0.806AsnAsn: 0.806 ± 0.189
2.847AsnPro: 2.847 ± 0.437
0.86AsnGln: 0.86 ± 0.212
2.31AsnArg: 2.31 ± 0.457
1.182AsnSer: 1.182 ± 0.308
1.666AsnThr: 1.666 ± 0.282
2.847AsnVal: 2.847 ± 0.417
0.537AsnTrp: 0.537 ± 0.153
0.752AsnTyr: 0.752 ± 0.205
0.0AsnXaa: 0.0 ± 0.0
Pro
9.026ProAla: 9.026 ± 0.851
0.322ProCys: 0.322 ± 0.102
3.062ProAsp: 3.062 ± 0.491
4.298ProGlu: 4.298 ± 0.475
1.827ProPhe: 1.827 ± 0.296
6.662ProGly: 6.662 ± 0.569
1.289ProHis: 1.289 ± 0.344
2.686ProIle: 2.686 ± 0.297
1.504ProLys: 1.504 ± 0.254
3.761ProLeu: 3.761 ± 0.506
0.484ProMet: 0.484 ± 0.167
1.182ProAsn: 1.182 ± 0.265
3.976ProPro: 3.976 ± 0.566
1.88ProGln: 1.88 ± 0.355
3.116ProArg: 3.116 ± 0.522
2.901ProSer: 2.901 ± 0.44
3.868ProThr: 3.868 ± 0.449
4.298ProVal: 4.298 ± 0.536
0.645ProTrp: 0.645 ± 0.161
1.719ProTyr: 1.719 ± 0.286
0.0ProXaa: 0.0 ± 0.0
Gln
5.91GlnAla: 5.91 ± 0.729
0.537GlnCys: 0.537 ± 0.184
1.612GlnAsp: 1.612 ± 0.28
1.343GlnGlu: 1.343 ± 0.275
0.967GlnPhe: 0.967 ± 0.233
2.095GlnGly: 2.095 ± 0.363
1.236GlnHis: 1.236 ± 0.267
1.075GlnIle: 1.075 ± 0.238
1.236GlnLys: 1.236 ± 0.224
3.277GlnLeu: 3.277 ± 0.502
0.806GlnMet: 0.806 ± 0.214
0.806GlnAsn: 0.806 ± 0.232
2.74GlnPro: 2.74 ± 0.367
1.88GlnGln: 1.88 ± 0.396
2.901GlnArg: 2.901 ± 0.37
1.451GlnSer: 1.451 ± 0.233
1.558GlnThr: 1.558 ± 0.306
2.042GlnVal: 2.042 ± 0.37
0.86GlnTrp: 0.86 ± 0.173
0.913GlnTyr: 0.913 ± 0.214
0.0GlnXaa: 0.0 ± 0.0
Arg
8.059ArgAla: 8.059 ± 0.989
1.451ArgCys: 1.451 ± 0.348
4.406ArgAsp: 4.406 ± 0.595
4.083ArgGlu: 4.083 ± 0.583
2.364ArgPhe: 2.364 ± 0.33
4.406ArgGly: 4.406 ± 0.593
1.719ArgHis: 1.719 ± 0.322
3.385ArgIle: 3.385 ± 0.53
2.794ArgLys: 2.794 ± 0.422
7.898ArgLeu: 7.898 ± 0.961
1.934ArgMet: 1.934 ± 0.339
2.525ArgAsn: 2.525 ± 0.361
4.244ArgPro: 4.244 ± 0.627
3.438ArgGln: 3.438 ± 0.422
7.898ArgArg: 7.898 ± 1.096
3.922ArgSer: 3.922 ± 0.53
3.976ArgThr: 3.976 ± 0.43
5.426ArgVal: 5.426 ± 0.632
2.149ArgTrp: 2.149 ± 0.394
1.988ArgTyr: 1.988 ± 0.364
0.0ArgXaa: 0.0 ± 0.0
Ser
6.017SerAla: 6.017 ± 0.709
0.537SerCys: 0.537 ± 0.191
2.847SerAsp: 2.847 ± 0.349
2.256SerGlu: 2.256 ± 0.364
1.988SerPhe: 1.988 ± 0.313
4.352SerGly: 4.352 ± 1.003
0.752SerHis: 0.752 ± 0.211
1.827SerIle: 1.827 ± 0.331
1.451SerLys: 1.451 ± 0.252
3.546SerLeu: 3.546 ± 0.516
0.913SerMet: 0.913 ± 0.205
1.88SerAsn: 1.88 ± 0.367
2.955SerPro: 2.955 ± 0.364
1.343SerGln: 1.343 ± 0.288
3.438SerArg: 3.438 ± 0.429
3.116SerSer: 3.116 ± 0.54
3.707SerThr: 3.707 ± 0.564
3.761SerVal: 3.761 ± 0.485
1.236SerTrp: 1.236 ± 0.22
1.558SerTyr: 1.558 ± 0.248
0.0SerXaa: 0.0 ± 0.0
Thr
6.931ThrAla: 6.931 ± 1.011
0.537ThrCys: 0.537 ± 0.161
3.385ThrAsp: 3.385 ± 0.418
3.761ThrGlu: 3.761 ± 0.452
2.042ThrPhe: 2.042 ± 0.287
5.211ThrGly: 5.211 ± 0.66
1.451ThrHis: 1.451 ± 0.361
3.707ThrIle: 3.707 ± 0.542
2.095ThrLys: 2.095 ± 0.339
4.513ThrLeu: 4.513 ± 0.655
0.806ThrMet: 0.806 ± 0.222
1.343ThrAsn: 1.343 ± 0.255
4.943ThrPro: 4.943 ± 0.451
1.289ThrGln: 1.289 ± 0.265
3.868ThrArg: 3.868 ± 0.579
2.471ThrSer: 2.471 ± 0.502
3.653ThrThr: 3.653 ± 0.666
5.587ThrVal: 5.587 ± 0.6
1.289ThrTrp: 1.289 ± 0.27
1.451ThrTyr: 1.451 ± 0.275
0.0ThrXaa: 0.0 ± 0.0
Val
9.187ValAla: 9.187 ± 0.851
0.86ValCys: 0.86 ± 0.24
5.91ValAsp: 5.91 ± 0.722
5.587ValGlu: 5.587 ± 0.598
1.719ValPhe: 1.719 ± 0.285
5.695ValGly: 5.695 ± 0.582
1.451ValHis: 1.451 ± 0.351
2.794ValIle: 2.794 ± 0.406
2.471ValLys: 2.471 ± 0.444
5.91ValLeu: 5.91 ± 0.68
0.967ValMet: 0.967 ± 0.23
1.773ValAsn: 1.773 ± 0.321
5.104ValPro: 5.104 ± 0.588
2.095ValGln: 2.095 ± 0.375
5.104ValArg: 5.104 ± 0.635
3.761ValSer: 3.761 ± 0.512
5.211ValThr: 5.211 ± 0.465
6.017ValVal: 6.017 ± 0.632
1.666ValTrp: 1.666 ± 0.311
1.988ValTyr: 1.988 ± 0.286
0.0ValXaa: 0.0 ± 0.0
Trp
1.666TrpAla: 1.666 ± 0.248
0.322TrpCys: 0.322 ± 0.124
0.86TrpAsp: 0.86 ± 0.154
0.806TrpGlu: 0.806 ± 0.211
0.86TrpPhe: 0.86 ± 0.223
1.182TrpGly: 1.182 ± 0.29
0.698TrpHis: 0.698 ± 0.175
0.591TrpIle: 0.591 ± 0.176
0.537TrpLys: 0.537 ± 0.147
1.934TrpLeu: 1.934 ± 0.353
0.376TrpMet: 0.376 ± 0.127
0.591TrpAsn: 0.591 ± 0.224
0.967TrpPro: 0.967 ± 0.245
1.343TrpGln: 1.343 ± 0.218
2.31TrpArg: 2.31 ± 0.397
0.752TrpSer: 0.752 ± 0.215
1.289TrpThr: 1.289 ± 0.306
1.236TrpVal: 1.236 ± 0.315
0.43TrpTrp: 0.43 ± 0.159
0.322TrpTyr: 0.322 ± 0.11
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.847TyrAla: 2.847 ± 0.329
0.537TyrCys: 0.537 ± 0.181
1.558TyrAsp: 1.558 ± 0.3
1.719TyrGlu: 1.719 ± 0.389
0.269TyrPhe: 0.269 ± 0.107
2.525TyrGly: 2.525 ± 0.399
0.322TyrHis: 0.322 ± 0.123
0.752TyrIle: 0.752 ± 0.234
0.86TyrLys: 0.86 ± 0.236
2.042TyrLeu: 2.042 ± 0.433
0.43TyrMet: 0.43 ± 0.132
0.698TyrAsn: 0.698 ± 0.172
1.558TyrPro: 1.558 ± 0.357
0.591TyrGln: 0.591 ± 0.203
2.525TyrArg: 2.525 ± 0.325
1.504TyrSer: 1.504 ± 0.29
1.236TyrThr: 1.236 ± 0.312
1.988TyrVal: 1.988 ± 0.316
0.43TyrTrp: 0.43 ± 0.12
0.645TyrTyr: 0.645 ± 0.188
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 94 proteins (18614 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski