Amino acid dipepetide frequency for Streptococcus virus MS1

Apart from single amino acid frequencies one can also calculate so called amino acid dipeptide frequency.

There is 400 possibilites (441 if we consider Xaa as additional 21st amino acid representing all non-standard or unknown amino acids). Thus, if the dipeptides would be present randomly in proteins each amino acid dipeptide should be present with 0.25% frequency. As it is not the case in the nature, for better visablity all more than expected dipepetides are marked by red, and those which are underrepresented are marked by blue in the table.

All values are presented as per milles (‰), therefore need to be multiplied by 10-3.

For more information see sequence space article on Wikipedia.

AlaCysAspGluPheGlyHisIleLysLeuMetAsnProGlnArgSerThrValTrpTyrXaa
Ala
5.079AlaAla: 5.079 ± 1.087
0.395AlaCys: 0.395 ± 0.151
3.273AlaAsp: 3.273 ± 0.457
5.531AlaGlu: 5.531 ± 0.71
2.935AlaPhe: 2.935 ± 0.484
4.797AlaGly: 4.797 ± 0.773
1.185AlaHis: 1.185 ± 0.243
4.346AlaIle: 4.346 ± 0.743
5.531AlaLys: 5.531 ± 0.591
7.111AlaLeu: 7.111 ± 1.122
1.919AlaMet: 1.919 ± 0.379
3.273AlaAsn: 3.273 ± 0.402
1.467AlaPro: 1.467 ± 0.241
2.765AlaGln: 2.765 ± 0.553
3.048AlaArg: 3.048 ± 0.392
4.12AlaSer: 4.12 ± 0.47
4.402AlaThr: 4.402 ± 0.554
3.668AlaVal: 3.668 ± 0.47
1.185AlaTrp: 1.185 ± 0.241
2.935AlaTyr: 2.935 ± 0.47
0.0AlaXaa: 0.0 ± 0.0
Cys
0.395CysAla: 0.395 ± 0.156
0.113CysCys: 0.113 ± 0.077
0.621CysAsp: 0.621 ± 0.172
0.508CysGlu: 0.508 ± 0.186
0.282CysPhe: 0.282 ± 0.133
0.395CysGly: 0.395 ± 0.16
0.113CysHis: 0.113 ± 0.083
0.677CysIle: 0.677 ± 0.208
0.903CysLys: 0.903 ± 0.218
1.185CysLeu: 1.185 ± 0.299
0.226CysMet: 0.226 ± 0.106
0.564CysAsn: 0.564 ± 0.187
0.564CysPro: 0.564 ± 0.172
0.226CysGln: 0.226 ± 0.107
0.451CysArg: 0.451 ± 0.213
0.395CysSer: 0.395 ± 0.157
0.395CysThr: 0.395 ± 0.139
0.339CysVal: 0.339 ± 0.152
0.113CysTrp: 0.113 ± 0.074
0.451CysTyr: 0.451 ± 0.162
0.0CysXaa: 0.0 ± 0.0
Asp
3.951AspAla: 3.951 ± 0.48
0.564AspCys: 0.564 ± 0.185
3.781AspAsp: 3.781 ± 0.608
4.176AspGlu: 4.176 ± 0.663
2.596AspPhe: 2.596 ± 0.373
5.644AspGly: 5.644 ± 0.855
1.411AspHis: 1.411 ± 0.236
4.176AspIle: 4.176 ± 0.474
4.12AspLys: 4.12 ± 0.591
4.854AspLeu: 4.854 ± 0.551
1.524AspMet: 1.524 ± 0.235
3.104AspAsn: 3.104 ± 0.325
2.878AspPro: 2.878 ± 0.43
1.862AspGln: 1.862 ± 0.366
3.104AspArg: 3.104 ± 0.408
3.386AspSer: 3.386 ± 0.394
3.273AspThr: 3.273 ± 0.493
4.007AspVal: 4.007 ± 0.425
0.734AspTrp: 0.734 ± 0.226
3.16AspTyr: 3.16 ± 0.461
0.0AspXaa: 0.0 ± 0.0
Glu
5.249GluAla: 5.249 ± 0.616
0.734GluCys: 0.734 ± 0.213
5.192GluAsp: 5.192 ± 0.599
6.321GluGlu: 6.321 ± 1.3
2.596GluPhe: 2.596 ± 0.312
4.233GluGly: 4.233 ± 0.612
1.185GluHis: 1.185 ± 0.342
4.402GluIle: 4.402 ± 0.465
5.192GluLys: 5.192 ± 0.702
7.563GluLeu: 7.563 ± 0.834
1.524GluMet: 1.524 ± 0.371
4.007GluAsn: 4.007 ± 0.589
2.54GluPro: 2.54 ± 0.541
4.515GluGln: 4.515 ± 0.589
2.935GluArg: 2.935 ± 0.4
2.991GluSer: 2.991 ± 0.389
3.16GluThr: 3.16 ± 0.391
4.515GluVal: 4.515 ± 0.546
0.959GluTrp: 0.959 ± 0.22
2.822GluTyr: 2.822 ± 0.501
0.0GluXaa: 0.0 ± 0.0
Phe
2.145PheAla: 2.145 ± 0.338
0.282PheCys: 0.282 ± 0.112
3.104PheAsp: 3.104 ± 0.401
3.273PheGlu: 3.273 ± 0.397
1.298PhePhe: 1.298 ± 0.245
2.878PheGly: 2.878 ± 0.523
0.734PheHis: 0.734 ± 0.175
2.032PheIle: 2.032 ± 0.279
3.33PheLys: 3.33 ± 0.572
3.048PheLeu: 3.048 ± 0.459
0.734PheMet: 0.734 ± 0.199
2.201PheAsn: 2.201 ± 0.324
0.79PhePro: 0.79 ± 0.211
1.242PheGln: 1.242 ± 0.25
2.257PheArg: 2.257 ± 0.429
3.217PheSer: 3.217 ± 0.451
2.54PheThr: 2.54 ± 0.497
2.483PheVal: 2.483 ± 0.392
0.564PheTrp: 0.564 ± 0.16
1.693PheTyr: 1.693 ± 0.28
0.0PheXaa: 0.0 ± 0.0
Gly
4.628GlyAla: 4.628 ± 0.676
0.282GlyCys: 0.282 ± 0.13
2.991GlyAsp: 2.991 ± 0.436
4.684GlyGlu: 4.684 ± 0.536
2.935GlyPhe: 2.935 ± 0.352
3.668GlyGly: 3.668 ± 0.58
1.016GlyHis: 1.016 ± 0.274
3.838GlyIle: 3.838 ± 0.806
5.305GlyLys: 5.305 ± 0.569
5.531GlyLeu: 5.531 ± 0.815
1.975GlyMet: 1.975 ± 0.289
3.499GlyAsn: 3.499 ± 0.55
1.806GlyPro: 1.806 ± 0.364
3.104GlyGln: 3.104 ± 0.405
3.781GlyArg: 3.781 ± 0.619
4.515GlySer: 4.515 ± 0.733
5.079GlyThr: 5.079 ± 0.541
4.966GlyVal: 4.966 ± 0.599
0.734GlyTrp: 0.734 ± 0.229
2.878GlyTyr: 2.878 ± 0.48
0.0GlyXaa: 0.0 ± 0.0
His
1.298HisAla: 1.298 ± 0.286
0.508HisCys: 0.508 ± 0.171
0.621HisAsp: 0.621 ± 0.167
0.677HisGlu: 0.677 ± 0.167
0.79HisPhe: 0.79 ± 0.252
0.903HisGly: 0.903 ± 0.287
0.339HisHis: 0.339 ± 0.111
0.847HisIle: 0.847 ± 0.22
1.129HisLys: 1.129 ± 0.29
1.298HisLeu: 1.298 ± 0.231
0.282HisMet: 0.282 ± 0.11
0.677HisAsn: 0.677 ± 0.213
0.677HisPro: 0.677 ± 0.2
0.621HisGln: 0.621 ± 0.193
0.508HisArg: 0.508 ± 0.159
1.129HisSer: 1.129 ± 0.243
1.185HisThr: 1.185 ± 0.231
0.847HisVal: 0.847 ± 0.203
0.0HisTrp: 0.0 ± 0.0
0.903HisTyr: 0.903 ± 0.214
0.0HisXaa: 0.0 ± 0.0
Ile
5.079IleAla: 5.079 ± 0.69
0.339IleCys: 0.339 ± 0.121
4.628IleAsp: 4.628 ± 0.556
6.095IleGlu: 6.095 ± 0.556
2.653IlePhe: 2.653 ± 0.294
4.007IleGly: 4.007 ± 0.636
0.959IleHis: 0.959 ± 0.219
3.556IleIle: 3.556 ± 0.516
4.571IleLys: 4.571 ± 0.482
5.079IleLeu: 5.079 ± 0.521
1.693IleMet: 1.693 ± 0.321
2.257IleAsn: 2.257 ± 0.343
2.765IlePro: 2.765 ± 0.416
3.838IleGln: 3.838 ± 0.815
2.483IleArg: 2.483 ± 0.3
3.781IleSer: 3.781 ± 0.49
3.894IleThr: 3.894 ± 0.518
4.289IleVal: 4.289 ± 0.465
0.451IleTrp: 0.451 ± 0.154
2.145IleTyr: 2.145 ± 0.396
0.0IleXaa: 0.0 ± 0.0
Lys
3.781LysAla: 3.781 ± 0.423
0.677LysCys: 0.677 ± 0.201
4.233LysAsp: 4.233 ± 0.452
6.321LysGlu: 6.321 ± 0.678
2.991LysPhe: 2.991 ± 0.367
4.854LysGly: 4.854 ± 0.453
0.959LysHis: 0.959 ± 0.215
4.684LysIle: 4.684 ± 0.512
4.402LysLys: 4.402 ± 0.691
5.531LysLeu: 5.531 ± 0.699
2.709LysMet: 2.709 ± 0.416
3.33LysAsn: 3.33 ± 0.518
2.257LysPro: 2.257 ± 0.536
2.822LysGln: 2.822 ± 0.402
3.838LysArg: 3.838 ± 0.517
4.402LysSer: 4.402 ± 0.385
3.668LysThr: 3.668 ± 0.373
6.152LysVal: 6.152 ± 0.699
0.903LysTrp: 0.903 ± 0.189
3.048LysTyr: 3.048 ± 0.417
0.0LysXaa: 0.0 ± 0.0
Leu
7.111LeuAla: 7.111 ± 0.743
0.847LeuCys: 0.847 ± 0.202
6.49LeuAsp: 6.49 ± 0.595
5.644LeuGlu: 5.644 ± 0.621
3.104LeuPhe: 3.104 ± 0.502
5.531LeuGly: 5.531 ± 0.495
1.185LeuHis: 1.185 ± 0.308
5.813LeuIle: 5.813 ± 0.892
4.854LeuLys: 4.854 ± 0.543
7.563LeuLeu: 7.563 ± 0.798
2.653LeuMet: 2.653 ± 0.415
4.515LeuAsn: 4.515 ± 0.554
4.458LeuPro: 4.458 ± 0.52
3.781LeuGln: 3.781 ± 0.419
3.217LeuArg: 3.217 ± 0.356
4.797LeuSer: 4.797 ± 0.533
5.7LeuThr: 5.7 ± 0.54
6.095LeuVal: 6.095 ± 0.745
1.185LeuTrp: 1.185 ± 0.347
3.273LeuTyr: 3.273 ± 0.432
0.0LeuXaa: 0.0 ± 0.0
Met
2.088MetAla: 2.088 ± 0.514
0.226MetCys: 0.226 ± 0.118
1.693MetAsp: 1.693 ± 0.3
1.693MetGlu: 1.693 ± 0.299
0.847MetPhe: 0.847 ± 0.205
1.524MetGly: 1.524 ± 0.498
0.226MetHis: 0.226 ± 0.118
2.088MetIle: 2.088 ± 0.494
2.201MetLys: 2.201 ± 0.34
1.919MetLeu: 1.919 ± 0.289
0.508MetMet: 0.508 ± 0.218
1.467MetAsn: 1.467 ± 0.263
1.185MetPro: 1.185 ± 0.251
0.847MetGln: 0.847 ± 0.267
0.959MetArg: 0.959 ± 0.246
2.032MetSer: 2.032 ± 0.293
1.862MetThr: 1.862 ± 0.332
1.411MetVal: 1.411 ± 0.398
0.395MetTrp: 0.395 ± 0.146
1.016MetTyr: 1.016 ± 0.263
0.0MetXaa: 0.0 ± 0.0
Asn
4.233AsnAla: 4.233 ± 0.49
0.395AsnCys: 0.395 ± 0.14
3.16AsnAsp: 3.16 ± 0.414
3.951AsnGlu: 3.951 ± 0.401
2.145AsnPhe: 2.145 ± 0.421
4.176AsnGly: 4.176 ± 0.496
0.677AsnHis: 0.677 ± 0.179
3.217AsnIle: 3.217 ± 0.451
3.217AsnLys: 3.217 ± 0.425
4.12AsnLeu: 4.12 ± 0.596
1.185AsnMet: 1.185 ± 0.234
1.975AsnAsn: 1.975 ± 0.375
2.483AsnPro: 2.483 ± 0.443
1.75AsnGln: 1.75 ± 0.317
2.314AsnArg: 2.314 ± 0.356
2.54AsnSer: 2.54 ± 0.365
2.088AsnThr: 2.088 ± 0.385
3.273AsnVal: 3.273 ± 0.39
0.79AsnTrp: 0.79 ± 0.222
2.032AsnTyr: 2.032 ± 0.324
0.0AsnXaa: 0.0 ± 0.0
Pro
2.257ProAla: 2.257 ± 0.342
0.226ProCys: 0.226 ± 0.117
2.596ProAsp: 2.596 ± 0.343
2.991ProGlu: 2.991 ± 0.422
1.298ProPhe: 1.298 ± 0.33
2.201ProGly: 2.201 ± 0.444
0.226ProHis: 0.226 ± 0.111
2.37ProIle: 2.37 ± 0.294
3.612ProLys: 3.612 ± 0.679
2.596ProLeu: 2.596 ± 0.404
0.847ProMet: 0.847 ± 0.224
1.806ProAsn: 1.806 ± 0.312
0.564ProPro: 0.564 ± 0.169
1.524ProGln: 1.524 ± 0.343
1.354ProArg: 1.354 ± 0.351
2.314ProSer: 2.314 ± 0.397
2.596ProThr: 2.596 ± 0.317
1.637ProVal: 1.637 ± 0.366
0.282ProTrp: 0.282 ± 0.113
1.467ProTyr: 1.467 ± 0.31
0.0ProXaa: 0.0 ± 0.0
Gln
3.104GlnAla: 3.104 ± 0.754
0.226GlnCys: 0.226 ± 0.101
2.765GlnAsp: 2.765 ± 0.407
3.16GlnGlu: 3.16 ± 0.459
2.257GlnPhe: 2.257 ± 0.342
2.878GlnGly: 2.878 ± 0.684
0.508GlnHis: 0.508 ± 0.164
2.822GlnIle: 2.822 ± 0.67
2.709GlnLys: 2.709 ± 0.38
4.458GlnLeu: 4.458 ± 0.466
0.903GlnMet: 0.903 ± 0.363
1.637GlnAsn: 1.637 ± 0.338
0.79GlnPro: 0.79 ± 0.261
1.75GlnGln: 1.75 ± 0.348
1.862GlnArg: 1.862 ± 0.357
2.145GlnSer: 2.145 ± 0.35
1.411GlnThr: 1.411 ± 0.339
4.007GlnVal: 4.007 ± 0.445
0.339GlnTrp: 0.339 ± 0.153
2.822GlnTyr: 2.822 ± 0.452
0.0GlnXaa: 0.0 ± 0.0
Arg
2.596ArgAla: 2.596 ± 0.418
0.282ArgCys: 0.282 ± 0.155
2.54ArgAsp: 2.54 ± 0.33
2.596ArgGlu: 2.596 ± 0.419
1.919ArgPhe: 1.919 ± 0.295
2.765ArgGly: 2.765 ± 0.505
0.508ArgHis: 0.508 ± 0.162
3.556ArgIle: 3.556 ± 0.341
4.176ArgLys: 4.176 ± 0.743
4.515ArgLeu: 4.515 ± 0.599
1.185ArgMet: 1.185 ± 0.216
3.16ArgAsn: 3.16 ± 0.39
1.75ArgPro: 1.75 ± 0.318
1.919ArgGln: 1.919 ± 0.361
2.201ArgArg: 2.201 ± 0.545
2.483ArgSer: 2.483 ± 0.45
2.765ArgThr: 2.765 ± 0.334
2.201ArgVal: 2.201 ± 0.282
0.79ArgTrp: 0.79 ± 0.177
2.54ArgTyr: 2.54 ± 0.434
0.0ArgXaa: 0.0 ± 0.0
Ser
4.402SerAla: 4.402 ± 0.497
0.564SerCys: 0.564 ± 0.175
2.596SerAsp: 2.596 ± 0.36
2.37SerGlu: 2.37 ± 0.356
2.314SerPhe: 2.314 ± 0.438
4.571SerGly: 4.571 ± 0.513
0.734SerHis: 0.734 ± 0.2
3.951SerIle: 3.951 ± 0.405
4.684SerLys: 4.684 ± 0.471
4.854SerLeu: 4.854 ± 0.575
1.75SerMet: 1.75 ± 0.43
3.556SerAsn: 3.556 ± 0.378
1.862SerPro: 1.862 ± 0.298
2.145SerGln: 2.145 ± 0.325
3.556SerArg: 3.556 ± 0.421
5.023SerSer: 5.023 ± 0.542
3.725SerThr: 3.725 ± 0.401
4.007SerVal: 4.007 ± 0.557
1.016SerTrp: 1.016 ± 0.208
2.032SerTyr: 2.032 ± 0.304
0.0SerXaa: 0.0 ± 0.0
Thr
3.217ThrAla: 3.217 ± 0.403
0.959ThrCys: 0.959 ± 0.327
3.725ThrAsp: 3.725 ± 0.411
4.289ThrGlu: 4.289 ± 0.427
2.201ThrPhe: 2.201 ± 0.361
4.797ThrGly: 4.797 ± 0.471
0.508ThrHis: 0.508 ± 0.139
4.458ThrIle: 4.458 ± 0.582
4.289ThrLys: 4.289 ± 0.442
5.7ThrLeu: 5.7 ± 0.599
1.467ThrMet: 1.467 ± 0.263
2.596ThrAsn: 2.596 ± 0.461
2.483ThrPro: 2.483 ± 0.387
2.37ThrGln: 2.37 ± 0.336
2.596ThrArg: 2.596 ± 0.42
2.991ThrSer: 2.991 ± 0.427
4.12ThrThr: 4.12 ± 0.435
4.628ThrVal: 4.628 ± 0.509
0.677ThrTrp: 0.677 ± 0.173
2.709ThrTyr: 2.709 ± 0.426
0.0ThrXaa: 0.0 ± 0.0
Val
4.741ValAla: 4.741 ± 0.617
0.677ValCys: 0.677 ± 0.191
4.515ValAsp: 4.515 ± 0.409
4.458ValGlu: 4.458 ± 0.472
2.145ValPhe: 2.145 ± 0.32
3.725ValGly: 3.725 ± 0.452
1.467ValHis: 1.467 ± 0.312
3.951ValIle: 3.951 ± 0.386
4.289ValLys: 4.289 ± 0.647
5.757ValLeu: 5.757 ± 0.627
1.524ValMet: 1.524 ± 0.298
3.048ValAsn: 3.048 ± 0.414
2.596ValPro: 2.596 ± 0.475
3.894ValGln: 3.894 ± 0.557
3.104ValArg: 3.104 ± 0.299
4.797ValSer: 4.797 ± 0.551
4.628ValThr: 4.628 ± 0.384
4.063ValVal: 4.063 ± 0.517
1.016ValTrp: 1.016 ± 0.357
1.58ValTyr: 1.58 ± 0.254
0.0ValXaa: 0.0 ± 0.0
Trp
0.621TrpAla: 0.621 ± 0.131
0.056TrpCys: 0.056 ± 0.057
0.903TrpAsp: 0.903 ± 0.221
0.903TrpGlu: 0.903 ± 0.235
0.847TrpPhe: 0.847 ± 0.231
0.395TrpGly: 0.395 ± 0.133
0.451TrpHis: 0.451 ± 0.141
0.508TrpIle: 0.508 ± 0.171
0.903TrpLys: 0.903 ± 0.266
1.411TrpLeu: 1.411 ± 0.283
0.169TrpMet: 0.169 ± 0.097
0.734TrpAsn: 0.734 ± 0.201
0.113TrpPro: 0.113 ± 0.083
0.508TrpGln: 0.508 ± 0.162
0.621TrpArg: 0.621 ± 0.267
0.508TrpSer: 0.508 ± 0.138
1.354TrpThr: 1.354 ± 0.449
0.734TrpVal: 0.734 ± 0.182
0.113TrpTrp: 0.113 ± 0.072
1.016TrpTyr: 1.016 ± 0.262
0.0TrpXaa: 0.0 ± 0.0
Tyr
2.991TyrAla: 2.991 ± 0.445
0.564TyrCys: 0.564 ± 0.214
2.878TyrAsp: 2.878 ± 0.425
3.048TyrGlu: 3.048 ± 0.496
1.693TyrPhe: 1.693 ± 0.305
3.273TyrGly: 3.273 ± 0.513
0.847TyrHis: 0.847 ± 0.238
3.273TyrIle: 3.273 ± 0.458
2.145TyrLys: 2.145 ± 0.379
3.725TyrLeu: 3.725 ± 0.52
1.354TyrMet: 1.354 ± 0.297
2.314TyrAsn: 2.314 ± 0.355
0.734TyrPro: 0.734 ± 0.187
1.072TyrGln: 1.072 ± 0.274
2.257TyrArg: 2.257 ± 0.343
2.145TyrSer: 2.145 ± 0.393
2.935TyrThr: 2.935 ± 0.439
2.653TyrVal: 2.653 ± 0.422
0.621TyrTrp: 0.621 ± 0.163
1.411TyrTyr: 1.411 ± 0.399
0.0TyrXaa: 0.0 ± 0.0
Xaa
0.0XaaAla: 0.0 ± 0.0
0.0XaaCys: 0.0 ± 0.0
0.0XaaAsp: 0.0 ± 0.0
0.0XaaGlu: 0.0 ± 0.0
0.0XaaPhe: 0.0 ± 0.0
0.0XaaGly: 0.0 ± 0.0
0.0XaaHis: 0.0 ± 0.0
0.0XaaIle: 0.0 ± 0.0
0.0XaaLys: 0.0 ± 0.0
0.0XaaLeu: 0.0 ± 0.0
0.0XaaMet: 0.0 ± 0.0
0.0XaaAsn: 0.0 ± 0.0
0.0XaaPro: 0.0 ± 0.0
0.0XaaGln: 0.0 ± 0.0
0.0XaaArg: 0.0 ± 0.0
0.0XaaSer: 0.0 ± 0.0
0.0XaaThr: 0.0 ± 0.0
0.0XaaVal: 0.0 ± 0.0
0.0XaaTrp: 0.0 ± 0.0
0.0XaaTyr: 0.0 ± 0.0
0.0XaaXaa: 0.0 ± 0.0
Statistics based on 77 proteins (17720 amino acids)

Note: The error has been estimated with the bootstraping (x100) at the protein level

Above dipeptide statistics (among other stats for this proteome) you can download from this CSV file
See this proteome in: uniprot_link
Proteome-pI is available under Creative Commons Attribution-NoDerivs license, for more details see here

Reference: Kozlowski LP. Proteome-pI 2.0: Proteome Isoelectric Point Database Update. Nucleic Acids Res. 2021, doi: 10.1093/nar/gkab944 Contact: Lukasz P. Kozlowski