TY - JOUR
T1 - Composite Mapping for Peptide-Based Data Storage with Higher Coding Density and Fewer Synthesis Cycles
AU - Zhang, Anxun
AU - Wang, Longjie
AU - Zhai, Xiaowei
AU - Xiao, Yao
AU - Wu, Yanchan
AU - Zhao, Yongxi
AU - Liu, Kai
AU - Zheng, Ji Shen
AU - Chen, Dong
N1 - Publisher Copyright:
© 2025 The Author(s). Advanced Science published by Wiley-VCH GmbH.
PY - 2025/7/17
Y1 - 2025/7/17
N2 - Peptides are natural information-bearing mediums and are promising for high-density data storage. However, conventional mapping of one amino acid (AA) to one binary code has limited the improvement of coding density by increasing the total number of different AAs. Here, a novel composite mapping strategy is developed, where each position in the peptide sequence is a composite letter consisting of several different AAs, and thousands of composite letters are available for mapping, thus breaking the limit of conventional mapping. When 20 different AAs are used, the coding density of six-AAs composite mapping achieves 15 bits/letter, while conventional mapping only reaches 4 bits/AA. The whole process of encoding data into composite letter sequences, synthesizing composite letter sequences via solid-phase peptide synthesis, sequencing composite letter sequences by mass spectrometry, and decoding data from composite letter sequences is successfully demonstrated for the first time. Composite mapping also demonstrates several distinct advantages, including high coding density, few synthesis cycles, high reliability against errors, low probability of homopolymers, and good compatibility with other encoding algorithms. The developed composite mapping strategy provides a novel way for peptide-based data storage to increase the coding density and reduce the synthesis cycles, showing great potential for large-scale data storage.
AB - Peptides are natural information-bearing mediums and are promising for high-density data storage. However, conventional mapping of one amino acid (AA) to one binary code has limited the improvement of coding density by increasing the total number of different AAs. Here, a novel composite mapping strategy is developed, where each position in the peptide sequence is a composite letter consisting of several different AAs, and thousands of composite letters are available for mapping, thus breaking the limit of conventional mapping. When 20 different AAs are used, the coding density of six-AAs composite mapping achieves 15 bits/letter, while conventional mapping only reaches 4 bits/AA. The whole process of encoding data into composite letter sequences, synthesizing composite letter sequences via solid-phase peptide synthesis, sequencing composite letter sequences by mass spectrometry, and decoding data from composite letter sequences is successfully demonstrated for the first time. Composite mapping also demonstrates several distinct advantages, including high coding density, few synthesis cycles, high reliability against errors, low probability of homopolymers, and good compatibility with other encoding algorithms. The developed composite mapping strategy provides a novel way for peptide-based data storage to increase the coding density and reduce the synthesis cycles, showing great potential for large-scale data storage.
KW - composite mapping
KW - data storage
KW - mass spectrometry sequencing
KW - solid-phase peptide synthesis
KW - statistical analysis
UR - https://www.scopus.com/pages/publications/105003816343
U2 - 10.1002/advs.202503790
DO - 10.1002/advs.202503790
M3 - 文章
C2 - 40285644
AN - SCOPUS:105003816343
SN - 2198-3844
VL - 12
JO - Advanced Science
JF - Advanced Science
IS - 27
M1 - 2503790
ER -