An approach to compress human genome sequence by delta computation and secure storage by Blockchain

  • Garima Mathur, Anjana Pandey, Sachin Goyal
Keywords: Genome sequence, Compression, FASTA, Blockchain, Delta computation, Genbank.

Abstract

In healthcare, DNA plays a vital role as it carries all genetic information about organisms; FASTA is the commonly used DNA sequence in text format. The size of this DNA data is too large that makes it difficult to store and manage; also securing this data is a big issue. Noval loosless compression techniques that can reduce size of these DNA data file is most appropriate solution, reducing the size also reduce the need of resources for transmission. Therefore, this work proposes an ASCII based compression algorithm, in which DNA characters are first converted into ASCII integers and then delta computed, afterwards LZW compression technique is applied on the computed result. For ensuring security of data, blockchain-based framework is used after compression module to make data immutable.

In this paper, for methods like LZW and Huffman code, compression ratio comparisons were also determined for homosapiens and from the results it is clear that proposed algorithm shows good compression ratio for some randomly selected data sets. Another aim of this paper is to show benefits of using blockchain based framework in securing healthcare data.

Published
2021-08-07
How to Cite
Sachin Goyal, G. M. A. P. (2021). An approach to compress human genome sequence by delta computation and secure storage by Blockchain. Design Engineering, 7130-7144. Retrieved from http://www.thedesignengineering.com/index.php/DE/article/view/3228
Section
Articles