Quantum protocol for privacy preserving Hamming distance problem of DNA sequences

2020 ◽  
Vol 59 (7) ◽  
pp. 2101-2111
Author(s):  
Min-Yao Ma ◽  
Zhuo Liu ◽  
Yi Xu
2011 ◽  
Vol 6 (10) ◽  
pp. 33-40
Author(s):  
Joyce Jiyoung Whang ◽  
Uran Oh ◽  
Aeyoung Kim ◽  
SangHo Lee

2020 ◽  
Vol 21 (6) ◽  
pp. 2191 ◽  
Author(s):  
Qiang Yin ◽  
Ben Cao ◽  
Xue Li ◽  
Bin Wang ◽  
Qiang Zhang ◽  
...  

The high density, large capacity, and long-term stability of DNA molecules make them an emerging storage medium that is especially suitable for the long-term storage of large datasets. The DNA sequences used in storage need to consider relevant constraints to avoid nonspecific hybridization reactions, such as the No-runlength constraint, GC-content, and the Hamming distance. In this work, a new nonlinear control parameter strategy and a random opposition-based learning strategy were used to improve the Harris hawks optimization algorithm (for the improved algorithm NOL-HHO) in order to prevent it from falling into local optima. Experimental testing was performed on 23 widely used benchmark functions, and the proposed algorithm was used to obtain better coding lower bounds for DNA storage. The results show that our algorithm can better maintain a smooth transition between exploration and exploitation and has stronger global exploration capabilities as compared with other algorithms. At the same time, the improvement of the lower bound directly affects the storage capacity and code rate, which promotes the further development of DNA storage technology.


2000 ◽  
Vol 7 (11) ◽  
Author(s):  
Jakob Pagter

In this report we study the proof employed by Miklos Ajtai<br />[Determinism versus Non-Determinism for Linear Time RAMs<br />with Memory Restrictions, 31st Symposium on Theory of <br />Computation (STOC), 1999] when proving a non-trivial lower bound<br />in a general model of computation for the Hamming Distance<br />problem: given n elements: decide whether any two of them have<br />"small" Hamming distance. Specifically, Ajtai was able to show<br />that any R-way branching program deciding this problem using<br />time O(n) must use space Omega(n lg n).<br />We generalize Ajtai's original proof allowing us to prove a<br />time-space trade-off for deciding the Hamming Distance problem<br /> in the R-way branching program model for time between n<br />and alpha n lg n / lg lg n, for some suitable 0 < alpha < 1. In particular we prove<br />that if space is O(n^(1−epsilon)), then time is Omega(n lg n / lg lg n).


Author(s):  
Daniel Liu

Previous algorithms for solving the approximate string matching with Hamming distance problem with wildcard ("don't care") characters have been shown to take \(O(|\Sigma| N \log M)\) time, where \(N\) is the length of the text, \(M\) is the length of the pattern, and \(|\Sigma|\) is the size of the alphabet. They make use of the Fast Fourier Transform for efficiently calculating convolutions. We describe a novel approach of the problem, which makes use of special encoding schemes that depend on \((|\Sigma| - 1)\)-simplexes in \((|\Sigma| - 1)\)-dimensional space.


2006 ◽  
Vol 99 (4) ◽  
pp. 149-153 ◽  
Author(s):  
Wei Huang ◽  
Yaoyun Shi ◽  
Shengyu Zhang ◽  
Yufan Zhu

2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Juan Zhang ◽  
Changsheng Wan ◽  
Chunyu Zhang ◽  
Xiaojun Guo ◽  
Yongyong Chen

To determine whether images on the crowdsourcing server meet the mobile user’s requirement, an auditing protocol is desired to check these images. However, before paying for images, the mobile user typically cannot download them for checking. Moreover, since mobiles are usually low-power devices and the crowdsourcing server has to handle a large number of mobile users, the auditing protocol should be lightweight. To address the above security and efficiency issues, we propose a novel noninteractive lightweight privacy-preserving auditing protocol on images in mobile crowdsourcing networks, called NLPAS. Since NLPAS allows the mobile user to check images on the crowdsourcing server without downloading them, the newly designed protocol can provide privacy protection for these images. At the same time, NLPAS uses the binary convolutional neural network for extracting features from images and designs a novel privacy-preserving Hamming distance computation algorithm for determining whether these images on the crowdsourcing server meet the mobile user’s requirement. Since these two techniques are both lightweight, NLPAS can audit images on the crowdsourcing server in a privacy-preserving manner while still enjoying high efficiency. Experimental results show that NLPAS is feasible for real-world applications.


Author(s):  
Daniel Liu

Previous algorithms for solving the approximate string matching with Hamming distance problem with wildcard ("don't care") characters have been shown to take \(O(|\Sigma| N \log M)\) time, where \(N\) is the length of the text, \(M\) is the length of the pattern, and \(|\Sigma|\) is the size of the alphabet. They make use of the Fast Fourier Transform for efficiently calculating convolutions. We describe a novel approach of the problem, which makes use of special encoding schemes that depend on \((|\Sigma| - 1)\)-simplexes in \((|\Sigma| - 1)\)-dimensional space.


2020 ◽  
Author(s):  
Alexey V. Rakov ◽  
Dieter M. Schifferli ◽  
Shu-Lin Liu ◽  
Emilio Mastriani

AbstractThe problem of fast calculation of Hamming distance inferred from many sequence datasets is still not a trivial task. Here, we present HamHeat, as a new software package to efficiently calculate Hamming distance for hundreds of aligned protein or DNA sequences of a large number of residues or nucleotides, respectively. HamHeat uses a unique algorithm with many advantages, including its ease of use and the execution of fast runs for large amounts of data. The package consists of three consecutive modules. In the first module, the software ranks the sequences from the most to the least frequent variant. The second module uses the most common variant as the reference sequence to calculate the Hamming distance of each additional sequence based on the number of residue or nucleotide changes. A final module formats all the results in a comprehensive table that displays the sequence ranks and Hamming distances.Availability and implementationHamHeat is based on Python 3 and AWK, runs on Linux system and is available under the MIT License at: https://github.com/alexeyrakov/[email protected]


2018 ◽  
Author(s):  
Maria Fernandes ◽  
Jérémie Decouchant ◽  
Marcus Völp ◽  
Francisco M Couto ◽  
Paulo Esteves-Veríssimo

AbstractThe advent of high throughput next-generation sequencing (NGS) machines made DNA sequencing cheaper, but also put pressure on the genomic life-cycle, which includes aligning millions of short DNA sequences, called reads, to a reference genome. On the performance side, efficient algorithms have been developed, and parallelized on public clouds. On the privacy side, since genomic data are utterly sensitive, several cryptographic mechanisms have been proposed to align reads securely, with a lower performance than the former, which in turn are not secure. This manuscript proposes a novel contribution to improving the privacy performance product in current genomic studies. Building on recent works that argue that genomics data needs to be × treated according to a threat-risk analysis, we introduce a multi-level sensitivity classification of genomic variations. Our classification prevents the amplification of possible privacy attacks, thanks to promoting and partitioning mechanisms among sensitivity levels. Thanks to this classification, reads can be aligned, stored, and later accessed, using different security levels. We then extend a recent filter, which detects the reads that carry sensitive information, to classify reads into sensitivity levels. Finally, based on a review of the existing alignment methods, we show that adapting alignment algorithms to reads sensitivity allows high performance gains, whilst enforcing high privacy levels. Our results indicate that using sensitivity levels is feasible to optimize the performance of privacy preserving alignment, if one combines the advantages of private and public clouds.


Sign in / Sign up

Export Citation Format

Share Document