file formats Latest Research Papers

Background: Bioinformatics software tools operate largely through the use of specialized genomics file formats. Often these formats lack formal specification, and only rarely do the creators of these tools robustly test them for correct handling of input and output. This causes problems in interoperability between different tools that, at best, wastes time and frustrates users. At worst, interoperability issues could lead to undetected errors in scientific results. Methods: We sought (1) to assess the interoperability of a wide range of bioinformatics software using a shared genomics file format and (2) to provide a simple, reproducible method for enhancing interoperability. As a focus, we selected the popular BED file format for genomic interval data. Based on the file format's original documentation, we created a formal specification. We developed a new verification system, Acidbio (https://github.com/hoffmangroup/acidbio), which tests for correct behavior in bioinformatics software packages. We crafted tests to unify correct behavior when tools encounter various edge cases—potentially unexpected inputs that exemplify the limits of the format. To analyze the performance of existing software, we tested the input validation of 80 Bioconda packages that parsed the BED format. We also used a fuzzing approach to automatically perform additional testing. Results: Of 80 software packages examined, 75 achieved less than 70% correctness on our test suite. We categorized multiple root causes for the poor performance of different types of software. Fuzzing detected other errors that the manually designed test suite could not. We also created a badge system that developers can use to indicate more precisely which BED variants their software accepts and to advertise the software's performance on the test suite. Discussion: Acidbio makes it easy to assess interoperability of software using the BED format, and therefore to identify areas for improvement in individual software packages. Applying our approach to other file formats would increase the reliability of bioinformatics software and data.

Download Full-text

Unified Nanotechnology Format: One Way to Store Them All

Molecules ◽

10.3390/molecules27010063 ◽

2021 ◽

Vol 27 (1) ◽

pp. 63

Author(s):

David Kuťák ◽

Erik Poppleton ◽

Haichao Miao ◽

Petr Šulc ◽

Ivan Barišić

Keyword(s):

Free Form ◽

File Format ◽

Dna Structures ◽

Rna Nanotechnology ◽

Dna And Rna ◽

File Formats ◽

Computer Based ◽

Wet Lab ◽

The Many ◽

Future Work

The domains of DNA and RNA nanotechnology are steadily gaining in popularity while proving their value with various successful results, including biosensing robots and drug delivery cages. Nowadays, the nanotechnology design pipeline usually relies on computer-based design (CAD) approaches to design and simulate the desired structure before the wet lab assembly. To aid with these tasks, various software tools exist and are often used in conjunction. However, their interoperability is hindered by a lack of a common file format that is fully descriptive of the many design paradigms. Therefore, in this paper, we propose a Unified Nanotechnology Format (UNF) designed specifically for the biomimetic nanotechnology field. UNF allows storage of both design and simulation data in a single file, including free-form and lattice-based DNA structures. By defining a logical and versatile format, we hope it will become a widely accepted and used file format for the nucleic acid nanotechnology community, facilitating the future work of researchers and software developers. Together with the format description and publicly available documentation, we provide a set of converters from existing file formats to simplify the transition. Finally, we present several use cases visualizing example structures stored in UNF, showcasing the various types of data UNF can handle.

Download Full-text

Ontology of Heterogeneous Image File Formats and their Disparate Applications

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2021/091062021 ◽

2021 ◽

Vol 10 (6) ◽

pp. 3138-3143

Keyword(s):

Digital Images ◽

File Format ◽

Image File ◽

Vector Type ◽

File Formats ◽

The World ◽

Basic Image ◽

Image File Format ◽

Image Format ◽

Image Type

Different image formats are available in the world today which are used for various purposes, this paper elaborates the Ontology of different Image File Formats and their various applications. Digital images are saved in various Image File Formats which have different properties and features which are ideal for a particular use. A digital image is primarily classified into two types, raster or vector type. Image format elucidate how the information in the image will be stored. Image file format is a systematic way of storing and arranging digital images. Image file format can store data in compressed format (which may be lossy or lossless), uncompressed format or a vector format. Some Image format are suitable for a particular purpose while some are not. TIFF Image type is good for printing whereas PNG or JPG, are best for web. Analysis of the basic Image File Format have been carried out practically and the result is displayed in the coming section

Download Full-text

Revisiting dynamic range and image enhancement ability of contemporary digital radiographic systems

Dentomaxillofacial Radiology ◽

10.1259/dmfr.20210404 ◽

2021 ◽

Author(s):

Luiz Eduardo Marinho ◽

Luciano Augusto Cano Martins ◽

Deborah Queiroz Freitas ◽

Francisco Haiter-Neto ◽

Matheus L. Oliveira

Keyword(s):

Image Enhancement ◽

Dynamic Range ◽

Angular Coefficient ◽

Image Brightness ◽

File Formats ◽

Contrast Enhanced ◽

Phosphor Plate ◽

The Mean ◽

Post Hoc ◽

The Relationship

Objectives: To assess the dynamic range and enhancement ability of radiographs acquired with contemporary digital systems. Methods: Five repeated periapical radiographs of human mandibles with an aluminium step-wedge were acquired using two sensor-based and three phosphor plate-based (PSP plate-based) systems and an X-ray unit at ten exposure times 0.020, 0.032, 0.063, 0.080, 0.100, 0.200, 0.320, 0.400, 0.500, and 0.630 s. All images had their brightness and contrast enhanced by two experienced oral and maxillofacial radiologists in consensus and were exported as both the original and enhanced file formats. Mean grey values were obtained from the aluminium steps and tabulated with their corresponding thicknesses for each exposure time, digital radiographic system, and file format. Images with saturated steps were excluded and the mean grey values from the remaining images were averaged to assess image brightness and the angular coefficient of the linear trendlines was generated from the relationship between mean grey values and their corresponding aluminium thicknesses to assess image contrast. Brightness and contrast values were compared using two-way ANOVA with post-hoc Tukey (α = 0.05). Results: PSP plate-based digital radiographic systems had a broader dynamic range. Longer exposure times produced original images with lower brightness and variable contrast (p < 0.05). Subjective enhancement significantly increased or reduced brightness and/or contrast in some systems (p < 0.05). Conclusions: Contemporary digital radiographic systems present different dynamic ranges and exposure-related brightness and contrast. Image enhancement may be a valuable tool at slightly suboptimal exposure times.

Download Full-text

easyfm: An easy software suite for file manipulation of Next Generation Sequencing data on desktops

10.22541/au.163845474.49811073/v1 ◽

2021 ◽

Author(s):

Hyungtaek Jung ◽

Brendan Jeon ◽

Daniel Ortiz-Barrientos

Keyword(s):

Next Generation Sequencing ◽

Life Sciences ◽

Next Generation Sequencing Data ◽

Command Line ◽

Next Generation ◽

Web Based ◽

File Formats ◽

Wide Range ◽

Ngs Data ◽

Generation Sequencing

Storing and manipulating Next Generation Sequencing (NGS) file formats for understanding biological phenomena is an essential but difficult task in the life sciences. Yet, most methods for analysing NGS data require complex command-line tools in high-performance computing (HPC) or web-based servers and have not yet been implemented in comprehensive, easy-to-use software. Here we present easyfm (easy file manipulation), a free standalone Graphical User Interface (GUI) software with Python support that can be used to facilitate the rapid discovery of target sequences (or user’s interest) in NGS datasets for novice users (more accessible to biologists). It enables them to perform end-to-end reproducible data analyses using a desktop application (Windows, Mac and Linux). Unlike existing tools, the GUI-based easyfm is not dependent on any HPC system and can be operated without an internet connection. For user-friendliness and convenience, easyfm was developed with four work modules and a secondary GUI window, covering different aspects of NGS data analysis, including post-processing, filtering, format conversion, generating results, real-time log, and help. In combination with the executable tools (BLAST+ and BLAT) and Python, easyfm allows the user to set analysis parameters, select/extract regions of interest, examine the input and output results, and convert to a wide range of file formats. To help augment the functionality of existing web-based and command-line tools, easyfm, a self-contained program, comes with extensive documentation (https://github.com/TaekAndBrendan/easyfm). This specific benefit allows easyfm to seamlessly integrate visual and interactive representations of NGS files, supporting a wider scope of bioinformatics applications in the life sciences.

Download Full-text

An OpenBIM workflow to support collaboration between Acoustic Engineers and Architects

Journal of Physics Conference Series ◽

10.1088/1742-6596/2069/1/012164 ◽

2021 ◽

Vol 2069 (1) ◽

pp. 012164

Author(s):

Tim Pat McGinley ◽

Thomas Vestergaard ◽

Cheol-Ho Jeong ◽

Finnur Pind

Keyword(s):

Real Time ◽

Design Process ◽

Architectural Design ◽

Acoustic Analysis ◽

Geometric Information ◽

International Standard ◽

Acoustic Performance ◽

Analysis Process ◽

Exchange Format ◽

File Formats

Abstract Architects require the insight of acoustic engineers to understand how to improve and/or optimize the acoustic performance of their buildings. Normally this is supported by the architect providing digital models of the design to the acoustic engineer for analysis in the acoustician’s disciplinary software, for instance Odeon. This current workflow suffers from the following challenges: (1) architects typically require feedback on architectural disciplinary models that have too much geometric information unnecessarily complicating the acoustic analysis process; (2) the acoustician then has to waste time simplifying that geometry, (3) finally, this extra work wastes money which could otherwise be spent on faster design iterations supported by frequent feedback between architects and acousticians early in the design process. This paper focuses on the architect / acoustician workflow, however similar challenges can be found in other disciplines. OpenBIM workflows provide opportunities to increase the standardization of processes and interfaces between disciplines by reducing the reliance on the proprietary discipline specific file formats and tools. This paper lays the foundation for an OpenBIM workflow to enable the acoustic engineer to provide near real time feedback on the acoustic performance of the architectural design. The proposed workflow investigates the use of the international standard IFC as a design format rather than simply an exchange format. The workflow is presented here with the intention that this will be further explored and developed by other researchers, architects and acousticians.

Download Full-text

Virtual Disks Performance Analysis Using Flexible I/O and Powerstat

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.38392 ◽

2021 ◽

Vol 9 (10) ◽

pp. 267-296

Author(s):

Pritam Patange

Keyword(s):

Cloud Computing ◽

Virtual Machine ◽

Virtual Machines ◽

Disk File ◽

Virtual Machine Monitor ◽

Image File ◽

File Formats ◽

Physical Infrastructure ◽

Disk Image ◽

The Impact

Abstract: Cloud computing has experienced significant growth in the recent years owing to the various advantages it provides such as 24/7 availability, quick provisioning of resources, easy scalability to name a few. Virtualization is the backbone of cloud computing. Virtual Machines (VMs) are created and executed by a software called Virtual Machine Monitor (VMM) or the hypervisor. It separates compute environments from the actual physical infrastructure. A disk image file representing a single virtual machine is created on the hypervisor’s file system. In this paper, we analysed the runtime performance of multiple different disk image file formats. The analysis comprises of four different parameters of performance namely- bandwidth, latency, input-output operations performed per second (IOPS) and power consumption. The impact of the hypervisor’s block and file sizes is also analysed for the different file formats. The paper aims to act as a reference for the reader in choosing the most appropriate disk file image format for their use case based on the performance comparisons made between different disk image file formats on two different hypervisors – KVM and VirtualBox. Keywords: Virtualization, Virtual disk formats, Cloud computing, fio, KVM, virt-manager, powerstat, VirtualBox.

Download Full-text

ClineHelpR: an R package for genomic cline outlier detection and visualization

BMC Bioinformatics ◽

10.1186/s12859-021-04423-x ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Bradley T. Martin ◽

Tyler K. Chafin ◽

Marlis R. Douglas ◽

Michael E. Douglas

Keyword(s):

Management Strategies ◽

R Package ◽

Model Organisms ◽

Hybrid Zones ◽

Testing Tools ◽

File Formats ◽

Genome Wide ◽

Outlier Loci ◽

Adaptive Processes ◽

Post Hoc

Abstract Background Patterns of multi-locus differentiation (i.e., genomic clines) often extend broadly across hybrid zones and their quantification can help diagnose how species boundaries are shaped by adaptive processes, both intrinsic and extrinsic. In this sense, the transitioning of loci across admixed individuals can be contrasted as a function of the genome-wide trend, in turn allowing an expansion of clinal theory across a much wider array of biodiversity. However, computational tools that serve to interpret and consequently visualize ‘genomic clines’ are limited, and users must often write custom, relatively complex code to do so. Results Here, we introduce the ClineHelpR R-package for visualizing genomic clines and detecting outlier loci using output generated by two popular software packages, bgc and Introgress. ClineHelpR bundles both input generation (i.e., filtering datasets and creating specialized file formats) and output processing (e.g., MCMC thinning and burn-in) with functions that directly facilitate interpretation and hypothesis testing. Tools are also provided for post-hoc analyses that interface with external packages such as ENMeval and RIdeogram. Conclusions Our package increases the reproducibility and accessibility of genomic cline methods, thus allowing an expanded user base and promoting these methods as mechanisms to address diverse evolutionary questions in both model and non-model organisms. Furthermore, the ClineHelpR extended functionality can evaluate genomic clines in the context of spatial and environmental features, allowing users to explore underlying processes potentially contributing to the observed patterns and helping facilitate effective conservation management strategies.

Download Full-text

Stegomalware: A Systematic Survey of Malware Hiding and Detection in Images, Machine Learning Models and Research Challenges

10.36227/techrxiv.16755457 ◽

2021 ◽

Author(s):

Raj chaganti ◽

vinayakumar R ◽

Mamoun Alazab ◽

Tuan Pham

Keyword(s):

Machine Learning ◽

Academic Research ◽

Image Steganography ◽

Machine Learning Techniques ◽

Current Status ◽

Generative Adversarial Networks ◽

Malware Analysis ◽

Source Of Infection ◽

Adversarial Networks ◽

File Formats

<div>Malware distribution to the victim network is commonly performed through file attachments in phishing email or downloading illegitimate files from the internet, when the victim interacts with the source of infection. To detect and prevent the malware distribution in the victim machine, the existing end device security applications may leverage sophisticated techniques such as signature-based or anomaly-based, machine learning techniques. The well-known file formats Portable Executable (PE) for Windows and Executable and Linkable Format (ELF) for Linux based operating system are used for malware analysis and the malware detection capabilities of these files has been well advanced for real time detection. But the malware payload hiding in multimedia like cover images using steganography detection has been a challenge for enterprises, as these are rarely seen and usually act as a stager in sophisticated attacks. In this article, to our knowledge, we are the first to try to address the knowledge gap between the current progress in image steganography and steganalysis academic research focusing on data hiding and the review of the stegomalware (malware payload hiding in images) targeting enterprises with cyberattacks current status. We present the stegomalware history, generation tools, file format specification description. Based on our findings, we perform the detail review of the image steganography techniques including the recent Generative Adversarial Networks (GAN) based models and the image steganalysis methods including the Deep Learning opportunities and challenges in stegomalware generation and detection are presented based on our findings.</div>

Download Full-text

Stegomalware: A Systematic Survey of Malware Hiding and Detection in Images, Machine Learning Models and Research Challenges

10.36227/techrxiv.16755457.v1 ◽

2021 ◽

Author(s):

Raj chaganti ◽

vinayakumar R ◽

Mamoun Alazab ◽

Tuan Pham

Keyword(s):

Machine Learning ◽

Academic Research ◽

Image Steganography ◽

Machine Learning Techniques ◽

Current Status ◽

Generative Adversarial Networks ◽

Malware Analysis ◽

Source Of Infection ◽

Adversarial Networks ◽

File Formats

<div>Malware distribution to the victim network is commonly performed through file attachments in phishing email or downloading illegitimate files from the internet, when the victim interacts with the source of infection. To detect and prevent the malware distribution in the victim machine, the existing end device security applications may leverage sophisticated techniques such as signature-based or anomaly-based, machine learning techniques. The well-known file formats Portable Executable (PE) for Windows and Executable and Linkable Format (ELF) for Linux based operating system are used for malware analysis and the malware detection capabilities of these files has been well advanced for real time detection. But the malware payload hiding in multimedia like cover images using steganography detection has been a challenge for enterprises, as these are rarely seen and usually act as a stager in sophisticated attacks. In this article, to our knowledge, we are the first to try to address the knowledge gap between the current progress in image steganography and steganalysis academic research focusing on data hiding and the review of the stegomalware (malware payload hiding in images) targeting enterprises with cyberattacks current status. We present the stegomalware history, generation tools, file format specification description. Based on our findings, we perform the detail review of the image steganography techniques including the recent Generative Adversarial Networks (GAN) based models and the image steganalysis methods including the Deep Learning opportunities and challenges in stegomalware generation and detection are presented based on our findings.</div>

Download Full-text

file formats
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Assessing and assuring interoperability of a genomics file format

Unified Nanotechnology Format: One Way to Store Them All

Ontology of Heterogeneous Image File Formats and their Disparate Applications

Revisiting dynamic range and image enhancement ability of contemporary digital radiographic systems

easyfm: An easy software suite for file manipulation of Next Generation Sequencing data on desktops

An OpenBIM workflow to support collaboration between Acoustic Engineers and Architects

Virtual Disks Performance Analysis Using Flexible I/O and Powerstat

ClineHelpR: an R package for genomic cline outlier detection and visualization

Stegomalware: A Systematic Survey of Malware Hiding and Detection in Images, Machine Learning Models and Research Challenges

Stegomalware: A Systematic Survey of Malware Hiding and Detection in Images, Machine Learning Models and Research Challenges

Export Citation Format

file formatsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Assessing and assuring interoperability of a genomics file format

Unified Nanotechnology Format: One Way to Store Them All

Ontology of Heterogeneous Image File Formats and their Disparate Applications

Revisiting dynamic range and image enhancement ability of contemporary digital radiographic systems

easyfm: An easy software suite for file manipulation of Next Generation Sequencing data on desktops

An OpenBIM workflow to support collaboration between Acoustic Engineers and Architects

Virtual Disks Performance Analysis Using Flexible I/O and Powerstat

ClineHelpR: an R package for genomic cline outlier detection and visualization

Stegomalware: A Systematic Survey of Malware Hiding and Detection in Images, Machine Learning Models and Research Challenges

Stegomalware: A Systematic Survey of Malware Hiding and Detection in Images, Machine Learning Models and Research Challenges

file formats
Recently Published Documents