In-Memory Management System for 3D Protein Macromolecular Structures

Bozena        Malysiak-Mrozek; Kamil        Zur; Dariusz         Mrozek

doi:10.2174/1570164615666180320151452

Abstract

Background: Protein Data Bank is a world-wide repository that collects and provides macromolecular data of protein structures and other molecules for Life sciences community. Manipulation of vast amount of 3D protein structures and exploration of their properties require parsing thousands of flat files that are used to describe these macromolecular structures every time we perform calculations.

Objective: Expecting more protein structures to appear in the future in open access repositories, like the Protein Data Bank, and meeting the expectations of the era of fast data analytics, we propose inmemory management system for protein structures that predominantly uses main memory of the host server to store, manage and manipulate data. This allows to eliminate the overhead related to loading data from hard drives and storing them in a buffer cache.

Method: In this paper, we show in-memory protein structure management system (IMPSMS), which allows performing various operations, including basic functions like: selection, inserting, updating and searching of protein structures, and execution of more sophisticated functions, like batch calculation of root mean square deviation between proteins stored in the database, batch calculation of torsion angles, structure comparison, structural alignment and superposition of the given molecule to molecules stored in the in-memory database.

Results: In the experimental part, we show that with dedicated in-memory data structures particular operations on proteins can be performed even a hundred times faster than analogous operations preceded by traditional loading and parsing macromolecular data from standard PDB flat files.

Conclusion: Our work proves that designing dedicated data structures and management systems for frequent protein data manipulations brings significant time savings and increases capabilities of running fast data analytics in bioinformatics.

Keywords: Databases, in-memory, management system, proteins, 3D protein structure, structural bioinformatics.

« Previous Next »

Graphical Abstract

Rights & Permissions Print Cite

Article Metrics

35

1

2

1

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/1570164615666180320151452	Print ISSN 1570-1646
Publisher Name Bentham Science Publisher	Online ISSN 1875-6247

Current Proteomics

In-Memory Management System for 3D Protein Macromolecular Structures

Abstract

Graphical Abstract

Mass spectrometry data acquisition and analysis for proteomics

Peptides: State-of-Art and Commercialisation Hurdles

Current Proteomics

In-Memory Management System for 3D Protein Macromolecular Structures

Abstract

Graphical Abstract

Call for Papers in Thematic Issues

Mass spectrometry data acquisition and analysis for proteomics

Peptides: State-of-Art and Commercialisation Hurdles

Related Journals

Related Books