Because of digital communication, an increasingly large amount of data is being created (big data keyword). This is a great opportunity for companies that work with the data. But the more data available to companies, the greater the challenge to filter relationships, patterns, and statements. IT solutions and systems are needed to help companies analyze the flood of information. Data analysis using traditional databases is no longer sufficient to store, retrieve, and ultimately process huge data collections. If the classical database reaches its limit, the use of a database in memory is required.
What is a database in memory?
The in-memory database or IMDB is based on a database management system that stores its collection of data directly in the memory of one or more computers. The use of main memory has a decisive advantage: DB in memory has a much faster access speed. Thus, stored data is available quickly.
How does the database work in memory?
A database in memory stores a large amount of data and gives you various analysis results. But how exactly does Big Data discard and what technology allows data storage?
This is how your data is stored
When storing data with a database in memory, a distinction is made between columns and row-based data storage, some database systems use both types of data storage. Row-oriented databases organize records that are collected together in one row. If, for example, the Name, Location, and Country values are stored, the data set will be as follows: Name 1, Location 1, Country 1, Name 2, Location 2, Country 2. For column-based storage, each data is assigned to each category: Name 1, Name 2, City 1, City 2, Country 1, Country 2.
The column-based data storage format is called the column-store format. By storing data with identical values, the system minimizes the amount of data available. Thus, storage space and transmission time are reduced. The analysis performance also develops positively because only the columns are needed and not everything that needs to be analyzed. This form of data analysis is called Column Projection.
The following technology enables large data storage
The concept of database in memory is not new. The basics of database technology were developed since the mid 1980s. However, inadequate IT infrastructure hinders successful implementation. Modern computer architectures such as data warehouses, 64-bit technology, and multi-core processors finally enable the implementation of meaningful concepts. Cheaper memory drives distribution too.
- Databases in memory usually belong to a data warehouse. This database system collects and combines data from various sources, secures it for the long term, and prepares it for analysis.
- With 64-bit technology, it is possible to increase the main memory capacity to the terabyte range. As a result, the DB in memory has grown a lot of data.
- With multi-core processors, several processor cores work on a single chip, resulting in better data processing and throughput. Data throughput shows the net amount of data transmitted per time.
What are the processes involved in using a database in memory?
Practicing in-memory databases involves repetitive, identical processes. The following steps go through the database in memory to back up data:
- Starting the database: When the database is started, the system loads the entire database from the hard disk into main memory. Thus, no data must be reloaded during database operations.
- Data reconciliation changed: If the data changes, the database compares this data periodically.
- Transaction log reserves: ongoing changes are recorded in what are called transaction logs. If an error occurs, the database can be restored when the error occurs. This process is called "roll forward".
- Data processing: Data are processed as in a traditional database according to ACID principles. This illustrates the desired nature of operations in a database management system.
- Database replication: For backup, this process continuously copies data from the database to the computer or server.
Advantages and disadvantages of in-memory database
As already mentioned, the database in memory achieves a much faster access speed by storing data in main memory. The advantage of databases in memory is at the same time the biggest cause of their loss, because permanent data storage in main memory is not possible. What other advantages and disadvantages?
Database in memory: benefits
The biggest advantage of using a database in memory is that access speeds are significantly higher due to memory usage. This also leads to accelerated data analysis and appropriately calculated data query times. But it's not only acceleration that optimizes data analysis. DB in memory allows evaluation of structured and unstructured data from any system. Companies and software solutions are faced with the challenge of storing and processing large amounts of unstructured data such as text, images or audio and video files.
Through the use of distributed infrastructure, unstructured data can be stored in a DB in memory, where several computing units (computers, processors, etc.) work in parallel on common tasks (parallelization) and distribute them to different server groups. This results in higher storage capacity and unstructured data processing and transmission speeds.
Database in memory: loss
Memory use on the one hand provides increased access speeds, but also carries a major disadvantage: the only short-term data storage. If a system failure occurs, all volatile data will be lost. To overcome the loss of data, the following methods have been created:
- Snapshot File: At certain times, such as at regular intervals or before shutdown, the current database version is saved. An important point of criticism of this action, however, is that all files added after the latest snapshot are gone.
- Transaction Log Backup: Notification of changes to the transaction log is included as a backup method in the ongoing process. In combination with ordinary snapshots, the last status can be traced after a crash.
- Replication: Most databases in memory already have the ability to place an exact copy of the database on a traditional hard disk. In the event of a failure, the stored database can be used.
- Non-volatile RAM memory: RAM memory can then provide files to be retrieved even after restarting the system when combined with an energy storage device.
Other losses caused by memory usage: The computer itself is not as much RAM available. To overcome this limitation, network grid computing can be a solution. In grid computing, many different computers are connected to each other. To participate in this link requires the installation of special software on the computer. Incorporating unused capacity creates a high-performance virtual computer.
When does the database in memory make sense for your company?
After making the pros and cons of the database in memory and comparing it directly with traditional databases, you should consider for your organization which database management system (DBMS) is right for you. If you work with big data, the decision has already been taken from you: here only the DB in memory is considered. But even in other cases, a database in memory can be the right choice.
The database in memory is the correct DBMS for your data if:
- You have a large data collection
- You need fast and frequent access to your data
- Your database management system or server is currently overloaded
- the persistence of your data is not your top priority
- You can accept the possibility of losing your data
Database example in memory
The most popular in-memory databases include SAP HANA and Oracle TimesTen. If you are a company looking for enterprise software with various functions, solutions from SAP and Oracle are the most common. Both database management systems achieve the highest possible performance. What distinguishes them and what are their practical applications in a company?
SAP HANA (High Performance Analysis Tool)
The database in SAP HANA memory (High Performance Analytic Appliance) is a combination of hardware and software. This software was developed specifically by SAP, while the hardware (server) comes from 10 different manufacturers. Unlike other in-memory databases, SAP HANA does not store data temporarily but permanently in main memory and stores data using transaction logs.
Transaction processing and analysis in public databases make it possible to process information in real time. SAP HANA can be used both on company servers and in the cloud, reducing the challenges faced by the company's IT structure. In addition, costs for previous data management methods are minimized and decision makers receive new and accurate estimates.
Oracle TimesTen
The Oracle database has a lot in common with SAP. Data processing is also done in real time and applications can be done both by the server and as a cloud service. Unlike the SAP database, software and hardware from Oracle TimesTen come from Oracle itself.
Therefore, this is a pure Oracle tool. The benefits for users are that you can trade internally if something goes wrong and the company does not depend on a variety of hardware and software companies. Oracle does not store all the data it collects in memory: data that does not rely on high performance can be saved to disk or flash disk
Database in memory for comparison: SAP HANA and Oracle TimesTen
The functions of SAP HANA and Oracle TimesTen are mostly identical. This results in identical main benefits from both databases for your company:
- Accelerated data processing
- Realignment of the company through innovative applications
- Increased agility in the form of flexibility, activity, and adaptability
- Database challenges in memory
As digitalisation advances, data sets that are already very large will continue to grow. Therefore, for database developers in memory, the continuous development of the existing system is needed. The following tasks must be handled:
- Collect data from a growing number of sources
- To further simplify IT structures while reducing response time and analysis speed
- Get more insight from data analysis and help companies make decisions
- Develop applications that are even more focused on the challenges of digital transformation