Research on Embedded Real-time Database Technology

Embedded RTDBS[1][2] refers to a database system that can run independently in embedded devices to process a large amount of data with strong timeliness and strict timing. It has high reliability and high real-time performance. With high information throughput as the goal, the correctness of its data depends not only on the logical results, but also on the timing of the logical results. Figure 1 shows the basic architecture of an embedded application. The entire embedded RTDBS is built on a real-time operating system. Because the embedded real-time database system is very different from the common enterprise-level database management system in the operating environment and operation mode, traditional enterprise databases such as Oracle and Sybase are difficult to play in the real-time embedded environment, especially in In some real-time demanding control systems, traditional databases are even more powerless. Therefore, with the emergence of various commercial embedded real-time operating systems, the research of real-time database systems in the embedded environment has become an important part of embedded software.

Figure 1 basic structure of embedded applications

Embedded RTDBS Connotation and Its Architecture

In the current development of embedded systems, the majority view on the issue of real-time database is that the embedded RTDBS is essentially an â€œin-memory databaseâ€ and is a memory buffer pool managed by the application program. Its role in the system is a Shared data area used by multiple real-time tasks. This database is actually an inseparable part of the application program embedded in the user's application software. Its function is mainly the storage and retrieval of data. It does not have independence, and it is not a true database system. A complete embedded real-time database system, in addition to the in-memory database, should also contain the historical database and database management system DBMS and the interface functions provided to the user. The entire database can be completed by the DBMS for the specific configuration of the database and various operations, such as system operation. Prior to the actual needs of the memory node in the database to increase or decrease the configuration operations. The current embedded real-time database system can be divided into two categories. One is a commercial-grade embedded real-time database system. It is independent of specific application software, such as the eXtremeDB in-memory real-time database provided by McObject, Inc., USA. A real-time database written specifically for database management of embedded systems. It builds the database directly into memory and generates database APIs based on application characteristics. The user can conveniently call these interface functions to manage the entire database system. The other type is user-specific. The specific application object and self-designed and developed embedded real-time database system, this database is generally embedded into the application software as part of the application program, does not have independence, the current real-time database developed by users in the measurement and control system are mostly this situation .

Figure 2 embedded RTDBS system structure

Figure 2 shows the architecture of an embedded real-time database system. Like traditional databases, it is still a three-tier architecture, namely user mode, logic mode, and storage mode. To build a real-time database system in an embedded environment, the following functions should be completed: an efficient data access mechanism, data security control, real-time transaction management mechanism, and database recovery mechanism. The design is more concerned with the real-time performance and cost of the system. Size, system performance, reliability, predictability and underlying control capabilities, ie how to design a reasonable data model and physical structure for the chosen real-time OS and embedded hardware platform, focusing on how to efficiently use the limited resources of the embedded system How to improve data access speed, how to protect data, data exchange, optimization of query/transaction processing algorithms, prioritization of transactions, transaction scheduling, and concurrency control.

Embedded RTDBS data model

The key to an embedded real-time database system is the establishment of a data model, which determines how the data is accessed and manipulated. The performance and reliability of the application are also largely dependent on this. Most database systems in the embedded environment adopt the relational model structure, which is also the data model of the commercial database system. The model structure is to use two-dimensional relational tables to implement data storage and use indexes to access and query data. This model structure is Based on strict mathematics, the structure is simple and flexible, and the independence is good. However, the memory overhead and data redundancy in the embedded environment are large. The user must optimize it and increase the difficulty of developing the database system; some embedded The database adopts a mesh model structure. The model uses pointers to determine the explicit connection relationship between data. It uses a redundant data and index file to save a large amount of storage space in the relation model, and has certain data independence and data independence. Shared features, high operating efficiency, and because it avoids the index operation, to save storage space than the relational database model, the data operation speed is faster. However, this kind of model structure is more complicated, especially when the embedded system increases in size, the structure of its database becomes very large, which may affect the real-time performance of the system. Figure 3 shows the system overhead comparison between the relational model and the mesh model under the same number of records. It can be seen from the figure that the mesh model is less expensive than the relational model because it avoids the index operation. However, in practical applications, the database model should be selected according to the overall performance of the real-time system. Usually, a mixed model structure of a mesh plus relationship or level plus relationship can be used to make up for a defect between the two structures, such as an embedded real-time database of CENTURA Corporation. RDM (Raima Database Management), which combines the advantages of mesh and relational models, avoids unnecessary indexing overhead, significantly reduces system storage space, I/O operations, and CPU cycles due to its rapid and highly reliable Characteristics, widely used in many embedded products.

Figure 3 Comparison of Relational and Mesh Model Overhead

The physical structure of embedded RTDBS

In embedded real-time systems, determinism is an important performance indicator. Users must be able to determine the time of data operations and the occupancy of database storage space. The traditional database storage management is mainly based on the disk storage structure. Data needs frequent I/O operations during the access process. Due to the uncertainty of the I/O operation time, the traditional database storage technology cannot be applied to the embedded system. in. Taking into account the access time, storage space utilization and maintenance costs and other factors, usually the embedded real-time system storage structure is divided into two levels, the first layer is memory, the memory database in RTDBS, the entire real-time database system performance The requirement is to use the in-memory database as the underlying support. It is the key to the real-time database system. It is used for program operation and real-time data processing. It has fast access speed and does not require disk I/O operations. Therefore, it is most suitable for real-time applications. Data management and operation; the second layer is external storage, usually using some permanent storage devices, need to read and write I / O operations for the storage of historical data in the system. In this way, for those data (such as analog engineering units) or relatively low frequency of access data (such as data backup or log backup only used in the recovery) placed in the external storage space, and All real-time data or data of the current working part resides in the memory, avoiding the operation of the database file, and greatly improving the performance of the real-time database system. For this kind of memory structure in the two-tier storage structure, it can be automatically assigned by the embedded OS. The user can also specify the allocation space of the system. Generally, it consists of three parts of shared memory, namely, the index area, the data area, and the system information area. The record is determined by the table name, segment number, and offset within the segment.

The search and update of data in the embedded RTDBS are quite frequent, and a good index structure must be established to speed up the execution of various operations and ensure the compactness of the data structure. Embedded RTDBS Because real-time data all resides in memory, the system rarely performs disk I/O operations. Therefore, its index structure focuses on time and space overhead. Its principle of establishment is to quickly locate and save space. The commonly used database index structure is the sequential structure, B-tree, AVL tree method. The sequential structure can be stored in arrays. The advantage is that it is easy to access, but it is not convenient for dynamic maintenance. It needs to move large amounts of data when inserting and deleting. B-tree is the most widely used, it has good operation performance and convenient dynamic maintenance. The data coverage of each node is only 55% [3], and the storage efficiency is too low. The AVL tree has high access performance, but each node requires two pointer fields and some additional control information, and storage efficiency. Not too high. They are not the best choices in the embedded real-time environment, so on the basis of the above structure, there have been many improved index structures suitable for embedded database systems, such as the T* tree index structure for improving the performance of in-memory databases. 4], which is an improved T-tree structure, which has higher space utilization than AVL and B-tree. Although the search time complexity is slightly higher than that of AVL tree, the operation in memory makes its search time enough to satisfy Real-time requirements, this structure greatly reduces the number of elements of the node between the movement and balance processing, a good consideration of the relationship between time and space, is an index structure suitable for embedded systems; In addition, for embedded systems The imbalance of I/O operation performance B-tree index structure [5], mainly to reduce the number of writes to the memory block, this index structure of the node consists of a number of keywords and pointer fields, each pointer points to the corresponding keyword File recording, because the unbalanced B-tree avoids the number of times the B-tree performs in order to adjust the further split caused by the balance, thus improving the system's write performance.

Embedded RTDBS Management System

The embedded RTDBS system is a layer of software between the user and the real-time operating system. It consists of many program modules. Its role is to effectively organize, manage, and access the shared data in the database. Its structure is shown in Figure 4. As shown. Among them, the storage space management module, the security and integrity control module, the concurrent concurrency control module, the real-time data dump module, and the operation log management module are some of the problems that need to be solved particularly when developing a real-time database system in an embedded environment: (1 ) Storage space management module. Since the embedded real-time database system adopts the in-memory database technology, it necessarily involves the memory management of the embedded operating system. Therefore, the user must understand the system's memory allocation mechanism and design its own memory management program. When the system is running, the module requests the system memory buffer through the real-time OS and uses it as a shared memory data area. After that, initialize the initialization data in the history database into the memory area to initialize the blank memory. For the application of memory space, the user can use a static allocation method. This method is simple to implement and does not require a complicated index structure. The disadvantage is that the flexibility is lost, and the required memory must be known and allocated beforehand in the design phase; or Dynamic allocation method, which is flexible and can expand data nodes as needed, but an appropriate index structure must be established to speed up data retrieval time. The module should be designed according to a specific real-time OS; (2) data security and integrity control module. In real-time database design, the security of data must be considered. On the one hand, it refers to the legality of the user's access to the data, and on the other hand refers to the security of the system. Integrity refers to the user's various operations on real-time data or historical data must comply with certain semantics, and can be achieved through integrity constraints; (3) Transaction concurrency control module. A real-time database is a shared resource that allows multiple tasks to be used together. If you do not control concurrent transactions, it may cause incorrect reading or storage of data and destroy the consistency of data. Therefore, real-time database systems must achieve good concurrency. Control mechanism. The traditional database generally adopts the locking method, similar to the semaphore in real-time operating systems. The size of the block granularity is determined according to the specific application system. The traditional database acquires the lock with less overhead, so the small granularity blocking unit is usually used. To increase the system's parallelism. However, in a real-time database system, the overhead for obtaining a lock for a transaction is the same as the overhead for processing data. If the granularity of the lock is too small, the performance of the system will be reduced. Therefore, the granularity of blocking in a real-time database usually selects a relational table as a unit (eg, The analog relational table is a blockade unit. This reduces the complexity of the concurrency control mechanism, reduces system overhead, and improves overall transaction processing performance. (4) Real-time data dump module. The function implemented by this module is to store real-time data as historical data. Usually, the module first saves the historical data in the memory buffer. Once the buffer is full, it is written to the disk only once; when reading the historical data, the buffer is first read from the buffer. Data can be fetched and read and written when the data is not available. In this way, the number of disk I/O operations can be reduced. And only the change data is stored, which saves the external storage space without affecting the system performance. (5) Operation log management module. Log files play a very important role in database recovery and can be used for transaction failure recovery and system failure recovery. The log buffer stores the records of database operations. The traditional database log records include the record name, the old value of the record before the update, the new value of the updated record, the transaction identifier, and the operation type. In the embedded real-time database system, in order to reduce the overhead of the system, the old and new record values â€‹â€‹are not included in the log record. The write operation to the log record is only performed on the buffer, and when the buffer is full, the disk write operation is written to the log. In the file.

Figure 4 embedded RTDBS module structure

Embedded RTDBS Design Example

We have developed a real-time database system based on the embedded operating system VxWorks for the measurement and control system. The hardware platform of the system adopts the Intel486 series PC104 CPU board, users can add or delete the corresponding data nodes through the display interface on the PC104 hardware platform. The entire application system structure is shown in Figure 5. The system is divided into two parts: the in-memory database and the historical database. The in-memory database is a database system for real-time updating of data. A large buffer pool is statically created by the application during initialization to store various types of data nodes, and buffers are managed using free lists. The free cell in the pool. When the program adds a new node, it first finds the free unit allocated to the user from the free list, and then the node in the linked list is deleted. When the buffer memory is not enough, through the enhanced memory partition management library MemLib provided in VxWorks, a buffer pool is dynamically requested from the system memory as a new buffer pool. This static and dynamic memory allocation method can overcome the problem of memory fragmentation. At the same time, it also avoids the problem that the capacity of the in-memory database is too large or too small in the static allocation; for the data indexing structure, taking into account the frequent retrieval and updating of data in the real-time system, we use the embedded RTDBS in combination with the system performance requirements. The index structure of the L-tree, which combines the features of the B+ tree, T-tree, and AVL tree. Each node of the tree can contain multiple keywords, and has high space utilization. It is a good support. The index structure of the in-memory database; on the control of real-time transactions based on priority drivers, the transaction deadline is mapped to the transaction priority. The simplest first to serve the FCFS allocation strategy is not suitable because it does not take into account the timing constraints. For real-time systems, we therefore use the shortest-term allocation strategy; in addition, for the management of the database, I Provides traditional database management functions, such as adding, deleting, modifying, and displaying. The user must enter the system with the corresponding permission to operate. The display system uses the graphic software package Zinc in VxWorks for development. The entire system runs stably and can meet the requirements. Real-time system data management requirements.

Figure 5 embedded real-time database application icon

Conclusion

At present, there are still many problems to be solved in the development of real-time databases under embedded environment. The performance in practical applications needs to be further improved. The key lies in the storage structure of real-time systems, the access speed of data, and the priority scheduling of real-time transactions. Fault recovery, which is the key to improving real-time database performance, is also the focus of real-time database system research.

references:

1. Olson, MA Selecting and Implementing an Embedded Database System. IEEE Computer. 2000, 33(9):27-34
2. J Stankovic, et al. Misconceptions about Real-Time Databases. IEEE Computer, 1999, 32(6): 29-36
3. Lehman TJ, Carey M. A Study of Index Structure MMDBMS. Proc of the 12th International Conference on Very Large Database, 1986, 294-303
4. Lu Yansheng, Pan Yi et al. Data organization of an in-memory database management system. Journal of Huazhong University of Science and Technology, 1999,27(10):64-66
5. Yang Jincai, Liu Yunsheng et al. Storage management of embedded real-time database systems. Small and Microcomputer Systems, 2002, 24(1): 42-45

3.5KW-5.5KW MPPT High Frequency Inverter

3.5KW-5.5KW High Frequency Inverter(MPPT)

3.5KW-5.5KW MPPT High Frequency Inverter,Energy Storage System Solar Inverter,Home Off Grid Solar Inverter

suzhou whaylan new energy technology co., ltd , https://www.whaylanenergy.com