Written before: This section discusses transactions, which are one of the core technologies of DBMS. In the history of computer science, three scientists have won the ACM Turing Award for their achievements in the field of databases, and one of them, Jim Gray ( (who once worked at Microsoft) won this honor because of his achievements in transaction processing. It is precisely because of him that the OLTP system became popular from then on until today. There is a lot involved in transaction processing technology, and you can easily write a book. Here I will only discuss some principles of SQLite transaction implementation. Compared with large-scale general-purpose DBMS, SQLite’s transaction implementation is relatively simple. These contents may be more theoretical, but they are not difficult and are also the basis for understanding other contents. Good Okay, let’s start the second section—business.
2. Transaction
2.1. Transaction Lifecycles
There are two things worth noting between programs and transactions:
(1) Which objects Running under a transaction – this is directly related to the API.
(2) The life cycle of the transaction, that is, when it starts, when it ends and when it starts to affect other connections (this is very important for concurrency) – this involves the specific implementation of SQLite.
A connection can contain multiple statements, and each connection has a B-tree and a pager associated with the database. Pager plays an important role in connections because it manages transactions, locks, memory caching, and is responsible for crash recovery. When you perform database write operations, the most important thing to remember is that only one connection is executed under one transaction at any time. These answer the first question.
Generally speaking, the life of a transaction is similar to that of a statement, and you can also end it manually. By default, transactions are committed automatically, but you can also commit them manually via BEGIN..COMMIT. Next is the issue of locks.
2.2. Lock States
Locks are very important to achieve concurrent access, and for large general-purpose DBMS, lock states The implementation is also very complex, while SQLite is relatively simple. Normally, its duration is consistent with the transaction. When a transaction starts, it will first lock, and when the transaction ends, the lock will be released. But if the system crashes without the transaction ending, then the next connection to the database will handle this situation.
There are 5 different states of locks in SQLite, and the connection is in one of them at any time. The following figure shows the corresponding states and the life cycle of the lock.
About this picture, there are the following A few points worth noting:
(1) A transaction can start in three states: UNLOCKED, RESERVED or EXCLUSIVE. Starts on UNLOCKED by default.
(2) UNLOCKED, PENDING, SHARED and RESERVED in the white box can exist at the same time in a database.
(3) Starting from the gray PENDING, things become strict, which means that the transaction wants to obtain an exclusive lock (EXCLUSIVE) (note the difference from the white box).
Although locks have so many states, physically speaking, there are only two situations: read transactions and write transactions.
2.3. Read Transactions
Let’s first take a look at the state change process of the lock when the SELECT statement is executed. It is very simple: A connection executes a select statement, triggering a transaction from UNLOCKED to SHARED. When the transaction COMMITs, it returns to UNLOCKED. It’s that simple.
Consider the following example (pseudocode used here for simplicity):
db = open(‘foods.db’)
db.exec(‘BEGIN’)
db. exec(‘SELECT * FROM episodes’)
db.exec(‘SELECT * FROM episodes’)
db.exec(‘COMMIT’)
db.close()
Due to the explicit use of BEGIN and COMMIT, the two SELECT commands are executed under one transaction. When the first exec() is executed, the connection is in SHARED, and then the second exec() is executed. When the transaction is submitted, the connection returns from SHARED to the UNLOCKED state, as follows:
UNLOCKED→PENDING→SHARED→UNLOCKED
If there are no BEGIN and COMMIT lines, it is as follows:
UNLOCKED→PENDING→SHARED→UNLOCKED→PENDING→ SHARED→UNLOCKED
2.4. Write transaction (Write Transactions)
Let’s consider writing a database, such as UPDATE. Like the read transaction, it will also go through UNLOCKED→PENDING→SHARED, but then it will be gray PENDING,
2.4.1, The Reserved States
When a connection writes data to the database, it changes from the SHARED state to the RESERVED state. If it obtains the RESERVED lock, it means that it is ready for write operations. Even if it does not write the modifications to the database, it can save the modifications to the page cache (page cache).
When a connection enters the RESERVED state, the pager begins to initialize the rollback journal. In the RESERVED state, the pager manages three types of pages:
(1) Modified pages: Contains records modified by the B-tree and is located in the page cache.
(2) Unmodified pages: Contains records that have not been modified by B-tree.
(3) Journal pages: This is the version before the page was modified. These are not stored in the page cache, but are written to the log before the B-tree modifies the page.
Page cache is very important, precisely because of its existence, a connection in the RESERVED state can really start working without interfering with other (read) connections. Therefore, SQLite can efficiently handle multiple read connections and one write connection at the same time.
2.4.2, The Pending States
When a connection completes the modification, it actually starts to submit the transaction, and the pager executing the process enters EXCLUSIVE state. From the RESERVED state, the pager tries to obtain the PENDING lock. Once obtained, it exclusively holds it and does not allow any other connection to obtain the PENDING lock (PENDING is a gateway lock). Since the write operation holds the PENDING lock, no other connection can enter the SHARED state from the UNLOCKED state, that is, no connection can enter the data (no new readers, no new writers). Only those connections that are already in the SHARED state can continue to work. The Writer in the PENDING state will wait until all these connections release their locks, then add an EXCUSIVE lock to the database, enter the EXCLUSIVE state, and monopolize the database (at this point, the locking mechanism of SQLite should be relatively clear).
2.4.3, The Exclusive State
In the EXCLUSIVE state, the main job is to remove the modified page from the page cache Write to the database file, this is where the actual writing occurs.
Before the pager writes modified pages, it must do one thing first: write a log. It checks whether all logs have been written to disk, and these are usually located in the operation buffer, so the pager has to tell the OS to write all files to disk, which is done synchronously by the program (implemented by calling the corresponding API of the OS) Completed.
Logs are the only method for database recovery, so logs are very important to DBMS. If the log pages are not completely written to disk and a crash occurs, the database cannot be restored to its original state and the database is left in an inconsistent state. After the log writing is completed, the pager writes all modified pages to the database file. The next step depends on the transaction submission mode. If it is automatically submitted, then the pager clears the log, page cache, and then enters UNLOCKED from EXCLUSIVE. If submitted manually, the pager continues to hold the EXCLUSIVE lock and save the log until COMMIT or ROLLBACK.
In short, from a performance perspective, the time the process holds the exclusive lock should be as short as possible, so the DBMS usually only holds the exclusive lock when the file is actually written, which can greatly improve concurrency performance.
from http://www.cnblogs.com/hustcat /archive/2009/02/14/1390731.html