SQLite is a relational embedded database engine created by D. Richard Hipp in 2000. It is the most widely deployed database engine, used by all Android devices and most web browsers.[1] As opposed to a client-server Database Management System (DBMS), SQLite is embedded into an application.[2] The database is a file that clients read and write to perform SQL operations. Its format is backward compatible and has long-term support.[3]
Atomicity, Consistency, Isolation, and Durability (ACID), are the properties that drive transactional database design. The robustness of a database system is measured by its ability to maintain these properties in the face adverse conditions. Additionally, SQLite has to synchronize access to the same database file with multiple concurrent clients. You will not be able to use it effectively without some knowledge of its inner working or without a little bit of configuration.
If you feel confused when trying to configure SQLite, you do not know what is
Write-Ahead Logging or how to handle SQLITE_BUSY
errors, this article is for
you. You will learn how SQLite ensures the ACID properties, how to use it
effectively with concurrent clients, and how to configure it for better
performance. It assumes you are familiar with the basics of Linux and SQL.
§Introduction
SQLite is available for most Linux distributions. You can install it with your favorite package manager:
It installs the shared library, the header files, and the interactive terminal client.
§Connect to a database
You can interact with an SQLite database from the command line with the built-in Command-Line Interface (CLI), or from an application through SQLite bindings.
§CLI
Running sqlite3
gives you an interactive prompt:
By default, the client connects to a transient in-memory database. You can open
or create an actual database file with sqlite3 FILENAME
.
This CLI accepts special commands such as:
.help
: list all the commands..dump [table_name]
: dump the database content..tables
: list the database tables..schema
: dump the database schema..quit
: exit the program (same asCTRL+D
).
You can also execute regular SQL statements (terminated by ;
):
SQLite adds the PRAGMA
statement to configure the database. For instance, you
can set busy_timeout
to 5 seconds as follows:
With some exceptions, you need to set these pragmas on each connection and verify that the returned value matches your request.
§Bindings
The same SQL statements can be sent through a connection from an application
using SQLite bindings. Some libraries, such as Python's sqlite3
module,
provide a high-level interface that hides some features such as transactions.
On the contrary, rusqlite
provides an explicit,
safe, and "low-level" API:
Running this program yields the following output:
The rest of this article uses the CLI interface to stay language agnostic.
§In-memory databases
One of SQLite's strength is that you can test your application with an
in-memory database, without having to install and configure a full DBMS, or
even touching the filesystem. To connect to an in-memory database, use the
special file name :memory:
.[4] Note that even
within a single process, multiple connections to :memory:
do not share the
same database unless you use a shared cache.
§Shared cache
Due in part to its heritage and backward compatibility, SQLite configuration is slightly intricate. One example is the shared cache. The documentation explains that this feature "can significantly reduce the quantity of memory and IO required by the system."[5]
Richard Hipp mentions on the SQLite forum that "the shared-cache feature was invented many years ago, for use on SymbianOS phones. [...] But these days, WAL-mode is much preferred, and shared-cache is considered a mistake and a misfeature."[6]
However, there is one situation where the shared cache is necessary. For tests involving multiple connections to an in-memory database in a single process. You have to use the shared cache if you want them to access the same underlying database. This is an example of interaction between orthogonal features.
You can enable the shared cache with the cache=shared
connection parameter:
This mode allows for multiple in-memory databases inside an application by
specifying a file name with the parameter mode=memory
:
The rest of this article deals with concrete databases (on the filesystem), without the need for a shared cache.
§Choose a journal mode
Atomicity guarantees that transactions form a unit of operations that can either succeed or fail as a whole. Another client cannot see the individual statements that compose a transaction. This property implies resilience through recoverability. In the face of errors, crashes, or power losses, the database must be able to rollback partial changes. To this end, SQLite implements two algorithms with different characteristics.
§Rollback journal
In the default rollback journal mode, changes are written directly to the database. To ensure atomicity, SQLite has to synchronize access to it, and implement a way to rollback changes.[7]
At the lowest level, SQLite interacts with the filesystem layer. Locks synchronize access to the database file, making sure no client can access the database while another is writing to it.[8] This concurrency control algorithm is called two-phase locking.
Before making any change, SQLite backups the pages that need to be modified to a rollback journal.[9] If a crash happens halfway through a write transaction, SQLite can rollback the partial changes thanks to these backups.[10]
The default value for journal_mode
is DELETE
. After all the
changes have been written to the database file on the disk, the backup pages
are no longer needed: SQLite atomically deletes the rollback journal, thereby
committing the transaction.[11]
The limitation of this mode is that all the clients read from and write to the same database file. As a consequence, writers take an exclusive lock that prevents readers and other writers from accessing it while it is being modified.[12]
§Write-Ahead Logging
The Write-Ahead Logging (WAL) journal mode
takes a different approach: instead of saving unmodified pages to a rollback
journal, it writes the modifications to a separate WAL
file (named with the suffix
-wal
). The benefit of a distinct file for writes is that clients can read
from the database while a write transaction is in progress, but writers still
have to take turns to append to the
journal.[13]
A transaction is atomically committed when a COMMIT
is appended to the WAL
file. Readers check it for committed transactions in addition to the main
database file. Scanning the WAL file for each query would be very inefficient.
Instead, readers rely on a shared
index from a memory-mapped file
(named with the suffix -shm
) to quickly lookup committed changes. This index
requires shared-memory support, which makes network filesystems incompatible
with WAL.[14]
If the last client using the database does not exit gracefully, the WAL file may remain on the disk (normally, it is cleaned up by the last closing transaction). You have to keep this file alongside the main database. Otherwise, you lose all the changes in the WAL file that have not been transferred to the database.
To enable WAL, set the journal_mode
pragma:
You only need to do it once, as this setting is saved to the database file, but there is no harm in doing it each time you open a connection to ensure it is enabled.[15]
Besides commits and rollbacks, WAL has a third core operation explained in § Checkpointing.
§Adjust the synchronization level
Durability guarantees that the effects of a committed transaction remain
persistent in non-volatile memory. On Linux, writing to a file does not
actually write to the disk. Instead, the changes are written to in-memory write
buffers managed by the kernel, automatically flushed in the background. To make
these changes durable, SQLite has to actively flush these buffers using the
fsync(2)
system
call.[16]
The synchronous
pragma controls how often SQLite calls fsync
. Note that for each value, the
guarantees depend on the journal_mode
. Generally, higher levels reduce the
chances that power failures or system crashes may result in database corruption
or revert committed transactions. Lower levels increase database performance at
the cost of durability (and integrity below a certain level).
In the DELETE
journal mode, the default FULL
synchronous setting is
appropriate, as there is a slight chance of database corruption in NORMAL
mode on older systems. With WAL, you can safely decrease this level to
NORMAL
, but you are trading some durability for performance by skipping the
fsync
operation after committing a transaction. On a system crash, you can
lose the entire WAL file if the changes are still in volatile memory. This will
not corrupt the database, but may revert some transactions.
Nevertheless, the NORMAL
synchronous level is appropriate for most use cases.
You can enable it with the following command:
With Linux, the actual number of lost changes depends on the virtual memory
subsystem
configuration.
The
dirty_writeback_centisecs
kernel tunable defines the interval at which Linux wakes up to flush dirty
buffers (15 seconds by default). During this operation, buffers are selected
based on other conditions. For example,
dirty_expire_centisecs
defines an expiration time and
dirty_background_ratio
defines a memory occupation threshold.
Note that during a checkpoint, SQLite also calls fsync
, which provides a way
to initiate a manual synchronization.
§Handle database locking
With concurrent clients, if one of them holds an exclusive lock to the
database, or two concurrent transactions are not serializable, then SQLite
can return SQLITE_BUSY
errors. In
WAL mode, these conditions arise in 5 main scenarios, detailed in the following
subsections.
§Exclusive mode
A single client with exclusive access to the database blocks other clients:
they receive an SQLITE_BUSY
error.[17]
- The database is in exclusive locking mode with
PRAGMA locking = EXCLUSIVE
.
In EXCLUSIVE
locking mode, the database only accepts a single connection. You
need to disable this mode or stop the offending client. (As an example, web
browsers open their databases in this mode, which means you have to close them
before being able to access their databases.)
§Exclusive operations
Some internal operations need exclusive access to the database, but only for a short amount of time, during which they block other readers and writers:
- A cleanup operation is running with an exclusive lock when closing a connection.
- A cleanup operation is running with an exclusive lock on the first connection after a crash of the last connection to the database.
In rollback journal mode, a writer takes an exclusive lock that blocks readers and other writers. In WAL mode, writers do not block readers, but there can be only one writer at any given time.
- A write transaction is started while another writer is active. (With a rollback journal, read transactions are also blocked, which makes this situation more common.)
The solution is to set a busy
handler. A busy timeout
handler is provided, that
waits up to the specified amount of time before returning SQLITE_BUSY
. To
define its timeout, set the
busy_timeout
pragma
(to 5 seconds in this example):
Note that scenarios 2 and 3 are transient, but a writer in scenario 4 can lock the database for an indefinite amount of time. You may need to:
- Increase the busy timeout to ensure no write transaction block for longer.
- Wrap the write transactions in a finite retry loop with an exponential backoff before returning the error.
§Transaction isolation
Isolation ensures that concurrent transactions do not conflict with each
other. The isolation property, raised to the SERIALIZABLE
level, checks that
any transaction schedule is equivalent to the same transactions executing
serially (one after the other). If not, SQLite immediately returns
SQLITE_BUSY
.[18]
A transaction started with
BEGIN
is considered a read
transaction (or DEFERRED
) until a write statement is executed. If between a
read and a write statement another writer commits changes to the database, any
data read prior to this write may be invalidated. In this situation, SQLite
returns the error immediately, without invoking the busy timeout
handler.[19]
With a rollback journal, this condition causes a deadlock, solved by aborting
the conflicting transaction. With WAL, a read transaction acquires a view of
the database that corresponds to an offset in the WAL file. If during a read
transaction , started in DEFERRED
mode, a write transaction
commits changes to the journal, SQLite prevents from
upgrading to a write transaction, as its view of the database (corresponding to
the WAL offset) is now outdated. Therefore, the write query in the invalidated
transaction returns SQLITE_BUSY
, and you have to retry the whole
transaction.[20]
- A read transaction upgrades to a write transaction after a write
invalidated prior reads. (The extended error in this case is
SQLITE_BUSY_SNAPSHOT
, because each transaction reads from its own "snapshot" of the database that may become outdated.)
Note that SQLite ensures serializability by enforcing serial writes (at most
one writer at any given time). It does not detect conflicts with a table or row
granularity, but for the whole database.[21]
In the following schedule, SQLite returns SQLITE_BUSY
for the 4th statement
because upgrades to a write transaction, even though
and do not actually conflict (as they operate on distinct tables):
Transaction | Transaction | |
---|---|---|
1 | BEGIN |
|
2 | SELECT value FROM foo WHERE id = ? |
|
3 | UPDATE bar SET value = ? |
|
4 | UPDATE foo SET value = ? |
|
5 | COMMIT |
Encountering this error in the middle of a transaction is inconvenient. As a
solution, always start write transactions with BEGIN IMMEDIATE
. This way,
you signal SQLite that this transaction must block other writers. When issuing
this statement, you may receive an SQLITE_BUSY
error as in scenario 4,
indicating that a write is in progress. The busy callback handles this
situation. After the transaction begins, you cannot receive SQLITE_BUSY
again
(as the writer has now exclusive access to the database until a COMMIT
or a
ROLLBACK
).[22]
§Optimize query execution
SQL is a declarative language that leaves query execution to the implementation. SQLite's query optimizer tries to make good choices when executing a query, but you can have an influence on how it performs.
§Analyze the query plan
A relational database has tables, filled with rows, containing values organized in columns. The job of the query planner is to collect data from the database in the most efficient way. The key is to use indexes to quickly find values.[23]
To illustrate how indexes improve query performance, start by creating a table
t1
with a single column a
:
You can ask SQLite how it executes a query using EXPLAIN QUERY PLAN
. Selecting rows by value requires to
linearly scan the table:
A scan has a time complexity, where is the number of rows.
An index for column t1 (a)
stores its values in a
B-tree. Every table has an index for
the rowid
column, but you can create additional indexes for any column (or
group of columns) with CREATE INDEX
:
SQLite relies on indexes to find rows more efficiently:
The search operation in a B-tree has a time complexity. Indexes improve query performance at the cost of space required to store the B-tree.
For some queries, SQLite automatically creates a transient index if the query
optimizer determines it is more efficient than scanning a
table.[24] In the previous example, creating
an automatic index for column t1 (a)
requires scanning the whole table
anyway, but in the following example, creating an index is more efficient:
Without an index, the query is equivalent to two nested scans with a total time complexity of . With an index, a B-tree search replaces the inner scan for a time complexity.
Since automatic indexes are created only for the duration of the query, you can improve performance by creating them in advance. Use the query plan as a tool to identify indexes to create and queries to optimize.
Checking the query plan for each query in an application is not really
practical. Instead, you can enable SQLite's error log, which prints a warning
with the
SQLITE_WARNING_AUTOINDEX
code (284) each time the query planner creates an automatic index. After the
previous example, the log contains the following line:
Now you can quickly identify which indexes to create in order to improve performance.
§Create covering indexes
Creating an index for a single column yields the most gains. But you can improve performance further by using covering indexes. They encompass both the values you are searching for and the values you need to return. After finding a value, SQLite does not have to binary search for the original row to return the associated values, sparing an additional time complexity.[25]
Consider the following example:
If you create an index on t2 (b)
, SQLite makes use of it:
The time complexity is , because SQLite has to
lookup the row in t2
to find the value for t2 (c)
, which is not included in
the index. To drop the constant factor, you can create a covering index that
contains the two columns t2 (b, c)
:
The time complexity is reduced to , at the expanse of additional space for the index.
§Collect query statistics
Each SQLite connection accumulates statistics about queries to identify where the query planner could do a better job. SQLite has the ability to improve query performance by collecting this information.[26]
To enable query analysis, run PRAGMA optimize
before closing a
connection. The results are stored in internal tables, and influence how
queries are performed (e.g., how to order tables to perform a join, which
indexes to pick).[27][28]
§Maintain the database
Compared to a DBMS, SQLite does not require a lot of maintenance, but there are some operations you need to be aware of.
§Checkpointing
When changes accumulate in the WAL file, read performance degrades.[29] Through a checkpointing process, SQLite periodically transfers the changes to the main database, and resets the WAL journal.
§How it works
The process has three steps:
- Flush the WAL file to persistent storage (to ensure the checkpoint can continue after a power loss), then transfer as much modified pages as possible to the database file.
Each reader has a mark in the WAL file that represents the latest committed transaction it can view.[20] The checkpointer can safely transfer pages that appear before this offset, because no reader needs the matching unmodified pages from the database file. (If the reader considers these changes committed, then it does not need the original pages from the database, and they can be overwritten.) However, the checkpointer cannot transfer the pages modified later, because a reader might read the matching unmodified pages from the database file. Therefore, the checkpointer at step 1 can only transfer pages before the earliest active reader mark.[30]
- If there are no remaining pages to transfer,
fsync
the database file (the changes are durable).
Assuming all the readers are up to date with the WAL file, the checkpointer can
transfer all the changes and fsync
the database to complete step 2. Without
new changes, subsequent reads fetch the pages directly from the database file
and do not even need to use the WAL journal (as it is
empty).[31]
- If no reader is using the WAL file, reset it.[30]
To complete step 3, all active readers have to exclusively access the database file (otherwise, the WAL file would be reset while they read from it). The next writer rewinds it and overwrites the old pages from the beginning.[32][33]
§Automatic checkpoints
The automatic checkpointer runs in the following cases:[34]
- On
COMMIT
at the end of a write transaction when the WAL file exceeds the number of pages configured by thewal_auto_checkpoint
pragma (1000 pages by default, or about 4 MB with the default 4096 bytes page size.) - When the last connection to the database is closed.
There is a tradeoff between read and write performance:[29]
- More frequent checkpoints keep the journal small, improving read performance at the cost of write performance.
- Less frequent checkpoints allow for a large journal, improving write performance at the cost of read performance.
§When it does not work
While the automatic checkpointer is sufficient for most purposes, a long-running read transaction (or constantly overlapping reads) can block it. The checkpointer will not be able to make progress past the earliest reader mark at step 1, or reset the WAL at step 3. Additionally, it may not be able to transfer enough changes back to the database file after huge write transactions. As a result, the WAL file can grow unbounded.[35]
To work around this problem, start a background thread that periodically issues
a wal_checkpoint
.
There are different modes:
PASSIVE
: run a checkpoint without blocking any other reader or writer (it cannot make progress past the earliest reader mark and when there is an active writer).FULL
: acquire an exclusive write lock (invoking the busy handler if necessary), run a checkpoint (it cannot make progress past the earliest reader mark), then release the exclusive write lock.RESTART
: same asFULL
, but while holding the lock, wait until all readers are done with the WAL journal (invoking the busy handler if necessary). The next writer will be able to reset it.TRUNCATE
: same asRESTART
, but also truncate the WAL file.
The RESTART
and TRUNCATE
modes actively wait until all readers are done
with the WAL file after step 2 by blocking other writers. This ensures that the
next writer will be able to reset the WAL file to complete the checkpoint (step
3). Another benefit is that the background thread bears the cost of calling
fsync
during a checkpoint instead of a foreground
writer.[36] However, a long-running or dangling
transaction can still block these manual checkpoints. Make sure to keep small
transactions and do not leave them dangling.
§Vacuum
Deleting data in SQLite does not actually delete it from the database file,
which makes this operation efficient (the pages are only marked as free). You
can rewrite the database file with
VACCUM
for the following purposes:
- Reduce the database file size after deleting large amount of data.
- Reduce database fragmentation. (SQLite reuses deleted pages, which over time leads to fragmentation that reduces query performance.)
- Permanently delete sensitive data. (See
secure_delete
to overwrite deleted data with zeros.) - Change the
page_size
.
VACUUM
is a write operation that blocks other writers. Note that the whole
database is copied so you need to have enough available space.
§Ensure consistency
Consistency guarantees that the database maintains its invariants across changes. Examples of invariants include uniqueness constraints, or any other rule that contributes to data integrity.
§"Flexible" typing
A type associated with a column in SQLite has one of the following affinities:
NUMERIC
INTEGER
REAL
TEXT
BLOB
They indicate the recommended type for values stored in a column. For example,
the DATETIME
column type has the NUMERIC
affinity.
Based on the affinity, SQLite converts values to the following storage classes:
NULL
INTEGER
REAL
TEXT
BLOB
Then, the value is stored in the database with an internal data type (suitable string encoding, integer size, etc).
Example: A TINYINT
column (8-bit integer) has the INTEGER
affinity:
If you insert the value 42
or 1337
(exceeding 8 bits), the storage class
for both is INTEGER
:
If you insert the value "42"
, the INTEGER
affinity converts this string to
an integer, stored in the INTEGER
storage class:
If you insert "Hello, World!"
into this column, SQLite tries to convert this
value based on the INTEGER
affinity. Because this string does not represent a
number, the value is stored with the TEXT
storage class:
As you can see, the actual type is not associated with the column, but with the
value itself. That explains why you can store TEXT
data in an INTEGER
column, long TEXT
in a VARCHAR(50)
column, or BLOB
and NULL
values in
any column type.
Note that SQLite enforces the UNIQUE
and NOT NULL
constraints. The latter
prevents values with the NULL
storage class from being stored in the column.
Unless you expect you will move to another DBMS with "rigid" typing, it is
unnecessary to use more specific column types in your schema, aside from
documentation purposes.
§Foreign keys
Foreign keys are essential to
maintain integrity in a relational database. By default, SQLite does not
enforce them. Run PRAGMA foreign_keys = ON
on each connection
to enable these constraints.
§Conclusion
- Enable Write-Ahead Logging with
PRAGMA journal_mode = WAL
. - Trade some durability for performance with
PRAGMA synchronous = NORMAL
. - Define a busy timeout handler with
PRAGMA busy_timeout = 5000
. (Increase its value to reduceSQLITE_BUSY
errors.) - Use small transactions and do not keep them dangling.
- Start writing transactions with
BEGIN IMMEDIATE
. - Enforce foreign key constraints with
PRAGMA foreign_keys = ON
. - Enable the error log and look for automatic indexes you can create permanently.
- Adjust
PRAGMA wal_autocheckpoint
based on the desired read/write performance. - If the WAL file grows too large, periodically run
PRAGMA wal_checkpoint(RESTART)
in a background service. - Occasionally run
VACUUM
to delete free pages and defragment the database.
§References
-
Most Widely Deployed SQL Database Engine. sqlite.org.
-
An Introduction To The SQLite C/C++ Interface. sqlite.org.
-
Long Term Support. sqlite.org.
-
In-Memory Databases. sqlite.org.
-
SQLite Shared-Cache Mode. sqlite.org.
-
Richard Hipp on shared cache. SQLite forum.
-
Atomic Commit in SQLite. sqlite.org.
-
File Locking and Concurrency in SQLite Version 3. sqlite.org.
-
Atomic Commit in SQLite § Creating A Rollback Journal File. sqlite.org.
-
Atomic Commit in SQLite § Rollback. sqlite.org.
-
Atomic Commit in SQLite § Deleting The Rollback Journal. sqlite.org.
-
Atomic Commit in SQLite § Obtaining An Exclusive Lock. sqlite.org.
-
Write-Ahead Logging § How WAL Works?. sqlite.org.
-
Write-Ahead Logging § Overview. sqlite.org.
-
Write-Ahead Logging § Persistence of WAL mode. sqlite.org.
-
Ensuring data reaches disk. lwn.net.
-
Write-Ahead Logging § Sometimes Queries Return
SQLITE_BUSY
In WAL Mode. sqlite.org. -
Isolation in SQLite. sqlite.org.
-
Transaction § Read transactions versus write transactions. sqlite.org.
-
Write-Ahead Logging § Concurrency. sqlite.org.
-
Is WAL mode serializable?. SQLite mailing list.
-
Transaction §
DEFERRED
,IMMEDIATE
, andEXCLUSIVE
transactions. sqlite.org. -
Query Planning. sqlite.org.
-
The SQLite Query Optimizer Overview § Automatic Indexes. sqlite.org.
-
Query Planning § Covering Indices. sqlite.org.
-
ANALYZE
§ Automatically RunningANALYZE
. sqlite.org. -
Database File Format § The sqlite_stat1 table. sqlite.org.
-
The SQLite Query Optimizer Overview § Manual Control of Query Plans. sqlite.org.
-
Write-Ahead Logging § Performance Considerations. sqlite.org.
-
Database File Format § Checkpoint Algorithm. sqlite.org.
-
src/wal.c
: comment on reader marks. SQLite source code. -
Database File Format § WAL Reset. sqlite.org.
-
src/wal.c
: comment on WAL reset. SQLite source code. -
Write-Ahead Logging § Automatic Checkpoint. sqlite.org.
-
Write-Ahead Logging § Avoiding Excessively Large WAL Files. sqlite.org.
-
src/wal.c
: comment onfsync
during checkpointing. SQLite source code.
§Further readings
- SQLite Documentation. sqlite.org.
- SQLite performance tuning. phiresky.github.io.
- Understanding
SQLITE_BUSY
. activesphere.com.