How to store data in SQLite: the complete guide from first table to production-ready database
SQLite runs inside every Android and iOS device, Python's standard library, and most embedded systems on the planet. The entire database lives in a single file, with no network layer, daemon, or config files to manage. That zero-overhead model makes it the default choice for web scrapers, mobile apps, CLI tools, and data pipelines that need structured storage without server complexity. This guide covers the full lifecycle: schema design, inserts, queries, security, and debugging.
Mykolas Juodis
Last updated: Apr 10, 2026
14 min read

TL;DR
- sqlite3 is in Python's standard library. No installation needed.
- Always use ? placeholders, never f-strings or string formatting in queries.
- Enable two PRAGMAs at connection start: foreign_keys = ON and journal_mode = WAL.
- Wrap every multi-step write in a transaction. Wrap myq every connection in a context manager.
- executemany() inside a single transaction is 50–100x faster than looping execute().
- Add STRICT to table definitions if you're on SQLite 3.37+. Type affinity will silently accept wrong types otherwise.
- Store the database path in a config variable. Centralize connection logic in one helper function.
- When you hit concurrent write bottlenecks or need network access, move to PostgreSQL. Your SQL transfers directly.
Prerequisites
You'll need Python 3.6+ and basic familiarity with functions, context managers, and exception handling. The sqlite3 module ships with the standard library, nothing to install.
Basic SQL is assumed. You should recognize SELECT, INSERT, UPDATE, and DELETE. This guide covers SQLite-specific behavior, not SQL fundamentals.
For the platform-specific sections: Android examples use Kotlin and Android Studio; Flutter examples use Dart with sqflite or drift in your pubspec.yaml; Windows examples use .NET 6+ with the Microsoft.Data.Sqlite NuGet package.
To run queries interactively, grab DB Browser for SQLite.
Setting up a SQLite database and managing connections
The difference between an SQLite project that runs cleanly for two years and one that starts breaking after three weeks usually has nothing to do with queries. It comes down to how you open and close connections.
Creating and initializing the database
Because sqlite3 is part of the standard Python library, a working database connection requires nothing beyond a Python installation.
If inventory.db doesn't exist yet, SQLite will create it. If it does, SQLite will open it. There's no separate CREATE DATABASE step like in PostgreSQL.
For testing or single-run pipelines where data doesn't need to persist beyond the process, use an in-memory database:
conn = sqlite3.connect(":memory:")
In-memory databases are fast and clean up automatically when the connection closes. They're a popular go-to for unit tests when users want to spin up a fresh database, run logic against it, and throw it away without any teardown.
Note: File extension conventions matter for tooling. .sqlite is recognized by most GUI browsers like DB Browser for SQLite; .db is common in framework tutorials. Pick one and stay consistent. Mixing them is a surprisingly common source of "why is my table empty?" debugging sessions.
Scattering connect() calls across a codebase leads to multiple connection objects, some never closed, some pointing at the wrong file. Wrap initialization in a class instead:
IF NOT EXISTS makes this safe to call on every startup without throwing errors on tables that already exist. Startup logic should be boring and reliable.
Libraries like SQLAlchemy, Flutter's sqflite, and Microsoft.Data.Sqlite all follow the same principle – wrap raw SQLite behind a structured interface so that the rest of the application never touches the database file directly.
Equivalent patterns in other environments
Environment
Pattern
SQLAlchemy
create_engine() + Base.metadata.create_all(engine)
Flutter
openDatabase() with an onCreate callback
.NET
Microsoft.Data.Sqlite with an InitializeDatabase() method
Getting the database open is the straightforward part. Keeping the connection lifecycle clean is where most users develop problems.
Managing connections reliably
First, decide on your connection lifetime – short-lived or persistent.
Short-lived connections open for a single operation or transaction and close immediately. They're the right default for scripts, request handlers, and anything that runs periodically.
Persistent connections stay open for the lifetime of a process and make sense when connection overhead is measurable, for example, desktop applications or long-running daemons with continuous write activity.
For most use cases, short-lived connections are safer. They release locks immediately, leave no dangling state, and fail loudly if something goes wrong.
Python's context manager pattern handles commit and close automatically:
As you can see, there’s no explicit conn.close() or try/finally boilerplate. For larger projects, it’s recommended to centralize connection logic in a single helper so that every part of the codebase opens connections the same way:
If you need to rename the file, switch to a config variable, or add logging, and change it in one place instead of hunting through the codebase.
It’s important to never hold a connection open across slow operations. The process includes opening a connection, firing a network request, waiting for the response, then writing the result. SQLite holds a write lock for that entire wait. On a small script, this is harmless, but in a scraper making 500 requests per run, it adds up.
To fix this, simply finish the slow work first, then open the connection only when you're ready to write. For projects with heavier concurrent read requirements, enable WAL mode early:
WAL (Write-Ahead Logging) allows readers and writers to operate concurrently rather than taking turns. You can add this at initialization and rarely think about it again.
With connections handled reliably, the next decision that shapes everything downstream is the schema.
Designing the database schema
A schema is a contract. It defines what data your application is allowed to store and what guarantees the database will enforce. Time spent sketching it before writing any SQL pays back every time you avoid hunting down a bug caused by a null that shouldn't exist or a duplicate that should have been blocked.
SQLite's type system – a trap worth knowing
SQLite uses type affinity rather than strict types. The five affinity types are INTEGER, TEXT, REAL, BLOB, and NUMERIC, but declaring a column as any of them doesn't lock it down. A column declared as INTEGER will silently accept a string without raising an error. Store "9.99" instead of 9.99 in a unit_price column and SQLite won't complain, but when you run a SUM across that column three months later, the result will be wrong, and nothing will tell you why.
SQLite 3.37+ has an opt-in fix. STRICT tables enforce actual type constraints at write times:
When building something for production, always add STRICT from the start, because retrofitting type safety into an existing schema is painful.
Creating the tables
For the rest of this guide, we'll use an example of a product inventory tracker with suppliers as the working example.
A few decisions worth noting:
- DEFAULT 0 on stock_count means application code never has to pass an initial value.
- DEFAULT CURRENT_TIMESTAMP on created_at is filled out automatically.
- CHECK(unit_price > 0) enforces a business rule at the database level. Even if the application has a bug, a negative price cannot be committed.
Foreign key enforcement is disabled by default in SQLite, which trips up developers coming from PostgreSQL. You can enable it at the start of every connection:
Foreign keys are one constraint type. The others worth defining at schema time are primary keys, UNIQUE, and CHECK. Together they cover the data integrity guarantees that would otherwise require enforcement in application code, scattered across every function that touches the database.
Primary keys and additional constraints
INTEGER PRIMARY KEY is an alias for SQLite's internal rowid. Any row that doesn't explicitly set this column gets an auto-incremented value. For junction tables like product_suppliers, define a composite primary key instead:
This prevents the same product-supplier pair from being recorded twice, which is exactly what a junction table guarantees.
UNIQUE constraints prevent duplicates on non-primary-key columns:
CHECK constraints push basic business rules down into the database layer – the last line of defense against bad data:
Schema tips:
- Sketch the schema before writing any SQL. A quick diagram of tables, required vs. optional columns, and relationships take ten minutes to prepare and prevent a lot of ALTER TABLE pain down the road.
- Use snake_case for all table and column names. Avoid SQL-reserved words as identifiers. SQLite parses ORDER and GROUP as keywords first, so a table named order breaks any query that references it unless you wrap the name in quotes every single time.
- Plan for schema migrations from the start. SQLite's ALTER TABLE supports ADD COLUMN, RENAME COLUMN, RENAME TO, and (since 3.35.0) DROP COLUMN, with the restriction that dropped columns cannot be primary keys, part of an index, or referenced by a foreign key. Any significant structural change requires recreating the table. Keeping schema creation SQL in version control from day one makes this far less painful.
With the schema defined, the next step is getting data in, with one rule applicable to every insert.
Inserting data
Value passing, large dataset handling, and duplicate record behavior are defaults that often cause problems with query writing. Each has a safer alternative worth knowing before writing any production code.
Parameterized queries are the only correct default
For every single insert the write function looks like this:
The ? placeholders are where the safety comes from. The database driver handles the binding between placeholder and value. Whatever gets passed in, even if it contains SQL syntax, gets treated as literal data and never executed as a command. The alternative (building INSERT statements with f-strings or %-formatting) is the primary SQL injection vector in Python codebases, because user input gets interpreted as SQL rather than data.
For queries with many columns where positional order becomes error-prone, switch to named placeholders:
Named placeholders make it obvious what's happening where and are resilient to column reordering.
Bulk inserts with executemany()
Looping execute() over a list of records and committing each one individually is the SQLite performance mistake users make. The difference is unnoticeable at 10 rows, but the performance overhead adds up quickly at 50,000 rows.
executemany() inside a single transaction fixes this:
Batching inside a single transaction is consistently 50× to 100× faster than per-row commits on large datasets. The code change is two lines.
INSERT OR REPLACE and INSERT OR IGNORE
Bulk inserts assume all data is new. Every scraper and sync pipeline eventually hits the case where a record might already exist.
INSERT OR REPLACE treats the incoming row as authoritative and overwrites the existing one:
INSERT OR IGNORE skips the row silently if it would violate a unique constraint. Re-running the same script against the same data produces no errors and no duplicates:
Insertion checklist
- Wrap bulk inserts in a single transaction. Per-row commits on large datasets are catastrophically slow.
- Retrieve auto-assigned primary keys with cursor.lastrowid. Use cursor.lastrowid after an insert to retrieve the auto-assigned primary key.
- Validate and clean data before inserting. SQLite's type affinity silently accepts wrong types, storing bad data without complaint.
Collecting data before storing it?
Decodo's Web Scraping API returns clean, structured output that drops straight into your SQLite pipeline with minimal validation overhead.
Querying, updating, and deleting data
SELECT * is fine for exploration, but for anything running in production, name your columns explicitly. It makes the code readable, reduces memory usage, and prevents silent breakage when someone adds a column to the table later.
Querying with SELECT
For partial text matching, use LIKE with a % wildcard, for example, WHERE name LIKE ? with "%Widget%" as the bound value matches any name containing that string.
conn.row_factory = sqlite3.Row lets you access result columns by name (row['name']) rather than by index (row[0]). Index-based access is brittle, and adding a column in the middle of the schema breaks every index-based access downstream. However, Row objects and tuples coming out of SQLite are raw database types, so parsing them into the data structures your application actually uses is a separate step worth thinking about deliberately.
Pick the fetch method based on what you’re expecting back:
- fetchall() loads the full result set into memory. It works for small datasets, but can be a problem for large ones.
- fetchone() retrieves a single row. You can use this for lookups where you’d expect a 0 or 1 result.
- fetchmany(n) pulls n rows at a time, which is better for large result sets that you want to process in chunks.
Updating records with UPDATE
When a product price changes or stock levels shift, target the exact rows you want to modify using UPDATE with a WHERE clause:
The WHERE clause is mandatory. An UPDATE without a WHERE clause modifies every row in the table. Check cursor.rowcount in scripts that run unattended, so you can know exactly how many rows were affected.
Deletion follows the same discipline – the WHERE clause is the only thing standing between a targeted operation and a full table wipe.
Deleting records with DELETE
Before running any DELETE in production, run the equivalent SELECT first to confirm you’re targeting the right rows. The deletion itself is straightforward:
For data that needs an audit trail, a soft delete is a safer default than physically removing rows:
Soft deletes preserve history, allow undoing, and make debugging much easier with accidental deletions. Use the following code to expire the oldest records from a log or cache table without touching newer ones:
Single-statement operations are straightforward, transactions ensure multiple writes succeed or fail as a unit.
Transactions for multi-step operations
Some operations only make sense together. Decrementing stock and recording the sale have to both succeed or both fail. Here's how to enforce that with an explicit transaction:
The context manager pattern handles commit and rollback automatically for simpler cases. Explicit BEGIN/COMMIT/ROLLBACK gives finer control when the logic is more involved.
Transactions protect consistency across operations. The next layer of protection runs at every individual statement, keeping bad data out in the first place.
Protecting data integrity and preventing SQL injection
Most data problems trace back to one of three failure points: a query built from raw string input, a constraint that was never defined, or a value that looked valid, but wasn't. The first is a security issue and the latter two are data quality problems, each with a different fix.
Why parameterized queries are non-negotiable
SQL injection is worth understanding concretely. The code below shows what happens when a query is built from raw string input, and how a simple name lookup becomes a full table dump with a crafted input value.
Parameterized queries close this off entirely:
This applies to every statement type. WHERE clauses in SELECT, UPDATE, and DELETE are just as vulnerable as INSERT. Any query touching user-supplied input needs parameterized values.
When building pipelines that write external data into SQLite, parameterized queries are the minimum bar – every value arriving from outside the application is untrusted input until proven otherwise. Using a data collection layer like Decodo's Web Scraping API reduces this burden because structured, clean output means less defensive handling before the data reaches the write path.
Enforcing integrity with constraints
Constraints handle a different class of problem – valid-looking data that violates business rules. They run as a second layer of defense even when the application code has a bug:
Constraint
What it prevents
NOT NULL
Accidental null values in required fields
UNIQUE
Duplicate entries on non-primary-key columns
CHECK
Business rule violations, e.g., negative prices
FOREIGN KEY
Orphaned records and referential inconsistency
As noted in the schema section, foreign key enforcement is easy to forget and difficult to debug when missing. Enable it at the start of every connection:
Transactions as an integrity tool
Constraints protect individual rows, while transactions protect operations that span multiple rows or tables. SQLite transactions are ACID-compliant, meaning all changes commit together (or none do), constraints are checked on commit, concurrent operations don't interfere, and committed data persists after a crash.
Without a transaction, a crash between the update and the INSERT in record_sale leaves stock decremented, but no sale recorded, leading to two tables out of sync with no specific error shown. You can verify by running this snippet:
Wrap both writes in a transaction and the rollback restores the original state automatically.
For simpler cases, the context manager handles commit and rollback without BEGIN/COMMIT boilerplate. Use explicit BEGIN and COMMIT when the logic branches mid-transaction, or you need finer control over save points.
Input validation before the database
Parameterized queries prevent injection. They don't validate that a value makes sense. A negative price passes through a placeholder without complaint, lands in the database, and breaks margin calculations. Catch bad data in application code before it ever reaches the write path:
Validation in application code, constraints in the schema, and parameterized queries at the statement level cause a problem if all three are not in place.
Now that we’ve covered the fundamentals – schema, inserts, queries, and integrity, we can move to patterns for data that don’t fit SQLite's native types, queries that span multiple tables, and the project contexts where SQLite's strengths matter most.
Advanced techniques: complex data types, joins, and specialized projects
Once the schema is stable and queries become more demanding, it might lead to problems with storing data that doesn't map cleanly to SQLite's native types, pulling data across multiple tables, and choosing the right patterns for specific project contexts. This section covers all three.
Storing complex data types
SQLite has no native array or list type. When you need to store tags, a set of URLs, or a dictionary of config values, choose between these two options.
Option 1 is serializing to JSON. It’s implemented quickly, but SQL can't query into the values:
Option 2 is a normalized child table. More code, but fully queryable:
WHERE tag = 'sale' now works at the SQL level, indexes apply, and joins become possible. Use JSON for list values that will never be queried directly, and a child table when they will.
A BLOB column stores binary data like small images, compressed payloads, or serialized objects directly alongside their metadata, retrieving both in a single query with no separate file lookup. Where that coupling matters, BLOB keeps the data model simple. Where it doesn't, storing large BLOBs inflates the database file, slows backups, and makes the file unwieldy to move around.
For anything above a few hundred kilobytes, that doesn't need to travel with its metadata, storing the file on disk and keeping only the path in the database is the better trade-off:
Combining tables with JOIN
When data lives across multiple tables, JOIN brings it together. INNER JOIN returns only rows with matching entries in both tables:
LEFT JOIN includes all rows from the primary table even when no match exists in the joined table:
Always use table-qualified column names (for example, p.name, s.name) when joined tables share column names.
For complex multi-step queries, Common Table Expressions (CTEs) are far more readable than deeply-nested subqueries. Each CTE is a named mini-table and the final SELECT reads like a sentence:
SQLite in specialized project contexts
- Web scraper persistence. SQLite is a natural fit as the storage layer for Python web scrapers that accumulate records across runs. Using Decodo's Web Scraping API as the data collection layer means structured, clean output arrives ready to insert, reducing the validation burden before it hits the database. INSERT OR IGNORE prevents duplicates across runs, and timestamped rows allow detecting what has changed since the last scrape:
The same pattern applies to AI web scraping pipelines where collected data feeds a model or analysis step. SQLite provides a lightweight, queryable store without adding infrastructure.
- IoT and sensor logging. SQLite's footprint is small enough to run on single-board computers. For time-series data, the index is the critical schema decision. Without one, time-range queries do full table scans:
- CLI tools and developer utilities. SQLite gives a persistent user state without asking anyone to install anything extra. Here's a simple key-value store built with Click:
The JSON vs. child table decision, the JOIN patterns, and the project-specific schema choices above are all expressions of the same underlying principle – match the data structure to how you need to query it. Get that right, and the rest follows.
Using SQLite across platforms and with abstraction layers
The SQL in this guide transfers directly across every platform below. What changes is the layer between the code and the database engine. Each platform presents the same core trade-off. Raw SQL gives direct control with minimal setup, while an abstraction layer adds compile-time safety and reduces boilerplate at the cost of setup time and generated queries you didn't write..
Python: sqlite3 vs. SQLAlchemy
The built-in sqlite3 module covers most Python use cases without installation. You can use it when every query is explicitly written in the source code, and the project is small enough that an ORM would add more ceremony than clarity.
SQLAlchemy Core earns its place when query conditions are determined at runtime, filtering by whichever columns the user selects, for example, and the alternative is concatenating strings to build dynamic SQL. The key advantage over raw sqlite3 is programmatic query construction, assembling filters, joins, and ORDER BY clauses in code without touching a string.
SQLAlchemy ORM maps Python classes to tables and generates SQL from method calls. It pays off for larger applications with complex data models where the abstraction genuinely reduces code. For simple, read-heavy scripts like pulling data, transforming it, and writing a report, the ORM setup cost outweighs the benefit. Raw sqlite3 ships faster and stays easier to read.
Android: SQLiteOpenHelper vs. Room
SQLiteOpenHelper is the traditional approach. Override onCreate() to handle schema creation and onUpgrade() for migrations:
Room is recommended for any new Android development. It generates SQL during the compile time, catches type mismatches before runtime, and eliminates the boilerplate code that SQLiteOpenHelper leaves to you:
SQLiteOpenHelper ships faster for a screen or two with simple queries. Once the schema involves multiple tables, relationships, and LiveData or Flow integration, Room pays is best. Complex analytical queries that Room can't express cleanly are where @RawQuery earns its place.
Flutter: sqflite and Drift
sqflite uses raw SQL, returns Futures for every operation, and requires no setup beyond the package itself. Use it when direct SQL control is enough and the project doesn't justify the Drift setup cost.
Drift generates type-safe Dart query code from table definitions. Its relationship to sqflite mirrors Room's relationship to SQLiteOpenHelper, delivering more setup overhead, more compile-time safety, and is better suited to large applications with complex schemas and reactive UI requirements. For a small app with two or three tables, sqflite ships faster.
Windows: Microsoft.Data.Sqlite and Entity Framework Core
Microsoft.Data.Sqlite gives thin, explicit control without abstraction overhead:
For Windows apps that need full ORM features, Entity Framework Core with the SQLite provider adds migrations, LINQ queries, and change tracking:
The same decision rule applies to every other platform: Microsoft.Data.Sqlite ships faster for small projects with straightforward queries, Entity Framework Core is most cost-efficient when the data model is complex, migrations matter, and LINQ queries would otherwise be hand-rolled SQL.
Enable query logging during development and verify the SQL the framework is actually generating. Abstractions produce unexpected queries, and EXPLAIN QUERY PLAN won't catch what you're not looking at."
SQLite is also widely used beyond the platforms covered here. If you're working with Go or comparing language options for a data-heavy project, this guide to Go vs. Python covers how the two languages approach the same class of problems.
Final thoughts
SQLite's reputation is deserved, but the thing most tutorials don't mention is how often it outlasts the projects that were supposed to outgrow it. The concurrent write limit sounds alarming until you measure actual write frequency and realize most applications never get close to it. The single-file model sounds fragile until you've spent a week debugging a misconfigured PostgreSQL connection in a staging environment.
Migrate when sustained write contention shows up in your logs, not in anticipation of scale. Most projects never reach that point.
The one place SQLite will genuinely surprise you is schema migrations. ALTER TABLE limitations are manageable early and painful late. Keeping your schema-creation SQL in version control from day one and treating every structural change as a migration script rather than a one-off fix is the habit that makes the difference when the schema needs to evolve six months in.
When you do eventually move to PostgreSQL, the SQL will transfer directly. That's rarer than it should be in this ecosystem, and it's worth choosing tools with that property.
Store data efficiently
Decodo's Web Scraping API delivers structured, clean output at scale so your SQLite pipeline gets reliable input from day one.
About the author

Mykolas Juodis
Head of Marketing
Mykolas is a seasoned digital marketing professional with over a decade of experience, currently leading Marketing department in the web data gathering industry. His extensive background in digital marketing, combined with his deep understanding of proxies and web scraping technologies, allows him to bridge the gap between technical solutions and practical business applications.
Connect with Mykolas via LinkedIn.
All information on Decodo Blog is provided on an as is basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Decodo Blog or any third-party websites that may belinked therein.


