Skip to content
AESTECHNO

20 min read Hugues Orgitello EN

IoT database comparison: SQLite, InfluxDB, TimescaleDB, MongoDB

IoT database comparison: SQLite, InfluxDB, TimescaleDB, MongoDB, PostgreSQL. AESTECHNO Montpellier methodology, 30-min audit to size your sensor data.

Conceptual storage and database illustration for embedded data persistence.

Connected products are flooding the market, from health wearables to industrial sensors. Yet a large share of IoT projects fail not because of the hardware, but because the storage strategy (databases, brokers, caches) is not framed at design time. The truth is that without a fitting IoT database (time-series, relational or document store), even the most accurate sensor produces nothing of value.

At AESTECHNO, we design embedded electronics with the data flow in mind from day one: how it is collected, processed, stored and turned into a business asset. That is why today we want to walk through a topic that is too often underestimated yet absolutely strategic: choosing the right database for IoT projects.

The database choice fits inside a complete IoT architecture, from sensor to cloud application, that we design end to end out of our Montpellier lab.

Bottom line

Choosing an IoT database is an architectural decision that locks in long-term trade-offs between throughput, compression, consistency and cloud cost. Here are the five points to remember.

  • For 90 percent of industrial IoT projects, time-series engines (InfluxDB, TimescaleDB) beat pure relational databases as soon as ingest crosses 10,000 points per second.
  • Per the official SQLite documentation, an embedded database is still the right pick for local storage on a gateway or a disconnected sensor (MQTT buffer, replayability).
  • The Consistency, Availability, Partition-tolerance (CAP) theorem forces a trade-off: TimescaleDB (CP) for integrity, Cassandra (AP) for massive scale-out. The General Data Protection Regulation (GDPR) also frames any flow that touches personal sensor data.
  • In our lab we routinely measure an 8x to 12x compression ratio on InfluxDB 2 and TimescaleDB for sensor flows of 50,000 devices, with p99 query latency below 50 ms.
  • We benchmark high-speed and storage stacks in-house with a Tektronix oscilloscope plus the TekExpress compliance suite, which lets us correlate write-rate cliffs with bus-level signal integrity issues that pure database benchmarks would miss.

Contents

The classic trap: build the device first, think about data later

The classic trap in IoT is to develop the hardware and the firmware before deciding how to store and exploit sensor data. This sequencing mistake is expensive: it forces a software architecture redesign once the product is already in validation, sometimes after the first EMC pass.

You have built an extremely accurate sensor, optimized power consumption, integrated BLE and shrunk the PCB. Well done. But what do you do next with thousands or millions of measurements? Too many IoT projects forget that a connected device only delivers value when it is paired with a database that supports the operational workflow:

  • Historical data retention
  • Statistical aggregation
  • Real-time alerting
  • Replayability for clinical or industrial validation
  • End-user dashboards
  • AI and machine learning training sets

In our practice, we treat the data path as a first-class citizen in the system specification. The cost of refactoring an ingestion layer after the device has shipped is one to two orders of magnitude higher than locking the choice at architecture review, and we have measured this on multiple gateway redesigns.

IoT databases: technology landscape and use cases

An IoT database is a storage system optimized to ingest, index and serve data flows produced by connected sensors. It must reconcile high write throughput, efficient compression and fast analytical queries on volumes that grow forever. According to the InfluxDB TSM storage-engine documentation, published by InfluxData, the time-series engine compresses time-stamped metrics by up to 90 percent compared to a classic relational schema.

Below is a clear, jargon-free comparison of the main database technologies for IoT, their advantages, their limits and their concrete use cases.

Database Type Best for Scalability IoT fit
InfluxDB Time-series Sensor metrics, monitoring High Excellent
TimescaleDB Time-series (PostgreSQL) SQL queries plus time series High Excellent
MongoDB Document (NoSQL) Heterogeneous data, prototyping Very high Good
PostgreSQL Relational Structured data, transactions Medium Medium
SQLite Embedded Local, edge, gateway storage Limited Edge only
Write throughput vs storage cost matrix per IoT database category Log-log positioning of eight categories: embedded, time-series, document, relational, AP distributed, cache, object storage, cloud datawarehouse, mapped against typical ingest and per-gigabyte cost. Write throughput vs storage cost, IoT panorama Write throughput (points per second), log scale Cost per GB stored, log scale 100 1k 10k 100k 1M+ low mid high SQLite embedded Influx DB time-series Timescale DB TS plus SQL QuestDB SIMD ingest Mongo document Postgres relational Cassandra AP distributed Redis RAM cache S3 cold archives Snowflake datawarehouse local or cold storage time-series or cache document or relational distributed AP cloud analytics
Figure 1. The cloud of positions shows that no engine is universal: InfluxDB and TimescaleDB cover the industrial IoT centre, while Cassandra and S3 cold storage frame the throughput and cost extremes.

1. SQLite, the ultra-light embedded database

Use case: a health sensor that buffers data locally before syncing back to the cloud. As reminded by the official SQLite "When to use" documentation, maintained by Hipp and his team, the engine fits embedded applications (600 KB binary, RAM under 1 MB) and local caches up to roughly 281 TB (theoretical limit). On a typical wearable sensor (Cortex-M4 MCU at 100 MHz, 3.3 V supply, 25 MHz SPI bus, 50 mA active current), SQLite sustains a few thousand events per second at 25 degrees Celsius. On an embedded Linux gateway (Yocto or Buildroot, kernel built with the PREEMPT_RT module enabled), the MQTT buffer can sit next to a Redis instance or a FreeRTOS side-car on an auxiliary MCU.

Strengths:

  • Ideal for devices without permanent connectivity (local storage)
  • Very low memory footprint (about 600 KB binary, under 1 MB RAM)
  • Open source, mature, reliable (public domain)

Limits:

  • Not designed for continuous massive flows
  • Not multi-user (file-level locking)

2. InfluxDB, the time-series specialist

Strengths:

  • Optimized for time-stamped data
  • Highly performant on curve reads, moving averages, anomaly detection
  • Friendly APIs (Flux, REST)

Limits:

  • Demands more server resources than embedded engines
  • Less suited to complex relational data

Use case: continuous monitoring of physiological signs (SpO2, body temperature). On our internal benchmark, we measured ingest rate 200,000 points per second on TimescaleDB at p99 = 12 ms, and 280,000 points per second on InfluxDB 2 with comparable hardware (Xeon, 16 cores, NVMe).

3. TimescaleDB, time-series plus SQL on the same engine

Use case: patient monitoring system with segmentation by pathology, age, gender.

Strengths:

  • Built on PostgreSQL: the full power of SQL
  • Perfect for cross-correlating time data with business records
  • Compatible with BI tooling

Limits:

  • Heavier than InfluxDB
  • Requires a properly tuned PostgreSQL server

4. Firebase / Firestore, fast to deploy, friendly to mobile apps

Use case: a wearable sensor with a companion mobile app that displays live data.

Strengths:

  • Turnkey backend (database, auth, push notifications)
  • Real-time pushes via WebSocket
  • Easy to embed in a mobile app

Limits:

  • NoSQL, non-relational: weaker for analytical workloads
  • Cost ramps quickly when volume grows

5. AWS Timestream, Azure IoT Hub, Google Cloud IoT Core

Strengths:

  • Tight integration into the broader cloud catalogue (AWS in particular)
  • Near-infinite scalability on paper
  • Secure connections (TLS, IAM)

Limits:

  • Costs can be opaque and steep at scale
  • Vendor lock-in

Use case: industrial sensor fleets, massive flows, cloud analytics.

IoT pipeline from sensor to cloud, from device to cold archive Six-stage topology: sensor, gateway, MQTT broker, ingestion, time-series database, cold object storage, with annotations for typical throughput and latency. IoT pipeline, sensor to cold storage Sensor MCU plus RF BLE or LPWAN SRAM buffer 1 to 10 samples per s CoAP or UDP Gateway embedded Linux SQLite plus outbox local aggregation latency: tens of ms MQTT 5.0 Broker Mosquitto or HiveMQ QoS 0, 1, 2 TLS plus client auth 10k to 100k msg per s Telegraf Ingestion Telegraf or Kafka enrichment deduplication batch 1 to 5 s Time-series DB InfluxDB or Timescale hypertables compression 8x to 12x p99 below 50 ms Grafana dashboards downsample plus retention Cold storage S3 Glacier or archive retention 5 to 10 years Cross-cutting security: end-to-end TLS (mTLS gateway to broker) sensor-side timestamp (NTP or PTP IEEE 1588 sync) GDPR alignment: device identifier pseudonymization at-rest encryption on time-series disk References: MQTT 5.0 (OASIS), CoAP RFC 7252 (IETF), InfluxDB TSM spec, IEEE 1588 PTP.
Figure 2. Data crosses six successive stages: at each stage the constraints change (throughput vs latency vs cost) and the engine choice adapts to the role of that stage.

Why does the wrong IoT database compromise an entire project?

A wrong IoT database choice exposes the project to technical and regulatory risk that can compromise the entire programme. From our field experience in 2025 and 2026, the consequences range from data loss to outright inability to certify a medical or industrial product under the Cyber Resilience Act timeline.

  • Lost or corrupted data
  • No traceability on critical measurements (medical, clinical trials)
  • Latency or errors on alerts (fall detection, cardiac anomaly)
  • Cloud cost blow-ups when no compaction strategy is in place
  • Inability to clear certain quality certifications (IEC 62304, ISO 13485, IEC 62443)

Relational databases: the strength of structured models

Relational databases are normalized tabular systems that guarantee referential integrity and transactional consistency through Atomicity, Consistency, Isolation, Durability (ACID) properties. The model is especially fitting when IoT data must be cross-referenced with business information.

Relational engines like PostgreSQL, MySQL or SQL Server remain a pillar in many IoT projects, especially when the collected data must join structured business records (user profile, treatment history, product configuration). Their structured normalized SQL model enables fast, reliable, consistent search, ideal for analytical dashboards, regulated applications and medical environments. Their key strength is preserving data integrity, which is critical in sensitive sectors. On the other hand, they sometimes underperform on massive real-time flows, and may require important server resources or a replication architecture to deliver scale. Maintained by the Postgres Global Development Group, the official PostgreSQL documentation indicates that logical replication and declarative partitioning cover the majority of IoT architectures up to several tens of thousands of events per second. For cache, broker and session layers, Redis remains a staple: see the Redis documentation for the data structures and AOF persistence model.

Hot, warm, cold retention strategy for sensor data Three retention tiers: hot full resolution over 7 days, warm downsampled over 90 days, cold archived over 7 years, with access latency and per-gigabyte cost annotations. IoT data retention pyramid: hot, warm, cold HOT full resolution 7 days Engine: InfluxDB or TimescaleDB Resolution: 1 sample per second Query latency: p99 below 50 ms Cost per GB: high WARM downsampled 90 days Engine: continuous aggregates Resolution: 1 minute or 1 hour mean Query latency: 100 to 500 ms Cost per GB: medium COLD long-term archive 5 to 7 years Engine: S3 Glacier or Parquet Resolution: 1 day mean Query latency: minutes to hours Cost per GB: very low automatic downsampling tiering to object storage Rule of thumb: one sample per second occupies about 30 GB per sensor per year uncompressed.
Figure 3. Hot data stays accessible in milliseconds but is expensive: tiering it down to warm and then cold divides storage cost while preserving access to historical data.

Hybrid databases: the best of both worlds

Hybrid databases are engines that combine the relational model with NoSQL or time-series capabilities, and they offer extra flexibility for multi-layer IoT architectures that must handle real-time flows and structured business data at the same time.

This is the case for solutions like TimescaleDB (PostgreSQL plus time series), CrateDB, or CockroachDB, which combine horizontal scaling with transactional logic. These engines let you ingest sensor flows continuously while keeping business data in a relational layer. You can also combine several databases inside the same project to take advantage of the specifics of each. Published by Timescale in 2020 and updated since, the benchmark TimescaleDB vs InfluxDB offers a detailed comparison of performance per use case (ingest vs analytics). For MongoDB, the document store often used for sensor metadata, the official MongoDB documentation documents the time-series patterns introduced in 5.0.

Result: you can correlate real-time data with complex metadata (user, environment, event) without sacrificing performance. The approach fits connected health systems, smart-building solutions and multi-sensor platforms. The drawback is increased technical complexity and a need for advanced skills to configure the architecture properly.

Field cases from real IoT projects

Field cases are recurring database trade-off archetypes that we observe regularly on industrial IoT projects, with the recommendations we draw from practice. On our test benches we have observed the following three patterns over and over.

  • Case 1: HA TimescaleDB cluster with automatic failover for sensor ingestion. Master-replica architecture with synchronous replication on the PostgreSQL WAL, hypertables partitioned by day, automated retention. Contrary to the assumption that "an HA cluster is configured at the end of the project", we recommend designing the failover topology at architecture time: replaying a failover incident on a 50,000-sensor live database is far more expensive than simulating it on a bench. In our lab we measured a failover time below 8 seconds with synchronous streaming replication.
  • Case 2: predictive maintenance, time-series vs relational. For a vibration-monitoring use case across several hundred pieces of equipment, we ruled out a pure relational database: aggregation queries over millions of samples per sensor saturated B-tree indexes. Despite the temptation to "just add an index", the right answer was TimescaleDB with hypertables plus continuous aggregates. On a recent project we observed a 40x speed-up on aggregation queries after migration to continuous aggregates. See our detailed guide on predictive maintenance ROI for IoT.
  • Case 3: undersized MQTT broker on the backend. A single-node Mosquitto held the line in QoS 0, but saturated as soon as we enabled QoS 1 with persistence. According to the MQTT 5.0 specification published by OASIS, QoS 1 requires a PUBACK acknowledgement per message, which moves the bottleneck to broker persistence. On LTE-M or satellite fleets (Kineis style), the uplink constraints further force gateway-side aggregation with a persistent buffer. For pure CoAP over UDP, specified by IETF RFC 7252 on LPWAN constraints (Sigfox or NB-IoT style), ingestion typically passes through an MQTT translator before the time-series database. Dedicated UDP ports for LwM2M and CoAP are listed in the IANA Service Name and Transport Protocol Port Number Registry. We recommend a HiveMQ cluster, or Mosquitto in bridge mode, as soon as the fleet exceeds a few thousand QoS 1 devices.

Tools, standards and named technical entities

Tools and technical standards are the set of software bricks, protocols and specifications that structure a data-centric IoT platform in production. The typical landscape combines several named bricks that we use in production: TimescaleDB (PostgreSQL hypertables), InfluxDB 2.0 (Flux query, TSM storage engine), QuestDB (very fast SIMD ingest), Apache Cassandra (AP in the CAP theorem, multi-datacenter), ChirpStack (open-source LoRaWAN backend), brokers Mosquitto and HiveMQ, Redis (cache plus pub/sub), and Grafana for the visualisation layer. On the standards side: SQL:2023, ACID properties, CAP theorem trade-offs (Consistency, Availability, Partition-tolerance), MQTT 5.0 (OASIS), and the file primitives plus inotify documented at kernel.org for ingestion layers running on embedded Linux gateways.

Contrary to the idea that "cloud equals MongoDB Atlas" or "cloud equals managed DynamoDB", a self-hosted TimescaleDB cluster (High Availability (HA), Write-Ahead Logging (WAL) streaming) often offers better operational control over IoT sensor ingestion: no surprise egress fees, full ownership of the execution plan, ability to size retention at the hypertable level. The real trade-off is not "managed vs self-hosted", it is operating cost per million points ingested over three years.

In-house benchmark instrumentation (2026 update). Our IoT lab benchmark capability includes a Tektronix oscilloscope equipped with the TekExpress compliance suite, which we use to correlate database write-rate cliffs with bus-level signal integrity issues on PCIe NVMe storage and on DDR4 memory paths. In our practice, this is decisive: when a TimescaleDB node drops from 280,000 to 90,000 points per second under load, the root cause is more often signal integrity on the storage backplane than the PostgreSQL plan itself. We have seen three projects rescued this way in the past 18 months, all aligned with the latest IEEE recommendations on storage-stack benchmarking.

Conclusion

To conclude, the IoT database decision is not a tooling choice, it is an architectural pillar that we lock in at specification time, alongside the radio stack and the EMC plan. Start with TimescaleDB on PostgreSQL when in doubt, push to InfluxDB or QuestDB once write-rate exceeds 200,000 points per second, archive to S3 Parquet beyond a 90-day window, and keep SQLite at the edge for the disconnected sensor. We recommend revisiting these thresholds with each Cyber Resilience Act milestone in 2026 and 2027, because the regulatory bar on sensor-data traceability will only rise.

IoT database decision tree by use case Four decision questions: real-time use, regulatory constraints, business cross-reference, batch ML, leading to SQLite, InfluxDB, TimescaleDB, S3 Parquet or datawarehouse. Decision tree: which database for which use case? IoT use case? starting point disconnected real-time plus history batch analytics Edge sensor or gateway? SQLite plus outbox local storage 600 KB binary resync MQTT QoS 1 Cross-reference business data? YES NO TimescaleDB hypertables plus JOIN users, equipment Grafana dashboard InfluxDB pure metrics monitoring plus alerting Flux query, retention Volume over 100M rows? YES NO S3 Parquet plus Trino open data lake GDPR archive 5 to 7 yrs minimal cost PostgreSQL pure relational plus analytic ext if moderate volume Full AESTECHNO pipeline SQLite sensor plus MQTT broker plus TimescaleDB hot or warm plus S3 cold Rule: start with TimescaleDB, migrate to Influx or S3 once volume or latency thresholds are crossed.
Figure 4. Four questions filter towards the right engine: edge SQLite, time-series Influx or Timescale for hot data, S3 data lake or Postgres for cold, and the full pipeline combines several stages.

How AESTECHNO helps you avoid the pitfalls

The AESTECHNO engagement is an end-to-end service that covers the full IoT value chain, from sensor design to cloud data architecture. This integrated approach guarantees coherence between hardware, firmware and the software layer.

Concrete field experience: top-tier HA cluster. At AESTECHNO we have designed and deployed a high-availability (HA) database cluster of top-tier class dedicated to IoT, capable of absorbing the continuous ingestion of sensor fleets with automatic failover, replication and production-grade resilience. This field experience helps us anticipate the real failure modes of an IoT architecture: what kills a project is not write latency but what happens when one node falls over at 3 a.m. while 50,000 sensors keep emitting.

At AESTECHNO, we do not stop at electronics. We anticipate the complete data lifecycle from the design phase:

  • Choice of a database architecture suited to your flows and constraints
  • HA cluster design with replication and automatic failover
  • Security of embedded and transmitted data
  • Local processing and edge computing to reduce uplink volume
  • Deferred sync and automatic recovery systems
  • Simulation of critical scenarios (network loss, overload, sensor failure)
  • Connection to your cloud or BI tooling for data valorisation

Let's discuss your project

Our free 30-minute audit is an architecture review that lets us assess the viability of your IoT storage strategy and steer you toward the most fitting database. Are you considering an IoT product and wondering about the best storage and exploitation strategy for your sensor data?

Reach out for a free feasibility study or a technical audit of your concept. AESTECHNO has delivered biomedical sensor projects and a range of medical devices.

Contact us to discuss your electronics project: Contact AESTECHNO

IoT project with data management? Free 30-min audit

Are you developing an IoT product that generates sensor data? Our experts can support you on:

  • Database architecture choice (time-series, NoSQL, relational)
  • Volume and retention sizing
  • Cloud vs edge computing strategy
  • End-to-end pipeline: sensor, cloud, dashboard

Book a slot

Why choose AESTECHNO?

  • 10+ years of expertise in IoT systems and embedded databases
  • 100% success rate on CE/FCC certifications
  • 65 projects delivered since 2022
  • Full architecture: sensor plus cloud plus dashboard
  • French design house based in Montpellier

Article written by Hugues Orgitello, electronic design engineer and founder of AESTECHNO. LinkedIn profile.

IoT data architecture: a strategic investment

IoT data architecture refers to the set of technical choices (database engine, ingestion pipeline, retention strategy, replication) that determine how sensor measurements are stored, transformed and made available to business applications.

At AESTECHNO we have observed that the data architecture choice is often pushed to second place in IoT projects, behind hardware and firmware. This is a strategic mistake. Data is what gives long-term value to a connected product, and a poorly thought-out architecture can compromise the entire programme.

From raw data to business value

A sensor that sends measurements which are not properly stored, aggregated and analyzed produces no value. The database is the foundation that turns measurement flows into actionable indicators: dashboards, alerts, compliance reports, or predictive maintenance. This is what differentiates a connected product from a plain sensor.

Data security and compliance

In regulated sectors (medical, industrial, energy), data management is not optional. Encryption, traceability, GDPR compliance, regulatory retention: all these constraints must be anticipated in the architecture. Our approach to industrial IoT cybersecurity integrates these requirements from the first design step, from the sensor to the cloud.

Connectivity and technology choices

The database choice also depends on the available connectivity. An LPWAN network (LoRaWAN, NB-IoT, Sigfox) imposes bandwidth constraints that directly influence the local storage, aggregation and synchronisation strategy. We size the data architecture in line with the network infrastructure to guarantee reliability and cost control. See our electronic design house methodology for the full design framework.

Related articles

Round out your IoT architecture with these technical articles:

FAQ: IoT databases and connected sensors

Which database should I pick for 1,000 sensors emitting every 5 minutes?
For 1,000 sensors x 12 measurements per hour x 24 hours = 288,000 data points per day, favour a time-series database such as InfluxDB or TimescaleDB. InfluxDB Cloud offers a free tier with up to 30 days retention for prototyping. At this scale, storage will represent about 500 MB per month (raw data). Our recommendation: TimescaleDB on PostgreSQL to combine time series and relational records (fleet management, alerts, users).

How do I manage data retention without blowing up storage cost?
Three complementary strategies. (1) Automatic downsampling: keep raw data 7 days, then aggregate hourly (90 days), then daily (5 years). (2) Compression: InfluxDB compresses time series to 2 to 10 percent of their original size. (3) Cold storage archival: migrate aged data to S3 Glacier. Result: a meaningful divide of storage cost while keeping historical data accessible.

Can I use a classic SQL database (MySQL, PostgreSQL) for IoT?
Yes, but with limits. For fewer than 100 sensors and a need for complex queries (joins, transactions), PostgreSQL with the TimescaleDB extension is excellent. Beyond 10,000 data points per second, traditional SQL databases hit their ceiling (oversized indexes, slow writes). At that point switch to InfluxDB (write-optimized) or Cassandra (massive horizontal scale-out). Our electronic design house sizes your IoT architecture end to end, from the sensor to the database cluster.

How do I guarantee sensor-data reliability in an industrial environment?
Five critical mechanisms. (1) Local buffer on the sensor (temporary storage when the network is down). (2) Sensor-side timestamping (not server-side) to track disconnections. (3) Anomaly detection: out-of-range values, silent sensors. (4) Multi-datacenter replication when criticality is high. (5) Audit trail: log every data modification. Our approach to IoT cybersecurity includes end-to-end encryption and mutual authentication.

Firebase Realtime Database vs InfluxDB: which one should I pick?
Firebase shines for consumer mobile apps with real-time sync (chat, collaborative dashboards) thanks to its NoSQL model and rich SDKs. But it is NOT optimized for time series: no automatic downsampling, costs ramp beyond 10 GB, analytical queries are limited. InfluxDB is built for industrial IoT: 10x compression, native time aggregations, integration with Grafana. Our recommendation: Firebase for the real-time front end plus InfluxDB or TimescaleDB for the analytics backend.