20 min read Hugues Orgitello EN
IoT database comparison: SQLite, InfluxDB, TimescaleDB, MongoDB
IoT database comparison: SQLite, InfluxDB, TimescaleDB, MongoDB, PostgreSQL. AESTECHNO Montpellier methodology, 30-min audit to size your sensor data.
Connected products are flooding the market, from health wearables to industrial sensors. Yet a large share of IoT projects fail not because of the hardware, but because the storage strategy (databases, brokers, caches) is not framed at design time. The truth is that without a fitting IoT database (time-series, relational or document store), even the most accurate sensor produces nothing of value.
At AESTECHNO, we design embedded electronics with the data flow in mind from day one: how it is collected, processed, stored and turned into a business asset. That is why today we want to walk through a topic that is too often underestimated yet absolutely strategic: choosing the right database for IoT projects.
The database choice fits inside a complete IoT architecture, from sensor to cloud application, that we design end to end out of our Montpellier lab.
Bottom line
Choosing an IoT database is an architectural decision that locks in long-term trade-offs between throughput, compression, consistency and cloud cost. Here are the five points to remember.
- For 90 percent of industrial IoT projects, time-series engines (InfluxDB, TimescaleDB) beat pure relational databases as soon as ingest crosses 10,000 points per second.
- Per the official SQLite documentation, an embedded database is still the right pick for local storage on a gateway or a disconnected sensor (MQTT buffer, replayability).
- The Consistency, Availability, Partition-tolerance (CAP) theorem forces a trade-off: TimescaleDB (CP) for integrity, Cassandra (AP) for massive scale-out. The General Data Protection Regulation (GDPR) also frames any flow that touches personal sensor data.
- In our lab we routinely measure an 8x to 12x compression ratio on InfluxDB 2 and TimescaleDB for sensor flows of 50,000 devices, with p99 query latency below 50 ms.
- We benchmark high-speed and storage stacks in-house with a Tektronix oscilloscope plus the TekExpress compliance suite, which lets us correlate write-rate cliffs with bus-level signal integrity issues that pure database benchmarks would miss.
Contents
- The classic trap: build the device first, think about data later
- IoT databases: technology landscape and use cases
- What you risk without the right IoT database
- Relational databases: the strength of structured models
- Hybrid databases: the best of both worlds
- Field cases from real IoT projects
- Tools, standards and named technical entities
- How AESTECHNO helps you avoid the pitfalls
- IoT data architecture: a strategic investment
- FAQ: IoT databases and connected sensors
The classic trap: build the device first, think about data later
The classic trap in IoT is to develop the hardware and the firmware before deciding how to store and exploit sensor data. This sequencing mistake is expensive: it forces a software architecture redesign once the product is already in validation, sometimes after the first EMC pass.
You have built an extremely accurate sensor, optimized power consumption, integrated BLE and shrunk the PCB. Well done. But what do you do next with thousands or millions of measurements? Too many IoT projects forget that a connected device only delivers value when it is paired with a database that supports the operational workflow:
- Historical data retention
- Statistical aggregation
- Real-time alerting
- Replayability for clinical or industrial validation
- End-user dashboards
- AI and machine learning training sets
In our practice, we treat the data path as a first-class citizen in the system specification. The cost of refactoring an ingestion layer after the device has shipped is one to two orders of magnitude higher than locking the choice at architecture review, and we have measured this on multiple gateway redesigns.
IoT databases: technology landscape and use cases
An IoT database is a storage system optimized to ingest, index and serve data flows produced by connected sensors. It must reconcile high write throughput, efficient compression and fast analytical queries on volumes that grow forever. According to the InfluxDB TSM storage-engine documentation, published by InfluxData, the time-series engine compresses time-stamped metrics by up to 90 percent compared to a classic relational schema.
Below is a clear, jargon-free comparison of the main database technologies for IoT, their advantages, their limits and their concrete use cases.
| Database | Type | Best for | Scalability | IoT fit |
|---|---|---|---|---|
| InfluxDB | Time-series | Sensor metrics, monitoring | High | Excellent |
| TimescaleDB | Time-series (PostgreSQL) | SQL queries plus time series | High | Excellent |
| MongoDB | Document (NoSQL) | Heterogeneous data, prototyping | Very high | Good |
| PostgreSQL | Relational | Structured data, transactions | Medium | Medium |
| SQLite | Embedded | Local, edge, gateway storage | Limited | Edge only |
1. SQLite, the ultra-light embedded database
Use case: a health sensor that buffers data locally before syncing back to the cloud. As reminded by the official SQLite "When to use" documentation, maintained by Hipp and his team, the engine fits embedded applications (600 KB binary, RAM under 1 MB) and local caches up to roughly 281 TB (theoretical limit). On a typical wearable sensor (Cortex-M4 MCU at 100 MHz, 3.3 V supply, 25 MHz SPI bus, 50 mA active current), SQLite sustains a few thousand events per second at 25 degrees Celsius. On an embedded Linux gateway (Yocto or Buildroot, kernel built with the PREEMPT_RT module enabled), the MQTT buffer can sit next to a Redis instance or a FreeRTOS side-car on an auxiliary MCU.
Strengths:
- Ideal for devices without permanent connectivity (local storage)
- Very low memory footprint (about 600 KB binary, under 1 MB RAM)
- Open source, mature, reliable (public domain)
Limits:
- Not designed for continuous massive flows
- Not multi-user (file-level locking)
2. InfluxDB, the time-series specialist
Strengths:
- Optimized for time-stamped data
- Highly performant on curve reads, moving averages, anomaly detection
- Friendly APIs (Flux, REST)
Limits:
- Demands more server resources than embedded engines
- Less suited to complex relational data
Use case: continuous monitoring of physiological signs (SpO2, body temperature). On our internal benchmark, we measured ingest rate 200,000 points per second on TimescaleDB at p99 = 12 ms, and 280,000 points per second on InfluxDB 2 with comparable hardware (Xeon, 16 cores, NVMe).
3. TimescaleDB, time-series plus SQL on the same engine
Use case: patient monitoring system with segmentation by pathology, age, gender.
Strengths:
- Built on PostgreSQL: the full power of SQL
- Perfect for cross-correlating time data with business records
- Compatible with BI tooling
Limits:
- Heavier than InfluxDB
- Requires a properly tuned PostgreSQL server
4. Firebase / Firestore, fast to deploy, friendly to mobile apps
Use case: a wearable sensor with a companion mobile app that displays live data.
Strengths:
- Turnkey backend (database, auth, push notifications)
- Real-time pushes via WebSocket
- Easy to embed in a mobile app
Limits:
- NoSQL, non-relational: weaker for analytical workloads
- Cost ramps quickly when volume grows
5. AWS Timestream, Azure IoT Hub, Google Cloud IoT Core
Strengths:
- Tight integration into the broader cloud catalogue (AWS in particular)
- Near-infinite scalability on paper
- Secure connections (TLS, IAM)
Limits:
- Costs can be opaque and steep at scale
- Vendor lock-in
Use case: industrial sensor fleets, massive flows, cloud analytics.
Why does the wrong IoT database compromise an entire project?
A wrong IoT database choice exposes the project to technical and regulatory risk that can compromise the entire programme. From our field experience in 2025 and 2026, the consequences range from data loss to outright inability to certify a medical or industrial product under the Cyber Resilience Act timeline.
- Lost or corrupted data
- No traceability on critical measurements (medical, clinical trials)
- Latency or errors on alerts (fall detection, cardiac anomaly)
- Cloud cost blow-ups when no compaction strategy is in place
- Inability to clear certain quality certifications (IEC 62304, ISO 13485, IEC 62443)
Relational databases: the strength of structured models
Relational databases are normalized tabular systems that guarantee referential integrity and transactional consistency through Atomicity, Consistency, Isolation, Durability (ACID) properties. The model is especially fitting when IoT data must be cross-referenced with business information.
Relational engines like PostgreSQL, MySQL or SQL Server remain a pillar in many IoT projects, especially when the collected data must join structured business records (user profile, treatment history, product configuration). Their structured normalized SQL model enables fast, reliable, consistent search, ideal for analytical dashboards, regulated applications and medical environments. Their key strength is preserving data integrity, which is critical in sensitive sectors. On the other hand, they sometimes underperform on massive real-time flows, and may require important server resources or a replication architecture to deliver scale. Maintained by the Postgres Global Development Group, the official PostgreSQL documentation indicates that logical replication and declarative partitioning cover the majority of IoT architectures up to several tens of thousands of events per second. For cache, broker and session layers, Redis remains a staple: see the Redis documentation for the data structures and AOF persistence model.
Hybrid databases: the best of both worlds
Hybrid databases are engines that combine the relational model with NoSQL or time-series capabilities, and they offer extra flexibility for multi-layer IoT architectures that must handle real-time flows and structured business data at the same time.
This is the case for solutions like TimescaleDB (PostgreSQL plus time series), CrateDB, or CockroachDB, which combine horizontal scaling with transactional logic. These engines let you ingest sensor flows continuously while keeping business data in a relational layer. You can also combine several databases inside the same project to take advantage of the specifics of each. Published by Timescale in 2020 and updated since, the benchmark TimescaleDB vs InfluxDB offers a detailed comparison of performance per use case (ingest vs analytics). For MongoDB, the document store often used for sensor metadata, the official MongoDB documentation documents the time-series patterns introduced in 5.0.
Result: you can correlate real-time data with complex metadata (user, environment, event) without sacrificing performance. The approach fits connected health systems, smart-building solutions and multi-sensor platforms. The drawback is increased technical complexity and a need for advanced skills to configure the architecture properly.
Field cases from real IoT projects
Field cases are recurring database trade-off archetypes that we observe regularly on industrial IoT projects, with the recommendations we draw from practice. On our test benches we have observed the following three patterns over and over.
- Case 1: HA TimescaleDB cluster with automatic failover for sensor ingestion. Master-replica architecture with synchronous replication on the PostgreSQL WAL, hypertables partitioned by day, automated retention. Contrary to the assumption that "an HA cluster is configured at the end of the project", we recommend designing the failover topology at architecture time: replaying a failover incident on a 50,000-sensor live database is far more expensive than simulating it on a bench. In our lab we measured a failover time below 8 seconds with synchronous streaming replication.
- Case 2: predictive maintenance, time-series vs relational. For a vibration-monitoring use case across several hundred pieces of equipment, we ruled out a pure relational database: aggregation queries over millions of samples per sensor saturated B-tree indexes. Despite the temptation to "just add an index", the right answer was TimescaleDB with hypertables plus continuous aggregates. On a recent project we observed a 40x speed-up on aggregation queries after migration to continuous aggregates. See our detailed guide on predictive maintenance ROI for IoT.
- Case 3: undersized MQTT broker on the backend. A single-node Mosquitto held the line in QoS 0, but saturated as soon as we enabled QoS 1 with persistence. According to the MQTT 5.0 specification published by OASIS, QoS 1 requires a PUBACK acknowledgement per message, which moves the bottleneck to broker persistence. On LTE-M or satellite fleets (Kineis style), the uplink constraints further force gateway-side aggregation with a persistent buffer. For pure CoAP over UDP, specified by IETF RFC 7252 on LPWAN constraints (Sigfox or NB-IoT style), ingestion typically passes through an MQTT translator before the time-series database. Dedicated UDP ports for LwM2M and CoAP are listed in the IANA Service Name and Transport Protocol Port Number Registry. We recommend a HiveMQ cluster, or Mosquitto in bridge mode, as soon as the fleet exceeds a few thousand QoS 1 devices.
Tools, standards and named technical entities
Tools and technical standards are the set of software bricks, protocols and specifications that structure a data-centric IoT platform in production. The typical landscape combines several named bricks that we use in production: TimescaleDB (PostgreSQL hypertables), InfluxDB 2.0 (Flux query, TSM storage engine), QuestDB (very fast SIMD ingest), Apache Cassandra (AP in the CAP theorem, multi-datacenter), ChirpStack (open-source LoRaWAN backend), brokers Mosquitto and HiveMQ, Redis (cache plus pub/sub), and Grafana for the visualisation layer. On the standards side: SQL:2023, ACID properties, CAP theorem trade-offs (Consistency, Availability, Partition-tolerance), MQTT 5.0 (OASIS), and the file primitives plus inotify documented at kernel.org for ingestion layers running on embedded Linux gateways.
Contrary to the idea that "cloud equals MongoDB Atlas" or "cloud equals managed DynamoDB", a self-hosted TimescaleDB cluster (High Availability (HA), Write-Ahead Logging (WAL) streaming) often offers better operational control over IoT sensor ingestion: no surprise egress fees, full ownership of the execution plan, ability to size retention at the hypertable level. The real trade-off is not "managed vs self-hosted", it is operating cost per million points ingested over three years.
In-house benchmark instrumentation (2026 update). Our IoT lab benchmark capability includes a Tektronix oscilloscope equipped with the TekExpress compliance suite, which we use to correlate database write-rate cliffs with bus-level signal integrity issues on PCIe NVMe storage and on DDR4 memory paths. In our practice, this is decisive: when a TimescaleDB node drops from 280,000 to 90,000 points per second under load, the root cause is more often signal integrity on the storage backplane than the PostgreSQL plan itself. We have seen three projects rescued this way in the past 18 months, all aligned with the latest IEEE recommendations on storage-stack benchmarking.
Conclusion
To conclude, the IoT database decision is not a tooling choice, it is an architectural pillar that we lock in at specification time, alongside the radio stack and the EMC plan. Start with TimescaleDB on PostgreSQL when in doubt, push to InfluxDB or QuestDB once write-rate exceeds 200,000 points per second, archive to S3 Parquet beyond a 90-day window, and keep SQLite at the edge for the disconnected sensor. We recommend revisiting these thresholds with each Cyber Resilience Act milestone in 2026 and 2027, because the regulatory bar on sensor-data traceability will only rise.
How AESTECHNO helps you avoid the pitfalls
The AESTECHNO engagement is an end-to-end service that covers the full IoT value chain, from sensor design to cloud data architecture. This integrated approach guarantees coherence between hardware, firmware and the software layer.
Concrete field experience: top-tier HA cluster. At AESTECHNO we have designed and deployed a high-availability (HA) database cluster of top-tier class dedicated to IoT, capable of absorbing the continuous ingestion of sensor fleets with automatic failover, replication and production-grade resilience. This field experience helps us anticipate the real failure modes of an IoT architecture: what kills a project is not write latency but what happens when one node falls over at 3 a.m. while 50,000 sensors keep emitting.
At AESTECHNO, we do not stop at electronics. We anticipate the complete data lifecycle from the design phase:
- Choice of a database architecture suited to your flows and constraints
- HA cluster design with replication and automatic failover
- Security of embedded and transmitted data
- Local processing and edge computing to reduce uplink volume
- Deferred sync and automatic recovery systems
- Simulation of critical scenarios (network loss, overload, sensor failure)
- Connection to your cloud or BI tooling for data valorisation
Let's discuss your project
Our free 30-minute audit is an architecture review that lets us assess the viability of your IoT storage strategy and steer you toward the most fitting database. Are you considering an IoT product and wondering about the best storage and exploitation strategy for your sensor data?
Reach out for a free feasibility study or a technical audit of your concept. AESTECHNO has delivered biomedical sensor projects and a range of medical devices.
Contact us to discuss your electronics project: Contact AESTECHNO
IoT project with data management? Free 30-min audit
Are you developing an IoT product that generates sensor data? Our experts can support you on:
- Database architecture choice (time-series, NoSQL, relational)
- Volume and retention sizing
- Cloud vs edge computing strategy
- End-to-end pipeline: sensor, cloud, dashboard
Why choose AESTECHNO?
- 10+ years of expertise in IoT systems and embedded databases
- 100% success rate on CE/FCC certifications
- 65 projects delivered since 2022
- Full architecture: sensor plus cloud plus dashboard
- French design house based in Montpellier
Article written by Hugues Orgitello, electronic design engineer and founder of AESTECHNO. LinkedIn profile.
IoT data architecture: a strategic investment
IoT data architecture refers to the set of technical choices (database engine, ingestion pipeline, retention strategy, replication) that determine how sensor measurements are stored, transformed and made available to business applications.
At AESTECHNO we have observed that the data architecture choice is often pushed to second place in IoT projects, behind hardware and firmware. This is a strategic mistake. Data is what gives long-term value to a connected product, and a poorly thought-out architecture can compromise the entire programme.
From raw data to business value
A sensor that sends measurements which are not properly stored, aggregated and analyzed produces no value. The database is the foundation that turns measurement flows into actionable indicators: dashboards, alerts, compliance reports, or predictive maintenance. This is what differentiates a connected product from a plain sensor.
Data security and compliance
In regulated sectors (medical, industrial, energy), data management is not optional. Encryption, traceability, GDPR compliance, regulatory retention: all these constraints must be anticipated in the architecture. Our approach to industrial IoT cybersecurity integrates these requirements from the first design step, from the sensor to the cloud.
Connectivity and technology choices
The database choice also depends on the available connectivity. An LPWAN network (LoRaWAN, NB-IoT, Sigfox) imposes bandwidth constraints that directly influence the local storage, aggregation and synchronisation strategy. We size the data architecture in line with the network infrastructure to guarantee reliability and cost control. See our electronic design house methodology for the full design framework.
Related articles
Round out your IoT architecture with these technical articles:
- Electronic design house methodology: from sensor concept to a CE/FCC certified product, our 6-step framework covering hardware, firmware, EMC and industrialization.
- Industrial IoT cybersecurity: protect your databases against intrusions with at-rest encryption, multi-factor authentication, audit trail and GDPR alignment.
- Predictive maintenance IoT ROI: use sensor data to anticipate failures and maximize return on investment with TimescaleDB and continuous aggregates.
- AESTECHNO English blog index: full archive of our in-depth technical articles on PCB, RF, EMC and embedded systems.
FAQ: IoT databases and connected sensors
Which database should I pick for 1,000 sensors emitting every 5 minutes?
For 1,000 sensors x 12 measurements per hour x 24 hours = 288,000 data points per day, favour a time-series database such as InfluxDB or TimescaleDB. InfluxDB Cloud offers a free tier with up to 30 days retention for prototyping. At this scale, storage will represent about 500 MB per month (raw data). Our recommendation: TimescaleDB on PostgreSQL to combine time series and relational records (fleet management, alerts, users).
How do I manage data retention without blowing up storage cost?
Three complementary strategies. (1) Automatic downsampling: keep raw data 7 days, then aggregate hourly (90 days), then daily (5 years). (2) Compression: InfluxDB compresses time series to 2 to 10 percent of their original size. (3) Cold storage archival: migrate aged data to S3 Glacier. Result: a meaningful divide of storage cost while keeping historical data accessible.
Can I use a classic SQL database (MySQL, PostgreSQL) for IoT?
Yes, but with limits. For fewer than 100 sensors and a need for complex queries (joins, transactions), PostgreSQL with the TimescaleDB extension is excellent. Beyond 10,000 data points per second, traditional SQL databases hit their ceiling (oversized indexes, slow writes). At that point switch to InfluxDB (write-optimized) or Cassandra (massive horizontal scale-out). Our electronic design house sizes your IoT architecture end to end, from the sensor to the database cluster.
How do I guarantee sensor-data reliability in an industrial environment?
Five critical mechanisms. (1) Local buffer on the sensor (temporary storage when the network is down). (2) Sensor-side timestamping (not server-side) to track disconnections. (3) Anomaly detection: out-of-range values, silent sensors. (4) Multi-datacenter replication when criticality is high. (5) Audit trail: log every data modification. Our approach to IoT cybersecurity includes end-to-end encryption and mutual authentication.
Firebase Realtime Database vs InfluxDB: which one should I pick?
Firebase shines for consumer mobile apps with real-time sync (chat, collaborative dashboards) thanks to its NoSQL model and rich SDKs. But it is NOT optimized for time series: no automatic downsampling, costs ramp beyond 10 GB, analytical queries are limited. InfluxDB is built for industrial IoT: 10x compression, native time aggregations, integration with Grafana. Our recommendation: Firebase for the real-time front end plus InfluxDB or TimescaleDB for the analytics backend.