How to Handle Out-of-Order Data in Your IoT Pipeline

2023-08-22
关注

Illustration: © IoT For All

Say you are a vertical manager at a logistics company. Knowing the value of proactive anomaly detection, you implement a real-time IoT system that generates streaming data, not just occasional batch reports. Now you’ll be able to get aggregated analytics data in real time.   

But can you really trust the data? 

If some of your data looks odd, it’s possible that something went wrong in your IoT data pipeline. Often, these errors are the result of out-of-order data, one of the most vexing IoT data issues in today’s streaming systems. 

Business insight can only tell an accurate story when it relies on quality data that you can trust. The meaning depends not just on a series of events, but on the order in which they occur. Get the order wrong, and the story changes—and false reports won’t help you optimize asset utilization or discover the source of anomalies. That’s what makes out-of-order data such a problem as IoT data feeds your real-time systems. 

So why does streaming IoT data tend to show up out of order? More importantly, how do you build a system that offers better IoT data quality? Keep reading to find out. 

The Causes of Out-of-Order Data in IoT Platforms

In an IoT system, data originates with devices. It travels over some form of connectivity. Finally, it arrives at a centralized destination, like a data warehouse that feeds into applications or IoT data analytics platforms.

The most common cause of out-of-order data relates to the first two links of this IoT chain. The IoT device may send data out of order because it’s operating in battery-save mode, or due to poor-quality design. The device may also lose connectivity for a period of time.

It might travel outside of a cellular network’s coverage area (think “high seas” or “military areas jamming all signals”), or it might simply crash and then reboot. Either way, it’s programmed to send data when it re-establishes a connection and gets this command. That might not be anywhere near the time that it recorded a measurement or GPS position. You end up with an event timestamped hours or more after it actually occurred.  

But connectivity lapses aren’t the only cause of out-of-order (and otherwise noisy) data. Many devices are programmed to extrapolate when they fail to capture real-world readings. When you’re looking at a database, there’s no indication of which entries reflect actual measurements and which are just the device’s best guess. This is an unfortunately common problem. To comply with service level agreements, device manufacturers may program their products to send data according to a set schedule—whether there’s an accurate sensor reading or not.      

The bad news is that you can’t prevent these data-flow interruptions, at least not in today’s IoT landscape. But there’s good news, too. There are methods of processing streaming data that limit the impact of out-of-order data. That brings us to the solution for this persistent data-handling challenge.   

Fixing Data Errors Caused by Out-of-Order Logging

You can’t build a real-time IoT system without a real-time data processing engine—and not all of these engines offer the same suite of services. As you compare data processing frameworks for your streaming IoT pipeline, look for three features that keep out-of-order data from polluting your logs:  

  1. Bitemporal modeling. This is a fancy term for the ability to track an IoT device’s event readings along two timelines at once. The system applies one timestamp at the moment of the measurement. It applies a second the instant the data gets recorded in your database. That gives you (or your analytics applications) the ability to spot lapses between a device recording a measurement and that data reaching your database.  
  1. Support for data backfilling. Your data processing engine should support later corrections to data entries in a mutable database (i.e., one that allows rewriting over data fields). To support the most accurate readings, your data processing framework should also accept multiple sources, including streams and static data. 
  1. Smart data processing logic. The most advanced data processing engine doesn’t just create a pipeline; it also layers machine learning capabilities onto streaming data. That allows the streaming system to simultaneously debug and process data as it moves from the device to your warehouse. 

With these three capabilities operating in tandem, you can build an IoT system that flags—or even corrects—out-of-order data before it can cause problems. All you have to do is choose the right tool for the job. 

What kind of tool, you ask? Look for a unified real-time data processing engine with a rich ML library covering the unique needs of the type of data you are processing. That may sound like a big ask, but the real-time IoT framework you’re looking for is available now, at this very moment—the one time that’s never out of order. 

Tweet

Share

Share

Email

  • Data Analytics
  • Big Data
  • Connectivity
  • Quality Management

  • Data Analytics
  • Big Data
  • Connectivity
  • Quality Management

  • en
您觉得本篇内容如何
评分

相关产品

EN 650 & EN 650.3 观察窗

EN 650.3 version is for use with fluids containing alcohol.

Acromag 966EN 温度信号调节器

这些模块为多达6个输入通道提供了一个独立的以太网接口。多量程输入接收来自各种传感器和设备的信号。高分辨率,低噪音,A/D转换器提供高精度和可靠性。三路隔离进一步提高了系统性能。,两种以太网协议可用。选择Ethernet Modbus TCP\/IP或Ethernet\/IP。,i2o功能仅在6通道以太网Modbus TCP\/IP模块上可用。,功能

雷克兰 EN15F 其他

品牌;雷克兰 型号; EN15F 功能;防化学 名称;防化手套

Honeywell USA CSLA2EN 电流传感器

CSLA系列感应模拟电流传感器集成了SS490系列线性霍尔效应传感器集成电路。该传感元件组装在印刷电路板安装外壳中。这种住房有四种配置。正常安装是用0.375英寸4-40螺钉和方螺母(没有提供)插入外壳或6-20自攻螺钉。所述传感器、磁通收集器和壳体的组合包括所述支架组件。这些传感器是比例测量的。

TMP Pro Distribution C012EN RF 音频麦克风

C012E射频从上到下由实心黄铜制成,非常适合于要求音质的极端环境,具有非常坚固的外壳。内置的幻像电源模块具有完全的射频保护,以防止在800 Mhz-1.2 Ghz频段工作的GSM设备的干扰。极性模式:心形频率响应:50赫兹-18千赫灵敏度:-47dB+\/-3dB@1千赫

ValueTronics DLRO200-EN 毫欧表

"The DLRO200-EN ducter ohmmeter is a dlro from Megger."

评论

您需要登录才可以回复|注册

提交评论

广告

iotforall

这家伙很懒,什么描述也没留下

关注

点击进入下一篇

Harnessing AI's Potential in Healthcare Product Development

提取码
复制提取码
点击跳转至百度网盘