Design Practices: AWS IoT Solutions
As IoT devices proliferate, businesses need a solution to collect, store and analyze their devices' data. Amazon Web Services provides several useful tools for designing strong data pipelines for IoT devices.
The Internet of Things (IoT) presents an unparalleled opportunity for every industry to address their business challenges. With the proliferation of devices, one needs a solution to connect, collect, store and analyze the devices’ data. Amazon Web Services provides various services that help connected devices easily and securely interact with cloud applications and other devices for various user scenarios. Having said this, every Solution Architect in the field knows the capabilities and reliabilities of AWS Cloud. Migrating or designing Internet of Things (IoT) solutions on the AWS platform enables one to focus on the core business without the hassle of infrastructure management and monitoring. This will ensure high availability to the customers. No matter whichever solution is designed, one should select the best platform to keep the solution stable. AWS is one such platform.
There are a few practices to consider in designing IoT solutions with AWS. If the right AWS services are used for customer requirements, then IoT solutions will be able to deliver results in a more secure, reliable and scalable manner.
Design to Operate at Scale Reliably
IoT systems must handle high-velocity and high-volume data captured by devices and gateways. The overflow of incoming data can be expected due to the sudden growth of the business or sometimes due to a malicious attack. In such cases, the cloud system architecture should be scalable to handle such data.
The best approach is to send data to queue and buffer in real-time in-memory databases before storing it. This helps to achieve real-time events and to slow down the data insertion rate to prevent the database crashing or to prevent a slower response.
The device can publish data to AWS Kinesis, or AWS IoT rule can be used to forward data to AWS SQS and Kinesis to store it in time-series stores like AWS S3, Redshift, Data Lake or Elastic search for data storage. These data stores can be used to generate custom dashboards or AWS Quick Sight dashboards.
Route Large Data Volumes Through Data Pipelines
Consuming incoming data from device topics directly to a single service prevents systems from achieving full scalability. Sometimes, such an approach limits the availability of the system on events of failure and data floods.
AWS IoT Rules Engine is designed to connect endpoints to AWS IoT Core in a scalable way. But, all AWS services have different data flow properties and their own pros and cons. All services cannot be used as a single point of entry to the system. Sometimes it can create subsequent failure with no recovery. For example, in the case of high-volume data, consider buffering (Elasti Cache) or queuing (SQS) the incoming data before invoking other services, which enables the ability to recover from subsequent failures.
AWS IoT Rules Engine allows the triggering of multiple AWS services like Lambda, S3, Kinesis, SQS or SNS in parallel. Once data is captured by the IoT system, it then enables AWS endpoints (other AWS services) to process and transform the data. This allows you to store data into multiple data stores simultaneously. The most secure way to ensure all data is processed and stored is to redirect all device topic data to an SNS which is designed to handle data flood processing, ensuring that incoming-data is reliably maintained, processed and delivered to the proper channel. To make it more scalable, multiple SNS topics, SQS queues and Lambdas for a different/group of AWS device topics can be used. One should consider storing the data in safe-storage like a Queue, Amazon Kinesis, Amazon S3 or Amazon Redshift before processing. This practice ensures no data loss due to message floods, un-wanted exception code or deployment issues.
Automate Device Provisioning and Upgrades
As the business grows and numerous devices connect to the IoT ecosystem, manual processes such as device provisioning, bootstrapping the software, security configuration, rule-actions setup and device OTA upgrades aren’t feasible. Minimizing human interaction in the initialization process and upgrades is important to saving time and reducing costs.
Designing built-in capabilities within the device for automated provisioning and leveraging the proper tools that AWS provides to handle device provisioning and management allows systems to achieve the desired operational efficiencies with minimal human intervention.
AWS IoT provides a set of functionalities which can be used for batch import with a set of policies that can be integrated with dashboards and manufacturing processes where a device can be pre-registered to AWS IoT and certificates can be installed on the device. Later, the device provisioning flow can claim a device and attach it to another user or any other entity. AWS provides the facility to trigger and track OTA upgrades for devices.
Adopt Scalable Architecture for Custom Components
As IoT systems connect to external world devices, the scope doesn’t end by connecting, controlling and reporting of devices. Think about adopting the latest technologies like Data Science and Machine Learning or integrating third-party components in IoT system like IFTTT, Alexa or Google Home. The architecture of IoT should ensure that the external components can be easily integrated into solutions without any performance bottlenecks.
Check for Offline Access and Processing
Sometimes it’s not necessary to process all your devices’ data in the cloud. In many cases, there’s no continuous internet connectivity available. For such a scenario, add AWS Greengrass at the edge. Greengrass processes and filters data locally on the edge and reduces the need to send all device data upstream. One can capture all data, hold it for a limited amount of time and send it to the cloud on error events or on demand/request. If there’s a need for time-series data, then one can schedule a periodic process that sends device data to the cloud which can be used for future enhancements like AWS Machine Learning models and cloud analytics tools.
Choose the Right Data Storage
IoT systems generate high-speed, high-volume and many varieties of data. Each IoT device or device topic can have different formats, which may not be manageable through a single database or a similar type of data-store. An architect should be careful while choosing database formats and data-stores. Frequently used static data can be stored in the Elastic cache which helps to improve performance. Such practices help to achieve scalability and maintainability of the system.
Filter and Transform Data Before Processing
All incoming data to the IoT system may require processing or transforming, after which it can be redirected to storage. AWS IoT rules provide action to redirect messages to different AWS services. An architect should divide all data into different forms (i.e. processing-needed, ignored/static data (like Config) and direct storage).