Get Latest CSE Projects in your Email


Distributed Data Service for Data Management in Internet of Things Middleware

ABSTRACT

The development of the Internet of Things (IoT) is closely related to a considerable increase in the number and variety of devices connected to the Internet. Sensors have become a regular component of our environment, as well as smart phones and other devices that continuously collect data about our lives even without our intervention. With such connected devices, a broad range of applications has been developed and deployed, including those dealing with massive volumes of data.

In this paper, we introduce a Distributed Data Service (DDS) to collect and process data for IoT environments. One central goal of this DDS is to enable multiple and distinct IoT middleware systems to share common data services from a loosely-coupled provider. In this context, we propose a new specification of functionalities for a DDS and the conception of the corresponding techniques for collecting, filtering and storing data conveniently and efficiently in this environment. Another contribution is a data aggregation component that is proposed to support efficient real-time data querying.

To validate its data collecting and querying functionalities and performance, the proposed DDS is evaluated in two case studies regarding as imulated smart home system, the first case devoted to evaluating data collection and aggregation when the DDS is interacting with the UIoT middleware, and the second aimed at comparing the DDS data collection with this same functionality implemented within the Kaa middleware.

BIG DATA MANAGEMENT FOR IOT MIDDLEWARE

Figure 1. IoT architecture

Figure 1. IoT architecture

Figure 1 shows a general view of the different layers that compose an IoT middleware. In general, the upper layer is a component that has direct interaction with the application layer within the IoT architecture. The application layer receives request messages and sends responses related to services provided by the middleware. The lower layer interacts with the physical layer and exchanges binary information and control commands/responses with the physical devices.

RELATED WORKS

Much effort has been made in the area of IoT data storage and processing, indicating that one central objective in this domain is to efficiently gather data generated by heterogeneous devices, then processing and storing both original and resulting data in a persistent data store. As mentioned before, to achieve this, IoT middleware systems usually implement a data collector. In a data collector, specifically projected for an IoT environment, is available through a REST API. Data are received in a message array format, and the collector splits these into single message packets and authenticates the sensor. After having been identified, the packets are put into a message queue in order to be processed.

DISTRIBUTED DATA SERVICE FOR IOT MIDDLEWARE

Figure 3. Data collection component

Figure 3. Data collection component

In Figure 3, processes related to consumers, filters and metadata are designed to handle large processing volumes since they can adapt to different parallel computing levels depending on environmental needs. These processes are executed according to a specific data flow, beginning within the consumer, passing through the filter to reach the metadata creator. The modules for data capture, data filtering and metadata creation can be instantiated in multiple different processes, each one with its respective consumers, filters and metadata.

Figure 5. Example of time series organization

Figure 5. Example of time series organization

Figure 5 gives an example of data compaction during a small time window, when there is a constant intense data flow, and also in a larger time window for less intense traffic. The idea is that, independent of the time window, the set of data will be grouped and then ordered continuously. In Phase 1, the time window t0 presents nine found data groups. In the second phase, the nine groups of data in the time window t0 are now sorted and compacted, while in time window t1, nine data groups are to be compacted, and two new data groups are arriving to be processed.

IMPLEMENTATION OF THE DDS

Figure 8. Kafka Consumer Configuration

Figure 8. Kafka Consumer Configuration

To process the large volume of data collected into a topic, the Kafka consumers have to be properly configured. In general, a Kafka consumer reads data partitions (P0, P1, P2, … PN) as the data arrive, as shown in Figure 8A. However, even achieving the reading of parallel partitions, the degree of Kafka parallelism is not enough to manage a large volume of data provided on a large network of devices/sensors under the control of an IoT middleware. In order to improve parallelism, a modification was introduced in how the partitions are read by the Kafka consumer, by defining a consumer for each existing partition, as shown in Figure 8 B. This modification increases the degree of parallelism in partition reading, since instead of having only one consumer reading multiple partitions, now there is one consumer dedicated to reading only one partition.

CASE STUDY: SMART HOME SYSTEM SIMULATION

Figure 9. Computational environment of DDS

Figure 9. Computational environment of DDS

In order to support the huge data volume, the distributed data service was projected to function on a computational cluster based on a messaging system. Then, the DDS is cluster based and uses publish-subscribe messaging to handle read and write operations. One of the cluster nodes is selected as a master server node for the data collector. The other cluster nodes are slave servers that receive messages from the master server for storing and processing purposes. It is important to note that a physical cluster, as shown in Figure 9, was implemented. This cluster internally supports three virtual clusters, respectively: the Kafka, the Storm and the Cassandra cluster.

Figure 10. Computational environment of DDS

Figure 10. Computational environment of DDS

It is worth high lighting that the cluster environment was implemented over four virtual machines, as shown in Figure 10. This cluster internally supports two virtual clusters, respectively the Kaa and the MongoDB cluster.

RESULTS

Figure 11. UIoT data producer: synchronous

Figure 11. UIoT data producer: synchronous

Figure 11 presents the results, with the synchronous producer showing that the performance is proportional to the number of producers being executed. In addition, the number of homes that can be supported in the synchronous scenario is presented. For example, for two producers, the number of messages per second is 20,000 (2934 homes).

Figure 13. Kaa collector results

Figure 13. Kaa collector results

Unlike the UIoT-DDS simulation, the Kaa simulation involved exclusively the creation of synchronous messages. This is due to the fact that data collection in a Kaa platform can only store the received messages in the database and then reply with a confirmation or acknowledgment back to the endpoint. Figure 13 displays the relationship between the message receiving speed and the number of homes supported by different numbers of Kaa nodes.

DISCUSSION

In this section, we intend to analyze first the data collection and data aggregation results of the UIoT-DDS study, second, the Kaa collector results and, finally, the data collection comparing the UIoT-DDS to the Kaa collector in terms of data ingestion of a huge data volume. This comparison shows the better performance of DDS when facing a huge volume of data coming from different sources. It is important to highlight that for the sake of fairness, this comparison is done in terms of data ingesting when UIoT-DDS operates synchronously, since the Kaa collector only operates in synchronous mode.

CONCLUSIONS AND FUTURE WORKS

The next generation of the Internet is moving in an important developmental direction, as the IoT increasingly captures the attention of industry and academia. Selecting and applying database middleware technology, in a reasonable way, is a key question to solving the problem of IoT data management for a massive volume of data generated in real time. In this context, our results show that the designed DDS supports data management for IoT middleware performs well in collecting and retrieving data from an IoT middleware.

The choice of components and their articulation for collaborative data treatment are key factors for high velocity data collecting and analyzing. Besides this, the proposed DDS is able to process a variety of data coming from different IoT environments through a specific communications interface and metadata creation modules. Finally, the proposed DDS was shown to have better performance compared with a typical data collector, the Kaa middleware.

Source: University of Brasilia
Authors: Ruben Cruz Huacarpuma | Rafael Timoteo De Sousa Junior | Maristela Terto De Holanda | Robson De Oliveira Albuquerque | Luis Javier Garcia Villalba | Tai-hoon Kim

Download Project

>> IoT based Big Data and Cloud Computing Projects for B.E/B.Tech Students

>> IoT based Real-Time Projects for B.E/B.Tech Students

For Free CSE Project Downloads:

Enter your email address:
( Its Free 100% )


Leave a Comment

Your email address will not be published. Required fields are marked *