**ABSTRACT**

The Internet of Things (IoTs) is becoming ubiquitous in our everyday lives, implying that more technologies will generate data. IoT devices use sensors to monitor various attributes of the environment such as temperature, humidity, light, etc. These sensors produce data periodically and storing this massive data in a database is becoming a huge challenge in the data storage infrastructure. Prior research has proposed compression algorithms and signature techniques to reduce data storage but do not specify how the data patterns are deﬁned.

Since similar patterns are exhibited everyday by the environment, this data generates the same information from everyday sensing. Therefore, in this study, we propose a system that stores data models rather than storing raw data points. Instead of storing each data point at a time, we develop and store data models with the corresponding time periods that captures the behavior of the sensor data. This helps in reducing data storage requirements. The data models developed are mathematical polynomial models that ﬁt a sample data set. In addition, we propose a sensor database structure that addresses the issues of data redundancy as well as temporal constraints in the database.

**BACKGROUND INFORMATION**

Figure 2.1 shows the overview and the main components of an IoT system. In the following subsections, we brieﬂy describe function of each component. A sensor network is a group of different sensors that are embedded in various IoT devices to monitor the physical or environment conditions and has the ability to communicate with each other to form a network. Sensor nodes play an important role in the network.

**RELATED WORK**

In the aspect of avoiding redundant data generated from the sensors, data models in are created using polynomials while the sensor node is providing new samples. When adding points to the polynomial, the algorithm tries to add as many points to the polynomial by adding degrees to the polynomial to ﬁt the data. If the data does not ﬁt, the polynomial keeps adding degrees until the maximum number of degrees is added. If the value point does not ﬁt within the maximum number of polynomial degrees, the polynomial is stored with the timestamp, then, a new polynomial of degree zero is created to ﬁt the next sample, and the process is restarted.

**SENSOR DATA MODELS**

Sensor data models provide an efﬁcient way to represent data and minimize storage space with the same data utility. The actual data reading gathered by sensors from their environment are raw data points. Instead of storing these raw data points in the database, we can efﬁciently utilize the storage space by representing groups of similar raw data points in the form of mathematical equations. There fore, we adequately manage the storage space by storing these mathematical equations (data models) in the database. Hence, in order to retrieve a raw data reading, instead of fetching a raw data point, we retrieve a data model that corresponds to this data point. We calculate the data value using the data model retrieved.

**SENSOR DATABASE**

Figure 5.2 proposes an underlying architecture of an IoT database. When a user enters any query, it is parsed by the query processor. The query service module builds the query according to the query parameters and sends the query request to the database service module. The database service module holds the database logic and sends the query statements to the query engine. The index established in the index module will be utilized to retrieve the requested data.

Figure 5.3 shows an example of a relation named as object Models that contains four records of sample information collected by some sensors that monitor a particular environment. The column names such as object ID, location, timestamp, and data Model are the attributes of this relation where timestamp is the primary key stating that each record will be uniquely identiﬁed by timestamp.

**EXPERIMENTAL RESULTS**

In Figure6.3, 78 polynomial models were generated forJanuary11,2016 00:00:00 to January 11, 2016 23:59:00. These models were generated using error threshold equal to 0.10 degrees Celsius and maximum polynomial degree equal to 2. The maximum polynomial degree is set to 2 and does not need to be high as all the data points are represented by maximum 3 coefﬁcients reducing and saving the storage space.

**CONCLUSION**

With the increasing trend of information communication technologies, data is being generated at very high rates. Data is becoming very hard to manage and an efﬁcient way to organize data in databases is an important issue. IoT model databases is becoming an important notion to all eviate data generation by decreasing the space that data consumes while also maintaining the same information. Data models also provide data with a negligible error that can ﬁt many raw data points from sensors. These models are created by ﬁtting a function to the data points.

In this research, we used polynomials with different order, for example, ﬁrst order, second order, etc to ﬁt the data points. Our algorithm, Generation of Sensor Data Model, ﬁnds a polynomial curve whose parameters are the coefﬁcients of the polynomial equations. These parameters now cover many raw data points within a time range. In other words, with data models we can represent enormous amount of data points without having to overﬁll databases or sacriﬁce data utility.

Source: University of Miami

Author: Parul Maheshwari