An Effective Coreset Compression Algorithm for Large Scale Sensor Networks

The Inventors present a data compression algorithm that employs a small coreset of a large sensor data set to yield a highly efficient compression system. This technology can be used in any applications where signal compression and simplification is desired, including services that handle video streams from wearable cameras, mobile sensors, GPS-based location and mapping, financial data and biological signals.  

Researchers

Daniela Rus / Dan Feldman

Departments: Dept of Electrical Engineering & Computer Science
Technology Areas: Computer Science: Networking & Signals
Impact Areas: Connected World

  • data coreset compression
    United States of America | Granted | 9,286,312

Technology

The Inventors use techniques from computational geometry to solve problems in data and signal compression using coresets. The input to the coreset algorithm is a constant e > 0, and a set P of n points in representing n signals from d sensors) that can be approximated by k segments. The algorithm returns a coreset of O(k) points, whose size is independent of the original input n, such that the Hausdorff Euclidean distance from P to any given set of points is preserved up to an additive error of e. This compression guarantees an approximation error for any query.These coresets can be constructed in parallel, as well as in a streaming setting where data points arrive one by one, in which it would be otherwise impossible to remember the entire data set due to memory constraints. Additionally, they can be computed at one pass over streaming data and when combined with map-reduce techniques can yield a system capable of compressing a stream of O(n) points using only O(log n) space and update time.  

Problem Addressed

Field-deployed sensor networks are collecting massive amounts of data in real-time across a wide range of applications, resulting in a great demand for systems that can support streaming and performing computation on this high frequency data. Methods that first store all the data from a given fielded sensor network struggle to analyze data in real time. Additionally, most analysis tools today are based on data mining algorithms that can only handle blocks of static data on the order of a few gigabytes, that fit in the internal memory. The Inventors’ approach employs coresets, which are small sets that approximately represents the original data, to achieve an efficient method of data compression. Running queries or fitting models on the coreset will yield similar results when applied to the original data set.  

Advantages      

  • Insertion time per new observation in coreset and required memory is only linear in both the dimension of the data, and the number k of segments
  • Approach achieves dimensionality reduction of very large scale sparse data sets by solving time and space limitations  

Publications

"The Single Pixel GPS: Learning Big Data Signals from Tiny Coresets." Association for Computing Machinery, November 2016, 23–32.

"An Effective Coreset Compression Algorithm for Large Scale Sensor Networks." Association for Computing Machinery, April 2012, 257–268.

License this technology

Interested in this technology? Connect with our experienced licensing team to initiate the process.

Sign up for technology updates

Sign up now to receive the latest updates on cutting-edge technologies and innovations.