Abstract:
With the increase of devices equipped with location sensors, mining spatio-temporal data for interesting behavioral patterns has gained attention in recent years. If we talk about big data, the present decade has been dubbed the Digital Universe Decade because the digital universe, all data is available in digital form and is growing at an exponential rate. Lots of this data is graph data. Examples include data related to social network connections and data diffusion, computer networks, telecommunication networks, the Web, and knowledge bases. Yet another example is road-network data: in step with the rapid increase is available vehicle trajectory data, this data grows rapidly in size, resolution, and sophistication. There is a need to find out the applications and solutions provided by the behavioral patterns to reveal the true meaning of the spatio-temporal data. One such pattern is the convoy pattern which can be used to find groups of people moving together in public transport or to prevent traffic jams. A convoy consists of at least m objects moving together for at least k consecutive time instants where m and k are user-defined parameters. Convoy mining is an expensive task and existing sequential algorithms do not scale to real-life dataset sizes. The sequential as well as parallel algorithms require a complex set of data dependent parameters which are hard to optimize. In this work we have designed A Smart Algorithm for Trajectory Pattern Mining on Big Data Platforms which is parallel in nature i.e. our algorithm uses the data parallelism strategy to exploit its scalability characteristics. The data parallelism strategy involves the division of data among multiple processors/nodes and the execution of an identical algorithm on each of the nodes. Interestingly, the k/2-hop technique can be applied to the mining myriads of existing patterns i.e. moving clusters and flocks etc. to achieve similar performance gains.