r ResearOur Research Interests Are

Managing IoT data streams in

real-time and geographical space

Our R&D is generating solid evidence that the speed of execution, accuracy and effectiveness are the main factors in managing a data stream lifecycle. In particular, finding the trade-off among these factors is essential for avoiding the risk of overflowing an automated analytical workflow with useless data streams which are continually being transmitted from IoT devices to the edge, fog and cloud resources. We are exploring a data stream lifecycle using Cisco Kinetics in order to meet the security, geographical distribution, and transmission needs of data lifecycles tasks such as data ingestion, data transportation, data storage, data leverage, and data control flow.

We are using the Petri Net (PN) model as a process mining technique to determine the stream behaviour of the automated analytical tasks. A PN model is a bipartite directed graph consisting of two types of nodes: places and transitions. The relation between the nodes are defined by arcs. In our PN model, transitions represent a sequence of events that take place when the analytical tasks are being executed by the algorithms running at the edge, fog, and cloud resources. Transitions represent the state changes of the data streams being transported along the analytical tasks. Places are used to model the resources available at the edge, fog, and cloud which are needed to execute our streaming analytical workflow.


We are using two PN tools known as ProM6 and Disco Fluxicon.

We are members of the Fluxicon Academic Initiative for processing mining research and education. We are working closely with a number of forward-thinking organizations around the world led by Professor Wil van der Aalst, who invented process mining at Eindhoven University of Technology. Under this initiative, Dr. Wachowicz’s students have access to the latest technology and process mining practices available in the market and to a discussion forum.

Analytics Everywhere Ecosystem

Our major breakthrough is our early prototype of an “analytics everywhere” ecosystem which has provided us with an iterative learning experience on how to advance our research towards automated analytical tasks for the Internet of Things. Combining analytical tasks using distributed resources (i.e. edge-fog-cloud) is not a trivial task, and the deployment of different IoT use cases has been crucial in identifying the limitations and benefits of our ecosystem. Four main building blocks of our proposed ecosystem are:

Autonomous Analytical Workflows

We are developing a set of autonomous tasks to process and analyze data streams coming from IoT devices. The data streams are usually an unbounded sequence of tuples generated with high data rates and containing exceptionally noisy data. From a conceptual perspective, the design of an automated analytical workflow depends on the integration of complementary mobility contexts for processing massive data streams without human intervention. We have introduced the notions of trips, network of trips and mobility neighborhoods to represent a mobility context across geographical scales. For example, a trip taken by a transit vehicle or a mobility neighborhood of a moving autonomous car on a highway towards a network of trips or a cluster of synchronized mobility neighborhoods on a highway. Our research will continue to focus on identifying new conceptual artifacts that can be used to support the automation of analytical tasks. Despite the scientific evidence that context plays an important role in analytics, it continues to lack careful examination.

Algorithm Transparency: This research direction has emerged due to our concerns about the inherited high risk of bias in data and algorithms in automated analytical workflows. It will require broader steps to resolve them, but our R&D outcomes are pointing out to the need to understand how the analytical tasks, data lifecycles, and geographic spaces that are used to collect training IoT data can influence the behavior of the predictive models. In the United States, there are already initiatives such as AI Now in which New York University and the Algorithmic Justice League with the help of MIT Media Lab are warning about the power of biased algorithms and their social implications. There is no similar initiative in Canada.

Machine Learning on Graphs

Several algorithms are being used to execute the automated tasks using bipartite graphs, directed graphs, and evolutionary graphs. Currently, they include techniques such as Louvain clustering, SVM, DBSCAN, Decision Tables, K-Means and Random Forest. They have been implemented in Python and JavaScript programming. We are combining algorithms to support descriptive analytics (what is happening?), diagnostic analytics (why is it happening?) and predictive analytics (what will it happen?). We have successfully performed analytical tasks in unbounded data streams using landmark windows, sliding windows, events windows, and tilted windows. Previous research work has neglected the spatial dimension, even though data streams are being generated over geographical areas having different but complementary local, regional and global scales. Our pioneering research is addressing this issue in Geomatics Engineering. We will continue to explore the spatial dimension, in particular addressing the challenges related to determine the trade-off between spatial scale and the frequency of refreshing training data sets that is needed for supporting machine learning in an automated analytical workflow. Finally, it is important to point out that our predictive models are currently being manually built and updated.

Anticipatory Learning: We are interested in investigating anticipatory learning to address two challenges. First is the challenge of labeling training IoT data, which is currently being done manually. Second is the challenge of creating event logs for the Petri Nets, which are currently done in a batch processing mode. We are interested in investigating a feedback mechanism in the course of the analytical workflow that will help us to address these challenges.

Cognitive Mapping for Machine Learning

We are exploring the Dervin’s sense-making model due to its wide applicability in the past to communicating settings from both procedural and cognitive perspectives, as well as at individual and group levels.

Edge-Fog-Cloud Architecture

We have built an edge/fog/cloud pipeline that consists of a distributed resource infrastructure. We have used the Cisco Kinetic platform because it is a single platform that can be deployed at both edge and fog nodes, and manage the data lifecycle of different types of IoT devices (e.g. lighting, parking, traffic, water management). Using the Cisco Kinetic platform was critical for initiating a data lifecycle at any time and any place. We also have the support of Compute Canada for the implementation of Hadoop cloud clusters, allowing us access to the East Cloud and West Cloud resources. This has led to a scientific breakthrough in realizing that the next challenge is not how fast to transport different data streams produced by IoT devices, but in fact, the actual stream behavior of an automated analytical workflow. Our research work is raising the awareness of how crucial it is to understand whether the data streams, which are being generated under different mobility contexts belonging to different IoT systems, actually conform to the expected data lifecycle necessary to execute a sequence of tasks of an automated analytical workflow at running at the edge, fog and cloud.

University of New Brunswick


The People in Motion Lab is located at the Department of Geodesy and Geomatics.


More information here.

We are

University of New Brunswick

Department of Geodesy & Geomatics Engineering

Head Hall, E-50
P.O. Box 4400
Fredericton, N.B. E3B 5A3