QStates: Cognitive State classification Software

Machine Learning Made Easy

Introduction

QStates is a rapid and efficient machine learning software tool developed by Quasar that uses quantitative EEG and other physiological sensor data to assess cognitive states. Cognitive state assessment can be done in real-time and off-line with graphical displays of results. QStates offers its users the flexibility to create their own models, which can be trained to classify any cognitive state for which there is a qEEG signature (e.g. cognitive workload, fatigue, engagement, or emotional states). Wearable Sensing and QUASAR scientists have validated the performance of greater than 90% accuracy for the states mental workload, engagement, and fatigue. Training models are straightforward and fast: to create a cognitive state model, a user needs to collect a minimum of one minute of EEG data for each of the high and low state conditions (e.g. high workload vs. low workload). QStates can classify up to 3 models simultaneously. The software automatically monitors data quality and rejects bad epochs. The program offers two different machine learning algorithm outputs: linear interpolation or probability density function. Outputs of state are given as values from 0-100 every 2 seconds. QStates offers automated summary tables generation. Data are saved in comma-separated value (.csv) format. QStates interfaces seamlessly with DSI-Streamer and the various DSI systems.

Real-Time Cognitive Classification Demo

In this video, we demonstrate a real-time assessment for models of “engagement” and “cognitive load”. The subject is doing a task where they are presented with easy and hard addition math problems one at a time. You will see in the bottom middle of the screen the model gauges. The very left gauge is tracking the subject’s engagement, and the middle and right gauges are tracking cognitive load. Changes in cognitive states cause the gauges to rise and fall. You will see how as the subject is given easy math problems, the engagement gauge is high, meaning the subject is paying attention, but the cognitive load gauges are low, meaning the subject is not exerting much mental effort. As the problems become more difficult, you will see how the cognitive load gauges start to rise. Outputs of the states are given as values from 0-100 every 2 seconds. QStates offers automated summary tables generated, and data is saved in CSV format.

Algorithm Development & Validation

Figure 1A
Figure 1B

Results from QUASAR’s Cognitive State Monitoring algorithm A) Subject performing 1st person shooting game task; B) Classifying cognitive Engagement, Workload and Fatigue on 18 subjects during performance of a 1st person shooting simulation with >90% accuracy. 

QUASAR’s real-time software for classification of cognitive state, QStates, is currently on the cutting edge of cognitive state assessment methods. QUASAR has developed and validated this powerful and rapid classification algorithm for cognitive state classification from EEG data. The details of this Partial-Least-Square (PLS)-based algorithm are described in this publication (McDonald & Soussou, 2011).  Briefly, this learning algorithm extracts spectral features from the EEG signal, trains cognitive models based on the researcher’s interests (workload, engagement, fatigue, …), then processes EEG data in real time producing cognitive state measures whose output ranges from 0 to 1 representing the relative intensity of the monitored state. The algorithm allows for expedient subject specific calibration within minutes, or the creation of generalized (normative) models that operate across subjects.

QUASAR has tested and validated this cognitive gauge methodology on several research projects. using three commonly evaluated cognitive models: workload (effort), engagement (attention/focus), and fatigue. All three models regularly achieve average classifications accuracies >90%, as determined by performance on primary, and secondary tasks, primary task difficulty, subjective evaluation (NASA TLX), and time duration since last sleep (for fatigue).  (Figure 1B) Furthermore, the models’ outputs have been shown to track task difficulty, correctly interpolating cognitive workload for tasks of intermediate difficulty compared to those used for training. (Figure 2B)

Individualized Models

Many efforts at developing cognitive gauges have attempted to produce universal gauges that work for all individuals in order to produce “ready to go” systems.  QUASAR’s rapidly training algorithms allow for expedient calibration within minutes, or the creation of “normative” models that do not require retraining.  We evaluated the performance of such group-wide normative models compared to subject-specific models, and compared to drift-calibrated models where calibration runs were conducted at the beginning and end of the experiment.  Figure 2B shows the classification accuracy of cognitive workload gauges for 18 subjects performing a first person shooting game at increasing difficulty levels. The normative model achieved 64% accuracy, whereas both individualized models produced average classifications accuracies >90% on all 18 subjects with less than three minutes of training data. (Figure 2A) Furthermore, the models’ outputs track task difficulty reliably, correctly interpolating cognitive workload for tasks of intermediate difficulty compared to those used for training (Figure 2B)

Results from a previous research project revealed that brain activity patterns vary across subjects even during the performance of a same task (a first-person shooting simulation). From a group of 18, nine subjects had cognitive workload metrics with a high degree of similarity in terms of the EEG features that were selected by the algorithm for use in mental workload classification; 5 of the subjects shared a different set of features, and the remaining 4 subjects each had their own specific brain activity patterns that best characterized their mental effort during performance of the same task.

The algorithm behind QUASAR’s cognitive states trains rapidly, and our current use scenario involves a brief (<5min) calibration session for each individual on each day. By examining the relative weights set by the algorithm on the extracted EEG features, we can gain insight into the underlying brain areas that are recruited during task performance and look for differences between individuals and for correlations with performance. In this project, we will investigate the differences between brain activity patterns and how they relate to effective performance on the learning task. We will seek to identify patterns that correlate with fast learning individuals, or with periods of accelerated learning.

Figure 2A
Figure 2B

Cognitive Workload Performance. A. Average classification accuracy of normative, individual, and drift calibrated models on 18 subjects performing a 1st person shooting game task. B. Average workload gauge output on 18 subjects as a function of task difficulty under different calibration settings. (avg ± std).

Task Specific Cognitive Gauges

Figure 3

Average classification accuracy of cognitive workload models across 18 subjects on the following tasks: All, BF (1st person shooting game), Validation (all tasks except BF), addition, FDS (forward digit span) memory, spatial Nback. Red bars indicate the accuracy of a generalized model trained using all tasks, whereas Green bars indicate the accuracy of task-specific models trained on a subset of data gathered during performance of the same tasks being classified.

The learning aspect of these cognitive gauges also allows them to be trained for various cognitive tasks. Pilot studies were conducted to evaluate the specificity of models. Figure 3 graphs the average output of a task-specific model (green) and a generalized “overall” model (red) on each of five different tasks on 18 subjects, and reveals that task-specific models have greater accuracy than generalized ones.

Mental workload theories propose that the brain has many functions that it can attend to serially or in parallel, but that different modalities can be taxed differentially at various time points. (Wickens, 1991) Any specific modality has optimal energetic states of operation, outside of which the person is prone to underload or overload and errors. (Gaillard and Wientjes, 1994) Accordingly, in order to track various learning functions it makes sense to create modality specific workload models (such as verbal, analytical, arithmetic, …) rather than a single generalized mental workload model.

Model Output

Figure 4

Output of cognitive load gauge tracks difficulty during performance of addition tasks: A) Alternating between medium difficulty and waiting. B) Alternating between easy and hard task, and C) Task difficulty varies from easy, medium, hard, medium, and back to easy. (X-axis is time, Y-axis is gauge output, the color traces represent two different mathematical outputs of the cognitive gauge, brown is a linearized output, purple is a non-linear metric.

Cognitive Workload in Training and Learning Models

There are several views of learning and training applied in educational models. The most basic and common concept to all is the acquisition of new information or rules. Learning can occur gradually, whereby students build knowledge in sequential steps; through novelty exposure, where students learn to explore and extract information from new environments; or through scenario specific training, where students improve by memorization of familiar and repetitive situations. In these models, learning occurs as students transition from a state of work that is deliberate, monitored, and emotional to a state of work that is effortless and natural, but not thoughtless or accidental. These models therefore suggest that monitoring attention load vs. performance could help determine expertise levels, as they imply an inverse relationship between performance and attention load over the course of training and across different Mastery levels.

In order to evaluate expertise, tests are generally administered to assess performance, such as whether students are able to complete tasks successfully, or whether they can generalize the rules learned. However, such performance-based metrics are currently not able to assess the ease with which the task was completed, a measure that can help distinguish expertise level. The ability to monitor cognitive effort and performance over the course of training could thus reveal mastery level. Furthermore, it could also enable the assessment of the efficacy of various training paradigms, by evaluating the steepness of their curves and the plateau values. 

In 2010, under a DHS-funded SBIR, QUASAR used these gauges to monitor cognitive workload of TSA operators and compare it to that of to briefly trained novices during an X-ray screening task. The workload gauge was able to discriminate expertise level (Figure 5), whereas none of the standard performance metrics (score, error breakdown, response duration …) were significantly related.

Figure 5

Average cognitive workload during X-Ray screening task as a function of expertise. (Novice group had 16 hours of training and passed the final exam, whereas expert group had >2 years screening experience)

Furthermore, there were significant correlations between EEG-based cognitive workload and error rates. Higher mental workloads were associated with images in which subjects made errors, and lower mental effort was correlated with images in which a correct detection was made. These preliminary results suggest that in cognitive gauge-guided training, cognitive workload can be used to discriminate whether an error was due to lack of understanding or inattentiveness, and to establish how hard an individual finds a particular task.

External Validation

Figure 6

The Army has used QUASAR’s EEG systems and cognitive state algorithms to objectively assess mental effort during performance of basic arithmetic tasks of varying difficulty. The report indicates that the EEG-based metrics were more related to performance than the subjective evaluations (NASA TLX) of task difficulty. As subjects got better at the task, their evaluation of its difficulty did not reflect their improved performance, whereas their mental workload decreased correspondingly. (Fielder, 2010)

During previous projects, we have observed that as task difficulty increases beyond a certain threshold, subjects changed their execution strategy and reverted to one requiring lower mental effort. By monitoring their cognitive workload, we were able to objectively determine their maximal performance threshold, as well as calibrate their relative limits compared to others in the group. Through direct measurement of mental activity, our neuromonitoring approach inherently provides an objective means to test baseline capabilities and improvements in performance of individual students.