Next Generation Continual Learning


Project Description

One of the most impressive abilities of human beings is the incremental learning ability. Since birth, we are always learning continuously without forgetting previously acquired knowledge that is useful for us. Could you imagine the amount of real-world applications that could be developed if we extended this human ability to modern-day AI systems (especially deep neural networks)? Modern deep neural networks adapt their parameters based on large-scale datasets to achieve state-of-the-art performance on specific computer vision tasks. However, due to legal and technical constraints and huge label diversity, deep learning models in real-world scenarios would rarely be trained just once time. Instead, they could be trained sequentially in several disjoint computer vision tasks without considering the data from previous tasks (because it may no longer be available for example). Therefore, these networks should learn incrementally without forgetting the previously learned knowledge. This is known as Continual Learning (CL). Currently, there are two families of CL methods. The first is rehearsal or memory-based methods, which select and store the most relevant samples to remember the current task when the following tasks are learned. The second group involves regularized methods that penalize changes to the most relevant parameters for the previous tasks. While the main studied challenge in the literature is learning with the least amount of forgetting on previous tasks, there are several other unexplored factors that affect learning from a stream of data. For instance, How fast is the learner in adapting the parameters of the model when receiving a new batch of data? If the learner is too slow (expensive training routine), then samples from the stream could be missed and not trained on. Thus, how can we benchmark different continual learning methods under budget constraint training? Furthermore, most continual learning benchmarks are focusing solely on the image domain leaving the more challenging video data unstudied. In the video domain, one of the main issues has been the lack of realistic, challenging, and standardized evaluation setups, making direct comparisons hard to establish. Therefore, our group IVUL has developed vCLIMB, a novel video class incremental learning benchmark, to promote and facilitate research on continual learning in the video domain. Video CL comes with unique challenges. (1) Memory-based methods developed in the image domain are not scalable to store full-resolution videos, so novel methods are needed to select representative frames to store in memory. (2) Untrimmed videos have background frames that contain less helpful information, thus making the selection process more challenging. (3) The temporal information is unique to video data, and both memory-based and regularization-based methods need to mitigate forgetting while also integrating key information from this temporal dimension.
Program - Computer Science
Division - Computer, Electrical and Mathematical Sciences and Engineering
Center Affiliation - Visual Computing Center
Field of Study - machine learning; computer vision

About the

Bernard Ghanem

Professor, Electrical and Computer Engineering

Bernard Ghanem
Professor Ghanem's research interests focus on topics in computer vision, machine learning, and image processing. They include:
  • Modeling dynamic objects in video sequences to improve motion segmentation, video compression, video registration, motion estimation, and activity recognition.
  • Developing efficient optimization and randomization techniques for large-scale computer vision and machine learning problems.
  • Exploring novel means of involving human judgment to develop more effective and perceptually-relevant recognition and compression techniques.
  • Developing frameworks for joint representation and classification by exploiting data sparsity and low-rankness.

Desired Project Deliverables

(i) A novel memory sampling strategy that learns to select a different number of relevant frames per video to reduce memory consumption while the performance remains almost the same; (ii) Novel training techniques/schemes to reduce forgetting. (iii) Benchmarking different continual learning methods on more practical metrics.