Machine Learning Engineer (I)
Paid internship at FotoNation · Starts after academic term
Categories:
  • – Artificial Intelligence
City:
  • room București

Here at Xperi we solve Computer Vision problems for In-Car Monitoring applications as Driver and Occupancy Monitoring Systems. To support the research and development of such solutions, one of the critical aspects is creating machine learning algorithms which provide a deep understanding of the car scene. At the cornerstone of such solutions lies the Object Detection and Segmentation module, responsible with the proper localization of all relevant body-parts and objects in the scene. Inspired by the tremendous success of transformer-based architectures in the field of NLP, the Computer Vision community recently adopted transformer networks to tackle these tasks from a totally different perspective. Successful examples include DETR and DeformableDETR, for the task of object detection, and VisTR for the task of video instance segmentation. Although transformer networks achieved remarkable results on generic datasets, little is known about their usefulness in a dense and highly constrained environment such as car interiors. Your job will be to study and experiment with transformer-based solutions for object detection and instance segmentation on in-car data, adapt such solutions to the particularities of the in-car domain, but also come up with novel ideas for further improvements.

Skills required:

  • Software development, testing and debugging
  • Problem solving and logical thinking
  • Python proficiency
  • Basic understanding of Deep Learning concepts

Nice to have:

  • Experience with PyTorch or other deep learning frameworks

References:

  • DETR: Carion, Nicolas, et al. "End-to-end object detection with transformers." European Conference on Computer Vision. Springer, Cham, 2020.
  • DeformableDETR: Zhu, Xizhou, et al. "Deformable DETR: Deformable Transformers for End-to-End Object Detection." arXiv preprint arXiv:2010.04159 (2020).
  • VisTR: Wang, Yuqing, et al. "End-to-End Video Instance Segmentation with Transformers." arXiv preprint arXiv:2011.14503 (2020).