PART 2: Design Paradigm of Combination of CNN and ViT: The shift from series to parallel

: 13h30, ngày 22/07/2022 (Thứ Sáu)

: Online

: Machine Learning và Data Mining

: Duc Anh Nguyen Nguyen Quoc Khanh

: Pixta Vietnam

Tóm tắt báo cáo

The combination of convolution and vision transformer effectively encodes both local processing and global interaction. Recent works show the benefit of this combination in series, either using convolution at the beginning or intertwining convolution into each transformer block. In this seminar, we will describe Mobile-Former (CVPR 2022), they shift the design paradigm from series to parallel, and propose a new network that parallelizes MobileNet and transformer with a two-way bridge in between.

Đánh giá bài viết

