An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
arXiv 📆 2021-6-3

References
https://juejin.cn/post/7214489453399277624 https://bbs.huaweicloud.com/blogs/264175 https://mathpretty.com/15541.html https://blog.csdn.net/qq_35591253/article/details/131994377 https://zhengyu.tech/archives/vitvisiontransformer-li-jie https://zhuanlan.zhihu.com/p/369781582