This page presents data visualisations based on the Nature Machine Intelligence Perspective paper Towards deployment-centric multimodal AI beyond vision and language . A preprint version is available at arXiv:2504.03603 .
The analysis covers arXiv preprints from 2019 to 2025 and highlights emerging research trends in multimodal AI, using data derived from the Kaggle arXiv Dataset . The data is updated annually, with the latest update in January 2026.
The reported numbers should be interpreted as indicators of overall trends rather than exact totals.
Star the GitHub repository to support the project.
Liu, X., Zhang, J., Zhou, S. et al. Towards deployment-centric multimodal AI beyond vision and language. Nature Machine Intelligence 7, 1612-1624 (2025). https://doi.org/10.1038/s42256-025-01116-5
@article{liu2025towards,
title={Towards deployment-centric multimodal AI beyond vision and language},
author={Liu, Xianyuan and Zhang, Jiayang and Zhou, Shuo and van der Plas, Thijs L. and Vijayaraghavan, Avish and Grishina, Anastasiia and Zhuang, Mengdie and Schofield, Daniel and Tomlinson, Christopher and others},
journal={Nature Machine Intelligence},
volume={7},
pages={1612--1624},
year={2025},
doi={10.1038/s42256-025-01116-5}
}