Appears to be an exhaustive survey paper — (Chen et al., 2023)

(Zhuang et al., 2024)

ChuXin, an entirely open-source language model with a size of 1.6 billion parameters. Unlike the majority of works that only open-sourced the model weights and architecture, we have made everything needed to train a model available, including the training data, the training process, and the evaluation code. Our goal is to empower and strengthen the open research community, fostering transparency and enabling a new wave of innovation in the field of language modeling. Furthermore, we extend the context length to 1M tokens through lightweight continual pretraining and demonstrate strong needle-in-a-haystack retrieval performance. The weights for both models are available at Hugging Face to download and use.

Chen, H., Jiao, F., Li, X., Qin, C., Ravaut, M., Zhao, R., Xiong, C., & Joty, S. (2023). ChatGPT’s One-Year Anniversary: Are Open-Source Large Language Models Catching Up?
Zhuang, X., Jiang, Y., He, Q., & Wu, Z. (2024). ChuXin: 1.6B Technical Report. https://arxiv.org/abs/2405.04828