Original title: scELMo: Embeddings from Language Models are Good Learners for Single-cell Data Analysis
Authors: Tianyu Liu,Tianqi Chen,Wangjie Zheng,Xiao Luo,Hongyu Zhao
This article introduces a new approach called scELMo for analyzing single-cell data. Unlike previous methods, scELMo leverages the benefits of Large Language Models (LLMs) to create a foundation model. The authors use LLMs like GPT 3.5 to generate descriptions and embeddings for metadata information. They combine these embeddings with the raw data to perform tasks such as cell clustering, batch effect correction, and cell-type annotation. What’s unique about scELMo is that it doesn’t require training a new model for these tasks. Additionally, scELMo can handle more challenging tasks like in-silico treatment analysis and modeling perturbation information through a fine-tuning framework. The authors highlight that scELMo has a lighter structure and lower resource requirements, while still delivering impressive performance. This new approach shows promise for developing domain-specific foundation models.
Original article: https://www.biorxiv.org/content/10.1101/2023.12.07.569910v1