ChatTS
is a Time Series Multimodal LLM focuses on Understanding and Reasoning about time series, much like what vision/video/audio-MLLMs do.
This repo provides code, datasets and model for ChatTS
: ChatTS: Aligning Time Series with LLMs via Synthetic Data for Enhanced Understanding and Reasoning.
Here is an example of a ChatTS application, which allows users to interact with a LLM to understand and reason about time series data:
ChatTS
features native support for multi-variate time series data with any length and range of values. With ChatTS
, you can easily understand and reason about both the shape features and value features in the time series. ChatTS
can also be integrated into existing LLM pipelines for more time series-related applications, leveraging existing inference frameworks such as vLLMs
. Check out the Case Studies section for more examples.
- 2025/01/01: We have released a new version of
ChatTS
model, with enhanced CoT and question answering capability. Check for more information. - 2024/12/30: A experimental version of
vLLM
support for ChatTS is available! Check demo_vllm.py for more information. (Note: This version is still under development and may not be stable.) We have also updated the ChatTS model implementation, which supportskv_cache
andAutoProcessor
now.
This repository provides several toolkits for generating synthetic data with the approaches introduced in ChatTS
, as well as the evaluation code and evaluation datasets for reproduction:
- Toolkits for generating synthetic time series data and the corresponding attribues:
chatts/ts_generator.py
. - Example code for generating a training dataset with pre-defined templates:
chatts/generate_template_qa.py
, which can be further used as seed QAs for TSEvol. - Example code for generating a training dataset with LLMs:
chatts/generate_llm_qa
, which can be further used as seed QAs for TSEvol. - Code implementation for
TSEvol
with the generated seed QAs:chatts/evol/evol_instruct.py
. - Code implementation for evaluation:
evaluation/
. - Simple demos for inference:
demo_hf.ipynb
anddemo_vllm.py
. - A trained
ChatTS
model (fine-tuned based on a modified version of QWen2.5-14B-Instruct) at . - Evaluations datasets: .
- Training scripts for training your own model: ChatTS-Training.
- Basic requirements for model inference:
python>=3.11
,deepspeed
,vllm==0.6.6.post1
,torch==2.5.1
,flash-attn
(refer torequirements.txt
). - Download the evaluation datasets from Zenodo and put them under
evaluation/dataset
(evaluation/dataset/dataset_a.json
andevaluation/dataset/dataset_b.json
). - Download the trained model weights from HuggingFace, extract it and put all the extracted files under
ckpt/
(ckpt/config.json
, etc). - Note:
ChatTS
is trained based on a 14B-sized base model, so you need to ensure that you have a GPU with sufficient memory for inference. Additionally, due to the model's requirements,Flash-Attention
(https://github.com/Dao-AILab/flash-attention) is essential, so you need to ensure that your GPU meets the installation requirements for Flash-Attention. Recommended GPUs: A100/A800.
- Following the steps in
Installation
to download the trainedChatTS
model and place it underckpt
. - The ChatTS model can be loaded directly using the
transformers
library. Refer todemo_hf.ipynb
for more information. - About
sp
Encoding. To facilitate the input of variable-length batch time series, we adopted a method namedsp
encoding when encoding the time series. For each time series data point, an additional numerical value of 1.0 is added as a mask. For convenience, we have a Processor which can be loaded withAutoProcessor
intransformers
to normalize and convert the time series and text (Value-Preserved Time Series Encoding). Please refer todemo_hf.ipynb
for more information about their usage. - An example usage of ChatTS (with
HuggingFace
):
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoProcessor
import torch
import numpy as np
# Load the model, tokenizer and processor
model = AutoModelForCausalLM.from_pretrained("./ckpt", trust_remote_code=True, device_map=0, torch_dtype='float16')
tokenizer = AutoTokenizer.from_pretrained("./ckpt", trust_remote_code=True)
processor = AutoProcessor.from_pretrained("./ckpt", trust_remote_code=True, tokenizer=tokenizer)
# Create time series and prompts
timeseries = np.sin(np.arange(256) / 10) * 5.0
timeseries[100:] -= 10.0
prompt = f"I have a time series length of 256: <ts><ts/>. Please analyze the local changes in this time series."
# Apply Chat Template
prompt = f"<|im_start|>system\nYou are a helpful assistant.<|im_end|><|im_start|>user\n{prompt}<|im_end|><|im_start|>assistant\n"
# Convert to tensor
inputs = processor(text=[prompt], timeseries=[timeseries], padding=True, return_tensors="pt")
# Model Generate
outputs = model.generate(**inputs, max_new_tokens=300)
print(tokenizer.decode(outputs[0][len(inputs['input_ids'][0]):], skip_special_tokens=True))
Since vLLM lacks native support for the ChatTS
model, we have provided a patch to enable vLLM to support inference. Therefore, before using vLLM to load the model, please make sure that the code includes: import chatts.vllm.chatts_vllm
to register the ChatTS model in vLLM. Please refer to the following steps to use vLLM to load ChatTS:
- Install
vllm==0.6.6.post1
(please ensure that you have installed the exact version as vLLM's multimodal APIs change frequently). - Please refer to
demo_vllm.py
for detailed usage methods.
A simple example of using vLLM to load ChatTS:
import chatts.vllm.chatts_vllm
from vllm import LLM, SamplingParams
# Load the model
language_model = LLM(model="./ckpt", trust_remote_code=True, max_model_len=ctx_length, tensor_parallel_size=1, gpu_memory_utilization=0.95, limit_mm_per_prompt={"timeseries": 50})
# Create time series (np.ndarray) and prompts (chat_templated applied)
ts1, ts2 = ...
prompt = ...
# Model Inference
outputs = language_model.generate([{
"prompt": prompt,
"multi_modal_data": {"timeseries": [ts1, ts2]}
}], sampling_params=SamplingParams(max_tokens=300))
- QA Generation with Templates. Use
python3 -m chatts.generate_template_qa
to generate a training dataset with pre-defined templates. - QA Generation with LLMs. You need a downloaded LLM that can be loaded with
vLLM
to perform this step. Set[LOCAL_LLM_PATH]
inchatts/generate_llm_qa.py
to a local LLM model (e.g., QWen2.5-32B-Instruct, NOT ChatTS Model) and set num_gpus, gpu_per_model accordingly. Usepython3 -m chatts.generate_llm_qa
to generate a training dataset with LLMs. - TSEvol. You need a downloaded LLM that can be loaded with
vLLM
to perform this step. The datasets generated in Step 1 and Step 2 will be used as seed QAs in TSEvol, so please make sure that you have successfully generated the previous datasets before running TSEvol. Then, refer to the steps inchatts/evol/evol_instruct.py
:- Set
[LOCAL_LLM_PATH]
inevol_instruct.py
to the path of a local LLM model (e.g., QWen2.5-32B-Instruct. NOT ChatTS Model) for QA generation and set num_gpus, gpu_per_model accordingly inchatts/evol/evol_instruct.py
. - Run
python3 -m chatts.evol.evol_instruct
. - The output will be saved to the file specified in
OUTPUT_FILE
.
- Set
- We provide a simple script for inference of ChatTS (
chatts/inference_tsmllm_deepspeed.py
) withdeepspeed
. After installingdeepspeed
, please set theWORKDIR
(the absolute path of the current directory) and the evaluation dataset in the script. Then, run the following command to do the model inference:
deepspeed --num_gpus [YOUR_NUM_GPUS] --master_port 12345 chatts/inference_tsmllm_deepspeed.py
You should find the inference results under exp/
folder, which will be further used for evaluation.
- Install
ragas==0.1.9
(https://github.com/explodinggradients/ragas), which is used for evaluating the inductive reasoning results. - Set the
API_KEY
andOPENAI_URL
inevaluation/ragas/config/config.toml
(Refer to https://platform.openai.com/docs/api-reference). - Run
python3 -m evaluation.evaluate_tsmllm_models
to evaluateChatTS
(make sure you have done the model inference before). - We also provide a simple demo to evaluate the performance of text-based GPT models. After setting your
API_KEY
andOPENAI_URL
inevaluation/evaluate_gpt_text_models.py
, run the commandpython3 -m evaluation.evaluate_gpt_text_models
to obtain the evaluation results of the text-based GPT model.
- We provide a simple script for fine-tuning your own TS-MLLM models: https://github.com/xiezhe-24/ChatTS-Training (modified based on LLaMA-Factory). Refer to this repository for more details.
- We've provided the two evaluation datasets we gathered, as stated in the paper. You can find them in the
evaluation/dataset
folder. Each sample in these datasets has several parts:timeseries
, which is the time series data itself;question
, the query related to the time series;answer
, the text-form standard answers provided for your reference only;attributes
, the structured labels used to evaluate results; andability_types
, which indicates the types of tasks the question involves. Please pay special attention to this: To cut down on evaluation costs, we've combined different questions that pertain to the same time series into onequestion
. We use numbering to tell these different questions apart. So, when you look at the evaluation dataset, the actual count of questions might be more than the number oftimeseries
entries. Another thing to note is that some tasks in inductive reasoning and alignment are grouped together in one question. This is because inductive reasoning tasks often require explaining the physical meanings of time series attributes. - The
MCQ2
dataset is sourced from a third-party and is open-source. However, due to licensing restrictions, we are unable to provide it within this repository. You can directly download it via https://github.com/behavioral-data/TSandLanguage.
In ChatTS
, we mainly focus on Understanding and Reasoning about time series, just like what vision/video/audio-MLLMs do, rather than conducting time series prediction, anomaly detection and classification tasks.
You can try more application scenarios of ChatTS by modifying the time series and the text of questions in demo_hf.ipynb
!
- QWen (https://github.com/QwenLM/Qwen2.5)
- DeepSpeed (https://www.deepspeed.ai/)
- RAGAS (https://github.com/explodinggradients/ragas)
- vLLM (https://github.com/vllm-project/vllm)
- Flash Attention (https://github.com/Dao-AILab/flash-attention)
If you discover a potential security issue in this project, or think you may have discovered a security issue, we ask that you notify Bytedance Security via our security center or vulnerability reporting email.
Please do not create a public GitHub issue for a security vulnerability.
This project is licensed under the MIT License.
@article{xie2024chatts,
title={ChatTS: Aligning Time Series with LLMs via Synthetic Data for Enhanced Understanding and Reasoning},
author={Xie, Zhe and Li, Zeyan and He, Xiao and Xu, Longlong and Wen, Xidao and Zhang, Tieying and Chen, Jianjun and Shi, Rui and Pei, Dan},
journal={arXiv preprint arXiv:2412.03104},
year={2024}
}