1/6
Welcome Yi-9B
> Trained on 3 Trillion tokens.
> Pretty good at Coding, Math and Common sense reasoning.
> Open access weights.
> Bilingual English & Chinese.
2/6
Benchmarks look pretty strong!
4/6
Note: this is a base model so you'd need to fine-tune it for chat specific use-cases
5/6
Using it in Transformers is easy, too!
from transformers import AutoModelForCausalLM, AutoTokenizer
MODEL_DIR = "01-ai/Yi-9B"
model = AutoModelForCausalLM.from_pretrained(MODEL_DIR, torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR, use_fast=False)
input_text = "# write the quick sort algorithm"
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_length=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
6/6
I’ve generally found Chinese LLMs to be way more untouched than typical western ones