Cohere开源35B模型,(RAG、Tool)能力超过Mixtral,支持中文
作者: NLP前沿 来源: NLP前沿
https://txt.cohere.com/command-r/
https://huggingface.co/CohereForAI/c4ai-command-r-v01
1. RAG 效果
在多个数据集上,遥遥领先mixtral moe模型,使用他们自己的embedding + rerank 更是遥遥领先开源模型
2. 工具能力
工具能力略优于mixtral , 大幅领先gpt-3.5
3. 多语言能力
支持英语、法语、西班牙语、意大利语、德语、巴西葡萄牙语、日语、韩语、简体中文和阿拉伯语。
4. 长文能力
大海捞针测试可以获得全绿的结果
5. 协议
CC-BY-NC ,不可商用
凑点字数
from transformers import AutoTokenizer
model_id = "CohereForAI/c4ai-command-r-v01"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
# define conversation input:
conversation = [
{"role": "user", "content": "Whats the biggest penguin in the world?"}
]
# Define tools available for the model to use:
tools = [
{
"name": "internet_search",
"description": "Returns a list of relevant document snippets for a textual query retrieved from the internet",
"parameter_definitions": {
"query": {
"description": "Query to search the internet with",
"type": 'str',
"required": True
}
}
},
{
'name': "directly_answer",
"description": "Calls a standard (un-augmented) AI chatbot to generate a response given the conversation history",
'parameter_definitions': {}
}
]
# render the tool use prompt as a string:
tool_use_prompt = tokenizer.apply_tool_use_template(
conversation,
tools=tools,
tokenize=False,
add_generation_prompt=True,
)
print(tool_use_prompt)
更多AI工具,参考Github-AiBard123,国内AiBard123