Vicuna: The Open-Source Chatbot Rivaling GPT-4 with Impressive Performance

Vicuna: A New Open-Source Chatbot Competing with GPT-4

In the world of large language models (LLMs), chatbot systems have seen significant advancements, with OpenAI’s ChatGPT being a prime example. However, the lack of clarity in ChatGPT’s training and architecture details has limited research and open-source innovation. Enter Vicuna-13B, an open-source chatbot inspired by the Meta LLaMA and Stanford Alpaca project, boasting an enhanced dataset and user-friendly, scalable infrastructure. By fine-tuning a LLaMA base model on user-shared conversations from ShareGPT.com, Vicuna-13B demonstrates competitive performance compared to other open-source models like Stanford Alpaca.

Challenges in Evaluating AI Chatbots

Evaluating AI chatbots is no easy feat, as it involves assessing language understanding, reasoning, and context awareness. As AI chatbots become more advanced, existing open benchmarks may no longer be sufficient. For example, the evaluation dataset used in Stanford’s Alpaca, self-instruct, can be effectively answered by SOTA chatbots, making it difficult for humans to discern performance differences. Other limitations include training/test data contamination and the high cost of creating new benchmarks. To address these issues, we propose an evaluation framework based on GPT-4 to automate chatbot performance assessment.

Model Details

Vicuna is an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT. It is an auto-regressive language model, based on the transformer architecture. Vicuna was trained between March 2023 and April 2023, with development led by a team from UC Berkeley, CMU, Stanford, and UC San Diego.

Intended Use and Users

The primary intended use of Vicuna is research on large language models and chatbots. Its primary intended users are researchers and hobbyists in natural language processing, machine learning, and artificial intelligence.

More models

Here you can find more AI models: https://huggingface.co/models?other=llama&p=1&sort=downloads

and see a comparison and benchark of AI models: https://lmsys.org/blog/2023-05-03-arena/