Open source ChatGPT alternatives on local machines — own AI model at your home

7 min readJul 2, 2023

While some people are trying to find arguments against AI in various areas to show that it is useless, others think about how to use AI wisely. Whose efforts are more constructive? Guess who will be the winner in the end.

This is how Bing Image Creator portrays llama, alpaca, and vicuña, but please do not take this image too seriously

Companies and developers should not wait for ready-to-use products with AI. This transformation is happening now. Only those who are best adapted to change will survive. So let’s see what we can do ourselves with our home computers in the area of AI.

There is a myth that nothing and AI is only for big companies with large clusters with thousands of nodes. It is simply not true. There are models suitable even for Raspberry Pi.

We can sit and complain about ChatGPT not being able to provide correct code for testing. We can point out that it cannot design correct test steps. But what have we actually done to teach AI how to do it?

If we want a human to do something, we commonly consider that we should instruct such a person. Why would we like to treat AI differently? Let’s teach it first, and then we will see if our lessons were worth it.

ZOO OF MODELS

ChatGPT — beginning of hype, far away from open source

As we probably all know, ChatGPT is the very hyped software, launched on November 30, 2022, and developed by OpenAI. By January 2023, it had become the fastest-growing consumer software application in history. ChatGPT, accessible without payment, is based on GPT-3.5, and the paid version is based on GPT-4.

Contrary to its name, OpenAI software is not very open. Initially, OpenAI was founded as a non-profit organization in 2015. In 2019, OpenAI stopped publicly releasing their models and transformed itself into a fully commercial company.

OpenAI GPT models were built on copyrighted articles, internet posts, web pages, and books scraped from 60 million domains over a period of 12 years.

LLaMA

Let’s start the story of open models with LLaMA, i.e., Large Language Model Meta AI. The ‘Meta’ in this name is not a coincidence with the owning company Meta AI (Facebook). LLaMA was born (or more precisely, its birth was announced) on February 23, 2023, in Meta AI.

LLaMA creators decided to use openly accessible data to train the model. The whole code used to train the model was released under the open-source GPL 3 license. As researchers wrote in an ArXiv note, ‘In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.’

It is worth explaining that in 2022, the model Chinchilla-70B outperformed GPT-3 (175B).

Among the training data (1.4 trillion tokens) were:

Webpages scraped by CommonCrawl
Open-source repositories of source code from GitHub
Wikipedia in 20 different languages
Public domain books from Project Gutenberg
The LaTeX source code for scientific papers uploaded to ArXiv
Questions and answers from Stack Exchange websites

The code and training data were openly accessible, but not the weights of the model. However, they quickly leaked out and became publicly accessible.

The architecture of LLaMA is similar to GPT but not identical. The general approach was also different. Instead of increasing the number of parameters, LLaMA’s developers focused their efforts on scaling the model’s performance by increasing the volume of training data. To be precise, LLaMA is not one model but a whole collection with various numbers of parameters (from 7B to 65B). As mentioned in the ArXiv article, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B.

Alpaca

The next creature worth mentioning here is Alpaca. Its name is an obvious answer to the name LLaMA. A llama is a domesticated South American camelid, whereas an alpaca is another similar but slightly smaller animal. The Alpaca project was a response by Stanford University researchers to LLaMA. They started from the LLaMA-7B model and fine-tuned it with 52k pairs of instructions and outputs from OpenAI’s text-davinci-003. The whole cost of it was less than $600. The model was published on GitHub.

The main advantage of Alpaca is that it is smaller than LLaMA and can be run on low-end devices, such as phones or Raspberry Pis. This means you don’t need a computer or a GPU to run Alpaca. You just need a device that has at least 2 GB of RAM and supports Python.

Unfortunately, Alpaca is intended only for academic research, so any commercial use is prohibited. The reason is that it is based on LLaMA, which has a non-commercial license. Additionally, the instruction data is based on OpenAI’s text-davinci-003 (GPT-3.5), whose terms of use prohibit developing models that compete with OpenAI.

Vicuna

The third animal in this zoo is Vicuna. Its name also refers to the vicuña, one of the two wild South American camelids that live in the high alpine areas of the Andes. Vicuñas are relatives of the llama and are now believed to be the wild ancestors of domesticated alpacas.

The Vicuna model was developed by a cross-university team from UC Berkeley, CMU, Stanford, and UC San Diego. Again, the basis was LLaMA (precisely LLaMA-13B), which was trained by fine-tuning on 70K user-shared conversations collected from ShareGPT with public APIs. The model was presented on March 30, 2023.

The cost of the model, compared to Alpaca, was again decreased thanks to using spot instances with auto-recovery for preemptions and auto zone switch. This solution slashes costs for training the 7B model from $500 to around $140 and the 13B model from around $1K to $300. Additionally, the max context length was expanded from 512 in Alpaca to 2048.

Again, in the case of Vicuna, we cannot use this model for commercial purposes.

Let me present you a short and simple instruction for running Stable-Vicuna-13B on any machine where Node.js is installed. It uses the CatAI GUI.

npm install -g catai
catai install Stable-Vicuna-13B
catai serve

That’s all. It opens in your browser under the address http://127.0.0.1:3000. You can talk to it in the same way as you can talk to ChatGPT, but everything happens on your own machine.

MPT — Mosaic Pretrained Transformer

Contrary to LLaMA, the MPT model by Mosaic ML is licensed for commercial use. One of its advantages is the ability to handle extremely long inputs and outputs, making it suitable for story writing. It is also capable of fast training and inference.

GPT4All — Viva la revolución!

Similarly to the Stanford researchers, the Nomic AI Team used the GPT-3.5-Turbo OpenAI API to collect around 800k prompt-response pairs, creating 430k training pairs. This is roughly 16 times larger than Alpaca.

GPT4All can run on a CPU without the need for a GPU and requires 8GB of RAM. It is currently an open-source software platform that allows anyone to train powerful and customized large language models on commonly used hardware.

Currently, GPT4All offers three different model architectures:

GPTJ — Based on the GPT-J architecture
LLAMA — Based on the LLaMA architecture
MPT — Based on Mosaic ML’s MPT architecture

Depending on the architecture and data, the model can be used for commercial purposes. This opens up a wide range of possibilities for companies to experiment with various models in practical applications in their real-life projects.

OpenLLaMA

OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA-7B trained on the RedPajama dataset.

I HAVE A MODEL. WHAT’S NEXT?

Having a local model with the possibility for commercial use, you can start fine-tuning it with your own data. This allows you to shape the desired behavior of AI that you want to achieve. Of course, it requires a lot of preparation regarding the data and a lot of patience.

How can you do it? You can download the LLaMA-LoRA-Tuner project and run it locally.

The GUI looks as follows:

I have no doubts that in the beginning, the benefits of using AI will be moderate, but this is common to every new technology. However, I am very optimistic when it comes to the future.

For a list of links related to this article see my AI Starter Pack.

If you like this article and would be interested to read the next ones, the best way to support my work is to become a Medium member using this link:

Join Medium with my referral link - Daniel Delimata

As a Medium member, a portion of your membership fee goes to writers you read, and you get full access to every story…

daniel-delimata.medium.com

If you are already a member and want to support this work, just follow me on Medium.