- We describe how to build an AI.
- Run your LLM on (nearly) any machine.
- Graphics cards take much of the computational strain.
Using LLMs (large language models) like CoPilot or ChatGPT has advantages in specific areas of working businesses today: web searches via chatbots return more relevant, intelligent results, and low-value content creation is pretty simple.
But using a so-called AI to mine an organization’s data remains a complex and/or expensive undertaking. It’s difficult to test a machine learning algorithm’s capabilities in specific contexts without committing to paying one of the big companies money for unlimited access to the best models. And when a data-driven project using an AI is at the experimental phase, it’s not an expenditure many can justify.
Setting up your own AI is a relatively simple thing to do, though. Once you’ve achieved the steps below, you’ll have your own LLM with which you can experiment, chat to, and even feed your own information to improve and sharpen the bot’s responses.
What you need to build an AI
1. Personnel. You’ll need someone (it may be you) who is very comfortable with computers. Ideally, that someone should have some experience using Linux or macOS. While an AI as described can be run on a Windows PC, there are certain limitations and extra hurdles to jump over to get the model up and running.
2. Hardware. Although you can run your own AI instance on just about any hardware, we’d strongly recommend a computer with a graphics card installed. If you’ve no graphics card, then the computer you choose should be as fast as possible. A first-generation M1 Mac (Apple silicon) is a good choice. In our setup, we used a six year old PC laptop with a similarly ancient Nvidia GTX 1060ti installed. Our laptop has 16GB of RAM, which should be plenty.
3. Background reading. We’ll be using the llamafile project’s executables, which are hosted on huggingface.com. Read through the project’s GitHub documentation. We’re following the same steps, but adding details of the gotchas we found, and any more information we think is relevant.
Instructions for running your own AI
Ready to build an AI? Let’s go.
1. Reset the machine to its factory settings. Then run any updates the operating system might suggest. This gives you a clean basis for installing the LLM.
2. Graphics card drivers and supporting files. If your machine has a graphics card, you should install the manufacturer’s drivers and libraries from its website. On an Apple silicon machine, there’s no need. Nvidia users should find what they’re looking for, in the form of the CUDA Toolkit, here.
– gotcha for Linux with Nvidia users. If your CUDA install fails, blacklist the Nouveau driver, instructions here, for Debian/Ubuntu.
– gotcha for macOS users on Apple Silicon (M1 Macs and later). You will need to download and install the Xcode Command Line Tools:
Launch Terminal.
Type xcode-select --install
Follow the instructions that appear in dialog boxes.
3. Download a llamafile. A llamafile is an executable program that contains everything you need to run an LLM in one file. There are several to try, each with a differing download size ranging from simply huge to fully enormous.
Model | Size | License | llamafile |
---|---|---|---|
LLaVA 1.5 | 3.97 GB | LLaMA 2 | llava-v1.5-7b-q4.llamafile |
Mistral-7B-Instruct | 5.15 GB | Apache 2.0 | mistral-7b-instruct-v0.2.Q5_K_M.llamafile |
Mixtral-8x7B-Instruct | 30.03 GB | Apache 2.0 | mixtral-8x7b-instruct-v0.1.Q5_K_M.llamafile |
WizardCoder-Python-13B | 7.33 GB | LLaMA 2 | wizardcoder-python-13b.llamafile |
– gotcha for Windows users. Initially, you should choose the smallest download. Without some tweaking, Windows simply won’t run the larger models. You can do this later, following the instructions on the llamafile project’s GitHub page, or by checking back in a few days for part II of this guide.
4. Move the file you’ve downloaded somewhere you won’t delete it. The Desktop is as good as any other choice.
Now you need to tell the operating system what you’ve downloaded is executable. On Windows, it’s easiest: rename the file to append ‘.exe’ to its name.
On Mac and Linux, open Terminal, and type:
cd /the/place/where/you/stored/the/llamafile
[for example, cd /home/techhq/Desktop
on Linux, or cd /Users/TechHQ/Desktop
on Mac]
chmod +x [name of llamafile]
and press ENTER
– tip: when typing the llamafile name, above, just type the first few letters of the name, and then press TAB to autocomplete.
– tip: on Mac and Linux, leave your Terminal open. You’ll be typing in here again.
5. Run your AI
On Windows, just double-click the llamafile.
On Mac and Linux, go back into the Terminal, and ensure you’ve cd
-ed to the file’s location, as above. Now type:
./[llamafile name]
and press ENTER.
– tip: after the ‘dot slash,’ type the first few letters of the name, then TAB to use autocomplete.
6. See if your web browser opens.
Your browser should open and display an interface.
– gotcha. If your browser doesn’t open, open it manually and head to the following URL: http://127.0.0.1:8080
Start chatting with your local AI in the web page.
When you’ve finished, close the llamafile application (Windows) or return to the Terminal (Linux and macOS) and type CONTROL c
to stop the process. You can close the browser window now.
Part two of this guide on how to build an AI will describe how Windows users can run the larger models available from llamafile, how to interact in different ways with different models on all the platforms covered here, and how best to ‘teach’ your AI with data that pertains to your organization for some real data mining. Check back in a few days.