How to Run a Large Language Model on Linux (and Why You Should)

Large language models have the potential to revolutionize the way you live and work, and can hold conversations and answer questions with a variable degree of accuracy.

To use one, you’d typically need an account with an LLM provider, and to login in via a website or dedicated app. But did you know you could run your own large language model entirely offline on Linux?

Why Run a Large Language Model on Linux?

Large language models (LLMs) are everywhere these days and can process natural language and give appropriate responses which can fool you into thinking that a human has replied.Microsoft is rolling out a new AI-powered version of Bing, while Alphabet’sBard is now an integral part of Google searches.

Away from search engines, you can use so-called “AI chatbots” to answer questions, compose poetry, or even do your homework for you.

two robots writing a book

But by accessing LLMs online, you depend on the goodwill of a third-party provider—which can be withdrawn at any time.

You’re also subject to usage restrictions. Ask OpenAI to write a 6,000-word erotic novella set in Nazi Germany, for example, and you’ll get a response along the lines of “I apologize, but I won’t be able to generate that story for you.”

three fluffy llamas

Anything you input to online LLMs is used to train them further, and data that you may want to remain confidential may be spat out in the future as part of a response to someone else’s question.

You’re also subject to lack of service as the system is flooded with users, and nagged to subscribe, so you can access the platform when demand is high.

dalai downloading alpaca 7B model

Dalai is a free and open-source implementation of Meta’s LLaMa LLM and Stanford’s Alpaca. It will run comfortably on modest hardware and provides a handy web interface and a range of prompt templates—so you can ask whatever you want, without fear that an admin is going to close your account, the LLM will refuse to answer, or your connection is going to drop.

When you install an LLM locally on Linux, it’s yours, and it’s possible to use it however you want.

a spurious LLM generated summary of “cat in the rain” by ernest hemingway

How to Install Dalai on Linux

The easiest way to install Dalai on Linux is to use Docker and Docker Compose. If you don’t already have these, consult our guide on how toinstall Docker and Docker Compose.

With that out of the way, you’re ready to start installing Dalai. Clone the Dalai GitHub repository and use the cd command to move into it:

To get Dalai up and running with a web interface, first, build the Docker Compose file:

Docker Compose will download and install Python 3.11, Node Version Manager (NVM), and Node.js.

At stage seven of nine, the build will appear to freeze as Docker Compose downloads Dalai. Don’t worry: check your bandwidth use to reassure yourself that something is going on, andsimulate the evolution of virtual organisms in your terminalwhile you wait.

Eventually, you’ll be returned to the command prompt.

Dalai and the LLaMa/Alpaca models require a lot of memory to run. While there isn’t any official specification, a good rough guide is 4GB for the 7B model, 8GB for the 13B model, 16GB for the 30B model, and 32GB for the 65B model.

The Alpaca models are relatively small, with the 13B model reaching a modest 7.6GB, but the LLaMA weights can be huge: the equivalent 13B download comes in at 60.21GB, and the 65B model will take up an epic half-terabyte on your hard disk.

Decide which model is most suitable for your resources, and use the following command to install it:

There’s a chance that the models downloaded via Dalai may be corrupted. If this is the case, grab them fromHugging Faceinstead.

After you’re returned to the command prompt, bring up Docker Compose in detached mode:

Check if the container is running properly with:

If everything’s working as it should, open a web browser, and enterlocalhost:3000in the address bar.

Have Fun With Your Own Large Language Model on Linux

When the web interface opens, you’ll see a text box, into which you can write your prompts.

Writing effective prompts is difficult, and the Dalai developers have helpfully provided a range of templates that will help you to get a useful response from Dalai.

These areAI-Dialog,Chatbot,Default,Instruction,Rewrite,Translate, andTweet-sentiment.

As you’d expect, theAI-DialogandChatbottemplates are structured in a way that allows you to hold a conversation of sorts with the LLM. The main difference between the two is that the chatbot is supposed to be “highly intelligent”, while the AI-Dialog is “helpful, kind, obedient, honest, and knows its own limits”.

Of course, this is your “AI”, and if it pleases you, you can alter the prompt so the chatbot is dumb, and the AI-dialog characteristics are “sadistic” and “unhelpful”. It’s up to you.

We tested out theTranslatefunction by copying the opening paragraph of a BBC news story and asking Dalai to translate it into Spanish. The translation was good, and when we ran it through Google Translate to turn it back into English, found that it was quite readable and echoed the facts and sentiment of the original piece.

Likewise, theRewritetemplate spun the text convincingly into the opening of a new article.

TheDefaultandInstructionprompts are structured to help you ask questions or directly instruct Dalai.

Dalai’s accuracy in response will vary greatly depending on what model you’re using. A 30B model will be a lot more useful than a 7B model. But even then, you’re reminded that LLMs are simply sophisticated systems for guessing the next word in a sentence.

Neither the 7B nor 13B Alpaca models were able to provide an accurate 200-word summary of the short story, “Cat in the Rain” by Ernest Hemingway, and both made up thoroughly convincing plot lines and details about what the story contained.

And while the “helpful, kind, obedient, honest” AI-Dialog which “knows its own limits”, and “highly intelligent” Chatbot will balk at controversial prompts, you can give Dalai a straight Instruction or Default request, and it will write whatever you like—however you like it.

A Large Language Model on Your Linux Machine Is Yours

By running a large language model on your own Linux box, you’re not subject to oversight or withdrawal of service. you may use it however you see fit without fear of consequences for violating a corporate content policy.

If your computing resources are few, it’s possible to even run an LLM locally on a humble Raspberry Pi.

Yes, you can run an LLM “AI chatbot” on a Raspberry Pi! Just follow this step-by-step process and then ask it anything.

Unlock a world of entertainment possibilities with this clever TV hack.

I gripped my chair the entire time—and then kept thinking about it when the screen turned off.

Some subscriptions are worth the recurring cost, but not these ones.

This small feature makes a massive difference.

Now, I actually finish the books I start.

How to Run a Large Language Model on Linux (and Why You Should)