2023-03-12 16:06:51 -04:00
# Dalai
2023-03-16 18:46:41 -04:00
Run LLaMA and Alpaca on your computer.
2023-03-12 16:06:51 -04:00
< a href = "https://github.com/cocktailpeanut/dalai" class = 'inverse btn' > < i class = "fa-brands fa-github" > < / i > Github< / a >
< a href = "https://twitter.com/cocktailpeanut" class = 'inverse btn' > < i class = "fa-brands fa-twitter" > < / i > Twitter< / a >
2023-03-15 15:09:43 -04:00
< a href = "https://discord.gg/XahBUrbVwz" class = 'inverse btn' > < i class = "fa-brands fa-discord" > < / i > Discord< / a >
2023-03-12 16:06:51 -04:00
---
2023-03-16 18:46:41 -04:00
## JUST RUN THIS
2023-03-12 17:56:04 -04:00
2023-03-17 04:53:51 -04:00
< img src = "npx.png" class = 'round' >
2023-03-16 18:46:41 -04:00
## TO GET
Both alpaca and llama working on your computer!

2023-03-12 16:06:51 -04:00
---
2023-03-16 18:46:41 -04:00
1. Powered by [llama.cpp ](https://github.com/ggerganov/llama.cpp ), [llama-dl CDN ](https://github.com/shawwn/llama-dl ), and [alpaca.cpp ](https://github.com/antimatter15/alpaca.cpp )
2023-03-12 17:56:04 -04:00
2. Hackable web app included
3. Ships with JavaScript API
4. Ships with [Socket.io ](https://socket.io/ ) API
---
2023-03-15 15:09:43 -04:00
# Intro
## 1. Cross platform
Dalai runs on all of the following operating systems:
1. Linux
2. Mac
3. Windows
2023-03-17 20:20:55 -04:00
## 2. Memory Requirements
2023-03-15 15:09:43 -04:00
Runs on most modern computers. Unless your computer is very very old, it should work.
2023-03-17 20:20:55 -04:00
According to [a llama.cpp discussion thread ](https://github.com/ggerganov/llama.cpp/issues/13 ), here are the memory requirements:
- 7B => ~4 GB
- 13B => ~8 GB
- 30B => ~16 GB
- 64 => ~32 GB
2023-03-15 15:09:43 -04:00
## 3. Disk Space Requirements
2023-03-16 18:46:41 -04:00
### Alpaca
Currently only the 7B model is available via [alpaca.cpp ](https://github.com/antimatter15/alpaca.cpp )
#### 7B
Alpaca comes fully quantized (compressed), and the only space you need for the 7B model is 4.21GB:

### LLaMA
2023-03-15 15:09:43 -04:00
You need a lot of space for storing the models.
You do NOT have to install all models, you can install one by one. Let's take a look at how much space each model takes up:
> NOTE
>
> The following numbers assume that you DO NOT touch the original model files and keep BOTH the original model files AND the quantized versions.
>
> You can optimize this if you delete the original models (which are much larger) after installation and keep only the quantized versions.
2023-03-16 18:46:41 -04:00
#### 7B
2023-03-15 15:09:43 -04:00
- Full: The model takes up 31.17GB
- Quantized: 4.21GB

2023-03-16 18:46:41 -04:00
#### 13B
2023-03-15 15:09:43 -04:00
- Full: The model takes up 60.21GB
- Quantized: 4.07GB * 2 = 8.14GB

2023-03-16 18:46:41 -04:00
#### 30B
2023-03-15 15:09:43 -04:00
- Full: The model takes up 150.48GB
- Quantized: 5.09GB * 4 = 20.36GB

2023-03-16 18:46:41 -04:00
#### 65B
2023-03-15 15:09:43 -04:00
- Full: The model takes up 432.64GB
- Quantized: 5.11GB * 8 = 40.88GB

---
2023-03-12 16:10:39 -04:00
# Quickstart
2023-03-12 16:06:51 -04:00
2023-03-15 15:09:43 -04:00
## Mac
2023-03-18 14:43:12 +11:00
### Step 1. Install Node.js >=18
2023-03-15 15:09:43 -04:00
< a href = "https://nodejs.org/en/download/" class = 'btn' > Install Node.js< / a >
2023-03-17 04:53:51 -04:00
### Step 2. Install models
2023-03-16 18:46:41 -04:00
Currently supported engines are `llama` and `alpaca` .
2023-03-17 03:01:52 -04:00
#### Add alpaca models
2023-03-16 18:46:41 -04:00
2023-03-17 03:01:52 -04:00
Currently alpaca only has the 7B model:
2023-03-16 18:46:41 -04:00
```
2023-03-17 04:53:51 -04:00
npx dalai alpaca install 7B
2023-03-16 18:46:41 -04:00
```
2023-03-17 03:01:52 -04:00
#### Add llama models
2023-03-16 18:46:41 -04:00
To download llama models, you can run:
2023-03-12 16:06:51 -04:00
```
2023-03-17 04:53:51 -04:00
npx dalai llama install 7B
2023-03-12 16:06:51 -04:00
```
2023-03-16 18:46:41 -04:00
or to download multiple models:
2023-03-15 15:09:43 -04:00
```
2023-03-17 04:53:51 -04:00
npx dalai llama install 7B 13B
2023-03-16 18:46:41 -04:00
```
2023-03-15 15:09:43 -04:00
2023-03-17 03:01:52 -04:00
### Step 4. Run Web UI
2023-03-15 15:09:43 -04:00
2023-03-16 18:46:41 -04:00
After everything has been installed, run the following command to launch the web UI server:
```
2023-03-17 04:53:51 -04:00
npx dalai serve
2023-03-16 18:46:41 -04:00
```
and open http://localhost:3000 in your browser. Have fun!
2023-03-12 16:06:51 -04:00
2023-03-15 19:45:18 -04:00
---
2023-03-12 16:06:51 -04:00
2023-03-15 15:09:43 -04:00
## Windows
2023-03-12 16:06:51 -04:00
2023-03-15 15:09:43 -04:00
### Step 1. Install Visual Studio
On windows, you need to install Visual Studio before installing Dalai.
Press the button below to visit the Visual Studio downloads page and download:
< a href = "https://visualstudio.microsoft.com/downloads/" class = 'btn' > Download Microsoft Visual Studio< / a >
2023-03-12 16:06:51 -04:00
2023-03-15 15:09:43 -04:00
**IMPORTANT!!!**
When installing Visual Studio, make sure to check the 3 options as highlighted below:
1. Python development
2. Node.js development
3. Desktop development with C++

2023-03-16 18:46:41 -04:00
---
2023-03-15 19:45:18 -04:00
2023-03-17 04:53:51 -04:00
### Step 2.1. Install models
2023-03-16 18:46:41 -04:00
2023-03-17 20:10:08 -04:00
> **IMPORTANT**
>
> On Windows, make sure to run all commands in **cmd**.
>
> DO NOT run in **powershell**. Powershell has unnecessarily strict permissions and makes the script fail silently.
2023-03-16 18:46:41 -04:00
Currently supported engines are `llama` and `alpaca` .
2023-03-17 04:53:51 -04:00
#### Install alpaca
2023-03-16 18:46:41 -04:00
2023-03-17 20:10:08 -04:00
Currently alpaca only has the 7B model. Open your `cmd` application and enter:
2023-03-16 18:46:41 -04:00
```
2023-03-17 04:53:51 -04:00
npx dalai alpaca install 7B
2023-03-16 18:46:41 -04:00
```
2023-03-17 03:01:52 -04:00
#### Add llama models
2023-03-16 18:46:41 -04:00
2023-03-17 20:10:08 -04:00
To download llama models. Open your `cmd` application and enter:
2023-03-16 18:46:41 -04:00
```
2023-03-17 04:53:51 -04:00
npx dalai llama install 7B
2023-03-16 18:46:41 -04:00
```
or to download multiple models:
```
2023-03-17 04:53:51 -04:00
npx dalai llama install 7B 13B
2023-03-16 18:46:41 -04:00
```
2023-03-17 04:53:51 -04:00
---
### Step 2.2. Troubleshoot (optional)
2023-03-16 18:46:41 -04:00
2023-03-18 14:43:12 +11:00
In case above steps fail, try installing Node.js and Python separately.
2023-03-17 04:53:51 -04:00
Install Python:
< a href = "https://www.python.org/ftp/python/3.10.10/python-3.10.10-embed-amd64.zip" class = 'btn' > Download Python< / a >
2023-03-18 14:43:12 +11:00
Install Node.js >= 18:
2023-03-17 04:53:51 -04:00
< a href = "https://nodejs.org/en/download/" class = 'btn' > Download Node.js< / a >
After both have been installed, open powershell and type `python` to see if the application exists. And also type `node` to see if the application exists as well.
Once you've checked that they both exist, try again.
### Step 3. Run Web UI
2023-03-16 18:46:41 -04:00
2023-03-17 20:10:08 -04:00
After everything has been installed, run the following command to launch the web UI server (Make sure to run in `cmd` and not powershell!):
2023-03-16 18:46:41 -04:00
```
2023-03-17 18:28:42 -04:00
npx dalai serve
2023-03-16 18:46:41 -04:00
```
and open http://localhost:3000 in your browser. Have fun!
2023-03-15 15:09:43 -04:00
2023-03-15 19:45:18 -04:00
---
2023-03-15 15:09:43 -04:00
## Linux
2023-03-16 18:46:41 -04:00
### Step 1. Install Dependencies
You need to make sure you have the correct version of Python and Node.js installed.
#### Step 1.1. Python <= 3.10
< a href = "https://pimylifeup.com/installing-python-on-linux/" class = 'btn' > Download node.js< / a >
> Make sure the version is 3.10 or lower (not 3.11)
Python must be 3.10 or below (pytorch and other libraries are not supported yet on the latest)
#### Step 1.2. Node.js >= 18
< a href = "https://nodejs.org/en/download/package-manager/" class = 'btn' > Download node.js< / a >
> Make sure the version is 18 or higher
2023-03-15 15:09:43 -04:00
2023-03-17 03:01:52 -04:00
---
2023-03-16 18:46:41 -04:00
2023-03-17 04:53:51 -04:00
### Step 2. Install models
2023-03-16 18:46:41 -04:00
2023-03-17 03:01:52 -04:00
Currently supported engines are `llama` and `alpaca` .
2023-03-16 18:46:41 -04:00
2023-03-17 03:01:52 -04:00
#### Add alpaca models
2023-03-16 18:46:41 -04:00
2023-03-17 03:01:52 -04:00
Currently alpaca only has the 7B model:
2023-03-16 18:46:41 -04:00
```
2023-03-17 04:53:51 -04:00
npx dalai alpaca install 7B
2023-03-16 18:46:41 -04:00
```
2023-03-17 03:01:52 -04:00
#### Add llama models
2023-03-16 18:46:41 -04:00
To download llama models, you can run:
```
2023-03-17 04:53:51 -04:00
npx dalai llama install 7B
2023-03-16 18:46:41 -04:00
```
or to download multiple models:
2023-03-15 15:09:43 -04:00
```
2023-03-17 04:53:51 -04:00
npx dalai llama install 7B 13B
2023-03-15 15:09:43 -04:00
```
2023-03-17 03:01:52 -04:00
### Step 4. Run Web UI
2023-03-15 15:09:43 -04:00
2023-03-16 18:46:41 -04:00
After everything has been installed, run the following command to launch the web UI server:
```
2023-03-17 04:53:51 -04:00
npx dalai serve
2023-03-16 18:46:41 -04:00
```
and open http://localhost:3000 in your browser. Have fun!
2023-03-15 15:09:43 -04:00
2023-03-16 18:46:41 -04:00
---
2023-03-12 16:06:51 -04:00
# API
Dalai is also an NPM package:
1. programmatically install
2. locally make requests to the model
3. run a dalai server (powered by socket.io)
3. programmatically make requests to a remote dalai server (via socket.io)
Dalai is an NPM package. You can install it using:
```
npm install dalai
```
---
## 1. constructor()
### Syntax
```javascript
2023-03-13 16:11:47 -04:00
const dalai = new Dalai(home)
2023-03-12 16:06:51 -04:00
```
2023-03-13 16:11:47 -04:00
- `home` : (optional) manually specify the [llama.cpp ](https://github.com/ggerganov/llama.cpp ) folder
By default, Dalai automatically stores the entire `llama.cpp` repository under `~/llama.cpp` .
However, often you may already have a `llama.cpp` repository somewhere else on your machine and want to just use that folder. In this case you can pass in the `home` attribute.
2023-03-12 16:06:51 -04:00
### Examples
2023-03-13 16:11:47 -04:00
#### Basic
Creates a workspace at `~/llama.cpp`
2023-03-12 16:06:51 -04:00
```javascript
const dalai = new Dalai()
```
2023-03-13 16:11:47 -04:00
#### Custom path
Manually set the `llama.cpp` path:
2023-03-12 16:06:51 -04:00
```javascript
2023-03-13 16:11:47 -04:00
const dalai = new Dalai("/Documents/llama.cpp")
2023-03-12 16:06:51 -04:00
```
---
## 2. request()
### Syntax
```javascript
dalai.request(req, callback)
```
- `req` : a request object. made up of the following attributes:
- `prompt` : ** (required)** The prompt string
2023-03-16 18:46:41 -04:00
- `model` : ** (required)** The model type + model name to query. Takes the following form: `<model_type>.<model_name>`
- Example: `alpaca.7B` , `llama.13B` , ...
2023-03-13 16:11:47 -04:00
- `url` : only needed if connecting to a remote dalai server
- if unspecified, it uses the node.js API to directly run dalai locally
- if specified (for example `ws://localhost:3000` ) it looks for a socket.io endpoint at the URL and connects to it.
2023-03-12 16:06:51 -04:00
- `threads` : The number of threads to use (The default is 8 if unspecified)
- `n_predict` : The number of tokens to return (The default is 128 if unspecified)
- `seed` : The seed. The default is -1 (none)
- `top_k`
- `top_p`
2023-03-13 16:11:47 -04:00
- `repeat_last_n`
- `repeat_penalty`
2023-03-12 16:06:51 -04:00
- `temp` : temperature
- `batch_size` : batch size
2023-03-13 16:11:47 -04:00
- `skip_end` : by default, every session ends with `\n\n<end>` , which can be used as a marker to know when the full response has returned. However sometimes you may not want this suffix. Set `skip_end: true` and the response will no longer end with `\n\n<end>`
2023-03-12 16:06:51 -04:00
- `callback` : the streaming callback function that gets called every time the client gets any token response back from the model
### Examples
#### 1. Node.js
Using node.js, you just need to initialize a Dalai object with `new Dalai()` and then use it.
```javascript
const Dalai = require('dalai')
new Dalai().request({
model: "7B",
prompt: "The following is a conversation between a boy and a girl:",
}, (token) => {
process.stdout.write(token)
})
```
#### 2. Non node.js (socket.io)
To make use of this in a browser or any other language, you can use thie socket.io API.
##### Step 1. start a server
First you need to run a Dalai socket server:
```javascript
// server.js
const Dalai = require('dalai')
new Dalai().serve(3000) // port 3000
```
##### Step 2. connect to the server
Then once the server is running, simply make requests to it by passing the `ws://localhost:3000` socket url when initializing the Dalai object:
```javascript
const Dalai = require("dalai")
2023-03-13 16:11:47 -04:00
new Dalai().request({
url: "ws://localhost:3000",
2023-03-12 16:06:51 -04:00
model: "7B",
prompt: "The following is a conversation between a boy and a girl:",
}, (token) => {
console.log("token", token)
})
```
---
## 3. serve()
### Syntax
Starts a socket.io server at `port`
```javascript
dalai.serve(port)
```
### Examples
```javascript
const Dalai = require("dalai")
new Dalai().serve(3000)
```
---
## 4. http()
### Syntax
connect with an existing `http` instance (The `http` npm package)
```javascript
dalai.http(http)
```
- `http` : The [http ](https://nodejs.org/api/http.html ) object
### Examples
This is useful when you're trying to plug dalai into an existing node.js web app
```javascript
const app = require('express')();
const http = require('http').Server(app);
dalai.http(http)
http.listen(3000, () => {
console.log("server started")
})
```
2023-03-13 16:11:47 -04:00
2023-03-17 04:53:51 -04:00
## 5. install()
2023-03-13 16:11:47 -04:00
### Syntax
```javascript
2023-03-16 18:46:41 -04:00
await dalai.install(model_type, model_name1, model_name2, ...)
2023-03-13 16:11:47 -04:00
```
2023-03-16 18:46:41 -04:00
- `model_type` : the name of the model. currently supports:
- "alpaca"
- "llama"
- `model1` , `model2` , ...: the model names to install ("7B"`, "13B", "30B", "65B", etc)
2023-03-13 16:11:47 -04:00
### Examples
2023-03-17 04:53:51 -04:00
Install Llama "7B" and "13B" models:
2023-03-13 16:11:47 -04:00
```javascript
const Dalai = require("dalai");
const dalai = new Dalai()
2023-03-17 04:53:51 -04:00
await dalai.install("llama", "7B", "13B")
```
Install alpaca 7B model:
```javascript
const Dalai = require("dalai");
const dalai = new Dalai()
await dalai.install("alpaca", "7B")
2023-03-13 16:11:47 -04:00
```
---
## 6. installed()
returns the array of installed models
### Syntax
```javascript
const models = await dalai.installed()
```
### Examples
```javascript
const Dalai = require("dalai");
const dalai = new Dalai()
const models = await dalai.installed()
console.log(models) // prints ["7B", "13B"]
```
2023-03-15 15:17:27 -04:00
2023-03-16 18:46:41 -04:00
<!--
2023-03-15 15:17:27 -04:00
2023-03-16 18:46:41 -04:00
---
## 7. download()
2023-03-15 15:17:27 -04:00
2023-03-16 18:46:41 -04:00
Download models.
2023-03-15 15:17:27 -04:00
2023-03-16 18:46:41 -04:00
There are two download options:
2023-03-15 15:17:27 -04:00
2023-03-16 18:46:41 -04:00
1. **LLaMA:** Download the original LLaMA model, convert it, and quantize (compress) it
2. **LLaMA.zip:** Download the compressed version (generated from step 1 and published on HuggingFace)
2023-03-15 15:17:27 -04:00
2023-03-16 18:46:41 -04:00
### Syntax
2023-03-15 15:17:27 -04:00
2023-03-16 18:46:41 -04:00
```javascript
await dalai.download(model1, model2, model3, ...)
2023-03-15 15:17:27 -04:00
```
2023-03-16 18:46:41 -04:00
- `models` : the model names to install. Can be: "7B"`, "13B", "30B", "65B", "7B.zip", "13B.zip", "30B.zip", "65B.zip"
- "7B", "13B", "30B", "65B": download the raw model, convert, and quantize
- "7B.zip", "13B.zip", "30B.zip", "65B.zip": download the quantized model (no need to waste time downloading huge files)
### Examples
Install the "7B" and "13B" models:
```javascript
const Dalai = require("dalai");
const dalai = new Dalai()
await dalai.install("7B", "13B")
2023-03-15 15:17:27 -04:00
```
2023-03-16 18:46:41 -04:00
-->
---
# FAQ
2023-03-17 20:07:21 -04:00
## Using a different home folder
By default Dalai uses your home directory to store the entire repository (`~/dalai` ). However sometimes you may want to store the archive elsewhere.
In this case you can call all CLI methods using the `--home` flag:
### 1. Installing models to a custom path
```
npx dalai llama install 7B --home ~/test_dir
```
### 2. Serving from the custom path
```
npx dalai serve --home ~/test_dir
```
2023-03-16 18:46:41 -04:00
## Updating to the latest
2023-03-17 03:01:52 -04:00
To make sure you update to the latest, first find the latest version at https://www.npmjs.com/package/dalai
2023-03-17 04:53:51 -04:00
Let's say the latest version is `0.3.0` . To update the dalai version, run:
2023-03-16 18:46:41 -04:00
```
2023-03-17 04:53:51 -04:00
npx dalai@0 .3.0 setup
2023-03-16 18:46:41 -04:00
```
2023-03-15 15:17:27 -04:00
## Staying up to date
Have questions or feedback? Follow the project through the following outlets:
< a href = "https://github.com/cocktailpeanut/dalai" class = 'inverse btn' > < i class = "fa-brands fa-github" > < / i > Github< / a >
< a href = "https://twitter.com/cocktailpeanut" class = 'inverse btn' > < i class = "fa-brands fa-twitter" > < / i > Twitter< / a >
< a href = "https://discord.gg/XahBUrbVwz" class = 'inverse btn' > < i class = "fa-brands fa-discord" > < / i > Discord< / a >
---