Merge branch 'main' into cherry-pick-storage-refactor

2025-06-03 04:30:22 +00:00 · 2024-01-02 12:21:56 -08:00 · 2024-01-02 12:21:56 -08:00 · cdef0729b8
commit cdef0729b8
parent d6a56b262e 34b1296d53
34 changed files with 399 additions and 201 deletions
--- a/README.md
+++ b/README.md
@ -75,7 +75,7 @@ memgpt configure
 ```

 ### In-chat commands
-You can run the following commands in the MemGPT CLI prompt which chatting with an agent:
+You can run the following commands in the MemGPT CLI prompt while chatting with an agent:
 * `/exit`: Exit the CLI
 * `/attach`: Attach a loaded data source to the agent
 * `/save`: Save a checkpoint of the current agent/conversation state
@ -227,4 +227,4 @@ Datasets used in our [paper](https://arxiv.org/abs/2310.08560) can be downloaded
 - [x] Add official gpt-3.5-turbo support ([discussion](https://github.com/cpacker/MemGPT/discussions/66))
 - [x] CLI UI improvements ([issue](https://github.com/cpacker/MemGPT/issues/11))
 - [x] Add support for other LLM backends ([issue](https://github.com/cpacker/MemGPT/issues/18), [discussion](https://github.com/cpacker/MemGPT/discussions/67))
- [ ] Release MemGPT family of open models (eg finetuned Mistral) ([discussion](https://github.com/cpacker/MemGPT/discussions/67))
+- [ ] Release MemGPT family of open models (eg finetuned Mistral) ([discussion](https://github.com/cpacker/MemGPT/discussions/67))
--- a/docs/.markdownlint.json
+++ b/docs/.markdownlint.json
@ -0,0 +1,6 @@
+{
+  "MD013": false,
+  "MD028": false,
+  "MD033": false,
+  "MD034": false
+}
--- a/docs/adding_wrappers.md
+++ b/docs/adding_wrappers.md
@ -5,7 +5,7 @@ category: 6580da9a40bb410016b8b0c3
 ---

 > ⚠️ MemGPT + local LLM failure cases
->
+
 > When using open LLMs with MemGPT, **the main failure case will be your LLM outputting a string that cannot be understood by MemGPT**. MemGPT uses function calling to manage memory (eg `edit_core_memory(...)` and interact with the user (`send_message(...)`), so your LLM needs generate outputs that can be parsed into MemGPT function calls.

 ### What is a "wrapper"?
@ -42,10 +42,9 @@ class LLMChatCompletionWrapper(ABC):

 You can follow our example wrappers ([located here](https://github.com/cpacker/MemGPT/tree/main/memgpt/local_llm/llm_chat_completion_wrappers)).

-
 ### Example with [Airoboros](https://huggingface.co/jondurbin/airoboros-l2-70b-2.1) (llama2 finetune)

-To help you get started, we've implemented an example wrapper class for a popular llama2 model **finetuned on function calling** (Airoboros). We want MemGPT to run well on open models as much as you do, so we'll be actively updating this page with more examples. Additionally, we welcome contributions from the community! If you find an open LLM that works well with MemGPT, please open a PR with a model wrapper and we'll merge it ASAP.
+To help you get started, we've implemented an example wrapper class for a popular llama2 model **fine-tuned on function calling** (Airoboros). We want MemGPT to run well on open models as much as you do, so we'll be actively updating this page with more examples. Additionally, we welcome contributions from the community! If you find an open LLM that works well with MemGPT, please open a PR with a model wrapper and we'll merge it ASAP.

 ```python
 class Airoboros21Wrapper(LLMChatCompletionWrapper):
@ -83,9 +82,9 @@ See full file [here](https://github.com/cpacker/MemGPT/tree/main/memgpt/local_ll

 MemGPT uses function calling to do memory management. With [OpenAI's ChatCompletion API](https://platform.openai.com/docs/api-reference/chat/), you can pass in a function schema in the `functions` keyword arg, and the API response will include a `function_call` field that includes the function name and the function arguments (generated JSON). How this works under the hood is your `functions` keyword is combined with the `messages` and `system` to form one big string input to the transformer, and the output of the transformer is parsed to extract the JSON function call.

-In the future, more open LLMs and LLM servers (that can host OpenAI-compatable ChatCompletion endpoints) may start including parsing code to do this automatically as standard practice. However, in the meantime, when you see a model that says it supports “function calling”, like Airoboros, it doesn't mean that you can just load Airoboros into a ChatCompletion-compatable endpoint like WebUI, and then use the same OpenAI API call and it'll just work.
+In the future, more open LLMs and LLM servers (that can host OpenAI-compatible ChatCompletion endpoints) may start including parsing code to do this automatically as standard practice. However, in the meantime, when you see a model that says it supports “function calling”, like Airoboros, it doesn't mean that you can just load Airoboros into a ChatCompletion-compatible endpoint like WebUI, and then use the same OpenAI API call and it'll just work.

-1. When a model page says it supports function calling, they probably mean that the model was finetuned on some function call data (not that you can just use ChatCompletion with functions out-of-the-box). Remember, LLMs are just string-in-string-out, so there are many ways to format the function call data. E.g. Airoboros formats the function schema in YAML style (see https://huggingface.co/jondurbin/airoboros-l2-70b-3.1.2#agentfunction-calling) and the output is in JSON style. To get this to work behind a ChatCompletion API, you still have to do the parsing from `functions` keyword arg (containing the schema) to the model's expected schema style in the prompt (YAML for Airoboros), and you have to run some code to extract the function call (JSON for Airoboros) and package it cleanly as a `function_call` field in the response.
+1. When a model page says it supports function calling, they probably mean that the model was fine-tuned on some function call data (not that you can just use ChatCompletion with functions out-of-the-box). Remember, LLMs are just string-in-string-out, so there are many ways to format the function call data. E.g. Airoboros formats the function schema in YAML style (see https://huggingface.co/jondurbin/airoboros-l2-70b-3.1.2#agentfunction-calling) and the output is in JSON style. To get this to work behind a ChatCompletion API, you still have to do the parsing from `functions` keyword arg (containing the schema) to the model's expected schema style in the prompt (YAML for Airoboros), and you have to run some code to extract the function call (JSON for Airoboros) and package it cleanly as a `function_call` field in the response.

 2. Partly because of how complex it is to support function calling, most (all?) of the community projects that do OpenAI ChatCompletion endpoints for arbitrary open LLMs do not support function calling, because if they did, they would need to write model-specific parsing code for each one.

--- a/docs/api.md
+++ b/docs/api.md
@ -7,21 +7,21 @@ category: 658135e7f596b800715c1cee
 ![memgpt llama](https://raw.githubusercontent.com/cpacker/MemGPT/main/docs/assets/memgpt_server.webp)

 > ⚠️ API under active development
-> 
+>
 > The MemGPT API is under **active development** and **breaking changes are being made frequently**. Do not expect any endpoints or API schema to persist until an official `v1.0` of the MemGPT API is released.
-> 
+>
 > For support and to track ongoing developments, please visit [the MemGPT Discord server](https://discord.gg/9GEQrxmVyE) where you can chat with the MemGPT team and other developers about the API.

 > 📘 Check Discord for the latest development build
-> 
+>
 > Make sure to check [Discord](https://discord.gg/9GEQrxmVyE) for updates on the latest development branch to use. The API reference viewable on this page may only apply to the latest dev branch, so if you plan to experiment with the API we recommend you [install MemGPT from source](https://memgpt.readme.io/docs/contributing#installing-from-source) for the time being.

 ## Starting a server process

 You can spawn a MemGPT server process using the following command:

-```Text shell
+```sh
 memgpt server
 ```

-Before attempting to launch a server process, make sure that you have already configured MemGPT (using `memgpt configure`) and are able to successfully create and message an agent using `memgpt run`. For more information, see [our quickstart guide](https://memgpt.readme.io/docs/quickstart).
+Before attempting to launch a server process, make sure that you have already configured MemGPT (using `memgpt configure`) and are able to successfully create and message an agent using `memgpt run`. For more information, see [our quickstart guide](https://memgpt.readme.io/docs/quickstart).
--- a/docs/autogen.md
+++ b/docs/autogen.md
@ -1,7 +1,7 @@
 ---
-title: MemGPT + AutoGen 
+title: MemGPT + AutoGen
 excerpt: Creating AutoGen agents powered by MemGPT
-category: 6580dab16cade8003f996d17 
+category: 6580dab16cade8003f996d17
 ---

 > 📘 Need help?
@ -21,6 +21,7 @@ category: 6580dab16cade8003f996d17
 MemGPT includes an AutoGen agent class ([MemGPTAgent](https://github.com/cpacker/MemGPT/blob/main/memgpt/autogen/memgpt_agent.py)) that mimics the interface of AutoGen's [ConversableAgent](https://microsoft.github.io/autogen/docs/reference/agentchat/conversable_agent#conversableagent-objects), allowing you to plug MemGPT into the AutoGen framework.

 To create a MemGPT AutoGen agent for use in an AutoGen script, you can use the `create_memgpt_autogen_agent_from_config` constructor:
+
 ```python
 from memgpt.autogen.memgpt_agent import create_memgpt_autogen_agent_from_config

@ -56,6 +57,7 @@ memgpt_autogen_agent = create_memgpt_autogen_agent_from_config(
 ```

 Now this `memgpt_autogen_agent` can be used in standard AutoGen scripts:
+
 ```python
 import autogen

@ -97,6 +99,7 @@ Once you've confirmed that you're able to chat with a MemGPT agent using `memgpt
 > If you're using RunPod to run web UI, make sure that you set your endpoint to the RunPod IP address, **not the default localhost address**.
 >
 > For example, during `memgpt configure`:
+>
 > ```text
 > ? Enter default endpoint: https://yourpodaddresshere-5000.proxy.runpod.net
 > ```
@ -106,6 +109,7 @@ Once you've confirmed that you're able to chat with a MemGPT agent using `memgpt
 Now we're going to integrate MemGPT and AutoGen by creating a special "MemGPT AutoGen agent" that wraps MemGPT in an AutoGen-style agent interface.

 First, make sure you have AutoGen installed:
+
 ```sh
 pip install pyautogen
 ```
@ -117,7 +121,9 @@ In order to run this example on a local LLM, go to lines 46-66 in [examples/agen
 `config_list` is used by non-MemGPT AutoGen agents, which expect an OpenAI-compatible API. `config_list_memgpt` is used by MemGPT AutoGen agents, and requires additional settings specific to MemGPT (such as the `model_wrapper` and `context_window`. Depending on what LLM backend you want to use, you'll have to set up your `config_list` and `config_list_memgpt` differently:

 #### web UI example
+
 For example, if you are using web UI, it will look something like this:
+
 ```python
 # Non-MemGPT agents will still use local LLMs, but they will use the ChatCompletions endpoint
 config_list = [
@ -132,7 +138,7 @@ config_list = [
 config_list_memgpt = [
    {
        "preset": DEFAULT_PRESET,
-        "model": None,  # not required for web UI, only required for Ollama, see: https://memgpt.readme.io/docs/ollama 
+        "model": None,  # not required for web UI, only required for Ollama, see: https://memgpt.readme.io/docs/ollama
        "model_wrapper": "airoboros-l2-70b-2.1",  # airoboros is the default wrapper and should work for most models
        "model_endpoint_type": "webui",
        "model_endpoint": "http://localhost:5000",  # notice port 5000 for web UI
@ -142,7 +148,9 @@ config_list_memgpt = [
 ```

 #### LM Studio example
+
 If you are using LM Studio, then you'll need to change the `api_base` in `config_list`, and `model_endpoint_type` + `model_endpoint` in `config_list_memgpt`:
+
 ```python
 # Non-MemGPT agents will still use local LLMs, but they will use the ChatCompletions endpoint
 config_list = [
@ -167,7 +175,9 @@ config_list_memgpt = [
 ```

 #### OpenAI example
+
 If you are using the OpenAI API (e.g. using `gpt-4-turbo` via your own OpenAI API account), then the `config_list` for the AutoGen agent and `config_list_memgpt` for the MemGPT AutoGen agent will look different (a lot simpler):
+
 ```python
 # This config is for autogen agents that are not powered by MemGPT
 config_list = [
@ -192,7 +202,9 @@ config_list_memgpt = [
 ```

 #### Azure OpenAI example
+
 Azure OpenAI API setup will be similar to OpenAI API, but requires additional config variables. First, make sure that you've set all the related Azure variables referenced in [our MemGPT Azure setup page](https://memgpt.readme.io/docs/endpoints#azure-openai) (`AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_VERSION`, `AZURE_OPENAI_ENDPOINT`, etc). If you have all the variables set correctly, you should be able to create configs by pulling from the env variables:
+
 ```python
 # This config is for autogen agents that are not powered by MemGPT
 # See Auto
@ -219,17 +231,17 @@ config_list_memgpt = [
        "azure_endpoint": os.getenv("AZURE_OPENAI_ENDPOINT"),
        "azure_version": os.getenv("AZURE_OPENAI_VERSION"),
        # if you are using Azure for embeddings too, include the following line:
-        "embedding_embedding_endpoint_type": "azure",        
+        "embedding_embedding_endpoint_type": "azure",
    },
 ]
 ```

-
 > 📘 Making internal monologue visible to AutoGen
 >
 > By default, MemGPT's inner monologue and function traces are hidden from other AutoGen agents.
 >
 > You can modify `interface_kwargs` to change the visibility of inner monologue and function calling:
+>
 > ```python
 > interface_kwargs = {
 >     "debug": False,  # this is the equivalent of the --debug flag in the MemGPT CLI
@ -239,11 +251,13 @@ config_list_memgpt = [
 > ```

 The only parts of the `agent_groupchat.py` file you need to modify should be the `config_list` and `config_list_memgpt` (make sure to change `USE_OPENAI` to `True` or `False` depending on if you're trying to use a local LLM server like web UI, or OpenAI's API). Assuming you edited things correctly, you should now be able to run `agent_groupchat.py`:
+
 ```sh
 python memgpt/autogen/examples/agent_groupchat.py
 ```

 Your output should look something like this:
+
 ```text
 User_proxy (to chat_manager):

@ -287,14 +301,14 @@ Remember, achieving one million dollars in revenue in such a short time frame wo
 --------------------------------------------------------------------------------
 MemGPT_coder (to chat_manager):

-Great goal! Generating a million dollars in one month with an app is ambitious, but definitely doable if you approach it the right way. Here are some tips and potential ideas that could help: 
+Great goal! Generating a million dollars in one month with an app is ambitious, but definitely doable if you approach it the right way. Here are some tips and potential ideas that could help:

-1. Identify a niche market or trend (for example, AI-powered fitness apps or FinTech solutions). 
-2. Solve a significant problem for many people (such as time management or financial literacy). 
-3. Choose an effective monetization strategy like subscriptions, in-app purchases, or advertising. 
+1. Identify a niche market or trend (for example, AI-powered fitness apps or FinTech solutions).
+2. Solve a significant problem for many people (such as time management or financial literacy).
+3. Choose an effective monetization strategy like subscriptions, in-app purchases, or advertising.
 4. Make sure your app is visually appealing and easy to use to keep users engaged.

-Some ideas that might work: 
+Some ideas that might work:
 - AI-powered personal finance management app
 - A virtual assistant app that helps people manage their daily tasks
 - A social networking platform for job seekers or freelancers
@ -316,15 +330,18 @@ User_proxy (to chat_manager):
 First, follow the instructions in [Example - chat with your data - Creating an external data source](example_data/#creating-an-external-data-source):

 To download the MemGPT research paper we'll use `curl` (you can also just download the PDF from your browser):
+
 ```sh
 # we're saving the file as "memgpt_research_paper.pdf"
 curl -L -o memgpt_research_paper.pdf https://arxiv.org/pdf/2310.08560.pdf
 ```

 Now that we have the paper downloaded, we can create a MemGPT data source using `memgpt load`:
+
 ```sh
 memgpt load directory --name memgpt_research_paper --input-files=memgpt_research_paper.pdf
 ```
+
 ```text
 loading data
 done loading data
@ -337,9 +354,11 @@ Generating embeddings: 100%|█████████████████
 Note: you can ignore the "_LLM is explicitly disabled_" message.

 Now, you can run `agent_docs.py`, which asks `MemGPT_coder` what a virtual context is:
+
 ```sh
 python memgpt/autogen/examples/agent_docs.py
 ```
+
 ```text
 Ingesting 65 passages into MemGPT_agent
 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.47s/it]
--- a/docs/cli_faq.md
+++ b/docs/cli_faq.md
@ -1,11 +1,11 @@
 ---
 title: Frequently asked questions (FAQ)
 excerpt: Check frequently asked questions
-category: 6580d34ee5e4d00068bf2a1d 
+category: 6580d34ee5e4d00068bf2a1d
 ---

 > 📘 Open / local LLM FAQ
-> 
+>
 > Questions specific to running your own open / local LLMs with MemGPT can be found [here](local_llm_faq).

 ## MemGPT CLI
--- a/docs/config.md
+++ b/docs/config.md
@ -1,7 +1,7 @@
 ---
 title: Configuration
 excerpt: Configuring your MemGPT agent
-category: 6580d34ee5e4d00068bf2a1d 
+category: 6580d34ee5e4d00068bf2a1d
 ---

 You can set agent defaults by running `memgpt configure`, which will store config information at `~/.memgpt/config` by default.
@ -18,7 +18,7 @@ The `memgpt run` command supports the following optional flags (if set, will ove
 * `--no-verify`: (bool) Bypass message verification (default=False)
 * `--yes`/`-y`: (bool) Skip confirmation prompt and use defaults (default=False)

-You can override the parameters you set with `memgpt configure` with the following additional flags specific to local LLMs: 
+You can override the parameters you set with `memgpt configure` with the following additional flags specific to local LLMs:

 * `--model-wrapper`: (str) Model wrapper used by backend (e.g. `airoboros_xxx`)
 * `--model-endpoint-type`: (str) Model endpoint backend type (e.g. lmstudio, ollama)
@ -26,14 +26,17 @@ You can override the parameters you set with `memgpt configure` with the followi
 * `--context-window`: (int) Size of model context window (specific to model type)

 #### Updating the config location
+
 You can override the location of the config path by setting the environment variable `MEMGPT_CONFIG_PATH`:
-```
+
+```sh
 export MEMGPT_CONFIG_PATH=/my/custom/path/config # make sure this is a file, not a directory
 ```

-
 ### Adding Custom Personas/Humans
+
 You can add new human or persona definitions either by providing a file (using the `-f` flag) or text (using the `--text` flag).
+
 ```sh
 # add a human
 memgpt add human [--name <NAME>] [-f <FILENAME>] [--text <TEXT>]
@ -43,9 +46,11 @@ memgpt add persona [--name <NAME>] [-f <FILENAME>] [--text <TEXT>]
 ```

 You can view available persona and human files with the following command:
+
 ```sh
 memgpt list [humans/personas]
 ```

 ### Custom Presets
+
 You can customize your MemGPT agent even further with [custom presets](presets) and [custom functions](functions).
--- a/docs/contributing.md
+++ b/docs/contributing.md
@ -1,14 +1,14 @@
 ---
 title: How to contribute
 excerpt: Learn how to contribute to the MemGPT project!
-category: 6581eaa89a00e6001012822c 
+category: 6581eaa89a00e6001012822c
 ---

 ![memgpt llama](https://raw.githubusercontent.com/cpacker/MemGPT/main/docs/assets/memgpt_library.webp)

 MemGPT is an active [open source](https://en.wikipedia.org/wiki/Open_source) project and we welcome community contributions! There are many ways to contribute for both programmers and non-programmers alike.

-> 📘 Discord contributor role 
+> 📘 Discord contributor role
 >
 > Contributing to the codebase gets you a **contributor role** on [Discord](https://discord.gg/9GEQrxmVyE). If you're a contributor and we forgot to assign you the role, message the MemGPT team [on Discord](https://discord.gg/9GEQrxmVyE)!

@ -22,4 +22,4 @@ We're always looking to improve our docs (like the page you're reading right now

 ## 🦙 Editing the MemGPT source code

-If you're interested in editing the MemGPT source code, [check our guide on building and contributing from source](contributing_code).
+If you're interested in editing the MemGPT source code, [check our guide on building and contributing from source](contributing_code).
--- a/docs/contributing_code.md
+++ b/docs/contributing_code.md
@ -1,7 +1,7 @@
 ---
 title: Contributing to the codebase
 excerpt: How to modify code and create pull requests
-category: 6581eaa89a00e6001012822c 
+category: 6581eaa89a00e6001012822c
 ---

 If you plan on making big changes to the codebase, the easiest way to make contributions is to install MemGPT directly from the source code (instead of via `pypi`, which you do with `pip install ...`).
@ -9,16 +9,17 @@ If you plan on making big changes to the codebase, the easiest way to make contr
 Once you have a working copy of the source code, you should be able to modify the MemGPT codebase an immediately see any changes you make to the codebase change the way the `memgpt` command works! Then once you make a change you're happy with, you can open a pull request to get your changes merged into the official MemGPT package.

 > 📘 Instructions on installing from a fork and opening pull requests
-> 
+>
 > If you plan on contributing your changes, you should create a fork of the MemGPT repo and install the source code from your fork.
 >
-> Please see [our contributing guide](https://github.com/cpacker/MemGPT/blob/main/CONTRIBUTING.md) for instructions on how to install from a fork and open a PR. 
+> Please see [our contributing guide](https://github.com/cpacker/MemGPT/blob/main/CONTRIBUTING.md) for instructions on how to install from a fork and open a PR.

 ## Installing MemGPT from source

 **Reminder**: if you plan on opening a pull request to contribute your changes, follow our [contributing guide's install instructions](https://github.com/cpacker/MemGPT/blob/main/CONTRIBUTING.md) instead!

 To install MemGPT from source, start by cloning the repo:
+
 ```sh
 git clone git@github.com:cpacker/MemGPT.git
 ```
@ -28,39 +29,45 @@ git clone git@github.com:cpacker/MemGPT.git
 First, install Poetry using [the official instructions here](https://python-poetry.org/docs/#installation).

 Once Poetry is installed, navigate to the MemGPT directory and install the MemGPT project with Poetry:
-```shell
+
+```sh
 cd MemGPT
 poetry shell
-poetry install -E dev -E postgres -E local 
+poetry install -E dev -E postgres -E local
 ```

 Now when you want to use `memgpt`, make sure you first activate the `poetry` environment using poetry shell:
-```shell
+
+```sh
 $ poetry shell
 (pymemgpt-py3.10) $ memgpt run
 ```

 Alternatively, you can use `poetry run` (which will activate the `poetry` environment for the `memgpt run` command only):
-```shell
+
+```sh
 poetry run memgpt run
 ```

 ### Installing dependencies with pip

 First you should set up a dedicated virtual environment. This is optional, but is highly recommended:
-```shell
+
+```sh
 cd MemGPT
 python3 -m venv venv
 . venv/bin/activate
 ```

 Once you've activated your virtual environment and are in the MemGPT project directory, you can install the dependencies with `pip`:
-```shell
+
+```sh
 pip install -e '.[dev,postgres,local]'
 ```

 Now, you should be able to run `memgpt` from the command-line using the downloaded source code (if you used a virtual environment, you have to activate the virtual environment to access `memgpt`):
-```shell
+
+```sh
 $ . venv/bin/activate
 (venv) $ memgpt run
 ```
--- a/docs/contributing_docs.md
+++ b/docs/contributing_docs.md
@ -1,14 +1,14 @@
 ---
-title: Contributing to the documentation 
-excerpt: How to add to the MemGPT documentation 
-category: 6581eaa89a00e6001012822c 
+title: Contributing to the documentation
+excerpt: How to add to the MemGPT documentation
+category: 6581eaa89a00e6001012822c
 ---

-There are two ways to propose edits to the MemGPT documentation: editing the documentation files directly in the GitHub file editor (on the GitHub website), or cloning the source code and editing the documentation files (in your text/markdown editor of chocie).
+There are two ways to propose edits to the MemGPT documentation: editing the documentation files directly in the GitHub file editor (on the GitHub website), or cloning the source code and editing the documentation files (in your text/markdown editor of choice).

-# Editing directly via GitHub
+## Editing directly via GitHub

-> 📘 Requires a GitHub account 
+> 📘 Requires a GitHub account
 >
 > Before beginning, make sure you have an account on [github.com](https://github.com) and are logged in.

@ -23,6 +23,6 @@ The easiest way to edit the docs is directly via the GitHub website:
 7. Add the necessary details describing the changes you've made, then click "Create pull request"
 8. ✅ That's it! A MemGPT team member will then review your PR and if it looks good merge it into the main branch, at which point you'll see the changes updated on the docs page!

-# Editing via the source code
+## Editing via the source code

-Editing documentation via the source code follows the same process as general source code editing - forking the repository, cloning your fork, editing a branch of your fork, and opening a PR from your fork to the main repo. See our [source code editing guide](contributing_code) for more details. 
+Editing documentation via the source code follows the same process as general source code editing - forking the repository, cloning your fork, editing a branch of your fork, and opening a PR from your fork to the main repo. See our [source code editing guide](contributing_code) for more details.
--- a/docs/data_sources.md
+++ b/docs/data_sources.md
@ -1,17 +1,20 @@
 ---
 title: Attaching data sources
 excerpt: Connecting external data to your MemGPT agent
-category: 6580d34ee5e4d00068bf2a1d 
+category: 6580d34ee5e4d00068bf2a1d
 ---

 MemGPT supports pre-loading data into archival memory. In order to made data accessible to your agent, you must load data in with `memgpt load`, then attach the data source to your agent. You can configure where archival memory is stored by configuring the [storage backend](storage.md).

 ### Viewing available data sources
+
 You can view available data sources with:
-```
+
+```sh
 memgpt list sources
 ```
-```
+
+```sh
 +----------------+----------+----------+
 |      Name      | Location | Agents   |
 +----------------+----------+----------+
@ -20,18 +23,22 @@ memgpt list sources
 |  memgpt-docs   |  local   |  agent_1 |
 +----------------+----------+----------+
 ```
+
 The `Agents` column indicates which agents have access to the data, while `Location` indicates what storage backend the data has been loaded into.

 ### Attaching data to agents
+
 Attaching a data source to your agent loads the data into your agent's archival memory to access. You can attach data to your agent in two ways:

 *[Option 1]* From the CLI, run:
-```
+
+```sh
 memgpt attach --agent <AGENT-NAME> --data-source <DATA-SOURCE-NAME>
 ```

 *[Option 2]*  While chatting with the agent, enter the `/attach` command and select the data source
-```
+
+```sh
 > Enter your message: /attach
 ? Select data source (Use arrow keys)
 » short-stories
@ -43,15 +50,18 @@ memgpt attach --agent <AGENT-NAME> --data-source <DATA-SOURCE-NAME>
 > To encourage your agent to reference its archival memory, we recommend adding phrases like "_search your archival memory..._" for the best results.

 ### Loading a file or directory
-You can load a file, list of files, or directry into MemGPT with the following command:
+
+You can load a file, list of files, or directly into MemGPT with the following command:
+
 ```sh
 memgpt load directory --name <NAME> \
    [--input-dir <DIRECTORY>] [--input-files <FILE1> <FILE2>...] [--recursive]
 ```

-
 ### Loading a database dump
+
 You can load database into MemGPT, either from a database dump or a database connection, with the following command:
+
 ```sh
 memgpt load database --name <NAME>  \
    --query <QUERY> \ # Query to run on database to get data
@ -65,7 +75,9 @@ memgpt load database --name <NAME>  \
 ```

 ### Loading a vector database
+
 If you already have a vector database containing passages and embeddings, you can load them into MemGPT by specifying the table name, database URI, and the columns containing the passage text and embeddings.
+
 ```sh
 memgpt load vector-database --name <NAME> \
    --uri <URI> \ # Database URI
@ -73,15 +85,19 @@ memgpt load vector-database --name <NAME> \
    --text_column <TEXT-COL> \ # Name of column containing text
    --embedding_column <EMBEDDING-COL> # Name of column containing embedding
 ```
+
 Since embeddings are already provided, MemGPT will not re-compute the embeddings.

 ### Loading a LlamaIndex dump
+
 If you have a Llama Index `VectorIndex` which was saved to disk, you can load it into MemGPT by specifying the directory the index was saved to:
+
 ```sh
 memgpt load index --name <NAME> --dir <INDEX-DIR>
 ```
+
 Since Llama Index will have already computing embeddings, MemGPT will not re-compute embeddings.

-
 ### Loading other types of data
+
 We highly encourage contributions for new data sources, which can be added as a new [CLI data load command](https://github.com/cpacker/MemGPT/blob/main/memgpt/cli/cli_load.py). We recommend checking for [Llama Index connectors](https://gpt-index.readthedocs.io/en/v0.6.3/how_to/data_connectors.html) that may support ingesting the data you're interested in loading.
--- a/docs/discord_bot.md
+++ b/docs/discord_bot.md
@ -1,23 +1,26 @@
 ---
-title: Chatting with MemGPT Bot 
+title: Chatting with MemGPT Bot
 excerpt: Get up and running with the MemGPT Discord Bot
-category: 6580da8eb6feb700166e5016 
+category: 6580da8eb6feb700166e5016
 ---

 The fastest way to experience MemGPT is to chat with the MemGPT Discord Bot.

 Join <a href="https://discord.gg/9GEQrxmVyE">Discord</a></strong> and message the MemGPT bot (in the `#memgpt` channel). Then run the following commands (messaged to "MemGPT Bot"):
+
 * `/profile` (to create your profile)
 * `/key` (to enter your OpenAI key)
 * `/create` (to create a MemGPT chatbot)

 Make sure your privacy settings on this server are open so that MemGPT Bot can DM you: \
 MemGPT → Privacy Settings → Direct Messages set to ON
+
 <div align="center">
 <img src="https://research.memgpt.ai/assets/img/discord/dm_settings.png" alt="set DMs settings on MemGPT server to be open in MemGPT so that MemGPT Bot can message you" width="400">
 </div>

 You can see the full list of available commands when you enter `/` into the message box.
+
 <div align="center">
 <img src="https://research.memgpt.ai/assets/img/discord/slash_commands.png" alt="MemGPT Bot slash commands" width="400">
 </div>
--- a/docs/embedding_endpoints.md
+++ b/docs/embedding_endpoints.md
@ -1,26 +1,32 @@
 ---
-title: Configuring embedding backends 
-excerpt: Connecting MemGPT to various endpoint backends 
-category: 6580d34ee5e4d00068bf2a1d 
+title: Configuring embedding backends
+excerpt: Connecting MemGPT to various endpoint backends
+category: 6580d34ee5e4d00068bf2a1d
 ---

-MemGPT uses embedding models for retrieval search over archival memory. You can use embeddings provided by OpenAI, Azure, or any model on Hugging Face. 
+MemGPT uses embedding models for retrieval search over archival memory. You can use embeddings provided by OpenAI, Azure, or any model on Hugging Face.

 ## OpenAI
-To use OpenAI, make sure your `OPENAI_API_KEY` enviornment variable is set.
+
+To use OpenAI, make sure your `OPENAI_API_KEY` environment variable is set.
+
 ```sh
 export OPENAI_API_KEY=YOUR_API_KEY # on Linux/Mac
 ```
-Then, configure MemGPT and select `openai` as the embedding provider: 
-```
-> memgpt configure                                                                                    
+
+Then, configure MemGPT and select `openai` as the embedding provider:
+
+```text
+> memgpt configure
 ...
-? Select embedding provider: openai 
+? Select embedding provider: openai
 ...
 ```

 ## Azure
-To use Azure, set enviornment variables for Azure and an additional variable specifying your embedding deployment:
+
+To use Azure, set environment variables for Azure and an additional variable specifying your embedding deployment:
+
 ```sh
 # see https://github.com/openai/openai-python#microsoft-azure-endpoints
 export AZURE_OPENAI_KEY = ...
@ -30,18 +36,22 @@ export AZURE_OPENAI_VERSION = ...
 # set the below if you are using deployment ids
 export AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT = ...
 ```
-Then, configure MemGPT and select `azure` as the embedding provider: 
-```
-> memgpt configure                                                                                    
+
+Then, configure MemGPT and select `azure` as the embedding provider:
+
+```text
+> memgpt configure
 ...
-? Select embedding provider: azure 
+? Select embedding provider: azure
 ...
 ```

 ## Custom Endpoint
-MemGPT supports running embeddings with any Hugging Face model using the [Text Embeddings Inference](https://github.com/huggingface/text-embeddings-inference)(TEI) library. To get started, first make sure you follow TEI's [instructions](https://github.com/huggingface/text-embeddings-inference#get-started) for getting started. Once you have a running endpoint, you can configure MemGPT to use your endpoint: 
-```
-> memgpt configure                                                                                    
+
+MemGPT supports running embeddings with any Hugging Face model using the [Text Embeddings Inference](https://github.com/huggingface/text-embeddings-inference)(TEI) library. To get started, first make sure you follow TEI's [instructions](https://github.com/huggingface/text-embeddings-inference#get-started) for getting started. Once you have a running endpoint, you can configure MemGPT to use your endpoint:
+
+```text
+> memgpt configure
 ...
 ? Select embedding provider: hugging-face
 ? Enter default endpoint: http://localhost:8080
@ -50,24 +60,26 @@ MemGPT supports running embeddings with any Hugging Face model using the [Text E
 ...
 ```

-## Local Embeddings 
+## Local Embeddings

-MemGPT can compute embeddings locally using a lightweight embedding model [`BAAI/bge-small-en-v1.5`](https://huggingface.co/BAAI/bge-small-en-v1.5). 
+MemGPT can compute embeddings locally using a lightweight embedding model [`BAAI/bge-small-en-v1.5`](https://huggingface.co/BAAI/bge-small-en-v1.5).

 > 🚧 Local LLM Performance
-> 
-> The `BAAI/bge-small-en-v1.5` was chosen to be lightweight, so you may notice degraded performance with embedding-based retrieval when using this option. 
+>
+> The `BAAI/bge-small-en-v1.5` was chosen to be lightweight, so you may notice degraded performance with embedding-based retrieval when using this option.

-To compute embeddings locally, install dependencies with: 
-```
+To compute embeddings locally, install dependencies with:
+
+```sh
 pip install `pymemgpt[local]`
 ```
-Then, select the `local` option during configuration: 
-```
-> memgpt configure                                                                                    
+
+Then, select the `local` option during configuration:
+
+```text
+memgpt configure
+
 ...
 ? Select embedding provider: local
 ...
 ```
-
-
--- a/docs/endpoints.md
+++ b/docs/endpoints.md
@ -1,13 +1,15 @@
 ---
-title: Configuring LLM backends 
-excerpt: Connecting MemGPT to various LLM backends 
-category: 6580d34ee5e4d00068bf2a1d 
+title: Configuring LLM backends
+excerpt: Connecting MemGPT to various LLM backends
+category: 6580d34ee5e4d00068bf2a1d
 ---

 You can use MemGPT with various LLM backends, including the OpenAI API, Azure OpenAI, and various local (or self-hosted) LLM backends.

 ## OpenAI
+
 To use MemGPT with an OpenAI API key, simply set the `OPENAI_API_KEY` variable:
+
 ```sh
 export OPENAI_API_KEY=YOUR_API_KEY # on Linux/Mac
 set OPENAI_API_KEY=YOUR_API_KEY # on Windows
@ -15,7 +17,8 @@ $Env:OPENAI_API_KEY = "YOUR_API_KEY" # on Windows (PowerShell)
 ```

 When you run `memgpt configure`, make sure to select `openai` for both the LLM inference provider and embedding provider, for example:
-```
+
+```text
 $ memgpt configure
 ? Select LLM inference provider: openai
 ? Override default endpoint: https://api.openai.com/v1
@ -27,11 +30,14 @@ $ memgpt configure
 ? Select storage backend for archival data: local
 ```

-#### OpenAI Proxies
+### OpenAI Proxies
+
 To use custom OpenAI endpoints, specify a proxy URL when running `memgpt configure` to set the custom endpoint as the default endpoint.

 ## Azure OpenAI
+
 To use MemGPT with Azure, expore the following variables and then re-run `memgpt configure`:
+
 ```sh
 # see https://github.com/openai/openai-python#microsoft-azure-endpoints
 export AZURE_OPENAI_KEY=...
@ -44,6 +50,7 @@ export AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT=...
 ```

 For example, if your endpoint is `customproject.openai.azure.com` (for both your GPT model and your embeddings model), you would set the following:
+
 ```sh
 # change AZURE_OPENAI_VERSION to the latest version
 export AZURE_OPENAI_KEY="YOUR_AZURE_KEY"
@ -53,18 +60,20 @@ export AZURE_OPENAI_EMBEDDING_ENDPOINT="https://customproject.openai.azure.com"
 ```

 If you named your deployments names other than their defaults, you would also set the following:
+
 ```sh
-# assume you called the gpt-4 (1106-Preview) deployment "personal-gpt-4-turbo" 
+# assume you called the gpt-4 (1106-Preview) deployment "personal-gpt-4-turbo"
 export AZURE_OPENAI_DEPLOYMENT="personal-gpt-4-turbo"

-# assume you called the text-embedding-ada-002 deployment "personal-embeddings" 
+# assume you called the text-embedding-ada-002 deployment "personal-embeddings"
 export AZURE_OPENAI_EMBEDDING_DEPLOYMENT="personal-embeddings"
 ```

 Replace `export` with `set` or `$Env:` if you are on Windows (see the OpenAI example).

 When you run `memgpt configure`, make sure to select `azure` for both the LLM inference provider and embedding provider, for example:
-```
+
+```text
 $ memgpt configure
 ? Select LLM inference provider: azure
 ? Select default model (recommended: gpt-4): gpt-4-1106-preview
@ -77,5 +86,6 @@ $ memgpt configure

 Note: **your Azure endpoint must support functions** or you will get an error. See [this GitHub issue](https://github.com/cpacker/MemGPT/issues/91) for more information.

-## Local Models & Custom Endpoints 
+## Local Models & Custom Endpoints
+
 MemGPT supports running open source models, both being run locally or as a hosted service. Setting up MemGPT to run with open models requires a bit more setup, follow [the instructions here](local_llm).
--- a/docs/example_chat.md
+++ b/docs/example_chat.md
@ -1,11 +1,11 @@
 ---
-title: Example - perpetual chatbot 
+title: Example - perpetual chatbot
 excerpt: Using MemGPT to create a perpetual chatbot
-category: 6580d34ee5e4d00068bf2a1d 
+category: 6580d34ee5e4d00068bf2a1d
 ---

 > 📘 Confirm your installation
-> 
+>
 > Before starting this example, make sure that you've [properly installed MemGPT](quickstart)

 In this example, we're going to use MemGPT to create a chatbot with a custom persona. MemGPT chatbots are "perpetual chatbots", meaning that they can be run indefinitely without any context length limitations. MemGPT chatbots are self-aware that they have a "fixed context window", and will manually manage their own memories to get around this problem by moving information in and out of their small memory window and larger external storage.
@ -15,6 +15,7 @@ MemGPT chatbots always keep a reserved space in their "core" memory window to st
 ### Creating a custom persona

 First, we'll create a text file with a short persona description. Let's make our chatbot a life coach named "Chaz". We'll also include a sentence at the top of the persona block to remind MemGPT that it should actively update its own persona over time. Open a text editor on your computer, and create a file called `chaz.txt`, and enter the following text:
+
 ```text
 This is just the beginning of who I am. I should update my persona as I learn more about myself.

@ -27,15 +28,18 @@ I will help them achieve greatness! Huzzah!
 ```

 Now that we've created a persona description inside `chaz.txt`, let's add this persona to MemGPT:
+
 ```sh
 # --name specifies the profile name, -f specifies the file to load from
 memgpt add persona --name chaz -f chaz.txt
 ```

 We can check that the persona is available:
+
 ```sh
 memgpt list personas
 ```
+
 ```text
 ...
 |                      |                                                                                                                                                                                        |
@ -55,15 +59,18 @@ memgpt list personas
 Next, we'll create a custom user profile. To show you the different commands, we'll add the user profile by typing the text directly into the command line, instead of writing it into a file.

 Let's pretend I'm a software engineer named Bob Builder that works at a big tech company. Similar to the persona, we'll can register this user profile using `memgpt add human`, but this time, let's try registering the human profile directly with `--text`:
+
 ```sh
 # Instead of using -f with a filename, we use --text and provide the text directly
 memgpt add human --name bob --text "Name: Bob Builder. Occupation: Software Engineer at a big tech company. Hobbies: running, hiking, rock climbing, craft beer, ultimate frisbee."
 ```

 Now when we run `memgpt list human`, we should see "Bob Builder":
+
 ```sh
 memgpt list humans
 ```
+
 ```text
 ...
 |         |                                                                                                                                                |
@ -79,6 +86,7 @@ Let's try out our new chatbot Chaz, combined with our new user profile Bob:
 # Alternatively we can run `memgpt configure`, then `memgpt run` without the --persona and --human flags
 memgpt run --persona chaz --human bob
 ```
+
 ```text
 💭 First login detected. Prepare to introduce myself as Chaz, the AI life coach. Also, inquire about Bob's day and his expectations from our interaction.
 🤖 Hello Bob! I'm Chaz, your AI life coach. I'm here to help you achieve your full potential! How was your day? And how may I assist you in becoming your best self?
--- a/docs/example_data.md
+++ b/docs/example_data.md
@ -1,11 +1,11 @@
 ---
 title: Example - chat with your data
 excerpt: Using MemGPT to chat with your own data
-category: 6580d34ee5e4d00068bf2a1d 
+category: 6580d34ee5e4d00068bf2a1d
 ---

 > 📘 Confirm your installation
-> 
+>
 > Before starting this example, make sure that you've [properly installed MemGPT](quickstart)

 In this example, we're going to use MemGPT to chat with a custom data source. Specifically, we'll try loading in the MemGPT research paper and ask MemGPT questions about it.
@ -15,15 +15,18 @@ In this example, we're going to use MemGPT to chat with a custom data source. Sp
 To feed external data into a MemGPT chatbot, we first need to create a data source.

 To download the MemGPT research paper we'll use `curl` (you can also just download the PDF from your browser):
+
 ```sh
 # we're saving the file as "memgpt_research_paper.pdf"
 curl -L -o memgpt_research_paper.pdf https://arxiv.org/pdf/2310.08560.pdf
 ```

 Now that we have the paper downloaded, we can create a MemGPT data source using `memgpt load`:
+
 ```sh
 memgpt load directory --name memgpt_research_paper --input-files=memgpt_research_paper.pdf
 ```
+
 ```text
 loading data
 done loading data
@ -42,12 +45,14 @@ Note: you can ignore the "_LLM is explicitly disabled_" message.
 Now that we've created this data source, we can attach it to a MemGPT chatbot at any time.

 For the sake of this example, let's create a new chatbot using the `memgpt_doc` persona (but you can use any persona you want):
+
 ```sh
 # reminder: `memgpt run --persona memgpt_doc` will create a new MemGPT agent using the `memgpt_doc` persona
 memgpt run --persona memgpt_doc
 ```

 Once we're chatting with the agent, we can "attach" the data source to the agent's archival memory:
+
 ```text
 Creating new agent...
 Created new agent agent_2.
@ -66,6 +71,7 @@ Attached data source memgpt_research_paper to agent agent_2, consisting of 130.
 ### Testing out our new chatbot

 Now that the data has been loaded into the chatbot's memory, we can start to ask questions about it:
+
 ```text
 > Enter your message: The paper I loaded into your archival memory is called MemGPT. Can you tell me more about it?
 💭 The search results show that the MemGPT paper explores operating-system-inspired techniques to enable large language models (LLMs) to manage memory and achieve unbounded context. The paper evaluates MemGPT in domains where LLMs are typically limited by finite context windows. This includes document analysis, allowing these models to process lengthy texts beyond their context limits, and conversational agents, where MemGPT allows for maintaining long-term memory, consistency, and evolution over extended dialogues. Time to relay this to Bob.
--- a/docs/functions.md
+++ b/docs/functions.md
@ -1,12 +1,13 @@
 ---
 title: Giving MemGPT more tools
 excerpt: Customize your MemGPT agents even further with your own functions
-category: 6580daaa48aeca0038fc2297 
+category: 6580daaa48aeca0038fc2297
 ---

 If you would like to give MemGPT the ability to call new tools or functions, you can write a Python `.py` file with the functions you want to add, and place it inside of `~/.memgpt/functions`. You can see the example function sets provided [here](https://github.com/cpacker/MemGPT/tree/main/memgpt/functions/function_sets).

 As an example, we provide a preset called [`memgpt_extras`](https://github.com/cpacker/MemGPT/blob/main/memgpt/presets/examples/memgpt_extras.yaml) that includes additional functions to read and write from text files, as well as make HTTP requests:
+
 ```yaml
 # this preset uses the same "memgpt_chat" system prompt, but has more functions enabled
 system_prompt: "memgpt_chat"
@ -37,18 +38,19 @@ There are three steps to adding more MemGPT functions:
 ### Simple example: giving MemGPT the ability to roll a D20

 > ⚠️ Function requirements
-> 
+>
 > The functions you write MUST have proper docstrings and type hints - this is because MemGPT will use these docstrings and types to automatically create a JSON schema that is used in the LLM prompt. Use the docstrings and types annotations from the [example functions](https://github.com/cpacker/MemGPT/blob/main/memgpt/functions/function_sets/base.py) for guidance.

 > ⚠️ Function output length
 >
-> Your custom function should always return a string that is **capped in length**. If your string goes over the specified limit, it will be truncated internaly. This is to prevent potential context overflows caused by uncapped string returns (for example, a rogue HTTP request that returns a string larger than the LLM context window).
+> Your custom function should always return a string that is **capped in length**. If your string goes over the specified limit, it will be truncated internally. This is to prevent potential context overflows caused by uncapped string returns (for example, a rogue HTTP request that returns a string larger than the LLM context window).
 >
 > If you return any type other than `str` (e.g. `dict``) in your custom functions, MemGPT will attempt to cast the result to a string (and truncate the result if it is too long). It is preferable to return strings - think of your function returning a natural language description of the outcome (see the D20 example below).

 In this simple example we'll give MemGPT the ability to roll a [D20 die](https://en.wikipedia.org/wiki/D20_System).

 First, let's create a python file  `~/.memgpt/functions/d20.py`, and write some code that uses the `random` library to "roll a die":
+
 ```python
 import random

@ -69,12 +71,13 @@ def roll_d20(self) -> str:
    """
    dice_role_outcome = random.randint(1, 20)
    output_string = f"You rolled a {dice_role_outcome}"
-    return output_string 
+    return output_string
 ```

 Notice how we used [type hints](https://docs.python.org/3/library/typing.html) and [docstrings](https://peps.python.org/pep-0257/#multi-line-docstrings) to describe how the function works. **These are required**, if you do not include them MemGPT will not be able to "link" to your function. This is because MemGPT needs a JSON schema description of how your function works, which we automatically generate for you using the type hints and docstring (which you write yourself).

 Next, we'll create a custom preset that includes this new `roll_d20` function. Let's create a YAML file `~/.memgpt/presets/memgpt_d20.yaml`:
+
 ```yaml
 system_prompt: "memgpt_chat"
 functions:
@ -86,7 +89,7 @@ functions:
  - "conversation_search_date"
  - "archival_memory_insert"
  - "archival_memory_search"
-  # roll a d20 
+  # roll a d20
  - "roll_d20"
 ```

@ -105,6 +108,7 @@ As we can see, MemGPT now has access to the `roll_d20` function! `roll_d20` is a
 _Example taken from [this pull request](https://github.com/cpacker/MemGPT/pull/282) by @cevatkerim_

 As an example, if you wanted to give MemGPT the ability to make calls to Jira Cloud, you would write the function in Python (you would save this python file inside `~/.memgpt/functions/jira_cloud.py`):
+
 ```python
 import os

@ -169,6 +173,7 @@ def run_jql(self, jql: str) -> dict:
 ```

 Now we need to create a new preset file, let's create one called `~/.memgpt/presets/memgpt_jira.yaml`:
+
 ```yaml
 # if we had created a new system prompt, we would replace "memgpt_chat" with the new prompt filename (no .txt)
 system_prompt: "memgpt_chat"
@ -187,9 +192,11 @@ functions:
 ```

 Now when we run `memgpt configure`, we should see the option to use `memgpt_jira` as a preset:
+
 ```sh
 memgpt configure
 ```
+
 ```text
 ...
 ? Select default preset: (Use arrow keys)
--- a/docs/index.md
+++ b/docs/index.md
@ -1,7 +1,7 @@
 ---
-title: Introduction 
+title: Introduction
 excerpt: Welcome to the MemGPT documentation!
-category: 6580d34ee5e4d00068bf2a1d 
+category: 6580d34ee5e4d00068bf2a1d
 ---

 <style>
--- a/docs/koboldcpp.md
+++ b/docs/koboldcpp.md
@ -1,13 +1,14 @@
 ---
-title: koboldcpp 
-excerpt: Setting up MemGPT with koboldcpp 
-category: 6580da9a40bb410016b8b0c3 
+title: koboldcpp
+excerpt: Setting up MemGPT with koboldcpp
+category: 6580da9a40bb410016b8b0c3
 ---

 1. Download + install [koboldcpp](https://github.com/LostRuins/koboldcpp/) and the model you want to test with
 2. In your terminal, run `./koboldcpp.py <MODEL> -contextsize <CONTEXT_LENGTH>`

 For example, if we downloaded the model `dolphin-2.2.1-mistral-7b.Q6_K.gguf` and put it inside `~/models/TheBloke/`, we would run:
+
 ```sh
 # using `-contextsize 8192` because Dolphin Mistral 7B has a context length of 8000 (and koboldcpp wants specific intervals, 8192 is the closest)
 # the default port is 5001
@ -15,7 +16,8 @@ For example, if we downloaded the model `dolphin-2.2.1-mistral-7b.Q6_K.gguf` and
 ```

 In your terminal where you're running MemGPT, run `memgpt configure` to set the default backend for MemGPT to point at koboldcpp:
-```
+
+```text
 # if you are running koboldcpp locally, the default IP address + port will be http://localhost:5001
 ? Select LLM inference provider: local
 ? Select LLM backend (select 'openai' if you have an OpenAI compatible proxy): koboldcpp
@ -24,6 +26,7 @@ In your terminal where you're running MemGPT, run `memgpt configure` to set the
 ```

 If you have an existing agent that you want to move to the koboldcpp backend, add extra flags to `memgpt run`:
+
 ```sh
 memgpt run --agent your_agent --model-endpoint-type koboldcpp --model-endpoint http://localhost:5001
-```
+```
--- a/docs/llamacpp.md
+++ b/docs/llamacpp.md
@ -1,13 +1,14 @@
 ---
-title: llama.cpp 
-excerpt: Setting up MemGPT with llama.cpp 
-category: 6580da9a40bb410016b8b0c3 
+title: llama.cpp
+excerpt: Setting up MemGPT with llama.cpp
+category: 6580da9a40bb410016b8b0c3
 ---

 1. Download + install [llama.cpp](https://github.com/ggerganov/llama.cpp) and the model you want to test with
 2. In your terminal, run `./server -m <MODEL> -c <CONTEXT_LENGTH>`

 For example, if we downloaded the model `dolphin-2.2.1-mistral-7b.Q6_K.gguf` and put it inside `~/models/TheBloke/`, we would run:
+
 ```sh
 # using `-c 8000` because Dolphin Mistral 7B has a context length of 8000
 # the default port is 8080, you can change this with `--port`
@ -15,7 +16,8 @@ For example, if we downloaded the model `dolphin-2.2.1-mistral-7b.Q6_K.gguf` and
 ```

 In your terminal where you're running MemGPT, run `memgpt configure` to set the default backend for MemGPT to point at llama.cpp:
-```
+
+```text
 # if you are running llama.cpp locally, the default IP address + port will be http://localhost:8080
 ? Select LLM inference provider: local
 ? Select LLM backend (select 'openai' if you have an OpenAI compatible proxy): llamacpp
@ -24,6 +26,7 @@ In your terminal where you're running MemGPT, run `memgpt configure` to set the
 ```

 If you have an existing agent that you want to move to the llama.cpp backend, add extra flags to `memgpt run`:
+
 ```sh
 memgpt run --agent your_agent --model-endpoint-type llamacpp --model-endpoint http://localhost:8080
-```
+```
--- a/docs/lmstudio.md
+++ b/docs/lmstudio.md
@ -1,7 +1,7 @@
 ---
-title: LM Studio 
-excerpt: Setting up MemGPT with LM Studio 
-category: 6580da9a40bb410016b8b0c3 
+title: LM Studio
+excerpt: Setting up MemGPT with LM Studio
+category: 6580da9a40bb410016b8b0c3
 ---

 > 📘 Update your LM Studio
@ -9,7 +9,7 @@ category: 6580da9a40bb410016b8b0c3
 > The current `lmstudio` backend will only work if your LM Studio is version 0.2.9 or newer.
 >
 > If you are on a version of LM Studio older than 0.2.9 (<= 0.2.8), select `lmstudio-legacy` as your backend type.
-
+>
 > ⚠️ Important LM Studio settings
 >
 > **Context length**: Make sure that "context length" (`n_ctx`) is set (in "Model initialization" on the right hand side "Server Model Settings" panel) to the max context length of the model you're using (e.g. 8000 for Mistral 7B variants).
@ -26,7 +26,8 @@ category: 6580da9a40bb410016b8b0c3
 4. Copy the IP address + port that your server is running on (in the example screenshot, the address is `http://localhost:1234`)

 In your terminal where you're running MemGPT, run `memgpt configure` to set the default backend for MemGPT to point at LM Studio:
-```
+
+```text
 # if you are running LM Studio locally, the default IP address + port will be http://localhost:1234
 ? Select LLM inference provider: local
 ? Select LLM backend (select 'openai' if you have an OpenAI compatible proxy): lmstudio
@ -35,6 +36,7 @@ In your terminal where you're running MemGPT, run `memgpt configure` to set the
 ```

 If you have an existing agent that you want to move to the LM Studio backend, add extra flags to `memgpt run`:
+
 ```sh
 memgpt run --agent your_agent --model-endpoint-type lmstudio --model-endpoint http://localhost:1234
 ```
--- a/docs/local_llm.md
+++ b/docs/local_llm.md
@ -1,33 +1,37 @@
 ---
 title: MemGPT + open models
 excerpt: Set up MemGPT to run with open LLMs
-category: 6580da9a40bb410016b8b0c3 
+category: 6580da9a40bb410016b8b0c3
 ---

 > 📘 Need help?
 >
-> If you need help visit our [Discord server](https://discord.gg/9GEQrxmVyE) and post in the #support channel.
+> Visit our [Discord server](https://discord.gg/9GEQrxmVyE) and post in the #support channel. Make sure to check the [local LLM troubleshooting page](local_llm_faq) to see common issues before raising a new issue or posting on Discord.
+
+> 📘 Using Windows?
 >
-> You can also check the [GitHub discussion page](https://github.com/cpacker/MemGPT/discussions/67), but the Discord server is the official support channel and is monitored more actively.
+> If you're using Windows and are trying to get MemGPT with local LLMs setup, we recommend using Anaconda Shell, or WSL (for more advanced users). See more Windows installation tips [here](local_llm_faq).

 > ⚠️ MemGPT + open LLM failure cases
 >
-> When using open LLMs with MemGPT, **the main failure case will be your LLM outputting a string that cannot be understood by MemGPT**. MemGPT uses function calling to manage memory (eg `edit_core_memory(...)` and interact with the user (`send_message(...)`), so your LLM needs generate outputs that can be parsed into MemGPT function calls.
->
-> Make sure to check the [local LLM troubleshooting page](local_llm_faq) to see common issues before raising a new issue or posting on Discord.
+> When using open LLMs with MemGPT, **the main failure case will be your LLM outputting a string that cannot be understood by MemGPT**. MemGPT uses function calling to manage memory (eg `edit_core_memory(...)` and interact with the user (`send_message(...)`), so your LLM needs generate outputs that can be parsed into MemGPT function calls. See [the local LLM troubleshooting page](local_llm_faq) for more information.

 ### Installing dependencies
+
 To install dependencies required for running local models, run:
+
 ```sh
 pip install 'pymemgpt[local]'
 ```

 If you installed from source (`git clone` then `pip install -e .`), do:
+
 ```sh
 pip install -e '.[local]'
 ```

 If you installed from source using Poetry, do:
+
 ```sh
 poetry install -E local
 ```
@ -38,7 +42,8 @@ poetry install -E local
 2. Run `memgpt configure` and when prompted select your backend/endpoint type and endpoint address (a default will be provided but you may have to override it)

 For example, if we are running web UI (which defaults to port 5000) on the same computer as MemGPT, running `memgpt configure` would look like this:
-```
+
+```text
 ? Select LLM inference provider: local
 ? Select LLM backend (select 'openai' if you have an OpenAI compatible proxy): webui
 ? Enter default endpoint: http://localhost:5000
@ -55,6 +60,7 @@ Saving config to /home/user/.memgpt/config
 Now when we do `memgpt run`, it will use the LLM running on the local web server.

 If you want to change the local LLM settings of an existing agent, you can pass flags to `memgpt run`:
+
 ```sh
 # --model-wrapper will override the wrapper
 # --model-endpoint will override the endpoint address
@ -66,15 +72,17 @@ memgpt run --agent agent_11 --model-endpoint http://localhost:1234 --model-endpo

 ### Selecting a model wrapper

-When you use local LLMs, you can specify a **model wrapper** that changes how the LLM input text is formatted before it is passed to your LLM. 
+When you use local LLMs, you can specify a **model wrapper** that changes how the LLM input text is formatted before it is passed to your LLM.

 You can change the wrapper used with the `--model-wrapper` flag:
+
 ```sh
 memgpt run --model-wrapper airoboros-l2-70b-2.1
 ```

 You can see the full selection of model wrappers by running `memgpt configure`:
-```
+
+```text
 ? Select LLM inference provider: local
 ? Select LLM backend (select 'openai' if you have an OpenAI compatible proxy): webui
 ? Enter default endpoint: http://localhost:5000
--- a/docs/local_llm_faq.md
+++ b/docs/local_llm_faq.md
@ -1,7 +1,7 @@
 ---
-title: Troubleshooting 
+title: Troubleshooting
 excerpt: FAQ for MemGPT + custom LLM backends
-category: 6580da9a40bb410016b8b0c3 
+category: 6580da9a40bb410016b8b0c3
 ---

 ## Problems getting MemGPT + local LLMs set up
@ -11,6 +11,7 @@ category: 6580da9a40bb410016b8b0c3
 This error happens when MemGPT tries to run the LLM on the remote server you specified, but the server isn't working as expected.

 For example, this error can happen when you have a typo in your endpoint (notice the duplicate `/v1` in the URL):
+
 ```text
 Exception: API call got non-200 response code (code=400, msg={"error": {"message": "Missing required input", "code": 400, "type": "InvalidRequestError", "param": "context"}}) for address: http://localhost:5001/v1/api/v1/generate. Make sure that the web UI server is running and reachable at http://localhost:5001/v1/api/v1/generate.
 ```
@ -36,6 +37,7 @@ This error occurs when the LLM you're using outputs a string that cannot be pars
 Many JSON-related output errors can be fixed by using a wrapper that uses grammars (required a grammar-enabled backend). See instructions about [grammars here](local_llm).

 For example, let's look at the following error:
+
 ```text
 Failed to parse JSON from local LLM response - error: Failed to decode JSON from LLM output:
 {
@ -48,6 +50,7 @@ JSONDecodeError.init() missing 2 required positional arguments: 'doc' and 'pos'
 ```

 In this example, the error is saying that the local LLM output the following string:
+
 ```text
 {
  "function": "send_message",
@ -58,6 +61,7 @@ In this example, the error is saying that the local LLM output the following str
 ```

 This string is not correct JSON - it is missing closing brackets and has a stray "<|>". Correct JSON would look like this:
+
 ```json
 {
  "function": "send_message",
@ -71,3 +75,20 @@ This string is not correct JSON - it is missing closing brackets and has a stray
 ### "Got back an empty response string from ..."

 MemGPT asked the server to run the LLM, but got back an empty response. Double-check that your server is running properly and has context length set correctly (it should be set to 8k if using Mistral 7B models).
+
+### "Unable to connect to endpoint" using Windows + WSL
+
+>⚠️ We recommend using Anaconda Shell, as WSL has been known to have issues passing network traffic between WSL and the Windows host.
+> Check the [WSL Issue Thread](https://github.com/microsoft/WSL/issues/5211) for more info.
+
+If you still would like to try WSL, you must be on WSL version 2.0.5 or above with the installation from the Microsoft Store app.
+You will need to verify your WSL network mode is set to "mirrored"
+
+You can do this by checking the `.wslconfig` file in `%USERPROFILE%'
+
+Add the following if the file does not contain:
+```
+[wsl2]
+networkingMode=mirrored # add this line if the wsl2 section already exists
+```
+
--- a/docs/local_llm_settings.md
+++ b/docs/local_llm_settings.md
@ -1,7 +1,7 @@
 ---
 title: Customizing LLM parameters
 excerpt: How to set LLM inference parameters (advanced)
-category: 6580da9a40bb410016b8b0c3 
+category: 6580da9a40bb410016b8b0c3
 ---

 > 📘 Understanding different parameters
@ -16,19 +16,22 @@ This means that many LLM inference parameters (such as temperature) will be set

 ### Finding the settings file

-To set your own parameters passed to custom LLM backends (ie non-OpenAI endpoints), you can modify the file `completions_api_settings.json` located in your MemGPT home folder. 
+To set your own parameters passed to custom LLM backends (ie non-OpenAI endpoints), you can modify the file `completions_api_settings.json` located in your MemGPT home folder.

 On Linux/MacOS, the file will be located at:
+
 ```sh
 ~/.memgpt/settings/completions_api_settings.json
 ```

 And on Windows:
+
 ```batch
 C:\Users\[YourUsername]\.memgpt\settings\completions_api_settings.json
 ```

 You can also use the `memgpt folder` command which will open the home directory for you:
+
 ```sh
 # this should pop open a folder view on your system
 memgpt folder
@ -40,12 +43,13 @@ Once you've found the file, you can open it your text editor of choice and add f

 When editing the file, make sure you are using parameters that are specified by the backend API you're using. In many cases, the naming scheme will follow the [llama.cpp conventions](https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md) or the [OpenAI Completions API conventions](https://platform.openai.com/docs/api-reference/completions/create), but make sure to check the documentation of the specific backend you are using. **If parameters are misspecified it may cause your LLM backend to throw an error or crash.**

-Additionally, make sure that your settings file is valid JSON. Many text editors will highlight invalid JSON, but you can also check your JSON using [tools online](https://jsonformatter.org/). 
+Additionally, make sure that your settings file is valid JSON. Many text editors will highlight invalid JSON, but you can also check your JSON using [tools online](https://jsonformatter.org/).

 ### Example: LM Studio (simple)

 As a simple example, let's try setting the temperature. Assuming we've already [set up LM Studio](lmstudio), if we start a MemGPT chat while using the LM Studio API, we'll see the request and it's associated parameters inside the LM Studio server logs, and it contains `"temp": 0.8`:
-```
+
+```sh
 [INFO] Provided inference configuration: {
  ...(truncated)...
  "temp": 0.8,
@ -54,15 +58,18 @@ As a simple example, let's try setting the temperature. Assuming we've already [
 ```

 Let's try changing the temperature to `1.0`. In our `completions_api_settings.json` file, we set the following:
+
 ```json
 {
-    "temp": 1.0
+    "temperature": 1.0
 }
-``` 
-Note how we're using the naming conventions from llama.cpp. In this case, using `"temperature"` instead of `"temp"` will also work.
+```
+
+Note how we're using the naming conventions from llama.cpp. In this case, using `"temperature"` instead of `"temp"`.

 Now if we save the file and start a new agent chat with `memgpt run`, we'll notice that the LM Studio server logs now say `"temp": 1.0`:
-```
+
+```sh
 [INFO] Provided inference configuration: {
  ...(truncated)...
  "temp": 1,
@ -77,23 +84,26 @@ Hooray! That's the gist of it - simply set parameters in your JSON file and they
 With LM Studio we can observe the settings that are loaded in the server logs, but with some backends you may not be able to see the parameters of the request so it can be difficult to tell if your settings file is getting loaded correctly.

 To double-check that your settings are being loaded and passed to the backend, you can run MemGPT with the `--debug` parameter and look for the relevant output:
+
 ```sh
 memgpt run --debug
 ```

 If your parameters are getting picked up correctly, they will be output to the terminal:
-```
+
+```sh
 ...(truncated)...
 Found completion settings file '/Users/user/.memgpt/settings/completions_api_settings.json', loading it...
 Updating base settings with the following user settings:
 {
-  "temp": 1.0
+  "temperature": 1.0
 }
 ...(truncated)...
 ```

 If you have an empty settings file or your file wasn't saved properly, you'll see the following message:
-```
+
+```sh
 ...(truncated)...
 Found completion settings file '/Users/loaner/.memgpt/settings/completions_api_settings.json', loading it...
 '/Users/user/.memgpt/settings/completions_api_settings.json' was empty, ignoring...
@ -102,10 +112,11 @@ Found completion settings file '/Users/loaner/.memgpt/settings/completions_api_s

 ### Example: LM Studio (advanced)

-In practice, there are many parameters you might want to set, since tuning these parameters can dramatically alter the tone or feel of the genreated LLM outputs. Let's try changing a larger set of parameters.
+In practice, there are many parameters you might want to set, since tuning these parameters can dramatically alter the tone or feel of the generated LLM outputs. Let's try changing a larger set of parameters.

 Now just for reference, let's record the set of parameters before any modifications (truncated to include the parameters we're changing only):
-```
+
+```text
 [INFO] Provided inference configuration: {
  ...(truncated)...
  "top_k": 40,
@ -127,11 +138,12 @@ Now just for reference, let's record the set of parameters before any modificati
 ```

 Now copy the following to your `completions_api_settings.json` file:
+
 ```json
 {
    "top_k": 1,
    "top_p": 0,
-    "temp": 0,
+    "temperature": 0,
    "repeat_penalty": 1.18,
    "seed": -1,
    "tfs_z": 1,
@ -147,7 +159,8 @@ Now copy the following to your `completions_api_settings.json` file:
 ```

 When we run, our settings are updated:
-```
+
+```text
 [INFO] Provided inference configuration: {
  ...(truncated)...
  "top_k": 1,
@ -166,4 +179,4 @@ When we run, our settings are updated:
  "penalize_nl": true,
  ...(truncated)...
 }
-```
+```
--- a/docs/ollama.md
+++ b/docs/ollama.md
@ -1,24 +1,26 @@
 ---
-title: Ollama 
-excerpt: Setting up MemGPT with Ollama 
-category: 6580da9a40bb410016b8b0c3 
+title: Ollama
+excerpt: Setting up MemGPT with Ollama
+category: 6580da9a40bb410016b8b0c3
 ---

 > ⚠️ Make sure to use tags when downloading Ollama models!
 >
-> Don't do **`ollama run dolphin2.2-mistral`**, instead do **`ollama run dolphin2.2-mistral:7b-q6_K`**.
+> Don't do **`ollama pull dolphin2.2-mistral`**, instead do **`ollama pull dolphin2.2-mistral:7b-q6_K`**.
 >
 > If you don't specify a tag, Ollama may default to using a highly compressed model variant (e.g. Q4). We highly recommend **NOT** using a compression level below Q5 when using GGUF (stick to Q6 or Q8 if possible). In our testing, certain models start to become extremely unstable (when used with MemGPT) below Q6.

 1. Download + install [Ollama](https://github.com/jmorganca/ollama) and the model you want to test with
-2. Download a model to test with by running `ollama run <MODEL_NAME>` in the terminal (check the [Ollama model library](https://ollama.ai/library) for available models)
+2. Download a model to test with by running `ollama pull <MODEL_NAME>` in the terminal (check the [Ollama model library](https://ollama.ai/library) for available models)

 For example, if we want to use Dolphin 2.2.1 Mistral, we can download it by running:
+
 ```sh
 # Let's use the q6_K variant
-ollama run dolphin2.2-mistral:7b-q6_K
+ollama pull dolphin2.2-mistral:7b-q6_K
 ```
-```text
+
+```sh
 pulling manifest
 pulling d8a5ee4aba09... 100% |█████████████████████████████████████████████████████████████████████████| (4.1/4.1 GB, 20 MB/s)
 pulling a47b02e00552... 100% |██████████████████████████████████████████████████████████████████████████████| (106/106 B, 77 B/s)
@ -32,7 +34,8 @@ success
 ```

 In your terminal where you're running MemGPT, run `memgpt configure` to set the default backend for MemGPT to point at Ollama:
-```
+
+```sh
 # if you are running Ollama locally, the default IP address + port will be http://localhost:11434
 # IMPORTANT: with Ollama, there is an extra required "model name" field
 ? Select LLM inference provider: local
@ -43,6 +46,7 @@ In your terminal where you're running MemGPT, run `memgpt configure` to set the
 ```

 If you have an existing agent that you want to move to the Ollama backend, add extra flags to `memgpt run`:
+
 ```sh
 # use --model to switch Ollama models (always include the full Ollama model name with the tag)
 # use --model-wrapper to switch model wrappers
--- a/docs/presets.md
+++ b/docs/presets.md
@ -1,7 +1,7 @@
 ---
 title: Creating new MemGPT presets
 excerpt: Presets allow you to customize agent functionality
-category: 6580daaa48aeca0038fc2297 
+category: 6580daaa48aeca0038fc2297
 ---

 MemGPT **presets** are a combination default settings including a system prompt and a function set. For example, the `memgpt_docs` preset uses a system prompt that is tuned for document analysis, while the default `memgpt_chat` is tuned for general chatting purposes.
@ -9,6 +9,7 @@ MemGPT **presets** are a combination default settings including a system prompt
 You can create your own presets by creating a `.yaml` file in the `~/.memgpt/presets` directory. If you want to use a new custom system prompt in your preset, you can create a `.txt` file in the `~/.memgpt/system_prompts` directory.

 For example, if I create a new system prompt and place it in `~/.memgpt/system_prompts/custom_prompt.txt`, I can then create a preset that uses this system prompt by creating a new file `~/.memgpt/presets/custom_preset.yaml`:
+
 ```yaml
 system_prompt: "custom_prompt"
 functions:
@ -22,4 +23,4 @@ functions:
  - "archival_memory_search"
 ```

-This preset uses the same base function set as the default presets. You can see the example presets provided [here](https://github.com/cpacker/MemGPT/tree/main/memgpt/presets/examples), and you can see example system prompts [here](https://github.com/cpacker/MemGPT/tree/main/memgpt/prompts/system).
+This preset uses the same base function set as the default presets. You can see the example presets provided [here](https://github.com/cpacker/MemGPT/tree/main/memgpt/presets/examples), and you can see example system prompts [here](https://github.com/cpacker/MemGPT/tree/main/memgpt/prompts/system).
--- a/docs/python_client.md
+++ b/docs/python_client.md
@ -1,27 +1,28 @@
 ---
 title: Python client
 excerpt: Developing using the MemGPT Python client
-category: 6580dab16cade8003f996d17 
+category: 6580dab16cade8003f996d17
 ---

 The fastest way to integrate MemGPT with your own Python projects is through the `MemGPT` client class:
+
 ```python
 from memgpt import MemGPT

 # Create a MemGPT client object (sets up the persistent state)
 client = MemGPT(
-  quickstart="openai",
-  config={
-    "openai_api_key": "YOUR_API_KEY"
-  }
+    quickstart="openai",
+    config={
+      "openai_api_key": "YOUR_API_KEY"
+    }
 )

 # You can set many more parameters, this is just a basic example
 agent_id = client.create_agent(
-  agent_config={
-    "persona": "sam_pov",
-    "user": "cs_phd",
-  }
+    agent_config={
+      "persona": "sam_pov",
+      "user": "cs_phd",
+    }
 )

 # Now that we have an agent_name identifier, we can send it a message!
@ -44,11 +45,11 @@ client = MemGPT(
    # user message.  This may have performance implications, so you
    # can otherwise choose when to save explicitly using client.save().
    auto_save=True,
-    
+
    # Quickstart will automatically configure MemGPT (without having to run `memgpt configure`
    # If you choose 'openai' then you must set the api key (env or in config)
    quickstart=QuickstartChoice.memgpt_hosted,
-    
+
    # Allows you to override default config generated by quickstart or `memgpt configure`
    config={}
 )
--- a/docs/quickstart.md
+++ b/docs/quickstart.md
@ -1,17 +1,28 @@
 ---
-title: Quickstart 
-excerpt: Get up and running with MemGPT 
-category: 6580d34ee5e4d00068bf2a1d 
+title: Quickstart
+excerpt: Get up and running with MemGPT
+category: 6580d34ee5e4d00068bf2a1d
 ---

 ### Installation
+> 📘 Using Local LLMs?
+>
+> If you're using local LLMs refer to the MemGPT + open models page [here](local_llm) for additional installation requirements.

 To install MemGPT, make sure you have Python installed on your computer, then run:
+
 ```sh
 pip install pymemgpt
 ```

+If you are running LLMs locally, you will want to install MemGPT with the local dependencies by running:
+
+```sh
+pip install pymemgpt[local]
+```
+
 If you already have MemGPT installed, you can update to the latest version with:
+
 ```sh
 pip install pymemgpt -U
 ```
@ -19,6 +30,7 @@ pip install pymemgpt -U
 ### Running MemGPT

 Now, you can run MemGPT and start chatting with a MemGPT agent with:
+
 ```sh
 memgpt run
 ```
@ -33,10 +45,12 @@ Neither of these options require you to have an LLM running on your own machine.
 ### Quickstart

 If you'd ever like to quickly switch back to the default **OpenAI** or **MemGPT Free Endpoint** options, you can use the `quickstart` command:
+
 ```sh
-# this will set you up on the MemGPT Free Endpoint 
+# this will set you up on the MemGPT Free Endpoint
 memgpt quickstart
 ```
+
 ```sh
 # this will set you up on the default OpenAI settings
 memgpt quickstart --backend openai
--- a/docs/storage.md
+++ b/docs/storage.md
@ -1,7 +1,7 @@
 ---
-title: Configuring storage backends 
+title: Configuring storage backends
 excerpt: Customizing the MemGPT storage backend
-category: 6580d34ee5e4d00068bf2a1d 
+category: 6580d34ee5e4d00068bf2a1d
 ---

 > ⚠️ Switching storage backends
@ -11,20 +11,26 @@ category: 6580d34ee5e4d00068bf2a1d
 MemGPT supports both local and database storage for archival memory. You can configure which storage backend to use via `memgpt configure`. For larger datasets, we recommend using a database backend.

 ## Local
+
 MemGPT will default to using local storage (saved at `~/.memgpt/archival/` for loaded data sources, and `~/.memgpt/agents/` for agent storage).

 ## Postgres
+
 In order to use the Postgres backend, you must have a running Postgres database that MemGPT can write to. You can enable the Postgres backend by running `memgpt configure` and selecting `postgres` for archival storage, which will then prompt for the database URI (e.g. `postgresql+pg8000://<USER>:<PASSWORD>@<IP>:5432/<DB_NAME>`). To enable the Postgres backend, make sure to install the required dependencies with:
-```
+
+```sh
 pip install 'pymemgpt[postgres]'
 ```

 ### Running Postgres
+
 You will need to have a URI to a Postgres database which support [pgvector](https://github.com/pgvector/pgvector). You can either use a [hosted provider](https://github.com/pgvector/pgvector/issues/54) or [install pgvector](https://github.com/pgvector/pgvector#installation).

 ## Chroma
-You can configure Chroma with both the HTTP and persistent storage client via `memgpt configure`. You will need to specify either a persistent storage path or host/port dependending on your client choice. The example below shows how to configure Chroma with local persistent storage: 
-```
+
+You can configure Chroma with both the HTTP and persistent storage client via `memgpt configure`. You will need to specify either a persistent storage path or host/port dependending on your client choice. The example below shows how to configure Chroma with local persistent storage:
+
+```text
 ? Select LLM inference provider: openai
 ? Override default endpoint: https://api.openai.com/v1
 ? Select default model (recommended: gpt-4): gpt-4
@ -38,10 +44,11 @@ You can configure Chroma with both the HTTP and persistent storage client via `m
 ```

 ## LanceDB
-You have to enable the LanceDB backend by running 
- ```
- memgpt configure
- ```
+
+You have to enable the LanceDB backend by running
+
+```sh
+memgpt configure
+```
+
 and selecting `lancedb` for archival storage, and database URI (e.g. `./.lancedb`"), Empty archival uri is also handled and default uri is set at `./.lancedb`. For more checkout [lancedb docs](https://lancedb.github.io/lancedb/)
-
-
--- a/docs/vllm.md
+++ b/docs/vllm.md
@ -1,13 +1,14 @@
 ---
-title: vLLM 
-excerpt: Setting up MemGPT with vLLM 
-category: 6580da9a40bb410016b8b0c3 
+title: vLLM
+excerpt: Setting up MemGPT with vLLM
+category: 6580da9a40bb410016b8b0c3
 ---

 1. Download + install [vLLM](https://docs.vllm.ai/en/latest/getting_started/installation.html)
 2. Launch a vLLM **OpenAI-compatible** API server using [the official vLLM documentation](https://docs.vllm.ai/en/latest/getting_started/quickstart.html)

 For example, if we want to use the model `dolphin-2.2.1-mistral-7b` from [HuggingFace](https://huggingface.co/ehartford/dolphin-2.2.1-mistral-7b), we would run:
+
 ```sh
 python -m vllm.entrypoints.openai.api_server \
 --model ehartford/dolphin-2.2.1-mistral-7b
@ -16,7 +17,8 @@ python -m vllm.entrypoints.openai.api_server \
 vLLM will automatically download the model (if it's not already downloaded) and store it in your [HuggingFace cache directory](https://huggingface.co/docs/datasets/cache).

 In your terminal where you're running MemGPT, run `memgpt configure` to set the default backend for MemGPT to point at vLLM:
-```
+
+```text
 # if you are running vLLM locally, the default IP address + port will be http://localhost:8000
 ? Select LLM inference provider: local
 ? Select LLM backend (select 'openai' if you have an OpenAI compatible proxy): vllm
@ -26,6 +28,7 @@ In your terminal where you're running MemGPT, run `memgpt configure` to set the
 ```

 If you have an existing agent that you want to move to the vLLM backend, add extra flags to `memgpt run`:
+
 ```sh
 memgpt run --agent your_agent --model-endpoint-type vllm --model-endpoint http://localhost:8000 --model ehartford/dolphin-2.2.1-mistral-7b
 ```
--- a/docs/webui.md
+++ b/docs/webui.md
@ -1,7 +1,7 @@
 ---
 title: oobobooga web UI
 excerpt: Setting up MemGPT with web UI
-category: 6580da9a40bb410016b8b0c3 
+category: 6580da9a40bb410016b8b0c3
 ---

 > 📘 web UI troubleshooting
@ -19,7 +19,8 @@ In this example we'll set up [oobabooga web UI](https://github.com/oobabooga/tex
 5. Assuming steps 1-4 went correctly, the LLM is now properly hosted on a port you can point MemGPT to!

 In your terminal where you're running MemGPT, run `memgpt configure` to set the default backend for MemGPT to point at web UI:
-```
+
+```text
 # if you are running web UI locally, the default IP address + port will be http://localhost:5000
 ? Select LLM inference provider: local
 ? Select LLM backend (select 'openai' if you have an OpenAI compatible proxy): webui
@ -28,6 +29,7 @@ In your terminal where you're running MemGPT, run `memgpt configure` to set the
 ```

 If you have an existing agent that you want to move to the web UI backend, add extra flags to `memgpt run`:
+
 ```sh
 memgpt run --agent your_agent --model-endpoint-type webui --model-endpoint http://localhost:5000
 ```
--- a/docs/webui_runpod.md
+++ b/docs/webui_runpod.md
@ -1 +1,3 @@
-TODO
+# WebUI
+
+TODO: write the this documentation.
--- a/memgpt/cli/cli_config.py
+++ b/memgpt/cli/cli_config.py
@ -4,6 +4,8 @@ from prettytable import PrettyTable
 import typer
 import os
 import shutil
+from typing import Annotated
+from enum import Enum

 # from memgpt.cli import app
 from memgpt import utils
@ -497,9 +499,16 @@ def configure():
    config.save()


+class ListChoice(str, Enum):
+    agents = "agents"
+    humans = "humans"
+    personas = "personas"
+    sources = "sources"
+
+
@app.command()
-def list(option: str):
-    if option == "agents":
+def list(arg: Annotated[ListChoice, typer.Argument]):
+    if arg == ListChoice.agents:
        """List all agents"""
        table = PrettyTable()
        table.field_names = ["Name", "Model", "Persona", "Human", "Data Source", "Create Time"]
@ -517,7 +526,7 @@ def list(option: str):
                ]
            )
        print(table)
-    elif option == "humans":
+    elif arg == ListChoice.humans:
        """List all humans"""
        table = PrettyTable()
        table.field_names = ["Name", "Text"]
@ -526,7 +535,7 @@ def list(option: str):
            name = os.path.basename(human_file).replace("txt", "")
            table.add_row([name, text])
        print(table)
-    elif option == "personas":
+    elif arg == ListChoice.personas:
        """List all personas"""
        table = PrettyTable()
        table.field_names = ["Name", "Text"]
@ -536,7 +545,7 @@ def list(option: str):
            name = os.path.basename(persona_file).replace(".txt", "")
            table.add_row([name, text])
        print(table)
-    elif option == "sources":
+    elif arg == ListChoice.sources:
        """List all data sources"""
        conn = StorageConnector.get_metadata_storage_connector(table_type=TableType.DATA_SOURCES)  # already filters by user
        passage_conn = StorageConnector.get_storage_connector(table_type=TableType.PASSAGES)
@ -554,7 +563,7 @@ def list(option: str):
            table.add_row([data_source.name, data_source.created_at, num_passages, ""])
        print(table)
    else:
-        raise ValueError(f"Unknown option {option}")
+        raise ValueError(f"Unknown argument {arg}")


@app.command()
--- a/memgpt/local_llm/grammars/json_func_calls_with_inner_thoughts.gbnf
+++ b/memgpt/local_llm/grammars/json_func_calls_with_inner_thoughts.gbnf
@ -19,7 +19,14 @@ ArchivalMemorySearchParams ::= "{"   ws   InnerThoughtsParam   ","  ws   "\"quer
 InnerThoughtsParam ::= "\"inner_thoughts\":"   ws   string
 RequestHeartbeatParam ::= "\"request_heartbeat\":"   ws   boolean
 namestring ::= "\"human\"" | "\"persona\""
-string ::= "\""   ([^"\[\]{}]*)   "\""
 boolean ::= "true" | "false"
-ws ::= ""
 number ::= [0-9]+
+
+string ::=
+  "\"" (
+    [^"\\] |
+    "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) # escapes
+  )* "\"" ws
+
+# Optional space: by convention, applied in this grammar after literal chars when allowed
+ws ::= ([ \t\n] ws)?