Categories
AI

What i learned about LM Studio

LM Studio is a user-friendly desktop application designed for exploring local and open-source Large Language Models (LLMs). Your data remains securely stored on your machine, and the best part is, it’s completely free for personal use.

Install

Download the installation binary and click through the installation wizard. It’s simple, quick, and has no surprises. LM Studio

Select LLM Models

A good starting point is the model Llama 3 - 8B Instruct, which is simpler than most and can be run on a regular notebook.

LM Studio (2024-07-24)

Attempting to upgrade to Llama 3.1 8B Instruct on July 24, 2024, was unsuccessful because the download link led to an invalid path, resulting in an HTTP 404 Not Found error.

I had to apply for allow list on Hugging face. huggingface.co – Meta-Llama-3.1-8B-Instruct-GGUF.

https://huggingface.co/settings/gated-repos (2024-07-24)

Then download from here: https://huggingface.co/lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF/tree/main

I chose Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf
Capacity and Complexity.

  • 8B Model
    • Lowest parameter count.
    • Suitable for less complex tasks with reduced memory and computational requirements.
  • 70B Model:
    • Mid-range complexity.
    • Better at handling nuanced tasks compared to the 8B model.
  • 405B Model:
    • Highest complexity and capacity.
    • Most capable in understanding context, generating coherent text, and performing complex reasoning tasks.

Create the folder on my own: C:\Users\<user>.cache\lm-studio\models\lmstudio-community\Meta-Llama-3.1-8B-Instruct-GGUF\

Knowledge cutoff is December 2023 and it has official support for 8 languages:

  • English
  • German
  • French
  • Italian
  • Portuguese
  • Hindi
  • Spanish
  • Thai

After reloading the LM Studio just hours after its release, I now have access to the latest Meta-Llama-3.1-8B in my LM Studio.

LM Studio (2024-07-24) with Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf

One significant improvement I’ve noticed in Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf is that it can now respond with valid and clean JSON, which was not possible for me with Llama-3.1.

Give the answer in JSON and do not add markers for markup. 

However, for automatic processing, you additionally always also want to provide the expected JSON structure.

LM Studio (2024-07-24) with Meta-Llama-3.1-8B-Instruct-Q8_0.gguf around 9GB

Memory usage on my Windows is with Meta-Llama-3.1-8B-Instruct-Q8_0.gguf around 9GB

LM Studio (2024-07-24) with Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf around 5.7GB

Memory usage on my Windows is with Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf around 5.7GB

Local LLM Chart

You switch to the chat section, load your model on the top banner, and once it’s loaded, you can directly start chatting with your local LLM.

LM Studio (2024-07-24)

I noticed that English works much better than German with the local LLM, whereas models from Anthropic and OpenAI can also handle a larger set of languages more easily

Local LLM API Server

You can utilize Large Language Models (LLMs) that are loaded into LM Studio through an API server running on localhost. The requests and responses adhere to OpenAI’s API format.

This is especially powerful for applications where you would prefer not to send the data to OpenAI or any other popular provider.

The big downside here is that you can only use one model for general chat inquiries and secondly an embedding model. In comparison to other tools, where an API request defines which model will be loaded, here no matter your model choice in the HTTP request, the loaded model that you selected in the GUI will answer.

Alternative

If you need it for commercial purposes, the licensing can be quite complicated. Therefore, I would recommend using LM Studio only for personal use.

https://lmstudio.ai/ (2024-07-25)

The other offerings, such as jan.ai and ollama.com, are what I would use if I need it for commercial purposes in a business.

No matter what you choose, having modern hardware—especially a modern CPU—is almost a necessity. Therefore, running an old scrap PC as an AI server is sadly not a viable option.

Jan.ai

Personally, I have not played with Jan yet, but I have heard from friends that this is also a good alternative.

https://github.com/janhq/jan

Jan.ai API almost there

https://jan.ai/ (2024-07-25)

Ollama for API

If you are an advanced user mostly interested in a powerful local API for your AI needs, this is the best place to be. Native clients for all relevant OS and the option to run in Docker. This is the tool to go for advanced users and people who want to become advanced users.

https://ollama.com