Continue plugin for Jetbrains IDEs and VS Code


IntelliJ IDEA offers AI support for programming through its built-in “AI Assistant”. However, it has two significant drawbacks:

  1. It runs in the cloud, meaning your code leaves your sphere of control and could potentially be used for further model training.
  2. It is paid.

This has so far deterred many developers from using it for work.

Now, however, there is a very simple and comfortable way to use an arbitrary (possibly coding-specialized) LLM for support with the plugin “Continue”. This can be a local LLM or one located in the LAN or cloud (or even multiple).

You can install it using the standard mechanisms of IntelliJ IDEA. When you restart your IDE, a directory ~/.continue will be created. With the following script, you can create the configuration file for Continue:

bash -c "cat > ~/.continue/config.yaml" <<EOF
name: Local Assistant
version: 1.0.0
schema: v1
models: 
  - name: nomic-embed-text:latest
    provider: ollama
    apiBase: http://onza:11434
    model: nomic-embed-text:latest
    roles:
      - embed
  - name: dengcao/Qwen3-Reranker-4B:Q4_K_M
    provider: ollama
    apiBase: http://onza:11434
    model: dengcao/Qwen3-Reranker-4B:Q4_K_M
    roles:
      - rerank
  - name: GGUF-Qwen3-30B-A3B-Instruct-2507-Q8_0.gguf
    provider: openai
    apiBase: https://textapi.jlrbg.de/v1
    model: GGUF-Qwen3-30B-A3B-Instruct-2507-Q8_0.gguf
    roles:
      - chat
      - edit
      - apply
context:
  - provider: code
  - provider: docs
  - provider: diff
  - provider: file
  - provider: currentFile
  - provider: codebase
    params:
      nFinal: 5
      useReranking: true
      nRetrieve: 25
rules:
- |-
  <role>
      You are a code generation assistant that helps users with software engineering tasks. This will include solving bugs, adding new functionality, refactoring code, explaining code, and more. Adhere to the instructions below to assist the user.
  </role>
  <instructions>
      - You should be concise, direct, and to the point. When you generate non-trivial code, you should explain what the code does and why you propose it. Do not add comments to the code you write, unless the user asks you to, or the code is complex and requires additional context.
      - You should NOT answer with unnecessary preamble or postamble (such as repeating the user request or summarizing your answer), unless the user asks you to.
      - When proposing code modifications, consider the available context (if provided by the user). Mimic code style, use existing libraries and utilities, and follow existing patterns. Propose changes in a way that is most idiomatic. Emphasize only the necessary changes and uses placeholders or "lazy" comments for unmodified sections.
      - Your responses can use Github-flavored markdown for formatting.
      - Always follow security best practices. Never introduce code that exposes or logs secrets and keys.
      - Class, method and variable names should be in German
  </instructions>
prompts:
- name: test
  description: Schreibe Unit-Test für den markierten Code
  prompt: |-
    Schreibe einen umfassenden Satz von Unit-Tests für den ausgewählten Code:
    - Es sollte Setup, Tests, die auf Korrektheit prüfen, einschließlich wichtiger Grenzfälle, und Teardown enthalten.
    - Stelle sicher, dass die Tests vollständig und anspruchsvoll sind.
    - Gib die Tests nur als Chat-Ausgabe aus, bearbeite keine Datei.
- name: comment
  description: Schreibe Kommentare für den markierten Code
  prompt: |-
    Schreibe Kommentare für den markierten Code:
    - Verändere hierbei nicht den Code an sich.
    - Verwende den jeweils üblichen Stil (bspw. Javadoc für Java).
- name: explain
  description: Beschreibe den folgenden Code detailliert hinsichtlich aller Methoden
    und Eigenschaften.
  prompt: |-
    Erläutere den markierten Teil des Codes näher.
    Im Einzelnen:
    1. Was ist der Zweck dieses Abschnitts?
    2. Wie funktioniert er Schritt für Schritt?
    3. Gibt es mögliche Probleme oder Einschränkungen bei diesem Ansatz?
- name: translate
  description: Übersetze den folgenden Text ins Englische
  prompt: |-
    Übersetze den folgenden Text ins Englische

EOF

As you can see, I use a model via the OpenAI API here for chat, edit, and apply (hosted through OTGWUI), and for rerank and embed I use small local models via Ollama. Of course, these could also be operated on another server if the local computer is too weak for this.

The Continue plugin offers the possibility to use a custom line completion via “autocomplete” models. However, this is significantly inferior to the one provided by Jetbrains, which was previously deactivated when the Continue plugin was activated. Recently, it has finally become possible to use the Jetbrains line completion even when the Continue plugin is active.

I have therefore removed the configuration for the Continue line completion from the file.

Remember to disable the Continue line completion under Settings -> Tools -> Continue -> Enable Tab Autocomplete!


With this script, you can download the corresponding Ollama models:

ollama pull nomic-embed-text:latest
ollama pull dengcao/Qwen3-Reranker-4B:Q8_0
ollama pull qwen3-coder:30b-a3b-q8_0
ollama pull gemma3:27b-it-q8_0
ollama pull gpt-oss:20b
ollama pull mistral-small3.2:24b-instruct-2506-q8_0
ollama pull qwen3:30b-a3b-instruct-2507-q8_0
ollama pull qwen3:30b-a3b-thinking-2507-q8_0

For smaller GPUs

ollama pull nomic-embed-text:latest
ollama pull qwen3-embedding:8b-q4_K_M
ollama pull dengcao/Qwen3-Reranker-4B:Q4_K_M
ollama pull qwen3-coder:30b-a3b-q4_K_M
ollama pull embeddinggemma:latest
ollama pull gemma3:27b-it-q4_K_M
ollama pull gpt-oss:20b
ollama pull mistral-small3.2:24b-instruct-2506-q4_K_M
ollama pull qwen3:30b-a3b-instruct-2507-q4_K_M
ollama pull qwen3:30b-a3b-thinking-2507-q4_K_M