AI Integration Guide 2026: Complete Model Setup & Best Practices

Every tool on Tools-Online.app includes AI assistance for writing, coding, diagram generation, and content analysis. This guide covers the three AI integration options available in 2026, how to set each one up, and which models work best for different tasks.

How to Use AI Features

The AI workflow is consistent across all tools — Notepad, code editors, Mermaid diagrams, and more.

Step 1: Click the settings icon at the bottom left of your browser window. This opens the AI configuration panel.

AI assistant configuration window

Step 2: Click the Model dropdown to see available options, organized by provider:

  • AI-ML API — Cloud models (Claude, GPT, Gemini, image/video generation)
  • OpenRouter — 100+ models from multiple providers via single API key
  • WebLLM — Privacy-first models running entirely in your browser

Step 3: Select your model and enter an API key if required. WebLLM models need no API key.


What You Need to Know Before Starting

Hardware Requirements

  • Cloud Models (AI-ML API, OpenRouter): No local hardware constraints — just stable internet
  • WebLLM Browser Models: Minimum 16 GB RAM (32 GB recommended). Models range from 0.9 GB to 5.7 GB in size. Processing runs on your CPU/GPU using WebAssembly

Costs

  • AI-ML API: Pay-per-use pricing. GPT-4.1 Nano is the most cost-effective for daily tasks
  • OpenRouter: Pay-per-use with many free models available (Llama 3.3 70B, DeepSeek V3, Qwen3)
  • WebLLM: Completely free after download — no API costs

Privacy

  • Cloud models: Data processed on provider servers with HTTPS encryption
  • WebLLM: Complete local processing — data never leaves your device

AI-ML API Models (Recommended for Most Users)

Our primary cloud provider offering state-of-the-art AI models through a unified API.

Text & Reasoning Models

  • GPT-4.1 NanoTop pick for daily tasks. Efficient text generation and analysis ($ — most cost-effective)
  • Claude 4.0 Sonnet/Opus — Advanced reasoning and writing ($$ Sonnet / $$$$ Opus)
  • o4-mini — Specialized reasoning model ($)
  • Gemini 2.5 Flash Preview — Multimodal with vision support ($$)
  • Qwen 2.5 Coder 32B — Specialized coding assistance ($$$)
  • Llama 3.1 70B — Open-source high-performance model ($$$)

Multimodal Models

Image & Video Generation

The system automatically routes requests to appropriate specialized models based on task type (text, image, code, multimodal).


OpenRouter Models (Multi-Provider Access)

OpenRouter provides a unified gateway to 100+ AI models from multiple providers — all accessible through a single API key. This is the fastest way to experiment with diverse models including free options.

Top Models on OpenRouter

ModelProviderContext / ParametersStrengths
Claude 4 SonnetAnthropic200K tokensHybrid reasoning, chain of thought, agents
Gemini 2.0 FlashGoogle1M tokensLow-latency SEO, summarization
Gemini 2.5 ProGoogle1M tokens, thinking modeDeep reasoning, coding, science
GPT-4o-miniOpenAI128K tokensVision support, highly cost-effective
DeepSeek V3 0324DeepSeek685B MoE, 163K tokensFree, open source, top logic performance

Top Free Models on OpenRouter

ModelProviderParameters / ContextUse Case
Llama 4 "Maverick"Meta400B MoE (17B active), 128KMultimodal, vision + text tasks
Llama 3.3 70B InstructMeta70B, 131K tokensChat, reasoning, multilingual
DeepSeek V3 0324DeepSeek685B MoE, 163K tokensResearch, logic, general purpose
Qwen3-30B-A3BTencent30.5B (3.3B active), 131KFast and intelligent dialogue + code
Mistral Small 3Mistral24B, 32K tokensHigh-quality open model, low latency

Model rankings update frequently. Explore current options:


WebLLM Browser Models (Privacy-First)

Privacy-first AI processing that runs entirely in your browser. After the initial download, these models work completely offline.

Ultra-Lightweight Options

  • Llama 3.2 1B (~0.9 GB) — Instruction-following with enhanced efficiency
  • Gemma 2 2B (~1.9 GB) — Google's optimized model with improved memory usage
  • Qwen3 4B (~3.2 GB)Top pick. Excellent multilingual capabilities with strong reasoning
  • Phi-3.5 Mini (~3.7 GB)Best for coding. Microsoft's specialized programming model
  • Llama 3.2 3B (~2.3 GB) — Enhanced reasoning with better memory efficiency

High-Performance Options

  • Qwen3 8B (~5.7 GB) — High-performance multilingual model with coding expertise
  • Hermes-3 Llama 3.1 8B (~4.9 GB) — Function-calling and instruction-following

Models are cached permanently after download until manually cleared.


Step-by-Step Setup Instructions

Setting Up AI-ML API

  1. Navigate to any AI-enabled tool (Notepad, Code Editors, Mermaid, etc.)
  2. Click the AI settings icon in the bottom left
  3. Select an AI-ML model from the dropdown menu
  4. Enter your AI-ML API key when prompted
  5. Start using AI assistance immediately

Get your API key: AI-ML API Dashboard

Setting Up OpenRouter

  1. Visit OpenRouter and create an account
  2. Generate an API key from your dashboard
  3. In your tool's AI settings, select your preferred OpenRouter model
  4. Enter your API key in the provided text box
  5. Start using any of 100+ available models

Documentation: OpenRouter Docs

Setting Up WebLLM Models

  1. Enable GPU acceleration in your browser for optimal performance:
    • Chrome/Edge: Visit chrome://flags/#enable-webgpu and enable WebGPU
    • Firefox: Visit about:config and set dom.webgpu.enabled to true
  2. Open AI settings and select a WebLLM model
  3. Choose based on your system resources:
    • 8–16 GB RAM: Llama 3.2 1B or Gemma 2B
    • 16–24 GB RAM: Qwen3 4B (recommended), Llama 3.2 3B, or Phi-3.5 Mini (coding)
    • 32 GB+ RAM: Any model including Qwen3 8B or Hermes-3 8B
  4. First selection triggers automatic download
  5. Wait for model initialization (one-time process)
  6. Enjoy offline AI assistance

Pro Tip: Download models on fast WiFi — they're cached permanently until manually cleared.


Best Practices

Model Selection Strategy

Task TypeRecommended ModelProviderCost
Daily tasksGPT-4.1 NanoAI-ML API$
Complex reasoningClaude 4.0 SonnetAI-ML API$$
Coding assistancePhi-3.5 Mini or Qwen 2.5 CoderWebLLM / AI-MLFree / $$$
Privacy-sensitiveQwen3 4BWebLLMFree
ExperimentationLlama 3.3 70B or DeepSeek V3OpenRouterFree
Image generationFLUX SchnellAI-ML API$
Video generationGoogle Veo3AI-ML API$$$$

Resource Management

  • Download WebLLM models on WiFi to conserve mobile data
  • Clear unused models if storage becomes limited
  • Monitor system performance when running multiple browser models

Troubleshooting

WebLLM model won't download: Ensure sufficient disk space and stable internet connection.

WebLLM models running slowly: Enable GPU acceleration in your browser:

  • Chrome/Edge: Visit chrome://flags/#enable-webgpu and enable WebGPU
  • Firefox: Visit about:config and set dom.webgpu.enabled to true

AI-ML API errors: Verify your API key is correctly entered in settings.

OpenRouter API errors: Check your API key and account balance at openrouter.ai.

Browser crashes with WebLLM: Reduce model size or increase system RAM allocation.


Security & Privacy

Cloud Providers (AI-ML API, OpenRouter)

  • Data processed on secure cloud infrastructure with HTTPS encryption
  • Check provider-specific privacy policies for data retention details
  • Requires active internet connection for all operations

WebLLM (Browser-Based)

  • Complete local processing — data never leaves your device
  • No internet communication after initial model download
  • Ideal for sensitive documents, personal notes, and confidential code

Deprecated: Ollama/LM Studio Integration

In January 2025, we removed support for Ollama and LM Studio integration. Browser security policies prevent HTTPS production sites from accessing localhost HTTP endpoints, making these integrations unreliable in deployment.

Migration paths:

  • For privacy-first local AI → Use WebLLM models (Qwen3 4B recommended)
  • For superior performance → Use AI-ML API or OpenRouter
  • Your workflow remains the same — only the underlying AI provider changes

Conclusion

Whether you prioritize privacy with WebLLM, cutting-edge capabilities with AI-ML API, or multi-provider flexibility with OpenRouter, every tool on Tools-Online.app gives you access to powerful AI assistance directly in your browser.

Start with the AI-powered Notepad to try text generation, explore the Mermaid diagram editor for AI-generated diagrams, or use any code editor with AI coding assistance.

Additional Resources