OSS Models

Deploy Open-Source models from a list of available models with GPU acceleration for inference.

Select Your Model

Browse through available pre-configured models organized by category:

Language Models: OpenAI OSS 20B, Deepseek, Gemma, Llama, Phi
Image Generation: Flux Kontext, Flux Dev, Flux Krea, Cogview
Computer Vision: Qwen 2.5 VL, YOLO, ResNet, object detection models

Select the model you want to deploy.

After selecting a model, you'll be redirected to a GPU selection page where you can choose from a wide range of GPU providers and configurations. Each provider offers different GPU types with specific configurations optimized for model inference.

Browse the GPU gallery and select the GPU that best matches your performance and budget requirements.

Configure Your Model

A configuration sheet opens as a side panel containing all necessary settings to customize your model deployment.

A. Port Configuration

Configure network access for your model API:

Primary Port: Main port for model API endpoints
Additional Ports: Secondary ports if needed for monitoring or management

B. Machine Configuration

GPU Configuration (AI Apps Only)
CPU & Memory
Storage Configuration

Aquanode provides optimized default configurations for each model based on requirements and performance testing, but certain settings can be customized depending upon selected provider.

Deploy Your Model

Review your configuration summary including:

Resource Allocation: Total GPU, CPU, memory, and storage
Estimated Costs: Monthly cost projection

Click the "Deploy" button to initiate deployment.

Featured Models

Flux.1-Dev → Flux’s developer model optimized for fast deployment.
Flux.1-Kontext-dev → Flux’s context-aware model for multimodal tasks.
Flux.1-Krea-dev → Flux’s creative-focused model for generative use cases.

Text Models

DeepSeek-R1 1.5B / 8B / 14B / 32B / 70B
DeepSeek’s reasoning-focused models in multiple parameter sizes.
Gemma / Gemma 2
Google’s lightweight language models via Ollama.
Llama 3.1 / 3.2 / 3.3
Meta’s Llama 3.x family via Ollama.
Phi / Phi 3 / Phi 3.5
Microsoft’s compact and efficient Phi series via Ollama.
Qwen 2.5
Alibaba’s Qwen 2.5 model via Ollama.
OpenAI OSS 20B / 120B
OpenAI’s GPT-OSS open models via vLLM.

Coding Models

Qwen 2.5 Coder
Specialized variant of Qwen 2.5 optimized for coding and programming tasks.

Vision & Multimodal Models

Qwen 2.5 VL
Vision-language model (text + image) from Alibaba, served via vLLM.
Flux.1-Kontext-dev
Multimodal image-text-to-image model for contextual generative tasks.
Flux.1-Krea-dev
Multimodal model optimized for creative workflows.

Text-to-Image Models

Flux.1-Schnell
High-speed text-to-image model optimized for fast deployment.
Flux.1-Dev
Flux Dev model available for text-to-image generation.
CogView4-6B
CogView4 generative model for high-quality text-to-image inference.

Usage Notes

All models are hosted with Aquanode’s Inference APIs, accessible with your API key.
Models can be deployed via the Inference gallery or vLLM or ComfyUI.
Select GPU resources based on model size (e.g., larger models like DeepSeek 70B require H100/B200).

🎉 With Aquanode’s OSS catalog, you can quickly deploy LLMs, multimodal models, and generative pipelines without worrying about infrastructure.