Created by Khanfar Systems - Making AI Accessible
This repository provides a simplified, step-by-step guide to install and run DeepSeek-V3 on Windows. We've streamlined the process to make it as beginner-friendly as possible.
- Windows 10 or Windows 11
- Python 3.9 or higher
- NVIDIA GPU with 16GB VRAM (for 7B model)
- 32GB RAM
- 50GB free disk space
- Windows 11
- Python 3.9
- NVIDIA GPU with 24GB VRAM
- 64GB RAM
- 100GB SSD space
-
Install Python 3.9:
- Download Python 3.9 from Python.org
- Select "Windows installer (64-bit)"
- During installation:
- ✅ Check "Add Python 3.9 to PATH"
- ✅ Click "Install Now"
- Verify installation by opening Command Prompt:
python --version
Should show:
Python 3.9.x
-
Install Git:
- Download from git-scm.com
- Use default installation options
- Verify installation:
git --version
-
Install CUDA Toolkit:
- Download CUDA 11.8 from NVIDIA CUDA Toolkit Archive
- Choose:
- Windows
- x86_64
- 11
- exe (local)
- Run installer with default options
- Verify installation:
nvcc --version
-
Clone the Repository:
git clone https://github.com/khanfar/DeepSeek-Windows.git cd DeepSeek-Windows
-
Create Virtual Environment:
python -m venv venv
-
Activate Virtual Environment:
venv\Scripts\activate
-
Install PyTorch:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
-
Install Other Requirements:
pip install -r requirements.txt
-
Run the Download Script:
python download_model.py
This will download the 7B parameter model (approximately 14GB)
-
Convert Model Format (if needed):
python fp8_cast_bf16.py --input-fp8-hf-path model_weights --output-bf16-hf-path model_weights_bf16
-
Run the Server:
python windows_server.py --model model_weights_bf16 --trust-remote-code
The server will start at: http://127.0.0.1:30000
-
Test the Model:
python test_client.py
DeepSeek-Windows/
├── README.md # This file
├── requirements.txt # Python dependencies
├── download_model.py # Script to download model
├── fp8_cast_bf16.py # Model conversion script
├── kernel.py # CUDA kernels
├── windows_server.py # Local server
└── test_client.py # Test script
-
"CUDA not available" Error:
- Ensure NVIDIA drivers are up to date
- Verify CUDA installation:
nvidia-smi
- Should show your GPU and CUDA version
-
"Out of Memory" Error:
- Close other applications
- Reduce model parameters in
windows_server.py
:max_tokens=100 # Reduce this value
-
Import Errors:
- Ensure virtual environment is activated
- Reinstall dependencies:
pip install -r requirements.txt
-
Download Issues:
- Check internet connection
- Try using a VPN
- Manual download option available on HuggingFace
-
For Better Speed:
- Use SSD for model storage
- Close background applications
- Update NVIDIA drivers
-
For Lower Memory Usage:
- Enable 4-bit quantization
- Reduce context length
- Limit batch size
- GitHub Issues: Report problems or suggest improvements
- Version Updates: Check releases page for latest versions
- Community: Join our Discord for support
- Original DeepSeek-V3 by DeepSeek-AI
- Windows adaptation by Khanfar Systems
- Community contributors
- Code: MIT License
- Model: DeepSeek License
- Documentation: CC-BY-SA 4.0