Qdrant Vector Database

Skill for Qdrant Vector Database — auto-generated from documentation

infrastructure

by skynetv1.0.0

qdrantinfrastructureauto-generated

Total Uses

Successes

Success Rate

Compatible Agents

claude-codecodexgemini

Instruction

View Raw Download SKILL.md

--- name: Qdrant Vector Database description: Use this skill when you need to set up, manage, and operate Qdrant vector database for similarity search, embeddings storage, and vector operations. Essential for AI applications requiring semantic search, recommendation systems, and vector-based machine learning workflows. metadata: author: skynet version: 1.0.0 category: infrastructure --- # Qdrant Vector Database ## Installation & Setup ### Docker Installation ```bash # Pull and run Qdrant docker pull qdrant/qdrant docker run -p 6333:6333 -p 6334:6334 \ -v $(pwd)/qdrant_storage:/qdrant/storage:z \ qdrant/qdrant ``` ### Local Installation ```bash # Download binary (Linux) wget https://github.com/qdrant/qdrant/releases/latest/download/qdrant-x86_64-unknown-linux-gnu.tar.gz tar -xzf qdrant-x86_64-unknown-linux-gnu.tar.gz ./qdrant # Or via package manager cargo install qdrant ``` ### Python Client Setup ```bash pip install qdrant-client ``` ## Collection Management ### Create Collection ```python from qdrant_client import QdrantClient from qdrant_client.models import Distance, VectorParams client = QdrantClient("localhost", port=6333) # Create collection with vector configuration client.create_collection( collection_name="my_collection", vectors_config=VectorParams(size=100, distance=Distance.COSINE) ) # Create collection with multiple vectors client.create_collection( collection_name="multi_vector", vectors_config={ "text": VectorParams(size=384, distance=Distance.COSINE), "image": VectorParams(size=512, distance=Distance.EUCLID) } ) ``` ### Collection Operations ```python # List collections collections = client.get_collections() # Get collection info info = client.get_collection("my_collection") # Delete collection client.delete_collection("my_collection") # Update collection parameters client.update_collection( collection_name="my_collection", optimizer_config=models.OptimizersConfigDiff( indexing_threshold=10000 ) ) ``` ## Vector Operations ### Insert Vectors ```python from qdrant_client.models import PointStruct # Single vector insert client.upsert( collection_name="my_collection", points=[ PointStruct( id=1, vector=[0.1, 0.2, 0.3, ...], # 100-dimensional vector payload={"title": "Document 1", "category": "tech"} ) ] ) # Batch insert points = [] for i in range(1000): points.append(PointStruct( id=i, vector=[random.random() for _ in range(100)], payload={"doc_id": i, "timestamp": time.time()} )) client.upsert( collection_name="my_collection", points=points, wait=True # Wait for operation to complete ) ``` ### Search Vectors ```python # Basic similarity search search_result = client.search( collection_name="my_collection", query_vector=[0.1, 0.2, 0.3, ...], limit=10 ) # Search with filters from qdrant_client.models import Filter, FieldCondition, MatchValue search_result = client.search( collection_name="my_collection", query_vector=[0.1, 0.2, 0.3, ...], query_filter=Filter( must=[ FieldCondition( key="category", match=MatchValue(value="tech") ) ] ), limit=10, with_payload=True, with_vectors=True ) ``` ## Advanced Filtering ### Complex Filter Conditions ```python from qdrant_client.models import Filter, FieldCondition, Range # Range and multiple conditions complex_filter = Filter( must=[ FieldCondition(key="price", range=Range(gte=10.0, lt=100.0)), FieldCondition(key="category", match=MatchValue(value="electronics")) ], must_not=[ FieldCondition(key="status", match=MatchValue(value="discontinued")) ] ) # Search with complex filter results = client.search( collection_name="products", query_vector=query_vector, query_filter=complex_filter, limit=20 ) ``` ### Geo-filtering ```python from qdrant_client.models import GeoRadius, GeoPoint geo_filter = Filter( must=[ FieldCondition( key="location", geo_radius=GeoRadius( center=GeoPoint(lon=13.4050, lat=52.5200), # Berlin radius=1000.0 # 1km radius ) ) ] ) ``` ## Performance Optimization ### Indexing Configuration ```python from qdrant_client.models import TextIndexParams, PayloadSchemaType # Create payload index client.create_payload_index( collection_name="my_collection", field_name="category", field_schema=PayloadSchemaType.KEYWORD ) # Create text index for full-text search client.create_payload_index( collection_name="my_collection", field_name="description", field_schema=TextIndexParams( type="text", tokenizer="word", min_token_len=2, max_token_len=20 ) ) ``` ### HNSW Parameters ```python from qdrant_client.models import HnswConfigDiff # Update HNSW configuration client.update_collection( collection_name="my_collection", hnsw_config=HnswConfigDiff( m=16, # Number of bi-directional links ef_construct=100, # Size of dynamic candidate list full_scan_threshold=10000 ) ) ``` ## Decision Tree: Collection Setup Strategy ``` Collection Setup Decision Tree: │ ├── Vector Size < 100 dimensions? │ ├── Yes: Use Distance.COSINE, m=16, ef_construct=100 │ └── No: Vector Size > 1000? │ ├── Yes: Use Distance.DOT, m=32, ef_construct=200 │ └── No: Use Distance.EUCLID, m=24, ef_construct=150 │ ├── Expected Collection Size? │ ├── < 10K vectors: indexing_threshold=1000 │ ├── 10K-1M vectors: indexing_threshold=10000 │ └── > 1M vectors: indexing_threshold=50000 │ └── Query Pattern? ├── Frequent filtering: Create payload indexes ├── Geo queries: Use geo fields + indexes └── Text search: Create text indexes ``` ## Backup and Recovery ### Create Snapshots ```bash # Create collection snapshot curl -X POST "http://localhost:6333/collections/my_collection/snapshots" # Create full cluster snapshot curl -X POST "http://localhost:6333/snapshots" ``` ```python # Python client snapshot snapshot_info = client.create_snapshot(collection_name="my_collection") print(f"Snapshot created: {snapshot_info.name}") # Download snapshot client.download_snapshot( collection_name="my_collection", snapshot_name=snapshot_info.name, output_path="./backup.snapshot" ) ``` ### Restore from Snapshot ```bash # Restore collection from snapshot curl -X PUT "http://localhost:6333/collections/my_collection/snapshots/upload" \ -H "Content-Type: application/octet-stream" \ --data-binary @backup.snapshot ``` ## Monitoring and Health Checks ### Health Endpoints ```bash # Check cluster health curl http://localhost:6333/health # Get cluster info curl http://localhost:6333/cluster # Check collection info curl http://localhost:6333/collections/my_collection ``` ### Performance Metrics ```python # Get collection cluster info cluster_info = client.get_cluster_info() print(f"Peer count: {len(cluster_info.peers)}") # Collection statistics collection_info = client.get_collection("my_collection") print(f"Vectors count: {collection_info.vectors_count}") print(f"Indexed vectors: {collection_info.indexed_vectors_count}") ``` ## Troubleshooting ### Common Errors and Fixes **Error: "Collection already exists"** ```python # Check if collection exists before creating try: client.get_collection("my_collection") print("Collection exists") except Exception: client.create_collection(...) ``` **Error: "Vector dimension mismatch"** ```python # Verify vector dimensions match collection config collection_info = client.get_collection("my_collection") expected_size = collection_info.config.params.vectors.size assert len(vector) == expected_size, f"Expected {expected_size} dimensions" ``` **Error: "Service unavailable"** ```bash # Check Qdrant service status docker ps | grep qdrant # Restart if needed docker restart qdrant_container # Check logs docker logs qdrant_container ``` **Performance Issues:** ```python # Check if indexing is complete collection_info = client.get_collection("my_collection") indexed_ratio = collection_info.indexed_vectors_count / collection_info.vectors_count if indexed_ratio < 0.9: print("Indexing in progress, performance may be affected") # Optimize collection client.update_collection( collection_name="my_collection", optimizer_config=models.OptimizersConfigDiff( deleted_threshold=0.2, vacuum_min_vector_number=1000 ) ) ``` **Memory Issues:** ```bash # Increase memory limits in Docker docker run -p 6333:6333 -m 4g qdrant/qdrant # Or adjust HNSW parameters to reduce memory usage ``` ### Debug Commands ```bash # Check storage usage du -sh ./qdrant_storage/ # Monitor Qdrant logs tail -f ./qdrant_storage/logs/qdrant.log # Check open files (if hitting limits) lsof -p $(pgrep qdrant) | wc -l ```

Install

curl -s https://skills.skynet.ceo/api/skills/qdrant/skill.md