High-performance similarity search powered by FAISS
The Vector Engine in ShibuDB provides high-performance similarity search capabilities powered by FAISS (Facebook AI Similarity Search). It enables efficient storage and retrieval of high-dimensional vectors for applications like recommendation systems, image search, natural language processing, and machine learning.
┌─────────────────────────────────────┐
│ Vector Space │
├─────────────────────────────────────┤
│ FAISS Index (similarity search) │
├─────────────────────────────────────┤
│ In-Memory Buffer (batch ops) │
├─────────────────────────────────────┤
│ Write-Ahead Log (durability) │
├─────────────────────────────────────┤
│ Data Files (persistent storage) │
└─────────────────────────────────────┘
Vector data is organized in spaces with specific index types and distance metrics.
# Create a basic vector space (128 dimensions, Flat index, L2 metric)
CREATE-SPACE embeddings --engine vector --dimension 128
# Create with specific index type
CREATE-SPACE image_vectors --engine vector --dimension 512 --index-type HNSW32 --metric L2
# Create with custom parameters
CREATE-SPACE text_embeddings --engine vector --dimension 768 --index-type IVF32 --metric InnerProduct
Parameters:
--engine vector
: Specifies vector engine type--dimension N
: Vector dimension (required for vector spaces)--index-type TYPE
: FAISS index type (default: Flat)--metric METRIC
: Distance metric (default: L2)| Index Type | Description | Use Case | Memory | Speed |
|------------|-------------|----------|--------|-------|
| Flat | Exact search | Small datasets, high accuracy | High | Slow |
| HNSW32 | Approximate search | Fast similarity search | Medium | Fast |
| IVF32 | Inverted file index | Large datasets | Low | Medium |
| PQ4 | Product quantization | Very large datasets | Very Low | Fast |
# Switch to vector space
USE embeddings
# Verify current space (prompt will show current space)
[embeddings]>
Different index types for various use cases and performance requirements.
Best for: Small datasets (< 1M vectors), high accuracy requirements
# Create flat index
CREATE-SPACE exact_search --engine vector --dimension 128 --index-type Flat --metric L2
USE exact_search
# Insert vectors
INSERT-VECTOR 1 1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0
INSERT-VECTOR 2 1.1,2.1,3.1,4.1,5.1,6.1,7.1,8.1
INSERT-VECTOR 3 9.0,8.0,7.0,6.0,5.0,4.0,3.0,2.0
Best for: Fast similarity search with good accuracy
# Create HNSW index
CREATE-SPACE fast_search --engine vector --dimension 128 --index-type HNSW32 --metric L2
USE fast_search
# Insert vectors
INSERT-VECTOR 1 1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0
INSERT-VECTOR 2 1.1,2.1,3.1,4.1,5.1,6.1,7.1,8.1
# Search with HNSW
SEARCH-TOPK 1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0 5
Best for: Large datasets with balanced performance
# Create IVF index
CREATE-SPACE large_dataset --engine vector --dimension 128 --index-type IVF32 --metric L2
USE large_dataset
# Insert many vectors
INSERT-VECTOR 1 1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0
INSERT-VECTOR 2 1.1,2.1,3.1,4.1,5.1,6.1,7.1,8.1
# ... insert more vectors
# Search with IVF
SEARCH-TOPK 1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0 10
Core operations for managing vector data.
# Insert single vector
INSERT-VECTOR 1 1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0
# Insert multiple vectors
INSERT-VECTOR 2 1.1,2.1,3.1,4.1,5.1,6.1,7.1,8.1
INSERT-VECTOR 3 0.5,1.5,2.5,3.5,4.5,5.5,6.5,7.5
# Insert with different dimensions
INSERT-VECTOR 4 1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0
# Get vector by ID
GET-VECTOR 1
# Get multiple vectors
GET-VECTOR 1 2 3
# Check if vector exists
EXISTS-VECTOR 1
# Delete single vector
DELETE-VECTOR 1
# Delete multiple vectors
DELETE-VECTOR 2 3 4
# Count vectors in space
COUNT-VECTORS
# List all vector IDs
LIST-VECTORS
# Get space information
INFO-SPACE
Advanced search capabilities for finding similar vectors.
# Search for top 5 similar vectors
SEARCH-TOPK 1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0 5
# Search with different query vector
SEARCH-TOPK 2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0 10
# Search with specific parameters
SEARCH-TOPK 1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0 5 --nprobe 10
# Find vectors within distance threshold
RANGE-SEARCH 1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0 0.5
# Range search with custom parameters
RANGE-SEARCH 1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0 1.0 --nprobe 20
Different distance metrics for measuring vector similarity.
# For general similarity (L2)
CREATE-SPACE general --engine vector --dimension 128 --metric L2
# For normalized embeddings (InnerProduct)
CREATE-SPACE embeddings --engine vector --dimension 768 --metric InnerProduct
# For sparse vectors (L1)
CREATE-SPACE sparse --engine vector --dimension 256 --metric L1
Tips for optimizing vector search performance.
Different index types have different memory requirements:
Recommended practices for using the vector engine effectively.
Common use cases and practical examples.
# Create image vectors space
CREATE-SPACE image_search --engine vector --dimension 512 --index-type HNSW32 --metric L2
USE image_search
# Insert image embeddings
INSERT-VECTOR img_001 0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0
INSERT-VECTOR img_002 0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0,0.1
INSERT-VECTOR img_003 0.9,0.8,0.7,0.6,0.5,0.4,0.3,0.2,0.1,0.0
# Search for similar images
SEARCH-TOPK 0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0 5
# Create text embeddings space
CREATE-SPACE text_search --engine vector --dimension 768 --index-type IVF32 --metric InnerProduct
USE text_search
# Insert text embeddings
INSERT-VECTOR doc_001 0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8
INSERT-VECTOR doc_002 0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9
INSERT-VECTOR doc_003 0.8,0.7,0.6,0.5,0.4,0.3,0.2,0.1
# Search for similar documents
SEARCH-TOPK 0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8 10
# Create user preferences space
CREATE-SPACE recommendations --engine vector --dimension 128 --index-type HNSW32 --metric L2
USE recommendations
# Insert user preference vectors
INSERT-VECTOR user_001 0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8
INSERT-VECTOR user_002 0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9
INSERT-VECTOR user_003 0.8,0.7,0.6,0.5,0.4,0.3,0.2,0.1
# Find similar users for recommendations
SEARCH-TOPK 0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8 5