ProductDocumentationExamplesBlogRoadmapGitHubGet Started
Experimental

Vector Sinks

Vector database connectors for embedding search (LanceDB, Pinecone, Weaviate, Qdrant, pgvector).

Krishiv ships five vector-store sinks. They share a common interface so you can swap targets without changing pipeline code.

Common interface

MethodPurpose
upsert_batch(batch)Write a batch of points (id, vector, payload).
query_nearest(vector, k) -> Vec<ScoredChunk>k-NN search. Returns scored chunks with payload.
delete_by_ids(ids)Delete by id.
sink_name() -> &'static strFor metrics labels and CLI output.

Backends

BackendFeatureStatus
BackendFeatureStatus
InMemoryVectorSink(always)Preview — for tests and prototypes.
LanceDbSink::open(path, table)vector-sinksPreview — local file-backed.
PgvectorSink::connect(conn_str, table)vector-sinks + pgvectorExperimental.
QdrantSink::connect(url, collection)vector-sinks + qdrantExperimental.
PineconeSink::new(api_key, index)vector-sinksPreview.
WeaviateSink::connect(url, class)vector-sinksPreview.

Data shape

All sinks expect a RecordBatch with at least:

ColumnType
idutf8 or int64
vectorlist<float32> (or fixed-size list)
payloadstruct<…> — backend-specific fields

Schema validation is the caller's responsibility; point_id_from_doc_epoch is a helper that turns a timestamped id into a deterministic u64.

Python

import krishiv as ks

session = ks.Session.embedded()
sink = ks.LanceDbSink.open("./vectors", "embeddings")
session.sql("SELECT id, vector, payload FROM embeddings")
       .write_stream()
       .format("vector")
       .option("sink", sink)
       .start()

See also