Show HN: HelixDB – Native graph and vector types for RAG and retrieval

24 points by GeorgeCurtis 4 days ago

Hey HN! We're college friends building HelixDB. It's a database that natively supports both graph and vector types. It’s designed for AI-driven apps like RAG, vector search, code indexing, and agent frameworks where you need both explicit relationships and similarity.

We came up with the idea for Helix at university, while building a graph database as a side project in Rust. Reading some research papers on RAG setups, I realised there was a lot of infrastructure setup to get started. You need your own server, a graph database, a vector database and then some bespoke middleman software to link them together.

After looking into how vectors work I discovered you can (kind of) abstract it into a graph. The vectors are just nodes with coordinates! And edges represent neighbour links. I realised I could link this up to the current graph infrastructure, allowing traversals in conjunction with similarity searches.

The way Helix works, from a low level, is you have four main types in respective tables. You have graph nodes, graph edges, vector nodes, and vector edges. The vector edges are irrelevant to developers (they just store the neighbour link for the similarity algorithm). The vector nodes work in the same way you'd use vectors in Pinecone or Qdrant by utilising HNSW. Likewise, the graph nodes work the same as they would in Neo4J or Neptune. The graph edges, however, are where things get interesting; you are able to define relationships between graph nodes, but also vectors, meaning you can have explicit dependencies between vectors and nodes or vice versa.

So, you can run a similarity search, then walk the graph to get more structured context. Or the other way around.

When you run Helix, it spins up as its own server (like a Docker container). To query it, you hit an auto-generated API endpoint.

Out of ignorance, I used to think databases compiled their queries and ran them like functions — like in normal programming. Turns out they don’t, and I never understood why that wasn’t the norm. So, like with what SpaceTimeDB is doing, we made Helix queries deployable. You write a query, and it gets built directly into the database as its own endpoint. This avoids the overhead of sending entire query strings over the network and cuts latency down a lot (and also prevents injections).

Over the past week we've expanded our query language, released vector types, and two SDKs (Python & TypeScript), making it easy to insert and query data. HelixDB is open source, super easy to self-host, and we offer a managed service.

If you're building anything involving retrieval, we’d love your feedback!

waleedlatif 4 days ago

curious how you handle updates. like if a vector changes, do you re-run the HNSW links automatically?

GeorgeCurtis 4 days ago

Yes, and we handle those updates automatically.

pomarie 4 days ago

This sounds amazing, gonna give it a spin!

GeorgeCurtis 4 days ago

Thanks so much! :)

allisonee 4 days ago

congrats on the launch! excited to try it out!