Local AI inference at ConSol combines GPT‑OSS with vLLM on OpenShift, delivering high‑throughput, low‑latency model serving on NVIDIA RTX PRO 6000 GPUs. By running the workload locally, we ensure cost control, data sovereignty and full performance tuning. The deployment leverages persistent storage, offline mode and egress‑air‑gapped networking for a secure, production‑ready solution.| ConSol Blog
Open-source vector similarity search for Postgres. Contribute to pgvector/pgvector development by creating an account on GitHub.| GitHub