About Semantic Software Engineering Search

This search engine helps you explore software engineering research papers using semantic search technology. It contains over 150,000 papers from major software engineering venues and journals.

How it works

Papers are converted into high-dimensional vectors (embeddings) using mxbai-embed-large. Each paper's embedding captures its semantic meaning in 1024 dimensions, allowing for concept-based searching rather than just keyword matching.

Features

Data Sources

Papers are sourced from Crossref's extensive academic database using their public API. The collection focuses on software engineering research from major venues and journals in the field.

Technical Details

The search uses cosine similarity in the embedding space to find relevant papers. The system is built on modern vector search technology (Pinecone) for fast and efficient similarity searches across the entire collection.

Based on searchthearxiv by August Wester.

Martin Monperrus. Feb 2025.