This project implements a search engine for Ancient Greek and Latin texts, designed to support advanced queries and experimentation on biblical texts.
The system combines a containerized architecture for easy installation, REST interfaces for programmatic interaction, and tools for automated evaluation of search performance.
The engine is deployed using Docker Compose and consists of:
Based on Elasticsearch, enhanced with custom plugins
Supports language-specific configurations for Greek and Latin
Manages both semantic embeddings and traditional methods (e.g., trigrams, classical tokenizers)
Written in Python, exposes REST endpoints for all operations (index creation, data loading, querying)
Also supports managing test cases and result collections
PostgreSQL for data persistence
Dedicated folders (elasticsearch/ and postgres/) for volume storage
A development web interface accessible through the browser
Allows running queries, managing datasets, tests, and configurations without using the command line
The system adopts a hybrid approach to improve the quality and accuracy of searches:
Semantic search: embeddings to compare texts on a semantic level
Traditional methods: trigrams, tokenization, and dedicated language configurations
Optional GPU acceleration to speed up indexing times
The source code is available on GitHub.
Full documentation, including setup instructions, can be found in the README.md file and in the API sources (webapp/app/api/).
Main requirements: