"Nutzungsgeschichte": Technology

This project implements a search engine for Ancient Greek and Latin texts, designed to support advanced queries and experimentation on biblical texts.

General Architecture

General Architecture

The system combines a containerized architecture for easy installation, REST interfaces for programmatic interaction, and tools for automated evaluation of search performance.

The engine is deployed using Docker Compose and consists of:

  • a customized Elasticsearch instance for indexing and text search,
  • a PostgreSQL database for persistent data management,
  • a Python backend exposing REST APIs and a web application for interacting with the system.

Main Components

Indexing and Search

  • Based on Elasticsearch, enhanced with custom plugins

  • Supports language-specific configurations for Greek and Latin

  • Manages both semantic embeddings and traditional methods (e.g., trigrams, classical tokenizers)

Backend API

  • Written in Python, exposes REST endpoints for all operations (index creation, data loading, querying)

  • Also supports managing test cases and result collections

Database and Storage

  • PostgreSQL for data persistence

  • Dedicated folders (elasticsearch/ and postgres/) for volume storage

Web Interface

  • A development web interface accessible through the browser

  • Allows running queries, managing datasets, tests, and configurations without using the command line

Techniques and Methods

The system adopts a hybrid approach to improve the quality and accuracy of searches:

  • Semantic search: embeddings to compare texts on a semantic level

  • Traditional methods: trigrams, tokenization, and dedicated language configurations

  • Optional GPU acceleration to speed up indexing times

Repository and Documentation

The source code is available on GitHub.
Full documentation, including setup instructions, can be found in the README.md file and in the API sources (webapp/app/api/).

Main requirements:

  • Docker + Docker Compose
  • NVIDIA Container Toolkit for GPU acceleration (optional)
  • 16 GB RAM recommended; tested on Ubuntu 24.04 LTS