a6c2fb93cf
Tesseract OCR (PHP, server-side):
- Dockerfile: adds tesseract-ocr + tesseract-ocr-ita + libgd-dev (gd extension)
- api/index.php: new tesseractReadExpiry() — decodes base64 image, pre-processes with GD (2× upscale, greyscale, auto-contrast, sharpen), runs tesseract CLI with ita+eng PSM-6, extracts date with multi-pattern regex (DD/MM/YYYY, MM/YYYY, ISO, named-month), returns YYYY-MM-DD + confidence
- geminiReadExpiry() now: (1) tries Tesseract first; (2) falls back to Gemini Vision if OCR returns null or no date found; (3) passes source ('ocr'|'gemini') in response
@xenova/transformers embedding classifier (browser-side):
- index.html: ES-module bootstrap that lazy-loads 'Xenova/all-MiniLM-L6-v2' quantized (~23 MB, cached in browser) via window._getCategoryPipeline(); pre-warms on first scan page visit
- assets/js/app.js: classifyCategoryByEmbedding(name) — embeds product name + 16 category anchor descriptions, cosine similarity, threshold 0.30; results cached in _embeddingCache Map
- autoDetectCategory(): after keyword map misses, fires classifyCategoryByEmbedding async and updates select when resolved (respects manuallySet flag)
- createQuickProduct(): if regex returned 'altro', silently patches category with embedding result via a background api call
47 lines
1.3 KiB
Docker
47 lines
1.3 KiB
Docker
FROM php:8.2-apache
|
|
|
|
# Install required PHP extensions + Tesseract OCR for offline expiry date reading
|
|
RUN apt-get update && apt-get install -y \
|
|
libsqlite3-dev \
|
|
libcurl4-openssl-dev \
|
|
libonig-dev \
|
|
libgd-dev \
|
|
tesseract-ocr \
|
|
tesseract-ocr-ita \
|
|
tesseract-ocr-eng \
|
|
&& docker-php-ext-install pdo_sqlite curl mbstring gd \
|
|
&& apt-get clean && rm -rf /var/lib/apt/lists/*
|
|
|
|
# Enable Apache mod_rewrite and mod_headers
|
|
RUN a2enmod rewrite headers
|
|
|
|
# Set working directory
|
|
WORKDIR /var/www/html
|
|
|
|
# Copy application files
|
|
COPY . /var/www/html/
|
|
|
|
# Create data directory with proper permissions
|
|
RUN mkdir -p /var/www/html/data/backups \
|
|
&& chown -R www-data:www-data /var/www/html/data \
|
|
&& chmod -R 775 /var/www/html/data
|
|
|
|
# Create .env from example if it doesn't exist (will be overridden by volume mount)
|
|
RUN [ ! -f /var/www/html/.env ] && cp /var/www/html/.env.example /var/www/html/.env || true
|
|
|
|
# Apache configuration: serve from app root
|
|
RUN echo '<Directory /var/www/html>\n\
|
|
AllowOverride All\n\
|
|
Require all granted\n\
|
|
</Directory>' > /etc/apache2/conf-available/evershelf.conf \
|
|
&& a2enconf evershelf
|
|
|
|
# Expose port 80
|
|
EXPOSE 80
|
|
|
|
# Health check
|
|
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
|
|
CMD curl -f http://localhost/ || exit 1
|
|
|
|
CMD ["apache2-foreground"]
|