Adding missed files

Remove Freeform and Find from UI. Allow Description to be added to Reviewed job
Added job review toggle
2026-06-30 12:16:16 +01:00 · 2026-06-29 13:09:01 +01:00 · 2026-06-23 10:43:44 +01:00 · 2026-06-19 23:12:33 +01:00 · 2026-06-19 17:47:53 +01:00 · 2026-06-10 22:04:52 +01:00
18 changed files with 2594 additions and 604 deletions
--- a/.env
+++ b/.env
@@ -0,0 +1,23 @@
 # DeepSeek OCR Application Configuration
 # API Configuration
 API_HOST=0.0.0.0
 API_PORT=8000
 # Frontend Configuration
 FRONTEND_PORT=3000
 # Model Configuration
 MODEL_NAME=deepseek-ai/DeepSeek-OCR
 HF_HOME=/models
 # CORS Configuration (comma-separated origins, defaults to http://localhost:3000)
 CORS_ORIGINS=http://localhost:3000
 # Upload Configuration
 MAX_UPLOAD_SIZE_MB=100
 # Processing Configuration
 BASE_SIZE=1024
 IMAGE_SIZE=640
 CROP_MODE=true
--- a/.env.example
+++ b/.env.example
@@ -11,12 +11,34 @@ FRONTEND_PORT=3000
 MODEL_NAME=deepseek-ai/DeepSeek-OCR
 HF_HOME=/models
 # OCR model selection
 # Register the local DeepSeek-OCR model (set to false for an Ollama-only deployment)
 ENABLE_DEEPSEEK_LOCAL=true
 # External Ollama host the backend should call (no trailing slash)
 OLLAMA_BASE_URL=http://host.docker.internal:11434
 # Comma-separated Ollama vision model tags to surface in the UI.
 # Pull these on the Ollama host first, e.g. `ollama pull glm-ocr`.
 OLLAMA_MODELS=glm-ocr,llama3.2-vision,minicpm-v,qwen2.5vl
 # Default model id selected in the UI (deepseek-local or ollama:<tag>)
 DEFAULT_OCR_MODEL=deepseek-local
 # Per-request timeout (seconds) for Ollama calls
 OLLAMA_TIMEOUT=300
 # CORS Configuration (comma-separated origins, defaults to http://localhost:3000)
 CORS_ORIGINS=http://localhost:3000
 # Upload Configuration
 MAX_UPLOAD_SIZE_MB=100
 # PostgreSQL Configuration
 POSTGRES_USER=ocr_user
 POSTGRES_PASSWORD=ocr_password
 POSTGRES_DB=ocr_db
 DATABASE_URL=postgresql://ocr_user:ocr_password@postgres:5432/ocr_db
 # OCR Image Storage (host path mounted into container)
 OCR_IMAGES_DIR=/data/ocr_images
 # Processing Configuration
 BASE_SIZE=1024
 IMAGE_SIZE=640
--- a/.gitignore
+++ b/.gitignore
@@ -46,7 +46,7 @@ yarn.lock
 pnpm-lock.yaml
 # Environment
-.env
+#.env
 .env.local
 .env.development.local
 .env.test.local
--- a/README.md
+++ b/README.md
@@ -172,6 +172,13 @@ FRONTEND_PORT=3000
 MODEL_NAME=deepseek-ai/DeepSeek-OCR
 HF_HOME=/models
 # OCR model selection (DeepSeek + Ollama)
 ENABLE_DEEPSEEK_LOCAL=true                          # register the local GPU model
 OLLAMA_BASE_URL=http://host.docker.internal:11434   # external Ollama host
 OLLAMA_MODELS=glm-ocr,llama3.2-vision,minicpm-v,qwen2.5vl
 DEFAULT_OCR_MODEL=deepseek-local                    # deepseek-local or ollama:<tag>
 OLLAMA_TIMEOUT=300                                  # per-request timeout (seconds)
 # Upload Configuration
 MAX_UPLOAD_SIZE_MB=100  # Maximum file upload size
@@ -186,13 +193,47 @@ CROP_MODE=true         # Enable dynamic cropping for large images
 - `API_HOST`: Backend API host (default: 0.0.0.0)
 - `API_PORT`: Backend API port (default: 8000)
 - `FRONTEND_PORT`: Frontend port (default: 3000)
- `MODEL_NAME`: HuggingFace model identifier
+- `MODEL_NAME`: HuggingFace model identifier for the local DeepSeek-OCR model
 - `HF_HOME`: Model cache directory
 - `ENABLE_DEEPSEEK_LOCAL`: Register the local DeepSeek-OCR model (set `false` for an Ollama-only deployment with no GPU model loaded)
 - `OLLAMA_BASE_URL`: URL of an external Ollama server the backend calls for non-DeepSeek models
 - `OLLAMA_MODELS`: Comma-separated Ollama vision model tags to expose in the UI (pull them on the Ollama host first, e.g. `ollama pull glm-ocr`)
 - `DEFAULT_OCR_MODEL`: Model id selected by default (`deepseek-local` or `ollama:<tag>`)
 - `OLLAMA_TIMEOUT`: Per-request timeout in seconds for Ollama calls
 - `MAX_UPLOAD_SIZE_MB`: Maximum file upload size in megabytes
 - `BASE_SIZE`: Base image processing size (affects memory usage)
 - `IMAGE_SIZE`: Tile size for dynamic cropping
 - `CROP_MODE`: Enable/disable dynamic image cropping
 ### Choosing an OCR Model
 The **Model** selector (next to the Mode selector) chooses which backend runs the OCR:
 - **DeepSeek-OCR (local GPU)** — the default. Loaded lazily on first use. Supports
  every mode including grounding/bounding-box modes (Find), plus the Advanced
  Settings (base size, crop mode, etc.).
 - **Ollama models** — any vision model pulled on your Ollama host and listed in
  `OLLAMA_MODELS` (e.g. `glm-ocr`, `llama3.2-vision`). These run remotely on the
  Ollama server. They return **plain text only**: bounding boxes are not produced,
  so grounding modes (Find) and the DeepSeek-specific Advanced Settings are ignored
  / disabled when an Ollama model is selected.
 Setup for Ollama models:
 ```bash
 # On the machine running Ollama
 ollama pull glm-ocr
 ollama pull llama3.2-vision
 # Point the backend at it (in .env), then restart
 OLLAMA_BASE_URL=http://host.docker.internal:11434
 OLLAMA_MODELS=glm-ocr,llama3.2-vision
 ```
 `GET /api/models` returns the registered models and their capabilities; the UI
 populates the selector from it. The model used for each job is stored on the job
 record (`ocr_model`) and shown in the Browse Jobs view.
 ## Tech Stack
 ### Frontend
@@ -377,6 +418,7 @@ For large images, the model uses dynamic cropping:
 **Parameters:**
 - `image` (file, required) - Image file to process (up to 100MB)
 - `model` (string) - OCR model id from `GET /api/models` (default: registry default). Grounding/Advanced settings apply to DeepSeek only.
 - `mode` (string) - OCR mode: `plain_ocr` | `describe` | `find_ref` | `freeform`
 - `prompt` (string) - Custom prompt for freeform mode
 - `grounding` (bool) - Enable bounding boxes (auto-enabled for find_ref)
@@ -416,6 +458,7 @@ Process PDF documents with OCR and export to various formats.
 **Parameters:**
 - `pdf_file` (file, required) - PDF file to process (up to 100MB)
 - `model` (string) - OCR model id from `GET /api/models` (default: registry default)
 - `mode` (string) - OCR mode: `plain_ocr` | `describe` | `find_ref` | `freeform`
 - `prompt` (string) - Custom prompt for freeform mode
 - `output_format` (string) - Output format: `markdown` | `html` | `docx` | `json`
--- a/backend/database.py
+++ b/backend/database.py
@@ -0,0 +1,115 @@
 import os
 import psycopg2
 import psycopg2.extras
 from contextlib import contextmanager
 from decouple import config as env_config
 DATABASE_URL = env_config(
    "DATABASE_URL",
    default="postgresql://ocr_user:ocr_password@postgres:5432/ocr_db"
 )
 def _get_conn():
    return psycopg2.connect(DATABASE_URL, cursor_factory=psycopg2.extras.RealDictCursor)
 def init_db():
    """Create tables if they don't exist. Called once at startup."""
    conn = None
    try:
        conn = _get_conn()
        with conn.cursor() as cur:
            cur.execute("""
                CREATE TABLE IF NOT EXISTS ocr_jobs (
                    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
                    author TEXT,
                    book TEXT,
                    chapter TEXT,
                    page TEXT,
                    submitted_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
                    image_path TEXT NOT NULL,
                    original_filename TEXT,
                    ocr_text TEXT,
                    status TEXT NOT NULL DEFAULT 'unreviewed',
                    reviewed_text TEXT,
                    reviewer_name TEXT,
                    reviewed_at TIMESTAMPTZ,
                    mode TEXT
                )
            """)
            # Index for fast full-text-style searches on common fields
            cur.execute("""
                CREATE INDEX IF NOT EXISTS ocr_jobs_status_idx ON ocr_jobs(status)
            """)
            cur.execute("""
                CREATE INDEX IF NOT EXISTS ocr_jobs_submitted_at_idx ON ocr_jobs(submitted_at DESC)
            """)
            # Add columns introduced after initial schema (safe to run repeatedly)
            cur.execute("""
                ALTER TABLE ocr_jobs
                ADD COLUMN IF NOT EXISTS describe_text TEXT
            """)
            cur.execute("""
                ALTER TABLE ocr_jobs
                ADD COLUMN IF NOT EXISTS freeform_text TEXT
            """)
            cur.execute("""
                ALTER TABLE ocr_jobs
                ADD COLUMN IF NOT EXISTS qdrant_synced_at TIMESTAMPTZ
            """)
            cur.execute("""
                ALTER TABLE ocr_jobs
                ADD COLUMN IF NOT EXISTS updated_at TIMESTAMPTZ
            """)
            # Which OCR model produced this job (e.g. "deepseek-local", "ollama:glm-ocr")
            cur.execute("""
                ALTER TABLE ocr_jobs
                ADD COLUMN IF NOT EXISTS ocr_model TEXT
            """)
            # Trigger function: stamp updated_at on every row update
            cur.execute("""
                CREATE OR REPLACE FUNCTION set_updated_at()
                RETURNS TRIGGER AS $$
                BEGIN
                    NEW.updated_at = NOW();
                    RETURN NEW;
                END;
                $$ LANGUAGE plpgsql
            """)
            cur.execute("""
                CREATE OR REPLACE TRIGGER ocr_jobs_set_updated_at
                BEFORE UPDATE ON ocr_jobs
                FOR EACH ROW EXECUTE FUNCTION set_updated_at()
            """)
            # Unique constraint: prevent duplicate (author, chapter, page) submissions.
            # Applies only when all three fields are non-null.
            cur.execute("""
                CREATE UNIQUE INDEX IF NOT EXISTS ocr_jobs_author_chapter_page_unique
                ON ocr_jobs (author, chapter, page)
                WHERE author IS NOT NULL AND chapter IS NOT NULL AND page IS NOT NULL
            """)
        conn.commit()
        print("Database initialized.")
    except Exception as exc:
        print(f"Database init failed: {exc}")
        if conn:
            conn.rollback()
        raise
    finally:
        if conn:
            conn.close()
@contextmanager
 def get_db():
    """Yield a connection and auto-commit/rollback."""
    conn = _get_conn()
    try:
        yield conn
        conn.commit()
    except Exception:
        conn.rollback()
        raise
    finally:
        conn.close()
--- a/backend/main.py
+++ b/backend/main.py
@@ -1,16 +1,15 @@
 import os
-import re
+import uuid
 import tempfile
 import shutil
 import base64
 from typing import List, Dict, Any, Optional
 from contextlib import asynccontextmanager
 from datetime import datetime, timezone
-from fastapi import FastAPI, File, UploadFile, Form, HTTPException
+from fastapi import FastAPI, File, UploadFile, Form, HTTPException, Query
 from fastapi.middleware.cors import CORSMiddleware
-from fastapi.responses import JSONResponse, StreamingResponse
+from fastapi.responses import JSONResponse, StreamingResponse, FileResponse
-import torch
+from pydantic import BaseModel
 from transformers import AutoModel, AutoTokenizer
 from PIL import Image
 import uvicorn
 from decouple import config as env_config
@@ -24,51 +23,41 @@ from pdf_utils import (
    clean_markdown_content
 )
 from format_converter import DocumentConverter
 from database import init_db, get_db
 from providers import (
    build_registry,
    parse_detections,
    clean_grounding_text,
    ProviderError,
    GROUNDING_MODES,
 )
 OCR_IMAGES_DIR = env_config("OCR_IMAGES_DIR", default="/data/ocr_images")
 # -----------------------------
-# Lifespan context for model loading
+# Lifespan context
 # -----------------------------
-model = None
+# The model registry holds all available OCR providers. Local models (e.g.
-tokenizer = None
+# DeepSeek-OCR) are loaded lazily on first use so an Ollama-only deployment
 # starts instantly and never touches the GPU.
 registry = None
@asynccontextmanager
 async def lifespan(app: FastAPI):
-    """Load model on startup, cleanup on shutdown"""
+    """Build the model registry on startup."""
-    global model, tokenizer
+    global registry
-    # Environment setup
+    # Image storage directory
-    os.environ.pop("TRANSFORMERS_CACHE", None)
+    os.makedirs(OCR_IMAGES_DIR, exist_ok=True)
    MODEL_NAME = env_config("MODEL_NAME", default="deepseek-ai/DeepSeek-OCR")
    HF_HOME = env_config("HF_HOME", default="/models")
    os.makedirs(HF_HOME, exist_ok=True)
-    # Load model
+    # Database
    print(f"🚀 Loading {MODEL_NAME}...")
    torch_dtype = torch.bfloat16
    tokenizer = AutoTokenizer.from_pretrained(
        MODEL_NAME,
        trust_remote_code=True,
    )
    model = AutoModel.from_pretrained(
        MODEL_NAME,
        trust_remote_code=True,
        use_safetensors=True,
        attn_implementation="eager",
        torch_dtype=torch_dtype,
    ).eval().to("cuda")
    # Pad token setup
    try:
-        if getattr(tokenizer, "pad_token_id", None) is None and getattr(tokenizer, "eos_token_id", None) is not None:
+        init_db()
-            tokenizer.pad_token = tokenizer.eos_token
+    except Exception as exc:
-        if getattr(model.config, "pad_token_id", None) is None and getattr(tokenizer, "pad_token_id", None) is not None:
+        print(f"Warning: database initialization failed: {exc}")
            model.config.pad_token_id = tokenizer.pad_token_id
    except Exception:
        pass
-    print("✅ Model loaded and ready!")
+    # OCR model registry (providers load their models lazily)
    registry = build_registry()
    yield
@@ -97,155 +86,6 @@ app.add_middleware(
    allow_headers=["*"],
 )
 # -----------------------------
 # Prompt builder
 # -----------------------------
 def build_prompt(
    mode: str,
    user_prompt: str,
    grounding: bool,
    find_term: Optional[str],
    schema: Optional[str],
    include_caption: bool,
 ) -> str:
    """Build the prompt based on mode"""
    parts: List[str] = ["<image>"]
    mode_requires_grounding = mode in {"find_ref", "layout_map", "pii_redact"}
    if grounding or mode_requires_grounding:
        parts.append("<|grounding|>")
    instruction = ""
    if mode == "plain_ocr":
        instruction = "Free OCR."
    elif mode == "markdown":
        instruction = "Convert the document to markdown."
    elif mode == "tables_csv":
        instruction = (
            "Extract every table and output CSV only. "
            "Use commas, minimal quoting. If multiple tables, separate with a line containing '---'."
        )
    elif mode == "tables_md":
        instruction = "Extract every table as GitHub-flavored Markdown tables. Output only the tables."
    elif mode == "kv_json":
        schema_text = schema.strip() if schema else "{}"
        instruction = (
            "Extract key fields and return strict JSON only. "
            f"Use this schema (fill the values): {schema_text}"
        )
    elif mode == "figure_chart":
        instruction = (
            "Parse the figure. First extract any numeric series as a two-column table (x,y). "
            "Then summarize the chart in 2 sentences. Output the table, then a line '---', then the summary."
        )
    elif mode == "find_ref":
        key = (find_term or "").strip() or "Total"
        instruction = f"Locate <|ref|>{key}<|/ref|> in the image."
    elif mode == "layout_map":
        instruction = (
            'Return a JSON array of blocks with fields {"type":["title","paragraph","table","figure"],'
            '"box":[x1,y1,x2,y2]}. Do not include any text content.'
        )
    elif mode == "pii_redact":
        instruction = (
            'Find all occurrences of emails, phone numbers, postal addresses, and IBANs. '
            'Return a JSON array of objects {label, text, box:[x1,y1,x2,y2]}.'
        )
    elif mode == "multilingual":
        instruction = "Free OCR. Detect the language automatically and output in the same script."
    elif mode == "describe":
        instruction = "Describe this image. Focus on visible key elements."
    elif mode == "freeform":
        instruction = user_prompt.strip() if user_prompt else "OCR this image."
    else:
        instruction = "OCR this image."
    if include_caption and mode not in {"describe"}:
        instruction = instruction + "\nThen add a one-paragraph description of the image."
    parts.append(instruction)
    return "\n".join(parts)
 # -----------------------------
 # Grounding parser
 # -----------------------------
 # Match a full detection block and capture the coordinates as the entire list expression
 # Examples of captured coords (including outer brackets):
 #  - [[312, 339, 480, 681]]
 #  - [[504, 700, 625, 910], [771, 570, 996, 996]]
 #  - [[110, 310, 255, 800], [312, 343, 479, 680], ...]
 # Using a greedy bracket capture ensures we include all inner lists up to the last ']' before </|det|>
 DET_BLOCK = re.compile(
    r"<\|ref\|>(?P<label>.*?)<\|/ref\|>\s*<\|det\|>\s*(?P<coords>\[.*\])\s*<\|/det\|>",
    re.DOTALL,
 )
 def clean_grounding_text(text: str) -> str:
    """Remove grounding tags from text for display, keeping labels"""
    # Replace <|ref|>label<|/ref|><|det|>[...any nested lists...]<|/det|> with just the label
    cleaned = re.sub(
        r"<\|ref\|>(.*?)<\|/ref\|>\s*<\|det\|>\s*\[.*\]\s*<\|/det\|>",
        r"\1",
        text,
        flags=re.DOTALL,
    )
    # Also remove any standalone grounding tags
    cleaned = re.sub(r"<\|grounding\|>", "", cleaned)
    return cleaned.strip()
 def parse_detections(text: str, image_width: int, image_height: int) -> List[Dict[str, Any]]:
    """Parse grounding boxes from text and scale from 0-999 normalized coords to actual image dimensions
    Handles both single and multiple bounding boxes:
    - Single: <|ref|>label<|/ref|><|det|>[[x1,y1,x2,y2]]<|/det|>
    - Multiple: <|ref|>label<|/ref|><|det|>[[x1,y1,x2,y2], [x1,y1,x2,y2], ...]<|/det|>
    """
    boxes: List[Dict[str, Any]] = []
    for m in DET_BLOCK.finditer(text or ""):
        label = m.group("label").strip()
        coords_str = m.group("coords").strip()
        print(f"🔍 DEBUG: Found detection for '{label}'")
        print(f"📦 Raw coords string (with brackets): {coords_str}")
        try:
            import ast
            # Parse the full bracket expression directly (handles single and multiple)
            parsed = ast.literal_eval(coords_str)
            # Normalize to a list of lists
            if (
                isinstance(parsed, list)
                and len(parsed) == 4
                and all(isinstance(n, (int, float)) for n in parsed)
            ):
                # Single box provided as [x1,y1,x2,y2]
                box_coords = [parsed]
                print("📦 Single box (flat list) detected")
            elif isinstance(parsed, list):
                box_coords = parsed
                print(f"📦 Boxes detected: {len(box_coords)}")
            else:
                raise ValueError("Unsupported coords structure")
            # Process each box
            for idx, box in enumerate(box_coords):
                if isinstance(box, (list, tuple)) and len(box) >= 4:
                    x1 = int(float(box[0]) / 999 * image_width)
                    y1 = int(float(box[1]) / 999 * image_height)
                    x2 = int(float(box[2]) / 999 * image_width)
                    y2 = int(float(box[3]) / 999 * image_height)
                    print(f"  Box {idx+1}: {box} → [{x1}, {y1}, {x2}, {y2}]")
                    boxes.append({"label": label, "box": [x1, y1, x2, y2]})
                else:
                    print(f"  ⚠️ Skipping invalid box: {box}")
        except Exception as e:
            print(f"❌ Parsing failed: {e}")
            continue
    print(f"🎯 Total boxes parsed: {len(boxes)}")
    return boxes
 # -----------------------------
 # Routes
 # -----------------------------
@@ -255,11 +95,38 @@ async def root():
@app.get("/health")
 async def health():
-    return {"status": "healthy", "model_loaded": model is not None}
+    return {"status": "healthy", "models": registry.list_models() if registry else []}
@app.get("/api/models")
 async def list_models():
    """List the OCR models available for selection in the UI."""
    if registry is None:
        raise HTTPException(status_code=503, detail="Model registry not ready.")
    return JSONResponse({"models": registry.list_models()})
 def _resolve_provider(model_id: Optional[str], mode: str):
    """Look up the provider and reject capability mismatches (e.g. grounding)."""
    if registry is None:
        raise HTTPException(status_code=503, detail="Model registry not ready.")
    try:
        provider = registry.get(model_id)
    except ProviderError as exc:
        raise HTTPException(status_code=400, detail=str(exc))
    if mode in GROUNDING_MODES and not provider.capabilities.get("grounding"):
        raise HTTPException(
            status_code=400,
            detail=f"Model '{provider.label}' does not support grounding modes (e.g. {mode}).",
        )
    return provider
@app.post("/api/ocr")
 async def ocr_inference(
    image: UploadFile = File(...),
    model: Optional[str] = Form(None),
    mode: str = Form("plain_ocr"),
    prompt: str = Form(""),
    grounding: bool = Form(False),
@@ -275,32 +142,18 @@ async def ocr_inference(
    Perform OCR inference on uploaded image
    - **image**: Image file to process
    - **model**: OCR model id (see GET /api/models); defaults to the registry default
    - **mode**: OCR mode (plain_ocr, markdown, tables_csv, etc.)
    - **prompt**: Custom prompt for freeform mode
-    - **grounding**: Enable grounding boxes
+    - **grounding**: Enable grounding boxes (DeepSeek only)
    - **include_caption**: Add image description
    - **find_term**: Term to find (for find_ref mode)
    - **schema**: JSON schema (for kv_json mode)
-    - **base_size**: Base processing size
+    - **base_size/image_size/crop_mode/test_compress**: DeepSeek processing options
    - **image_size**: Image size parameter
    - **crop_mode**: Enable crop mode
    - **test_compress**: Test compression
    """
-    if model is None or tokenizer is None:
+    provider = _resolve_provider(model, mode)
        raise HTTPException(status_code=503, detail="Model not loaded yet")
    # Build prompt
    prompt_text = build_prompt(
        mode=mode,
        user_prompt=prompt,
        grounding=grounding,
        find_term=find_term,
        schema=schema,
        include_caption=include_caption,
    )
    tmp_img = None
    out_dir = None
    try:
        # Save uploaded file
        with tempfile.NamedTemporaryFile(delete=False, suffix=".png") as tmp:
@@ -315,42 +168,27 @@ async def ocr_inference(
        except Exception:
            orig_w = orig_h = None
-        out_dir = tempfile.mkdtemp(prefix="dsocr_")
+        # Run inference through the selected provider
-        
+        text = provider.run(
-        # Run inference
+            tmp_img,
-        res = model.infer(
+            mode=mode,
-            tokenizer,
+            prompt=prompt,
-            prompt=prompt_text,
+            grounding=grounding,
-            image_file=tmp_img,
+            find_term=find_term,
-            output_path=out_dir,
+            schema=schema,
-            base_size=base_size,
+            include_caption=include_caption,
-            image_size=image_size,
+            options={
-            crop_mode=crop_mode,
+                "base_size": base_size,
-            save_results=False,
+                "image_size": image_size,
-            test_compress=test_compress,
+                "crop_mode": crop_mode,
-            eval_mode=True,
+                "test_compress": test_compress,
            },
        )
        # Normalize response
        if isinstance(res, str):
            text = res.strip()
        elif isinstance(res, dict) and "text" in res:
            text = str(res["text"]).strip()
        elif isinstance(res, (list, tuple)):
            text = "\n".join(map(str, res)).strip()
        else:
            text = ""
        # Fallback: check output file
        if not text:
            mmd = os.path.join(out_dir, "result.mmd")
            if os.path.exists(mmd):
                with open(mmd, "r", encoding="utf-8") as fh:
                    text = fh.read().strip()
        if not text:
            text = "No text returned by model."
-        # Parse grounding boxes with proper coordinate scaling
+        # Parse grounding boxes (no-op for providers/text without grounding tokens)
        boxes = parse_detections(text, orig_w or 1, orig_h or 1) if ("<|det|>" in text or "<|ref|>" in text) else []
        # Clean grounding tags from display text, but keep the labels
@@ -367,14 +205,21 @@ async def ocr_inference(
            "boxes": boxes,
            "image_dims": {"w": orig_w, "h": orig_h},
            "metadata": {
                "model": provider.id,
                "model_label": provider.label,
                "mode": mode,
-                "grounding": grounding or (mode in {"find_ref","layout_map","pii_redact"}),
+                "grounding": grounding or (mode in GROUNDING_MODES),
                "base_size": base_size,
                "image_size": image_size,
                "crop_mode": crop_mode
            }
        })
    except ProviderError as e:
        print(f"OCR provider error: {e}")
        raise HTTPException(status_code=502, detail=str(e))
    except HTTPException:
        raise
    except Exception as e:
        print(f"OCR inference error: {type(e).__name__}: {str(e)}")
        raise HTTPException(status_code=500, detail="An internal error occurred during OCR processing.")
@@ -385,12 +230,11 @@ async def ocr_inference(
                os.remove(tmp_img)
            except Exception:
                pass
        if out_dir:
            shutil.rmtree(out_dir, ignore_errors=True)
@app.post("/api/process-pdf")
 async def process_pdf(
    pdf_file: UploadFile = File(...),
    model: Optional[str] = Form(None),
    mode: str = Form("plain_ocr"),
    prompt: str = Form(""),
    output_format: str = Form("markdown"),  # markdown, html, docx, json
@@ -417,8 +261,7 @@ async def process_pdf(
    - **image_size**: Image size parameter
    - **crop_mode**: Enable crop mode
    """
-    if model is None or tokenizer is None:
+    provider = _resolve_provider(model, mode)
        raise HTTPException(status_code=503, detail="Model not loaded yet")
    # Validate output format
    if output_format not in ["markdown", "html", "docx", "json"]:
@@ -441,56 +284,32 @@ async def process_pdf(
        for page_idx, img in enumerate(images):
            print(f"🔍 Processing page {page_idx + 1}/{total_pages}...")
            # Build prompt for this page
            prompt_text = build_prompt(
                mode=mode,
                user_prompt=prompt,
                grounding=grounding,
                find_term=None,
                schema=None,
                include_caption=include_caption,
            )
            # Save image temporarily
            tmp_img = None
            out_dir = None
            try:
                with tempfile.NamedTemporaryFile(delete=False, suffix=".png") as tmp:
                    img.save(tmp, format="PNG")
                    tmp_img = tmp.name
                orig_w, orig_h = img.size
                out_dir = tempfile.mkdtemp(prefix="dsocr_pdf_")
-                # Run inference
+                # Run inference through the selected provider
-                res = model.infer(
+                text = provider.run(
-                    tokenizer,
+                    tmp_img,
-                    prompt=prompt_text,
+                    mode=mode,
-                    image_file=tmp_img,
+                    prompt=prompt,
-                    output_path=out_dir,
+                    grounding=grounding,
-                    base_size=base_size,
+                    find_term=None,
-                    image_size=image_size,
+                    schema=None,
-                    crop_mode=crop_mode,
+                    include_caption=include_caption,
-                    save_results=False,
+                    options={
-                    test_compress=False,
+                        "base_size": base_size,
-                    eval_mode=True,
+                        "image_size": image_size,
                        "crop_mode": crop_mode,
                        "test_compress": False,
                    },
                )
                # Normalize response
                if isinstance(res, str):
                    text = res.strip()
                elif isinstance(res, dict) and "text" in res:
                    text = str(res["text"]).strip()
                elif isinstance(res, (list, tuple)):
                    text = "\n".join(map(str, res)).strip()
                else:
                    text = ""
                if not text:
                    mmd = os.path.join(out_dir, "result.mmd")
                    if os.path.exists(mmd):
                        with open(mmd, "r", encoding="utf-8") as fh:
                            text = fh.read().strip()
                if not text:
                    text = f"No text returned for page {page_idx + 1}."
@@ -535,8 +354,6 @@ async def process_pdf(
                        os.remove(tmp_img)
                    except Exception:
                        pass
                if out_dir:
                    shutil.rmtree(out_dir, ignore_errors=True)
        print(f"✅ Processed all {total_pages} pages")
@@ -547,6 +364,8 @@ async def process_pdf(
                "total_pages": total_pages,
                "pages": pages_content,
                "metadata": {
                    "model": provider.id,
                    "model_label": provider.label,
                    "mode": mode,
                    "grounding": grounding,
                    "extract_images": extract_images,
@@ -575,12 +394,468 @@ async def process_pdf(
                headers={"Content-Disposition": f"attachment; filename=ocr_result.docx"}
            )
    except ProviderError as e:
        print(f"PDF provider error: {e}")
        raise HTTPException(status_code=502, detail=str(e))
    except Exception as e:
        import traceback
        print(f"Error processing PDF: {e}")
        print(traceback.format_exc())
        raise HTTPException(status_code=500, detail="An internal error occurred during PDF processing.")
 # -----------------------------
 # Job management routes
 # -----------------------------
 class ReviewRequest(BaseModel):
    reviewed_text: str
    reviewer_name: str
    author: Optional[str] = None
    book: Optional[str] = None
    chapter: Optional[str] = None
    page: Optional[str] = None
    describe_text: Optional[str] = None
    freeform_text: Optional[str] = None
 def _job_row_to_dict(row) -> Dict[str, Any]:
    """Convert a DB row (RealDictRow) to a plain dict with serialisable values."""
    d = dict(row)
    for key, val in d.items():
        if isinstance(val, datetime):
            d[key] = val.isoformat()
        elif val is not None and hasattr(val, '__str__') and type(val).__name__ == 'UUID':
            d[key] = str(val)
    return d
@app.post("/api/jobs")
 async def commit_job(
    image: UploadFile = File(...),
    author: str = Form(""),
    book: str = Form(""),
    chapter: str = Form(""),
    page: str = Form(""),
    ocr_text: str = Form(""),
    describe_text: str = Form(""),
    freeform_text: str = Form(""),
    mode: str = Form("plain_ocr"),
    ocr_model: str = Form(""),
 ):
    """Commit an OCR job: save the image and insert a DB record."""
    job_id = str(uuid.uuid4())
    # Determine file extension from original filename or content type
    original_filename = image.filename or "image"
    ext = os.path.splitext(original_filename)[1].lower()
    if not ext:
        ct = (image.content_type or "").lower()
        ext_map = {
            "image/png": ".png", "image/jpeg": ".jpg", "image/jpg": ".jpg",
            "image/webp": ".webp", "image/gif": ".gif", "image/bmp": ".bmp",
        }
        ext = ext_map.get(ct, ".png")
    image_path = os.path.join(OCR_IMAGES_DIR, f"{job_id}{ext}")
    try:
        content = await image.read()
        with open(image_path, "wb") as f:
            f.write(content)
    except Exception as exc:
        raise HTTPException(status_code=500, detail="Failed to save image file.")
    try:
        with get_db() as conn:
            with conn.cursor() as cur:
                cur.execute(
                    """
                    INSERT INTO ocr_jobs
                        (id, author, book, chapter, page, image_path, original_filename,
                         ocr_text, describe_text, freeform_text, mode, ocr_model, status)
                    VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, 'unreviewed')
                    RETURNING *
                    """,
                    (job_id, author or None, book or None, chapter or None,
                     page or None, image_path, original_filename,
                     ocr_text or None, describe_text or None, freeform_text or None,
                     mode, ocr_model or None),
                )
                row = cur.fetchone()
    except Exception as exc:
        # Clean up saved image if DB insert fails
        try:
            os.remove(image_path)
        except Exception:
            pass
        # Unique constraint violation (author + chapter + page already exists)
        if getattr(exc, 'pgcode', None) == '23505':
            raise HTTPException(
                status_code=409,
                detail="A job with this Author, Chapter, and Page already exists."
            )
        print(f"Job commit DB error: {exc}")
        raise HTTPException(status_code=500, detail="Failed to save job to database.")
    return JSONResponse(_job_row_to_dict(row), status_code=201)
@app.get("/api/jobs")
 async def list_jobs(
    search: Optional[str] = Query(None, description="General text search across all fields"),
    author: Optional[str] = Query(None),
    book: Optional[str] = Query(None),
    chapter: Optional[str] = Query(None),
    status: Optional[str] = Query(None, description="unreviewed | reviewed"),
    limit: int = Query(20, ge=1, le=200),
    offset: int = Query(0, ge=0),
 ):
    """Search and list jobs. All filters are optional and combinable."""
    conditions = []
    params: List[Any] = []
    if search:
        conditions.append(
            "(author ILIKE %s OR book ILIKE %s OR chapter ILIKE %s "
            "OR page ILIKE %s OR ocr_text ILIKE %s OR reviewer_name ILIKE %s)"
        )
        like = f"%{search}%"
        params.extend([like, like, like, like, like, like])
    if author:
        conditions.append("author ILIKE %s")
        params.append(f"%{author}%")
    if book:
        conditions.append("book ILIKE %s")
        params.append(f"%{book}%")
    if chapter:
        conditions.append("chapter ILIKE %s")
        params.append(f"%{chapter}%")
    if status:
        conditions.append("status = %s")
        params.append(status)
    where = ("WHERE " + " AND ".join(conditions)) if conditions else ""
    try:
        with get_db() as conn:
            with conn.cursor() as cur:
                cur.execute(
                    f"SELECT COUNT(*) AS total FROM ocr_jobs {where}",
                    params,
                )
                total = cur.fetchone()["total"]
                cur.execute(
                    f"""
                    SELECT id, author, book, chapter, page, submitted_at, status,
                           reviewer_name, reviewed_at, mode, ocr_model, original_filename
                    FROM ocr_jobs {where}
                    ORDER BY submitted_at DESC
                    LIMIT %s OFFSET %s
                    """,
                    params + [limit, offset],
                )
                rows = [_job_row_to_dict(r) for r in cur.fetchall()]
    except Exception as exc:
        print(f"list_jobs DB error: {exc}")
        raise HTTPException(status_code=500, detail="Database error.")
    return JSONResponse({"total": total, "limit": limit, "offset": offset, "jobs": rows})
@app.get("/api/jobs/suggestions")
 async def job_suggestions():
    """Return distinct values for author, book, chapter, and reviewer_name to power autocomplete."""
    try:
        with get_db() as conn:
            with conn.cursor() as cur:
                cur.execute("""
                    SELECT
                        array_remove(array_agg(DISTINCT author ORDER BY author), NULL) AS authors,
                        array_remove(array_agg(DISTINCT book ORDER BY book), NULL) AS books,
                        array_remove(array_agg(DISTINCT chapter ORDER BY chapter), NULL) AS chapters,
                        array_remove(array_agg(DISTINCT reviewer_name ORDER BY reviewer_name), NULL) AS reviewers
                    FROM ocr_jobs
                """)
                row = cur.fetchone()
    except Exception as exc:
        print(f"suggestions DB error: {exc}")
        raise HTTPException(status_code=500, detail="Database error.")
    return JSONResponse({
        "authors": row["authors"] or [],
        "books": row["books"] or [],
        "chapters": row["chapters"] or [],
        "reviewers": row["reviewers"] or [],
    })
@app.get("/api/jobs/{job_id}")
 async def get_job(job_id: str):
    """Retrieve full job record including OCR text."""
    try:
        uuid.UUID(job_id)
    except ValueError:
        raise HTTPException(status_code=400, detail="Invalid job ID.")
    try:
        with get_db() as conn:
            with conn.cursor() as cur:
                cur.execute("SELECT * FROM ocr_jobs WHERE id = %s", (job_id,))
                row = cur.fetchone()
    except Exception as exc:
        print(f"get_job DB error: {exc}")
        raise HTTPException(status_code=500, detail="Database error.")
    if not row:
        raise HTTPException(status_code=404, detail="Job not found.")
    return JSONResponse(_job_row_to_dict(row))
@app.get("/api/jobs/{job_id}/image")
 async def get_job_image(job_id: str):
    """Serve the stored image for a job."""
    try:
        uuid.UUID(job_id)
    except ValueError:
        raise HTTPException(status_code=400, detail="Invalid job ID.")
    try:
        with get_db() as conn:
            with conn.cursor() as cur:
                cur.execute("SELECT image_path FROM ocr_jobs WHERE id = %s", (job_id,))
                row = cur.fetchone()
    except Exception as exc:
        print(f"get_job_image DB error: {exc}")
        raise HTTPException(status_code=500, detail="Database error.")
    if not row:
        raise HTTPException(status_code=404, detail="Job not found.")
    path = row["image_path"]
    if not os.path.isfile(path):
        raise HTTPException(status_code=404, detail="Image file not found on disk.")
    return FileResponse(path)
@app.put("/api/jobs/{job_id}/review")
 async def review_job(job_id: str, body: ReviewRequest):
    """Mark a job as reviewed with the corrected text and reviewer name."""
    try:
        uuid.UUID(job_id)
    except ValueError:
        raise HTTPException(status_code=400, detail="Invalid job ID.")
    try:
        with get_db() as conn:
            with conn.cursor() as cur:
                cur.execute(
                    """
                    UPDATE ocr_jobs
                    SET status = 'reviewed',
                        reviewed_text = %s,
                        reviewer_name = %s,
                        reviewed_at = NOW(),
                        author = %s,
                        book = %s,
                        chapter = %s,
                        page = %s,
                        describe_text = %s,
                        freeform_text = %s
                    WHERE id = %s
                    RETURNING *
                    """,
                    (
                        body.reviewed_text,
                        body.reviewer_name,
                        body.author or None,
                        body.book or None,
                        body.chapter or None,
                        body.page or None,
                        body.describe_text,
                        body.freeform_text,
                        job_id,
                    ),
                )
                row = cur.fetchone()
    except Exception as exc:
        print(f"review_job DB error: {exc}")
        raise HTTPException(status_code=500, detail="Database error.")
    if not row:
        raise HTTPException(status_code=404, detail="Job not found.")
    return JSONResponse(_job_row_to_dict(row))
 class StatusRequest(BaseModel):
    status: str
    reviewer_name: Optional[str] = None
@app.put("/api/jobs/{job_id}/status")
 async def set_job_status(job_id: str, body: StatusRequest):
    """Toggle a job's reviewed status without touching its text or metadata.
    Marking 'reviewed' requires a reviewer_name and stamps reviewed_at.
    Marking 'unreviewed' clears reviewed_at while preserving reviewed_text.
    """
    try:
        uuid.UUID(job_id)
    except ValueError:
        raise HTTPException(status_code=400, detail="Invalid job ID.")
    if body.status not in ("reviewed", "unreviewed"):
        raise HTTPException(status_code=400, detail="status must be 'reviewed' or 'unreviewed'.")
    if body.status == "reviewed" and not (body.reviewer_name or "").strip():
        raise HTTPException(status_code=400, detail="Reviewer name is required to mark reviewed.")
    try:
        with get_db() as conn:
            with conn.cursor() as cur:
                if body.status == "reviewed":
                    cur.execute(
                        """
                        UPDATE ocr_jobs
                        SET status = 'reviewed',
                            reviewer_name = %s,
                            reviewed_at = NOW()
                        WHERE id = %s
                        RETURNING *
                        """,
                        (body.reviewer_name.strip(), job_id),
                    )
                else:
                    cur.execute(
                        """
                        UPDATE ocr_jobs
                        SET status = 'unreviewed',
                            reviewed_at = NULL
                        WHERE id = %s
                        RETURNING *
                        """,
                        (job_id,),
                    )
                row = cur.fetchone()
    except Exception as exc:
        print(f"set_job_status DB error: {exc}")
        raise HTTPException(status_code=500, detail="Database error.")
    if not row:
        raise HTTPException(status_code=404, detail="Job not found.")
    return JSONResponse(_job_row_to_dict(row))
 class JobDescribeRequest(BaseModel):
    model: Optional[str] = None
@app.post("/api/jobs/{job_id}/describe")
 async def describe_job(job_id: str, body: JobDescribeRequest):
    """Run Describe mode on a job's stored image and save the result to describe_text."""
    try:
        uuid.UUID(job_id)
    except ValueError:
        raise HTTPException(status_code=400, detail="Invalid job ID.")
    # Look up the stored image for this job
    try:
        with get_db() as conn:
            with conn.cursor() as cur:
                cur.execute("SELECT image_path FROM ocr_jobs WHERE id = %s", (job_id,))
                row = cur.fetchone()
    except Exception as exc:
        print(f"describe_job lookup DB error: {exc}")
        raise HTTPException(status_code=500, detail="Database error.")
    if not row:
        raise HTTPException(status_code=404, detail="Job not found.")
    image_path = row["image_path"]
    if not image_path or not os.path.isfile(image_path):
        raise HTTPException(status_code=404, detail="Image file not found on disk.")
    provider = _resolve_provider(body.model, "describe")
    try:
        text = provider.run(
            image_path,
            mode="describe",
            prompt="",
            grounding=False,
            find_term=None,
            schema=None,
            include_caption=False,
            options={"base_size": 1024, "image_size": 640, "crop_mode": True, "test_compress": False},
        )
    except ProviderError as e:
        print(f"describe_job provider error: {e}")
        raise HTTPException(status_code=502, detail=str(e))
    except Exception as e:
        print(f"describe_job inference error: {type(e).__name__}: {e}")
        raise HTTPException(status_code=500, detail="An internal error occurred during description.")
    display_text = clean_grounding_text(text) if ("<|ref|>" in text or "<|grounding|>" in text) else text
    # Persist the generated description on the job
    try:
        with get_db() as conn:
            with conn.cursor() as cur:
                cur.execute(
                    "UPDATE ocr_jobs SET describe_text = %s WHERE id = %s RETURNING *",
                    (display_text, job_id),
                )
                updated = cur.fetchone()
    except Exception as exc:
        print(f"describe_job save DB error: {exc}")
        raise HTTPException(status_code=500, detail="Database error.")
    if not updated:
        raise HTTPException(status_code=404, detail="Job not found.")
    return JSONResponse(_job_row_to_dict(updated))
@app.delete("/api/jobs/{job_id}")
 async def delete_job(job_id: str):
    """Delete a job record and its stored image."""
    try:
        uuid.UUID(job_id)
    except ValueError:
        raise HTTPException(status_code=400, detail="Invalid job ID.")
    try:
        with get_db() as conn:
            with conn.cursor() as cur:
                cur.execute(
                    "DELETE FROM ocr_jobs WHERE id = %s RETURNING image_path",
                    (job_id,),
                )
                row = cur.fetchone()
    except Exception as exc:
        print(f"delete_job DB error: {exc}")
        raise HTTPException(status_code=500, detail="Database error.")
    if not row:
        raise HTTPException(status_code=404, detail="Job not found.")
    # Best-effort removal of the stored image file
    try:
        if row["image_path"] and os.path.isfile(row["image_path"]):
            os.remove(row["image_path"])
    except Exception:
        pass
    return JSONResponse({"deleted": job_id})
 if __name__ == "__main__":
    host = env_config("API_HOST", default="0.0.0.0")
    port = env_config("API_PORT", default=8000, cast=int)
--- a/backend/providers.py
+++ b/backend/providers.py
@@ -0,0 +1,489 @@
 """
 OCR provider abstraction.
 Each provider knows how to turn an image + a semantic OCR request (mode, prompt,
 options) into raw model text. DeepSeek-specific prompt tokens and grounding-box
 parsing live here too so the FastAPI routes stay model-agnostic.
 Two providers ship today:
  - DeepSeekLocalProvider  -> the local HF transformers DeepSeek-OCR model (GPU)
  - OllamaProvider         -> any vision model served by an external Ollama host
 The registry is built from environment variables at startup (see build_registry()).
 """
 import os
 import re
 import base64
 import tempfile
 import shutil
 from abc import ABC, abstractmethod
 from typing import List, Dict, Any, Optional
 from decouple import config as env_config
 # httpx is only needed when an Ollama model is actually used; import lazily so the
 # backend can run DeepSeek-only without the dependency installed.
 try:
    import httpx
 except Exception:  # pragma: no cover - exercised only when httpx is missing
    httpx = None
 # =============================================================================
 # Prompt builders
 # =============================================================================
 def build_prompt(
    mode: str,
    user_prompt: str,
    grounding: bool,
    find_term: Optional[str],
    schema: Optional[str],
    include_caption: bool,
 ) -> str:
    """Build the DeepSeek-OCR prompt (with its special tokens) based on mode."""
    parts: List[str] = ["<image>"]
    mode_requires_grounding = mode in {"find_ref", "layout_map", "pii_redact"}
    if grounding or mode_requires_grounding:
        parts.append("<|grounding|>")
    parts.append(_instruction_for_mode(mode, user_prompt, find_term, schema, include_caption))
    return "\n".join(parts)
 def build_ollama_prompt(
    mode: str,
    user_prompt: str,
    find_term: Optional[str],
    schema: Optional[str],
    include_caption: bool,
 ) -> str:
    """Build a plain natural-language prompt for a generic vision model.
    No DeepSeek grounding tokens — Ollama vision models receive the image
    separately and respond in plain text.
    """
    if mode == "plain_ocr":
        instruction = (
            "Transcribe all of the text in this image exactly as it appears, "
            "preserving line breaks and reading order. Output only the transcribed "
            "text with no commentary."
        )
    elif mode == "markdown":
        instruction = (
            "Convert this document image to clean GitHub-flavored Markdown, "
            "preserving headings, lists, and tables. Output only the Markdown."
        )
    elif mode == "tables_csv":
        instruction = (
            "Extract every table in this image and output CSV only. Use commas with "
            "minimal quoting. If there are multiple tables, separate them with a line "
            "containing '---'. Output only the CSV."
        )
    elif mode == "tables_md":
        instruction = (
            "Extract every table in this image as GitHub-flavored Markdown tables. "
            "Output only the tables."
        )
    elif mode == "kv_json":
        schema_text = schema.strip() if schema else "{}"
        instruction = (
            "Extract the key fields from this image and return strict JSON only "
            f"(no prose). Use this schema, filling in the values: {schema_text}"
        )
    elif mode == "figure_chart":
        instruction = (
            "Parse the figure in this image. First extract any numeric series as a "
            "two-column table (x,y). Then add a line containing '---' followed by a "
            "two-sentence summary of the chart."
        )
    elif mode == "find_ref":
        key = (find_term or "").strip() or "Total"
        instruction = (
            f"Find every occurrence of '{key}' in this image and quote the surrounding "
            "text for each match. If it does not appear, say so."
        )
    elif mode == "layout_map":
        instruction = (
            'Identify the layout blocks in this image and return a JSON array of '
            'objects {"type": one of ["title","paragraph","table","figure"]}. '
            "Do not include the text content."
        )
    elif mode == "pii_redact":
        instruction = (
            "Find all emails, phone numbers, postal addresses, and IBANs in this image. "
            'Return a JSON array of objects {"label", "text"}.'
        )
    elif mode == "multilingual":
        instruction = (
            "Transcribe all of the text in this image exactly, detecting the language "
            "automatically and preserving the original script. Output only the text."
        )
    elif mode == "describe":
        instruction = "Describe this image, focusing on the key visible elements."
    elif mode == "freeform":
        instruction = user_prompt.strip() if user_prompt else "Transcribe the text in this image."
    else:
        instruction = "Transcribe the text in this image."
    if include_caption and mode != "describe":
        instruction += "\nThen add a one-paragraph description of the image."
    return instruction
 def _instruction_for_mode(
    mode: str,
    user_prompt: str,
    find_term: Optional[str],
    schema: Optional[str],
    include_caption: bool,
 ) -> str:
    """The DeepSeek instruction text (without the <image>/<|grounding|> prefix tokens)."""
    if mode == "plain_ocr":
        instruction = "Free OCR."
    elif mode == "markdown":
        instruction = "Convert the document to markdown."
    elif mode == "tables_csv":
        instruction = (
            "Extract every table and output CSV only. "
            "Use commas, minimal quoting. If multiple tables, separate with a line containing '---'."
        )
    elif mode == "tables_md":
        instruction = "Extract every table as GitHub-flavored Markdown tables. Output only the tables."
    elif mode == "kv_json":
        schema_text = schema.strip() if schema else "{}"
        instruction = (
            "Extract key fields and return strict JSON only. "
            f"Use this schema (fill the values): {schema_text}"
        )
    elif mode == "figure_chart":
        instruction = (
            "Parse the figure. First extract any numeric series as a two-column table (x,y). "
            "Then summarize the chart in 2 sentences. Output the table, then a line '---', then the summary."
        )
    elif mode == "find_ref":
        key = (find_term or "").strip() or "Total"
        instruction = f"Locate <|ref|>{key}<|/ref|> in the image."
    elif mode == "layout_map":
        instruction = (
            'Return a JSON array of blocks with fields {"type":["title","paragraph","table","figure"],'
            '"box":[x1,y1,x2,y2]}. Do not include any text content.'
        )
    elif mode == "pii_redact":
        instruction = (
            'Find all occurrences of emails, phone numbers, postal addresses, and IBANs. '
            'Return a JSON array of objects {label, text, box:[x1,y1,x2,y2]}.'
        )
    elif mode == "multilingual":
        instruction = "Free OCR. Detect the language automatically and output in the same script."
    elif mode == "describe":
        instruction = "Describe this image. Focus on visible key elements."
    elif mode == "freeform":
        instruction = user_prompt.strip() if user_prompt else "OCR this image."
    else:
        instruction = "OCR this image."
    if include_caption and mode != "describe":
        instruction = instruction + "\nThen add a one-paragraph description of the image."
    return instruction
 # =============================================================================
 # Grounding parser (DeepSeek-specific; no-op on plain text)
 # =============================================================================
 DET_BLOCK = re.compile(
    r"<\|ref\|>(?P<label>.*?)<\|/ref\|>\s*<\|det\|>\s*(?P<coords>\[.*\])\s*<\|/det\|>",
    re.DOTALL,
 )
 def clean_grounding_text(text: str) -> str:
    """Remove grounding tags from text for display, keeping labels."""
    cleaned = re.sub(
        r"<\|ref\|>(.*?)<\|/ref\|>\s*<\|det\|>\s*\[.*\]\s*<\|/det\|>",
        r"\1",
        text,
        flags=re.DOTALL,
    )
    cleaned = re.sub(r"<\|grounding\|>", "", cleaned)
    return cleaned.strip()
 def parse_detections(text: str, image_width: int, image_height: int) -> List[Dict[str, Any]]:
    """Parse grounding boxes from text and scale 0-999 normalized coords to pixels."""
    boxes: List[Dict[str, Any]] = []
    for m in DET_BLOCK.finditer(text or ""):
        label = m.group("label").strip()
        coords_str = m.group("coords").strip()
        try:
            import ast
            parsed = ast.literal_eval(coords_str)
            if (
                isinstance(parsed, list)
                and len(parsed) == 4
                and all(isinstance(n, (int, float)) for n in parsed)
            ):
                box_coords = [parsed]
            elif isinstance(parsed, list):
                box_coords = parsed
            else:
                raise ValueError("Unsupported coords structure")
            for box in box_coords:
                if isinstance(box, (list, tuple)) and len(box) >= 4:
                    x1 = int(float(box[0]) / 999 * image_width)
                    y1 = int(float(box[1]) / 999 * image_height)
                    x2 = int(float(box[2]) / 999 * image_width)
                    y2 = int(float(box[3]) / 999 * image_height)
                    boxes.append({"label": label, "box": [x1, y1, x2, y2]})
        except Exception as e:
            print(f"❌ Grounding parse failed: {e}")
            continue
    return boxes
 # =============================================================================
 # Providers
 # =============================================================================
 GROUNDING_MODES = {"find_ref", "layout_map", "pii_redact"}
 class ProviderError(Exception):
    """Raised when a provider cannot fulfil a request (e.g. backend unreachable)."""
 class OCRProvider(ABC):
    """Turns an image + OCR request into raw model text."""
    id: str
    label: str
    capabilities: Dict[str, Any]
    @abstractmethod
    def run(
        self,
        image_path: str,
        *,
        mode: str,
        prompt: str,
        grounding: bool,
        find_term: Optional[str],
        schema: Optional[str],
        include_caption: bool,
        options: Dict[str, Any],
    ) -> str:
        """Return the raw text output of the model for this image/request."""
    def info(self) -> Dict[str, Any]:
        return {"id": self.id, "label": self.label, "capabilities": self.capabilities}
 class DeepSeekLocalProvider(OCRProvider):
    """Local HF transformers DeepSeek-OCR model. Loaded lazily on first use."""
    def __init__(self):
        self.id = "deepseek-local"
        self.label = "DeepSeek-OCR (local GPU)"
        self.capabilities = {"grounding": True, "advanced_settings": True}
        self._model = None
        self._tokenizer = None
    @property
    def loaded(self) -> bool:
        return self._model is not None and self._tokenizer is not None
    def _ensure_loaded(self):
        if self.loaded:
            return
        # Heavy imports kept local so an Ollama-only deployment never needs torch.
        import torch
        from transformers import AutoModel, AutoTokenizer
        os.environ.pop("TRANSFORMERS_CACHE", None)
        model_name = env_config("MODEL_NAME", default="deepseek-ai/DeepSeek-OCR")
        hf_home = env_config("HF_HOME", default="/models")
        os.makedirs(hf_home, exist_ok=True)
        print(f"🚀 Loading {model_name}...")
        tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
        model = AutoModel.from_pretrained(
            model_name,
            trust_remote_code=True,
            use_safetensors=True,
            attn_implementation="eager",
            torch_dtype=torch.bfloat16,
        ).eval().to("cuda")
        try:
            if getattr(tokenizer, "pad_token_id", None) is None and getattr(tokenizer, "eos_token_id", None) is not None:
                tokenizer.pad_token = tokenizer.eos_token
            if getattr(model.config, "pad_token_id", None) is None and getattr(tokenizer, "pad_token_id", None) is not None:
                model.config.pad_token_id = tokenizer.pad_token_id
        except Exception:
            pass
        self._model = model
        self._tokenizer = tokenizer
        print("✅ DeepSeek-OCR loaded and ready!")
    def run(self, image_path, *, mode, prompt, grounding, find_term, schema, include_caption, options):
        self._ensure_loaded()
        prompt_text = build_prompt(
            mode=mode,
            user_prompt=prompt,
            grounding=grounding,
            find_term=find_term,
            schema=schema,
            include_caption=include_caption,
        )
        out_dir = tempfile.mkdtemp(prefix="dsocr_")
        try:
            res = self._model.infer(
                self._tokenizer,
                prompt=prompt_text,
                image_file=image_path,
                output_path=out_dir,
                base_size=int(options.get("base_size", 1024)),
                image_size=int(options.get("image_size", 640)),
                crop_mode=bool(options.get("crop_mode", True)),
                save_results=False,
                test_compress=bool(options.get("test_compress", False)),
                eval_mode=True,
            )
            if isinstance(res, str):
                text = res.strip()
            elif isinstance(res, dict) and "text" in res:
                text = str(res["text"]).strip()
            elif isinstance(res, (list, tuple)):
                text = "\n".join(map(str, res)).strip()
            else:
                text = ""
            if not text:
                mmd = os.path.join(out_dir, "result.mmd")
                if os.path.exists(mmd):
                    with open(mmd, "r", encoding="utf-8") as fh:
                        text = fh.read().strip()
            return text
        finally:
            shutil.rmtree(out_dir, ignore_errors=True)
 class OllamaProvider(OCRProvider):
    """A single vision model served by an external Ollama host."""
    def __init__(self, tag: str, base_url: str, label: Optional[str] = None):
        self.tag = tag
        self.base_url = base_url.rstrip("/")
        self.id = f"ollama:{tag}"
        self.label = label or f"{tag} (Ollama)"
        # Generic vision models don't emit DeepSeek grounding tokens.
        self.capabilities = {"grounding": False, "advanced_settings": False}
    def run(self, image_path, *, mode, prompt, grounding, find_term, schema, include_caption, options):
        if httpx is None:
            raise ProviderError("httpx is not installed; cannot reach Ollama.")
        prompt_text = build_ollama_prompt(
            mode=mode,
            user_prompt=prompt,
            find_term=find_term,
            schema=schema,
            include_caption=include_caption,
        )
        with open(image_path, "rb") as f:
            img_b64 = base64.b64encode(f.read()).decode("utf-8")
        payload = {
            "model": self.tag,
            "prompt": prompt_text,
            "images": [img_b64],
            "stream": False,
        }
        timeout = float(env_config("OLLAMA_TIMEOUT", default=300.0, cast=float))
        try:
            resp = httpx.post(f"{self.base_url}/api/generate", json=payload, timeout=timeout)
            resp.raise_for_status()
            data = resp.json()
        except httpx.HTTPStatusError as e:
            detail = ""
            try:
                detail = e.response.json().get("error", "")
            except Exception:
                detail = e.response.text[:200]
            raise ProviderError(f"Ollama returned {e.response.status_code}: {detail}") from e
        except httpx.HTTPError as e:
            raise ProviderError(f"Could not reach Ollama at {self.base_url}: {e}") from e
        return (data.get("response") or "").strip()
 # =============================================================================
 # Registry
 # =============================================================================
 class ModelRegistry:
    def __init__(self, providers: List[OCRProvider], default_id: str):
        self._providers: Dict[str, OCRProvider] = {p.id: p for p in providers}
        # Fall back to the first registered provider if the configured default is gone.
        self.default_id = default_id if default_id in self._providers else (
            next(iter(self._providers), None)
        )
    def get(self, model_id: Optional[str]) -> OCRProvider:
        chosen = model_id or self.default_id
        provider = self._providers.get(chosen)
        if provider is None:
            raise ProviderError(f"Unknown model '{chosen}'.")
        return provider
    def list_models(self) -> List[Dict[str, Any]]:
        out = []
        for p in self._providers.values():
            entry = p.info()
            entry["default"] = (p.id == self.default_id)
            out.append(entry)
        return out
 def build_registry() -> ModelRegistry:
    """Build the provider registry from environment variables.
    Env:
      ENABLE_DEEPSEEK_LOCAL  - register the local DeepSeek-OCR model (default: true)
      OLLAMA_BASE_URL        - Ollama host (default: http://host.docker.internal:11434)
      OLLAMA_MODELS          - comma-separated tags to surface (e.g. "glm-ocr,llama3.2-vision")
      DEFAULT_OCR_MODEL      - id to select by default (default: deepseek-local)
    """
    providers: List[OCRProvider] = []
    enable_deepseek = env_config("ENABLE_DEEPSEEK_LOCAL", default="true").strip().lower() in {"1", "true", "yes"}
    if enable_deepseek:
        providers.append(DeepSeekLocalProvider())
    base_url = env_config("OLLAMA_BASE_URL", default="http://host.docker.internal:11434")
    raw_tags = env_config("OLLAMA_MODELS", default="")
    tags = [t.strip() for t in raw_tags.split(",") if t.strip()]
    for tag in tags:
        providers.append(OllamaProvider(tag=tag, base_url=base_url))
    default_id = env_config("DEFAULT_OCR_MODEL", default="deepseek-local")
    if not providers:
        # Defensive: nothing configured. Register DeepSeek so the app still starts.
        providers.append(DeepSeekLocalProvider())
        default_id = "deepseek-local"
    registry = ModelRegistry(providers, default_id)
    print(f"🧠 OCR models registered: {[p.id for p in providers]} (default: {registry.default_id})")
    return registry
--- a/backend/requirements.txt
+++ b/backend/requirements.txt
@@ -15,3 +15,5 @@ PyMuPDF>=1.23.0
 img2pdf>=0.5.0
 python-docx>=1.1.0
 markdown>=3.5.0
 psycopg2-binary>=2.9.0
 httpx>=0.27.0
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -1,4 +1,19 @@
 services:
  postgres:
    image: postgres:16-alpine
    container_name: deepseek-ocr-postgres
    environment:
      POSTGRES_USER: ${POSTGRES_USER:-ocr_user}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-ocr_password}
      POSTGRES_DB: ${POSTGRES_DB:-ocr_db}
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-ocr_user} -d ${POSTGRES_DB:-ocr_db}"]
      interval: 5s
      timeout: 5s
      retries: 10
  backend:
    build: ./backend
    container_name: deepseek-ocr-backend
@@ -10,8 +25,23 @@ services:
      API_HOST: ${API_HOST:-0.0.0.0}
      API_PORT: ${API_PORT:-8000}
      MAX_UPLOAD_SIZE_MB: ${MAX_UPLOAD_SIZE_MB:-100}
      DATABASE_URL: ${DATABASE_URL:-postgresql://ocr_user:ocr_password@postgres:5432/ocr_db}
      OCR_IMAGES_DIR: ${OCR_IMAGES_DIR:-/data/ocr_images}
      ENABLE_DEEPSEEK_LOCAL: ${ENABLE_DEEPSEEK_LOCAL:-true}
      OLLAMA_BASE_URL: ${OLLAMA_BASE_URL:-http://host.docker.internal:11434}
      OLLAMA_MODELS: ${OLLAMA_MODELS:-}
      DEFAULT_OCR_MODEL: ${DEFAULT_OCR_MODEL:-deepseek-local}
      OLLAMA_TIMEOUT: ${OLLAMA_TIMEOUT:-300}
    # Lets the container reach an Ollama server running on the Docker host
    # (works out of the box on Docker Desktop; required for Linux engines).
    extra_hosts:
      - "host.docker.internal:host-gateway"
    volumes:
      - ./models:/models
      - ./ocr_images:/data/ocr_images
    depends_on:
      postgres:
        condition: service_healthy
    deploy:
      resources:
        reservations:
@@ -22,8 +52,6 @@ services:
    shm_size: "4g"
    ports:
      - "${API_PORT:-8000}:${API_PORT:-8000}"
    networks:
      - ocr-network
  frontend:
    build: ./frontend
@@ -32,9 +60,10 @@ services:
      - "${FRONTEND_PORT:-3000}:80"
    depends_on:
      - backend
-    networks:
+
-      - ocr-network
+volumes:
  postgres_data:
 networks:
-  ocr-network:
+  default:
-    driver: bridge
+    name: rw-research
--- a/frontend/src/App.jsx
+++ b/frontend/src/App.jsx
@@ -1,18 +1,35 @@
-import { useState, useCallback } from 'react'
+import { useState, useCallback, useEffect } from 'react'
 import { useSuggestions } from './hooks/useSuggestions'
 import { useModels } from './hooks/useModels'
 import { motion, AnimatePresence } from 'framer-motion'
-import { Sparkles, Zap, Loader2, Settings, Image as ImageIcon, FileText } from 'lucide-react'
+import {
  Sparkles, Zap, Loader2, Settings, Image as ImageIcon, FileText,
  Layers, ChevronLeft, CheckCircle2, Database,
 } from 'lucide-react'
 import ImageUpload from './components/ImageUpload'
 import ModeSelector from './components/ModeSelector'
 import ModelSelector from './components/ModelSelector'
 import ResultPanel from './components/ResultPanel'
 import AdvancedSettings from './components/AdvancedSettings'
 import PDFProcessor from './components/PDFProcessor'
 import MetadataForm from './components/MetadataForm'
 import JobsPanel from './components/JobsPanel'
 import axios from 'axios'
 const API_BASE = import.meta.env.VITE_API_URL || '/api'
 const INPUT_CLASS =
  'w-full bg-white/5 border border-white/10 rounded-lg px-3 py-2 text-sm text-gray-200 ' +
  'placeholder-gray-600 focus:outline-none focus:border-purple-500/50 transition-colors'
 function App() {
  const [view, setView] = useState('new_job')
  // OCR state
  const { models, loading: modelsLoading } = useModels()
  const [model, setModel] = useState(null)
  const [mode, setMode] = useState('plain_ocr')
-  const [fileType, setFileType] = useState('image') // 'image' or 'pdf'
+  const [fileType, setFileType] = useState('image')
  const [image, setImage] = useState(null)
  const [imagePreview, setImagePreview] = useState(null)
  const [result, setResult] = useState(null)
@@ -21,22 +38,39 @@ function App() {
  const [showAdvanced, setShowAdvanced] = useState(false)
  const [includeCaption, setIncludeCaption] = useState(false)
  // Form state
  const [prompt, setPrompt] = useState('')
  const [findTerm, setFindTerm] = useState('')
  const [advancedSettings, setAdvancedSettings] = useState({
-    base_size: 1024,
+    base_size: 1024, image_size: 640, crop_mode: true, test_compress: false,
    image_size: 640,
    crop_mode: true,
    test_compress: false
  })
-  const handleFileTypeChange = useCallback((newType) => {
+  const suggestions = useSuggestions()
-    // Clear current file when switching types
+
-    setImage(null)
+  const [metadata, setMetadata] = useState({ author: '', book: '', chapter: '', page: '' })
-    if (imagePreview) {
+  // Results accumulated per mode: { plain_ocr: 'text', describe: 'text', freeform: 'text' }
-      URL.revokeObjectURL(imagePreview)
+  const [modeResults, setModeResults] = useState({})
  const [editedResults, setEditedResults] = useState({})
  const [activeResultMode, setActiveResultMode] = useState(null)
  const [commitLoading, setCommitLoading] = useState(false)
  const [commitResult, setCommitResult] = useState(null)
  // Modes that produce editable text output and can be committed to the DB
  const COMMITTABLE_MODES = new Set(['plain_ocr', 'describe'])
  const MODE_LABELS = { plain_ocr: 'OCR Text', describe: 'Description' }
  // Pick the default model once the list loads
  useEffect(() => {
    if (!model && models.length > 0) {
      setModel((models.find(m => m.default) || models[0]).id)
    }
  }, [models, model])
  // Show the full-screen result view once at least one committable mode has a result
  const showResultView = view === 'new_job' && Object.keys(modeResults).length > 0
  const handleFileTypeChange = useCallback((newType) => {
    setImage(null)
    if (imagePreview) URL.revokeObjectURL(imagePreview)
    setImagePreview(null)
    setError(null)
    setResult(null)
@@ -45,42 +79,38 @@ function App() {
  const handleImageSelect = useCallback((file) => {
    if (file === null) {
      // Clear everything when removing image
      setImage(null)
-      if (imagePreview && fileType === 'image') {
+      if (imagePreview && fileType === 'image') URL.revokeObjectURL(imagePreview)
        URL.revokeObjectURL(imagePreview)
      }
      setImagePreview(null)
      setError(null)
      setResult(null)
      setModeResults({})
      setEditedResults({})
      setActiveResultMode(null)
      setCommitResult(null)
    } else {
      setImage(file)
-      // Only create preview URL for images, not PDFs
+      setImagePreview(fileType === 'image' ? URL.createObjectURL(file) : file)
      if (fileType === 'image') {
        setImagePreview(URL.createObjectURL(file))
      } else {
        setImagePreview(file) // Just store the file for PDFs
      }
      setError(null)
      setResult(null)
      setModeResults({})
      setEditedResults({})
      setActiveResultMode(null)
      setCommitResult(null)
    }
  }, [imagePreview, fileType])
  const handleSubmit = async () => {
-    if (!image) {
+    if (!image) { setError('Please upload an image first'); return }
      setError('Please upload an image first')
      return
    }
    setLoading(true)
    setError(null)
-
+    setCommitResult(null)
    try {
      const formData = new FormData()
      formData.append('image', image)
      if (model) formData.append('model', model)
      formData.append('mode', mode)
      formData.append('prompt', prompt)
      // Enable grounding only for find mode
      formData.append('grounding', mode === 'find_ref')
      formData.append('include_caption', includeCaption)
      formData.append('find_term', findTerm)
@@ -91,12 +121,16 @@ function App() {
      formData.append('test_compress', advancedSettings.test_compress)
      const response = await axios.post(`${API_BASE}/ocr`, formData, {
-        headers: {
+        headers: { 'Content-Type': 'multipart/form-data' },
          'Content-Type': 'multipart/form-data',
        },
      })
      setResult(response.data)
      if (COMMITTABLE_MODES.has(mode)) {
        const text = response.data.text || ''
        setModeResults(prev => ({ ...prev, [mode]: text }))
        setEditedResults(prev => ({ ...prev, [mode]: text }))
        setActiveResultMode(mode)
      }
      setCommitResult(null)
    } catch (err) {
      setError(err.response?.data?.detail || err.message || 'An error occurred')
    } finally {
@@ -104,31 +138,61 @@ function App() {
    }
  }
-  const handleCopy = useCallback(() => {
+  const handleNewAnalysis = () => {
-    if (result?.text) {
+    setResult(null)
-      navigator.clipboard.writeText(result.text)
+    setModeResults({})
    setEditedResults({})
    setActiveResultMode(null)
    setCommitResult(null)
  }
  const handleCommitJob = useCallback(async () => {
    if (!image) return
    setCommitLoading(true)
    setCommitResult(null)
    try {
      const formData = new FormData()
      formData.append('image', image)
      formData.append('author', metadata.author)
      formData.append('book', metadata.book)
      formData.append('chapter', metadata.chapter)
      formData.append('page', metadata.page)
      formData.append('ocr_text', editedResults.plain_ocr || '')
      formData.append('describe_text', editedResults.describe || '')
      formData.append('freeform_text', editedResults.freeform || '')
      formData.append('mode', mode)
      if (model) formData.append('ocr_model', model)
      const response = await axios.post(`${API_BASE}/jobs`, formData, {
        headers: { 'Content-Type': 'multipart/form-data' },
      })
      setCommitResult({ success: true, job: response.data })
    } catch (err) {
      setCommitResult({ success: false, error: err.response?.data?.detail || err.message })
    } finally {
      setCommitLoading(false)
    }
-  }, [result])
+  }, [image, editedResults, metadata, mode, model])
  const handleCopy = useCallback(() => {
    const text = (activeResultMode && editedResults[activeResultMode]) || result?.text
    if (text) navigator.clipboard.writeText(text)
  }, [activeResultMode, editedResults, result])
  const handleDownload = useCallback(() => {
-    if (!result?.text) return
+    const text = (activeResultMode && editedResults[activeResultMode]) || result?.text
-    
+    if (!text) return
-    const extensions = {
+    const ext = { plain_ocr: 'txt', describe: 'txt', find_ref: 'txt', freeform: 'txt' }[mode] || 'txt'
-      plain_ocr: 'txt',
+    const blob = new Blob([text], { type: 'text/plain' })
      describe: 'txt',
      find_ref: 'txt',
      freeform: 'txt',
    }
    const ext = extensions[mode] || 'txt'
    const blob = new Blob([result.text], { type: 'text/plain' })
    const url = URL.createObjectURL(blob)
    const a = document.createElement('a')
    a.href = url
    a.download = `deepseek-ocr-result.${ext}`
    a.click()
    URL.revokeObjectURL(url)
-  }, [result, mode])
+  }, [activeResultMode, editedResults, result, mode])
  const metaField = (key) => (e) => setMetadata(m => ({ ...m, [key]: e.target.value }))
  return (
    <div className="min-h-screen relative overflow-hidden">
@@ -138,27 +202,13 @@ function App() {
        <div className="absolute inset-0 bg-[url('data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iNjAiIGhlaWdodD0iNjAiIHZpZXdCb3g9IjAgMCA2MCA2MCIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIj48ZyBmaWxsPSJub25lIiBmaWxsLXJ1bGU9ImV2ZW5vZGQiPjxwYXRoIGQ9Ik0zNiAxOGMzLjMxIDAgNiAyLjY5IDYgNnMtMi42OSA2LTYgNi02LTIuNjktNi02IDIuNjktNiA2LTZ6TTI0IDZjMy4zMSAwIDYgMi42OSA2IDZzLTIuNjkgNi02IDYtNi0yLjY5LTYtNiAyLjY5LTYgNi02ek00OCAzNmMzLjMxIDAgNiAyLjY5IDYgNnMtMi42OSA2LTYgNi02LTIuNjktNi02IDIuNjktNiA2LTZ6IiBzdHJva2U9InJnYmEoMTQ3LCA1MSwgMjM0LCAwLjEpIiBzdHJva2Utd2lkdGg9IjIiLz48L2c+PC9zdmc+')] opacity-30" />
        <motion.div
          className="absolute top-20 left-20 w-96 h-96 bg-purple-500/10 rounded-full blur-3xl"
-          animate={{
+          animate={{ scale: [1, 1.2, 1], opacity: [0.3, 0.5, 0.3] }}
-            scale: [1, 1.2, 1],
+          transition={{ duration: 8, repeat: Infinity, ease: 'easeInOut' }}
            opacity: [0.3, 0.5, 0.3],
          }}
          transition={{
            duration: 8,
            repeat: Infinity,
            ease: "easeInOut"
          }}
        />
        <motion.div
          className="absolute bottom-20 right-20 w-96 h-96 bg-cyan-500/10 rounded-full blur-3xl"
-          animate={{
+          animate={{ scale: [1.2, 1, 1.2], opacity: [0.5, 0.3, 0.5] }}
-            scale: [1.2, 1, 1.2],
+          transition={{ duration: 8, repeat: Infinity, ease: 'easeInOut' }}
            opacity: [0.5, 0.3, 0.5],
          }}
          transition={{
            duration: 8,
            repeat: Infinity,
            ease: "easeInOut"
          }}
        />
      </div>
@@ -166,11 +216,7 @@ function App() {
      <header className="sticky top-0 z-50 glass border-b border-white/10">
        <div className="max-w-7xl mx-auto px-6 py-4">
          <div className="flex items-center justify-between">
-            <motion.div 
+            <motion.div className="flex items-center gap-3" initial={{ opacity: 0, x: -20 }} animate={{ opacity: 1, x: 0 }}>
              className="flex items-center gap-3"
              initial={{ opacity: 0, x: -20 }}
              animate={{ opacity: 1, x: 0 }}
            >
              <div className="relative">
                <div className="absolute inset-0 bg-gradient-to-r from-purple-500 to-cyan-500 rounded-xl blur-lg opacity-75" />
                <div className="relative bg-gradient-to-br from-purple-600 to-cyan-500 p-2 rounded-xl">
@@ -182,173 +228,348 @@ function App() {
                <p className="text-xs text-gray-400">Next-Gen Vision AI</p>
              </div>
            </motion.div>
            <nav className="flex gap-2">
              {showResultView && (
                <motion.button
                  onClick={handleNewAnalysis}
                  className="flex items-center gap-2 px-4 py-2 rounded-xl text-sm font-medium glass text-gray-400 hover:bg-white/5 transition-all"
                  whileHover={{ scale: 1.02 }} whileTap={{ scale: 0.98 }}
                >
                  <ChevronLeft className="w-4 h-4" />
                  New Analysis
                </motion.button>
              )}
              <motion.button
                onClick={() => setView('new_job')}
                className={`flex items-center gap-2 px-4 py-2 rounded-xl text-sm font-medium transition-all ${view === 'new_job' ? 'bg-gradient-to-r from-purple-600 to-cyan-600 text-white' : 'glass text-gray-400 hover:bg-white/5'}`}
                whileHover={{ scale: 1.02 }} whileTap={{ scale: 0.98 }}
              >
                <Zap className="w-4 h-4" />
                New Job
              </motion.button>
              <motion.button
                onClick={() => setView('jobs')}
                className={`flex items-center gap-2 px-4 py-2 rounded-xl text-sm font-medium transition-all ${view === 'jobs' ? 'bg-gradient-to-r from-purple-600 to-cyan-600 text-white' : 'glass text-gray-400 hover:bg-white/5'}`}
                whileHover={{ scale: 1.02 }} whileTap={{ scale: 0.98 }}
              >
                <Layers className="w-4 h-4" />
                Browse Jobs
              </motion.button>
            </nav>
          </div>
        </div>
      </header>
      {/* Main Content */}
-      <main className="max-w-7xl mx-auto px-6 py-8">
+      <main className="max-w-7xl mx-auto px-6 py-6">
-        <div className="grid lg:grid-cols-2 gap-6">
+        <AnimatePresence>
          {/* Left Panel - Upload & Controls */}
          <motion.div
            initial={{ opacity: 0, y: 20 }}
            animate={{ opacity: 1, y: 0 }}
            transition={{ delay: 0.1 }}
            className="space-y-6"
          >
            {/* File Type Toggle */}
            <div className="glass p-4 rounded-2xl">
              <div className="grid grid-cols-2 gap-2">
                <motion.button
                  onClick={() => handleFileTypeChange('image')}
                  className={`p-3 rounded-xl text-sm font-medium transition-all flex items-center justify-center gap-2 ${
                    fileType === 'image'
                      ? 'bg-gradient-to-r from-purple-600 to-cyan-600 text-white'
                      : 'glass text-gray-400 hover:bg-white/5'
                  }`}
                  whileHover={{ scale: 1.02 }}
                  whileTap={{ scale: 0.98 }}
                >
                  <ImageIcon className="w-4 h-4" />
                  Image OCR
                </motion.button>
                <motion.button
                  onClick={() => handleFileTypeChange('pdf')}
                  className={`p-3 rounded-xl text-sm font-medium transition-all flex items-center justify-center gap-2 ${
                    fileType === 'pdf'
                      ? 'bg-gradient-to-r from-purple-600 to-cyan-600 text-white'
                      : 'glass text-gray-400 hover:bg-white/5'
                  }`}
                  whileHover={{ scale: 1.02 }}
                  whileTap={{ scale: 0.98 }}
                >
                  <FileText className="w-4 h-4" />
                  PDF Processing
                </motion.button>
              </div>
            </div>
-            {/* Mode Selector with integrated inputs */}
+          {/* ── Full-screen OCR result view ── */}
-            <ModeSelector
+          {showResultView ? (
-              mode={mode}
+            <motion.div
-              onModeChange={setMode}
+              key="ocr_result"
-              prompt={prompt}
+              initial={{ opacity: 0, y: 20 }}
-              onPromptChange={setPrompt}
+              animate={{ opacity: 1, y: 0 }}
-              findTerm={findTerm}
+              exit={{ opacity: 0, y: -20 }}
-              onFindTermChange={setFindTerm}
+              className="flex flex-col gap-4"
            />
            {/* Image/PDF Upload */}
            <ImageUpload
              onImageSelect={handleImageSelect}
              preview={imagePreview}
              fileType={fileType}
            />
            {/* Advanced Settings Toggle */}
            <motion.button
              onClick={() => setShowAdvanced(!showAdvanced)}
              className="w-full glass px-4 py-3 rounded-2xl flex items-center justify-between hover:bg-white/5 transition-colors"
              whileHover={{ scale: 1.01 }}
              whileTap={{ scale: 0.99 }}
            >
-              <div className="flex items-center gap-2">
+              {/* Run additional modes */}
-                <Settings className="w-4 h-4 text-purple-400" />
+              <div className="glass p-4 rounded-2xl flex-shrink-0">
-                <span className="text-sm font-medium text-gray-300">Advanced Settings</span>
+                <div className="mb-3">
-              </div>
+                  <ModelSelector
-              <motion.div
+                    models={models} value={model} onChange={setModel} loading={modelsLoading}
-                animate={{ rotate: showAdvanced ? 180 : 0 }}
+                  />
-                transition={{ duration: 0.3 }}
+                </div>
-              >
+                <ModeSelector mode={mode} onModeChange={setMode} />
-                <svg className="w-4 h-4 text-gray-400" fill="none" stroke="currentColor" viewBox="0 0 24 24">
+                <div className="flex items-center gap-3 mt-3">
-                  <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M19 9l-7 7-7-7" />
+                  <motion.button
-                </svg>
+                    onClick={handleSubmit}
-              </motion.div>
+                    disabled={loading}
-            </motion.button>
+                    className={`flex items-center gap-2 px-5 py-2 rounded-xl font-medium text-sm transition-all ${loading ? 'opacity-50 cursor-not-allowed bg-white/5' : 'bg-gradient-to-r from-purple-600 to-cyan-600'}`}
-
+                    whileHover={!loading ? { scale: 1.02 } : {}}
-            {/* Advanced Settings Panel */}
+                    whileTap={!loading ? { scale: 0.98 } : {}}
            <AnimatePresence>
              {showAdvanced && (
                <AdvancedSettings
                  settings={advancedSettings}
                  onSettingsChange={setAdvancedSettings}
                  includeCaption={includeCaption}
                  onIncludeCaptionChange={setIncludeCaption}
                />
              )}
            </AnimatePresence>
            {/* Action Button / PDF Processor */}
            {fileType === 'pdf' ? (
              <PDFProcessor
                pdfFile={image}
                mode={mode}
                prompt={prompt}
                advancedSettings={advancedSettings}
                includeCaption={includeCaption}
              />
            ) : (
              <>
                <motion.button
                  onClick={handleSubmit}
                  disabled={!image || loading}
                  className={`w-full relative overflow-hidden rounded-2xl p-[2px] ${
                    !image || loading ? 'opacity-50 cursor-not-allowed' : ''
                  }`}
                  whileHover={!loading && image ? { scale: 1.02 } : {}}
                  whileTap={!loading && image ? { scale: 0.98 } : {}}
                >
                  <div className="absolute inset-0 bg-gradient-to-r from-purple-600 via-pink-600 to-cyan-600 animate-gradient" />
                  <div className="relative bg-dark-100 px-8 py-4 rounded-2xl flex items-center justify-center gap-3">
                    {loading ? (
                      <>
                        <Loader2 className="w-5 h-5 animate-spin" />
                        <span className="font-semibold">Processing Magic...</span>
                      </>
                    ) : (
                      <>
                        <Zap className="w-5 h-5" />
                        <span className="font-semibold">Analyze Image</span>
                      </>
                    )}
                  </div>
                </motion.button>
                {error && (
                  <motion.div
                    initial={{ opacity: 0, y: -10 }}
                    animate={{ opacity: 1, y: 0 }}
                    className="glass p-4 rounded-2xl border-red-500/50 bg-red-500/10"
                  >
-                    <p className="text-sm text-red-400">{error}</p>
+                    {loading
-                  </motion.div>
+                      ? <><Loader2 className="w-4 h-4 animate-spin" /> Processing...</>
-                )}
+                      : <><Zap className="w-4 h-4" /> Analyze</>}
-              </>
+                  </motion.button>
-            )}
+                  {error && <p className="text-sm text-red-400">{error}</p>}
-          </motion.div>
+                </div>
              </div>
-          {/* Right Panel - Results */}
+              {/* Image + Text */}
-          <motion.div
+              <div className="grid gap-6" style={{ gridTemplateColumns: '1fr 1fr', height: '130vh' }}>
-            initial={{ opacity: 0, y: 20 }}
+                {imagePreview && typeof imagePreview === 'string' ? (
-            animate={{ opacity: 1, y: 0 }}
+                  <div className="glass rounded-2xl overflow-hidden flex items-center justify-center bg-black/20 h-full">
-            transition={{ delay: 0.2 }}
+                    <img
-          >
+                      src={imagePreview}
-            <ResultPanel 
+                      alt="Source"
-              result={result}
+                      className="w-full h-full object-contain"
-              loading={loading}
+                    />
-              imagePreview={imagePreview}
+                  </div>
-              onCopy={handleCopy}
+                ) : (
-              onDownload={handleDownload}
+                  <div className="glass rounded-2xl flex items-center justify-center h-full">
-            />
+                    <p className="text-gray-500 text-sm">No preview</p>
-          </motion.div>
+                  </div>
-        </div>
+                )}
                <div className="glass rounded-2xl p-4 flex flex-col h-full">
                  {/* Mode tabs — only shown when multiple modes have results */}
                  {Object.keys(modeResults).length > 1 && (
                    <div className="flex gap-1 mb-3 flex-shrink-0">
                      {Object.keys(modeResults).map(m => (
                        <button
                          key={m}
                          onClick={() => setActiveResultMode(m)}
                          className={`px-3 py-1 rounded-lg text-xs font-medium transition-colors ${
                            activeResultMode === m
                              ? 'bg-purple-600 text-white'
                              : 'bg-white/5 text-gray-400 hover:bg-white/10'
                          }`}
                        >
                          {MODE_LABELS[m] || m}
                        </button>
                      ))}
                    </div>
                  )}
                  <p className="text-xs text-gray-400 mb-2 flex-shrink-0">
                    {MODE_LABELS[activeResultMode] || 'Result'}
                    <span className="text-purple-400 ml-1">(edit before committing)</span>
                  </p>
                  {loading && COMMITTABLE_MODES.has(mode) ? (
                    <div className="flex-1 flex items-center justify-center">
                      <Loader2 className="w-8 h-8 animate-spin text-purple-400" />
                    </div>
                  ) : (
                    <textarea
                      value={activeResultMode ? (editedResults[activeResultMode] ?? '') : ''}
                      onChange={e => setEditedResults(prev => ({ ...prev, [activeResultMode]: e.target.value }))}
                      className="flex-1 w-full bg-transparent text-sm text-gray-200 font-mono resize-none focus:outline-none min-h-0"
                      placeholder="Run a mode to see results here..."
                    />
                  )}
                </div>
              </div>
              {/* Metadata row */}
              <div className="glass p-4 rounded-2xl flex-shrink-0">
                <datalist id="rv-authors">
                  {suggestions.authors.map(a => <option key={a} value={a} />)}
                </datalist>
                <datalist id="rv-books">
                  {(suggestions.books || []).map(b => <option key={b} value={b} />)}
                </datalist>
                <datalist id="rv-chapters">
                  {suggestions.chapters.map(c => <option key={c} value={c} />)}
                </datalist>
                <div className="grid grid-cols-4 gap-4">
                  {[
                    { key: 'author',  label: 'Author',  placeholder: 'Author name',  list: 'rv-authors' },
                    { key: 'book',    label: 'Book',    placeholder: 'Book title',    list: 'rv-books' },
                    { key: 'chapter', label: 'Chapter', placeholder: 'Chapter',       list: 'rv-chapters' },
                    { key: 'page',    label: 'Page',    placeholder: 'Page number',   list: undefined },
                  ].map(({ key, label, placeholder, list }) => (
                    <div key={key}>
                      <label className="text-xs text-gray-400 mb-1 block">{label}</label>
                      <input
                        type="text"
                        list={list}
                        value={metadata[key]}
                        onChange={metaField(key)}
                        placeholder={placeholder}
                        className={INPUT_CLASS}
                      />
                    </div>
                  ))}
                </div>
              </div>
              {/* Commit row */}
              <div className="flex items-center gap-4 flex-shrink-0">
                <AnimatePresence>
                  {commitResult?.success && (
                    <motion.div
                      initial={{ opacity: 0, x: -10 }} animate={{ opacity: 1, x: 0 }} exit={{ opacity: 0 }}
                      className="flex-1 glass p-3 rounded-xl bg-green-500/10 border border-green-500/20"
                    >
                      <p className="text-xs text-green-400">
                        Job saved &mdash; ID: <span className="font-mono">{commitResult.job?.id}</span>
                      </p>
                    </motion.div>
                  )}
                  {commitResult && !commitResult.success && (
                    <motion.div
                      initial={{ opacity: 0, x: -10 }} animate={{ opacity: 1, x: 0 }} exit={{ opacity: 0 }}
                      className="flex-1 glass p-3 rounded-xl bg-red-500/10 border border-red-500/20"
                    >
                      <p className="text-xs text-red-400">{commitResult.error}</p>
                    </motion.div>
                  )}
                </AnimatePresence>
                <motion.button
                  onClick={handleCommitJob}
                  disabled={commitLoading || commitResult?.success}
                  className={`flex items-center gap-2 px-6 py-3 rounded-xl font-medium text-sm transition-all flex-shrink-0 ${
                    commitLoading || commitResult?.success
                      ? 'opacity-50 cursor-not-allowed bg-white/5'
                      : 'bg-gradient-to-r from-blue-600 to-indigo-600 hover:from-blue-500 hover:to-indigo-500'
                  }`}
                  whileHover={!commitLoading && !commitResult?.success ? { scale: 1.02 } : {}}
                  whileTap={!commitLoading && !commitResult?.success ? { scale: 0.98 } : {}}
                >
                  {commitLoading ? (
                    <><Loader2 className="w-4 h-4 animate-spin" /> Committing...</>
                  ) : commitResult?.success ? (
                    <><CheckCircle2 className="w-4 h-4" /> Committed</>
                  ) : (
                    <><Database className="w-4 h-4" /> Commit Job</>
                  )}
                </motion.button>
              </div>
            </motion.div>
          ) : view === 'jobs' ? (
            <motion.div
              key="jobs"
              initial={{ opacity: 0, y: 20 }}
              animate={{ opacity: 1, y: 0 }}
              exit={{ opacity: 0, y: -20 }}
            >
              <JobsPanel />
            </motion.div>
          ) : (
            /* ── Upload / Controls layout ── */
            <motion.div
              key="new_job"
              initial={{ opacity: 0, y: 20 }}
              animate={{ opacity: 1, y: 0 }}
              exit={{ opacity: 0, y: -20 }}
            >
              <div className="grid lg:grid-cols-2 gap-6">
                {/* Left Panel */}
                <motion.div
                  initial={{ opacity: 0, y: 20 }}
                  animate={{ opacity: 1, y: 0 }}
                  transition={{ delay: 0.1 }}
                  className="space-y-6"
                >
                  {/* File Type Toggle */}
                  <div className="glass p-4 rounded-2xl">
                    <div className="grid grid-cols-2 gap-2">
                      <motion.button
                        onClick={() => handleFileTypeChange('image')}
                        className={`p-3 rounded-xl text-sm font-medium transition-all flex items-center justify-center gap-2 ${fileType === 'image' ? 'bg-gradient-to-r from-purple-600 to-cyan-600 text-white' : 'glass text-gray-400 hover:bg-white/5'}`}
                        whileHover={{ scale: 1.02 }} whileTap={{ scale: 0.98 }}
                      >
                        <ImageIcon className="w-4 h-4" /> Image OCR
                      </motion.button>
                      <motion.button
                        onClick={() => handleFileTypeChange('pdf')}
                        className={`p-3 rounded-xl text-sm font-medium transition-all flex items-center justify-center gap-2 ${fileType === 'pdf' ? 'bg-gradient-to-r from-purple-600 to-cyan-600 text-white' : 'glass text-gray-400 hover:bg-white/5'}`}
                        whileHover={{ scale: 1.02 }} whileTap={{ scale: 0.98 }}
                      >
                        <FileText className="w-4 h-4" /> PDF Processing
                      </motion.button>
                    </div>
                  </div>
                  <MetadataForm metadata={metadata} onChange={setMetadata} suggestions={suggestions} />
                  <ModelSelector
                    models={models} value={model} onChange={setModel} loading={modelsLoading}
                  />
                  <ModeSelector mode={mode} onModeChange={setMode} />
                  <ImageUpload onImageSelect={handleImageSelect} preview={imagePreview} fileType={fileType} />
                  <motion.button
                    onClick={() => setShowAdvanced(!showAdvanced)}
                    className="w-full glass px-4 py-3 rounded-2xl flex items-center justify-between hover:bg-white/5 transition-colors"
                    whileHover={{ scale: 1.01 }} whileTap={{ scale: 0.99 }}
                  >
                    <div className="flex items-center gap-2">
                      <Settings className="w-4 h-4 text-purple-400" />
                      <span className="text-sm font-medium text-gray-300">Advanced Settings</span>
                    </div>
                    <motion.div animate={{ rotate: showAdvanced ? 180 : 0 }} transition={{ duration: 0.3 }}>
                      <svg className="w-4 h-4 text-gray-400" fill="none" stroke="currentColor" viewBox="0 0 24 24">
                        <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M19 9l-7 7-7-7" />
                      </svg>
                    </motion.div>
                  </motion.button>
                  <AnimatePresence>
                    {showAdvanced && (
                      <AdvancedSettings
                        settings={advancedSettings} onSettingsChange={setAdvancedSettings}
                        includeCaption={includeCaption} onIncludeCaptionChange={setIncludeCaption}
                      />
                    )}
                  </AnimatePresence>
                  {fileType === 'pdf' ? (
                    <PDFProcessor
                      pdfFile={image} mode={mode} prompt={prompt} model={model}
                      advancedSettings={advancedSettings} includeCaption={includeCaption}
                    />
                  ) : (
                    <>
                      <motion.button
                        onClick={handleSubmit}
                        disabled={!image || loading}
                        className={`w-full relative overflow-hidden rounded-2xl p-[2px] ${!image || loading ? 'opacity-50 cursor-not-allowed' : ''}`}
                        whileHover={!loading && image ? { scale: 1.02 } : {}}
                        whileTap={!loading && image ? { scale: 0.98 } : {}}
                      >
                        <div className="absolute inset-0 bg-gradient-to-r from-purple-600 via-pink-600 to-cyan-600 animate-gradient" />
                        <div className="relative bg-dark-100 px-8 py-4 rounded-2xl flex items-center justify-center gap-3">
                          {loading ? (
                            <><Loader2 className="w-5 h-5 animate-spin" /><span className="font-semibold">Processing Magic...</span></>
                          ) : (
                            <><Zap className="w-5 h-5" /><span className="font-semibold">Analyze Image</span></>
                          )}
                        </div>
                      </motion.button>
                      {error && (
                        <motion.div
                          initial={{ opacity: 0, y: -10 }} animate={{ opacity: 1, y: 0 }}
                          className="glass p-4 rounded-2xl border-red-500/50 bg-red-500/10"
                        >
                          <p className="text-sm text-red-400">{error}</p>
                        </motion.div>
                      )}
                    </>
                  )}
                </motion.div>
                {/* Right Panel - Results (non-plain_ocr modes or loading) */}
                <motion.div
                  initial={{ opacity: 0, y: 20 }}
                  animate={{ opacity: 1, y: 0 }}
                  transition={{ delay: 0.2 }}
                >
                  <ResultPanel
                    result={result}
                    loading={loading}
                    imagePreview={imagePreview}
                    onCopy={handleCopy}
                    onDownload={handleDownload}
                  />
                </motion.div>
              </div>
            </motion.div>
          )}
        </AnimatePresence>
      </main>
      {/* Footer */}
      <footer className="mt-20 border-t border-white/10 glass">
        <div className="max-w-7xl mx-auto px-6 py-8 text-center space-y-2">
          <p className="text-sm text-gray-400">
-            Powered by <span className="gradient-text font-semibold">DeepSeek-OCR</span> • 
+            Powered by <span className="gradient-text font-semibold">DeepSeek-OCR</span> &bull;
            Built with <span className="text-pink-400">♥</span> using React + FastAPI
          </p>
          <p className="text-xs text-gray-500">
--- a/frontend/src/components/JobsPanel.jsx
+++ b/frontend/src/components/JobsPanel.jsx
@@ -0,0 +1,665 @@
 import { useState, useEffect, useCallback } from 'react'
 import { useSuggestions } from '../hooks/useSuggestions'
 import { useModels } from '../hooks/useModels'
 import { motion, AnimatePresence } from 'framer-motion'
 import {
  Search, ChevronLeft, ChevronRight, CheckCircle2, Clock,
  FileText, Loader2, Save, RefreshCw, Trash2, Sparkles,
 } from 'lucide-react'
 import axios from 'axios'
 const API_BASE = import.meta.env.VITE_API_URL || '/api'
 const INPUT_CLASS =
  'w-full bg-white/5 border border-white/10 rounded-lg px-3 py-2 text-sm text-gray-200 ' +
  'placeholder-gray-600 focus:outline-none focus:border-purple-500/50 transition-colors'
 const STATUS_COLORS = {
  unreviewed: 'text-amber-400 bg-amber-400/10 border-amber-400/30',
  reviewed:   'text-green-400 bg-green-400/10 border-green-400/30',
 }
 function StatusBadge({ status }) {
  const Icon = status === 'reviewed' ? CheckCircle2 : Clock
  return (
    <span className={`inline-flex items-center gap-1 px-2 py-0.5 rounded-full text-xs border ${STATUS_COLORS[status] || 'text-gray-400'}`}>
      <Icon className="w-3 h-3" />
      {status}
    </span>
  )
 }
 // ─────────────────────────────────────────────────────────────
 // Full-screen Job Detail
 // ─────────────────────────────────────────────────────────────
 function JobDetail({ jobId, onClose, onReviewed, onDeleted, suggestions = {} }) {
  const { models } = useModels()
  const [job, setJob] = useState(null)
  const [loading, setLoading] = useState(true)
  const [error, setError] = useState(null)
  const [describeModel, setDescribeModel] = useState('')
  const [generatingDescribe, setGeneratingDescribe] = useState(false)
  const [editedText, setEditedText]         = useState('')
  const [editDescribeText, setEditDescribeText] = useState('')
  const [editFreeformText, setEditFreeformText] = useState('')
  const [activeTab, setActiveTab]           = useState('ocr')
  const [editAuthor, setEditAuthor]         = useState('')
  const [editBook, setEditBook]             = useState('')
  const [editChapter, setEditChapter]       = useState('')
  const [editPage, setEditPage]             = useState('')
  const [reviewerName, setReviewerName]     = useState('')
  const [submitting, setSubmitting] = useState(false)
  const [saveResult, setSaveResult] = useState(null)
  const [confirmDelete, setConfirmDelete] = useState(false)
  const [deleting, setDeleting] = useState(false)
  const [togglingStatus, setTogglingStatus] = useState(false)
  useEffect(() => {
    let cancelled = false
    setLoading(true)
    setError(null)
    setSaveResult(null)
    axios.get(`${API_BASE}/jobs/${jobId}`)
      .then(res => {
        if (!cancelled) {
          const d = res.data
          setJob(d)
          setEditedText(d.reviewed_text ?? d.ocr_text ?? '')
          setEditDescribeText(d.describe_text ?? '')
          setEditFreeformText(d.freeform_text ?? '')
          setEditAuthor(d.author || '')
          setEditBook(d.book || '')
          setEditChapter(d.chapter || '')
          setEditPage(d.page || '')
          setReviewerName(d.reviewer_name || '')
          // Default to the OCR tab when there's OCR text, otherwise Description
          if (d.reviewed_text || d.ocr_text) setActiveTab('ocr')
          else setActiveTab('describe')
        }
      })
      .catch(err => {
        if (!cancelled) setError(err.response?.data?.detail || err.message)
      })
      .finally(() => { if (!cancelled) setLoading(false) })
    return () => { cancelled = true }
  }, [jobId])
  // Default the Describe model to the job's original model (if available) or the registry default
  useEffect(() => {
    if (!describeModel && models.length > 0) {
      const def = models.find(m => m.default) || models[0]
      const fromJob = job?.ocr_model && models.some(m => m.id === job.ocr_model) ? job.ocr_model : null
      setDescribeModel(fromJob || def.id)
    }
  }, [models, job, describeModel])
  const handleGenerateDescribe = async () => {
    setGeneratingDescribe(true)
    setSaveResult(null)
    try {
      const res = await axios.post(`${API_BASE}/jobs/${jobId}/describe`, {
        model: describeModel || null,
      })
      setJob(res.data)
      setEditDescribeText(res.data.describe_text || '')
      onReviewed(res.data)
    } catch (err) {
      setSaveResult({ success: false, error: err.response?.data?.detail || err.message })
    } finally {
      setGeneratingDescribe(false)
    }
  }
  const handleSave = async () => {
    if (!reviewerName.trim()) {
      setSaveResult({ success: false, error: 'Reviewer name is required.' })
      return
    }
    setSubmitting(true)
    setSaveResult(null)
    try {
      const res = await axios.put(`${API_BASE}/jobs/${jobId}/review`, {
        reviewed_text: editedText,
        reviewer_name: reviewerName.trim(),
        author: editAuthor,
        book: editBook,
        chapter: editChapter,
        page: editPage,
        describe_text: editDescribeText || null,
        freeform_text: editFreeformText || null,
      })
      setJob(res.data)
      setSaveResult({ success: true })
      onReviewed(res.data)
    } catch (err) {
      setSaveResult({ success: false, error: err.response?.data?.detail || err.message })
    } finally {
      setSubmitting(false)
    }
  }
  const handleToggleStatus = async () => {
    // Marking reviewed accepts BOTH the reviewed document text and the description,
    // so it goes through the full review save (not a status-only flip).
    if (!isReviewed) {
      setTogglingStatus(true)
      try {
        await handleSave()
      } finally {
        setTogglingStatus(false)
      }
      return
    }
    // Reverting to unreviewed preserves the saved reviewed text and description.
    setTogglingStatus(true)
    setSaveResult(null)
    try {
      const res = await axios.put(`${API_BASE}/jobs/${jobId}/status`, {
        status: 'unreviewed',
        reviewer_name: reviewerName.trim() || null,
      })
      setJob(res.data)
      setReviewerName(res.data.reviewer_name || '')
      onReviewed(res.data)
    } catch (err) {
      setSaveResult({ success: false, error: err.response?.data?.detail || err.message })
    } finally {
      setTogglingStatus(false)
    }
  }
  const handleDelete = async () => {
    setDeleting(true)
    try {
      await axios.delete(`${API_BASE}/jobs/${jobId}`)
      onDeleted(jobId)
    } catch (err) {
      setSaveResult({ success: false, error: err.response?.data?.detail || err.message })
      setConfirmDelete(false)
    } finally {
      setDeleting(false)
    }
  }
  const isReviewed = job?.status === 'reviewed'
  return (
    <motion.div
      key={jobId}
      initial={{ opacity: 0, y: 20 }}
      animate={{ opacity: 1, y: 0 }}
      exit={{ opacity: 0, y: -20 }}
      className="flex flex-col gap-4"
    >
      {/* Top bar */}
      <div className="flex items-center gap-4 flex-shrink-0">
        <motion.button
          onClick={onClose}
          className="flex items-center gap-2 glass glass-hover px-4 py-2 rounded-xl text-sm text-gray-300"
          whileHover={{ scale: 1.02 }} whileTap={{ scale: 0.98 }}
        >
          <ChevronLeft className="w-4 h-4" />
          Back to results
        </motion.button>
        {job && (
          <>
            <StatusBadge status={job.status} />
            <motion.button
              onClick={handleToggleStatus}
              disabled={togglingStatus}
              title={isReviewed ? 'Revert to unreviewed' : 'Mark as reviewed'}
              className={`flex items-center gap-1 px-3 py-1.5 rounded-lg text-xs font-medium transition-colors disabled:opacity-50 ${
                isReviewed
                  ? 'glass glass-hover text-amber-400 hover:bg-amber-500/10'
                  : 'glass glass-hover text-green-400 hover:bg-green-500/10'
              }`}
              whileHover={!togglingStatus ? { scale: 1.02 } : {}}
              whileTap={!togglingStatus ? { scale: 0.98 } : {}}
            >
              {togglingStatus ? (
                <Loader2 className="w-3.5 h-3.5 animate-spin" />
              ) : isReviewed ? (
                <Clock className="w-3.5 h-3.5" />
              ) : (
                <CheckCircle2 className="w-3.5 h-3.5" />
              )}
              {isReviewed ? 'Mark Unreviewed' : 'Mark Reviewed'}
            </motion.button>
            <span className="text-xs text-gray-500 font-mono hidden sm:block">{job.id}</span>
          </>
        )}
        <div className="ml-auto flex items-center gap-2">
          {confirmDelete ? (
            <>
              <span className="text-xs text-red-400">Delete this job permanently?</span>
              <motion.button
                onClick={handleDelete}
                disabled={deleting}
                className="flex items-center gap-1 px-3 py-2 rounded-xl text-sm font-medium bg-red-600 hover:bg-red-500 disabled:opacity-50"
                whileHover={{ scale: 1.02 }} whileTap={{ scale: 0.98 }}
              >
                {deleting ? <Loader2 className="w-4 h-4 animate-spin" /> : <Trash2 className="w-4 h-4" />}
                Confirm
              </motion.button>
              <motion.button
                onClick={() => setConfirmDelete(false)}
                className="px-3 py-2 rounded-xl text-sm glass glass-hover text-gray-300"
                whileHover={{ scale: 1.02 }} whileTap={{ scale: 0.98 }}
              >
                Cancel
              </motion.button>
            </>
          ) : (
            <motion.button
              onClick={() => setConfirmDelete(true)}
              className="flex items-center gap-2 px-3 py-2 rounded-xl text-sm glass glass-hover text-red-400 hover:bg-red-500/10"
              whileHover={{ scale: 1.02 }} whileTap={{ scale: 0.98 }}
            >
              <Trash2 className="w-4 h-4" />
              Delete
            </motion.button>
          )}
        </div>
      </div>
      {loading && (
        <div className="flex-1 flex items-center justify-center">
          <Loader2 className="w-8 h-8 animate-spin text-purple-400" />
        </div>
      )}
      {error && (
        <div className="glass p-4 rounded-xl border-red-500/30 bg-red-500/10 flex-shrink-0">
          <p className="text-sm text-red-400">{error}</p>
        </div>
      )}
      {job && !loading && (
        <>
          {/* Image + Text */}
          <div className="grid gap-6" style={{ gridTemplateColumns: '1fr 1fr', height: '130vh' }}>
            <div className="glass rounded-2xl overflow-hidden flex items-center justify-center bg-black/20 h-full">
              <img
                src={`${API_BASE}/jobs/${job.id}/image`}
                alt="Job source"
                className="w-full h-full object-contain"
                onError={e => { e.target.style.display = 'none' }}
              />
            </div>
            <div className="glass rounded-2xl p-4 flex flex-col h-full">
              {/* Tabs — only show tabs that have content */}
              {(() => {
                const tabs = [
                  job.ocr_text || job.reviewed_text ? { id: 'ocr', label: 'OCR Text' } : null,
                  { id: 'describe', label: 'Description' },
                ].filter(Boolean)
                return tabs.length > 1 ? (
                  <div className="flex gap-1 mb-3 flex-shrink-0">
                    {tabs.map(t => (
                      <button
                        key={t.id}
                        onClick={() => setActiveTab(t.id)}
                        className={`px-3 py-1 rounded-lg text-xs font-medium transition-colors ${
                          activeTab === t.id
                            ? 'bg-purple-600 text-white'
                            : 'bg-white/5 text-gray-400 hover:bg-white/10'
                        }`}
                      >
                        {t.label}
                      </button>
                    ))}
                  </div>
                ) : null
              })()}
              <p className="text-xs text-gray-400 mb-2 flex-shrink-0">
                {{ ocr: isReviewed ? 'Reviewed Text' : 'OCR Text', describe: 'Description' }[activeTab]}
                <span className="text-purple-400 ml-1">(editable)</span>
              </p>
              {activeTab === 'ocr' && (
                <>
                  <textarea
                    value={editedText}
                    onChange={e => setEditedText(e.target.value)}
                    className="flex-1 w-full bg-transparent text-sm text-gray-200 font-mono resize-none focus:outline-none min-h-0"
                    placeholder="OCR text..."
                  />
                  {isReviewed && job.ocr_text && (
                    <details className="flex-shrink-0 mt-2 border-t border-white/10 pt-2">
                      <summary className="cursor-pointer text-xs text-gray-500 hover:text-gray-400 transition-colors">
                        Original OCR Text
                      </summary>
                      <pre className="text-xs text-gray-600 whitespace-pre-wrap font-mono mt-1 max-h-28 overflow-y-auto">
                        {job.ocr_text}
                      </pre>
                    </details>
                  )}
                </>
              )}
              {activeTab === 'describe' && (
                <>
                  <div className="flex items-center gap-2 mb-2 flex-shrink-0">
                    <select
                      value={describeModel}
                      onChange={e => setDescribeModel(e.target.value)}
                      disabled={generatingDescribe || models.length === 0}
                      className="bg-white/5 border border-white/10 rounded-lg px-2 py-1.5 text-xs text-gray-200 focus:outline-none focus:border-purple-500/50"
                    >
                      {models.length === 0 && <option value="">No models</option>}
                      {models.map(m => (
                        <option key={m.id} value={m.id}>{m.label}{m.default ? ' (default)' : ''}</option>
                      ))}
                    </select>
                    <motion.button
                      onClick={handleGenerateDescribe}
                      disabled={generatingDescribe || !describeModel}
                      className={`flex items-center gap-1.5 px-3 py-1.5 rounded-lg text-xs font-medium transition-all ${
                        generatingDescribe || !describeModel
                          ? 'opacity-50 cursor-not-allowed bg-white/5'
                          : 'bg-gradient-to-r from-violet-600 to-purple-600 hover:from-violet-500 hover:to-purple-500'
                      }`}
                      whileHover={!generatingDescribe && describeModel ? { scale: 1.02 } : {}}
                      whileTap={!generatingDescribe && describeModel ? { scale: 0.98 } : {}}
                      title="Run Describe on this job's image and save it"
                    >
                      {generatingDescribe
                        ? <><Loader2 className="w-3.5 h-3.5 animate-spin" /> Generating…</>
                        : <><Sparkles className="w-3.5 h-3.5" /> Generate Description</>}
                    </motion.button>
                  </div>
                  <textarea
                    value={editDescribeText}
                    onChange={e => setEditDescribeText(e.target.value)}
                    className="flex-1 w-full bg-transparent text-sm text-gray-200 font-mono resize-none focus:outline-none min-h-0"
                    placeholder="No description yet — pick a model and click Generate Description, or type one here."
                  />
                </>
              )}
            </div>
          </div>
          {/* Metadata + reviewer row */}
          <div className="glass p-4 rounded-2xl flex-shrink-0">
            <datalist id="jd-authors">
              {(suggestions.authors || []).map(a => <option key={a} value={a} />)}
            </datalist>
            <datalist id="jd-books">
              {(suggestions.books || []).map(b => <option key={b} value={b} />)}
            </datalist>
            <datalist id="jd-chapters">
              {(suggestions.chapters || []).map(c => <option key={c} value={c} />)}
            </datalist>
            <datalist id="jd-reviewers">
              {(suggestions.reviewers || []).map(r => <option key={r} value={r} />)}
            </datalist>
            <div className="grid grid-cols-6 gap-4">
              <div>
                <label className="text-xs text-gray-400 mb-1 block">Author</label>
                <input type="text" list="jd-authors" value={editAuthor} onChange={e => setEditAuthor(e.target.value)} placeholder="Author" className={INPUT_CLASS} />
              </div>
              <div>
                <label className="text-xs text-gray-400 mb-1 block">Book</label>
                <input type="text" list="jd-books" value={editBook} onChange={e => setEditBook(e.target.value)} placeholder="Book title" className={INPUT_CLASS} />
              </div>
              <div>
                <label className="text-xs text-gray-400 mb-1 block">Chapter</label>
                <input type="text" list="jd-chapters" value={editChapter} onChange={e => setEditChapter(e.target.value)} placeholder="Chapter" className={INPUT_CLASS} />
              </div>
              <div>
                <label className="text-xs text-gray-400 mb-1 block">Page</label>
                <input type="text" value={editPage} onChange={e => setEditPage(e.target.value)} placeholder="Page" className={INPUT_CLASS} />
              </div>
              <div>
                <label className="text-xs text-gray-400 mb-1 block">Reviewer</label>
                <input type="text" list="jd-reviewers" value={reviewerName} onChange={e => setReviewerName(e.target.value)} placeholder="Your name" className={INPUT_CLASS} />
              </div>
              <div className="flex flex-col justify-end">
                <motion.button
                  onClick={handleSave}
                  disabled={submitting || !reviewerName.trim()}
                  className={`w-full flex items-center justify-center gap-2 px-4 py-2 rounded-lg font-medium text-sm transition-all ${
                    submitting || !reviewerName.trim()
                      ? 'opacity-50 cursor-not-allowed bg-white/5'
                      : isReviewed
                        ? 'bg-gradient-to-r from-blue-600 to-indigo-600 hover:from-blue-500 hover:to-indigo-500'
                        : 'bg-gradient-to-r from-green-600 to-emerald-600 hover:from-green-500 hover:to-emerald-500'
                  }`}
                  whileHover={!submitting && reviewerName.trim() ? { scale: 1.02 } : {}}
                  whileTap={!submitting && reviewerName.trim() ? { scale: 0.98 } : {}}
                >
                  {submitting ? (
                    <><Loader2 className="w-4 h-4 animate-spin" /> Saving...</>
                  ) : isReviewed ? (
                    <><Save className="w-4 h-4" /> Save Changes</>
                  ) : (
                    <><CheckCircle2 className="w-4 h-4" /> Mark Reviewed</>
                  )}
                </motion.button>
              </div>
            </div>
            {!isReviewed && (
              <p className="text-xs text-gray-500 mt-2">
                Marking reviewed accepts both the reviewed document text and the description.
              </p>
            )}
            {saveResult && (
              <motion.div
                initial={{ opacity: 0, y: -4 }} animate={{ opacity: 1, y: 0 }}
                className={`mt-3 p-2 rounded-lg text-xs ${saveResult.success ? 'bg-green-500/10 text-green-400' : 'bg-red-500/10 text-red-400'}`}
              >
                {saveResult.success
                  ? (isReviewed ? 'Changes saved!' : 'Job marked as reviewed!')
                  : saveResult.error}
              </motion.div>
            )}
            {/* Read-only info row */}
            <div className="flex gap-6 mt-3 pt-3 border-t border-white/10">
              {job.submitted_at && (
                <span className="text-xs text-gray-500">Submitted: {new Date(job.submitted_at).toLocaleString()}</span>
              )}
              {isReviewed && job.reviewed_at && (
                <span className="text-xs text-gray-500">Last reviewed: {new Date(job.reviewed_at).toLocaleString()}</span>
              )}
              {job.mode && <span className="text-xs text-gray-500">Mode: {job.mode}</span>}
              {job.ocr_model && <span className="text-xs text-gray-500">Model: {job.ocr_model}</span>}
            </div>
          </div>
        </>
      )}
    </motion.div>
  )
 }
 // ─────────────────────────────────────────────────────────────
 // Search / List view
 // ─────────────────────────────────────────────────────────────
 export default function JobsPanel() {
  const suggestions = useSuggestions()
  const [search, setSearch] = useState('')
  const [filterStatus, setFilterStatus] = useState('')
  const [filterAuthor, setFilterAuthor] = useState('')
  const [filterBook, setFilterBook] = useState('')
  const [jobs, setJobs] = useState([])
  const [total, setTotal] = useState(0)
  const [page, setPage] = useState(0)
  const [loading, setLoading] = useState(false)
  const [error, setError] = useState(null)
  const [selectedJobId, setSelectedJobId] = useState(null)
  const LIMIT = 20
  const fetchJobs = useCallback(async (pageNum = 0) => {
    setLoading(true)
    setError(null)
    try {
      const params = new URLSearchParams()
      if (search.trim()) params.set('search', search.trim())
      if (filterStatus) params.set('status', filterStatus)
      if (filterAuthor.trim()) params.set('author', filterAuthor.trim())
      if (filterBook.trim()) params.set('book', filterBook.trim())
      params.set('limit', LIMIT)
      params.set('offset', pageNum * LIMIT)
      const res = await axios.get(`${API_BASE}/jobs?${params}`)
      setJobs(res.data.jobs)
      setTotal(res.data.total)
      setPage(pageNum)
    } catch (err) {
      setError(err.response?.data?.detail || err.message)
    } finally {
      setLoading(false)
    }
  }, [search, filterStatus, filterAuthor, filterBook])
  useEffect(() => { fetchJobs(0) }, []) // eslint-disable-line react-hooks/exhaustive-deps
  const handleReviewed = (updatedJob) => {
    setJobs(prev => prev.map(j => j.id === updatedJob.id ? { ...j, ...updatedJob } : j))
  }
  const totalPages = Math.ceil(total / LIMIT)
  // When a job is selected show full-screen detail
  if (selectedJobId) {
    return (
      <AnimatePresence mode="wait">
        <JobDetail
          key={selectedJobId}
          jobId={selectedJobId}
          onClose={() => setSelectedJobId(null)}
          onReviewed={handleReviewed}
          onDeleted={(id) => {
            setJobs(prev => prev.filter(j => j.id !== id))
            setTotal(prev => prev - 1)
            setSelectedJobId(null)
          }}
          suggestions={suggestions}
        />
      </AnimatePresence>
    )
  }
  return (
    <motion.div
      key="job_list"
      initial={{ opacity: 0, y: 20 }}
      animate={{ opacity: 1, y: 0 }}
      exit={{ opacity: 0, y: -20 }}
      className="space-y-4"
    >
      {/* Search form */}
      <div className="glass p-4 rounded-2xl space-y-3">
        <form onSubmit={e => { e.preventDefault(); fetchJobs(0) }} className="flex gap-2">
          <input
            type="text"
            value={search}
            onChange={e => setSearch(e.target.value)}
            placeholder="Search all fields..."
            className={`${INPUT_CLASS} flex-1`}
          />
          <motion.button
            type="submit"
            className="flex items-center gap-2 px-4 py-2 rounded-lg bg-gradient-to-r from-purple-600 to-cyan-600 text-sm font-medium"
            whileHover={{ scale: 1.02 }} whileTap={{ scale: 0.98 }}
          >
            <Search className="w-4 h-4" /> Search
          </motion.button>
        </form>
        <datalist id="jp-authors">
          {suggestions.authors.map(a => <option key={a} value={a} />)}
        </datalist>
        <datalist id="jp-books">
          {(suggestions.books || []).map(b => <option key={b} value={b} />)}
        </datalist>
        <div className="grid grid-cols-3 gap-2">
          <select value={filterStatus} onChange={e => setFilterStatus(e.target.value)} className={INPUT_CLASS}>
            <option value="">All statuses</option>
            <option value="unreviewed">Unreviewed</option>
            <option value="reviewed">Reviewed</option>
          </select>
          <input type="text" list="jp-authors" value={filterAuthor} onChange={e => setFilterAuthor(e.target.value)} placeholder="Author..." className={INPUT_CLASS} />
          <input type="text" list="jp-books" value={filterBook} onChange={e => setFilterBook(e.target.value)} placeholder="Book..." className={INPUT_CLASS} />
        </div>
        <div className="flex items-center justify-between">
          <span className="text-xs text-gray-500">{total} job{total !== 1 ? 's' : ''} found</span>
          <button onClick={() => fetchJobs(page)} className="flex items-center gap-1 text-xs text-gray-400 hover:text-gray-200 transition-colors">
            <RefreshCw className="w-3 h-3" /> Refresh
          </button>
        </div>
      </div>
      {loading && <div className="flex justify-center py-8"><Loader2 className="w-6 h-6 animate-spin text-purple-400" /></div>}
      {error && (
        <div className="glass p-4 rounded-xl border-red-500/30 bg-red-500/10">
          <p className="text-sm text-red-400">{error}</p>
        </div>
      )}
      {!loading && !error && jobs.length === 0 && (
        <div className="glass p-8 rounded-2xl text-center">
          <FileText className="w-10 h-10 mx-auto mb-3 text-gray-600" />
          <p className="text-gray-400">No jobs found</p>
          <p className="text-xs text-gray-500 mt-1">Commit your first OCR job from the New Job tab</p>
        </div>
      )}
      {/* Results grid */}
      <div className="grid grid-cols-1 sm:grid-cols-2 lg:grid-cols-3 xl:grid-cols-4 gap-3">
        <AnimatePresence>
          {jobs.map(job => (
            <motion.button
              key={job.id}
              onClick={() => setSelectedJobId(job.id)}
              className="text-left glass p-4 rounded-xl border border-white/5 hover:border-white/20 hover:bg-white/5 transition-all"
              initial={{ opacity: 0, y: 10 }}
              animate={{ opacity: 1, y: 0 }}
              exit={{ opacity: 0 }}
              whileHover={{ scale: 1.02 }}
              whileTap={{ scale: 0.98 }}
              layout
            >
              <div className="flex items-start justify-between gap-2 mb-2">
                <StatusBadge status={job.status} />
              </div>
              {job.book && <p className="text-sm font-medium text-gray-200 truncate">{job.book}</p>}
              <div className="flex items-center gap-2 mt-0.5">
                {job.chapter && <span className="text-xs text-gray-500">Ch. {job.chapter}</span>}
                {job.page && <span className="text-xs text-gray-500">p. {job.page}</span>}
              </div>
              {job.author && <p className="text-xs text-gray-400 mt-1">{job.author}</p>}
              <div className="flex items-center justify-between mt-2">
                <p className="text-xs text-gray-600 font-mono">{new Date(job.submitted_at).toLocaleDateString()}</p>
                {job.ocr_model && <span className="text-[10px] text-gray-500 truncate ml-2">{job.ocr_model}</span>}
              </div>
            </motion.button>
          ))}
        </AnimatePresence>
      </div>
      {totalPages > 1 && (
        <div className="flex items-center justify-center gap-3">
          <button onClick={() => fetchJobs(page - 1)} disabled={page === 0} className="glass glass-hover p-2 rounded-lg disabled:opacity-30">
            <ChevronLeft className="w-4 h-4" />
          </button>
          <span className="text-sm text-gray-400">Page {page + 1} of {totalPages}</span>
          <button onClick={() => fetchJobs(page + 1)} disabled={page >= totalPages - 1} className="glass glass-hover p-2 rounded-lg disabled:opacity-30">
            <ChevronRight className="w-4 h-4" />
          </button>
        </div>
      )}
    </motion.div>
  )
 }
--- a/frontend/src/components/MetadataForm.jsx
+++ b/frontend/src/components/MetadataForm.jsx
@@ -0,0 +1,77 @@
 import { BookOpen } from 'lucide-react'
 export default function MetadataForm({ metadata, onChange, suggestions = {} }) {
  const { author, book, chapter, page } = metadata
  const { authors = [], books = [], chapters = [] } = suggestions
  const field = (key) => (e) => onChange({ ...metadata, [key]: e.target.value })
  const inputClass =
    'w-full bg-white/5 border border-white/10 rounded-lg px-3 py-2 text-sm text-gray-200 ' +
    'placeholder-gray-600 focus:outline-none focus:border-purple-500/50 transition-colors'
  return (
    <div className="glass p-4 rounded-2xl space-y-3">
      <div className="flex items-center gap-2">
        <BookOpen className="w-4 h-4 text-purple-400" />
        <h3 className="text-sm font-medium text-gray-300">Job Metadata</h3>
      </div>
      <datalist id="mf-authors">
        {authors.map(a => <option key={a} value={a} />)}
      </datalist>
      <datalist id="mf-books">
        {books.map(b => <option key={b} value={b} />)}
      </datalist>
      <datalist id="mf-chapters">
        {chapters.map(c => <option key={c} value={c} />)}
      </datalist>
      <div className="grid grid-cols-2 gap-3">
        <div>
          <label className="text-xs text-gray-400 mb-1 block">Author</label>
          <input
            type="text"
            list="mf-authors"
            value={author}
            onChange={field('author')}
            placeholder="Author name"
            className={inputClass}
          />
        </div>
        <div>
          <label className="text-xs text-gray-400 mb-1 block">Book</label>
          <input
            type="text"
            list="mf-books"
            value={book}
            onChange={field('book')}
            placeholder="Book title"
            className={inputClass}
          />
        </div>
        <div>
          <label className="text-xs text-gray-400 mb-1 block">Chapter</label>
          <input
            type="text"
            list="mf-chapters"
            value={chapter}
            onChange={field('chapter')}
            placeholder="Chapter"
            className={inputClass}
          />
        </div>
        <div>
          <label className="text-xs text-gray-400 mb-1 block">Page</label>
          <input
            type="text"
            value={page}
            onChange={field('page')}
            placeholder="Page number"
            className={inputClass}
          />
        </div>
      </div>
    </div>
  )
 }
--- a/frontend/src/components/ModeSelector.jsx
+++ b/frontend/src/components/ModeSelector.jsx
@@ -1,29 +1,17 @@
 import { motion } from 'framer-motion'
-import { FileText, Eye, Search, Wand2 } from 'lucide-react'
+import { FileText, Eye } from 'lucide-react'
 const modes = [
-  { id: 'plain_ocr', name: 'Plain OCR', icon: FileText, color: 'from-blue-500 to-cyan-500', desc: 'Extract raw text', needsInput: false },
+  { id: 'plain_ocr', name: 'Plain OCR', icon: FileText, color: 'from-blue-500 to-cyan-500', desc: 'Extract raw text' },
-  { id: 'describe', name: 'Describe', icon: Eye, color: 'from-violet-500 to-purple-500', desc: 'Image description', needsInput: false },
+  { id: 'describe', name: 'Describe', icon: Eye, color: 'from-violet-500 to-purple-500', desc: 'Image description' },
  { id: 'find_ref', name: 'Find', icon: Search, color: 'from-yellow-500 to-orange-500', desc: 'Locate specific terms', needsInput: 'findTerm' },
  { id: 'freeform', name: 'Freeform', icon: Wand2, color: 'from-fuchsia-500 to-pink-500', desc: 'Custom prompt', needsInput: 'prompt' },
 ]
-export default function ModeSelector({ 
+export default function ModeSelector({ mode, onModeChange }) {
  mode, 
  onModeChange, 
  prompt, 
  onPromptChange,
  findTerm,
  onFindTermChange
 }) {
  const selectedMode = modes.find(m => m.id === mode)
  const needsInput = selectedMode?.needsInput
  return (
    <div className="glass p-4 rounded-2xl space-y-3">
      <h3 className="text-sm font-semibold text-gray-200">Mode</h3>
-      <div className="grid grid-cols-4 gap-2">
+      <div className="grid grid-cols-2 gap-2">
        {modes.map((m) => {
          const Icon = m.icon
          const isSelected = mode === m.id
@@ -32,6 +20,7 @@ export default function ModeSelector({
            <motion.button
              key={m.id}
              onClick={() => onModeChange(m.id)}
              title={m.desc}
              className={`
                relative p-2 rounded-xl text-center transition-all
                ${isSelected
@@ -68,38 +57,6 @@ export default function ModeSelector({
          )
        })}
      </div>
      {needsInput === 'findTerm' && (
        <motion.div
          initial={{ opacity: 0, height: 0 }}
          animate={{ opacity: 1, height: 'auto' }}
          exit={{ opacity: 0, height: 0 }}
        >
          <input
            type="text"
            value={findTerm}
            onChange={(e) => onFindTermChange(e.target.value)}
            placeholder="Enter term to find (e.g., Total, Invoice #)"
            className="w-full bg-white/5 border border-white/10 rounded-xl px-3 py-2 text-sm focus:outline-none focus:border-purple-500 transition-colors"
          />
        </motion.div>
      )}
      {needsInput === 'prompt' && (
        <motion.div
          initial={{ opacity: 0, height: 0 }}
          animate={{ opacity: 1, height: 'auto' }}
          exit={{ opacity: 0, height: 0 }}
        >
          <textarea
            value={prompt}
            onChange={(e) => onPromptChange(e.target.value)}
            placeholder="Enter your custom prompt..."
            className="w-full bg-white/5 border border-white/10 rounded-xl px-3 py-2 text-sm focus:outline-none focus:border-purple-500 transition-colors resize-none"
            rows={2}
          />
        </motion.div>
      )}
    </div>
  )
 }
--- a/frontend/src/components/ModelSelector.jsx
+++ b/frontend/src/components/ModelSelector.jsx
@@ -0,0 +1,33 @@
 import { Cpu } from 'lucide-react'
 const SELECT_CLASS =
  'w-full bg-white/5 border border-white/10 rounded-lg px-3 py-2 text-sm text-gray-200 ' +
  'focus:outline-none focus:border-purple-500/50 transition-colors'
 // Dropdown to pick which OCR model runs the analysis.
 // `models` comes from the useModels() hook; `value` is the selected model id.
 export default function ModelSelector({ models, value, onChange, loading }) {
  return (
    <div className="glass p-4 rounded-2xl space-y-3">
      <div className="flex items-center gap-2">
        <Cpu className="w-4 h-4 text-purple-400" />
        <h3 className="text-sm font-semibold text-gray-200">Model</h3>
      </div>
      <select
        value={value || ''}
        onChange={e => onChange(e.target.value)}
        disabled={loading || models.length === 0}
        className={SELECT_CLASS}
      >
        {loading && <option value="">Loading models…</option>}
        {!loading && models.length === 0 && <option value="">No models available</option>}
        {models.map(m => (
          <option key={m.id} value={m.id}>
            {m.label}{m.default ? ' (default)' : ''}
          </option>
        ))}
      </select>
    </div>
  )
 }
--- a/frontend/src/components/PDFProcessor.jsx
+++ b/frontend/src/components/PDFProcessor.jsx
@@ -5,7 +5,7 @@ import axios from 'axios'
 const API_BASE = import.meta.env.VITE_API_URL || '/api'
-function PDFProcessor({ pdfFile, mode, prompt, advancedSettings, includeCaption }) {
+function PDFProcessor({ pdfFile, mode, prompt, model, advancedSettings, includeCaption }) {
  const [processing, setProcessing] = useState(false)
  const [progress, setProgress] = useState(0)
  const [result, setResult] = useState(null)
@@ -29,6 +29,7 @@ function PDFProcessor({ pdfFile, mode, prompt, advancedSettings, includeCaption
    try {
      const formData = new FormData()
      formData.append('pdf_file', pdfFile)
      if (model) formData.append('model', model)
      formData.append('mode', mode)
      formData.append('prompt', prompt)
      formData.append('output_format', outputFormat)
@@ -80,7 +81,7 @@ function PDFProcessor({ pdfFile, mode, prompt, advancedSettings, includeCaption
    } finally {
      setProcessing(false)
    }
-  }, [pdfFile, mode, prompt, outputFormat, includeCaption, advancedSettings])
+  }, [pdfFile, mode, prompt, model, outputFormat, includeCaption, advancedSettings])
  const handleDownloadJSON = useCallback(() => {
    if (!result || outputFormat !== 'json') return
--- a/frontend/src/components/ResultPanel.jsx
+++ b/frontend/src/components/ResultPanel.jsx
@@ -205,7 +205,7 @@ export default function ResultPanel({ result, loading, imagePreview, onCopy, onD
            exit={{ opacity: 0, y: -20 }}
            className="space-y-4"
          >
-            {/* Preview with boxes */}
+            {/* Preview with boxes (grounding modes) */}
            {imagePreview && result.boxes && result.boxes.length > 0 && (
              <div className="relative rounded-xl overflow-hidden border border-white/10 bg-black">
                <img
@@ -226,15 +226,13 @@ export default function ResultPanel({ result, loading, imagePreview, onCopy, onD
              </div>
            )}
-            {/* Text result */}
+            {/* Rendered text result */}
            <div className="bg-white/5 border border-white/10 rounded-xl p-4 max-h-96 overflow-y-auto">
              {isHTML ? (
                <div
                  className="prose prose-invert prose-sm max-w-none"
                  dangerouslySetInnerHTML={{ __html: DOMPurify.sanitize(result.text) }}
-                  style={{
+                  style={{ color: '#e5e7eb' }}
                    color: '#e5e7eb',
                  }}
                />
              ) : isMarkdown ? (
                <div className="prose prose-invert prose-sm max-w-none">
--- a/frontend/src/hooks/useModels.js
+++ b/frontend/src/hooks/useModels.js
@@ -0,0 +1,24 @@
 import { useState, useEffect } from 'react'
 const API_BASE = import.meta.env.VITE_API_URL || '/api'
 // Fetches the OCR models available for selection. Returns { models, loading }.
 // Each model: { id, label, capabilities: { grounding, advanced_settings }, default }
 export function useModels() {
  const [models, setModels] = useState([])
  const [loading, setLoading] = useState(true)
  useEffect(() => {
    let cancelled = false
    fetch(`${API_BASE}/models`)
      .then(r => (r.ok ? r.json() : null))
      .then(data => {
        if (!cancelled && data?.models) setModels(data.models)
      })
      .catch(() => {})
      .finally(() => { if (!cancelled) setLoading(false) })
    return () => { cancelled = true }
  }, [])
  return { models, loading }
 }
--- a/frontend/src/hooks/useSuggestions.js
+++ b/frontend/src/hooks/useSuggestions.js
@@ -0,0 +1,16 @@
 import { useState, useEffect } from 'react'
 const API_BASE = import.meta.env.VITE_API_URL || '/api'
 export function useSuggestions() {
  const [suggestions, setSuggestions] = useState({ authors: [], books: [], chapters: [], reviewers: [] })
  useEffect(() => {
    fetch(`${API_BASE}/jobs/suggestions`)
      .then(r => r.ok ? r.json() : null)
      .then(data => { if (data) setSuggestions(data) })
      .catch(() => {})
  }, [])
  return suggestions
 }
Author	SHA1	Message	Date
Aaron Roberts	02185bef46	Adding missed files	2026-06-30 12:16:16 +01:00
Aaron Roberts	04bbbebd5a	Remove Freeform and Find from UI. Allow Description to be added to Reviewed job	2026-06-29 13:09:01 +01:00
Aaron Roberts	48f958de6c	Added job review toggle	2026-06-23 10:43:44 +01:00
Aaron Roberts	91c134faa7	Add updated_at column and trigger for Qdrant re-sync detection Adds updated_at TIMESTAMPTZ to ocr_jobs, stamped automatically by a BEFORE UPDATE trigger. The sync process can use updated_at > qdrant_synced_at to detect jobs that need re-ingestion after edits or reviews. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-19 23:12:33 +01:00
Aaron Roberts	38ac36b18e	Add qdrant_synced_at column	2026-06-19 17:47:53 +01:00
Aaron Roberts	ab19725e0b	Remove AnimatePresence mode=wait to fix blank screen on view transitions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-10 22:04:52 +01:00
Aaron Roberts	a511db78cb	Fix blank screen on Analyze; add mode selector to result view showResultView now only activates after results exist (not during loading), preventing AnimatePresence from blanking the screen mid-transition. Adds a mode selector + Analyze button at the top of the result view so additional modes can be run without leaving the page. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-10 21:55:23 +01:00
Aaron Roberts	07b2f2b6bc	Fix stale editedOcrText reference in handleDownload dependency array Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-10 21:44:36 +01:00
Aaron Roberts	ae0ac3af59	Store all mode results (OCR, Describe, Freeform) in a single job record - DB: add describe_text and freeform_text columns (ALTER TABLE IF NOT EXISTS) - Backend: commit and review endpoints accept/persist all three text fields - App: accumulate results per mode in state; tabs appear when >1 mode run; all results sent on Commit Job - JobDetail: tabbed text panel shows whichever fields are populated, all editable Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-10 12:28:01 +01:00
Aaron Roberts	4ab87d2e6f	Extend commit workflow to Describe and Freeform modes All text-output modes (plain_ocr, describe, freeform) now show the full-screen editable result view with metadata fields and Commit Job button. The textarea label reflects the active mode. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-10 10:38:27 +01:00
Aaron Roberts	cc5ce0c6be	Fix suggestions fetch using wrong API base URL Fallback was http://localhost:8000/api instead of /api, causing silent failure in containerized deployments. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-09 18:37:13 +01:00
Aaron Roberts	02e3099388	Add delete job functionality with confirmation step Adds DELETE /api/jobs/{id} endpoint (removes DB record and image file), and a two-step Delete / Confirm button on the review page that returns to the job list on success. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-09 18:33:46 +01:00
Aaron Roberts	dc5a1a4ff5	Add book title to autocomplete suggestions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-09 18:29:14 +01:00
Aaron Roberts	5ea18d76d6	Add autocomplete suggestions for Author, Chapter, and Reviewer fields Adds a GET /api/jobs/suggestions endpoint that returns distinct values for author, chapter, and reviewer_name from the database, and wires them into HTML datalist elements on the New Job, result view, and Browse Jobs pages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-09 18:24:49 +01:00
Aaron Roberts	1d15b5f0c1	Add unique constraint to prevent duplicate (author, chapter, page) submissions Adds a PostgreSQL partial unique index on (author, chapter, page) where all three fields are non-null, and returns HTTP 409 when a duplicate is detected. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-09 18:19:54 +01:00
Aaron Roberts	cb704a2f27	Double image/text section height to 130vh Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-09 18:13:11 +01:00
Aaron Roberts	3ca40a2255	Revert to 50/50 image/text column split Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-09 18:10:51 +01:00
Aaron Roberts	6f86f872a9	Make image display significantly taller Give the image+text row an explicit 65vh height instead of flex-1 inside a viewport-locked container. Remove the overall height constraint so metadata and commit rows sit naturally below with scroll if needed. Image and textarea containers now use h-full to fill the fixed row height. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-09 18:10:39 +01:00
Aaron Roberts	7381ecd12e	Increase image display size to 60% of the split layout Change image/text column ratio from 50/50 to 60/40 (3fr 2fr) on both the New Job result view and the Browse Jobs detail view. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-09 18:05:09 +01:00
Aaron Roberts	247a5e4b0e	Full-screen side-by-side layout for New Job and Browse Jobs New Job (plain_ocr): - After OCR completes, the entire main area becomes a flex-column view pinned to viewport height: image and editable textarea side by side at top (filling available space), metadata fields in a compact row below, Commit Job button at the bottom - "New Analysis" button in the header returns to the upload view - ResultPanel reverted to simple rendered-output only (no commit logic) Browse Jobs: - Selecting a job replaces the search list with a full-screen detail view using the same layout: image \| editable textarea on top, all metadata fields + Reviewer name + action button in a single row below - "Back to results" button returns to the search/list grid - Search results now display as a responsive card grid Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-09 17:57:11 +01:00
Aaron Roberts	9356ba6d1b	Side-by-side image/text layout and editable metadata on review New Job page: - OCR result now shows source image and editable textarea side by side - Grounding-box overlay preview moved into the non-commit branch Browse Jobs / Review page: - JobDetail uses a 2-column layout: image + read-only info on left, all editable fields on right - Author, book, chapter, and page are now editable inputs (not read-only) - Text textarea is always editable (for both unreviewed and reviewed jobs) - Reviewer name pre-filled for reviewed jobs; button becomes "Save Changes" - Outer grid changed to 1/3 list + 2/3 detail for more review space Backend: - PUT /api/jobs/{id}/review now accepts and saves author, book, chapter, page alongside reviewed_text and reviewer_name Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-09 17:38:36 +01:00
Aaron Roberts	da7957d7d5	Fix commit job and OCR text editing - OCR text is now shown in an editable textarea (plain_ocr mode) so users can correct it before committing - editedOcrText state tracks edits; commit job sends the edited value instead of the original result.text - Remove silent early-return guard that blocked commit when text was empty - Copy and download also use the edited text Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-09 17:11:49 +01:00
Aaron Roberts	fd747e6c23	Add job tracking with PostgreSQL, image storage, and review workflow - Add PostgreSQL service to docker-compose with health check and postgres_data volume - Mount ./ocr_images as bind volume for persistent image storage - Add backend/database.py with schema init and get_db() context manager - Add 5 new API endpoints: POST /api/jobs, GET /api/jobs (search), GET /api/jobs/{id}, GET /api/jobs/{id}/image, PUT /api/jobs/{id}/review - Jobs are saved with author/book/chapter/page metadata, auto UUID, and submitted_at timestamp - Jobs start as 'unreviewed'; review captures edited text, reviewer name, and reviewed_at - Add MetadataForm.jsx (author/book/chapter/page inputs) to the New Job panel - Add JobsPanel.jsx with search/filter, paginated list, and detail pane with review form - Add "Commit Job" button to ResultPanel (plain_ocr mode only) with success/error feedback - Add "New Job" / "Browse Jobs" navigation to the app header Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-09 16:48:12 +01:00
Aaron Roberts	68147eb97c	.env	2026-06-09 15:10:25 +01:00
Aaron Roberts	ba313ee808	stack.env	2026-06-09 15:06:02 +01:00
Aaron Roberts	bd19e09630	Adding .env for portainer	2026-06-09 14:15:34 +01:00