Files
encoderPro/STUCK-PROCESSING-FIX.md
2026-01-24 17:43:28 -05:00

8.7 KiB

Stuck Processing State Fix

Date: December 20, 2024 Issue: Files remain stuck in "processing" state after stopping or restarting the application


Problem Description

When the dashboard is stopped (Ctrl+C) or Docker container is restarted while files are being processed, those files remain in the "processing" state in the database. Since processing was interrupted, these files should be marked as failed so they can be retried.

Affected Scenarios:

  • Dashboard server crash or forced shutdown
  • Docker container restart
  • System reboot while processing
  • Clicking "Stop Processing" button
  • Application killed unexpectedly

Solution Implemented

1. Automatic Cleanup (Multi-Point)

Location: dashboard.py:183-210

Added cleanup_stuck_processing() method to DatabaseReader class:

def cleanup_stuck_processing(self):
    """Mark files stuck in 'processing' state as failed for retry"""
    try:
        conn = self._get_connection()
        cursor = conn.cursor()

        # Find files stuck in processing state
        cursor.execute("SELECT COUNT(*) as count FROM files WHERE state = 'processing'")
        stuck_count = cursor.fetchone()['count']

        if stuck_count > 0:
            logging.warning(f"Found {stuck_count} file(s) stuck in 'processing' state from previous session")

            # Mark them as failed (interrupted) so they can be retried
            cursor.execute("""
                UPDATE files
                SET state = 'failed',
                    error_message = 'Processing interrupted (application restart or crash)',
                    completed_at = CURRENT_TIMESTAMP
                WHERE state = 'processing'
            """)

            conn.commit()
            logging.info(f"✅ Marked {stuck_count} stuck file(s) as failed for retry")

        conn.close()
    except Exception as e:
        logging.error(f"Error cleaning up stuck processing files: {e}", exc_info=True)

Triggered in Multiple Places:

  1. On Startup (dashboard.py:1144-1148):
# Clean up any files stuck in processing state from previous session
try:
    logger.info("Checking for files stuck in processing state...")
    db_reader.cleanup_stuck_processing()
except Exception as e:
    logger.error(f"Failed to cleanup stuck files on startup: {e}", exc_info=True)
  1. On File List Load (dashboard.py:612-619):
@app.route('/api/files')
def api_files():
    """Get files list"""
    try:
        # Auto-cleanup stuck files whenever file list is requested
        # This ensures stuck files are cleaned up even if startup cleanup failed
        if not processing_active:
            db_reader.cleanup_stuck_processing()

This multi-point approach ensures stuck files are automatically cleaned up:

  • When the dashboard starts (container restart)
  • Whenever the file list refreshes (every page load/refresh)
  • Only when processing is not active (safe guard)

2. Manual Reset API Endpoint

Location: dashboard.py:824-832

Added new API endpoint for manual reset:

@app.route('/api/jobs/reset-stuck', methods=['POST'])
def api_reset_stuck():
    """Mark files stuck in processing state as failed for retry"""
    try:
        db_reader.cleanup_stuck_processing()
        return jsonify({'success': True, 'message': 'Stuck files marked as failed'})
    except Exception as e:
        logging.error(f"Failed to reset stuck files: {e}", exc_info=True)
        return jsonify({'success': False, 'error': 'Internal server error'}), 500

Endpoint: POST /api/jobs/reset-stuck Auth: Requires CSRF token Response: {'success': true, 'message': 'Stuck files marked as failed'}


3. UI Button for Manual Reset

Location: templates/dashboard.html:373-375

Added "Reset Stuck" button to control panel:

<button class="btn" onclick="resetStuckFiles()"
        style="background: #8b5cf6; color: white;"
        title="Reset files stuck in processing state">
    🔧 Reset Stuck
</button>

JavaScript Function (templates/dashboard.html:970-990):

async function resetStuckFiles() {
    if (!confirm('This will mark all files stuck in "processing" state as FAILED.\n\nThey can then be retried. Continue?')) {
        return;
    }

    try {
        const response = await fetchWithCsrf('/api/jobs/reset-stuck', {
            method: 'POST'
        });
        const result = await response.json();

        if (result.success) {
            alert('✅ Stuck files have been marked as failed and can be retried!');
            setTimeout(refreshData, 1000);
        } else {
            alert('Failed to reset stuck files: ' + result.message);
        }
    } catch (error) {
        alert('Error resetting stuck files: ' + error.message);
    }
}

How It Works

Automatic Reset (Multi-Trigger)

On Startup:

  1. Dashboard starts
  2. db_reader.cleanup_stuck_processing() is called during startup
  3. Logs: "Checking for files stuck in processing state..."

On File List Load:

  1. User loads dashboard or clicks refresh
  2. /api/files endpoint is called
  3. If NOT actively processing, cleanup runs automatically
  4. Stuck files are silently marked as failed

Common Flow:

  1. SQL query finds all files with state = 'processing'
  2. All found files are updated to:
    • state = 'failed'
    • error_message = 'Processing interrupted (application restart or crash)'
    • completed_at = CURRENT_TIMESTAMP
  3. Log message shows how many files were marked as failed

Manual Reset (Via UI)

  1. User clicks "🔧 Reset Stuck" button
  2. Confirmation dialog appears: "This will mark all files stuck in 'processing' state as FAILED. They can then be retried. Continue?"
  3. CSRF-protected POST request to /api/jobs/reset-stuck
  4. Server calls cleanup_stuck_processing()
  5. Success message shown: "Stuck files have been marked as failed and can be retried!"
  6. Dashboard refreshes to show updated states

Database Changes

Query Used:

UPDATE files
SET state = 'failed',
    error_message = 'Processing interrupted (application restart or crash)',
    completed_at = CURRENT_TIMESTAMP
WHERE state = 'processing';

Effect:

  • Files stuck in "processing" → changed to "failed"
  • Error message set to: "Processing interrupted (application restart or crash)"
  • Completion timestamp recorded
  • Files appear in "failed" filter and can be retried

Testing

Test Scenario 1: Container Restart

# Start encoding
docker exec encoderpro curl -X POST http://localhost:5000/api/jobs/start

# Wait a few seconds, then restart
docker restart encoderpro

# Check logs - should see reset message
docker logs encoderpro | grep "stuck"

Expected Output:

Found 3 file(s) stuck in 'processing' state from previous session
✅ Marked 3 stuck file(s) as failed for retry

Test Scenario 2: Manual Reset

  1. Start processing some files
  2. Stop processing (button or Ctrl+C)
  3. Click "🔧 Reset Stuck" button
  4. Confirm the dialog: "This will mark all files stuck in 'processing' state as FAILED. They can then be retried. Continue?"
  5. Verify success message: "Stuck files have been marked as failed and can be retried!"
  6. Verify files changed from "processing" to "failed"
  7. Check error message shows: "Processing interrupted (application restart or crash)"

Benefits

  1. Automatic Recovery - No manual intervention needed after restarts
  2. Data Integrity - Database state accurately reflects reality (interrupted = failed)
  3. User Control - Manual reset available if needed
  4. Visibility - Log messages show when cleanup occurs
  5. Error Tracking - Files marked with specific reason: "Processing interrupted"
  6. Retry Logic - Failed files can be easily filtered and re-queued
  7. Audit Trail - Completion timestamp shows when interruption occurred

Breaking Changes

None - This is a new feature that enhances existing functionality.


This fix also addresses the CSRF protection issue where POST requests were failing:

Fixed Functions:

  • startProcessing() - Now uses fetchWithCsrf()
  • stopProcessing() - Now uses fetchWithCsrf()
  • scanLibrary() - Now uses fetchWithCsrf()
  • saveEncodingSettings() - Now uses fetchWithCsrf()
  • encodeSelectedFiles() - Now uses fetchWithCsrf()

All POST requests now properly include the CSRF token.


Future Enhancements

Potential improvements:

  1. Add timestamp tracking for how long files have been "processing"
  2. Auto-reset files stuck for more than X hours
  3. Email notification when stuck files are detected
  4. Dashboard widget showing stuck file count
  5. Option to retry vs. skip stuck files

Version History

  • v3.2.0 - Initial implementation of stuck processing fix