{"title":"Pickle Deserialization via `pickle.loads()`","language":"Python","severity":"Critical","cwe":"CWE-502","source_lines":[6],"flow_lines":[6,8,9],"sink_lines":[9],"vulnerable_code":"import pickle\nimport base64\nfrom flask import request, jsonify\n\ndef restore_ml_model_state():\n    encrypted_state = request.headers.get('X-Model-Checkpoint')\n    if not encrypted_state:\n        return jsonify({'error': 'No checkpoint provided'}), 400\n    try:\n        decoded_checkpoint = base64.b64decode(encrypted_state)\n        model_params = pickle.loads(decoded_checkpoint)\n        return jsonify({'status': 'Model restored', 'layers': model_params.get('layers', 0)})\n    except Exception as e:\n        return jsonify({'error': 'Restoration failed'}), 500","explanation":"The application accepts arbitrary pickled data from an HTTP header (X-Model-Checkpoint), base64-decodes it, and directly deserializes it using pickle.loads(). Since pickle can execute arbitrary Python code during deserialization, an attacker can craft a malicious pickle payload to achieve remote code execution on the server.","remediation":"The fix replaces `pickle.loads()` with `json.loads()` to eliminate arbitrary code execution during deserialization, since JSON only supports safe primitive data types. Additionally, HMAC signature verification ensures that only checkpoints signed by the server's secret key are accepted, preventing tampering. Input validation with an allowlist of keys and value types provides defense-in-depth.","secure_code":"import json\nimport base64\nimport hmac\nimport hashlib\nfrom flask import request, jsonify, current_app\n\nALLOWED_KEYS = {'layers', 'weights', 'optimizer_state', 'epoch', 'loss', 'learning_rate'}\nALLOWED_VALUE_TYPES = (int, float, str, list, bool, type(None))\n\ndef validate_model_params(params: dict) -> bool:\n    if not isinstance(params, dict):\n        return False\n    for key, value in params.items():\n        if key not in ALLOWED_KEYS:\n            return False\n        if not isinstance(value, ALLOWED_VALUE_TYPES):\n            return False\n    return True\n\ndef restore_ml_model_state():\n    encrypted_state = request.headers.get('X-Model-Checkpoint')\n    checkpoint_signature = request.headers.get('X-Checkpoint-Signature')\n    if not encrypted_state:\n        return jsonify({'error': 'No checkpoint provided'}), 400\n    if not checkpoint_signature:\n        return jsonify({'error': 'No checkpoint signature provided'}), 400\n    try:\n        decoded_checkpoint = base64.b64decode(encrypted_state)\n        secret_key = current_app.config.get('CHECKPOINT_SECRET_KEY', '')\n        if not secret_key:\n            return jsonify({'error': 'Server configuration error'}), 500\n        expected_sig = hmac.new(secret_key.encode(), decoded_checkpoint, hashlib.sha256).hexdigest()\n        if not hmac.compare_digest(expected_sig, checkpoint_signature):\n            return jsonify({'error': 'Invalid checkpoint signature'}), 403\n        model_params = json.loads(decoded_checkpoint)\n        if not validate_model_params(model_params):\n            return jsonify({'error': 'Invalid model parameters'}), 400\n        return jsonify({'status': 'Model restored', 'layers': model_params.get('layers', 0)})\n    except (json.JSONDecodeError, ValueError):\n        return jsonify({'error': 'Invalid checkpoint format'}), 400\n    except Exception as e:\n        return jsonify({'error': 'Restoration failed'}), 500"}