{"title":"Pickle Deserialization via `__reduce__` Arbitrary Code Execution","language":"Python","severity":"Critical","cwe":"CWE-502","source_lines":[9],"flow_lines":[9,10,11],"sink_lines":[11],"vulnerable_code":"import pickle\nimport base64\nfrom flask import Flask, request, jsonify\n\napp = Flask(__name__)\n\n@app.route('/iot/device/restore', methods=['POST'])\ndef restore_device_config():\n    encoded_state = request.json.get('device_state')\n    device_data = base64.b64decode(encoded_state)\n    restored_config = pickle.loads(device_data)\n    return jsonify({'status': 'restored', 'device_id': restored_config.get('device_id'), 'firmware': restored_config.get('firmware_version')})\n\nif __name__ == '__main__':\n    app.run(host='0.0.0.0', port=5000)","explanation":"The application accepts user-controlled data from a POST request, base64 decodes it, and directly deserializes it using pickle.loads() without validation. Python's pickle module can execute arbitrary code during deserialization through __reduce__ methods, allowing attackers to achieve remote code execution.","remediation":"The fix replaces pickle deserialization with JSON parsing, which cannot execute arbitrary code during deserialization. Additionally, input validation ensures the decoded data conforms to expected structure and types, and an optional HMAC signature verification provides defense-in-depth to ensure data integrity from trusted devices.","secure_code":"import json\nimport base64\nimport hmac\nimport hashlib\nfrom flask import Flask, request, jsonify\n\napp = Flask(__name__)\n\n# Secret key for HMAC validation (should be stored securely, e.g., environment variable)\nDEVICE_STATE_SECRET = b'your-secure-secret-key-here'\n\nALLOWED_FIELDS = {'device_id', 'firmware_version', 'config', 'network_settings', 'sensors', 'last_sync'}\n\ndef validate_device_config(data):\n    \"\"\"Validate that the deserialized config only contains expected fields and safe types.\"\"\"\n    if not isinstance(data, dict):\n        return False\n    if not data.keys() <= ALLOWED_FIELDS:\n        return False\n    # Ensure device_id is a string\n    if 'device_id' in data and not isinstance(data['device_id'], str):\n        return False\n    # Ensure firmware_version is a string\n    if 'firmware_version' in data and not isinstance(data['firmware_version'], str):\n        return False\n    return True\n\n@app.route('/iot/device/restore', methods=['POST'])\ndef restore_device_config():\n    request_data = request.json\n    if not request_data or 'device_state' not in request_data:\n        return jsonify({'status': 'error', 'message': 'Missing device_state'}), 400\n\n    encoded_state = request_data.get('device_state')\n    provided_signature = request_data.get('signature')\n\n    try:\n        device_data = base64.b64decode(encoded_state)\n    except Exception:\n        return jsonify({'status': 'error', 'message': 'Invalid base64 encoding'}), 400\n\n    # Verify HMAC signature if provided (defense in depth)\n    if provided_signature:\n        expected_signature = hmac.new(DEVICE_STATE_SECRET, device_data, hashlib.sha256).hexdigest()\n        if not hmac.compare_digest(provided_signature, expected_signature):\n            return jsonify({'status': 'error', 'message': 'Invalid signature'}), 403\n\n    # Use JSON instead of pickle for safe deserialization\n    try:\n        restored_config = json.loads(device_data)\n    except (json.JSONDecodeError, UnicodeDecodeError):\n        return jsonify({'status': 'error', 'message': 'Invalid JSON configuration data'}), 400\n\n    # Validate the structure of the restored config\n    if not validate_device_config(restored_config):\n        return jsonify({'status': 'error', 'message': 'Invalid configuration structure'}), 400\n\n    return jsonify({\n        'status': 'restored',\n        'device_id': restored_config.get('device_id'),\n        'firmware': restored_config.get('firmware_version')\n    })\n\nif __name__ == '__main__':\n    app.run(host='0.0.0.0', port=5000)"}