{"title":"Path Traversal via os.path.join on Untrusted Filenames","language":"Python","severity":"High","cwe":"CWE-22","source_lines":[9],"flow_lines":[9,10,11],"sink_lines":[11],"vulnerable_code":"import os\nfrom flask import Flask, request, send_file\n\napp = Flask(__name__)\nMODEL_STORAGE = '/var/ml/trained_models'\n\n@app.route('/api/v1/download-model', methods=['GET'])\ndef fetch_trained_model():\n    model_id = request.args.get('model_id', 'default.h5')\n    model_path = os.path.join(MODEL_STORAGE, model_id)\n    if os.path.exists(model_path):\n        return send_file(model_path, as_attachment=True)\n    return {'error': 'Model not found'}, 404","explanation":"The application accepts user-controlled input from the 'model_id' parameter without validation and directly uses it in os.path.join() to construct a file path. An attacker can use path traversal sequences (../) to escape the MODEL_STORAGE directory and access arbitrary files on the filesystem, which are then served via send_file().","remediation":"The fix applies two layers of defense: first, os.path.basename() strips any directory traversal components (like ../) from the user input, ensuring only the filename portion is used. Second, os.path.realpath() resolves symlinks and verifies the final path is still within the allowed MODEL_STORAGE directory, preventing any bypass via symbolic links or edge cases.","secure_code":"import os\nfrom flask import Flask, request, send_file\n\napp = Flask(__name__)\nMODEL_STORAGE = '/var/ml/trained_models'\n\n@app.route('/api/v1/download-model', methods=['GET'])\ndef fetch_trained_model():\n    model_id = request.args.get('model_id', 'default.h5')\n    # Sanitize: strip any directory components from the model_id\n    safe_model_id = os.path.basename(model_id)\n    if not safe_model_id:\n        return {'error': 'Invalid model identifier'}, 400\n    model_path = os.path.join(MODEL_STORAGE, safe_model_id)\n    # Verify the resolved path is still within MODEL_STORAGE\n    real_model_path = os.path.realpath(model_path)\n    real_storage = os.path.realpath(MODEL_STORAGE)\n    if not real_model_path.startswith(real_storage + os.sep):\n        return {'error': 'Invalid model identifier'}, 400\n    if os.path.exists(real_model_path):\n        return send_file(real_model_path, as_attachment=True)\n    return {'error': 'Model not found'}, 404"}