# Path Traversal via os.path.join on Untrusted Filenames

Language: Python
Severity: High
CWE: CWE-22

## Source
9

## Flow
9-10-11

## Sink
11

## Vulnerable Code
```python
import os
from flask import Flask, request, send_file

app = Flask(__name__)
MODEL_STORAGE = '/var/ml/trained_models'

@app.route('/api/v1/download-model', methods=['GET'])
def fetch_trained_model():
    model_id = request.args.get('model_id', 'default.h5')
    model_path = os.path.join(MODEL_STORAGE, model_id)
    if os.path.exists(model_path):
        return send_file(model_path, as_attachment=True)
    return {'error': 'Model not found'}, 404
```

## Explanation

The application accepts user-controlled input from the 'model_id' parameter without validation and directly uses it in os.path.join() to construct a file path. An attacker can use path traversal sequences (../) to escape the MODEL_STORAGE directory and access arbitrary files on the filesystem, which are then served via send_file().

## Remediation

The fix applies two layers of defense: first, os.path.basename() strips any directory traversal components (like ../) from the user input, ensuring only the filename portion is used. Second, os.path.realpath() resolves symlinks and verifies the final path is still within the allowed MODEL_STORAGE directory, preventing any bypass via symbolic links or edge cases.

## Secure Code
```python
import os
from flask import Flask, request, send_file

app = Flask(__name__)
MODEL_STORAGE = '/var/ml/trained_models'

@app.route('/api/v1/download-model', methods=['GET'])
def fetch_trained_model():
    model_id = request.args.get('model_id', 'default.h5')
    # Sanitize: strip any directory components from the model_id
    safe_model_id = os.path.basename(model_id)
    if not safe_model_id:
        return {'error': 'Invalid model identifier'}, 400
    model_path = os.path.join(MODEL_STORAGE, safe_model_id)
    # Verify the resolved path is still within MODEL_STORAGE
    real_model_path = os.path.realpath(model_path)
    real_storage = os.path.realpath(MODEL_STORAGE)
    if not real_model_path.startswith(real_storage + os.sep):
        return {'error': 'Invalid model identifier'}, 400
    if os.path.exists(real_model_path):
        return send_file(real_model_path, as_attachment=True)
    return {'error': 'Model not found'}, 404
```
