# Unsafe tarfile Extraction via Path Traversal

Language: Python
Severity: Critical
CWE: CWE-22

## Source
1

## Flow
1-4-6-7-8

## Sink
8

## Vulnerable Code
```python
import tarfile
import os

def deploy_iot_firmware_package(firmware_archive, device_id):
    deployment_dir = f"/opt/iot/devices/{device_id}/firmware"
    os.makedirs(deployment_dir, exist_ok=True)
    with tarfile.open(firmware_archive, 'r:gz') as tar_pkg:
        for firmware_component in tar_pkg.getmembers():
            tar_pkg.extract(firmware_component, path=deployment_dir)
    return {"status": "deployed", "location": deployment_dir}
```

## Explanation

The function accepts a firmware_archive path without validation and directly extracts all tar members without checking for path traversal sequences (../, absolute paths). An attacker can craft a malicious tar.gz with entries containing '../../../etc/cron.d/backdoor' to write files outside the intended deployment directory.

## Remediation

The fix validates each tar member's resolved extraction path to ensure it remains within the intended deployment directory before extraction. It normalizes the path and checks that it starts with the deployment directory prefix, rejecting any entries containing path traversal sequences like '../' or absolute paths. Additionally, symbolic and hard links are validated to ensure they don't point outside the deployment directory.

## Secure Code
```python
import tarfile
import os

def deploy_iot_firmware_package(firmware_archive, device_id):
    deployment_dir = f"/opt/iot/devices/{device_id}/firmware"
    os.makedirs(deployment_dir, exist_ok=True)
    with tarfile.open(firmware_archive, 'r:gz') as tar_pkg:
        for firmware_component in tar_pkg.getmembers():
            member_path = os.path.normpath(os.path.join(deployment_dir, firmware_component.name))
            if not member_path.startswith(os.path.realpath(deployment_dir) + os.sep) and member_path != os.path.realpath(deployment_dir):
                raise ValueError(f"Attempted path traversal in firmware archive: {firmware_component.name}")
            if firmware_component.issym() or firmware_component.islnk():
                link_target = os.path.normpath(os.path.join(deployment_dir, firmware_component.linkname))
                if not link_target.startswith(os.path.realpath(deployment_dir) + os.sep):
                    raise ValueError(f"Symbolic/hard link points outside deployment directory: {firmware_component.name}")
            tar_pkg.extract(firmware_component, path=deployment_dir)
    return {"status": "deployed", "location": deployment_dir}
```
