{"title":"XML External Entity (XXE) via lxml.etree.fromstring with entity resolution","language":"Python","severity":"Critical","cwe":"CWE-611","source_lines":[6],"flow_lines":[6,7],"sink_lines":[7],"vulnerable_code":"from lxml import etree\nimport boto3\n\ndef process_cloud_config_manifest(manifest_xml, bucket_name):\n    s3_client = boto3.client('s3')\n    parser = etree.XMLParser(resolve_entities=True, no_network=False)\n    config_tree = etree.fromstring(manifest_xml.encode(), parser)\n    deployment_region = config_tree.find('.//Region').text\n    instance_type = config_tree.find('.//InstanceType').text\n    security_groups = [sg.text for sg in config_tree.findall('.//SecurityGroup')]\n    metadata = {'region': deployment_region, 'type': instance_type, 'sg': security_groups}\n    s3_client.put_object(Bucket=bucket_name, Key=f'deployments/{deployment_region}/config.json', Body=str(metadata))\n    return {'status': 'deployed', 'region': deployment_region, 'metadata': metadata}","explanation":"The code accepts untrusted XML input (manifest_xml) and parses it with lxml.etree using resolve_entities=True and no_network=False, enabling XXE attacks. An attacker can inject malicious external entity declarations to read local files, perform SSRF attacks, or cause denial of service through entity expansion attacks.","remediation":"The fix secures the XML parser by setting resolve_entities=False to prevent entity expansion, no_network=True to block network-based entity resolution, and explicitly disabling DTD loading and validation. These settings prevent attackers from injecting external entity declarations that could read local files, perform SSRF, or cause denial of service.","secure_code":"from lxml import etree\nimport boto3\n\ndef process_cloud_config_manifest(manifest_xml, bucket_name):\n    s3_client = boto3.client('s3')\n    parser = etree.XMLParser(resolve_entities=False, no_network=True, dtd_validation=False, load_dtd=False)\n    config_tree = etree.fromstring(manifest_xml.encode(), parser)\n    deployment_region = config_tree.find('.//Region').text\n    instance_type = config_tree.find('.//InstanceType').text\n    security_groups = [sg.text for sg in config_tree.findall('.//SecurityGroup')]\n    metadata = {'region': deployment_region, 'type': instance_type, 'sg': security_groups}\n    s3_client.put_object(Bucket=bucket_name, Key=f'deployments/{deployment_region}/config.json', Body=str(metadata))\n    return {'status': 'deployed', 'region': deployment_region, 'metadata': metadata}"}