{"title":"XML Entity Expansion DoS via Billion Laughs in lxml","language":"Python","severity":"High","cwe":"CWE-776","source_lines":[3],"flow_lines":[3,4,5,6],"sink_lines":[6],"vulnerable_code":"from lxml import etree\nimport io\n\ndef process_iot_device_config(xml_payload):\n    parser = etree.XMLParser(resolve_entities=True)\n    config_stream = io.BytesIO(xml_payload.encode('utf-8'))\n    device_tree = etree.parse(config_stream, parser)\n    root = device_tree.getroot()\n    device_id = root.find('.//device_id').text\n    firmware_ver = root.find('.//firmware').text\n    settings = {child.tag: child.text for child in root.find('.//settings')}\n    return {'device': device_id, 'firmware': firmware_ver, 'config': settings}","explanation":"The function accepts untrusted XML input and explicitly enables entity resolution with `resolve_entities=True`, making it vulnerable to XML Entity Expansion attacks (Billion Laughs). An attacker can craft malicious XML with recursive entity definitions that exponentially expand during parsing, causing memory exhaustion and DoS.","remediation":"The fix disables entity resolution by setting `resolve_entities=False` and adds additional hardening options (`no_network=True`, `dtd_validation=False`, `load_dtd=False`, `huge_tree=False`) to prevent XML entity expansion attacks and other XML-based exploits. This ensures that recursive entity definitions in DTDs are not expanded, preventing the Billion Laughs DoS attack.","secure_code":"from lxml import etree\nimport io\n\ndef process_iot_device_config(xml_payload):\n    parser = etree.XMLParser(\n        resolve_entities=False,\n        no_network=True,\n        dtd_validation=False,\n        load_dtd=False,\n        huge_tree=False\n    )\n    config_stream = io.BytesIO(xml_payload.encode('utf-8'))\n    device_tree = etree.parse(config_stream, parser)\n    root = device_tree.getroot()\n    device_id = root.find('.//device_id').text\n    firmware_ver = root.find('.//firmware').text\n    settings = {child.tag: child.text for child in root.find('.//settings')}\n    return {'device': device_id, 'firmware': firmware_ver, 'config': settings}"}