• Malicious PDF analysis

    Part 1

    This post contains the analysis that i carried out on a malicious pdf file, exploiting a stack based overflow vulnerablity in the adobe acrobat reader. Before analyzing the malicious pdf, it is necessary to understand the structure of the pdf file, so before getting into the analysis i would like to explain the structure of pdf document. Also the PDF document can include images, fonts, text, javascript, flash and other content to dispay the document.


    PDF document structure
    In this document i'l not explain the full pdf structure, but i'l be explaining the parts that is required for the analysis. The pdf document starts with a header which looks like this %PDF-a.b.c.d, this is the version of the pdf language. Without this header the pdf readers will not accept it.
    The pdf document also consists of multiple indirect objects, an example of an indirect object is shown below, the indirect objects have an object id (1 in the below example) and version number (0 in the below example) follwed by the keyword obj, endobj marks the end of the object.

    Code:
    %PDF-1.1        <------ PDF header
    
    1 0 obj            <-------- Indirect object, 1 is the object id and 0 is the version number
    
    ........
    .....
    endobj        <------ This marks the end of object 1
    Inside the object there are a series of tags describing the contents of the object or reference to another object, In the below example, object 7 contains javascript (indicated by /Javascript tag) and the /JS tag indicates the content of the javascript, whereas object 31 has reference to javascript object (object id 34) indicated by /JS 34 0 R, the R stands for Reference. One of the object essential for analyzing malicious pdf files is the stream object. In the example below the object 34 contains stream object. A stream object contains a stream of data between the keyword stream and endstream. This data stream is often compressed and that is the reason it looks like meaningless data. In the example FlateDecode indicates that the data stream is compressed using zlib compression algorithm.
    Also there are different types of tags, for example: /JS and /javascript indicates javascript
    /Richmedia indicates flash,
    /AA, /OpenAction indicates an automatic action to be performed when the document is viewed.

    Code:
    7 0 obj                              <---- object 7
    <<
    /Type /Action
    /S /JavaScript                <---- javascript tag
    /JS <javascript code>      <---- javascript content
    ...............
    .............
    >>
    endobj
    
    
    obj 31 0                                    <----- object 31
    Type:
    Referencing: 34 0 R
    [(2, '<<'), (2, '/S'), (2, '/JavaScript'), (2, '/JS'), (1, ' '), (3, '34'), (1, ' '), (3, '0'), (1, ' '), (3, 'R'), (2, '>>'), (1, '\r')]
    <<
    /S /JavaScript
    /JS 34 0 R                          <------ reference to a javascript object, in this case object 34
    >>
    endobj
    
    
    34 0 obj<</Subtype/Type1C/Length 5416/Filter/FlateDecode
    >>stream                                                                     <--- stream object
    H‰|T}T#W#Ÿ!d&"FI#ʼnNFW#åC                      <---- compressed stream content, in this case zlib compressed data indicated by /FlatDecode
    …
    endstream
    endobj
    Continued

    Malicious PDF analysis Tutorial Part 1
    Malicious PDF analysis Tutorial Part 2
    Malicious PDF analysis Tutorial Part 3 Extracting Javascript
    Malicious PDF analysis Tutorial Part 4
    Part 5 Malicious PDF analysis Tutorial Shellcode analysis
    AnArKI and c1ph3r like this.
    This article was originally published in forum thread: Malicious PDF analysis started by m0nna View original post
  • G4H Twitter

  • Latest Posts

    fb1h2s

    IE 6/7 :D , that would work out :D

    IE 6/7 :D , that would work out :D

    fb1h2s Today, 03:32 PM Go to last post
    amolnaik4

    1. how can i get pass this and automatically log...

    1. how can i get pass this and automatically log all the "httponly" cookies from the worldbank.com ?
    -- Well there is no direct way to access "httpOnly" cookies via javascript. That's it's job to...

    amolnaik4 Today, 02:26 PM Go to last post
    amolnaik4

    This is required to work CORS and requirement for...

    This is required to work CORS and requirement for Same Origin Policy. If the protocol/domain/port mismatches, SOP will prevent the communication.

    The "httpOnly" cookies will have no meaning in...

    amolnaik4 Today, 01:58 PM Go to last post
    Punter

    A Guide to Understand Flow Charts 208

    A Guide to Understand Flow Charts

    208

    Punter Today, 01:26 PM Go to last post
    Anant Shrivastava

    Its VirtualBox image only VMware has wierd error...

    Its VirtualBox image only VMware has wierd error running it.

    next release i will see if i can make sure compatibility is maintained right now i support only virtualbox

    FAQ : Android Tamer

    Anant Shrivastava Today, 12:52 PM Go to last post
"; for(var vi=0;vi