Xem mẫu

Gray Hat Hacking: The Ethical Hacker’s Handbook 318 .text:0804874B mov .text:0804874E push .text:08048750 call .text:08048755 add eax, [ebp+arg_0] dword ptr [eax] sub_8057850 esp, 10h yields the following improved disassembly in which we are far less likely to waste time analyzing any of the three functions that are called. .text:0804872C push .text:0804872D mov .text:0804872F sub .text:08048732 call .text:08048737 mov .text:0804873A call .text:0804873F mov .text:08048742 sub .text:08048745 mov .text:08048748 push .text:0804874B mov .text:0804874E push .text:08048750 call .text:08048755 add ebp ebp, esp esp, 18h ___sys_getuid [ebp+var_4], eax ___sys_getgid [ebp+var_8], eax esp, 8 eax, [ebp+arg_0] dword ptr [eax+0Ch] eax, [ebp+arg_0] dword ptr [eax] _initgroups esp, 10h Wehavenotcoveredhowtoidentifyexactlywhichstaticlibraryfilestousewhengen-erating your IDA sig files. It is safe to assume that statically linked C programs are linked against the static C library. To generate accurate signatures, it is important to track down a version of the library that closely matches the one with which the binary was linked. Here,somefileandstringsanalysiscanassistinnarrowingthefieldofoperatingsystems that the binary may have been compiled on. The file utility can distinguish among vari-ous platforms such as Linux, FreeBSD, or OS X, and the strings utility can be used to search for version strings that may point to the compiler or libc version that was used. Armed with that information, you can attempt to locate the appropriate libraries from a matching system. If the binary was linked with more than one static library, additional strings analysis may be required to identify each additional library. Useful things to look for in strings output include copyright notices, version strings, usage instructions, or other unique messages that could be thrown into a search engine in an attempt to identify each additional library. By identifying as many libraries as possible and apply-ing their signatures, you greatly reduce the amount of code that you need to spend time analyzing and get to focus more attention on application-specific code. Data Structure Analysis One consequence of compilation being a lossy operation is that we lose access to data declarations and structure definitions, which makes it far more difficult to understand the memory layout in disassembled code. As mentioned in Chapter 12, IDA provides the capability to define the layout of data structures and then to apply those structure definitions to regions of memory. Once a structure template has been applied to a regionofmemory,IDAcanutilizestructurefieldnamesinplaceofintegeroffsetswithin the disassembly, making the disassembly far more readable. There are two important steps in determining the layout of data structures in compiled code. The first step is to Chapter 13: Advanced Static Analysis with IDA Pro 319 determine the size of the data structure. The second step is to determine how the struc-tureissubdividedintofieldsandwhattypeisassociatedwitheachfield.Theprogramin Listing13-6anditscorrespondingcompiledversioninListing13-7willbeusedtoillus-trate several points about disassembling structures. Listing 13-6 1: #include 2: #include 3: #include 4: typedef struct GrayHat_t { 5: char buf[80]; 6: int val; 7: double squareRoot; 8: } GrayHat; 9: int main(int argc, char **argv) { 10: GrayHat gh; 11: if (argc == 4) { 12: GrayHat *g = (GrayHat*)malloc(sizeof(GrayHat)); 13: strncpy(g->buf, argv[1], 80); 14: g->val = atoi(argv[2]); 15: g->squareRoot = sqrt(atof(argv[3])); 16: strncpy(gh.buf, argv[0], 80); 17: gh.val = 0xdeadbeef; 18: } 19: return 0; 20: } Listing 13-7 1: ; int __cdecl main(int argc,const char **argv,const char *envp) 2: _main proc near 3: var_70 4: dest 5: var_10 6: argc 7: argv 8: envp = qword ptr -112 = byte ptr -96 = dword ptr -16 = dword ptr 8 = dword ptr 12 = dword ptr 16 9: push 10: mov 11: add 12: push 13: push 14: mov 15: cmp 16: jnz 17: push 18: call 19: pop 20: mov 21: push 22: push ebp ebp, esp esp, 0FFFFFFA0h ebx esi ebx, [ebp+argv] [ebp+argc], 4 ; argc != 4 short loc_4011B6 96 ; struct size _malloc ecx esi, eax ; esi points to struct 80 ; maxlen dword ptr [ebx+4] ; argv[1] Gray Hat Hacking: The Ethical Hacker’s Handbook 320 23: push 24: call 25: add 26: push 27: call 28: pop 29: mov 30: push 31: call 32: pop 33: add 34: fstp 35: call 36: add 37: fstp 38: push 39: push 40: lea 41: push 42: call 43: add 44: mov 45: loc_4011B6: 46: xor 47: pop 48: pop 49: mov 50: pop 51: retn 52: _main endp esi ; start of struct _strncpy esp, 0Ch dword ptr [ebx+8] ; argv[2] _atol ecx [esi+80], eax ; 80 bytes into struct dword ptr [ebx+12] ; argv[3] _atof ecx esp, 0FFFFFFF8h [esp+70h+var_70] _sqrt esp, 8 qword ptr [esi+88] ; 88 bytes into struct 80 ; maxlen dword ptr [ebx] ; argv[0] eax, [ebp-96] eax ; dest _strncpy esp, 0Ch [ebp-16], 0DEADBEEFh eax, eax esi ebx esp, ebp ebp Therearetwomethodsfordeterminingthesizeofastructure.Thefirstandeasiestmethod is to find locations at which a structure is dynamically allocated using malloc or new. Lines 17 and 18 in Listing 13-7 show a call to malloc 96 bytes of memory. Malloced blocksofmemorygenerallyrepresenteitherstructuresorarrays.Inthiscase,welearnthat thisprogrammanipulatesastructurewhosesizeis96bytes.Theresultingpointeristrans-ferredintotheesiregisterandusedtoaccessthefieldsinthestructurefortheremainderof the function. References to this structure take place at lines 23, 29, and 37. The second method of determining the size of a structure is to observe the offsets used in every reference to the structure and to compute the maximum size required to housethedatathatisreferenced.Inthiscase,line23referencesthe80bytesatthebegin-ning of the structure (based on the maxlen argument pushed at line 21), line 29 refer-ences 4 bytes (the size of eax) starting at offset 80 into the structure ([esi + 80]), and line 37 references 8 bytes (a quad word/qword) starting at offset 88 ([esi + 88]) into the structure. Based on these references, we can deduce that the structure is 88 (the maxi-mum offset we observe) plus 8 (the size of data accessed at that offset), or 96 bytes long. Thus we have derived the size of the structure by two different methods. The second method is useful in cases where we can’t directly observe the allocation of the structure, perhaps because it takes place within library code. To understand the layout of the bytes within a structure, we must determine the types of data that are used at each observable offset within the structure. In our example, the access at line 23 uses the beginning of the structure as the destination of a string copy Chapter 13: Advanced Static Analysis with IDA Pro 321 operation,limitedinsizeto80bytes.Wecanconcludethereforethatthefirst80bytesof the structure are an array of characters. At line 29, the 4 bytes at offset 80 in the structure areassignedtheresultofthefunctionatol,whichconvertsanasciistringtoalongvalue. Here we can conclude that the second field in the structure is a 4-byte long. Finally, at line 37, the 8 bytes at offset 88 into the structure are assigned the result of the function atof, which converts an ascii string to a floating-point double value. You may have noticed that the bytes at offsets 84–87 of the structure appear to be unused. There are two possible explanations for this. The first is that there is a structure field between the long and the double that is simply not referenced by the function. The second possibil-ity is that the compiler has inserted some padding bytes to achieve some desired field alignment. Based on the actual definition of the structure in Listing 13-6, we conclude that padding is the culprit in this particular case. If we wanted to see meaningful field namesassociatedwitheachstructureaccess,wecoulddefineastructureintheIDAstruc-ture window as described in Chapter 12. IDA offers an alternative method for defining structures that you may find far easier to use than its structure editing facilities. IDA can parse C header files via the File | Load File menu option. If you have access to the source code or prefer to create a C-style struct definition using a text editor, IDA will parse the header file and automatically create structures for each struct definition that it encoun-ters in the header file. The only restriction you must be aware of is that IDA only recog-nizes standard C data types. For any nonstandard types, uint32_t, for example, the header file must contain an appropriate typedef, or you must edit the header file to con-vert all nonstandard types to standard types. Access to stack or globally allocated structures looks quite different than access to dynamicallyallocatedstructures.Listing13-6showsthatmaincontainsalocal,stackallo-cated structure declared at line 10. Lines 16 and 17 of main reference fields in this local structure. These correspond to lines 40 and 44 in the assembly Listing 13-7. While we can see that line 44 references memory that is 80 bytes ([ebp-96+80] == [ebp-16]) after the reference at line 40, we don’t get a sense that the two references belong to the same struc-ture. This is because the compiler can compute the address of each field (as an absolute address in a global variable, or a relative address within a stack frame) at compile time, whereas access to fields in dynamically allocated structures must always be computed at runtime because the base address of the structure is not known at compile time. Using IDA Structures to View Program Headers In addition to enabling you to declare your own data structures, IDA contains a large number of common data structure templates for various build environments, including standard C library structures and Windows API structures. An interesting example use of thesepredefinedstructuresistousethemtoexaminetheprogramfileheaderswhich,by default,arenotloadedintotheanalysisdatabase.Toexaminefileheaders,youmustper-form a manual load when initially opening a file for analysis. Manual loads are selected via a checkbox on the initial load dialog box as shown in Figure 13-3. Manual loading forces IDA to ask you whether you wish to load each section of the binary into IDA’s database. One of the sections that IDA will ask about is the header sec-tion,whichwillallowyoutoseeallthefieldsoftheprogramheadersincludingstructures Gray Hat Hacking: The Ethical Hacker’s Handbook 322 Figure 13-3 Forcing a manual load with IDA such as the MSDOS and NT file headers. Another section that gets loaded only when a manualloadisperformedistheresourcesectionthatisusedontheWindowsplatformto store dialog box and menu templates, string tables, icons, and the file properties. You can view the fields of the MSDOS header by scrolling to the beginning of a manually loaded Windows PE file and placing the cursor on the first address in the database, which should contain the ‘M’ value of the MSDOS ‘MZ’ signature. No layout information will be dis-played until you add the IMAGE_DOS_HEADER to your structures window. This is accomplished by switching to the Structures tab, pressing INSER , entering IMAGE_DOS_ HEADER as the Structure Name, and clicking OK as shown in Figure 13-4. This will pull IDA’s definition of the IMAGE_DOS_HEADER from its type library into yourlocalstructureswindowandmakeitavailabletoyou.Finally,youneedtoreturntothe disassembly window, position the cursor on the first byte of the DOS header, and use the ALT-Q hotkey sequence to apply the IMAGE_DOS_HEADER template. The structure may initially appear in its collapsed form, but you can view all of the struct fields by expanding the struct with the numeric keypad + key. This results in the display shown next: HEADER:00400000 __ImageBase HEADER:00400000 HEADER:00400000 HEADER:00400000 HEADER:00400000 HEADER:00400000 dw 5A4Dh dw 50h dw 2 dw 0 dw 4 dw 0Fh ; e_magic ; e_cblp ; e_cp ; e_crlc ; e_cparhdr ; e_minalloc ... - tailieumienphi.vn
nguon tai.lieu . vn