How a Browser Works: A Beginner-Friendly Guide to Browser Internals
Have you thought about what happens when you type a URL and press enter, and then in seconds a webpage appears with all texts, images, colors ,etc. What is going on behind the scenes to transform simple text files into fancy and good-looking UI (user interface)?
In this article we are going to understand how a actually browser works and browser internals.
Topics to Cover
What a browser actually is
Main parts of a browser
User Interface: address bar, tabs, buttons
Browser Engine vs Rendering Engine
Networking: how a browser fetches HTML, CSS, JS
HTML parsing and DOM creation
CSS parsing and CSSOM creation
How DOM and CSSOM come together
What a browser actually is
A browser is software application that retrieves information from the web and interprets it and displays content on your device. In simple words its job is to take a "recipe" (HTML, CSS, and JavaScript) and cook it into a finished "meal" (the interactive website).
Main parts of a browser
A browser consists of several major components like User Interface, Browser Engine, Rendering Engine.
User Interface: address bar, tabs, buttons
User Interface: The UI is visible part of the browser everything except the webpage content itself.
Components of the UI:
Address Bar: Where you type web addresses. Modern address bars also do research, show security status and provide suggestions.
Navigation Buttons: Back, Forward, refresh,home buttons that control page navigation.
Tabs: Allow multiple webpages to be open simultaneously in one window.
Browser Menu: Settings,History, downloads, extensions.
Bookmarks Bar: Quick access to saved pages.
Browser Engine vs Rendering Engine
Browser Engine: The high-level coordinator. It marshals actions between the UI and the rendering engine. When you click the refresh button in the UI, the browser engine tells the rendering engine to reload the page.
Think of the browser engine as a manager who coordinates between different departments but doesn't do the detailed work itself.
Rendering Engine: The component that actually interprets HTML and CSS, builds the page structure, and displays content.
Think of the rendering engine as the artist who does the actual work of creating the visual representation.
Popular Rendering Engines:
Blink: Used by Chrome, Edge, Opera, Brave (most popular)
WebKit: Used by Safari
Gecko: Used by Firefox
These engines have the same job (render web content) but implement details differently, which is why websites sometimes look or behave slightly differently across browsers.
Networking: How a Browser Fetches HTML, CSS, and JavaScript
Before the browser can display anything, it needs to get the files. This is where the networking component comes in.
The Journey from URL to Files
When you press Enter after typing a URL:
1. DNS Lookup: The browser contacts DNS servers to translate the domain name (example.com) into an IP address (93.184.216.34).
2. TCP Connection: The browser establishes a TCP connection with the server at that IP address (the three-way handshake we discussed in previous posts).
3. HTTP Request: The browser sends an HTTP GET request asking for the webpage:
GET / HTTP/1.1
Host: example.com
4. Server Response: The server sends back HTML:
HTTP/1.1 200 OK
Content-Type: text/html
<!DOCTYPE html>
<html>
<head>
<link rel="stylesheet" href="style.css">
</head>
<body>
<h1>Hello World</h1>
<script src="script.js"></script>
</body>
</html>
5. Additional Requests: The browser parses this HTML and sees references to CSS (style.css) and JavaScript (script.js). It immediately makes additional HTTP requests to fetch these files.
6. Other Resources: As parsing continues, the browser discovers and fetches images, fonts, videos, and other resources.
Parsing: Breaking Down Code into Understandable Structures
Once the browser has the files, it needs to parse them—break them down into structures it can work with. Let's understand parsing with a simple analogy.
What is Parsing?
Imagine you're given this mathematical expression:
3 + 4 * 5
To evaluate this, your brain doesn't just see characters—it parses the expression into meaning:
First, recognize the numbers: 3, 4, 5
Then, recognize the operators: +, *
Finally, understand the structure (order of operations): multiply first, then add
You might mentally construct a tree:
+
/ \
3 *
/ \
4 5
This tree represents the structure: "Add 3 to the result of (4 times 5)."
Parsing is exactly this: breaking text into tokens (smallest meaningful units) and organizing them into a structure that represents their meaning and relationships.
HTML Parsing and DOM Creation
The DOM (Document Object Model) is a tree structure representing the webpage. It's the browser's internal model of the HTML.
From HTML to DOM
When the browser receives HTML:
html
<!DOCTYPE html>
<html>
<head>
<title>My Page</title>
</head>
<body>
<h1>Welcome</h1>
<p>This is a paragraph.</p>
</body>
</html>
The parsing process:
- Tokenization: Break HTML into tokens (tags, text, attributes)
<html> → Start tag token
<head> → Start tag token
<title> → Start tag token
"My Page" → Text token
</title> → End tag token
...
- Tree Construction: Build a tree structure from these tokens
Document
|
<html>
/ \
<head> <body>
| / \
<title> <h1> <p>
| | |
"My Page" "Welcome" "This is..."
This tree is the DOM. Each node represents an element or text in the HTML. The structure captures parent-child relationships: <body> is a child of <html>, <h1> is a child of <body>, etc.
Why the DOM Matters
The DOM is crucial because:
JavaScript interacts with the DOM: When you write
document.querySelector('h1'), you're querying this tree structureThe browser uses the DOM for rendering: It traverses this tree to determine what to display
It represents document structure: The hierarchy defines which elements contain others
Key Insight: The DOM is not the same as your HTML source code. It's a living, in-memory representation that JavaScript can modify. When you change the DOM with JavaScript, the visual page updates, but your original HTML file doesn't change.
CSS Parsing and CSSOM Creation
Just as HTML becomes the DOM, CSS becomes the CSSOM (CSS Object Model).
From CSS to CSSOM
When the browser receives CSS:
css
body {
font-size: 16px;
color: #333;
}
h1 {
color: blue;
font-size: 32px;
}
p {
margin: 10px;
}
The parsing process:
- Tokenization: Break CSS into tokens (selectors, properties, values)
body → Selector
{ → Start block
font-size → Property
16px → Value
...
- CSSOM Tree Construction: Build a tree representing CSS rules
CSSOM
|
┌───────┼───────┐
│ │ │
body h1 p
| | |
[styles] [styles] [styles]
font-size color margin
color font-size
Cascading and Inheritance
The CSSOM represents not just individual rules but the cascade—how styles are combined and overridden:
Browser default styles (base)
User styles (if any)
Author styles (your CSS)
Inline styles (highest priority)
The CSSOM resolves all these to determine the final style for each element. This is why it's called "Cascading" Style Sheets—styles cascade down from general to specific, from parent to child, with later rules overriding earlier ones.
How DOM and CSSOM Come Together: The Render Tree
The browser now has two trees:
DOM: The structure and content
CSSOM: The styles
To display the page, it combines them into a Render Tree.
Building the Render Tree
The render tree contains only the elements that will be visible on the page, each with its computed styles:
DOM: CSSOM: Render Tree:
<html> body {...} <html> (visible)
<head> h1 {...} <body> (visible, styled)
<title>...</title> p {...} <h1> (visible, styled)
<body> <p> (visible, styled)
<h1>...</h1>
<p>...</p>
(Invisible elements like <head>, <title>,
and elements with display:none are excluded)
Key Points:
Only visible elements:
<head>,<script>, elements withdisplay: noneare excluded from the render treeComputed styles: Each node has its final, computed styles after cascading, inheritance, and defaults are applied
Visual hierarchy: The tree structure represents what needs to be painted and in what order
Example: If your CSS has:
css
h1 {
color: blue;
font-size: 32px;
display: none; /* This h1 won't be in render tree! */
}
The <h1> element exists in the DOM (JavaScript can still find it) but doesn't appear in the render tree (won't be displayed).