Migrate from Confluence to Public Docs: A Step-by-Step Guide
You are going to move a Confluence space into a public-facing documentation site. By the end of this walkthrough you will have a live docs URL, clean markdown files in a repo, and redirects pointing from any previously public Confluence pages to their new homes.
This is the migration path most small teams actually take in 2026, not the "enterprise DITA migration" version. Confluence usage grew 129% year over year in the "export confluence to markdown" search trend (DataForSEO, 2026), which tracks what you already suspect: teams are leaving internal wikis for public docs sites faster than ever.
Prerequisites:
- Confluence Cloud or Server access with Export Space permission (check by visiting any space, then Space settings in the left sidebar)
- A local machine with Python 3.10+ and Node.js 18+ (only needed if you run the export scripts below)
- Pandoc installed (
brew install pandocon macOS,choco install pandocon Windows, orapt install pandocon Linux) - Somewhere to host the new site. A GitHub repo is enough
One note before the steps: if your goal is to skip the export-and-clean phase entirely, jump to step 3 and see how AI documentation tools can regenerate the docs from your product URL instead. Most teams still want their Confluence content preserved verbatim, so the main path below assumes that.
Step 1: Export the Confluence space
Confluence does not support direct markdown export. You have three real options. Pick one based on how much content you have.
Option A: Space-level HTML export (under 500 pages, mixed content). In the space, click Space settings in the sidebar, then Content tools, then the Export tab. Choose HTML, check Custom export, and select the pages you want. Click Export. Confluence emails you a ZIP when it finishes. Download and unzip it.
You now have index.html, one HTML file per page, and an attachments/ folder with images.
Option B: Python exporter (500+ pages, want markdown directly). Install the community exporter:
pip install confluence-markdown-exporter
confluence-markdown-exporter spaces ENGDOCS ./export \
--url https://yourteam.atlassian.net/wiki \
--username you@yourteam.com \
--token YOUR_API_TOKEN
Generate the API token at https://id.atlassian.com/manage-profile/security/api-tokens. Replace ENGDOCS with your space key. The exporter preserves page hierarchy as nested folders and writes each page as a .md file with front matter.
Option C: Atlassian XML export (archive everything, no structure preference). Same menu as Option A, but pick XML instead of HTML. This gets you the raw Confluence storage format. You will need to convert it later. I would only use this when the other options fail or when legal requires an archival format.
Verify the export worked. Count the files:
cd export
find . -name "*.md" | wc -l # or *.html for Option A
If the count matches your Confluence page count within a few percent, you are fine. Missing pages usually mean permission issues on specific parent pages. Re-run the export with an admin account.
Step 2: Clean up macros, broken links, and internal-only content
Raw Confluence exports are never publish-ready. This is the step teams skip, and it is the reason most migrations look embarrassing in production.
Strip Confluence-specific macros. The exporter leaves behind HTML comments like <!-- ac:structured-macro ac:name="info" --> or literal text {info}, {warning}, {code:java}. Find them:
grep -rE "ac:structured-macro|\{info\}|\{warning\}|\{code:" . | head -20
For info/warning/note macros, rewrite to markdown blockquotes or your docs platform's admonition syntax. For {code:lang} blocks, convert to fenced code blocks with triple backticks. Macros for Jira embeds, Roadmap planner, or user mentions should be deleted, not converted. They do not mean anything outside Confluence.
Fix broken internal links. Confluence links often look like /wiki/spaces/ENGDOCS/pages/12345678/Page+Title. On your new site these will not resolve. Run a find-and-replace pass:
# Rough shape. Adjust for your URL structure.
find . -name "*.md" -exec sed -i '' \
-E 's|/wiki/spaces/[A-Z]+/pages/[0-9]+/([^"\)]+)|/docs/\1|g' {} +
On Windows, use sed -i without the '' quotes or use a Node.js script. Manually spot-check a dozen links afterward. Regex never catches all edge cases.
Remove internal-only pages. Confluence spaces always have pages that should never see daylight: meeting notes, salary discussions, incident retrospectives with real customer names, links to private Google Drive folders. Make a list and delete them before publishing:
rm -r "./Meeting Notes" "./People Ops" "./Retros"
Grep for the obvious leaks before you go live:
grep -riE "(password|secret|credential|internal only|do not share)" . | head
Internal documentation and public documentation are different products even when they share a source. The grep above is the five-minute version of that distinction.
Normalize page titles and slugs. Confluence allows titles like "Q3 2024 Planning (DRAFT – Owen's copy)." That title will become a URL. Rename files to their clean, lowercase, hyphenated forms before the first publish so your URLs stay stable.
Step 3: Choose the new platform
You have three credible paths. I am biased because I run Docsio, but the choice actually depends on who writes your docs.
Docs-as-code with Docusaurus, Nextra, or MkDocs. You commit markdown to a Git repo, the site builds on push, engineers are the only people who can contribute. Good for API-first products where docs belong with the code. See our docs-as-code guide for the full pattern.
WYSIWYG tools like GitBook or Notion. You paste content into an editor, non-engineers can edit. Good if your docs are written by PMs, support, and ops rather than engineers. We have a breakdown in Notion for documentation.
AI-generated docs with a tool like Docsio. You paste a URL, it generates a branded site from your product content in about five minutes, and an AI agent edits everything from there. This is the fastest path when your Confluence content is already stale and you would rather regenerate than salvage. Free tier includes hosting with SSL and a custom domain. For the salvage path we are on here, you can also import your cleaned markdown into Docsio as the starting content and edit from there.
For this walkthrough I will show the docs-as-code path because it has the most moving parts, and the other paths are mostly "upload and done." If you picked Docsio or GitBook, skip to Step 5.
Step 4: Restructure nested spaces into a flat information architecture
This is the step that separates good migrations from embarrassing ones. Confluence encourages nesting. A page can live 6 levels deep under Engineering > Backend > Services > Auth > v2 > Endpoints. Public documentation sites do not work that way. Readers get lost past two levels.
Flatten aggressively. A sane docs site has three levels at most: category, page, optional sub-page. Open your exported folder and look at what you have. Here is the pattern I use on every migration:
# Before (Confluence)
Engineering/
Backend/
Services/
Auth/
v2/
Endpoints/
login.md
refresh.md
# After (public docs)
docs/
api-reference/
auth/
login.md
refresh.md
Map every Confluence page to one of five top-level categories: Getting Started, Guides, API Reference, Concepts, Changelog. If a page does not fit any of those, it probably should not be in public docs. If everything fits into "Guides" and nothing else, your structure is too loose. Split.
Write a sidebar config that reflects the new structure. For Docusaurus, that is sidebars.ts:
const sidebars = {
docsSidebar: [
'intro',
{
type: 'category',
label: 'Getting Started',
items: ['quickstart', 'installation', 'configuration'],
},
{
type: 'category',
label: 'API Reference',
items: ['api-reference/auth/login', 'api-reference/auth/refresh'],
},
],
};
This is the step where picking the right documentation platform pays off. Platforms that auto-generate the sidebar from your folder structure (Docsio, Mintlify, GitBook) save hours here. Platforms that require a manual sidebar file (Docusaurus, MkDocs) give you more control but cost time.
Step 5: First publish
Initialize the repo and install Docusaurus (or your platform of choice):
npx create-docusaurus@latest my-docs classic --typescript
cd my-docs
cp -r ../export/* ./docs/
npm run start
Open http://localhost:3000. Click through ten random pages. You are looking for three specific things:
- Rendering issues. Confluence tables with merged cells often break. Pandoc-style tables fix most of them; nested tables need manual work
- Missing images. If your export had
attachments/image1.pngreferenced asimage1.pngin the markdown, Docusaurus will not find it. Move images intostatic/img/or into the page's folder, then update paths - Broken code blocks. Confluence
{code:java}blocks sometimes export as plain paragraphs. Fenced code blocks with language hints fix the syntax highlighting
Once local looks clean, deploy. GitHub Pages, Vercel, Netlify, or any static docs host will work. For Vercel:
npm install -g vercel
vercel --prod
You now have a live URL. Click through the same ten pages on production to confirm nothing changed between local and prod. Share the URL with one person internally before telling the rest of the team.
Step 6: Handle redirects and SEO for previously public pages
If your Confluence pages were behind SSO, skip this step. Nothing was indexed. You are done.
If any Confluence pages were public (Confluence Cloud lets you share pages with "anyone with the link" or make a whole space public), Google has indexed them. Killing those URLs without redirects tanks any traffic they earned.
List what Google knows about. Use Google Search Console. Open Pages and export everything under your Confluence domain (yourteam.atlassian.net/wiki/...). Or use a site: query:
site:yourteam.atlassian.net/wiki
Save the list. For each URL, write down the new URL on your docs site.
Set up the redirect map. If you control the old domain, add redirect rules. For Confluence Cloud you do not. Work around it by either:
- Putting a redirect stub page in Confluence that tells readers the content moved (ugly but works)
- Asking any external sites that link to you to update their links
- Submitting the new URLs to Search Console and letting the old ones drop
If the new site is on Vercel, the vercel.json redirect block looks like:
{
"redirects": [
{
"source": "/wiki/spaces/ENGDOCS/pages/12345678/Login",
"destination": "/docs/api-reference/auth/login",
"permanent": true
}
]
}
This only works if the old Confluence domain CNAMEs to your new host. In most cases it will not, and the redirect stub approach is what you actually do.
Update your sitemap and resubmit. Most docs platforms auto-generate sitemap.xml at /sitemap.xml. Submit it in Search Console under Sitemaps. Google typically recrawls within two weeks.
What to do next
Ship a changelog entry on your product site that says "Our docs moved here." Link it. Tell the team in Slack so support stops pointing customers at dead Confluence links.
Then pick one page per week to improve for the next month. The migration gets you the content onto a site readers can find. The rewrite gets you the content readers actually finish. If you want a shortcut for that second phase, most AI-generated docs platforms can regenerate stale sections against your current product in minutes. For teams that decided mid-migration that the Confluence content was too stale to salvage, generating fresh docs from your product URL is often the faster path. Either way, the public docs site lives on its own now, and that is the change you actually wanted from the start.
