You have downloaded your Mastodon export archive from your account settings, but the files are in CSV or JSON format that is hard to search through quickly. The archive contains your posts, followers, following list, bookmarks, and other data. This article explains how to convert that export archive into a single, searchable JSON index that you can open in any text editor or import into a data analysis tool.
The export archive from Mastodon includes multiple files, each holding a specific type of data. A JSON index combines all these files into one structured document with key-value pairs for every item. You will use a Python script to read the archive, parse each file, and write a unified JSON index.
By the end of this guide, you will have a single JSON file that contains all your Mastodon data in a flat, searchable structure. You can query it with standard JSON tools or load it into a spreadsheet for further analysis.
Key Takeaways: Mastodon Export Archive to JSON Index
- Settings > Import and Export > Export > Request your archive: Downloads a ZIP file containing all Mastodon data files.
- Python script with json and zipfile modules: Reads the archive, parses each file, and writes a single JSON index.
- jq command-line tool: Queries the resulting JSON index for specific posts, followers, or dates without opening the entire file.
What the Mastodon Export Archive Contains
When you request your Mastodon archive from Settings > Import and Export > Export, the server generates a ZIP file named something like mastodon-export-20250101.zip. Inside the ZIP, you find a folder structure with the following key files:
- outbox.json — All your public posts in ActivityStreams format, including boosts and replies.
- following_accounts.json — A list of accounts you follow, with their username and server URL.
- followers.json — A list of accounts that follow you.
- lists.json — Any lists you have created on your Mastodon account.
- blocks.json — Accounts you have blocked.
- mutes.json — Accounts you have muted.
- bookmarks.json — Posts you have bookmarked.
- media_attachments.json — Metadata about media files you uploaded.
- actor.json — Your account profile data.
Each file is already in JSON format, but they are separate and use different schemas. To search across all your data at once, you need a single JSON index that merges these files into one document with a consistent structure.
Steps to Convert the Archive into a Searchable JSON Index
You will use a Python script to extract the ZIP archive, read each JSON file, and write a new JSON index file. This script works on Windows, macOS, and Linux systems that have Python 3.6 or later installed.
- Download and install Python
If Python is not already on your system, download the latest version from python.org. During installation on Windows, check the box Add Python to PATH. Verify the installation by opening a terminal or Command Prompt and runningpython --version. You should see a version number like 3.12.0. - Create a new Python script file
Open a text editor such as Notepad, VS Code, or Sublime Text. Paste the following script into a new file and save it as convert_archive.py in the same folder as your Mastodon export ZIP file.import json import zipfile import os import sys def convert_archive(zip_path, output_path): index = {} with zipfile.ZipFile(zip_path, 'r') as z: for filename in z.namelist(): if filename.endswith('.json'): with z.open(filename) as f: try: data = json.load(f) # Use the filename (without extension) as the key key = os.path.splitext(os.path.basename(filename))[0] index[key] = data except json.JSONDecodeError: print(f"Warning: Could not parse {filename}, skipping") with open(output_path, 'w', encoding='utf-8') as out: json.dump(index, out, indent=2, ensure_ascii=False) print(f"Index written to {output_path}") if __name__ == '__main__': if len(sys.argv) != 3: print("Usage: python convert_archive.py <input.zip> <output.json>") sys.exit(1) convert_archive(sys.argv[1], sys.argv[2]) - Run the script from the terminal
Open a terminal or Command Prompt. Navigate to the folder containing both the script and the ZIP file using thecdcommand. Run the following command, replacing mastodon-export.zip with your actual ZIP filename and mastodon-index.json with your desired output name:python convert_archive.py mastodon-export.zip mastodon-index.jsonThe script prints a success message when done. If you see errors, check that the ZIP file is not corrupted and that all JSON files inside it are valid.
- Verify the output JSON index
Open the generated mastodon-index.json file in a text editor or JSON viewer. The file should start with a top-level object containing keys like outbox, following_accounts, followers, and so on. Each key maps to the data from the corresponding file in the archive. - Search the index using jq
Install the jq command-line JSON processor from jqlang.org. For Windows, download the executable and add it to your PATH. For macOS, use Homebrew:brew install jq. For Linux, use your package manager:sudo apt install jq(Debian/Ubuntu). Then run a query like this to find all posts containing a specific word:jq '.outbox.orderedItems[] | select(.object.content | contains("Mastodon"))' mastodon-index.jsonThis command extracts every item from the outbox array and filters for content that includes the word Mastodon.
Common Mistakes and Limitations When Converting Archive Files
Python script fails with a zipfile.BadZipFile error
This error means the ZIP file is corrupted or incomplete. Download the archive again from Settings > Import and Export > Export. Ensure the download finishes completely before running the script.
JSON output is too large to open in a regular text editor
If you have thousands of posts, the resulting index file can be hundreds of megabytes. Use a JSON viewer designed for large files, such as JSON Viewer or a code editor like VS Code with the JSON extension. Alternatively, use jq to query the file without loading it entirely into memory.
Missing data for private or direct posts
The Mastodon export archive only includes public posts by default. Private posts and direct messages are not exported. If you need those, you must use the Mastodon API with appropriate authentication tokens. The archive is not a full backup of your account.
Script skips some JSON files without warning
The script prints a warning for files it cannot parse. If you see no output about skipped files, all files were processed. If some files are missing from the index, check that they exist inside the ZIP by listing its contents with unzip -l mastodon-export.zip on Linux/macOS or by opening the ZIP in File Explorer on Windows.
jq query returns no results for a known search term
The .orderedItems[] path in the outbox may contain objects with different structures. Some items are boosts or replies that do not have a .object.content field. Adjust the query to handle missing fields by using select(.object.content // "" | contains("Mastodon")). This uses an empty string fallback for items without content.
Mastodon Export Archive vs JSON Index: Key Differences
| Item | Export Archive | JSON Index |
|---|---|---|
| File format | ZIP containing multiple JSON files | Single JSON file |
| Search capability | Requires opening each file separately | Searchable with jq or any JSON tool |
| Data structure | Each file uses its own schema | Unified schema with filename-based keys |
| Portability | Must extract to access data | Ready to use immediately |
| Size | Compressed ZIP, smaller on disk | Uncompressed, larger than ZIP |
Now you have a single JSON index that contains all your Mastodon export data in one place. Use jq to run complex queries across your posts, followers, and bookmarks without opening multiple files. For advanced analysis, load the index into a database like SQLite using the JSON1 extension, or into a Python pandas DataFrame for statistical exploration. The script provided can be extended to filter out unnecessary keys or flatten nested structures for even easier searching.