Querying Our Read-Only Database
We run a free public read-only Postgres server with the Stack Overflow database:
- Server: query.smartpostgres.com
- Username: readonly
- Password: 511e0479-4d35-49ab-98b1-c3a9d69796f4
(In spring 2024, we’ll be setting up automated per-person user accounts, so expect this username/password to stop working at that point.)
Building Your Own Database
If you want to create indexes or test queries that change data, you’ll want to build your own Stack Overflow database. To do that, check out Francesco Tisiot’s tutorial on loading the Stack Overflow data. The exact queries to do it are at the bottom of his post.
That method requires downloading the original XML files. The ones we use on our read-only server are the big ones for StackOverflow.com itself:
Note that those file sizes are big, and when they’re unzipped with 7zip, they’re even bigger – hundreds of gigabytes. You don’t have to use the same Stack Overflow data that we use – you can pick the data for another smaller site. The smaller sites are distributed in a single 7z file for all of the files on the site. For example, here are some of ’em:
All of the sites use the exact same file names & formats – for example, anime.stackexchange.com and stackoverflow.com both have users.xml, posts.xml, comments.xml, etc.
Avoid the meta sites – that’s a different kind of data, discussion about the site itself, that tends to be extremely small.
Learning More About the Data
To learn more about the tables, columns, and their relationships to each other:
- Database diagram – which includes tables that aren’t part of the data dump
- Documentation about the schema – which honestly isn’t that friendly
The stackoverflow database on that server is imported from the Stack Overflow data dump. This data, like Stack’s data dump, is provided by Stack Overflow under cc-by-sa 4.0 license. That means you are free to share this database and adapt it for any purpose, even commercially, but you must attribute it to Stack Overflow, the original authors (not Smart Postgres.)
For questions, check out the FAQ.