Free Data for Developers 
June 18th, 2007
UPDATE (June 28, 2007): New release of this dataset is available below with corrections and new datasets.
Over the years I’ve needed data for various purposes when developing web applications. Examples: Regional data for sign-up forms (”Choose your location:”), Dictionaries for spell checkers, relational enterprise data for mockups and samples, etc. So I’ve compiled a fairly large resource that I imagine would be useful – given how hard some of it was to come by.
I know there are data generators out there and other sources for some of this – but when I was looking it was a chore (especially for the regional data) to get it all together the way I needed it. So here it is (download link is at the bottom of the page). Warning: its around 100MB compressed and 550 MB uncompressed so watch out!
Included is full schema information for each database.
For each database there are 4 formats available:
- CSV
- MS Access 2000
- SQL Syntax file
- XML
The following databases are included:
- ComputerLanguages – A random list of computer languages in a table. Don’t know why I needed this
- ContactsFlatfile1k – A table of 1000 realistic but fake people with full contact information, email, phone, address, etc.
- ContactsFlatfile10k – Same but with 10,000 entries.
- ContactsFlatfile100k – Same but with 100,000 entries for testing big lookups.
- CountryRegion – A fairly extensive list of Country and region (be it state/province, etc) flatfile information. Great for signup forms. UPDATED
- EnglishDictionary – 110,554 words and definitions from the English language. Don’t know where this came from, but you might want to check copyright before using it in anything serious.
- EnglishWordsLarge – A slightly larger (127,238) database of English words without definitions.
- EnglishWordsMostPopular – The most popular 1,000 words in the English Language. Great for live spell checkers.
- GeneralProducts – A table of product information.
- NorthWindUltra – A much larger and more detailed version of the famous NorthWind database from Microsoft. Note: it contains none of it’s data. This is all new stuff and would be great for app mockups.
- USCityStates – A detailed list of all major US cities and their states. This version is in a heirarchical parent-child relationship database structure. Over 13,000 entries.
- USCityStatesFlatfile – A detailed list of all major US cities and their states. This version is in a flat-file single table structure for easy reference. The same 13,000 entries.
Â
And Now Including:
- Industries – A list of common industries. Useful for signup forms.
- CompanyTypesSizes – A list of common company types and sizes. Also useful for signup forms.
- UKPostalCodes – A list of UK Postal Codes with full longitude, latitude, and district name. Thanks to Tony Hine for this!
Download it here (100 MB): http://blogs.nitobi.com/alexei/dbtestdata_entset2007.zip
|
Del.icio.us
This entry was posted on Monday, June 18th, 2007 at 10:07 pm and is filed under web development, resources. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

June 19th, 2007 at 12:19 am
was thinking something like this would good as a web service too. especially the data for forms.
June 19th, 2007 at 6:28 am
.
Thank you for the free download of sample data. Much appreciated.
June 19th, 2007 at 9:34 am
I have posted a link to the free sample database download here as well! Cheers Tony
http://www.ecademy.com/node.php?id=85948
June 19th, 2007 at 12:44 pm
Thanks Tony. And I should mention, if anybody wants to add a dataset to this let me know and I’ll package it up again to include it.
June 19th, 2007 at 12:45 pm
Andre: yeah I agree. What did you have in mind? Anyway lets chat about it.
June 27th, 2007 at 3:30 pm
There is a list of UK post codes (zip codes) available here, http://www.easypeasy.com/guides/article.php?article=64 in several formats, which you may or may not want to add to your set! If you want more detailed help on how to use the information and please do not hesitate to contact me. [email protected]
UK post codes work by defining an area usually based on a city. Your post goes to the central sorting office of your area, where the first part of the postcode (for example RG10) is used to send your post to the area that refers to. Therefore this first part of the postcode eight termed “outgoing”
When the post arrives in the destination area, the last part of the postcode is used to identify where to send the post to.
June 27th, 2007 at 3:34 pm
I forgot to mention, this is a list of the “outward” codes only, the list also includes the “location” in a metre measurement so that you could calculate the distances between the postcode areas if you so desire.
June 28th, 2007 at 7:29 pm
Thanks again! I’ve added this to the database and converted it to Access and XML format also.
June 28th, 2007 at 7:29 pm
Note: I’ve posted a new version of the database with additional datasets and a corrected Regions table.
June 28th, 2007 at 7:30 pm
[…] Earlier I posted our sample database with tonnes of free datasets for developing web applications: Country/State information, Dictionaries, Fake sales data, Fake customer lists, and so-on. There is an update available for this database with several new tables, and also a corrected Regions database. […]