City Data Portal Analysis

There's a lot of content on the City of Austin's open data portal. This project is about studying that content so we can make the portal better.

Current project goals

Write code that grabs specific pieces of information from Austin's public data portal and rearranges it into a format that's useful for analysis.

Next goals include automated publishing to the City's data portal, so everyone can access and analyze this data.

Why we're doing this

There are many ways to explore data quality. Improving data quality is a job that's never done.

Current business needs/issues to explore include:

Identifiers... How often are departments using unique identifiers for City assets? What is the nature of those identifiers? Where might we benefit from using common identifiers?

Redundancy... How often are departments publishing the same information within their datasets? Are there any departments publishing about the same topics who might want to collaborate?

Accessibility... Are we using multiple resources to publish the same information repeatedly for different time periods? (Not ideal for API consumers.) What column labels and descriptions don't match up with their values, and could perhaps use some tuning? How often are schemas changing? Are these changes good or bad for data consumers?

Table grain... How often are we publishing aggregate information (subtotals and totals) when we could be publishing atomic data? This one is huge!

Contributing terms

When you contribute to this project, you are sharing and/or creating content. Please do not contribute content unless you agree with the terms here.

How to contribute

Additional guidelines forthcoming. In the meantime, please contribute by:

Getting started

To be announced. We're all just getting started.

Credits

Coming soon

History

A detailed record of significant changes can be found in the changelog

License

Unlicense