Vast quantities of government, health, and other data, figures and statistics are effectively inaccessible to an immeasurable numbers of researchers and other interested people who would be able to help mine and synthesize the data into enormous numbers of very critical, meaningful and far-reaching new data products. The cause of much of this lost potential is the lack of standardization in which information is published by the various entities that gather it.
To increase the efficient flow of this information, it is necessary to enforce that it be published in accordance to some standards upon which systems could be built to harness the data with greater automation.
The standards must be created by people in the open data advocate community who know exactly what to ask for. But a suggestion that should be implemented immediately is the requirement that all data published in PDF format should also be required to be published/posted with the underlying data.