XBRL, the reporting standard of choice for business and financial information is sweeping the world, steadily and surely. With over 60 countries now having adopted XBRL, a repository of XBRL data is getting created world over. XBRL International (XII) recently published a list of countries where XBRL data is being made available for data users. There are several other countries, where regulators are collecting data in XBRL and sharing it with stakeholders.
From Data to Quality Data
The benefits of having structured data are well known and we are already witnessing the growing demand for it. But if we are to rely on any data, and that goes for structured data too, we need to be sure of its quality. This awareness has led to another trend in recent times - A shift in focus from mere data collection to collection of quality data.
Take for example, the Data Quality Committee (DQC) that was set up by XBRL US to improve the utility of XBRL financial data filed with the U.S. Securities and Exchange Commission (SEC). A recent analysis reveals a whopping 64% decrease in errors in XBRL filings of Q1 2016 as compared to 2015. All this because of the first set of validation rules that was released by the DQC to help filers improve data quality.
It is clear that good quality data can be achieved by putting in place validation rules and checks that can help clean data at the source itself.
But can we assume that if an XBRL report has cleared all the validations and checks that the data is of good quality? Well, its not always as simple as that.
Going Beyond Mere Data Checks
Structured data is machine-readable and lends itself to incorporating automated processes for validations. But quality checks need to go beyond the data to also include technical validity, inconsistencies and other hygiene factors.
And while correctness of data is being addressed by committees and software applications that can validate the XBRL data for technical correctness, checking for consistency and other hygiene factors is a tricky issue to address.
And this is because such issues do not result in technical invalidity and nor do they violate business logic. I had written about one such issue of including duplicate facts in XBRL report in my last post.
Today, lets look at another Applying incorrect contexts to data.
We all know that content (or data, in this case) is king, but it is context that really acts as the kingmaker! In our daily lives, the same information seen in different contexts leads to different interpretations. The world of structured data is no different.
Let me explain the importance of context with an example.
In an XBRL report, the context is made up of reporting entity, the period or the date for which the data has been reported and any other explanatory details (also called XBRL dimensions).
The 10-Q HTML report of Company A shown below has data for the current quarter and the corresponding quarter of the previous year. Nothing unusual there.
Figure 1: Snapshot of Income Statement from the HTML report as filed with SEC
Figure 2: Snapshot of Income Statement as shown in the XBRL Rendering of SEC
Anything jump out at you? The numbers reported are the same but look at the period it has been reported for. The data which actually pertains to a quarter i.e. 3 months has been captured in the XBRL report against a 12 month period. Talk about a change in context! Now, if you were using an application that was in turn consuming this XBRL data, would you really be confident of the end result? Am guessing not.
And this is just one of many examples that we found where contexts were not correctly assigned.
And how did we do this given that the error was neither technically invalid nor did it fail any business rule checks?
At IRIS, we have built a rich structured data repository to give you access to global normalized financial and non-financial data of public and private companies. In the process of normalizing the XBRL data of US SEC filings, we came across many such issues that seem technically valid, but can lead to poor data quality. Our focus has been on finding such issues and building checks and validations to address them so that you not only have data, but data that you can rely on.
Feel free to reach out to us should you need any guidance and assistance for setting up a robust framework for checking data quality, or if you would like a free trial of the rich data set and analytics that we have built.
You can contact me at email@example.com.