Composition of binary coding over cityscape showing data handling for data taxonomy

Understanding the Taxonomy of Data in Challenging Jurisdictions

Most people are familiar with the taxonomy of species. This process classifies animals to organize and understand how they are different and alike. For over a decade, this is also how humans have explored the world of data. The more research conducted into emerging markets, the bigger the reliance on a classification system to understand how users can access the public data collected on a daily basis. This system is simply the Taxonomy of Data.

When looking at the global supply of information required to complete legal, financial and corporate transactions, clear category lines appear. To simplify, let’s focus this discussion on corporate records. Now think of a spectrum: to the far left is the digitized information that is online, centralized and open. To the far right, is the opposite: paper files inside metal cabinets. From the far left to far right includes the spectrum of five different categories.

The five different categories of data

T1. Beginning with the T1 at the far left, this data and records are online, centralized and open. Anyone from anywhere may access the Company’s House corporate data for most companies registered in the UK. Anyone who wants to is able to find information on the ultimate beneficial owner (UBO), see all the foundational elements of these legal entities, and review a limited amount of annual accounts reporting. This point on the spectrum represents a vast ocean of data that is the easiest and most efficient to access.

T2. One step to the right is T2. This data is online, centralized, and commercialized. This category is most commonly found across many jurisdictions in Western Europe and Anglophone countries such as Canada and Australia. Despite some wrinkles, the data may be accessed with a generic user account and any credit card. Many database aggregator companies have replicated this T2 category on their own platforms. With a username and credit card, users can access any data that’s available within the database. Companies such as Dow Jones and LexisNexis offer products in this category.

T3. Take another step to the right, into T3, where data is online, centralized, and restricted. Depending on the jurisdiction, this restriction may simply be language, such as in China or Russia. It may be that the user must pay with a credit card connected to a local banking system. Sometimes these databases require a username and password attached to a local phone number, or they employ Internet Protocol (IP) restrictions, which may only allow local IP numbers to access the database. The T3 category is where one begins to see a clear need for understanding the local data environment before. This category is oftentimes the most problematic because countries that fall into this space remain well inside the emerging market category, but the data required to complete international transactions, such as a merger or acquisition, is not readily available. It’s no longer “in the cloud,” and a local professional must be engaged.

T4. Stepping into T4, data is categorized as decentralized, mostly offline and mostly restricted. Some jurisdictions, such as the United States have online portals, commercialized and not. But jurisdictions in this category such Brazil and India are also decentralized and therefore challenging to access data. The process and funding behind managing public data in these jurisdictions is not nearly as well organized or transparent. The processes are often intricate and information takes a long time to access, if you’re able to at all.

T5. Finally, T5 is the catch all for most edge-market countries and data here is defined by being partially or completely offline, decentralized and restricted. Mexico is an infamous example here, and presents real challenges because its public data is just as challenging to obtain as, say, Niger, but unlike Niger, Mexico is a significant recipient of foreign direct investment. Other T5 countries, such as Rwanda, offer online portals, but link rot and source rot often occur when using these portals and there are often entire years that are unavailable. Myanmar in this category is currently benefiting from a significant Japanese investment in its public data, but there are no guarantees information will be completely digitized by the time the funding runs out. Like other countries in this category, Indonesia or the Philippines, the only option is to stand in line and fill out a form to request the information needed.

Demonstrating the difference from T5 to T1 in challenging jurisdictions

The best example of the far right category T5 that our company, Evidencity has seen, is in Juba, South Sudan. An on-site provider alerted the team that he had accessed corporate records, but they were in piles of paper and files on the floor of a storage room. This process represented the offline, decentralized and restricted system in place in this challenging jurisdiction. It represents pools of the global supply of data that are the most difficult to access.

An example of the T2 category would be online, centralized and commercialized data. We would classify it as “limited” because there is some information the database does not reveal. For example, in the British Virgin Islands, corporate registry data will return a registered agent rather than the beneficial owners of a corporate entity. Other tax havens like this are similarly limited.

The T3 category is online, centralized and restricted. These databases restrict access in one of three common ways: IP address, paywall, or user class. An example of this is the real estate database in Dubai. They require anyone to pre-register their IP address to access portals or databases. Many others, and this is a growing reality, require a subscription, or some sort of payment processed before you may access any information.

How understanding the taxonomy of data can determine risk when conducting business

Access to public data is often part of what defines a country as an emerging market versus an edge market. Public data access may be coupled with the Transparency International Corruptions Perception Index or the Freedom House Index to determine the level of risk associated with participating in any sort of financial transaction. Generally speaking, the more restricted the data, the more risk required  to mitigate before continuing business  in emerging and edge markets. Understanding the Taxonomy of Data in any target country is an essential first step to avoid the pitfalls of bribes, lies and delays associated with challenging yet often rewarding jurisdictions.

While these categories are ever-changing, it’s important for any leader handling data in this ever-evolving landscape to understand how to categorize data to better protect their organizations and customers when working in challenging and emerging markets.


Founder and CEO at Evidencity