Unlock the 4 Gates of Data: Building the Bulletproof Data Organization
Architecture and process are required for success in getting data to the business but in what order and how is all that complexity managed? Take a step back and look at how a data practice lives and breathes and you begin to see some patterns emerge. Data organizations often become stale with the information because they cannot keep up. This leads to less questions getting asked and less requests for information. As a result, the organization struggles to get funded, to get to the next maturity level. Some organizations get an incredible amount of technology and vision out, but nobody uses it. We have seen both sides of the organization so you might wonder how you fix it.
Organizations often see things as projects or initiatives not thinking there needs to be an overarching strategy. There are simply only two gates for them. Get the Data. Present the Data. In reality, there are 4 gates of data, and we are about to explore all of them. You might ask how I came up with these and the simple answer is I interviewed a lot of the successful data organizations and distilled down what worked versus what did not. There stood out to me 4 gates that the successful organizations navigated and allowed for that were characteristics in all their approaches. Let us walk through them.
The first gate is Make Data Available.
Sounds straightforward? It is anything but that. The first rule of each gate is to take it seriously so let us get through the first one. To make data available you need the right architecture, the right processes, the ability to ingest that data quickly, value your dataset, and curate that data set. Only then is that data available. You will need metrics around that as well. How quickly can you do it? The only answer is you should know enough that you will have that data ready when it is needed. The implication is that you need to be in the know, not waiting for a request. By the time a request comes in it is too late. Wondering how to do that? Let us walk through example 1.
Imagine a family, a mother, father, and three children (3yr, 9yr, and 16yr). Now in that family, the mother often prepares the meals for each person. She knows when her spouse will return home and the constant needs of a one-year-old. In between, she supplies larger quantities for the 9-year-old and a sparingly healthy vegan diet for her 16yr old. She knows when they are going to eat, when they will want something new and how to cater to diverse needs. Sounds familiar. It is a nice model but I also give you the other I have seen where there is a much larger family with 6 children. This example comes directly from my own family where I saw my aunt always in the kitchen with a large pot and whatever came out of that pot everyone had to eat. She simply was overwhelmed, and this is how she managed. Larger demands of your time mean less quality and diversity. The issue that organizations have is that they need to know and manage the demands. How much focus is really spent on that part of the process versus developing capability? Put together an organization that can sit in and with the business that communicates effectively. The idea of Data Speak is something we do not use often. The practice of having those in the organization that can speak the language of data and disseminate that through the organization. Data Speak is a key facet of successful data organizations.
Architecture needs to be thought out as something shared across projects not something for each project. You need to have a method to get data ingested quickly, something that is repeatable and manageable as well as scalable. There are many ways to do this and we will not discuss them here but suffice to say that you should have a cloud component in there and the ability to manage self-service. Shared sets of data are better than silo’ d data so you will need to make sure your architecture does not lock users out of looking across data (think orthogonal data sets make feature engineering easier).
Onto the next gate. Make Data Searchable.
By my research the most missed of all the gates and the least understood. Many organizations have document management, there is a search capability there and that is good enough? Not quite. Quite simply this is the secret sauce of successful organizations and the cloud is making this capability become mainstream. As you started reading about all the value that most companies miss (dark data, data locked in their information or document systems) where the value of their data resides. One always thinks in terms of data sources and transactional systems as where the information is stored and not other systems. Cognitive and Semantic search capabilities are a way of unlocking data and managing unstructured data that is in documents and in data sources. The value of internal well-written documents for procurement, for RFPs, for contracts is something that is not considered data. Step one is to consider this data, it is valuable, reusable, and should be managed as such. As business users find ways to interact with data this will be the beginning of a great friendship. Think of it as bringing the intelligence of google search into the organization and supercharging their efforts to answer questions with a system that does not cause you to go back and become backlogged in your search for a data source. The ability to ask and get answers to questions is a long-misunderstood part of the data equation. You will need a gate in place and navigate through that gate.
Let’s get to the third gate. Make Data Manageable.
Curate that data, make it uniform, complete, and valued. You can use multiple methods here I would recommend you look at data as an asset and create equations accordingly. If you need to understand in more detail start by looking at how often a piece of data is loaded versus how often it is requested. To do that you will need to monitor your metadata layer as well your reporting layer (which you are of course doing anyway? It's ok you can manage this through your cognitive search which your setup). So, in short, put a process in place to understand how data is used and make sure that the data is ready to be used. The last part of curation relates to having it ready when it is needed. We have talked about Data Speak earlier but it comes into play here as well. We will cover this in later blog entries, but you need to be ahead of the game (your data strategy should always be looking for where data is being requested and supporting those drives).
The final gate to navigate is Make Data Actionable.
Simply put, you need to get that data used and quickly in a variety of ways. Reports, Dashboards, AI data sets will all be part of the equation. The service layer of actionability is a layer to be able to quickly find, provision, and access the information. Data can be requested as a CSV file, as part of a regular data feed, part of an application or part of a real-time streaming service. Automating this layer to manage better is a way for you to get time to execute on other activities and remain close to managing curation deadlines. There are a lot of architectures that can get this done that are SOA-based or Cloud-enabled. Better to have automation so you can focus on your core demands of getting that data ready for consumption.
These are the 4 gates that without exception are characteristic of organizations that can move in an agile manner because they can execute through navigating the gates and lay out an architecture and process that listens to the business and delivers value. Dinner is served.