I am sure many people reading this article will have worked on one, or probably many, projects which involve migration of data from one IT system to another. One of the things that the data migration elements of these projects tend to have in common is that they are painful. It is not that migration of data itself is always tricky, it is more that the quality of data on the source system tends to be an issue. Data cleansing and data mapping can be tricky, particularly when there has been poor data governance and stewardship. In a worst case scenario, we might even find fields being used for different purposes by different teams, or there might be ‘debates’ over what specific data items actually mean (“Order date is the date the order is received”. “No! Order date is the date that the order is accepted, which means payment has also been made”).
There are so many issues that we could discuss, I’d like to zoom in on one: the granularity of data.
What Does ‘Granularity of Data’ Mean for Migration?
This is probably best illuminated with an example. My home insurance was recently up for renewal, which I purchased via a large online insurer. Technically, they are a broker rather than an insurer—but you wouldn’t know that to look at them (as the policies are branded with their logo, and you have to look pretty hard to find out which insurer actually underwrites them).
Close to the renewal date I shopped around and found some slightly better prices . I rang my insurer to see if they could price-match, and I was surprised as they asked me for a whole bunch of information that they already had on file. Concerned, I asked why, and the agent explained to me that they had migrated from one IT system to another, and he no longer had access to the ‘legacy’ data.
This lack of access to old data itself worried me, but things got worse. When I explained that I occasionally work from home, the quote increased, and I was told that the existing renewal quote did not include this. This was extremely worrying as this had always been included on previous policies as standard—and I had a paper-trail that showed this. Then it struck me, this used to be included as standard, so it was never noted on the system. The new system has ‘greater granularity’, but there was no way of correctly mapping the two as the information wasn’t appropriately recorded in the first place. This might sound abstract, so let’s unpick this point a little.
Imagine there’s a bounded list that an agent chooses from to denote the usage of the property. The option that they choose here affects the premium, the documentation, and the sections of the policy that are activated.
|Field on existing system||Mapped to this field on new system|
|NO – No Business use||NO – No Business use|
|CL – Clerical Business Use Only|
|CO – Commercial use (visitors)||CO – Commercial use (visitors)|
Here we have an issue, because clerical business use was accepted as standard, it has never been recorded. Where a policy has “No business use” selected against it, this could mean either:
- No business use at all
- Clerical business use
This causes an issue for mapping to the new set of configured options, as there is now a specific option for clerical business use. Yet how do we map to this? One approach might be to map to the ‘highest rated’ option (e.g. assume clerical business use); but doing this will likely increase the premium unnecessarily for those for whom this doesn’t apply. Another, which the insurance company appears to have adopted, is to map to the ‘most favourable’ option (e.g. assume no business use). Yet this has the impact of potentially leaving someone incorrectly insured, or at the very least having incorrect paperwork. The most thorough thing to do would be to contact each customer prior to renewal and ask them; or at the very least communicate what the default will be and ask them to correct this by exception, but this is expensive (which is presumably why the insurer didn’t do this).
It is easy to imagine, in the heat and pressure of a migration workshop to gloss over decisions like this. It looks so simple and logical, (It’s easy to imagine a tired stakeholder yelling “We need to move on, we’ve got 17,017 other fields, just map “no” to “no”, we’ll deal with the fall-out if it happens!”) Yet where there is a lack of granularity or certainty over the existing data sets we really must ask “what is the impact of this mapping?”. Will it mean that our data integrity is further eroded? Will it lead to negative customer or business outcomes? And if so, what can we do to avoid these situations occurring?
As business analysts, we need to be comfortable having conversations at the ‘micro’ level, on items such as this, which may have far-reaching levels, as well as at the ‘macro’ and strategic levels. Our ability to look deep and wide is one of the things that makes our role so valuable.
What are your views? Please add a comment below, and let’s keep the conversation flowing!
If you’ve enjoyed this article don’t forget to subscribe.
About the author:
Adrian Reed is Principal Consultant at Blackmetric Business Solutions, an organisation that offers Business Analysis consulting and training solutions. Adrian is a keen advocate of the analysis profession, and is constantly looking for ways of promoting the value that good analysis can bring.
To find out more about the training and consulting services offered at Blackmetric, please visit www.blackmetric.com