How To Develop And Modernize Your Enterprise Data Strategy

How To Develop And Modernize Your Enterprise Data Strategy

June 25, 2020 Data Based Consulting 0

 

Photo by Startaê Team on Unsplash

Our data consulting team has had  the privileged to help develop strategies, data products, algorithms, upskill teams, and help grow our new data-focused departments at companies all across US.

With all of this work focused around growing teams, data analytics strategies, and custom data science solutions we have developed several helpful assessments and guides that we walk through with our clients.

Today we wanted to walk through one of those data analytics strategy guides we go through to build our final reports for our clients.

The purpose of this guide is to help take you through the process of assessing where your data analytics team, strategy, and culture is at.  As well as outline how your team can improve and modernize its data strategy.

This guide will help you by taking account of the data products you already have on hand. Like data warehouses, data sources, databases, dashboards, algorithms, etc.

It will also help outline what goals your team has and what decisions your team is looking to drive.

The end goal of all of this will be to allow your team to write out a few pages summarizing your team’s data strategy and outlining how your team can be most impactful.
Our consulting team has found this very helpful in terms of really taking the time to get organized as well as crystalize future strategies.

This guide not only provides steps about what you should be tracking, but it will also provide insights about options and recommendations our team would have in various situations.

Like what if your team doesn’t have access to a data engineer, or what to do with unused dashboards.

Don’t feel restricted to the guide, but instead use it to help take inventory of where your data analytics strategy is and go from there.

And with that let’s start by tracking where your team is at.

Take Inventory Of Your Data Sources

data strategy consulting big data consulting modern enterprise data strategy data consulting strategydata management strategy mysql

For this, you don’t need to go crazy and list every table, stored procedure, and SQL statement that exists.

Instead, just focus on the general name of the data source, the category it fits in, give a few notes on the data that exists, who owns it (even if it is no one), and what system it is located on. This could be data warehouses, flat files you get from online or third-parties, databases, Workday, Salesforce, and so on.

This could be stored in Postgres, MySQL, or really any data storage system. Tracking it just helps consultants like us get an idea of your data landscape.

For example, if you have databases in MySQL as well as data in Salesforce and even more data in Workday and want to use all of it.

This will require a decent amount of work as you are interacting with three very different systems. 

But for now, let’s track your data sources below.

 

Why Is It Important To Track Data Sources?

Along with getting a good idea for the scope of the work required to pull data from these different data sources, this step also lets the team know what types of analytics are possible. By providing a general understanding of your data sources, you will have a better understanding of what questions you can answer.

If you don’t know what data exists, and where you could meld data together. Where you could meld accounting data, with sales data and employee data, it becomes harder to see what you can do.

Even just by following through a few of the steps of the process can help provide perspective. 

You will probably think of ideas of reports, dashboards, and products your team could develop along the way.

That is one of the side effects of reflecting on the work your team has already done and the data your team has access to. As we have consulted with companies, we have found that oftentimes while we are working through our process, employees and team members will come up with ideas because of a new perspective.

So keep an open mind as you go through the rest of the steps about how you could take all of what your team already has and how it could use it in the future.

Take Inventory Of Your Data Pipelines And Automated Workflows

ETLs(Extract Transform And Load) and data pipelines are the main way of getting data from point A to point B.

Not all teams have data pipelines, ETLs, or automated systems and that is ok. The purpose of this step is merely to try to get an understanding if any exist. If they do, then what language, framework, or tool are they using and how many are there.

If there aren’t any, then we will look into figuring out what types of tools might be best for the team and if the team will require training to upskill them to understand ETLs as a whole.

For now, let’s list out the Pipelines, what data they support, who owns it, and what frameworks they are written in.

 

 

ETLs An Data Pipelines – How To Manage Them

source

Before we go on to the next section we would like to discuss some of the suggestions we provide for teams depending on the information we track.

As mentioned earlier, we sometimes recommend training and upskilling if a team is not aware of what ETLs are. This is highly dependent on where the team sits in the organization because not every team should take on ETL development.

In addition, ETLs and automated workflows require a decent amount of time and technical expertise. So upskilling isn’t always the right answer.

For example, if the team is purely a data science team with very little understanding of data engineering. Also, data scientists should be focusing their energy more on researching and analytics. Not on developing data pipelines 

So when your team doesn’t have a data engineer or your company is too small to afford one full time then there we generally recommend a few things

Option 1 – Partner With Data Engineering Or BI Teams

Large organizations will have a BI or data engineering department of some kind or another. They might be broken up by department or they might be a central body that every team needs to go through.

If this is the case, then you will need to partner closely with those data engineers. This means you need to build relationships with the managers and individual contributors. 

That seems obvious right?

But what does this actually look like? 

Although when our team comes in to write our final report this section may vary. We do cover a few key concepts in to how to improve your cross functional relationships.

Here are are few tips.

Create win-win situations

Data engineers and data architects tend to do a lot of work that doesn’t always get recognized for their work. Due to the fact that they are really only developing the back-end infrastructure for your teams to analyze data, they often don’t get to sit in the limelight. 

So as your team is figuring out new models, algorithms, and creating an impact that you provide credit to the data engineers team. Both publicly to directors and higher-ups and also in private.

The win-win is ensuring that the data management teams get acknowledged as an important aspect of the company at a director/C-suite level. This ensures that their team continues to exist and be well supported.

Teams that don’t get discussed often in a positive light can often be viewed as a cost rather than a benefit.

Also, privately saying thanks a few weeks after the project is also helpful. It provides the data management team feedback that their work wasn’t just good for the short term, but also the long term. Meaning their work plays an important role in the company.

This seems like a small recommendation. However, we have seen time and time again that oftentimes the work will get done by the data management teams and then they will be forgotten.

Explain to the data engineers the business impact

As many of our consultants have worked as data engineers, they are aware of the never-ending work that comes with the territory. Everyone needs new tables, new pipelines, and dashboards. This makes it difficult to manage what is a priority and what isn’t.

Part of the reason is sometimes data engineers are several steps removed from the business. Even when they are integrated more into the specific area of expertise, they still might not know what your team’s goals are.

So do provide clear context to why your work is important. Truthfully, the data engineering team may still put it behind much of their other work. You still do need to try to provide some context. It doesn’t only help the data engineering team prioritize but it also provides an understanding of where to get the data from as well.

This is all under the assumption that you had some form of data management or data engineering team. However, this is not always the case.

Option 2 – Hire a Data Engineer

data engineer consultant

Icons made by wanicon from www.flaticon.com

Some companies don’t have specific data engineering teams that create automated workflow. Even if they do those teams are just too busy at the end of the day to provide you with specific pipelines. Some of our clients are small companies that can’t really hire a data engineer or data scientist full time.

Then another option is to hire a full/part time data engineer. There are plenty of consultants and contractors who would fit this role fine. We have several clients who we build and manage their data warehouses, pipelines, dashboards and machine learning models. 

It helps reduce their costs while also getting them the data they need.

This option is great when your company has the resources to afford an extra pair of hands. Both in terms of management costs as well as

OPTION 3 – Drag And Drop Pipelines

One other great option is drag and drop pipelines. In this case your IT person might be all that is required when your pipelines are simple enough. Instead of using a coding based pipeline tool you can instead use drag and drop pipeline tools. These do often require some training. But usually they are developed to be much simpler than code.

We also would mention that drag and drop data pipeline tools like Stitch for example, have vendor lock-in. With these tools, because they are specific to said vendor, it is difficult to switch. Thus, if the vendor at some point decides to charge more, you will just have to pay it. Even if it is double what you thought it would cost.

This is a small trade-off that could be worth it for small organizations.

OPTION 4 – Data Virtualization Tools

data virtualization consultingdata consulting management

Data virtualization is a slightly newer option and truthfully, we don’t recommend it to everyone. With data virtualization allows data teams to create a virtual layer over multiple data sources. This means you can connect data sources that come from different database systems like Postgres, MySql, DynamoDB, etc, and across cloud providers often.

In turn, this means you can reduce a lot of the heavy lifting of ETLs and data pipelines.

The caveat being that often these tools require a whole need skill set.

Working with providers like Denodo, you will quickly find that you need to be somewhat proficient in Denodo itself for it to be useful.

This is a similar issue with many technologies that often require training to actually make said technologies useful. In addition, these technologies are not universal. Unlike code like Python which is used in every company. Not every company relies on Denodo. 

Thus, incorporating it into your team’s third-parties means you will always need someone who knows how to use it. This becomes difficult when you need to replace the person who used to have that skill set.

Unlike python which many developers know. Denodo is unique and thus hard to hire for.

It doesn’t mean it isn’t an option. However, it does require a specific set of skills that might be difficult to find.

Finding The Right Data Engineering Solution

All of these solutions above move the work away from the data scientist who should be focusing on analysis. Although we won’t talk about it here. The big difference between data science and data engineering is the goals of the individuals.

When you take away data scientists from focusing on analysis work and force them to do engineering work, it not only costs time but takes away work they enjoy doing.

So we don’t recommend it.

Now that we have discussed a decent amount about what to do if you don’t have the support for ETLs and data management, we will go on to the next section where we will talk about dashboards.

Take Inventory Of Your Dashboards

dashboard consulting

Photo by Luke Chesser on Unsplash

Almost every data science, data analytics, or even just business team we have worked with has always had at least 1-2 dashboards that they maintain.

Dashboards are still a very common business tool that can help succinctly describe what is occurring in a business.

When designed properly, we find them very valuable. However, it is not uncommon for dashboards to end up forgotten. That is why taking inventory of the dashboards your team manages is a useful exercise.

So let’s list out the dashboards your team manages.

 

This allows our team to see what decisions and metrics our clients are responsible for. In turn this lets us have a better understanding where the team fits in the organization as well as further helps us understand how we can start to guide their data analytics strategy.

In our experience, dashboards have a bad habit.

They have a habit of ending up in the dashboard grave-yard. They get forgotten about, and unused.

This is why we like to list out a few of the main dashboards your team uses and what decisions they drive and also some dashboards that are unused.

Unused Dashboards

We don’t just look into what dashboards are being used, but we also track dashboards that are either lightly used or unused.

This is for a few reasons. 

One, we might be able to recommend integrating some of the dashboards information into another dashboard.

Two, we might find dashboards that are actually very important to the business. It may have just been that the dashboard has been forgotten about.

In these cases, we will work to help the team proliferate the usefulness of the dashboard. This usually requires some buy-in from directors and stakeholders. Once that is done, then the team has increased its impact with little work.

Finally, if the dashboard really provides no value, then we recommend getting rid of it.

Recap

This section allows us to review with teams exactly what their dashboards even are and let them qualify their value. In our experience, it is not uncommon for directors and managers to come up with ideas for dashboards that they then never actually use to make decisions.

This is a sense is a waste of time. There is some value in informing, but generally, dashboards should drive some sort of action.

For example, you might have a dashboard that tracks payroll and breaks out the different types of payroll like overtime.

In turn, directors may use that to better forecast how many people they should hire to optimize payroll costs.

A good data strategy makes sure dashboards drive decisions. Instead of just displaying vanity metrics. We won’t be going into vanity metrics in this data strategy guide.

Do You Currently Have A Data Team And Who Is On It

all of our clients have different sized data teams, budgets and resources.

Small operators often don’t have anyone working on this type of work or if they do it’s just their one IT person who manages their cloud, analytics, and website. This poses several challenges when it comes to implementing new technologies or data focused strategies.

Whereas billion-dollar corporations have huge teams, processes, and complex bureaucracies that have a whole different set of challenges.

So in this case, this section is geared towards two paths.

One where you already have a data team and one where you don’t.

Pretty much every billion-dollar corporation has one if not many data-focused teams. 

So let’s go down the billion-dollar path first.

Large Corporations

Large corporations will have most of the cliche data professionals. This includes data scientists, data engineers, research scientists, BI developers, data analysts, and other combinations of the word engineer, scientist, developer, etc.

 How these organizations set up their teams vary. Sometimes teams are a mix of multiple disciplines where each member plays a role in the data workflow. Other times each team specializes in discipline and instead work together on specific projects. 

This will change our consulting team’s final reports but it doesn’t change keeping track of what type of team members are on that team.

So below let’s list out a few of the data specialists your team supports.

With that out of the way, let’s list out the skill sets you think your team has.

We don’t do this on the person tracking sheet because we assume most teammates can learn the skills required rather quickly.

What skill sets does that team have

Here we only track some high-level notes and the skill set we believe the team has. We will also often provide some sort of notes about where we saw the team demonstrating these skill sets. This is helpful when we need examples in the future as we write reports. Some of the skills you can include are like:

  • Data Pipelines And ETLs
  • Statistics
  • Analytics And Metrics Developments
  • Data Warehouse And Database Management
  • etc.


Now let’s discuss small companies.


Small Companies Without Data Teams

Not every company can afford a full-time data scientist or data engineer. However, they might need some reporting, metrics, or models developed. 

Often times this is where our team comes in and develops entire data workflows. From creating a data storage system like a data warehouse and ETLs to developing dashboards and algorithms.

But in general, we recommend finding people that can develop an easy to run a system that they can automate nearly 100% of the data loading, manipulating, and deployment. 

This is usually a little easier for smaller companies that only have only a few data workflows.

There isn’t a huge need to have a full-time data professional.

Just someone that comes in and checks to make sure your automated systems are still running.

Other than that there is not too much to track here.

Is Your Team Managing Research Or Algorithms 

algorithm strategy consulting

Icons made by xnimrodx from www.flaticon.com

Some teams we work with on more focused on research and developing algorithms. These are also important to track. Both in terms of high-level documentation on what the algorithms are but also tracking how they change over time.

There are lots of great tools like SaturnCloud and Domino Data Lab that were developed to help track research and algorithms over time. They can be used to improve collaboration as well as act as version control.

In this case we do like to track who is in charge of the algorithms, what data sources support the algorithms and the purpose of the algorithm. Go ahead and track that below.

Moving Beyond The Technical Aspects

Improving a team or companies enterprise data strategy is much more than the technical aspects.

You will also need to learn about the team’s relationships and stakeholders as well as their goals. This is the last pillar we find important at this stage in the process of developing a data strategy. 

Regardless of how great a team is technically. Without good relationships, strong stakeholders and, clear goals, it is difficult, if not impossible to be successful in large companies.

Note: If we work with smaller companies there might not be stakeholders, but there are still goals and targets that we will need to hit.

Who Are Your Major Stakeholders

Our team is built up of engineers and data scientists. We all went through our own phase of believing that just really smart design and engineering was enough to get things done in the business world.

Truthfully, our experience in large tech organizations as well as small start-ups have taught us that you need to develop strong relationships with your stakeholders.

Your cross-functional stakeholders help ensure that your projects actually provide the value they are supposed to. If you have poor buy-in or work without communicating your progress, timelines, and goals to stakeholders, then you are likely to at the very least be perceived as unreliable.

So as part of our inventory portion of the assessment, we work with our clients to figure out their largest stakeholders.

It doesn’t need to be every possible stakeholder.

Just the highest level stakeholder in a specific department that our clients work with directly.

If you’re following along you can list them in the table below.

As stated before, this step is just as important as understanding the technical side of a client’s current status.

In order to develop a future strategy 

Do you currently have clear goals for your team

An important part of any team is clear goals. In the data world, you can quickly get involved in a lot of ad-hoc analysis and data pulls for other teams.

This isn’t necessarily bad. But it might not be the way your team can provide the most impact.

So before getting too far in the process.

Take a moment to record what your team is currently doing(if you have a data team)

Take another moment to figure out what you want your team to do. This might not be what your team already does.

Here are some ideas of goals your teams might want to consider.

  • Our team wants to increase the conversion of Customers in our sale’s funnel
  • Our team wants to decrease inefficiencies in our companies supply chain
  • Our team wants to help increase product traction
  • Our team wants to drive better data science practices
  • We want our team to be in charge of data standardization
  • We want 15% of our team to be focused on ad-hoc work

These are some high arching goals that can help guide your team. Goals are important in any team, department, or organization. It acts as a guiding light. 

Goals help you know what skill sets your team may need to develop or hire out. 

Goals help you know which relationships internally you need to cultivate with which directors and managers.

Goals help provide that X on the other side of the map that your team can start developing a path forward.

So that when other teams start trying to pressure you to constantly do ad-hoc work, you can make sure that the work aligns with your goals.

If you start doing too much ad-hoc work, and you start getting sidestepped from your goals, your team can never truly be as impactful as it should.

Thus, our consulting team will always write a portion of our reports that discuss this.

That discusses what we think a team’s goals should be. Based on their team’s skillsets, placement in the organization, and relationships.

Sidenote; This isn’t to say you don’t do ad-hoc work. You just need to make sure you take into consideration how much ad-hoc work your team does on a normal day to day basis.

Where Should You go Part 2

At this point in our consulting team’s process, we have a good understanding of where our clients are.

We know what data sources they have, we know what data pipeline and dashboard they have, and we know where the team sits as far as the rest of the company.

These are all very important factors to try to get straight early on.

This lets you know where the team is, and now you can start discussing where the team would like to go.

First, we can go back to the goals section of this worksheet and see what types of goals the team has.

From there we would assess if the team is currently meeting those goals, if they are a good fit for said goals and what goals they might be missing out on. 

This section is more about taking a high level view. We aren’t as interested about the specific individual contributor work, but more about where we can see our client fitting best in their larger organization.

What Types Of Decisions Can Your Team Drive

Having goals is step one. Another area we look into is what decisions a team is looking to drive.

This is some what connected to goals.

However, this is often more specific.

For example, perhaps your data team’s goal is to reduce the time spent in your sales funnel.

Based on that, what decisions does your team need to drive? 

Do you need to change the product design of your sales experience?

Do you need to change the training of your sales team?

All of these decisions can lead to impact and change but from a data teams perspective, they are merely providing the numbers

They will need to be well aligned with the businesses goals.

Otherwise they risk failing their project.

This brings us to our next section which is discussing risk.

What Are Your Teams Biggest Risk Factors

One concept that is often overlooked is a risk.

Risk means something slightly different in the project world. It doesn’t necessarily mean something bad happening. Like the risk of an airplane falling out of the sky.

Instead, it often refers to the factors that will cause your project to either be slowed down or all-out stop.

Our consultants have to lead many projects and thus have in turn faced many of these risks.

Here are some major ones we will write about in our reports, as well as often provide solutions to fix them.

Cultural 

Change is hard. Especially when it comes to process and technology

Organizations get comfortable with our way of doing things and when someone introduces us to a new way of approaching a problem, many people push back.

This is one of the biggest risk factors there is when it comes to developing introducing new ideas in a company. 

To try to reduce this risk, the best thing to do is to get stakeholders involved in whatever your new project is early.

Are you considering using machine learning to make some crucial decisions that up until now have been decided by humans?

Make sure you get the original people making that decision involved. They can feel threatened by the fact that you are trying to take work away and or lack confidence in your technology. 

Thus, you need to make sure that they work with you on the process. Often the technology won’t take away their job, but instead, just let them focus on more important tasks.

Poor Buy-In

Any project, regardless the size requires buy-in. This is slightly different than cultural. Cultural specifically refers to issues where people are against adopting a new idea. Poor buy-in occurs when even if people agree with the change, they don’t do anything about it.

Change isn’t just hard because it’s new. It’s hard because it requires work. So you need to convince people that your new method is better than the old method.

Part of this is making sure you spend time telling people the new data product, dashboard and or algorithm exists. If you don’t tell people, then they won’t know. Thus, you need to put time into working with your team to show them the value.

In addition, get your stakeholders involved so you have champions for your new data product besides yourself.

This will help reduce the risk of your product going unnoticed.

Process Bottlenecks

As consultants, we are constantly dealing with process bottlenecks. These take many forms. In particular, we see this in larger companies that have teams where goals don’t always align. 

This is what often leads to internal processes. Usually the goal is to help prioritize teams and keep them focused. Concepts like Scrum, and other project management tools also can ac as bottlenecks that slow down new work.

So being aware of different teams and their processes is important.

Security 

We have been hit by security issues time and time again. For example, we once worked at a company that blocked sites like Github, Google Apps, and Stack-overflow. This made our team’s typical workflow very difficult.

We couldn’t collaborate with their teams as easily because all the documents had to be shared via downloadable format vs. cloud, code had to be transferred in and then built out on their own version control system, and so on.

This isn’t to say that security processes are wrong. They are necessary. They keep companies secure and depending on the industry your data team is in

Data Silos

Data silos remain one of the big issues larger corporations struggle with. Data warehouses and databases tend to be siloed by the team. Finance data is with the finance team, sales with sales, operations with operations, and so on.

This causes several issues.

These data systems also don’t speak to each other. In the sense that when you want to join this data across systems, there aren’t easy to join IDs that let you connect different transactions with each other.

In addition, there are often security issues that make it difficult to get access to the data. You will need to request access to the other team. This gets put into a ticketing system and then maybe after a month or two someone finally gets you to access the data.

All the while your managers and directors are constantly asking where their new report is.

This is why Data Silos can act as major bottlenecks for your next data science, data engineering, or BI project.

How do some companies avoid data silos

Some companies have developed more open data protocols. Often what we see is decentralized-centralization as a method to avoid

As consultants, we ourselves are very process-driven. However, there is a point where due to the size of a company and the complexities of their various  

Other teams 

Another major issue that faces many teams is well, other teams. It could be the goals of other teams not aligning or taking on projects that are similar to other teams.

Many teams are very protective of their responsibilities and before you know it you could be accidentally working on someone else project.

This is why ensuring you bring in other teams and getting them on board with your project is a critical step. During our team’s projects, we have learned that you need to not only work with the team that brought you on but also other teams that might be working on similar work.

This helps provide context, gets other people onboard for your clients strategy overall.

Can you have external influence 

At this point our team has a pretty good idea about a clients data sources, processes, infrastructure and current approach to their data strategy. So our next step is to take a moment and reflect where we think they can start playing a bigger role.

Part of modernizing your data infrastructure is not only technical but also improving the positioning of our clients in their organization. This requires taking account of the skills, relationships and data the team manages.

From their our team can assess where our clients have the best opportunity to improve their company.

What Type Of Team Are We Working With

At the end of the day, most of this is based on skill set. Regardless of wha the team currently manages. Going forward it’s usually best to focus on the teams strengths. Sometimes this means our consultants will recommend offloading work that might be better suited for cross-functional partners.

For example if a data engineering team is doing some research or data science work, we would recommend finding a new team to manage it (as long as the company has a data science team). Similar things can be said for data science teams who are doing too much data engineering work.

We believe that teams should focus on what they are good at. We discuss this further in our post about data scientist vs. data engineers. This is less about skill sets and more about the goals of the individuals. T

Growing Your Team’s Impact 

One major reason our team is reached out to is to grow our client’s impact in their organization. Once we have gotten an understanding of where our client fits in their enterprise and what role they currently play, then we can start to outline their future strategy.

This will usually be one of the final sections in our teams data strategy assessment.

This section of our strategy assessment will discuss what projects we think the team should be taking on to maximize their impact.

We will usually provide more recommendations for projects than the team could be taken on. However, this is because we will then provide what we see as the priorities for the team.

What Projects Should You Take On

In order to prioritize our project recommendations we use a four quadrant chart.

The two axis of this chart are Impact and Time Required.

This tool helps teams take the mountains of requests you get from managers and directors and find the most impactful ones.

For example, if you have projects that don’t require a lot of time and have high impact. Those should be a high-priority.

On the other hand, If the project idea  you get from a manager has low-impact and requires a lot of time, then you should pass on it.

With this tool you can more succinctly approach other teams when they ask you for other work.

You can use it to visualize where your team is at, and why you are making the decisions you are making.

We use this chart to provide our recommendations on how the team can move forward to improve their data strategy.

 

data analytics strategy consulting

Once this step is done, for the most part this part of the strategy recommendations is over.

The next deliverable would be based on our clients decision.

Do they have specific projects they would like to move forward with?

Based on the data projects they chose we will then go into the next section.

Next Steps

This data strategy assessment is meant to help walk you through the concepts, and steps you will need to consider if you are trying to plan out your future data analytics strategy.

Not every section needs to be filled out. This assessment just helps summarize a lot of the key points that a data team needs to consider as they are moving froward and  developing a new data analytics strategy.

As far as the final product, we end up with a several page report and a one pager to summarize our recommendations.

This might be a combination of dashboards, metrics, processes and goals we recommend our clients take one. Our team does provide custom development services as well. But this is usually step one.

If you enjoyed this article, then consider reading some of these articles as well.

Why You Need To Migrate To The Modern Data Stack

Advancefd SQL For Data Analysts, Data Scientists And Engineers

Airbnb’s Airflow Vs. Spotify’s Luigi

How Algorithms Can Become Unethical and Biased

Top 10 Business Intelligence (BI) Implementation Tips​

5 Great Big Data Tools For The Future — From Hadoop To Cassandra

Learning Data Science: Our Favorite Resources Free Or Not