Man sitting against a glass window, looking pensively at his laptop
Process Improvement

Curbing data sprawl in large, complex projects

Written By: Matt Lieberson
January 6, 2023
7 min read

Way back in 2016, the average enterprise-sized company managed over 347 Terabytes of data. It’s hard to picture how much information that is. For a little context, it would fill about 1.1 million volumes of the Encyclopedia Britannica.

Now imagine an operation with the scope of Boston’s Big Dig—a 25-year project to burrow 1.5 miles into the heart of the city Hundreds of public and private organizations needed to record budgets and expenses, coordinate their workforce, draw and iterate plans, and send countless communications. Every action created more data. Each data set would be fragmented, duplicative, and quickly outdated, making it nearly impossible to trust or track by everyone involved.

This out-of-control mess of digital information is called data sprawl. And it’s a particularly nasty problem for cross-organizational projects because not only is more data generated from large operations, it’s also heavily fragmented in completely disconnected systems.

But as new technology makes pitching more data on the pile easier, it also offers hope of controlling it. Directors that use the right tools to address sprawl will create an ecosystem of organized and trustworthy data that’s accessible by every organization involved. The earlier the better too. Or project leaders will have to dig themselves out of a Terabyte-sized hole later on.

Fragmented data: The real cause of data sprawl

Fractured, fragmented data is information that sits on siloed apps, servers, or laptops that’s hard to access and collate. The lack of visibility leads to duplication of records and effort, sending sprawl to unmanageable levels.

In an operation like the Big Dig, for example, several construction companies will work simultaneously in different locations and on different systems. Every day, each crew gathers data about things like progress and costs. They’ll store all that information on whichever platform their company has decided is best. Some will have a project management app, many keep tabs with a spreadsheet.

Project directors need the data these teams accrue to update reports and allocate budgets. Other teams need it to coordinate work crews so the electricians don’t show up before the structural engineers. But there’s no one place to get it, so everyone pulls the same data over and over, or manually ports and manipulates facts and figures from spreadsheets to create a usable output.

Pretty quickly, the same data lives in hundreds of locations and in various forms. All of it is updated at a different pace, leaving the real truth buried under a mound of ROT (redundant, obsolete, or trivial data). Now, the project director not only has to hunt down cost data from a dozen places, but they have to sleuth out which information is correct.

What well-managed data looks like

The above scenario is common, but it’s not a foregone conclusion for large projects coordinated between many service providers. There is a world where trustworthy information flows freely between teams. And when we look at well-managed data in real-world examples, we see it shares a handful of common characteristics.

Source-agnostic data collection

It’s unrealistic to expect a hundred organizations to use the same financial, CMS, and project management software. So successfully stopping sprawl needs a centralized platform that can integrate with the systems everyone uses to gather data.

Kayak, the popular travel app, is a perfect case in point. The company is headquartered in Boston, but works with partners in Europe, China, and India on IT infrastructure projects.

It would be nearly impossible for the Boston team to update and exchange the single-user spreadsheets from each involved. So they used a no-code platform to create a centralized location that imports data from their partners. Now Kayak’s partners update in their native locations, and Kayak employees see it all in one place.

Intentional governance

Governance is the control of who can see, add, and use data. As more organizations join a project, there are more opportunities to create duplicate data and muddy the reporting waters with out-of-date information. Governance is how you keep that from happening—especially if you’ve enabled teams to build their own apps using no-code platforms.

Governance works best when it’s controllable at various levels. That means setting automated approvals not just by role, like admin vs. user, but by the data field or query. So as a project expands and hundreds of new people need access, no one is tasked with setting individual permissions.

The Micron Consumer Products Group had several data management jobs that no-code apps could solve quickly. But they also had to be aware of sensitive data that could be at risk when non-IT personnel created these solutions. So they built in governance guardrails that allowed for fast app development while keeping data safe.

Discoverable and accessible

Just as data collection needs to be source-agnostic, avoiding sprawl requires that data be easy to find and usable by everyone that needs it. That includes making it available on all the devices—both stationary and portable—typically used to access it. The alternative is repetitive requests for the same data that gets stored in many locations.

A single organization can handle this by storing data on an accessible cloud server. For multi-organization projects, it’ll require a central platform that “speaks” with the various systems of each team.

Here’s how Boyett Construction—a specialty subcontractor with 100-plus jobs always on the go—does it. They custom-built their own no-code project management app, called BMS, which gathers information from vendors, financial partners, and field crews. The team then quickly designs custom reports and dashboards that collate and contextualize all that disparate data.


Compliance is an essential concern for almost any organization that deals with data. Whether it’s HIPAA, GDPR, or CCPA, there is often a need to prove that sensitive information is handled properly. Data sprawl is a natural enemy of good compliance and makes reporting to compliance agencies a nightmare.

Centralizing data is a big leap towards curbing sprawl and being compliant. So is proper data organization and tagging.

For example, the Atlantic Research Group—a contract research organization—left behind the challenges of using spreadsheets and Google Docs to meet HIPAA requirements. Instead, the company created its own Clinical Trials Management System that keeps all the files they need to protect in one place that’s easily searched when HIPPA regulators come calling.

Flexibly managed

As a project increases in complexity and scope, so do its data management needs. What worked when three organizations collaborated around project design won’t be adequate when 50 teams are testing soil, gaining legal approvals, and building physical structures.

Wrangling data sprawl while working with a growing and changing list of subcontractors takes a lot of work. Canadian Solar Solutions Inc. found it impossible to do with an inflexible SharePoint instance locked behind a firewall. So they built a completely customized project management app. The flexible design of their no-code app lets them start with a few basic data sets, like key project dates, and add functionality over time as new vendors and partners require specific data.

The impact of curbing data sprawl

Well-structured, easily shareable data has profound impacts on a project. Decisions are better informed as leaders have easy access to contextualized, trustworthy information. Coordination improves since everyone is working off the same, up-to-date facts. There’s a sharp increase in productivity because people aren’t manually requesting details repeatedly. And without the exponential duplication of data left in many locations, data security increases while storage costs decrease.

There is a flexible, no-code solution that lets the largest, most complex projects curb data sprawl through customizable applications. Whether you’re organizing the deployment of a new state-wide health initiative or just digging a really big hole, we can show you how.

Matt Lieberson
Written By: Matt Lieberson

Matt Lieberson is a Content Marketing Manager at Quickbase.

Never miss a post — subscribe to the Quickbase Blog.

Sign up to receive the latest posts on everything from Operational Excellence to Digital Transformation.