Start Collecting More Data & Enriching It to Have Better Results with Predictive AI

Predictive AI relies on training algorithms with data, and often the data that is used by the algorithm is actually derived from your raw data into a set of "features". Let's explore why you want collect and enrich your data to make this process more impactful.

For context, the process to create features from your existing data involves tasks like:

  • Changing the shape so a single column becomes many columns or many rows are aggregated into a single cell

  • Handling empty or errant values to fill in a default or calculated value based on your other data

  • Resolving outliers that are edge or exceptional cases that you would not want to model, as these fall outside of your normal business processes

Even with those activities, you still need to have data to work with. Without a variety of data to use, you end up limited in what you can do, and your predictive AI models may not perform as well as you would hope.

Some industries have a head start collecting more data because they need it from a regulatory or compliance perspective.

For example, many industries in financial services require knowing a customer's name, address, birthdate, and many other details that can be used for feature engineering.

However, other industries operate with much less data, which makes it harder to start working with predictive AI.

For example, imagine if you only have first and last names, an email address, and transaction history to rely upon; there are many other details that are missing like their geographic location, demographic details, and behavioral or psychographic elements that are valuable to use when modeling.

You can start to fill these gaps by collecting more information using:

  • First party data, whereby you're collecting that directly from your customers, donors, etc by asking or requiring it to engage with your organization

  • Third party data, whereby you're enriching your existing data by augmenting or comparing what you have to research done by outside firms

Both approaches can be used together, although you must appreciate that:

  • First party data likely has a higher likelihood of being accurate and useful to you

  • Third party data can accelerate your time to market for projects around analytics and predictive AI

TLDR: You likely need more data from your customers or donors before exploring predictive AI; third party is fastest, first party is best.

Only Done Right Daily

A free, daily email newsletter with practical insights into digital strategy and transformation, designed for both practitioners and executives looking to make processes and technology work better.

Each email is a two minute read packed with content on how to continually drive digital transformation in your organization.

    I will not send you spam nor share your email address with anyone else.

    If you're still not sure, you can browse the archive.

    Previous
    Previous

    Takeaways from a Report on Generational Giving for Nonprofits

    Next
    Next

    5 Core Skills to Master for Digital Transformation