AI Needs the ‘Applied Sciences’ Treatment

As industries rapidly advance in AI/machine learning, a key to unlocking the power of these approaches for companies is an enabling environment. Domain experts need to be able to use artificial intelligence on data relevant to their work, but they should not have to know computer or data science techniques to solve their problems. An environment which enables the domain expert to easily and intuitively label data and train models will allow AI to become truly ‘applied.’ The above image shows a series of fault planes predicted by our approach in the SubsurfaceAI Seismic application, created with ‘applied machine learning’ in mind. Learn More.

The Rise of ‘Applied Machine Learning’ and Geoscience

A generally accepted definition of ‘applied sciences’ is the use of the scientific method and associated knowledge to solve practical problems. The next phase for artificial intelligence now gaining attention is ‘applied AI,’ with an analogous definition being ‘the use of AI methods and associated domain expert knowledge to solve practical problems,’ with those underlying methods transparent to the expert owning the challenge and using the tools. 

In the last few years, the field of AI/machine learning has experienced rapid advances in capabilities and applications. A search on the words ‘applied machine learning’ provides a host of engaging articles. These articles also indicate AI capabilities have been driven largely by data scientists and experts in coding/model building using general approaches. 

Many of these capabilities have yet to be easily accessible to domain experts in a way that enables them to rapidly adapt them to solve their specific problems – in other words, to be truly ‘applied.’ 

Machine learning models themselves are increasingly becoming commoditized, freely available, and easy to build proof of concept work around. In business terms, this means that soon internally developed machine learning models will not provide differentiation to a company’s offering. 

Machine learning models have become more like plug and play building blocks that can be fit into a solution. This makes it very easy to rapidly test and prototype solutions, but it does not solve the critical problems associated with actually putting AI solutions into the hands of domain experts in a way that allows adaptation as working tools. 

The problem is one of industries moving to ‘applied machine learning.’ A key question for this transition is how do you set up a solution that gives a domain expert access to the machine learning tools in an environment where they can easily and intuitively train machine learning models? 

That is a much trickier proposition than prototyping a solution, and it’s why we’re seeing recent high valuations for companies such as Scale and Labelbox, which are focused on providing a way to operationalize AI for business. 

It’s All About the User and Labeled Data 

The machine learning models themselves are important, and it is necessary to test different networks, different models, and various ways of layering models to arrive at reasonable results. However, many of these models are essentially commodities. So, although you need to fiddle with them, a domain expert can take them off the shelf and connect different ones in various ways to test different solutions relatively quickly and easily. 

Increasingly, labeled data is the key. Things get more challenging when it comes to how the domain expert will interact with the machine learning models and with the data they are interested in manipulating or analyzing. A lot of effort from many companies has gone into labeling, analyzing, and interpreting everyday types of data. In the B2C world, this is dominated by pictures of people, roads, cars, or usage patterns of consumers of various services, such as social media platforms and video streaming services. 

These efforts have resulted in a large amount of labeled data on objects that commonly appear in our world. But areas that have been left behind in those efforts include many of the sciences where datasets are typically much smaller, the number of people working on it are fewer, and the number of people who can correctly label the data are fewer still. 

For example, let’s say an exploration team wants geologic core (rock) data labeled such that the stratigraphy is highlighted, as well as the general makeup of each of the stratigraphic layers (e.g., sand, shale and carbonates). They can’t just let someone with no geoscience background do the labeling. The result would be a bunch of meaningless training data. That’s the situation many science-based companies are in. They have good data, maybe not on a large scale, but enough to use AI and ML to good effect. However, they lack the labeling technology and labeled data to use it effectively.

So, a really important thing to build simply and intuitively is the user interface to the data and AI models. The domain expert must be able to easily and intuitively interact with the data and rapidly build-out training data on which to run AI models. 

The Ultimate Prize for the Geoscientist 

The ultimate prize in the subsurface world of energy is for the geoscientist to be training the machine learning model while labeling the data. This is a revolutionized workflow – one that completely removes any role for an intermediary such as a data scientist and one that enables the domain expert to utilize a model that will interpret the way they do.

In the energy industry subsurface world, one could envision analogs to ImageNet, for example a ‘Seismic ImageNet,’ a ‘WellLogNet’, and ‘CoreCTScanNet’ as open source datasets. There is rapidly enough open source data becoming available to develop such high-quality models. 

Automated, iterative image labeling integrated with models makes it possible, and the result is that companies with massive amounts of subsurface data exclusive to them will find their advantage in big data approaches eroding. 

This prize is available, albeit in an early stage, for seismic interpretation in our recently developed custom deep learning application, SubsurfaceAI Seismic. Anyone who would like to see how it meets the ‘applied machine learning’ test, please get in touch.  

About the Author

Mason Dykstra is the Enthought vice president of Energy Solutions. As an intuitive thought leader, he helps oil and gas companies connect the dots between science, engineering, technology, and business needs. Mason leads the Enthought team of energy experts and scientists in tackling big problems that contribute to the bottom line. Connect with Mason on LinkedIn at to join his online conversations.

Share this article:

Related Content

ChatGPT on Software Engineering

Recently, I’ve been working on a new course offering in Enthought Academy titled Software Engineering for Scientists and Engineers course. I’ve focused on distilling the…

Read More

What’s in a __name__?

if __name__ == “__main__”: When I was new to Python, I ran into a mysterious block of code that looked something like: def main():  …

Read More

3 Trends for Scientists To Watch in 2023

As a company that delivers Digital Transformation for Science, part of our job at Enthought is to understand the trends that will affect how our…

Read More

Retuning the Heavens: Machine Learning and Ancient Astronomy

What can we learn about machine learning from ancient astronomy? When thinking about Machine Learning it is easy to be model-centric and get caught up…

Read More

Extracting Target Labels from Deep Learning Classification Models

In the blog post Configuring a Neural Network Output Layer we highlighted how to correctly set up an output layer for deep learning models. Here,…

Read More

Exploring Python Objects

Introduction When we teach our foundational Python class, one of the things we do is make sure that our students know how to explore Python…

Read More

Choosing the Right Number of Clusters

Introduction When I first started my machine learning journey, K-means clustering was one of the first algorithms I was introduced to – and it is…

Read More

Prospecting for Data on the Web

Introduction At Enthought we teach a lot of scientists and engineers about using Python and the ecosystem of scientific Python packages for processing, analyzing, and…

Read More

Configuring a Neural Network Output Layer

Introduction If you have used TensorFlow before, you know how easy it is to create a simple neural network model using the Keras API. Just…

Read More

No Zero Padding with strftime()

One of the best features of Python is that it is platform independent. You can write code on Linux, Windows, and MacOS and it works…

Read More