As industries rapidly advance in AI/machine learning, a key to unlocking the power of these approaches for companies is an enabling environment. Domain experts need to be able to use artificial intelligence on data relevant to their work, but they should not have to know computer or data science techniques to solve their problems. An environment which enables the domain expert to easily and intuitively label data and train models will allow AI to become truly ‘applied.’ The above image shows a series of fault planes predicted by our approach in the SubsurfaceAI Seismic application, created with ‘applied machine learning’ in mind. Learn More.
The Rise of ‘Applied Machine Learning’ and Geoscience
A generally accepted definition of ‘applied sciences’ is the use of the scientific method and associated knowledge to solve practical problems. The next phase for artificial intelligence now gaining attention is ‘applied AI,’ with an analogous definition being ‘the use of AI methods and associated domain expert knowledge to solve practical problems,’ with those underlying methods transparent to the expert owning the challenge and using the tools.
In the last few years, the field of AI/machine learning has experienced rapid advances in capabilities and applications. A search on the words ‘applied machine learning’ provides a host of engaging articles. These articles also indicate AI capabilities have been driven largely by data scientists and experts in coding/model building using general approaches.
Many of these capabilities have yet to be easily accessible to domain experts in a way that enables them to rapidly adapt them to solve their specific problems – in other words, to be truly ‘applied.’
Machine learning models themselves are increasingly becoming commoditized, freely available, and easy to build proof of concept work around. In business terms, this means that soon internally developed machine learning models will not provide differentiation to a company’s offering.
Machine learning models have become more like plug and play building blocks that can be fit into a solution. This makes it very easy to rapidly test and prototype solutions, but it does not solve the critical problems associated with actually putting AI solutions into the hands of domain experts in a way that allows adaptation as working tools.
The problem is one of industries moving to ‘applied machine learning.’ A key question for this transition is how do you set up a solution that gives a domain expert access to the machine learning tools in an environment where they can easily and intuitively train machine learning models?
That is a much trickier proposition than prototyping a solution, and it’s why we’re seeing recent high valuations for companies such as Scale and Labelbox, which are focused on providing a way to operationalize AI for business.
It’s All About the User and Labeled Data
The machine learning models themselves are important, and it is necessary to test different networks, different models, and various ways of layering models to arrive at reasonable results. However, many of these models are essentially commodities. So, although you need to fiddle with them, a domain expert can take them off the shelf and connect different ones in various ways to test different solutions relatively quickly and easily.
Increasingly, labeled data is the key. Things get more challenging when it comes to how the domain expert will interact with the machine learning models and with the data they are interested in manipulating or analyzing. A lot of effort from many companies has gone into labeling, analyzing, and interpreting everyday types of data. In the B2C world, this is dominated by pictures of people, roads, cars, or usage patterns of consumers of various services, such as social media platforms and video streaming services.
These efforts have resulted in a large amount of labeled data on objects that commonly appear in our world. But areas that have been left behind in those efforts include many of the sciences where datasets are typically much smaller, the number of people working on it are fewer, and the number of people who can correctly label the data are fewer still.
For example, let’s say an exploration team wants geologic core (rock) data labeled such that the stratigraphy is highlighted, as well as the general makeup of each of the stratigraphic layers (e.g., sand, shale and carbonates). They can’t just let someone with no geoscience background do the labeling. The result would be a bunch of meaningless training data. That’s the situation many science-based companies are in. They have good data, maybe not on a large scale, but enough to use AI and ML to good effect. However, they lack the labeling technology and labeled data to use it effectively.
So, a really important thing to build simply and intuitively is the user interface to the data and AI models. The domain expert must be able to easily and intuitively interact with the data and rapidly build-out training data on which to run AI models.
The Ultimate Prize for the Geoscientist
The ultimate prize in the subsurface world of energy is for the geoscientist to be training the machine learning model while labeling the data. This is a revolutionized workflow – one that completely removes any role for an intermediary such as a data scientist and one that enables the domain expert to utilize a model that will interpret the way they do.
In the energy industry subsurface world, one could envision analogs to ImageNet, for example a ‘Seismic ImageNet,’ a ‘WellLogNet’, and ‘CoreCTScanNet’ as open source datasets. There is rapidly enough open source data becoming available to develop such high-quality models.
Automated, iterative image labeling integrated with models makes it possible, and the result is that companies with massive amounts of subsurface data exclusive to them will find their advantage in big data approaches eroding.
This prize is available, albeit in an early stage, for seismic interpretation in our recently developed custom deep learning application, SubsurfaceAI Seismic. Anyone who would like to see how it meets the ‘applied machine learning’ test, please get in touch.
About the Author
Mason Dykstra is the Enthought vice president of Energy Solutions. As an intuitive thought leader, he helps oil and gas companies connect the dots between science, engineering, technology, and business needs. Mason leads the Enthought team of energy experts and scientists in tackling big problems that contribute to the bottom line. Connect with Mason on LinkedIn at linkedin.com/in/mason-dykstra-a304b25/ to join his online conversations.