Published in the October 2016 edition of The Leading Edge magazine by the Society of Exploration Geophysicists. Read the full article here.
Author: Brendon Hall
Coordinator: Matt Hall, Agile Geoscience
Abstract
There has been much excitement recently about big data and the dire need for data scientists who possess the ability to extract meaning from it. Geoscientists, meanwhile, have been doing science with voluminous data for years, without needing to brag about how big it is. But now that large, complex data sets are widely available, there has been a proliferation of tools and techniques for analyzing them. Many free and open-source packages now exist that provide powerful additions to the geoscientist’s toolbox, much of which used to be only available in proprietary (and expensive) software platforms.
One of the best examples is scikit-learn, a collection of tools for machine learning in Python. What is machine learning? You can think of it as a set of data-analysis methods that includes classification, clustering, and regression. These algorithms can be used to discover features and trends within the data without being explicitly programmed, in essence learning from the data itself.
In this tutorial, we will demonstrate how to use a classification algorithm known as a support vector machine to identify lithofacies based on well-log measurements. A support vector machine (or SVM) is a type of supervised-learning algorithm, which needs to be supplied with training data to learn the relationships between the measurements (or features) and the classes to be assigned. In our case, the features will be well-log data from nine gas wells. These wells have already had lithofacies classes assigned based on core descriptions. Once we have trained a classifier, we will use it to assign facies to wells that have not been described.
Related Content
ChatGPT on Software Engineering
Recently, I’ve been working on a new course offering in Enthought Academy titled Software Engineering for Scientists and Engineers course. I’ve focused on distilling the…
What’s in a __name__?
if __name__ == “__main__”: When I was new to Python, I ran into a mysterious block of code that looked something like: def main(): …
3 Trends for Scientists To Watch in 2023
As a company that delivers Digital Transformation for Science, part of our job at Enthought is to understand the trends that will affect how our…
Retuning the Heavens: Machine Learning and Ancient Astronomy
What can we learn about machine learning from ancient astronomy? When thinking about Machine Learning it is easy to be model-centric and get caught up…
Extracting Target Labels from Deep Learning Classification Models
In the blog post Configuring a Neural Network Output Layer we highlighted how to correctly set up an output layer for deep learning models. Here,…
Exploring Python Objects
Introduction When we teach our foundational Python class, one of the things we do is make sure that our students know how to explore Python…
Choosing the Right Number of Clusters
Introduction When I first started my machine learning journey, K-means clustering was one of the first algorithms I was introduced to – and it is…
Prospecting for Data on the Web
Introduction At Enthought we teach a lot of scientists and engineers about using Python and the ecosystem of scientific Python packages for processing, analyzing, and…
Configuring a Neural Network Output Layer
Introduction If you have used TensorFlow before, you know how easy it is to create a simple neural network model using the Keras API. Just…
No Zero Padding with strftime()
One of the best features of Python is that it is platform independent. You can write code on Linux, Windows, and MacOS and it works…