Concurrent Materials Design, Accelerated by AI
This article references topics presented by Dr. Michael Heiber at Enthought’s 2025 R&D Innovation Summit in Tokyo. Link to video below.Over the last...
Software & AI
Scientific Software Development, Legacy Software Modernization, UI/UX,
Predictive Modeling, Custom Simulations, Web Applications,
Multimodal Knowledge Systems, API Development
Data Systems
Data Engineering, Process Engineering, Data Pipelining and Augmentation,
Workflow Automation and Redesign, Scientific Data Management Systems,
Data Capture Systems, High Volume Data Management, Database Design
Strategy & Design
R&D AI Transformation, R&D Digital Transformation, Strategic Roadmap Development,
Data System Design, Process Analysis
Infrastructure
Technical Upskilling for Scientists & Engineers, R&D Systems Integration,
R&D IT and Data Ops
Core Technologies
Machine Learning, Deep Learning, Baysian Optimization, Generative
Adversarial Networks, Graph Neural Networks
Advanced Modeling & Systems
Reasoning Models, Multi-Scale Modeling, Surrogate Modeling,
Simulation, Image Processing, Agentic AI Systems
Language & Generative AI
Natural Language Processing, Foundation Models, Generative AI,
Large Language Models
Discovery & Development
Property Prediction, Formulation Optimization, Structure Generation,
Materials Discovery, Materials Compatibility
Data Insights
Text Data Mining, Automated Data Analysis, Time Series Analysis,
Multimodal Search, Literature and Patent Search, Dashboards, Data Visualizations
Decision Support
Chatbots, Predictive Maintenance, Preventative Maintenance, AI
Recommendation Systems
Making Sense of Agentic AI | You can now watch this timely webinar on agentic AI in materials & chemistry R&D on-demand.
For many traditional innovation-driven organizations, scientific data is generated to answer specific immediate research questions and then archived to protect IP, with little attention paid to the future value of reusing the data to answer other similar or tangential questions. Data is essentially a side product of R&D and not viewed as a primary output. As a result, important experimental process details and implied contextual information are often not recorded.
Data that is collected is often not formatted in a consistent and well-structured manner, making it difficult and expensive to parse large volumes of historical data files that may be archived in a network drive or data lake. And the experimental workflows that produce this data are typically manual and require coordination between multiple teams—manual sample preparation and handoff between labs, manual data transfer between computers, manual raw data analysis on instrument computers. All these challenges make new data generation very slow and expensive.
The result is that many R&D labs have surprisingly small datasets that are actually clean enough and complete enough to serve a higher purpose, like as training data for a machine learning model.
Faced with their “small data” situation, researchers and managers often feel that they may not yet benefit from pursuing data-driven approaches to new product development. They are not sure what can be done given the current state of their data or how to efficiently gather more data to alleviate the issue. Even in organizations that have pushed forward a high-level vision with one-size-fits-all data platforms, new data science and engineering teams struggle to generate value due to the unique challenges inherent to scientific small data problems.
At Enthought, we have tackled many small data challenges in science-driven product development and have employed multiple strategies for getting the most value out of our customers’ small data to meet their strategic innovation goals. While there is no universal solution because each R&D organization has unique data and workflows, we help make the most of what they have and set a course towards continuous improvement. Teams can actually get started with little to no data and leverage existing domain knowledge to get further with less data through well-crafted experimental designs, feature engineering, informed model constraints and priors, and improved data quality. We also assess existing data generation workflows and prioritize workflow improvements that will accelerate new data generation and improve data quality using software tools to streamline data labeling tasks and to automate or assist users with raw data analysis.
For a deeper dive into how these strategies can be applied, particularly in the realm of materials science, check out our related webinar, Materials Informatics for Product Development: Deliver Big with Small Data. In this webinar, we share proven tips for getting the most value out of small data to meet your innovation goals, offering practical advice for R&D managers and researchers.
Don’t let that stop you from getting started with data-driven methods. In fact, it is in the organizations where small data is the norm where data-driven modeling and prediction can provide the most value and accelerate discovery and innovation.
Contact us today to discuss your team's small data challenges.
Michael Heiber holds a Ph.D. in polymer science from The University of Akron and a B.S. in materials science and engineering from the University of Illinois at Urbana-Champaign with expertise in polymers for optoelectronic applications. He leads Enthought's Materials Informatics solutions.
Prior to joining Enthought, Michael worked as a postdoctoral researcher at several institutions, where he developed improved physical models for organic electronic devices using custom open source software tools for physics-based device simulations, automated experimental measurements, and advanced data analysis. At Enthought, he drives diverse client projects from laboratory automation to data-driven recommendation systems to MI training.
This article references topics presented by Dr. Michael Heiber at Enthought’s 2025 R&D Innovation Summit in Tokyo. Link to video below.Over the last...
This article was originally published on Forbes and can be foundhere. By Michael Connell, EdD | Chief Operating Officer, Enthought Inc. AI is...
The specialty chemicals and materials industry is undergoing a significant shift. For companies that have historically relied on the strength of...