Predicting Breadmaking Performance from Wheat Data
Our Customer and the Challenge
Data quality is everything. In the food industry, the metrics are broken. And the funny thing is that everyone knows it. The challenge is what to do about it. This is true for wheat as much as it is for any other food.
A wheat breeding laboratory needed to predict bread quality from wheat, and they came to us to explore how machine learning could help. Wheat breeders use a metric called bread score to track overall breadmaking performance for wheat varieties. The bread score aggregates performance across important characteristics like bread volume, pore size, and crumb texture. Measuring the bread score is time-intensive: it requires growing wheat – up to four generations' worth, turning the wheat into flour, baking a bread with the flour, and testing the bread. Because the bread score is so time-intensive, breeders rely on proxy measurements like wheat protein content. Here's the challenge – wheat protein content is hardly reliable as a predictor of bread score, but supplementing it with other measurements comes with increased cost and experimentation time.
The wheat lab came to us with the following questions:
- 1. What is it about a wheat variety that leads to a high bread score?
- 2. Knowing protein by itself isn't enough, what else should a wheat lab measure?
- 3. How can we minimize measurement costs while maintaining predictive accuracy?
Answering these questions would help them with the following objectives:
- 1. Save time and money in developing high-performing wheat varieties for breadmaking
- 2. Identify grain properties that are predictors of bread quality besides protein to guide informed breeding and cultivation decisions
- 3. Optimize the measurements to balance cost constraints with predictive performance
Question 2 is a challenge for many labs – there are a range of different instruments that can be used to measure breadmaking potential, and none of them are cheap. Two popular options are the Mixograph and Farinograph. On the surface these two look very similar – they're both mechanical tests of dough strength, and they are both resource-intensive to own and operate. Many labs have a lurking suspicion that they might not need both.
The wheat lab shared a dataset with the following variables:
- Grain and flour properties: Kernel weight and dimensions, milling yield, flour protein, ash, and moisture content and gluten content and index.
- Dough strength measurements: Mixograph and Farinograph readings
- Bread quality features: 3 Dough processing properties and 2 Bread quality scores
Our Approach: Pareto Frontier Analysis
We developed machine learning models to predict the baking quality features from the grain, flour and dough strength features. We then used our Pareto Frontier Analysis tool—a decision-support system that automatically evaluates trade-offs between predictive performance and measurement cost. This tool enables organizations to make informed decisions about which measurements deliver the highest value for their specific objectives and within their budgets.
Results and Insights
1. Mixograph Peak Integral is the most important predictor for most tasks
We found that excluding farinograph data has little to no impact on predictions of Mixing Time, Dough Handling and Bread Volume (3 out of 5 factors in the bread score). This is good news, because farinograph measurements take 3 times as long, are very sensitive to noise, and have a much larger sample mass requirement. But don't throw out your farinograph just yet – excluding farinograph data does impact predictive performance for Water Absorption (R² from 0.62 to 0.51) and Bread Score (R² from 0.53 to 0.44).
Cost vs Performance Insight - Excluding farinograph measurements reduces testing costs by 30-40% but labs must weigh the cost savings with the performance drops.
2. Protein Content alone is insufficient to predict bread performance
The wheat lab was already aware that protein content isn't a reliable predictor by itself. Wheat varieties with identical protein can vary in bread volume by as much as 34%. Our analysis confirmed this – protein content alone only explains only 27% of the variation in bread volume. Protein content was even worse as a predictor of the other bread qualities, explaining 3% for bread score overall and up to 7% of the variance for Dough Mixing, Water Absorption, and Dough Handling.
3. Protein is predictive in combination with other grain and flour features
Protein isn't great on its own, but it's an excellent team player – we found optimal predictive performance using protein in combination with gluten index, ash content, and milling yield.
4. Balancing the Measurements of Flour and Dough
While pareto-optimal predictive models allocate majority importance (60%) to grain/flour features, dough features take up a large minority (40%).
5. 36% reduction in measurement costs maintains 97% of predictive performance
Our Pareto Frontier Analysis reveals clear cost-performance tradeoffs. While comprehensive testing costs $390/sample and establishes our baseline performance (R² = 0.597 average across all breadmaking properties), strategic test exclusions can reduce costs significantly with minimal accuracy loss.
The optimal budget-conscious approach costs $248/sample and focuses exclusively on flour quality and dough rheology measurements while skipping all grain-level tests (grain moisture, protein, ash, test weight, kernel dimensions, and milling yield). This configuration achieves 96.9% of maximum predictive accuracy while saving $142/sample—a 36% cost reduction.
What's Next
At the moment, 40% of the breadmaking puzzle is still unsolved, at least for the data we received. We're looking forward to continued progress on this problem. Our results hint at complex interactions between protein and other parts of the grain matrix. We think that future studies should explore measurements of protein quality rather than just total protein content. The importance of gluten index in our models suggests that measuring gliadin and glutenin fraction may provide additional predictive power. We expect that starch composition would also improve prediction, due to its significant role in water binding in dough.
Contact Us
We combine deep expertise in ingredients with advanced machine learning to tackle the complex challenges food scientists face every day. We know that ingredient behavior is often unpredictable—that's what makes this work both frustrating and fascinating.
- Do you have questions about your single-proxy predictions, such as is kernel hardness the main predictor of milling efficiency?
- Do you have complex questions about the rules of blending flour for performance?
We offer a systematic approach to uncover patterns in your data that can complement your expertise and guide strategic decisions.
