Data-driven design: predicting functional attributes in early-stage automotive engineering

Maximilian Rahn; Dietmar Göhlich; Tu-Anh Fay; Kien van Ho

doi:10.1017/pds.2025.10123

Data-driven design: predicting functional attributes in early-stage automotive engineering

Published online by Cambridge University Press: 27 August 2025

Tu-Anh Fay and

Maximilian Rahn*: Affiliation:
Technische Universität Berlin, Germany
Dietmar Göhlich: Affiliation:
Technische Universität Berlin, Germany
Tu-Anh Fay: Affiliation:
Technische Universität Berlin, Germany
Kien van Ho: Affiliation:
Technische Universität Berlin, Germany
*: Maximilian, RahnTechnische Universität BerlinGermanymax_rahn@outlook.de

Article contents

Abstract
Introduction
State of the art
Methodical approach to predict customer relevant functional attributes with Machine Learning
Case study: prediction of functional attributes of combustion engine cars
Discussion
Conclusion and outlook
References

Abstract

This paper investigates the effectiveness of machine learning models in predicting customer-relevant functional attributes of vehicles based on selected design variables, using a limited automobile market dataset. By comparing machine learning algorithms such as Support Vector Regression, k-Nearest Neighbour Regression, and Lasso Regression, the study evaluates the models’ predictive accuracy and their potential application in automotive design. The findings highlight both the opportunities and limitations of these methods, emphasising their capacity to support data-driven decision-making despite constraints posed by dataset size, as encountered in real-world, early-stage automotive platform strategies.

Keywords

machine learning product modelling / models design engineering

Information

Type: Article
Information: Proceedings of the Design Society , Volume 5: ICED25 , August 2025 , pp. 1091 - 1099

DOI: https://doi.org/10.1017/pds.2025.10123 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.
Copyright: © The Author(s) 2025

1 Introduction

Globalization, evolving consumer behaviour, technological advancements, and the growing demand for individualization are megatrends that necessitate a broader diversification of product portfolios for companies to maintain long-term competitiveness. This diversification is reflected in the extensive range of products available to customers (Bender & Gericke, Reference Bender and Gericke2021).

In 2021, a Mercedes-Benz electrical S-Class, for example, offers tens of thousands of possible options, with up to 350 sensors dedicated solely for the safety systems (Lang & Baumann, Reference Lang and Baumann2021). This example aligns with previous findings that very high complexity is one of the biggest obstacles for many industries (ElMaraghy et al., Reference ElMaraghy, ElMaraghy, Tomiyama and Monostori2012).

Modular product strategies, such as product platforming, can help balance the trade-off between standardization and differentiation (Simpson et al., Reference Simpson, Siddique, Jiao, Simpson, Siddique and Jiao2006). These strategies enable the efficient delivery of a wide variety of products to customers (Seepersad et al., Reference Seepersad, Mistree and Allen2002, p. 1) while significantly reducing costs (Meyer et al., Reference Meyer, Utterback and &1993). According to Wec. (Reference Weck, Simpson, Siddique and Jiao2006), the multi-platform extent problem, addressing several platforms spanning multiple product segments, considers a large but finite set of factors and can be solved quantitatively. The first step in the proposed framework is the calculation of the Net Present Value for single products. For this process, it is essential to predict customer-relevant functional attributes. These attributes, such as fuel efficiency or acceleration, are based on design variables of the product, including vehicle height, wheelbase, and engine displacement, all of which are determined through the engineering process Wec. (Reference Weck, Simpson, Siddique and Jiao2006). In addition to the aforementioned model, the prediction of customer-relevant attributes plays a crucial role in defining and optimizing modular product architectures to achieve a balance between standardization and differentiation (Robertson & Ulrich, Reference Robertson and Ulrich1998, Bender & Gericke, Reference Bender and Gericke2021, Krause & Gebhardt, Reference Krause and Gebhardt2023). Our paper contributes by investigating new approaches, specifically those based on machine learning (ML), to predict customer-relevant attributes from given design variables. The investigated ML approaches are efficient to set up and trade off efficiency versus accuracy. This is especially important in the early project phase, where major but more directional decisions have to be made (Koch et al., Reference Koch, Allen, Mistree and Mavris1997).

Based on the challenges discussed earlier, this study investigates the effectiveness of machine learning (ML) methods in predicting functional vehicle attributes as a fundament for further estimations such as the Net Present Value. This leads to the central research question:

How effective is machine learning in predicting customer-relevant functional attributes based on selected design variables, when trained with a limited data set?

If ML models can accurately predict these relationships, this would mark a step toward data-driven support in the design process and, more broadly, in platform optimization. To ensure comparability with existing literature and reflect the practical constraints faced by automotive manufacturers, particularly during the early stages of new model development, where comprehensive data is often unavailable, this study employs a limited data set.

Section 2 defines the terminology used in this paper, provides an overview of the underlying single-product model, and identifies the specific design stage where ML models offer significant leverage. Additionally, the ML approaches utilized are briefly introduced. Section 3 outlines the methodology, including details about the limited, real-world data set employed. Section 4 presents the key findings, highlighting the performance of ML models in predicting functional attributes. Section 5 discusses the broader implications of these findings for the prediction of functional and customer-related attributes. Finally, the paper concludes with Section 6, summarizing the outcomes and offering an outlook on future research opportunities.

2 State of the art

2.1 Terminology: design variables, functional attributes

Design Variables

Design variables represent the physical components or parts and modules consisting of parts. In the development process of a product, engineers set the “values” for the design variables to comply with the targeted functional attributes (Weck, 2006, Pirmoradi et al., Reference Weck, Simpson, Siddique and Jiao2014). Examples in the automobile sector could be wheelbase, height, and tire diameter in general, or with respect to the drive system type, engine displacement, or battery capacity for a combustion engine and battery electric car, respectively. A real-world automobile has a significant number (>>10,000) of design variables that can be influenced within the engineering process (Weck, Reference Weck, Simpson, Siddique and Jiao2006).

Functional Attributes

The functional attributes are relevant to the customer by indicating the product performance. They depend on the design variables, although the relationship can only be approximated, for example, by modelling the physical correlation (Weck, Reference Weck, Simpson, Siddique and Jiao2006).

Examples for automobiles could be the acceleration, the fuel or energy efficiency.

Mapping design variables to customer-relevant functional attributes to understand the relationship using a model is a standard procedure, that could be done with a variety of methods, including Object-Process Methodology (Dori, Reference Dori2002, Bender & Gericke, Reference Bender and Gericke2021).

2.2 Single product model with engineering performance domain

The single product model from Wec. (Reference Weck, Simpson, Siddique and Jiao2006) is embedded in his multi-platform optimization approach and it aims to estimate the net present value (NPV) for a single product. To estimate this value is important because a platform strategy itself could target on maximizing the NPV which is consistent with a for-profit company’s main aim. In order to estimate the NPV, a simplified model of the entire creation process for a single product like an automobile is designed. The data used consists of four clusters: design variables, functional attributes, soft attributes, and market data (Figure 1). The framework itself is comprised of six domain blocks that map the entire internal value creation flow at a high level, including Product Architecture, Engineering Performance, Product Value, Market Demand, Manufacturing Cost, and Investment Finance. The first domain maps the design variables and soft attributes as influenced by engineering to the customer-relevant functional attributes. Soft attributes for automobiles could include styling, comfort, or quality. Next, the Engineering Performance domain predicts the functional attributes from the design variables based on a data set of peer-grouped products. The data set in our paper of cars in the mid-size segment will be discussed later. In the third domain, a model-predicted product value is calculated based on the product’s functional and soft attributes. Fourth, in market demand, the product’s “real” market value as revealed by the market data is calculated. A synchronization of the model-predicted product value and derived demand with the revealed market value and demand, respectively, provides insights into the influence of each functional and soft attribute as well as the design variables. The demand serves as the basis for approximating the variable manufacturing costs in the fifth domain. Finally, in Investment Finance, the net present value is calculated via the discounted cash flow method, taking into account fixed and variable costs, price, demand, and investments. (Weck, Reference Weck, Simpson, Siddique and Jiao2006, Cook, Reference Cook1997).

Figure 1. Use of ML in engineering performance domain (based on single product model, Wec. (Reference Weck, Simpson, Siddique and Jiao2006))

In the Engineering Performance domain, empirical approximation models like linear regression or response surface models can be used (Srivastava et al., Reference Srivastava, Hacker, Lewis and Simpson2004). We see a high leverage by using machine learning models in this domain. As this domain involves regression problems, which include predicting a numerical target variable based on features, it is a typical task for supervised machine learning models (Géron, Reference Géron2018).

3 Methodical approach to predict customer relevant functional attributes with Machine Learning

3.1 Model explanation

In this study, a dataset from the literature was utilized, comprising 31 vehicles from the Medium Sedan/Coupe segment sold in the USA in 2002 (Weck, Reference Weck, Simpson, Siddique and Jiao2006). The dataset was first split into training and test sets using k-fold cross-validation, a technique that involves dividing the dataset into multiple subsets (folds), training on some folds while testing on others, and rotating through all folds (Raschka, Reference Raschka2018, Raschka & Mirjalili, Reference Raschka and Mirjalili2021).

One observation was removed beforehand as a separate evaluation set. The training data was divided into 5 folds for this purpose. Data preprocessing included mean imputation, a method for handling missing data by replacing missing values with the mean value of the corresponding feature, and z-score standardization, a normalization technique that transforms data into a distribution with a mean of 0 and a standard deviation of 1 (Alexandropoulos et al., Reference Alexandropoulos, Kotsiantis and Vrahatis2019). Feature selection was performed using Sequential Floating Forward Selection, a method that iteratively adds and removes features to find the optimal subset for the model (Ververidis & Kotropoulos, Reference Ververidis and Kotropoulos2008).

The processed training data was then used to train and optimize various models. Model performance was assessed using RMSE, MAE, and R² metrics during cross-validation (Geron, Reference Géron2018). The best configurations from cross-validation were then applied to the evaluation set to predict specific data points.

3.2 Model specifications

In this section, we describe the methods and specific hyperparameters used in the development and training of the three machine learning models: Support Vector Regression (SVR), k-Nearest Neighbor Regression (kNNR), and Least Absolute Shrinkage and Selection Operator (Lasso) Regression. These models were selected for their ability to handle regression tasks, and their hyperparameters were carefully tuned to optimize their performance.

SVR, which finds a hyperplane in a high-dimensional space that best fits the data (Geron, 2018), was employed to capture the complex relationships between the input features and the target variable. The key hyperparameters for the SVR model were tuned as follows: The regularization parameter C was set between 0.807 (Fuel Efficiency) and 55.9 (Passenger Volume). This parameter controls the trade-off between achieving a low error on the training data and minimizing the model complexity, thus preventing overfitting. A higher C value puts more emphasis on minimizing errors on the training data, potentially leading to overfitting. The epsilon value was set within a range of 0.02 (Fuel Efficiency) to 0.95 (Cargo Volume). This parameter defines a margin of tolerance within which no penalty is given to errors, allowing the model to ignore errors that fall within this margin and thus reducing sensitivity to noise in the data. A higher epsilon allows for more significant errors to be ignored, which can make the model less sensitive to outliers but may also reduce accuracy. The gamma value varied between 0.03 (Acceleration) and 0.73 (Cargo Volume), and the kernels used were linear (Cargo Volume), sigmoid (Fuel Efficiency, Acceleration), and radial basis function (Passenger Volume, Towing Capacity).

The kNNR model, a non-parametric method predicting the value of a new data point based on the average value of its k-nearest neighbours (Raschka, Reference Raschka and Mirjalili2021), was utilized for its simplicity and effectiveness in capturing local patterns in the data. The number of neighbours k varied between 2 (Towing Capacity) and 10 (Fuel Efficiency). This parameter determines how many of the closest data points (neighbours) are considered when making a prediction for a new point. A higher value of k tends to smooth out predictions but may also introduce bias. The neighbour weights calculation was done using uniform weighting (Fuel Efficiency, Acceleration), where all neighbours contribute equally to the prediction, and distance weighting (Passenger Volume, Cargo Volume, Towing Capacity), where closer neighbours have a greater influence on the prediction.

LassoR, which includes a regularization term to penalize large coefficients and effectively performs feature selection by shrinking some coefficients to zero (Geron, Reference Géron2018), was employed to enforce sparsity in the model, potentially reducing the number of features by setting some coefficients to zero. The regularization parameter alpha was set between 0.06 (Cargo Volume) and 0.98 (Towing Capacity). This parameter controls the strength of the regularization applied to the model. A higher alpha value increases the penalty on large coefficients, encouraging the model to prefer simpler solutions with fewer active features.

The values for the presented parameters were chosen based on cross-validation performance and a grid-search optimization, ensuring that the model effectively captured the underlying patterns without overfitting.

3.3 Model performance

In this section, we present the performance of the three machine learning models, Support Vector Regression (SVR), k-Nearest Neighbour Regression (kNNR), and Lasso Regression (LassoR), across different functional attributes. Table 1 summarizes the key performance metrics while predicting various functional attributes: R², Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE), for each model and attribute, including Passenger Volume, Cargo Volume, Towing Capacity, Fuel Efficiency, and Acceleration.

Table 1. Model performance for predicting functional attributes

* Highest R2 (see Table 1)

For passenger volume (PV), the SVR model performed the best with an R² of 0.51, indicating that it explained 51% of the variance in the passenger volume data. It also had the lowest MAE and RMSE values, 2.22 and 2.76, respectively, making it the most accurate model for this attribute. kNNR and LassoR performed less well, with lower R² values (0.32 and 0.22) and higher errors.

Concerning cargo volume (CV), both kNNR and LassoR models achieved the highest R² value of 0.62 for cargo volume, demonstrating strong predictive power. However, LassoR had the lowest MAE (0.61) and RMSE (0.76), suggesting that it provided the most accurate predictions. The SVR model also performed well, with an R² of 0.54 and reasonable error metrics.

For towing capacity, the kNNR model was the only model that produced a positive R² value (0.21) for towing capacity, indicating a slight ability to predict this attribute. SVR and LassoR, however, had negative R² values (-0.15 and -1.12), indicating that these models performed poorly for this attribute, with high MAE and RMSE values.

Concerning fuel efficiency (FE), the SVR model again demonstrated strong performance with an R² of 0.39, outperforming kNNR and LassoR, which had R² values of 0.25 and 0.17, respectively. The SVR model also had the lowest error metrics, with an MAE of 1.28 and an RMSE of 1.49.

For acceleration (AC), the SVR model showed the best performance with an R² of 0.61, the highest among the three models. It also had the lowest MAE (0.56) and RMSE (0.74), indicating that it was the most accurate model for this attribute. kNNR and LassoR showed slightly lower R² values (0.51 and 0.50) and higher errors.

4 Case study: prediction of functional attributes of combustion engine cars

In automotive engineering, predicting customer-relevant functional attributes like fuel efficiency or acceleration is a critical challenge. A predictive model based on publicly available, rudimental data like we developed in the last section, would enhance the efficiency of the design process in an early stage by giving implications supporting the decision-making process and thereby reduce the time to market. To illustrate this thought, we discuss one observation as evaluation point. In industry practice, this vehicle would be in an early development stage and functional attributes are needed to do a competitor analysis.

4.1 Results

For the evaluation of the different models namely support vector regression (SVR), k-nearest neighbour regression (kNNR), least absolute shrinkage and selection operator regression (LassoR) in respect to the five dependent variables passenger volume (PV, in liter), cargo volume (CV, in liter), towing capacity (TC, in kg), fuel capacity (FE, in l/100km) and acceleration (AC, in s) one observation was taken due to the limited data set. This observation is at the median of the dataset, ranked by sales volume (Mercury Sable). This datapoint was excluded from the training and testing of all models to comply with the ML modelling principles. Table 2 summarizes the prediction results and the deviations from the actual values.

The prediction for Passenger Volume (PV) using kNNR is 2883.61 liters, just -0.65% to the actual value of 2902.47 liters, which is slightly more accurate than the SVR prediction with a +1.25% deviation. Despite this, the SVR model boasts the best R² performance (0.51; see Table 1).

The kNNR model provides the best approximation for cargo volume (CV) in liters, predicting 474.75 liters with a deviation of +4.81% from the actual value of 453.07 liters. Meanwhile, the LassoR model, which has the same R² value (0.62) like the kNNR, predicts 487.49 liters, deviating by +7.56%, and the SVR model slightly more overestimates this attribute with a deviation of +7.81%.

For towing capacity (TC), the kNNR model again delivers the best approximation, predicting 562.52 kg with a deviation of -0.79% from the real value of 566.99 kg. The SVR model, which holds the best R² value, predicts 537.68 kg, deviating by +5.18%, while the LassoR model predicts 584.09 kg, deviating by +3.02%.

The SVR model, despite having the best R² value (0.39), predicts fuel efficiency (FE) at 9.51 liters per 100 km, deviating by +7.48% from the real value of 10.23 liters per 100 km. The kNNR and LassoR models offer better approximations, both deviating by -0.87%, predicting 10.32 and 9.96 liters per 100 km, respectively.

The SVR model, while having the highest R² value (0.61), predicts acceleration with a value of 8.54 seconds, deviating by +14.78% from the real value of 10.02 seconds. The kNNR model provides a better approximation with a deviation of -12.68%, while the LassoR model, which also has a strong R² value, predicts 8.90 seconds with a deviation of -11.17%.

Table 2. Real-world predictions (observation (Mercury Sable) median of the dataset, ranked by sales volume)

* Highest R2 (see Table 1);

** Best approximation; Median data point, sorted by sales volume, was taken as evaluation data point (Mercury Sable)

4.2 Comparative performance

The results indicate that while the SVR model consistently achieved the highest R² values (see Table 1), the kNNR model often provided the best approximations to the actual values, particularly for cargo volume, towing capacity, and passenger volume (see Table 2). The LassoR model showed strong performance in predicting acceleration but generally had slight underestimations in other attributes.

Our findings suggest that while models with the highest R² values, such as SVR, offer strong predictive capabilities, models like kNNR, which provided the best approximations to actual values, are crucial for practical applications. These models demonstrate strong potential for application in the early stages of product design to support decision-making processes, balancing accuracy with approximation effectiveness.

5 Discussion

This section delves into the insights derived from the empirical results of the models utilized in predicting various functional attributes of combustion engine cars, considering the practical implications, limitations, and scope for future research.

Study Context and Assumptions

This study was conducted under the premise of using a limited dataset from the Medium Sedan/Coupe segment sold in the USA in 2002, consisting of only 31 vehicles (Weck, Reference Weck, Simpson, Siddique and Jiao2006). The small size of the dataset, limited to 9 design variables and 5 functional attributes, posed significant challenges in predicting customer-relevant outputs such as Passenger Volume, Cargo Volume, Towing Capacity, Fuel Efficiency, and Acceleration. Moreover, several of these functional attributes are governed by complex physical relationships, which further complicated the modelling process. Despite these constraints, the study aimed to evaluate the effectiveness of three machine learning models, Support Vector Regression (SVR), k-Nearest Neighbour Regression (kNNR), and Lasso Regression (LassoR), in generating accurate predictions based on the available data.

Interpretation of Results

The results indicate that the SVR model consistently achieved the highest R² values across most functional attributes, particularly for Passenger Volume, Acceleration, and Fuel Efficiency. However, the kNNR model often provided the most accurate approximations. This suggests that ML models in general, and SVR and kNNR in particular, effectively capture the relationships between design variables and functional attributes, despite the limited data. For Passenger Volume and Cargo Volume, the models demonstrate the ability to accurately predict these attributes, highlighting their capacity to manage the straightforward relationship between design variables and interior spaces. Similarly, for Acceleration and Fuel Efficiency, the models’ high R² values reflect their capability to handle the complex interactions between aerodynamics, powertrain efficiency, and vehicle weight, even though these attributes are influenced by more intricate physical dynamics.

Limitations of Discussed ML Models

Given the limited dataset, the models were not able to precisely predict all functional attributes. From our perspective, more data across all dimensions, design variables, functional attributes, and observations, are needed to enhance the prediction capabilities. While this is an obvious conclusion, it raises important questions in the industry about which data is available, at what cost, and with what quality. Additionally, in real-world scenarios, a product segment like the US midsize market has only a limited number of competitors, making it difficult to gather more “observations”.

Another limitation of our study is the lack of consideration for specific technologies. In our model-building process, no specific technologies, such as turbocharging or hybrid technology, were considered. These could be included as additional design variables to improve the models’ accuracy.

6 Conclusion and outlook

In this study, we presented the results of a data-driven approach to predict customer-relevant functional attributes. We explored the effectiveness of three machine learning models, Support Vector Regression (SVR), k-Nearest Neighbour Regression (kNNR), and Least Absolute Shrinkage and Selection Operator Regression (LassoR), in predicting various functional attributes of vehicles using a limited dataset from the Medium Sedan/Coupe segment sold in the USA in 2002. Despite the constraints posed by the limited dataset, which included 9 design variables and 5 functional attributes, our findings provide valuable insights into the capabilities and limitations of these models in the context of automotive engineering and address the question of how effective machine learning is in predicting customer-relevant functional attributes.

Key Findings

The results of our analysis indicate that the SVR model consistently outperformed the other models across most functional attributes, achieving the highest R² values and the lowest errors (MAE and RMSE) for Passenger Volume, Acceleration, and Fuel Efficiency.

On the other hand, the kNNR model often provided the most accurate approximations, particularly for Cargo Volume, Passenger Volume, and Fuel Efficiency, and performed well in predicting Acceleration.

This demonstrates SVR’s and kNNR’s robustness in capturing both straightforward and complex relationships between design variables and functional attributes.

LassoR showed competitive performance in predicting Cargo Volume and Acceleration but underperformed in other areas, highlighting its limitations when applied to this specific dataset.

Implications

Our contribution includes the development of models using Support Vector Regression (SVR) and k-Nearest Neighbor Regression (kNNR) to predict customer-relevant functional attributes based on underlying design variables, at least at an indicative level. In the early stages of product development, these models can support directional decisions in product architecture definition, such as identifying the best fit of existing modules or technologies to meet specific customer requirements. This approach helps balance synergies from potential module use with product distinctiveness, thereby enhancing the success of platform strategies. These tasks are crucial in the product development process (Bender & Gericke, Reference Bender and Gericke2021, Robertson & Ulrich, Reference Robertson and Ulrich1998). Additionally, the models, which were trained with rudimentary, publicly available data, align with the trade-off between accuracy and efficiency in the early phase (Koch et al., Reference Koch, Allen, Mistree and Mavris1997).

The models demonstrated significant potential, particularly in predicting attributes such as passenger volume, fuel efficiency, and acceleration. The ability of these models to generalize appears to be effective, given the nature of their training data. In this study, a dataset of the US midsize segment was used, which was derived from literature (Weck, Reference Weck, Simpson, Siddique and Jiao2006). This indicates that the methodologies and models applied in this research are robust and capable of predicting key performance metrics in similar automotive engineering scenarios.

Future Research

This study was limited by the small dataset and the narrow scope of design variables and functional attributes. The models’ inability to predict all functional attributes with greater accuracy underscores the need for more extensive datasets that capture a broader range of variables and observations. Future research should focus on expanding the dataset to include modern vehicles and technologies, such as turbocharging and hybrid systems, which could be incorporated as additional design variables.

Another important subject is the model adoption to battery electric vehicles, naturally incorporating a different set of design variable and functional attributes.

Additionally, the challenge of gathering sufficient data in specific automobile product segments, such as the US midsize market with combustion engines, raises important questions about data availability, as well as the associated costs and quality. Addressing these challenges will be crucial for improving the accuracy and reliability of predictive models in automotive design.

Acknowledgments

We would like to express the gratitude to colleagues at Mercedes-Benz AG, especially Dr. Jan-Christoph Goos, for their exceptional guidance and support. We also thank the colleagues at the MPM at the Technical University of Berlin for their valuable insights and assistance.

References

Alexandropoulos, S.-A. N., Kotsiantis, S. B., & Vrahatis, M. N. (2019). Data preprocessing in predictive data mining. The Knowledge Engineering Review, 34, e1. https://doi.org/10.1017/S026988891800036X CrossRef Google Scholar

Bender, B., & Gericke, K. (Eds.). (2021). Pahl/Beitz Konstruktionslehre: Methoden und Anwendung erfolgreicher Produktentwicklung (9th ed. 2021). Springer Vieweg. https://doi.org/10.1007/978-3-662-57303-7 CrossRef Google Scholar

Cook, H. E. (1997). Product management: Value, quality, cost, price, profit and organization (1. ed.). Chapman & Hall.Google Scholar

Dori, D. (2002). Object-Process Methodology. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-56209-9 CrossRef Google Scholar

ElMaraghy, W., ElMaraghy, H., Tomiyama, T., & Monostori, L. (2012). Complexity in engineering design and manufacturing. CIRP Annals, 61 (2), 793–814. https://doi.org/10.1016/j.cirp.2012.05.001 CrossRef Google Scholar

Géron, A. (2018). Praxiseinstieg Machine Learning mit Scikit-Learn und TensorFlow: Konzepte, Tools und Techniken für intelligente Systeme (K. Rother, Trans.) (1. Auflage). O'Reilly. https://www.oreilly.de/buecher/13111/9783960090618-praxiseinstieg-machine-learning-mit-scikit-learn-und-tensorflow.html Google Scholar

Koch, P. N., Allen, J. K., Mistree, F., & Mavris, D. (1997). The Problem of Size in Robust Design. In Proceedings of the 1997 ASME Design Engineering Technical Conferences (pp. 1–13). . https://doi.org/10.1115/DETC97/DAC-3983 CrossRef Google Scholar

Krause, D., & Gebhardt, N. (2023). Methodical development of modular product families: Developing high product diversity in a manageable way. Springer. https://doi.org/10.1007/978-3-662-65680-8 CrossRef Google Scholar

Lang, P., & Baumann, U. (2021). Luxus-Stromer mit Diesel-Reichweite ab 106.374 Euro: Mercedes EQS (2021), Die elektrische S-Klasse. Motor Presse Stuttgart. https://www.auto-motor-und-sport.de/elektroauto/mercedes-eqs-2021-marktstart-elektrische-s-klasse-reichweite-batterie-preis/ Google Scholar

Meyer, M., Utterback, J., &, M, J. (1993). The Product Family and the Dynamics of Core Capability. Sloan Management Review, 34 (3), 29–47. http://hdl.handle.net/1721.1/2440 Google Scholar

Raschka, S. (2018). Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning. Cornell University arXiv. https://doi.org/10.48550/arXiv.1811.12808 CrossRef Google Scholar

Raschka, S., & Mirjalili, V. (2021). Machine Learning mit Python und Keras, TensorFlow 2 und Scikit-learn: Das umfassende Praxis-Handbuch für Data Science, Deep Learning und Predictive Analytics (K. Lorenzen, Trans.) (3., aktualisierte und erweiterte Auflage). mitp. https://ebookcentral.proquest.com/lib/kxp/detail.action?docID=6947823 Google Scholar

Robertson, D., & Ulrich, K. (1998). Planning for Product Platforms. Sloan Management Review, 39 (4), 19–31. https://repository.upenn.edu/oid_papers/266/ Google Scholar

Seepersad, C. C., Mistree, F., & Allen, J. K. (2002). A Quantitative Approach for Designing Multiple Product Platforms for an Evolving Portfolio of Products. In Proceedings of the ASME 2002 Design Engineering Technical Conferences (pp. 579–592). . https://doi.org/10.1115/DETC2002/DAC-34096 CrossRef Google Scholar

Simpson, T. W., Siddique, Z., & Jiao, J. R. (2006). Platform-Based Product Family Development. In Simpson, T. W., Siddique, Z., & Jiao, J. R. (Eds.), Product Platform and Product Family Design: Methods and Applications (pp. 1–15). Springer US. https://doi.org/10.1007/0-387-29197-0_1 CrossRef Google Scholar

Srivastava, A., Hacker, K., Lewis, K., & Simpson, T. W. (2004). A method for using legacy data for metamodel-based design of large-scale systems. Structural and Multidisciplinary Optimization, 28 (2-3), 146–155. https://doi.org/10.1007/s00158-004-0438-4 CrossRef Google Scholar

Ververidis, D., & Kotropoulos, C. (2008). Fast and accurate sequential floating forward feature selection with the Bayes classifier applied to speech emotion recognition. Signal Processing, 88 (12), 2956–2970. https://doi.org/10.1016/j.sigpro.2008.07.001 CrossRef Google Scholar

Weck, O. L. de. (2006). Determining Product Platform Extent. In Simpson, T. W., Siddique, Z., & Jiao, J. R. (Eds.), Product Platform and Product Family Design: Methods and Applications (241–301). Springer US.Google Scholar

Figure 1. Use of ML in engineering performance domain (based on single product model, Wec. (2006))

Table 1. Model performance for predicting functional attributes

Table 2. Real-world predictions (observation (Mercury Sable) median of the dataset, ranked by sales volume)

Article contents

Data-driven design: predicting functional attributes in early-stage automotive engineering

Abstract

Keywords

Information

1 Introduction

2 State of the art

2.1 Terminology: design variables, functional attributes

Design Variables

Functional Attributes

2.2 Single product model with engineering performance domain

3 Methodical approach to predict customer relevant functional attributes with Machine Learning

3.1 Model explanation

3.2 Model specifications

3.3 Model performance

4 Case study: prediction of functional attributes of combustion engine cars

4.1 Results

4.2 Comparative performance

5 Discussion

Study Context and Assumptions

Interpretation of Results

Limitations of Discussed ML Models

6 Conclusion and outlook

Key Findings

Implications

Future Research

Acknowledgments

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests