Artificial Intelligence Archives - ESRD

XAI Will Force Clear Thinking About the Nature of Mathematical Models

brent — Wed, 15 Nov 2023 17:45:07 +0000

By Dr. Barna Szabó
Engineering Software Research and Development, Inc.
St. Louis, Missouri USA

It is generally recognized that explainable artificial intelligence (XAI) will play an important role in numerical simulation where it will impose the requirements of reliability, traceability, and auditability. These requirements will necessitate clear thinking about the nature of mathematical models, the trustworthiness of their predictions, and ways to improve their reliability.

Courtesy Gerd Altmann/geralt.

What is a Mathematical Model?

A mathematical model is an operator that transforms one set of data D, the input, into another set, the quantities of interest F. In shorthand notation we have:

\boldsymbol D\xrightarrow[(I,\boldsymbol p)]{}\boldsymbol F,\quad (\boldsymbol D, \boldsymbol p) \in ℂ \quad (1)

where the right arrow represents the mathematical model. The letters I and p under the right arrow indicate that the transformation involves an idealization (I) as well as parameters (physical properties) p that are determined by calibration. Restrictions on D and p define the domain of calibration ℂ, which is also called the domain of application of the mathematical model.

The formulation of mathematical models is a creative, open-ended activity, guided by insight, experience, and personal preferences. The validation and ranking of mathematical models, on the other hand, are based on objective criteria.

The systematic improvement of the predictive performance of mathematical models and their validation is, essentially, a scientific research program. According to Lakatos [1], a scientific research program has three constituent elements: (a) a set of hardcore assumptions, (b) a set of auxiliary hypotheses, and (c) a problem-solving machinery.

In the applied sciences, the hardcore assumptions are the assumptions incorporated in validated models of broad applicability, such as the theory of elasticity, the Navier-Stokes equations, and the Maxwell equations. The objects of investigation are the auxiliary hypotheses.

For example, in linear elastic fracture mechanics (LEFM), the goal is to predict the probability distribution of the length of a crack in a structural component, given the initial crack configuration and a load spectrum. In this case, the hardcore assumptions are the assumptions incorporated in the theory of elasticity. One auxiliary hypothesis establishes a relationship between a functional defined on the elastic stress field, such as the stress intensity factor, and increments in crack length caused by the application of cyclic loads. The second auxiliary hypothesis accounts for the effects of overload and underload events. The third auxiliary hypothesis models the statistical dispersion of crack length.

The parameters characterize the relationships defined by the auxiliary hypotheses and define the material properties of the hardcore problem. The domain of calibration ℂ is the set of restrictions on the parameters imposed by the assumptions in the hardcore hypothesis and limitations in the available calibration data.

Problem-Solving

The problem-solving machinery is a numerical method, typically the finite element method. It generates an approximate solution from which the quantities of interest F_num are computed. It is necessary to show that the relative error in F_num does not exceed an allowable value τ_all:

| \boldsymbol F - \boldsymbol F_{num} |/|\boldsymbol F| \le \tau_{all} \quad (2)

To achieve this goal, it is necessary to obtain a sequence of numerical solutions with increasing degrees of freedom [2].

Demarcation

Not all model development projects (MDPs) are created equal. It is useful to differentiate between progressive, stagnant, and improper MDPs: An MDP is progressive if the domain of calibration is increasing; stagnant if the domain of calibration is not increasing, and improper if the auxiliary hypotheses do not conform with the hardcore assumptions, or the problem-solving method does not have the capability to estimate and control the numerical approximation errors in the quantities of interest. Linear elastic fracture mechanics is an example of stagnant model development projects [3].

Presently, the large majority of engineering model development projects is improper. The primary reason for this is that finite element modeling rather than numerical simulation is used, hence the capability to estimate and control the numerical approximation errors is absent.

Finite element modeling is formally similar to equation (1):

\boldsymbol D\xrightarrow[(i,\boldsymbol p)]{} \overline {\boldsymbol F}_{num} \quad (3)

where lowercase i is used to indicate intuition in the place of idealization (I) and F̅_num replaces F. The overbar is used to distinguish the solutions obtained by finite element modeling and proper application of the finite element method.

In finite element modeling, elements are intuitively selected from the library of a finite element software tool and assembled to represent the object of analysis. Constraints and loads are imposed to produce a numerical problem. The right arrow in equation (3) represents a ”numerical model”, which may not be an approximation to a well-defined mathematical model, in which case F is not defined and F̅_num does not converge to limit value as the number of degrees of freedom is increased. Consequently, error estimation is not possible. Also, the domain of calibration has a different meaning in finite element modeling than in numerical simulation.

Opportunities for Improving the Predictive Performance of Models

There is a very substantial unrealized potential in numerical simulation technology. To realize that potential, it will be necessary to replace the practice of finite element modeling with numerical simulation and utilize XAI tools to aid analysts in performing simulation projects:

Rapid advancements are anticipated in the standardization of engineering workflows, initially through the use of expert-designed engineering simulation applications equipped with autonomous error control procedures.
XAI will make it possible to control the errors of approximation very effectively. Ideally, the information in the input will be used to design the initial mesh and assignment of polynomial degrees in such a way that in one or two adaptive steps the desired accuracies are reached.
XAI will be less helpful in controlling model form errors. This is because the formulation of models involves creative input for which no algorithm exists. Nevertheless, XAI will be useful in tracking the evolutionary changes in model development and the relevant experimental data.
XAI will help navigate numerical simulation projects.
- Prevent the use of intuitively plausible but conceptually wrong input data.
- Shorten training time for the operators of simulation software tools.

The Main Points

The reliability and effectiveness of numerical simulation can be greatly enhanced through integration with XAI processes.
The main elements of XAI-integrated numerical simulation processes are shown in Figure 1:

Figure 1: The main elements of XAI-integrated numerical simulation.

The integration of numerical simulation with explainable artificial intelligence tools will force the adoption of science-based algorithms for solution verification and hierarchic modeling approaches.

References

[1] I. Lakatos, The methodology of scientific research programmes, vol. 1, J. Currie and G. Worrall, eds., Cambridge University Press, 1972.

[2] B. Szabó and I. Babuška, Finite Element Analysis. Method, Verification and Validation. 2nd edition, John Wiley & Sons, Inc., 2021.

[3] Szabó, B. and Actis, R. The demarcation problem in the applied sciences. Computers and Mathematics with Applications, Vol. 162, pp. 206–214, 2024.

Related Blogs:

The Story of the P-version in a Nutshell

brent — Fri, 01 Dec 2023 02:15:56 +0000

By Dr. Barna Szabó
Engineering Software Research and Development, Inc.
St. Louis, Missouri USA

The idea of achieving convergence by increasing the polynomial degree (p) of the approximating functions on a fixed mesh, known as the p-version of the finite element method, was at odds with the prevailing view in the finite element research community in the 1960s and 70s.

The accepted paradigm was that elements should have a fixed polynomial degree, and convergence should be achieved by decreasing the size of the largest element of the mesh, denoted by h. This approach came to be called the h-version of the finite element method. This view greatly influenced the development of the software architecture of legacy finite element software in ways that made it inhospitable for later developments.

The finite element research community rejected the idea of the p-version of the finite element method with nearly perfect unanimity, predicting that “it would never work”. The reasons given are listed below.

Why the “p-version would never work”?

The first objection was that the system of equations would become ill-conditioned at high p-levels. − This problem was solved by proper selection of the basis functions [1].

The second objection was that high-order elements will require excessive computer time. − This problem was solved by proper ordering of the operations. If the task is stated in this way: “Compute (say) the maximum normal stress and verify that the result is accurate to within (say) 5 percent relative error“ then the p-version will require substantially fewer machine cycles than the h-version and virtually no user intervention.

The third objection was that mappings, other than isoparametric and subparametric mappings, fail to represent rigid body displacements exactly. − This is true but unimportant because the errors associated with rigid body modes converge to zero very fast [1].

The fourth objection was that solutions obtained using high-order elements oscillate in the neighborhoods of singular points. – The rate of convergence of the p-version is stronger because the finite element solution oscillates in the neighborhood of singular points and the p-version is very efficient elsewhere [1].

The fifth objection was the hardest one to overcome. There was a theoretical estimate of the error of approximation in energy norm which states:

||\boldsymbol u_{ex} -\boldsymbol u_{fe}||_E \le Ch^{min(\lambda,p)} \quad (1)

On the left of this inequality is the error of approximation in energy norm, on the right C is a positive constant, h is the size of the largest element of the mesh, λ is a measure of the regularity of the exact solution, usually a number less than one, and p is the polynomial degree. The argument was that since λ is usually a small number, it does not matter how high p is, it will not affect the convergence rate. This estimate is correct for the h-version, however, because C depends on p, it is not correct for the p-version [2].

(From left to right) Norman Katz, Ivo Babuška and Barna Szabó.

The sixth objection was that the p-version is not suitable for solving nonlinear problems. – This objection was answered when the German Research Foundation (DFG) launched a project in 1994 that involved nine university research institutes. The aim was to investigate adaptive finite element methods with reference to problems in the mechanics of solids [3]. The research was led by professors of mathematics and engineering.

As part of this project, the question of whether the p-version can be used for solving nonlinear problems was addressed. The researchers agreed to investigate a two-dimensional nonlinear model problem. The exact solution of the model problem was not known, therefore a highly refined mesh with millions of degrees of freedom was used to obtain a reference solution. This is the “overkill” method. The researchers unanimously agreed at the start of the project that the refinement was sufficient so that the corresponding finite element solution could be used as if it were the exact solution.

Professor Ernst Rank and Dr. Alexander Düster, of the Department of Construction Informatics of the Technical University of Munich, showed that the p-version can achieve significantly better results than the h-version, even when compared with adaptive mesh refinement, and recommended further investigation of complex material models with the p-version [4]. They were also able to show that the reference solution was not accurate enough. With this, the academic debate was decided in favor of the p-version. I attended the concluding conference held at the University in Hannover (now Leibniz University).

Understanding the Finite Element Method

The finite element method is properly understood as a numerical method for the solution of ordinary and partial differential equations cast in a variational form. The error of approximation is controlled by both the finite element mesh and the assignment of polynomial degrees [2].

The separate labels of h- and p-version exist for historical reasons since both the mesh (h) and the assignment of polynomial degrees (p) are important in finite element analysis. Hence, the h- and p-versions should not be seen as competing alternatives, but rather as integral components of an adaptable discretization strategy. Note that a code that has p-version capabilities can always be operated as an h-version code, but not the other way around.

There are other discretization strategies named X-FEM, Isogeometric Analysis, etc. They have advantages for certain classes of problems, but they lack the generality, adaptability, and efficiency of the finite element method implemented with p-version capabilities.

Outlook

Explainable Artificial Intelligence (XAI) will impose the requirements of reliability, traceability, and auditability on numerical simulation. This will lead to the adoption of methods that support solution verification and hierarchic modeling approaches in the engineering sciences.

Artificial intelligence tools will have the capability to produce smart discretizations based on the information content of the problem definition. The p-version, used in conjunction with properly designed meshes, is expected to play a pivotal role in that process.

References

[1] B. Szabό and I. Babuška, Finite Element Analysis. John Wiley & Sons, Inc., 1991.

[2] I. Babuška, B. Szabó and I. N. Katz, The p-version of the finite element method. SIAM J. Numer. Anal., Vol. 18, pages 515-545, 1981.

[3] E. Ramm, E. Rank, R. Rannacher, K. Schweizerhof, E. Stein, W. Wendland, G. Wittum, P. Wriggers, and W. Wunderlich, Error-controlled Adaptive Finite Elements in Solid Mechanics, edited by E. Stein. John Wiley & Sons Ltd., Chichester 2003.

[4] A. Düster and E. Rank, The p-version of the finite element method compared to an adaptive h-version for the deformation theory of plasticity. Computer Methods in Applied Mechanics and Engineering, Vol. 190, pages 1925-1935, 2001.

Related Blogs:

Why Finite Element Modeling is Not Numerical Simulation?

brent — Thu, 02 Nov 2023 15:05:12 +0000

By Dr. Barna Szabó
Engineering Software Research and Development, Inc.
St. Louis, Missouri USA

The term “simulation” is often used interchangeably with “finite element modeling” in the engineering literature and marketing materials. It is important to understand the difference between the two.

The Origins of Finite Element Modeling

Finite element modeling is a practice rooted in the 1960s and 70s. The development of the finite element method began in 1956 and was greatly accelerated during the US space program in the 1960s. The pioneers were engineers who were familiar with the matrix methods of structural analysis and sought to extend those methods to solve the partial differential equations that model the behavior of elastic bodies of arbitrary geometry subjected to various loads. The early papers and the first book on the finite element method [1], written when our understanding of the subject was just a small fraction of what it is today, greatly influenced the idea of finite element modeling and its subsequent implementations.

Guided by their understanding of models for structural trusses and frames, the early code developers formulated finite elements for two- and three-dimensional elasticity problems, plate and shell problems, etc. They focused on getting the stiffness relationships right, subject to the limitations imposed by the software architecture on the number of nodes per element and the number of degrees of freedom per node. They observed that elements of low polynomial degree were “too stiff”. The elements were then “softened” by using fewer integration points than necessary. This caused “hourglassing” (zero energy modes) to occur which was fixed by “hourglass control”. For example, the formulation of the element designated as C3D8R and described as “8-node linear brick, reduced integration with hourglass control” in the Abaqus Analysis User’s Guide [2] was based on such considerations.

Through an artful combination of elements and the finite element mesh, the code developers were able to show reasonable correspondence between the solutions of some simple problems and the finite element solutions. It is a logical fallacy, called the fallacy of composition, to assume that elements that performed well in particular situations will also perform well in all situations.

The Science of Finite Element Analysis

Investigation of the mathematical foundations of finite element analysis (FEA) began in the early 1970s. Mathematicians understand FEA as a method for obtaining an approximation to the exact solution of a well-defined mathematical problem, such as a problem of elasticity. Specifically, the finite element solution u_FE has to converge to the exact solution u_EX in a norm (which depends on the formulation) as the number of degrees of freedom n is increased:

Under conditions that are usually satisfied in practice, it is known that u_EX exists and it is unique.

The first mathematical book on finite element analysis was published in 1973 [3]. Looking at the engineering papers and contemporary implementations, the authors identified four types of error, called “variational crimes”. These are (1) non-conforming elements, (2) numerical integration, (3) approximation of the domain and boundary conditions, and (4) mixed methods. In fact, many other kinds of variational crimes commonly occur in finite element modeling, such as using point forces, point constraints, and reduced integration.

By the mid-1980s the mathematical foundations of FEA were substantially established. It was known how to design finite element meshes and assign polynomial degrees so as to achieve optimal or nearly optimal rates of convergence, how to extract the quantities of interest from the finite element solution, and how to estimate their errors. Finite element analysis became a branch of applied mathematics.

By that time the software architectures of the large finite element codes used in current engineering practice were firmly established. Unfortunately, they were not flexible enough to accommodate the new technical requirements that arose from scientific understanding of the finite element method. Thus, the pre-scientific origins of finite element analysis became petrified in today’s legacy finite element codes.

Figure 1 shows an example that would be extremely difficult, if not impossible, to solve using legacy finite element analysis tools:

Figure 1: Lug-clevis-pin assembly. The lug is made of 16 fiber-matrix composite plies and 5 titanium plies. The model accounts for mechanical contact as well as the nonlinear deformation of the titanium plies. Solution verification was performed.

Notes on Tuning

On a sufficiently small domain of calibration any model, even a finite element model laden with variational crimes, can produce results that appear reasonable and can be tuned to match experimental observations. We use the term tuning to refer to the artful practice of balancing two large errors in such a way that they nearly cancel each other out. One error is conceptual: Owing to variational crimes, the numerical solution does not converge to a limit value in the norm of the formulation as the number of degrees of freedom is increased. The other error is numerical: The discretization error is large enough to mask the conceptual error [4].

Tuning can be effective in structural problems, such as automobile crash dynamics and load models of airframes, where the force-displacement relationships are of interest. Tuning is not effective, however, when the quantities of interest are stresses or strains at stress concentrations. Therefore finite element modeling is not well suited for strength calculations.

Solution Verification is Mandatory

Solution verification is an essential technical requirement for democratization, model development, and applications of mathematical models. Legacy FEA software products were not designed to meet this requirement.

There is a general consensus that numerical simulation will have to be integrated with explainable artificial intelligence (XAI) tools. This can be successful only if mathematical models are free from variational crimes.

The Main Points

Owing to limitations in their infrastructure, legacy finite element codes have not kept pace with important developments that occurred after the mid-1970s.

The practice of finite element modeling will have to be replaced by numerical simulation. The changes will be forced by the technical requirements of XAI.

References

[1] O. C. Zienkiewicz and Y. K. Cheung, The Finite Element Method in Structural and Continuum Mechanics, London: McGraw-Hill, 1967.

[2] http://130.149.89.49:2080/v6.14/books/usb/default.htm

[3] G. Strang and G. J. Fix, An Analysis of the Finite Element Method, Englewood Cliffs, NJ: Prentice-Hall, 1973. [4] B. Szabό and I. Babuška, Finite Element Analysis: Method, Verification and Validation., 2nd ed., Hoboken, NJ: 2nd edition. John Wiley & Sons, Inc., 2021.

[4] B. Szabό and I. Babuška, Finite Element Analysis: Method, Verification and Validation., 2nd ed., Hoboken, NJ: 2nd edition. John Wiley & Sons, Inc., 2021.

A Memo from the 5th Century BC

brent — Tue, 17 Oct 2023 20:09:48 +0000

By Dr. Barna Szabó
Engineering Software Research and Development, Inc.
St. Louis, Missouri USA

Let us be reminded of the timeless wisdom found in The Analects of Confucius (Book XIII, Chapter 3):

“If names be not correct, language is not in accordance with the truth of things. If language be not in accordance with the truth of things, affairs cannot be carried on to success [1]”.

Confucius (551-479 BC) tells us, engineers working in the 21^st century AD, that having a firm grasp on terminology is an essential prerequisite to success. For example, if we are interested in setting up a numerical simulation project then we cannot set realistic goals and expectations unless we understand what numerical simulation is. We cannot understand what numerical simulation is unless we understand what simulation is. We cannot understand what simulation is unless we understand the difference between physical reality and an idea of physical reality, and so on.

The Analects of Confucius (courtesy Wikipedia)

Misleading or Meaningless Terms

There are many popular but misleading or meaningless terms floating around in engineering presentations, technical papers, blog articles, training and workplace environments such as; “physical reality”, “laws of nature”, “physics-based model”, “physics-informed model”, “computational model”, “governing equations“, “finite element modeling”, and “artificial intelligence”, not in accordance with the truth of things. Brief explanations follow.

Physical Reality

Quoting Wolfgang Pauli (Physics Nobel Prize 1950); “The layman always means, when he says ‘reality’ that he is speaking of something self-evidently known; whereas to me it seems the most important and exceedingly difficult task of our time is to work on the construction of a new idea of reality [2]”.

A mathematical model is a precisely formulated idea about some aspect of physical reality and must never be confused with reality itself. Many different models can be formulated about the same aspect of reality. If the predictive performance of two or more models were found to be the same then the models would be equally valid. This view is known as model-dependent realism. Quoting Stephen Hawking: “I take the positivist viewpoint that a physical theory is just a mathematical model and that it is meaningless to ask whether it corresponds to reality. All that one can ask is that its predictions should be in agreement with observation [3].” – In other words, aspects of physical reality are seen and understood through mathematical models.

You may ask, since Pauli and Hawking were theoretical physicists, and you are interested in engineering, how would all this pertain to you? – The answer is that engineering models deal with aspects of reality as well. As engineers, we benefit from operating within established conceptual frameworks like continuum mechanics, fluid dynamics, and electromagnetism. Working in the foundational sciences, physicists are laboring to formulate and validate a comprehensive conceptual framework. They are progressing very slowly. A physicist even argued that theoretical physics has been stagnating for about forty years [4].

Laws of Nature

Nature is what it is and if there are such things as laws of nature, they are not known and probably are not knowable. Laws of great generality and elegance have been formulated to describe aspects of physical reality. For example, Newton’s laws were considered to be laws of nature for about two centuries, but we now know that they are approximations to a more comprehensive model, the theory of relativity, which has its limitations also.

Physics-based Model

In view of the foregoing, this is a meaningless term.

Physics-informed Model

This vague term is also meaningless.

Computational Model

This term conflates the operations that transform the input data D into the quantities of interest F, called mathematical model, with the procedures by which a numerical approximation F_num is obtained. Understanding the difference between F and F_num is essential.

Governing Equations

This term is misleading. Natural processes are not governed by our equations. Nature knows nothing about our equations. Our equations describe certain aspects of natural processes, subject to limitations imposed by the underlying assumptions and the available calibration data.

Finite Element Modeling

This term refers to the obsolete practice of constructing numerical problems by assembling elements from the library of a finite element code. In these libraries, the model form and the approximating functions are mixed.

Artificial Intelligence (AI)

This is an umbrella term referring to computer systems designed to perform tasks, such as visual perception, speech recognition, and translation between languages. The term is misleading because these tasks involve the execution of clever algorithms that form only a small subset of the functions we associate with human intelligence. Importantly, there is no algorithm for theory choice [5], that involves elements of creativity, innovation, and, possibly, paradigm-shifting approaches. It would be more accurate to interpret the ‘I’ in AI as signifying ‘idiot savant’ rather than ‘intelligence’.

[1] Translated by James Legge.

[2] Wolfgang Pauli. Letter to Markus Fierz, 1948.

[3] Stephen Hawking and Roger Penrose. The nature of space and time. Princeton University Press, 2010.

[4] Sabine Hossenfelder. Lost in math: How beauty leads physics astray. Hachette UK, 2018.

[5] Thomas Kuhn. Postscript to the second edition of “The Structure of Scientific Revolutions”, University of Chicago Press, 1970.