Treatment of Bone Marrow Cancer Based on Model Predictive Control

a,1 Electrical and Computer Engineering department of Tarbiat Modares University, Al Ahmad Street, Jalal, Tehran 14115-111, Iran b,2 Mechanics, Electrical Power and Computer department of Azad University, Science and Research branch, Simon Bulivar BlvdDaneshgah Blvd, Tehran 14515/775, Iran b,3 Computer Science and Engineering department of Oakland University, 318 Meadow Brook Rd, Rochester, MI 48309, USA 1 Ehsansalajegheh@modares.ac.ir; 2 Mojalalsepide@gmail.com; 3 mojarradghahfar@oakland.edu * Corresponding Author


Introduction
Bone is a complex and dynamic tissue that is constantly being rebuilt. The regeneration process consists of two steps: new bone is made while removing old bone. Humans have about 200 bones, and each of them is a complex of several different tissues that work together: bone, cartilage, connective tissue, various blood tissues, adipose tissue, and nerve tissue. The microscopic structure of bone is also very complex. Bone is composed of approximately 25% water, 25% collagen fibers, and 50% crystalline mineral salt. As mineral salts form in the bones, they form, and the tissues harden. This process of calcification is initiated by osteoblasts, the bone-forming cells [1]. Bone marrow is a soft and vascular tissue that is based on reticular tissue and has various blood, hematopoietic and fat cells on it [2]. Cancer is a genetic disease. Although Bone marrow is a spongy tissue that contains stem cells that are found inside some bones, including the hip and femur. Bone marrow cancer is a type of cancer that is caused by stem cells that make up the blood cells in the bone marrow. Sometimes these cells grow too fast or abnormally, which is called bone marrow cancer. Bone tissue cells are mainly composed of osteoblasts and osteoclasts. Osteoblast cells constantly build new bone throughout the life of each bone, and other cells called osteoclasts constantly absorb pieces of bone, so the bone is constantly being renewed. In this paper, mathematical models of tumors, the effect of the body on the drug, and the drug on the body are introduced, and then the appropriate dose of the drug to reduce tumor density is calculated using the model predictive control (MPC) algorithm. To obtain an adaptive MPC strategy, the extended least squares (ELS) method developed to learn the parameters of the tumor growth model is used. Finally, the simulation in MATLAB, assuming the model is correct, shows that the tumor is gone, and the bone mass improves over a period of time. The results demonstrate that the proposed method is effective for the treatment of bone marrow cancer.
ISSN 2775-2658 Vol. 1, No. 4, 2021, pp. 463-476 Ehsan Salajegheh (Treatment of Bone Marrow Cancer Based on Model Predictive Control) environmental and other non-genetic factors play a role in many stages of tumor development, cancer is primarily caused by mutations in genes prone to cancer [3] [4]. With the increasing threat of cancer to human health, we must seek treatment. In the treatment of these diseases, surgery, radiation therapy, and chemotherapy are used, which must be well planned to have effective treatment results. Proper dosing of the drug requires optimization processes, and this is one of the factors that increases the number of mathematical optimization studies for cancer treatment [5] [6].
The purpose of a mathematical model for tumor growth is to predict and control the course of the disease when using a treatment method. Having a mathematical model is very important to show tumor growth [7]. Another application of Mathematical models to predict and control diseases such as Parkinson's is provided in [8]. In [9], a dose of optimal chemotherapy is calculated by solving the problem of convex optimization based on the inequality of the linear matrix. In [10], it shows that when system states are not fully measurable, MPC is still a useful method of treating cancer. In [11], MPC has been used to provide a chemotherapy program for mice with breast cancer. In many diseases research, it has been hypothesized that the direct effect of the drug is possible. Pharmacodynamics (PD) is the study of the physiological effects of drugs on the body and other microorganisms or parasites. In addition, the mechanism of action of drugs and the effect of drug concentration on the drug are among the topics discussed in pharmacodynamics.
Pharmacodynamics is the effect of a drug on the body, while pharmacokinetics (PK) is the effect of the body on a drug. Leading horizon control is a powerful and well-known technique used to solve the dynamic optimization problem. In most approaches, the leading horizon control is associated with the model predictive control [19]. In [12] and [13], the leading horizon control for tumor growth is expressed based on optimal control. The study [14] [15] proposes mathematical models of drug resistance. In [16], a model is developed to show the microenvironmental interaction between osteoclasts, osteoblasts, and bone mass density, which allows the calculation of cell populations and changes in bone mass at a discrete location of the bone model. In [17], a mathematical model of the relationship between bone cells has been developed, which includes the increase of osteoblasts in the model. In this model, it has been shown that increasing new cells or increasing the proliferation rate of new osteoblasts increases bone mass. In [18], a mathematical model for preventing the progression of bone metastasis using optimal control has been proposed. Model predictive control is one of the advanced process control methods. In model predictive control, an explicit process model is used to obtain a control signal by minimizing a cost function [19][20] [21].
In this article, control is used to plan treatment to reduce cancer tumor density. Tumor density is indicated by a function that depends on the tumor density and drug effect. To find the pharmacological effects, the MPC algorithm must be used to solve an optimization problem. For this reason, a quadratic cost function is used that weighs the drug effects and the error between tumor density and a reference signal [22] [23]. Pharmacokinetics and pharmacodynamics are modeled as two models for a drug. Drugs are grouped according to pharmacodynamics properties, and these properties play a decisive role in deciding whether to use a drug group to treat a disease. The overall goal in pharmacodynamics modeling is to investigate and identify the effects of the drug. The ELS method is used to estimate the model parameters. Since MPC calculates the optimal drug effect of * and only drug dose can be changed, it is necessary to calculate the optimal drug effects of * , which are related to the optimal drug concentration * . To do this, the reverse PD model is defined and used. A controller with a monitor is designed to determine which of the drug d doses produces a drug concentration c close to the optimal drug * concentration.
The structure of the article is as follows: Section 2 defines the mathematical models of PK, PD, and drug resistance as well as the bone model with tumor. In section 3 of the article, the

System Model
In PK/PD models, one of the most important characteristics of drug activity is the concentration of drug at the site of activity. Effective concentrations at the site of effect are required for pharmacodynamics modeling. Pharmacokinetic models, in most cases, are not able to easily predict those concentrations. To meet the need, it is necessary to introduce another class of models that link the pharmacokinetics and pharmacodynamics of the drug [11]. The pharmacodynamics model consists of PK, PD, and drug resistance models and aims to provide a true clinical treatment.

Pharmacokinetic Model (PK)
Pharmacokinetics is the study of the absorption, distribution, metabolism, and excretion of a drug, as well as the effect of a drug on the body and the evaluation of an effective therapeutic response. In fact, it examines the parameters affecting the plasma concentration curve and the factors affecting it. Equation (1) shows the PK model of the drug with a nonlinear and time-invariant system with 2 real poles and a single static gain in which s is the mixed frequency and 1 and 2 are the magnitude of the poles [24].
The state-space of the transfer function (1) can be written as follows: is the state, and is the concentration of drugs in the blood. This system is fully controllable and observable, so a controller with a monitor can be designed to detect the optimal dose of drug . The dynamics of the system with a controller and a monitor is as follows: and are Gain vectors that may be calculated by a pole locating technique. The closed-loop system can be defined as a relation in (4).
With the system defined by (4), the evolution of the drug concentration at the discrete-time is given for the input signal of the Dirac equation ( ) by (5). ..
Where ∆ is the sampling time.

Pharmacodynamics Model (PD)
Pharmacodynamics is the study of the physiological effects of drugs on the body and other microorganisms. In addition, the mechanism of action of drugs and the effect of drug concentration on drug effect are among the topics discussed in pharmacodynamics. An important class of models that study the dose-effect process is known as models or the Hill equation. The general form of these models is as follows: Where is the severity of the pharmacological effects when the drug plasma concentration is . EC50 Drug concentration in which the pharmacological effect is equal to half the maximum pharmacological effect. The variable 1 , 2 , 3 are shape factors that indicate the slope of the effect-concentration curve. This model is used as a model for many pharmacodynamics mechanisms [8]. The PD model of the drug is assumed to be a static nonlinear relation expressed by relation 7 [10].
Where is the effect of the drug at time . The drug concentration of 50 is equal to half of its maximum effect. Therefore, the PD model varies between the amount without drug effect to the maximum drug effect. Changes in drug effect as a function of drug concentration are shown in Fig. 1. Although it has been concluded that the drug effect of u varies between no drug effect to maximum drug effect, the inverse pharmacodynamics model is not defined when is 1, hence to 99% of maximum drug effect for Both PD and PD -1 models are restricted. Drug Resistance Model Drug resistance is a natural process in the human body and is a major problem in the treatment of cancer. If the drug concentration is below a certain threshold , only weak cells are killed. The cells produced are resistant to this amount of drug concentration. This phenomenon is called drug resistance. Equation (8) defines the amount of drug resistance at time .
Where is the sampling distance and the maximum at which there is no drug resistance. Given that drug resistance r affects the following, 50 can be used as a relation (9) in which is a parameter related to the disease's ability to develop drug resistance [10]. This method can overcome the limitations of other methods, such as wavelet transform. The empirical mode decomposition (EMD) and Hilbert transform can be used to characterize the system [25]. Initially, the intrinsic mode function (IMFs) of the structural acceleration responses will be obtained by the empirical mode analysis method. The combination of these functions can be calculated by HHT of natural frequency and system attenuation ratio [26] [27].

Bone microbial model with tumor dynamics and drug treatment
Among the models studied for tumor growth, the Gompertz model is used to show tumor growth and simulate its behavior. Equation (20) represents the Gompertz model with a control input. Tumor density changes are considered as a function of continuous-time by ( ) [4].
Where ∈ 0 + is a parameter dependent on the tumor growth rate, ∈ 0 + is the sensitivity of the tumor to the drug, ∈ + is the stable level, ∈ [0 ] is the tumor density and 2 ∈ 0 + is the effect of the drug that kills the tumor cell.

Bone microenvironment model
Where 1,2 is the rate of osteoclast and osteoblast production, 1,2 is the rate of osteoclast and osteoblast removal, 11 and 22 are autocrine parameters, 21 and 12 are paracrine parameters, is tumor parameter. The variable ̅ and ̅ are the average amount of osteoclasts and osteoblasts, respectively. The variable u1 indicates the therapeutic effect of osteoblasts.

Extended Least Square Method
In the least-squares method in general, the unknown parameters of a mathematical model must be chosen in such a way that the sum of the squares of the difference or error between the observations (actual data) and the results calculated from the mathematical model is minimized. In this method, the mathematical model is a linear regression model that is defined as The calculation ̂ in this method is known as the least-squares method as In the least-squares method (RLS), the matrix 1 () T   must be non-single, i.e., and its determinant must be zero. So, the recursive least-squares method is an online method of identification. If the detected data is white noise, the recursive least squares estimate will be able to perform the detection without bias, but if the data is colored noise, the RLS will be biased. In this case, the Extended Least Squares (ELS) is implemented to solve the problem using the semi-analytical method. In this method, the system model is considered as, Where ( ), ( ), and ( ) are polynomials in terms of the leading transmission operator and e (t) noise. After removing the system noise, the method of empirical mode decomposition will also be applied to decompose the obtained signals. In this case, the equation (15) Therefore, the estimation will be based on (16) relationships.

Method PID Controller
For bone repair, osteoclast and osteoblast oscillations must reach a certain steady state. The closed-loop system stability and performance conditions will be extracted in the format of the sum of squares by guaranteeing ∞ tracking index according to the work that has been done in [26]. For this purpose, a discrete PID is used to calculate the variable 1 . The lubricating control is obtained by the differential equation (17). 1 The drug effect of osteoblast regulator and ( ) indicates the error between the osteoblast reference signal and the output. The variable is the proportional interest, is the integral interest and is the derivative interest. Where, , MPC Method MPC is used in a wide variety of optimal control problems. Ref. [28] aims to determine an optimal allocation of autonomous vehicles in a multi-lane heterogeneous traffic network where the road is shared between autonomous and human-driven vehicles. MPC formulation is a model-based control method. It is desirable to find how much 2 ( ) should be at time to reduce tumor density. This is done by solving the limited optimization problem in a leading horizon strategy.
Where +1 is the predicted output vector at the moment + 1.
Where is the forecast horizon and R    is the setting parameter. The reference signal vector * 1 k Y  follows the relation (22).
And is the input control vector.
Finally, the relation (19) is defined as a relation (24). To solve the optimization problem, the quasi-Newton algorithm is used, which requires an initial estimate of the solution * for each time . Equation (25) is used to obtain the input control vector at time .  [31], the parameters of the model can be determined as time variable entities. More clearly, the C/GMRES approach can be determined by variable time parameters by defining the Jacobian as a function of frizzed/dynamic parameters. The detailed approach is presented in [24].

Simulation
Consider two different simulations: In simulation A, MPC is used to control tumor growth, assuming direct control of the drug u effect. In simulation B, the PC pharmacodynamics model is also considered so that the treatment is as close as possible to clinical reality. The ELS method is used to estimate the model parameters in both simulations A and B, then some features of MPC are tested. Growth time is measured to check the MPC speed for the system to approach the reference signal. Some simulation results are presented in Fig. 5 to Fig. 13. Note that the 5% growth time required by ( ) to reach the 5% difference between the primary tumor density (0) and the tumor density equilibrium (+ ∞). The "rate ⁄ ∞" is the offset between the stable , , and the reference signal balance. These two functional properties are investigated as a function of the maximum effect of the drug and the optimization parameter . The purpose of the relationship (26) is exponential reference seeking.
With initial tumor density ∞ = 10, (0) = 97.5, while ∈ + is the parameter that defines the rate of change of the reference signal. The exponential reference signal starts with (0). Tumor density detects the reference signal, and the system reaches equilibrium in about 5 weeks of treatment. The selected value is with step size ℎ = 1/7. Fig. 4 shows the daily dose of medication that should be administered to the patient. Fig. 4 is simulated for different with constant and vice versa. Fig . 5 shows that as decreases, the system response slows down. However, when is small, the MPC cannot bring y to ∞ even if the simulation time increases.  Fig. 6 shows that as increases, the bandwidth becomes smaller, and the system response slows down. In addition, more resistance is applied to the system because high-frequency dynamics are reduced. Two other important features to consider are the relationship between simulation cost and simulation time as a function of the N forecast horizon. When analyzing the cost as a function of the forecast horizon, two things are considered that are in accordance with Fig. 7. In test 1, the MPC tumor density model has no parametric error, while the MPC tumor density model in test 2 has 20 Percentage error is parametric. The variable is a cost function that evaluates all experiments with sample , which is defined as The vector contains the system outputs, the vector * contains the reference signal, and the vector contains the system inputs. Analyzing Fig. 8, Test 1 has a lower value than Test 2. There are no specific rules for choosing the best N forecast horizon. There is a general rule that the choice of should be after the angle of the cost curve because more predictions will reduce costs very little. For test 1, the angle of the cost curve is at = 2, meaning that any > 2 is probably a good choice (for test 2, the angle of the cost curve is approximately at = 5). However, large values of do not mean that the cost function decreases. It is obvious that when increases, the change in the cost function is almost zero. In addition, when increases, the simulation time increases exponentially. There is no significant difference in simulation time between the two tests, as the MPC must perform the exact same calculations for both tests.  Fig. 8 shows the system inputs and outputs for four different 0 (0). Although system inputs and outputs are different in the first few weeks, all system inputs converge to the same result, indicating that the initialization of the optimization problem is not important. Fig. 8. System input and output for different initial optimization Test 1: 0 (0) = 0.1, Test 2: 0 (0) = 0.5, Test 3: 0 (0) = 1, Test 4: 0 (0). A random vector between 0 and 1 whose elements may be different. Fig. 9 shows the estimation of tumor growth model parameters from = 1. Estimation of tumor growth model parameters begins when cell-killing agents are being injected. It can be seen that the estimator converges in about 10 days. The block diagrams in Fig. 2 and Fig. 3 are used to reduce tumor density and repair bone volume. Two moments are defined: 1 marks the beginning of the time for both treatments (tumor cell killer and osteoblast regulator), and 2 stands when the tumor is gone. In bone mass density recovery, a discrete PI controller with a leading Euler integral is used to calculate the therapeutic effect of u1. Fig. 10 examines the drug effects of 1 and 2 and the drug dose 2.  After k2, in which only osteoblast regulator is active, bone mass density improves until it reaches a steady of ̅ = 100. Changes in the concentration of the lethal drug in cell parameter 50 and the level of drug resistance are shown in Fig. 11.  Fig. 12 and Fig. 13 show that between = 0 and 1, when no drug is used, tumor density increases, and the balance between osteoclasts and osteoblasts decreases, and thus bone mass density decreases from steady-state. Between 1 and 2, where both drugs are used, the tumor density decreases until the tumor is destroyed (T<2%), and the balance of osteoclasts and osteoblasts improves, as does bone mass density.

Conclusion
In this article, we used an adaptive optimal solution method for the treatment of bone marrow cancer, which is a combination of MPC and ELS methods, and also empirical mode decomposition method is applied to decompose the denoised obtained signals. This control problem is solved by optimization, and a drug dosing program is presented. In the proposed simulations, MPC can reduce tumor density and prevent the tumor from spreading to other organs. The controller works well when checking for errors between the system model and the actual system output, as well as when using different methods to interpret the Gompertz model and directs tumor density to the desired reference values. By starting treatment to regulate the activity of osteoblasts, an optimal dose of medication is obtained for the patient, which restores the density of healthy bone mass. Also, assuming the model is correct, it eliminates the tumor in about 2 years and improves normal bone density in 5 years. Actual treatment should be considered by considering the pharmacodynamics model of the drug. In subsequent studies, in addition to examining the pharmacodynamics model (PD, PK, and drug resistance), the effect of drug toxicity can be investigated.