Logo
User Name

Mina Ferizbegović

Društvene mreže:

Kévin Colin, Mina Ferizbegovic, H. Hjalmarsson

In this letter, we study the trade-off between exploration and exploitation for linear quadratic adaptive control. This trade-off can be expressed as a function of the exploration and exploitation costs, called cumulative regret. It has been shown over the years that the optimal asymptotic rate of the cumulative regret is in many instances $\mathcal {O}(\sqrt {T})$ . In particular, this rate can be obtained by adding a white noise external excitation, with a variance decaying as $\mathcal {O}(1/\sqrt {T})$ . As the amount of excitation is pre-determined, such approaches can be viewed as open loop control of the external excitation. In this contribution, we approach the problem of designing the external excitation from a feedback perspective leveraging the well known benefits of feedback control for decreasing sensitivity to external disturbances and system-model mismatch, as compared to open loop strategies. We base the feedback on the Fisher information matrix which is a measure of the accuracy of the model. Specifically, the amplitude of the exploration signal is seen as the control input while the minimum eigenvalue of the Fisher matrix is the variable to be controlled. We call such exploration strategies Fisher Feedback Exploration (F2E). We propose one explicit F2E design, called Inverse Fisher Feedback Exploration (IF2E), and argue that this design guarantees the optimal asymptotic rate for the cumulative regret. We provide theoretical support for IF2E and in a numerical example we illustrate benefits of IF2E and compare it with the open loop approach as well as a method based on Thompson sampling.

Mina Ferizbegovic, H. Hjalmarsson, Per Mattsson, Thomas Bo Schön

In this paper, we propose variations of Willems’ fundamental lemma that utilize second-order moments such as correlation functions in the time domain and power spectra in the frequency domain. We believe that using a formulation with estimated correlation coefficients is suitable for data compression, and possibly can reduce noise. Also, the formulations in the frequency domain can enable modeling of a system in a frequency region of interest.

Mina Ferizbegovic, Jack Umenberger, H. Hjalmarsson, Thomas Bo Schön

This letter concerns the problem of learning robust LQ-controllers, when the dynamics of the linear system are unknown. First, we propose a robust control synthesis method to minimize the worst-case LQ cost, with probability $1-\delta $ , given empirical observations of the system. Next, we propose an approximate dual controller that simultaneously regulates the system and reduces model uncertainty. The objective of the dual controller is to minimize the worst-case cost attained by a new robust controller, synthesized with the reduced model uncertainty. The dual controller is subject to an exploration budget in the sense that it has constraints on its worst-case cost with respect to the current model uncertainty. In our numerical experiments, we observe better performance of the proposed robust LQ regulator over the existing methods. Moreover, the dual control strategy gives promising results in comparison with the common greedy random exploration strategies.

Jack Umenberger, Mina Ferizbegovic, Thomas Bo Schön, H. Hjalmarsson

This paper concerns the problem of learning control policies for an unknown linear dynamical system to minimize a quadratic cost function. We present a method, based on convex optimization, that accomplishes this task robustly: i.e., we minimize the worst-case cost, accounting for system uncertainty given the observed data. The method balances exploitation and exploration, exciting the system in such a way so as to reduce uncertainty in the model parameters to which the worst-case cost is most sensitive. Numerical simulations and application to a hardware-in-the-loop servo-mechanism demonstrate the approach, with appreciable performance and robustness gains over alternative methods observed in both.

Mina Ferizbegovic, Miguel Galrinho, H. Hjalmarsson

Identification of a complete dynamic network affected by sensor noise using the prediction error method is often too complex. One of the reasons for this complexity is the requirement to minimize a non-convex cost function, which becomes more difficult with more complex networks. In this paper, we consider serial cascade networks affected by sensor noise. Recently, the Weighted Null-Space Fitting method has been shown to be appropriate for this setting, providing asymptotically efficient estimates without suffering from non-convexity; however, applicability of the method was subject to some conditions on the locations of sensors and excitation signals. In this paper, we drop such conditions, proposing an extension of the method that is applicable to general serial cascade networks. We formulate an algorithm that describes application of the method in a general setting, and perform a simulation study to illustrate the performance of the method, which suggests that this extension is still asymptotically efficient.

...
...
...

Pretplatite se na novosti o BH Akademskom Imeniku

Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo

Saznaj više