News

Introduction

The generic fitting package Gfitter comprises a framework for the statistical analysis of parameter estimation problems in HEP. It is specifically designed to provide a user-friendly environment for involved fitting problems, such as the global Standard Model fit to electroweak precision data. During its development, it was found that Gfitter is also a convenient framework for averaging problems, ranging from simple weighted means using or not correlated input data, to more involved problems with common systematic errors, requiring or not parameter rescaling.

The software package consists of abstract object-oriented code in C++ using ROOT functionality. Tools for the handling of the data, the fitting, and statistical analyses such as toy Monte Carlo sampling are provided by a core package, where theoretical errors, correlations, and inter-parameter dependencies are consistently dealt with. Theoretical models are inserted as plugin packages, which may be hierarchically organised. The use of dynamic parameter caching avoids the recalculation of unchanged results between fit steps, and thus significantly reduces the amount of computing time required for a fit.

The Gfitter group are:

Gfitter Features

Gfitter core package provides parameter (so-called GParameters) fitting via data cards. Currently, the input data are written in XML format, which however can be transparently replaced by any format. The theory to be tested is coded in form of GTheory derivatives, in which the dependent GParameters, defined in the input data card, are booked. Each package comprising a full theory has its own namespace. Parameter and theory handling and the attribution of theories to parameters (through their names) is centrally performed in the Gfitter core package via a central data container class, denoted GStore. The detection of fit parameters is automatic, depending on whether or not a given parameter has an associated GTheory.

A standard Gfitter analysis begins by instantiating a GController object from a user script, followed by an initialisation step and the execution of the actions defined in the data card. Global fits, parameter scans in 1D and 2D, and toy analyses can be performed. The results obtained are persistified in the target ROOT file, and can be exploited for plotting in custom macros.

The parameter fitting is transparent with respect to the fitter implementation, which by default uses TMinuit, but which is extensible via the driving card to the more involved global minima finders Genetic Algorithm and Simulated Annealing, implemented in the ROOT package TMVA.

An important feature of Gfitter is the possibility to cache computation results between fit steps. Each parameter holds pointers to the theories that depend on it. Upon computation of the log-likelihood function in a new fit step, only those theories (or part of theories) that depend on modified parameters (with respect to the previous fit step) are recomputed. The gain in CPU time of this caching mechanism is substantial, and can reach orders of magnitudes in many-parameter fitting problems.

Gfitter offers the possibility to study the behaviour of the log-likelihood test statistics as a function of one or two parameters by one- or two-dimensional scans respectively. For this purpose penalty contributions are added to the log-likelihood test statistics forcing the fit to the parameter value under study. In addition, two-dimensional contour regions of the test statistics can be computed using the corresponding TMinuit functionality.

Gfitter offers the possibilty to perform toy Monte Carlo (MC) analyses repeating the minimisation step for input parameter values randomly generated around expectation values according to specified errors and correlations. For each MC experiment the fit results are recorded allowing a statistical analysis, e.g. , the determination of the p-value. All parameter scans can be optionally performed that way, as opposed to using a Gaussian approximation to estimate the p-value for a given scan point (manifestation of true values). It also allows to derive an overall goodness-of-fit probability.

last modified: