News
Introduction
The
generic fitting package
Gfitter comprises a framework for
the statistical analysis of parameter estimation problems in HEP. It is specifically
designed to provide a user-friendly environment for involved fitting problems,
such as the global Standard Model fit to electroweak precision data. During its
development, it was found that Gfitter is also a convenient framework for
averaging problems, ranging from simple weighted means using or not correlated
input data, to more involved problems with common systematic errors, requiring
or not parameter rescaling.
The software package consists of abstract object-oriented code in C++ using ROOT functionality.
Tools for the handling of the data,
the fitting, and statistical analyses such as toy Monte Carlo sampling are provided by a core
package, where theoretical errors, correlations, and inter-parameter dependencies are consistently
dealt with. Theoretical models are inserted
as plugin packages, which may be hierarchically organised.
The use of dynamic parameter caching avoids the recalculation of unchanged
results between fit steps, and thus significantly reduces the amount of computing
time required for a fit.
The Gfitter group are:
Gfitter Features
Gfitter core package provides parameter (so-called
GParameters) fitting via data cards. Currently, the input data are written in XML format, which however can be transparently replaced by any format. The theory to be tested is coded in form of
GTheory derivatives, in which the dependent GParameters, defined in the input data card, are booked. Each package comprising a full theory has its own namespace. Parameter and theory handling and the attribution of theories to parameters (through their names) is centrally performed in the Gfitter core package via a central data container class, denoted
GStore. The detection of fit parameters is automatic, depending on whether or not a given parameter has an associated GTheory.
A standard Gfitter analysis begins by instantiating a
GController object from a user script, followed by an initialisation step and the execution of the actions defined in the data card. Global fits, parameter scans in 1D and 2D, and toy analyses can be performed. The results obtained are persistified in the target ROOT file, and can be exploited for plotting in custom macros.
The parameter fitting is transparent with respect to the fitter implementation, which by
default uses TMinuit, but which is extensible via the driving card to the
more involved global minima finders Genetic Algorithm and Simulated Annealing, implemented
in the ROOT package TMVA.
An important feature of Gfitter is the possibility to cache computation results between
fit steps. Each parameter holds pointers to the theories that depend on it. Upon computation
of the log-likelihood function in a new fit step, only those theories (or part of theories)
that depend on modified parameters (with respect to the previous fit step) are recomputed.
The gain in CPU time of this caching mechanism is substantial, and can reach orders of
magnitudes in many-parameter fitting problems.
Gfitter offers the possibility to study the behaviour of the log-likelihood test
statistics as a function of one or two parameters by one- or two-dimensional scans
respectively. For this purpose penalty contributions are added to
the log-likelihood test statistics forcing the fit to the parameter value
under study. In addition, two-dimensional contour regions of the test statistics
can be computed using the corresponding TMinuit functionality.
Gfitter offers the possibilty to perform toy Monte Carlo (MC) analyses repeating
the minimisation step for input parameter values randomly generated around
expectation values according to specified errors and correlations.
For each MC experiment the fit results are recorded allowing
a statistical analysis, e.g. , the determination of the p-value. All parameter scans
can be optionally performed that way, as opposed to using a Gaussian approximation
to estimate the p-value for a given scan point (manifestation of true values). It
also allows to derive an overall goodness-of-fit probability.