Overview ======== The codebase consists of two parts. The code where computations are implemented is written in C++11 and is scattered in different top-level directories .. code-block:: bash core/ neutrino/ integrator/ detector/ The paths to compiled source files should be explicetely listed in the ``CMakeLists.txt``. Computation flow configuration, user interaction and everything else is written in Python 2 and placed in the ``pylib/`` directory. The two parts are glued together with PyROOT. Although saving some coding, PyROOT is slow and too implicit. I would seriously consider dropping it in favor of something more cleaner (maybe `pybind11 `_) in the future. C++ side -------- All the objects, which are going to be used in the computations are derived from the ``GNAObject`` class. This object provides the following basic features. .. _variables: Variables, parameters, evaluables and taintflags ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ A *variable* represents a named value, which is going to be changed during fit procedure; it's generally a parameter on which the theoretical prediction of experemental observations depends. Variables are implemented as template class ``variable`` (where ``double`` is almost the only tested and used choice for ``T``) and provide the status tracking: when some variable ``v`` is modified, any other variable (or a similar object described below) ``x`` which is declared to depend on ``v`` will be signalled to invalidate currently computed value indicating for later recomputation. The invalidation procedure is internally called *tainting*, while dependencies are named *subscriptions*: ``v.subscribe(x)`` literally means that ``x`` subscribes to invalidation events of ``v``. The users of variables are not expected to change their values, only to use them in computations, so the ``variable`` class does not have any direct modifications method. To make it actually usable (to make them hold some useful value), it should by associated to either *parameter* (implemented as ``parameter`` or *evaluable* (``evaluable``). The first one provides an interface to directly modify the value (``set()`` method), while the second is initialized with a function to compute the value, usually depending on some other variables. The procedure of such association is called *binding* and is done almost exclusively on the Python side of the code. All the variables of constructed objects are assumed (except explicetely stated on Python side during the creation) to be bound to something. Any number of variables of different objects may be bound to the same parameter or evaluable, making it possible to have common systematics. The computation code should not care about bindings themselves, no matter whether it will be an independend parameter, or dependant evaluable, the code should just use the ``variable`` interface. A *taintflag* is an object used to expose the invalidation state outside of the parameter/variable/evaluable. It may be subscribed to a number of trackable objects and casting to bool will tell you, whether any of them was changed since last reset. Please refer to the code in ``core/Parameters.hh`` where the ``variable``, ``parameter``, ``evaluable`` and ``taintflag`` machinery is implemented. They are exposed in ``GNAObject`` through the ``Parametrized`` base class, which is implemented in ``core/Parametrized.hh``. Transformations ^^^^^^^^^^^^^^^ A *transformation* represents the actual computation procedure -- it's basically a function taking any number of inputs (or sometimes internally called *sources*) and providing arbitrary number of outputs (or internally *sinks*). The number, size and all other properties of inputs and outputs are determined during the initialization stage, starting from the object creation and ending at the first request of actual computation results. The function that determines outputs properties (like sizes) from the inputs properties is called *types function* and it's expected to be called only during initialization stage, while the computation itself is done by some *function* which is called only after the types function successfully precomputed everything possible. To carry out the computations there should be no free inputs, all of them should be connected to output of another object, forming the computation chain, computation graph (which is acyclic and directed). The transformation gets invalidated when any of the transformations connected to it is invalidated or when a variable of the containing ``GNAObject`` gets invalidated (unless another behaviour is explicetely specified during the transformation initialization). The computed results are reused making partial computation faster as long as it stays valid. There may be any number of transformation provided by a single ``GNAObject``, they all will have its state as shared. If only one transformation is expected, one can use the ``GNASingleObject`` instead, which allows to drop the name of transformation in some contexts making code shorter. The transformations machinery is implemented by inheriting from two classes: ``TransformationTypes::TransformationBase``, which is generally inherited only through ``GNAObject`` and should not be used directly, and ``TransformationBind`` template class which should be inherited by each user of transformations with the class itself as ``T`` (CRTP is employed). The code can be found in ``core/transformation/`` and ``core/TransformationDescriptor.hh``. The former is (at least was) intended to be used only by C++, while the latter to be exported to Python side with PyROOT. This was done with ROOT5 in mind, since ROOT6 is very good at C++11 bindings, the distinction is not as important anymore. Python side ----------- Running ^^^^^^^ The entry point of python code is ``run()`` function from ``gna.dispatch``. It handles command line arguments parsing and runs specified commands from ``gna.ui``. Each command corresponds to one module in ``gna.ui``, it should contain ``cmd`` class which should be derived at least from ``basecmd``. Methods ``init()`` and ``run()`` are executed, the former exactly after the latter, so there is no real difference between them. Commands are executed strictly sequentially in the order specified in the command line. .. _environment_ns: Environment and namespaces ^^^^^^^^^^^^^^^^^^^^^^^^^^ Commands do everything -- define experiments, data, do fits, plotting, etc. They share common state with ``env`` object. This object is implemented in ``gna.env`` and is assumed to be constructed only once in the whole program, the instance is available as ``gna.gna.env``. Sometimes you can see the env object passed around as it wasn't single, this behaviour is old (before there were plans to have multiple ``env`` objects) and considered deprecated. Everytime ``GNAObject`` is created, it's registered in the ``env``. This means (at least): - a reference to the object is kept somewhere in the env, in order not to bother with memory management (ffffuuuuuuuuuu) - each variable of the object is bound to some value (parameter or evaluable); if a variable can't be bound and isn't declared optional by the object itself, an exception is thrown. - each evaluable expression is registered by the corresponding name in the namespace provided with ``ns`` kwarg to constructor. This is implemented in ``gna.bindings``, where some pythonization and monkeypatching is done. The parameters and evaluables are collected into hierarchical namespaces. Their names inside a namespace are unique, but they can coincide with subnamespace names (not recommended though to avoid confusion). By convention, parameters and evaluables names are written with initial uppar case, while namespaces are lower case. Some namespaces may be active, so names inside them will be visible with ``env.pars`` and will be available during binding procude. Inactive namespaces are invisible until activated. This is handled by the ``nsview`` object inside ``env``. Activation/deactivation is done with the context syntax (``with ns: ...``) or with explicit ``ns.add([...])`` / ``ns.remove([...])``. Bundles and configuration ^^^^^^^^^^^^^^^^^^^^^^^^^ Bundles are implemented by deriving from the base class :ref:`TransformationBundle` implement the following functionality: + Construct and configure a single transformation + Construct, configure and connect several transformations. In this sense :ref:`Bundle ` is a transformation of a higher level. + Based on the given configuration initialize necessary environments and variables, set uncertainties, etc. The class :ref:`NestedDict` is used to implement the configuration. :ref:`NestedDict` configuration can be saved to or loaded from a file, created within python as regular dictionary. It supports nesting, and attribute access syntax.