3. Simple transformations¶
Here we will review several simple transformations. Simple means that they do not depend on any variables and have no inputs. There are currently three transformations Points, Histogram and Histogram2d that enable user to initialize the input data (arrays or histograms) in forms of transformation outputs.
Note
Points, Histogram and Histogram2d objects may be constructed from TH1, TH2 and TMatrix. See Importing data from ROOT files for the example.
3.1. Points¶
The Points transformation is used to represent 1d/2d array as transformation output. The Points
instance is created with numpy
array passed as input:
1import numpy as np
2# Create numpy array
3narray = np.arange(12).reshape(3,4)
4# Create a points instance with data, stored in `narray`
5parray = C.Points(narray)
6
7# Print the structure of GNAObject parray
8parray.print()
9print()
10
11# Print list of transformations
12print('Transformations:', list(parray.transformations.keys()))
13
14# Print list of outputs
15print('Outputs:', list(parray.points.outputs.keys()))
16print()
17
18# Access the output `points` of transformation `points` of the object `parray`
19print('Output:', parray.points.points)
20# Access and print relevant DataType
21print('DataType:', parray.points.points.datatype())
22# Access the actual data
23print('Data:\n', parray.points.points.data())
The code produces the following output:
1[obj] Points: 1 transformation(s)
2 0 [trans] points: 0 input(s), 1 output(s)
3 0 [out] points: array 2d, shape 3x4, size 12
4
5Transformations: ['points']
6Outputs: ['points']
7
8Output: [out] points: array 2d, shape 3x4, size 12
9DataType: array 2d, shape 3x4, size 12
10Data:
11 [[ 0. 1. 2. 3.]
12 [ 4. 5. 6. 7.]
13 [ 8. 9. 10. 11.]]
Let us now follow the code in more details. We prepare 2-dimensional array on a side of python:
narray = np.arange(12).reshape(3,4)
In order to use this data in the computational chain a transformation should be provided. The Points
transformation is used for arrays. We use Points
constructor from constructors
module in order to initialize it
from the numpy array [1].
parray = C.Points(narray)
Here parray
is GNAObject
. We now may print the information about its transformations, inputs and outputs:
parray.print()
1[obj] Points: 1 transformation(s)
2 0 [trans] points: 0 input(s), 1 output(s)
3 0 [out] points: array 2d, shape 3x4, size 12
As it can be seen from the output, the Points
instance has a single transformation called points
with a single
output again called points
. As it was shown in the Introduction the transformation may be accessed
by its name as an attribute of the object as object.transformation_name
:
t = parray.points
print(t)
[trans] points: 0 input(s), 1 output(s)
The short way to access its output is similar, object.transformation_name.output_name
. In our case it reads as
follows:
output = parray.points.points
print(output)
[out] points: array 2d, shape 3x4, size 12
There exist a longer but in some cases more readable way of accessing the same data:
output = parray.transformations['points'].outputs['points']
print(output)
[out] points: array 2d, shape 3x4, size 12
Here we read the dictionary of transformations, request transformation points, access the dictionary with its outputs and request the output points.
As we now can access the transformation output, we may request the data it holds:
arr = parray.points.points.data()
print(arr)
print('shape:', arr.shape)
[[ 0. 1. 2. 3.]
[ 4. 5. 6. 7.]
[ 8. 9. 10. 11.]]
shape: (3, 4)
The data()
method triggers the transformation function which does the calculation and returns a numpy view on the
result, contained in parray.points.points
. Accessing the data()
for the second time will do one of the following
things:
Return the same view on a cached data in case no calculation is required.
If some of the prerequisites of the output has changed the transformation function will be called again updating the result. The view on the updated data is returned then.
The status of the transformation may be checked by accessing its taintflag
:
print(bool(parray.points.tainted()))
If the result of the method is false, the call to data()
will return cached data without triggering the
transformation function. In case it is true, the call to data()
will execute the transformation function and then
return the view to updated data [2].
The term view here means that if the data will be modified by the transformation, the arr
variable will contain
the updated data. In the same time access to arr
does not trigger the calculation itself, only data()
does.
In case user wants to have a fixed version of the data the copy()
method should be used:
arr = parray.points.points.data().copy()
print(arr)
print('shape:', arr.shape)
There is also datatype()
method that returns a DataType
instance holding the information on the array
dimensions.
dt = parray.points.points.datatype()
print(dt)
Now we have defined a transformation holding the data. The transformations output may now be connected to other
transformations’ inputs in order to build a computational chain (see Sum and product: transformations with inputs). It is
important to understand that the way to access transformations and their inputs and outputs is universal and is
applicable to any GNAObject
.
3.2. Histogram¶
The Histogram transformation stores a 1-dimensional histogrammed data. It is very similar to the 1d version of Points with the only difference: its DataType stores the bin edges.
1import numpy as np
2# Create numpy array for data points
3nbins = 12
4narray = np.arange(nbins)**2 * np.arange(nbins)[::-1]**2
5# Create numpy array for bin edges
6edges = np.linspace(1.0, 7.0, nbins+1)
7
8# Create a histogram instance with data, stored in `narray`
9# and edges, stored in `edges`
10hist = C.Histogram(edges, narray)
11hist.print()
12print()
13
14# Access the output `hist` of transformation `hist` of the object `hist`
15print('Output:', hist.hist.hist)
16# Access and print relevant DataType
17datatype = hist.hist.hist.datatype()
18print('DataType:', datatype)
19print('Bin edges:', list(datatype.edges))
20# Access the actual data
21print('Data:', hist.hist.hist.data())
The work flow for a histogram is very similar to the one of the array. The object has a single transformation hist with a single output hist.
The main difference is that DataType
of the histogram now has histogram edges defined. On the line 19
datatype.edges C++ vector is accessed and converted to the python list.
The code produces the following output:
1[obj] Histogram: 1 transformation(s)
2 0 [trans] hist: 0 input(s), 1 output(s)
3 0 [out] hist: hist, 12 bins, edges 1.0->7.0, width 0.5
4
5Output: [out] hist: hist, 12 bins, edges 1.0->7.0, width 0.5
6DataType: hist, 12 bins, edges 1.0->7.0, width 0.5
7Bin edges: [1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.0]
8Data: [ 0. 100. 324. 576. 784. 900. 900. 784. 576. 324. 100. 0.]
3.3. Histogram2d¶
The Histogram2d is 2-dimensional version of a histogram. It holds the 2-dimensional array and its datatype has two sets of bin edges.
1import numpy as np
2# Create numpy arrays for bin edges
3nbinsx, nbinsy = 12, 8
4edgesx = np.linspace(0, nbinsx, nbinsx+1)
5edgesy = np.linspace(0, nbinsy, nbinsy+1)
6# Create fake data array
7narray = np.arange(nbinsx*nbinsy).reshape(nbinsx, nbinsy)
8narray = narray**2 * narray[::-1,::-1]**2
9
10# Create a histogram instance with data, stored in `narray`
11# and edges, stored in `edgesx` and `edgesy`
12hist = C.Histogram2d(edgesx, edgesy, narray)
13hist.print()
14print()
15
16# Access the output `hist` of transformation `hist` of the object `hist`
17print('Output:', hist.hist.hist)
18# Access and print relevant DataType
19datatype = hist.hist.hist.datatype()
20print('DataType:', datatype)
21print('Bin edges (X):', list(datatype.edgesNd[0]))
22print('Bin edges (Y):', list(datatype.edgesNd[1]))
23# Access the actual data
24print('Data:', hist.hist.hist.data())
And again the general work flow is very similar. When it comes to the multiple axes their bin edges may be accessed via
edgesNd
member of the DataType
by axis index: see lines 21 and 22.
The code produces the following output:
1[obj] Histogram2d: 1 transformation(s)
2 0 [trans] hist: 0 input(s), 1 output(s)
3 0 [out] hist: hist2d, 12x8=96 bins, edges 0.0->12.0 and 0.0->8.0
4
5Output: [out] hist: hist2d, 12x8=96 bins, edges 0.0->12.0 and 0.0->8.0
6DataType: hist2d, 12x8=96 bins, edges 0.0->12.0 and 0.0->8.0
7Bin edges (X): [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0]
8Bin edges (Y): [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0]
9Data: [[ 0. 8836. 34596. 76176. 132496. 202500. 285156. 379456.]
10 [ 484416. 599076. 722500. 853776. 992016. 1136356. 1285956. 1440000.]
11 [1597696. 1758276. 1920996. 2085136. 2250000. 2414916. 2579236. 2742336.]
12 [2903616. 3062500. 3218436. 3370896. 3519376. 3663396. 3802500. 3936256.]
13 [4064256. 4186116. 4301476. 4410000. 4511376. 4605316. 4691556. 4769856.]
14 [4840000. 4901796. 4955076. 4999696. 5035536. 5062500. 5080516. 5089536.]
15 [5089536. 5080516. 5062500. 5035536. 4999696. 4955076. 4901796. 4840000.]
16 [4769856. 4691556. 4605316. 4511376. 4410000. 4301476. 4186116. 4064256.]
17 [3936256. 3802500. 3663396. 3519376. 3370896. 3218436. 3062500. 2903616.]
18 [2742336. 2579236. 2414916. 2250000. 2085136. 1920996. 1758276. 1597696.]
19 [1440000. 1285956. 1136356. 992016. 853776. 722500. 599076. 484416.]
20 [ 379456. 285156. 202500. 132496. 76176. 34596. 8836. 0.]]