-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathintro.tex
1366 lines (1276 loc) · 54.4 KB
/
intro.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
% \input{epigraph}
% \noindent
\section{Historical background}
\label{sec:bg}
In the history of natural sciences,
there has been two main approaches to describe dynamical systems,
which I call here
\emph{kinetics} and \emph{thermodynamics}.
% This thesis is about the relationship between the two.
The first approach goes all the way back to
Newton's laws of motion~\citep{newton}.
Loosely speaking, it describes a system by
the positions and momenta of each particle.
A description of this type is as explicit and detailed as it can get
for the type of systems under consideration in classical mechanics,
\ie the movement of point particles.
In the case of a classical mechanical system that has $N$ particles,
a state of the system is specified by
a vector in $\RR^{2 \cdot 3 \cdot N}$.
In general, a kinetic description is the full description of the
dynamics of the system in terms of the velocities of its processes.
% Regarding the role of forces in Newtonian mechanics:
% Interactions producing a change of momentum on a particle
% that can be measured independently do not interact between them
% and thus the resulting derivative of the momentum (force)
% is just the sum of the forces as measured independently.
The thermodynamic approach, on the other hand,
gives a description based on an \emph{energy function}.
The energy function is defined on the states of the system
and assigns a real value to each state, its energy.
That is, a state is described by a single scalar
regardless of how many particles it comprises.
Naturally, this approach endowed the description of
a dynamical system in classical mechanics
with a remarkable conciseness, simplicity and elegance.
It first appeared in the work of
\citet{lagrange2} and \citet{hamilton},
and has been subsequently used as the basis for most of modern physics.
Once in possession of the energy function,
the kinetic description (\ie the equations of motion)
can be derived from it.
However the converse is not true:
in general a kinetic description might not have an energy function
from which it can be derived \citep{santilli}
because of non-conservative (dissipative) forces.
Obtaining an energy function from the equations of motion
is referred to as the \emph{inverse problem} % in classical mechanics
and it was first attended to by \citet{helmholtz}.
Both the direct and the inverse problem are the interest of this thesis
and we aim to answer these questions
in the context of biomolecular interaction networks.
% A more precise statement of the problem will have to wait
% until we have introduced all the relevant concepts.
% Note that this approach has been given the name `thermodynamic'
% not because of thermodynamics,
% the science that studies the dynamics of heat and temperature,
% but because of the protagonical role of the energy
% in driving the system's evolution.
% Certainly, there are connections to thermodynamics
% that will be highlighted as they arise.
Half a century after Hamilton's work
researchers like Maxwell, Boltzmann, and Gibbs
applied the ideas of classical mechanics to \emph{atoms}
in order to describe physical properties of matter like pressure,
the capacity to transfer heat, and others.
This body of work came to be known as \emph{statistical mechanics}
and was used to explain Brownian motion by \citet{einstein-brownian},
which after its experimental verification \citep{perrin}
settled the debate about the existence of atoms.
It tried to explain, however unsuccessfully,
the second law of thermodynamics
and thus how irreversible processes arise from reversible ones.
% Here we do not attempt to explain irreversible processes
% in biomolecular interaction networks.
Perhaps it failed because the second law does not hold
in general, \eg in small systems and short time scales \citep{ft-lab}.
A theorem in dynamical systems generalises the second law
and can explain these results \citep{ft}.
Here, however, we concern ourselves with equilibrium thermodynamics
and all processes considered are reversible.
Statistical mechanics had to be extended in order to explain
% This work however did not attempt to explain
the chemical interactions and reactions that molecules undergo.
% That would have to wait yet half a century for
It was greatly helped by
the axiomatisation of probability theory by \citet{kolmogorov}
and the further developments by \citet{doob} and \citet{feller},
who, among others, established the theoretical framework
for continuous-time Markov chains (CTMCs).
Below you can find the definition of (time-homogeneous) CTMCs
and \qmatrices that will be used here.
% In particular, we work with time-homogeneous CTMCs.
\begin{definition}%[Infinitesimal generator]
A (stable and conservative) \emph{\qmatrix} $\qm$
on an at most countable set of states $\states$
is an $\states \times \states$ matrix
with elements $q_{ij} \in \RR$, $i,j \in \states$
such that $0 \leqslant q_{ij} < \infty$ when $i \neq j$
and $q_{ii} = - \sum_{j \neq i} q_{ij} > -\infty$.\footnote{
If unstable $q_{ii}$ can be $-\infty$ and if non-conservative
$q_{ii} \leqslant - \sum_{j \neq i} q_{ij}$.}
\end{definition}
The \qmatrix plays the role of
the time derivative of the transition probabilities at time $0$
and induces the evolution of a probabilistic state
according to the Kolmogorov backward equation,
\begin{equation}
\label{eq:transition-function}
\ddt P(t) = Q P(t), \quad P(0) = I
\end{equation}
where $P(t)$ is the $\states \times \states$ matrix
with elements $p_{ij}(t) \in \RR$ the probability that
we were in state $i$ at time $0$ and are in state $j$ at time $t$.
% transition probability from state $i$ to $j$ at time $t$.
When the \qmatrix is stable and conservative
there exists a unique minimal\footnote{
% In the sense that
If $P'(t)$ is any
non-negative solution of \eqn{transition-function},
then $p_{ij}(t) \leqslant p'_{ij}(t)$
for all $i, j \in \states$ and $t \geqslant 0$.}
solution $P(t)$ of \eqn{transition-function} \citep{anderson}.
% Proved in Th 2.2.2, page 70, of Anderson's book
% Properties of transition functions and q-matrices:
% P(t) is stable if \ddt p_{ii}(t) is finite at t = 0 for all i. (p 4)
% P(t) is standard if \lim_{t \to 0} p_{ii} = 1 for all i. (p 6)
% Q is stable if q_{ii} > -\infty for all i. (p 9)
% Q is conservative if \sum_j q_{ij} = 0 for all i. (p 13)
We shall work with this type of \qmatrices
and assume there is a transition function $P(t)$
whenever we have a \qmatrix $\qm$ and vice versa.
Given a \pmf $\state(0)$ on $\states$ (seen as a row vector)
the state at $t = 0$,
the probability distribution $\state(t)$ after time $t$
is given by $\state(t) = \state(0) P(t)$.
% We can obtain the time derivative of this distribution
% from \eqn{transition-function}.
% % $s_i(t)$ that the Markov chain is in state $i$ at time $t$.
% In coordinate form, we have
% \begin{equation} % TODO: is this equation correct?
% \label{eq:prob-deriv}
% \ddt s_i(t) = \sum_{j \in \states} q_{ji} s_j(t)
% \end{equation}
We say the \qmatrix is \emph{irreducible}
if every state is reachable regardless of the initial state,
\ie $p_{ij}(t) > 0$ for all $i,j \in \states$
and some $t \geqslant 0$.
\begin{definition}[CTMC]%[continuous-time Markov chain]
A \emph{continuous-time Markov chain} is a tuple
$\tuple{\states, \state(0), \qm}$ with
$\states$ an at most countable set of states,
$\state(0)$ a \pmf on $\states$
representing the initial state and
$\qm$ the \qmatrix of the Markov chain.
% The Markov chain can be presented as a time-indexed family
% of random variables $X_t$.
\end{definition}
A few important properties of CTMCs
for the present work are given below.
\begin{definition}[detailed balance]
\label{def:detailed-balance}
A \qmatrix $\qm$ on $\states$
is said to be \emph{time reversible} iff
there is a \pmf $\ip$ on $\states$ such that
\begin{equation}
\label{eq:detailed-balance}
\ip_i q_{ij} = \ip_j q_{ji}
\end{equation}
for all $i,j \in \states$.
Then $\qm$ is said to have \emph{detailed balance}
with respect to $\ip$.
\end{definition}
Detailed balance was first proposed,
in a slightly stronger form
that requires every path going from $i$ to $j$
to have a reverse path with which it is in equilibrium,
by \citet{wegscheider} in the context of chemical kinetics.
Its validity for other physical systems was argued by
\citet{lewis} and \citet{tolman}.
Tolman called the generalised principle
\emph{microscopic reversibility}.
% The stronger form requires every path going from $i$ to $j$
% to have an reverse path with which it is in equilibrium.
\begin{definition}
A \pmf $\ip$ on $\states$ is
\emph{invariant} for a \qmatrix $\qm$
iff $\ip \qm = 0$, \ie
\[ -\ip_i q_{ii} = \ip_i \sum_{j \neq i} q_{ij}
= \sum_{j \neq i} \ip_j q_{ji} \]
\end{definition}
That is to say, $\ip$ is invariant
whenever the action of $\qm$ on $\ip$ does not change $\ip$
or equivalently when $\ip$ is a fixpoint of $\qm$.
% The relationship between detailed balance and an invariant \pmf
% is established by the following lemma.
\begin{lemma}
Suppose the \qmatrix $\qm$
has detailed balance with respect to $\ip$.
Then $\ip$ is invariant for $\qm$.
\end{lemma}
\begin{proof}
From \eqn{detailed-balance} we obtain
\[ \sum_{i \in \states} \ip_i q_{ij} =
\sum_{i \in \states} \ip_j q_{ji} = -\ip_j q_{jj}, \]
as $\sum_{i \in \states} q_{ji} = -q_{jj}$ for any fixed state $j$.
\end{proof}
Once an invariant \pmf is reached by the Markov chain
it stays there forever.
We would therefore like to know when an invariant
\pmf is realised by the Markov chain.
\begin{definition}[ergodicity]
A \qmatrix $\qm$ is \emph{ergodic} when
there is a \pmf $\ip$ on $\states$ such that
for all $i,j \in \states$,
\[ \lim_{t \to \infty} P_{ij}(t) = \ip_j \]
\end{definition}
This is equivalent to say that the Markov chain
will converge to the \pmf $\ip$
regardless of the initial state $\state(0)$.
\begin{lemma}
\label{lemma:ergodic}
Suppose the \qmatrix $\qm$ is irreducible
and has an invariant \pmf $\ip$.
Then $\qm$ is ergodic and converges to $\ip$.
\end{lemma}
The proof for this lemma can be found in part 2 of theorem 1.6
in chapter 5 of Anderson's book (\cite*[][pages 160--161]{anderson}).
CTMCs have a strong kinetic flavour as they describe
stochastic processes in terms of probability flows
happening at a certain rate. % (read velocities).
% They are the most explicit description in the stochastic world
% for discrete state, continuous time processes.
% and all approaches to describe these processes
% are interpreted in terms of them.
%
It is natural to wonder then how the thermodynamic approach
looks like in the stochastic world.
It turns out the energy function has a very clear interpretation
in this setting, namely, that of defining the probability $\ip_i$
that the system finds itself in state $i \in \states$ as follows
% \citep[page 72]{mcquarrie-stat-mech}.\footnote{
\citep[page 40]{mcquarrie-stat-mech}.\footnote{
We express the energy in units of $1/\kB T$
% (with $\kB$ the Boltzmann constant and $T$ the temperature)
% (known as inverse temperature)
to avoid writing this term explicitly.}
\begin{equation}
\label{eq:energy}
\ip_i = \frac{\exp{-E(i)}}{\sum_{j \in \states} \exp{-E(j)}}
\end{equation}
This is known as the \emph{Boltzmann distribution}.
Note that
(i) when given the probability distribution $\ip$
the energy function is defined uniquely
only up to an additive constant;\footnote{
In other words, if we change the energy of each state by adding
a fixed constant we obtain the same probability distribution $\ip$.}
(ii) by convention the sign of the energy is inverted
so lower energies represent more favourable states; and
(iii) in the case of detailed balance,
we obtain $\exp{E(j)-E(i)} = q_{ji}/q_{ij}$
by combining \eqn{energy} and \eqn{detailed-balance}.
% The next question is
How do we construct a CTMC
from an energy function?
% What else do we need?
% Clearly, we need to know the state space $\states$.
% Also, unlike in classical mechanics,
% we would need to know which transitions between states are possible
% since there are no assumptions of continuity on $\states$.
The first formulation to shed light on this problem
was proposed by \citet{metropolis}.
The algorithm asks for an energy function and
an \emph{a priori} one-step transition probability matrix $A$
where each element $a_{ij}$ ($i,j \in \states$)
denotes the probability that
we choose to jump to state $j$ when we are at state $i$.
Hence $\sum_{j \in \states} a_{ij} = 1$ for any fixed $i$
and we write $a_{i-}$ for this probability distribution.
The $A$ matrix is assumed to be symmetric,
\ie $a_{ij} = a_{ji}$ for all $i,j \in \states$,
% This matrix plays the role of the \qmatrix in the
% discrete-time setting (\ie where time is indexed by the naturals)
% and each element $a_{ij}$ denotes the probability that
% we choose to jump to state $j$ when we are at state $i$.
although this is not strictly necessary
and the algorithm has been later generalised
to work under a weaker assumption ($a_{ij} = 0$ iff $a_{ji} = 0$)
by \citet{hastings}.
% The $A$ matrix asserts which transitions are possible.
% By symmetry, their reverse transitions are also possible.
Note that when addressing the direct problem for CTMCs
by using the Metropolis algorithm
we require an extra ingredient --- the $A$ matrix ---
which was not needed in classical mechanics.
% This extra ingredient was unnecesary
% when addressing the direct problem in classical mechanics
% because implicit assumptions of continuity on $\states$
% supply us with this information.
This is because in classical mechanics there are
implicit assumptions of continuity on $\states$
that supply this information.
The state space is $\RR^{2 \cdot 3}$
and, intuitively, an allowed transition in this continuous space
is a differential change in any direction,
\ie $dx$, $dy$, $dz$.
% In addition to the energy function and the state space,
% we would need to know which transitions are possible.
% This is unlike classical mechanics,
% where assumptions of continuity on $\states$
% make this unnecesary.
%
On the other hand,
the method that will be presented in \chp{direct}
does not ask for a priori transition probabilities
% (\ie an $A$ matrix)
but only which reversible transitions are possible at all.
The construction gives a \emph{discrete-time} Markov chain that
converges to the probability distribution $\ip$ in \eqn{energy}.
% For the sake of simplicity we present here
% only the original formulation.
The algorithm works as follows.
Given any state $i \in \states$ we pick a neighbour state $j$
at random according to the probability distribution $a_{i-}$.
We evaluate the energy function at $i$ and $j$
to compute $\Delta E = E(j)-E(i)$ and proceed with the transition
with probability $1$ if $\Delta E < 0$ and
probability $\exp{-\Delta E}$ if $\Delta E > 0$.
Otherwise we stay at state $i$.
In both cases time (a natural number) is increased by 1.
We repeat for state $j$ if the transition was successful
and $i$ otherwise.
To see that $\ip$, as defined in \eqn{energy},
is the invariant probability distribution
of the discrete-time Markov chain
we show that it has (the discrete-time version of)
detailed balance with respect to $\ip$.
The probability $p_{ij}$ of jumping from $i$ to $j$ is
a combination of the a priori probability $a_{ij}$ and
the probability of accepting that transition,
which depends on $\Delta E$.
\[ p_{ij} = a_{ij}\; \min(1, \exp{-\Delta E}) \]
By taking the ratio of $p_{ij}$ and $p_{ji}$ we obtain
\[ \frac{p_{ij}}{p_{ji}} =
\frac{a_{ij}\; \min(1, \exp{E(i)-E(j)})}{
a_{ji}\; \min(1, \exp{E(j)-E(i)})} =
\frac{\min(1, \exp{E(i)-E(j)})}{
\min(1, \exp{E(j)-E(i)})} \]
since $a_{ij} = a_{ji}$ by symmetry of $A$.
% of the a priori transition probability matrix.
Suppose $E(i)-E(j) > 0 > E(j)-E(i)$, then
\[ \frac{p_{ij}}{p_{ji}} = \exp{E(i)-E(j)}
= \frac{\exp{-E(j)}}{\exp{-E(i)}}
= \frac{\ip_{j}}{\ip_{i}} \]
It is easy to see that when
$E(j)-E(i) > 0 > E(i)-E(j)$ we obtain the same equation.
Hence the discrete-time Markov chain has detailed balance
with respect to $\ip$. % as defined in \eqn{energy}.
Provided the a priori transition probability matrix $A$
makes it possible to reach any state from any other state,
the Markov chain will converge to $\ip$ as $t \to \infty$.
The Metropolis-Hastings algorithm can be generalised
to the continuous-time case \citep{diaconis}.
However, the algorithm require us to either
(i) compute the energy of all states to obtain the probabilities
$p_{ij}$ (or transition rates $q_{ij}$ in the continuous-time case),
or (ii) do rejection sampling, as outlined above.
Option (i) can be very time-consuming when $\states$ is large
% or the evaluation of the energy function is expensive.
or it's costly to evaluate the energy function.
Option (ii) can be inefficient when the rejection rate is high.
For these reasons we explore an alternative method in this thesis.
We partition the state space in regions of equal energy
and group transitions according to these regions.
This is made possible by assuming extra structure on $\states$
(to be introduced in \sct{kappa}).
Let us go back to the stochastic modelling of
chemical interactions mentioned above.
The theory of CTMCs allows one to frame
the dynamics of chemical reaction systems.
A stochastic approach to such systems
was pioneered by \citet{delbruck}
and has been common practice for decades
\citep{mcquarrie-stoch-kinetics}.
The physical conditions under which this approach is plausibly valid
has been argued by \citet{gillespie76}.
% The stochastic approach has provided insights
% in the understanding of living organisms. Posible refs:
% http://dx.doi.org/10.1371/journal.pbio.1000115
% http://science.sciencemag.org/content/297/5584/1183
% http://www.nature.com/nrg/journal/v6/n6/full/nrg1615.html
% http://science.sciencemag.org/content/320/5872/65
% http://www.alexeikurakin.org/text/akDGE2005.pdf
Since the number of molecules of a species
is a priori unbounded and thus $\states$ might be infinite,
one would like to have a way to express
these systems in a finite and simple form.
A language that could do this was designed by \citet{petri}.
% came to be in the work of \citet{petri}.
This language, later called \emph{Petri nets},
sees reactions as transformations of
multisets of chemical species.
\begin{definition}
A \emph{multiset} $M$ over a set $X$ is a map from $X$ to
the naturals assigning to each element $x \in X$
the number of copies $M(x) \in \NN$ of that element
in the multiset.
\end{definition}
There is a natural partial order $\leqslant$ on multisets over $X$.
We say $M \leqslant N$ if for each element $x \in X$,
$M(x) \leqslant N(x)$.
We write $\MM(X)$ for the set of all multisets over $X$.
\begin{definition}
Given a set of species $\species$,
a \emph{reaction} $r$ is a pair $\tuple{L,R}$
with $L$ and $R$ multisets over $\species$.
We refer to $L$ and $R$ as the left- and right-hand side of $r$.
% and write $L \to R$ for the reaction.
\end{definition}
\begin{definition}%[PN]%[Petri net]
A \emph{Petri net} is a pair $\tuple{\species, \reactions}$ of
sets of species $\species$ and reactions $\reactions$.
\end{definition}
A state of a Petri net is a multiset over $\species$,
usually called a \emph{marking}.
A reaction can occur in a given state $M$ only if
its left-hand side $L \leqslant M$.
\begin{definition}
A \emph{match} of the left-hand side $L$ of a reaction
on a state $M$ is an injective function from $L$ to $M$
that identifies each copy of species $s$ in $L$
with a copy of $s$ in $M$.
\end{definition}
We write $\matches{L}{M}$ for the set of matches from $L$ to $M$.
From this definition we have that the number of matches
$\abs{\matches{L}{M}}$ from $L$ to $M$ is
\[ \abs{\matches{L}{M}} = \prod_{s \in \species} \binom{L(s)}{M(s)} \]
A reaction is said to be elementary iff its rate is
proportional to the number of matches of its left-hand side.
This is known as the \emph{law of mass action} in chemistry.
Here we consider only elementary reactions.
Petri nets can be given a stochastic interpretation
in terms of a CTMC.
Given a Petri net $\tuple{\species, \reactions}$,
an initial marking $M_0$ and
a rate map $k: \reactions \to \RR_{\geqslant 0}$
that assigns rates to reactions,
we construct a CTMC $\tuple{\states, \state(0), \qm}$ as follows.
\begin{align*}
\states &= \MM(\species) \\
\state(0)(x) &= \begin{cases}
1 \quad\text{if } x = M_0 \\
0 \quad\text{if } x \neq M_0
\end{cases} \\
q_{MN} &= \sum_{\substack{r \in \reactions\\r = \tuple{L,R}}}
k(r) \; \nitoj_{MN}(L,R)
\end{align*}
with
\begin{equation*}
\nitoj_{MN}(L,R) = \left\{\begin{array}{ll}
\abs{\matches{L}{M}} & \text{if } M - L + R = N \\
0 & \text{otherwise}
\end{array}\right.
\end{equation*}
% Interestingly,
\citet{et2} have solved
the \emph{direct} and \emph{inverse} problem for Petri nets,
that is, they have shown the conditions
the set of reactions and rate map have to fulfil
for a Petri net to have an energy function
and what is the structure of said energy function.
% that is, the problem of constructing a Petri net
% from an energy function on $\states$ and vice versa.
Petri nets have limitations when we take into consideration
what happens inside molecules in a chemical reaction.
The chemical transformation taking place amounts to
a change in the way electrons are shared by atoms
resulting in a relocation of chemical bonds.
In other words, (non-radioactive) reactions are all about
the binding and unbinding of atoms,
how they establish connections and break them.
This is poorly captured by a conversion of species
as it is modelled by Petri nets.
A consequence of this lack of a formal representation for
molecular bonds is that certain systems of chemical reactions
cannot be described in a finite way using Petri nets,
\eg unbounded polymerisation
(think of a molecular chain that can always attach new links).
Recently,
a formal language to describe biochemical interactions
using rewriting rules,
where molecules not just react but can also bind other molecules
has been proposed by \citet{kappa}.
In the next section we introduce this language, called Kappa.
% keeping in mind that we want to address
The language shall give us a formal foundation
from which we can address
% , in the context of biomolecular interactions,
the \emph{direct} and \emph{inverse} problems mentioned above,
namely, the problem of generating a set of rewriting rules
from an energy function and vice versa.
\section{Kappa}
\label{sec:kappa}
Kappa represents interactions among proteins,
nucleic acids and other biomolecules as
connections in a biomolecular network.
In these networks, nodes stand for the biomolecules
while connections represent transient molecular bonds
(\eg non-covalent interactions like hydrogen bonds).
This network is constantly changing as molecules
travel and interact with other molecules in a cell,
which is viewed as the constant destruction and creation
of the connections that make up the network.
Due to spatial constraints,
molecules can physically interact with
just so many other molecules at once.
Exactly how many will depend on multiple factors
like the size of the two interacting molecules
and the region where they come in contact.
These regions, known in molecular biology by the names of domains,
motifs or binding sites, are simply called \emph{sites} in Kappa.
Any such site can bind at most one other site at a time.
These sites belong to the nodes of the graph,
which Kappa calls \emph{agents}. % and uses to represent biomolecules.
In the same way a molecule is of a certain species,
agents can be of different types.
These types also live in a network,
a static «network of possibilities» which
informs us of the set of sites an agent can have
and the possible connections that sites can form.
To make these ideas formal we use a simpler version of
the category-theoretical approach
introduced by \citet{kappadpo}.
We first define the networks where types live and then
use them as a basis to construct the actual biomolecular networks.
Graphs are used as a mathematical model of networks
and site graphs in particular for biomolecular interaction networks.
\begin{definition}%[site graph]
A \emph{site graph} $G$ consists of
a finite set of agents $\agents_G$,
a finite set of sites $\sites_G$,
a map $\sitemap_G: \sites_G \to \agents_G$
that assigns sites to agents
and a symmetric edge relation $\edges_G$ on $\sites_G$.
\end{definition}
The pair $\sites_G$, $\edges_G$ form an undirected graph.
Clearly, the definition of site graphs does not impose
a bound on the number of connections a site can have.
Indeed there is no restriction at all so far.
This is the network where types live.
Sites not in the domain of $\edges_G$ are said to be \emph{free}.
One says $G$ is \emph{realisable} iff
(i) no site has an edge to itself and
(ii) sites have at most one incident edge.
Each realisable site graph represents a
(possibly partially specified\footnote{
Below you can find the definition of a fully specified state,
which we call a mixture.})
state in which our biomolecular network can be.
However it contains no typing information.
% Note however that it contains no typing information.
We give a type to each agent and site in the graph
by assigning to it an agent and site in the type graph.
More precisely,
we use a map from a realisable site graph to a site graph.
Below we introduce such maps.
\begin{flushleft}
\begin{minipage}{.66\linewidth}
\begin{definition}
A \emph{homomorphism} $h: G \to G'$ of site graphs is
a pair of functions, $h_\sites: \sites_G \to \sites_{G'}$
and $h_\agents: \agents_G \to \agents_{G'}$, such that
for all $s,s' \in \sites_G$ we have
(i) $h_\agents(\sitemap_G(s)) = \sitemap_{G'}(h_\sites(s))$
and (ii) if $s \mathbin{\edges_G} s'$ then
$h_\sites(s) \mathbin{\edges_{G'}} h_\sites(s')$.
\end{definition}
\end{minipage}
\begin{minipage}{.3\linewidth}
\begin{flushright}
\begin{tikzpicture}
\matrix (m) [matrix of math nodes,row sep=30pt,column sep=30pt] {
\sites_G & \sites_{G'} \\
\agents_G & \agents_{G'} \\};
\draw[hom] (m-1-1) -- node[above] {$h_\sites$} (m-1-2);
\draw[hom] (m-2-1) -- node[below] {$h_\agents$} (m-2-2);
\draw[hom] (m-1-1) -- node[left] {$\sitemap_G$} (m-2-1);
\draw[hom] (m-1-2) -- node[right] {$\sitemap_{G'}$} (m-2-2);
\end{tikzpicture}
\end{flushright}
\end{minipage}
\end{flushleft}
Put simply, homomorphisms preserve site ownership and connections.
The diagram to the right is the corresponding
commutative diagram in the category $\Set$
of sets and total functions
to condition (i) in the definition.
We say the homomorphism $g: G \to C$ is
a \emph{contact map} over $C$ iff
(i) $G$ is realisable and
(ii) whenever $g_\sites(s_1) = g_\sites(s_2)$
and $\sitemap_G(s_1) = \sitemap_G(s_2)$, then $s_1 = s_2$.
Condition (ii) means that every agent in $G$ has at most
one copy of each site of its corresponding agent in $C$.
We refer to $C$ as the contact graph.
Contact maps act as the typing map mentioned above.
In particular,
$C$ specifies the types of agents that can exist in $G$,
the sites that they may possess,
and which of the $|\sites_C|(|\sites_C|+1)/2$
possible edge types are actually valid.
% |\sites_C|(|\sites_C|+1)/2 = |\sites_C|(|\sites_C|-1)/2+|\sites_C|,
% that is, each site can bind any other site (divided by 2 to
% compensate for counting each edge twice) or it can bind itself
% (these edges aren't counted twice).
Site graphs and homomorphisms form a category $\SG$.
The composition of two homomorphisms
$h_1: G_1 \to G_2$ and $h_2: G_2 \to G_3$
is a homomorphism $h: G_1 \to G_3$ with
$h_\sites = h_{2,\sites} \comp h_{1,\sites}$ and
$h_\agents = h_{2,\agents} \comp h_{1,\agents}$.
It is easy to see that composition defined in this way is associative.
The identity arrow $\id_G: G \to G$ in $\SG$ is defined
using the identity functions of the corresponding sets.
A homomorphism $\psi: G \to G'$ is an \emph{embedding} iff
(i) $\psi_\agents$ and $\psi_\sites$ are injective and
(ii) if $s$ is free in $G$, so is $\psi_\sites(s)$ in $G'$.
Injectivity of $\psi_\agents$ and $\psi_\sites$ implies
\begin{wrapfigure}[4]{r}{0.28\textwidth}
\vspace{-2em}
\begin{center}
\begin{tikzpicture}
\matrix (m) [matrix of math nodes,row sep=20pt,column sep=20pt] {
G & & G' \\
& C & \\};
\draw[hom] (m-1-1) -- node[above] {$\psi$} (m-1-3);
\draw[hom] (m-1-1) -- node[below left] {$g$} (m-2-2);
\draw[hom] (m-1-3) -- node[below right] (g) {\phantom{$g$}}
(m-2-2);
\node[anchor=south] at (g.south) {$g'$};
\end{tikzpicture}
\end{center}
\end{wrapfigure}
that whenever $\psi: G \to G'$ is an embedding
and $G'$ is realisable then $G$ is also realisable.
An embedding $\psi: G \to G'$ between realisable site graphs
can be lifted to a morphism between contact maps $g: G \to C$
and $g': G' \to C$ iff the diagram on the right commutes in $\SG$.
Contact maps over $C$ and embeddings form a category $\rSGe_C$.
Composition and identity % the identity arrow
are defined in a similar manner to $\SG$.
We write $\matches{g}{g'}$ for the set of embeddings
between $g$ and $g'$ in $\rSGe_C$
and refer to $g$ as a \emph{pattern}
to be matched in $g'$.
We have a functor $\anon{\cdot}$
from $\rSGe_C$ to $\SG$ which forgets types.
In particular, if $g: G \to C$ is a contact map,
we write $\anon{g}$ for its domain $G$.
As an example, consider the site graph $T$ for a triangle.
\vspace{-.2cm}
\begin{minipage}{.5\textwidth}
\begin{align*}
\agents_T &= \set{1, 2, 3}, \quad
\sites_T = \set{l_1, r_1, l_2, r_2, l_3, r_3}, \\
\sitemap_T &= \setof{s_a \mapsto a}{
s \in \set{l, r},\; a \in \agents_T}, \\
\edges_T &= \set{(r_1, l_2), (l_2, r_1), (r_2, l_3),
(l_3, r_2), (r_3, l_1), (l_1, r_3)}
\end{align*}
\end{minipage}
\begin{minipage}{.32\textwidth}
\vspace{.7cm}
\begin{center}
\begin{tikzpicture}[grphdiag]
\nn[n]{n1}{0,0}{1};
\nn[n]{n2}{0:1.4}{2};
\nn[n]{n3}{60:1.4}{3};
\e{n1}{n2};
\e{n2}{n3};
\e{n1}{n3};
\site{r1}{0:10pt};
\site{l1}{60:10pt};
\node at (86:15pt) {\scriptsize $l_1$};
\node at (-26:15pt) {\scriptsize $r_1$};
\begin{scope}[shift={(0:1.4)}]
\site{r2}{180:10pt};
\site{l2}{120:10pt};
\node at (206:15pt) {\scriptsize $l_2$};
\node at (94:15pt) {\scriptsize $r_2$};
\end{scope}
\begin{scope}[shift={(60:1.4)}]
\site{r3}{-120:10pt};
\site{l3}{-60:10pt};
\node at (-146:15pt) {\scriptsize $r_3$};
\node at (-34:15pt) {\scriptsize $l_3$};
\end{scope}
\end{tikzpicture}
\end{center}
\end{minipage}
\newline
\vspace{.2cm}
Let us use $T$ as the contact graph
for a contact map $g: G \to T$ where
\vspace{.3cm}
\begin{center}
$G = \;\;$
\begin{tikzpicture}[grphdiag,baseline=-2.5,thick]
\path[use as bounding box] (-0.8,0.4) rectangle (3.4,-0.42);
\e{-0.6,0}{3.2,0};
\begin{scope}
\nn[n]{x}{0,0}{$x$};
\site{lx}{x.west};
\site{rx}{x.east};
\node[anchor=north east,yshift=2] at (180:.2) {\scriptsize $l_x$};
\node[anchor=south west,yshift=-1] at (0:.2) {\scriptsize $r_x$};
\end{scope}
\begin{scope}[shift={(1.3,0)}]
\nn[n]{y}{0,0}{$y$};
\site{ly}{y.west};
\site{ry}{y.east};
\node[anchor=north east,yshift=2] at (180:.2) {\scriptsize $l_y$};
\node[anchor=south west,yshift=-1] at (0:.2) {\scriptsize $r_y$};
\end{scope}
\begin{scope}[shift={(2.6,0)}]
\nn[n]{z}{0,0}{$z$};
\site{lz}{z.west};
\site{rz}{z.east};
\node[anchor=north east,yshift=2] at (180:.2) {\scriptsize $l_z$};
\node[anchor=south west,yshift=-1] at (0:.2) {\scriptsize $r_z$};
\end{scope}
\end{tikzpicture}
\end{center}
\begin{alignat*}{3}
\agents_G &= \set{x, y, z} &
\sitemap_G &= \setof{s_a \mapsto a}{
s \in \set{l, r},\; a \in \agents_G} \\
\sites_G &= \set{l_x, r_x, l_y, r_y, l_z, r_z} \quad\quad &
\edges_G &= \set{(r_x, l_y), (l_y, r_x), (r_y, l_z), (l_z, r_y)}
\end{alignat*}
and
\vspace{-.4cm}
\begin{align*}
g_\agents &= \set{x \mapsto 1, y \mapsto 2, z \mapsto 3} \\
g_\sites &= \setof{s_a \mapsto s_{a'}}{
s \in \set{l, r},\; a \in \agents_G,\; a' = g_\agents(a)}
\end{align*}
Sites $l_x$ and $r_z$ in $G$ are free,
which we denote graphically by a stub coming out of the site.
$T$ and $G$ are realisable since no site is bound to itself or
bound to more than one other site.
Note however that the codomain of a contact map ($T$ in this case)
does not have to be realisable in general.
To ease the definition of concrete contact maps,
we colour agents according to their type
and annotate sites by their name in $C$.
The contact map $g: G \to T$ above can then be defined
more succinctly by the following drawing.
\vspace{.3cm}
\begin{center}
\begin{tikzpicture}[grphdiag,thick]
\e{-0.5,0}{2.7,0};
\begin{scope}
\n[n1]{x}{0,0};
\site{lx}{x.west};
\site{rx}{x.east};
\node[anchor=north east,yshift=2] at (180:.2) {\scriptsize $l$};
\node[anchor=south west,yshift=-1] at (0:.2) {\scriptsize $r$};
\end{scope}
\begin{scope}[shift={(1.1,0)}]
\n[n2]{y}{0,0};
\site{ly}{y.west};
\site{ry}{y.east};
\node[anchor=north east,yshift=2] at (180:.2) {\scriptsize $l$};
\node[anchor=south west,yshift=-1] at (0:.2) {\scriptsize $r$};
\end{scope}
\begin{scope}[shift={(2.2,0)}]
\n[n3]{z}{0,0};
\site{lz}{z.west};
\site{rz}{z.east};
\node[anchor=north east,yshift=2] at (180:.2) {\scriptsize $l$};
\node[anchor=south west,yshift=-1] at (0:.2) {\scriptsize $r$};
\end{scope}
\end{tikzpicture}
\end{center}
where we have assigned colours orange, blue and green to
agent types $1$, $2$, $3$ in $C$.
We have written $l$ and $r$ for sites $l_1,l_2,l_3$ and $r_1,r_2,r_3$
as the subscript can be deduced from the colour of the agent as well.
Whenever a contact map $g: G \to C$ specifies all sites
that its type $C$ permits for all its agents, that is,
if for all $a \in \agents_G$,
$g_\sites(\sitemap_G^{-1}(a)) = \sitemap_C^{-1}(g_\agents(a))$,
then we say $g$ is a \emph{mixture}.
We write $\MM(C)$ for the set of all mixtures in $\rSGe_C$.
In the above example, $g$ is a mixture.
What other mixtures are there that have $T$ as contact graph?
We can have chains of any length and closed cycles
of length some multiple of three like triangles, hexagons, etc.
We can have any disjoint sum of them as well.
% TODO: next sentence could be improved
Mixtures, being fully specified biomolecular networks
with respect to the type $C$,
are a natural choice for the states of our dynamical system.
We jump from state to state by the applications of \emph{rules}.
\begin{definition}
A \emph{rule} $r$ is a pair of contact maps
$r_L: L \to C$, $r_R: R \to C$
which differ only in their edge structures,
\ie $\agents_L = \agents_R$, $\sites_L = \sites_R$,
$\sitemap_L = \sitemap_R$, $r_{L,\agents} = r_{R,\agents}$
and $r_{L,\sites} = r_{R,\sites}$.
\end{definition}
\newcounter{markpoint}
\refstepcounter{markpoint}
\label{p:example}
In the context of the contact graph $T$,
we obtain a rule that binds
agents of type $1$ with agents of type $2$ as follows.
\vspace{-.6cm}
\begin{flushright}
\begin{minipage}{.43\linewidth}
\begin{center}
\begin{tikzpicture}
\node[grphnode] (lhs) at (0,0) {
\tikz[ingrphdiag]{
\begin{scope}[shift={(0,0)}]
\n[n1]{x}{0,0};
\e{x}{.5,0};
\site{rx}{x.east};
\node at (26:.42) {\scriptsize $r$};
\end{scope}
\begin{scope}[shift={(1.2,0)}]
\n[n2]{y}{0,0};
\e{y}{-.5,0};
\site{ly}{y.west};
\node at (206:.42) {\scriptsize $l$};
\end{scope}
}};
\path (lhs.east) +(.3,0) edge[rule] +(1,0)
+(1.3,0) coordinate (r);
\node[grphnode,anchor=west] (rhs) at (r) {
\tikz[ingrphdiag]{
\e{0,0}{1.1,0};
\begin{scope}
\n[n1]{x}{0,0};
\site{rx}{x.east};
\node at (26:.42) {\scriptsize $r$};
\end{scope}
\begin{scope}[shift={(1.1,0)}]
\n[n2]{y}{0,0};
\site{ly}{y.west};
\node at (206:.42) {\scriptsize $l$};
\end{scope}
}};
\end{tikzpicture}
\end{center}
\end{minipage}
\begin{minipage}{.5\linewidth}
\begin{equation*}
\arraycolsep=2pt
\begin{array}{ccccl}
\agents_L &=& \agents_R &=& \set{u,v} \\
\sites_L &=& \sites_R &=& \set{r_u, l_v} \\
\sitemap_L &=& \sitemap_R &=& \set{r_u \mapsto u, l_v \mapsto v} \\
r_{L,\agents} &=& r_{R,\agents} &=& \set{u \mapsto 1, v \mapsto 2} \\
r_{L,\sites} &=& r_{R,\sites} &=& \set{r_u \mapsto r_1, l_v \mapsto l_2} \\
&& \edges_L &=& \emptyset \\
&& \edges_R &=& \set{(r_u,l_v),(l_v,r_u)}
\end{array}
\end{equation*}
\end{minipage}
\end{flushright}
Note that there is no site $l$ in $u$ and no site $r$ in $v$.
Hence $r_L$ and $r_R$ are not mixtures
as they are only partially specified.
Intuitevely, this means that the rule can be applied
regardless of whether those sites are bound or free.
\begin{wrapfigure}[5]{r}{0.35\textwidth}
\vspace{-1.7em}
\begin{equation}
\label{eq:rewrite-step}
\tikz[baseline=-1.1cm,x=2cm,y=2cm]{
\node (rl) at (0,0) {$r_L$};
\node (rr) at (1,0) {$r_R$};
\node (ml) at (0,-1) {$m$};
\node (mr) at (1,-1) {$m^{(r,\psi)}$};
\draw[hom,dotted] (rl) -- (rr);
\draw[hom,dotted] (ml) -- (mr);
\draw[hom] (rl) -- node[left] {$\psi$} (ml);
\draw[hom] (rr) -- node[right] {$\comatch{\psi}$} (mr);
% \matrix (m) [matrix of math nodes,row sep=35pt,column sep=35pt] {
% r_L & r_R \\
% m & m^{(r,\psi)} \\};
% \draw[hom,dotted] (m-1-1) -- (m-1-2);
% \draw[hom,dotted] (m-2-1) -- (m-2-2);
% \draw[hom] (m-1-1) -- node[left] {$\psi$} (m-2-1);
% \draw[hom] (m-1-2) -- node[right] {$\comatch{\psi}$} (m-2-2);
}
\end{equation}
\end{wrapfigure}
When a rule $r$ is applied to an embedding $\psi: r_L \to m$
it induces a \emph{rewrite} of the mixture $m$
by modifying the edge structure of the image of $\psi$
from that of $r_L$ to that of $r_R$.
The result of rewriting is a new mixture $m^{(r,\psi)}$
(or simply $\comatch{m}$ when $(r,\psi)$ is clear from the context)
and an embedding $\comatch{\psi}: r_R \to \comatch{m}$, where
$\anon{\comatch{m}}$ has the same agents and sites as $\anon{m}$,
\ie $\agents_{\anon{\comatch{m}}} = \agents_{\anon{m}}$,
$\sites_{\anon{\comatch{m}}} = \sites_{\anon{m}}$,
$\sitemap_{\anon{\comatch{m}}} = \sitemap_{\anon{m}}$,
$\comatch{m}_\agents = m_\agents$,
$\comatch{m}_\sites = m_\sites$,
and $\edges_{\anon{\comatch{m}}} = \edges_{\anon{m}} -
\psi_\sites(\edges_{\anon{r_L}}) +
\comatch{\psi}_\sites(\edges_{\anon{r_R}})$.
Since the set of agents and sites are equal,
% of $\anon{m}$ and $\anon{\comatch{m}}$ are equal,
$\comatch{\psi}$ is given by
% The embedding $\comatch{\psi}$ is simply given by % simply defined by
$\comatch{\psi}_\agents = \psi_\agents$ and
$\comatch{\psi}_\sites = \psi_\sites$.
The inverse of $r$,
defined as $\inv{r} := (r_R,r_L)$ is also a valid rule.
By applying $\inv{r}$ to $\comatch{\psi}$
we recover $m$ and $\psi$.
% By rewriting $\comatch{m}$ with $\inv{r}$ via $\comatch{\psi}$,
% we recover $m$ and $\psi$.
\begin{lemma}
\label{lemma:reversibility}
Let $r = (r_L, r_R)$ be a rule,
$r_L/\rSGe_C$ the coslice category under $r_L$, and
$r_R/\rSGe_C$ the coslice category under $r_R$.