$\hat A$的本征函数系${\Psi_n}$构成正交、归一的完备函数系。一维情况的归一化:$\int_{-\infty}^{+\infty}\Psi_n^*(x)\Psi_n(x)\mathrm dx = 1$。正交性:$\int_{-\infty}^{+\infty}\Psi_m(x)\Psi_n(x)\mathrm d x = \delta_{mn}$.
pick aaa pick bbb pick ccc # Rebase xxx onto xxx (xxx command) # # Commands: # p, pick <commit> = use commit # r, reword <commit> = use commit, but edit the commit message # e, edit <commit> = use commit, but stop for amending # s, squash <commit> = use commit, but meld into previous commit # f, fixup [-C | -c] <commit> = like "squash" but keep only the previous # commit's log message, unless -C is used, in which case # keep only this commit's message; -c is same as -C but # opens the editor # x, exec <command> = run command (the rest of the line) using shell # b, break = stop here (continue rebase later with 'git rebase --continue') # d, drop <commit> = remove commit # l, label <label> = label current HEAD with a name # t, reset <label> = reset HEAD to a label # m, merge [-C <commit> | -c <commit>] <label> [# <oneline>] # . create a merge commit using the original merge commit's # . message (or the oneline, if no original merge commit was # . specified); use -c <commit> to reword the commit message # # These lines can be re-ordered; they are executed from top to bottom. # # If you remove a line here THAT COMMIT WILL BE LOST. # # However, if you remove everything, the rebase will be aborted. #
var ifram = document.querySelector("#iframe") var idoc = ifram.contentWindow.document; //console.log(idoc); var ifram2 = idoc.querySelector("#ext-gen1046 > iframe") var idoc2 = ifram2.contentWindow.document; //console.log(idoc2); var ifram3 = idoc2.querySelector("#frame_content"); var idoc3 = ifram3.contentWindow.document; console.log(idoc3); text = idoc3.documentElement.innerHTML;
hasura-cli@2.36.1 Error! Failed to install Hasura CLI binary. Try npm uninstall hasura-cli or yarn remove hasura-cli and then reinstall it. If the issue occurs repeatedly, check if your network can access https://github.com as the the Hasura CLI binary file is hosted on Github. You can report the issue on https://github.com/jjangga0214/hasura-cli/issues with error message.
from torch.utils.tensorboard import SummaryWriter from accelerate.tracking import GeneralTracker, on_main_process import os from typing import Union
# 0. 自定义追踪器 class MyCustomTracker(GeneralTracker): """ my custom `Tracker` class that supports `tensorboard`. Should be initialized at the start of your script.
Args: run_name (`str`): The name of the experiment run logging_dir (`str`, `os.PathLike`): Location for TensorBoard logs to be stored. kwargs: Additional key word arguments passed along to the `tensorboard.SummaryWriter.__init__` method. """
name = "tensorboard" requires_logging_directory = True
""" # 第一种写法,按照类型勾,但如果有重复类型的layer比较复杂 net_chilren = net.children() for child in net_chilren: if not isinstance(child, nn.ReLU6): child.register_forward_hook(hook=hook) """
when $\Omega$ is continuous($\R$ for example), Borel field is useful.
+
“minimum” $\sigma$-algebra means deleting any element in the $\mathcal B (\mathbf R)$ will miss the requirements.
+
+
Uncountable
decimal numbers between 0 and 1 are uncountable.
+
Probability measures
$$
+P:\mathcal F \rightarrow [0, 1]
+$$
+
+
Nonnegativity $P(A)\ge0, \forall A \in \mathcal{ F}$
+
Normalization $P(\empty)=0, P(\Omega)=1$
+
Countable additivity $A_1, A_2, … \text { is disjoint in }\mathcal F, P(A_1\cup A_2\cup …)=P(A_1)+P(A_2)+…$
+
+
They are the axioms of probability.
+
Probability is a mapping from $\sigma$-algebra to a real number betwenn 0 and 1, which intuitively specifies the “likelihood” of any event.
+
There exist non-measurable sets, on which we cannot define a probability measure.
+
+
Discrete models
$$
+P(\lbraces_1, ..., s_n\rbrace)=P(s_1)+...+P(s_n)\\
+P(A) = \frac{\text{\# of elements of }A}{\text{total \# of elements of sample points}}
+$$
+
+
+
Continuous Models
Probability = Area
+
Some properties of Probability measure
$$
+A\sub B\Rightarrow P(A)\le P(B)\\
+P(A\cup B)=P(A)+P(B)-P(A\cap B)\\
+P(A\cup B) \le P(A) + P(B)\\
+P(A\cup B \cup C)=P(A) + P(A^C\cap B) + P(A^C\cap B^C\cap C)
+$$
+
+
Conditional Probability
$$
+P(A|B)=\frac{P(A\cap B)}{P(B)}
+$$
+
+
+
If $P(B)=0$, $P(A|B)$ is undefined.
+
For a fixed event $B$, $P(A|B)$ can be verified as a legitimate probability measure on the new universe. $P(A, B)\ge 0$, $P(\Omega|B)=1$, $P(A_1\cup A_2\cup…|B)=P(A_1|B)+P(A_2|B)+…$
+
$P(A|B)=\frac{\text{ \# of elements of }A\cap B}{\text{total \# of elements of }B}$
+
+
Total probability theorem
Let $A_1, …, A_n$ be disjoint events that form a partition of the sample space and assume that $P(A_i)>0$ for all $i$. Then for any event B, we have
+
$$
+P(B) = \sum_{i=1}^n P(A_i\cap B) = \sum_{i=1}^nP(A_i)P(B|A_i)
+$$
+
+
Remark
+
+
The definition of partition is that $\cup_{i=1}^n A_i = \Omega, A_i\cap A_j = \emptyset, \forall i\ne j$
+
The probability of B is a weighted average of its conditional probability under each scenario
+
Each scenario is weighted according to its prior probability
+
Useful when $P(B|A_i)$ is known or easy to derive
+
+
Inference and Bayes’ rule
Let $A_1, …, A_2$ be disjoint events that from a partition of the sample space and assume that $P(A_i) \gt 0$ for all $i$. Then for any event $B$ such that $P(B)\gt 0$, we have
Relates conditional probabilities of the form $P(A_i|B)$ with conditional probabilities of the form $P(B|A_i)$
+
often used in inference: effect $B$ $\lrarr$ cause $A_i$
+
+
The meaning of $P(A_i|B)$ in the view of Bayes: the belief of $A_i$ is revised if we observed effect $B$. If the cause and the effect are closely binded($P(B|A_i) > P(B|A_i^c)$), then the belief $A_i$ is enhanced by the observation of effect $B$($P(A_i|B) > P(A)$). This can be derived from the Bayes’ rule through simple calculation. If $P(A_i|B)=P(A_i)$, then $B$ provides no information on $A_i$.
+
Independence
Independence of two disjoint events
Events A and B are called independent if
+
$$
+P(A\cap B) = P(A)\cdot P(B)
+$$
+or equivalently, when $P(B) > 0$,
+
+
$$
+P(A|B) = P(A)
+$$
+
+
Remarks
+
+
Occurrence of B provides no information about A’s occurrence
+
Equivalence due to $P(A\cap B) = P(B)\cdot P(A|B)$
+
Symmetric with respect to $A$ and $B$.
+
+
applies even if $P(B) = 0$
+
+
+
+
implies $P(B|A) = P(B)$ and $P(A|B^c) = P(A)$
+
+
+
Does not imply that A and B are disjoint, indeed opposite!
+
+
Two disjoint events are never independent!($P(A\cap B) = 0$, but $P(A)\cdot P(B)\ne 0$)
+
+
+
+
Conditional independence
$$
+P(A\cap B | C) = P(A| C) \cdot P(B|C)
+$$
+
+
Definition
+
Event $A_1, A_2, …, A_n$ are called independent if:
for any distinct indices $i, j, \dots q$ chosen from ${1, \dots n}$.
+
Pairwise is independence does not imply independence.
+
Discrete Random Variables
Random Variable is neither random, nor variable.
+
Definition
We care about the probability that $X \le x$ instead $X = x$ in the consideration of generality.
+
Random variables
+
Given a probability space $(\Omega, F, P)$, a random variable is a function $X: \Omega \rightarrow \R$ with the probability that ${\omega \in \Omega: X(\omega) \le x} \in \mathcal F$ for each $x\in \R$. Such a function $X$ is said to be $\mathcal F$-measurable.
Assuming that the integration is well-defined. The Cauchy distribution ($\frac{1}{1+x^2}$)doesn’t have expectation since $\frac{x}{1+x^2}$ is not absolutely integrably.
$$
+F_X(x) = P(X\le x) = \begin{cases}
+ \sum_{k\le x}p_X(k), &\text{if } X \text{ is discrete,}\\
+ \int_{-\infty}^x f_X(t)\mathrm dt, &\text{if } X \text{ is continuous.}
+\end{cases}
+$$
+
+
Properties
+
$$
+\text{if } x \le y, \text{then } F_X(x)\le F_X(y).\\
+F_X(x)\text{ tends to 0 as } x \rightarrow -\infty, \text{and to 1 as} x \rightarrow \infty\\
+\text{If } X \text{ is discrete, then } F_X(x) \text{ is a piecewise constant function of }x.\\
+\text{If } X \text{ is continuous, then } F_X(x) \text{is a continuous funciton of }x.\\
+\text{If } X \text{ is discrete and takes integer values, the PMF and the CDF can be obtained from each other by summing or differcing: }\\
+F_X(k) = \sum_{i = -\infty}^k p_X(i),\\
+p_X(k) = P(X\le k) - P(X \le k -1) = F_X(k) - F_X(k - 1),\\
+\text{ for all integers }k.\\
+\text{If } X \text{ is continuous, the PDF and the CDF can be obtained from each other by integration or differentiation: }\\
+F_X(x) = \int_{-\infty}^x f_X(t)\mathrm dt, f_X(x) = \frac{\mathrm dF_X}{\mathrm dx}(x)
+$$
Gaussian is good, since adding two Gaussian functions resulting in a new Gaussian functions. And with a huge mount of samples, the distribution is close to Gaussian(Central limit theorem).
The CDF of Normal Random Variable $\Phi(y)$ can not be derived directly, we can use the standard normal table to get the value.
+
$$
+\Phi(-y) = 1 - \Phi(y)
+$$
+
+
Multiple Continuous Random Variables
Joint PDFs
+
The two continuous RVs X and Y, with the same experiment, are jointly continuous if they can be described by a joint PDF $f_{X, Y}$, where $f_{X, Y}$ is a nonnegative function that satisfies
+
$$
+P((X, Y) \in B) = \iint_{(x, y)\in B} f(X, Y)\mathrm d x\mathrm dy
+$$
+
+
for every subset B of the two-dimensional plane. In particular, when B is the form $B = {(x, y)|a\le x \le b, c\le y \le d}$, we have
+
$$
+P(a\le X \le b, c \le Y \le d) = \int_c^d\int_a^bf_{X, Y}(x, y)\mathrm dx\mathrm dy
+$$
+
+
Normalization
+
$$
+\int_{-\infty}^\infty\int_{-\infty}^\infty f_{X, Y}(x, y)\mathrm dx\mathrm dy = 1
+$$
+
+
Interpretation(Small rectangle)
+
$$
+P(a\le X \le a + \delta, c \le Y \le c + \delta) \approx f_{X, Y}(a, c)\cdot\delta^2
+$$
+
+
Marginal PDF
+
$$
+P(X\in A) = P(X \in A, Y \in (-\infty, \infty)) = \int_A \int_{-\infty}^\infty f_{X, Y}(x, y)\mathrm dy\mathrm dx
+$$
$$
+P(x \le X \le x + \delta|Y = y) \approx f_{X|Y}(x|y)\cdot\delta
+$$
+
+
But {Y = y} is a zero-probability event.
+
Let $B = {y\le Y \le y + \epsilon}$, for small $\epsilon > 0$. Then
+
$$
+P(x \le X \le x + \delta|Y \in B) \approx \frac{P(x \le X \le x + \delta)}{P(y \le Y \le y + \epsilon)} \approx \frac{f_{X, Y}(x, y)\cdot\epsilon\delta}{f_Y(y)\cdot\epsilon} \approx f_{X|Y}(x|y)\cdot\delta
+$$
+
+
Limiting case when $\epsilon \rightarrow 0$, to define conditional PDF where the denominator is a zero-probability event.
+
Conditional Expectation
+
The conditional expectation of X given that A has happened is defined by
Suppose $g$ is a strictly monotonic function and that for some function $h$ and all $x$ in the range of $X$ we have
+
$$
+y = g(x) \text{ if and only if } x = h(y)
+$$
+
+
Assume that $h$ is differentiable.
+
Then the PDF of $Y = g(X)$ is given by
+
$$
+f_Y(y) = \frac{\mathrm d F_Y}{\mathrm d y}(y) = \frac{\mathrm d}{\mathrm d y} F_X(h(y)) = f_X(h(y))\left|\frac{\mathrm d h}{\mathrm d y}(y)\right|
+$$
+
+
Entropy
Defintion
+
Discrete case
+
Let $X$ be a discrete RV defined on probability space $(\Omega, \mathcal F, P)$. The entropy of $X$ is defined by
$$
+M_Y(s) = \int_{-\infty}^\infty \frac{1}{\sqrt{2\pi}}e^{-(y^2/2)}e^{sy}\mathrm dy = e^{s^2/2}
+$$
+
+
Consider $X = \sigma Y + \mu$
+
$$
+M_X(s) = e^{s^2\sigma^2/2 + \mu s}
+$$
+
+
Inversion of transforms
Inversion Property
+
The transform $M_X(s)$ associated with a RV $X$ uniquely determines the CDF of $X$, assuming that $M_X(s)$ is finite for all $s$ in some interval $[-a, a]$, where $a$ is a positive number.
The probability $P(X = k)$ is found by reading the coefficient of the term $e^{ks}$:
+
$$
+P(X = k) = p(1-p)^{k-1}
+$$
+
+
Transform of Mixture of Distributions
Let $X_1,\dotsb, X_n$ be continuous RVs with PDFs $f_{X_1}, \dotsb, f_{X_n}$.
+
The value $y$ of RV $Y$ is generated as follows: an index $i$ is chosen with a corresponding probability $p_i$, and $y$ is taken to be equal to the value $X_i$. Then,
Let $X$ and $Y$ be independent Gaussian RVs with means $\mu_x$ and $\mu_y$, and variances $\sigma_x^2, \sigma_y^2$. And let $Z = X + Y$. Then $Z$ is still Gaussian with mean $\mu_x + \mu_y$ and variance $\sigma_x^2 + \sigma_y^2$
where $N$ is a RV that takes integer values, and $X_1, X_2, \dotsb$ are identically distributed RVs. Assume that $N$, $X_1, X_2, \dotsb$ are independent.
Assume that $N$ and $X_i$ are both geometrically distributed with parameters $p$ and $q$ respectively. All of these RVs are independent. $Y = X_1 + \dotsb + X_N$
For large $n$, the bulk of the distribution of $M_n$ is concentrated near $\mu$
+
Theorem
+
Let $X_1, X_2, \dots$ be independent identically distributed (i.i.d.) RVs with finite mean $\mu$ and variance $\sigma^2$. For every $\epsilon \gt 0$, we have
Let $\lbrace Y_n\rbrace$(or $Y_1, Y_2, \dots$) be a sequence of RVs(not necessarily independent), and let $a$ be a real number. We say that the sequence $Y_n$ converges to $a$ in probability, if for every $\epsilon \gt 0$, we have
The relation between “almost surely” and “in r-th mean” is complicated. There exist sequences which converge almost surely but not in mean, and which converge in mean but not almost surely!
+
Central Limit Theorem
Theorem
Let $X_1, X_2, \dots$ be i.i.d. RVs with mean $\mu$ and variance $\sigma^2$. Let
CDF of $Z_n$ converges to normal CDF(converge in distribution)
+
Normal Approximation Based on the Central Limit Theorem
Let $S_n = X_1 + \dotsb + X_n$, where $X_i$ are $\text{i.i.d.}$ RVs with mean $\mu$ and variance $\sigma^2$. If $n$ is large, the probability $P(S_n ≤ c)$ can be approximated by treating $S_n$ as if it were normal, according to the following procedure.
+
+
Calculate the mean $n\mu$ and the variance $n\sigma^2$ of $S_n$
+
Calculate the normalinzd value $z = (c - n\mu)/(\sigma\sqrt{n})$
+
Use the approxmation
+
+
$$
+ P(S_n \le c) \approx \Phi(z)
+$$
+
+
where $\Phi(z)$ is available from the standard normal CDF.
Independence property: For any given time $n$, the sequence of $X_{n + 1}, X_{n + 2}, \dots$ is also a Bernoulli process, and is independent from $X_1, \dots, X_n$
+
Memoryless property: Let $n$ be a given time and let $\overline T$ be the time of the first success after time $n$. Then $\overline T − n$ has a geometric distribution with parameter $p$, and is independent of the RVs $X_1, \dots , X_n$.
+
$$
+P(\overline T - n = t | \overline T \gt n) = (1 - p)^{t - 1}p = P(T = t)
+$$
+
+
Interarrival times
+
Denote the $k$th success as $Y_k$, the $k$th interarrival time as $T_k$.
Whenever there is an arrival, we choose to either keep it (with probability $q$), or to discard it (with probability $1 − q$).
+
Both the process of arrivals that are kept and the process of discarded arrivals are Bernoulli processes, with success probability $pq$ and $p(1 − q)$, respectively, at each time.
+
Merging of a Bernoulli process
+
In a reverse situation, we start with two independent Bernoulli processes (with parameters $p$ and $q$ respectively). An arrival is recorded in the merged process if and only if there is an arrival in at least one of the two original processes.
+
The merged process is Bernoulli, with success probability $p+q−pq$ at each time step.
+
The Poisson Process
Definition
An arrival process is called a Poisson process with rate $λ$ if it has the following properties:
+
Time homogenity
+
$$
+P(k, \tau) = P(k \text{ arrivals in interval of duration }\tau)
+$$
+
+
Independence
+
Numbers of arrivals in disjoint time intervals are independent.
We also have orthogonal function decomposition(Chap.3, Chap.6).
+
System modeling and Classification
System model can be represented by math equation(including input-output description and state variables or state equation) graphic symbol and block diagrams.
+
We use the input-output description mostly. If controling something internal is needed, state euqtion is useful.
If functions are continuous, we can get their boundary conditions by determining the derivatives.
+
Then the coefficients can be solved by multipling the inverse of Vandermonde matrix with the boundary condition matrix.
+
Zero-input and -state Responses
Zero-input response The response caused by the initial state (i.e., energy originally stored in the system), and it is denoted by $r_{zi}(t)$
+
Zero-state response $r(0_-)\equiv 0$, the response caused only by the external excitation and it is denoted by $r_{zs}(t)$
+
+
+
The combination of zero-input response and the zero-state response is not necessarily linear, since the existence of constant. If one of them vanishes, the other is linear.
+
Impulse and Step Responses
Impulse Response the zero-state response $h(t)$ to $\delta (t)$, which can be equalized to the initial condition.
+
Note: normally $n>m$.
+
Unit Step Response The zero-state response $g(t)$ to $u(t)$
Integral interval $e(t)=0, \forall t<0$, $h(t)=0,\forall t<0$, so $r(t)=\int_0^t{e(\tau)h(t-\tau)\mathrm d\tau}$
+
The condition for applying convolution:
+
+
For linear system ONLY
+
For time variant systems, $h(t, \tau)$ means response at time $t$ generated by the impulse at time $\tau$, then $r(t)=\int_0^th(t,\tau)e(\tau)\mathrm d \tau$; for time-invariant system is a special case, $h(t,\tau)=h(t-\tau)$.
+
+
The Properties of Convolution The commutative property, the distributive property, the associative property
Spectrum is discrete with frequency spacing $\omega_1 = \frac{2\pi}{T_1}$. When $T_1 \rightarrow \infty$, the spectrum will be continuous.
+
Amplitude: $\text{Sa}\left(\frac{n\pi\tau}{T_1}\right)$ or $\text{Sa} \left(\frac{n\omega_1\tau}{2}\right)$, cross zero when $\omega_1 = \frac{2m\pi}{\tau}$
+
Non-zero FS coefficients of a aperiodic signal are infinite with most energy concentrated at low frequency components (within $\left(-\frac{2\pi}{\tau},\frac{2\pi}{\tau}\right)$). Thus we define the bandwith $B_{\omega} = \frac{2\pi}{\tau}$
+
+
Periodic symmetric square wave
+
Since the spectrum crosses zero when $\omega_1 = \frac{2m\pi}{\tau}$, the even harmonic vanishes. Also the sine component vanishes.
More compacted than square signal($|F(\omega)|\propto \frac 1{\omega^3}$). An explanation is that the raised cosine has no discontinuities.
+
Generally:
+
+
$f(t)$ has discontinuities, $|F(\omega)|\propto \frac 1{\omega}$
+
$\frac{d}{dt}f(t)$ has discontinuities, $|F(\omega)|\propto \frac 1{\omega^2}$
+
$\frac{d^2}{dt^2}f(t)$ has discontinuities, $|F(\omega)|\propto \frac 1{\omega^3}$
+
+
The width $\tau$ of the raised cosine signal is defined at $\frac E2$ rather than at the bottom, making it easy to compare with the rectangular pulse of same width. The first zeros of the frequency spectrum are identical.
+
raised consine is energy-concentrative and has been widely used in digital communications.
The spectrum of impulse function covers the entire frequency range. The interferences caused by a variety of electric sparks always cover the full frequency range.
if $f(t)$ is real and even, then $f(t)=f_e(t), F(\omega)=R(\omega)$, the phase shift is $0$ or $\pi$.
+
if $f(t)$ is real and odd, $f(t) = f_o(t)$, then $F(\omega)=jX(\omega)$, $F(\omega)$ has only imaginary part and is odd, the phase shift is $\pm \frac{\pi}{2}$
+
Scaling $\mathcal{F}[f(at)]=\frac 1{|a|}F\left(\frac{\omega}a\right)$ Expansion in TD results in Compression in FD.
+
Time Shifting $\mathcal{F}[f(t\pm t_0)] = F(\omega)e^{\pm j\omega t_0}$
+
Frequency Shifting $\mathcal F[f(t)e^{\pm j\omega_0t}] = F(\omega\mp\omega_0)$
$$
+F_0(ω) \text{ determines the profile of } F(ω)\\
+T_1\text{ determines the density of the impulses
+}\\
+T_1↑, ω_1↓\text{, intensity of harmonics}↓\\
+T_1↓,ω_1↑\text{, intensity of harmonics}↑\\
+$$
A band-limited signal whose spectrum is strictly within $[0, f_m]$ could be uniquely determined by the samples on itself, if and only if the sampling interval $T_s \le 1/(2f_m)$.
+
$T_s = \frac{1}{2f_m}$ is called the Nyquist interval.
All the poles of $H(s)$ are the natural frequencies of the system, but $h(t)$ may not include all the natural frequencies(but the root of $\Delta$ contains all natural frequencies).
The Amplitude is const., while the phase can change.
+
Minimum-phase system/function
+
Definition: A stable system with poles on left-half s-plane is called minimum-phase system/function, if all the zeros are also on left-half s-plane or at the jω-axis. Otherwise is a non-minimum-phase system/function.
+
Property: A non-minimum-phase function can be represented as the product of a minimum-phase function and an all-pass function.
+
Stability of Linear System
A system is considered to be stable if bounded input always leads to bounded output.
+
Bounded-input, Bounded-output(BIBO)
+
The necessary & sufficient conditions for BIBO:
+
$$
+\int_{-\infty}^\infty|h(t)|\mathrm dt \le M
+$$
+
+
Poles are:
+
+
on the left half-plane: $\lim_{t\rightarrow \infty}[h(t)] = 0$, stable system
+
on the right half-plane, or at $j\omega$-axis with order of more than one: $\lim_{t\rightarrow \infty}[h(t)] = \infty$, unstable system
+
at $j\omega$-axis with order of one: $h(t)$ is non-zero or oscillated with equal amplitude, critical stable system
+
+
Two-sided (Bilateral) LT
$$
+F_B(s) = \int_{-\infty}^\infty f(t)e^{-st}\mathrm{d} t
+$$
+
+
+
t starts from −∞, i.e., non-causal signal as the input or regarding the initial condition as the input.
+
Easily to be associated with F-transform and Z- transforms
If no overlap between the two constraints, then $F_B(s)$ does not exist.
+
$F_B(s)$ and $f(t)$ are not uniquely corresponding to each other.($\int_{-\infty}^\infty u(t)e^{-st}\mathrm{d} t = \frac{1}{s}$, $\int_{-\infty}^\infty -u(-t)e^{-st}\mathrm{d} t=\frac{1}{s}$)
+
Two-sided L-Transform shares almost all the properties with its single-sided counterpart except for the initial-value theorem.
+
Two-sided L-Transform has very limited applications as most continuous-time systems are causal.
$1 + e^{-s}$ also has zero(many!). Note that if it is on the denominator.
+
FT in Telecom. Systems
System discussed in this chapter are strictly stable:
+
$$
+\mathcal{F}[f(t)] = F(s)|_{s=j\omega}
+$$
+
+
Because even for critical stable system, FT is not the same as LT(containing $\delta$), there will be ambiguity between $H(j\omega)$ and $H(s)|_{s=j\omega}$.
+
For every freq. component, it is reshaped in its phase and amplitude by the system function when passing through the system, related with its frequency. Thus the system can distort the original signal.
+
Distortion
+
2 types of distortion:
+
+
Non-linear distortion (new frequency components)
+
Linear distortion (without new frequency components), just the amplitude and/or phase distortion.
Symbol rate : clock period is $T$, signal symbol rate is $f = 1/T$.
+
Information rate: information rate equals to symbol rate for binary encoding, otherwise, equal to multiplication between symbol rate and number of information bits per symbol.
+
Signal bandwidth: the first zero of non-return-to-zero (NRZ) signal’s spectrum is $1/T$, so the signal bandwidth is $B=1/T =f$.
When NRZ code is used, signal bandwidth = symbol rate
+
When return-to-zero (RZ) code is used, signal bandwidth > symbol rate
+
Using NRZ code can save bandwidth yet high frequency components of the rectangular signal will suffer from the severe inter-symbol interference (ISI). So the raised cosine or Sa function is preferred.
(b) unilateral: if $\mathcal{Z}[x(n)] = X(z), R_{X_1} < |z|$, then $\mathcal{Z}[x(n-m)] = z^{-m}[X(z) + \sum_{k = -m}^{-1}x(k)z^{-k}], R_{X_1}\lt |z|$, and $\mathcal{Z}[x(n+m)] = z^{m}[X(z) - \sum_{k = 0}^{m-1}x(k)z^{-k}], R_{X_1}\lt |z|$
+
For casual sequence, $n < 0, x(n) = 0$, the unilateral is also $\mathcal{Z}[x(n-m)] = z^{-m}X(z)$.
+
The reason is that the unilateral z transform doesn’t contain the $n<0$ parts of sequence, but after shifting, sometimes must be counted(right shift), sometimes must be discarded(left shift).
+
Linear weighting on sequence(Z domain differentiation)
The FT of $h(n)$, $H(e^{j\omega})$ is a periodic function with period of $\omega_s = 2\pi /T = 2\pi$.
+
If $h(n)$ is real, then the amplitude/phase response is even/odd function.
+
The amplitude is determined within $[0, \omega_s/2]$
+
+
+
NOTE:
+
+
We can derive the frequency response (function of $\omega$) by letting $D$ move along the unit circle once.
+
$H(j\omega)$ is periodic. The frequency response from 0 to $\omega_s/2$ can be determined by letting $D$ move along half circle.
+
If pole $p_i$ is close to the unit circle, there will be a peak in the frequency response. If zero $z_i$ is close to the unit circle, there will be a notch in the frequency response.
+
For statble systems, $p_i$ should be inside the unit circle, while $z_i$ could be inside or outside the unit circle.
+
poles and zeros at origin have no influence on amplitude.
+
+
Analog and digital Filter
Fundamental Principles
+
+
The spectrum of $x(t)$ is strictly inside $\pm \omega_m$.
+
We choose the sampling frequency:$\omega_s = \frac{2\pi}{T} \ge 2\omega_m$
Be careful when changing the order of operation($\frac{d}{dt}\int_{-\infty}^tx(\tau)d\tau = x(t)$, $\int_{-\infty}^t\frac{d}{d\tau}x(\tau)d\tau = x(t) - x(-\infty)$)
+
+
transfer operator:
+
$$
+r(t) = \frac{N(t)}{D(t)}e(t) = H(p)e(t)
+$$
+
+
Brief introduction to the signal flow graphs(SFG)
+
+
+
Terminnologies in SFG
+
Node, Transfer function, Branch(The branch gain is the transfer function), Source node, Sink node, Mixed node.
+
Properties of SFG
+
+
Signal only passes through a branch with the direction indicated by the arrowhead.
+
Signals of incoming branches are added at a node, and the added signal appears on the all outgoing branches.
+
A sink node can be separated from a mixed node.
+
For a given system, the SFGs can be different.(equations for a system can be different)
+
After the SFG being inversed, the transfer function keeps invariant, but the signals represented by the inner nodes will be different.
+
+
Note: Inversion is done by inversing the transfer direction of each branch, and exchanging the source and sink nodes as well.
+
Algebra of SFG
+
+
+
Simplify:
+
NOTE: The SFG can be simplified using the following steps:
+
a. Merge the cascaded branches to decrease the number of nodes;
+
b. Merge the parallel branches to decrease the number of branches;
The period of address code is much shorter than the period of data code: $T_c < T_d$, so the modulated signal is much wider in FD, whose spectrum is called spread spectrum.
Min-term A min-term is a product of all variables taken either in the direct or complemented form, each variable shown once.
+
$A’B’C’=m_0$ and $ABC=M_7$
+
Max-term A max-term is a sum of all variables taken either in the direct or complemented form, each variable shown once.
+
$A+B+C=M_0$ and $A’+B’+C’=M_7$
+
$$
+m_6= \overline{M_6}
+$$
+
+
Karnaugh Maps
+
Use row and columns to represent combinations of the inputs(by min-terms), the cell to represent the value. The inputs is ordered by the sequence of Grey Code.
+
Simplification of 2-level logic
Karnaugh Maps method
+
If two adjacent min-terms deliver logic 1, merge them.
+
Implicant
+
$$
+G\Rightarrow F, \text{then }G\text{ is the implicant of }F\\
+no\ Q\ s.t.P\Rightarrow Q\Rightarrow F, \text{then }P \text{ is the prime implicant of }F\\
+\text{If one min-term can only be covered by one prime implicant, this prime implicant is an EPI.}
+$$
+
+
EPI will finally exist there.
+
Q-M method
+
algorithm to simplify multiple-input large-scale function.
+
Combinational Logic
Gate
NAND, NOT, NOR is better than AND, OR in saving area.
+
Transmission Gate Use both NMOS and PMOS to form a CMOS switch. NMOS is good at transmitting low volt while PMOS is better at working on high volt.
+
Tri-state Gate EN is high, TG on, F=A; EN is low, TG off, F is isolated from input A. The states are called logic 0, logic 1 and high-resistance Z.
+
+
The bottom part of the circuit is used to avoid the high-Z state.
+
Combinational logic circuits
Outputs are the function of logic input circuits.
+
Determined only by current not past inputs + delay
+
To deal with complex logic with many inputs, we should:
+
+
From 2-level to multi-level(BDD?)
+
Divide-and-conquer
+
Re-using
+
+
Metrics
Static metrics Logic voltage values, DC noise margins, Area, Fan-out
+
Dynamic metrics Speed/delay, Power dissipation, Noise(reference)
+
Speed rise time and fall time. Propagation time.
+
Fan out The maximum number of CMOS inputs that one logic output can drive.
+
Power and energy Leakage power(static power): subthreshold leakage power, gate leakage, D/S subtrate leakage. We can reduce the static power:
Energy-delay product is a metric. It’s hard to reduce.
+
+
Low power can increase the lifetime of the chip.
+
Low delay can increase the speed of the chip.
+
+
Hazard
static-1 hazard ‘1’ output has a transient ‘0’ glitch
+
static-0 hazard ‘1’ output has a transient ‘0’ glitch
+
dynamic hazard several transitions during a single output change(not required)
+
If the initial input and final input cannot be covered by one PI, it may have a glitch as state transition.
+
Basic comb. logic circuits
Encoder: inputs are more, outputs are less($n \le 2^m$)
+
Decoder: inputs are less, outputs are more($m = 2^n$)
+
Multiplexer: From several inputs, choose one as output according to the address inputs. It can be used to make a shift register. We can use n-bit-addr MUX for m-bit function.
+
Adder: Half adder & full adder.
+
Half Adder: $S = A \oplus B$
+
Full adder: $C_{out} = A\cdot C_{in} + B\cdot C_{in} + A\cdot B, S = A\oplus B\oplus C_{in}$
+
Implements of Full Adder:
+
Serial Adder $C_{i+1} = A_iC_i+B_iC_i + A_iB_i$. The latency is disastrous.
+
Carry Lookahead Adder(CLA)
+
First define $P_i = A_i\oplus B_i, G_i = A_iB_i$, P means Carry-bit propagation, G means Carry-bit Generation. Thus, $C_{i+1} = G_i + P_iC_i, S_i = A_i\oplus B_i\oplus C_i = P_i\oplus C_i$. Thus the $C_i$ can be replaced:
A 4-bit CLA can be designed using these formulas. For more, it is too complex. However, we can cascade the 4-bit CLA to reach the balance of latency and complexity.
+
Moreover, here comes the parallel intra- and inter-group CLA which regards the 4-bit adder as a block and defines its $P_i$ and $G_i$, connecting the 4-bit adders in a similar manner as the structure of 4-bit adder inside.
+
Sequential logic
+
Clock
+
Ring Oscillator
+
LC oscilator
+
Crystal Oscillator
+
+
State and FSM
States contain all needed infomation. could be redundant.
+
Finite State Machine(FSM) The number of input, output and states are finite.
+
Mealy FSM:
+
+
Moore FSM:
+
+
We describe the FSM by State Transition Table or State Diagram.
+
+
+
Remember: Moore is less.
+
Latch
Watch the input at the duration clock enables.
+
Examples
SR latch
+
+
$SR=0$ is required.
+
$$
+Q^+=S+R^\prime Q
+$$
+
+
A gated version:
+
+
$$
+Q^+=C^\prime Q + C(S+R^\prime Q)
+$$
+
+
D latch
+
+
$$
+Q^+=D
+$$
+
+
Transimission Gate version
+
+
Timing parameters
+
+
Flip-Flop
Watch the input only at the moment when clock signal rises or falls.
+
Examples
D Flip-flop(DFF)
+
+
2 D latches in series with opposite clock.
+
Use the delay to help us
+
The delay of NOT gate at the bottom enables the slave to lock first and the master to unlock second when clock falls.
+
Also, the delay of NOT gate at the bottom makes $t_h=0$.
+
+
Two time constraints
+
Set-up time constraints: restrict the clock cycle.
+
+
$t_{logic}(max)$ is also called propagation delay $t_{dab}$.
+
Hold time constraints: restrict the $t_d$
+
+
$t_{logic}(min)$ is also called contamination delay($t_{cab}$).
The Multicycle processor processes different inst. in different cycles. Thus, it can avoid the time limitation by the slow instruction.
+
Modules on th datapath can be used multiple times within an inst. That is Module reuse.
+
Performance improvement depends on the detailed delay. The multicycle is not necessarily faster than the single cycle.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Exception and Interrupt
+
Generally, exceptions and interrupts are events that can change the normal instruction execution flow (other than branches and jumps).
+
Exception – Internal unpredictable events such as overflow.
+
Interrupt – External unpredictable events such as I/O.
+
+
+
+
+
+
+
+
Pipelined processor
+
Pipeline means spatial and temporal reuse.
+
Divide a complex task into several sub-tasks to execute sequentially, and assign each sub-task to dedicated hardware
+
Different sub-tasks of multiple tasks can be processed simultaneously to improve the performance
+
• Time-division multiplexing: the same resource is reused through different cycles
+
• Space-division multiplexing: multiple resources are reused within one cycle
+
Pros: improved efficiency
+
Cons: Some inst. depends on the former inst.. However, the former inst. has not finished yet, if the prediction of branch is wrong, time is wasted.
+
Instruction-Level Parallelism (ILP)
+
– Execute multiple instructions in parallel
+
– One mainstream approach to CPU performance improvement
+
Basic Techniques
+
– Pipelining: instruction execution is divided into several stages, and each stage can be processed with the other stages (of other instructions) simultaneously
+
– Superscalar: multiple dedicated functional units are equipped so that CPU can receive & execute more instructions within one cycle.
+
– Very Long Instruction Word (VLIW): each instruction consists of multiple segments to utilize the processor resources independently.
+
5 stages in MIPS:
+
+
Instruction Fetch (IF)
+
Instruction Decode / Register File
+
ALU Execution(EX)
+
Memory Data Access(MEM)
+
Reg Write-Back(WB)
+
+
Latency of the stages in the pipeline should be as equivalent as possible, why? –The pipeline performance is bottlenecked by the stage of the longest latency
+
Metrics of pipelined processors
+
+
Throughput(TP): Executed instruction # per unit time
+
Max throughput: The throughput of a steady-state pipeline with a continuous instruction input stream
+
Real throughput: The throughput of the pipeline executing task with finite instructions
+
+
Real TP:
+
$$
+TP = \frac n {T_k} = \frac{n}{(n+k-1)\Delta t}
+$$
Speed up Execution time ratio between w/o or w/ pipelining.
+
$T_0$: execution time without pipelining
+
$T_k$: execution time with k-stage pipelining(assume each stage has the same latency)
+
$$
+\frac{T_0}{T_k}
+$$
+
+
Real speedup:
+
$$
+S = \frac{nk\Delta t}{(n + k - 1)\Delta t} = \frac{kn}{n + k - 1}
+$$
+
+
Max speedup:
+
$$
+S_{max} = \lim_{n\rightarrow \infty}\frac{kn}{k + n - 1} = k
+$$
+
+
Pipelined datapath
+
+
+
+
+
+
+
Many inst. doesnt need MEM stage, but the stage can’t be skipped, since it may “collide” with the previous inst.
+
Q: How to get the control signals in each stage?
+
Control signals are generated at ID/RF stage.
+
Control signals flow in the pipeline: use when needed; reserve when subsequent stages needed; discard when not needed any more.
+
+
Why RegDst Doesnt needed in the stages after EX?
+
The RegDst is used to select the destination register. However, the destination register is determined in the EX stage. Thus, RegDst is not needed in the stages after EX.
+
Sometimes this can cause trouble:
+
1 2 3
LW R2, R9(10) ADD R4, R3, R2 ADD R6, R5, R4
+
+
The R4 is accessed before it updates.
+
Hazard in the pipeline
Structural Hazards
+
Two instructions acquire the same hardware resource simultaneously.
+
+
Solution:
+
+
Add resources: separating PC+4 from ALU; Harvard architecture
+
Adjust stages: add MEM
+
+
+
+
MIPS is born to be pipelined: the problem of structural hazard is solved by the structure of MIPS.
+
Data Hazards
+
+
2 solutions:
+
Stalling
+
+
Forwarding(Bypassing)
+
+
+
But, load uses hazard emerges:
+
+
+
Still, a nop needed to stall the pipeline.
+
+
Is there any possibility to eliminate the stall?
+
Yes, if the MIPS code can be optimized.
+
+
Control Hazards
+
+
Stalling
+
+
Forwarding
+
+
If Move the branch decision to ID stage:
+
+
+
Delay slot
+
+
Prediction
+
Static Branch Prediction
+
+
Cancel the effect caused by false prediction.
+
Dynamic Branch Prediction
+
+
History-based dynamic prediction
+
Using runtime behaviour to predict future branches
+
+
At IF stage, there are Branch History Table(BHT) and Branch Target Buffer(BTB).
+
beq at IF stage
+
• Look up if the instruction address is in BHT and BTB.
+
• If not in, create a new entry. If in, check whether the branch is taken at the last time. If taken, send the target address to PC as the next address for IF.
+
–beq at ID stage
+
• IF stage will fetch instruction based on the predicted target address
+
Implementation of the Pipiline
+
Data hazard
+
Forwarding
+
EX/MEM hazard
+
1 2 3 4 5 6 7 8
if (EX/MEM.RegWrite and (EX/MEM.RegWrAddr != 0) and (EX/MEM.RegWrAddr == ID/EX.RegisterRs)) ForwardA = 10 if (EX/MEM.RegWrite and (EX/MEM.RegWrAddr != 0) and (EX/MEM.RegWrAddr == ID/EX.RegisterRt)) ForwardB = 10
+
+
MEM/WB hazard
+
1 2 3 4 5 6 7 8 9 10
if (MEM/WB.RegWrite and (MEM/WB.RegWrAddr != 0) and (MEM/WB.RegWrAddr == ID/EX.RegisterRs) and (EX/MEM.RegWrAddr != ID/EX.RegisterRs || ~ EX/MEM.RegWrite)) ForwardA = 01 if (MEM/WB.RegWrite and (MEM/WB.RegWrAddr != 0) and (MEM/WB.RegWrAddr == ID/EX.RegisterRt) and (EX/MEM.RegWrAddr != ID/EX.RegisterRt || ~ EX/MEM.RegWrite)) ForwardB = 01
+
+
load-use hazard
+
We have to stall the pipeline for one cycle.
+
Condition: if (ID/EX.MemRead and ((ID/EX.RegisterRt == IF/ID.RegisterRs) or (ID/EX.RegisterRt == IF/ID.RegisterRt)))
+
How to stall?
+
Stall & flush
+
PC & ID/IF: stall(Keep the control signals)
+
EX & MEM & WB: flush(Set the control signals to 0)
+
Control signals:
+
EX: RegDst, ALUOp1, ALUOp0, ALUSrc
+
MEM: Branch, MemRead, MemWrite
+
WB: MemToReg, RegWrite
+
1 2 3 4
if {ID/EX.MemRead and ((ID/EX.RegisterRt = IF/ID.RegisterRs) or (ID/EX.RegisterRt = IF/ID.RegisterRt))} Keep IF/ID; Keep PC; Flush ID/EX;
+
+
Special Case:
+
Memory-to-memory copy
+
1 2
lw $1, 10($2) sw $1, 10($3)
+
+
The stall is unnecessary. We can use forwarding to solve the problem.
+
Control Hazard
+
BEQ & J need 1 stall cycle.
+
Control hazards are not as frequent as data hazards, but they are harder to be resolved effectively as forwarding.
+
Another type of control hazard: exceptions and interrupts.
+
– “Exception” includes any unpredictable events that can change the normal control flow, which has no distinction between internal and external.
+
– “Interrupt” is only used for external events.
+
+
Advanced techniques for processor
Instruction-level parallelism
Superpipelining: a deeper pipeline
+
VLIW
+
Multiple-issue: a wider pipeline
+
Superscalar
+
+
Thread-level parallelism
Hyper-threading
+
+
Multicore
+
Heterogeneous computing
GPU
+
XPU
+
Memory
Basics
SRAM cell
+
High speed, low density & expensive
+
+
DRAM cell
+
Low speed, high density & cheap. The charge in capacitor may leak, so the data cannot be stored for a long time.
+
+
+
Evaluation
3C model:
+
Compulsory miss first access to a block
+
Capacity miss all lines in cache are used
+
Conflict miss(collision miss) not fully filled, but the blocks # > available ways #.
NMS (Non-Maximum Suppression) Bounding boxes for one instance may overlap. Method: For each type, use NMS to eliminate redundant bounding boxes (greedy approach). Workflow:
+
+
Sort candidate bounding boxes by classification confidence.
+
Adding the boxes b with most confidence to output list, and delete it from the candidate boxes.
+
Calculate IoU between b and other boxes bi. If > threshold, delete bi.
+
Repeat until no candidate bounding boxes.
+
+
序列模型(Serial Model)
+
to process Speech, text, video, audio, etc.
+
Feature:
+
+
The data input is in the time sequence.
+
There is a correlation between the data before and after.
+
+
So the model should have the ability to “store” information.
+
Speech dataset: TIMIT
+
+
It consists of recordings of 630 speakers of 8 dialects of American English each reading 10 phonetically-rich sentences.
+
It also comes with the word and phone-level transcriptions of the speech.
+
+
Video dataset: DAVIS
+
The Densely Annotation Video Segmentation dataset (DAVIS) is a high quality and high resolution densely annotated video segmentation dataset under two resolutions, 480p and 1080p.
+
There are 50 video sequences with 3455 densely annotated frames in pixel level. 30 videos with 2079 frames are for training and 20 videos with 1376 frames are for validation.
+
NLP dataset: GLUE
+
General Language Understanding Evaluation (GLUE) benchmark: Standard split of data totrain, validation, test, where labels for the test set is only held in the server.
+
+
Sentence pair tasks
+
MNLI, Multi-Genre Natural Language Inference
+
QQP, Ouora Ouestion Pairs
+
QNLI, Ouestion Natural Language Inference
+
STS-B The Semantic Textual Similarity Benchmark
+
MRPC Microsoft Research Paraphrase Corpus
+
RTE Recognizing Textual Entailment
+
WNLI Winograd NLI is a small natural language inference
This loss is convex. But there are many solutions that result in same outputs, so the regularizaton is indispensible to prevent divergence.
+
+
Support Vector Machine (SVM)
Soft-SVM (Hinge Loss)
$$
+\min_{w,b,\xi}\frac{1}{2}\|w\|_{2}^{2}+\frac{C}{n}\sum_{i=1}^{n}\xi_{i}\\\mathrm{s.t.~}y_i(\boldsymbol{w}\cdot\boldsymbol{x}_i+b)\geq1-\xi_i\\\xi_i\geq0,1\leq i\leq n
+$$
+
+
Define Hinge Loss
+
$$
+\ell(f(x),y)=\max\{0,1-yf(x)\rbrace
+$$
+
+
For the linear hypothesis:
+
$$
+\ell(f(x),y)=\max\{0,1-y(w\cdot x+b)\}
+$$
+
+
Theorem: Soft-SVM is equivalent to a Regularized Rise Minimization:
$$
+m \ge n, c_s\left[m,n\right]=a^{m+n+2}\sigma_s^2+\sum_{k=m-n}^ma^{2k+n-m}\sigma_u^2=a^{m+n+2}\sigma_s^2+\sigma_u^2a^{m-n}\sum_{k=0}^na^{2k}\\
+m \lt n, c_s\begin{bmatrix}m,n\end{bmatrix}=a^{m+n+2}\sigma_s^2+\sum_{k=0}^ma^{2k+n-m}\sigma_u^2=a^{m+n+2}\sigma_s^2+\sigma_u^2a^{n-m}\sum_{k=0}^ma^{2k}=c_s\begin{bmatrix}n,m\end{bmatrix}
+$$
Effective Aperture(等效口面) and Aperture efficienty(口面效率)
+
+
E 面:与电场方向平行的面
+
H 面:与磁场方向平行的面
+
Pattern Parameters
+
Often use log scale.
+
Power Density
Instantaneous Poynting vector $\vec S(x, y, z, t)$
+
Radiation Power Density = Time average Poynting vector $\vec S_{av}(x, y, z)=\frac1T\int_0^T\vec S(x, y, z, t)\mathrm dt = \frac12\text{Re}[\tilde{\vec E} \times \tilde{\vec H^*}]$
+
Total Radiation Power $P_{rad} = \oiint_S[\tilde{\vec E} \times \tilde{\vec H^*}] \cdot \mathrm d\vec s$
RCS (σ) of a radar target is an effective area that intercepts the transmitted radar power and then scatters that power isotropically back to the radar receiver.
The Q and R are symmetric and ponlositive definite. If not positive, you may optimize the result into negative inf.
+
Value Iteration
+
Model-Free RL
If the model is not known?
+
modelbased RL:
+
+
base policy to collect dataset
+
learning dynamics model from dataset
+
plan through dynamics model and give actions
+
Execute the actions and add the result into data set
+
+
Model predictive control (MPC)
+
+
base policy to collect dataset
+
learning dynamics model from dataset
+
plan through dynamics model and give actions
+
only execute the first planned action
+
append the $(s, a, s^\prime)$ to dataset
+
+
Model-based RL with a policy
Why Model based RL with a learned model?
+
+
Data-efficiency
+
Dont need to act in real world
+
+
+
Multi-task with a model
+
reuse the world model
+
+
+
+
But they are unstable and worse asymptotic performance.
+
+
If the model biased toword the positive side
+
the actions overfit to the learned model
+
+
+
if the trajectory is really long
+
Accumulated errors
+
+
+
+
To resolve 1: use uncertainty
+
Optimize towards expectation of rewards rather than rewards
+
Two types of uncertainty
+
+
Aleatoric or statistical: The reality itself has uncertainty (e.g. Dice)
+
Epistemic or model uncertainty: You are uncertain about the true function
+
+
If use output entropy, it can’t tell apart the type of uncertainty. We need to measure the epistemic uncertainty.
+
How to measure?
+
We use the collected data to train the model, maximize $\log(D|\theta)$ by changing $\theta$
+
Can we instead to measure $\log(\theta|D)$ – the model uncertainty!
+
but it is rather intractable.
+
Model Ensemble!
+
Training multiple models, see if they agree with each other. We have to make the models different(variant).
+
The randomness and SGD is enough to make the models different.
+
+
Every time drag one model and give actions
+
calculate the reward
+
add the data into dataset and update policy
+
+
To resolve 2 (long rollouts can be error-prone), we can always use short rollouts.
+
combine the real and model data to improve policy
+
Example: DYNA-style MBRL
+
Also can try Baysian Neural Networks.
+
Value-Equivalent Model
You dont have to stimulate the world, just simplify the value fuction ensuring to keep the value is approximately the same with the real one.
+
Use mean square error.
+
Model-Base RL with images
Imitation Learning
Accumulate Error and Covariate Shiift
+
DAgger:
+
+
Train a policy from human data $D$
+
Run the policy to get dataset $D_\pi$
+
Ask human to label $D_\pi$ with actions $a_t$
+
Aggregate: $D \larr D \cup D_\pi$
+
+
Techniques: Dataset Resampling / Reweighting
+
Techniques: Pre-Trained Models to extract representations
+
MSE gives the mean value, while the cross-entropy gives the probability. If a task has a probability with 50% left, 50% right, the MSE will give an answer “go forward”.
\ No newline at end of file
diff --git a/2024/05/27/hello-world/index.html b/2024/05/27/hello-world/index.html
new file mode 100644
index 00000000..0a3ffb21
--- /dev/null
+++ b/2024/05/27/hello-world/index.html
@@ -0,0 +1,111 @@
+Hello World | Guo_Yun
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Hello World
Welcome to Hexo! This is your very first post. Check documentation for more info. If you get any problems when using Hexo, you can find the answer in troubleshooting or you can ask me on GitHub.
\ No newline at end of file
diff --git a/css/README.html b/css/README.html
new file mode 100644
index 00000000..28b00d5c
--- /dev/null
+++ b/css/README.html
@@ -0,0 +1,4 @@
+
\ No newline at end of file
diff --git a/yun.png b/yun.png
new file mode 100644
index 00000000..fb6026b7
Binary files /dev/null and b/yun.png differ
diff --git a/yun.svg b/yun.svg
new file mode 100644
index 00000000..f54921e8
--- /dev/null
+++ b/yun.svg
@@ -0,0 +1,21 @@
+