\caption{\textbf{3D object detection and segmentation} with GRNNs. In the first and second row on the left we show the input images over time, and their corresponding object detection results for a top view, respectively. Blue voxels denote groundtruth objects and the predicted bounding boxes are shown in \color{red}red \color{black} and \color{green} green \color{black}. On the right, we show segmentation results for the third time step, visualizing the results from two views. Predicted 3D boxes and their corresponding predicted masks are show in red and green, and we show in blue the corresponding groundtruth. Best seen in color. %\todo{i still dont get why on the right there are 4 objects instead of two} }
\caption{ Experimental frame (top) and processed data analysis (bottom) for fiber radius $\R = 0.1 \, \si{mm}$, nozzle inner diameter $ID = 0.8$ \si{mm}, and flow rate $\Qm$ = 0.04 \si{g/s}, \red{shown at a distance 8 - 10 mm from the nozzle} . \red{The red dots superimposed on the experimental frame correspond to the extracted film profile and the black dots on the bottom plot} correspond to the locations of the maxima. \red{The average distance between two maxima is $L^* = 5.92$ mm and the film thickness between the drops ranges from 0.095 to 0.120 mm.}}
\caption{Average bead speed for fiber radius $\R=0.1$ mm (left) and $\R = 0.215$ mm (right), and flow rates $\Qm =$ 0.04, 0.06, and 0.08 g/s from top to bottom, compared to the proposed model (\GvdW) as blue diamonds and Craster \& Matar (CM) as red circles. The last two data points in (a) are in the isolated droplet regime. There is no plot for$\R =0.215$ mm, $\Qm = 0.08$ g/s, because in this case there is no TWS and the experiments lie in the convective regime. }
\caption{A comparison of traveling wave solutions to \eqref{eq:travelingODE} from the original Craster \& Matar model (CM), the Slip Craster\& Matar model (SCM), the Full Curvature Model (FCM) and the experimental observation showing that while the solutions profiles from each models are close, the estimated bead velocity predicted by the FCM matches the best with the experimental result. The bead profile is obtained from the thick fiber experiment with$\R = 0.215$ mm, flow rate $\Qm$ $=0.04$ g/s, and the inner nozzle diameter ID = $1.06$ mm. The corresponding non-dimensional constraints are given by $(L, M_0) = (4.94,8.92)$.}
\caption{ \red{(Left) The experimentally measured relations between the dimensional $L^*$, $M_0^*$, and the nozzle inner diameters for fiber radius $\R = 0.215$ mm at flow rate $\Qm = 0.04$ g/s; (Right) the predicted speed of traveling wave solutions to the film stabilization model (\GvdW) in comparison with the corresponding experimental results (large dots) with varying $M_0^*$ and $L^*$}.}
\caption{Evaluation results for $\CorpusReddit$ and $\CorpusAmazon$ sorted by $\accuracy$ in descending order. Binary-intrinsic approaches are highlighted by \colorbox{BinaryIntrinisc}{\textbf{purple}}, binary-extrinsic approaches by \colorbox{BinaryExtrinisc}{\textbf{orange\protect\vphantom{l}}} and unary approaches by \colorbox{Unary}{\textbf{green\protect\vphantom{l}}}. \textbf{Non-optimizable} and \textbf{non-deterministic} \AV methods are marked by $\dagger$ and $\star$, respectively.}
\caption{Gas network connected to a power grid. Red nodes are \textcolor{red!50}{PQ/demand nodes}, green nodes are generators (\textcolor{green!50}{PV nodes}) and the blue node is the \textcolor{blue!50}{slack bus} (also a generator, with gas consumption of the form $\fuel(P)=a_0 + a_1 P + a_2 P^2$). The circle symbol indicates a compressor station.}
\caption{Polarized optical microscope (POM) images of three polymorphs of paracetamol (forms I, II, III)~\cite{Haisa,parac,Perrin} and possible phase transitions; \blue{$T_{g,m,tr}$ are temperatures of glass transition, melting and solid phase transition, respectively~\cite{Martino,Gaisford,Qi,Martino2,John,Kach,Zimm}. Blue arrows show the preparation procedure of the sample used for IR measurement in this study.} The polariser (P) and analyser (A) orientations are shown together with that of a waveplate $\lambda = 530$~nm used to color-shift the polariscopy image. Thickness of paracetamol film was $d\approx 10~\mu$m for the forms II and III, while $\sim 25~\mu$m for the form I. }
\caption{(a) Molecular organisation \blue{of paracetamol} in \blue{the} forms I, II, III and their melting temperatures~\cite{Martino,Gaisford,Qi}. (b) Paracetamol form II crystal structure and (c) the absorbance spectrum \blue{of the} form II~\cite{parac} averaged over the measured area at $\theta = 0^\circ$ (horizontal x-axis in transmission, $T$, measurements). The hydrogen bonding at 3205~cm$^{-1}$ was used for mapping and for orientation of the optical slow-axis at 3600~cm$^{-1}$, at which absorbance is close to zero and \blue{transmittance} is changing due to the retardance (see, Eqn.~\ref{e1}). The OH \blue{stretching} band is aligned with the molecular chain (inset in (c)), which is the \blue{optical} slow-axis.}
\caption{\blue{Specific heat capacity $c_{\rm{p}}$ measured on heating at the rate of 10 K/min for the specimen as purchased (in the first heating, blue line) and after cooling at 10 K/min (in the second heating, red line). The highest temperature reached at the end of the first cycle was 180$^\circ$C.} }
\caption{Polarisation angle dependence of transmittance, $T(\theta)$ (Eqn.~\ref{e1}). Experimental transmittance is fitted by Eqn.~\ref{e1} at four polarisation angles~\cite{Hikima} $\theta$ separated by $\pi/4$; more angles increase \blue{a} fidelity of the fit. The marked losses represent collection and absorption losses measured through the parallel polariser - analyser setup. The slow\blue{-}axis orientation $\theta^{'}\equiv\theta^{'}_n$ (for pure retardance effect) and the retardance $\Delta nd/\lambda$ is also found from the fit. This measurement is made at selected wavelength.}
\caption{(a) Color map of the spectral average of absorbance. ROI indicates the region-of-interest for single point spectral measurements; ROI at $x=23$, $y=14$ when the upper left corner is (0,0). (b) Single point spectra at different $\theta$ angles; note logarithmic abscise scale used to better separate \blue{absorption} bands. (c) Transmittance (Eqn.~\ref{e1}) change at non-absorption band of 3600 cm$^{-1}$. The best fit is plotted by $y=1-a\sin^2(2x-2b)+c$ which achieves regression coefficient $R^2 = 0.986$ with $a=0.893$, $b=-36.11$, $c=-0.036$. (d) Absorbance change at $\delta$(CH$_3$) band 1377~cm$^{-1}$ were also measured by transmission, however, at the \blue{absorption} band $A = -\lg(T)$ and Malus law applies due to the mutual orientation between the linear polarisation and orientation of the absorbing dipoles. The best fit was made by $y=a\cos(2x-2b)+c$ with $R^2 = 0.942$, $a=0.72$, $b=-31.97$, $c=0.79$. }
\caption{Qualitative results on VeRi dataset using \emph{BS} based triplet embedding. Each row indicates query image and top-10 retrievals for this query image. {\color{red}{Red}} border indicates incorrect retrieval and {\color{green}{Green}} indicates correct retrievals. These demonstrate good embedding quality as the top retrievals include different views and cameras. }
\caption{{\bf RayNet.} Given a reference view and its adjacent views, we extract features via a 2D CNN (\textcolor{blue}{blue}). Features corresponding to the projection of the same voxel along ray $r$ (see (\subref{fig:network_multi_view}) for an illustration) are aggregated via the average inner product into per pixel depth distributions. The average runs over all pairs of views and $\sigma(\cdot)$ denotes the softmax operator. The depth distributions for all rays (\ie, all pixels in all views) are passed to the unrolled MRF. The final depth predictions $d_r$ are passed to the loss function. The forward pass is illustrated in \textcolor{darkgreen}{green}. The backpropagation pass is highlighted in \textcolor{red}{red}. $\cC=\{C_i\}$ is the set of all cameras, $W\times H$ are the image dimensions and $D$ is the max. number of voxels along each ray. }
\caption{\label{table:categoryPerformance} List of AUC value for different categories using fixed threshold $T_1= 0.55$ and \emph{Scale 1} saliency maps. Note that the highest AUC value in each category is labeled in \textcolor{ForestGreen}{green} and lowest AUC value in \textcolor{red}{red}. Also, the category with the highest AUC in the dataset is shown in \textbf{bold}}
\caption{\label{table:distanceMetrics} Estimated distances using distribution-based metrics for the proposed algorithm in comparison with the state-of-the-art algorithms using \emph{Scale 1} saliency maps computed using 3DFFT algorithm \cite{long2015saliency}. Note that the highest value in each distance metric is labeled in \textcolor{ForestGreen}{green} and the lowest value in \textcolor{red}{red}. Also, the distance values for the proposed algorithm is shown in \textbf{bold}}
\caption{Performance comparisons on five tracking benchmarks. \textcolor{red}{Red}, \textcolor{green}{Green} and \textcolor{blue}{Blue} fonts indicate the top-3 trackers, respectively. %SaimFC+\ding{172}-\ding{174} indicate SiamFC based on CIResNet, CIR-Incep., CIR-NeXt respectively. }
\caption{\textbf{Left}: Ring of \gray excess detected by AGILE. \textbf{Right}: CO map \citep{Dame01}, which reveals the star formation shell discussed by \cite{Pillitteri16}; the contour levels from \gray data are shown in black. The figure is from M18.}
\caption{AGILE \gray spectrum obtained with an extended source likelihood analysis centred on (l, b) = (214.4, -18.5) with a $1^{\circ}$ radius convoluted with AGILE PSF (M18). The blue line shows the best-fit power-law function, with index 1.7.}
\caption{AGILE (red) \gray points plotted with \gray emission from pion decay (blue line), from primary Bremsstrahlung (cyan dashed line), from secondary Bremsstrahlung (magenta dashed line), and total emission (black line).}
\caption{AGILE (red) \gray points plotted with the total \gray emission produced by re-acceleration (brown line) and acceleration (grey line, $\xi_{CR}=20\%$) described with parameters of our best model. The black line is the sum of the two contributions.}
\caption{Gaze prediction AUC and action recognition accuracy with respect to inference iteration on the EGTEA dataset. \textcolor{myblue}{Blue line} correspond to action recognition accuracy on the left axis, and \textcolor{myorange}{orange line} correspond to gaze prediction AUC on the right axis. We show the strongest baseline of action recognition \cite{li2018eye} and gaze prediction \cite{huang2018predicting} in \textcolor{cyan}{cyan} and \textcolor{red}{red} dashed lines respectively.}
\caption{\textbf{Scenarios:} Our \blue{validation user study} included $5$ scenarios, including indoor, outdoor, residential, and fantastical scenes.}
\caption{\textbf{Reported Dominance:} We present the mean values of the normalized participant responses $\in [-1, 1]$. We generated $10$ characters for each scenario using \blue{two} gaits from each dominance level: $HS, S, N, D, HD$. Participants reported higher dominance for more dominant gaits, as predicted by our algorithm, across all the scenarios.}
\caption{\textbf{Comparison of Means:} We compare the means of participant responses for pairs of gaits with different levels of dominance using paired samples t-tests. This table presents the p-values for these comparisons \blue{(highlighted values indicate $p \geq 0.05$)}. We observe statistically significant differences ($p < 0.05$) in the means of reported dominance for pairs of gaits with different predicted dominance levels. These results support our hypothesis that our data-driven approach can be used to generate virtual characters with varying levels of dominance.}
\caption{\blue{\textbf{Consistency Across Scenarios}: We perform paired samples t-tests between scenarios to assess whether the participant responses for a gait remain consistent across a variety of scenarios. We present a visualization of the p-values obtained for these comparisons. We color the cells where we observed a significant difference $(p < 0.05)$ between mean participant responses red and we color the cells where we did not observe a significant difference $(p > 0.05)$ green. For most of the gaits, there was not a significant difference $(p > 0.05)$ between mean participant responses across scenarios, indicating consistent dominance levels irrespective of the scenario.}}
\caption{\textbf{Dominance spectrum:} Based on a \blue{perception user study}, we obtain dominance labels for motion-captured gaits. As an example, participants rated the character on the left as Highly Submissive (HS), whereas the character on the right as Highly Dominant (HD). According to the psychology literature, more leg and hand movement and erect posture is observed in a dominant gait as compared to a submissive gait. }
\caption{\textbf{Dataset Variation:} We divide $179$ gaits into $5$ dominance levels using a \blue{perception user study}: \textit{(Highly Submissive (\textbf{HS}), Submissive (\textbf{S}), Neutral (\textbf{N}), Dominant (\textbf{D}), Highly Dominant (\textbf{HD}))}.}
\caption{\blue{\textbf{Average Frame Update Time}: We present the average frame update time for generating gaits of virtual characters with different dominance traits. We compare the performance to an algorithm that does not consider any dominance traits.}}
\caption{\textbf{Application User Study Results}: We present the percentages of participants that answered teacher, student, or unsure for the \textit{School} scene (in light gray) and the percentages of participants that answered employee, boss, or unsure for the \textit{Office} scene (in dark gray). For the \textit{Office} scene, our approach to generating virtual characters with different dominance traits is able to create characters that are distinguishable as employees and bosses. Overall, the results of our \blue{application user study} indicate that our approach can be used to simulate the vertical dimension of interpersonal social relationships between virtual characters.}
\caption{\textbf{Interpersonal Social Relationship Between Virtual Characters}: Our approach can be used to realize the vertical dimension of interpersonal social relationships. Members of a pair of dominant and submissive characters generated using our method were perceived by participants as being a boss or an employee depending on their dominance level in our \blue{application user study}.}
\caption{\textbf{Modeling Dominance Traits for Virtual Characters:} Our approach learns dominance traits from motion-captured gaits and computes a data-driven dominance mapping. We use this mapping to \blue{interactively} generate virtual characters with different dominance traits (below) for an immersed user (top). According to the psychology literature, more leg and hand movement and erect posture indicates a dominant gait, whereas slumped posture and less leg and hand movement indicates a submissive gait.}
\caption{\emph{Bidisperse shear thickening phenomenology.} (a) $\eta/\eta_f$ as a function of $\sigma/\sigma_0$ from simulations at $\alpha=0.25$ and $\phi=0.53$, with $\xi=0$ (pentagon), 0.2 ($\square$), 0.5 ($\triangledown$), 0.65 ($\triangle$), 0.8 ($\circ$) and 1 ($\diamond$). (b) Frictionless relative viscosity $\eta_0/\eta_f$ from simulations (\blue{--$\blacksquare$--}) and experiments (\blue{--$\circ$--}), and frictional relative viscosity $\eta_{\rm m}/\eta_f$ from simulations (\red{--$\blacksquare$--}). (c) Limiting jamming volume fractions, $\phi_0$ (blue) and $\phi_{\rm m}<\phi_0$ (red), versus $\xi$ from simulations (\blue{--$\blacksquare$--},\red{--$\blacksquare$--}) and experiments (\blue{--$\circ$--},\red{--$\circ$--}). (d) Experimental $\eta/\eta_f$ versus $\sigma$ for PMMA spheres at $\alpha=0.26$ and $\phi=0.51$, with $\xi=0$ (pentagon), 0.2 ($\square$), 0.5 ($\triangledown$), 0.65 ($\triangle$), 0.8 ($\circ$) and 1 ($\diamond$). Inertial fracture at $\dot{\gamma} \approx \SI{8e3}{s^{-1}}$ renders the grey-shaded region inaccessible\cite{guy2015towards}. }
\caption{Additional \aastex\symbols}
\caption{Abstract syntax of Hybrid Rebeca. The main differences in syntax compared to Timed Rebeca, are highlighted with color \textcolor{\HighlightColor}{\HighlightColor}. Angle brackets $\langle~\rangle$ denotes meta parenthesis, superscripts $+$ and $*$ respectively are used for repetition of one or more and repetition of zero or more times. Combination of $\langle~\rangle$ with repetition is used for comma separated list. Brackets $[~]$ are used for optional syntax. Identifiers $C$, $T$, $\MessageName$, $\ModeName$, $v$, $c$, $r$ and $e$ respectively denote class, primitive type, method name, mode name, variable, constant, and rebec name, respectively; and $e$ denotes an expression. \label{Fig::HybridRebecaGrammar}}
\caption{Critical temperature of studied granular aluminum films versus the normal state resistivity. Black upside down triangles \textcolor{black}{$(\lyxmathsym{\textifsymbol[ifgeo]{99}})$} marks $T_{c}$ of the optical spectroscopy studied samples. Blue triangles \textcolor{blue}{$(\text{\textifsymbol[ifgeo]{97}})$} marks $T_{c}$ as measured by Hall bars with similar deposition conditions as described in the main text. The dashed line marks the bulk aluminum critical temperature. \label{fig:Phasediagram}}
\caption{$\Delta(T)$ and $\rho(T)$ versus temperature for all studied samples. Black squares $(\blacksquare)$ are the measured gap, obtained by fitting $\sigma_{1,s}/\sigma_{1,n}$ to MB formulae and the red curve is a fit to a BCS gap equation curve. Blue circles \textcolor{blue}{$(\bullet)$} are the measured resistivity and the grey area marks the decrease of the normal state resistivity from 90 to 10\% of its normal state value. \label{RTgap}}
\caption{The coupling ratio as a function of $k_{F}\xi_{pair}$ or $(k_{F}a_{F})^{-1}$ (Pisani et al., private communication). The red circles\textcolor{blue}{{} }\textcolor{red}{$(\bullet)$ }are the results of the numerical calculations that include corrections beyond mean field which arise from pairing fluctuations, labeled GMB. The red line joining these circles is a guide to the eye. The black square\textcolor{black}{s $(\blacksquare)$}\textcolor{blue}{{} }\textcolor{black}{correspond to our measured coupling ratio. The corresponding values of $(k_{F}a_{F})^{-1}$ range from -1.3 to -0.7 and those of $k_{F}\xi_{pair}$ }from 5.3 to 2.3. \label{fig:PisanisdeltakFxi}}
\caption{An overview of test sets included in EQUATE. \reddit{} and \st{} are framed as 3-class (entailment, neutral, contradiction) while \rte{}, \qnli{} and \awp{} are 2-class (entails=yes/no). RTE 2-4 formulate entailment as a 2-way decision. We find that few news article headlines are contradictory, thus \qnli{} is similarly framed as a 2-way decision. For algebra word problems, substituting the wrong answer in the hypothesis necessarily creates a contradiction under the event coreference assumption \cite{de-marneffe-etal-2008-finding}, thus it is framed as a 2-way decision as well.}
\caption{Accuracies(\%) of 9 NLI Models on five tests for quantitiative reasoning in entailment. M and D represent \emph{models} and \emph{datasets} respectively. $\Delta$ captures improvement over majority-class baseline for a dataset. Column Nat.Avg. reports the average accuracy(\%) of each model across 3 evaluation sets constructed from natural sources (\rte, \qnli, \reddit), whereas Synth.Avg. reports the average accuracy(\%) on 2 synthetic evaluation sets (\st, \awp). Column Avg. represents the average accuracy(\%) of each model across all 5 evaluation sets in \textsc{equate}. }
\caption{\draw \cite{Gregor:2015:draw}}
\caption{keyframe localization error comparison. \textcolor{red}{X} denote unsuccessful cases.}
\caption{The infinite temperature spectral form factor $g(t,\beta=0)$ for various re-wirings of the network. On the left is the SFF for zero re-wirings. The SFF does not show any dip or ramp regimes. The middle plot shows the SFF for 1, 2 ({\color{blue}{blue}}) and 3 ({\color{red}{red}}) re-wirings. The dip-linear ramp-plateau behaviour is immediate and more pronounced for increasing randomness. On the far right, the SFF computed for 20 re-wirings of the network where $p\sim 1$ clearly displays a linear ramp connecting the dip to the plateau at $t\sim 10$. We have checked also that the onset of the plateau remains at $t\sim 10$ for $N=6,7,8$ and $9$.}
\caption{Safecast privacy gain: Spatial (top) and \acp{POI} (bottom). Amount of measurements per user \textbf{+} : \textless{}10k, \tikzcircle[black, fill=black]{2pt} : [10k,50k], $\blacktriangle$ : \textgreater{}50k. Each point on the graphs represents one user.}
\caption{Spatial privacy gain (left part) and \ac{POI} privacy gain (right part) in Radiocells. Amount of measurements per user \textbf{+} : \textless{}10k, \tikzcircle[black, fill=black]{2pt} : [10k,50k], $\blacktriangle$ : \textgreater{}50k. Each point on the graphs represents one user.}
\caption{Average \textcolor{green}{$Avg_a$} / \textcolor{blue}{$Avg_b$} scores on UCCS, $Avg_a$ means the green-red component, with green in the negative direction. $Avg_b$ represents the blue-yellow component, with blue in the negative direction. The best two are shown in bold.}
\caption{Comparison of the image quality among CBCT, SynPlanCT, and PlanCT \red{of test patient (ii)}. For each patient, the images in the top, middle, and bottom row are axial, coronal, and sagittal views, respectively. The images on the left, middle, and right are CBCT, SynPlanCT and PlanCT, respectively. The display window range was set to (-400, 0) HU for CBCT and (-200, 200) HU for SynPlanCT and PlanCT.}
\caption{ \red{Comparison of a ROI containing air. From left to right, ROI position in CBCT, (a) Original CBCT, (b) SynPlanCT with $\lambda_{air}=1.0$ and $\lambda_{grad}=0.1$, (c) SynPlanCT with $\lambda_{air}=0$ and $\lambda_{grad}=0$, checkerboard overlay of (a) and (b), and checkerboard overlay of (a) and (c). We can see the shape of the air regions in (a) and (b) match very well. }}
\caption{Comparision between the proposed MOT framework (online mode) with other online processing SOTA methods in MOT16 and MOT17. 'with filter' means detection score refiner is used. 'MOT16p' means MOT16 with private detection. \textcolor{red}{Red} for the best result.}
\caption{Comparision between the proposed MOT framework (batch mode) with other batch processing SOTA methods in MOT16 and MOT17. 'with filter' means detection score refiner is used. 'MOT16p' means MOT16 with private detection. \textcolor{red}{Red} for the best result.}
\caption{\textbf{Binary Image SelectiON (BISON):} Given a \emph{text query}, the system must select which of two images best matches the caption. This task evaluates fine-grained visual grounding. The BISON accuracy of a system is the proportion of examples for which the system correctly chooses the \emph{positive image} ({\color{green}\cmark}) over the \emph{negative image} ({\color{red}\xmark}).}
\caption{\Pred Indexing Algorithm}
\caption[none]{\textbf{\emph{Step 1}}: With the current estimates of $\ddot{\b{z}}_\mesh^{(i)}$ (\begin{tikzpicture}{\draw[my_blue, line width=1.5] (0,0) -- (.4,0);}{\draw[my_blue, fill=my_blue] (0.2,0) circle[radius=3pt];} \end{tikzpicture}) we generate using the $\GP$ model Eq. \ref{eq:posterior}, the current posterior curve $\bs{\mu}^{(i)}$ (\begin{tikzpicture}{\draw[my_green, line width=1.5] (0,0) -- (.4,0);}{\draw[my_green, fill=my_green] (0.2,0) circle[radius=3pt];} \end{tikzpicture}) and velocities $\dot{\bs{\mu}}^{(i)}$ (\begin{tikzpicture}{\draw[my_red, line width=1.5] (0,0) -- (.4,0);}{\draw[my_red, fill=my_red] (0.2,0) circle[radius=3pt];} \end{tikzpicture}). \textbf{\emph{Step 2}}: Then, using the proposed fixed-point update scheme Eq. \ref{eq:fixed-point-update}, we get the updated parameters $\ddot{\b{z}}_\mesh^{(i+1)}$% = f(\bs{\mu}^{(i)}, \dot{\bs{\mu}}^{(i)})$ $(\begin{tikzpicture} {\draw [my_blue, densely dotted, line width=1.5] (0,0) -- (.4,0);} {\draw[my_blue, line width=1] (0.2,0) circle[radius=3pt];} \end{tikzpicture})$ % $(\begin{tikzpicture} % {\draw [blue, dashed, line width=2] (0,0) -- (.5,0);} % \end{tikzpicture} % = % f( % \begin{tikzpicture} % {\draw [green, line width=2] (0,0) -- (.5,0);} % \end{tikzpicture} % , % \begin{tikzpicture} % {\draw [red, line width=2] (0,0) -- (.5,0);} % \end{tikzpicture} % ))$. . The algorithm iterates until $\norm{ \begin{tikzpicture} {\draw [my_blue, densely dotted, line width=1.5] (0,0) -- (.4,0);} {\draw[my_blue, line width=1] (0.2,0) circle[radius=3pt];} \end{tikzpicture} - f( \begin{tikzpicture} {\draw [my_green, densely dotted, line width=1.5] (0,0) -- (.4,0);} {\draw[my_green, line width=1] (0.2,0) circle[radius=3pt];} \end{tikzpicture} , \begin{tikzpicture} {\draw [my_red, densely dotted, line width=1.5] (0,0) -- (.4,0);} {\draw[my_red, line width=1] (0.2,0) circle[radius=3pt];} \end{tikzpicture} ) }$ small enough.}
\caption{Phaseogram showing the variation of the pulse phase corresponding to an imperfect orbital solution (in this case the time at the ascending node $T_0$) in a \nustar observation of Her X-1, executed with \stingray and plotted in a convenient, interactive interface with \hendrics. The TOA button allows the user to calculate the TOA for use with Tempo2, PINT or similar programs.}
\caption{Visualization of SeisInvNet framework. Given the seismic data (\brown{$\blacktriangledown$} to \red{$\blacktriangledown$} indicate data by different sources while \black{$\blacktriangle$} to \gray{$\blacktriangle$} indicate data recorded by different receivers. To save space, data is not visualized in its original scale.), (a) \emph{Embedding Encoder} replaces each original seismic trace $D^i_{s,r}$ by an embedding vector $\mathbf{E}^i_{s,r}$ which composes of neighborhood information $\mathcal{N}(D^i_{s,:})_r$ (indicated by $\vrectangle$), observation setup $\mathcal{S}(D^i_{s,r})$ (indicated by squares, such as \tiny{\black{$\blacksquare$}} \footnotesize{and} \tiny{\red{$\blacksquare$}}\footnotesize.), and global context $\mathcal{G}(D^i_{s,:})$ of corresponding seismic profile (indicated by rectangles, such as \orange{$\vrectangleblack$}, \blue{$\vrectangleblack$} and \green{$\vrectangleblack$}); (b) \emph{spatially aligned feature Generator} transforms each embedding vector to a feature map whose information is spatially aligned to the velocity model; (c) \emph{Velocity Model Decoder} collects all the feature maps from which knowledge is decoded to regress velocity model. (d) We optimize parameters of Encoder, Generator, and Decoder by minimizing \emph{$L_2$} and maximizing \emph{MSSIM} metrics to make output more closed to the Ground Truth. %During training, \emph{Velocity Model Decoder} randomly throw away several feature maps to make Decoder robust and prevent over-fitting. Check Sec.~\ref{sec:implement} for the details.}
\caption{Example of data interpolation considering one gather of the synthetic dataset. (a) depicts the original gather $\I$, cropped in its central portion with size $450 \times 300$; (b) reports the corrupted gather $\Ihole$, with $50\%$ of randomly missing traces, (c) shows the reconstructed gather $\Ihat$; \red{(d) depicts the reconstruction error, which is the difference between reconstructed and original shot gather.}}
\caption{Pictorial representation of a Typed Graph Network from the perspective of a vertex $v$. A set of embeddings is received from vertices in its \colorbox{GreenYellow!60!white!40}{incoming neighbourhood}, a message is computed from each embedding with the message function \colorbox{red!40!white!60}{$\mu$} and messages are aggregated and fed to the update function \colorbox{Cyan!40!white!60}{$\phi$}, which produces an updated embedding for $v$. Simultaneously, $v$ sends messages to vertices in its \colorbox{Emerald!30!white!70}{outgoing neighbourhood}, which will undergo the same update process.}
\caption{Comparison of state-of-the-art detectors on MSCOCO18 test-dev set. More results can be founded \href{https://competitions.codalab.org/competitions/5181\#results}{in the MSCOCO evaluation website (test-dev2018)}. For each metric, the best results are marked with {\color{red}\underline{\textbf{red}}}. }
\caption{Comparisons of max F-measure ($F_{\beta}$-max) and MAE values. Results on both VGG \cite{VGG} and ResNet \cite{resnet:He2015Deep} backone are reported, and the top two results are shown in \textcolor{aa}{\textbf{red}} and \textcolor{bb}{\textbf{blue}} colors, respectively. Best viewed in color.}
\caption{Evaluation results of $F_{\beta}$-max and MAE with different modules on 6 datasets for ablation studies. We report these results on both VGG \cite{VGG} and ResNet \cite{resnet:He2015Deep} backbone. The top two results are highlighted in \textcolor{aa}{\textbf{red}} and \textcolor{bb}{\textbf{blue}} colors. Best viewed in color.}
\caption{Qualitative results of sRb-VAE and Rb-VAE applied to conditional image generation. See Sec. (\textcolor{red}{5.4}) of the paper for details.}
\caption{Qualitative results of sRb-VAE and Rb-VAE applied to visual attribute transfer. See Sec. (\textcolor{red}{5.4}) of the paper for details.}
\caption{GZSL results on CUB, AWA2 and aPY. ts = test classes (unseen classes), tr = train classes (seen classes), H = harmonical mean. The accuracy is class-average Top-1 in \%. The highest accuracy is in \textcolor{red}{red} color and the second is in \textcolor{blue}{blue} (better viewed in color).}
\caption[caption]{DA classification results. M = MNIST, U = USPS, S = SVHN. The highest accuracy is in \textcolor{red}{red} color and the second is in \textcolor{blue}{blue} (better viewed in color). Self-ensembling, unlike other methods, leverages data-augmentation and reports accuracy numbers that are evidently higher than those obtained in the fully supervised case for $U\rightarrow M,\,S \rightarrow M$. \label{tab:da}}
\caption{\textbf{Concrete autoencoder architecture and pseudocode.} (a) The architecture of a concrete autoencoder consists of a single encoding layer, shown in \textcolor{brown}{brown}, and arbitrary decoding layers (e.g. a deep feedforward neural network), shown in \textcolor{teal}{teal}. The encoder has one neuron for each feature to be selected. During the training phase, the $i^\mathrm{th}$ neuron $u^{(i)}$ takes the value $\xb^\top \mb^{(i)}, \; \mb^{(i)} \sim $ Concrete$(\alphab^{(i)}, \mathrm{T})$. During test time, these weights are fixed and the element with the highest value in $\alphab^{(i)}$ is selected by the corresponding $i^\mathrm{th}$ hidden neuron. The architecture of the \textit{decoder} remains the same during train and test time, namely that $\hat{\xb} = f_\theta(\ub)$, where $\ub$ is the vector consisting of each $u^{(i)}$. (b) Here, we show pseudocode for the concrete autoencoder algorithm, see Appendix \ref{appendix:pseudocode} for more details.}
\caption{\textbf{Annealing schedules for the concrete autoencoder.} Here, we show the effect of different annealing schedules on a concrete autoencoder trained on the MNIST dataset with $k=20$ selected features. At each epoch, we plot the temperature in \textcolor{red}{red}, average of the largest value in each concrete sample $\mb^{(i)}$ in black, as well the reconstruction error (using linear regression with the top $k=20$ features on validation data), shown in \textcolor{blue}{blue}. If the temperature is kept high, the concrete samples do not converge to individual features, and the reconstruction error remains large (top left). If the temperature is kept low, the samples immediately converge to poor features, and the error remains large (top right). If the temperature is exponentially decayed (the annealing schedule we use), the samples converge to informative features, and the reconstruction error reaches a suitable minimum (bottom left). Finally, if the temperature is dropped abruptly, the samples converge, but the error is suboptimal (bottom right).}
\caption{\textbf{Imputation errors of concrete autoencoders and landmark genes.} Here, we show the mean-squared error of the imputation task using both the 943 landmark genes (\textcolor{red}{red}) and the 943 genes selected by the concrete autoencoder (\textcolor{blue}{blue}) on the test set. The task is to impute the expression of all 10,463 genes. We observe about a 3\% reduction (note that y-axis begins at 0.20) of the reconstruction error when using the genes selected by the concrete autoencoders (CAE) across all architectures. These are results averaged over three trials. Standard deviation bars are shown but were very low, as the final imputations were very similar across all trials. (b) We train the CAE with different numbers of selected features, and calculate the MSE using linear regression on the test set. We find that we can achieve a similar MSE to the landmark genes using only around 750 genes, a 20\% reduction in the number of genes measured.}
\caption{An example of the super-node concept used in $k$-concurrent community detection is shown for partitioning into 4 parts. Two super-nodes $I$ and $J$ consisting of four subnodes ($x_{i/j,k}$) where each are connected by a super-edge Q$_{I,J}$. Internal edges \textcolor{blue}{Q$^{I/J}_{l,m}$} where $l,m \in \{1-4\}$ are set to enforce the selection of only one subnode to be equal to ``1'' after the annealing. The super-edge Q$_{I,J}$ is shown with connections between corresponding subnodes.}
\caption{Mean squared error (a), mean standard error of the estimator (b), and empirical coverage in the simulation study. The nominal confidence level ({\color{red} - - - }) in panels (c) is 0.95.}
\caption[]{ A tree representing a metric and the objects of an assignment instance associated to its nodes. In this instance $A=\{a_1,a_2,a_3,a_4,a_5\}$ and these objects are labelled $(t,t,w,w,w)$. Similarly the five objects in $B$ are labelled $(s,s,t,w,x)$. The objects are associated to tree nodes by the map $\varrho$, and \tikz\node[setA](){}; denotes the elements of the set $A$ and \tikz\node[setB](){}; of $B$, respectively.}
\caption{Performance of subsets of varying size on the \cifarhundred dataset. Each point is an average over $10$ trials and the vertical bars denote standard deviation.}
\caption{Example of variation between images in the same redundant group compared to variation across different redundant groups in the \cifarhundred dataset. Each column contains a specific class of images. In contrast to Figure \ref{fig:cifar_redundant}, the images within each redundant group show much more variations. The groups were found when retaining a 90\% subset, and retraining only the selected images (in green boxes) and discarding the rest had a negative impact on test performance.}
\caption{Number of redundant groups of various sizes in the \cifarhundred dataset when finding a 90\% subset for two classes. Note that the y-axis is logarithmic. }
\caption{Average dissimilarity to the retained sample across redundant groups (clusters) of size greater than 1. We report the class-wise mean for 3 classes as well as the average over the entire dataset. All clusters were created to find a subset of 90\% the size of the full set. We can observe that the average dissimilarity is about an order of magnitude higher for the \cifarhundred dataset, indicating that there is more variation in the redundant groups. }
\caption{GLUE test set results scored using the GLUE evaluation server. The number below each task denotes the number of training examples. The state-of-the-art results are in \textbf{bold}, and the results on par with or pass human performance are in {\color{blue}\textbf{bold}}. MT-DNN uses BERT\textsubscript{LARGE} to initialize its shared layers. All the results are obtained from \href{https://gluebenchmark.com/leaderboard}{https://gluebenchmark.com/leaderboard} on February 25, 2019. % Note that all the results are scored on the latest GLUE test set; the \textit{old} version of GLUE datasets expired on January 30, 2019 and we report its result in the appendix. Please refer to https://gluebenchmark.com for detailed information. Model references: $^1$:\protect\cite{wang2018glue} ; $^2$:\protect\cite{gpt2018}; $^3$: \protect\cite{phang2018sentence}; $^4$:\protect\cite{bert2018}. %Note that MT-DNN (v2) denotes results on the latest GLUE test set, and MT-DNN the test results reported on January 15, 2019, on the \textit{old} version of GLUE test set which was expired on January 30, 2019. % \JG{Results to be updated by Xiaodong and Pengcheng.} %\textcolor{red}{TODO: update results of our model.} }
\caption{GLUE Test results, which are scored by the GLUE evaluation server. The number below each task denotes the number of training examples. The state-of-the-art results are in \textbf{bold}. MT-DNN uses BERT\textsubscript{LARGE} for its shared layers. All the results are obtained from \href{https://gluebenchmark.com/leaderboard}{https://gluebenchmark.com/leaderboard.} \textcolor{red}{TODO: update results of our model.} }
\caption{ Comparison of samples generated from two generative models on the Yelp reviews dataset. The standard model struggles with repetitions of the same context or words (in \textcolor{azure(colorwheel)}{blue}), yielding non-coherent text. A hierarhical decoder with multi-layered latent variables eliminates redundancy and yields more coherent text planned around focused concepts.}
\caption{Pearson correlation matrix for selected keywords from the burglary and robbery reports. There are three groups of keywords have been selected, including crime descriptions ({\it burglary}, {\it robbery}, {\it carjacking}, {\it stole}, {\it jewelry}, {\it arrestee}, {\it jail}, {\it shot}), racial descriptions ({\it black\_male}, {\it black\_males}), and their comparisons ({\it black}, {\it male}, {\it males}). The correlations between crime descriptions and racial descriptions have been highlighted by a {\bf black} box, and the correlations between crime descriptions and comparisons of racial descriptions have been highlighted by a {\color{amethyst}{\bf purple}} box.}
\caption{\textit{Top} Power consumption ({\protect\tikz[baseline=-0.5ex] \protect\draw[line width=0.5mm, black] (0,0) -- (0.3,0);}), skeletal density ({\protect\tikz[baseline=-0.5ex] \protect\draw[line width=0.5mm, red] (0,0) -- (0.3,0);}) and envelope density ({\protect\tikz[baseline=-0.5ex] \protect\draw[line width=0.5mm, blue] (0,0) -- (0.3,0);}) as a function of mixing time for a typical model chocolate formulation with $\phi_{0} = 0.55$ ($\equiv$~74 wt.\%). Red box: the density of the grey shaded cluster is the skeletal density. Blue box: the average density inside the black dashed circle is the envelope density. Red dash line denotes time at which second shot of lecithin is added and the transition from dry to wet conche. \textit{Bottom} Visual appearance of samples taken out of the planetary mixer at various stages of the conche. Letter labels correspond in the upper and lower panels. Scale bars are \SI{10}{\mm}. Granule size increases A--E; by F, the granule size has diverged to the size of the system.}
\caption{Model chocolate flow curves. ({\color{black}\CIRCLE}): $\sigma^{\star}/\sigma_{\textrm{a}} \ll 1$, crumb conched with \SI{0.83}{\percent} lecithin ($\phi_0 = 0.55$). ({\color{red}\CIRCLE}): $\sigma^{\star}/\sigma_{\textrm{a}} \gg 1$; as before, but with \SI{1.2}{\percent} PGPR. Intermediate curves for intermediate PGPR contents. Dashed lines guide the eye to high-shear viscosity at $\sigma > \sigma_{\textrm{frac}}$.}
\caption{(a) Relative high shear viscosity of chocolate suspensions {\it vs} solid volume fraction. ({\color{red}\CircPipe}): $\eta_{\textrm{fc}}^{[\phi_0]}$, suspensions fully conched in the planetary mixer. \textit{Filled circles}: Diluted fully-chonched suspensions: ({\color[RGB]{255,173,0}\CIRCLE}): diluted from $\phi_0 = 0.596$. ({\color[RGB]{0,220,0}\CIRCLE}): diluted from $\phi_0 = 0.586$. ({\color[RGB]{0,120,221}\CIRCLE}): diluted from $\phi_0 = 0.576$. ({\color[RGB]{88,0,159}\CIRCLE}): diluted from $\phi_0 = 0.536$. Matching-color vertical dotted lines: $\phi_{\rm m}$ from fitting \autoref{eq:krieger-dougherty} to these four data sets. Thin red curves: \autoref{eq:krieger-dougherty} with $\lambda = 1,73$ consistent with single open red circle data points. \textit{Inset} Frictional jamming $\phi_{\textrm{J}}$ of chocolate suspensions as a function of the conched volume fraction $\phi_0$. Symbols as in main figure. (b) Replotted data of Lewis and Nielson \cite{lewis1968} for 30-\SI{40}{\micro\meter} glass spheres suspended in Aroclor, with each data set fitted to \autoref{eq:krieger-dougherty}; aggregate size increases from $\aleph$ and then A to K.}
\caption{The expansion terms of each expression. Terms that are expansion parameters in the sense of Eq.~\ref{eq:def-exp-param} are denoted with a green check ({\color{green}\checkmark}), while terms that are not are denoted with a red cross ({$\color{red}\times$}). Note that Madrid refers also to the expression AJLOS(31) and FL which are all generally quite similar. AM refers to both AM$^2$ and AM$^{5/2}$, and DMP refers to both DMP$^0$ and DMP$^1$.}
\caption{ (a) The majorization lattice $E_4$ on 4 variables. (b) The 27 complete simple games $C_4$ on 4 variables. The symbol $\vee$ denotes an element that is join-reducible. {\color{red}Red} and {\color{blue} blue} denote the image under the first map $E_4\to C_4$ and {\color{blue} blue} in particular denotes some elements sufficient for the second map $C_4\to C_3$ to be onto. (c) The 10 complete simple games $C_3$ on 3 variables. }
\caption[]{Illustration of the design criteria in two-dimensional parametric space. The plot shows a generic exact CR (shaded), the over-approximating orthotope (dashed) of the exact CR identified using the anchor points $\ve \pi$ (\tikz\draw[line width=0.4 mm,blue, fill={rgb,255:red,180; green,170; blue,255}] (0,0) \Square{10pt};). \tikz\draw[line width=0.3 mm, white,fill={rgb,255:red,255; green,100; blue,255}] (0,0) circle (.75ex); mark the points that give the maximal Euclidean distance of two points in the CR.}
\caption[]{The exact (solid) and linearized (dashed ellipsoid) CRs using $N = 4$ as obtained by classical and exact A designs. The plot shows the over-approximating orthotopes (dashed) of the exact CRs identified using the anchor points $\ve \pi$ represented by \tikz\draw[line width=0.25 mm,white, fill=green] (0,0) \Square{10pt};.}
\caption[]{The exact (solid) and linearized (dashed ellipsoid) CRs using $N = 4$ as obtained by classical, ellipsoidal and exact D designs. The plot shows the outer-/inner-approximating ellipsoids of the exact CRs (dotted and dash-dotted lines, respectively). \tikz\draw[line width=0.25 mm, white,fill={rgb,255:red,255; green,204; blue,255}] (0,0) circle (.75ex); and \tikz\draw[line width=0.25 mm, white,fill=yellow] (0,0) circle (.75ex); are the intersection points for the outer-/inner-approximating ellipsoids and the exact CRs.}
\caption[]{The exact (solid) and linearized (dashed ellipsoid) CRs using $N = 4$ as obtained for classical and exact E OED. \tikz\draw[line width=0.3 mm, white,fill=green] (0,0) circle (.75ex); mark the points used to calculate the Euclidean distance of the CRs.}
\caption[]{Mean and variance of $\phi_A$, $\hat\phi_D$ and $\phi_E$ for $1,000$ random experiments with $N=4$ noisy measurements at $\mathcal U^*$ of classical (\tikz\draw[line width=0.25 mm, red,fill=red] (0,0) \LongRrectangle{10pt};), ellipsoidal (\tikz\draw[line width=0.25 mm, green,fill=green] (0,0) \LongRrectangle{10pt};) and the exact (\tikz\draw[line width=0.25 mm, blue,fill=blue] (0,0) \LongRrectangle{10pt};) designs. Dashed line signifies the performance of the nominal design.}
\caption[]{The exact (solid) and linearized (dashed ellipsoid) CRs using $N = 2$ as obtained for classical and exact A OED. The plot shows the over-approximating orthotopes (dashed) of the exact CRs identified using the anchor points $\ve \pi$ represented by \tikz\draw[line width=0.25 mm,white, fill=green] (0,0) \Square{10pt};.}
\caption[]{The exact (solid) and linearized (dashed ellipsoid) CRs using $N = 2$ as obtained for classical, ellipsoidal and exact D OED. The plot shows the outer-/inner-approximating ellipsoids of the exact CRs (dotted and dash-dotted lines, respectively). \tikz\draw[line width=0.25 mm, white,fill={rgb,255:red,255; green,204; blue,255}] (0,0) circle (.75ex); and \tikz\draw[line width=0.25 mm, white,fill=yellow] (0,0) circle (.75ex); are the intersection points for the outer-/inner-approximating ellipsoids and the exact CRs.}
\caption[]{The exact (solid) and linearized (dashed ellipsoid) CRs using $N = 2$ as obtained for classical and exact E OED. \tikz\draw[line width=0.3 mm, white,fill=green] (0,0) circle (.75ex); mark the points used to calculate the Euclidean distance of the CRs.}
\caption[]{Mean and variance of $\phi_A$, $\hat\phi_D$ and $\phi_E$ for $1,000$ random experiments using $N=4$ noisy measurements at $\mathcal U^*$ of classical (\tikz\draw[line width=0.25 mm, red,fill=red] (0,0) \LongRrectangle{10pt};), ellipsoidal (\tikz\draw[line width=0.25 mm, green,fill=green] (0,0) \LongRrectangle{10pt};) and the exact (\tikz\draw[line width=0.25 mm, blue,fill=blue] (0,0) \LongRrectangle{10pt};) OED. Dashed line signifies the performance of the nominal design.}
\caption{\small Averaged BER along $\EbNo$ for turbo LMMSE (\textcolor{blue}{$\triangledown$}), BEP/KSEP \cite{Santos18,Santos18c} (\textcolor{mycolor3}{$\circ$}), FEP \cite{Santos18} (\textcolor{mycolor}{$\diamond$}), BP-EP \cite{Sun15} (\textcolor{mycolor1}{$+$}), %KSEP (\textcolor{mycolor2}{$\times$}), D-BEP/D-KSEP with $T=3$ (\textcolor{red}{$\square$}) and $T=5$ (\textcolor{auburn}{$\circ$}) and D-FEP (\textcolor{mycolor4}{$\diamond$}) %and D-KSEP ({\scriptsize{\textcolor{red}{$\square$}}}) equalizers, with $\ntaps=7$ for (a) $64$-QAM and (b)$128$-QAM modulations. }
\caption{Model outline. \textit{Input}: (i) a sequence of vectors representing the words and (ii) a sequence of vectors which serve to highlight predicate and argument. \textit{Processing}: 1.\element-wise multiplication of the two sequences ($\bigotimes$); 2.\generation of hidden states with forward and backward Bi-LSTM reads (\includegraphics[angle=180,origin=c,scale=0.4]{pdfresizer.pdf} \&\includegraphics[scale=0.4]{pdfresizer.pdf}); 3.\self-attention mechanism builds a new sequence of hidden states by letting every hidden state attend to every other hidden state; 4.\concatenation of the hidden states to generate a vector representation ($[;]$). \textit{Output}: (i) use vector representation to output Likert scale auxiliary predictions (FF$_{ReLU}$) and (ii) concatenate auxiliary predictions to the vector representation ($[;]$) to finally (iii) compute the multi-label predictions at the top level ($|P|\cdot$FF$_{Softmax}$; $P$: set of proto role properties).}
\caption{Schematic illustration of the individual, disseminated, and composite effects defined by \citet{buchanan2018assessing} in a cluster of size 3. Cluster $k$ is shown as a rectangle and subjects are shown as circles, with subject $i$ labeled. Gray shading {\protect\tikz[baseline=-0.5ex]{\protect\node[minimum size=0.5cm,inner sep=0.05cm,fill=gray,circle,draw]{};}} \indicates a treated index subject, and a hatched pattern{\protect\tikz[baseline=-0.5ex]{\protect\node[minimum size=0.5cm,inner sep=0.05cm,pattern=north east lines,pattern color=gray,circle,draw]{};}} \indicates a subject exposed to treatment via dissemination from a treated cluster member. The individual effect$RD^I$ compares the potential outcome of subject $i$ when treated with no dissemination from group members, versus the potential outcome of subject $i$ when untreated with dissemination from the treated index.}
\caption{A random 100,000-gon, colored so that the first edge is a bright orange ({\color[RGB]{253,95,0}$\bullet$}), the 50,000th is a light green ({\color[RGB]{118,179,157} $\bullet$}), and the edge colors between are interpolated sinusoidally.}
\caption{\label{tab100} The $\square_{\gamma Z} (E)\ \big (\times 10^{-3}\big)$ corrections evaluated for the measured proton PV asymmetry $A^{p}_{PV}$ at forward angles.} \begin{ruledtabular} \pgfplotstabletypeset[every head row/.style={before row=\toprule, after row=\hline}, every last row/.style={after row=\bottomrule}, col sep=space, header=false, display columns/0/.style={column name={Experiment}}, display columns/1/.style={column name={$Q^2$ (GeV$^2$)}}, display columns/2/.style={column name={$E$ (GeV)}}, display columns/3/.style={column name={$\square_{\gamma Z} (E)\\big (\times 10^{-3}\big)$}
\caption{$\psi$ vs r for $D_s$ meson(Linear parent Coulomb perturbation(Dalgarno))}{\includegraphics[scale=0.802]{delg4.eps}}
\caption{$\psi$ vs r for $D$(Linear parent(VIPT))}{\includegraphics[scale=0.802]{VIPT2.eps}}
\caption{$\psi$ vs r for $D_s$(Linear parent(VIPT))}{\includegraphics[scale=0.802]{viptcoul.eps}}
\caption{Additional \aastex\symbols}
\caption{ Error in the instantaneous velocity field as a function of time for LES1 (\solidline), LES2 (\dashedline), and LES4 (\dotdashline) compared with DNS in (a) semi-log scale and (b) log-log scale (LES1 excluded). \label{fig:error_full}}
\caption{ Error in (a) mean velocity profile and (b) streamwise turbulence intensity as a function of time for LES1 (\solidline), LES2 (\dashedline), and LES4 (\dotdashline) compared with DNS. \label{fig:error}}
\caption{ Mean velocity profile (a) before and (b) after saturation of error for LES1 (\solidline), LES2 (\dashedline), and LES4 (\dotdashline) compared with DNS (\dotline). \label{fig:Umean}}
\caption{ Streamwise turbulence intensity profile (a) before and (b) after saturation of error for LES1 (\solidline), LES2 (\dashedline), and LES4 (\dotdashline) compared with DNS (\dotline). \label{fig:urms}}
\caption{ Galaxy halo masses for SDSS galaxies predicted by ML compared to traditional methods. The y-axis shows galaxy mass predictions from the \XGBoosts algorithm that was trained on mock catalogues. The x-axis shows mass estimates for galaxies through \HAMs (\textit{left panel}) and \DYNs (\textit{right panel}). The blue shading shows the frequency of galaxies in two-dimensional bins, where the number of galaxies in each bin is normalised by the value for the bin containing the most galaxies. Yellow solid lines and errorbars correspond to the mean and standard deviation of \mpreds in bins of \Mhams or \Mdyn. The dashed black lines show the one-to-one relation between mass estimates. }
\caption{(a) Wide-angle 2D x-ray diffraction of a bundle of white silk \emph{Bombyx mori} fibers. Inset shows an optical microscopic image of \blue{a convolved} silk \blue{fiber} bundle. The silk bundle was made of degummed single strand silk fibers. The long axis of the fibers was predominantly vertical. (b) Optical image of white silk fibers through optically aligned polariser-analyser (high transmission) setup under white light illumination using a Nikon MPlan 10$^\times$ DIC objective lens with numerical aperture $NA = 0.25$. }
\caption{(a) Far-field optical image of longitudinal slices of white silk embedded in an epoxy sheet. \blue{The inset shows schematics of a lateral silk slice composed of $\beta$-sheets interconnected with $\alpha$-coils and amorphous segments.} (b) Optical and topographic images of the silk slice shown in (a) measured with scattering near-field microscopy (SNOM; neaspec). Markers in optical image indicate locations where spectra were acquired.}
\caption{AUC value for recently published real-time trackers using Siamese networks. Datas highlighted in \textcolor{red}{\textit{red}}, \textcolor{blue}{\textit{blue}}, and \textcolor{green}{\textit{green}} color stand for the first, second, and third place of each benchmarks, respectively.}
\caption{Schematic diagram of the ground states of the Dicke model for the normal and superradiant phases. (a) The energy eigenstates of the normal phase are represented by the phonon Fock states, $|n\rangle$, and the spins oriented along the $x$-axis. If $B^{x}>\delta$ {\color{red}(plotted here)}, the low lying excitations are phonon like, and if $B^{x}<\delta$ {\color{red} (not shown)} they are represented by spin flips along the $x$-axis. (b) The energy eigenstates in the superradiant phase, where the phonons are represented by displaced Fock states, $\hat{D}(\alpha)|n\rangle$, and the spins are aligned in the $\pm z$-direction. In this region, the low lying excitations are phonon like if $g^{2}/\delta > \delta$ {\color{red} (plotted here)} and are represented by spin flips along the $z$-axis if $g^{2}/\delta < \delta$ {\color{red} (not shown)}. {\color{red} The symbol $\hat e_i$ denotes the unit vector in the $i$ direction.} \label{fig: phases} }
\caption{Ramp profiles for the time-dependent transverse field in the Dicke model. We show the LA ramp and the bang-bang ramp. The LA and bang-bang ramps have been optimized to produce the highest ground-state fidelity for a simulation time less than or equal to 2 ms. {\color{red} The theoretical and experimental bang-bang ramps are optimized at about 1~ms (open circle). The experimental data was sampled out to 2 ms with the same quench field.} \label{fig: ramps} }
\caption{Comparison of experimental data and theory estimates for the optimal quench of the bang-bang ramp for a system of 75 ions with coupling constant $J=2\pi\times 0.875$~kHz, and detuning from the COM mode of $\delta=-2\pi\times1$~kHz. The spins are initialized to the state $|-N/2\rangle_{x}$ and the COM mode is in a thermal state with an initial occupation of $\bar{n}\approx 6$. Figures (a) and (b) show plots of the experiment and theory respectively for the total spin projections in the $x$, $y$, and $z$ directions. Figure (c) shows the mean value of $\langle|S_{z}|\rangle/N$. {\color{red} A noticeable growth of $\langle|S_{z}|\rangle$ is observed after the initial quench. Figure (d) shows the mean value of $\langle S_{x}\rangle/N$ which exhibits fast demagnetization. For this observable, however, dephasing plays a non-negligible role and the disagreement between theory and experiment becomes larger. The statistical error bars are on the order of the size of the data points.} \label{fig: expt} }
\caption{Comparison of experimental data and theory estimates for the LA ramp for a system of 76 ions with coupling constant $J=2\pi\times 0.875$~kHz, and detuning from the COM mode of $\delta=-2\pi\times1$~kHz. The spins are initialized to the state $|-N/2\rangle_{x}$ and the COM mode is in a thermal state with an initial occupation of $\bar{n}\approx 6$. Figures (a) and (b) represent false-color plots of the experiment and theory respectively for the total spin projections in the $x$, $y$, and $z$ directions. Both of the $S_{z}$ plots show {\color{red} good} qualitative agreement. Figure (c) shows the values of $\langle|S_{z}|\rangle/N$. {\color{red} A noticeable growth of $\langle|S_{z}|\rangle$ is observed in the superradiant regime. Figure (d) shows the mean value of $\langle S_{x}\rangle/N$ which exhibits fast demagnetization. Similar to the bang-bang case in this observable dephasing plays a non-negligible role and the disagreement between theory and experiment becomes larger. The statistical error bars are on the order of the size of the data points.} \label{fig: la_exp} }
\caption{\blue{Results from control and experimental groups (average)}}
\caption{Performance of different methods on two datasets. The best two results are shown in \textcolor{red}{red} and \textcolor{blue}{blue}, respectively.}
\caption{\label{FIGclsPerformance}Parametrisation of the performance of the trained \cls method. \Subref{FIGclsDist}~Distributions of the \cls metric, \predCls, for the signal and background samples, as indicated. \Subref{FIGclsTs}~The parametrised \cls test statistic, \tsCls, (see \autoref{eqTsCls}) as a function of \predCls, before and after the correction for trials. The dashed-dotted horizontal line highlights the value, $\ts = 25$.}
\caption{(Color online) Structural and electronic properties of H-MgB$_2$. (a) Top view of the crystal structure. (b) The electronic band structure and density of states (DOS). The $\sigma$ and $\pi$-H{\blue \textit{s}} bands crossing $E_{\mathrm{F}}$ are also indicated. (c) The norm of the wave function of the $\pi$-H\textit{s} state at $E_{\mathrm{F}}$ obtained along the path $\Gamma$-K.}
\caption{(color online) $\kappa$'s determined by different authors from near-wall $U^+(y^+)$ profiles ($\bullet$) and from the centerline velocity $U^+_{\mathrm{CL}}(R^+)$ ($\blacksquare$) versus maximum $R^+$ of the respective experiment: \textcolor{cyan}{$\bullet$}, \cite{Monty_thesis}; \textcolor{blue}{$\bullet$}, \cite{zanoun2007}; \textcolor{green}{$\bullet$}, \cite{Furuichi15}; \textcolor{orange}{$\blacksquare$}, \cite{FioriniPhD}; \textcolor{red}{$\blacksquare$}, \cite{TSFP17}; \textcolor{brown}{$\blacksquare$}, \cite{Nikuradse-pipe}; \textcolor{violet}{$\bullet\, \blacksquare$}, \cite{ZS98} and \textcolor{purple}{$\bullet\, \blacksquare$}, \cite{MLJMS04}. The gray boxes emphasize the separation of values obtained from low and high Reynolds number experiments.}
\caption{(color online)\(a) Difference $\Delta U^{+\,\mathrm{(Z\&S)}}_{\mathrm{Pitot}}$ (equ. \ref{Zagcorr}) between the uncorrected pipe velocities $U^{+\,\mathrm{(Z\&S)}}_{\mathrm{uncorr}}$ of \cite{ZS98} and the Musker-Chauhan fit $U^{+\,\mathrm{(ZPG)}}_{\mathrm{inner}}$ for the ZPG TBL inner expansion (equ. \ref{Musker}), scaled by $(d^+)^{0.9}(R^+)^{-0.4} \equiv 0.0213 (R^+)^{0.5}$. Data range $0 \leq y^+ \leq \mathrm{min}[300, 0.1\,R^+]$ containing data up to $R^+\approx 4\times 10^4$. Data symbols as in fig. \ref{Fig:Plog1} except for the lowest $R^+=851$ identified by \textcolor{green}{$\Box$}. \textcolor{red}{\textbf{---}}, fit by equation (\ref{Zagcorr}). \newline (b) Scaled difference $\Delta U^{+\,\mathrm{(Z\&S)}}_{\mathrm{Pitot}}$ versus $R^+$ for the Z\&S data with$d/R = 0.0139$. Solid and open symbols correspond to $y^+ \leq 50$ and $50 < y^+ \leq 300$, respectively. The McKeon data of panel (c) for the same $d/R = 0.0139$ are included as purple triangles. \newline (c) Analogous scaled difference $\Delta U^{+\,\mathrm{(McK)}}_{\mathrm{Pitot}}$ between the data from Appendix C of \cite{McK_thesis} and $U^{+\,\mathrm{(ZPG)}}_{\mathrm{inner}}$ versus $y^+$, for $R^+ = 1825$ and Pitot O.D.'s of 0.3mm ($\blacksquare$), 0.5mm (\textcolor{green}{$\blacklozenge$}), 0.9mm (\textcolor{purple}{$\blacktriangle$}) and 1.8mm (\textcolor{blue}{$\bullet$}); Corresponding open symbols, data for $R^+ = 3328$.}
\caption{(color online) Analysis of the overlap layer of the 26 Superpipe profiles of \cite{ZS98}, corrected according to equations (\ref{Zagcorr}) and (\ref{Hama}), for $851 \leqslant R^+ \leqslant 528000$. \textcolor{green}{$\blacksquare$}, $R^+ < 3\times 10^3$ ; $\bullet$, $3\times 10^3 < R^+ = < 10^4$ ; \textcolor{blue}{$\blacktriangle$}, $10^4 < R^+ < 5\times 10^4$ ; \textcolor{red}{$\blacklozenge$}, $5\times 10^4 < R^+ < 2.5\times 10^5$ ; $\times$, $2.5\times 10^5 < R^+$ where roughness effects become significant. Corresponding large symbols mark the centerline fitted by equ. (\ref{CLlog}) (\textbf{- - -}). \newline (a) \textcolor{green}{$\blacksquare$} $\bullet$ \textcolor{blue}{$\blacktriangle$} \textcolor{red}{$\blacklozenge$} $\times$, $(U^{+\,\mathrm{(Z\&S)}} - U^{+\,\mathrm{(ZPG)}}_{\mathrm{inner}})$ [equ. (\ref{Musker})]. \textcolor{violet}{$\blacksquare$}, $U^{+\,\mathrm{(Z\&S)}}$, taken from fig. 17 of \cite{ZS98}, minus $U^{+\,\mathrm{(ZPG)}}_{\mathrm{inner}}$. $- - -$, $(U^{+\,\mathrm{(P)}}_{\mathrm{CL}} - U^{+\,\mathrm{(ZPG)}}_{\mathrm{inner}})$ ; \textcolor{orange}{\textbf{---}}, $(U^{+\,\mathrm{(P)}}_{\mathrm{inner}} - U^{+\,\mathrm{(ZPG)}}_{\mathrm{inner}})$ [equ. (\ref{Pinner})]. \newline (b) \textcolor{green}{$\blacksquare$} $\bullet$ \textcolor{blue}{$\blacktriangle$} \textcolor{red}{$\blacklozenge$} $\times$, $(U^{+\,\mathrm{(Z\&S)}} - U^{+\,\mathrm{(P)}}_{\mathrm{cp}})$ [equ. (\ref{Pcp})]. $- - -$, $(U^{+\,\mathrm{(P)}}_{\mathrm{CL}} - U^{+\,\mathrm{(P)}}_{\mathrm{cp}}) = 1.56$ ; \textcolor{orange}{\textbf{---}}, $(U^{+\,\mathrm{(ZPG)}}_{\mathrm{inner}} - U^{+\,\mathrm{(P)}}_{\mathrm{cp}})$ ; $\cdot \cdot \cdot$, departure $L^{+\,\mathrm{(P)}}$ (equ. \ref{Lpart}) from the log-law for the last profile in each group.}
\caption{(color online) (a) \textcolor{red}{---}, turbulent viscosity $N_T \equiv \mu_T^+/R^+$, calculated from the DNS of \cite{ElKhoury2013} for $R^+=999$; \textcolor{red}{$-\cdot -$}, $0.436 Y$; \textcolor{red}{$-\cdot \cdot -$}, fit (\ref{NTfit}). (b) DNS mean velocity profiles of \cite{ElKhoury2013} for $R^+=550$ (\textcolor{blue}{---}) and $R^+=999$ (\textcolor{red}{---}) minus $U^{+\,\mathrm{(P)}}_{\mathrm{inner}}$ (equ. \ref{Pinner}); - - -, corresponding profiles minus $U^{+\,\mathrm{(ZPG)}}_{\mathrm{inner}}$ (equ. \ref{Musker}); $- \cdot \cdot -$, $L^{+\,\mathrm{(P)}}$ fitted by equation (\ref{Lpart}) for $R^+ = 550$ and 999; $- \cdot -$, limit $2.43 Y$ at $R^+=\infty$ (without the slope decrease towards the CL).}
\caption{(color online) (a) The wake function $U^{+\,\mathrm{(P)}}_{\mathrm{wake}}(Y)$ of equation (\ref{Pcomp2}) with the same data and color coding as in figure \ref{Fig:Plog1}; \textcolor{green}{- - -, ---, - - -}, asymptotically linear part $L^{+\,\mathrm{(P)}}$ in equation (\ref{Lpart}) for $R^+$ = 850, 2000 and $\infty$; $\blacklozenge$ , $U^{+\,\mathrm{(P)}}_{\mathrm{CL}}$ [equ. \ref{CLlog}] minus $U^{+\,\mathrm{(P)}}_{\mathrm{inner}}(y^+ = R^+)$. (b) $U^{+\,\mathrm{(P)}}_{\mathrm{wake}}(Y) - L^{+\,\mathrm{(P)}}(Y;R^+=2000)$. (c) Complete fit $U^{+\,\mathrm{(Z\&S)}} - U^{+\,\mathrm{(P)}}_{\mathrm{comp}}$ [equs. (\ref{Pcomp}), (\ref{Pcomp2})].}
\caption{(color online) \textcolor{red}{---}, DNS mean velocity profile $U^{+\,\mathrm{(L\&M)}}$ of \cite{LM14} for $H^+=5200$ minus $U^{+\,\mathrm{(Ch)}}_{\mathrm{inner}}$ ; \textcolor{red}{- - -}, DNS profile minus $U^{+\,\mathrm{(ZPG)}}_{\mathrm{inner}}$ (equ. \ref{Musker}); \textcolor{red}{$- \cdot \cdot -$}, $L^{+\,\mathrm{(Ch)}}$ fitted by equation (\ref{LpartCh}); \textcolor{red}{$\cdot \cdot \cdot$}, $U^{+\,\mathrm{(L\&M)}} - U^{+\,\mathrm{(Ch)}}_{\mathrm{comp}}$. For an explanation of the nonphysical ``hump'' near the origin, see the comment in section \ref{sec:lin} regarding the same phenomenon in figure \ref{Fig:NT}b.}
\caption{{\small{\bf S-DOD-CNN Framework.} \textcolor{myred}{Red}, \textcolor{myblue}{blue}, and \textcolor{mygreen}{green} arrows indicate the computational flow responsible for event recognition ($e$), rigid object detection ($r$), and non-rigid object detection ($n$), respectively. For rigid and non-rigid object detection, a combined feature map is constructed by combining per-RoI feature maps while preserving the spatial locations of the RoIs within the original image.}}
\caption{\label{table:loc_performance_summary} The aggregated localization performance on the \nclt, \updrive, and \kitti dataset(s), showing average localization recall with \maptracking $\recallmt$, and with \gloloc $\recalllc$, and the median translation ($\transmedianerror$) and orientation ($\orientmedianerror$) accuracy. % For \updrive and \kittins, \blue{planar~$\planarmedianerror$} and \blue{lateral~$\latmedianerror$} errors are shown instead of full $3DoF$ translation errors. %$3(translation:~$\transmedianerror$, \blue{planar:~$\planarmedianerror$, lateral:~$\latmedianerror$}), as well as the orientation accuracy $\orientmedianerror$. % Standard deviations are denoted by ``+/-'', and the $90$~percentile is shown in square brackets. % %Localization using \maptracking is able to achieve close to $100\%$ recall on all dataset collections. % %In contrast to that, global localization only performs similarly well on the \kitti dataset, which exhibits minor appearance change. % %The median localization accuracy on the \nclt with respect to \textit{ground-truth} is below $15cm$ and $2$ degrees. % %Despite the \textit{RTK GPS} being less accurate than the \nclt \textit{ground-truth} poses, the median planar errors on the \updrive and \kitti datasets lie below $0.5m$, and the median errors in orientation are less than $0.3$ degrees. % %The loss in accuracy in \kitti, as compared with \updrive, is mainly attributed to the different camera rig. % %While there is a surround-view camera in on the \updrive vehicle, only a front facing camera is available in \kitti. % %The localized poses are thus less constrained in space in case of \kitti. % }
\caption{% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \textsf{\textbf{a}}~The larger is $\phi$, the smaller is the average strain $\gamma_{\mathrm{J}}$ to reach a shear-jammed state. % $\gamma_{\mathrm{J}}^{(1)}$ ($\circ$) and $\gamma_{\mathrm{J}}^{(2)}$ (\textcolor{red}{$\diamond $}) are strains to reach the first jammed states from the initial states and the second jammed states after stress reversals, respectively. % Only jammed results of ten simulations are plotted. % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % \textsf{\textbf{b}}~Mean contact number $Z$ with non-rattlers of SJ states (\textcolor{black}{$\circ$}) are almost constant for $\phi \leq 0.84$. % The lowest value of the SJ states is $ Z \approx 3.07$ (dashed line). %% These SJ states can be confirmed as fragile with the minimum values after the shear reversals (\textcolor{red}{$\triangledown$}), which are below the isostatic condition $Z_{\mathrm{iso}} = 3$. % $Z$ of unjammed states (\textcolor{blue}{$\times$}) are below but close to the plateau value near the boundary. % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % \textsf{\textbf{c}}~The sharp decrease of the stress anisotropy $\sigma^{xy}/P$ of jammed states (\textcolor{black}{$\circ$}) above $\phi = 0.84$ indicates the transition from shear jamming to isotropic jamming. % }
\caption{ \textsf{\textbf{a}}~Friction coefficient $\mu$ dependence of the shear jamming strain $\gamma_{\mathrm{J}}$ (\textcolor{black}{$\circ$}) at $\phi = 0.8$. Suspensions with lower frictions ($\mu = 0.2$ and below) did not reach jamming at this area fraction. % \textsf{\textbf{b}}~The contact number $Z$ (\textcolor{blue}{$\times$}) of unjammed states increases with the friction coefficient $\mu$. % However, less $Z$ (\textcolor{black}{$\circ$}) is required to realize jammed states with higher $\mu$. % % %becomes larger with smaller $\mu$. % %Friction coefficient $\mu$ dependence % % Jammed states and time-averaged unjammed states % are shown with and \textcolor{blue}{$\times$}, respectively. }
\caption{\textsf{\textbf{a}} Strains $\gamma_{\mathrm{J}}$ to reach jammed states for frictionless suspensions ($\mu = 0$). \textsf{\textbf{b}} The average contact numbers $Z$ (\textcolor{black}{$\circ$}) monotonically increase as the volume fraction $ \phi $. % SJ states may be indicated by the minimum values after the shear reversal (\textcolor{red}{$\triangledown$}) which are below the isostatic condition $Z_{\mathrm{iso}}=4$. % However, the observed range of area fractions are rather narrow. % $Z$ of unjammed states are time-averaged values (\textcolor{blue}{$\times$}). }
\caption{\label{fig:fault_cases} Effect of Fault at Different Locations: a) Output Layer Bias, b) Weight Connecting Hidden and Output Layer, c) Hidden Layer Bias, and d) Weight Connecting Input and Hidden Layer. The \textcolor{red}{Red} coloured neurons signify the affected neurons because of the fault induction}
\caption{\textbf{Evaluation of the VAE and cVAE on the Grassy-MNIST dataset.} Here, we show the results of applying both the standard VAE and cVAE to the Grassy-MNIST dataset, introduced in Fig. \ref{fig:mnist-examples}. (a) The top panel shows that when the samples are embedded into the 2-dimensional latent space of the VAE, they do not cluster by digit (here, images with the digit 0 are in \textcolor{YellowGreen}{green}, with the digit 1 are in \textcolor{teal}{blue}, and the digit 2 in \textcolor{gray}{gray}). The bottom panel shows that when the samples are embedded into the 2-dimensional $s$-latent space of the VAE, they do cluster by digit. (b) These results are also consistent, as shown in this boxplot which shows the resulting silhouette scores across 10 trials. (c) Here, we use the trained VAE to generate new samples by sweeping values in the 2-dimensional latent space. The generated samples include both digit features and background features. (d) Here, we use the trained cVAE to generate new samples by sweeping values in the 2-dimensional latent space resulting in clean hand-written digits. In Appendix \ref{appendix:high_dimension_vae}, we show that even a standard VAE with a larger latent space does not learn digit-related factors in this setting.}
\caption{\small Overview of our $3$D face modeling method. A mixture of synthetic and real data is used to train the encoder-decoder network with supervised (\textcolor{green}{green}) and unsupervised (\textcolor{red}{red}) loss. Our network can be used for $3$D dense correspondence and $3$D face reconstruction.}
\caption{Accuracy, RMSE, and parameter values of competing models for all datasets. An asterisk (*) indicates customization methods first introduced in this paper. A dash (-) indicates the model is too big to be trained in an NVIDIA 1080 Ti GPU. \textbf{Bold-face} indicates that the performance of basis-customization is significantly better ($p<0.05$) than that of a simple customization. Values colored \textcolor{red}{red} are performance weaker than that of the BiLSTM model, thus customization hurts the performance in those cases.}
\caption{Comparison of accuracies of using only the categories for classification (MLP+non-text) and the best models of Customized BiLSTM (BiLSTM+cust) and Basis-Customized BiLSTM (BiLSTM+basis-cust). Results colored \textcolor{red}{red} are accuracies worse than the MLP+non-text.}
\caption{Quantitative evaluation of trackers under different tracking challenges using AUC(\%) of success plot on OTB-50. The {\color{red}first}, {\color{green}second} and {\color{blue}third} best results are highlighted. Scenario attributes indicate changes in illumination, scale, in-plane and out-of-plane rotation, deformation, occlusion, out-of-view, clutter, low resolution, fast motion and motion blur.}
\caption{(Color online) The dependence of the drag coefficients $\lambda_{+}$ and $\lambda_{-}$ defined in Eq.~\ref{eqn:lamdaPlusMinus} on $(l/a)$ for a Janus ({\color{blue}---}), no-slip (${\color{red}- \cdot -}$) or free-slip ({\color{green}-- --}) pills. Insets: a) The laboratory $(x, y, z)$ and body $(\mathbf{e}_{\Vert},\mathbf{e}_{\bot})$ coordinate systems. b) A close up of the region around $l/a=0.355$ in which $\lambda_{-}$ changes sign.}
\caption{(Color online) Force components on a Janus pill as a function of the angle of inclination, $\theta$ relative to a uniform flow in the $z$-direction for $l/a=$ 0, 0.355, 1 and 2 ({\color{blue}---}). a) The $x$-component, $F_x$ and also results for $l/a = 0.355$ for the no-slip (${\color{red}- \cdot -}$) and free-slip ({\color{green}- - -}) cases where $F_x$ for the Janus pills vanishes. b) The $z$-component, $F_z$ and with results for $l/a=0$ that correspond to no-slip (${\color{red}- \cdot -}$) and free-slip ({\color{green}-- --}) spheres.}
\caption{(Color online) a) The torque, $T_y$ about the $y$-axis on a Janus pill with the body axis $\mathbf{e}_{\bot}$ oriented anti-parallel to the uniform external flow or the force, $F_{\bot}$ in the $\mathbf{e}_{\bot}$ direction of the body frame, experienced by a rotating Janus pill at angular frequency $\omega$ as a function of the aspect ratio, $l/a$. b) Variations of the torque about the $y$-axis on a Janus pill as a function of the angle of inclination, $\theta$ relative to a uniform flow in the $z$-direction for $l/a=$ 0, 1 and 2 ({\color{blue}---}). Both the uniform no-slip (${\color{red}- \cdot -}$) and free-slip ({\color{green}-- --}) pills experience no torque.}
\caption{ Super-resolution model frameworks based on deep learning. The trapezoids denote the up-or-down sampling operations, depending on their directions. The \textcolor{Gray}{gray} ones denote predefined upsampling operations, while the \textcolor{LimeGreen}{green} ones and \textcolor{Dandelion}{yellow} ones indicate learnable upsampling or downsampling layers, respectively. \textcolor{CornflowerBlue}{Blue} boxes represent convolutional layers, and the blocks enclosed by the dashed box represent some modules that can be stacked in the frameworks. }
\caption{ Interpolation-based upsampling methods. The \textcolor{Gray}{gray} board denotes the coordinates of pixels, and the \textcolor{CornflowerBlue}{blue}, \textcolor{Dandelion}{yellow} and \textcolor{LimeGreen}{green} points represent the initial, intermediate and final pixels, respectively. }
\caption{ Transposed convolution layer. The \textcolor{CornflowerBlue}{blue} boxes denote the input, and the \textcolor{LimeGreen}{green} boxes indicate the kernel and the output of the convolution operation. % The input feature maps (a) are expanded to twice of the original size (b), then a common convolution with stride $1$ is performed (c). }
\caption{ Sub-pixel layer. The \textcolor{CornflowerBlue}{blue} boxes denote the input, and the boxes with other colors indicate different convolution operations and different output feature maps. % The input feature maps (a) are convoluted to $s^2$ times channels (b), then reshaped to the size with twice width and height of the original size (c), where $s$ is the scaling factor. }
\caption{Summary of the average overlap scores of 13 compared methods on 16 sequences. The bold numbers in \textcolor{blue}{blue} indicate the best performance, while the numbers in \textcolor{red}{red} indicate the second best.}
\caption{Another deconing $d_{H_1}\ID$ of the icosidodecahedral arrangement (with respect to a red line {\color{red} $H_1$} at infinity)}
\caption[base flow convergence]{\label{fig:convergence_rate_vs_muratio} Dependence of the maximum error (normalized with the Cahn number) on the viscosity ratio of the laminar CAF. { The data shown as \mysquare{black} and \mycircle{red} are the proportionality constants of the solid lines in figure~\ref{fig:basic_flow_convergence}c--d, respectively, whereas the error bars quantify the difference between the solid lines (fits) and the data points. Note that additional computations were performed for even lower values of the viscosity ratio ($\hat\mu=1/50$, $1/100$, $1/200$ and $1/601$) and are also included here. The data shown as \mydowntriangle{green} are from another case with lower volume ratio, $\hat V=0.96$. The dashed lines show the approximation to the error given by eq~\eqref{eq:err_lam}, which is only valid for small $\hat\mu$.}}
\caption{Average values of the visit counts and first visit time of the trained agent for the \brownianObj{} and \fixedObj{} objects in Experiment 4, with all baselines.}
\caption{Average values of the visit counts and first visit time of the trained agent for the \brownianObj{} and \fixedObj{} objects in Experiment 4, with all baselines.}
\caption{{\bf Phase diagram and exceptional points for odd elastic waves.} {\bf a.~}Phase diagram for waves in an overdamped odd elastic solid. Red curves represent the boundary outside of which active waves can be sustained. {\bf b.~}A cut ($\Gamma M K \Gamma$) through the space of wavevectors (first Brillioun zone) of a triangular lattice with generalized Hookean springs. The microscopic activity in the springs is characterized by the ratio $\abs{\frac{k^o}{k}}$ between odd spring constant $k^o$ and conservative spring constant $k$. The threshold for active waves varies across the Brilloin zone, with the elastic limit describing the region near $\Gamma$. The middle inset shows the regions of the Brillouin zone (light grey) in which waves propagate (for $\abs{\frac{k^o}{k}}$ corresponding to the horizontal dashed line). {\bf c.~}The eigenmodes for three relative values of the elastic moduli, showing trajectories in shear space ($S_1$ and $S_2$, c.f. Fig.~3). At zero activity (\protect\markercirc), the modes correspond to longitudinal and transverse waves, whose eigenvectors are orthogonal in $S_1$-$S_2$ space. At the exceptional point (\protect\markerstar), the eigenmodes become colinear. Above the exceptional point (\protect\markersq), the eigenmodes acquire a circular polarization, performing a spiral through simultaneous rotation and attenuation in phase space. (See Supplementary Movie 3.)}
\caption{The mean cell liveness $< \! a \! >$ of a QGOL system of $100 \times 100$ cells with a starting fraction of live cells $f=0.2$ (\textcolor{red}{\pmb{$\bullet$}}) and $f=0.8$ (\textcolor{RoyalBlue}{\pmb{$\bullet$}}) is presented. The inset shows the Gaussian distribution of $< \! a \! >$. An example cell liveness distribution is presented as a gray-scale (dead=white, live=black).}
\caption{Two typical difficult cases of MM-WHS. The first column is the gold standard, and the other columns are the segmentation results and corresponding Dice scores from the methods which have been tested \zxhcolor{}{on both of the two modalities.}}
\caption[Proposed method unrolled over two time blocks and a maximum of three iterations]{ Proposed method unrolled over two time blocks and a maximum of three iterations. In block 1, src1 \tikz{\node[legend-box,fill=color-src1]{};}, corresponding to backgound noise, and src2 \tikz{\node[legend-box,fill=color-src2]{};} are separated. Then, in block 2, the NN receives embedding vectors for src1 and src2, extracts src1, estimates an empty mask for silent src2, and extracts the new src3 \tikz{\node[legend-box,fill=color-src3]{};}. % The embedding vectors are visualized using arrows \tikz{\pic{emb-1};}, \tikz{\pic{emb-2};} and \tikz{\pic{emb-3};} for sources 1, 2 and 3. }
\caption{Typical response of the cold RF AM. (a) Polarimeter output. The signal -- downsampled for clarity -- is fitted with an exponentially decaying sine function (in light green, overlapped to data points). Data obtained at $\omega=\SI{34.5}{\kilo\hertz}$ (value from fit: $\omega_{\text{fit}}=\SI{34.49\pm 0.01}{\kilo\hertz}$) with RF pulse of \red{B$_{\text{RF}}=\SI{22}{\nano\tesla}$} (V$_{\text{rms}}=\SI{4.6}{\milli\volt}$). (b) Examples of the polarimeter output's amplitude versus interrogation time. The dashed line is the exponential fit, obtained from the total signal. \label{fig:oscillations}}
\caption{Response of the cold RF AM. (a) Typical RF resonance, measured with the lock-in amplifier. Blue crosses: in-phase component (X). Red squares: quadrature component (Y). \red{The linewidth is \SI{230}{\hertz}}. Data were obtained with: N=\SI{1.1e8}{\text{atoms}}, and \red{B$_{\text{RF}}=\SI{11}{\nano\tesla}$ }(V$_{\text{rms}}=\SI{2.3}{\milli\volt}$). (b) Typical FFT of the cold RF AM output, and technical noise level. Signal-to-noise ratio, SNR=33.}
\caption{Response at different frequencies (in-phase component X, measured via the lock-in amplifier), demonstrating the tunability of the cold RF AM. Dashed lines are the best Lorentzian fits of experimental data. Probed frequencies are $\omega_{\text{RF}}$=\SI{15.9}{}, \SI{33.9}{}, \SI{52.0}{}, \SI{76.1}{}, \SI{99.6}{\kilo\hertz} (\SI{22.7}{}, \SI{48.4}{}, \SI{74.3}{}, \SI{142.3}{\milli\text{G}}, respectively). \red{The width of the resonances -- around \SI{1}{\kilo\hertz} in this case -- does not exhibit relevant changes in the explored magnetic range.}}
\caption{Sensitivity \red{(Eq.~\ref{eqn:sensitivity})} of the cold RF AM versus number of atoms N in the polarization gradient cooling \red{(PGC)} phase. The dashed line is a fit with power N$^{-1.17}$.}
\caption{% (a) The grey closed symbols are the speed of the fastest computer as a function of the year of commission, in floating-point operations (flop) per second. Trend lines are: \solid, flops per minute; \dashed, overnight; \dotted, three months. They are reduced by a factor of four with respect to nominal values to account for practical inefficiencies \cite{gordonbell:17}, and grow by $1000$ every 15 years. The horizontal lines are the number of operations required for simulations of isotropic turbulence, starting at the year of their initial publication: \circle, $Re_\lambda=35$ \cite{orspat72}; \dtrian, $Re_\lambda=150$ \cite{jwsr}; \trian, $Re_\lambda=650$ \cite{kaneda06}. % (b) As in (a), for turbulent channels: \circle, $Re_\tau=180$ \cite{kmm}; \dtrian, $Re_\tau=2000$ \cite{hoyas06}; \trian, $Re_\tau=5200$ \cite{lee:moser:15}. % }
\caption{% (a) Discrimination table for a set of experiments on the flow in figure \ref{fig:tur2d}. The labels in the top row are the variables used for discrimination. Those on the right column are the operations used to modify each cell, where $\bra\ket$ is the average over the cell, and the tilde denotes fluctuations with respect to that average. Values in the table are the discrimination efficiency illustrated in (b,c). % (b) Probability density function of the cell enstrophy for: \solid, most significant cells; \dashed, least significant ones. The vertical line is the optimum discrimination threshold. Conditions are as in figure \ref{fig:tur2d}, for the bottom line of the table in (a). % (c) As in (b), for the cell kinetic energy. }
\caption{A tensor network containing three edges (wires). The $U$-boxes on the left have IDs \textcolor{blue}{1} and \textcolor{blue}{2} (labels in the bottom-right corner of the boxes), while $A \in V^{(\textcolor{red}{1})} \otimes V^{(\textcolor{red}{2})}$ occurs only once (and the ID~\textcolor{blue}{1} is used) and has two legs numbered~\textcolor{red}{1} and~\textcolor{red}{2}, respectively (labels near the decorations of the $A$ box). Output vertices are marked by filled circles, whereas input vertices are marked by empty circles. The edges also carry labels, to make their encoding more clear.}
\caption{Graphical representation of the scalar $f$ from \eqref{eq:def-f}. Two copies of $U$ and $U^*$ boxed are represented, denoted by \textcolor{blue}{1} and \textcolor{blue}{2} (labels in the bottom-right corner of the boxes). There are two outputs of $U$ (resp.~two inputs of $U^*$), $\mathbb C^n$ (denoted by \textcolor{red}{1}) and $\mathbb C^k$ (denoted by \textcolor{red}{2}), see the labels near the decorations of the first copy fo the $U$ box. The six edges are also identified. }
\caption{The channel-level description of the single experiment. Purple boxes point out the initial data format and the intermediate outcomes of the workflow (as successive elaborations of the raw data). The black box encompasses the key steps of the process. The procedure is coded in MATLAB\textsuperscript\textregistered, the sequence of actions is illustrated in Figure \ref{fig:ChannelLevel-SingleExp_Illustration}. The figure illustrates the main loop (as discussed in Section \ref{subsec:channel-level_SingleExp}), at the end of which further checks are carried out on average and median values (full-set level description), in order to identify further anomalies or outliers.}
\caption{\label{fig:ex}Interactive visualization of the system output. Words tagged as {\sc bad} as shown in \textcolor{red}{\it red}, and {\sc bad} gaps are denoted as red underscores (``\textcolor{red}{\_}"). The Jupyter Notebook producing this output is available at \url{https://github.com/Unbabel/OpenKiwi/blob/master/demo/KiwiViz.ipynb}.}
\caption{MVF ({\color{C2}{orange}}) vs Naive ({\color{nice-green}{green}}) $5$\si{\second} predictions with the same initial conditions and inputs as the true trajectory ({\color{C1}{blue}}).}
\caption{Performance results (in $\%$) of our approach (VLAWE) versus several state-of-the-art methods \cite{Ionescu-KES-2017,Cheng-IJCAI-2018,Fu-ESA-2018,Hill-NAACL-2016,Iyyer-ACL-2015,Kim-EMNLP-2014,Kiros-NIPS-2015,Le-ICML-2014,Liu-IJCAI-2017,Shen-ACL-2018,Torki-ACL-2018,Xue-TKDE-2009,Zhao-IJCAI-2015,Zhou-COLING-2016,Zhou-IJCAI-2018} on the Reuters-21578, RT-2k, MR, TREC and Subj data sets. The top three results on each data set are highlighted in \textcolor{dark_red}{red}, \textcolor{dark_green}{green} and \textcolor{dark_blue}{blue}, respectively. Best viewed in color.}
\caption{\label{tab:SSDResults} mAP scores for different models \textit{(row)} at varying fog densities \textit{(columns)} for easy/moderate/hard setting on synthetic KITTI data. Best SSD models are labeled \redbf{magenta} and second best \bluebf{blue}. The image pre-processing comparisons use the Pix2PixHD-CJ (best from Tab.~\ref{tab:GanScores}) for fog removal before object detection. Note that these results do not generalize to real world data, see Tab.~\ref{tab:TestResults}. }
\caption{Quantitative detection mAP on unseen fog data from the Sweden dataset split into different distortion levels and difficulties easy/moderate/hard~\cite{Kitt_dataset}. The proposed model is trained solely on clean data without distortions. The best model is marked \redbf{magenta} and the second-best in \bluebf{blue}. The image pre-processing comparisons use the Pix2PixHD-CJ (best from Tab.~\ref{tab:GanScores}) for fog removal.}
\caption{(a) Time evolution of the order parameter by varying the coefficient of the reaction time, $\kappa$. The time-averaged values of (b) the order parameter $\left<R\right>$, (c) the group speed $\left<U\right>$, and (d) the group size $\left<G\right>$, as a function of reaction insensitivity $\kappa$. The parameter setting is the same as in Fig. \ref{fig3} except the variable $\kappa$. The red dotted lines in (b)-(d) indicate the standard deviation from the time-averaged data.} \label{fig4} \end{figure} The behavioral dynamics of a flock also depends on the insensitivity $\kappa$, which may differ from flock to flock. Fig. \ref{fig4}(a) shows the time-dependent evolution of the order parameter $R$ at the different values of $\kappa=100 \textrm{ and }1000$. The used parameters are the same as in Fig. \ref{fig3} except the variable $\kappa$. When the reaction insensitivity is low, $\kappa=100$, the responses of a flock are quicker, so the mean order becomes smaller ($\left<R\right>=0.73$), compared with the case at $\kappa=1000$ ($\left<R\right>=0.94$). Also it is clearly seen the magnitude of fluctuations becomes larger. The effects of $\kappa$ on the time-averaged order $\left<R\right>$, the speed $\left<U\right>$, and the group size $\left<G\right>$ are seen in Fig. \ref{fig4}(b)-(d). Fig. \ref{fig4}(b) shows that a flock is more ordered when the reaction delay becomes longer, i.e., the sensitivity decreases. The red dotted lines in (b)-(d) indicate the standard deviation from the time-averaged data. In this case, the coupled acceleration becomes negligible and the collective dynamics is similar to the ``overly-stable" dynamics of a flock observed in the standard models of Vicsek and Cucker-Smale. On the other hand, when the reaction becomes instantaneous with no delay, the system becomes less ordered and more fluctuated (see the wider deviations in red dotted lines), as discussed in Fig. \ref{fig4}(a). A few observations are worthy to note: (i) There is a relation between values of $\left<U\right>$ and $\left<G\right>$; the smaller group travels faster than the larger group on average. (ii) Another relation is seen between $\left<R\right>$ and $\left<U\right>$ when $\kappa$ is large; the smaller (denser) group is better ordered compared to the larger (diluter) group. This behavioral feature is commonly observed in natural flocks; a bird flock has primarily two states, a disordered state of low density and a well-aligned state characterized by high density \cite{tren}. Finally, (iii) there is a specific value of $\kappa$ that corresponds the minimum group speed $\left<U\right>$ and the maximum group size $\left<G\right>$, the point below which the effect of the coupled acceleration term is dominant. All three averages $\left<R\right>$, $\left<U\right>$, and $\left<G\right>$ were obtained by $\left<\boldsymbol{\cdot} \right>=\frac{1}{t_f} \int_0^{t_f} \boldsymbol{\cdot}~dt,$ where the total simulation time is $t_f=10000$. % Fig. 5 \begin{figure} \centering \includegraphics[width=16cm]{Fig5.eps} \caption{The correlation functions with respect to the inter-individual distance $r$ for velocities (a) and for speeds (b). The correlation length as a function of the group size $G$ for velocities (c) and for speeds (d). The correlation length $\xi$, i.e., the zero point in the correlation function, $C(r=\xi)=0$, is denoted in (a) and (c). The data of 1000 points are sampled at the equal intervals from one simulation run when $t_f=20000$ and the parameter setting of the simulation is the same as in Fig. \ref{fig3}. For the velocity correlation in (a) and (c), Pearson's correlation test gives $n=100$, $r=0.69$, $p=0$. For the speed correlation in (b) and (d), Pearson's correlation test gives $n=100$, $r=0.72$, $p=0$. Note that the linear relation is strong where $r>0.7$} \label{fig5} \end{figure} To show the nature of the fluctuations around the mean velocities, we consider the correlation function of the velocity fluctuations. The correlation function measures how much two velocity fluctuations at a distance $r$ are correlated \cite{cavagna2010scale}. \begin{equation}\label{corr} C(r)=\frac{1}{c_0}\frac{\sum_{ij}\mathbf{u}_i \cdot \mathbf{u}_j \delta(r-r_{ij})}{\sum_{ij}\delta(r-r_{ij})} \end{equation} where $\delta (r-r_{ij})$ is a smoothed Dirac $\delta$-function selecting pairs of birds at mutual distance $r$ and $c_0$ is a normalization factor such that $C(r=0)=1$. The fluctuation around the mean flock's velocity is \begin{equation} \mathbf{u}_i=\mathbf{v}_i-\frac{1}{N}\sum_{k=1}^{N}\mathbf{v}_k \end{equation} where the sum of the fluctuations around the mean group velocity is zero $\sum \mathbf{u}_i=0$ by definition. Using our model in Eq. (\ref{evolu2}), we compute the correlation function and correlation length of velocities. The correlation length can be defined at the point satisfying the correlation function is zero, $C(r=\xi)=0$ and it is denoted in Fig. \ref{fig5}(a). The correlation length gives good estimates of the average sizes of the correlated domains. Fig. \ref{fig5}(b) presents the relation between the group size $G$ and the correlation length $\xi$ and confirms the linear proportionality as reported in the experiments of Cavagna et al. \cite{cavagna2010scale}. This comparison of simulation results with the experimental data for the correlation was possible since our model assumes neither periodic boundary conditions nor unit speeds as typical models. We can say that the dynamics of Eq. (\ref{evolu2}) gives the scale-free relation of velocity fluctuations without relying on any stochastic variables. The linear relation between the group size and the correlation length is also obtained in speeds fluctuations in Fig. \ref{fig5}(c)-(d). \subsection{Long-term behaviors of a flock} We further investigate the long-term evolution of our model \cite{sims,kare1983b,benh,edwa2012}. We focus on the moving path of the center of mass of a flock when $N=100$. The center of mass of a flock is computed by $X_c=\frac{1}{N}\sum \mathbf{x}_i$ %Fig 7 \begin{figure} \centering \includegraphics[width=16cm]{Fig6.eps} \caption{Brownian-like motion of a flock. The graph is the trajectory of their center of mass. Birds initiate their travel at $(0,0)$ and the final flying time is $2\times 10^6$. The inset shows the magnification of the boxed area. The used parameters are $C_1=7.3\times 10^{-6}$, $C_2=3.3\times 10^{-5}$, and $\kappa=8$, and other parameters are same as the case in Fig. \ref{fig3}.} \label{fig6} \end{figure} Fig. \ref{fig6} shows the trajectory of the center of mass of a flock in free space. The birds initiate their flight at $(0, 0)$. Their center of mass has been traced until the final time $t_f = 2 \times 10^6$. In this long-time simulation, we use the small number of birds, $N=100$. Other parameters are $C_1=7.3\times 10^{-6}$, $C_2=3.3\times 10^{-5}$, and $\kappa=8$, and otherwise values are the same as in Fig. \ref{fig3}. We notice that the number of birds affects the length and time scales in the long-term dynamics. When the number of birds in a flock is decreased, the strength of attraction ($C_1$ and $C_2$) should be increased in order to hold them in a bounded area with the equal density in open space. Also, since reaction times increases in a smaller flock (deviation $f_i$ is relatively large), the insensitivity of $\kappa$ should be accordingly reduced to have similar time scales with the case of the larger flock. The results in Fig. \ref{fig6} demonstrate a smooth random walk, a Brownian motion at large temporal and spatial scales. The inset magnifies the part in the boxed area. This suggests that a flock as a point mass can travel in a random manner, which is generated from the local individual alignment mechanism of a flock in our deterministic model. %Fig 8 \begin{figure} \centering \includegraphics[width=16cm]{Fig7.eps} \caption{Distributions of the spatial increments for (a) $\Delta t=1000$ and (b) $\Delta t=3000$ in the case shown in Fig. \ref{fig6}. They are fit to a normal distribution with $p$-value (a) $p=0.4591$ and (b) $p=0.9277$ in the Kolmogorov-Smirnov (KS) test. (c) An approximate proportional relation is $\langle\Delta x^2 \rangle\sim \Delta t^{2H}$ with $H=0.48<0.5$. This means the random walk is actually close to a fractional Brownian motion which is weakly sub-diffusive.} \label{fig7} \end{figure} To rigorously claim that the flight path in Fig. \ref{fig6} performs Brownian motions, we investigate its statistical features in Fig. \ref{fig7}. Two exemplary distributions of the spatial increments $\Delta x$ of the center of mass are illustrated in Fig. \ref{fig7}(a) and Fig. \ref{fig7}(b) for $\Delta t = 500$ and $\Delta t = 3000$, respectively. We apply the Kolmogorov-Smirnov test (KS test) to compare the measured distribution of $\Delta x$ with a normal distribution function. The method calculates maximum distance between the two curves and estimates $p$-values. The $p$-values obtained from the data in Fig. \ref{fig7}(a) and Fig. \ref{fig7}(b) are $p=0.4591$ and $p=0.9277$, respectively, which are greater than the conventional confidence level 0.1. Results indicate that the data likely fit a normal distribution. In addition to KS test, we verify if a proportional relation \begin{equation} \left< \Delta x^2 \right>\sim \Delta t ^{2H},\quad 0<H<1 \end{equation} is held. This proportional relation between temporal and spatial scales is known to characterize a generalized Brownian motion. The fitted log-log graph in Fig. \ref{fig7}(c) shows that the estimated value of $H$ is $0.425 (\geq 0.5$), which indicates that the path is a weakly sub-diffusive Brownian motion. %Fig 9 \begin{figure} \centering \includegraphics[width=16cm]{Fig8.eps} \caption{L\'{e}vy-like flight of a flock. The graph is the trajectory of their center of mass. Birds initiate their travel at $(0,0)$ and continue to fly for $2\times 10^6$. The inset shows the magnification of the boxed area. The used parameter is $H_1=0.24$, and other parameters are the same as the case in Fig. \ref{fig6}. The readers are recommended to see Movie 1 in Supplementary Information materials, which shows how a flock of birds switches from tumbling to running, and then switches back spontaneously.} \label{fig8} \end{figure} We now discuss in Fig. \ref{fig8} the emergence of the L\'{e}vy-like flight of a flock. We use the higher strength for the bird-to-bird alignment dynamics with $H_1=0.24$ to change the behavior characteristic from Brownian motion to Levy flights. Other parameters are the same as the case in Fig. \ref{fig6}. The inset shows the magnification of the boxed area. Unlike the Brownian motion, in Fig. \ref{fig8} the path of center of mass consists of clustered circling movements interspersed by long straight segments. During the flights, birds switch from tumbling to running, and then spontaneously switch back to tumbling (see Supplementary material Movie 4) \cite{visw1999,visw2008,reyn2009,bart}. It is interesting that long-term flight patterns of natural flocks can be also created by the individual based model in Eq. (\ref{evolu2}). This strongly indicates that the long-term behaviors of a flock may be the natural results of individual interactions rather than results of distinct mechanisms that focus on specific behavioral patterns. \begin{figure} \centering \includegraphics[width=10cm]{Fig9.eps} \caption{Distribution of the flight length. The flight lengths are measured at every $\Delta t=2200$. Fittings to a heavy-tailed distribution of the flight lengths in Fig. \ref{fig8} were made by maximum likelihood methods, and a goodness of fit was tested by KS test. The distribution is fit with the power-law distribution $p(l)\sim l^{-\mu}$ with $\mu=2.3$.} \label{fig9} \end{figure} To show that the path in Figure \ref{fig8} follows L\'{e}vy flights, we adopt the method of analyzing power law distributions proposed in \cite{clauset2009power}. According to this method, a cumulative power-law probability distribution function of the spatial displacement is of the form: \begin{equation} \text{Pr}(X\geq x)=\frac{c}{1-\mu}x^{1-\mu},\quad x\geq x_{\text{min}} \end{equation} where $c$ is a normalization constant. Fig. \ref{fig9} displays the distribution of the displacements $\Delta l$ measured in the flight path in Fig. \ref{fig8} at every $\Delta t = 2200$. We use the method of maximum likelihood and the KS test to estimate the exponent $\mu$ and the lower bound $x_{\text{min}}$, respectively. The fitted slop in Fig. \ref{fig9} indicates the power law exponent is $\mu=2.3$. To check the goodness-of-fit of the power law distribution, we generate synthetic data sets from a true power law distribution using the same $\mu$ and $x_{\text{min}}$, and calculate the $p$-value as the fraction of synthetic data sets that pass the KS test. Since $p=0.32$ ($\geq 0.1$) is obtained from this procedure, we can conclude that the L\'{e}vy flight is a plausible fit to the flight path in Fig. \ref{fig9}. \section{Conclusion} We investigate the effects of adaptive reaction delays on the behaviors and the ordering states of a flock, using a generalized Cucker-Smale model. We find that the reaction between orientational orders and reaction times is the key factor to create a variety of behavior patterns similar to those found in natural flocks \cite{pick,syme,cava}. Since the instant reaction of birds with no delay induces instability, such adaptive reaction time prevents the system from converging into a perfectly-ordered state and retains the system in marginalized ordering states. Further we show that both Brownian motion and Levy flights naturally occur in our model and their emergence can be understood in the context of individual interactions, not in the context of specific goal-seeking behaviors \cite{reyn2016}. Results indicate that our model may be used in exploring the long term behaviors of a flock in terms of local interactions of birds without relying on nonphysical stochastic effects. \bigskip {\bf \large Acknowledgments} \\ This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2017R1D1A1B04032921). The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. \bigskip \bibliographystyle{naturemag} %\bibliography{flock_ref_3} \begin{thebibliography}{10} \expandafter\ifx\csname url\endcsname\relax \def\url#1{\texttt{#1}}\fi \expandafter\ifx\csname urlprefix\endcsname\relax\def\urlprefix{URL }\fi \providecommand{\bibinfo}[2]{#2} \providecommand{\eprint}[2][]{\url{#2}} \bibitem{ball} \bibinfo{author}{Ballerini, M.} \emph{et~al.} \newblock \bibinfo{title}{An empirical study of large, naturally occurring starling flocks: a benchmark in collective animal behaviour}. \newblock \emph{\bibinfo{journal}{arXiv preprint arXiv:0802.1667}} (\bibinfo{year}{2008}). \bibitem{care} \bibinfo{author}{Carere, C.} \emph{et~al.} \newblock \bibinfo{title}{Aerial flocking patterns of wintering starlings, sturnus vulgaris, under different predation risk}. \newblock \emph{\bibinfo{journal}{Animal Behaviour}} \textbf{\bibinfo{volume}{77}}, \bibinfo{pages}{101--107} (\bibinfo{year}{2009}). \bibitem{atta2015} \bibinfo{author}{Attanasi, A.} \emph{et~al.} \newblock \bibinfo{title}{Emergence of collective changes in travel direction of starling flocks from individual birds' fluctuations}. \newblock \emph{\bibinfo{journal}{Journal of The Royal Society Interface}} \textbf{\bibinfo{volume}{12}}, \bibinfo{pages}{20150319} (\bibinfo{year}{2015}). \bibitem{reyn1987} \bibinfo{author}{Reynolds, C.~W.} \newblock \bibinfo{title}{Flocks, herds and schools: A distributed behavioral model}. \newblock In \emph{\bibinfo{booktitle}{ACM SIGGRAPH computer graphics}}, vol.~\bibinfo{volume}{21}, \bibinfo{pages}{25--34} (\bibinfo{organization}{ACM}, \bibinfo{year}{1987}). \bibitem{cuck2007a} \bibinfo{author}{Cucker, F.} \& \bibinfo{author}{Smale, S.} \newblock \bibinfo{title}{On the mathematics of emergence}. \newblock \emph{\bibinfo{journal}{Japanese Journal of Mathematics}} \textbf{\bibinfo{volume}{2}}, \bibinfo{pages}{197--227} (\bibinfo{year}{2007}). \bibitem{cuck2007b} \bibinfo{author}{Cucker, F.}, \bibinfo{author}{Smale, S.} \emph{et~al.} \newblock \bibinfo{title}{Emergent behavior in flocks}. \newblock \emph{\bibinfo{journal}{IEEE Transactions on automatic control}} \textbf{\bibinfo{volume}{52}}, \bibinfo{pages}{852--862} (\bibinfo{year}{2007}). \bibitem{vics} \bibinfo{author}{Vicsek, T.}, \bibinfo{author}{Czir{\'o}k, A.}, \bibinfo{author}{Ben-Jacob, E.}, \bibinfo{author}{Cohen, I.} \& \bibinfo{author}{Shochet, O.} \newblock \bibinfo{title}{Novel type of phase transition in a system of self-driven particles}. \newblock \emph{\bibinfo{journal}{Physical review letters}} \textbf{\bibinfo{volume}{75}}, \bibinfo{pages}{1226} (\bibinfo{year}{1995}). \bibitem{hart} \bibinfo{author}{Hartman, C.} \& \bibinfo{author}{Benes, B.} \newblock \bibinfo{title}{Autonomous boids}. \newblock \emph{\bibinfo{journal}{Computer Animation and Virtual Worlds}} \textbf{\bibinfo{volume}{17}}, \bibinfo{pages}{199--206} (\bibinfo{year}{2006}). \bibitem{chen} \bibinfo{author}{Chen, Y.} \& \bibinfo{author}{Kolokolnikov, T.} \newblock \bibinfo{title}{A minimal model of predator--swarm interactions}. \newblock \emph{\bibinfo{journal}{Journal of The Royal Society Interface}} \textbf{\bibinfo{volume}{11}}, \bibinfo{pages}{20131208} (\bibinfo{year}{2014}). \bibitem{li} \bibinfo{author}{Li, Z.} \& \bibinfo{author}{Jiang, Y.} \newblock \bibinfo{title}{Friction based social force model for social foraging of sheep flock}. \newblock \emph{\bibinfo{journal}{Ecological modelling}} \textbf{\bibinfo{volume}{273}}, \bibinfo{pages}{55--62} (\bibinfo{year}{2014}). \bibitem{wate} \bibinfo{author}{Waters, A.}, \bibinfo{author}{Blanchette, F.} \& \bibinfo{author}{Kim, A.~D.} \newblock \bibinfo{title}{Modeling huddling penguins}. \newblock \emph{\bibinfo{journal}{PLoS One}} \textbf{\bibinfo{volume}{7}}, \bibinfo{pages}{e50277} (\bibinfo{year}{2012}). \bibitem{szab} \bibinfo{author}{Szab{\'o}, P.}, \bibinfo{author}{Nagy, M.} \& \bibinfo{author}{Vicsek, T.} \newblock \bibinfo{title}{Transitions in a self-propelled-particles model with coupling of accelerations}. \newblock \emph{\bibinfo{journal}{Physical Review E}} \textbf{\bibinfo{volume}{79}}, \bibinfo{pages}{021908} (\bibinfo{year}{2009}). \bibitem{atta2014} \bibinfo{author}{Attanasi, A.} \emph{et~al.} \newblock \bibinfo{title}{Information transfer and behavioural inertia in starling flocks}. \newblock \emph{\bibinfo{journal}{Nature physics}} \textbf{\bibinfo{volume}{10}}, \bibinfo{pages}{691} (\bibinfo{year}{2014}). \bibitem{cavagna2010scale} \bibinfo{author}{Cavagna, Andrea} \emph{et~al.} \newblock \bibinfo{title}{Scale-free correlations in starling flocks}. \newblock \emph{\bibinfo{journal}{Proceedings of the National Academy of Sciences}} \textbf{\bibinfo{volume}{107}}, \bibinfo{pages}{11865--11870} (\bibinfo{year}{2010}). \bibitem{szabo2009transitions} \bibinfo{author}{Szab{\'o}, P{\'e}ter} \emph{et~al.} \newblock \bibinfo{title}{Transitions in a self-propelled-particles model with coupling of accelerations}. \newblock \emph{\bibinfo{journal}{Physical Review E}} \textbf{\bibinfo{volume}{79}}, \bibinfo{pages}{021908} (\bibinfo{year}{2009}). \bibitem{skel} \bibinfo{author}{Skellam, J.~G.} \newblock \bibinfo{title}{Random dispersal in theoretical populations}. \newblock \emph{\bibinfo{journal}{Biometrika}} \textbf{\bibinfo{volume}{38}}, \bibinfo{pages}{196--218} (\bibinfo{year}{1951}). \bibitem{kare1983a} \bibinfo{author}{Kareiva, P.} \newblock \bibinfo{title}{Local movement in herbivorous insects: applying a passive diffusion model to mark-recapture field experiments}. \newblock \emph{\bibinfo{journal}{Oecologia}} \textbf{\bibinfo{volume}{57}}, \bibinfo{pages}{322--327} (\bibinfo{year}{1983}). \bibitem{edwa2007} \bibinfo{author}{Edwards, A.~M.} \emph{et~al.} \newblock \bibinfo{title}{Revisiting l{\'e}vy flight search patterns of wandering albatrosses, bumblebees and deer}. \newblock \emph{\bibinfo{journal}{Nature}} \textbf{\bibinfo{volume}{449}}, \bibinfo{pages}{1044} (\bibinfo{year}{2007}). \bibitem{sims} \bibinfo{author}{Sims, D.~W.} \emph{et~al.} \newblock \bibinfo{title}{Scaling laws of marine predator search behaviour}. \newblock \emph{\bibinfo{journal}{Nature}} \textbf{\bibinfo{volume}{451}}, \bibinfo{pages}{1098} (\bibinfo{year}{2008}). \bibitem{hump2010} \bibinfo{author}{Humphries, N.~E.} \emph{et~al.} \newblock \bibinfo{title}{Environmental context explains l{\'e}vy and brownian movement patterns of marine predators}. \newblock \emph{\bibinfo{journal}{Nature}} \textbf{\bibinfo{volume}{465}}, \bibinfo{pages}{1066} (\bibinfo{year}{2010}). \bibitem{reyn2016} \bibinfo{author}{Reynolds, A.~M.} \& \bibinfo{author}{Ouellette, N.~T.} \newblock \bibinfo{title}{Swarm dynamics may give rise to l{\'e}vy flights}. \newblock \emph{\bibinfo{journal}{Scientific reports}} \textbf{\bibinfo{volume}{6}}, \bibinfo{pages}{30515} (\bibinfo{year}{2016}). \bibitem{fedotov2017emergence} \bibinfo{author}{Fedotov, S.} \& \bibinfo{author}{Korabel, N.} \newblock \bibinfo{title}{Emergence of l{\'e}vy walks in systems of interacting individuals}. \newblock \emph{\bibinfo{journal}{Physical Review E}} \textbf{\bibinfo{volume}{95}}, \bibinfo{pages}{030107} (\bibinfo{year}{2017}). \bibitem{ague} \bibinfo{author}{Agueh, M.}, \bibinfo{author}{Illner, R.} \& \bibinfo{author}{Richardson, A.} \newblock \bibinfo{title}{Analysis and simulations of a refined flocking and swarming model of cucker-smale type}. \newblock \emph{\bibinfo{journal}{Kinetic and Related Models}} \textbf{\bibinfo{volume}{4}}, \bibinfo{pages}{1--16} (\bibinfo{year}{2011}). \bibitem{hask} \bibinfo{author}{Haskovec, J.} \newblock \bibinfo{title}{Flocking dynamics and mean-field limit in the cucker--smale-type model with topological interactions}. \newblock \emph{\bibinfo{journal}{Physica D: Nonlinear Phenomena}} \textbf{\bibinfo{volume}{261}}, \bibinfo{pages}{42--51} (\bibinfo{year}{2013}). \bibitem{ballerini2008interaction} \bibinfo{author}{Ballerini, Michele and Cabibbo, Nicola and Candelier, Raphael and Cavagna, Andrea and Cisbani, Evaristo and Giardina, Irene and Lecomte, Vivien and Orlandi, Alberto and Parisi, Giorgio and Procaccini, Andrea and others} \newblock \bibinfo{title}{Interaction ruling animal collective behavior depends on topological rather than metric distance: Evidence from a field study}. \newblock \emph{\bibinfo{journal}{Proceedings of the national academy of sciences}} \textbf{\bibinfo{volume}{105}}, \bibinfo{pages}{1232--1237} (\bibinfo{year}{2008}). \bibitem{dors} \bibinfo{author}{D'Orsogna, M.~R.}, \bibinfo{author}{Chuang, Y.-L.}, \bibinfo{author}{Bertozzi, A.~L.} \& \bibinfo{author}{Chayes, L.~S.} \newblock \bibinfo{title}{Self-propelled particles with soft-core interactions: patterns, stability, and collapse}. \newblock \emph{\bibinfo{journal}{Physical review letters}} \textbf{\bibinfo{volume}{96}}, \bibinfo{pages}{104302} (\bibinfo{year}{2006}). \bibitem{albi} \bibinfo{author}{Albi, G.} \& \bibinfo{author}{Pareschi, L.} \newblock \bibinfo{title}{Binary interaction algorithms for the simulation of flocking and swarming dynamics}. \newblock \emph{\bibinfo{journal}{Multiscale Modeling \& Simulation}} \textbf{\bibinfo{volume}{11}}, \bibinfo{pages}{1--29} (\bibinfo{year}{2013}). \bibitem{armb} \bibinfo{author}{Armbruster, D.}, \bibinfo{author}{Martin, S.} \& \bibinfo{author}{Thatcher, A.} \newblock \bibinfo{title}{Elastic and inelastic collisions of swarms}. \newblock \emph{\bibinfo{journal}{Physica D: Nonlinear Phenomena}} \textbf{\bibinfo{volume}{344}}, \bibinfo{pages}{45--57} (\bibinfo{year}{2017}). \bibitem{aceb} \bibinfo{author}{Acebr{\'o}n, J.~A.}, \bibinfo{author}{Bonilla, L.~L.}, \bibinfo{author}{Vicente, C. J.~P.}, \bibinfo{author}{Ritort, F.} \& \bibinfo{author}{Spigler, R.} \newblock \bibinfo{title}{The kuramoto model: A simple paradigm for synchronization phenomena}. \newblock \emph{\bibinfo{journal}{Reviews of modern physics}} \textbf{\bibinfo{volume}{77}}, \bibinfo{pages}{137} (\bibinfo{year}{2005}). \bibitem{okee} \bibinfo{author}{O’Keeffe, K.~P.}, \bibinfo{author}{Hong, H.} \& \bibinfo{author}{Strogatz, S.~H.} \newblock \bibinfo{title}{Oscillators that sync and swarm}. \newblock \emph{\bibinfo{journal}{Nature Communications}} \textbf{\bibinfo{volume}{8}}, \bibinfo{pages}{1504} (\bibinfo{year}{2017}). \bibitem{levi} \bibinfo{author}{Levis, D.}, \bibinfo{author}{Pagonabarraga, I.} \& \bibinfo{author}{Liebchen, B.} \newblock \bibinfo{title}{Activity induced synchronization}. \newblock \emph{\bibinfo{journal}{arXiv preprint arXiv:1802.02371}} (\bibinfo{year}{2018}). \bibitem{tren} \bibinfo{author}{Trenchard, H.} \newblock \bibinfo{title}{American coot collective on-water dynamics}. \newblock \emph{\bibinfo{journal}{arXiv preprint arXiv:1205.5929}} (\bibinfo{year}{2012}). \bibitem{kare1983b} \bibinfo{author}{Kareiva, P.} \& \bibinfo{author}{Shigesada, N.} \newblock \bibinfo{title}{Analyzing insect movement as a correlated random walk}. \newblock \emph{\bibinfo{journal}{Oecologia}} \textbf{\bibinfo{volume}{56}}, \bibinfo{pages}{234--238} (\bibinfo{year}{1983}). \bibitem{benh} \bibinfo{author}{Benhamou, S.} \newblock \bibinfo{title}{How many animals really do the levy walk?} \newblock \emph{\bibinfo{journal}{Ecology}} \textbf{\bibinfo{volume}{88}}, \bibinfo{pages}{1962--1969} (\bibinfo{year}{2007}). \bibitem{edwa2012} \bibinfo{author}{Edwards, A.~M.}, \bibinfo{author}{Freeman, M.~P.}, \bibinfo{author}{Breed, G.~A.} \& \bibinfo{author}{Jonsen, I.~D.} \newblock \bibinfo{title}{Incorrect likelihood methods were used to infer scaling laws of marine predator search behaviour}. \newblock \emph{\bibinfo{journal}{PloS one}} \textbf{\bibinfo{volume}{7}}, \bibinfo{pages}{e45174} (\bibinfo{year}{2012}). \bibitem{visw1999} \bibinfo{author}{Viswanathan, G.~M.} \emph{et~al.} \newblock \bibinfo{title}{Optimizing the success of random searches}. \newblock \emph{\bibinfo{journal}{nature}} \textbf{\bibinfo{volume}{401}}, \bibinfo{pages}{911} (\bibinfo{year}{1999}). \bibitem{visw2008} \bibinfo{author}{Viswanathan, G.}, \bibinfo{author}{Raposo, E.} \& \bibinfo{author}{Da~Luz, M.} \newblock \bibinfo{title}{L{\'e}vy flights and superdiffusion in the context of biological encounters and random searches}. \newblock \emph{\bibinfo{journal}{Physics of Life Reviews}} \textbf{\bibinfo{volume}{5}}, \bibinfo{pages}{133--150} (\bibinfo{year}{2008}). \bibitem{reyn2009} \bibinfo{author}{Reynolds, A.~M.} \& \bibinfo{author}{Rhodes, C.~J.} \newblock \bibinfo{title}{The l{\'e}vy flight paradigm: random search patterns and mechanisms}. \newblock \emph{\bibinfo{journal}{Ecology}} \textbf{\bibinfo{volume}{90}}, \bibinfo{pages}{877--887} (\bibinfo{year}{2009}). \bibitem{bart} \bibinfo{author}{Bartumeus, F.}, \bibinfo{author}{Peters, F.}, \bibinfo{author}{Pueyo, S.}, \bibinfo{author}{Marras{\'e}, C.} \& \bibinfo{author}{Catalan, J.} \newblock \bibinfo{title}{Helical l{\'e}vy walks: adjusting searching statistics to resource availability in microzooplankton}. \newblock \emph{\bibinfo{journal}{Proceedings of the National Academy of Sciences}} \textbf{\bibinfo{volume}{100}}, \bibinfo{pages}{12771--12775} (\bibinfo{year}{2003}). \bibitem{clauset2009power} \bibinfo{author}{Clauset, A.}, \bibinfo{author}{Shalizi, C.~R.} \& \bibinfo{author}{Newman, M.~E.} \newblock \bibinfo{title}{Power-law distributions in empirical data}. \newblock \emph{\bibinfo{journal}{SIAM review}} \textbf{\bibinfo{volume}{51}}, \bibinfo{pages}{661--703} (\bibinfo{year}{2009}). \bibitem{pick} \bibinfo{author}{Pickering, S.}, \bibinfo{author}{Creighton, E.} \& \bibinfo{author}{Stevens-Wood, B.} \newblock \bibinfo{title}{Flock size and breeding success in flamingos}. \newblock \emph{\bibinfo{journal}{Zoo Biology}} \textbf{\bibinfo{volume}{11}}, \bibinfo{pages}{229--234} (\bibinfo{year}{1992}). \bibitem{syme} \bibinfo{author}{Symes, C.} \& \bibinfo{author}{Perrin, M.} \newblock \bibinfo{title}{Daily flight activity and flocking behaviour patterns of the greyheaded parrot poicephalus fuscicollis suahelicus reichenow 1898 in northern province, south africa}. \newblock \emph{\bibinfo{journal}{Tropical Zoology}} \textbf{\bibinfo{volume}{16}}, \bibinfo{pages}{47--62} (\bibinfo{year}{2003}). \bibitem{cava} \bibinfo{author}{Cavagna, A.} \emph{et~al.} \newblock \bibinfo{title}{New statistical tools for analyzing the structure of animal groups}. \newblock \emph{\bibinfo{journal}{Mathematical biosciences}} \textbf{\bibinfo{volume}{214}}, \bibinfo{pages}{32--37} (\bibinfo{year}{2008}). \bibitem{hale2013} \bibinfo{author}{Hale, J.~K.} \& \bibinfo{author}{Lunel, S. M.~V.} \newblock \emph{\bibinfo{title}{Introduction to functional differential equations}}, vol.~\bibinfo{volume}{99} (\bibinfo{publisher}{Springer Science \& Business Media}, \bibinfo{year}{2013}). \bibitem{kuan} \bibinfo{author}{Kuang, Y.} \newblock \emph{\bibinfo{title}{Delay differential equations: with applications in population dynamics}}, vol. \bibinfo{volume}{191} (\bibinfo{publisher}{Academic Press}, \bibinfo{year}{1993}). \bibitem{bray} \bibinfo{author}{Brayton, R.~K.} \newblock \bibinfo{title}{Bifurcation of periodic solutions in a nonlinear difference-differential equations of neutral type}. \newblock \emph{\bibinfo{journal}{Quarterly of Applied Mathematics}} \textbf{\bibinfo{volume}{24}}, \bibinfo{pages}{215--224} (\bibinfo{year}{1966}). \bibitem{kolm} \bibinfo{author}{Kolmanovskii, V.~B.} \& \bibinfo{author}{Nosov, V.~R.} \newblock \emph{\bibinfo{title}{Stability of functional differential equations}}, vol. \bibinfo{volume}{180} (\bibinfo{publisher}{Elsevier}, \bibinfo{year}{1986}). \end{thebibliography} \section*{Appendix A: Linear stability analysis of two-birds system} Equation (\ref{evolu2}) with the reaction time $\tau_{i}$ in response to the acceleration is one example of neutral delay differential equations, where a delay is considered in the terms with the highest order of derivative, i.e. acceleration in our case \cite{hale2013,kuan,bray}. In neutral delay differential equations, even small delays can have large effects on stability of the systems \cite{kuan,kolm}. Here we briefly present the standard stability analysis for the model in Eq. (\ref{evolu2}) without the third potential term when $N=2$. Given two birds, $i=1,2$, we assume that the two birds are flying with the same velocity $\mathbf{v}_1=\mathbf{v}_2=(v_x^*,v_y^*)^T$, and with the same reaction time $\tau_1=\tau_2=\tau$. Let $s$ be the distance between two birds, as $|\mathbf{x}_2-\mathbf{x}_1 |=s$. Note that, since the communication function $g(s)$ monotonically decreases, $g(s)$ grows as two birds are getting closer. Here we treat $g(s)$ as a parameter, assuming that the relative position of two birds $s$ is fixed in the analysis. We can reformulate the velocity part of Eq. (\ref{evolu2}) as \begin{equation}\label{redu} \begin{aligned} \frac{d\mathbf{v}_1}{dt}=&\mathbf{F}(\mathbf{v}_1,\mathbf{v}_2)+\mathbf{G}\left(\frac{d\mathbf{\bar{v}}_1}{dt}, \frac{d\mathbf{\bar{v}}_2}{dt}\right) \\ \frac{d\mathbf{v}_2}{dt}=&\mathbf{F}(\mathbf{v}_2,\mathbf{v}_1)+\mathbf{G}\left(\frac{d\mathbf{\bar{v}}_2}{dt}, \frac{d\mathbf{\bar{v}}_1}{dt}\right) \end{aligned} \end{equation} where $\mathbf{F}(\mathbf{u},\mathbf{v})=H_1g(s)(\mathbf{v}-\mathbf{u})+\alpha(1-|\mathbf{u}|^2)\mathbf{u},$ $\mathbf{G}(\mathbf{u},\mathbf{v})=H_2 \left(g(s) \mathbf{v}\right)$, $\displaystyle{\mathbf{\bar{u}}(t)=\mathbf{u}(t-\tau)}$ and $\displaystyle{\mathbf{\bar{v}}(t)=\mathbf{v}(t-\tau)}$. We set a solution of Eq. (\ref{redu}) around the aligned formation $\mathbf{v}_1=\mathbf{v}_2=(v_x^*,v_y^*)^T$ as \begin{equation} \mathbf{y}(t)=\mathbf{y}^*+\delta\mathbf{y}(t) \end{equation} where $\mathbf{y}=(v_{1x},v_{1y},v_{2x},v_{2y})^T$, $\mathbf{y}^*=(v_{x}^*,v_{y}^*,v_{x}^*,v_{y}^*)^T$ and $\delta\mathbf{y}$ is the infinitesimal displacements from the equilibrium solution. Using the Taylor series expansion, the above Eq. (\ref{redu}) can be linearized about the equilibrium solution as \begin{equation} \frac{d (\delta\mathbf{y}) }{dt}=\mathbf{J}\,\delta\mathbf{y}+\mathbf{J}_{\tau}\,\frac{d (\overline{\delta\mathbf{y}})}{dt}\label{lineqn} \end{equation} where $\overline{\delta\mathbf{y}}(t)=\delta\mathbf{y}(t-\tau)$. Here the Jacobian matrices $\mathbf{J}$ and $\mathbf{J}_\tau$ are \begin{equation*} \mathbf{J}= \begin{bmatrix} -H_1 g(s)-2 \alpha v_{x}^{*2} & -2 \alpha v^*_{x}v^*_{y} & H_1 g(s) & 0 \\ -2\alpha v^*_{x}v^*_{y} & -H_1 g(s)-2 \alpha v_{y}^{*2} & 0 & H_1 g(s) \\ H_1 g(s) & 0 & -H_1 g(s)-2\alpha v_{x}^{*2} & -2\alpha v^*_{x}v^*_{y} \\ 0 & H_1 g(s) & -2\alpha v^*_{x}v^*_{y} & -H_1 g(s)-2\alpha v_{y}^{*2} \end{bmatrix} \end{equation*} and \begin{equation*} \mathbf{J}_\tau= \begin{bmatrix} 0 & 0 & H_2 g(s) & 0 \\ 0 & 0 & 0 & H_2 g(s) \\ H_2 g(s) & 0 & 0 & 0 \\ 0 & H_2 g(s) & 0 & 0 \end{bmatrix}. \end{equation*} We seek exponentially growing solutions of (\ref{lineqn}) of the form \begin{equation}\label{linsol} \delta\mathbf{y}(t)=e^{\lambda t} \mathbf{w}, \,\mathbf{w}\ne 0 \end{equation} where $\lambda$ is complex and $\mathbf{w}$ is a vector whose components are complex. Putting Eq. (\ref{linsol}) to Eq. (\ref{lineqn}) gives a characteristic equation with respect to $\lambda$ as \begin{equation} \label{chareqn} \begin{aligned} 0=&\text{det}(\mathbf{J}+\lambda e^{\lambda t} \mathbf{J}_\tau -\lambda \mathbf{I})\\ =&\lambda e^{-4\lambda\tau}(e^{\lambda\tau}-H_2 g(s))(\lambda H_2 g(s)+\lambda e^{\lambda\tau}+2H_1 g(s) e^{\lambda\tau}+2\alpha e^{\lambda\tau})\\ &(-\lambda H_2 g(s)+\lambda e^{\lambda\tau}+2\alpha e^{\lambda\tau})(\lambda H_2 g(s)+\lambda e^{\lambda\tau}+2H_1 g(s) e^{\lambda\tau}+2\alpha e^{\lambda\tau}) \end{aligned} \end{equation} where $\mathbf{I}$ is a $4\times 4$ identity matrix. The five factored equations for the eigenvalues in (\ref{chareqn}) are \begin{eqnarray} 0&=&\lambda e^{-4\lambda\tau}\label{e1}\\ 0&=&e^{\lambda\tau}-H_2 g(s)\label{e2}\\ 0&=&\lambda H_2 g(s)+\lambda e^{\lambda\tau}+2H_1 g(s) e^{\lambda\tau}+2\alpha e^{\lambda\tau}\label{e3}\\ 0&=&-\lambda H_2 g(s)+\lambda e^{\lambda\tau}+2\alpha e^{\lambda\tau}\label{e4}\\ 0&=&\lambda H_2 g(s)+\lambda e^{\lambda\tau}+2H_1 g(s) e^{\lambda\tau}+2\alpha e^{\lambda\tau}\label{e5} \end{eqnarray} Let $\lambda^{\text{Re}}_{\text{max}}$ denote the largest value of the real part of eigenvalues of the linearlized system. For the system to be stable, $\lambda^{\text{Re}}_{\text{max}}$ should be nonpositive. One can confirm that no positive solution of the real part of the eigenvalue exists from the below three equations (\ref{e3}),(\ref{e4}) and (\ref{e5}). From the first two equations (\ref{e1}) and (\ref{e2}), we have \begin{equation}\label{lambda_comp} \lambda^{\text{Re}}_{\text{max}}= \begin{cases} 0, & \text{if }g(s)\leq 1/H_2, \\ \log(H_2 g(s))/\tau, & \text{otherwise}. \end{cases} \end{equation} Fig. \ref{fig10} plots the maximum eigenvalue with respect to the communication rate $g(s)$. The value of $\lambda^{\text{Re}}_{\text{max}}$ bifurcates from a neutral state to an unstable one at a critical value $g(s)=1/H_2$, and the system is unstable when $g(s)>1/H_2$. Since the communication rate $g(s)$ monotonically decreases with $s$, the trajectories of the two birds become unstable when $s<s_T$ where $g(s_{\text{T}})=1/H_2$. The slope at the critical point indicates how likely a perturbation is to occur in a flock. From Eq. (\ref{lambda_comp}), the slope is obtained from \begin{equation} \left.\frac{\partial \lambda^{\text{Re}}_{\text{max}}}{\partial g(s)}\right\vert_{g(s)=1/H_2}=\frac{H_2}{\tau} \label{suscep} \end{equation} Since the slope is inversely proportional to $\tau$, when the reaction time $\tau$ is reduced, it becomes steeper as shown in Fig. \ref{fig10}. Due to the instability at this high slope, the innate perturbations are particularly easy to be induced in a flock. This analysis gives an insight about how birds in a large flock behave with the reaction time. Once the birds or part of them drift away from ordered states, a longer delay in feedback is recovered and it stops deterring alignment. This is the main factor to create rich dynamics of the model in Eq. (\ref{evolu2}) and the flocking mechanism of a marginalized ordering state. %%%Fig. 9%%% \begin{figure} \centering \includegraphics[width=10cm]{Fig10.eps} \caption{Maximal eigenvalue according to the communication rate $g(s)$, where $s$ is a distance between two birds $|\mathbf{x}_2-\mathbf{x}_1|=s$. } \label{fig10} \end{figure} % %\newpage %%%% Fig. 1 %\begin{figure} %\centering %\subfigure{\includegraphics[width=16cm]{Fig_Rs.eps}} %\caption{Velocity distributions of birds when $R=0.19$, $R=0.56$, and $R=0.96$.} %\label{Fig_Rs} %\end{figure} %%%% Fig. 2 Snapshots %%% %\begin{figure} %\centering %\subfigure{\includegraphics[width=16cm]{Fig_Snapshots.eps}} %\caption{} %\label{Fig_Snapshots} %\end{figure} %%%% Fig. 3 Macros %%% %\begin{figure} %\centering %\subfigure{\includegraphics[width=17cm]{Fig_Macros.eps}} %\caption{ } %\label{Fig_Macros} %\end{figure} %%%% Fig. 4 Variance of Fig. wrt Ka $\kappa$ r0 %%% %\begin{figure} %\centering %\includegraphics[width=15cm]{Fig_Macros_wrt_paras.eps} %\caption{Variance with respect to $K_a$, $\kappa$, and $r_0$} %\label{Fig_Macros_wrt_paras} %\end{figure} % %%%% Fig. 5 Brownian Path %%% %\begin{figure} %\centering %\includegraphics[width=16cm]{Fig_Brown_path.eps} %\caption{Typical Brownian-like motion of a flock of N=100 with the potential in Eq. (\ref{cubic}). Birds initiate their travel at $(0,0)$ and continue to fly for $2\times 10^7$ (ms). The graph is the trajectory of their centroid. The unit of distance is meters. The inset shows a magnification of the boxed area. The used parameters are $r_0=0.6, \betabeta=10, \kappa=80,=0.2, H_1=0.10, H_2=0.12,C_{\text{r}}=1.32, l_{\text{r}}=0.04, C_1=7.3\times 10^{-6},$ and $C_2=-3.3\times 10^{-5}.$ } %\label{Fig_Brown_path} %\end{figure} %%%% Fig. 6 Brownian Analysis %%% %\begin{figure} %\centering %\includegraphics[width=16cm]{Fig_Brown_analysis.eps} %\caption{Distributions of the spatial increments for (a) $\Delta t=500$ (ms) and (b) $\Delta t=3,000$ (ms) in the flight presented in Fig. \ref{fig_Brown}. They are fit to a normal distribution with p-value (a) $p=0.17$ and (b) $p=0.75$ in the Kolmogorov-Smirnov (KS) test. (c) An approximate proportional relation is $\langle\Delta x^2 \rangle\sim \Delta t^{2H}$ with $H=0.48<0.5$. This means the random walk is actually close to a fractional Brownian motion which is weakly subdiffusive.} %\label{Fig_Brown_analysis} %\end{figure} %%%% Fig. 7 L\'{e}vy Path %%% %\begin{figure} %\centering %\includegraphics[width=16cm]{Fig_L\'{e}vy_path.eps} %\caption{Typical L\'{e}vy-like flight of a flock of 100 birds with the potential in Eq. (\ref{cubic}). Birds initiate their travel at $(0,0)$ and continue to fly for $2\times 10^7$ (ms). The graph is the trajectory of their centroid. The unit of distance is meters. The inset shows a magnification of the boxed area. The used parameters are $N=100, r_0=0.6, \beta=10, \kappa=80,=0.2, H_1=0.24, H_2=0.12,C_{\text{r}}=1.54, l_{\text{r}}=0.05, C_1=3.37\times 10^{-6}$, and $C_2=-1.65\times 10^{-5}.$ Refer to Movie 4 in supplementary materials to see local flocking behaviors that lead to L\'{e}vy flights.} %\label{Fig_L\'{e}vy_path} %\end{figure} %%%% Fig. 8 L\'{e}vy analysis %%% %\begin{figure} %\centering %\includegraphics[width=10cm]{Fig_L\'{e}vy_analysis.eps} %\caption{Distribution of the flight length. The flight lengths are measured at every $\Delta t=22,000$ (ms). Fittings to a heavy-tailed distribution of the flight lengths in Fig. \ref{fig_L\'{e}vy} were made by maximum likelihood methods, and a goodness of fit was tested by KS test. The distribution is fit with the power-law distribution $p(l)\sim l^{-\mu}$ with $\mu=2.3$.} %\label{Fig_L\'{e}vy_analysis} %\end{figure} %%%% Fig. 9 %%% %\begin{figure} %\centering %\includegraphics[width=10cm]{Fig_stab.eps} %\caption{Maximal eigenvalue according to the communication rate $g(s)$, where $s$ is a fixed distance between two birds $|\mathbf{x}_j-\mathbf{x}_i|=s$. } %\label{Fig_stab} %\end{figure} %%%% Fig. 10 Morse potential%%% %\begin{figure} %\centering %\subfigure{\includegraphics[width=16cm]{Fig_MorseP_A6_capture.eps}} %\subfigure{\includegraphics[width=16cm]{Fig_MorseP_B4_capture.eps}} %\caption{Two cases of continual evolution of a flock of 2,000 birds generated by the model in Eq. (\ref{evolu}): (a) $r_0=0.9, H_2=0.09$ and (b) $r_0=0.4, H_2=0.12$. The common parameters are $\beta=10, \kappa=200, =0.2, H_1=0.3, C_{\text{r}}=2.2, l_{\text{r}}=0.1, C_{\text{a}}=0.6$ and $l_{\text{a}}=10^4.$ Visually appealing animations of flocks are provided in Movie 2 and 3 in supplementary materials.} %\label{Fig_MorseP} %\end{figure} \end{document} }
\caption{(a) A typical visual dialog task. Particularly, the initial answer $A^0$ denotes the given image captioning. (b) The conventional training process at round $t$: given an image $I$, a history $H^{t}$, and question $Q^t$, the loss is supervised by the ground-truth answer {\color{green}$A^t_{gt}$}. (c) The proposed History-Advantage Sequence Training (HAST) paradigm: the reward (\emph{i.e.} $-$ loss) is a History-Advantage, which is more focused on the impact caused by a wrong answer {\color{red}$\bar{A}^t$} to the future round $t'$, by comparing the difference between the Gold Reward from gold history $H^{t'}_g$ and the Adverse Critic from fake history $H^{t'}_a$. }
\caption{Benchmark positions: (Top Left) Trap from Budapest Gambit: 1.d4 \textsymfigsymbol{N}f6 2. c4 e5 3.d5 \textsymfigsymbol{B}c5 4.\textsymfigsymbol{B}g5 \textsymfigsymbol{N}e4, to be followed by 5.\textsymfigsymbol{B}xd8 \textsymfigsymbol{B}xf2\#. (Top Right) Trap after mistake in Caro-Kann Defence, Breyer variation: 1. e4 c6 2. d3 d5 3.\textsymfigsymbol{N}d2 dxe4 4. dxe4 \textsymfigsymbol{N}f6 5. \textsymfigsymbol{N}gf3 \textsymfigsymbol{B}g4 6. e5 \textsymfigsymbol{N}d5 7. h3 \textsymfigsymbol{B}h5 8. c4 \textsymfigsymbol{N}b6 9. e6 \textsymfigsymbol{N}a6 10. \textsymfigsymbol{N}e5, to be followed by ... \textsymfigsymbol{B}xd1 11. exf7\#. (Bottom Left) Kieninger Trap: 1.d4\textsymfigsymbol{N}f6 2.c4 e5 3.dxe5 \textsymfigsymbol{N}g4 4.\textsymfigsymbol{B}f4 \textsymfigsymbol{N}c6 5.\textsymfigsymbol{N}f3 \textsymfigsymbol{B}b4+ 6. \textsymfigsymbol{N}bd2 \textsymfigsymbol{Q}e7 7.a3 \textsymfigsymbol{N}gxe5, to be followed by 8.axb4 \textsymfigsymbol{N}d3\#. (Bottom Right) L\'egal Trap: 1.e4 e5 2.\textsymfigsymbol{N}f3 \textsymfigsymbol{N}c6 3. \textsymfigsymbol{B}c4 d6 4. \textsymfigsymbol{N}c3 \textsymfigsymbol{B}g4 5. \textsymfigsymbol{N}xe5, to be followed by ... \textsymfigsymbol{B}xd1 6. \textsymfigsymbol{B}xf7+ \textsymfigsymbol{K}e7 7. \textsymfigsymbol{N}d5\#.}
\caption{\label{fig:comp}\textcolor{gray}({Color online) Material distribution in the system (a) and scheme of the lowest energy levels (b). } }
\caption{\label{fig:g_combo_e}\textcolor{gray}({Color online) (a,b) Spin-flip transition rates of electron related to the hyperfine (blue lines) and various channels of spin-orbit (red lines) interaction as a function. (c,d) Ratio of the spin-flip to the spin-conserving phonon-assisted tunneling rate for both transition channels.} }
\caption{\label{fig:F_e}\textcolor{gray}({Color online) (a,b) Spin-flip transition rates of electron as a function of the magnetic field, for fixed electric field $E=6$~kV/cm.} }
\caption{\label{fig:g_combo_h}\textcolor{gray}({Color online) (a) Spin-flip transition rates of hole related to the hyperfine (blue lines) and various channels of spin-orbit (red lines) interaction. (b) Ratio of the spin-flip to the spin-conserving phonon-assisted tunneling rate for both transition channels.} }
\caption{We plot the projection of the decision boundaries onto a two dimensional surface formed by interpolating between three images belonging to the same semantic category (vehicles) - truck (yellow), ship (green) and automobile (dark green). \textcolor{cyan}{cyan}/\textcolor{blue}{blue} regions represent \textcolor{cyan}{airplane}/\textcolor{blue}{cat}.}
\caption{The projection of the decision boundaries onto a two dimensional surface formed by interpolating between three images belonging to the same semantic category (vehicles) - aeroplane (cyan), ship (green) and truck (yellow).The \textcolor{red}{red}/\textcolor{blue}{blue}/\textbf{black} regions represent \textcolor{red}{bird}/\textcolor{blue}{cat}/\textbf{frog} respectively).}
\caption{A depiction of our \textit{human-annotation interface} that was used in order to collect ground truth data of anchored objects. In conjunction with changes in the scene, as illustrated by \ftextnumero~1-3, the human user has the possibility to \textit{freeze} the execution of the framework and providing feedback about what he/she would consider as the appropriate anchoring action for a candidate objects. Once the execution is frozen, the human user can select segmented candidate objects, e.g., the moved \scriptpred{apple} as illustrated in \ftextnumero~4, after which the framework is responding by displaying an updated representation of a number of already anchored objects, shown in \ftextnumero~5, which best (attribute-wise) corresponds to the selected object. The human user can then provide positive feedback about a matching anchored object (by selecting the representation of the matching anchored object), or negative feedback (simply by clicking anywhere else on the screen). Also, to covering the time aspect, and to suggest possible matching anchored objects that have not been perceived recently, we have further added a time slider, illustrated in the top part of \ftextnumero~6. Through this time slider can the user adjust the time factor $k$ for the purpose of selecting the best matching candidate object that was last observed at a time $t-k$. }
\caption{A depiction of how suggested system benefits of combined object anchoring and probabilistic object tracking. \textit{Rows in order from the top:} \textit{1\textsuperscript{st})} representing screen-shots of a scenario where a human hand is occluding an apple while the apple is moved, \textit{2\textsuperscript{nd})} corresponding resulting anchored objects while \textit{only} using the \textit{anchoring system} (note that the original \scriptpred{apple-1} object is lost while it is occluded and moved by the \scriptpred{skin-1} object, and a new \scriptpred{apple-3} object is, therefore, \textit{acquired} in the end of the scenario), \textit{3\textsuperscript{rd})} plotted particles given by the \textit{inference system} during execution of suggested integrated approach, and \textit{4\textsuperscript{th})} corresponding resulting anchored objects of the \textit{anchoring system} supported by the feed back of the \textit{inference system} (note that in this case is the position of \scriptpred{apple-1} object tracked while it is occluded and moved by the \scriptpred{skin-1} object, and the \scriptpred{apple-1} object is, accordingly, \textit{re-acquired} in the end of the scenario). }
\caption{ Examples of screen-shots captured during the execution of stated scenarios. Visual perceived anchored objects are symbolized with the unique anchor id (e.g., \scriptpred{ball-2}), while occluded hidden objects are depicted by plotted particles that represent possible positions of the occluded object in the inference system. \textit{Rows in order from the top:} \textit{1\textsuperscript{st})} example of \textit{simple occlusion} where a \scriptpred{ball} is hidden behind a \scriptpred{cup}, \textit{2\textsuperscript{nd})} depicts the \textit{movement} of an \textit{occluded object} where a \scriptpred{glove} (or human hand) is occluding while moving an \scriptpred{apple}, \textit{3\textsuperscript{rd})} similar example of \textit{moving an occluded object} where a \scriptpred{glove} is occluding while moving a \scriptpred{ball} (\scriptpred{ball-1}), but in this case is also another \scriptpred{ball} object (\scriptpred{ball-2}) introduced during the execution of the scenario, \textit{4-6\textsuperscript{th})} illustrate a \textit{shell game} scenario where a smaller object (\scriptpred{block-3}) is hidden under one of three identical containers (\scriptpred{block-2}), and where the containers, subsequently, are shuffled around. }
\caption{Bayesian Information Criterion (BIC) for model order selection based on available data excluding Person K (\protect\marksymbol{square}{black}), Person L (\protect\marksymbol{diamond}{black}), Person M (\protect\marksymbol{triangle}{black}), and Person N (\protect\marksymbol{o}{black}), respectively.}
\caption{\footnotesize 2D tSNE~\cite{tSNE_van2014} visualization of word2vec vectors \cite{Mikolov_arXiv_2013}. Red, green and blue texts represent seen {\color{red}ModelNet40 \cite{Article10}}, unseen {\color{green}ModelNet10 \cite{Article10}} and unseen {\color{blue}McGill \cite{Article49}} classes respectively.}
\caption{Examples for three possible reasons for not understanding the text. In each example, \textcolor{blue}{(A)} is the original text and \textcolor{red}{(a)} and \textcolor{red}{(b)} are the two sentences generated by our rules.}
\caption{Examples of grammatical errors introduced by our rules. The \textcolor{red}{red} text was incorrectly inserted and the \textcolor{blue}{blue} text was incorrectly removed.}
\caption{Detailed two-rule execution example. We show in \textcolor{red}{red} parts of the input that are used for detection or modified during execution. The input token list $Z$ is of a single sentence. First, the rule for inner connective is applied, splitting $Z$ into two sentences $A,B$, without the connective \nl{because}. Then, applying the anaphora rule, the pronoun \nl{his} in $B$ is replaced with the entity it refers to in $A$, to obtain a new sentence pair.}
\caption{Example for two independent sentences, and their fusion. The modifications applied are pronominalization (\textcolor{blue}{blue}) and connective insertion (\textcolor{red}{red}).}
\caption{Generated fusion examples for different phenomena. The input text is marked in uppercase \textcolor{blue}{blue}, and the generated sentence pair is marked in lowercase \textcolor{red}{red}. We show in \textbf{boldface} parts that allow us to detect the target phenomenon.}
\caption{Illustrations of different regularization schemes. For the interest {\color{red}red} voxel, we use voxels in {\color{blue}blue} to denote its receptive field during the cost volume regularization. The runtime memory requirement is also listed on top of the volume, where H, W and D denote the image height, width and depth sample number respectively. The 3D CNNs gather the cost information across the whole space, however, requires a runtime memory cubical to the model resolution}
\caption{The two plans, i.e the safe plan {\color{ForestGreen!80} $\pi_s$} (left) and the probably-risky plan {\color{BrickRed!80} $\pi_{pr}$} (right) for the robot-delivery scenario.}
\caption{The $wF_\beta$ and $MAE$ of different salient object detection approaches on all test datasets. The best three results are shown in {\color{red}{red}}, {\color{blue}{blue}}, and {\color{green}{green}}.}
\caption{The effectiveness of edge preservation loss. The score of $wF_\beta$ and $MAE$ in our method when $\alpha$ is given different values. The best result is shown in {\color{red}{red}}. The test dataset is DUTS-test. }
\caption{{\color{blue}Examples of divisions that violate the asymmetry. Here the red cross ${\bf x}_a$ represents the target point, while the green cross ${\bf x}_b$ is one of its neighborhood. The magenta sphere represents the spherical neighborhood range of ${\bf x}_a$, and the black curves represent divisions along $\theta$/$\phi$ directions. (a) The spherical space is not divided at all, which results in a single weight matrix to be defined in the kernel and applied to any two points ${\bf x}_a$ and ${\bf x}_b$ symmetrically. (b) Let $\Phi=[-\frac{\pi}{2},\frac{\pi}{2}]$. There will be no divisions along the $\phi$ direction, which results in a particular weight matrix to be symmetrically applied to points ${\bf x}_a$ and ${\bf x}_b$ on the $z$-$axis$ or its parallels (the blue arrow line). (c) Let $\Theta=[-\pi,0,\pi]$, that is, $n=2$. The $\theta$ direction will be divided into two bins, which subtly results in % the weight matrix in a certain bin with $\theta\in[0,\pi]$ to be symmetrically applied to % points ${\bf x}_a$ and ${\bf x}_b$ on $x$-$axis$ or its parallel. points ${\bf x}_a$, ${\bf x}_b$ on the $x$-$axis$ and its parallels to share weights in a certain bin with $\theta\in[0,\pi]$ symmetrically.}}
\caption{{\bf Release of adult mosquitoes}: evolution of the uninfected (top) and {\em Wolbachia}-infected (bottom), as function of time. The larvae appear on the left column, the adults on the right one. The components of the state $x$ (resp.\of the estimate$x_-$) appear in {\color{green}\bf green} (resp.\in{\color{blue}\bf blue}).}
\caption{{\bf Release of larvae}: evolution of the uninfected (top) and {\em Wolbachia}-infected (bottom), as function of time. The larvae appear on the left column, the adults on the right one. The components of the state $x$ (resp.\of the estimates$x_-$, $x^+$) appear in {\color{green}\bf green} (resp.\in{\color{blue}\bf blue}, in {\color{red}\bf red}).}
\caption{Deblurring result on GoPro image \cite{nah2017deep}. {\color{red}\textbf{Red box:}} zoom-in view of the original local patch. {\color{green}\textbf{Green box:}} zoom-in view of the dark channel of its corresponding local patch. {\color{blue}\textbf{Blue box:}} zoom-in view of the bright channel of its corresponding local patch.}
\caption{Dataset statistics for multiclass datasets. CFMC5 has 550 problems with a balanced class distribution. CFMC10 has 1159 problems and has a class imbalance. CFMC5 is a subset of CFMC10. {\color{red}Red} classes belong to the solution category; {\color{blue}blue} classes belong to the problem category.}
\caption[Gazebo scenario]{Gazebo scenario with robot's start pose marked with \includegraphics[height=6pt]{start.eps} and goal pose with \includegraphics[height=6pt]{goal.eps} in each planning session.}
\caption{Baek \etal's GAN based network architecture. In the diagram interaction with the paired set \(P\) and unpaired set \(U\) are represented by \textcolor{red}{Red} and \textcolor{green}{Green} respectively and the \textcolor{blue}{Blue} lines is for interaction with both \(U\) and \(P\). Originally used in~\cite{Baek_2018_CVPR}.}
\caption{\color{blue} Computational costs for loss functions.}
\caption{\color{blue}Key frames histogram.}
\caption{A schematic of the autoencoder structure used in our work. Layers with text in {\color{mred}{red}} have ReLU activation while the last layer with text in {\color{mgreen}{green}} has SoftMax activation (see \eqnref{eq:sigma_defn}).}
\caption{Our results and SMNA re-implementation are shown in gray highlighted rows. \textbf{Bolding} indicates the best value per section and \textcolor{blue}{\textbf{blue}} indicates best values overall. We include both a short and long version of our approach to compare to existing models greedy and beam search approaches.%\K{note that I remove the "balanced" setting, it complicates the story and is not impressive on a single metrics} }
\caption{ Top-down view of the trajectory graphs for beam search and \short{}. \textcolor{Turquoise}{Blue Star} is the start and \textcolor{red}{Red Stop} is the target.}
\caption{{\bf Odor composition decoder:} ({\bf a}) Encoding: Color saturation levels represent concentrations of odorants. Colors in the sensing matrix represent receptor binding affinity to the corresponding odorant. Receptor activity is represented by colors, white = inactive. ({\bf b}) Decoding (Step 1 Elimination): Silent receptors are used to eliminate absent odorants, reducing an initially under-determined problem to a well-defined one. (Step 2 Estimation): Concentration of the remaining odorants can be estimated from responses of the remaining receptors. ({\bf c}) Same as Fig.~\ref{fig:binaryScheme}(c), now for the continuous decoder. Mean and error bar ($\pm1$ standard deviation) were computed over 10 replicate simulations, each with 1000 trials. The free parameter $\gamma$ in the analytical formula (Eq.~\ref{eq:theoreticalLimitMain}) was chosen to minimize mean squared error between the probability obtained numerically and the formula. ({\bf d}) $P(\hat{{\boldsymbol{c}}} = {\boldsymbol{c}})$ plotted as a function of number of odorants in the odor mixture ($K$) and $s*N_{\rm R}$ at fixed $N_{\rm R}$. $P(\hat{{\boldsymbol{c}}} = {\boldsymbol{c}})$ was calculated numerically over 1000 trials, each with a random choice of odor mixture and sensitivity matrix. The smooth curve in white is the approximate boundary of the perfect decoding region, estimated by setting Eq.~\ref{eq:theoreticalLimitMain} to $0.5$ ($\gamma = 3$). ({\bf e}) Error in concentration estimates, defined as Euclidean distance between actual and estimated odor concentration divided by the number of odorants ($\left( ||\hat{\boldsymbol{c}} - {\boldsymbol{c}}||_2 / K \right)$), plotted as a function of number of odorants ($K$) and $s*N_{\rm R}$ at fixed $N_{\rm R}$. The error is small even when recovery is not perfect. Other measures of error lead to similar results (SI Fig.~\ref{fig:errorMeasures}: Other measures of estimation error). } \label{fig:continuousScheme} \end{center} \end{figure*} \subsection*{Decoder of odor composition} If noise is low, or integration times are long, the activity of a receptor is more appropriately represented by a numerical continuum, along with the odorant concentrations ($c_i$), receptor sensitivities ($S_{ij}$), and receptor responses ($R_j$). In this case, the decoder can be modified to estimate not just which odorants are present in the mixture, but also their concentrations. While the details of the decoding scheme depend on the encoding mechanism of the receptor, the main principle remains the same. First, inactive (below threshold) receptors are used to eliminate some odorants. Then, the active receptors are used to estimate the remaining concentrations. Receptor responses can be realistically described by a competitive binding (CB) model in which odorant molecules compete to occupy the receptor binding site \cite{singh2019competitive}. The response to a mixture of odorants with concentrations $c_i$ is given by a Hill-type function \cite{singh2019competitive, reddy2018antagonism, rospars2008competitive, cruz2013neural}: \begin{align} R = \frac{\sum_{j=1}^{N_{\rm L}} x_j}{\left(1+d*\sum_{j=1}^{N_{\rm L}} x_j\right)}. \label{eq:compBind} \end{align} Here $x_j=S_{ij}c_j$ and $d$ parameterizes the affinity of odorants for the receptor. The CB model approximates to the binary model when $\left(d\rightarrow\infty\right)$ and to the commonly used linear response model when $d\rightarrow0$ \cite{gupta2015olfactory,martelli2013intensity, ferreira2012revisiting,Tesileanu255547}. %\cite{khan2008odor,gupta2015olfactory,kim2011system,martelli2013intensity,berglund1973quantitative, ferreira2012revisiting,Tesileanu255547}. As discussed earlier, the binary decoder works because an inactive receptor implies that all odorants that can bind to this receptor are absent. The concentrations of such odorants are set to zero, while concentrations of the remaining of the odorants are set to 1. Thus the success of the binary decoder depends on ensuring that for every odorant that is absent, there is at least one receptor that does not respond. In the continuous case, a weaker condition is sufficient. The starting point is an under-determined identification problem because the number of possible odorants exceeds the number of receptors ($N_L > N_R$). In the first step, we eliminate odorants that bind to receptors with below-threshold responses. This leaves $\tilde{N}_R$ active receptors responding to $\tilde{N}_L$ candidate odorants. If $\tilde{N}_{\rm L} \leq \tilde{N}_{\rm R}$ the problem is now over-determined and can be solved (Fig.~\ref{fig:continuousScheme}), even if all the absent odorants have not been eliminated. Specifically, the odor encoding functions (eq.~\ref{eq:compBind}) give a set of coupled equations that relate the $\tilde{N}_L$ odorant concentrations to the $\tilde{N}_R$ receptor responses. These equations can be inverted to get the unknown concentrations. Our decoder will not eliminate any of the $K$ odorants that are present in a mixture because all of them will evoke responses. To estimate the number of false positives from the remaining $N_L - K$ odorants, let $s$ be the probability that a given receptor is sensitive to a given odorant ($P(S_{ij}>0)=s$). Then, the number of active receptors will be about $\tilde{N}_R \sim s K N_{\rm R}$ while the number of inactive receptors will be approximately $(1 - sK) N_R$. The first inactive receptor eliminates approximately a fraction $s$ of the remaining $N_L - K$ odorants. The second inactive receptor removes roughly another fraction $s$ of the remaining $(1-s) (N_L - K)$ odorants. Summing over these eliminations for all $(1 - sK) N_R$ inactive receptors leaves a total of $\tilde{N}_L \sim K + (N_L - K)(1-s)^{N_R(1 - s K) - 1}$ odorants under consideration. Typical parameters $\{N_{\rm L}, K , N_{\rm R}, s\} = \{10^4, 10, 500, 0.05\}$ give $\tilde{N}_L \sim K = 10$ which is less than $\tilde{N}_R \sim s K N_{\rm R} = 250$, showing that in the biologically relevant regime our elimination algorithm leads to an over-determined and hence solvable identification problem. We can derive an approximate analytical expression for the probability of correct estimation (SI: Odor composition decoder). This derivation assumes that the typical number of receptors that respond to a mixture is larger than the average odor complexity $(\tilde{N}_{\rm R} > K)$, while, at the same time, enough receptors are inactive to eliminate absent odorants. This requires $s(N_{\rm R}-\tilde{N}_{\rm R})>\gamma$, where $\gamma >1$ is a parameter that depends on the receptor response model (details in SI: Odor composition decoder). With these assumptions, \begin{align} P(\hat{{\boldsymbol{c}}} & = {\boldsymbol{c}}) \sim P(\tilde{N}_{\rm R} > K)*P(N_{\rm R}-\tilde{N}_{\rm R}> (\gamma/s)) \nonumber \\ & = \left[1 - \Phi \left(\frac{K - \tilde{N}_{\rm R}}{\sqrt{\tilde{N}_{\rm R}}}\right)\right] \Phi \left(\frac{N_{\rm R} - \tilde{N}_{\rm R} - \frac{\gamma}{s}}{\sqrt{\tilde{N}_{\rm R}}}\right)\, .\label{eq:theoreticalLimitMain} \end{align} $\Phi$ is the standard normal cumulative distribution function. To numerically estimate the probability of correct decoding ($P(\hat{{\boldsymbol{c}}}={\boldsymbol{c}})$), we generated sparse odor vectors with $K$ odorants on average. Concentrations were drawn from a uniform distribution on the interval [0, 1). Elements of the sensitivity matrix were chosen to be non-zero with probability $s$, i.e., ($P(S_{ij}>0)=s$). The values of these non-zero elements were chosen from a log uniform distribution (SI: Numerical Simulations; similar results with other distributions in SI Fig.~\ref{fig:comparisonWithOtherDistributions}). The probability of correct decoding is zero when there are very few receptors ($N_{\rm R}$). But the probability transitions sharply to finite values at a threshold $N_{\rm R}$ which is much smaller than the number of possible odorants ($N_{\rm L}$) (Fig.~\ref{fig:continuousScheme}c). Odor compositions are recovered perfectly for a wide range of parameters (Fig.~\ref{fig:continuousScheme}d,e), so long as receptors are sufficiently sensitive $s*N_{\rm R}>6$. Odors with the highest complexity are decoded when $s*N_{\rm R} \sim 10-15$. The dependence on the total number of odorants ($N_{\rm L}$) is weak (SI Fig,~\ref{fig:variationWithNL}). We quantified the error in odor estimates and found that even when decoding is not perfect there is a large parameter space where the error is small (Fig.~\ref{fig:continuousScheme}e). Since humans have about 300 receptors, our model predicts that odors with most components can be decoded with $s\sim 3-5\%$ so that $sN_{\rm R} = 10-15$. This is consistent with observation -- human receptors have $s \sim 4\%$ \cite{mainland2015human}. For {\it{\it Drosophila}}, which has $\sim 50$ receptors, the observed sensitivity of $s \sim 14\%$ \cite{munch2016door} gives $sN_{\rm R} \sim 7$, in the expected range for successful decoding. \begin{figure} \begin{center} %\includegraphics[scale=0.17,trim=0cm 0.0cm 0.0cm 0.0cm]{UpdatedFigure.pdf} \includegraphics[scale=0.25,trim=0cm 0.0cm 0.0cm 0.0cm]{Figure3.pdf} \caption{{\bf Experimental Test: } We predict performance of mice in the the olfactory cocktail-party task of detecting missing odorants in $K$-component mixtures \cite{rokni2014olfactory}. (a) Schematic of the two step noisy decoder. (b) %The performance of our model is calculated using signal detection theory. Normally distributed decision variable: mean when target is absent = 0; mean when target is present = $P(\hat{c}=c)$, the probability of correct detection of the binary decoder; standard deviations $\sigma$ in both conditions. The ideal observer detection threshold is indicated. Probability of correct response = probability of correct rejection + probability of true positives. (c) 1/d-prime estimated from experiment (see text) follows a linear trend with $K$. (d) Probability of correct detection of presence or absence of an odorant in a $K$-component mixture. Blue markers = fraction of correct responses (true positive + correct rejection) by mice. Red line = theoretical prediction in the absence of noise. Black line = theoretical prediction including noise determined from the data in panel (c). The noisy decision model fits the data well and predicts a striking cliff in performance for odors with $\sim 37$ components. Parameters of the theoretical model: number of odorants $N_{\rm L} = 10^4$, number of receptors $N_{\rm R}=1000$, receptor response sensitivity $s=0.05$ \cite{mainland2015human}. } \label{fig:Test} \end{center} \end{figure} \subsection*{Noisy decision making and comparison with experiment} We can compare the predictions of the odor identity decoder to the performance of mice \cite{rokni2014olfactory} in behavioral studies where the animal is presented with an odor (a mixture of odorants) and is asked to report whether a target odorant is present or not (details in SI: Comparison to experiment). We model the decision making process in this olfactory ``cocktail-party problem'' as consisting of two steps: (1) Internally representing the components of the mixture using the odor identity decoder described above, and (2) Using noisy higher level processes to decide on the presence or absence of the target odorant based on the output of the estimation step (Fig.~\ref{fig:Test}a). Decision noise will degrade performance relative to the ideal decoder. We will model this process in the brain in terms of a noisy decision variable derived from the activity of neurons in the decision circuit \cite{parker1998sense, gold2007neural}. If the target is absent the baseline-subtracted decision variable should take the value $0$. If the target is present the variable should take a value that is proportional to the probability $P(\hat{c}=c) = p$ of correct detection. However, in both cases the decision variable is actually distributed around the desired value with a standard deviation determined by the noise. An ideal observer then simply asks whether the target odorant is more likely to be present or absent, given the observed value of the decision variable and its distribution in the two cases (Fig.~\ref{fig:Test}b). Decision noise in this picture can be directly estimated from the data in \cite{rokni2014olfactory} using signal detection theory \cite{green1966signal}. Briefly, assuming for simplicity that the decision variable in (Fig.~\ref{fig:Test}b) is distributed as a Gaussian, we can estimate the standard deviation from the rate of hits (fraction of instances when the target is correctly reported to be present) and false alarms (fraction of instances where the target is falsely reported to be present). Signal detection theory \cite{green1966signal} relates the the signal to noise ratio (SNR; also called d-prime) of the go/no-go task to the true/ false positive rates as: SNR = d-prime = z(true positive) - z(false positives), where z is the z-score. This analysis gives the SNR (d-prime) for mice as a function of $K$, the number of components in the mixture \cite{rokni2014olfactory} (Fig.~\ref{fig:Test}c). We estimate SNR at other values of $K$ by extrapolating the experimentally observed relationship (red line in Fig.~\ref{fig:Test}c). For a Gaussian decision variable with the same standard deviation $\sigma$ for both conditions, and a difference in means of $\mu$, standard theory \cite{green1966signal} gives ${\rm SNR} = \mu /\sigma$. We took the noise standard deviation in our model to be $\sigma$ (estimated as above from the data for each $K$) times a constant $a$ chosen to minimize the mean squared difference between theory and experiment. In the absence of noise ($\sigma = 0$) our binary decoder predicts essentially perfect performance for mice identifying missing odorants in odors with up to $\sim 27$ components, and a sharp fall-off thereafter (red line in Fig.~\ref{fig:Test}d). Passing this through the noisy decision making process in (Fig.~\ref{fig:Test}a,b) with noise estimated as described above leads to the black line in Fig.~\ref{fig:Test}d. There is an excellent match to the data in \cite{rokni2014olfactory} and a new prediction: the performance of mice in this olfactory cocktail-party problem with continue decline linearly as the complexity of odors increases, until there are about $27$ component odorants. At that point there will be a sharp fall-off in the probability of correct detection, which will approach chance for odors with about $37$ components. \subsection*{Network implementation} We have demonstrated an efficient algorithm for decoding odor identity from a combinatorial code in which receptors that are below threshold are used to eliminate the vast majority of odorants, converting an underdetermined problem into an overdetermined one. Here, we develop a neural network implementation of the algorithm. To instantiate our algorithm mechanistically it is important to have reliable responses and non-responses in receptors, reflecting the actual concentrations of odorants. However, receptor-odorant binding is inherently stochastic. So the first step is to mitigate sensing noise. The simplest way to achieve this is to have many copies of each receptor and to average their responses. Indeed, in the first stage of the olfactory pathway in mammals, each type of receptor is individually expressed in thousands of Olfactory Sensory Neurons (OSNs) (Olfactory Receptor Neurons in insects) and, subsequently, responses of each type are aggregated in glomeruli of the Olfactory Bulb (Antennal Lobe in insects) (Fig.~\ref{fig:NeuralModel}a). Below-threshold responses are especially important for our algorithm. To further ensure their reliability we can arrange for receptor types to compete to suppress each other, thereby muting the weakest responses. In the presence of a firing threshold such a suppression will cause units firing at very low rates to fall silent, as we require. Well-known computational principles show that recurrent inhibitory circuits can achieve this effect. Indeed, in the second stage of olfactory processing, inhibitory interneurons implement a circuit that shuts down the output neurons (mitral cells in mammals; projection neurons in insects) of weakly active glomeruli \cite{olsen2010divisive, roland2016massive}. Next, we need a mechanism to eliminate absent odorants. To achieve this, we organize projections from glomeruli of receptors binding a given odorant to a readout unit whose activity $\hat{c}_j$ represents the odorant concentration (Figure~\ref{fig:NeuralModel}a). We can then implement the elimination step of our decoding algorithm by setting the readout unit threshold so that most of its inputs must be active to trigger a response. If odorant $j$ is not present in the mixture ($c_j = 0$), the probability that a receptor which binds to this odorant is inactive when responding to the mixture is $P(R_i=0|c_j = 0) \sim e^{-sK}$ (SI: Eq.~\ref{eq:probReceptorInactive}). So, of the roughly $sN_{\rm R}$ receptors that bind to this odorant, nearly $sN_{\rm R}e^{-sK}$ will be inactive. Taking typical numbers $\{K, N_R,s\} = \{10, 500, 0.05\}$, about $25$ receptors will respond to a ligand, and about $15$ these will be silent if the ligand is absent. Hence, the corresponding readout unit will be silent ($\hat{c}_j = 0$). A similar architecture is seen in the feedforward projections from the Olfactory Bulb to the Piriform Cortex in mammals (Antennal Lobe to Mushroom Body in insects). Specifically, each neuron in the third stage of olfactory processing receives inputs from many glomeruli in the second stage, but simultaneous activation from most of these is necessary for a response \cite{miyamichi2011cortical, davison2011neural, johnson2000new, franks2011recurrent}. Finally we need a mechanism to set the activity of the readout units that have not been eliminated to represent concentrations of odorants. This can be achieved through a network of recurrent connections between readout units (Figure~\ref{fig:NeuralModel}). To illustrate, suppose that the responses $R_i$ corresponding to the $i$th receptor are conveyed to the $j$th readout unit with a feedforward weight $\hat{S}_{ij}$. %let $R_i$ represent responses corresponding to the $i$th receptor, and suppose that this response is conveyed to the $j$th readout unit with a feedforward weight $\hat{S}_{ij}$. Also suppose that the $j$th readout unit provides recurrent input to the $k$th readout unit with a weight $p_{jk}$. A standard linearized neural network with these connections satisfies the equation \begin{align}{d\hat{c}_j \over dt}= -\hat{c}_j + \sum\limits_{i=1}^{N_{\rm R}} \hat{S}_{ji}R_i + \sum\limits_{k=1, k\ne j}^{N_{\rm L}} p_{jk}\hat{c}_k \, ,\label{eq:dynamic} \end{align} where $\hat{c}_j$ is the response of the $j$th unit. The first term on the right side represents decay of activity in the absence of inputs. In this context, we also linearize the responses so that $R_i = \sum_j S_{ij} c_j$, where $S_{ij}$ is a sensitivity matrix and $c_j$ are odorant concentrations. The steady state occurs when \begin{align} 0 = \left(-\hat{c}_j + c_j \sum\limits_{i=1}^{N_{\rm R}} \hat{S}_{ji}S_{ij} \right) + \sum\limits_{k=1, k\ne j}^{N_{\rm L}} \left(p_{jk} \hat{c}_k + c_k\sum\limits_{i=1}^{N_{\rm R}} \hat{S}_{ji} S_{ik} \right). \label{eq:FinalNeuralModel} \end{align} In the steady state $\hat{c}_j = c_j$, i.e. the activity of readout unit $j$ equals the concentration of odorant $j$, provided the feedforward and recurrent weights are adjusted to obey \begin{align} \sum\limits_{i=1}^{N_{\rm R}} \hat{S}_{ji}S_{ij} = 1 ~~ \mathrm{and} ~~ p_{jk} + \sum\limits_{i=1}^{N_{\rm R}} \hat{S}_{ji} S_{ik} = 0 \label{eq:weightconstraints} \end{align} for all $j$ and every $k \neq j$. The first criterion relates the feedforward weights $\hat{S}_{ji}$ to the sensing matrix $S_{ij}$ (see \cite{zhang2016robust} for a similar relation in a related context). The second criterion balances the network -- feedforward excitation ($\hat{S}_{ji}$) is compensated by recurrent inhibition ($p_{jk}$). This recurrent balanced inhibition recalls circuits of the Piriform Cortex in mammals where long-range inhibition arises via large-scale distance-independent random projections from pyramidal cells to locally inhibitory interneurons \cite{johnson2000new, franks2011recurrent, bathellier2009properties}. In insects similar recurrent inhibition is provided by a giant interneuron. \begin{figure} \begin{center} \includegraphics[scale=0.17,trim=0cm 0.0cm 0.0cm 0.0cm]{Figure4.pdf} \caption{{\bf Network implemtantion:} (a) An odorant $c_j$ binds many types of receptors (colors). Responses are reliably estimated by averaging multiple receptors of the same type in a second layer, where axons of each type converge. This average response is relayed to a readout layer. Units in the readout layer also receive recurrent inhibitory inputs from other readout units. Connections for one odorant and final readout unit are shown. (b) Probability of correct decoding ($P(\hat{{\boldsymbol{c}}} = {\boldsymbol{c}})$) as a function of odor complexity $K$ and $s*N_{\rm R}$ for $N_{\rm R}$ receptors, with $s$ = the probability that an odorant binds to a receptor. $P(\hat{{\boldsymbol{c}}} = {\boldsymbol{c}})$ was calculated numerically over 100 trials, each with random choice of odor mixture and sensitivity matrix (SI: Numerical simulations). The criterion for correct decoding was that the Euclidean distance between the odor vector $\boldsymbol{c}$ and the decoded vector $\hat{\boldsymbol{c}}$ was less than $0.01$. } \label{fig:NeuralModel} \end{center} \end{figure} The $N_{\rm L}$ constraints from the first criterion in (\ref{eq:weightconstraints}) can be solved along with $N_{\rm L}(N_{\rm L} -1)$ constraints from the second criterion because we have about $s*N_{\rm R}N_{\rm L}$ parameters in the feedforward matrix $\hat{S}$ and $N_{\rm L}(N_{\rm L}-1)$ parameters in the recurrent matrix ($p$). This gives more free parameters than constraints if $s N_{\rm R}$, the typical number of receptors responding to an odorant, is bigger than one. These network parameters can be acquired through local learning rules because the feedforward weights for readout unit $j$ are only related to the sensitivities of receptors to the corresponding odorant $j$, while the excitatory-inhibitory balance is local (unit by unit). If the response $R_i$ is a nonlinear function of its inputs, e.g. Eq.~\ref{eq:compBind}, there will still be enough parameters in a recurrent network to decode odor concentrations. However, units in the network may need to have nonlinear responses, or be organized in a deep network with multiple feedforward layers. To test our network decoder (details in SI: Numerical simulations) we selected odor sensitivity matrices $S_{ij}$ such that each odorant binds randomly to a fraction $s$ of the receptors, and assumed a response function that is linear in the odorant concentrations (Eq.~\ref{eq:compBind} with $d=0$). This linear response represented the statistically stable average over many stochastic receptors. We then selected feedforward projection matrices $\hat{S}_{ij}$ to readout neurons with recurrent weight matrices $p_{jk}$ satisfying the constraints in Eq.~\ref{eq:weightconstraints}. Imitating the projections to the olfactory cortex \cite{miyamichi2011cortical, davison2011neural, johnson2000new, franks2011recurrent}, we set a threshold so that the readout units only responded if at least $95\%$ of their feedforward inputs were active. Finally, we allowed the network to decode odor concentrations as the steady state of the network in Eq.~\ref{eq:dynamic}. Figure~\ref{fig:NeuralModel}b shows that the probability of correct decoding by the network is similar to results shown for the abstract decoders discussed in previous sections. \section*{Discussion} Our central idea is that receptors which do not respond to an odor convey far more information than receptors that do. This is because the olfactory code is combinatorial -- each receptor binds to many different odorants and each odorant binds to many receptors. Hence, an inactive receptor indicates that all the odorants that could have bound to it must be absent. Natural odors are mixtures of perhaps $10$-$40$ components drawn from the more than $10^4$ volatile molecules in nature \cite{dunkel2008superscent,touhara2009sensing,yu2015drawing}. We showed that if most of these molecules bind to a fraction of the receptors that is neither too small nor too large, odorants that are absent from a mixture can be eliminated from consideration with nearly perfect accuracy by a system with just a few dozen to a few hundred receptors types. The response of the active receptors can then be used to accurately decode the concentrations of the molecules that are present. Our results show that odors of natural complexity can be faithfully encoded in, and fully decoded from, signals produced by a relatively small number of receptor types that each bind to 5-15\% of odorants. Perhaps this explains why all animals express $\sim 300$ receptor types, give or take a small factor, although receptor diversity does increase with body size \cite{Tesileanu255547}. Even at the extremes, the fruitfly and the billion-fold heavier African elephant have $\sim 300/6$ and $\sim 300 \times 6$ receptor types respectively. We proposed a network implementation of our algorithm that resembles the architecture of the early olfactory pathway in the brain. First, with a few dozen to a few hundred receptor types, our algorithm requires each receptor type to bind to $\sim 5-15\%$ of odorants. This requirement, which recalls ideas from ``primacy coding'' \cite{wilson2017primacy,dewan2018single}, is consistent with observations from {\it Drosophila} to human \cite{munch2016door, mainland2015human}. If receptors are noisy, the next step in our decoding network is to pool signals from multiple receptors of the same type into ``glomeruli'', and to then allow lateral inhibition to suppress spurious responses due to noise. This pooling and inhibition motif is realized in the second stage of olfactory processing \cite{olsen2010divisive, roland2016massive}. The third stage of our decoding network has readout units that pool from many glomeruli, most of which must be active to produce a response. In addition, the readout units must have large-scale, recurrent, balanced inhibition. A similar architecture is visible in projections from the second to the third stage of the animal olfactory pathway, and in the recurrent circuits of the third stage \cite{miyamichi2011cortical, davison2011neural, johnson2000new, franks2011recurrent, bathellier2009properties}. Previous work has highlighted that this architecture could enable robust feedforward reconstruction of compressed odor codes \cite{zhang2016robust}, and supports both similarity search \cite{dasgupta2017neural} and novelty detection \cite{dasgupta2018neural}. In our network implementation the activation function of each unit was linear in the activities of other units. Real neurons have nonlinear activation functions with a threshold, saturation, and sometimes nonlinear summation of inputs. Our model, which can be regarded as a linearization of such neurons around an operating point, can be generalized to nonlinear units which still have a high threshold for activation to implement feedforward elimination of absent odorants, and recurrent inhibitory balance for concentration decoding. Our network readout units individually represented the presence or absence of odorants. By contrast, in the brain, exposure to an odorant activates a sparse, distributed collection of cortical neurons. A simple extension of our network produces such a representation. Instead of collecting all glomeruli that respond to a given odorant, we can construct readout units that sample groups of these glomeruli. An odorant would then be represented by the collective activation of a set of readout neurons, some of which may also participate in the representation of other odorants, as seen in the brain. We did not pursue this approach because we assumed knowledge of the olfactory environment and the receptor sensitivity matrix. But during development the brain does not know which odorants are present in the world and which receptors they activate. Thus, a good wiring strategy would be to project small groups of glomeruli to target readout neurons. Each such target would be a guess for a subset of receptors that will be co-activated by some odorant. The odorant is then represented by activity in every readout neuron that samples from a proper subset of the activated receptors. Finally, our feedforward weight matrix was related to the odor sensitivity matrix in order to decode the actual concentrations of odorants. As we discussed these weights could be acquired through a local learning rule. Our theory could be tested by sampling sensitivities of receptors for a particular odorant \cite{saito2009odor, mainland2015human}, along with feedforward projection strengths from glomeruli to their targets, perhaps measured by optogenetically activating individual glomeruli while imaging the strength of downstream responses. %% Additional text %\begin{figure} %\begin{center} %\includegraphics[scale=0.2,trim=0cm 0.0cm 0.0cm 0.0cm]{Figure4.pdf} % \caption{{\bf Effect of $N_{\rm R}$ on odor estimation:} Probability of correct decoding plotted as a function of average sparsity of odor mixtures ($K$). } %\label{fig:Predictions} % \end{center} % \end{figure} According to our decoding model, performance at estimating and discriminating complex odors should increase with the size of the olfactory receptor repertoire. So, while performance might be similar at low odor complexity, say between humans and dogs or mice, these animals should be better than humans at discriminating more complex odors as they have 2.5 times more olfactory receptor types. Thus, while comparing performance between species, one should account for odor complexity as well as the total number of receptors. This prediction can be tested by studying odor discrimination thresholds as a function of odor complexity for animals with olfactory receptor repertoires of different sizes. %% End Additional Text Our results suggest that the brain may indeed be able to discriminate the detailed composition of odors, contrary to our usual experience of olfaction as a synthetic sense. In fact behavioral experiments do show that it is possible to discriminate complex odors that differ by just a few components \cite{jinks1999limit, bushdid2014humans}. If our decoding algorithm is realized in the brain, all odors that bind to an inactive receptor type should be eliminated. A way of testing this prediction would be to block a specific receptor type pharmacologically, or via optogenetic suppression of the associated glomerulus. Our theory predicts that animals will then tend to behave as if ligands that bind to this receptor are absent, even if other receptors do bind them. Finally, in our model odors can be very well decoded (yellow regions in Figs.~\ref{fig:binaryScheme},\ref{fig:continuousScheme}) if they are composed of fewer than $K_{{\rm max}}$ components, where $K_{{\rm max}}$ is determined by the number of receptor types ($N_r$) and the fraction of them that bind on average to the typical odorant ($s$). This prediction can be tested by measuring $N_r$ and $s$ for different species and then characterizing discrimination performance between odors of complexity bigger and smaller than $K_{{\rm max}}$. We illustrated this for mouse in Fig.~\ref{fig:Test}. Our algorithm can decode complex natural odors detected by chemosensing devices like electric noses \cite{johnson2006dna, goldsmith2011biomimetic}. In this engineered setting, the target odorants and response functions are explicitly known so that our method of ``Estimation by Elimination'' can be precisely implemented. \acknow{VS was supported by a University of Pennsylvania Computational Neuroscience Initiative fellowship. VB was supported by Simons Foundation MMLS grant 400425, and NSF grants PHY-160761 and PHY-1734030. VB thanks the Kavli IPMU for hospitality as this work was completed. } \showacknow{} % Display the acknowledgments section % Bibliography \bibliography{references} \clearpage \renewcommand{\thefigure}{S\arabic{figure}} \setcounter{figure}{0} \renewcommand{\theequation}{S\arabic{equation}} \setcounter{equation}{0} \section*{Supplementary information} \subsection{Analytic estimate of the probability of correct decoding} \begin{figure}[h] \begin{center} \includegraphics[scale=0.25,trim=0cm 0.0cm 0.0cm 0.0cm]{BinaryHuristicCompNumericsNOTitleLabel.png} \caption{Plot of the difference between $P(\hat{{\boldsymbol{c}}} = {\boldsymbol{c}})$ as given by Eq.~\ref{eq:correctest} and as estimated numerically for the binary decoder. For most parameters the analytical results match the simulations. \label{fig:binaryNumericsVsAnalytics}} \end{center} \end{figure} \subsubsection{Odor identity decoder} We want the probability $P(\hat{\boldsymbol{c}} = \boldsymbol{c})$ that the decoded vector $\hat{\boldsymbol{c}}$ equals the input vector $\boldsymbol{c}$, i.e., the corresponding elements of the vectors $\hat{\boldsymbol{c}}$ and $\boldsymbol{c}$ are equal. Assuming statistical independence of the decoding of each odorant, we can write %Assuming that the probability of the decoded element $\hat{c}_i$ to be equal to ${c}_i$ is independent of the other elements in $\hat{\boldsymbol{c}}$, we can write \begin{align} P(\hat{{\boldsymbol{c}}} = {\boldsymbol{c}}) = \prod_{i=1}^{N_{\rm L}} P(\hat{c}_i=c_i) = \left[P(\hat{c}_i=c_i) \right]^{N_{\rm L}} \, .\label{eq:Probability1} \end{align} The assumption of independence is an approximation that we will validate by comparing with the full numerical results. %Note that due to the highly non-linear encoding (binary OR operation) and decoding (elimination) steps, this is a strong assumption to make. %But without this simplification the problem is analytically intractable. %However, the resulting expression provides excellent predictions to the numerical simulations. The decoded concentration $\hat{c}_i$ could be equal to $c_i$, if either both of them equal 1 or both of them equal zero. Thus, the term in the square bracket in Eq.~\ref{eq:Probability1} can be written as: \begin{align} P(\hat{c}_i=c_i) = P(\hat{c}_i=1| c_i =1) P(c_i = 1) \nonumber \\ + P(\hat{c}_i = 0 | c_i = 0) P(c_i = 0). \label{eq:ProbabilityElements} \end{align} where $P(c_i =1) = K/{N_{\rm L}} = \alpha$ is the probability that an odorant is present in the mixture, and $P(c_i = 0) = (1-\alpha)$. %The conditional probabilities in Eq.~\ref{eq:ProbabilityElements} can be found as follows: The decoder guarantees that if an odorant $c_i$ is present and there is a receptor $R_j$ that is sensitive to it ($S_{ji}$=1), then the receptor will respond, and the decoded vector will set the corresponding element $\hat{c}_i$ to 1. If no receptor is sensitive to this odorant (i.e, $\forall j~: ~j\in[1,N_{\rm R}], S_{ji}=0 $), the decoded element will still be set to 1 by default. So, $P(\hat{c}_i = 1 | c_i = 1)=1$. To calculate $P(\hat{c}_i=0| c_i =0)$, recall that in our decoding scheme, $\hat{c}_i = 0$ if there exists at least one receptor such that $R_j = 0$ for which $S_{ji}=1$. Thus, \begin{align} P(\hat{c}_i=0| c_i =0) = P(\exists \; j: R_j =0\cap S_{ji}=1|c_i = 0) \end{align} where $\cap$ is the binary AND operation. The probability on the right is 1 minus the probability that for all receptors either $R_j=1$ or $R_j=0 \cap S_{ji}=0$. So, \begin{align} & P(\hat{c}_i=0| c_i =0) \nonumber \\ & = 1 - P(\forall \; j: R_j =1\cup (R_j=0 \cap S_{ji}=0)|c_i = 0) \nonumber \\ & = 1 - \left[P(R_j =1 \cup (R_j=0 \cap S_{ji}=0)|c_i = 0)\right]^{N_{\rm R}}, \label{eq:receptorsAreIndependent} \end{align} where in the second step we have again made the assumption that the receptors are independent conditional on the response of $c_i$. The quantity in the bracket in Eq.~\ref{eq:receptorsAreIndependent} can be written as: \begin{align} & P(R_j =1 \cup (R_j=0 \cap S_{ji}=0)|c_i = 0) \nonumber \\ & = P((R_j =1 \cup R_j=0) \cap (R_j =1 \cup S_{ji}=0)|c_i = 0) \nonumber \\ & = 1 \cap (R_j =1 \cup S_{ji}=0)|c_i = 0) \nonumber \\ & = P(R_j =1 \cup S_{ji}=0|c_i = 0) \nonumber \\ & = 1 - P(R_j =0 \cap S_{ji}=1|c_i = 0) \nonumber \\ & = 1 - P(R_j =0|c_i = 0)P(S_{ji}=1|c_i = 0) \end{align} Now, $P(S_{ji}=1|c_i = 0) = P(S_{ji}=1) = s$, where entries of the sensing matrix are chosen to be non-zero independently and with probability $s$. To calculate $P(R_j =0|c_i = 0)$ recall that the receptors are OR gates with inputs $S_{jk}c_{k}$. Thus, for $R_j=0$ all terms $S_{jk}c_k$ should be zero. The probability that any one such term is zero is $(1 - s \alpha)$. Since we already have $c_i = 0$, there are $(N_{\rm L} -1)$ additional terms that need to be zero. Hence, \begin{align} P(R_j =0|c_i = 0) = (1 - s \alpha)^{(N_{\rm L} -1)} \, ,\label{eq:probReceptorInactiveExact} \end{align} and \begin{align} P(\hat{c}_i=0| c_i =0) = \left(1 - \left[1-s(1-s\alpha)^{(N_{\rm L}-1)}\right]^{N_{\rm R}}\right) \label{eq:zeroProbability} \end{align} Putting this all together (using Eq.~\ref{eq:zeroProbability} in Eq.~\ref{eq:ProbabilityElements}), we get: \begin{align} & P(\hat{{\boldsymbol{c}}} = {\boldsymbol{c}}) = \left[\alpha + (1-\alpha) \left(1 - \left[1-s(1-s\alpha)^{(N_{\rm L}-1)}\right]^{N_{\rm R}}\right)\right]^{N_{\rm L}} \label{eq:correctest} \end{align} Using Eq.~\ref{eq:zeroProbability}, we can also get the (approximate) probability of a false detection as $P(\hat{c}_i=1| c_i =0) = 1 - P(\hat{c}_i=0| c_i =0)$: \begin{align} P(\hat{c}_i=1| c_i =0) \approx \left[1-s(1-s\alpha)^{(N_{\rm L}-1)}\right]^{N_{\rm R}}. \label{eq:flasePositiveAppend} \end{align} This expression is approximate due to our independence assumptions. % about $\hat{c}_i$'s being independent and $R_j$'s being independent conditional on the response of $\hat{c}_i$. \begin{figure} \begin{center} \includegraphics[scale=0.18,trim=0cm 0.0cm 0.0cm 0.0cm]{SI_Figure2.pdf} \caption{{\bf $P(\boldsymbol{\hat{c}}=\boldsymbol{c})$ as a function of $N_{\rm R}$ for the continuous decoder and alternative choices of the sensitivity matrix}. Results for binary encoding are the same as in Fig.~\ref{fig:binaryScheme}c and are plotted here for comparison. (a) Uniform distribution: Similar to Fig.~\ref{fig:continuousScheme}c except that the non-zero elements of the sensitivity matrix were chosen uniformly at random between [0,1]. (b) Log-normal distribution: Similar to Fig.~\ref{fig:continuousScheme}c except that the non-zero elements of the sensitivity matrix were chosen at random from a log normal distribution with the corresponding normal distribution having mean zero and standard deviation 1. \label{fig:comparisonWithOtherDistributions}} \end{center} \end{figure} {\it \underline{Approximation:} } Since the average number of odorants present in the mixture ($K = \alpha N_{\rm L}$) is small compared to $N_{\rm L}$ and $N_{\rm L} \gg 1$, we can approximate: \begin{align} (1-s\alpha)^{(N_{\rm L}-1)} = \left(1-\frac{s K}{N_{\rm L}}\right)^{(N_{\rm L}-1)} \approx e^{-s K}. \end{align} Now, since the odor sensitivity ($s$) is small, so that $s e^{-s K}$ is also small, while $N_{\rm R} \gg 1$, we further approximate \begin{align} \left[1 - s e^{-s K} \right]^{N_{\rm R}} = \left[1 - \frac{s N_{\rm R} e^{-s K}}{N_{\rm R}} \right]^{N_{\rm R}} \approx e^{-s N_{\rm R} e^{-s K}}. \end{align} This results in: \begin{align} & P(\hat{{\boldsymbol{c}}} = {\boldsymbol{c}}) = \left[\alpha + (1-\alpha) \left(1 - e^{-s N_{\rm R} e^{-s K}}\right)\right]^{N_{\rm L}} \label{eq:BooleanProb2} \end{align} which simplifies to: \begin{align} & P(\hat{{\boldsymbol{c}}} = {\boldsymbol{c}}) = \left[1 - e^{-s N_{\rm R} e^{-s K}} + \alpha e^{-s N_{\rm R} e^{-s K}} \right]^{N_{\rm L}} \end{align} This expression approximates to Eq.~\ref{eq:BooleanProb4} in the main text: \begin{align} & P(\hat{{\boldsymbol{c}}} = {\boldsymbol{c}}) = \left[1 - N_{\rm L} e^{-s N_{\rm R} e^{-s K}} \right]\end{align} Similarly, Eq.~\ref{eq:probReceptorInactiveExact} approximates to \begin{align} P(R_j =0|c_i = 0) = (1 - s \alpha)^{(N_{\rm L} -1)}\approx e^{-s\alpha N_{\rm L}} = e^{-sK}, \label{eq:probReceptorInactive} \end{align} and Eq.~\ref{eq:flasePositiveAppend} approximates to \begin{align} P(\hat{c}_i=1| c_i =0) \approx e^{-s N_{\rm R} e^{-s K}}. \end{align} \begin{figure} \begin{center} \includegraphics[scale=0.35,trim=0cm 0.0cm 0.0cm 0.0cm]{SI_Figure3.png} \caption{{\bf Other measures of estimation error}: (a) Total error: $L_2$ norm (or the root mean square) of the difference between actual and estimated concentrations for the continuous decoder with compressive binding encoding model. (b) $L_2$ norm of the difference between actual and estimated concentrations divided by the $L_2$ norm of the actual concentration. \label{fig:errorMeasures}} \end{center} \end{figure} \subsubsection{Odor composition decoder} For the continuous decoder to give a unique solution, the number of receptors that respond to the mixture should be larger than the number of odorants with non-zero concentrations ($K = \alpha N_{\rm L}$). This ensures that the system of equations is over-determined and can in principle be solved. Additionally, the number of receptors that do not respond should be such that the absent odorants can be set to zero. Since every receptor binds to $sN_{\rm L}$ odorants on average, we need at least $1/s$ receptors to cover all the odorants. In general, as the entries of the sensitivity matrix are statistically distributed, the number of receptors that do not respond should be larger than ${\gamma}/{s}$ for correct odor estimation, where $\gamma$ is a small number greater than 1. Putting this all together, if $P(\tilde{N}_{\rm R})$ is the probability of the number of receptors with non-zero response, we are interested in the probability that $P(\tilde{N}_{\rm R}> K = \alpha N_{\rm L})*P(N_{\rm R}-\tilde{N}_{\rm R}> (\gamma/s))$. The probability that a receptor responds is: \begin{align} P(R>0) = \left(1 - P(R=0) \right) &= \left(1- (1 - s \alpha)^{N_{\rm L}}\right). \end{align} Taking the number of receptors that respond to be a Poisson variable with rate $\left< \tilde{N}_{\rm R}\right> = N_{\rm R}*P(R>0)$, we can estimate the typical number of receptors that respond. For biologically appropriate parameters $\{N_{\rm L}, N_{\rm R}, K, s\} \sim \{10^4, 500, 10, 0.05\}$, the mean number of receptors that respond is $\left< \tilde{N}_{\rm R}\right> \sim 200$. The standard deviation is $\sqrt{\left< \tilde{N}_{\rm R}\right>} \sim 14$. For these values of the mean and variance, we can approximate the Poisson distribution with a Gaussian $P(\tilde{N}_{\rm R}) = \mathcal{N}\left(\tilde{N}_{\rm R}, \sqrt{\tilde{N}_{\rm R} } \right)$. Thus, \begin{align} P(\hat{{\boldsymbol{c}}} & = {\boldsymbol{c}}) \sim P(\tilde{N}_{\rm R} > \alpha N_{\rm L})*P(N_{\rm R}-\tilde{N}_{\rm R}> (\gamma/s)) \nonumber \\ & = \left[ 1 - \Phi \left( \frac{\alpha N_{\rm L} - \tilde{N}_{\rm R}}{\sqrt{\tilde{N}}_{\rm R}}\right)\right] \Phi \left( \frac{N_{\rm R} - \tilde{N}_{\rm R} - \frac{\gamma}{s}}{\sqrt{\tilde{N}}_{\rm R}}\right) \label{eq:theoreticalLimit} \end{align} where $\Phi$ is the cumulative distribution function of the standard normal distribution. \subsection{Numerical Simulations} \subsubsection{Odor identity decoder} For the binary case, the elements of the odor vector were chosen to be non-zero withs probability $P(c_i>0) = K/N_{\rm L}$. The entries of the sensitivity matrix $S_{ij}$ were chosen to be non-zero with a probability $s$, ($P(S_{ij})>0 = s$). The receptor response was calculated using the binary `OR' function. The decoded concentration $\hat{c}$ was estimated using the two steps described in the main paper. First, the decoded concentration of any odorant to which an inactive receptor is sensitive, was set to zero. All remaining concentrations were set to 1. \begin{figure}[h] \begin{center} \includegraphics[scale=0.5,trim=0cm 0.0cm 0.0cm 0.0cm]{VariationWithNL.pdf} \caption{{\bf Dependence of $P(\boldsymbol{\hat{c}} = \boldsymbol{c})$ on $N_{\rm L}$}: $P(\boldsymbol{\hat{c}}=\boldsymbol{c})$ plotted as a function of odor complexity $K$ and $sN_{\rm R}$ at a fixed value of $N_{\rm R} = 500$. Each panel gives $P(\boldsymbol{\hat{c}}=\boldsymbol{c})$ for different value of the total number of possible odorants $N_{\rm L}$. The minimum value of $s N_R$ for successful decoding and the optimal value where the most complex odors can be decoded are both relatively independent of $N_{\rm L}$. \label{fig:variationWithNL}} \end{center} \end{figure} \subsubsection{Odor composition decoder} For the continuous case, the elements of the odor vector were chosen to be non-zero with probability $P(c_i>0) = K/N_{\rm L}$, and the elements of the sensitivity matrix were chosen to be non-zero with probability $s$, ($P(S_{ij})>0 = s$). The values of the non-zero elements in the odor vector were chosen from a uniform distribution on the interval $[0, 1)$, and for the sensitivity matrix from a log-uniform distribution between $10^{-1}$ and $10^1$. The activity of each receptor was determined using Eq.~\ref{eq:compBind} of the main text (d = 1). The concentration of any odorant to which an inactive receptor is sensitive was set to zero. After this elimination, let $\tilde{\boldsymbol{R}}$ be the vector representing the response of the set of active receptors, $\tilde{\boldsymbol{c}} $ be the vector representing the concentration of the odorants that have not been set to zero, and $\tilde{S} = \{\tilde S_{ij} \}$ be the $\tilde{N}_{\rm R} \times \tilde{N}_l$ sensitivity submatrix over active receptors $\tilde{\boldsymbol{R}}$ and the remaining odorants $\tilde{\boldsymbol{c}}$. Then, if $\tilde{N}_{\rm R} < \tilde{N}_l$ (non-invertible case), all decoded concentrations were set to zero. Otherwise, the decoded concentrations were given by the vector that minimized the $L_2$ distance $||\tilde{\boldsymbol{R}} - \tilde{S}\cdot\tilde{\boldsymbol{c}}||_2$. The Levenberg-Marquardt solver with geodesic acceleration from the \textit{GNU GSL} library was used to find the minimum. Multiple trials were run fore each choice of parameters. At the end of each trial, the $L_2$ norm of the difference between actual and decoded concentration vectors was reported. The trial was considered a success if this norm was less than a threshold of $0.01$. Simulations were performed in C++. The sensitivity matrix $S$ and the odorant concentrations were generated from streams of (pseudo)random numbers drawn by the \textit{Xoroshiro128+} random number generator. Each stream is seeded with a $2^{64}$ forward jump from the seed of the previous stream. The first stream is seeded from the output of the \textit{SplitMix64} generator initialized by current system time. Random number production as well as vector operation code were optimized using \textit{Intel}'s SIMD instruction set. \subsubsection{Neural network} To simulate the neural network, we generated random sparse odor vectors and sensitivity matrices. The elements of the odor vector were chosen to be non-zero with probability $P(c_i>0) = K/N_{\rm L}$, and the elements of the sensitivity matrix were chosen to be non-zero with probability $P(S_{ij})>0 = s$. The value of the non-zero elements were chosen from a uniform distribution on the interval $[0, 1)$. The receptor response was calculated using a linear response model (d = 0 in Eq.~\ref{eq:compBind}). To get the feed forward connections $\hat{S}_{ji}$, we first made a matrix $\bar{\mathcal{S}}$ defined as: $\bar{\mathcal{S}}_{ji} = 1/(S_{ji})$ if $S_{ji}$ is non-zero, and $\bar{\mathcal{S}}_{ji} = 0$ otherwise. The matrix $\hat{S}$ was then chosen as: $\hat{S}_{ji} = \bar{\mathcal{S}}_{ji}/ \left(\sum \limits_{i}\bar{\mathcal{S}}_{ji}S_{ij}\right)$. The elements of the recurrent connectivity matrix were obtained as $p_{jk} = - \sum \limits_{i} \hat{S}_{ji} S_{ik}$. If more than 5\% of the receptors connected to a readout unit $c_{j}$ were inactive, the decoded concentration $\hat{c}_j$ was set to zero. The feedforward input to the remaining readout units were calculated as $c^{\rm init}_j = \hat{S}_{ji}R_i$. The remaining concentrations were computed as $(1-\tilde{p})^{-1} \mathbf{c}^{\rm init}$, where $\mathbf{c}^{\rm init}$ is the vector representing the total feedforward input to neurons that have more than 95\% of their receptors active, and $\tilde{p}$ represents the sub-matrix of connection weights between these neurons. \subsection{Comparison to experiment} In the main text the predictions of the binary decoder are compared to the performance of mice \cite{rokni2014olfactory} in go/no-go experiments where the subject is presented with an odor (a mixture of odorants) and is asked to report whether a target odorant is present in the odor or not. The experiments of \cite{rokni2014olfactory} were conducted as follows. 13 mice were trained to report the presence of a target odorant by licking at a water spout, and absence by abstaining from licking. Feedback was provided by giving a water drop on correct lick, and punishing an incorrect lick by a 5 second timeout. Each mouse was trained to identify 2 target odorants from a set of 16. A total of 8 different sets of target odorant pairs were used. The odor mixtures in the experiment contained between 1-14 odorants. In each trial, the target odorant was present with probability 0.5, and the two target odorants for a particular mouse were never presented in the same trial. The mice were first trained to identify targets in mixtures with few components, and allowed to reach a performance level of 80\%. Once mice reached this criterion, the complexity of the mixtures (number of component odorants) was gradually increased such that the distribution of mixture complexity presented in any trial gradually became uniform. Mice took typically 1000 trials to reach the 80\% criterion with uniform distribution of odor complexity. Performance was measured after such learning had occurred. \end{document}}
\caption{(a) An example \ac{rpm}. One is asked to select an image that best completes the problem matrix, following the structural and analogical relations. Each image has an underlying structure. (b) Specifically in this problem, it is an inside-outside {\color{NavyBlue}{\textbf{structure}}} in which the outside {\color{Red}{\textbf{component}}} is a {\color{YellowOrange}{\textbf{layout}}} with a single centered object and the inside {\color{Red}{\textbf{component}}} is a $2 \times 2$ grid {\color{YellowOrange}{\textbf{layout}}}. Details in Figure~\ref{fig:process}. (c) lists the rules for (a). The compositional nature of the rules makes this problem a difficult one. The correct answer is 7.}
\caption{Four training \red{approaches} used in experiment. \textit{base} trains entire network from scratch, \textit{fix} only trains classifier, \textit{init} trains entire network with initialized weights, and \textit{kd} trains entire network with an additional loss function.}
\caption{{\bf Feature boxplots between cells and voids.} A: generic solidity; B: inflated surface of object (\micro\squaremetre), C and D: number of deflated and inflated boundaries per object. The box extends from the lower to upper quartile values with the red line being the median. The whiskers extend from the non-outlier minimum to non-outlier maximum, while crosses are outlier values.}
\caption{{\bf Vein data distribution.} A. Mean vessel number count per section, B. Vessel cell area sum per section (\micro\squaremetre), C. Mean vessel equivalent diameter (\micro\meter), D. Fraction of the epithelium per section. The box-and-whisker plot depicts the descriptive statistics evaluated from the corresponding cross-sections. For boxplot interpretation cf. Figure \ref{fig_solidity}. }
\caption{\textbf{Void fraction at the base (B), middle (M) and apex (A) of the leaf blade.}:A. Section Void fraction, B. Membrane ratio, C. Cell median equiv diameter (\micro\squaremetre), D. Void median equiv diameter (\micro\squaremetre). The box-and-whisker plot depicts the descriptive statistics evaluated from the corresponding cross-sections, as in Fig.~\ref{fig_vein}. }
\caption{To compute a data-driven emotion mapping, we collected $23$ videos of pedestrians walking in a corridor for our Mechanical Turk perceptual user study. Users were asked to label the emotion of one pedestrian (marked in \textcolor{blue}{blue}).}
\caption{Our robot navigation algorithm satisfies the proxemic distance constraints, including peripersonal space (\textcolor{green}{green}) and interpersonal space (\textcolor{blue}{blue}). The trajectory computed by our algorithm does not intrude onto these spaces, whereas a robot that fails to consider the reachability distance (\textcolor{violet}{purple} trajectory) may cause discomfort to some pedestrians.}
\caption{Evolution of the top-3 (\protect\tikz \fill[light-red] (0.1,0.0) rectangle (0.4,0.2);), top-2 (\protect\tikz \fill[light-blue] (0.1,0.0) rectangle (0.4,0.2);) and top-1 (\protect\tikz \fill[light-yellow] (0.1,0.0) rectangle (0.4,0.2);) accuracy with the number of asked questions. The top-K accuracy with a decision tree as QA model keeps increasing slowly. The top-K accuracy rate of the RL agent as a QA model is not only higher than the decision tree method but also converges quicker in around 7 questions.}
\caption{Illustration for Alg.\ref{algo: myalgorithm_iros}. $\mathcal{U}$ is the unknown space (\tikzrectangle[black,fill=MyblueLight]{10pt}), and $\mathcal{O}$ are the known obstacles (\tikzrectangle[black,fill=DarkOrange]{10pt}) . One unknown obstacle is shown with dotted line.}
\caption[Composite images of Experiment~1]{Composite images of Experiment~1. The UAV must fly from start \tikzcircle[black,fill=green]{2pt} to goal \tikzcircle[black,fill=red]{2pt}. Snapshots shown every 670~ms.}
\caption[Composite images of Experiment~2]{ Composite image of Experiment 2. The UAV must fly from start \tikzcircle[black,fill=green]{2pt} to goal \tikzcircle[black,fill=red]{2pt}. Snapshots shown every 330~ms.}
\caption[Composite images of Experiment~3]{ Composite image of Experiment 3. The UAV must fly from start \tikzcircle[black,fill=green]{2pt} to goal \tikzcircle[black,fill=red]{2pt}. Snapshots shown every 670~ms.}
\caption[Composite images of Experiment~4]{ Composite image of Experiment 4. The UAV must fly from start \tikzcircle[black,fill=green]{2pt} to goal \tikzcircle[black,fill=red]{2pt}. Snapshots shown every 670~ms.}
\caption{The first row illustrates that our new model can resume tracking from a failure caused by the strong background clutter. The second row illustrates the occlusion by a similar object, the proposed confidence region based model can track the right person after reappearing. \textbf{In both cases, the original ECO tracker failed.} Note that in our experiments, we use \textcolor{red}{95\%} confidence region.(better viewed in color)}
\caption{(Color online) %\textit{TO MOMEN: make sure you can recognize the graphs even in black and white (use for instance different thicknesses or line styles)} Variations of the condition number $\kappa$ of $\boldsymbol{V}$ along the array in units of coupling lengths for different geometries. A hexagonal oversized array with an off-center $1 \times 7$ lantern ( \protect\blue\space curve) provides the best performance as it has a low condition number that is fairly insensitive to length.}
\caption{(Color online) Histogram of the phase of the field transfer matrix $\psi_\mathrm{n,i}$ for i = 1, 2 and 3 (excitation of the device with the 3 modes supported by the lantern and shown in the insets) in a 9 waveguides square array %with the lantern connected as shown by the upper configuration (\space\protect\greenbox\space fill, $\kappa = 2 \times 10^6$) and a 27 waveguides hexagonal array with an off-center lantern (\space\protect\bluebox\space fill, $\kappa = 4.4$). Greater phase diversity of the field transfer matrix yields $\boldsymbol{V}$ matrices with lower condition numbers.}
\caption[]{Arrival directions of neutrino events from IceCube. Shown are upgoing track events~\cite{Aartsen:2016xlq,Haack:2017dxi} (\textcolor{red}{$\odot$}), the high-energy starting events (HESE) (tracks \textcolor{magenta}{$\otimes$} and cascades \textcolor{magenta}{$\oplus$})~\cite{Aartsen:2014gkd,Kopper:2015vzf,Kopper:2017zzm}, and additional track events published as public alerts (\textcolor{darkgreen}{$\odot$})~\cite{Smith:2012eu,GCN}. The blue-shaded region indicates where the Earth absorption of 100-TeV neutrinos becomes important. The dashed line indicates the equatorial plane. We also indicate the location of the blazar TXS 0506+056 ($\medwhitestar$).}
\caption{\label{fig:simulation_1_intersection}\color{blue}The intersection points on the terrain.}
\caption{\label{tab:comparison}Comparison between binary, one hot and domain wall encoding strategies (note that the $\delta'_{i}$ strategy is not shown in the table, but would be the same as the one used here except for would require fourth order coupling for a two variable interaction). Maximum order in this case refers to the maximum number of $Z$ variables which must appear in a single Hamiltonian term for the encoding. \textcolor{red}{Red} colouring is used to indicate a major drawback of a strategy, while \textcolor{blue}{blue} indicates a major advantage conferred by a strategy. The word `complicated' is used to indicate cases where the result is likely to be highly dependent on the details of the problem being encoded. For discussion of the performance metrics, and explanations of the `complicated' cases, see appendix 1.}
\caption{\textcolor{darkgray}{Cartoon demonstrating the power of combining ISM velocity and distance information. {\bf Top:} Gas emission, DIB absorption, and stellar reddening measurements constrain different dimensions of the ISM's spatial--kinematic structure. {\bf Bottom:} This spatial--kinematic structure reveals the cloud's dynamical status.} }
\caption{Examples of \textcolor{querycolor}{queries}, \textcolor{poscolor}{positive}, and \textcolor{negcolor}{negative} samples. The negatives are sorted by difficulty from left to right (hard to easy) based on distances obtained from our re-identification feature vectors. It should be noted that the hardest negative sample has usually subtle differences (e.g. missing a small spoiler in the first row).}
\caption{\markedred{Confusion matrix of baseline method network}}
\caption{\markedred{Confusion matrix of CNN-LSTM network}}
\caption{Performance of the regular angular adaptivity with P$_n$, in the relative error of the 2-norm of the scalar flux across the domain, for the Brunner problem. The x, \textcolor{matlabblue}{o} and \textcolor{foliagegreen}{\CIRCLE} markers use threshold coefficients 1\xten{-3}, 1\xten{-4} and 1\xten{-5}, respectively, with the \textcolor{fireenginered}{$\triangle$} uniform (unadapted).}
\caption{Performance of the regular angular adaptivity with FP$_n$ with $\Sigma_{\textrm{f}}=1$, in the relative error of the 2-norm of the scalar flux across the domain, for the Brunner problem. The x, \textcolor{matlabblue}{o} and \textcolor{foliagegreen}{\CIRCLE} markers use threshold coefficients 1\xten{-3}, 1\xten{-4} and 1\xten{-5}, respectively, with the \textcolor{gaylordpurple}{$\triangle$} uniform (unadapted).}
\caption{Investigating the impact of a spatially-dependent filter strength with FP$_n$, in the relative error of the 2-norm of the scalar flux across the domain, for the Brunner problem. The solid \textcolor{fireenginered}{$\triangle$} is uniform P$_n$, with the dotted \textcolor{gaylordpurple}{$\triangle$} FP$_n$ with $\Sigma_{\textrm{f}}=100$, dash-dotted FP$_n$ with $\Sigma_{\textrm{f}}=10$ and dashed FP$_n$ with $\Sigma_{\textrm{f}}=1$. The \textcolor{matlabblue}{$\otimes$} are regular adapted FP$_n$ with threshold tolerance 1\xten{-4}, with spatially dependent filter strength and reduced tolerance solves, with the dotted $\Sigma_{\textrm{f}}^{\textrm{1}}=100$, the dash-dotted $\Sigma_{\textrm{f}}^{\textrm{1}}=10$ and dashed $\Sigma_{\textrm{f}}^{\textrm{1}}=1$}
\caption{Comparison of the relative error of the 2-norm of the scalar flux across the domain, for different angular discretisations, for the Brunner problem. The \textcolor{foliagegreen}{$\otimes$} are regular P$_n$ adapts with threshold coefficient 1\xten{-5} and reduced tolerance solves, with the dashed \textcolor{matlabblue}{$\otimes$} regular FP$_n$ adapts with threshold coefficient 1\xten{-4}, spatially dependent $\Sigma_{\textrm{f}}$, with $\Sigma_{\textrm{f}}^{\textrm{1}}=1$ and reduced tolerance solves. The solid \textcolor{fireenginered}{$\bigtriangleup$} is uniform P$_n$, with the dashed \textcolor{gaylordpurple}{$\bigtriangleup$} FP$_n$ with $\Sigma_{\textrm{f}}=1$, the dotted \textcolor{gaylordpurple}{$\bigtriangleup$} FP$_n$ with $\Sigma_{\textrm{f}}=100$ and \textcolor{deludedorange}{$\diamond$} uniform LS P$^0$ FEM.}
\caption{Effect of changing the filter strength of FP$_n$, in the relative error of the detector response, for the 2D dogleg problem. The solid \textcolor{fireenginered}{$\triangle$} is uniform P$_n$ and the \textcolor{gaylordpurple}{$\triangle$} are uniform FP$_n$, with densely dotted $\Sigma_{\textrm{f}}=0.1$ and dash dotted $\Sigma_{\textrm{f}}=10$. The \textcolor{matlabblue}{$\otimes$} are goal-based FP$_n$ adapts, error target 1\xten{-1} and reduced tolerance solves, spatially dependent $\Sigma_{\textrm{f}}$ with densely dotted $\Sigma_{\textrm{f}}^{\textrm{1}}=0.1$ and dash-dotted $\Sigma_{\textrm{f}}^{\textrm{1}}=10$}
\caption{Comparison of the relative error of the 2-norm of the scalar flux across the domain, for different angular discretisations, for the 2D dogleg problem. The \textcolor{foliagegreen}{$\otimes$} are goal-based P$_n$ adapts with error target 1\xten{-1} and reduced tolerance solves, with the dashed \textcolor{matlabblue}{$\otimes$} goal-based FP$_n$ adapts with error target 1\xten{-1}, spatially dependent $\Sigma_{\textrm{f}}$, with $\Sigma_{\textrm{f}}^{\textrm{1}}=10$ and reduced tolerance solves. The solid \textcolor{fireenginered}{$\bigtriangleup$} is uniform P$_n$, the dash-dotted \textcolor{gaylordpurple}{$\triangle$} is uniform FP$_n$ with $\Sigma_{\textrm{f}}=10$ and \textcolor{deludedorange}{$\diamond$} uniform LS P$^0$ FEM. The \textcolor{black}{$\otimes$} are goal-based adapted non-standard Haar wavelets with error target 1\xten{-3} and one extra adapt step (from \cite{Dargaville2019})}
\caption{Sensitivity matrix of ${\lambda _1}$ to perturbations in the interior network ({\color{red}{$\square$}} denotes the existing edges in the network).}
\caption{Illustration of the different steps implemented in the \deepobs package and their outputs. The color of each block highlights the way a user mostly interacts with this part. Blocks in \colordot{TUgold} signify classes, those in \colordot{TUdark} are used via command line scripts. \colordot{ERC_ora} signals data packaged with \deepobs and \colordot{TUred} denotes parts provided through template scripts.}
\caption{Overview of the test problems included in \deepobs with their properties showing if the test problem includes convolutional layers (\textit{Conv}), recurrent neural network cells (\textit{RNN}), dropout layers (\textit{Drop}), batch normalization layers (\textit{BN}) or weight decay (\textit{WD}). The first column highlights the machine learning task that the model performs, \ie image classification~\colordot{TUgold}~, generative model~\colordot{ERC_ora}~, natural language processing~\colordot{TUred}~or problems where the loss function is given explicitly~\colordot{TUgray}. Test problems marked in~\colorsquare{TUgreen!50}~and~\colorsquare{TUblue!50}~are part of the small and large benchmark set, respectively.}
\caption{Relative performance against learning rate for each test problem and optimizer. Top row shows test problems P1 to P4, bottom row the test problems P5 to P8. The optimizers are represented in the same color as in \autoref{fig:benchmark}, where \colordot{plot_blue} represents \sgd, \colordot{plot_ora} represents \momentum, and \colordot{plot_green} the \adam optimizer.}
\caption{\color{Gray} \textbf{Example of a standard floating figure}. \textbf{A-F}, This figure is wrapped into the standard floating environment.}
\caption{(a) Conductance $G_{\text{QD2}}$ of the signal dot (QD2) (upper panel) and conductance $G_{\text{QD1}}$ of the sensing dot (lower panel) as a function of the finger gate voltages $V_{\text{QD2}}$ and $V_{\text{QD1}}$ for a fixed back gate ($V_{\text{BG}}=3\,\text{V}$) and split gate voltage ($V_{\text{SG}}\approx-3.5\,\text{V}$). Upper panel: The lines spaced with a periodicity of 0.09 V in $V_{\text{QD2}}$ are due to Coulomb blockade resonances. The lower panel shows the simultaneously-acquired measurement of the charge detector conductance $G_{\text{QD1}}$. We observe features aligned with the Coulomb resonances in the upper panel (highlighted with vertical dashed gray line) and tilted lines resulting from the cross capacitance between the sensing dot and $V_{\text{QD2}}$ (highlighted with diagonal dashed gray line). The dashed black lines in the upper and lower panel indicate the line cuts in b, respectively. \textcolor{red}{(c) SNR of the charge detection signal for different measurement system bandwidth.}}
\caption{Comparison between the first generation of moment data and of $\chi$EFT predictions. The bold fonts denote moments for which $\chi$EFT was expected to provide robust predictions. ``{\color{blue} \bf{A}}" means that data and calculations agree up to at least $Q^2=0.1$ GeV$^2$, ``{\color{red} \bf{X}}" that they disagree and ``-" that no calculation was available. {\scriptsize \emph{p+n}} indicates either deuteron data without deuteron break-up contribution, or proton+neutron moments added together with neutron information either from D or $^3$He. \label{xpt-comp}}
\caption{Task illustration: generating responses that are consistent with dialogue history in \textcolor{red}{persona}, \textcolor{mygreen}{tone} and \textcolor{orange}{topic} (from our system, 2 context turns).}
\caption{Test set performance of feature extraction (\FE) and fine-tuning (\FT) approaches for ELMo and BERT-base compared to two sentence embedding methods. %For \FT, we only show the difference vs. \FE~for clarity. Settings that are good for \FT~are colored in \textcolor{fe_red}{red} ($\Delta$=\FT-\FE~$>$ 1.0); settings good for \FE~are colored in \textcolor{fe_blue}{blue} ($\Delta$=\FT-\FE~$<$ -1.0). Numbers for baseline methods are from respective papers, except for SST-2, MNLI, and STS-B results, which are from \citet{Wang2018a}. BERT fine-tuning results (except on SICK) are from \citet{Devlin2018}. The metric varies across tasks (higher is always better): accuracy for SST-2, SICK-E, and MRPC; matched accuracy for MultiNLI; Pearson correlation for STS-B and SICK-R; and span F$_1$ for CoNLL 2003. %For each task and model, the best performing approach is underlined. For CoNLL 2003, we report the mean with five seeds; standard deviation is about 0.2\%.}
\caption{\label{mlp_struct} Structure of the spectral feature {\red extraction} network DML~\cite{guo2017spectral}.}
\caption{{\em Swift} BAT light curve of \smc\during 2012. The moving average of the BAT flux is shown in gold. A super-orbital period of around$60$\,days is clearly visible. The red vertical bars indicate the duration of each\nustar\observation presented here. The first observation (10002013001) took place near the end of the low state, while the second observation (10002013003) took place as the source was growing fainter shortly after the high state.}
\caption{Results of pulsation searches applied to each epoch. The left column shows the results of a dynamic folding search. Pulsations are not detected during Epoch I but seem to appear and gradually increase in strength after observation continues during Epoch II. Pulsations are clearly detected in Epoch III and do not appear to vary significantly throughout the epoch. The middle column shows the results of folding searches over both the pulse period and its first derivative. The results of the dynamic searches allowed us to search over a narrower period range. The resulting $Z^2_4$ distribution (b, e, and h) for each epoch is fitted to a 2-d Gaussian distribution. The mean of the fitted Gaussian is indicated by a black cross (\ding{58}) while the white contours represent the 1- and 2-sigma confidence regions. The apparent correlation between $P$ and $\dot{P}$ is an artifact of the search itself and is not intrinsic to the source. The maximum $Z^2_4$ value achieved by each search is indicated by a blue cross (\textcolor{blue}{\ding{54}}) and was used to produce pulse profiles shown in blue in panels b, d, and f. In gold are the 90\% confidence regions determined by the Monte Carlo procedure described in Section \ref{sec:intro}. When applied to Epoch I, the search produces multiple maxima of relatively low detection probability, resulting in a poor fit which cannot constrain the pulse period and first derivative to within the search bounds. We therefore do not show the fitted Gaussian, and we choose to fold the pulse profile using the maximum nearest the values measured for Epochs II and III. The result is a profile with weak pulsations which are not detected when the last 5000\,s of Epoch I are omitted. During Epochs II and III, however, the pulse period is well-constrained, resulting in distinctive pulse profiles, shown in the right column. Note that the scale of the y-axis in panel (c) is narrower than those of (f) and (i) in order to better illustrate the pulse profile during Epoch I.}
\caption{\label{tb:stoa_Market} Comparison to the state-of-the-art unsupervised results in the Market-1501 dataset. \red{\textbf{Red}} indicates the best and \blue{\textbf{Blue}} the second best. Measured by \%. }
\caption{Visual results of the soft multilabel-guided hard negative mining. Each pair surrounded by the \red{red box} is the similar pair mined by our model with the lowest soft multilabel agreements, and the images on their right are the reference persons corresponding to the first/second maximal soft multilabel entries. The first row is from the Market-1501 and the second from DukeMTMC-reID. We highlight the discovered fine-grained discriminative clues in the bottom text for each pair. Please view in the screen and zoom in. }
\caption{Progress of SuperSCS (with RB and AA directions) and SCS versus time for a large-scale SDP of the form \eqref{eq:rpca} with \(d=500\) (with \(m=625751\) and \(n=250501\)). [\textcolor{mycolor1}{\bf ---} SCS; \textcolor{mycolor2}{\bf ---} SuperSCS RB with memory 50; \textcolor{mycolor3}{\bf ---} SuperSCS AA with memory 5].}
\caption{Qualitative alignment results on a crop of an image of Bloomington from the Inria dataset. \textcolor{red}{Red: initial OSM annotations}; \textcolor{green}{green: aligned annotations}.}
\caption{Qualitative alignment results on a crop of bloomington22 from the Inria dataset. \textcolor{red}{Red: initial dataset annotations}; \textcolor{blue}{blue: aligned annotations round 1}; \textcolor{green}{green: aligned annotations round 2}.}
\caption{\textbf{Left}: ambiguity of the perfect ground truth annotations. \textbf{Right}: alignment failure case. \textcolor{magenta}{Magenta: manually aligned annotations}; \textcolor{red}{red: original dataset annotations}; \textcolor{green}{green: aligned annotations round 2}.}
\caption{Distribution of \gray spectral indexes for the 45 non-blazar radio loud AGN taken from the {\it Fermi} FL8Y compilation. The red dotted curve is the best fit Gaussian distribution with a mean value of 2.3 and a width of 0.35.}
\caption{Fermi measurements of the unresolved extragalactic \gray background (Ackermann et al. 2015) are shown in blue. Results of our calculation are shown by the red line and pink uncertainty band (see text). We find that $11\pm7$\% of the unresolved EGB at $\sim 1$ GeV can be due to unresolved, core dominant radio galaxies. }
\caption{ Inserting a pedestrian video on the DukeMTMC dataset. % \textcolor{red}{Best viewed in Adobe Reader as it should play videos.} \textcolor{red}{Click the image to play the video.} }
\caption{ Typical load profile of a one family home in summer (\textcolor{gray}{gray}) and winter (\textcolor{red}{red}) }
\caption{RGB input images (first row) and the corresponding resulting camera poses (second row), visualized in a reconstruction of the given scene (Stairs, Red Kitchen, Office). For each frame the ground truth (\textcolor{green}{green}), initially regressed pose (\textcolor{red}{red}) and optimized pose using the proposed adversarial refinement (\textcolor{blue}{blue}) are displayed. Below each image initially regressed (left values) and refined (right values) rotation and translation errors are given.}
\caption{RGB input images (first row) and the corresponding resulting camera poses (second row), visualized in a reconstruction of the given scene (Chess, Fire, Pumpkin). For each frame the ground truth (\textcolor{green}{green}), initially regressed pose (\textcolor{red}{red}) and optimized pose using the proposed adversarial refinement (\textcolor{blue}{blue}) are displayed. Below each image initially regressed (left values) and refined (right values) rotation and translation errors are given.}
\caption{ % {\bf Results without Explicitly Corrupting Training Set.} % {\bf (a1)-(a2)}~LR input. % {\bf (b1)-(b2)}~HR ground truth. % [MS-mSSIM,RRMSE,QILV] for % {\bf (c1)-(c2)}~SRGAN: \textcolor{orange}{[0.67,0.004,0.973]}, \textcolor{orange}{[0.87,0.14,0.972]}; % {\bf (d1)-(d2)}~SRGAN\_E:\textcolor{blue}{[0.86,0.001,0.975]}, \textcolor{blue}{[0.87,0.011,0.973]}; % {\bf (e1)-(e2)}~SRGAN\_QE:\textcolor{green}{[0.89,0.001,0.987]}, \textcolor{green}{[0.89,0.007,0.984]}; % {\bf (f1)-(f2)}~{\bf SRGAN\_SQE}: \textcolor{red}{[0.91,0.001,0.996]}, \textcolor{red}{[0.90,0.008,0.993]}. % }
\caption{ % {\bf Results with Varying Levels of Training-set Corruption.} % {\bf (a)}~LR input. % {\bf (b)}~HR ground truth. % Results for {\bf (c1)-(c2)}~{\bf Our SRGAN\_SQE} and {\bf (d1)-(d2)}~SRGAN, % with $5\%$ and $30\%$ corrupted examples introduced in the training set. % [MS-mSSIM,RRMSE,QILV] for: (c1)-(c2)~{\bf Our SRGAN\_SQE} are \textcolor{red}{[0.91,0.01,0.993]} and \textcolor{red}{[0.91,0.01,0.987]}; % (d1)-(d2) SRGAN are \textcolor{orange}{[0.85,0.02,0.976]} and \textcolor{orange}{[0.76,0.15,0.853]}. % }
\caption[CaffeNet GRF/M component correlations]{\color{caption_main}\textbf{CaffeNet GRF/M component and mean correlations by movement type, stance limb, and motion capture (acceleration orientation) method.}~{\color{caption_sub}CNN double-cascade single fold, output channels interlaced and PCA-reduced, 100~\%~stance.}}
\caption[ResNet-50 GRF/M component correlations]{\color{caption_main}\textbf{ResNet-50 GRF/M component and mean correlations by movement type, stance limb, and motion capture (acceleration orientation) method.}~{\color{caption_sub}CNN double-cascade single fold, output channels interlaced and PCA-reduced, 100~\%~stance.}}
\caption{Qualitative image-to-GPS results. Columns from left to right are: the query image, the reference panorama image with predicted bounding boxes overlaid (\textcolor{green}{GT}, the proposed \textcolor{red}{QATM}, and the baseline \textcolor{blue}{BUPM}), and the response maps of ground truth mask, QATM-improved, and baseline, respectively.}
\caption{Qualitative template matching performance. Columns from left to right are: the \textcolor{green}{template} frame, the target search frame with predicted bounding boxes overlaid (different colors indicate different method), and the response maps of \textcolor{red}{QATM}, \textcolor{magenta}{BBS}, \textcolor{cyan}{DDIS}, \textcolor{orange}{CoTM}, respectively. Rows from top to bottom: the top four are positive samples from OTB, while the bottom four are negative samples from MOTB. Best viewed in color and zoom-in mode. }
\caption{\emph{Visualization of testing results on UCF-Crime.} The blue curves are predictions of the action classifier trained under video-level labels, and the orange curves are the results under cleaned supervision. The ``GT'' bars in green are ground truths. {\color{cyan}{Best viewed in Adobe Reader where (d) should play as a video.}}}
\caption{\emph{Partial video of ``05\_0021'' on ShanghaiTech.} \color{cyan}{Best viewed in Adobe Reader where (a)-(c) should play as videos.}}
\caption{Time-domain responses of the power converter with DeePC. From $t=0.2\rm{s}$ to $0.7\rm{s}$, {\scriptsize{$I_d^{ref}$}} and {\scriptsize{$I_q^{ref}$}} are respectively set as $1.0{\rm{p.u.}}+ \tau_1$ and $\tau_2$ so as to get the input/output data with $u$ persistently exciting, where $\tau_1$ and $\tau_2$ are two different white noise signals (noise power: $1.0 \times 10^{-4}$). The DeePC is activated at $t=1.0\rm{s}$. {\color{ORANGE}{\bf{-----}}} without DeePC; {\color{BLUE}{\bf{-----}}} with DeePC.}
\caption{Time-domain responses of the power converter with different algorithms. The DeePC/PEM-MPC is activated at $t=0.2\rm{s}$. The grid-side inductance $L_g$ is changed from $0.34\rm{p.u.}$ to $0.35\rm{p.u.}$ at $t=0.7\rm{s}$, and to $0.5\rm{p.u.}$ at $1.0\rm{s}$. {\color{BLUE}{\bf{-----}}} PEM-MPC; {\color{ORANGE}{\bf{-----}}} DeePC ($T=500$); {\color{black}{\bf{-----}}} DeePC ($T=330$); {\color{GREEN1}{\bf{-----}}} {\scriptsize{$I_d^{ref}=1.0{\rm{p.u.}}, I_q^{ref}=0$}}.}
\caption{Variations of the optimization cost and the time-domain cost with different values of $\lambda_g$. {\color{ORANGE}{\bf{-----}}} DeePC; {\color{BLUE}{\bf{-----}}} PEM-MPC;}
\caption{NWD dependence on temperature for Model A ({\bf (a)}) and Model B ({\bf (b)}) and comparison to NWD for the ordered case of square lattice. Squares (\textcolor{black}{$\square$}) represent the behavior for disordered substrate, i.e. VRL, while circles (\textcolor{red}{$\bigcirc$}) represent the ordered square lattice. The computation of NWD was performed after $6\times 10^4$ simulation time steps. Averaging over $100$ realizations was performed, the error bar are smaller than the size of the presented symbols. %, after 50,000 simulation time steps, red points represent the result over a square lattice, black points represents the result over a random lattice. the presented results are the averaged over 100 simulations, the error bar are smaller then the point therefore we don't show them. (a) Model A. (b) Model B. }
\caption{ NWD dependence on temperature for Model A ({\bf (a)}) and Model B ({\bf (b)}) and comparison to NWD for the ordered case of square lattice. Squares (\textcolor{black}{$\square$}) represent the behavior for disordered substrate, i.e. VRL, while circles (\textcolor{red}{$\bigcirc$}) represent the ordered square lattice. The computation of NWD was performed after $6\times 10^5$ simulation time steps. Averaging over $100$ realizations was performed and error bars are presented.}
\caption{\textcolor{review1}{Differences between quantile regressions based on the NNQF and the k-nearest neighbors quantile regressions}}
\caption{\textcolor{review2}{Percentage of training data vs. time for training/applying the quantile regressions. In the legend, the number of nearest neighbors used by the nearest neighbors dependent techniques are shown in parenthesis; \textcolor{red}{Red:} with NNQF; \textbf{Black:} k-nearest neighbors quantile regression (kNNQR)}}
\caption{\textcolor{review1}{Amount of training and test data in each relevant task}}
\caption{\textcolor{review2}{Average pinball-loss; the techniques showing only one result are the ones that do not use nearest neighbors}}
\caption{Pinball-loss vs. computational effort for training/applying the quantile regressions. In the legend, the number of nearest neighbors used by the nearest neighbors dependent techniques are shown in parenthesis; \textcolor{red}{Red:} with NNQF; \textbf{Black:} benchmarks}
\caption{\textcolor{review2}{Pinball-loss improvement relative to the GEFCom14 benchmark. In the legend, the number of nearest neighbors used by the nearest neighbors dependent techniques are shown in parenthesis; \textcolor{red}{Red:} with NNQF; \textbf{Black:} benchmarks}}
\caption{\textcolor{review2}{Reliability results obtained on the GEFCom14 data. In the legend, the number of nearest neighbors used by the nearest neighbors dependent techniques are shown in parenthesis; \textcolor{red}{Red:} with NNQF; \textbf{Black:} benchmarks; \textcolor{blue}{Blue:} perfect reliability}}
\caption{\textcolor{review2}{Pinball-loss over all relevant tasks}}
\caption{Visualization of the base translations $\{\mathbf{c}_j\}$ learned by PoseNet~\cite{Kendall2015ICCV,Kendall2017CVPR} and MapNet~\cite{Brahmbhatt2018CVPR}. Each point corresponds to one base translation. The scale of the base translations is in meters. We show the combinations of base translations for some training images for MapNet. The weight a translation received in Eq.~\ref{eq:linear_pose_combination} for a single image, respectively all images (on the right of the figure), is indicated by colors and point sizes, with warm colors and large points for translations with a large coefficient. The training and test trajectory are shown in \textcolor{darkred}{red} and \textcolor{darkgreen}{green}. The test predictions by PoseNet and MapNet and Active Search~\cite{Sattler2017PAMI} are shown in \textcolor{blue}{blue}, \textcolor{deepmagenta}{purple}, and \textcolor{cyan}{cyan}, respectively.}
\caption{Results on the \textbf{Cambridge Landmarks}~\cite{Kendall2015ICCV} and \textbf{7 Scenes}~\cite{Shotton2013CVPR} datasets. We compare absolute (APR) and relative (RPR) pose regression methods, image retrieval (IR) techniques, and structure-based (3D) approaches. We report the median position / orientation error in meters / degree. \emph{DenseVLAD + Inter.} uses the top-20 (Cambridge Landmarks) respectively top-25 (7 Scenes) retrieved images. \textcolor{red}{Red} numbers show when a method fails to outperform the image retrieval (IR) baselines. Results for Cambridge Landmarks for MapNet are obtained running the code of the authors.}
\caption{DQN-decor (\textcolor{orange}{orange}) vs. DQN (\textcolor{blue}{blue}): all learning curves for 49 Atari games. }
\caption{QR-DQN-decor (\textcolor{orange}{orange}) vs. QR-DQN (\textcolor{blue}{blue}): all learning curves for 49 Atari games. }
\caption{MC-NNM estimates of treatment exposure on state government revenue, 1809 to 1982: {\color{Darjeeling15}{\sampleline{}}}, observed treated; {\color{Darjeeling11}{\sampleline{dashed}}}, observed control; {\color{Darjeeling15}{\sampleline{dotted}}}, counterfactual treated; {\color{Darjeeling15}{\sampleline{dash pattern=on .7em off .2em on .05em off .2em}}}, $\hat{\bar{\alpha}}_{t}$.\label{mc-estimates-rev-pc}}
\caption{A t-SNE visualization of primary (in \textcolor{blue}{blue}) and related (in \textcolor{dark-green}{green}) tasks. The distance between two tasks is based on the number of components they share. Two well separable clusters on top correspond to {\em Car Maintenance} and {\em Home Repairs} categories, while most of the tasks belong to the {\em Cooking} category.}
\caption{Example of obtained solution for \textit{Make French Toast} task. Outputs of the classifier are shown in \textcolor{blue}{blue}. Correctly localized steps are shown in \textcolor{green}{green}. False detections are shown in \textcolor{red}{red}. Ground truth intervals for the steps are shown in \textcolor{yellow}{yellow}. Failure cases include false localization due to the ordering constraints (\textit{Pour milk}, \textit{Whisk mixture} and \textit{Dip bread}) and due to a missing step (\textit{Top toast}).}
\caption{Example of obtained solution for \textit{Build Floating Shelves} task. Outputs of the classifier are shown in \textcolor{blue}{blue}. Correctly localized steps are shown in \textcolor{green}{green}. False detections are shown in \textcolor{red}{red}. Ground truth intervals for the steps are shown in \textcolor{yellow}{yellow}.}
\caption{Example of obtained solution for \textit{Make Strawberry Cake} task. Outputs of the classifier are shown in \textcolor{blue}{blue}. Correctly localized steps are shown in \textcolor{green}{green}. False detections are shown in \textcolor{red}{red}. Ground truth intervals for the steps are shown in \textcolor{yellow}{yellow}. }
\caption{Example of obtained solution for \textit{Make Irish Coffee} task. Outputs of the classifier are shown in \textcolor{blue}{blue}. Correctly localized steps are shown in \textcolor{green}{green}. False detections are shown in \textcolor{red}{red}. Ground truth intervals for the steps are shown in \textcolor{yellow}{yellow}.}
\caption{Example of obtained solution for \textit{Change a Tire} task. Outputs of the classifier are shown in \textcolor{blue}{blue}. Correctly localized steps are shown in \textcolor{green}{green}. False detections are shown in \textcolor{red}{red}. Ground truth intervals for the steps are shown in \textcolor{yellow}{yellow}.}
\caption{Example of obtained solution for \textit{Make Fish Curry} task. Outputs of the classifier are shown in \textcolor{blue}{blue}. Correctly localized steps are shown in \textcolor{green}{green}. False detections are shown in \textcolor{red}{red}. Ground truth intervals for the steps are shown in \textcolor{yellow}{yellow}.}
\caption{Erroneous predictions, involving wrong objects and actions. Correct object/action is in \textcolor{green}{green}. Our method is not capable of distinguishing particular kinds of objects, especially liquids and powders, due to the nature of the features. Examples for the wrong action components show that in many cases the method captures a static context in which object occurs, rather than performed action.}
\caption{$\chi_{\rm r}^2$ from the $10^7$ MC simulations. Vertical line shows the domain of the first percentile of simulations in terms of the $\chi_{\rm r}^2$. \label{fig:chi}}{\includegraphics[clip=true,trim=0 0 0 0]{chi.pdf}}
\caption{C/S distribution from the 1st percentile of our MC simulations in terms of $\chi_{\rm r}^2$. Note: the element in the far right is the 66th best fit, therefore not considered an issue but still considered in the estimation of the mean and standard deviation. \label{fig:alfa}}{\includegraphics[clip=true,trim=0 0 0 0]{alfa.pdf}}
\caption{Experimental setup on the real Shadow Dexterous Hand \cite{ShadowHand}. A force/torque sensor is attached to a foam block, to measure the force applied to the block. BioTac\textregistered{} tactile sensors \cite{BioTac} were used to compute the forces exerted by the corresponding fingers.}
\caption{\label{fig:step_af.png} Multi-step Affine registration results over iteration steps. The affine network is trained \mn{using} three steps for longitudinal registration ({\color{red}red}) and five steps for cross-subject registration ({\color{blue}blue}). Performance increases with steps and finally saturates.}
\caption{\textcolor{\revisionColor}{ AUC values with regard to motion in \emph{YouTube-natural} videos}}
\caption{\textbf{Left panel:} Sampling evolution from \diamonds\,\,of the frequency centroid$\nu_0$ (orange dots) as a function of the nested iteration, for the red giant KIC~12008916. Each nested iteration corresponds to one sampling point. The sampling presented is built as follows: 1) \diamonds\,\,performed a total of$N_{\rm nest} = 6000$ iterations (hence 6000 sampling points) using 2000 live points; 2) the run is completed by adding up the set of 2000 live points that is left at the end of the nested sampling, thus resulting in a total of 8000 effective nested iterations (or equivalently sampling points); 3) the sampling from the first 1500 nested iterations was removed to improve the detection of the peak structures in the PSD. This results in a final set of 6500 nested iterations, hence of sampling points used for the analysis. \textbf{Right panels:} (top) The PSD of the corresponding chunk analyzed (in gray) with a 10 frequency-bin smoothing overlaid (blue curve) to highlight the oscillation peak structures. (bottom). The counts histogram from the sampling, with local maxima corresponding to the extracted oscillation frequencies indicated by vertical dotted red lines, as identified by a hill climbing algorithm. The different regions of the histogram, denoted with varying colors, represent the ranges of frequency automatically selected to compute the final estimates of the oscillation frequencies and their corresponding uncertainties, according to the method (b) presented in Figure~\ref{fig:2}. The estimated frequencies and corresponding 1-$\sigma$ uncertainties from method (b) are indicated by the red circles and error bars, respectively. The horizontal dashed line represents the threshold level for the selection of the peaks, set to 3\,\% of the maximum counts in the histogram.}
\caption{Comparison between the oscillation frequencies extracted from \diamonds\,\,using the multi-modal approach ($\nu_0$) and those published by C15 ($\nu_\mathrm{Corsaro15}$). The values are reported in percentage of deviation with respect to the literature values. The uncertainty error bars obtained with the new approach are also overlaid for each oscillation frequency, after they were rescaled by $\nu_\mathrm{Corsaro15}$. The horizontal lines at $(\nu_0 - \nu_\mathrm{Corsaro15}) / \nu_\mathrm{Corsaro15} = 0\,\%$ denote the perfect matching. The panels from left to right depict the result obtained by considering four different methods to extract the frequencies from the counts histogram: \textbf{(a)} using a symmetric frequency range for each local maximum, with an extent on each side of the maximum that is the minimum found between the two distances going from the maximum to the adjacent midpoints from the next and previous maxima (or the edge of the total frequency range) on the right and left side respectively. The final frequencies are computed as the sampling mean from the sampling falling in the selected range of each local maximum, while the corresponding uncertainty is the standard deviation from the same sampling used to compute the mean; \textbf{(b)} using the same symmetric ranges as (a) but this time performing a weighted mean and weighted standard deviation using the nested iteration of each sampling point as a weight; \textbf{(c)} using an asymmetric frequency range, built with the distances between the maximum and the adjacent midpoints from the next and previous maxima (or edge of the total frequency range) on the right and left side respectively. The mean and standard deviation of the selected sampling of the range are used as in (a) to compute the final frequency and uncertainty estimates from each local maximum; \textbf{(d)} same as the case (c) but this time using a weighted mean and standard deviation as done for the case (b).}
\caption{\red{A poloidal cross-section of the MCNP6 model including the central solenoid (CS), toroidal field (TF) coil, outer vacuum vessel (VV), first wall (FW), and three volumes of the particle source (A, B, and C) as described in Table~\ref{tab:source}. Eight micro-fission chambers (MFCs) are shown as black dots (approximately to-scale) located just outside the VV and TF coil at four poloidal locations: $\theta = 0$, 45, and $\pm90$ degrees. Radial distances from the tokamak center are given in cm at the bottom, with material and geometric details described in the text.}}
\caption{\red{A top-down midplane cross-section of the tokamak, tangential MPR spectrometer, and horizontal neutron camera (HC) modeled in MCNP6. Note that the radial MPR is not shown. Also labeled are the central solenoid (CS), outer vacuum vessel (VV), and TF coil. Distances between the apertures and dimensions of diagnostics are indicated, with other material and geometric details provided in the text.}}
\caption{\red{A side view of the tokamak, radial MPR spectrometer, and horizontal and upper neutron cameras (HC/UC) modeled in MCNP6. Also labeled are the outer vacuum vessel (VV) and central solenoid (CS). Note that the tangential MPR and TF coils are not seen in this poloidal cross-section. Distances and dimensions are given, with material and other geometric details provided in the text.}}
\caption{Computational run time and scaling required to calculate $Z^{\mathrm{vib}}_\mathrm{RPI}$ as a function of $N$ for the \ch{H + CH4} reaction. Lines of best fit were modeled using power functions and optimized using non-linear least squares. Points with open circles were omitted from the lines of best fit due to large variability at such a short time scale. \red{The numbers in parentheses represent a standard deviation in the last digit.} Run times were measured on a standard desktop computer with a single processor. }
\caption{\red{Relative error in calculating $Z_{\mathrm{RPI}} / Z_{\mathrm{r}}$ as a function of the number of beads, $N$, at three different temperatures.} %The relative error is calculated using the RPE method with $N=8192$ as the benchmark for each respective temperature.} %The M\"uller-Brown reaction under study has a crossover temperature of 2210 K\@. \red{The relative error was defined using the RPE results with $N=8192$ as the benchmark for each respective temperature as $\left| \frac{X_{\mathrm{method}, N}-X_{\mathrm{RPE}, 8192}}{X_{\mathrm{RPE}, 8192}} \right |$, where $X$ denotes $Z_\mathrm{RPI} / Z_\mathrm{r}$ for a given method and number of beads, $N$. No points are plotted for the benchmark because it has zero relative error by definition.} \red{A horizontal dashed line indicates a 1\% relative error, below which results are considered converged.} In all cases, DaC8 results are so similar to DaC4 results that they cannot be distinguished on the plot. }
\caption{Quantum tunneling factors, defined by the ratio of instanton to Eyring TST rates, $k_\text{RPI} / k_\text{Eyring}$, for the syn-\ch{CH3CHOO} hydrogen transfer. Instantons were optimized on-the-fly at the B3LYP/cc-pVDZ level with $N=64$. \orange{Numbers in parentheses denote powers of 10.}}
\caption{ Frequency histograms of $(x,y,\phi)$ during different learning iterations. Top row: log probability density vs. $\phi$; bottom row: $(x,y)$ frequency heatmap. Only the TTR-based reward leads to near-complete trajectories in the 2D heat map between iterations $20$ and $40$, when the other rewards still involve much exploration. Also, the shift (\textcolor{red}{red circle}) of log probability density towards the target $\theta$ of $0.75$ rad occurs only when TTR-based reward is used, which suggests TTR function is guiding learning effectively.}
\caption{Average PSNR/SSIM values for scale factors $\times 2$, $\times 3$ and $\times 4$ with \textbf{BI} degradation model. The best performance is shown in {\color{red} red} and the second best performance is shown in {\color{blue} blue}.\label{comp_sot_bi}}
\caption{Average PSNR/SSIM values for scale factor $\times3$ with \textbf{BD} and \textbf{DN} degradation models. The best performance is shown in {\color{red}red} and the second best performance is shown in {\color{blue}blue}.\label{comp_sot_md}}
\caption{(a) Monocular depth estimation and (b) Instance segmentation from a single input image; (c) Extracted point cloud frustums \textcolor{blue}{\textbf{(blue)}} overlaid on the pseudo-LiDAR \textbf{(black)}; 3D bounding box detection (\textcolor{blue}{\textbf{blue}}) results (d) without bounding box consistency (BBC) and (e) with BBC. Ground truth shown in \textcolor{red}{\textbf{red}}.}
\caption{\textbf{Proposed Pipeline.} (a) Lift every pixel of input image to 3D coordinates given estimated depth to generate pseudo-LiDAR; (b) Instance mask proposals detected for extracting point cloud frustum; (c) 3D bounding box estimated \textcolor{blue}{\textbf{(blue)}} for each point cloud frustum made to be consistent with corresponding 2D proposal. Inputs and losses are in \textcolor{red}{\textbf{red}} and \textcolor{orange}{\textbf{orange}}.}
\caption{Comparison between the \textcolor{red}{\textbf{LiDAR}} (top), \textcolor{blue}{\textbf{pseudo-LiDAR}} (middle) and an overlaid version (bottom). Two types of noise discussed in Section \ref{sec:pseudovslidar} are indicated in \textcolor{orange}{\textbf{orange}} (local misalignment) and \textbf{black} (long tail) ellipses.}
\caption{\textbf{Effectiveness of Instance Mask Proposal.} Top left: 2D box proposal. Top right: Instance mask proposal. Bottom left: Point cloud frustum lifted from 2D box proposal with noisy long tail. Bottom right: Point cloud frustum lifted from instance mask proposal with no tail. Ground truth box corresponding to the frustum shown in \textcolor{red}{\textbf{red}}.}
\caption{\textbf{Effect of Bounding Box Consistency (BBC).} Top Row: Minimum bounding rectangle (MBR) of \textcolor{blue}{3D box estimate} (white), \textcolor{red}{instance mask} \textcolor{red}{(\textbf{red})}. Bottom left: Poor 3D box estimate without BBC. Bottom right: Improved 3D box estimate with BBC. Ground truth shown in \textcolor{red}{\textbf{red}}. %We show that, from left to right, we achieve an increase of the 2D IoU between two MBRs from 0.64 to 0.90 and an increase of the 3D IoU between the 3D bounding box estimate and its ground truth from 0.28 to 0.78, with the use of BBC. % We show an increase of a 2D IoU from 0.62 to 0.90 and a 3D IoU from 0.28 to 0.78 with BBC. }
\caption{Quantitative comparison on KITTI \textbf{val} set. We report the average precision (in \%) of car category on bird's eye view and 3D object detection as $\text{AP}_{\text{BEV}}$ and $\text{AP}_{\text{3D}}$. Top three rows are previous state-of-the-art methods and middle three rows colored in \textcolor{Green}{green} are concurrent works developed independently from our work. We outperform all monocular methods.}
\caption{Qualitative results of our proposed method on KITTI \textbf{val} set. We visualize our 3D bounding box estimate (in \textcolor{blue}{blue}) and ground truth (in \textcolor{red}{red}) on the frontal images (1st and 3rd rows) and pseudo-LiDAR point cloud (2nd and 4th rows).}
\caption{Additional visual comparison between the \textcolor{red}{\textbf{LiDAR}} (top), \textcolor{blue}{\textbf{pseudo-LiDAR}} (middle) and an overlaid version (bottom). % We can see the point cloud frustums in the pseudo-LiDAR mostly have the long tail issue while their LiDAR counterparts do not have the issue. Also, a few examples of the local misalignment between pseudo-LiDAR and LiDAR frustum can be clearly observed in the overlaid version. % Two types of noise discussed in Section \ref{sec:pseudovslidar} are indicated in \textcolor{orange}{\textbf{orange}} (local misalignment) and \textbf{black} (long tail) eclipses. }
\caption{Plots illustrating the performance of various models on the test set, as training progresses. \textcolor{our_blue}{\textbf{Blue}} lines represent the baseline methods when no curriculum is used, and \textcolor{our_red}{\textbf{red}} lines represent the same models when different versions of our curriculum learning framework are used to train them. The vertical lines represent the step in which the models attain the BLEU score that the baseline models attain at convergence.}
\caption{ Qualitative results on THUMOS'14 and ActivityNet after action start generation (see Sec.~\ref{sec: fusion}). $\times$ means no starts are detected at those times. Numbers indicate the scores of detected action starts. Results of {\color{blue}{ClsNet}} and {\color{red}{StartNet}} are marked in blue and red, respectively. Yes/No (ground-truth) indicates if an action of the associated class starts at the time. Best viewed in color. }
\caption{Baseline comparison on SPMCS-32. {\color{red}Red} here and in the other tables indicates the best performance (PSNR/SSIM).}
\caption{Quantitative evaluation of state-of-the-art SR algorithms on Vid4 for $4\times$. {\color{red}Red} indicates the best and {\color{blue}blue} indicates the second best performance (PSNR/SSIM). The calculation is computed without crop any pixels border and remove first and last two frames. For $B_{123}+T$ and DRDVSR, we use results provided by the authors on their webpage. For BRCN, VESPCN, and FRVSR, the values taken from their publications. *The output is cropped 8-pixels near image boundary.}
\caption{Quantitative evaluation of state-of-the-art SR algorithms on SPMCS-11 for $4\times$. {\color{red}Red} indicates the best and {\color{blue}blue} indicates the second best performance (PSNR/SSIM).}
\caption{Comparison of Deep VSRs. (a) Input frames are concatenated to preserve temporal information~\cite{kappeler2016video,caballero2017real,jo2018deep,liao2015video}. (b) Temporal aggregation improves (a) to preserve multiple motion regimes~\cite{liu2017robust}. (c) RNNs take a sequence of input frames to produce one SR image at a target frame, $I_{t}$~\cite{huang2015bidirectional, tao2017detail, sajjadi2018frame}. (d) Our recurrent back-projection network accepts $I_{t}$, which is enclosed by a blue dashed line, as well as a set of residual features computed from a pairing $I_{t}$ with other frames (i.e., $I_{t-k}$ for $k \in \{ 1,\cdots,n \}$), as enclosed by a red dotted line, while previous approaches using RNNs shown in (c) feed all temporal frames one by one along a single path. % % In (d), each pair of $I_{t}$ and $I_{t-k}$ provides residual % features optimized for representing missing details in $I_{t}$ % for super-resolution. % Residual features computed from the pairs of $(I_{t},I_{t-k})$ (MISR path - {\color{red}the vertical red arrows}) are fused with features extracted from variants of $I_{t}$ (SISR path - {\color{blue}the horizontal blue arrows}) through RNN. % % Recurrent feedback connections depicted by red arrows % allow us to optimize the network in accordance with % temporally-smooth frames, $I_{t-1}, \cdots, I_{t-n}$. % }
\caption{Comparison with the state-of-the-arts. The top three results are highlighted in {\color{dred}\bf red}, {\color{dgreen}\bf green}, and {\color{dblue}\bf blue}, respectively.}
\caption{Computation graph for WS in {\color{feedforwarding}feed-forwarding} and {\color{backpropagation}back-propagation}. The numbers are equation numbers.}
\caption{\blue{A depiction of the possible ranges of AI agents and the possible tradeoff/balance between skill and style. In this tradeoff, there is a region that captures human-like skill and style. AI Agents may not necessarily land in the human-like region. High-skill AI agents land in the green region while their style may fall out of the human-like region.} }
\caption{The AI agent training pipeline, \blue{which consists of two main components, game-play environment and agent environment. Agents submit actions to the game-play environment and receive the next state.}}
\caption{\blue{This plot belongs to Section~\ref{sec:player-progression}.} Average cumulative reward (return) in training and evaluation for the agents as a function of the number of iterations. Each iteration is worth $\sim$60 minutes of game-play. The trained agents are: (1) a DQN agent with complete state space, (2) a Rainbow agent with complete state space, (3) a DQN agent with augmented observation space, and (4) a Rainbow agent with augmented observation space. Augmented space is the space observable by humans in addition to inferred information, which is much smaller than the complete space. }
\caption{Model performance measures the probability of the event that the Markov agent finds at least one previous action from human-played demonstration episodes in the current game state. The goal of interactive learning is to add support for new game features to the already trained model or improve its performance in underexplored game states. Plotted is the model performance during interactive training from demonstrations in a proprietary open-world game as a function of time measured in \blue{milliseconds (with the total duration around 10 minutes).} }
\caption{Comparison between OpenAI 1v1 Dota 2 Bot \cite{openai-dota2} training metrics and training an agent via bootstrap. \blue{The comparison is not 1-to-1 because the training objectives are very different. However, the environments are similar in complexity. These metrics highlight the practical training of agents during the game development cycle. The point is to illustrate that the training objectives play a critical role.}}
\caption{The proposed framework of domain adaptation for unsupervised (upper) and zero-shot learning (bottom). \newline{\color{blue}Blue} and {\color{red}Red} markers represent data from source and target domains respectively. The Black markers represent learned class-level representations. The shapes of ``triangle", ``diamond" and ``square" denote three different classes whilst the shape of ``circle" represent unlabelled samples. Filled and hollow markers represent ground truth labelling and predictions respectively. The main difference between unsupervised and zero-shot learning conditions is the access of target data for training as denoted by the presence and absence of the hollow markers in the left of the two conditions illustrated above. }
\caption{The top panels show the streamwise velocity autocorrelation in the streamwise (left) and spanwise (right) directions. Solid lines are without rotation ($R_\Omega=0$) while dashed lines are with mild rotation ($R_\Omega=0.32$). Black upper triangle (\protect\triangleup): $L_x/d=\pi$, $L_z/d=\pi/2$, blue lower triangle (\protect\triangledow): $L_x/d=2\pi$, $L_z/d=\pi$, green circle (\protect\circlefilled): $L_x/d=4\pi$, $L_z/d=2\pi$; yellow square (\protect\squarefilled): $L_x/d=8\pi$, $L_z/d=4\pi$. The bottom panel show the magnitude of averaged streamwise velocity for non-rotating (left) and $R_\Omega=0.32$ (right). The three-halves domain (\protect\starfilled): $L_x/d=3\pi$, $L_z/d=\frac{3}{2}\pi$ is also included.}
\caption{\textcolor{\me}{The performances of using different feature mappings on MSCOCO test split. Shared GAN learns a shared feature mapping for three sets of features with CycleGAN. Single GAN concatenates the three kinds of embeddings together and learns a mapping with CycleGAN.}}
\caption{\textcolor{\me}{Examples of unpaired image captioning failure cases. Although the accuracy of image scene graph highly influences the performance of captioning results, our Graph-Align can still generate relevant image captions.}}
\caption{Our iterative pose optimization in high-dimensional space, schematized here in 2D. We start at an initial pose (\textcolor{red}{$\bm{\times}$}) and want to converge to the ground truth pose (\textcolor{blue}{$\bm{\circ}$}), that maximizes image similarity. Our updater CNN generates updates for each pose (\textcolor{green}{$\bm{+}$}) that bring us closer. The updates are predicted from the synthesized image of the current pose estimate and the observed depth image.}
\caption{Relative frequencies of agreement, disagreement and neutral stance for each knowledge category towards the statements ``For me, in my daily life, it is not important to know about science'' (upper row) and ``Science \& Technology are making our lives healthier, easier and more comfortable'' (lower row), shown here as examples of two distinct behaviours of attitude variables. Upper row shows an example of an asymmetric behaviour of agreement and disagreement, with the distinct ``inverted U'' curve appearing in the negative attitude. Lower row shows an item with a mostly flat disagreement curve and monotonously crescent agreement curve. Shaded areas highlight the four consecutive knowledge bins with highest agreement in each attitude item.}
\caption{Set of 9 attitude variables in the Eurobarometer dataset. For each statement respondents were asked to state their agreement or disagreement. Starred items (\textasteriskcentered) do not have data for 1989.}
\caption{Set of 13 knowledge variables in the Eurobarometer dataset, with question statement and possible answers; A ``don't know'' option was also available in each question. The correct answer is starred (\textasteriskcentered).}
\caption{\colorfig Illustration of the displacement dependence of the photon statistics observed using PNRDs in the dark port of the two-path interferometer for a squeezing parameter of $r=1$. The displacement of the phase space distribution is shown on the upper left-hand side of the figure. The displacement dependence of the detection probabilities is shown on the lower left. An offset proportional to $(n+1/2)$ is used to distinguish the different photon numbers. The parabola shows the value of $x^{2}$ in units of the offset, indicating that probability rapidly drops to zero for displacement larger than $\sqrt{n+1/2}$. The zero points of the distributions are marked with $x_{n,k}$ as explained in the text. The right hand side of the figure shows the photon number distributions for $x=x_{4,1}\approx1.16$ and $x=x_{6,1}\approx1.65$, where the lowest minima of the photon number distribution are found at $n=4$ and $n=6$, respectively. }
\caption{\colorfig Explanation of the origin of multiphoton interference fringes in the photon number distribution $p_{n}(x)$ for a quadrature displacement of $x = x_{4,1} \approx 1.14$ and a squeezing parameter of $r=0.8$. The quantum state $\ket{\sigma(x)}$ is indicated by a straight line parallel to the $y$-axis at the displacement $x$. The actual phase space extent of the Gaussian Wigner function of $\ket{\sigma(x)}$ is illustrated by the blue-shaded region. The dotted arcs indicate single mode phase shifts, with $\Delta\tau(n)$ describing the phase difference between the two phases $\tau_{1}(n)$ and $\tau_{2}(n)$ that intersect the central quadrature value $x$ along a circle of photon number $n$. The solid circles show the positions of minima caused by destructive interferences in the photon number distribution $p_{n}$. Note that these minima are not necessarily at integer photon numbers. As shown in Eq. \eqref{eq::n_sep_n_phase}, the product of the arc $\Delta\tau(n)$ and the photon number difference $\Delta\nu$ between two consecutive circles is approximately equal to $2\pi$. Since $\Delta\tau(n)$ gradually increases from an initial value of about $2\pi/3$ to a final value of $\pi$, the separation $\Delta\nu$ between consecutive minima decreases from about $3$ to an asymptotic limit of $2$ as photon number increases. }
\caption{\colorfig Explanation of quantum interference fringes in photon number distributions of displaced squeezed states with a squeezing parameter of $r=0.8$. In the upper part, the first minimum $n_{\text{min.}}^{(1)}$ (green circle) and the regime $[0,\bar{n}+2\Delta n]$ (red shading inside the red dash-dotted line) for different quadrature displacements are shown in phase space. The dashed line represents the $\zeta x$-quadrature. In the lower part, the actual probability distributions $p_{n}$ of displaced squeezed states obtained from Eq. \eqref{eq::inner_prod_sq_st} are plotted as histograms. The approximate quantum interference fringes given by Eq. \eqref{eq::q_inter_approx} are plotted with orange squares, which are obtained from the orange dashed envelop function $2\rho(n,\zeta x)$ modulated by a squared cosine function with a phase of $(S(n,\zeta x)-\pi/4)$. The approximate Gaussian distribution for $\bar{n}$ and $\Delta n$ given by Eq. \eqref{eq::cl_approx} are plotted using blue circles. Panel (a) shows the photon statistics for a small displacement of $x=x_{4,1}\approx1.14$. The photon number distribution is well approximated by Eq. \eqref{eq::q_inter_approx}, but is quite different from the Gaussian distribution. Panel (b) shows the photon statistics for an intermediate displacement of $x=x_{6,1}\approx1.62$. The probabilities for photon numbers greater than $4$ are well approximated by Eq. \eqref{eq::q_inter_approx}. The probabilities of photon number from $0$ to $3$ are roughly given by the Gaussian, with a maximal deviation at $3$ photons. Panel (c) shows the photon statistics for a large displacement of $x=\chi_{c}\approx3.74$. In this limit, the approximation of Eq. \eqref{eq::q_inter_approx} converges on the Gaussian approximation of Eq. \eqref{eq::cl_approx}. Only a slight deviation from the Gaussian exists around $n=20$. }
\caption{\colorfig Transition from quantum interference to Gaussian distribution. The contour plot shows the photon number distribution $p_{n}(x)$ of squeezed states with $r=0.8$, where the photon number dependence is mathematically interpolated between discrete photon numbers to give a more intuitive image of the photon number dependence. The red solid line indicates the average photon number $\bar{n}$. The magnitude of photon number uncertainty $\Delta n$ is illustrated by the dashed and dotted lines, showing $\bar{n}+\Delta n$ and $\bar{n}+2\Delta n$, respectively. The region between $\bar{n}$ and $\bar{n}+2\Delta n$ is highlighted in red. The white circles mark the zero points $x_{n,k}$ of $p_{n}(x)$. The green line marks the interpolation of the lowest photon number zero points associated with $x_{n,1}$, indicating the photon number at which quantum interference starts to occur. As the displacement $x$ increases, the zero points shift out of the regime between $\bar{n}$ and $\bar{n}+2\Delta n$ and the photon statistics approaches a Gaussian as given by Eq. \eqref{eq::cl_approx}. }
\caption{\colorfig Fisher information $I_F$ of the $x$-estimation of the displaced squeezed state $\ket{\sigma(x)}$ in lossy PNRD measurements with photon losses $\epsilon$ and a squeezing parameter of $r=1$. The approximated Fisher information is given in Eq. \eqref{eq::FI_PNR_reduced}. Panel (a) shows the $x$- and $\epsilon$-dependence of the approximated Fisher information $I_F$. Panel (b) shows the $x_{\text{eff.}}$-dependence of the Fisher information $I_F$ for a small photon loss $\epsilon = 0.002$. The green dashed line $(1-\epsilon)\mathcal{H}_{F}$ is the asymptotic limit of the approximated Fisher information $I_{F}$ (the blue solid line) for $x\to \infty$. The approximated Fisher information is very close to the value obtained from the precise photon number distributions (the orange dot-dashed line). The green highlighted area between the asymptotic limit and the approximated Fisher information is the reduction function $\Delta_{Q}$ given in Eq. \eqref{eq::reduction_fct_sq_st}. The white circles are the estimation sensitivity $1/(N\delta^{2} x)$ obtained from a numerical simulation of a lossy PNRD estimation using $N=2000$ samples. Panel (c) shows the $x_{\text{eff.}}$-dependence of the Fisher information for photon losses of $\epsilon = 0.01$. As the photon losses increase, the approximation becomes less accurate, and the structure of the dips is broadened. Panel (d) shows the $x_{\text{eff.}}$-dependence of the Fisher information for photon losses of $\epsilon = 0.05$. The main dips now appear as a small modulation of a nearly homogeneous reduction of Fisher information. }
\caption{ The multi-source acquisition-cube for a few of the possible acquisition functions. \mi, \ivr, and \ip~stand for `mutual information', `integral variance reduction', and `integral precision', respectively. % The forward arrows ({\color{mred}\protect\rotatebox[origin=c]{-150}{$\twoheadrightarrow$}}) denote the special case of one source only ($L=1$) as in the case of \vbq. The downward facing arrows ($\downarrow$) denote the special case where the cost $c$ is not dependent on the locations $\sX_\star$. The double-lines ({\color{forestgreen}$\xlongequal{~~~}$}) between nodes denote that these acquisition functions are equivalent in the sense that they yield the same optimal $\sX_{\star}$. The two grayed-out acquisitions for \ip~highlight that they exhibit non-favorable behavior (cf.~Section~\protect\ref{sec:multi-source-acq}). The bottom front row in the cube denotes the special case of \vbq~($L=1$ and $c(\sX_\star)=\text{const.}$) where all three acquisition policies (\mi, \ivr, \ip) coincide. }
\caption{The proposed semi-supervised framework for landmarks localization. The labeled and unlabeled branched are marked with \textcolor{blue}{blue} and \textcolor{red}{red} arrows, respectfully. Given an input image, \gls{g} produces $K$ heatmaps, one for each landmark. Labels are used to generate real heatmaps as~$\omega(\lmk^l)$. \gls{g} produces fake samples from the unlabeled data. Source images are concatenated on heatmaps and passed to \gls{d}.}
\caption{Random samples of landmarks predicted using $\lkl$ (white), with the ground truth drawn as line segments (\textcolor{red}{red}). Notice the predicted points tend to overlap with the ground-truth. Best viewed in color. Zoom-in for greater detail.}
\caption{ Transport characteristics of 25 nm La:BaSnO$_3$ films. (a) Temperature dependence of the zero-field resistivity of the samples (red) A1, (blue) A2 and (green) A3. (b) Mobile electron carrier concentration vs. temperature and (c) electron mobility vs. temperature of the same La:BaSnO$_3$ films characterized in Figs.~\ref{fig:first}\textcolor{blue}{(c)}-\textcolor{blue}{(d)} and Fig.~\ref{fig:second}. (d) Measured electron mobility as function of the growth temperature of the SrZrO$_3$ buffer layer.}
\caption{\blue{Main characteristics of the 6 groups of datasets we use for experimentation. The last 5 columns indicate the number of verbatims contained in the dataset, the number of codes the codeframe consists of, the average and median length of the verbatim (i.e., number of non-unique words contained in it), and the average number of codes per verbatim.}}{\scriptsize \begin{tabular}{|c|c|c|r|r|r|r|r|r|} \hline & \side{\textbf{Group of datasets}} & \side{\textbf{Type}} & \side{\textbf{\# verbatims per dataset}} & \side{\textbf{Tot \# binary codes}} & \side{\textbf{Avg \# words per verbatim}} & \side{\textbf{Median \# words per verbatim}\phantom{x}} & \side{\textbf{Avg \# codes per verbatim}} & \side{\textbf{Avg \# positive verbatims per code}}\\ \hline 1 & LL-ACE & market research & 201 & 75 & 1.80 & 1 & 1.22 & 9.76 \\ 2 & LL-BDFGHIL & market research & 501 & 333 & 5.56 & 3 & 1.26 & 13.27 \\ 3 & Egg & customer sat & 926 & 74 & 26.97 & 22 & 1.74 & 90.60 \\ 4 & ANES-L/D & political survey & 2,665 & 1 & 26.88 & 21 & 0.52 & 1396.00 \\ 5 & MDS & product reviews & 2,000 & 4 & 129.74 & 84 & 0.50 & 1000.00 \\ 6 & Reuters-21578(10) & newswires & 10,788 & 10 & 127.76 & 84 & 0.93 & 997.90 \\ \hline \multicolumn{3}{c}{\mbox{}} & \textbf{Tot} $\rightarrow$ & 497 \\ \end{tabular} }
\caption{Average PSNR and SSIM results of different methods for different degradation settings on the color BSD68 dataset~\cite{MartinFTM01,roth2009fields,zhang2017beyond}. The best two results are highlighted in \textcolor[rgb]{1.00,0.00,0.00}{red} and \textcolor[rgb]{0.00,0.00,1.00}{blue} colors, respectively.}
\caption{The two panels are distribution{\red s} of $\ln B_{01}$ for different $r$ (with ${\cal A} = 0.6$) and ${\cal A}$ (with $r=0.15$) respectively. The other parameters are chosen as $\beta = 0.006$, $t_{echo}=0.0295$, and $\Delta t_{echo} = 0.0295$. The yellow dashed lines in both panels are the thresholds of $r$ and ${\cal A}$ where the Bayes factor can be used to select the UIE model.}
\caption{\label{fig:kappa} The similarity of the transformation from 3D cubic to 2D honeycomb planar geometry structures ($c$-BAs $\to$ $g$-BAs \emph{vs.}\diamond$\to$ graphene) is in contrast to the opposite $\kappa$ variation. When transforming from 3D into 2D, the $\kappa$ of BAs is found to be anomalously lowered by more than one order of magnitude. % (a) The structure of graphene in 2D is the (111) cross section of the structure of diamond in 3D, which is planar due to the $sp^2$ hybridization of carbon atoms. (b) The $g$-BAs to $c$-BAs is like graphene to diamond. (c) The comparison of $\kappa$ of diamond, graphene, $c$-BAs,\cite{PhysRevLett.111.025901, PhysRevB.2017.96.161201} and $g$-BAs. }
\caption{(Better zoom in.) We confirm the effects of SIBAN through a visualization of the learned representations $z_S$ \&$z_T$ using t-distributed stochastic neighbor embedding (t-SNE)~\cite{maaten2008tSNE}. Specifically, we show the results of Non-adapted model in (a)\&(d), IBAN in (b)\&(e) and SIBAN in (c)\&(f), respectively. In the first row, we label the t-SNE map by domains, where\textcolor{red}{red} denotes the source domain and \textcolor{blue}{blue} denotes the target domain. In the second row, we label the t-SNE map by different classes. The colors are consistent with the annotation maps.}
\caption{B-rank output from our model contrasted with baselines. Type I errors are in \textcolor{red}{red}, type II errors in \textcolor{orange}{orange}, and correctly tagged untranslated terminology in \textcolor{blue}{blue}.}
\caption{Structure of the gradient transformer module, which has a newly proposed Region Norm (RN) layer, $1 \times 1$ convolutional layer (bias enabled) and identity mapping. We insert four probes (\textcolor{red}{a}, \textcolor{red}{b}, \textcolor{red}{c} and \textcolor{red}{d}) to assist analysis in Section~\ref{sec:universal_analysis} and Section~\ref{sec:universal_exp}.}
\caption{\label{tbl:summarization}CNN/Daily Mail summarization test results. We compare the performance of our models (both with and without the n-gram repetition reranking approach of \red{Chen and Bansal}) to strong abstractive and extractive systems from previous work. }
\caption{Source dataset is in {\color{red} red} and target is in {\color{blue} blue}. Best viewed in color.}
\caption{Dense captioning with different levels of contextual interactions: (i) without any contextual cues (marked by {\color{cyan} blue})~\cite{johnson2016densecap}, (ii) with guidance from the global cue (marked by {\color{red} red})~\cite{yang2017densecap}, and (iii) with mutual interactions from neighboring (marked by \textcolor{orange}{orange}) and global visual information. (Best viewed in color.) }
\caption{The architecture of CAG-Net. The multi-scale features are generated by the proposed Contextual Feature Extractor after region proposals. Then the \textcolor{cyan}{\textit{local}} (in blue) feature of the target region and multi-scale context cues, \ie, \textcolor{red}{\textit{global}} (in red) and \textcolor{orange}{\textit{neighboring}} (in orange), broadcast into the Attribute Grounded Caption Generator for region captioning in parallel. The final descriptions of the target region are generated jointly by the hierarchical structures trained with the auxiliary attribute losses. }
\caption{ % An example of Contextual Feature Extractor for the {target proposal}. (left) The similarity graph between {\color{cyan}target} (in blue) proposal and contextual {\color{orange}neighboring} (in orange) proposals are generated considering both spatial configuration and appearance similarity. (right) The \textit{neighboring} feature are obtained by fusing the {contextual neighboring proposals} with the similarity graph. Best viewed in color.}
\caption{Comparisons between different network structures. (a) L generates the descriptions separately after region proposals; (b) L + G generates descriptions with not only the {\color{cyan}local} feature but also the {\color{red}global} feature of the image; (c) L + G + N (CCI) integrates {\color{red}global}, {\color{orange}neighboring} and {\color{cyan}local} information for the target to generate descriptions; (d) CAG-Net by multiple LSTM cells is a stacked version of (c) CCI but supervised with hierarchical linguistic attribute losses. }
\caption{ % The unrolled structure of Contextual Cue Integrator (CCI). (a) Unrolled structure integrates the {\color{cyan}local} (in blue) information and multi-scale context cues, \ie, {\color{red}global} (in red) and {\color{orange}neighboring} (in orange). The hollow circle stands for the LSTM cell while the plus sign for the feature fusion briefly. (b) The captioning loss consists of a sentence loss and an attribute loss. }
\caption{Performance comparison with the state-of-the-art algorithms on the depth map dataset for different values of upsampling factors. The tables shows the means and (standard deviations) over all images of the MSE (in pixel$^2$), MAE (in pixels), and PBP (in \%). $\dagger$ trained on the high-res ground truth target.\colorbox{Gray}{Best overall}, \textbf{Best without high-res ground truth targets}.}
\caption{Performance comparison with the state-of-the-art algorithms on the vegetation height map dataset for different values of upsampling factors. The tables shows the means and (standard deviations) over all images of the MSE (in m$^2$), MAE (in m), and PBP (in \%). $\dagger$ trained on the high-res ground truth target. \colorbox{Gray}{Best overall}, \textbf{Best without high-res ground truth targets}.}
\caption{Quantitative results for benchmarking the proposed SPANet and the state-of-the-art derainers on the proposed test set. The original codes of all these derainers are used for evaluation. We have also trained CNN-based state-of-the-art methods~\cite{fu:cvpe:2017:ddn,yang:cvpr:2017:j,zhang:cvpr:2018:did,li:eccv:2018:rsecan} on our dataset, and results are marked in \red{red}. The best performance is marked in {\bf bold}. Note that due to the lack of density labels for the rain images in our dataset, we only fine-tune the pre-trained model of DID-MDN~\cite{zhang:cvpr:2018:did} without the re-training label classification network.}
\caption{Visual comparison of the state-of-the-art CNN-based derainers trained on the original/proposed datasets. Methods in \red{red} mean that they are retrained on the proposed dataset. PSNR/SSIM results are included for reference.}
\caption{Performance comparison of Barista (\color{black!20!blue}blue\color{black}) and Prophet (\color{black!50!green}green\color{black}) along with ground truth (first dataset) (\color{black!30!red}red\color{black}).\\}
\caption{Performance comparison of Barista (\color{black!20!blue}blue\color{black}) and Prophet (\color{black!50!green}green\color{black}) along with ground truth (second dataset) (\color{black!30!red}red\color{black}).}
\caption{%Vertical Scaling to allocate the number of CPU cores(red line) while maintaining the SLO bound of 5 seconds. The The \color{blue}{blue} \color{black} dotted line shows the workload pattern, and the solid \color{blue!50!black}{navy blue} \color{black} line shows the latency of the prediction services %if run on maximum allocated cores on a VM of 8 cores. The \color{green!60!black}{green} \color{black} line shows the latency if we dynamically (de)-allocate the cores.}
\caption{Top ranked distribution that describes the variation in the sample data. The distribution (\color{blue}blue\color{black}) is plotted on top of the histograms (\color{orange}orange\color{black}) of observations. }
\caption{The minimum WRMSE and WMAE of existing state-of-the-art edge-preserving smoothing methods and deep models. The optimal parameter setting of each algorithm is used across the entire dataset. {\color{red}Red}, {\color{green}Green} and {\color{blue}Blue} color indicates the best, second best and third best results, respectively. }
\caption{\small \textbf{Tracking Segmentation} on the DAVIS2017 validation set. %Some baseline numbers reported by \cite{yang2018efficient}. Methods marked with \textcolor{red}{$^{1st}$} additionally use the first frame and its mask (provided) for tracking in the rest of the video. The \textcolor{blue}{number} in bracket is the estimated number of frames used for training the corresponding method. }
\caption{\textbf{Human Pose Tracking} on JHMDB dataset. Methods marked with \textcolor{red}{$^{1st}$} additionally use the first frame with its mask for propagating on the rest frames. ``mgPFF+ft'' means that we fine-tune mgPFF model particularly on the videos from this dataset in an unsupervised way (no annotations used). }
\caption{An example of \varmisuse shown in {\color{red}{red}} text. At test time, one prediction task is generated for each of the variable-use locations (Blue boxes).}
\caption{Results on KITTI dataset using the test split suggested in~\cite{eigen2014depth}. For the training data, K represents KITTI dataset, CS is CityScapes dataset~\cite{cordts2016cityscapes}, and S is vKITTI dataset. Methods, which apply domain adaptation techniques, are marked by the {\color{gray} gray}.}
\caption{\small \textbf{DADA architecture (top) and DADA learning scheme (bottom)}. In the top part, the dark-blue stack shows the backbone CNN network; light-blue boxes symbolize the network modules; and green blocks stand for output features. In the lower part, the arrows drawn in \textcolor{scolor}{blue} and \textcolor{tcolor}{red} differentiate network flows of \textcolor{scolor}{source} and \textcolor{tcolor}{target} samples respectively. For convenient reference, over the learning blocks -- illustrated by dashed boxes -- we indicate the corresponding equation numbers.}
\caption{Characterizing thermal lensing of different optical elements. \textbf{a)} Setup for thermal lensing measurements. Along the full path, the high power beam passes through two lenses and one AOM. A $BSF10-C$ coated beam sampler enables to create a low-power ($P<10\, \mathrm{W}$) copy of the beam, which is focused by a third lens $f_{3}$ and sent to a CCD camera mounted on a translation stage (double arrow). The focus position is measured by recording the peak intensity of the Gaussian spot versus the camera position. \textbf{b)} Thermal shifts $\Delta z_{th}$ as a function of the laser power recorded for different combinations of optical elements. Right axis: $\Delta z_{th}$ due to the $f_1-f_2$ telescope with $f_2=50\, \mathrm{mm}$ in $Suprasil$\textsuperscript\textregistered $3001$ (black triangles) or in UV fused silica (red diamonds). The shift of the $f_1=200\, \mathrm{mm}$ fused silica lens alone (yellow circles) has been tested directly by measuring its focus shift versus the beam power. For each data set, the dashed line is the corresponding shift calculated by Gaussian beam propagation analysis, assuming each element to represent an additional lens with$f_{th}$ given by Eq. \ref{fth_formula} and characterized by the corresponding $m_0$ value listed in Table \ref{table1}. Left axis: Thermal shift of the optical setup with inclusion of the AOM crystal, with (black squares) or without (red circles) quartz window in the beam path. The AOM was placed at $d_{AOM,2}= 3(1) \, \mathrm{cm}$ behind the second lens $f_2$, the last lens $f_3$ at $d_{3,AOM} = 58(2) \, \mathrm{cm}$, whereas the window (if present) was at $d_{win,3}= 12(1)\, \mathrm{cm}$ after $f_3$. Solid lines (same color code) show the focus shift calculated by Gaussian beam propagation analysis, assuming the AOM thermal lens to be described by Eq. \ref{fth_formula} with the $m_0$ value given in Table \ref{table1}.}
\caption{Intra-word CS between Spanish and Wixarika, \textbf{(a)} standard LID for CS, \textbf{(b)} our task. { \color{ForestGreen}PPFV stands for past perfective.}}
\caption{{\color{ForestGreen}Confusion matrices of the two best models on both datasets. The $x$ axis represents tags seen in the gold standard, and the $y$ axis shows the corresponding predicted tags. Values are rounded up, therefore not all columns add up to 1.}}
\caption{Median of yearly heating threshold temperatures with $\left[ q_{25\%},q_{75\%}\right]$ uncertainty range determined by using electricity consumption data with daily (black), weekly (blue) and monthly (red) resolution. Countries of which the final heat demand is covered by less than 15\% by electricity are shown with faint colors. Results for all countries apart from Denmark, France and UK were obtained by using electricity consumption data provided by ENTSO-E. Results for France, Denmark and UK were obtained by using data from national sources as stated in Section \ref{sec: el_data}. Italy is not shown as heating by electricity is classified as non-existing.} \label{fig: el_all} \end{figure*} %------------- \begin{table} \centering \caption{Heating threshold temperatures for heating by gas and electricity with uncertainty ranges. n.a denotes a share of fuel type below 15\% and results are not trusted.} \footnotesize \renewcommand{\arraystretch}{1.3} \begin{tabular}{|ccc|} \hline {} & \textbf{Electricity} & \textbf{Gas - Eurostat}\\ Country & T$_{0} \left[q_{25\%},q_{75\%}\right]$ $\degree$C & T$_{0} \left[q_{25\%},q_{75\%}\right]$ $\degree$C\\ \hline AUT & n.a. & 14.59 $\left[14.08,15.41 \right]$ \\ BEL & n.a. & 15.20 $\left[ 14.59,16.02 \right]$ \\ BGR & 12.76 $\left[11.53,14.08 \right]$ & 16.02 $\left[ 15.31,18.06 \right]$ \\ CZE & n.a. & 14.80 $\left[14.80,15.10 \right]$ \\ CHE & 16.84 $\left[ 15.61,17.65 \right]$ & n.a.\\ DEU & n.a. & 13.98 $\left[13.67,14.80 \right]$ \\ DNK & n.a. & 15.20 $\left[ 14.69,15.71 \right]$ \\ EST & n.a. & 11.12 $\left[10.71,13.47 \right]$ \\ ESP & 9.69 $\left[ 5.00,13.27 \right]$ & 18.47 $\left[17.35,21.94 \right]$ \\ FIN & 13.16 $\left[ 11.53,14.18 \right]$ & n.a. \\ FRA & 13.98 $\left[13.47,14.39 \right]$ & 15.61 $\left[ 15.20,16.02 \right]$ \\ GBR & n.a. & 14.18 $\left[13.37,15.10 \right]$ \\ GRC & n.a. & 16.84 $\left[ 13.57,19.59 \right]$ \\ HRV & n.a. & 18.67 $\left[17.76,20.20 \right]$ \\ HUN & n.a. & 16.84 $\left[ 16.53,17.24 \right]$ \\ IRL & n.a. & 12.76 $\left[10.51,14.18 \right]$ \\ ITA & n.a. & 15.61 $\left[ 15.20,16.02 \right]$ \\ LTU & n.a. & 15.20 $\left[11.53,17.65 \right]$ \\ LVA & n.a. & 12.96 $\left[ 12.04,13.98 \right]$ \\ NLD & n.a. & 13.98 $\left[12.55,15.51 \right]$ \\ NOR & 11.53 $\left[ 10.71,12.45 \right]$ & n.a. \\ POL & n.a. & 15.2 $\left[14.49,16.33 \right]$ \\ PRT & 11.94 $\left[ 10.20,15.20 \right]$ & n.a. \\ ROU & n.a. & 15.41 $\left[13.78,18.88 \right]$ \\ SWE & 13.16 $\left[ 12.76,14.08 \right]$ & n.a. \\ SVN & n.a. & 15.41 $\left[14.80,16.02 \right]$ \\ SVK & n.a. & 14.18 $\left[ 13.06,15.92 \right]$ \\ BIH & 12.76 $\left[10.71,13.67 \right]$ & n.a. \\ SRB & 17.65 $\left[ 16.84,17.86 \right]$ & n.a. \\ \hline \end{tabular} \renewcommand{\arraystretch}{1} \label{tab: ht_summary} \end{table} \noindent Tab. \ref{tab: month_classification} presents a 10 year average (2008-2017) of monthly aggregated heating degree-days for each country. Enveloped months represent the summer season for which space heating is usually not required, since the heat absorbed during daylight hours is enough to keep the buildings warm during colder periods. The binary indicator function, $\Theta _X$, takes a values of zero for the enveloped months and one for the rest. Countries for which threshold temperatures are available for both heating by gas and electricity, the minimum required heating season is shown. It is clear that all countries exhibit a summer period from June-August. Apart from this, the classification shows a spread in the summer months, which mostly depends on the geographical position of the countries. As could be expected, South European countries usually hold longer summer periods without heating while the Northern countries tend to have shorter summer periods. \\ \noindent Daily and weekly aggregated gas and electricity consumption data belonging to the winter classified months have been used to recalculate the heating threshold temperatures. The results are shown in Fig. \ref{fig: gas_all} and \ref{fig: el_all}, respectively. For both consumption types, the statistical similarity in threshold temperatures for each individual country provide a robust indication of the adequacy of using less granular data for estimating the threshold temperatures. On the other hand, it is clear that the threshold temperatures increase with increasing data granularity. \\ \noindent In the following we illustrate the significance of reaching country specific heating threshold temperatures and summer seasons. As a case study, results for Great Britain are used but an identical analysis can be performed for each individual country by utilizing the heating degree-days in Tab. \ref{tab: month_classification}. For Great Britain, October averages to 90 heating degree-days, and is classified as a winter month, while May, which as well averages to 90 heating degree-days, is not. Contrary to May, the winter classified October is explained by an existing relation between gas consumption and heating degree-days. On the other hand, the AIC evidence ratio for May is below 2 and, thus, more years of consumption data would be needed to fully justify this classification. A similar classification is shown for Hungary for May and September as observed in Fig. \ref{fig: HUN_ex}. Similar cases appear for Denmark, Estonia, Greece, Romania, Switzerland and Bosnia \& Herzegovina as shown by Tab. \ref{tab: month_classification}. These issues arise mostly during Autumn and Spring where the monthly temperature differences exhibit large variances over the years. \\ \noindent Average heating degree-days calculated by using a threshold temperature of 14 $\degree$C, 16 $\degree$C and 18 $\degree$C for Great Britain are shown in Fig. \ref{fig: HDDvsMonths}. The summer season is shown by a depreciation of heating degree-days from May to October. It is clear that a 2 $\degree$C increase in the threshold temperature introduce a significant difference in the accumulated heating degree-days over a year. The most striking result which emerges from the classification is the extreme change in the seasonal pattern of the heating degree-days. \\ %------------- \begin{figure} \centering \includegraphics[width=0.48\textwidth]{HDDvsmonths.png} \caption{Average of 10 yearly heating degree-days for Great Britain calculated with heating threshold temperatures, $T_0 = 14\degree$C (yellow), $T_0 = 16\degree$C (red) and $T_0 = 18\degree$C (black). Fully drawn lines illustrate the heating degree-days during winter months as a result of the classification. Dotted lines illustrate summer months for which space heating is not needed and have to be removed.} \label{fig: HDDvsMonths} \end{figure} %------------- \noindent Quantitative measures of the heating degree-days are shown in Tab. \ref{tab: HDDvsMonths_summary} for six case studies. In the most extreme scenario, case study c) overestimates the heating degree-days by approximately 93\%, which is almost a doubling in comparison to case study d). For a fixed average space heat demand per capita per heating degree-day, $L^\text{space heat}_{0,\text{GBR}}$, the energy demand for space heating, Eq. \ref{eq: space_heat}, is consequently overestimated by identical shares. These results suggest that the current estimations of the energy demand for space heating in various projects might be highly over or underestimated for some countries. This might introduce further changes as, e.g., the estimation of CO$_2$ emissions, technology choice for heating or peak demand estimation. On the other hand, a yearly fixed energy consumption for space heating will be distributed differently according to the seasonal distribution of heating degree-days.\\ \begin{table} \centering \caption{Overview of yearly aggregated heating degree-days for six case studies of Great Britain denoted by a)-f).} \footnotesize \renewcommand{\arraystretch}{1.3} \begin{tabular}{|c|cc|} \hline {} & \textbf{With summer season} & \textbf{Without summer season}\\ \hline $\boldsymbol{T_0 = 14\degree}$\textbf{C} & a) 1654 & d) 1510\\ $\boldsymbol{T_0 = 16\degree}$\textbf{C} & b) 2235 & e) 1927\\ $\boldsymbol{T_0 = 18\degree}$\textbf{C} & c) 2896 & f) 2350 \\ \hline \end{tabular} \renewcommand{\arraystretch}{1} \label{tab: HDDvsMonths_summary} \end{table} \noindent Fig. \ref{fig: val} illustrate the synergy between the monthly averaged ground temperature measurements (blue curve), the threshold temperature (red dashed line) and the classified summer season (hatched area) for Greece, Italy and Norway. From these figures it is clear that the monthly averaged temperature falls below the heating threshold temperature outside the hatched area, which indicates that space heating is needed.\\ %\noindent Energy demand for space heating will be reduced in every country for a corresponding decrease in the heating degree-days. By using Eq. \ref{eq: space_heat} we find an average of 15\% or 40 TWh/yr reduction in space heat demand for Great Britain from 2008 to 2017 from omitting the heating degree-days that are present during the Summer period. Here we assume a population of $p_{\text{GBR}} = $ 64.4 m and $L^{\text{space heat}}_{0,\text{GBR}}=$ 320kJ/cap/HDD for the residential and commercial sectors. $T_0=$14.03$\degree$C and $\Theta_{\text{GBR}}$ from Tab. \ref{tab: month_classification}. An identical heating threshold temperature as in Eurostat and HRE provides a heat demand of 474TWh/yr which is approximately double our finding. By using 16$\degree$C and 18$\degree$C as in Stratego and Odyssee or EIA we find 353TWh/yr and 458TWh/yr. \subsection{Country wise validation} \noindent \textbf{Denmark} has no heating season defined by law. 54 \% of the end-use heat demand is provided by district heating \citep{nordic} which dominates the Danish heat production. During summer time the district heating utilities mainly deliver hot water. A similar summer season is determined in this work with a heating threshold temperature of 15.20 $\left[14.69,15.71 \right]$ $\degree$C. A similar finding was presented by \cite{dahl2017decision}. \\%From Fig. \ref{fig: hvidesande}-\ref{fig: skagen} it is clear that the Danish district heating utilities stop providing heat from June to September. \\ \noindent \textbf{Czech Republic} has a legally defined heating season that lasts from September 1st to May 31st \citep{stratego2}. If the daily average outside temperature is below 13$\degree$C then the district heating utilities start to deliver heat. An identical heating season is determined by this work with a threshold temperature of 14.80 $\left[ 14.80,15.10 \right]$ $\degree$C. A possible explanation for this discrepancy is that our results covers the complete heating production by electricity and gas while 13$\degree$C only refers to district heating. \\ \noindent \textbf{Great Britain} has no heating season provided by law but a typical heating season starts by October 1st and ends at April 30th and is further restricted with a day time peak temperature being 16 $\degree$C or lower for a few consecutive days \citep{stratego2}. An identical heating season is proposed in this study with a threshold temperature of 14.18 $\left[13.37,15.10 \right]$ $\degree$C which is conductive with 16$\degree$C daytime and 9$\degree$C night temperatures. A threshold temperature of $13\degree$C is proposed by the \cite{HSE} based on qualitative surveys.\\ \noindent \textbf{Germany} has no legal heating season, but the German Tenants Association, \citep{DMB}, states the heating season typically runs from October 1st to April 30th. This gives a heating season that is two months shorter than found here, which is explained by the threshold temperatures. The Association of German engineers, VDI 2067, estimate a German heating threshold temperature of 12$\degree$C, whereas a threshold of 13.98 $\left[ 13.67,14.80 \right]$ is presented by this study. \\ \noindent \textbf{Finland} also has no heating season defined by law. According to \cite{jylha2015hourly} an accepted heating threshold temperature is 12$\degree$C from Autumn to December and lowers to 10$\degree$C during the Spring. Here, a value of 13.16 $\left[11.53,14.18 \right]$ $\degree$C is proposed to be used from September through to May. \\ \noindent \textbf{Italy} has several heating seasons defined by law depending on six different climatic zones from the mountainous North with a colder climate to the flat South with a temperate climate \citep{stratego2}. October 15th is the earliest date at which heating is permitted and lasts at most to April 15th. A national-aggregate heating period is found to run from October 1st to March 31st with a threshold of 15.61 $\left[ 15.20,16.02 \right]$ $\degree $C. \\ \noindent \textbf{Croatia} has a typical heating season to range from September 15th to May 15th. The heating season is in this study proposed to start at September 1st and last to April 30th with a heating threshold temperature of 18.67 $\left[17.76,20.20 \right]$ $\degree $C. \\ \noindent \textbf{Romania}'s district heating utilities begin to operate by law if the outside average temperature reaches 10$\degree $C or lower for three consecutive days, and no later than November 1st. Heat delivery stops, by law, if the daily average temperature exceeds 10$\degree $C for three consecutive days and not earlier than April 15th. In this work, the overall heating season is found to start from October 1st and last to March 31st with a heating threshold temperature of 15.41 $\left[ 13.78,18.88 \right]$ $\degree $C. \\ \noindent \textbf{Spain} holds heating threshold temperatures from 13-14.8 $\degree$C depending on the region \citep{labandeira2012estimation, blazquez2013residential}. In this work, 9.69 $\left[5.00,13.27 \right]$ $\degree$C is found for electricity use and 18.47 $\left[ 17.35,21.94 \right]$ $\degree$C for gas use. %------------- \begin{figure} \centering \includegraphics[width=0.5\textwidth]{val_fig.png} \caption{Monthly averaged temperatures from 2008-2017 with one sigma uncertainty range (blue full drawn curve with shaded region), heating threshold temperature with $\left[q_{25\%},q_{75\%}\right]$ uncertainty range (red dashed line with shaded region) and classified summer season (black hatched area) for Greece (upper figure), Italy (central figure) and Norway (lower figure).}
\caption{Main MetaModel of CloudCAMP framework. The black lines depict containment, the {\color{red}red} lines depict inheritence and {\color{blue} blue} lines depict connection.}
\caption{\color{blue}{Box 1} \color{black}depicts the responsibilities of service deployment team, which is to define the low-level scripts so that existing automation tools can configure the application components and orchestration tools can provision the infrastructure for application components and execute them on heterogeneous cloud environments. \color{red}{Box 2} \color{black} depicts the contributions of this paper which introduces a self-service framework and automates whole infrastructure design solutions for these tools.}
\caption{Comparison of the local Matsubara Green's functions and the total spectral functions of the extended Hubbard model in the Normal, AFM, and CO phases. \textcolor{blue}{What is the message here? Should this be replaced with something involving the self energy? Maybe a different cuts?} }
\caption{Thermal conductivity computed for a 13824-atom supercell of a-Si using the quantum QHGK approach in the quantum regime (Eq. \ref{eq:quantum}), compared with the Allen-Feldman approach ~\cite{Allen1989,Allen1993,Feldman:1993tn} and experimental data (\protect\markerone, \protect\markerthree ~Ref.\cite{Zink:2006wt}), (\protect\markertwo ~Ref.\cite{Cahill1994}). The broadening $\eta$ used in Allen-Feldman calculations is set equal for every normal mode.}
\caption{%\jack{Put the normal distribution as yellow!} Taken from \citep{RBOCPD, GVI}. Comparing likelihood-based robust losses usable within \textcolor{GVIColor1}{\textbf{\GVI}} with the standard negative log likelihood loss used within \textcolor{VIColor}{\textbf{\VI}}. % \textbf{Left}: Transforming the loss provides robustness against model misspecification. Depicted are posterior predictives under $\varepsilon = 5\%$ outlier contamination using \textcolor{VIColor}{\textbf{\VI}} and \textcolor{GVIColor1}{$P(\sum_{i=1}^n\Lb(\*\theta, \*y_i),\KLD, \mathcal{Q})$}, with $\Lb(\*\theta, \*y_i)$ as in eq. \eqref{eq:BD^loss} for $\beta=1.5$. % \textbf{Right:} % Depicted is the influence \citep[see][]{InfFct} of the 100th observation $\*y_{100}$ on exact posteriors for robust and non-robust losses in standard deviations from the posterior mean after 99 observations. % Higher influence is assigned for \textcolor{VIColor}{$\mathbf{-\log(p(\*y_i, \*\theta))}$} the more unlikely $\*y_i$ is under the current model. % In contrast, \textcolor{GVIColor1}{$\mathbf{\Lb(\*\theta, \*y_i)}$} guards against assigning the highest influence to outliers. % $\Lg(\*\theta, \*y_i)$ behaves similarly. % % %\textcolor{GVIColor1}{\textbf{\GVI}}, %\textcolor{FVIColor}{\textbf{\FVI}} and %focus on one model while AR-VI smoothes between the two. This demonstrates that using AR-VI to produce more conservative marginal variances implicitly changes the loss function of the Bayesian problem. }
\caption{ % Taken from \citep{GVI}. % Comparing standard \textcolor{VIColor}{\textbf{\VI}} ($D= \KLD$) against \textcolor{GVIColor1}{\textbf{\GVI}} with $D = \RAD$ using posteriors with Gaussian likelihoods and mean-field Gaussian approximations. % \textbf{Left:} Changing $D$ improves marginal variances. % %zero-avoiding behaviour: Depicted are exact and approximate marginals. The exact posterior is correlated, causing \textcolor{VIColor}{\textbf{\VI}} to over-concentrate. \textcolor{GVIColor1}{\textbf{\GVI}} can avoid this. % \textbf{Right:} Changing $D$ provides prior robustness. % Depicted are approximate marginals for two different priors $\pi \in \{N(-30,2^2), N(-5,2^2)\}$. \textcolor{VIColor}{\textbf{\VI}} is sensitive to the badly specified prior. \textcolor{GVIColor1}{\textbf{\GVI}} can avoid this. }
\caption{ Comparing performance in \DGP{}s with $L$ layers for %\textcolor{FVIColor}{\textbf{\FVI}}, %\textcolor{GVIColor2}{\textbf{\GVI changing $\ell_n$ and $D$}}, \textcolor{GVIColor1}{\textbf{\DGP-\GVI}} with %alternative choices for $\ell_n(\*\theta, \*x) = \sum_{i=1}^n\Lg(\*\theta, x_i)$ and \textcolor{VIColor}{\textbf{\DGP-\VI}}. Benchmark performance is the \DGP with three layers as in \citep{DeepGPsVI}. \textbf{Top rows}: Negative test log likelihoods. \textbf{Bottom rows}: Test \RMSE. The lower the better. }
\caption{ Comparing performance in \DGP{}s with 3 layers for %\textcolor{FVIColor}{\textbf{\FVI}}, %\textcolor{GVIColor2}{\textbf{\GVI changing $\ell_n$ and $D$}}, \textcolor{GVIColor1}{\textbf{\DGP-\GVI}} with %alternative choices for $\ell_n(\*\theta, \*x) = \sum_{i=1}^n\Lg(\*\theta, x_i)$ and alternative uncertainty quantifiers $D$ against \textcolor{VIColor}{\textbf{\DGP-\VI}}. Benchmark performance is the \DGP with three layers as in \citep{DeepGPsVI}. \textbf{Top row}: Negative test log likelihoods. \textbf{Bottom row}: Test \RMSE. The lower the better. }
\caption{Profiles of (a) mean $C(z)$, and (b) r.m.s $\sigma_c(z)$ of concentration in the vertical direction. Open symbols represent the data from the traversing PID in a point source plume released at different source heights; $s_z/\delta$ = 0.004 (\textcolor{red}{$\ocircle$}); 0.044 (\textcolor{orange}{$\lozenge$}); 0.1 (\textcolor{green}{$\lhd$}) ; 0.25 (\textcolor{black}{$\Box$}); 0.33 (\textcolor{blue}{$\rhd$}). Solid symbols represent the measurement from the stationary PID.}
\caption{With \ags, we can compare classes throughout layers of a network. Here we compare two similar classes: \blackBear and \brownBear. From the intersection of their \ags, we see both classes share features related to \bearness, but diverge towards the end of the network using fur color and face color as discriminable features. This feature discrimination aligns with how humans might classify bears. }
\caption{A high-level illustration of how we take thousands of images for a given class, e.g., images from \textbf{\textit{white wolf}} class, compute their top activations and attributions, and combine them to form an \textcolor{main}{\textbf{attribution graph}} that shows how lower-level features (``legs'') contribute to higher-level ones (``white fur''), and ultimately the final outcome.}
\caption{Two examples of \textcolor{red}{drug}-\textcolor{darkgreen}{gene}-\textcolor{blue}{mutation} relations from a biomedical journal paper. The relations are expressed across multiple paragraphs, requiring document-level extraction.}
\caption{Multiscale representation learning for document-level $n$-ary relation extraction, an entity-centric approach that combines mention-level representations learned across text spans and subrelation hierarchy. (1) Entity mentions (e.g., \textcolor{red}{gefitinib, a drug}; \textcolor{darkgreen}{EGFR, a gene}; \textcolor{blue}{T790M, a variant}) are identified from text, and mentions that co-occur within a discourse unit (e.g., paragraph) are isolated. (2) Within each discourse unit, mention-level representations are computed for each tuple of entity mentions. These representations may correspond to the entire $n$-ary relation or subrelations over subsets of entities (\textcolor{darkmagenta}{drug-variant}, \textcolor{darkyellow}{drug-gene}, \textcolor{darkcyan}{gene-variant}). (3) At the document scale, mention-level representations for both the $n$-ary relation and its subrelations are combined into entity-level representations. (4) Entity-level representations are used to predict the relation. }
\caption{The proposed regularizer: the hidden vector in the decoder, $s_j$, transits through two paths: 1) a linear and a softmax layers that output vector $v_j$ (vocab\_dim) which is used for predicting the target word as usual, and 2) a two-layer network (ReWE) that outputs a vector,$e_j$, of word embedding size (word\_emb\_dim). During training,$e_j$ is used in a regressive loss with the ground-truth embedding.}
\caption{BLEU scores of three models over the en-fr validation set for different $\lambda$ values: baseline (\textcolor{red}{\textbf{red}}), baseline + ReWE (MSE) (\textcolor{green}{\textbf{green}}), baseline + ReWE (CEL) (\textcolor{blue}{\textbf{blue}}). Each point in the graph is an average of 3 independently trained models.}
\caption{Plot of the values of various loss functions during training of our model over the en-fr training set: \textcolor{green}{\textbf{green}}: training loss (NLL + ($\lambda = 20$) ReWE (CEL); Eq.\ref{eq:combined_loss}); \textcolor{red}{\textbf{red}}: NLL loss; \textcolor{blue}{\textbf{blue}}: ReWE (CEL) loss; \textcolor{magenta}{\textbf{magenta}}: ReWE (CEL) loss scaled by $\lambda = 20$. Each point in the graph is an average value of the corresponding loss over 25,000 sentences.}
\caption{Plot of the values of various loss functions during training of our model over the en-fr training set: \textcolor{green}{\textbf{green}}: training loss (NLL + ($\lambda = 20$) ReWE (MSE); Eq.7); \textcolor{red}{\textbf{red}}: NLL loss; \textcolor{blue}{\textbf{blue}}: ReWE (MSE) loss; \textcolor{magenta}{\textbf{magenta}}: ReWE (MSE) loss scaled by $\lambda = 20$. Each point in the graph is an average value of the corresponding loss over 25,000 sentences.}
\caption{Reconstruction error of AE and MemAE on an abnormal frame of UCSD-Ped2. {MemAE can significantly highlight the abnormal parts (in \redtext{red} bounding box) in the scene.}}
\caption{ Synthetic multi-color (top) and bolometric (middle) light curves from our $5.4~M_\odot$ progenitor model with the hydrogen-rich envelope of $0.4~M_\odot$. The explosion energy and $^{56}$Ni mass of the model are $5\times 10^{50}~\mathrm{erg}$ and $0.003~M_\odot$, respectively. The open triangle shows the Gaia upper limit. The bottom panel shows the photospheric velocity evolution of the model, where the photosphere is defined as the radius with the Rosseland-mean optical depth of 2/3. \red{The half of the \Ha velocity (Fig.~\ref{fig:ha}) which approximately traces the photospheric velocity is also shown.} The time in the figure is from the observationally estimated explosion date and the explosion date of our synthetic model is 4~days before the estimated explosion date. }
\caption{An illustration of the standard \term evaluation procedure \citep[e.g.,][]{Jia2017AdversarialEF} and our proposed analysis method. {\color{blue} ``Original''} refers to the a standard dataset (e.g., SQuAD) and {\color{red} ``\Term''} refers to the \term dataset (e.g., Adversarial SQuAD). Outcomes are discussed in Section~\ref{sec:method}. \label{fig:method_summary}}
\caption{ An example from the Adversarial SQuAD dataset, with the distractor sentence \textcolor{blue}{in blue}. Figure reproduced from \citet{Jia2017AdversarialEF}. }
\caption{Comparison with the state of the arts on CityPersons\cite{zhang2017citypersons}. Results test on the original image size (1024x2048 pixels) are reported. {\color{red}{Red}} and {\color{green}{green}} indicate the best and second best performance.}
\caption{\color{red}Evaluation of different layers in the encoder, which are implemented as multi-head self-attention with the EM routing based information aggregation. ``1'' denotes the bottom layer, and ``6'' the top layer.}
\caption{\label{fig:pipeline} Data collection pipelines. Our outlier detection method (\textcolor{CornflowerBlue}{blue box}) is incorporated into the uniqueness-driven data collection pipeline to guide crowd workers to write more diverse paraphrases. \textcolor{YellowGreen}{Green rounded boxes} are manual processes performed by crowd workers, \textcolor{BurntOrange}{orange boxes} with curved bases are data, and the \textcolor{CornflowerBlue}{blue rectangular box} is our outlier detection method. In (b), $r$ is the number of outliers detected from $n$ samples. }
\caption{\label{fig:svp_slots} Example annotated sentence for the slot-filling task. The slot names are (in order of appearance) \sethlcolor{yellow}\hl{\textit{metric}}, \sethlcolor{orange}\hl{\textit{amount}}, \sethlcolor{green}\hl{\textit{currency}}, and \sethlcolor{pink}\hl{\textit{date}}.}
\caption{\textbf{Median change in output $\boldsymbol{\Delta\hat{y}^{med}}$} (x) densities in relation to the \textbf{max attention ($\boldsymbol{\max{\hat{\alpha}}}$)} (y) obtained by randomly permuting instance attention weights. Colors denote classes: negative (\textcolor[HTML]{7570b3}{$\blacksquare$}) and positive (\textcolor[HTML]{d95f02}{$\blacksquare$}); phenotyping (e) is not binary. {\bf Top row shows results for BiLSTM encoders; middle for CNNs; bottom for Embedding Projection.} }
\caption{ \small We consider a transfer learning scenario in reinforcement learning that considers transfer in both task and environment. Three different settings are presented here (see text for details). The \textcolor{red}{\textbf{red dots}} denote \textsc{seen} combinations, \textcolor{gray}{\textbf{gray dots}} denote \textsc{unseen} combinations, and arrows $\rightarrow$ denote transfer directions.}
\caption{\small Transfer results of settings 2 and 3. AvgSRs are marked in the grid (see Suppl.\Materials for more visually discernible plots). The tasks and environments in the\textcolor{blue}{\textbf{purple cells}} are from the unseen $Q$ set and the \textcolor{red}{\textbf{red cells}} correspond to the rest. Darker color means better performance. It shows that cross-task transfer is easier than cross-environment.}
\caption{Quantitative comparison of SRCNN-CAB \cite{CAB}, SRMDNF \cite{SRMD} and the proposed SFTMD. The comparison is conducted using three different isotropic Gaussian kernels on Set5, Set14 and BSD100 dataset. The best two results are highlighted in \textcolor[rgb]{1.00,0.00,0.00}{red} and \textcolor[rgb]{0.00,0.00,1.00}{blue} colors.}
\caption{Quantitative comparison of the SOTA SR methods and IKC method. The best two results are highlighted in \textcolor[rgb]{1.00,0.00,0.00}{red} and \textcolor[rgb]{0.00,0.00,1.00}{blue} colors, respectively. Note that the methods marked with ``*'' is not designed for blind SR, thus the comparison with these methods is unfair.}
\caption{Overall results in error rate (\%) on CIFAR and SVHN datasets. A suffix + indicates standard data augmentation. We only list results discussed and compared in our Experiments Section~\ref{sec:experiment} for succinctness. The overall best results are {\color{blue} \textbf{blue}}.}
\caption{Top 8 image retrieval examples given a query sketch. All the examples correspond to a zero-shot setting, \emph{i.e.} no example have been seen in training. First row provides a comparison with CVAE~\cite{yelamarthi2018zero} method against our pipeline. Note that in some retrieval cases, for instance, \texttt{door} is confused with \texttt{window} images which can be true even for humans. \textcolor{green}{Green} and \textcolor{red}{Red} stands for correct and incorrect retrievals. (Better viewed in pdf)}
\caption{Contact map construction for the `flashlight' object from ContactDB human demonstration. Points are randomly sampled on the object surface. \textcolor{OliveGreen}{green}: attractive, \textcolor{red}{red}: repulsive.}
\caption{The scaling exponent, $\mu$, of conventional polar code in BEC. The marker {\color{green} $\blacklozenge$} denotes this contribution's newly derived analytical approximation, being exactly $\varphi$ above the optimal value ({\color{blue} $\circ$}). Also marked for comparison, previous heuristic numerical computation ({\color{red}$\square$},~\cite{Hassani2014}) and known bounds ({\color{black}$\vartriangleleft,\blacktriangleleft$},\cite{Hassani2014} and {\color{black}$\vartriangleright$},\cite{Mondelli2016}).}
\caption{ Examples of the selected event proposals ({\color{red}\textbf{red}}) out of the candidates (\textbf{black}) in the proposed event sequence generation network and the ground-truth events ({\color{blue}\textbf{blue}}). }
\caption{An isosurface of difference electron density of graphene is shown by the Cube 3D Viewer. The drop-down menu is \textcolor{red}{i}n the top-right corner.}
\caption{Imaging of transient vibrations in an AFM cantilever excited with a mechanical broadband pulse with 100 ns time resolution. (a) Time trace of the vertical position $h(t)$ extracted at a pixel near the cantilever tip. Zooms show the time trace immediately after mechanical excitation (b), peak oscillation amplitude (c) and after ring down (d). (b-d) Evolution of the surface profile $h(\vect{r},t)$ of the cantilever (see also \textcolor{urlblue}{Visualization 1}).}
\caption{Monitoring vibration excitation and decay with periodic chirped excitation. Chirped excitation waveform $U_\text{ex}$ linearly sweeping from 20 kHz to 1 MHz in 1 ms was applied. (a) Time trace of the vertical position $h(t)$ extracted at a pixel near the cantilever tip. (b) Spatially integrated spectrogram $X_\text{int} (\tau,\omega)$ of vibrations excited in the cantilever as obtained by Short-Fourier Transforms. (c) Decay time $\tau$ and quality factor $Q$ of the vibration modes extracted from (b). \textcolor{urlblue}{Visualization 2} shows a movie of the cantilever vibrations.}
\caption{One round of the non-contextual wiring where context $\boldsymbol{\beta}=\left\{\beta_1, \beta_2\right\}$ is chosen for box $\boldsymbol{B}_{\mathrm{PRE}}$, leading to the sequences of buttons and lights $\left(\beta_1,r_1,\gamma_1,s_1,\delta_1,t_1\right)$ and $\left(\beta_2,r_2,\gamma_2,s_2,\delta_2,t_2\right)$. The behavior of box $\boldsymbol{B}_{\mathrm{POST}}$ can depend on box $\boldsymbol{B}_{\mathrm{PRE}}$, but with the restriction that the probability of output $t_1$ for button $\gamma_1$ can depend only on $\beta_1$ and $r_1$ and the probability of output $t_2$ for button $\delta_2$ can depend only on $\beta_2$ and $r_2$, and hence $p_{\boldsymbol{\delta}\left(\boldsymbol{s}\right)}\left(\boldsymbol{t}\vert \boldsymbol{\beta},\boldsymbol{r}\right)=p_{\delta_1}\left(t_1 \vert \beta_1, r_1\right)\times p_{\delta_2}\left(t_2 \vert \beta_2, r_2\right)$. If we consider the situation in which button $\beta_2$ is pressed only after outcome $t_1$ is recorded, this restriction implies that the behavior of the post-processing box depends only on the sequence $\left(\beta_2,r_2,\gamma_2,s_2,\delta_2,t_2\right)$ and the wiring has no memory of the previous round $\left(\beta_1,r_1,\gamma_1,s_1,\delta_1,t_1\right)$. We want the set of free operations to be convex and hence we also allow for convex combinations of such instances, given by the sum over the variable $\phi$ in equation \eqref{eq:post_rest}.} \label{fig:dep} \end{minipage} \end{figure} This composition defines a final box $\mathcal{W}_\mathrm{NC}\left(\boldsymbol{B}\right)$ with input buttons $\mathcal{M}_{\mathrm{PRE}}$, compatibility graph $\mathcal{C}_{\mathrm{PRE}}$, and output lights $\mathcal{O}_{\mathrm{POST}}$. The behavior of the final box will be given by \be p_{\boldsymbol{\beta}}\left(\boldsymbol{t}\right)= \sum_{\boldsymbol{s},\boldsymbol{r}} p_{\boldsymbol{\delta}(\boldsymbol{s})}(\boldsymbol{t})p_{\boldsymbol{\gamma}(\boldsymbol{r})}(\boldsymbol{s})p_{\boldsymbol{\beta}}(\boldsymbol{r}) \ee where $\boldsymbol{\beta} \in \mathcal{C}_{\mathrm{PRE}}$, $\boldsymbol{r}$ runs over the possible output lights associated with context $\boldsymbol{\beta}$ in box $\boldsymbol{B}_{\mathrm{PRE}}$, $\boldsymbol{\gamma}(\boldsymbol{r}) \in \mathcal{C}$ is the context of box $\boldsymbol{B}$ corresponding to $\boldsymbol{r}$, $\boldsymbol{s}$ runs over the output lights associated with context $\boldsymbol{\gamma}(\boldsymbol{r})$, $\boldsymbol{\delta}(\boldsymbol{s})\in \mathcal{C}_{\mathrm{POST}}$ is the context of box $\boldsymbol{B}_{\mathrm{POST}}$ corresponding to $\boldsymbol{s}$, and $\boldsymbol{t}$ is one of the possible output lights associated with context $\boldsymbol{\delta}(\boldsymbol{s})$. The set of all non-contextual wirings will be denoted by $\mathsf{NCW}$. Self-consistency of the theory requires that non-contextual wirings satisfy the following property, proven here for context with two buttons. The general case is completely analogous and a general proof can be found in the supplemental material of ref. \cite{ACTA18}. \begin{lem}[Non-disturbance preservation] \label{Lem:ND} The class of boxes $\mathsf{ND}$ is closed under all wirings in $\mathsf{NCW}$. \end{lem} \begin{proof} Suppose that $\boldsymbol{\beta}=\left\{\beta_1,\beta_2\right\}$. Since when one button is pressed exactly one light turns on, we have $\boldsymbol{r}=\left\{r_1,r_2\right\}$, $\boldsymbol{\gamma}\left(\boldsymbol{r}\right)=\left\{\gamma_1\left(r_1\right),\gamma_2\left(r_2\right)\right\}$, $\boldsymbol{s}=\left\{s_1,s_2\right\}$, $\boldsymbol{\delta}\left(\boldsymbol{s}\right)=\left\{\delta_1\left(s_1\right),\delta_2\left(s_2\right)\right\}$, as in fig. \ref{fig:dep}. We have that \begin{eqnarray} \sum_{t_2}p_{\boldsymbol{\beta}}\left(\boldsymbol{t}\right)&=&\sum_{t_2} \sum_{\boldsymbol{r},\boldsymbol{s}} p_{\boldsymbol{\delta}\left(\boldsymbol{s}\right)}\left(\boldsymbol{t}\right)p_{\boldsymbol{\gamma}\left(\boldsymbol{r}\right)}\left(\boldsymbol{s}\right)p_{\boldsymbol{\beta}}\left(\boldsymbol{r}\right) \\ &=& \sum_{t_2} \sum_{\boldsymbol{r},\boldsymbol{s}} \sum_{\phi}p\left(\phi\right)p_{\delta_1\left(s_1\right)}\left(t_1\vert \beta_1, r_1, \phi\right)p_{\delta_2\left(s_2\right)}\left(t_2\vert \beta_2, r_2, \phi\right)p_{\boldsymbol{\gamma}\left(\boldsymbol{r}\right)}\left(\boldsymbol{s}\right)p_{\boldsymbol{\beta}}\left(\boldsymbol{r}\right) \\ &=&\sum_{\boldsymbol{r},\boldsymbol{s}} \sum_{\phi}p\left(\phi\right)p_{\delta_1\left(s_1\right)}\left(t_1\vert \beta_1, r_1, \phi\right)\left[\sum_{t_2} p_{\delta_2\left(s_2\right)}\left(t_2\vert \beta_2, r_2, \phi\right)\right]p_{\boldsymbol{\gamma}\left(\boldsymbol{r}\right)}\left(\boldsymbol{s}\right)p_{\boldsymbol{\beta}}\left(\boldsymbol{r}\right) \\ &=&\sum_{r_1,r_2,s_1,s_2} \sum_{\phi}p\left(\phi\right)p_{\delta_1\left(s_1\right)}\left(t_1\vert \beta_1, r_1, \phi\right)p_{\boldsymbol{\gamma}\left(\boldsymbol{r}\right)}\left(\boldsymbol{s}\right)p_{\boldsymbol{\beta}}\left(\boldsymbol{r}\right) \\ &=&\sum_{r_1,s_1} \sum_{\phi}p\left(\phi\right)p_{\delta_1\left(s_1\right)}\left(t_1\vert \beta_1, r_1, \phi\right)\sum_{r_2,s_2} p_{\boldsymbol{\gamma}\left(\boldsymbol{r}\right)}\left(\boldsymbol{s}\right)p_{\boldsymbol{\beta}}\left(\boldsymbol{r}\right) \\ &=&\sum_{r_1,s_1} \sum_{\phi}p\left(\phi\right)p_{\delta_1\left(s_1\right)}\left(t_1\vert \beta_1, r_1, \phi\right)p_{\gamma_1(r_1)}(s_1)\sum_{r_2}p_{\boldsymbol{\beta}}\left(\boldsymbol{r}\right) \\ &=&\sum_{r_1,s_1} \sum_{\phi}p\left(\phi\right)p_{\delta_1\left(s_1\right)}\left(t_1\vert \beta_1, r_1, \phi\right)p_{\gamma_1(r_1)}(s_1)p_{\beta_1}\left(r_1\right)\\&=&p_{\beta_1}\left(t_1\right). \end{eqnarray} \end{proof} In addition, to give valid free operations, $\mathsf{NCW}$ must fulfill the following requirement, proven here for contexts with two buttons \cite{ACTA18}. \begin{thm}[Non-contextuality preservation] \label{teoncpreservation} The class of boxes $\mathsf{NC}$ is closed under all wirings in $\mathsf{NCW}$. \end{thm} \begin{proof} Non-conextuality of $\boldsymbol{B}_{\mathrm{PRE}}$, $\boldsymbol{B}$ and $B_{\mathrm{POST}}$ and condition \eqref{eq:post_rest} imply that \begin{align} p_{\boldsymbol{\beta}}\left(\boldsymbol{r}\right)&=\sum_{\xi}p\left(\xi\right)p_{\beta_1}\left(r_1\vert \xi\right)p_{\beta_2}\left(r_2\vert \xi\right),\\ p_{\boldsymbol{\gamma}\left(\boldsymbol{r}\right)}\left(\boldsymbol{s}\right)&=\sum_{\psi}p\left(\psi\right)p_{\gamma_1\left(r_1\right)}\left(s_1\vert \psi\right)p_{\gamma_2\left(r_2\right)}\left(s_2\vert \psi\right),\\ p_{\boldsymbol{\delta}\left(\boldsymbol{s}\right)}\left(\boldsymbol{t}\right)&=\sum_{\phi}p\left(\phi\right)p_{\delta_1\left(s_1\right)}\left(t_1\vert \beta_1,r_1, \phi\right)p_{\delta_2\left(r_2\right)}\left(t_2\vert \beta_2,r_2,\phi\right). \end{align} This in turn implies that \begin{align} p_{\boldsymbol{\beta}}\left(\boldsymbol{t}\right)&= \sum_{\boldsymbol{r},\boldsymbol{s}} p_{\boldsymbol{\delta}\left(\boldsymbol{s}\right)}\left(\boldsymbol{t}\right)p_{\boldsymbol{\gamma}\left(\boldsymbol{r}\right)}\left(\boldsymbol{s}\right)p_{\boldsymbol{\beta}}\left(\boldsymbol{r}\right) \\ &=\sum_{\xi,\psi,\phi}\sum_{r_1,r_2,s_1,s_2}p\left(\xi\right)p\left(\psi\right)p\left(\phi\right)p_{\delta_1\left(s_1\right)}\left(t_1\vert \beta_1,r_1, \phi\right)p_{\delta_2\left(r_2\right)}\left(t_2\vert \beta_2,r_2, \phi\right)\nonumber\\ & \hspace{9em} \times p_{\gamma_1\left(r_1\right)}\left(s_1\vert \psi\right)p_{\gamma_2\left(r_2\right)}\left(s_2\vert \psi\right)p_{\beta_1}\left(r_1\vert \xi\right)p_{\beta_2}\left(r_2\vert \xi\right)\\&= \sum_{\boldsymbol{\psi}} p\left(\boldsymbol{\psi}\right)\left[ \sum_{r_1,s_1}p_{\delta_1\left(s_1\right)}\left(t_1\vert \beta_1,r_1, \phi\right)p_{\gamma_1\left(r_1\right)}\left(s_1\vert \psi\right)p_{\beta_1}\left(r_1\vert \xi\right)\right]\nonumber\\ & \hspace{9em} \times\left[\sum_{r_2,s_2}p_{\delta_2\left(r_2\right)}\left(t_2\vert \beta_2,r_2,\phi\right)p_{\gamma_2\left(r_2\right)}\left(s_2\vert \xi\right)p_{\beta_2}\left(r_2\vert \xi\right)\right]\\&=\sum_{\boldsymbol{\psi}} p\left(\boldsymbol{\psi}\right)p_{\beta_1}\left(t_1\vert\boldsymbol{\psi}\right)p_{\beta_2}\left(t_2\vert\boldsymbol{\psi}\right). \end{align} where $\boldsymbol{\psi}=\left(\xi,\psi,\phi\right).$ \end{proof} This proof is connected to the fact that the composition of any three independent non-contextual boxes yields a final box that is also non-contextual (with three independent non-contextual hidden variables). $\mathsf{NCW}$ is however more powerful than such compositions because the pre- and post-processing boxes here are not independent. Still, the restriction of eq.\\eqref{eq:post_rest} enables non-contextuality preservation (see \cite{ACTA18}). For space-like separated measurements, $\mathsf{NCW}$ reduces to \emph{local operations assisted by shared randomness}, the canonical free operations of Bell non-locality \cite{GWAN12,Vicente14,GA17}. This also shows that non-contextual wirings is not the largest set of free operations for contextuality, since in the particular case of Bell scenarios it is known that local operations assisted by shared randomness is not the largest set of free operations of non-locality. However, we still lack an explicit parametrization for a larger set of free operations for contextuality, and we restrict throughout to the class of non-contextual wirings, unless stated otherwise, since this is the class for which we have a friendly parametrization with a clear physical interpretation. \subsubsection*{Product and controlled choice of boxes} We consider now two different ways of combining independent boxes $\boldsymbol{B}_1=\left(\mathcal{M}_1, \mathcal{C}_1, \mathcal{O}_1\right)$ and $\boldsymbol{B}_2=\left(\mathcal{M}_2, \mathcal{C}_2, \mathcal{O}_2\right)$. First we define the box $\boldsymbol{B}_1 \otimes \boldsymbol{B}_2$, called the \emph{product} of $\boldsymbol{B}_1$ and $\boldsymbol{B}_2$, as the box such that each of its contexts is given by $\boldsymbol{\gamma}=\boldsymbol{\gamma}_1 \cup \boldsymbol{\gamma}_2$, with $\boldsymbol{\gamma}_i\in \mathcal{C}_i$, that is, each context in the final box $\boldsymbol{B}_1 \otimes \boldsymbol{B}_2$ consists of a choice of context for box $\boldsymbol{B}_1$ \emph{and} a choice of context for $\boldsymbol{B}_2$ . The behavior of this box is \be p_{\boldsymbol{\gamma}_1\cup \boldsymbol{\gamma}_2}\left(\boldsymbol{s}_1,\boldsymbol{s}_2\right)=p_{\boldsymbol{\gamma}_1}\left(\boldsymbol{s}_1\right)p_{\boldsymbol{\gamma}_2}\left(\boldsymbol{s}_2\right).\ee \begin{figure}[h!] \centering \includegraphics[scale=0.7]{fig_and.pdf} \caption{The box $\boldsymbol{B}_1 \otimes \boldsymbol{B}_2$ for which each context consists of a choice of context for box $\boldsymbol{B}_1$ \emph{and} a choice of context for $\boldsymbol{B}_2$.} \label{fig:and} \end{figure} We can also define the box $\boldsymbol{B}_1 \& \boldsymbol{B}_2$, called the \emph{controlled choice} of $\boldsymbol{B}_1$ and $\boldsymbol{B}_2$, as the box such that $\mathcal{C}=\mathcal{C}_1 \cup \mathcal{C}_2$, that is, each context in the final box $\boldsymbol{B}_1 \& \boldsymbol{B}_2$ consists of a choice of context for box $\boldsymbol{B}_1$ \emph{or} a choice of context for $\boldsymbol{B}_2$. The behavior of this box is a juxtaposition of a behavior for box $\boldsymbol{B}_1$ and a behavior for $\boldsymbol{B}_2$. \begin{figure}[h!] \centering \begin{subfigure}[b]{0.45\textwidth} \includegraphics[scale=0.7]{fig_or_a.pdf} \caption{} \end{subfigure} ~ \qquad \begin{subfigure}[b]{0.45\textwidth} \includegraphics[scale=0.7]{fig_or_b.pdf} \caption{} \end{subfigure} \caption{The box $\boldsymbol{B}_1 \& \boldsymbol{B}_2$ for which each context consists of a choice of context for box $\boldsymbol{B}_1$ \emph{or} a choice of context for $\boldsymbol{B}_2$.} \label{fig:or} \end{figure} \section{Quantifiers} \label{sec:quant} The essential requirement for a function to be a valid measure of contextuality is that it is monotonous (i.e. non-increasing) under the set of non-contextual wirings. \begin{dfn} A function $Q: \mathsf{ND}\left(\Upsilon\right)\rightarrow \mathds{R}$ is a \emph{contextuality monotone} for the resource theory of contextuality defined by non-contextual wirings if \be Q\left[\mathcal{W}\left(B\right)\right] \leq Q\left(B\right)\ee for every $\mathcal{W} \in \mathsf{NWC}$. \end{dfn} Besides monotonicity under free operations, other properties of a monotone $Q$ are also desirable \cite{HGJKL15, ABM17}: \begin{enumerate} \item \emph{Faithfullness:} For all $\boldsymbol{B}\in \mathsf{NC}(\Upsilon)$, $Q\left(\boldsymbol{B}\right)=0$. \item \emph{Preservation under reversible operations:} If $\mathcal{T} \in \mathcal{W}$ is reversible, then \be Q\left(\mathcal{T}\left(\boldsymbol{B}\right)\right) = Q\left(\boldsymbol{B}\right).\ee \item \emph{Additivity:} Given two independent boxes $\boldsymbol{B}_1$ and $\boldsymbol{B}_2$ we require: \be Q\left(\boldsymbol{B}_1 \otimes \boldsymbol{B}_2\right)\leq Q\left(\boldsymbol{B}_1\right)+Q\left(\boldsymbol{B}_2\right). \ee \be Q\left(\boldsymbol{B}_1 \&\boldsymbol{B}_2\right)\leq Q\left(\boldsymbol{B}_1\right)+Q\left(\boldsymbol{B}_2\right). \ee \item \emph{Convexity:} If a behavior can be written as $\boldsymbol{B}=\sum_i \pi_i \boldsymbol{B}^i$, where $\pi_i \in [0,1]$ and each $\boldsymbol{B}^i$ is a behavior for the same scenario, then \be Q\left(\boldsymbol{B}\right)\leq \sum_i \pi_i Q\left(\boldsymbol{B}^i\right).\ee \item \emph{Continuity:} $Q\left(\boldsymbol{B}\right)$ should be a continuous function of $\boldsymbol{B}$. \end{enumerate} In what follows we exhibit a number of monotones for different resource theories of contextuality and list which of the properties above they satisfy. \subsection{Entropic Contextuality Quantifiers} \subsubsection{Relative entropy of contextuality} In ref. \cite{GHHHHJKW14}, the authors also introduce two measures of contextuality based directly on the notion of relative entropy distance, also called the Kullback-Leibler divergence. Given two probability distributions $p$ and $q$ in a sample space $\Omega$, the Kullback-Leiber divergence between $p$ and $q$ \be D_{\mathrm{KL}}(p\|q) = \sum_{i\in \Omega} p(i) \, \log\frac{p(i)}{q(i)} \ee is a measure of the difference between the two probability distributions. \begin{dfn} The \emph{Relative Entropy of Contextuality} of a behavior $\boldsymbol{B}$ is defined as \begin{equation}\label{HorDist2} E_{max}\left(\boldsymbol{B} \right) = \min _{\boldsymbol{B}^{NC} \in \mathsf{NC} }\ \ \max _{\pi}\ \ \sum _{\boldsymbol{\gamma}\in \mathcal{C}}\ \pi\left(\boldsymbol{\gamma}\right) D_{\mathrm{KL}}\left(p_{\boldsymbol{\gamma}} \middle\| p^{NC}_{\boldsymbol{\gamma}} \right), \end{equation} where the minimum is taken over all non-contextual behaviors $\boldsymbol{B}^{NC}=\left\{p^{NC}_{\boldsymbol{\gamma}} \right\}$ and the maximum is taken over all probability distributions $\pi$ defined on the set of contexts $\mathcal{C}$. The \emph{Uniform Relative Entropy of Contextuality} of $\boldsymbol{B}$ is defined as \begin{equation}\label{HorDist3} E_{u}\left(\boldsymbol{B} \right)= \frac{1}{N} \min _{\boldsymbol{B}^{NC}\in \mathsf{NC} } \\\sum _{\boldsymbol{\gamma}\in \mathcal{C}}\D_{\mathrm{KL}}\left(p_{\boldsymbol{\gamma}} \middle\| p^{NC}_{\boldsymbol{\gamma}} \right), \end{equation} where $N=\left|\mathcal{C}\right|$ is the number of contexts in $\mathcal{C}$ and, once more, the minimum is taken over all non-contextual behaviors $\boldsymbol{B}^{NC}=\left\{p^{NC}_{\boldsymbol{\gamma}} \right\}$. \end{enumerate}\end{dfn} In reference \cite{ACTA18} it is shown that $E_{max}$ is a monotone under \emph{non-contextual wirings}. The quantity $E_{u}$, however, is not a monotone under the complete class of non-contextual wirings, as shown in Ref. \cite{GA17} for the special class of Bell scenarios. Nonetheless, it is a monotone under a broad class of such operations. More specifically, it is monotone under post-processing operations and under a subclass of pre-processing operations (see ref. \cite{AT17}). \begin{thm} \label{teo:prop_entropic} The following properties are valid for the contextuality quantifiers based on relative entropy: \begin{enumerate} \item $E_{max}$ is a contextuality monotone for the resource theory of contextuality defined by non-contextual wirings; \item $E_u$ is a contextuality monotone for the resource theory of contextuality defined by post-processing operations and a subclass of pre-processing operations; \item $E_{max}$ and $E_u$ are faithful, additive, convex, continuous, and preserved under relabellings of inputs and outputs. \end{enumerate} \end{thm} The proof of this result can be found in refs. \cite{GHHHHJKW14,HGJKL15,AT17}. \subsection{Geometric Contextuality Quantifiers} \label{sectiongeometric} We now introduce contextuality monotones based on the distance $\ell_1$, in contrast with the previous defined quantifiers which are based on entropic distances. \begin{dfn} The $\ell_1$-\emph{max contextuality distance} of a behavior $\boldsymbol{B}$ is defined as \be \mathcal{D}_{max}\left(\boldsymbol{B}\right)= \min _{\boldsymbol{B}^{NC}\in \mathsf{NC} } \max_{\pi} \sum _{\boldsymbol{\gamma}\in \mathcal{C}}\\pi\left(\boldsymbol{\gamma}\right)\ \sum_{\boldsymbol{s}}\left|p_{\boldsymbol{\gamma}}\left(\boldsymbol{s}\right)-p^{NC}_{\boldsymbol{\gamma}}\left(\boldsymbol{s}\right)\right|, \label{eqdefdist3}\ee where the minimum is taken over all non-contextual behaviors $\boldsymbol{B}^{NC}=\left\{ p^{NC}_{\boldsymbol{\gamma}} \right\}$ and the maximum is taken over all over all probability distributions $\pi$ defined over the set of contexts $\mathcal{C}$. The $\ell_1$-\emph{uniform contextuality distance} of a behavior $B$ is defined as \be \mathcal{D}_u\left(\boldsymbol{B}\right)=\frac{1}{N}\min_{\boldsymbol{B}^{NC}\in \mathsf{NC} } \sum_{\boldsymbol{\gamma}\in \mathcal{C}} \sum_{\boldsymbol{s}}\left|p_{\boldsymbol{\gamma}}\left(\boldsymbol{s}\right)-p^{NC}_{\boldsymbol{\gamma}}\left(\boldsymbol{s}\right) \right|, \label{eqdefdist2}\ee where $N=\left|\mathcal{C}\right|$ is the number of contexts in $\mathcal{C}$. \end{dfn} For detailed discussion of these contextuality quantifiers see ref. \cite{AT17} and, for the special class of Bell scenarios, ref. \cite{BAC18}). \begin{thm} \label{teo_dist} The following properties are satisfied: \begin{enumerate} \item $\mathcal{D}_{max}$ is a contextuality monotone for the resource theory of contextuality defined by the non-contextual wiring operations; \item $\mathcal{D}_u$ is a contextuality monotone for the resource theory of contextuality defined by post-processing operations and a subclass of pre-processing operations; \item $\mathcal{D}_{u}$ and $\mathcal{D}_{max}$ are faithful, additive, convex, continuous, and preserved under relabellings of inputs and outputs. \item $\mathcal{D}_u$ can be computed using linear programming. \end{enumerate} \end{thm} This result is proven in refs. \cite{AT17,BAC18}. It shows that while $\mathcal{D}_{max}$ is a proper contextuality monotone under the entire class of non-contextual wirings, $D_u$ are more suitable when the set of allowed free operations preserves the scenario under consideration. Other distances defined in the set $\mathsf{ND}$ can also be used in place of the $\ell_1$ distance. The above results are also valid for any $\ell_p$ distance. \subsection{Contextual Fraction} \label{subsec:cf} A contextuality quantifier based on the intuitive notion of what \emph{fraction} of a given behavior admits a non-contextual description was introduced in refs. \cite{AB11, ADLPBC12}. Several properties of this quantifier were further discussed in Ref. \cite{ABM17}. \begin{dfn} The \emph{contextual fraction} of a behavior $\boldsymbol{B}$ is defined as \be \label{eq:cont_frac} \mathcal{CF}\left(\boldsymbol{B}\right)= \min \left\{\lambda \left|\boldsymbol{B}= \lambda \boldsymbol{B}' + \left(1-\lambda\right)\boldsymbol{B}^{NC}\right.\right\}, \ee where the minimum is taken over all decompositions of $\boldsymbol{B}$ as a convex sum of a non-contextual behavior $\boldsymbol{B}^{NC}$ and an arbitrary behavior $\boldsymbol{B}'$. \end{dfn} \begin{thm} The contextual fraction is a monotone under all linear operations that preserve the non-contextual set $\mathsf{NC}$. \end{thm} \begin{proof}Let $\mathcal{T}$ be a linear operation over the set of behaviors such that \be \mathcal{T}\left(\mathsf{NC}\right)\subset \mathsf{NC}. \ee Given a behavior $\boldsymbol{B}$, let $\boldsymbol{B}= \lambda \boldsymbol{B}' + \left(1-\lambda\right)\boldsymbol{B}^{NC}$ be the decomposition of $\boldsymbol{B}$ achieving the minimum in eq. \eqref{eq:cont_frac}, that is, $\mathcal{CF}\left(\boldsymbol{B}\right)=\lambda$. Then \begin{eqnarray} \mathcal{T}\left(\boldsymbol{B}\right)&=&\mathcal{T}\left(\lambda \boldsymbol{B}' + \left(1-\lambda\right)\boldsymbol{B}^{NC}\right)\\&=&\lambda\mathcal{T}\left(\boldsymbol{B}' \right)+ \left(1-\lambda\right)\mathcal{T}\left(\boldsymbol{B}^{NC}\right). \end{eqnarray} Since $\mathcal{T}\left(\boldsymbol{B}^{NC}\right)$ is a non-contextual behavior, we conclude that \be \mathcal{CF}\left(\mathcal{T}\left(\boldsymbol{B}\right)\right) \leq \lambda= \mathcal{CF}\left(\boldsymbol{B}\right). \ee \end{proof} \begin{prop} \label{teo:cf} The contextual fraction satisfies: \begin{enumerate} \item The contextual fraction is faithful, convex and continuous; \item $\mathcal{CF}\left(\boldsymbol{B}_1 \&\boldsymbol{B}_2\right)\leq \max_i \mathcal{CF}\left(\boldsymbol{B}_i\right)$; \item $\mathcal{CF}\left(\boldsymbol{B}_1 \otimes \boldsymbol{B}_2\right) \leq \mathcal{CF}\left(\boldsymbol{B}_1\right) + \mathcal{CF}\left(\boldsymbol{B}_2\right) - \mathcal{CF}\left(\boldsymbol{B}_1\right)\mathcal{CF}\left(\boldsymbol{B}_2\right)$; \item The contextual fraction can be calculated via linear programming. \end{enumerate} \end{prop} The proof of these results can be found in Ref. \cite{ABM17}. \section{Contextuality as a resource} \label{sec:app} In this contribution we have defined the set of free objects using an abstract mathematical characterization of noncontextuality, but a resource theory of contextuality will exhibit its true power when applied to operational applications of this phenomenon. Contextuality has been identifyed as a possible resource for quantum advantages in different schemes and in this section we review some of the recent results. \subsubsection*{Contextuality and random number generation} The generation of genuine randomness is still a challenging task as true random numbers can never be generated with classical systems, for which a deterministic description, in principle, always exists. For quantum system that exhibit contextuality, a deterministic description that is independent on the choice of measurement settings is impossible, thus opening the door for the generation of genuine random numbers. This was indeed achieved in refs. \cite{PAMGMMOHLMM10,UZZWYDDK13} where the violation of Bell and non-contextuality inequalities where used directly to compute a lower bound on the min-entropy of the outcomes, thus guaranteeing randomness of the string of output bits. This string can then be processed using classical algorithms to distill genuine random numbers. It is interesting to note that the quantum system in ref. \cite{UZZWYDDK13} is a qutrit, which shows that randomness can be generated without the need of using costly quantum resources such as entanglement. This allows for easier implementation and significantly higher generation rate of random strings. \subsubsection*{Contextuality and models of quantum computation with state injection} Quantum computation with state injection (QCSI) \cite{BK05} is a scheme composed of a free part consisting of quantum circuits with restricted set of states, unitaries and measurements (generally restricted to be that of the stabilizer formalism) in which quantum computation universality is achieved by the injection of special resource states, called \emph{magic states}. These special states are usually distilled from many copies of noisy states through a procedure called \emph{magic state distillation}. To understand the source of quantum advantage in these schemes we need to understand what is precisely the quantum property that allows for magic state distillation. In ref. \cite{VFGE12}, it was shown that for a special choice of Wigner function representation of qudits with prime $d$, its positivity implied the existence of an efficient classical simulation of the state, which in turn implied that this state is not useful for magic state distillation. This result was later explored in ref. \cite{HWVE14}, which exhibited a contextuality scenario based on the set of restricted measurements for which non-contextuality is equivalent to negativity of the Wigner function. This shows that contextuality with respect to this scenario is a necessary ingredient for magic state distillation. This result does not easily generalizes to quibt systems \cite{BDBOR17,LWE18}, where both the definition of the Wigner function and the presence of state independent contextuality with respect to the restricted measurements poses an obstacle to the recognition of contextuality as a resource. Several attempts to establish contextuality as a resource in qubit stabilizer sub-theory have been done so by further restricting to non-contextual subsets of operations within the qubit stabilizer sub-theory. In ref. \cite{DGBR15}, the authors restrict to qubits with real density matrices (rebits) and define a Wigner function for $n$ rebits that is consistent with the restricted stabilizer formalism. With this construction they are able to prove that there is a real QCSI schemes in which universal quantum computation is only possible in the presence of contextuality. In ref. \cite{RBDOB17}, the authors show that if non-negative Wigner functions remain non-negative under free measurements, then contextuality and Wigner function negativity are necessary resources for universal quantum computation on these schemes. The result on contextuality is however strictly stronger than the result on Wigner functions, since different from the qudit case \cite{DOBBR17}, qubit magic states can have negative Wigner functions but still be non-contextual. These results where later generalized in ref. \cite{BDBOR17}, that shows that if the set of available measurements in the scheme is such that there exists a quantum states that does not exhibit contextuality, then contextuality is a necessary resource for universal quantum computation on these schemes. \subsubsection*{Contextuality and measurement based quantum computation} A $\ell_d$-measurement-based quantum computation ($\ell_d$-MBQC) \cite{RBB03} consists of a $n$-site correlated resource state and a control classical computer with restricted computational power. Each site receives the information of a measurement setting to be performed in its system, encoded as an element of $\mathds{Z}_d$, with $d=p^r$ and $p$ prime, and returns the outcome of the measurement also encoded as an element of $\mathds{Z}_d$. No communication between sites is allowed during the computation. The control computer post-processes the measurement outcomes linearly to produce the output of the computation. For $d=2$ it was shown in ref. \cite{AB09} that nonlinear Boolean functions can be computed with $\ell_2$-MBQC with a resource state constructed from a proof of contextuality based on Mermin's GHZ paradox. It was then shown in ref. \cite{Raussendorf13} that deterministic computation of any nonlinear Boolean function with $\ell_2$-MBQC implies that the contextual fraction of the corresponding behavior is equal to one. For probabilistic computation, $\ell_2$-MBQC which compute a non-linear Boolean function with high probability are necessarily contextual. Ref. \cite{OG17} proves that bipartite non-local behaviors in the CHSH scenario and behaviors with arbitrarily small violation of a multi-partite GHZ non-contextuality inequality suffice for reliable classical computation, that is, for the evaluation of any Boolean function with success probability bounded away from $\frac{1}{2}$. These results were later connected with the contextual fraction of the resource state with respect to the available measurements in the computation \cite{ABM17}. Let $f$ be a Boolean function and consider an $\ell_2$-MBQC that uses the behavior $\boldsymbol{B}$ to compute $f$ with average success probability $p_S$ overall possible inputs, and corresponding average failure probability $p_F=1-p_S$. Then \be p_F\geq \left(1-\mathcal{CF}\left(\boldsymbol{B}\right)\right)\nu\left(f\right), \ee where $\nu\left(f\right)$ is the average distance of $f$ to the closest $\mathds{Z}_2$-linear function\footnote{The average distance between two Boolean functions $f,g:2^m\rightarrow 2^l$ is given by $d(f,g):=\frac{1}{2^{m}}\left|\left\{i\in 2^m\vert f(i)\neq g(i)\right\}\right|.$}. In the qudit case, however, examples of non-contextual $\ell_d$-MBQCs with local dimension $d\geq 3$ that evaluate nonlinear functions can be found \cite{FRB18}. Nevertheless, it is still possible to connect contextuality with quantum advantages. In this direction, ref. \cite{HWB11} shows that the evaluation of a sufficiently high order polynomial function on a multi-qudit system provides a proof of contextuality. This problem was also investigated in ref. \cite{FRB18}, that besides reproducing the result of \cite{HWB11}, emphasised the distinctive role of contextuality in individual sites versus strong correlations between the sites. \subsubsection*{Memory cost of simulating contextuality} Contextuality can be simulated by classical models with memory and the efficiency of such simulations can help understand the difference between quantum and classical systems. In any such simulation, the system changes between different internal states during the measurement sequence. These states, drawn from a classical state space $\Lambda$, can be considered as memory and define the spatial complexity of the simulation. The model in which the cardinality of $\Lambda$ is minimum is memory-optimal and defines the \emph{memory cost} of the simulation. In refs. \cite{KGPLC11,FK17} the authors study the memory cost of simulating quantum contextuality in the Peres-Mermim scenario and show that three internal states are necessary for a perfect simulation of quantum behaviors. This shows that reproducing the results of sequential measurements on a two-qubit system requires more memory than the information-carrying capacity of the system, given by the Holevo bound. In ref. \cite{KWB18} it is shown that contextuality in a quantum sub-theory puts a lower bound on the cardinality of the state space used in any classical simulation of this sub-theory. As a consequence of their result, the authors prove that the minimum amount of bits necessary to simulate the $n$-qubit stabilizer sub-theory grows quadratically with $n$, in contrast with the qudit case with $d$ an odd prime \cite{DOBBR17}, where an efficient simulation that scales linearly in $n$ can be constructed from a particular choice of Wigner representation, which is always positive for stabilizer states. \section{Conclusion} \label{sec:conclusion} In this contribution, we reviewed some of the recent developments towards a unified resource theory for contextuality. Although these results highlight contextuality as a possible operational resource, the understanding of the connection of these constructions with practical applications is still in its infancy. This is the most important and the most challenging ingredient in a resource theory. Other important question regarding contextuality as a resource remain open, such as the possibility of contextuality distillation, the role of catalysts, conversion rates and the possibility of finding an explicit parametrization of larger classes of free operations for contextuality. It is also important to investigate the role of other forms of non-classicality in quantum advantage \cite{DA18,MK18} and, although quantifiers can be adapted to the contextuality-by-default framework of ref. \cite{DKC16}, it is still an open problem to find a version of the non-contextual wirings to this extended notion of contextuality. We point out that although the main application of a resource theory is to understand the role of a physical property as an operational resource, this construction can be interesting on its own and it can give insight about the physical property under consideration. For example, in ref. \cite{DBAC18} the authors use contextuality quantifiers to explore the geometry of the set of behaviors, finding the approximate relative volume of the non-contextual set in relation to the non-disturbing set. \subsection*{Acknowledgments} The author thanks Ad\'an Cabello, Ehtibar Dzhafarov, Emily Tyhurst, Ernesto Galv\~ao, Jan-\AA{}ke Larsson, Jingfang Zhou, Leandro Aolita, Marcelo Terra Cunha, Pawe\l{} Horodecki, Pawe\l{} Kurzy\'{n}ski, Rui Soares Barbosa, Samsom Abramsky, Shane Mansfield and all the participants of the Winer Memorial Lectures at Purdue University for valuable discussions. The author thanks Ehtibar Dzhafarov, Maria Kon, and V\'ictor H. Cervantes for the organization of the event and Purdue University for its support and hospitality. The author acknowledges financial support from the Brazilian ministries and agencies MEC and MCTIC, INCT-IQ, FAPEMIG, and CNPq Universal grant n. 431443/2018-1.\bibliographystyle{apalike} \bibliography{biblio}% Produces the bibliography via BibTeX. \end{proof}\end{document} }}}}}}}}}}\end{align}}}}}}}\end{align}\end{proof}}}}}}}}}}}}}}}\end{eqnarray}}
\caption{\color{blue}{Representation ability comparison.}}
\caption{Comparison with state-of-the-art real-time trackers on OTB dataset. Trackers are grouped into CF-based methods, SiamFC-based methods and miscellaneous. Numbers in \textcolor{red}{red} and \textcolor{blue}{blue} are the best and the second best results, respectively.}
\caption{Comparison with state-of-the-art trackers on VOT benchmark. Both non-real-time methods (top rows) and real-time methods (bottom rows) are included. ``A'' and ``R'' denote accuracy and robustness. EAO stands for expected average overlap. The numbers in \textcolor{red}{red} and \textcolor{blue}{blue} indicate the best and the second best results.}
\caption{Comparison with state-of-the-art trackers on TrackingNet and LaSOT. The last three trackers are implemented by ourselves. The numbers in \textcolor{red}{red} and \textcolor{blue}{blue} indicate the best and the second best results.}
\caption{Visualization of failure cases. The \textcolor{green}{green} box is ground-truth and the \textcolor{red}{red} box is our tracking result.}
\caption{Comparison between different levels of temporal supervision. \textcolor[rgb]{0.2,0.4,0.8}{APV} indicates the average number of unique actions per training video. TS results refer to the accuracy obtained with the best initialisation (see Figure~\ref{fig:gridSearch}). Timestamp results are reported after update, with $h = 1.00$.}
\caption{ Distribution of weights $\omega_i$ for unlabeled images at epoch 0 (left) and epoch 90 (right) during the training of CIFAR-10 with $500$ labels. Correct pseudo-labels according to ground-truth are shown in {\color{blue} blue} and incorrect in {\color{red} red}. \label{fig:wDistr} }
\caption{\label{fig:WirelessExampleTopology} $G$: $n = 6$ and $N = 16$. Vertices 1-3 represent fixed terrestrial sites, with vertex 3 a coastal communications station; vertex 4 represents a ship (over the horizon from the station), vertex 5 represents a plane (within range of the station and ship), and vertex 6 represents an overhead satellite. Nontrivial P2MP groups are shown in {\color{red}red}, {\color{violet}violet}, {\color{blue}blue}, and {\color{cyan}cyan}.}
\caption{ \label{fig:TacticalExampleWireless}A network with link capacities color-coded as follows: black links have capacity $10^{-4}$ Gbps; {\color{red}red links, $10^{-3}$ Gbps}; {\color{blue}blue links, $10^{-2}$ Gbps}; and {\color{green}green links, $10^{-1}$ Gbps}. P2MP groups are defined by color: i.e., each set of links from a given vertex with a given color defines a P2MP group.}
\caption{ \label{fig:FMM} (Top) A toy scenario in $[-1,1]^2$. Goals are modeled by negative charges and shown in {\color{blue}blue}; obstacles are modeled by positive charges and shown in {\color{red}red}. Opacity indicates the magnitude of charges. $10^3$ robots are modeled by test points (versus, e.g., test charges of small positive sign) and their locations and velocities both indicated by black gradient vectors of the artificial potential. The target locations are distributed as $\frac{4}{5}U(\text{top half}) + \frac{1}{5}U(\text{bottom half})$, where here $U$ indicates a uniform distribution. (Bottom left) The quad-tree associated to the scenario on the right. Varying the desired precision in the FMM has very little effect on this tree, and as a practical matter it can be assumed unique. (Bottom right) The associated spatial discretization, with relative number of test points indicated. Note that regions without test points do not have ``leaf boxes.''}
\caption{ \label{fig:FMNet} The FMN corresponding to the scenario in Figure \ref{fig:FMM}. Nodes are colored by betweenness centrality according to the colorbar on the right. {\color{gray}The spatial decomposition from Figure \ref{fig:FMM} is shown in gray} for reference. {\color{blue}Edges within a FMM box are blue}, while {\color{cyan}edges connecting nearest nodes in adjacent boxes are cyan} and {\color{red}edges connecting otherwise isolated nodes to their nearest neigbors are red}.}
\caption{ \label{fig:Metrics} Network metrics for $RGG(\xi;r)$ (in black), {\color{red}$RD(\xi;r)$ (in red)}, {\color{magenta}$RG(\xi;r)$ (in magenta)}, and {\color{blue}$RFMN(\xi;r)$ (in blue)} for 100 simulations of $N = 10^3$ uniformly distributed test charges in $[-1,1]^2$. Although $RGG(\xi;r)$ is most efficient, this network performance comes at a high cost in edges, and $RFMN(\xi;r)$ performs well (and for hop efficiency per edge, the best) for all measures of efficiency. Note that $RFMN(\xi;r) = FMN(\xi)$ for sufficiently large $r$ within the range shown. We also have that, e.g. $RD(\xi;r') = D(\xi)$, and though the corresponding $r'$ is outside the range shown, the residual effects are minimal.}
\caption{The visual verb sense predictions (``blockieren'', ``b\"{u}rsten'') successfully constrains the decoder to predict the correct sense of the verb (``block'', ``brush'') in the German translation \textbf{(+WSD)}. The incorrect verb in the baseline translation is shown in \textcolor{red}{\textbf{bold red.}}}
\caption{An example of a verb sense translation error (shown in \textcolor{red}{\textbf{bold red}}) by the English-German neural translation system of \citeauthor{sennrichwmt2017} (\citeyear{sennrichwmt2017}).}
\caption{SED of the eastern emission region e1. The red upper limits are derived in this work within $2\sigma$ confidence level. Other data are taken from \citet{zhou18} directly. Radio to low energy \gray photons are fitted with a synchrotron emission in a magnetic field (dashed line), and the GeV to TeV emission through IC scattering of the CMB (dot-dashed line). The solid line represents the sum of the synchrotron and IC emission. }
\caption{Example showing how the motif percolation creates a chain of adjacent motifs to form a compact structure. For each pattern $M$ $\in \mathcal{M}$ that is found in $G^{\tau'}$, we only consider the motif instances $\{ M \}$ belonging to this family $M$- here \\protect\includegraphics[scale=0.04]{M0.png}. We start with a seed motif instance shown in the initial configuration of $G^{\tau'}$ and then over each iteration cover instances which are adjacent to the already covered motifs subject to constraints. Once there is no adjacent instance that can be included, the process stops and the covered nodes and edges form $G_{*, M}^{\tau'}$ for that cascade.}
\caption{Qualitative comparison of Python graph analysis packages. \graspy is largely complementary to existing graph analysis packages in Python. \graspy does not implement many of the essential algorithms for operating on graphs (rather, it leverages \networkx for these implementations). The focus of \graspy is on statistical modeling of populations of networks, with features such as multiple graph embeddings, model fitting, and hypothesis testing. A \greencheck~is given for packages that implement the respective feature, a \ocheck~for packages that partially implement the respective feature, and a \redx~is given for packages that do not implement the respective feature. Note that while a \greencheck~shows that the feature exists in the corresponding package, it does not imply that the specific algorithms are the same for all packages.}
\caption{\label{fig:vizu} \blue{%Visualization can help finding a track for further network analysis. Four visualizations of the same network modelling interactions between 64 sociable weavers \cite{ros15,van14}. a) Random layout. b) Fruchterman and Reingold layout. c) Circle layout where nodes'size and position are defined by their degree. The same two nodes are colored in red in panels a-c to show their distance varies depending on the layout. d) Representation of the adjacency matrix with row/columns ordering consistent with the clustering obtained with the Infomap algorithm (see \cite{for16} for details). Graphical representations are performed with the package \texttt{igraph}.} }
\caption{ The \textbf{top} diagram describes the architecture of our method. The network is composed of an encoder branch $q(z|x)$ (at top left), a conditional decoder $p(x|z,s)$, as well as two augmenting losses. The DWI-space reconstruction loss in {\color{blue} blue} is computed using the injected subject-specific projection matrix (from SH to b-vector representation). The patch adversary in {\color{dark_green} green} attempts to predict whether a reconstructed patch is originally from a given site (``remapped'' vs. ``real'' patches). At test time the {\color{red} $s$ site id} is manipulated to map data onto one specific site. The \textbf{bottom} diagram describes the training and testing schema for the proposed method, in the left and right boxes respectively. Site bias is represented by the differing colors. In both configurations, these are mapped to a site-invariant space $z$, the colorless center column of both boxes. The remaining (site-independent) information can then be reconstructed into an image, given a site. In the training configuration data are remapped to their original site, and the loss calculated from Eq. \ref{eq:loss} (secondary loss terms omitted from figure). Weights are then trained using the derivative of the loss (backpropagation \cite{rumelhart1985learning}). In the testing configuration, the data are mapped to the selected site $s'$. The outputs in the right-hand column of the right box have bounded mutual information about their original site, vanishing with respect to the loss function. }
\caption{Whole brain, subset location in {\color{blue} blue}.}
\caption{Network modes for weight mirroring. Both panels show the same two-layer section of a network. In both modes, the three neurons in layer \textit{l} of the forward path (\protect\tikz \fill[forward-color] (0.1,0.0) rectangle (0.4,0.2);) send their output signal $\mathbf{y}_l$ through the weight array $\mathbf{W}_{l+1}$ (and other processing shown in equation~(\ref{eq:forward-path})) to yield the next-layer signal $\mathbf{y}_{l+1}$. And in the feedback path (\protect\tikz \fill[feedback-color] (0.1,0.0) rectangle (0.4,0.2);), the two neurons in layer $l + 1$ send their signal $\boldsymbol{\delta}_{l+1}$ through weight array $\mathbf{B}_{l+1}$ to yield $\boldsymbol{\delta}_l$, as in (\ref{eq:fa}). The figure omits the biases $\mathbf{b}$, nonlinearities $\phi$, and, in the top panel, the projections that convey $\mathbf{y}_l$ to the $\boldsymbol{\delta}_l$ cells, allowing them to compute the factor $\phi'(\mathbf{y}_l)$ in equation (\ref{eq:fa}). \textbf{a)} In \textit{engaged mode}, cross-projections (\protect\tikz \fill[gold] (0.1,0.0) rectangle (0.4,0.05);) convey the feedback signals $\boldsymbol{\delta}$ to the forward-path cells, so they can adjust the forward weights \textcolor{green}{$\mathbf{W}$} using learning rule (\ref{eq:weight-update}). \textbf{b)} In \textit{mirror mode}, one layer of forward cells, say layer $l$, fires noisily. Its signal $\mathbf{y}_l$ still passes through $\mathbf{W}_{l+1}$ to yield $\mathbf{y}_{l+1}$, but now the blue cross-projections (\protect\tikz \fill[forward-color] (0.1,0.0) rectangle (0.4,0.1);) control firing in the feedback path, so $\boldsymbol{\delta}_l = \mathbf{y}_l$ and $\boldsymbol{\delta}_{l+1} = \mathbf{y}_{l+1}$, and the $\boldsymbol{\delta}_l$ neurons adjust the feedback weights \textcolor{green}{$\mathbf{B}_{l+1}$} using learning rule~(\ref{eq:noise}). We call the circuit $\mathbf{y}_l$, $\mathbf{y}_{l+1}$, $\boldsymbol{\delta}_{l+1}$, $\boldsymbol{\delta_l}$ a \textit{weight mirror} because it makes the weight array $\mathbf{B}_{l+1}$ resemble $\mathbf{W}_{l+1}^T$.}
\caption{Reciprocal network for Kolen-Pollack learning. There is a single mode of operation. Gold-colored cross-projections (\protect\tikz \fill[gold] (0.1,0.05) rectangle (0.4,0.1);) convey feedback signals $\boldsymbol{\delta}$ to forward-path cells, so they can adjust the forward weights \textcolor{green}{$\mathbf{W}$} using learning rule (\ref{eq:kolen-pollack-forward-update}). Blue cross-projections (\protect\tikz \fill[forward-color] (0.1,0.05) rectangle (0.4,0.1);) convey the signals $\mathbf{y}$ to the feedback cells, so they can adjust the feedback weights \textcolor{green}{$\mathbf{B}$} using (\ref{eq:kolen-pollack-feedback-update}).}
\caption{ImageNet results. \textbf{a)} With ResNet-18 architecture, the weight-mirror network (\textcolor{blue}{\textbf{---} WM}) and Kolen-Pollack (\textcolor{red}{\textbf{---} KP}) outperformed plain feedback alignment (\textcolor{green}{\textbf{---} FA}) and the sign-symmetry algorithm (\textcolor{gold}{\textbf{---} SS}), and nearly matched backprop (\textcolor{black}{\textbf{---} BP}). \textbf{b)} With the larger ResNet-50 architecture, results were similar.}
\caption{Agreement of forward and feedback matrices in the ResNet-50 from Figure~\ref{fig:results-Resnet}b. \textbf{a)} Weight mirrors kept the angles between the matrices $\mathbf{B}_l$ and $\mathbf{W}_l^T$ small in all layers, from the input layer (\textcolor{light-blue}{\textbf{---}}) to the output (\textcolor{cyan}{\textbf{---}}). \textbf{b)} Feedback vectors $\boldsymbol{\delta}_l$ computed by the weight-mirror network were also well aligned with those that would have been computed by backprop. \textbf{c, d)} The Kolen-Pollack network kept the matrix and $\boldsymbol{\delta}$ angles even smaller. \textbf{e, f)} The sign-symmetry method was less accurate.}
\caption{ \system{} integrates multiple coordinated views for discovering intersectional bias. Above, our user investigates the intersectional subgroups of \textit{sex} and \textit{race}. \textbf{A.} The \feature{} allows users to visualize each feature's distribution and generate subgroups. \textbf{B.} The \strip{} lets users select various fairness metrics to see the global average per metric and compare subgroups to one another, e.g., \textcolor{sysred}{pinned \textbf{Caucasian Males}} versus \textcolor{sysblue}{hovered \textbf{African-American Males}}. The plots for \textit{Recall} and \textit{False Positive Rate} show that for African-American Males, the model has relatively high recall but also the highest false positive rate out of all subgroups of sex and race. \textbf{C.} The \detail{} lets users compare the details of two groups and investigate their class balances. Since the difference in False Positive Rates between Caucasian Males and African-American Males is far larger than their difference in base rates, a user suspects this part of the model merits further inquiry. \textbf{D.} The~\suggest{} shows suggested subgroups ranked by the worst performance in a given metric. }
\caption{Primary system results. Results shown in terms of minimum t-DCF and the CM EER [\%]. IDs highlighted in \dnncolor{grey} signify systems that used neural networks in either the front- or back-end. IDs highlighted in \textbf{bold font} signify systems that use an ensemble of classifiers.}
\caption{Distribution of 1.2\,mm dust continuum emission toward MC5-N. (a) Color scale image shows 1.2\,mm dust emission obtained by the IRAM 30\,m telescope. The dotted lines show the field coverage of the 7\,m array observation toward MC5-N, MC6 and MC8. (b) An enlargement view of the orange rectangle in panel (a). The lowest contour and subsequent contour step are\textcolor{black}{15\,mJy\,beam$^{-1}$}. The dotted line shows the field coverage of the 7\,m array observation shown in panel (c,d). The angular resolution is given in the lower right corner, 16$\arcsec$. (c) Same as panel (b) but for the combined (7\,m array + IRAM 30\,m) data shown in color scale. (d) Same as panel (b) but for the 7\,m array data alone. The angular resolution is given in the lower right corners, 7$\farcs$3 $\times$ 5$\farcs$0 in panel (c,d). \textcolor{black}{The dotted circles in panel (c,d) indicate the positions of the possible substructures.}}
\caption{ Mean radial profiles of H$_2$ column density centered at the peak position in MC5-N derived from the 1.2\,mm dust continuum image obtained by the IRAM 30\,m telescope alone and the combined 7\,m array + IRAM data. Left and right panels show the linear--linear plot and the log--log plot of the profiles, respectively. The averaged profiles of the combined data and the IRAM data are shown by black and red solid lines, respectively. Black and red bars show the ($\pm$1$\sigma$) dispersion of the distribution of the radial profiles in each data. Green/blue lines in the left panel shows the half widths at half maximum of the beams, \textcolor{black}{8}$\arcsec$ and 3$\arcsec$. In the right panel green/blue curves are same as in the left panel but for Gaussian functions with the same widths.}
\caption{Streamline plots for linear three-player game near stationary point (0,0,0). Trajectories start at the {\color{green}{green}} point and converge to the {\color{red}{red}} point by following the vector field. (a) and (e) shows the top-view of the 3-D trajectories. When $w_1=0$ the trajectories suggest that both ML-ARL and MaxEnt-ARL converge to the local optima, ($w_1=w_2=w_3=0$). When $w_2=0$, the MaxEnt-ARL trajectories converge to the local optima. The ML-ARL trajectories converge to the optima only when they start far away from $0$ along $w_3$. The trajectories starting closer to $w_3=0$, however, do not converge to $w_1=0$. When $w_3=0$, the game reduces to a two-player adversarial game (akin to a GAN\cite{goodfellow2014generative}), where ML-ARL shows non-convergent cyclic behavior while MaxEnt-ARL converges.\label{fig:trajectory_1d_gaussian}}
\caption{Analysis of EF using DBPN-S on $4\times$ enlargement. {\color{red}Red} indicates the best performance.}
\caption{Analysis of filter size in the back-projection stages on $4\times$ enlargement from D-DBPN. {\color{red}Red} indicates the best performance.}
\caption{Analysis of input/output color channel using DBPN-L. {\color{red}Red} indicates the best performance.}
\caption{Comparison of the DBPN-L and D-DBPN-L on 4$\times$ and 8$\times$ enlargement. {\color{red}Red} indicates the best performance.}
\caption{Quantitative evaluation of DBPN's variants on 4$\times$. {\color{red}Red} indicates the best performance. }
\caption{Quantitative evaluation of state-of-the-art SR algorithms: average PSNR/SSIM for scale factors 2$\times$, 4$\times$, and 8$\times$. {\color{red}Red} indicates the best and {\color{blue}blue} indicates the second best performance. }
\caption{Runtime evaluation with input size 64$\times$64. {\color{red}Red} indicates the best and {\color{blue}blue} indicates the second best performance, * indicates the calculation using function timer in Torch, and N.A. indicates that the algorithm runs out of GPU memory.}
\caption{ \emph{Left}: the base stitch unit in \textbf{black}, its {\color{course}course connections} in {\color{course}purple}, and its {\color{wale}wale ones} in {\color{wale}orange}. One can think of courses as "rows" and wales as "columns". \emph{Middle}:~width increase by \emph{split stitch}. \emph{Right}:~width decrease by merging neighboring stitches using move transfers. }
\caption{ Sideways view of a compact glove in our system. The underlying skeleton graph is highlighted on top, with \emph{tubular sheet} nodes in \textcolor{sheet}{blue} and \emph{split} nodes in \textcolor{split}{fuchsia}.}
\caption{{ \color{NavyBlue} \bf Phase diagram of dynamical regimes.} Dynamical regimes for the droplet depending on the reduced acceleration $\Gamma$ and oil depth $h$. Measured Faraday and Walking thresholds are indicated by filled circles. The dashed curve corresponds to Me=20. The vertical lines indicate the depths $h_0$, $h_1$ and $h_2$ fixed in our experiments. The empty circles correspond to the experimental conditions where the speeds $v_1$ and $v_2$ were measured.}
\caption{{ \color{NavyBlue} \bf Sketch of the cavities.} (a) Top view of an annular cavity of width $D$ and radius $R$ is carved in the bottom of the oil container. (b) Side view. The oil level is adjusted to obtain a depth $h_1$ in the cavity and a thin layer $h_0$ elsewhere. By adjusting these parameters, a walking droplet tends to remain in the cavity and follow a circular motion. Additional cases were also considered. (c) A pattern of period $p=a+b$ is inserted in the annulus such that (d) zones of intermediate depth $h_2<h_1$ and width $a$ alternate with zones of depth $h_1$ and width $b$. (e,f) The case of uniform depth $h_2$ is also considered. }
\caption{{ \color{NavyBlue} \bf Speed of walkers.} Average normalized speed of walkers $\bar v/v_1$ in the annulus as a function of the period $p$ normalized by the Faraday wavelength $\lambda_F$. Error bars are indicated, measured over 20 different observations. The horizontal line corresponds to the expected average speed $\bar v/v_1$ from Eq.(\ref{eq:average}). The red curve is a guide for the eye, emphasizing the drop of average speed near the Bragg condition, i.e. Eq.(\ref{eq:bragg}).}
\caption{{ \color{NavyBlue} \bf Typical trajectories of walkers.} Azimutal trajectories $\theta(t)$ in the annulus for two different periodic patterns : (a) $N=22$ and (b) $N=28$, i.e. around the Bragg condition $p/\lambda_F=1/2$. In the former case, the droplet follows a circular motion at nearly constant speed, while in the latter case, a back-and-forth motion is seen to take place with randomness. For that condition, the droplet speed is seen to oscillate when crossing periodically barriers, as shown in the enlarged parts of the plots (c) and (d). (e,f) Typical trajectories, corresponding to those conditions, with speed indication in color scale. This illustrates the fact that the speed decreases in between barriers. (f) For $N=28$, the speed is close to zero in between two successive barriers. }
\caption{{ \color{NavyBlue} \bf Numerical simulations of the walkers dynamics within corrugated medium.} (a) The amplitude of each individual wave source is set to $\zeta_0$. (b) the amplitude of each individual wave source is $0.66 \zeta_0$. For each simulation, the amplitude of the wavy external potential is $U_0 = 15$ $\mu$J/kg, which is comparable to the energy along the $y$ direction. The periodicity of the underwater carvings is identical and corresponds to $p=\lambda_F/2$.}
\caption{\label{Figure3} \textbf{(a)} Maximum THz electric field amplitude, normalised to the mean value, as the polarization angle is varied over a $360^{\circ}$ range by changing the relative bias voltages between horizontal and vertical pixels. The inset shows the bias voltage applied for each target polarization angle. \textbf{(b)} Comparison of the experimentally measured orientation angle to the target angle at each step. The dashed line represents an exact match between the two. \textbf{(c)} Polar representation of the orientation angle and amplitude of the THz pulses at each step. \textbf{(d)} Ellipticities of the generated THz radiation at each orientation angle, at 1\,THz (blue) and averaged from 0.3-5.0\,THz (red). The shaded areas represent the variation over the$360^{\circ}$ rotation.}
\caption{The PSNR and SSIM results of different approaches on Set5, Set14, BSDS100 and Urban100 with down-sampling factor $\times$2, $\times$3 and $\times$4. We use {\color{red}{red}} and {\color{blue} blue} to label first and second place, respectively.}
\caption{Average accuracy of UHC methods over different combinations of HC configurations, datasets, and unified classifier models. (\underline{\bf Underline bold}: Best method. {\bf Bold}: Methods which are not statistically significantly different from the best method.)\label{table:results}}{ \definecolor{gray}{rgb}{0.85,0.85,0.85} \newcommand{\hs}{\hspace{4pt}} \newcommand{\uline}{\underline} \footnotesize \renewcommand{\arraystretch}{1} \setlength{\aboverulesep}{1.5pt} \setlength{\belowrulesep}{1.5pt} \begin{tabular}{@{}c@{}c@{}c@{\hs}c@{\hs}c@{\hs}c@{\hs}c@{\hs}c@{\hs}c@{\hs}c@{\hs}c@{\hs}c@{\hs}c@{\hs}c@{\hs}c@{\hs}c@{\hs}c@{\hs}c@{\hs}c@{\hs}} \toprule[1.1px] \multicolumn{2}{c}{} & \multicolumn{8}{c}{Random Classes} & & \multicolumn{8}{c}{Completely Overlapping Classes}\tabularnewline \cmidrule{3-10} \cmidrule{12-19} \multicolumn{2}{c}{Methods} & \multicolumn{2}{c}{ImageNet} & & \multicolumn{2}{c}{LSUN} & & \multicolumn{2}{c}{Places365} & & \multicolumn{2}{c}{ImageNet} & & \multicolumn{2}{c}{LSUN} & & \multicolumn{2}{c}{Places365}\tabularnewline \cmidrule(r){3-4} \cmidrule(r){6-7} \cmidrule(r){9-10} \cmidrule(r){12-13} \cmidrule(r){15-16} \cmidrule{18-19} \multicolumn{2}{c}{} & VGG16 & ResNet34 & & VGG16 & ResNet34 & & VGG16 & ResNet34 & & VGG16 & ResNet34 & & VGG16 & ResNet34 & & VGG16 & ResNet34\tabularnewline \midrule[1.1px] \multicolumn{2}{r}{SPV (Benchmark)} & .7212 & .6953 & & .6664 & .6760 & & .5525 & .5870 & & .7345 & .7490 & & .6769 & .7017 & & .5960 & .6460\tabularnewline \cmidrule{1-19} \multicolumn{2}{r}{SD} & .5543 & .5562 & & .5310 & .5350 & & .4390 & .4564 & & \textbf{.7275} & \textbf{.7292} & & .7004 & \textbf{.7041} & & \textbf{.6163} & \textbf{.6402}\tabularnewline \cmidrule{1-19} \multicolumn{2}{l}{\textbf{(A)} {\it Estimate $q$ methods}} & & & & & & & & & & & & & & & & & \tabularnewline \multicolumn{2}{r}{CE-E} & \textbf{.6911} & \textbf{.6852} & & .6483 & \textbf{.6445} & & \textbf{.5484} & \textbf{.5643} & & \textbf{.7276} & \textbf{.7290} & & .7002 & .7036 & & \textbf{.6162} & \underline{\bf .6406}\tabularnewline \multicolumn{2}{r}{MF-P-E} & .6819 & .6747 & & .6443 & .6406 & & .5349 & .5488 & & \textbf{\uline{.7280}} & \textbf{\uline{.7297}} & & \textbf{.7012} & \textbf{.7052} & & \textbf{\uline{.6167}} & \underline{\bf .6406}\tabularnewline \multicolumn{2}{r}{MF-LV-E} & .6660 & .6609 & & .6348 & .6330 & & .5199 & .5414 & & .7231 & .7242 & & \textbf{\uline{.7031}} & \textbf{.7043} & & .6129 & .6374\tabularnewline \multicolumn{2}{r}{MF-LF-E} & .6886 & \textbf{.6833} & & .6490 & \textbf{.6458} & & .5441 & .5609 & & .7265 & \textbf{.7279} & & \textbf{.7015} & \textbf{\uline{.7057}} & & \textbf{.6161} & \textbf{.6397}\tabularnewline \cmidrule{1-19} \multicolumn{2}{l}{\textbf{(B)} {\it Backprop. methods}} & & & & & & & & & & & & & & & & & \tabularnewline \multicolumn{2}{r}{CE-BP} & \textbf{.6902} & \textbf{.6869} & & \textbf{.6520} & .6439 & & .5466 & \textbf{.5669} & & \textbf{.7275} & \textbf{.7288} & & .7003 & \textbf{.7040} & & \textbf{.6161} & \textbf{.6400}\tabularnewline \multicolumn{2}{r}{MF-P-BP} & \textbf{\uline{.6945}} & \textbf{\uline{.6872}} & & .6480 & \textbf{.6417} & & \textbf{.5471} & .5609 & & \textbf{.7277} & \textbf{.7287} & & .6999 & \textbf{.7019} & & .6146 & .6384\tabularnewline \multicolumn{2}{r}{MF-LV-BP} & .6889 & \textbf{.6847} & & \textbf{.6495} & .6389 & & .5467 & \textbf{.5681} & & .7229 & .7225 & & .7001 & \textbf{.7046} & & .6113 & .6369\tabularnewline \multicolumn{2}{r}{MF-LF-BP} & .6842 & \textbf{.6840} & & \textbf{.6523} & \textbf{.6445} & & .5383 & \textbf{.5624} & & .7239 & .7252 & & \textbf{.7020} & .7034 & & .6104 & .6366\tabularnewline \cmidrule{1-19} \multicolumn{2}{l}{\textbf{(C)} {\it Balanced soft labels}} & & & & & & & & & & & & & & & & & \tabularnewline \multicolumn{2}{r}{SD-BS} & .6629 & .6574 & & .6343 & .6345 & & .5283 & .5433 & & .7217 & .7214 & & .6979 & .7017 & & .6094 & .6320\tabularnewline \multicolumn{2}{r}{CE-BS} & \textbf{.6928} & \textbf{.6856} & & .6513 & \textbf{.6464} & & \textbf{\uline{.5548}} & \textbf{.5687} & & .7215 & .7213 & & .6979 & .7018 & & .6094 & .6323\tabularnewline \multicolumn{2}{r}{MF-P-BS} & .6851 & .6756 & & .6474 & \textbf{.6450} & & .5455 & .5546 & & .7243 & .7252 & & .6996 & .7041 & & .6124 & .6355\tabularnewline \multicolumn{2}{r}{MF-LV-BS} & .6772 & .6682 & & .6388 & .6357 & & .5346 & .5497 & & .7168 & .7173 & & .7014 & .7028 & & .6063 & .6301\tabularnewline \multicolumn{2}{r}{MF-LF-BS} & \textbf{.6935} & \textbf{.6865} & & \textbf{\uline{.6549}} & \textbf{\uline{.6485}} & & \textbf{.5544} & \textbf{\uline{.5692}} & & .7210 & .7215 & & .6998 & .7035 & & .6101 & .6330\tabularnewline \bottomrule[1.1px] \end{tabular} }
\caption{Trajectory results on seq. \texttt{08} (drive \#28, 2011/09/30)\cite{geigerVision2013} of the KITTI dataset. The proposed method (\textcolor{green}{green}) accurately follows the benchmark trajectory for the entire sequence (\SI{4.2}{km}, \SI{9}{min}), whereas the pure IMU integration (\textcolor{cyan}{cyan}) quickly diverges. Both methods use only IMU signals and are initialized with the benchmark pose and velocity. We see during the GPS outage that occurs in this sequence, our solution keeps estimating accurately the trajectory.\label{fig:example}}
\caption{Background maps of the \fermi\LAT\gray\sky.\fermi\data are split into three logarithmically-uniform bins in energy and divided by the mission-averaged exposure map for that energy range. Grayscale intensity encodes the resulting mission-averaged photon flux over each band in units of photons per 200 seconds$m^{-2}$ ${\rm deg}^{-2}$.}
\caption{Average \fermi\\gray\background rates at the positions of track (upper panel) and cascade (lower panel) neutrinos. In each panel, the histogram shows the distribution obtained from 10,000 Monte-Carlo scrambled datasets, while the red line marks the observed background rate for unscrambled data. Background rates are expressed in units of photons per square meter per square degree per 200s. Observed average backgrounds are consistent with background for both datasets.}
\caption{Feature visualization with t-SNE. (a-d) We plot the class prototypes (black circles) and features on the target domain (crosses). The color of a cross represents its class. We observed that features on our method show more discrimative features than other methods. (e-h) \textcolor{red}{Red}: Features of the source domain. \textcolor{blue}{Blue}: Features of the target domain. Our method's features are well-aligned between domains compared to other methods. }
\caption{\textit{fc-head} head is more suitable for classification than \textit{conv-head}. This figure includes three cases. Each case has two rows. The bottom row shows ground truth, detection results using \textit{conv-head} alone, \textit{fc-head} alone, and our Double-Head-Ext (from left to right). The \textit{conv-head} misses objects in the \textcolor{red}{red} box. In contrast, these missing objects are successfully detected by \textit{fc-head} (shown in the corresponding \textcolor{green}{green} box). The top row zooms in the red and green boxes, and shows classification scores from the two heads for each object. The missed objects in \textit{conv-head} have small classification scores, compared to \textit{fc-head}.}
\caption{Additional \aastex\symbols}
\caption{Results of coarse-to-fine generation. The best results are marked in \red{\textbf{blue}} color.}
\caption{DiverseNet with a VoxNet predictor. CP: $3^3$ conv with batch norm, ReLU and max pooling, CU: $3^3$ conv with batch norm, ReLU and nearest neighbor upsampling. Black numbers: size of voxel grid, \textcolor{red}{red} numbers: number of channels.}
\caption{Two views of diverse 3D contact map predictions. (a) \textit{Unseen} object classes: mug, pan, and wine glass, (b) \textit{Unseen shape} of training object classes: camera and hammer. Intent: use, Model: VoxNet-DiverseNet, \textcolor{red}{Red}: contact.}
\caption{Geometric error of the texture mapping process. The spot on the front button shown in \textcolor{OliveGreen}{green} was precision-heated with a warm pencil-top eraser.}
\caption{Quantitative evaluations of single image denoising on the synthetic dataset. \#1-4 are the 4 testing subsets. ``Low" and ``High" represent different noise levels, which respectively correspond to$\sigma_s=2.5\times10^{-3},\sigma_r=10^{-2}$ and $\sigma_s=6.4\times10^{-3},\sigma_r=2\times10^{-2}$. {\color{red}Red} and {\color{blue}blue} indicate the first and second best performance. \label{table:synthetic_result_image}}
\caption{Quantitative evaluations of video denoising on the synthetic dataset. \#1-4 are the 4 testing subsets.% ``Ours-2D" represents applying our 2D deformable kernels on each input frame separately. % ``Low" and ``High" denote different noise levels, which respectively correspond to $\sigma_s=2.5\times10^{-3},\sigma_r=10^{-2}$ and $\sigma_s=6.4\times10^{-3},\sigma_r=2\times10^{-2}$. % {\color{red}Red} and {\color{blue}blue} text indicates the first and second best performance. \label{table:synthetic_result_video}}
\caption{Characterization of the induced charge current in the magnetic WSM layer: (a) Current density $\boldsymbol{J}_{e,\parallel}^W$ as a function of the spatial coordinate $z$, the spatially integrated current $\boldsymbol{I}% _{e,\parallel}^W \left(\equiv \protect\int_{-\infty}^0 \mathit{d}z% \boldsymbol{J}_{e,\parallel}^W \right)$ as functions of (b) Fermi energy $% E_F $ and (c) the separation of the two Weyl nodes $k_0$ for an injected spin $\boldsymbol{s}_{inj}$ along $x$, $y$ and $z$ axes respectively. The Lifshitz transition energy and the Fermi wavevector are given by $E_L=m_0 k_0^2$ and $k_F=\protect\sqrt{2m_e(E_F+\protect\mu_0)/\hbar^2}$, respectively. Other parameters used in the numerical calculation: $\protect% \mu_0=5.0$ eV, $m_e=9 \times 10^{-31}$ kg, $m_0=-m_1=20$ eV$\cdot \mathring{A% }^2$ and $v=2$ eV$\cdot \mathring{A}$. } \label{fig2:J_W} \end{figure*} Thus far, we have solved the problem of a single electron scattering at the interface of the magnetic-WSM/NM bilayer. In order to calculate the current density induced in the magnetic WSM layer due to the spin current injection from the NM layer, we need to find the electron distributions in each layer. At the NM side of the interface (i.e., $z=0^{+}$), the distribution function can be described by a $2\times 2$ matrix in spin space~\cite% {mStiles02PRB_STT,jwZhang04PRL,slzhang15PRB_AMR,slZhang16PRL}, i.e., \begin{equation} \hat{f}_{N}=f_{0,N}\left( \mathbf{k}\right) \hat{I}+\hat{g}_{N}\left( \mathbf{k}\right) \end{equation}% where $f_{0,N}\left(\mathbf{k}\right)\hat{I}$ is the equilibrium part of the distribution function with $f_{0,N}$ the Fermi-Dirac function and $% \hat{I}$ the $2\times 2$ identity matrix, and the nonequilibrium component of the distribution function $\hat{g}_{N}\left(\mathbf{k}\right)$ that gives rise to the spin current can be described by \begin{equation} \hat{g}_{N}\left( \mathbf{k}\right) =-e\tau v_{z}{\hat{\mathcal{E}}}_{z}% \frac{\partial f_{0}}{\partial E_{\mathbf{k}}} \end{equation}% where ${\hat{\mathcal{E}}}_{z}=E_{z}\boldsymbol{\sigma \cdot s}_{inj}$ (with $\boldsymbol{\sigma }$ denoting the Pauli spin matrices) is a spin-dependent electric field pointing in opposite directions for electrons with opposite spin directions which drives a spin current~\footnote{The spin current can be realized experimentally in a few different ways, for example through the SHE~\cite{aHoffman13IEEE_SHE} or through spin pumping~\cite{Tserkovnyak05RMP}; the manner through which the spin current is generated is not important for our purposes so we write the spin current in terms of an effective spin-dependent electric field.}, $\boldsymbol{s}_{inj}$ is a unit vector denoting the direction of the spin component of the spin current. At temperatures well below the Fermi temperature of the NM, it is a good approximation to assume $\frac{\partial f_{0,N}}{\partial E_{% \mathbf{k}}}\simeq -\delta \left(E_{\mathbf{k}}-\mu _{0}-E_{F}\right)$ where $E_{F}$ is the Fermi energy relative to the energy of the two Weyl nodes as shown schematically in Fig.~\ref% {fig1:schematics}(b). Formally, the spin current density is given by $% Q_{z}^{\alpha }=\frac{\hbar }{4}Tr_{\sigma }\int \frac{d^{3}\mathbf{k}}{% \left(2\pi \right)^{3}}\sigma ^{\alpha }v_{z}\hat{f}_{N}$. Explicitly, $% Q_{z}^{b}=J_{s,z}^{N}s_{inj}^{b}$ where the magnitude of the spin current density for a given spin direction can be characterized by $% J_{s,z}^{N}\equiv \frac{\hbar }{2e}\sigma _{D}E_{z}$ with $\sigma _{D}=\frac{% \tau e^{2}k_{F}^{3}}{3\pi ^{2}m_{e}}$ the Drude conductivity. The nonequilibrium distribution function for the transmitted electrons in the magnetic WSM is determined by the transmission amplitudes and the nonequilibrium electron distribution $\hat{g}_{N}\left(\mathbf{k}\right)$ at $z=0^{+}$~\cite% {camley89PRL,mStiles02PRB_STT} via \begin{equation} \hat{g}_{W}^{<}\left( \mathbf{k},z\right) =\hat{T}^{\dag }\left( \mathbf{k},z% \right) \hat{g}_{N}^{<}\left( \mathbf{k}\right) \hat{T}\left( \mathbf{k},z% \right) \end{equation}% where $\hat{T}\left(\mathbf{k},z\right)$ is a $2\times 2$ transmission matrix satisfying $\varphi_W(\mathbf{k},z)=\hat{T}(\mathbf{k},z)\varphi_{N,i}(\mathbf{k},0^{+})$ (the formula for $\hat{T}\left(\mathbf{k},z\right)$ is not very informative and thus we will present it together with the derivation of the scattering amplitudes in the Supplemental Materials), the superscript ``$<$" denotes electrons moving in the negative $z-$direction (i.e., $v_z<0$). Note that, to the leading order, electrons in the WSM moving towards the interface are assumed to entirely come from the equilibrium distribution, i.e., $\hat{f}_{W}^{<}\simeq \hat{f}_{0,W}^{<}$ and $\hat{g}_{W}^{<}\simeq 0$. Having obtained the nonequilibrium distribution $\hat{g}_{W}\left( \mathbf{% k},z\right) $, the in-plane charge current density induced in the magnetic WSM layer can be computed via \begin{equation} \boldsymbol{J}_{e,\parallel }^{W}(z)=-\frac{e}{2}\int \frac{d^{3}\mathbf{k}% }{\left(2\pi \right)^{3}}Tr_{\sigma }\left(\hat{g}_{W}\boldsymbol{\hat{v}}% _{W,\parallel }+\mathit{h.c.}\right)\,,\label{Eq: J_W} \end{equation}% where $\mathit{h.c.}$ denotes Hermitian conjugate, $\hat{g}_{W}=\hat{g}_{W}^{>}+\hat{g}_{W}^{<}$ and the trace operation is carried out in the spin space. Note that in deriving Eq.~(\ref{Eq: J_W}) we have taken into account the fact that the equilibrium distribution of electrons in the WSM, i.e., $\hat{f}_{0,W}$, does not contribute to the current. Before seeking the numerical solutions of the induced charge current in the magnetic WSM layer, a remarkable property of the spin-to-charge conversion can be illuminated by a simple symmetry analysis of Eq.~(\ref{Eq: J_W}): regardless of the orientation of the injected spins, no current will be induced in the direction parallel to the line connecting the pair of Weyl nodes in momentum space, i.e., $% J_{e,x}^W=0$. This is simply because the $x$-component of the electron velocity operator (i.e., $\hat{v}_{W,x}=\frac{\partial \mathcal{H}_{W}}{% \partial k_{x}}$) is an odd function of $k_{x}$ whereas the nonequilibrium distribution function $\hat{g}_{W}$ is an even function of $k_{x}$. Therefore, the corresponding $x$-component of the current density must vanish everywhere in the magnetic WSM layer as it is the integral of the product of these two over $\mathbf{k}$% -space. Such an anisotropic spin-to-charge conversion stems from the inherent property of magnetic WSMs -- the anisotropy in the band structure in the first Brillouin zone between the directions perpendicular and parallel to the separation between the two Weyl nodes in $\mathbf{k}$-space. The numerical solution of $J_{e,x}^{W}$ indeed confirms that it vanishes everywhere in the magnetic WSM layer, regardless of the direction of the injected spin $\boldsymbol{s}% _{inj}$, the position of the Fermi level $E_{F}$ as well as the separation between the Weyl nodes [see the inset of Fig.~\ref{fig2:J_W}(a)]. In contrast to the robust suppression of $J_{e,x}^{W}$, the behavior of the current induced in the $y$-direction (perpendicular to the separation between the two Weyl nodes and perpendicular to the WSM$\mid$NM interface) is much richer. Figure~\ref{fig2:J_W}(a) shows the spatial variation of the $J_{e,y}^{W}(z)$ for the injected spin along $x $, $y$ and $z$ directions, respectively. We find that while the magnitude of $% J_{e,y}^{W}$ depends on the orientation of the spin injection, it generally decays rapidly over one Fermi wavelength $\lambda _{F}\left( =\frac{2\pi }{% k_{F}}\right) $ away from the WSM$\mid$NM interface, indicating a dominant contribution of the evanescent surface states to the spin-to-charge conversion in the magnetic WSM. In Fig.~\ref{fig2:J_W}(b), we show the total induced current $% I_{e,y}^{W}\left( \equiv \int_{-\infty }^{0}\mathit{d}zJ_{e,y}^{W}\right) $ as a function of the Fermi level $E_{F}$. We note the existence of a Lifshitz transition energy level $E_{L}$ at which two separate Fermi surfaces, enclosing the two Weyl nodes, merge into a single Fermi surface [as shown schematically in Fig.~\ref{fig1:schematics}(b)]. We find that the total current $I_{e,y}^{W}$ is insensitive to the variation of the Fermi level as long as $E_{F}$ is below $E_{L}$, and the onset of noticeable changes of $I_{e,y}^{W}$ occur at $E_{L}$ due to a significant change of density of states at the Fermi level when it crosses the Lifshitz energy. In Fig.~\ref{fig2:J_W}(c), we show the dependence of the total induced current $I_{e,y}^{W}$ on the separation between the two Weyl nodes $2k_{0}$. Extraordinary variations of $% I_{e,y}^{W}$ take place when $k_{0}$ approaches the Fermi wavevector $k_{F}$% , as the projected Fermi contour of the NM in the $x$-$y$ plane switches between one that encloses the two Weyl nodes and one that does not, which drastically alters the scattering phase space. Lastly, we provide an order-of-magnitude estimation of the effect in EuCd$_2$As$_2$ -- a magnetic Weyl predicted recently~\cite{2019arXiv_lWang_magnWSM} that contains a single pair of Weyl nodes. By choosing the following parameters for EuCd$_2$As$_2$~\cite{2019arXiv_lWang_magnWSM}: $m_0=1.6$ eV$\cdot \mathring{A% }^2$, $m_1=54.5$ eV$\cdot \mathring{A% }^2$, $v=2.7$ eV$\cdot \mathring{A% }$, $k_0=0.008$ $\mathring{A% }^{-1}$, and $E_F=0.01$ eV, we obtain a spin-to-charge conversion efficiency of $\vartheta \simeq 0.2\%$ for spin injected along the $y$-direction where $\vartheta \equiv J_{e,y}^W(0^{-})/(\frac{2e}{\hbar}J_{s,z}^N)$, which is about an order of magnitude smaller than the spin Hall angle in Pt~\cite{aHoffman13IEEE_SHE,Sinova15}. As a final point, it is interesting to compare the spin-to-charge conversion in magnetic WSMs with that in other systems (such as heavy metals, Rashba 2DEG, topological insulator surfaces etc.) due to the ISHE or IEE. When a spin current is injected in heavy metals (such as Pt or Ta) from a NM, a charge current will be generated in the direction perpendicular to both the spin direction and and the flow direction of the injected spin current due to the ISHE; formally, the process can be described by $J_{e,i}=\epsilon _{ijk}\vartheta _{0}Q_{j}^{k} $ where $\epsilon _{ijk}$ is the antisymmetric Levi-Civita tensor ($i,j,k=x$,$y$, or $z$), $Q_{j}^{k}$ represents the injected spin current flowing along the $j$ direction with spin pointing in the $k$ direction, and $\vartheta _{0}$ is a dimensionless material parameter known as the spin Hall angle which measures the efficiency of the spin-charge conversion. A transverse charge current can also be generated, based on the inverse Edelstein effect, by injecting a spin current perpendicularly to the surface of a topological insulator or to an interface with strong Rashba spin-orbit coupling and using the spin-charge locking in these systems that fixes the spins of the carriers perpendicularly to their momenta. Note that the IEE has the same symmetry as the ISHE and hence can be described by the same formula that we used for the ISHE -- the only difference is that it is a conversion of a 3D spin current to a 2D charge current and hence $% \vartheta _{0}$ in the linear response relation has the dimension of length. For both IEE and ISHE, a charge current may in principle be induced in any arbitrary direction with properly chosen spin injection direction, i.e., $% \vartheta _{0}$ is \textit{isotropic}~\cite{slZhang14EPL}. The spin-to-charge conversion in magnetic WSMs, however, is rather \textit{% anisotropic} emanating from the anisotropy in their unique band structures -- the appearance of a pair of Weyl nodes in $\mathbf{k}$-space; as we have shown above, no charge current can be induced in the direction along the line connecting the two Weyl nodes (i.e., $\mathbf{\hat{k}}_{0}$), regardless of the orientation of the injected spins. Note that for a magnetic WSM with a single pair of Weyl nodes, the magnetization is in the same direction as $\hat{\mathbf{k}}_0$~\cite{2019arXiv_lWang_magnWSM}. In general, there will be an odd number of pairs in a magnetic Weyl semimetal, in which case the total current density, being the sum of contributions from different pairs (if the pairs of Weyl nodes are well separated in the reciprocal space), vanishes along the magnetization direction, i.e., \begin{equation} \mathbf{m} \cdot \boldsymbol{J}_{e,\parallel }^{W}=0 \,,\end{equation} where $\mathbf{m}$ is a unit vector denoting the magnetization direction of the magnetic WSM. A charge current, however, can be induced in the direction perpendicular to the magnetization direction, and the induced current is rather sensitive to the direction of the injected spin $\mathbf{s}_{inj}$, which is experimentally controllable. In addition, we have shown that the spin-to-charge conversion in magnetic WSM relies on the separation between two Weyl nodes and the position of the Fermi surface relative to them, which provides additional means to manipulate and control the effect. These remarkable features make the spin-to-charge conversion in magnetic WSMs distinctly different from that previously studied in heterostructures involving heavy metals or topological insulators, and are potentially very useful in spintronic applications. Work by S. Z., A. B. and O.H. was supported by Center for Advancement of Topological Semimetals, an Energy Frontier Research Center funded by the U.S. Department of Energy Office of Science, Office of Basic Energy Sciences, through the Ames Laboratory under its Contract No. DE-AC02-07CH11358; work by I. M. was supported by the U.S. DOE, Office of Science, Basic Energy Science Division of Materials Sciences and Engineering. \appendix \section{Appendix:~Derivation of the transmission matrix in spin space} Let us first consider the scattering problem for free electrons in a bilayer consisting of a normal metal (NM) layer and a magnetic Weyl semimetal (WSM) layer, and with the electrons incident on the interface from the NM. For the Weyl fermions in the magnetic WSM layer, we use the following low-energy effective Hamiltonian [same as Eq. (1) in the main text] \begin{equation} \mathcal{H}_{W}=\left[m_{1}\left(k_{0}^{2}-k_{x}^{2}\right)+m_{0}\left(k_{y}^{2}+k_{z}^{2}\right)\right]\sigma _{x}+v\left(k_{y}\sigma _{y}+k_{z}\sigma _{z}\right)\,,\tag{S1} \end{equation}% where $\sigma _{i}$ ($i=x,y,z$) are the Pauli spin matrices, and for the NM layer we adopt the following simple free electron model Hamiltonian \begin{equation} \mathcal{H}_{N}=\frac{\hbar ^{2}k^{2}}{2m_{e}}-\mu _{0} \,,\tag{S2} \end{equation}% where $\mu _{0}$ is a constant shift of the chemical potential. By choosing the $z$-axis as the spin quantization axis, the wave function of an electron in the NM layer incident on the interface and with its spin pointing in an arbitrary direction $\boldsymbol{n}\mathbf{=}\left( \sin \theta \cos \phi ,\sin \theta \sin \phi ,\cos \theta \right) $ can be written as \begin{equation} \varphi _{N,i}(\mathbf{r})=\left[\cos \frac{\theta }{2}e^{-i\phi /2}\binom{1% }{0}+\sin \frac{\theta }{2}e^{i\phi /2}\binom{0}{1}\right]e^{-ik_{z}z}e^{i% \boldsymbol{k}_{\parallel }\cdot \mathbf{r}}\,,\tag{S3} \label{Eq: phi_Ni} \end{equation}% and the corresponding reflected wave can be expressed as \begin{equation} \varphi _{N,r}(\mathbf{r})=\left[R_{\uparrow }\binom{1}{0}+R_{\downarrow }% \binom{0}{1}\right]e^{ik_{z}z}e^{i\boldsymbol{k}_{\parallel }\cdot \mathbf{r% }}\,,\tag{S4} \end{equation}% where $k_{z}\equiv \sqrt{\frac{2m_{e}E}{\hbar ^{2}}-\boldsymbol{k}% _{\parallel }^{2}}$ with $% \boldsymbol{k}_{\parallel }\left[ \equiv \left( k_{x},k_{y}\right) \right] $ the in-plane component of the wavevector, $R_{\uparrow(\downarrow) }$ are the reflection amplitudes, and we have assumed translational invariance in the $x$-$y$ plane. It follows that the full scattering wave function in the NM is a superposition of the incident and the reflected waves, i.e., \begin{equation} \varphi _{N}(\mathbf{r})=\varphi _{N,i}(\mathbf{r})+\varphi _{N,r}(\mathbf{r}% )\,.\tag{S5} \end{equation} The wave function for a transmitted electron in the magnetic WSM can be expressed as \begin{equation} \varphi _{W}\left(\mathbf{r}\right)=\left(T_{+}\chi _{+}e^{ik_{z,+}z}+T_{-}\chi _{-}e^{ik_{z,-}z}\right)e^{i\boldsymbol{k}% _{\parallel }\cdot \mathbf{r}} \,,\tag{S6} \label{Eq:phi_W} \end{equation}% where $T_{\pm }$ are the transmission amplitudes, $\chi _{+}$ and $\chi _{-}$ are two spinors given by $\chi _{\pm }=\frac{1}{\sqrt{N_{\pm }}}\binom{% a_{\pm }}{b_{\pm }}$ with $a_{\pm }=m_{1}\left( k_{x}^{2}-k_{0}^{2}\right) +m_{0}\left( k_{y}^{2}+k_{z,\pm }^{2}\right) -ivk_{y}$, $b_{\pm }=E-vk_{z,\pm }$ and $N_{\pm }$ the normalization coefficients satisfying $% \left\vert a_{\pm }\right\vert ^{2}+\left\vert b_{\pm }\right\vert ^{2}=N_{\pm }^{2}$. The $z$-components of the wavevectors are given by \begin{widetext} \begin{equation} k_{z,\pm }^{2}=-\frac{1}{m_{0}^{2}}\left[m_{0}m_{1}\left(k_{0}^{2}-k_{x}^{2}\right)+m_{0}^{2}k_{y}^{2}+\frac{v^{2}}{2}\mp \sqrt{% m_{0}m_{1}\left(k_{x}^{2}-k_{0}^{2}\right)v^{2}+m_{0}^{2}E^{2}+\frac{v^{4}% }{4}}\right]\,\tag{S7} \end{equation}% \end{widetext} with the signs of $k_{z,\pm }$ so selected that the transmitted waves either propagate freely or decay in the WSM ($z<0$). The reflection and transmission amplitudes $R_{\uparrow \left( \downarrow \right) }$ and $T_{\pm }$ can be determined by specifying the boundary conditions. Here, we assume that both the scattering wave function and the $z$-component of the current density are continuous at the interface $z=0$: \begin{equation} \varphi _{N}(0^{+})=\varphi _{W}\left(0^{-}\right)\text{ and }\hat{v}% _{N,z}\varphi _{N}(0^{+})=\hat{v}_{W,z}\varphi _{W}(0^{-}) \,,\tag{S8} \label{Eq: bc's} \end{equation}% where the velocity operators are given by $\boldsymbol{\hat{v}}_{N}=\frac{% \partial \mathcal{H}_{N}}{\hbar\partial \boldsymbol{k}}$ and $\boldsymbol{\hat{v}}% _{W}=\frac{\partial \mathcal{H}_{W}}{\hbar\partial \boldsymbol{k}}$ for the NM and WSM, respectively, where we have eliminated the common factor $e^{i\boldsymbol{k}_{\parallel}\cdot \mathbf{r}}$ on both sides of each equation. By placing Eqs.~(\ref{Eq: phi_Ni}) - (\ref{Eq:phi_W}) in Eq.~(\ref{Eq: bc's}% ), one can derive the following scattering amplitudes \begin{subequations} \label{Eq:R_up-dn} \begin{equation} R_{\uparrow }=a_{+}T_{+}+a_{-}T_{-}-\cos \frac{\theta }{2}e^{-i\phi /2} \tag{S9a} \label{Eq:R_up} \end{equation}% \begin{equation} R_{\downarrow }=b_{+}T_{+}+b_{-}T_{-}-\sin \frac{\theta }{2}e^{i\phi /2} \tag{S9b}\,,\label{Eq:R_dn} \end{equation}% \end{subequations} where \begin{subequations} \label{T_pm} \begin{equation} T_{+}=\frac{2\left(B_{-}\cos \frac{\theta }{2}e^{-i\phi /2}-A_{-}\sin \frac{% \theta }{2}e^{i\phi /2}\right)}{A_{+}B_{-}-A_{-}B_{+}} \tag{S10a} \label{Eq:T+} \end{equation}% \begin{equation} T_{-}=\frac{2\left(A_{+}\sin \frac{\theta }{2}e^{i\phi /2}-B_{+}\cos \frac{% \theta }{2}e^{-i\phi /2}\right)}{A_{+}B_{-}-A_{-}B_{+}} \tag{S10b} \label{Eq:T_} \end{equation}% \end{subequations} with \begin{subequations} \begin{equation} A_{s}=\left(1-\frac{m_{e}va}{\hbar ^{2}k_{N,z}}\right)a_{s}-\frac{% 2m_{0}a^{2}m_{e}k_{z,s}}{\hbar ^{2}k_{N,z}}b_{s} \tag{S11a} \end{equation}% \begin{equation} B_{s}=\left(1+\frac{m_{e}va}{\hbar ^{2}k_{N,z}}\right)b_{s}-\frac{% 2m_{0}a^{2}m_{e}k_{z,s}}{\hbar ^{2}k_{N,z}}a_{s} \tag{S11b} \end{equation}% \end{subequations} and $s$ denoting $+$ or $-$. We note that $R_{\uparrow \left( \downarrow \right) }$ and $T_{\pm }$ contain only quadratic terms of $k_{x}^{2}$ and $% k_{y}^{2}$ -- a property that is useful in determining the presence of the in-plane charge current in a given direction. Next, we determine the transmission matrix by rewriting the transmitted state in the form of $\varphi _{W}\left(z\right) =\hat{T}% _{NW}\varphi _{N,i}\left( 0^{+}\right) $. Inserting the expressions of $% T_{\pm }$ [Eqs.~(\ref{Eq:T+}) and (\ref{Eq:T_})] in Eq.~(\ref{Eq:phi_W}), one can rewrite $\varphi _{W}\left( z\right) $ as \begin{widetext} \begin{equation} \varphi _{W}\left(z\right)=\frac{2}{A_{+}B_{-}-A_{-}B_{+}}\left(\begin{array}{cc} a_{+}B_{-}e^{ik_{z,+}z}-a_{-}B_{+}e^{ik_{z,-}z} & a_{-}A_{+}e^{ik_{z,-}z}-a_{+}A_{-}e^{ik_{z,+}z} \\ b_{+}B_{-}e^{ik_{z,+}z}-b_{-}B_{+}e^{ik_{z,-}z} & b_{-}A_{+}e^{ik_{z,-}z}-b_{+}A_{-}e^{ik_{z,+}z}% \end{array}% \right)\binom{\cos \frac{\theta }{2}e^{-i\phi /2}}{\sin \frac{\theta }{2}% e^{i\phi /2}} \,.\tag{S12} \end{equation}% \end{widetext} It follows that the transmission matrix can be expressed as \begin{equation} \hat{T}_{NW}=\frac{2}{A_{+}B_{-}-A_{-}B_{+}}\sum\limits_{s=\pm }se^{ik_{z,s}z}\binom{a_{s}}{b_{s}}\otimes \left(\begin{array}{cc} B_{-s} & -A_{-s}% \end{array}% \right)\,,\tag{S13} \end{equation}% where $\otimes $ denotes a direct product. \bigskip \bigskip \bigskip \bibliographystyle{apsrev4-1} \bibliography{20190321_IEE-WSM} \appendix %\begin{figure}[h] %\includegraphics[trim={0.8cm 0.5cm 3cm 2.2cm},clip=true, %width=0.9\columnwidth]{20180108_rhoTH-v-lambda_1skyrm-bbl.pdf} \hspace*{\fill% %} %\caption{Topological Hall resistivity $\protect\rho _{TH}$ generated by a %single skyrmion bubble as a function of spin diffusion length $\protect% %\lambda _{sd}$ in a thin film of width $w=6r_{sk}$ for several different $p_{% %\protect\tau }$; the insets show the corresponding spatial profile of the %emergent magnetic field. We have defined $\protect\rho _{TH}^{\left( %0\right) }=\left( p_{\protect\tau }+p_{\protect\sigma }\right) R_{H}\protect% %\psi _{0}/S$ and used $a_{DW}=0.2r_{sk}$ and $J_{ex}=0.2\protect\epsilon % %_{F} $.} %\label{Fig:rho-1Skrm-bbl} %\end{figure} \end{document} }
\caption{Self-attention guided synthesis of visible images from polarimetric thermal input. In order to minimize the domain gap between different modalities, the input thermal/visible images are directly mapped into the visible/thermal modality. In order to obtain the image level style, the pixel GAN loss (\textcolor{blue}{blue}) and cycle consistancy loss (\textcolor{green}{green}) are introduced. The feature-level semantic information is captured by the identity and perceptual losses (\textcolor{yellow}{yellow}). Similar architecture can also be used for synthesizing thermal images from visible images.}
\caption{Early conversation trees from \subreddit{AskMen}; nodes are comments and edges indicate reply structure. The original post is the black node, and as node colors lighten from red to yellow, comment timing increases from zero minutes to sixty minutes.}
\caption{\subreddit{AskMen} ($t_s=15$)}
\caption{\subreddit{AskWomen} ($t_s=45$)}
\caption{\subreddit{Fitness ($t_s=60$)}}
\caption{\subreddit{LifeProTips} ($t=165$)}
\caption{\subreddit{personalfinance} (N/A)}
\caption{\subreddit{relationships} ($t_s=45$)}
\caption{\textbf{SPRNet:} ({\color{red} 3.1}) Start with GFPN with ResNet-50 backbone, where \emph{G} denotes the gate mechanism; then following three paralleled branches for predicting class, box and mask respectively. ({\color{red} 3.2}) Mask branch with multi-scale fusion, positive pixel sampling and consecutive deconvolutions for instance mask generation.}
\caption{The predicted intrinsic luminosity (top) and equivalent width (bottom) as a function of metallicity for a range of prominent emission lines in the rest-frame UV and optical. In both cases we assume constant star formation for 10 Myr. The thick grey band denotes the rough range of metallicities predicted by \bluetides\for galaxies with$M^{*}>10^{8}\,{\rm M_{\odot}}$.}
\caption{The ratio of the modelled line luminosity compared to the true luminosity as a function of the number of particles used to sample the star formation history. The top-axis shows the corresponding stellar mass assuming the mean \bluetides\stellar particle mass.}
\caption{Predictions for the properties of 6 prominent UV and optical lines in \bluetides\. In the top panel we show both the intrinsic and dust-attenuated luminosity functions for each line at$z\in\{8,9,10,11,12,13\}$. In the next two rows we show the median attenuated equivalent width in bins of stellar mass and UV luminosity respectively. In the fourth row we show the median specific line luminosity ($L/M_{\star}$) in stellar mass bins while in the final row we show the median ratio of the line luminosity to the UV luminosity in bins of UV luminosity.}
\caption{The observed \citet{deBarros2019} and predicted distribution of combined H$\beta$ and [O{\sc iii}]$\lambda$4959,5007 equivalent widths and stellar masses at $z\sim 8$. The small red circles show the individual measurements from \citet{deBarros2019} while the large point denote the median value in 0.5 dex wide bins of stellar mass. The small and large error bars denote the error on the median and the 16-84th percentile range respectively. The dark and light solid lines show the intrinsic and attenuated predictions from \bluetides\respectively. The histograms on the right hand side show the distribution of equivalent widths for galaxies with$M^{\star}>10^{8}\,{\rm M_{\odot}}$.}
\caption{The observed \citet{deBarros2019} and predicted distribution of the ratio of the H$\beta$ and [O{\sc iii}]$\lambda$4959,5007 line luminosities to the far-UV luminosity and far-UV luminosities at $z\sim 8$. The small red circles show the individual measurements from \citet{deBarros2019}. The dark and light solid lines show the intrinsic and attenuated predictions from \bluetides\respectively. The histograms on the right hand side show the distribution of ratios for galaxies with$M^{\star}>10^{8}\,{\rm M_{\odot}}$.}
\caption{(Left) Streamwise evolution of the wall-shear stress on the suction side of the wing at $Re_{c}=100,000$, where {\color{blue}\solid} corresponds to the tripped case from Table \ref{wing_cases} and {\color{black}\solid} to a case without tripping. The zero-wall-shear-stress level is denoted by {\color{black}\dotdashed}. (Right) Streamwise evolution of the Reynolds number based on displacement thickness on the suction side of the wing, where colors correspond to the cases summarized in Table \ref{wing_cases}. Here {\color{black}\dotdashed} denotes the value $Re_{\delta^{*}}=450$.}
\caption{Diagnostic-plot scaling modified with the shape factor $H$, applied to several profiles over the whole suction side of the wing in the four $Re_{c}$ cases under study. The colors correspond to the cases summarized in Table \ref{wing_cases}, and {\color{black}\dashed} represents equation (\ref{fit_diag}), where $\alpha_{H}$ and $\beta_{H}$ are obtained from the correlations by \cite{diagnostic_ftac}, using the largest $Re_{\theta}$ value in each case.}
\caption{Inner-scaled tangential mean velocity profiles at (top-left) $x_{ss}/c=0.4$ and (top-right) $x_{ss}/c=0.7$, for the four wing cases under study, compared with the DNS results of ZPG TBL by \cite{schlatter_orlu10} at approximately matching $Re_{\tau}$ values. Colors from wing cases as in Table \ref{wing_cases}, and {\color{grey}\solid} denotes ZPG TBL data. \textcolor{black}{ The matched $U^{+}_{t}$ profiles for W10 and ZPG10 are denoted by $\left ( \blacksquare \right )$, for W4 and ZPG4 by $\left ( \bullet \right )$ and for W2 and ZPG2 by $\left ( \blacklozenge \right )$.} Ratio of (bottom-left) $U^{+}_{e}$ and (bottom-right) $H$, between wing and ZPG at approximately matching $Re_{\tau}$. Here (\textcolor{blue}{$\blacksquare$}), (\textcolor{red}{$\blacksquare$}) and (\textcolor{green}{$\blacksquare$}) denote ratios at $x_{ss}/c=0.4$, $0.7$ \textcolor{black}{ and $0.8$}, respectively.}
\caption{Selected inner-scaled components of the Reynolds-stress tensor at (top-left, middle-left) $x_{ss}/c=0.4$ and (top-right, middle-right) $x_{ss}/c=0.7$, for the four wing cases under study, compared with the DNS results of ZPG TBL by \cite{schlatter_orlu10} at approximately matching $Re_{\tau}$ values. Wall-normal profiles of (top panels) tangential velocity fluctuations (solid) and Reynolds-shear stress (dashed), and (middle panels) wall-normal (dashed) and spanwise (solid) velocity fluctuations are shown. Colors from wing cases as in Table \ref{wing_cases}, and {\color{grey}\solid} denotes ZPG TBL data. \textcolor{black}{ The matched $\overline{u^{2}_{t}}^{+}$ and $\overline{u_{t} v_{n}}^{+}$ profiles for W10 and ZPG10 are denoted by $\left ( \blacksquare \right )$, for W4 and ZPG4 by $\left ( \bullet \right )$ and for W2 and ZPG2 by $\left ( \blacklozenge \right )$.} Ratio of (bottom-left) $\overline{u^{2}_{t}}^{+}$ and (bottom-right) $\overline{u_{t}v_{n}}^{+}$ between wing and ZPG at $y_{n}/\delta_{99} \simeq 0.2$. Here (\textcolor{blue}{$\blacksquare$}), (\textcolor{red}{$\blacksquare$}) and (\textcolor{green}{$\blacksquare$}) denote ratios at $x_{ss}/c=0.4$, $0.7$ \textcolor{black}{ and $0.8$}, respectively.}
\caption{\textcolor{black}{ (Left) Outer-scaled tangential velocity fluctuations at $x_{ss}/c=0.7$ for the four wing cases under study, compared with the DNS results of ZPG TBL by \cite{schlatter_orlu10} at approximately matching $Re_{\tau}$ values. Colors from wing cases as in Table~\ref{wing_cases}, and {\color{grey}\solid} denotes ZPG TBL data. The matched $\overline{u^{2}_{t}}/U_{e}^{2}$ profiles for W10 and ZPG10 are denoted by $\left ( \blacksquare \right )$, for W4 and ZPG4 by $\left ( \bullet \right )$ and for W2 and ZPG2 by $\left ( \blacklozenge \right )$. (Right) Ratio of the inner-scaled TKE production between wing and ZPG at $y_{n}/\delta_{99} \simeq 0.2$.} Here (\textcolor{blue}{$\blacksquare$}), (\textcolor{red}{$\blacksquare$}) and (\textcolor{green}{$\blacksquare$}) denote ratios at $x_{ss}/c=0.4$, $0.7$ and $0.8$, respectively.}
\caption{Normal estimation on ModelNet40 dataset. For clearness, we only show predictions with angle less than 30$^{\circ}$ in \textcolor[rgb]{0.00,0.00,1.00}{blue}, and angle greater than 90$^{\circ}$ in \textcolor[rgb]{1.00,0.00,0.00}{red} between ground truth normals.}
\caption{\textbf{Ours (\textsc{FULL})} \textcolor{FullColor}{\textbullet} vs \textbf{Supervised} \textcolor{SupervisedColor}{\textbullet}}
\caption{Comparison of the ratio of predicted single Cause/Effect for the three sets of models. \textcolor[rgb]{0.2, 0.8, 0.2}{RS-E w/o MHSA} and \textcolor[rgb]{0.0, 1.0, 0.5}{RS-E w/ MHSA} denote the ratio of predicted single Effect for the models without and with MHSA mechanism, respectively. Similarly, \textcolor[rgb]{0.0, 0.55, 0.55}{RS-C w/o MHSA} and \textcolor[rgb]{0.0, 1.0, 1.0}{RS-C w/ MHSA} denote the ratio of predicted single Cause for the models without and with MHSA mechanism, respectively. \label{fig9}}
\caption{Standard anchor based detection. Anchors count as \textcolor{chameleon3}{positive} with an overlap $IoU>0.7$ to any \textcolor{skyblue3}{object}, \textcolor{scarletred3}{negative} with an overlap $IoU<0.3$, or are \textcolor{aluminium3}{ignored} otherwise.}
\caption{Center point based detection. The \textcolor{chameleon3}{center pixel} is assigned to the \textcolor{skyblue3}{object}. Nearby points have a reduced negative loss. Object size is regressed.}
\caption{Training an $L$-layer CNN with binary weights via matrix or tensor decomposition. The rows colored in {\color{blue} blue} are the changes introduced by our method when compared against the approach proposed in~\cite{rastegari2016xnor}.}
\caption{Comparison of mean velocity profile and turbulent kinetic energy budget. Circles represent the DNS data from \cite{moser99} while the lines represent the values from the LES. All values are normalized with inner units. The individual terms are color coded as: Production (\textcolor{blue}{blue}), dissipation (\textcolor{red}{red}), viscous diffusion (\textcolor{cyan}{cyan}), turbulent diffusion (\textcolor{green}{green}), velocity-pressure correlation (\textcolor{mygray}{gray})}
\caption{Comparison of turbulent kinetic energy budget for a NACA4412 wing section at the suction side location of $x/c=0.7$. The circles represent DNS data from \cite{hosseini16} while the lines are data from the LES. The individual terms are color coded as: Production (\textcolor{blue}{blue}), dissipation (\textcolor{red}{red}), viscous diffusion (\textcolor{cyan}{cyan}), turbulent diffusion (\textcolor{green}{green}), velocity-pressure correlation (\textcolor{mygray}{gray}), convection (\textcolor{yellow}{yellow})}
\caption{ Depiction of stages in common audio feature extraction pipelines and corresponding inversion. The two obstacles to vocoding are (1) estimating linear-frequency magnitude spectra from log-frequency mel spectra (outlined in \textcolor[rgb]{0.29, 0.33, 0.13}{green} dashed line), and (2) estimating phase information from magnitude spectra (outlined in \textcolor{blue}{blue} dotted line). We focus on magnitude estimation in this paper, observing that coupling an ideal solution to this subproblem with a phase estimation heuristic can produce high-quality speech (Table~\ref{tab:gl}).}
\caption[]{ Exemplary failure case for conventional MRF (\textcolor{orange}{-} in dotted orange) in seismic horizon tracking problem as compared to MRF with an additional bottleneck potential (\textcolor{green}{-} in solid green). (\textcolor{red}{+})~indicates the seed. The MRF solution makes one local error with high cost and starts tracking another smoother layer leading to an overall lower cost solution. A bottleneck term penalizes such high cost errors and results in the correct track.}
\caption{A user click record and the top ranked products provided by each method. Words of similar meaning are in the same color (\colorbox{size}{\makebox(36,4){large size}}, \colorbox{casual}{\makebox(24,4){casual}}, \colorbox{female}{\makebox(32,4){feminine}}, \colorbox{hot}{\makebox(44,4){hot weather}})\label{casestudy}}
\caption{Average shear stresses (a) $<\tau^+_{11}>$ and (b) $<\tau^+_{13}>$, for different $x$ locations, case V97, $\Delta^+=1.0$. \protect\redline $S-SMAG$, \protect\redlinedash $D-SMAG$ \protect\blueline $SIMB$, \protect\purpleline $SIMET$, \protect\greenline $GRAD$, \protect\cyanline $S-CLARK$, \protect\cyanlinedash $D-CLARK$, \protect\greyline $WALE$, \protect\yellowline $ANN$. }
\caption{\label{tab:datasets} Datasets used in our experiments. The number of vertices $n$ and edges $m$ is recorded for each graph. The datasets annotated by \textcolor{green}{$\odot$} have been created by us, and are publicly available. The five-dimensional vector containing the number of edges for each day of Twitter correspond to {\em follow, retweet, mention, quote, reply} respectively. For details, see Section~\ref{sec:setup}.}
\caption{ Three AMR parses %with errors for: %the sentence \textit{There is no asbestos in our products now}, %as generated %with three different automatic meaning representation parses and example errors: %GPLA %\textbf{GPLA}} %(top, \citet{DBLP:journals/corr/abs-1805-05286}), JAMR (middle, \citet{DBLP:conf/semeval/FlaniganDSC16}), CAMR (bottom, \citet{wang-xue-pradhan:2015:NAACL-HLT,wang-xue-pradhan:2015:ACL-IJCNLP,wang-EtAl:2016:SemEval}). Light errors (\textcolor{orange}{orange}) by GPLA (top), JAMR (bottom), %and CAMR (right). %\citet{wang-xue-pradhan:2015:NAACL-HLT,wang-xue-pradhan:2015:ACL-IJCNLP,wang-EtAl:2016:SemEval}). \textcolor{orange}{Light} %are contained in the first parse, and \textcolor{red}{severe} errors %(\textcolor{orange}{orange}, \textcolor{red}{red}) are %are %contained found in GPLA and JAMR %second parses; %, while %the CAMR %is incomplete: it %and fails to provide \textit{we}, the %entity which is manufacturer of the product. Bottom right: F1 for Smatch and three example subtasks from evaluation against the gold parse (given in Figure \ref{fig:ex1}). }
\caption{ Three AMR parses %with errors for: %the sentence \textit{There is no asbestos in our products now}, %as generated %with three different automatic meaning representation parses and example errors: %GPLA %\textbf{GPLA}} %(top, \citet{DBLP:journals/corr/abs-1805-05286}), JAMR (middle, \citet{DBLP:conf/semeval/FlaniganDSC16}), CAMR (bottom, \citet{wang-xue-pradhan:2015:NAACL-HLT,wang-xue-pradhan:2015:ACL-IJCNLP,wang-EtAl:2016:SemEval}). Light errors (\textcolor{orange}{orange}) by GPLA (top), JAMR (middle), %and CAMR (bottom). %\citet{wang-xue-pradhan:2015:NAACL-HLT,wang-xue-pradhan:2015:ACL-IJCNLP,wang-EtAl:2016:SemEval}). Light %are contained in the first parse, and severe errors %are %contained marked in the 1$^{st}$ %first and 2$^{nd}$ %second parse (\textcolor{orange}{orange}, \textcolor{red}{red}); %, while %the 3$^{rd}$ %third parse %is incomplete: it %and fails to provide \textit{we}, the %entity which is manufacturer of the product. }
\caption{ Three AMR parses %with errors for: %the sentence \textit{There is no asbestos in our products now}, %as generated %with three different automatic meaning representation parses and example errors: %GPLA %\textbf{GPLA}} %(top, \citet{DBLP:journals/corr/abs-1805-05286}), JAMR (middle, \citet{DBLP:conf/semeval/FlaniganDSC16}), CAMR (bottom, \citet{wang-xue-pradhan:2015:NAACL-HLT,wang-xue-pradhan:2015:ACL-IJCNLP,wang-EtAl:2016:SemEval}). Light errors (\textcolor{orange}{orange}) by GPLA (top), JAMR (bottom), %and CAMR (right). %\citet{wang-xue-pradhan:2015:NAACL-HLT,wang-xue-pradhan:2015:ACL-IJCNLP,wang-EtAl:2016:SemEval}). \textcolor{orange}{Light} %are contained in the first parse, and \textcolor{red}{severe} errors %(\textcolor{orange}{orange}, \textcolor{red}{red}) are %are %contained found in GPLA and JAMR %second parses; %, while %the CAMR %is incomplete: it %and fails to provide \textit{we}, the %entity which is manufacturer of the product. Bottom right: F1 for Smatch and three example subtasks from evaluation against the gold parse (given in Figure \ref{fig:ex1}). }
\caption{A realization of the Poisson bipolar network for $\lambda = 1/40$, $\alpha = 4$, and $R = 2$. `\textcolor{red}{$\times$}' denotes a transmitter while a circle `\textcolor{blue}{$\circ$}' denotes the associated receiver. In $(\mathrm{a})$, the number next to each link is a value of $P_{\rm s}$ (reliability) for that link for a deterministic SIR threshold of $\theta = 1$, while in $(\mathrm{b})$, the number next to each link is a value of the SIR threshold $T$ for that link such that the reliability is exactly $\nu$.}
\caption{Five K-means clusters of goal summaries. The K-means algorithm is applied to summary representations, which are derived by taking the average of GloVe word vectors~\cite{pennington2014glove} excluding stop words (represented with \textcolor{gray}{light fonts}).\label{tbl:goal_k_means}}
\caption{Affine-linear mapping $F_e$ from the reference triangle $\hat{\Omega} = \left\{\hat{\boldsymbol{a}}_1, \hat{\boldsymbol{a}}_2, \hat{\boldsymbol{a}}_2 \right\} = \left\{[0,0]^T, [1,0]^T, [0,1]^T \right\}$ to physical triangle $\Omega_e = \left\{\boldsymbol{a}_{e,1}, \,\boldsymbol{a}_{e,2}, \boldsymbol{a}_{e,3} \right\}$.}
\caption{Overview of the landmark-based map with landmarks depicted as red circles (\textcolor{landmark_rgb}{$\bullet$}). The route in Ulm-Lehr (Germany) is about $5\,\si{\kilo\meter}$ long and is comprised of 3860 map landmarks.}
\caption{The flowchart of the proposed weakly supervised adversarial domain adaptation. On the top, the asymmetric multi-task model is depicted, which consists of a detection model and a segmentation model (DS). During the training stage, a pair of images from two domains are fed to the DS model. The {\color{magenta}{magenta}} and {\color{green}{green}} curve arrows represent the input/output of source and target domain, respectively. Further, the two-way arrow shows that the data flow is involved in the training process. From this figure, source images take part in the object- and pixel- level training, while target images only participate in the object-level training. On the bottom, the two domain classifiers (PDC and ODC) at the object- and pixel- levels are demonstrated. The feature maps of two streams in DS are respectively fed to PDC and ODC, respectively. By alternately adversarial optimizing DS and two domain classifiers, the final DS will be obtained. During the testing phase, the test images are only fed to the segmentation stream in DS to predict the pixel-level score map.}
\caption{Visualization of our deformable kernel. Each row is a test case in the segmentation phase. The images from left to right are offset kernel for background activation point, offset kernel for pancreas activation point and ground truth mask, respectively. {\color{green}Green dots} are activation points, and {\color{red}red dots} represent the receptive field of that activation point.}
\caption{Additional \aastex\symbols}
\caption{GLUE dev set results. The best result on each task produced by a single model is in \textbf{bold}. % Note that there have been two versions of the QNLI dataset. V1 is expired on January 30, 2019. The current version is v2. MT-DNN uses BERT\textsubscript{LARGE} as their initial shared layers. {\NMNAME} is the MT-DNN trained using the proposed knowledge distillation based MTL. MT-DNN-ensemble denotes the results of the ensemble models described in Section~\ref{subsec:impl}. The ensemble models on MNLI, QQP, RTE and QNLI are used as teachers in the knowledge distillation based MTL, while the other ensemble modes, whose results are in {\color{blue}{\textit{blue and italic}}}, are not used as teachers. % The last 4 columns, highlighted and italicized, are the tasks without using teacher models. }
\caption{\emph{Displacement error} (in pixel) with respect to the ground truth (GT) for various values of the total variation penalty, $\lambda_{\text{TV}}$ (\texttt{t}) and the OMT penalty, $\lambda_{\text{OMT}}$ (\texttt{o}). Results for the \textcolor{blue}{inner} and the \textcolor{red}{outer} rings show subpixel registration accuracy for all \emph{local} metric optimization results (\texttt{*\_l}). Overall, local metric optimization substantially improves registrations over the results obtained via initial global multi-Gaussian regularization (\texttt{global}). \label{fig:displacement_errors_within_shape}}
\caption{Mean target overlap ratios on CUMC12 (in 3D) with $\lambda_{\text{TV}}=0.1$ and $\lambda_{\text{OMT}}=50$. Our approach (marked \textcolor{red}{red}) gives the best result overall. Local metric optimization greatly improves results over the initial global multi-Gaussian regularization. Best results are achieved for the model that was trained on this dataset (\texttt{c/c local}), but models trained on MGH10 (\texttt{m/c local}) and on IBSR18 (\texttt{i/c local}) transfer well and show almost the same level of performance. The dashed line is the median mean target overlap ratio (\ie, mean over all labels, median over all registration pairs).}
\caption{\textcolor[rgb]{1,0,0}{(a). A ground outdoor image. (b)-(c). SfM points and surface mesh with the similar viewpoint of (a). (d)-(f). Regions corresponding to the rectangles in (c). They are representative regions with relatively (d) rich and (e) low textures but simple structures (flat walls) and regions with relatively (f) complicated structures (bracket sets). The colored triangles in (d)-(f) are the facet examples for $a_{i,m}$ computation.}}
\caption{\textcolor[rgb]{1,0,0}{The influence of the value of $t_c$ to the number of planned laser scans on NCT and FGT.}}
\caption{(a) and (b). SfM and laser points of the region marked by blue rectangle in top left corner of Fig. \ref{fg:laser}. (c) and (d). SfM and laser points of the region marked by the blue rectangle in bottom left corner of Fig. \ref{fg:laser}. \textcolor[rgb]{1,0,0}{(e) and (f). SfM and laser points of the region marked by magenta rectangle in top left corner of Fig. \ref{fg:laser}. (g) and (h). SfM and laser points of the region marked by the magenta rectangle in bottom left corner of Fig. \ref{fg:laser}.}}
\caption{Image and laser scan merging accuracies (\textcolor[rgb]{1,0,0}{root-mean-square} errors) on NCT and FGT with different ratios of initial space error cost to initial reprojection error cost: $r_c=C_S(\omega)/C_R$.}
\caption{Image and laser scan merging accuracies (\textcolor[rgb]{1,0,0}{root-mean-square} errors) on NCT and FGT with different comparative methods.}
\caption{\textcolor[rgb]{1,0,0}{Constructed ground truths for indoor scene of NCT (left) and outdoor scene of GEH (right).}}
\caption{\textcolor[rgb]{1,0,0}{Reconstruction comparison of image based and our methods against laser based method in precision$|$recall$|$F-score.}}
\caption{Allocations of regulators.Plain DACNN-18 was selected as the testing model, the network is cut into 4 sections by 3 pooling layers, and regulators are appended to residual blocks in each section separately. ALL denotes all sections, {\bf bold} denotes overall best result and {\color{blue}{\bf blue}} denotes best result on single section.}
\caption{Model Efficiency. This is a plot of TOP-1 error rate on CIFAR-100 vs. number of parameters. Architectures with high model efficiency are at the bottom left of the graph. Mixed DACNNs and mixed DACNNs with regulators (circled with \textcolor{red}{red}) give highest model efficiency among all other models in our experiment.}
\caption{Comparison of test accuracy (\%) with different methods on the CIFAR-10 and CIFAR-100. The best results are highlighted in {\color{red}{red}}, and the best records of our models are in \textbf{bold}. Combined with the augmentation method of \textit{mixup}~\cite{zhang2017mixup}, SRP can challenge state-of-the-art results.}
\caption{The curve of region area ratio on different depth of netwrok blocks, where the region is seleted by SRP. The {\color{red}red line} is the value of the region area ratio in SS-SRP. The {\color{blue}blue line} is the mean of the region area ratio in the MS-SRP, and the value of area ratio in MS-SRP has a probability of 95\% on the {\color{blue}blue shadow}. Best viewed in color.}
\caption{Quantitative comparison of proposed network with several state-of-the-art methods. \red{Red} color indicates the best performance and \blue{blue} color indicates the second best result. }
\caption{\label{fig1}{\bf (a)} The topological Lieb lattice is composed of three sublattices, labelled {\color{red} A} (red), {\color{blue} B} (blue), and {\color{orange} C} (orange). {\bf(b)} The three bands of the tight-binding Hamiltonian are plotted within the first Brillouin zone. {\bf (c)} Phase diagram of soliton stability, as a function of energy and power, {\bf (d-g)} Soliton wavefunction magnitude profiles for points indicated in (c). Note that the magnitude of the A and C sites is the same though they are out of phase by $\pi/2$ . {\bf(h)} Wavefunction magnitude profile at the A sites, {\bf(i)} at the B sites; and {\bf(j)} the phase at the B sites (for the soliton in (f)). Soliton parameters for (h-j) are: $s=1.0$, $t_1 = 1.0$, $t_2=0.4$, $P = 3.31$, $E=-2.4044$. }
\caption{\colorbox{color0}{Train}, \colorbox{color1}{Test}, \colorbox{color2}{Receptive field}, \colorbox{color3}{Test pixel in receptive field}, \textbullet: center pixel.\\The 2D receptive field of a CNN can involuntarily include samples from the test set, making the network overfit and biasing the evaluation.}
\caption{{\textcolor{blue}{ Representation of the proposed procedure.}}}
\caption{{\textcolor{blue}{ Representation of the proposed Bayesian approach.}}}
\caption{\label{tab:table4} Comparison of {\color{blue} Buckingham} polarizabilies of ground state of \emph{free} ECSC potential, at selected $\delta$, with reference values of \cite{lai13}. PR implies Present Result. See text for details.}
\caption{Number of combinatorial quadrangulated boundaries that can be shelled with up to $H_{\max}$ hexahedra. Timings are given for a single thread on an Intel\textregistered{} Core\texttrademark{} i7-7700HQ CPU.}
\caption{\label{fig:2} Workflow diagram of the numerical models defined in FLUKA and ANSYS\textregistered\for studying the jaw thermo-mechanical response in case of a proton beam impact. The principal variable input is the beam impact parameters (in green). The ANSYS\textregistered\Transient Structural model is based on two main assumptions (in purple) that must be confirmed by the experimental results (in red).}
\caption{\label{fig:3} Full scale 3D geometry of the jaw defined in the ANSYS\textregistered\transient structural model. Two frictional contacts model the connection between the jaw guiding plates and the cylindrical shafts with a friction coefficient of\textmu\= 0.1. Temperature dependent linear elastic material properties are implemented. Finally, the simulation set up can describe three phases with distinct structural modes, delimited by dotted lines in a characteristic peak temperature evolution in the backstiffener after the ``deepest standard LIU proton beam impact''.}
\caption{\label{fig:4} Preliminary 2D-plane-strain model in ANSYS\textregistered\transient structural of a jaw cross section with the higher backstiffener temperature profile. In case of the deepest standard LIU proton beam impact, the peak temperature in the backstiffener is of 108\degree C. The right plot shows how the short duration thermal load creates larges compressive-to-tensile x-normal stresses.}
\caption{\label{fig:5} Preliminary ANSYS\textregistered\modal analysis of the jaw in the thermal state reached at the end of the proton beam pulse. The fundamental has a frequency of 54 Hz and governs the response in the second phase.}
\caption{\label{fig:6} ANSYS\textregistered\static structural analysis of the jaw response reached after at the beginning of the third phase (1 s) after the deepest standard LIU proton beam impact. A deformed geometry results from the thermal strains with the jaw downstream and upstream sides extending away from the beam trajectory.}
\caption{\label{fig:12} The LVDT\_L signal acquired during the high-intensity beam impact, deep (L), is plotted in solid blue, whereas the spread acquisitions during cooling are marked by a blue cross. The y-displacement calculated by the numerical model at the probing spot is plotted in solid red. Some inconsistencies are visible in the first instant of the third response phase.}
\caption{\label{fig:17} Picture of the upstream Novoltex\textregistered\Sepcarb\textregistered\054-62 absorber before (May 2018) and after (January 2019) the HRMT 44 high-intensity proton beam impacts. No defect is visible.}
\caption{Top-#1 performance (geometric mean across our five target projects). The \tikzemph{fill=best}{shaded row and column} represent the best sampler and learner, respectively.\label{Ta:sampler-by-learner-table-#2}}
\caption{Our DADA architecture learns to extract \textit{domain-invariant} features of visual categories. In addition to \textit{domain disentanglement} ({\color{blue}blue lines}), we employ \textit{class disentanglement} ({\color{red}red lines}) to remove \textit{class-irrelevant} features, both trained adversarially. We further apply a mutual information minimizer to strengthen the disentanglement. }
\caption{Left panel: Total intensity map for a noise-free version of our synthetic \hi\data cube, generated by integrating\hi\emission over the full spectral range of the cube. The map spans 1.4~\deg$~\times$~1.4~\deg\and contains\hi\line emission from 52~662 galaxies in the redshift range 0.7~--~0.758. The spatial resolution is 15~arcsec. Right panel: zoom in of the spatial region delimited by the white square in the left panel. Emission from only 30 consecutive channels is shown, corresponding to a velocity range of$\sim~300$~\kms. \hi\flux densities in units of Jy~beam$^{-1}$ to which the colours in each panel correspond are shown in the colour bars.}
\caption{Various spectra associated with each of two different target galaxies. All spectra are extracted using a spatial aperture size of 30~arcsec. The top two panels show the spectra of the first galaxy, assuming redshift offsets of 0 and 150~\kms. The bottom two panels show the spectra of the second galaxy, assuming redshift offsets of 0 and -150~\kms. In each panel, the green spectrum represents emission from the target galaxy, whereas the grey spectrum is the combined emission from the target and other neighbour galaxies (noise is excluded from this example). The black-dashed vertical lines delimit a spectral range of $\pm~300$~\kms\about the centre of the spectrum. The solid and red-dotted vertical lines represent the TF spectral ranges of the galaxy based on 1) an assumed inclination of 90 degrees, and 2) the evaluated (true) inclination, respectively. The actual (evaluated) galaxy inclination is given in the top left of each panel. A combination of a small spectral extraction range and a large redshift offset can yield an extracted spectrum that completely ``misses'' the emission from a target galaxy of interest, thereby yielding a high level of contamination in the extracted spectrum.}
\caption{Collective MPI\_Allreduce}
\caption{\color{Gray} \textbf{The BowTie neural network.} The estimated probability $P(\hat{\textbf{y}}^{(i)}|\textbf{x}^{(i)}, \textbf{w}^{(i)})$ may be fed into a post-processing discriminator component to assign a category (pos/neg) for the input $\textbf{x}^{(i)}$ with respect to a discriminator value $\delta\in[0, 1]$. All experiments presented in this paper use $\delta=0.5$. }
\caption{Transmission function ${\cal T} \left( \varepsilon \right)$ defined in Eq. \ref{eq:tau} %and (b) $d{\cal T} \left( \varepsilon \right)/d\varepsilon$ within a range of lengths $l$ for an island with homogeneous magnetic moment. Energies are expressed in units of $\varepsilon_{\perp}=J m_{\perp}$, lengths are expressed in units of $L_0=\varepsilon_{\perp} / \hbar v_F$. } \label{fig:fig2} \end{figure} In order to calculate the transmission function we proceed as in Ref. \cite{marun}, starting from the evolution operator in space for the whole scattering region. It reads $\hat{\cal U}(x_N,x_0)=\prod_{j=1}^N \hat{\cal U}(x_j,x_{j-1})$, with \begin{eqnarray} \label{eq:evol} \hat{\cal U}(x_j,x_{j-1})&=& \exp \{ i \frac{ \varepsilon_{j ||} }{\hbar v_F} L_j \} \exp\{- i \boldsymbol{\lambda}_j \cdot \hat{\boldsymbol{\sigma}} \} \\ & = & \exp \left\lbrace i \frac{ \varepsilon_{j ||} }{\hbar v_F} L_j \right\rbrace \left[ \hat{\sigma}_0 \cos{\lambda_j} - i {\bf n}_j \cdot \hat{\boldsymbol{\sigma}}\sin{\lambda_j} \right], \nonumber \end{eqnarray} being $\boldsymbol{\lambda}_j= \left(i\;\varepsilon_{j \perp} \sin \phi_j, -i \;\varepsilon_{j \perp} \cos \phi_j, \varepsilon \right)L_j /(\hbar v_F) $, with $\varepsilon_{||,\perp}= J m_{||,\perp}$, and $ {\bf n}_j= \boldsymbol{\lambda}_j/\lambda_j$. The transmission function is the inverse of the element $2,2$ of the transfer matrix, which is, in turn, the inverse of the matrix $\hat{\cal U}(L,0)$. Hence, ${\cal T}(\varepsilon)=| \mbox{Det}[\hat{\cal U}(x_N,x_0)]/ {\cal U}(x_N,x_0)_{1,1}|^2$. {\em Single homogeneous island}. We start by discussing the case of an homogeneous domain of length $L$, described by the previous Hamiltonian with a single piece, $N=1$. The resulting transmission function is \begin{equation} \label{eq:tau} {\cal T}(\varepsilon)= \frac{|\varepsilon_{\perp}^2- \varepsilon^2|}{|\varepsilon_{\perp}^2- \varepsilon^2|\cos^2\lambda + \varepsilon^2 \sin^2 \lambda}, \end{equation} being $\lambda= r l$, with $l = L/ L_0$, $L_0 = \hbar v_F/\varepsilon_{\perp} $ and $r=\sqrt{(\varepsilon/\varepsilon_{\perp})^2-1}$. %\sqrt{\varepsilon^2- \varepsilon_{\perp}^2}/(\hbar v_F)$. Notice that the transmission function does not depend on the detailed orientation of the magnetic moment but only on the projection $m_{\perp}$ perpendicular to the direction of the spin-orbit interaction of the material. It is also symmetrical to $\varepsilon = 0$. The latter introduces an effective coupling between the two Kramer's partners, that may open a gap in the spectrum of magnitude $\varepsilon_{\perp}$. The behavior of ${\cal T}(\varepsilon)$ is illustrated in Fig.\ref{fig:fig2}, where we see its dependence on the length of the island. For short islands, there is a sizable tunneling amplitude through the magnetic island, while as the length of the magnet increases, the transmission function tends to a step function close to $\varepsilon \sim \varepsilon_{\perp}$. We get the following behavior of the transmission function at the opening of the gap as a function of length ${\cal T}(\varepsilon_{\perp}) = \left[1+l^2\right]^{-1}$, with $l=L/L_0$, while the slope behaves as $d {\cal T}/d\varepsilon|_{\varepsilon_{\perp}} = 2l^4[1+l^2]/3[1+l^2]^3$, which saturates at the value of $ 2/3$, for increasing $l$. For energies $\varepsilon > \varepsilon_{\perp}$, ${\cal T}(\varepsilon)$ exhibits oscillations with maxima ${\cal T}^{\rm max}(\varepsilon_n) =1$ and minima ${\cal T}^{\rm min}(\varepsilon_m) = 1- \left(\varepsilon_{\perp}/\varepsilon_m\right)^2 $ at energies satisfying $\left(\varepsilon_{n (m)}\right)^2\!\!\!=\!\left(\varepsilon_{\perp}\right)^2 + \left(\pi \alpha_{n (m) } \hbar v_F/L\right)^2$, with $\alpha_{n (m)}$ being an integer (half-integer) number, respectively. % The sharpness of the oscillations can be quantified by the derivative of ${\cal T}(\varepsilon)$, which is shown in the inset of Fig. \ref{fig:fig2}. %The first minimum has the largest depth, which scales with the length of the island as ${\cal T}^{\rm min}(\varepsilon_1) \sim \pi L_0/4L$, while %the width of the first peak scales as $\pi L_0/L$. Hence, the length of the island can be adequately chosen in order to design a device optimized for high power generation and/or for %high figure of merit. \textcolor{red}{controlar estas estimaciones} \begin{figure} %\centering \includegraphics[width=\columnwidth]{fig3.pdf} \caption{Maximum power (upper panels) and figure of merit $ZT$ (lower panels), for a single magnetic domain of $\textcolor{red}{l} = 10 $ (a)-(b) and $\textcolor{red}{l}= 20 $ (c)-(d). The maximum values in (a) and (b) are $P_{\rm max}( T = 0.3 ) = 0.240 P_0$ (a), $ZT(T=0.08 ) = 60$ (b), $P_{\rm max}(T = 0.45) = 0.244 P_0$ (c), and $ZT(T=0.02) = 274$ (d). The temperatures are expressed in units of $T_0=\varepsilon_{\perp}/k_B$ and the power is expressed in units of $P_0 = (k_B \Delta T)^2 / h$. Other details are the same as in Fig.\ref{fig:fig2}. } \label{fig:fig3} \end{figure} The impact of the transmission function on the thermoelectric performance of the heat engine is illustrated in Fig. \ref{fig:fig3} for two lengths of the magnetic domain, in a range of chemical potentials close to the edge of the energy gap, within a temperature range scaled by the reference temperature $T_0=\varepsilon_{\perp}/k_B$. For the shortest length shown in panels (a) and (b), $l=10$, ${\cal T}(\varepsilon_{\perp}) <0.01$ and $d {\cal T}/d\varepsilon|_{\varepsilon_{\perp}} \sim 0.65 $, i.e. close to the maximal slope ($2/3$), implying a pronounced step in the transmission function at the closing of the energy gap. The plots shown in Figs.(c) and (d) correspond to a longer island of length $l=20$, for which the step function is slightly more pronounced. For very low temperatures, within a scale $k_B T$ smaller than the width of the peaks of ${\cal T}(\varepsilon)$, both $P_{\rm max}$ and $ZT$ vanish for $\mu=\varepsilon_n$ (see arrows in panels (a) and (b)). As the temperature increases, the behavior of these quantities is ruled by the effect of several peaks. At sufficiently high temperature, such that several maxima of ${\cal T}(\varepsilon)$ are included in an energy window of width $k_B T$, the behavior is dominated by the average between the envelopes for the minima and the maxima of ${\cal T}(\varepsilon)$. The resulting function is approximately a smoothed step-function, independently of the length of the island. For this reason, $P_{\rm max}$ shows a wide maximum centered at $\sim |\varepsilon_{\perp}- \mu| \sim k_B T$ \cite{whitney1,peter}. The maximum is as high as $\sim 0.244 P_0$, i.e. $ \sim 75 \% $ of the bound $0.32 P_0$. ${\cal T}(\varepsilon)$ is symmetric with respect to $\varepsilon=0$ and has a well of unit depth and width $\sim 2 \varepsilon_{\perp}$. This feature dominates the behavior of the power and $ZT$ at high temperatures. These properties depend mildly on the length of the island. Details of the effect of the different features of ${\cal T}(\varepsilon)$ on the thermoelectric response as a function of $T$ are presented in the supplementary material (SM) \cite{sm}. \begin{figure} % \centering \includegraphics[width=\columnwidth]{fig4.pdf} \caption{ Transmission function ${\cal T}(\varepsilon)$ defined in Eq. \ref{eq:tau} for two magnetic domains of equal size ($l = {4,10}$) with the perpendicular component of the magnetic moments oriented with a relative tilt $\phi$.} \label{fig:fig4} \end{figure} {\em Two domains}. We now turn to analyze the case where we have two pieces, corresponding to $N=2$ in Eq.(\ref{eq:evol}) with $L_1=L_2=L$, $\phi_1=0$, $\phi_2=\phi$, and $\varepsilon_{\perp,1}=\varepsilon_{\perp,2}=\varepsilon_{\perp}$. The resulting transmission function reads, \begin{align} {\cal T}(\varepsilon) &=\left \{ \left[\cos^2\lambda+\frac{\sin^2 \lambda}{r^2} \left(\cos\phi- \frac{\varepsilon^2}{\varepsilon_{\perp}^2} \right ) \right ]^2 \right. + \nonumber \\ & \left. \left[- \frac{\varepsilon}{\varepsilon_{\perp}} \frac{\sin 2\lambda}{r}+\sin\phi \frac{\sin^2\lambda}{r^2} \right ]^2 \right \}^{-1}. \end{align} %with $\phi$ the orientation of the second island with respect to the first. The new feature in the present case, in comparison to the case of a single magnetic moment, is the existence of resonances within the gap, $|\varepsilon| < \varepsilon_{\perp}$, for $\phi \neq 0$. The position of the resonant state depends on the phase difference $\phi$. For $\phi = \pi$, Eq. (\ref{ham}) coincides in that case with the model introduced by Jackiw and Rebbi \cite{jare,ssh}, which has a topological zero mode localized at the domain wall boundary. In Ref. \onlinecite{sm} we analyze the impact of the length of the domains on the width of the resonant state. We also show that this feature is robust under weak disorder in the length of the domains and the orientation of the magnetization along each domain. The behavior of the transmission function for two domains is illustrated in Fig. \ref{fig:fig4} for a set of orientations. The upper and lower panels show the transmission function for $l=10$ and $l=4$ for each domain, respectively. Note that the width of the resonance decreases for increasing $l$. The corresponding thermoelectric response is shown in Fig. \ref{fig:fig5}. Close to the edge of the gap, the minima of ${\cal T}(\varepsilon)$ for $\phi=\pi$, are deeper than the ones for a single domain (see Eq. 6 of Ref. \onlinecite{sm}). Notice that the latter corresponds to a single domain with total length $2L$. On the other hand, for $\phi=\pi$, the energy difference between peaks is twice the one for $\phi=0$. Hence, for two domains with $\phi=\pi$, the first peak after the closing of the gap is expected to generate a thermoelectric response with a high figure of merit, similar to that of a Lorenzian function within a range of temperatures larger than in the case of a single ferromagnetic one. For $\mu \sim k_B T$ the thermoelectric response is dominated by the resonance within the gap. This leads to high values of $ZT$ for $k_B T \lessapprox 10 \gamma$, being $\gamma$ the width of the resonance, which depends on the domain length. These details are discussed in Ref. \cite{sm}. For higher temperatures, the transport behavior is dominated by the Heaviside-step function and well-shaped envelopes of the transmission function, and the thermoelectric response is similar to the one discussed for a single domain. \begin{figure} \centering \includegraphics[width=\columnwidth]{fig5.pdf} \caption{Figure of merit $ZT$ for two magnetic domains of length $l = {4,10}$, with the perpendicular component of the magnetic moments tilted in $\phi=\pi$. Other details are similar to previous figures. } \label{fig:fig5} \end{figure} {\em Conclusions.} We have analyzed the transmission function characterizing the coherent transport of electrons in a structure consistent of a pair of helical edge states of a 2D TI coupled by a magnetic island with a magnetic moment having a component perpendicular to the direction of the spin orbit of the TI. We have shown that this setup has the necessary conditions to achieve high performance thermoelectricity. The key is the opening of a gap in the spectrum of the helical edges with a steep increase of the transmission function at the opening of the propagating modes in the spectrum. Depending on the energy range and the configuration of the magnetic domains, the transmission function has features akin to a theta-function, as well as with features akin to a delta-function, which are known to be optimal for high-power production and figure of merit, respectively. Due to the resonant states in the gap for two magnetic domains, very large values of the figure of merit, $ZT>100$, are attained for the heat-engine and refrigeration modes. Our calculations focus on a single pair of edge states, but the currents simply scale in a factor two when the pair at the opposite edge is also considered. The range of operation is set by the magnetic gap $\varepsilon_{\perp}$. For a single domain generating an effective magnetic field of $\sim 1.8 - 4$ T \cite{Scheunert2016}, we estimate $\varepsilon_{\perp} \sim 1 - 2 \times 10^{-4} eV$, corresponding to reference temperatures $T_0 \sim 1.2 - 2.4$ K. According to our study, such a device with a length of the magnetic island of $ \sim 10 L_0$, being $L_0= \varepsilon_{\perp} / \hbar v_F$, operates as a heat engine at a high performance ($\sim 75\%$ of the optimal bound) regarding power generation with a figure of merit $ZT \gg 1$ for $T < 0.5 T_0$. Taking estimates for the Fermi velocity of the helical edge states in quantum wells of HgTe from Ref. \cite{ti4}, we have $\hbar v_F \sim 0.9 eV/nm$, leading to $L_0 \sim 10- 20 \mu m$. These parameters are at the state of the art of present experimental realizations. {\em Acknowledgements}. We thank A. Aligia and P. Roura-Bas for carefully reading our manuscript and useful comments. We acknowledge support from CONICET, Argentina. We are sponsored by PIP-RD 20141216-4905 of CONICET, PICT-2014-2049 and PICT-2017-2726 from Argentina, as well as the Alexander von Humboldt Foundation, Germany (LA). \newpage \begin{appendices} \textbf{SUPLEMENTAL MATERIAL: OPTIMAL THERMOELECTRICITY WITH QUANTUM SPIN-HALL EDGE STATES} In order to gain insight on the features of the transmission function ${\cal T}(\varepsilon)$ ruling the thermoelectric response of a single Kramer's pair of helical edge states of the QSH coupled to a nanomagnet, we analyze simpler functions and take them as reference. In particular, we analyze the thermoelectric response of Heaviside-step (\sref{sec:Heaviside}), well-shaped (\sref{sec:well}) thansmission functions, and a well-shaped transmission function with a resonance in the energy gap (\sref{sec:well-plus-bound}). In Sec. II we show that at different energy scales, ${\cal T}(\varepsilon)$ for the nanomagnet-QSH system contains ingredients of these different functions. Depending on the temperature, a particular one becomes dominant. We show in Sec. III the existence of a resonant state in the case of two magnetic domains with relative orientation $\phi=\pi$. Finally, in \sref{sec:inhom} we analyze the robustness of the main features of the transmission function against disorder in the orientation of the magnetic moment within the domains. \begin{figure}[h!] \centering \includegraphics[width=\columnwidth]{fig1s.pdf} \caption{Maximum power $P_{\rm max}$ and $ZT$ for the Heaviside step function defined in Eq. (\ref{step}) for several temperatures, expressed in units of $T_0=\varepsilon_0/k_B$. } \label{fig:fig1s} \end{figure} \begin{figure}[h!] \centering \includegraphics[width=\columnwidth]{fig2s.pdf} \caption{Maximum power and $ZT$ for a well-function defined in Eq. (\ref{well}) as function of $\mu$ for several temperatures. The functions are symmetric with respect to $\mu=0$. Other details are the same as in the previous Fig.} \label{fig:fig2s} \end{figure} \begin{figure}[h!] \centering \includegraphics[width=\columnwidth]{fig3s.pdf} \caption{(a-b)Maximum power and (c-d)$ZT$ for the bound state. (a-c) correspond to $\gamma = 0.001 \varepsilon_0$ while (b-d) to $\gamma=0.1\varepsilon_0$. Other details are the same as in the previous figures. } \label{fig:fig3s} \end{figure} \subsection{Heaviside-function}\label{sec:Heaviside} We review the behavior of the figure of merit $ZT$ and the maximum power $P_{\rm max}$, defined, respectively, in Eqs. (3) and (4) of the main text, for the case of a transmission function with the form of a Heaviside function, \begin{equation}\label{step} {\cal T}(\varepsilon)= \theta(\varepsilon-\varepsilon_0). \end{equation} Results are shown in \fref{fig:fig1s}. The upper bound for the maximum achievable power, $0.32P_0$, where $P_0 = (k_B \Delta T)^2/h$, is reached \cite{benenti,whitney1} at $\mu-\varepsilon_0 \sim -k_B T$. We see that $ZT$ achieves high values at low temperatures. However, it is a decreasing function of the temperature, and drops very rapidly to $ZT \sim 0$ in the region of $\mu-\varepsilon_0>0$, where ${\cal T} = 1$. \subsection{Well-shaped function}\label{sec:well} In this section we consider a well-shaped function of the form \begin{equation}\label{well} {\cal T}(\varepsilon) = \theta(\varepsilon-\varepsilon_0) + \theta(-\varepsilon-\varepsilon_0). \end{equation} The range $|\varepsilon| < \varepsilon_0$ behaves like the transmission function of the nanomagnet coupled to the helical edge states within the energy gap for sufficiently large magnets such that $L \gg L_0$. In the range $|\varepsilon| > \varepsilon_0$, it is similar to the envelope for the sequence of maxima of the transmission function of the nanomagnet. Both the maximum power and $ZT$ are even functions of $\mu$. Results are shown in \fref{fig:fig2s} for $\mu>0$. For low temperatures, $T<0.2 T_0$, being $T_0=\varepsilon_0/k_B$, the behavior of both, the maximum power and $ZT$ is similar to that of the two Heaviside-type transmission functions. In fact, we can identify in Fig. \ref{fig:fig2s} the same features that we have already analyzed in Fig. \ref{fig:fig1s}. In particular, we see that the maximum power achieves the optimal bound $0.32 P_0$. However, for $T>0.2T_0$, the maximum value becomes a decreasing function of the temperature. The behavior of $ZT$ is similar to that of the step function within the whole range of temperatures. \subsection{Well-shaped transmission function with a resonant peak}\label{sec:well-plus-bound} We now consider the thermoelectric response of a transmission function, which consists of a well-shaped transmission function with a Lorenzian function of width $\gamma$ in the center of the well, \begin{equation} {\cal T}(\varepsilon)= \theta(\varepsilon-\varepsilon_0) + \theta(-\varepsilon-\varepsilon_0) + \frac{ \gamma^2}{(\gamma^2+\varepsilon^2)}. \end{equation} We focus on $\gamma \ll \varepsilon_0$, which is relevant for the description of a transmission function of the nanomagnet with two magnetic domains, which hosts a narrow resonance inside the gap. The corresponding thermoelectric response is presented in \fref{fig:fig3s} for two different widths of the resonance, $\gamma = 0.001 \varepsilon_0$ and $\gamma = 0.01\varepsilon_0$. We start by analyzing the behavior of the maximum power, which is shown in the upper panels of the Fig. By comparing these plots with those shown in the upper panels of \fref{fig:fig2s} and \fref{fig:fig1s}, we see that the dominant response at all temperatures is due to the well-function feature. For low temperatures $T<10 \gamma/k_B$, we can also identify features originated in the response due to the resonant peak. Concretely, we can identify maxima of $P_{\rm max}$ at $\mu \sim k_B T$. The maxima are anyway much smaller than the optimal bound ($\sim P_0/3$ for the case of $T=0.01T_0$ and $\gamma=0.01$). In the behavior of $ZT$, we can also identify features originated in the resonance and on the step function. For $T\leq 0.05 T_0$, the resonance leads to maxima in $ZT$ at values of the chemical potential that are $|\mu| \propto k_B T$, while the step function leads to maxima of $ZT$ at values of the chemical potential satisfying $|\mu-\varepsilon_0| \propto k_B T$. Hence, within this low-temperature range, there are two maxima, which become closer one another as the temperature increases. For higher temperature the positions of the two maxima change mildly but the values of $ZT$ decrease rapidly with the temperature. \section{Thermoelectric regimes of the nanomagnet coupled to the helical edge states} We expect different thermoelectric regimes as a function of temperature. The transmission function of the nanomagnet caupled to the edge states presents ingredients of the three functions described in the previous section. Concretely, we expect that at low temperatures, where resonances and peaks are resolved, the thermoelectric response resembles that of the Lorenzian function, at higher temperatures, the behavior of the Heaviside function becomes dominant and at even higher temperatures, the well-shape due to the gap causes the decreasing behavior of both $P_{\rm max}$ and $ZT$ as functions of $T$ for all values of the chemical potential $\mu$. An overview of the different regimes is presented in Fig. \ref{fig:fig4s}. Here, we show the functions $\overline{ZT}=\mbox{Max}_{\mu} \left[ZT \right]$ and $\overline{P}_{\rm max}=\mbox{Max}_{\mu}\left[P_{\rm max}\right]$, where $\mbox{Max}_{\mu}$ denotes the maximum value of the quantity over the whole range of values of $\mu$. The boundary for the lowest-temperature regime is identified with the range of $T$ below the one corresponding to the minimum of $\left[P_{\rm max}\right]$. This regime is akin to the response due to a Lorenzian-type transmission function and is non-universal, since it depends on the details of the resonant peaks of ${\cal T}(\varepsilon)$. The latter are determined by the length of the island and the number of domains. In the case of two domains woth $\phi=\pi$, it is possible to distinguish a narrow feature, which is associated to a resonance within the gap, followed by a second feature, associated to the the first peak after the closing of the gap. For islands with a single domain, we can distinguish only one feature, which is associated to the first peak after the closing of the gap. For larger temperatures, we enter the regime dominated by the Heaviside-type function corresponding to averaging the envelopes of the minima and maxima of the transmission function for $\varepsilon>\varepsilon_{\perp}$. While the maxima corresponds to ${\cal T}=1$, the minima are deeper for two domains with $\phi=\pi$ than for a single domain. Hence, the values of $\mbox{Max}_{\mu} \left[P_{\rm max}\right]$ within this regime are higher for a single domain than for two antiferromagnetic ones. For both orientations, the behavior is universal, namely, it does not depend on the length of the island. The figure \ref{fig:fig4s} shows the onset of the third (high-temperature) regime, where the well-shape becomes dominant and $\mbox{Max}_{\mu} \left[P_{\rm max}\right]$ turns to be a decreasing function of $T$. \begin{figure} \centering \includegraphics[width=\columnwidth]{fig4s.pdf} \caption{$\overline{ZT}=\mbox{Max}_{\mu} \left[ZT \right]$ and $\overline{P}_{\rm max}=\mbox{Max}_{\mu}\left[P_{\rm max}\right]$, corresponding to the maximum values of $P_{\rm max}$ and $ZT$ over the whole range of $\mu$, as functions of the temperature $T$. AF denotes islands with two domains antiferromagnetically aligned.} \label{fig:fig4s} \end{figure} \section{Resonant state in the $\phi=\pi$ configuration}\label{sec:resonant} The existence of a resonant state in the gap for this configuration can be understood after noticing that the Hamiltonian of Eq. (5) of the main text reduces to a 1D Dirac Hamiltonian with a mass, determined by the coupling to $m_{\perp}$, which changes sign at the boundary between the two domains. This is, precisely the model analyzed by Jackiw and Rebbi \cite{jare}, which is the continuous version of the Su-Schrieffer-Heeger model \cite{ssh} for polyacetylene. These models are known to host topological zero modes localized at the domain wall. We now analyze the degree of localization of the zero mode for the case of interest, where the length of the magnets (equivalent to the massive region of the Dirac Hamiltonians) is finite. The inverse of transmission function of two magnetic domains with length $L$ and a relative orientation $\phi$ of the component of the magnetization perpendicular to the spin-orbit interaction of the TI, takes the following simple form \noindent \begin{align}\label{eq:resonant} &\left[{\cal T} (\varepsilon,l,\phi)\right]^{-1} = \nonumber \\ & \left[ \cos^2\lambda+\frac{\sin^2\lambda}{r^2}(\cos\phi-x^2) \right ]^2 + \left[ -x \frac{\sin 2\lambda}{r}+\sin\phi\frac{\sin^2\lambda}{r^2} \right ]^2, \end{align} \noindent where $l=L/L_0$, $x = \varepsilon/\varepsilon_\perp$, $r = \sqrt{x^2-1}$ and $\lambda = l r$. Notice that for $\varepsilon =0$, the latter function reads \noindent \begin{align}\label{eq:resonant} &\!\left[{\cal T} (0,l,\phi)\right]^{-1}\!=\nonumber\\ &\!\left[\cos^2(li)+\left(\frac{\sin(li)}{i} \right )^2 \cos\phi \right ]^2 \! +\!\left(\sin\phi \left (\frac{\sin li}{i} \right )^2 \right )^2 \nonumber \\ &= \left(\cosh^2l+\sinh^2l \cos\phi \right )^2+\sin^2\phi \sinh^2l \nonumber \\ &= \left(1+\left(1+\cos\phi \right )\sinh^2l \right )^2 + \sin^2\phi \sinh^2l, \end{align} \noindent where we see that ${\cal T}(0,l,\phi) \sim 0$ for $l>1$ except for $\phi = (2n+1) \pi$, with $n$ integer, in which case ${\cal T}(0,l,(2n+1)\pi) = 1$. Therefore, we conclude that for $\phi=(2n+1) \pi$, there is a resonant state in the center of the gap, with energy $\varepsilon=0$. The width of this resonant state depends on the length as $e^{-l}$, which means that the width of the resonance decreases with $l$. For this configuration there is a simple expression for the minima of the oscillations above the gap. It reads \begin{equation}{\cal T}^{min}(\varepsilon_{m})=\left[\left(\varepsilon_m^2-\varepsilon_\perp^2 \right )/\left(\varepsilon_m^2+\varepsilon_\perp^2 \right ) \right ]^2, \end{equation} where $\left(\varepsilon_{m}\right)^2\!\!\!=\! \left(\varepsilon_{\perp}\right)^2 + \left(\pi \alpha_{m} \hbar v_F/L\right)^2$, with $\alpha_{m}$ being a half-integer number. In addition, we estimate the critical length of the domains for the resonant peak to develop. We use the following criterion to determine the critical length $l_c$ for which the width of the resonant state is smaller than the energy gap, ${\cal T}(\varepsilon/\varepsilon_\perp=0.5)\leq 0.5$, leading to \begin{equation} \!\!\!{\cal T}\left ( \frac{\varepsilon}{\varepsilon_\perp}=0.5,l_c \right )=\frac{9/8}{\frac{15}{8}\!\!-\!\!\cosh\left (\sqrt{3}l_c \right )\!\!+\!\!\frac{1}{4}\cosh\left (2\sqrt{3}l_c\right )} . \label{eq:taul} \end{equation} We get $l_c \simeq 0.9 $. Hence, we identify a resonance in the configuration of $\phi=\pi$ for $l>l_c$. We have verified that for angles $\phi\neq \pi$ the criterion does not change significantly and the condition $l > 1$ is enough to clearly resolve a resonant state within the gap. The previous analysis was based on the assumption that the two domains have the same length. In what follows, we analyze the case of magnetic domains with different lengths, focusing on the situation where one of the domains has $l_1>1$ and the second domain has $l_2 \leq l_1$. The generalization of Eq. (9) of the main text to the present case reads \begin{multline} \left[{\cal T}(\lambda_1,\lambda_2,\phi,x) \right ]^{-1} = \\ \left[\cos\lambda_1\cos\lambda_2+\frac{\sin\lambda_1\sin\lambda_2}{r^2}\left(\cos\phi-x^2 \right ) \right ]^2 \\ + \left[\sin\phi \frac{\sin\lambda_1\sin\lambda_2}{r^2}-\sin(\lambda_1+\lambda_2)\frac{x}{r} \right]^2. \end{multline} For $\phi=\pi$, the height of the resonant level is given by \begin{equation}\label{tau12}{\cal T}(\lambda_1,\lambda_2,\phi=\pi,x=0) = \frac{1}{\cosh\left(l_1-l_2\right)}. \end{equation} We see that the transmission function is $\tau=1$ for $l_1=l_2$, consistently with our previous analysis of two equaly sized domains. For $l_1 \neq l_2$, $\tau$ drecreases and becomes $\tau\sim 0$ if $l_2\ll l_1$ (recall that $l_1 >1$). From Eq. (\ref{tau12}) we find that for $|l_1-l_2|\leq 0.5$ and $l_1>1$, $\tau$ hosts a clear resonance with $\tau(0)>0.9$. \begin{figure} \centering \includegraphics[width=\columnwidth]{fig5s.pdf} \caption{Transmission function for a two-domain configuration with $\phi=\pi$. The first island has a fixed length and we consider several lengths of the second island. The inset shows a detail of the bound state at zero energy} \label{fig:fig5s} \end{figure} The behavior of $\tau(\varepsilon)$ for $l_1 \neq l_2$ is illustrated in \fref{fig:fig5s} for $l_1=4$ and $l_2<l_1$. We see that for $l_1-l_2 > 2$, the resonance is no longer distinguished within the gap. However, for small difference in the length of the two magnetic domains, not only the resonant peak, but also the behavior of $\tau(\varepsilon)$ above the gap is practically unaffected. \section{Robustness of the features of the transmission function against inhomogeneities in the orientation of the magnetic moment}\label{sec:inhom} We now turn to analyze the effect of an inhomogeneity in the orientation of the magnetic moment within each domain. To this end, we divide each magnetic domain in $n$ pieces of the same length, along which the phase gets a random component. Hence, the orientation of the magnetic moment within each of these pieces is $\phi_j = \phi^0+\delta\phi_j, \; j=1, \ldots, n$, where $\delta \phi_j$ is a random component of the phase within the $j-$th subdomain, while $\phi^0 = 0,\pi$ for the first and second domain, respectively. According to the previous analysis, the partitions must satisfy $l/n <<1$, in order to be considered as a perturbation over the main magnetic configuration of the domain. In fact, for $l/n \sim 1$, each of these partitions would separately open a gap and would behave as an additional magnetic domain. \begin{figure} \centering \includegraphics[width=\columnwidth]{fig6s.pdf} \caption{Transmission function for two domains of equal length $l=4 $ with orientation of the magnetic moments $\phi^0=0,\pi$ and random piece-wise random fluctuations $\delta \phi_j=\pm \pi/9,\;j=1,\ldots,n$, within within $n=10$ partitions of equal length (top) and $n=80$ (bottom). Different colors correspond to different realizations of disorder.} \label{fig:fig6s} \end{figure} \begin{figure} \centering \includegraphics[width=\columnwidth]{fig7s.pdf} \caption{Same as Fig. \ref{fig:fig6s} with $\delta \phi_j=\pm 4 \pi/9$. } \label{fig:fig7s} \end{figure} Examples are shown in Figs. \ref{fig:fig6s} and \ref{fig:fig7s} for weak and strong amplitude in the random component of the magnetic moment, respectively. In each case, we compare the behavior of different numbers of partitions $n$. In the case of weak disorder shown in Fig. \ref{fig:fig6s}, we see that the behavior of the transmission function above the gap is almost unaffected by the inhomogeneous orientation of the magnetic moment, while the position of the resonant peak is slightly shifted away from $\varepsilon=0$. Albeit, the width of the latter remains unaffected. The shift becomes smaller as the number of partitions increases, as seen in the comparison between the top and bottom panels. The case of strong fluctuations ($\delta \phi_j =\pm 4\pi/9$) is analyzed in Fig. \ref{fig:fig7s}. The behavior of the resonance is similar to the case of weak disorder. For increasing fluctuations of the magnetic moment, the shift of the resonance away from $\varepsilon=0$ is larger. For small number of partitions $n$ (see top panel) the pattern of maxima and minima above the gap becomes also affected. However, the overall structure of $\tau(\varepsilon)$, including the existence of a resonant peak, the clear opening of the gap and a series of peaks with an envelope defined by a Heaviside function are preserved. For larger number of partitions (see bottom panel), all the features, including the resonant peak, as well as the pattern of maxima and minima above the gap are mildly affected. The analysis of this and the previous sections lead us to conclude that the performance of the setup is very robust under weak fluctuations in the orientation of the magnetic moment, as well as under fluctuations in the length of the two domains. \end{appendices} \begin{thebibliography}{9} \bibitem{giazotto} F. Giazotto, F., T. T. Heikkil\"a, A. Luukanen, A. M. Savin, and J.P. Pekola, Opportunities for mesoscopics in thermometry and refrigeration: Physics and applications. Rev. Mod. Phys.{\bf 78} 2172006 (2006). \bibitem{casati}G. Benenti, G. Casati, K. Saito, R. S. Whitney, Fundamental aspects of steady-state conversion of heat to work at the nanoscale, Phys. Rep. {\bf 694}, 1 (2017). \bibitem{granger} J. P. Eisenstein and J. L. Reno, Observation of Chiral Heat Transport in the Quantum Hall Regime, G. Granger, Phys. Rev. Lett. {\bf 102}, 086803 (2009). \bibitem{Nam} S. G. Nam, E. H. Hwang and H. J. Lee, Thermoelectric detection of chiral heat transport in graphene in the quantum Hall regime, Phys. Rev. Lett. {\bf 110} 22680 (2013). \bibitem{us} L. Arrachea and E. Fradkin, Chiral heat transport in driven quantum Hall and quantum spin Hall edge states, Phys. Rev. B {\bf 84}, 235436 (2011). \bibitem{qcond} S. Jezouin, F D. Parmentier, A. Anthore, U. Gennser, A. J. Cavanna and F. Pierre, Quantum limit of heat flow across a single electronic channel, Science {\bf 342} 601 (2013). %%%%%%%%%% \bibitem{cappelli} A. Cappelli, M. Huerta, and G. Zemba, Thermal Transport in Chiral Conformal Theories and Hierarchical Quantum Hall States, Nucl. Phys. B \textbf{636}, 568 (2002). %%%%%%%%%%% %%%%%%%%%% \bibitem{grosfeld} E. Grosfeld and S. Das, Probing the Neutral Edge Modes in Transport across a Point Contact via Thermal Effects in the Read-Rezayi Non-Abelian Quantum Hall States, Phys. Rev. Lett. \textbf{102}, 106403 (2009). %%%%%%%%%% \bibitem{torsten} T. Karzig, G. Refael, L. I. Glazman, F. von Oppen, Energy Partitioning of Tunneling Currents into Luttinger Liquids, Phys. Rev. Lett. {\bf 107}, 176403 (2011). \bibitem{us1} H. Aita, L. Arrachea, C. Na\'on, and E. Fradkin, Heat transport through quantum Hall edge states: Tunneling versus capacitive coupling to reservoirs, Phys. Rev. B{\bf 88}, 085122 (2013). \bibitem{stern} G. Viola, S. Das, E. Grosfeld, and A. Stern, Thermoelectric probe for neutral edge modes in the fractional quantum Hall regime, Phys. Rev. Lett. {\bf 109}, 146801 (2012). \bibitem{heiblum} I. Gurman, R. Sabo, M. Heiblum, V. Umansky, and D. Mahalu, Extracting net current from an upstream neutral mode in the fractional quantum Hall regime, Nature Comm. {\bf 3}, 1289 (2012). \bibitem{altimiras} C. Altimiras, H. le Sueur, U. Gennser, A. Cavanna, D. Mailly and F. Pierre, Tuning energy relaxation along quantum Hall channels, Phys. Rev. Lett. \textbf{105}, 226804 (2010). \bibitem{yacoby} V. Venkatachalam, S. Hart, L. Pfeiffer, K. West, and A. Yacoby, Local thermometry of neutral modes on the quantum Hall edge Nature Physics {\bf 8}, 676 (2012). \bibitem{altimiras2} A. Cavanna, D. Mailly, and F. Pierre, Chargeless Heat Transport in the Fractional Quantum Hall Regime, C. Altimiras, H. le Sueur, U. Gennser, A. Anthore, Phys. Rev. Lett. {\bf 109}, 026803 (2012). \bibitem{baner} M. Banerjee, M. Heiblum, A. Rosenblatt, Y. Oreg, D. E. Feldman, A. Stern, and V. Umansky, Observed quantization of anyonic heat flow, Nature {\bf 545}, 75 (2017). \bibitem{half}Theory of Disorder-Induced Half-Integer Thermal Hall Conductance D. F. Mross, Y. Oreg, A. Stern, G. Margalit, M. Heiblum, Phys. Rev. Lett. {\bf 121}, 026801 (2018). \bibitem{pheno} A. Aharon, Y. Oreg, A. Stern, Phenomenological theory of heat transport in the fractional quantum Hall effect, Phys. Rev. B {\bf 99}, 041302 (2019). \bibitem{rafa} R. S\'anchez, B. Sothmann, A. N. Jordan, Chiral thermoelectrics with quantum Hall edge states, Phys. Rev. Lett.{\bf 114}, 146801 (2015). \bibitem{peter}P. Samuelsson, S. Kheradsoud, B. Sothmann Optimal quantum interference thermoelectric heat engine with edge states, Phys. Rev. Lett. {\bf 118}, 256801 (2017). \bibitem{janine}S. Kheradsoud, N. Dashti, M. Misiorny, P. P. Potts, J. Splettstoesser, P. Samuelsson, Power, Efficiency and Fluctuations in a Quantum Point Contact as Steady-State Thermoelectric Heat Engine, arXiv:1904.03912. \bibitem{vanuci} L Vannucci, F Ronetti, G Dolcetto, M Carrega, M Sassetti, Interference-induced thermoelectric switching and heat rectification in quantum Hall junctions Phys. Rev. B {\bf 92}, 075446 (2015). \bibitem{enhan} P. Roura-Bas, L. Arrachea, E. Fradkin, Enhanced thermoelectric response in the fractional quantum Hall effect Phys. Rev. B {\bf 97}, 081104 (2018). \bibitem{jauho}A Xiao-Qin Yu, Zhen-Gang Zhu, Gang Su, A. -P. Jauho, spincaloritronic battery, Phys. Rev. Applied {\bf 8}, 054038 (2017). \bibitem{rone}Flavio Ronetti, Luca Vannucci, Giacomo Dolcetto, Matteo Carrega, Maura Sassetti, Spin-thermoelectric transport induced by interactions and spin-flip processes in two dimensional topological insulators, Phys. Rev. B {\bf 93}, 165414 (2016). \bibitem{roda} Sun-Yong Hwang, Rosa Lopez, Minchul Lee, David Sanchez, Nonlinear spin-thermoelectric transport in two-dimensional topological insulators, Phys. Rev. B {\bf 90}, 115301 (2014). \bibitem{graph}Po-Hao Chang, Mohammad Saeed Bahramy, Naoto Nagaosa, Branislav K. Nikolic Giant thermoelectric effect in graphene-based topological insulators with nanopores, Nano Lett. {\bf 14}, 3779 (2014). \bibitem{bjorn}D. G. Rothe, E. M. Hankiewicz, B. Trauzettel, M. Guigou, Spin-dependent thermoelectric transport in HgTe/CdTe quantum wells, Phys. Rev. B {\bf 86}, 165434 (2012). \bibitem{helius}P. Roura-Bas, L. Arrachea, and E. Fradkin, Helical spin thermoelectrics controlled by a side-coupled magnetic quantum dot in the quantum spin Hall state, Phys. Rev. B {\bf 98}, 195429 (2018). \bibitem{soth}D. Sanchez, R. Sanchez, R. Lopez, B. Sothmann, Nonlinear chiral refrigerators, arXiv:1904.04506. \bibitem{benja} Arjun Mani, Colin Benjamin, Helical thermoelectrics and refrigeration, Phys. Rev. E {\bf 97}, 022114 (2018). \bibitem{ti1}C. L. Kane and E. J. Mele, $Z_2$ Topological Order and the Quantum Spin Hall Effect, Phys. Rev. Lett. {\bf 95}, 146802 (2005). \bibitem{ti2}C. L. Kane and E. J. Mele, Quantum Spin Hall Effect in Graphene, Phys. Rev. Lett. {\bf 95}, 226801 (2005). \bibitem{ti3}B. A. Bernevig, T. L. Hughes, and S.-C. Zhang, Quantum Spin Hall Effect and Topological Phase Transition in HgTe Quantum Wells, Science {\bf 314}, 1757 (2006). \bibitem{ti4} M. K\"onig, S. Wiedmann, C. Br\"une, A. Roth, H. Buhmann, L. W. Molenkamp, X. L. Qi, and S.-C. Zhang, Quantum Spin Hall Insulator State in HgTe Quantum Wells, Science{\bf 318}, 766 (2007). \bibitem{ti5} A. Roth, C. Br\"une, H. Buhmann, L. W. Molenkamp, J. Maciejko, X-L. Q, S-C Zhang, Nonlocal Transport in the Quantum Spin Hall State Science{\bf 325}, 294 (2009). \bibitem{ti6} C. Br\"une, A. Roth, H. Buhmann, E. M. Hankiewicz, L. W. Molenkamp, J. Maciejko, X-L. Qi, and S-C. Zhang, Spin polarization of the quantum spin Hall edge states Nature Phys.{\bf 8}, 485 (2012). \bibitem{qpc1}B. J. van Wees, H. van Houten, C. W. J. Beenakker, J. G. Williamson, L. P. Kouwenhoven, D. van der Marel, C. T. Foxon, Quantized conductance of point contacts in a two-dimensional electron gas. Phys. Rev. Lett. {\bf 60}, 848 (1988). \bibitem{qpc2}H. van Houten, L. W. Molenkamp, C. W. J. Beenakker, C.T. Foxon, Thermoelectric properties of quantum point contacts. Semicond. Sci. Technol. {\bf 7}, B215 (1992). \bibitem{qpc3} M. B\"uttiker, Quantized transmission of a saddle-point constriction. Phys. Rev. B{\bf 41}, 7906 (1990). \bibitem{qpc4} J. Strunz, J. Wiedenmann, C. Fleckenstein, L. Lunczer, W. Beugeling, V. L. M\"uller, P. Shekhar, N. Traverso Ziani, S. Shamim, J. Kleinlein, H. Buhmann, B. Trauzettel, L. W. Molenkamp, Interacting topological edge channels, arXiv:1905.08175.\bibitem{dolcini}F. Dolcini, Full electrical control of charge and spin conductance through interferometry of edge states in topological insulators, Phys. Rev. B {\bf 83}, (2011). \bibitem{rech1}P. Virtanen, P. Recher, Dephasing of spin and charge interference in helical Luttinger liquids, Phys. Rev. B {\bf 83}, 115332 (2011) \bibitem{citro}F. Romeo, R. Citro, D. Ferraro and M. Sassetti, Electrical switching and interferometry of massive Dirac particles in topological insulator constrictions, Phys. Rev. B {\bf 86}, 165418 (2012). \bibitem{bruno1} B. Rizzo, L. Arrachea, M. Moskalets, Transport phenomena in helical edge states interferometers. A Green's function approach, Phys Rev B {\bf 88}, 155433 (2013). \bibitem{tidot1} F. Cr\'epin, J. C. Budich, F. Dolcini, P. Recher, and B. Trauzettel, Renormalization group approach for the scattering off a single Rashba impurity in a helical liquid, Phys. Rev. B{\bf 86}, 121106 (R ) (2012). \bibitem{tidot2} G. Dolcetto, F. Cavaliere, D. Ferraro, and M. Sassetti, Generating and controlling spin-polarized currents induced by a quantum spin Hall antidot, Phys. Rev. B {\bf 87}, 085425 (2013). \bibitem{tidot3} K. T. Law, C. Y. Seng, Patrick A. Lee, and T. K. Ng, Quantum dot in a two-dimensional topological insulator: The two-channel Kondo fixed point Phys. Rev. B {\bf 81}, 041305 (2010). \bibitem{tidot4}T. Posske, C-X Liu, J. C. Budich, and B. Trauzettel, Exact Results for the Kondo Screening Cloud of Two Helical Liquids, Phys. Rev. Lett. {\bf 110}, 016602 (2013). \bibitem{tidot5} B. Probst, P. Virtanen, P. Recher, Controlling spin polarization of a quantum dot via a helical edge state, Phys. Rev. B {\bf 92}, 045430 (2015). \bibitem{bruno2}B. Rizzo, A. Camjayi, Liliana Arrachea Transport in quantum spin Hall edges in contact to a quantum dot Phys. Rev. B {\bf 94}, 125425 (2016). \bibitem{maso}G. D. Mahan, J. O. Sofo, The best thermoelectric. Proc. Natl. Acad. Sci. U.S.A. {\bf 93}, 7436 (1996). \bibitem{benenti}G. Benenti, K. Saito, and G. Casati, Thermodynamic Bounds on Efficiency for Systems with Broken Time-Reversal Symmetry, Phys. Rev. Lett. {\bf 106}, 230602 (2011). \bibitem{whitney1}R. S. Whitney, Most Efficient Quantum Thermoelectric at Finite Power Output. Phys. Rev. Lett. {\bf 112}, 130601 (2014) . \bibitem{whitney2}R. S. Whitney, Finding the quantum thermoelectric with maximal efficiency and minimal entropy production at given power output, Phys. Rev. B {\bf 91}, 115425 (2015). \bibitem{linke} M. Josefsson, A. Svilans, A. M. Burke, E. A. Hoffmann, S. Fahlvik, C. Thelander, M. Leijnse, H. Linke, A quantum-dot heat engine operating close to the thermodynamic efficiency limits, Nat. Nanotechnol. {\bf 13}, 920 (2018). \bibitem{fabio1}P. A. Erdman, F. Mazza, R. Bosisio, G. Benenti, R. Fazio, Fabio Taddei Thermoelectric properties of an interacting quantum dot-based heat engine Phys. Rev. B {\bf 95}, 245432 (2017). \bibitem{fabio2}D. Prete, Paolo A. Erdman, V. Demontis, V. Zannier, D. Ercolani, L. Sorba, F. Beltram, F. Rossella, F. Taddei, S. Roddaro, Thermoelectric conversion at 30K in InAs/InP nanowire quantum dots, arXiv:1903.06935. \bibitem{dutta}B. Dutta, J. T. Peltonen, D. S. Antonenko, M. Meschke, M. A. Skvortsov, B. Kubala, J. K\"onig, C. B. Winkelmann, H. Courtois, and J. P. Pekola, Thermal Conductance of a Single-Electron Transistor, Phys. Rev. Lett.{\bf 119}, 077701 (2017). \bibitem{ora}O.Entin-Wohlman,Y.Imry and A.Aharony, Enhanced performance of joint cooling and energy production, Phys.Rev. B {\bf 91}, 054302 (2015). \bibitem{arman}D. P\'erez Daroca, P. Roura-Bas, A. A. Aligia, Enhancing of nonlinear thermoelectric response of a correlated quantum dot in the Kondo regime by asymmetrically coupling to the leads, Phys. Rev. B{\bf 97}, 165433 (2018) \bibitem{misha1}D. B. Karki, M. N. Kiselev, Exceeding quantum upper bound of power production in nano devices: effects of resonance scattering and strong electron interactions, arXiv:1906.00724 \bibitem{magdot1} Q. Meng, S. Vishveshwara, T. L. Hughes, Spin-transfer torque and electric current in helical edge states in quantum spin Hall devices, Phys. Rev. B {\bf 90}, 205403 (2014). \bibitem{magdot2} L. Arrachea and F. von Oppen, Nanomagnet coupled to quantum spin Hall edge: An adiabatic quantum motor, Physica E {\bf 74}, 596 (2015). \bibitem{magdot3} P. G. Silvestrov, P. Recher, P. W. Brouwer, Noiseless manipulation of helical edge state transport by a quantum magnet, Phys. Rev. B {\bf 93}, 205130 (2016). \bibitem{fu}L. Fu and C. L. Kane, Josephson current and noise at a superconductor/quantum-spin-Hall-insulator/superconductor junction Phys. Rev. B {\bf 79}, 161408 (2009). \bibitem{meyer}M. Houzet, J. S. Meyer, D. M. Badiane, and L. I. Glazman, Dynamics of Majorana States in a Topological Josephson Junction Phys. Rev. Lett. {\bf 111}, 046401 (2013). \bibitem{crepin}Francois Cr\'epin, Bj\"orn Trauzettel, and Fabrizio Dolcini, Signatures of Majorana bound states in transport properties of hybrid structures based on helical liquids, Phys. Rev. B{\bf 89}, 205115 (2014). \bibitem{marun} R. Bustos-Marun, G. Refael, and F. von Oppen, Adiabatic Quantum Motors, Phys. Rev. Lett. {\bf 111}, 060802 (2013). \bibitem{jare}Jackiw, R. and Rebbi, C. Solitons with fermion number. Phys, Rev. D {\bf 13}, 3398 (1976). \bibitem{ssh}Su, W. P., Shrieffer, J. R. and Heeger, A. J. Soliton excitations in polyacetylene. Phys. Rev. B {\bf 22}, 2099 (1980). \bibitem{sm} See supplementary material for details on the behavior of the maximum power for the heat engine and the figure of merit of reference transmission functions. \bibitem{Scheunert2016} G. Scheunert, O. Heinonen, R. Hardeman, A. Lapicki, M. Gubbins, and R. M. Bowman, A review of high magnetic moment thin films for microscale and nanotechnology applications. Appl. Phys. Rev. {\bf 3}, 011301,(2016). \end{thebibliography} \end{document} }\end{equation}}
\caption{ \textbf{Results generated using TileGAN.} The results were generated using GANs trained on the Google terrain map (top) and the Satellite imagery (bottom) data sets. For each result, the first column shows samples from the training data, the second column contains sample tiles generated by the trained GAN, the third column features the low-resolution guidance map input to our method and the final large-scale output. In the last column, we show two cropped and zoomed regions scaled by the zoom value given on the bottom right. \imagecredits{top, first column \textcopyright~Google; bottom, first column \textcopyright~ESRI.} }
\caption{ \textbf{Further TileGAN results.} These results were generated by our method using GANs trained on the Oil canvas (top) and the Night sky (bottom) data sets. See Fig.~\protect\ref{fig:results1} for an explanation of the image columns. \imagecredits{top, low-resolution input \textcopyright~Vincent Brady; bottom, first column \textcopyright~ESO and ESA/Hubble.} }
\caption{SPICE model of standalone electrostatic MEMS actuator. Arrows pointing into (out of) a block refer to inputs (outputs) to (from) the block. $x$ designates position, $v$ designates velocity and $F$ designates forces. Implementation of these blocks using circuit elements follows Ref. \textcolor{blue}{\onlinecite{toshiyoshi2011spice}}. Refer \textcolor{blue}{supplementary material} for detailed schematic implementation.}
\caption{Static pull-in and release characteristics of the hybrid actuator. The simulation results are in agreement with the analytical results given in Ref. \textcolor{blue}{\onlinecite{masuduzzaman2014effective}}, thus validating the hybrid actuator SPICE model.}
\caption{%\footnotesize %\small Comparing state-of-the-art methods on \textit{Office}. The $1^\text{st}/2^\text{nd}$ best results are indicated in {\color{red}red}/{\color{blue}blue}. }
\caption{Performance comparison on three handwritten font synthesis tasks with increasing size of training set. The comparison between {\color{blue}\textit{Ours 750}} and {\color{blue}\textit{HAN 2550}} demonstrate that our method achieve the equal even better performance with much less training set.}
\caption{We give the robot the instruction: \textit{"pick up the ball in front of the can"}. The robot executes the action and waits for further instructions. We then give the instruction to \textit{"drop it in front of the mug"}. The problem in this step is that there is no object classified as \predicate{mug}, which means that none of the objects has as label with highest probability \predicate{mug}. We correct for this through probabilistic reasoning over not only the top label for each object but a number of top ranked labels per object. This allows the anchoring system to correct its classification of an object based on what we as humans think an object is. Given the instruction, the anchoring system re-classifies the black object from \predicate{pot} to \predicate{mug}. The instruction is then successfully carried out. The recorded video can be found here: \url{https://vimeo.com/302072685}.}
\caption{(Left) \textit{AM}: Plots of $f_3$ along the active manifold (\textcolor{PineGreen}{green}) and piecewise-cubic Hermite spline approximation from AM algorithm (\textcolor{RedOrange}{orange}). (Right) \textit{AS}: Plots of $f_3$ along the active subspace and along bootstrap replicates (\textcolor{PineGreen}{green}), and Constantine's own optimization algorithms were applied to fit a degree 4 polynomial (\textcolor{RedOrange}{orange}).}
\caption{\textcolor{orange}{Level sets (orange)} of example function $f_3: \mathbb{R}^2 \to \mathbb{R}$, and its \textcolor{blue}{gradient vector field (blue)} tangent to the AMs at each point.}
\caption{From left to right, AM run on test functions $f_1, f_2, f_3,$ respectively. %of Eq.~\ref{eq:example-functions}, respectively. Black bold line is the approximated active manifold, $\gamma$, from a random starting point as used in tests of Section~\ref{sec:regression-experiment}. Test data indicated with \textcolor{blue}{blue $\mathbf{\times}$}, with \textcolor{blue}{path traversed to the active manifold in blue}. Test data with no path indicates those for which the learned active manifold cannot provide an estimate.}
\caption{ AM-derived function values in \textcolor{RedOrange}{orange} along parameterization of the active manifold and piecewise-cubic Hermite interpolation fit to these points in \textcolor{PineGreen}{green}. Top plots depict $u_{ave}$ (left), $B_{ind}$ (right) corresponding to the Hartmann problem (Sec. \ref{sec:hartmann}). Bottom plots depict $u_{ave}$ (left), $B_{ind}$ (right) corresponding to the idealized MHD generator data (Sec. \ref{sec:ideal-mhd}).}
\caption{\textcolor{blue}{SNe~Ia with different $z_\mathrm{hel}$ in JLA and Pantheon} : While the JLA and Pantheon names of the SNe~Ia differ by survey specific prefixes, the fact that they are the same can be verified by checking their Right Ascension and Declination coordinates. The JLA redshifts are taken from \href{https://github.com/cmbant/CosmoMC/blob/master/data/jla\_lcparams.txt}{https://github.com/cmbant/CosmoMC/blob/master/data/jla\_lcparams.txt} while the Pantheon redshifts are from \href{https://github.com/dscolnic/Pantheon/blob/master/lcparam\_full\_long\_zhel.txt}{https://github.com/dscolnic/Pantheon/blob/master/lcparam\_full\_long\_zhel.txt}. The shifts in the redshift are computed assuming the conservative value of $\sigma_z = 0.0005$ \citep{Kessler:2009ys}.}
\caption{\textcolor{blue}{Table \ref{tab:zhel1} (Continued)}}
\caption{Constraints on fermionic asymmetric DM which forms a DM core and collapses to a mini black hole in a WD. The black hole either ignites a supernova via Hawking emission ({\color{c6} red}) or accretes and eats the star (or possibly ignites a supernova) ({\color{c7} blue}). Also shown ({\color{c5} purple}) are the constraints on DM-nuclei scatters igniting a supernova during core collapse before formation of a black hole. }
\caption{Constraints on bosonic asymmetric DM which forms a DM core and collapses to a mini black hole in a WD. The black hole either ignites a supernova via Hawking emission ({\color{c6} red}) or accretes and eats the star (or possibly ignites a supernova) ({\color{c7} blue}). Also shown ({\color{c5} purple}) are the constraints on DM-nuclei scatters igniting a supernova during core collapse before formation of a black hole. }
\caption{Constraints on fermionic asymmetric DM which forms a DM core and ignites a supernova through annihilations ({\color{c6} red}). For sufficiently small $\sigma_{\chi \chi} v$ the core first collapses to a black hole ({\color{c7} blue}), and is otherwise constrained, see Fig.~\ref{fig:BHfermion}. Also shown ({\color{c5} purple}) are the constraints on DM-nuclei scatters igniting a supernova during core collapse before annihilations could do so. }
\caption{Constraints on bosonic asymmetric DM which forms a DM core and ignites a supernova through annihilations ({\color{c6} red}). For sufficiently small $\sigma_{\chi \chi} v$ the core first collapses to a black hole ({\color{c7} blue}), and is otherwise constrained, see Fig.~\ref{fig:BHboson}. Also shown ({\color{c5} purple}) are the constraints on DM-nuclei scatters igniting a supernova during core collapse before annihilations could do so. }
\caption{PSD heat-maps of the three EEG bands i.e. theta (\textcolor{red}{red}), alpha (\textcolor{green}{green}), and beta (\textcolor{blue}{blue}) EEG bands are added according to respective color-bar range to get combined RGB heat-map image.(Circular outline, nose, ears, and color-bars have been added for visualization only.)}
\caption{Detected face (marked in \textcolor{red}{red}) and face localized points (marked in \textcolor{green}{green}) for two participants (left and center) in the study, and some of the features (marked in \textcolor{yellow}{yellow}) computed using the coordinates of the face localized points.}
\caption{Comparison between different annotation tools.\newline \protect\dotgreenbright ~Integrates feature \protect\dotyellowbright ~Feature only available for 2D images \protect\dotredbright ~Feature not available }
\caption{Evaluation metrics on the \texttt{LISA-T} dataset. \textit{User1} annotated the objects very accurately in the beginning. \newline \protect\tikz \protect\fill[myred] (1ex,1ex) circle (1ex); User1 \qquad \protect\tikz \protect\fill[mygreen] (1ex,1ex) circle (1ex); User2 \qquad \protect\tikz \protect\fill[myblue] (1ex,1ex) circle (1ex); User3}
\caption{Evaluation metrics on the \textit{NuScenes} dataset. \textit{User2} achieved the best accuracy in total. \newline \protect\tikz \protect\fill[myred] (1ex,1ex) circle (1ex); User1 \qquad \protect\tikz \protect\fill[mygreen] (1ex,1ex) circle (1ex); User2 \qquad \protect\tikz \protect\fill[myblue] (1ex,1ex) circle (1ex); User3}
\caption[Circle centre partition]{A packing produced by extended SLP: 128 (3 medium, 17 tiny, and 108 very tiny) circles packed into 15 (1 medium, 4 tiny and 10 very tiny) lanes. A possible input order of the circles is \textcolor{blue}{1 medium circle}, \textcolor{red}{23 very tiny circles} (filling the sparse block $A$), \textcolor{seagreen}{1 medium circle}, \textcolor{orange}{13 tiny circles} (filling the sparse block $B$), \textcolor{violet}{24 very tiny circles} (filling the vertical lane $C$), \textcolor{blue}{1 medium circle}, \textcolor{red}{2 tiny circles} (filling the sparse block $D$), \textcolor{seagreen}{11 very tiny circles} (filling the sparse block $E$), and \textcolor{orange}{2 tiny circles} (filling the vertical lane $F$).}
\caption{Accuracies of reconstructed \predictions compared to returning a default response when \predictions from the \basemodel are unavailable ($\accUnavail$). ``Available'' is the accuracy achieved when \basemodel \predictions are available ($\accAvail$). \tech uses $k=2$ and the generic \encdec. }
\caption{Overall accuracy ($\accOverall$) of \predictions on CIFAR-10 as the fraction of \predictions that are unavailable ($\fracUnavail$) increases. The horizontal orange line shows the accuracy of the ResNet-18 \basemodel ($\accAvail$).}
\caption{Accuracies of \predictions reconstructed by \tech with $k=2,3,4$ and using the generic \encdec compared to returning a default response when \basemodel \predictions are unavailable ($\accUnavail$). }
\caption{Components of a \modelserver and those added by \tech (dotted). Queues indicate components which may group \queries/\predictions (e.g., \codinggroup).}
\caption{Quantitative evaluation results on the test set. The numbers are averaged metrics for the datasets. \textcolor{red}{Red} indicates the best result and \textcolor{blue}{blue} indicates the second best. In the case of HIGRADE, there are two kind of measures HIGRADE-1/HIGRADE-2.}
\caption{ Results (\%) on big re-ID datasets. It is clear that OSNet achieves state-of-the-art performance on all datasets, surpassing most published methods by a clear margin. It is noteworthy that \emph{OSNet has only 2.2 million parameters}, which are far less than the current best-performing ResNet-based methods. -: not available. $\dag$: model trained from scratch. $\ddag$: reproduced by us. (Best and second best results in \red{red} and \blue{blue} respectively) }
\caption{Likelihoods on ground-truth attributes predicted by OSNet. Correct/incorrect classifications based on threshold 50\% are shown in \green{green}/\red{red}.}
\caption{Effect of black hole mass. \textcolor{magenta}{Results from a set of CLOUDY simulations performed on a constant density single BLR cloud assuming $\mathrm{M_{\mathrm{BH}}=10^8\;M_{\odot}}$ and $\mathrm{M_{\mathrm{BH}}=10^{10}\;M_{\odot}}$. The plots are shown for the spectral bins A1, B1 and B1$^+$, showing the distribution of changing \feii\strength with changing BLR sizes computed from the virial relation. Average values of FWHM are used for each spectral bin. The computations are performed for viewing angle range 0-90$^{\mathrm{o}}$, for a continuum SED from \cite{mf87}, \cite{kor97} and \cite{laor97} for the respective spectral bins. The filled circles in black/red/blue in each bins are the corresponding \feii\strength at a radius constrained by\cite{bentz13} $R_\mathrm{BLR}$-$L_{\mathrm{5100}}$ relation.}}
\caption{Core-to-continuum median flux ratio versus significance of the Spearman correlation for all quasar spectra searched in a proximate velocity window (top, {within a velocity window encompassing the pipeline and visual redshifts estimates, extended 2000~\kms on each side}) and an intervening window {with the exact same width for each spectrum, but shifted by 5\,000~\kms\bluewards} (bottom). The vertical and horizontal dotted lines show our cuts defining the samples ${\cal S}^P_{c1}$ (top) and ${\cal S}^I_{c1}$ (bottom). Points located outside the solid contour (containing 99.73\% of the points) define, respectively, ${\cal S}^P_{c2}$ (top) and ${\cal S}^I_{c2}$ (bottom). Candidates belonging to either one or both of these selections (black points) were visually checked and coloured green when strong H$_2$ is confirmed (grade A) or yellow when considered tentative only (grade B). {Red and orange points correspond to additional systems described in Sect.~\ref{additional} with, respectively, grade A and B.} \label{fig:parameter}}
\caption{Distribution of relative velocities with respect to the quasar redshift for our sample of strong proximate H$_2$ systems (orange histograms) compared to those found in a region shifted by 5\,000~\kms\(blue). We here used the "zbest" provided by the DR14Q catalogue as the quasar redshift and the $z_{\rm abs}$ measurement directly from our search algorithm. Negative velocities indicate $\zabs>\zem$. Note that the x-axis goes from positive velocities (blueshifted compared to the quasar) on the left to negative velocities (redshifted) to the right. Both distributions are restricted to visually-checked systems (unfilled histograms: grade A or B, filled histograms: grade A only) isolated using the outlier selection (\# 2). The grey regions show the corresponding minimal search windows. Systems falling outside these regions are not considered when comparing incidence rates. The horizontal dashed line shows the mean number of intervening strong H$_2$ systems per velocity bin ($\sim 1$ per 500~\kms\bin). A significant excess of H$_2$ systems at the quasar redshift is observed and cannot be explained by intervening statistics.}
\caption{Distribution of \HI\column densities in our statistical samples with visual grade A (proximate: filled, intervening: red)}
\caption{Coverage of the $(\xBj, Q^{2})$ (left) and $(\xBj, -t/Q^{2})$ (right) phase-spaces by the experimental data listed in Table \ref{tab:data:dvcs_data}. The data come from the Hall A (\textcolor{blue}{$\blacktriangledown$}, \textcolor{black}{$\triangledown$}), CLAS (\textcolor{red}{$\blacktriangle$}, \textcolor{black}{$\vartriangle$}), HERMES (\textcolor{green}{$\bullet$}, \textcolor{black}{$\circ$}), COMPASS (\textcolor{magenta}{$\blacksquare$}, \textcolor{black}{$\square$}) and HERA H1 and ZEUS (\textcolor{cyan}{$\blacklozenge$}, \textcolor{black}{$\lozenge$}) experiments. The gray bands (open markers) indicate phase-space areas (experimental points) being excluded from this analysis due to the cuts introduced in Eqs. \eqref{eq:data:cut_1} and \eqref{eq:data:cut_2}.}
\caption{{\footnotesize \textcolor[rgb]{0.00,0.00,0.00}{Variations of PTT values corresponding to different characteristic points on PPG.} }}
\caption{{\footnotesize \textcolor[rgb]{0.00,0.00,0.00}{Block diagram of the sampling hardware device.}}}
\caption{\textcolor[rgb]{0.00,0.00,0.00}{The SBP and DBP Estimation Results}}
\caption{An example of video-based face recognition problem consisting of three still face gallery subjects and four samples from the videos. {\color[rgb]{1,0.5,0}Orange arrows} show positive connections from body appearance similarity. \textbf{Black arrows} indicate negative connections constructed from co-occurrence information. {\color{blue}Blue arrows} represent the facial similarities to the ground truth galleries. The thicker the arrows, the stronger the connections. The {\color{red}red cross} indicates an misleading connection. A graph with fixed connections may propagate erroneous information through these misleading connections. (The figure is best viewed in color.)}
\caption{Overview of the data flow when employing our proposed 3d-SMRnet (\textcolor{blue}{blue path}). Instead of measuring a HR SM and using it for reconstruction (\textcolor{orange}{orange path}), only a LR SM (lower left) is measured and the HR SM is retrieved by applying our proposed method to each frequency component of the LR SM. The recovered HR SM can be used for reconstruction (upper right). }
\caption{Scenarios and mean results over 30 runs compared to standard QUIC. QUIC flow competes for bandwidth. Each row shows our approach under different bandwidth estimates (as a factor to the true BW). Green depicts an improvement (\textcolor{Green}{\ding{115}}), red a deterioration (\textcolor{BrickRed}{\ding{116}}), orange shows FCT changes within the same RTT (\textcolor{Orange}{\ding{115}}\textcolor{Orange}{\ding{116}}) and gray circles depict that the means show no statistically significant difference in the 95\% confidence intervals (\textcolor{Gray}{\ding{108}}).}
\caption{The overview of \methodname workflow. $\mathcal{D}_{epi}$ is an episode randomly sampled. $\{\mathcal{S}^i\}_{i=1}^{K-1}$ denotes source domains and $\mathcal{T}^s$ denotes the simulated target domain. The two gradient update loops of meta-training process are illustrated. The \textcolor{orange}{yellow} colored blocks and arrows are associated with Learner, while the \textcolor{blue}{blue} ones are associated with MetaLearner. (``Target loss'' is used here instead of ``Simulated Target loss'' for simplicity.)}
\caption{\textcolor{red}{\label{fig:Rotation-about-(-x)}}Rotation about $-x$ axis. Point $P$ is rotated an angle of $(90^o-\phi)$ about the $-x$ axis, obtaining point $P'$.}
\caption{MTMD applied to a two-degree-of-freedom system: uncontrolled structure (\textbf{---}), solution from \cite{Ozguven1986} (\textcolor{blue}{\textbf{---}}) and norm-homotopy solution (\textcolor{red}{\textbf{---}}).}
\caption{Compliance of the damped single-degree-of-freedom system: uncontrolled (\textcolor{gray}{---}) and controlled (\textbf{---}) structure.}
\caption{Compliance of the damped single-degree-of-freedom system: uncontrolled structure (\textcolor{gray}{---}), initial tuning (\textbf{---}) and $p$-norm optimization with $p=1$ (\textcolor{blue}{\textbf{---}}).}
\caption{TMD on a damped single-degree-of-freedom system: uncontrolled structure (\textcolor{gray}{---}), initial tuning (\textbf{---}), $p$-norm optimization with $p=1$ (\textcolor{blue}{\textbf{---}}) and norm-homotopy optimization (\textcolor{red}{\textbf{---}}).}
\caption{MTMD on a two-degree-of-freedom system: initial tuning (\textbf{---}), solution for $k=0$ (\textcolor{blue}{\textbf{---}}), solution for $k=1$ (\textcolor{darkGreen1}{\textbf{-$\cdot$-}}), solution for $k=2$ (\textcolor{violet1}{\textbf{---}}), solution for $k=3$ (\textcolor{orange1}{\textbf{-$\cdot$-}}), solution for $k=4$ (\textcolor{cyan}{\textbf{---}}), and norm-homotopy optimal solution (\textcolor{red}{\textbf{-$\cdot$-}}).}
\caption{Compliance of the plate with three absorbers targeting modes $(1,1)$, $(2,1)$ and $(1,2)$: uncontrolled structure (\textcolor{gray}{---}), initial tuning (\textbf{---}) and optimized tuning (\textcolor{red}{\textbf{---}}).}
\caption{Compliance of the plate with four absorbers targeting modes $(1,1)$, $(2,1)$, $(1,2)$ and $(2,2)$: uncontrolled structure (\textcolor{gray}{---}), initial tuning (\textbf{---}) and optimized tuning (\textcolor{red}{\textbf{---}}).}
\caption{(a) A rectangle of size $L_1\sqrt{3}/2\times L_2$, (b) the cylinder of height $H\!=\!L_1\sqrt{3}/2$ and diameter $D\!=\!L_2/\pi$, (c) a triangulated lattice of $(L_1,L_2)\!=\!(11,29)$, which approximately makes $D/L_1\!=\!1$, (d) a front view of the cylinder with fixed boundary, and (e) the vertices (\textcolor{green}{$\bullet$}) on a boundary are allowed to move along the circumferential direction. The square plates in (d) and (e) are drawn to emphasize the existence of the fixed boundaries. The edge length $a$ of the regular triangle in (c) is assumed as $a\!=\!1$.}
\caption{MNIST benchmarks for biologically plausible models of deep learning compared with models in this paper (\textbf{bold}). SNN: Spiking Neural Network, for other abbreviations see \autoref{sec:results}. Models are ranked by MNIST test accuracy (rightmost column). Parts of this table are taken from \citep{Tavanaei2018a,Kheradpisheh2018,Diehl2015}. Models using convolutional layers (CNN) are marked in \textcolor{orange}{orange}. See \autoref{tab:abbreviations} for abbreviations. %Note that the simple models in the $l$-RP class ($l$-RP, LIF rate \& spiking $l$-RP), marked in \textcolor{red}{red}, perform better than several more elaborate models. For conventional ANN/DNN/CNN MNIST benchmarks see \href{http://yann.lecun.com/exdb/mnist/}{table} in \citet{LeCun}.\label{tab:MNISTbenchmarks}}
\caption{Additional examples from our LCGN model on the validation split of the GQA dataset for VQA. In the middle 4 columns, each red line shows an edge $j \rightarrow i$ along the message passing paths (among the $N$ detected objects) where the connection edge weight $w^{(t)}_{j,i}$ exceeds a threshold. The \textcolor{blue}{blue} star on each line is the sender node $j$. In these example, the objects of interest receive messages from other objects through those connections with high weights (the red lines). The \textcolor{red}{red} star (along with the box) in the last column shows the object with the highest attention $\beta_i$ in the single-hop VQA classifier in Sec.~3.2 of the main paper. The last two rows show two failure examples on the GQA dataset. Some failure cases are due to ambiguity in the answers in the GQA dataset (\eg ``woman'' vs. ``lady'' in the last example).}
\caption{Additional examples from our LCGN model on the validation split of the CLEVR dataset for VQA. The middle 4 columns show the connection edge weights $w^{(t)}_{j,i}$ similar to Figure~\ref{fig:gqa_vis_supp}, where the \textcolor{blue}{blue} stars are the sender nodes. The last column shows the attention $\beta_i$ in the single-hop VQA classifier in Sec.~3.2 of the main paper over the $N = 14 \times 14$ feature grid. In these examples, the relevant objects in the question usually first propagate messages within the convolutional grids of the same object (possibly to form an object representation from the CNN features), and then the object of interest tends to collect messages from other context objects. The last two rows show two failure examples on the CLEVR dataset.}
\caption{Additional examples from our LCGN model on the validation split of the CLEVR-Ref+ dataset for REF. The middle 4 columns show the connection edge weights $w^{(t)}_{j,i}$ similar to Figure~\ref{fig:gqa_vis_supp}, where the \textcolor{blue}{blue} stars are the sender nodes. The last column shows the selected target grid location $p$ on the $N = 14 \times 14$ spatial grid (the \textcolor{red}{red} star) in the GroundeR model in Sec.~3.2 of the main paper, along with the ground-truth (\textcolor[rgb]{.7,.7,0}{yellow}) box and the predicted box (\textcolor{red}{red} box from bounding box regression $u$ in GroundeR). In these examples, the objects of interest tend to collect messages from other context objects. The last two rows show two failure examples on the CLEVR-Ref+ dataset.}
\caption{Examples from our LCGN model on the validation split of the GQA dataset for VQA. In the middle 4 columns, each red line shows an edge $j \rightarrow i$ along the message passing paths (among the $N$ detected objects) where the connection edge weight $w^{(t)}_{j,i}$ exceeds a threshold. The \textcolor{blue}{blue} star on each line is the sender node $j$, and the line width corresponds to its connection weight. In the upper example, the person, the elephant and the fence propagate messages with each other, and fence receives messages from the elephant in $t=4$. In the lower example, the frisbee collect messages from the dog as contextual information in multiple rounds, and is picked up by the single-hop classifier. The \textcolor{red}{red} star (along with the box) in the last column shows the object with the highest single-hop attention $\beta_i$ in Eqn.~\ref{eqn:vqa_in}.}
\caption{Examples from our LCGN model on the validation split of the CLEVR dataset for VQA. The middle 4 columns show the connection edge weights $w^{(t)}_{j,i}$ similar to Figure~\ref{fig:gqa_vis}, where the \textcolor{blue}{blue} stars are the sender nodes. The last column shows the single-hop attention $\beta_i$ in Eqn.~\ref{eqn:vqa_in} over the $N = 14 \times 14$ feature grid. In the upper example, in $t=1$ the matte ball (leftmost) collects messages from the gray metal ball (of the same size), and then in $t=3$ messages are propagated within the convolutional grids on the matte ball, possibly to refine the collected context from the gray ball. In the lower example, in $t=1$ all four balls try to propagate messages within the convolutional grids of each ball region, and in $t=2$ the three other balls (of the same size) receive messages from the rubber ball (leftmost) and are picked up by the single-hop classifier.}
\caption{Examples from our LCGN model on the validation split of the CLEVR-Ref+ dataset for REF. The middle 4 columns show the connection edge weights $w^{(t)}_{j,i}$ similar to Figure~\ref{fig:gqa_vis}, where the \textcolor{blue}{blue} stars are the sender nodes. The last column shows the selected target grid location $p$ on the $N = 14 \times 14$ spatial grid (the \textcolor{red}{red} star) in Eqn.~\ref{eqn:target_select}, along with the ground-truth (\textcolor[rgb]{.7,.7,0}{yellow}) box and the predicted box (\textcolor{red}{red} box from bounding box regression $u$ in Eqn.~\ref{eqn:bbox_reg}). In the upper example, the blue cube (the target object) collects messages from the two other objects in $t=2$, and then the blue cube further collects messages from the big matte green cube on the left (which has the same shape) in $t=3$. In the lower example, the green cube checks for other cubes by collecting messages from things on its right in $t=2$.}
\caption{\label{fig:(a)-Geometry-and}(a) The top view of a single-helix unit cell with subwavelength dimensions. (b) Reflection and transmission coefficients of\textcolor{red}{{} }$y$- and $x$-polarized waves for the infinite planar periodic array of unit cells in (a). (c) Geometry and different view angles of the designed particle composed by four interlaced helices as in (a). (d) Reflection and transmission coefficients of the\textcolor{red}{{} }$y$- and $x$-polarized waves for an infinite planar periodic array of unit cells shown in (c).}
\caption{\label{f2} Proton distribution function, $f(p)p^4$, calculated with Equations (\ref{finj})-(\ref{fN}). Panel (a): $f(p)p^4$ in a $M_{\rm s}=3.2$ shock with $Q_{\rm i,0}=$ 3.0 (blue line), 3.3 (black line), and 3.5 (red line), when the maximum momentum is $p_{\rm max} \gg p_{\rm inj}$. The vertical dashed line shows the injection momentum, $p_{\rm inj}$, with $Q_{\rm i,0}=3.3$. Panels (b)-(d): Change of $f(p)p^4$ in $M_{\rm s}=2.5$, 3.2, and 4.0 shocks with $Q_{\rm i,0}=3.5$, as $p_{\rm max}$ increases. Here, $T_1=10^8$ K. {\color{red}The DSA test-particle slope, $q_{\rm tp}$, is given in each panel.} Due to the energy transfer to the CR component, the temperature reduction factor, $R_{\rm T}$, decreases. Hence, while $p_{\rm inj}$ is fixed, the injection parameter, $Q_{\rm i}=Q_{\rm i,0}/\sqrt{R_{\rm T}}$, increases, leading to the reduction of the normalization factor, $f_{\rm N}$. }
\caption{\textbf{Calculated results of the distorted TBG structure with $\mathbf{\theta = 0.48^{\circ}}$.} \textbf{a}, Schematic model of the moir\'{e} pattern. \textbf{b}, Absolute magnitude of different in-plane atomic displacements (top panel) and out-of-plane displacements for the deformed system along the path MNP defined in \textbf{a}. For the in-plane displacements, the parameters ($\Delta D$, $l_D$, $\sigma_D$) are: A (0.04 nm, 5.91 nm, 1.14), B (0.07 nm, 5.91 nm, 1.14), C (0.07 nm, 3.70nm, 1.14), D (0.07 nm, 3.70 nm, 11.44). For the out-of-plane displacements, the parameters ($\Delta Z$, $l_Z$, $\sigma_Z$) are: E (0.0115 nm, 4.93 nm, 0.7), F (0.023 nm, 4.93 nm, 0.7), G (0.0058 nm, 4.93 nm, 0.7). \textcolor{red}{\textbf{c} and \textbf{d}, Maps of the absolute magnitude of the in-plane atomic displacement $|\Delta d|$ and out-of-plane displacement $\Delta z$ upon deformation of rigidly TBG calculated from Eq.(\ref{deform}), respectively. } \textbf{e} and \textbf{f}, Local density of states in the AA region of deformed systems with different in-plane and out-of-plane displacements, respectively.}
\caption{Average class-wise membership scores for the (a) Stanford Dogs and (b) Caltech 256 dataset using the features of the Alexnet \cite{alexnet} and VGG16 \cite{VGG} network. The first $60$ \&$128$ categories for the two datasets in (a) \& (b) are considered in the known class set while the remaining are considered to form the novel classes. We observed that the class-wise membership score for the known class set in general are higher than that of the novel class examples for both the datasets. This clearly demonstrates the effectiveness of our proposed method.}
\caption{Classification accuracy percentages. The results of other networks are taken from \citealt{Cangea2018TowardsClassifiers} with which we share 10-fold splits for benchmarking our methods. Bold indicates top-performance, \textcolor{table_colour}{blue} indicates weaker performance than the \textsc{mlp}.}
\caption{Performance comparison on the reprocessed Color Checker dataset. The best three results are shown in {\color{red}{\textbf{red}}}, {\color{green}{\textbf{green}}}, and {\color{blue}{\textbf{blue}}}, respectively}
\caption{Performance comparison on the NUS 8-camera dataset. The best three results are shown in {\color{red}{\textbf{red}}}, {\color{green}{\textbf{green}}}, and {\color{blue}{\textbf{blue}}}, respectively}
\caption{Four nearest neighbors of representative names in Twitter embedding space, showing how they preserve gender and ethnicity associations. Notes: {\color{e5} Asian} (Chinese, Korean, Japanese, Vietnamese), {\color{e1} British}, {\color{e2}European} (Spanish, Italian), {\color{e3}Middle Eastern} (Arabic, Hebrew), {\color{e4}North American} (African-American, Native American, Contemporary).}
\caption{Follower count distributions of \textit{seed users}' followers and followees. We characterize Twitter users with $r$, the ratio of follower over followee count. \textcolor{green}{Celebrity}: $r>10$. \textcolor{blue}{Ordinary}: $r\leq10$. More celebrities among followees. Homophily between fans and celebrities is not as strong as that between families and friends. So \textit{Followee*} removes names of celebrities to strengthen homophily among followee lists.}
\caption{Chinese-English translation examples of Transformer decoding in left-to-right and right-to-left way, and our proposed models. L2R performs well in \textcolor{red}{\protect\dashuline{the first half sentence}}, whereas R2L translates well in \textcolor{blue}{\uwave{the second half sentence}.} }
\caption{Steady state deformation under $50\%$ extension for (a) $lvl_1$, and (b) $lvl_4$ discretizations. Nodes in the bottom surface are fixed and nodes at the top surface undergo a displacement in the $z$-axis, $u_z = -0.05m$ $\big(\bigcirc$ CME, $\blacksquare$ MMLS-SEBCIEM, $\bm{\times}$ FEM$\big)$.} \label{fig:cube_extension} \end{figure} The $\Delta t_{crit}$ for CME is found higher for all the parameter sets compared to MMLS-SEBCIEM. The $\Delta t_{crit}$ gain is reduced for increasing $s$ and $a$; with increasing $s$ the gradient becomes steeper, leading to larger eigenvalues in the spatial derivatives matrix ($\leftidx{^t_0}{\bm{B}}_{L0}^t$). The CME performs better, in terms of computational efficiency, since it can reach steady state with less time steps than the MMLS-SEBCIEM. \begin{table}[htbp] \caption{Simulation characteristics (critical explicit integration time step - $\Delta t_{crit}$, number of execution steps - $N_{exe}$, execution time - $t_{exe}$) for cube under $50\%$ uniaxial extension $\&$ compression for $lvl_1$-$lvl_6$ discretization.} \centering \begin{threeparttable} \begin{tabular}{c c c c c c} \toprule Approx. & $s$ \tnote{$\dagger$} & $a$ \tnote{$\ddagger$} & $\Delta t_{crit}$ (ms) & $N_{exe}$ & $t_{exe}$ (s) \\ \midrule \multicolumn{6}{c}{Extension} \\ \midrule \multirow{6}{*}{CME} & \multirow{3}{*}{2} & 0.2 & 2.173 - 0.260 & 2963 - 10455 & 1.4 - 3173.5 \\ % & & 1.6 & 1.972 - 0.243 & 2814 - 12707 & 1.6 – 3906.9 \\ % & & 2.0 & 1.918 - 0.238 & 3341- 13122 & 1.4 - 3911.0 \\ % & \multirow{3}{*}{4} & 0.2 & 1.527 - 0.205 & 3396 - 16582 & 1.9 - 4652.4 \\ % & & 1.6 & 1.230 - 0.176 & 3529 - 20900 & 1.8 - 5711.7 \\ % & & 2.0 & 1.216 - 0.174 & 3509 - 21167 & 1.9 - 6070.6 \\ MMLS-SEBCIEM & 1.6 & 0.2 & 0.902 - 0.192 & 3208 - 16827 & 1.4 - 4603.5 \\ \midrule \multicolumn{6}{c}{Compression} \\ \midrule \multirow{6}{*}{CME} & \multirow{3}{*}{2} & 0.2 & 2.173 - 0.260 & 3172 - 10442 & 1.7 - 3184.9 \\ % & & 1.6 & 1.972 - 0.243 & 3093 - 17288 & 1.5 - 4630.3 \\ % & & 2.0 & 1.918 - 0.238 & 3117 - 16501 & 1.5 - 4414.7 \\ % & \multirow{3}{*}{4} & 0.2 & 1.527 - 0.205 & 3279 - 31402 & 1.7 - 7231.9 \\ % & & 1.6 & 1.230 - 0.176 & 3664 - 24885 & 2.0 - 6297.4 \\ % & & 2.0 & 1.216 - 0.174 & 3735 - 18563 & 1.7 - 5557.1 \\ MMLS-SEBCIEM & 1.6 & 0.2 & 0.902 - 0.192 & 3975 - 18315 & 1.8 - 5710.6 \\ \bottomrule \end{tabular} \begin{tablenotes} \item [$\dagger$] CME smoothness modulation factor \item [$\ddagger$] CME R-function zero set shape factor or MMLS dilatation coefficient \end{tablenotes} \end{threeparttable} \label{tab:cube_uniaxial_results} \end{table} An analytical solution is not available for the given problem. Therefore, to evaluate the convergence and accuracy of the method we use the Signed Relative Error in strain energy density ($SRE_W$) as a convergence measure similar to \cite{Arroyo2006}. We compute the $SRE_W$ for $lvl_1$ to $lvl_5$ discretization with respect to the $lvl_6$ (finest) discretization (Figure \ref{fig:cube_uniaxial_convergence}). The $SRE_W$ is obtained by: \begin{equation} SRE_W = \frac{\Bar{W}_{lvl_i} - \Bar{W}_{lvl_6}}{\Bar{W}_{lvl_6}} \end{equation} where $\Bar{W}_{lvl_i}$ is the mean value of the strain energy density for the $lvl_i$ discretization, $i=1,\ldots,5$. The $SRE_W$ for CME is comparable with the $SRE_W$ for MMLS-SEBCIEM. While the MMLS-SEBCIEM lead to lower $SRE_W$ for higher nodal density ($lvl_3$-$lvl_5$), the CME results in lower $SRE_W$ for lower nodal density ($lvl_1$ and $lvl_2$). \begin{figure}[htbp] \centering \includegraphics[width=\textwidth]{figure6.jpg} \caption{Signed relative error in strain energy density $SRE_W$ for a cube at: (a) $50\%$ uniaxial extension, (b) $50\%$ uniaxial compression.} \label{fig:cube_uniaxial_convergence} \end{figure} For additional comparison, the $SRE_W$ for $lvl_1$ to $lvl_5$ discretization for a Finite Element Method (FEM) simulation is given. FEM simulations are performed with the FEBio v$2.5.2$ software \cite{Maas2012} using isoparametric tetrahedral elements and the Newton-Raphson implicit time integration method. Linear tetrahedral elements are known for being prone to volumetric “locking”. For this reason, FEBio implements a nodally integrated tetrahedron with enhanced performance for finite deformation and near-incompressibility compared to the standard constant strain tetrahedron \cite{Puso2006}. The $SRE_W$ for FEM simulations is found an order of magnitude higher from CME in $50\%$ extension. Similar results are acquired for $50\%$ compression. The use of CME in MTLED leads to higher $\Delta t_{crit}$ and lower $t_{exe}$ compared to the MMLS-SEBCIEM (Table \ref{tab:cube_uniaxial_results}). The $SRE_W$ in $50\%$ compression is found lower for CME compared to MMLS-SEBCIEM and FEM for all the discretization levels (Figure \ref{fig:cube_uniaxial_convergence}). The set-up of $N_R = 2$ and $s=2$; $m=3$; $a=2.0$; $k=2$ for CME is found to be a good trade-off between accuracy and efficiency in MTLED for both cases of extension and compression (Table \ref{tab:cube_uniaxial_results}). The use of CME instead of the MMLS-SEBCIEM in MTLED eliminates the necessity for EBC correction and the execution time is reduced. The execution time reduction is more evident for dense nodal discretization (up to $1295.9 s$ - $22.7\%$). Moreover, the evaluation of the $SRE_W$ demonstrates the improved accuracy of the MTLED method over the FEM for large strain problems for both $50\%$ extension and compression conditions. Especially for severe distortion, such as in the $50\%$ compression case, the FEM simulation leads to poor results for coarse nodal discretization compared to MTLED using either CME or MMLS-SEBCIEM approximants (Figure \ref{fig:cube_compression}). \begin{figure}[htbp] \centering \includegraphics[width=\textwidth]{figure7.jpg} \caption{Steady state deformation under $50\%$ compression for (a) $lvl_1$, and (b) $lvl_4$ discretizations. Nodes in the bottom surface are fixed and nodes at the top surface undergo a displacement in the $z$-axis, $u_z = 0.05m$ $\big(\bigcirc$ CME, $\blacksquare$ MMLS-SEBCIEM, $\bm{\times}$ FEM$\big)$.}
\caption{Phase diagram of $\nu_{\text{PHS}^\dagger}$ [Eq.\ref{nu3}] for the \blue{PH$^\dagger$-symmetric SSH model} with fixed $\gamma_y=0.5$ and $t_2=1$. }
\caption{Spectrum and SR winding behavior of the \blue{PH$^\dagger$-symmetric SSH model}. In (a-d), gray and dark blue curve indicate the PBC and OBC spectra respectively, \CH{with the PBC spectra bearing no direct influence on the presence and position of topological modes.} In (e-h), the red loops are the SR, and in light blue ones are the $\bar{\bm d}$-loops, corresponding to (a-d) respectively. \blue{The blue dots indicate the high-symmetric points, which fall onto the $\bar{d}_y$-axis after normalizing the SR. (e) $\nu_{\text{PHS}^\dagger}=0$ when the two high-symmetric points are inside the SR. (f,h) When the two high-symmetric points are outside the SR, $\nu_{\text{PHS}^\dagger}$ takes $-1$ or $1$, depending on whether the two points are separated by the SR along $\bar{d}_y$-axis or not.} The parameters are \CH{$g_y=0.5$} and $t_2=1$, with (a) $t_1=0.2$, $g_z=1$; (b) $t_1=0.2$, $g_z=0.7$; (c) $t_1=1$, $g_z=0.7$; and (d) $t_1=1.8$, $t_2=0.7$. }
\caption{Spectrum and SR winding visualizations of the \CH{PH-symmetric} non-Hermitian \CH{extended} Kitaev model, \CH{with cases (a-c) being topologically trivial, gapless and nontrivial respectively}. In (a-c), gray and dark blue curves indicate the PBC and OBC spectra respectively. In (d-f), the red loops are the SR, and in light blue are the ${\bm d(k)}$-loops, corresponding to (a-c) respectively. The gray areas in (d-f) are the planes containing the SR. The blue dots correspond to the high-symmetric points, \blue{and the yellow ones indicate where the ${\bm d(k)}$-loops penetrate the SR-plane.} The parameters are $\gamma_z=0.5$, $\phi=\pi/2$, $\mu=1$, $t_1=\Delta_1=0.5$, with (a) $t_2=\Delta_2=0.5$; (b) $t_2=\Delta_2=0.711$; (c) $t_2=\Delta_2=0.9$.}
\caption{A real-world application of \method based on the dataset provided by the State Grid of China. \small In the figure, (a), (b), (c) and (d) present four different types of time series about watt-hour meters' clock error. The upper figure shows raw curve of clock error and the marks of the genes, while the lower figure visualizes %the probability of gene assignment obtained by our model. the behavior sequence $\mathcal{A}$ of genes computed by \methodraw. These figures illustrate four different evolution modes (i.e., monotonous, repaired, fluctuating, placid) of watt-hour meters. (e) shows the average segments %value of the generated samples generated from four different genes. %(f) compares the results of two mode recognition methods by the silhouette score, one through K-means on origin sequence and the other through our \methodname. \normalsize }
\caption{Top related images to the nodes. These nodes are related to: \textcolor{colornode05}{\newmoon} cereal, \textcolor{colornode08}{\newmoon} pan, \textcolor{colornode09}{\newmoon} eggs, \textcolor{colornode01}{\newmoon} sandwich, \textcolor{colornode04}{\newmoon} kettle, and \textcolor{colornode06}{\newmoon} foodbox.}
\caption{Average number of hydrogen bonds per OH group in glycerol as a function of the confinement concentration, $\rho_{\text{conf}}$ in slit nanopores of different confinement length $l_{z}$. The results are normalized by the value in the bulk $\left< n_{\textrm{HB/OH,bulk}} \right>$. Dashed lines are a guide for the eye.} \label{nhb_ratios} \end{center} \end{figure} The trend observed in Fig.\ \ref{nhb_ratios} unveils the remarkable effect of the degree of confinement on the disruption of the the hydrogen-bonding network of glycerol. At $\rho_{\text{conf}}=\rho_{\text{bulk}}$ such a disruption is not especially significant, with $n^*_\text{HB/OH} > 0.9$. In this case, the reduction in $n^*_\text{HB/OH}$ is caused exclusively by the presence of the $\gamma$-Al$_2$O$_3$ surfaces. Although this result conciliates with the suggestion by D'Agostino et al. \cite{d2012}, the reduction of glycerol-glycerol hydrogen bonds is partially compensated by the presence of hydrogen bonds between glycerol and the hydroxyl groups in the surface of $\gamma$-Al$_2$O$_3$. In fact, it was observed that all the glycerol molecules lying in the contact layer have their three OH groups pointing to the surface and forming hydrogen bonds with the $\gamma$-Al$_2$O$_3$ hydroxyl groups. Despite the orientational restrictions imposed by the solid surface, it is important to note that the number of glycerol-glycerol hydrogen-bonds would be smaller if there were no glycerol-glycerol hydrogen-bonds in the contact layer as this would lead to approximately $n^*_\text{HB/OH} \approx 0.85$, $0.77$, and $0.54$ for the pores of $60$ , $40$ and $20$ $\textup{\AA}$ respectively. Therefore, some of the molecules in the contact layer are forming hydrogen-bonds with other glycerol molecules. The disruption becomes more relevant at lower confinement concentrations, especially if the confinement length is particularly small ($l_z = 20$ $\textup{\AA}$). In other words, if the pore is not fully saturated, the average number of hydrogen bonds per OH group in liquid glycerol significantly decreases, due to the formation of a low-density cavity or two separated liquid films that create a vacuum-liquid interface. The minimum in $n^*_\text{HB/OH}$ is found at the lowest glycerol concentration due to the large interfacial area between the liquid films and vacuum. Thus, if there is an effect of the hydrogen-bond network on the dynamics of glycerol, then it must be reflected in the dependence of the glycerol diffusivity on the degree of pore saturation. To test the validity of this hypothesis, we have calculated the reduced self-diffusion coefficient of glycerol, $D^* \equiv D_\text{conf}/D_\text{bulk}$, defined as the ratio between the self-diffusivity in the pore and the self-diffusivity in the unconstrained liquid bulk. However, due to the intrinsic anisotropy and spatial inhomogeneity in the confined fluid, neither the standard three-dimensional Einstein relation nor the Kubo relation over the whole confined volume would be suitable approaches \cite{liu2004}. Therefore, $D_\text{conf}$ has been calculated as the density-weighted average of the local parallel diffusion coefficient in finite slabs centered at different $z$ positions in the simulation box: % \begin{equation} D_\text{conf} \equiv \left< D_{||}(z)\right> = \frac{\int_{l_{z}}{D_{||}(z) \rho (z) dz} }{\int_{l_{z}}{\rho(z) dz}} \end{equation} % In particular, the two-dimensional parallel component of the diffusion coefficient in a given slab $\delta_z$, defined by the spatial interval ($z$,$z+dz$), reads % \begin{equation} D_{||}\left( \delta_z\right) = \lim_{\tau \rightarrow \infty}{\frac{\left< |\boldsymbol{r}_{xy}\left(t + \tau \right) - \boldsymbol{r}_{xy} \left( t\right) |^{2} \right>_{\delta_z}}{4\tau P_{\delta_z}\left( \tau \right)}} \end{equation} % where $\boldsymbol{r}_{xy}(t)$ is the two-dimensional position vector of a glycerol molecule in slab $\delta_z$ at time $t$ and $P_{\delta_z}\left(\tau\right)$ is the survival probability for particles to remain in that slab (see Ref. \cite{liu2004}). The brackets stand for an ensemble average over sample molecules and time origins $t$. Earlier computational studies have demonstrated that the diffusion coefficient calculated by averaging the translational motion along the $z$ direction, $D_\perp$, would provide the same qualitative results \cite{liu2004, mittal2008}. Consequently, even if the global effective three-dimensional diffusion coefficients could be somehow obtained without a decoupling in the translational degrees of freedom, they would (in principle) follow the same trend and be quantitatively similar to those obtained with Eq. 4. The resulting dependence of $D^*$ on the confinement concentration is reported in Fig.\\ref{dif_ratios} for different confinement lengths. \begin{figure}[h] \begin{center} \includegraphics[scale=0.33]{dif_alt.pdf} \caption{Average self-diffusion coefficient of glycerol in slit nanopores of size $l_{z}$ at different confinement concentrations $\rho _{\textrm{conf}}$. The values are normalized with the diffusion coefficient in the isotropic bulk fluid $D_{\textrm{bulk}}$. Dashed lines are added as a guide for the eye.} \label{dif_ratios} \end{center} \end{figure} It is interesting to note that at $\rho_\text{conf}/\rho_\text{bulk}=1$ (pore completely saturated), $D^*<1$ for any $l_z$, suggesting a slower diffusion in the pores than in the bulk. By contrast, in pores that are partially saturated ($\rho_\text{conf}/\rho_\text{bulk}<1$), the reduced self-diffusion coefficient increases significantly, especially so at $l_z \ge 40$ $\textup{\AA}$. As soon as a fully developed vacuum-liquid interface forms, $D^*$ decreases again, indicating that the disruption of hydrogen bonds in the liquid glycerol cannot be the only factor determining its dynamics. To unveil the missing piece of this puzzle, one should consider the spatial correlations discussed above and reported in Figs.\\ref{dens_all} and \ref{maps}. These correlations determine the diffusion profile along the $z$ direction that is shown in Fig.\\ref{dif_latz}, where we report the local translational diffusion coefficients obtained via Eq 5. \begin{figure}[h] \begin{center} \includegraphics[scale=0.33]{difz_joined.pdf} \caption{Parallel component of the reduced self-diffusion coefficient as a function of the $z$ position for glycerol in slit $\gamma$-alumina nanopores for different $l_{z}$ and confinement concentrations. The local values are normalized by the diffusion coefficient in the isotropic bulk fluid $D_{\textrm{bulk}}$. Solid lines are added as a guide for the eye.} \label{dif_latz} \end{center} \end{figure} Each frame refers to a different pore size and, within each frame, we report the dependence of the parallel self-diffusivity on the position for different degrees of pore saturation. One can appreciate that the self-diffusion coefficient at the solid-liquid interface is especially low, with $D_{||}(z)/D_\text{bulk}<<1$, but it increases significantly at increasing distance from the solid support. Additionally, at relatively low confinement concentrations, when a vacuum-liquid interface forms, no glycerol molecules are found within the vacuum region and therefore the self-diffusion coefficients are practically meaningless there. Pores that are completely filled show a similar behaviour, but the glycerol self-diffusion coefficient in there remains lower than that in the bulk across the whole pore volume. The case of very small pores ($l_z=20$ $\textup{\AA}$) is especially interesting. While $D^*$ in Fig.\\ref{dif_ratios} would indicate a reduced mobility as compared to that in the bulk, a deeper analysis reveals that this strictly depends on how far the glycerol molecules are from the $\gamma$-Al$_2$O$_3$ support (Fig.\\ref{dif_latz}). The nucleation of the low-density cavity and subsequent formation of a vacuum-liquid interface deeply determine the disruption of the hydrogen-bond network and a significant increase of the glycerol mobility, which we do not observe in fully saturated pores, where a vacuum-liquid interface is not present. Finally, to better understand the effects of the interfaces on the glycerol dynamics, we have assessed the time fluctuations in the hydrogen-bonding network by evaluating the intermittent hydrogen-bond correlation function, $\mathcal{C}_\text{HB}$, firstly introduced by Rapaport in the 1980s \cite{rapaport1983}: \begin{equation} \mathcal{C}_{\text{HB}}\left(\tau\right)= \frac{\left<h_{ij}\left(t + \tau\right)\cdot h_{ij} \left(t\right)\right>}{\left<h_{ij} \right>} \end{equation} % where $h_{ij}\left(t\right)$ is the binary hydrogen bond population operator, which equals unity if the hydroxyl groups $i$ and $j$ are hydrogen-bonded at time $t$ and zero otherwise. In particular, $\mathcal{C}_{\textrm{HB}}\left(\tau\right)$ represents the conditional probability that the hydrogen bond between hydroxyls $i$ and $j$, observed at time $t$, still exists at time $t+\tau$, regardless of whether bond-breaking events might have occurred meanwhile. Basically, a fast decay of $\mathcal{C}_{\textrm{HB}}$ would indicate a relatively short average life-time of a hydrogen bond. In Fig.\\ref{hb_cor}, we present the correlation's decay for different pore sizes at all confinement concentrations. For comparison, we also include the case of unconstrained bulk liquid. All the curves exhibit a similar time dependence, which can be approximated by an exponential decay. In addition, at $l_{z} \ge 40$ $\textup{\AA}$ and $\rho_{\text{conf}}=1.0$ g cm$^{-3}$, which provide the highest self-diffusivities in Fig.\\ref{nhb_ratios}, $\mathcal{C}_{\text{HB}}\left(\tau\right)$ displays a faster decays as compared to any other confinement concentration, including the case of the bulk fluid. By contrast, in smaller pores, the correlation decays slower than in the bulk at all times, confirming the above observations. We also calculated the correlation function for the glycerol-surface hydrogen bond kinetics and we found that the curves do not decay to zero in the whole simulated time. This conciliates with the observed suppression in the diffusion of glycerol adsorbed in the crystal surface (Fig. \ \ref{dif_latz}). \begin{figure}[h] \begin{center} \includegraphics[scale=0.33]{hb_joined.pdf} \caption{Intermittent hydrogen-bonding correlation function in glycerol in the bulk state and under confinement.} \label{hb_cor} \end{center} \end{figure} Finally, the average hydrogen bond lifetime, $\tau_{\textrm{HB}}$, was estimated from the reactive flux hydrogen-bond correlation function, $k(\tau)$, which is the time derivative of $\mathcal{C}_{\textrm{HB}}\left(\tau\right)$: \begin{equation} k(\tau) = - \frac{d\mathcal{C}_{\text{HB}}\left(\tau\right)}{d\tau} \end{equation} % and by treating the hydrogen-bond breaking and re-formation as a reversible chemical reaction with well-defined rate constants \cite{luzar1996nature}. The dependence of $\tau_{\text{HB}}$, being normalized by the lifetime in the bulk, on the confinement concentration is reported in Fig.\\ref{tau_ratios} at different confinement lengths. Overall, the results agree with those of Figs.\\ref{dif_ratios} and \ref{hb_cor}. As a matter of fact, we note that for glycerol confined in pores with $l_{z} \ge 40$ $\textup{\AA}$, the minimum in the ratio $\tau^{*}_{\text{HB}} = \tau_{\text{HB}}/\tau_{\text{HB,bulk}}$ occurs at the $\rho_{\text{conf}}$ where $D^*$ reaches its maximum value. \begin{figure}[h] \begin{center} \includegraphics[scale=0.33]{tau_alt.pdf} \caption{Lifetime of the hydrogen bonds between pairs of glycerol molecules in slits of size $l_{z}$ at confinement density $\rho_{\textrm{conf}}$. The values are normalized with the lifetime in the isotropic bulk fluid, $\tau_{\textrm{HB,bulk}}$. Dashed lines are added as a guide for the eye.} \label{tau_ratios} \end{center} \end{figure} \section{Conclusions} We have investigated the structure and dynamics of glycerol confined in $\gamma$-Al$_{2}$O$_{3}$ slit nanopores using atomistic MD simulations. As observed for simple liquids, the confinement of glycerol within hard geometric boundaries results in a spatially inhomogeneous molecular distribution. More specifically, in the vicinity of the solid surface, changes from a liquid-like to a crystal-like structure are promoted, due in part to the formation of hydrogen bonds with the hydroxyl groups of the $\gamma$-Al$_{2}$O$_{3}$ surface. The dynamics of glycerol in fully saturated pores is primarly affected by the reduction of mobility at the solid-liquid interface, which is the result of a higher density and structural order in the contact layer than in the bulk liquid and the presence of hydrogen bonds between molecules in the contact layer and hydroxyl groups at the surface. Pores that are not fully saturated exhibit a cavity, approximately located in their centre, that does not contain glycerol and causes a disruption of the hydrogen-bond network within the liquid, which in turns enhance the glycerol self-diffusion. The competition between the strong structural order at the solid-liquid interface and the disruption of hydrogen bonds far from it, leads to an overall self-diffusion that is faster than that in the liquid bulk. In particular, we found a significant enhancement in the self-diffusion of nanoconfined glycerol for pore sizes $l_{z} \geq 40$ $\textup{\AA}$ and confinement concentrations $\rho_{\textrm{conf}} \leq 1.0$ g cm$^{-3}$. The present study provides a fundamental guideline to understand recent experimental observations on the dynamics of glycerol in confined media \cite{d2012}, which is consistent with the dynamics of glycerol observed in thin films \cite{capponi2010} and highlights the importance of liquid-gas interfaces in the dynamics of confined viscous fluids. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% The "Acknowledgement" section can be given in all manuscript %% classes. This should be given within the "acknowledgement" %% environment, which will make the correct section or running title. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section*{Conflicts of interest} There are no conflicts to declare.\\ \section*{Acknowledgements} The project leading to these results has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sk\l{}odowska-Curie grant agreement No 676045 (MULTIMAT). The authors acknowledge the assistance given by IT Services and the use of the Computational Shared Facility at the University of Manchester.\\ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% The appropriate \bibliography command should be placed here. %% Notice that the class file automatically sets \bibliographystyle %% and also names the section correctly. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \bibliography{campos} \bibliographystyle{rsc.bst} %%%END OF MAIN TEXT%%% %The \balance command can be used to balance the columns on the final page if desired. It should be placed anywhere within the first column of the last page. \balance %If notes are included in your references you can change the title from 'References' to 'Notes and references' using the following command: %\renewcommand\refname{Notes and references} %%%REFERENCES%%% %\bibliography{rsc} %You need to replace "rsc" on this line with the name of your .bib file %\bibliographystyle{rsc} %the RSC's .bst file \end{document} }\end{equation}}}\end{equation}}
\caption{Case studies used in this paper. Sorted by number of variables. Medium sized-problems are highlighted with {\color{blue} blue rows} while the large ones are in {\color{orange} orange rows}. Three items (marked with *) are not included in some further reports (see text). See text for details.}
\caption{The $p_T$ spectra for pions and protons in $pp$ collisions at $\sqrt{s}=$ 62.4, 200 and 900 GeV at midrapidity. Data are taken from \textcolor{blue}{Ref.~\cite{all-16}}. Each spectrum is fitted with all the six functions of Eqs.(\ref{functions}) in the $p_T$ range as shown in Table \ref{tab2}. Ratios of the net fits to data are also shown in each lower panel.}
\caption{The $p_T$ spectra for pions and protons in $pp$ collisions at $\sqrt{s}=$ 2.76, 5.02 and 7 TeV at midrapidity. Data are taken from Refs.~\cite{all-10-1,all-32}. Each spectrum is fitted with all the six functions of Eqs.(\ref{functions}) in the $p_T$ range as shown in Table \ref{tab2}. Ratios of the net fits to data are also shown in each lower panel. \textcolor{blue}{Note that as mentioned in Ref.~\cite{all-10-1}, the data for $pp$ collisions at 5.02 TeV are generated by interpolating the data measured at 2.76 TeV and 7 TeV.}}
\caption{PDFs of $u_\omega$ (Left), $u_y$ (Mid) and $u_x$ (Right) at the node 325, the red line \,\quad \textcolor{red}{\rlap{\rule[0.5ex]{0.5cm}{1pt}}{$\Delta$}\llap{\rule[0.5ex]{0.5cm}{1pt}}} \,\quad and the blue line \textcolor{blue}{\rule[0.5ex]{0.8cm}{1pt}} denote the computed PDFs obtained by the proposed algorithm and the reference PDFs obtained from $1 \times 10^5$ Monte Carlo simulations, respectively}
\caption{ \sl \small Summary of the LQ models which can accommodate $R_{K^{(\ast)}}$ (first column), $R_{D^{(\ast)}}$ (second column), and both $R_{K^{(\ast)}}$ and $R_{D^{(\ast)}}$ (third column) without inducing other phenomenological problems. The \textcolor{red}{\xmark}$^{\textcolor{blue}{\ast}}$ symbol means that the discrepancy can be alleviated, but not fully accommodated.}
\caption{Lyndon representatives of 5 ordered classes of asymptotically stable stationary $4$-periodic solutions of \eqref{eq:lde}: \protect\includegraphics[width=2mm]{pic_sym_triangle_down.png} - $u_\mathfrak{0}$, \protect\includegraphics[width=2mm]{pic_sym_rectangle.png} - $u_\mathfrak{0001}$, \protect\includegraphics[width=2mm]{pic_sym_triangle_right.png} - $u_\mathfrak{0011}$, \protect\includegraphics[width=2mm]{pic_sym_circle.png} - $u_\mathfrak{0111}$, \protect\includegraphics[width=2mm]{pic_sym_triangle_up.png} - $u_\mathfrak{1}$ (the values are slightly modified for better visualisation). }
\caption[These plots depict the mean upper bounds (over $1,\!000$ trials for various distributions (the titles on the plots describe the distribution) and using various methods. All figures share the following legend: Rendering Error]{These plots depict the mean upper bounds (over $1,\!000$ trials) for various distributions (the titles on the plots describe the distribution) and using various methods. All figures share the following legend: \includegraphics[width=\textwidth]{figs/Legend.png} }
\caption{Model generated from CART \texttt{ret}, \texttt{lo} and \texttt{sh-synr} are node names. Branches leading to spikes (with service response time $>$470ms/query) are \color{red!70}{highlighted in red}. }
\caption{Pit viper \cite{web2} "sees" a thermal image of its warm-blooded pray (simulated here with Gimp\textregistered{ }tool effects).}
\caption{An example of activated joints of all streams for the baseline model, the RA-GCN whth 2 streams and 3 streams. The {\color{red}red} points denote the activated joints, while the {\color{blue}blue} points denote the unactivated joints. Best viewed in color.}
\caption{Example synthesis procedure text from a materials journal article \citep{dong2009beta}. \textcolor{Red}{\textbf{Bold red}} indicates the operations (predicates) involved in the synthesis; \textbf{bold black} indicates arguments; \underline{underlines} demarcate entity boundaries.}
\caption{PSD heat-maps of theta (\textcolor{red}{red}), alpha (\textcolor{green}{green}), and beta (\textcolor{blue}{blue}) EEG bands being added according to respective color-bar range to get combined RGB heat-map (Image border, nose, ears, and color-bars have been added for visualization only.)}
\caption{For a trial from DEAP dataset, PPG signal with peaks (in \textcolor{red}{red}) being detected for the calculation of RRs and HRV (above), and PPG spectrogram (below).}
\caption{Detected face (marked in \textcolor{red}{red}) and face localized points (marked in \textcolor{green}{green}) in DEAP Dataset (left), AMIGOS Dataset (center), and subset of features (marked in \textcolor{yellow}{yellow}) computed using face localized points (right). The features are normalized using height (H) and width (W) of the detected face. These subjects' consent to use their face is marked in respective datasets.}
\caption{COMSOL Multiphysics\textregistered \simulations of the electric potential produced by a cylindrical electrode on a diamond substrate. The substrate is 10~$\mu$m deep in the $z$ direction and extends infinitely in the $x$ and $y$ directions. Depicted here is a cross-section of the substrate in the $x$-$z$ half-plane ($y=0$) with the electrode centered at the origin (note that the simulation has rotational symmetry). A potential of $+1$ V has been applied to the electrode which has a height and radius of 1 $\mu$m. The electrostatic potential has an approximately wire-like geometry with a longitudinal axis extending into the substrate.}
\caption{ (a) Feature space of 6 sessions projected in 2-D latent space, point anomalies are marked in \textcolor{red}{circles} (b) Known Group membership of features per session \textcolor{red}{Attacker} session, \textcolor{blue}{normal/user} session. (c) Latent distribution $Z$ of 6 sessions, \textcolor{red}{Attacker} sessions and \textcolor{blue}{normal/user} session.}
\caption{How do we recognize the actions performed in the frame shown in (a)? We argue that there are two types of vital cues for video understanding: (1) external commonsense \emph{semantic relationships of labels}, such as linguistic similarity or co-occurrence and (2) \emph{visual spatio-temporal contextual cues}, such as human-object interactions. In this work, we perform representation learning on a hybrid symbolic (b) and visual (c) graph to leverage both types of cues. Visual node types are \emph{actor} (shown in \textcolor{green}{green}) and \emph{object} (shown in \textcolor{blue}{blue}). Edge types include \emph{object-to-actor spatial} (shown in \textcolor{magenta}{magenta}) and \emph{actor-to-actor temporal} (shown in \textcolor{orange}{orange}). }
\caption{Correlations between semantic representation accuracy and the rate or distortion. \textbf{left:} Rate versus accuracy for models marked by decoder class: PixelCNN (\textcolor{blue}{$\bullet$}), CNN (\textcolor[rgb]{1.0, 0.5, 0.0}{$\blacksquare$}), and Dueling Decoder (\textcolor{red}{$\blacktriangle$}). Filled or empty markers denote $\beta=1.0$ or $\beta=0.1$ respectively. The vertical dashed line indicates the 10 nat threshold for low-rate. \textbf{right:} Distortion versus accuracy for the low-rate models. Best viewed in color.}
\caption{The relationship between the semantic reconstruction accuracy (classifier on $X'$) and the semantic representation accuracy (MLP or linear classifier on $Z$). The MLP is filled and the linear classifier is empty. Symbols and colors mark the decoder types: PixelCNN (\textcolor{blue}{$\bullet$}), CNN (\textcolor[rgb]{1.0, 0.5, 0.0}{$\blacksquare$}), and Dueling Decoder (\textcolor{red}{$\blacktriangle$}). Best viewed in color.}
\caption{Simulation results displaying the Haines jump mechanism for $\theta=\ang{30}$ (\protect\redline), $\theta=\ang{90}$ (\protect\greenline), and $\theta=\ang{150}$ (\protect\blueline) for $\Ca=1\times10^{-3}$.}
\caption{Simulation results displaying the Haines jump mechanism for $\theta=\ang{30}$ (\protect\redline), $\theta=\ang{90}$ (\protect\greenline), and $\theta=\ang{150}$ (\protect\blueline) for $\Ca=2.5\times10^{-3}$.}
\caption{Simulation results displaying the Haines jump mechanism for $\theta=\ang{30}$ (\protect\redline), $\theta=\ang{90}$ (\protect\greenline), and $\theta=\ang{150}$ (\protect\blueline) for $\Ca=5\times10^{-3}$.}
\caption{Driving pressure ($\Delta p^*=\frac{\Delta p\overline{D}}{\mu\abs{\vb{U}}}$) across the porous media domain, for (a) $\Ca=10^{-1}$, (b) $\Ca=10^{-2}$ and (c) $\Ca=10^{-3}$, for $\theta=\ang{30}$ (\protect\redline), $\theta=\ang{90}$ (\protect\greenline) and $\theta=\ang{150}$ (\protect\blueline). Similar to \cite{Zacharoudiou2018}, interface mergers/pore-filling events can be identified by rapid changes in the pressure signal.}
\caption{Comparison of front location of the invading fluid in the longitudinal direction, for (a) $\theta=\ang{30}$, (b) $\theta=\ang{90}$ and (c) $\theta=\ang{150}$, with $\Ca=10^{-1}$ (\protect\redline), $\Ca=10^{-2}$ (\protect\greenline) and $\Ca=10^{-3}$ (\protect\blueline). As the capillary number decreases, the movement of the front becomes increasingly dominated by capillary rise between pores and Haines jumps.}
\caption{Displacement fraction $S$ of the invading fluid with respect to the pore volume, for (a) $\Ca=10^{-1}$, (b) $\Ca=10^{-2}$ and (c) $\Ca=10^{-3}$, with $\theta=\ang{30}$ (\protect\redline), $\theta=\ang{90}$ (\protect\greenline) and $\theta=\ang{150}$ (\protect\blueline). In (a), saturation takes longer due to the presence of capillary fingering and the formation of droplets. The kinks in (b) and (c) are due to residual fluid pockets which are not displaced.}
\caption{Visualization on our final models. The starting points of arrows are located at feature $i_0$. Arrows point to the $k$ proposed correspondences ($k=8$) of feature $i_0$. Proposed correspondences whose indices are in $\mathcal{A}^{i_0}$ (defined in Equation \eqref{eq:activate:neighbor}) are pointed by \textcolor{red}{red arrows} otherwise by \textcolor{blue}{blue arrows}. Feature changes after going through CP module are shown in heatmaps.}
\caption{\textbf{Framework overview.} Starting from RGB image input, 2D keypoints and segmentation are predicted first. Predicted 2D keypoints are used to crop patches from both RGB and predicted segmentation (color encoded). The RGB patches and segmentation patches are fused together to attain residual limb orientation vector (in 3D), which is transformed to residual 3D pose along the hierarchical human skeleton tree. Residual and initial 3D pose estimate are combined to construct the final 3D estimate. The refined pose (\textcolor{red}{Red}) is overlaid on the initial pose (\textcolor{blue}{Blue}) for better readability. Poses are visualized in a novel 3D viewpoint.}
\caption{\textbf{Motivation: Local patch for fine details amplification.} To determine the orientation of \emph{right\_elbow$\rightarrow$right\_wrist} (\textcolor{green}{Green arrow}), we are more interested in the content in \textcolor{red}{Red} patch compared to other parts \eg \emph{left\_knee$\rightarrow$ left\_ankle} (\textcolor{yellow}{Orange arrow}). However, the resolution of this "Patch of Interest" (\textcolor{red}{Red}) in original 224$\times$224 input image is only 24$\times$24, which is relatively low in modern network architectures. By explicit zoom in operation, the resolution of local patch is increased for further refinement. \textbf{Left}: Original image input. \textbf{Right}: Recovered high-resolution patch.}
\caption{\textbf{Qualitative results of the patch-based refinement.} \textbf{Left:} Image input. \textbf{Middle:} Cropped segmentation and RGB image patch. \textbf{Right:} The refined result (\textcolor{red}{Red}) on initial estimate (\textcolor{blue}{Blue}). Ground truth is colored in white for reference. \textcolor{blue}{Blue} arrow points to the part and refined joint. Only best 3D local view is visualized.}
\caption{\textbf{Most helpful case 1: rare pose}. \textcolor{red}{Red} indicates a joint is improved with patch-based refinement. \textcolor{blue}{Blue} indicates no improvement. }
\caption{\textbf{Most helpful case 2: occlusion}. \textcolor{red}{Red} indicates a joint is improved with patch-based refinement. \textcolor{blue}{Blue} indicates no improvement. Occluded part is enclosed in rectangle. }
\caption{\textbf{Qualitative examples showing adding cropped segmentation is better than only cropped RGB.} \textbf{Left}: Image input. \textbf{Middle}: Cropped segmentation and RGB image patch. \textbf{Right}: The result with cropped segmentation (\textcolor{red}{Red}) \emph{vs} with only cropped RGB (\textcolor{green}{Green}). White is ground truth. Body part and refined joint are marked with \textcolor{blue}{Blue} arrow. We only show the novel local 3D view for better readability. \textbf{Top}: Note that the lower arm almost blends in with the background, which is eliminated in segmentation. Besides, the dim illumination no longer exists. \textbf{Bottom}: Occlusion case. Left and right ankle are not clearly shown in the RGB patch because of the overlapping shoes. In the segmentation patch, nonetheless, left leg and right leg are distinguishable.}
\caption{An example AMR and the corresponding sentence before and after preprocessing. Senses are removed. The first named entity is replaced by ``\textcolor{blue}{HIGHWAY\_0}''; the second named entity is replaced by ``\textcolor{cayenne}{COUNTRY\_REGION\_0}''; the first date entity replaced by ``\textcolor{moss}{DATE\_0}''. \label{fig:preproc}}
\caption{Full model prediction vs. no target-side copy prediction. Nodes in blue denote the same concept (i.e., the country ``China''). The full model correctly copies the first node (``\textcolor{blue}{\text{vv7 / country}}'') as \texttt{ARG0} of ``\text{start-01}''. Without target-side copy, the model has to generate a new node with a different index, i.e., ``\textcolor{red}{\text{vv10 / country}}''. \label{fig:no-target-copy-ex}}
\caption{Full model prediction vs. no coverage loss prediction. The full model correctly predicts the second modifier ``\textcolor{blue}{\text{solemn}}''. Without coverage loss, the model generates a repetitive modifier ``\textcolor{blue}{\text{magnificent}}''. \label{fig:no-coverage-ex}}
\caption{Illustrations of our prediction during complicated human-human interactions. (a) A cyclist (\textcolor{lblue}{$\bullet$$\bullet$$\bullet$}) interacts with a person moving slow (\textcolor{lgreen}{$\bullet$$\bullet$$\bullet$}). (b) A person (\textcolor{lblue}{$\bullet$$\bullet$$\bullet$}) meets a group of people. (c) A cyclist (\textcolor{lblue}{$\bullet$$\bullet$$\bullet$}) first interacts with another cyclist in front (\textcolor{lgreen}{$\bullet$$\bullet$$\bullet$}) and then considers the influence of a person (\textcolor{lred}{$\bullet$$\bullet$$\bullet$}). The proposed approach socially avoids potential collisions. }
\caption{Spatio-temporal features are visually encoded from discretized grid to locally discover (i) human-human (\crule[yyellow]{0.25cm}{0.25cm}: woman$\leftrightarrow$man) and (ii) human-space interactions (\crule[rred]{0.25cm}{0.25cm}: man$\leftrightarrow$ground, \crule[pink]{0.25cm}{0.25cm}: cyclist$\leftrightarrow$cone) over time. Then, their pair-wise relations (\textit{i.e.}, \crule[rred]{0.25cm}{0.25cm}$\leftrightarrow$\crule[yyellow]{0.25cm}{0.25cm}, \crule[yyellow]{0.25cm}{0.25cm}$\leftrightarrow$\crule[ppurple]{0.25cm}{0.25cm}, \crule[ppurple]{0.25cm}{0.25cm}$\leftrightarrow$\crule[bblue]{0.25cm}{0.25cm}, \crule[bblue]{0.25cm}{0.25cm}$\leftrightarrow$\crule[pink]{0.25cm}{0.25cm}, \crule[pink]{0.25cm}{0.25cm}$\leftrightarrow$\crule[rred]{0.25cm}{0.25cm}, ...) with respect to the past motion of the target (\textcolor{yyellow}{$\rightarrow$}) are investigated from a global perspective for trajectory forecast.}
\caption{The average image denoising performance comparison on PASCAL-VOC 2012 validation set, with $\sigma$ = 15, 25, 35. \textcolor{red}{Red} is the best and \textcolor{blue}{blue} is the second best results (the same hereinafter)}
\caption{Benefits of camera views, types and colors on vehicle Re-ID. The {\color{brown}brown dash} box indicates the five vehicles characterized by distinct camera views, types and colors, their visualized feature representations and 2D feature projections of the images from corresponding identities, where different colors represents different vehicles. The {\color{blue}blue dash} box demonstrates several ranking results of conventional vehicle Re-ID based on ResNet-50~\cite{sarfraz2017pose}, where the {\color{red}red} and {\color{green}green} solid boxes of the first 15 ranks indicate the wrong and right matching respectively. The results show that extra semantics or attributes play critical role in handling the challenges of vehicle Re-ID. }
\caption{Overview of our DF-CVTC. The view transform model generates multi-view images based on a view-specified GAN. Both the original ({\color{blue}blue} box) and the generated images ({\color{red}red} boxes) are fed into the vehicle Re-ID model, which consists of one backbone (the first three blocks of ResNet-50), three subnetworks and one embedding network.}
\caption{Examples of ranking results on VeRi-776 dataset. The {\color{green} green} and {\color{red}red} boxes indicate the right matchings and the wrong matchings respectively.}
\caption{Examples of ranking results on VehicleID dataset. The {\color{green} green} boxes indicate the right matchings. Note that there is only one ground truth vehicle image in gallery set in the VehicleID dataset.}
\caption{ Comparisons with state-of-the-art Re-ID methods on VeRi-776 (in \%). The top three results are highlighted in {\color{red} red}, {\color{green} green} and {\color{blue} blue}, respectively.}
\caption{Comparisons with state-of-the-art Re-ID methods on VehicleID (in \%). The top three results are highlighted in {\color{red} red}, {\color{green} green} and {\color{blue} blue}, respectively.}
\caption{CMC curves on VeRi-776 and VehicleID datasets comparing to the state-of-the-art methods where the curves of variants of our methods are plotted in {\color{red}red} color but with different markers.}
\caption{An example of ranking results of DF-CVTC on ResNet-50 backbone by progressively introducing the view, type and color subnetworks on VeRi-776 dataset. The {\color{green} green} and {\color{red}red} boxes indicate the right matchings and the wrong matchings respectively. The histograms denote the probability distributions learnt from the view, type and color subnetworks respectively. }
\caption{Ablation study on different backbones with varying components on VeRi-776 dataset (in \%). The top three results are highlighted in {\color{red} red}, {\color{green} green} and {\color{blue} blue}, respectively.}
\caption{Parameters that were passed to generator for each set. Sets are separated into groups with a horizontal line paying respect to our representation of tests' results on graphs. \colorbox{codecolors_inline}{\lstinline{p_manif}} is set to 0.1 everywhere.}
\caption{Plots (A)-(C) are for sets 1-5 (varying \colorbox{codecolors_inline}{\lstinline{scale}}), (D)-(E) are for sets 6-10 (no cycles, varying \colorbox{codecolors_inline}{\lstinline{n_lat}}), (G)-(I) for sets 11-15 (one cycle, varying \colorbox{codecolors_inline}{\lstinline{n_lat}}).}
\caption{Effect of noise augmentation on ADS-B and outdoor WiFi fingerprinting. The ADS-B dataset corresponds to the first row of Table \ref{table_adsb_experiments} (with low test SNR, high train SNR). Noise injection improves ADS-B performance from \colorbox{LightGrey}{32.29\%} (which corresponds to train SNR$_{\text{aug}}$ = test SNR$_{\text{aug}}$ = $\infty$, i.e.\no noise insertion) to\colorbox{LightCyan}{52.12\%} when train SNR$_{\text{aug}}$ = 10 dB and test SNR$_{\text{aug}}$ = 50 dB. Outdoor Wifi accuracy improves from \colorbox{LightGrey}{61.73\%} to \colorbox{LightCyan}{69.37\%}. }
\caption{Comparisons to the state-of-the-art attention modules on ImageNet validation set. Single 224 $\times$ 224 central crop is adopted for evaluation. All results are reproduced in the pytorch framework. $^{*}$ denotes the modified versions based on ResNet backbones. The best and the second best records are marked as {\bf bold} and {\color{blue}\bf blue}, respectively.}
\caption{Performance on RetinaNet for objects of three scales. The notations are similar as in Table \ref{tab_compare_detectors}. The best and the second best records are marked as {\bf bold} and {\bf\color{blue}blue}, respectively. Compared to the SE/SK module, the detection of small objects from SGE has been significantly improved.}
\caption{ (a) Global maximum of $R_{\al\bb\ga\de}/R_0$, where $R_0=\lambda \ln 2$ is the global maximum of $R_{\al\bb\ga\de}$ for a sample without a hole. (b) Positions of contacts. Mean position below $3\pi/4$ and above $5\pi/4$ was changed by $\pm \pi/2$ to avoid discontinuity. Large dots are experimental results while small dots show results of the numerical simulations. Inset: {\red shape of the triply connected sample used in the experiment and} positions of four contacts located at $\al$, $\bb$, $\ga$ and $\de$, related to a global maximum of $R_{\al\bb\ga\de}$ (colors of contacts correspond to colors of dots in (b)). }
\caption{Learned 40-dimensional MuRP and MuRE embeddings for WN18RR relation \textit{has\_part}, projected to 2 dimensions. \tikzcircle[black!20!magenta, fill=black!20!magenta]{2pt} indicates the subject entity embedding, \tikzcircle{2pt} indicates true positive object entities predicted by the model, \tikzcircle[blue, fill=blue]{2pt} true negatives, \tikzcircle[orange, fill=orange]{2pt} false positives and \tikzcircle[black!30!green, fill=black!30!green]{2pt} false negatives. Lightly shaded blue and red points indicate object entity embeddings before applying the relation-specific transformation. The line in the left figure indicates the boundary of the Poincar{\'e} disk. The supposed false positives predicted by MuRP are actually true facts missing from the dataset (e.g. \textit{malaysia}).}
\caption{All the models overlaid on the same graph. Looking at the bottom-left graph, the models are: \textcolor{red}{Transformer} (red), \textcolor{cyan}{CNN} with attention (cyan), \textcolor{blue}{GRU} with attention (blue) and \textcolor{green}{LSTM} without attention (green). The BLEU score of CNN approaches that of transformer, but the diminishing training gradient of the CNN caused a loss explosion (middle top graph) and further training failed. Top panel: \emph{epoch\_sec} - training time per epoch, \emph{loss} -- training loss, \emph{ppl} -- training perplexity. Bottom panel: \emph{avg\_bleu} -- average epoch BLEU score, \emph{loss} -- testing loss, \emph{ppl} -- testing perplexity.}
\caption{{\bf Left:} assessment of sensitivity to the choice of the prior mean ($\mu$) in the $\mathcal{GP}$. Three choices are considered: $\mu=0$ , $\mu=\bar{x}$ and $\mu=x_i$. {\bf Right:} Evaluation $F^{\ast}$ (marked with a star at $h=0$). The {\color{blue} Blue} points denote the evaluation of $I(f,h)$ using the 4 initial step-sizes ${\bf h}_{K} = \{1,1/2,1/3,1/4\}$ and a 95\% uncertainty band. We then apply the $\mathcal{GP}$ algorithm over several iterations (only 2 required) to identify a converged estimate of $F^{\ast}$. Each of the two new step-sizes were chosen based on the step-size $h_{j}$ in which the 95\% uncertainty band is below $\gamma = 0.5$ of the uncertainty $\sigma_{0}$ at $h=0$. The first evaluation, point $h_{5}$, is denoted by {\color{green} Green} point and the second, $h_{6}$, is denoted by {\color{red} Red}. }
\caption{ (a) \textbf{Illustration of the multivariate, geo-tagged time series imputation task}: the input data has three dimensions (i.e. time, location, measurement) with some missing values (indicated by the \textcolor{orange}{orange} dot); the output is of same shape as the input while the missing values have been imputed (indicated by the \textcolor{red}{red} dot). (b) \textbf{Self-attention mechanism}: the \textbf{Attention Map} is first computed using every pair of \textbf{Query} vector and \textbf{Key} vector and then guides the updating of \textbf{Value} vectors via weighted sum to take into account contextual information. (c) \textbf{Traditional Self-Attention} mechanism updates Value vector along the temporal dimension only vs. \textbf{Cross-Dimensional Self-Attention} mechanism updates Value vector according to data across all dimensions. }
\caption{Visualization of the cross-dimensional self-attention on \textbf{KDD-2015}. (a) Part of \textit{Time}-\textit{Measurement} attention map. (b) Two time series of PM2.5 and PM10. The value at \textcolor{purple}{purple} dot is missing and our model predicts its value based on other values. The arrow in (b) represents attention whose score is highlighted with bounding box in (a) of the same color.}
\caption{Examples of reconstructed normal images (first 4 columns in each Figure) and outliers (last 4 columns in each Figure) using VAE and RVAE with different $\beta$s on (a) MNIST (normal) + EMNIST (outliers) datasets and (b,c) Images from the class of shoes (normal) and images from the class of other accessories (outliers) in the Fashion MNIST datasets. The optimal value of $\beta$ is highlighted in \textcolor{red}{red}. The RVAE with too small $\beta$ has similar performance to the regular VAE, while RVAE with too large $\beta$ rejects outliers but also rejects some normal samples.}
\caption{Schematic illustration of algorithmic procedures. We consider three states (1, 2 and 3) and we choose the dephasing time for these states as 3 time units. \redc{a) Procedure for extracting instances from short MD trajectories, once they have been projected onto these states.} b) QSD-KMC procedure. }
\caption{\redc{(a) A three-well potential function. (b) Survival probability function and Anderson-Darling dephasing time for state 2. (c) Different densities computed for state 2 : the Boltzmann (equilibrium) density, the QSD density and the density accumulated along the path of a QSD-KMC trajectory.}}
\caption{\redc{State-to-state probability evolutions calculated from QSD-KMC and from the benchmark MD trajectory for the three-state system. The dephasing times of 20 time units for all the states are estimated from the Anderson-Darling procedure described in Section~\ref{choice}. For comparison, we also show the QSD-KMC results using $\tau_{d}$ = 1 and 5 time units. }}
\caption{\redc{QSD-KMC density as a function of the dephasing time (number of time units shown in parenthesis) for the three-state system.}}
\caption{\redc{(a) A sloped sine wave potential function. (b) Survival probability function and Anderson-Darling dephasing time for state 1. (c) Different densities computed for state 1: the Boltzmann (equilibrium) density, the QSD density, the density contribution from the pass-through trajectories, the (non-equilibrium) density obtained from the benchmark MD trajectories and the density accumulated along the path of a QSD-KMC trajectory. (d) QSD-KMC density as a function of the dephasing time (number of time units shown in parenthesis) for state 1. (e) Different densities computed for state 2 similar to the state 1 case.}}
\caption{\redc{State-to-state probability evolutions calculated from QSD-KMC and the benchmark MD trajectories for the non equilibrium one-dimensional system. The dephasing time of 50 time units for all the states are estimated from the Anderson-Darling procedure described in Section~\ref{choice}. For comparison, we also show the QSD-KMC results at $\tau_{d}$ = 1 and 5 time units.} }
\caption{\small Average cumulative reward matrix $R$ and $A$ metric result for each scenario and CRL strategy. Results highlighted in \textbf{black} and {\color{blue} blue} represent the best and the second best performing strategies on each scenario. A gray background is used in the cells involved in the computation of the $A$ metric. Results are computed over 10 runs for each strategy and benchmark for a total of 160 runs. Standard deviation is reported in Tab. \ref{tab:std_res}}
\caption{A first order decision tree is used to visualize the effects of backscattering. \textcolor{OliveGreen}{Green} boxes denote events with correct emission direction and energy information. \textcolor{gray}{Gray} boxes denote events without any trigger at all, while \textcolor{YellowOrange}{yellow} stands for the right emission direction assignment, but detection at lower energy, and \textcolor{red}{red} misses both, emission direction and full energy reconstruction.}
\caption{Flow transformation when computing log-likelihoods. \textbf{(a)} Discrete autoregressive flows stack multiple levels of autoregressivity. The receptive field of output unit 2 (\red{red}) includes left and right contexts. \textbf{(b)} Discrete bipartite flows apply a binary mask (\blue{blue} and \green{green}) which determines the subset of variables to transform. With 2 flows, the receptive field of output unit 2 is $\mbx_{1:3}$. }
\caption{Detailed outcome of the robot experiments. \greencheck: the trial was successful, \redcross: generated grasp was not successful, \magentacross: none of the generated grasps are kinematically feasible.}
\caption{Illustration of common stages of image annotation: annotators first provide object class labels at the image level \cite{deng09cvpr,kuznetsova18arxiv} (\textcolor{red}{red}), sometimes associated to a specific object via a click as in \cite{lin14eccv} and our approach for \taskname{} (\textcolor{green}{green}). Following stages then annotate the spatial extent of all objects of these classes, \eg with bounding boxes or segmentations (\textcolor{blue}{blue}). Speech provides a natural way to combine the two stages to simultaneously annotate class labels and bounding boxes.}
\caption{Example annotations on \ilsvrc{}. For each click we show the three alternatives from the ASR model (\textcolor{orange}{orange}) and the final class label (\textcolor{green}{green}). The first three images show typical annotations produced by our method. The last one shows a failure case: while the correct name is among the alternatives, an incorrect transcription matching a class name ranks higher, hence the final class label is wrong. % }
\caption{A comparison of typical mouse paths produced when annotating an image with our interface (\textcolor{darkgreen}{green}) or with \cite{lin14eccv} (\textcolor{red}{red}). Circles indicate clicks. Mouse paths for our interface are extremely short, thanks to its simplicity and naturalness.}
\caption{CAS for different models at 128$\times$128 and 256$\times$256 resolutions. BigGAN-deep samples \textcolor{.}{are} taken from best truncation parameter of 1.5.}
\caption{ Improvement after incorporating CTM into existing methods on \mini~(left) and \tiered~(right). }
\caption{Schematic visualization of the work flow, from an operator's perspective. A DMP was created based on a demonstration. Subsequently, the DMP was executed while evaluated by the operator. If unsuccessful, the operator demonstrated a correction, which yielded a modified DMP to be evaluated. Once successful, further improvement could be done by, \emph{e.g.}, trajectory-based reinforcement learning, though that was outside the scope of this work. Steps that required direct, continuous interaction by the operator are marked with light {\color{red} red} color. Steps that required some attention, such as supervision and initialization, are marked with light {\color{blue} blue}. The operations in the white boxes were done by the software in negligible computation time, and required no human involvement. The work in this paper focused on the steps within the dashed rectangle.}
\caption{Same trajectories as in \cref{fig:adjust_cart}, but zoomed in on the corrective trajectory. Arrows indicate directions. The parts of the trajectories between the separation markers were not retained. The right, {\color{blue} blue}, separation point was determined explicitly by the operator during the corrective demonstration. The left, {\color{green} green}, separation point was determined according to \cref{eq:mindist}. Further, what was left of the deficient trajectory was modified for a smooth transition. However, the part of the corrective trajectory retained was not modified, since it was desired to closely follow this part of the demonstration. Note that the trajectories retained were not intended for direct play-back execution. Instead, they were used to form a modified DMP, which in turn generated a resulting trajectory, as shown in Figs.~\ref{fig:place1}, \ref{fig:place2} and \ref{fig:avoid1}.}
\caption{\textbf{Self-Supervised Object Detection.} We use unlabeled videos as input and train an object detector without any manual annotation. Self-supervision is based on the timed transcript selected automatically, using closed captions (or speech-to-text). For a given object name, we extract a single key frame at the instant the object was mentioned. Yet, only in part of the frames the object is visible (colored as {\color{green}green}). Often the object is mentioned but out of the frame or missing (colored as {\color{red}red}). The object can be detected by searching common themes relating the selected frames and discriminating them from frames in irrelevant videos. Zoom in for better visibility.}
\caption{(a) A network that performs some \textcolor{nnin}{input}-\textcolor{nnout}{output} mapping task with one intervening \textcolor{nnattin}{hidden} layer. (b) A \textcolor{nnattrnet}{DAE} that produces a \textcolor{nnattout}{reified output}. (c) Integrating the two architectures to perform state reification on the \textcolor{nnattin}{hidden} state. (d) A recurrent sequence processing architecture, unrolled in time horizontally, with an \textcolor{nnattrnet}{attractor net}---unrolled vertically---reifying the \textcolor{nnattin}{hidden} state.}
\caption{Predicted thermodynamic \ce{CH4} volume mixing ratios{\color{red} $^\dagger$} }
\caption{ Target-Guided Open-Domain Conversation. The agent is given a target subject \emph{e-books} which is unknown to the human. The goal is to guide the conversation naturally to the target. Utterance keywords are highlighted in {\color{red} red} (agent) and {\color{blue} blue} (human) and in \emph{italic}. %The system randomly picks a start sentence and a target topic and only informs the agent of the task. Then the human and the agent chat with each other. The agent is required to guide the conversation to the target 'animal' naturally. }
\caption{Example conversations between human (\textbf{H}) and two different agents (\textbf{A}), with the same targets and starting utterances. Keywords selected or predicted by the agents are highlighted in {\color{red} red} and \emph{italic}, and keywords mentioned by human are highlighted in {\color{blue} blue} and \emph{italic}. As keywords predicted by the Kernel agent do not necessarily occur in the retrieved utterances, we put them to the end of each sentence. Targets achieved at the end of conversations are underlined. We present the examples in case-sensitive format for readability. All tokens are in lowercase in the program.}
\caption{{\bf Left:} EncryptGAN asymmetric encryption image steganographic algorithm flowchart. Boxes in \textcolor{green}{green} denote public information, boxes in \textcolor{red}{red} contain private key or messages. Two distinct image domains provide two random images including a `cover' image $x_i$ and a `disguise' image $y_i$. Before the communication, a third party transformed them via a learned network $\mathcal{K}$ into unrecognizable images: sender is given $\mathcal{K}(y_i)$ (public) and receiver is given $\mathcal{K}(x_i)$ (private). Sender pastes in the private message $m$ on the `cover' and using the transformed `disguise' $\mathcal{K}(y_i)$ to create an altered version of `disguise' image $y_i(m)$. Receiver decodes the altered `disguise' image with the transformed `cover' image $\mathcal{K}(x_i)$ (private) back to the original private message. {\bf Right:} An intuitive explanation of our steganographic operation principle as navigation on manifolds of two image domains. A `cover' image $x_i$ when pasted with different private messages, $m_1, m_2$, induces movements to distinct landmarks on its image manifold $X$. The Cycle-GAN generative networks ensure similar movements on manifold $Y$ relative to the `disguise' image landmark. To decode the original private message, it is crucial to know the original reference landmark of the `cover' image $x_i$, because an alternative `cover' image $x_{o}$ can also reach the same landmark location: $x_{o} + m_{o}$ = $x_{i} + m_{2}$. This multiplicity of the solution provides strong privacy protection. We conjunct that $\mathcal{K}(x_i)$ needs to encode just enough information to infer the landmark location of $x_{i}$ to compute $m_{2}$. }
\caption{Resolved shear stress on grain pairs under LCF and dwell fatigue, compared to the critical resolved shear stresses~\cite{perilla1995two,gong2009anisotropy}. The operative slip systems observed by TEM are highlighted in \textcolor{red}{red}.}
\caption{A flowchart showing the Bayesian updating scheme for multi-level uncertainty propagation within the Total Monte Carlo method. $w_k$ is the weight computed for the $k^{th}$ ND file, and $w_{thres}$ is a minimum weight threshold assigned to each random ND file such that any file that does not meet this criterion is discarded. A value of $w_{thres}$ = 2.06e-09 which correspond to a total critical $\chi^2$ value of 40 was assigned to each ND file. The critical $\chi^2$ value correspond to the upper-tail critical value of the $\chi^2$ distribution with 28 degrees of freedom (varied parameters) at 95\% confidence level assuming that the $\chi^2$ distribution is in the form of a gamma distribution with the scale parameter equal to 2 and the shape parameter equal to N/2; where N is the number of the degrees of freedom. \textcolor{red}{It should be noted that, this could introduce some bias into the calculation depending on the choice of the cut-off parameter.}}
\caption{Selected nuclear model parameters of the TALYS code (with their parameter widths) used for parameter variations. The parameter widths are given as a fraction (\%) of their absolute values. The parameter widths of $a$, $g_\pi$ and $g_\nu$ are given in terms of the mass number $A$. $a$ is the real central diffuseness, $g_\pi$ and $g_\nu$ are the single-particle state densities as used in exciton model analyses~\cite{Koning-2004global}. A complete list of all the model parameters can be found in Ref.~\cite{TALYS-2007}. \textcolor{blue}{All parameters were varied altogether within their parameter widths in a TMC fashion.}}
\caption{\textcolor{blue}{Differential experimental data used for the adjustment of $^{208}$Pb showing the number of data points per reaction channel used. Only data points between 5 and 20 MeV were used in the computation of file weights. Also, only the names of the first authors have been presented.}}
\caption{\textcolor{blue}{Convergence in the mean and the $^{208}$Pb ND uncertainty \textcolor{red}{(left)} with the corresponding prior $k_{\rm eff}$ distribution \textcolor{red}{(right). The $k_{\rm eff}$ distribution is the 1st prior distribution obtained by varying $^{208}$Pb nuclear data in the hmf57c1 benchmark (a total of 2700 random ND files were used). By 1st prior distribution, we refer to the distribution of the $k_{\rm eff}$ before the implementation of the threshold weight presented in Fig.~\ref{fig_BayesUpdate}}. The error bars on the ND uncertainty represent the estimated uncertainty on the ND data uncertainty obtained for each iteration. More details on how to compute this uncertainty has been presented in Refs.~\cite{Alhassan-2014ANE,Helgesson-2013,Sjostrand-2013a}. \textcolor{blue}{The $k_{\rm eff}$ distribution is fitted to a normal distribution for the purpose of eye guidance only}.}}
\caption{\textcolor{blue}{The 2nd prior and posterior $k_{\rm eff}$ distributions due to the variation of $^{208}$Pb nuclear data for the hmf57c1 benchmark with the corresponding file weights computed using only selected experiments from EXFOR. Also presented are the convergence plots for the mean $k_{\rm eff}$ and the ND uncertainty of the posterior distribution as well as scatter plot of the weights computed. \textcolor{red}{The 2nd prior represents the $k_{\rm eff}$ distribution after the implementation of the threshold weight presented in Fig.~\ref{fig_BayesUpdate}(note: a total of 2046 random ND files were accepted)}. The error bars on the convergence plot of the ND uncertainty, represent the estimated uncertainty on the ND data uncertainty for each iteration. More details on how to compute this uncertainty has been presented in Refs.~\cite{Alhassan-2014ANE,Helgesson-2013,Sjostrand-2013a}. A Prior ND uncertainty of 1025 pcm and a posterior ND uncertainty of 1018 pcm were obtained. Based on the weights computed, an effective sample size (ESS) of 245 was obtained (i.e. for the case of weighted channels). The weight distribution is given in the log scale.}}
\caption{\label{compare_weights11} \textcolor{blue}{Comparison of different nuclear data libraries as well as adjustments from this work using the reduced chi squared (see Eq.~\ref{gen_chi2}). Note that only experimental data from the (n,tot), (n,non-el), (n,inl), (n,$\gamma$) and the (n,2n) cross sections of $^{208}$Pb were considered. \textcolor{blue}{No data was available for the (n,non-el) channel in the JEFF-3.3 and JENDL-4.0 libraries and therefore, the (n,non-el) was obtained by subtracting the (n,el) from their total cross sections.} As presented in the Table, the 1st update (weighted channels) represents the adjustment with differential data only where each channel was assigned a weight equal to its average cross section, while the (unweighted channels) represents adjustment where all considered channels were assigned with equal weights. \textcolor{red}{In the case of the 2nd update, only the hmf57c1 was used for adjustment while in the case of 'This work (global likelihood)', adjustments were carried out using selected experiments from EXFOR and the hmf57c1 benchmark}. In the case of the global likelihood, both the weighted and unweighted channels gave the same 'best' file. \textcolor{blue}{Note: the $\chi^2$ results given for the ENDF/B-VIII.0, JEFF-3.3, JENDL-4.0, TENDL-2017, CENDL-3.1 and the nominal file were obtained using unweighted channels.}}}
\caption{\label{prior_post_NDuncert} \textcolor{blue}{Summary of results for the mean $k_{\rm eff}$ with corresponding nuclear data uncertainty ($\pm$ uncertainty on the estimated ND uncertainty) and the Effective Sampling Size (ESS) for the 1st and 2nd priors, and the posterior distributions of the 1st, 2nd and the combined updates. Weighted channels represents the case where each channel was assigned a weight equal to its average cross section over the considered energy range while the unweighted channels represents the case where all channels were assigned equal weights. The 1st prior here represents the distribution of the $k_{\rm eff}$ without the inclusion of experimental data while in the case of the 2nd prior, the information from the differential experimental data were used to exclude some random files based on a weight threshold (see Fig.~\ref{fig_BayesUpdate}). }}
\caption{Comparison of file performance between this work and the evaluations from the ENDF/B-VIII.0 and JEFF-3.3 ND libraries and differential experimental data from EXFOR (between 5 to 20 MEV) for $^{208}$Pb (n,2n), (n,el), (n,inl) and (n,2n) cross sections. \textcolor{blue}{The nominal file (prior) is the ND file around which the other random files were generated. The authors of the different experiments from EXFOR have been lumped together and labelled as EXFOR. The weighted and unweighted represents cases where channels were weighted with their average cross sections or where all channels carried equal weights respectively.}}
\caption{Comparison of file performance between this work and the major nuclear data libraries as well as with differential experimental data from EXFOR (between 5 to 20 MEV) for $^{208}$Pb (n,non) and (n,$\gamma$) cross sections. \textcolor{blue}{The nominal file (prior) is the file around which the other random files were generated. The authors of the different experiments from EXFOR are not presented, instead, they have all been labelled as EXFOR. The weighted and unweighted represents cases where channels were weighted with their average cross sections or where all channels carried equal weights respectively.}}
\caption{Time series of a) LA oscillations measured at 4.98\,V~vs.~SHE and b) HA oscillations at 6.85\,V~vs.~SHE. Both states measured at$R_\text{ext}A=6$\,k$\Omega$cm$^2$.}
\caption{Investigated parameter space with the different types of oscillations, marked with: (\textbullet) simple periodic Low Amplitude oscillations (LA), ($\blacksquare$) simple periodic High Amplitude oscillations (HA), ($\bigstar$) multiperiodic and aperiodic oscillations, (\textasteriskcentered) foci and ($+$) nodes. The oscillatory regime of the parameter space is roughly divided into three different regions which are coloured; yellow LA, green HA, yellow and green striped multiperiodic and aperiodic oscillations. The dashed orange line marks the Hopf bifurcation from which the LA oscillations arise. The sn1 line represents the border between the sets of parameters for which no initial oxide layer on the electrode is necessary for the system to attain a stable limit cycle (left) and those for which an initial oxide layer is needed (right). The sn2 line marks the transition from the oscillatory regime to the electropolishing branch.}
\caption{a) LA oscillations and stable focus in the ellipsometric intensity $\xi$ vs. current density j phase plane projection at $R_\text{ext}A=6$\,k$\Omega$cm$^2$. Black lines mark attractors, the red line shows the transient after a parameter variation. The large limit cycle was measured at 4.90\,V~vs.~SHE, the small limit cycle was measured at 4.84\,V~vs.~SHE and the stable focus was measured at 4.82\,V~vs.~SHE. b) The squared peak-to-peak amplitude of LA oscillations close to the Hopf bifurcation$A^2$ vs. the applied potential $U_\text{app}$ at $R_\text{ext}A=6$\,k$\Omega$cm$^2$. The line is the linear fit for the first four points.}
\caption{Next-maximum-map of a period-two torus measured at 5.74\,V vs. SHE; even-numbered points appear in the upper left and odd-numbered points in the lower right corner. The arrows indicate the direction in which the points appear.}
\caption{{\bf (A)} The network architecture consists of just one hidden layer, receiving inputs from an input population ${\bf x}$ and projecting to output units ${\bf y}$. {\bf (B)}: Basal phase of network activity, during which $\phi = \tanh$ determines the firing rate $r_i$ from the somatic voltage $V^S_i$. The vectors ${\bf r}^S$ and ${\bf r}^A$ are concatenations of ${\bf r}$ with the inputs ${\bf x}$ and labels ${\bf \overline{y}}$, respectively, with a constant $1$ to provide a bias term. {\bf (C)}: Distal phase of network activity, during which neural firing $\tilde{r}_i$ is determined by $\phi\left(V_i^A\right)$. {\bf (D)}: Temporally filtered losses for each set of synapses. Test accuracy (green) was computed every 500 time steps. Light traces are individual trials, while dark traces are trial-averages.} \end{figure} We chose the (4, 6)-back task to quantify learning \cite{pitis2016recurrent} because it has low-dimensional inputs and outputs, multiple time scales of relevant information, and clear bounds for performance that correspond to learning particular input-output dependencies. In more detail, the network has to map an i.i.d. temporal sequence of Bernoulli inputs with $p_x = 0.5$ to a Bernoulli output, whose probability depends on the inputs with some lag that can be adjusted to tune the task difficulty, here lags of 4 and 6 time steps. In particular, the baseline output probability $p_y = 0.5$ is increased (decreased) by 0.5 (0.25) when the input from 4 (6) time steps back is equal to 1.\\ Our model consists of an input layer, a recurrent network, and an output layer (Fig.1A). We define plasticity rules with the aim of minimizing a loss function that quantifies task performance as the cross entropy between the network outputs and the target distribution. Plasticity rules can be derived by performing stochastic gradient descent on this loss function, but calculation of the exact gradient requires computations that are nonlocal over space and time. Instead, we exploit a novel machine learning technique, known as ``synthetic gradients," to approximate the gradient using local computations \cite{jaderberg2017decoupled}. Biologically, this approximation manifests as a network of neurons with multiple compartments that are innervated by functionally distinct sets of synapses ${\bf W}$, ${\bf A}$ and ${\bf J}$, one somatic and two distal. The ${\bf W}$ synapses are used for solving the actual task, processing inputs and running the primary network dynamics, whereas the ${\bf A}$ and ${\bf J}$ synapses are used for learning. All of these synapses are plastic.\\ First, the plasticity rule for a synapse $W_{ij}$ has a very simple form, depending only on presynaptic activity and the somatic and distal postsynaptic voltages $V_i^S$ and $V_i^A$ (Fig.1B). Biologically, this corresponds to distal modulation of synaptic plasticity \cite{dudman2007role}. Second, the plasticity rule for a synapse $A_{ij}$ depends on presynaptic activity and postsynaptic distal voltage, gated by the inputs from the ${\bf J}$ synapses and a top-down error signal $\boldsymbol{\delta}$ passed through fixed feedback synapses ${\bf W^{FB}}$ (Fig.1C). Lastly, the plasticity rule for $J_{ij}$ follows perceptron-like learning $\Delta J_{ij} \propto r_i^{(t)} \left(r_j^{(t+1)} - \sum_k J_{jk} r_k^{(t)}\right)$ to implement one-step prediction of recurrent dynamics. ${\bf J}$ is meant to approximate the Jacobian of the recurrent dynamics $\partial {\bf r}^{(t+1)} /\partial {\bf r}^{(t)}$. Since all the required signals cannot be simultaneously represented at the level of voltages, we require two distinct phases for the network dynamics: a ``somatic" phase for the updates of ${\bf W}$ and ${\bf J}$, and a ``distal" phase for the updates of ${\bf A}$. Biologically, this could be mediated by targeted inhibition that dynamically gates out unwanted inputs \cite{somogyi2014temporal}.\\ Each plasticity rule minimizes an implicit loss function w.r.t. its synaptic sub-population: ${\bf W}$ updates to improve task performance, ${\bf A}$ to improve the approximation of credit assignment, and ${\bf J}$ to approximate the Jacobian. Fig.1D shows the evolution of the corresponding losses over learning. As expected, the losses decrease and stabilize. Interestingly, the saturation happens fastest for ${\bf J}$, driving learning in ${\bf A}$, which in turn drives learning in ${\bf W}$, predicting possible differences in timescales of plasticity at distal vs. basal synapses. What does the network learn? Performance-wise, the network produces the correct output $75\%$ of the time, which is the theoretical bound given inherent randomness in the task. The 3 blue dashed lines represent cross-entropy bounds for ``internalizing" the different dependencies between the inputs and outputs. The upper-most dashed line represents learning of the marginal output statistics, i.e. that $62.5\%$ of outputs are 1, while the second and third dashed lines represent learning of 4- and 6-time-step lags, respectively. On average over many random seeds, our model reliably learns the 4-time-step lag and is sufficiently close to the next bound to indicate some knowledge of the 6-back component. Failing to learn the second dependency is not entirely surprising, because optimal performance w.r.t. cross entropy requires a perfect calibration of confidence at each time step. Moreover, vanilla recurrent networks are known to struggle with long-term dependencies even with full BPTT \cite{pascanu2013difficulty}. The fact that our local approximation of a much more complicated algorithm can learn long-term dependencies at all is exciting.\\ In summary, we have designed a network model that can learn long-term dependencies using biologically plausible, local learning rules. The required biological features for calculating credit assignment include multi-compartment neurons, distinct phases for circuit dynamics, and spatial clustering of synapses with similar function. While functional roles of different compartments and their distinct plasticity properties have received experimental attention, there is relatively little theoretical work on their computational significance \cite{guerguiev2017towards}. Our work is an important step in this direction. \bibliographystyle{ieeetr}{\footnotesize \bibliography{ref}} \end{document}}
\caption{We measure the (non-)convergence to equilibrium in the separable convex-concave-- ($f(x,y) = \alpha( x^2 - y^2 )$, left three plots) and concave convex problem ($f(x,y) = \alpha( -x^2 + y^2 )$, right three plots), for $\alpha \in \{1.0,3.0,6.0\}$. (Color coding given by \textcolor{Cyan}{ GDA, SGA, LCGD, CGD}, \textcolor{RedOrange}{ConOpt}, \textcolor{Green}{OGDA}, the y-axis measures $\log_{10}(\|(x_{k},y_{k})\|)$ and the x-axis the number of iterations $k$. Note that convergence is desired for the first problem, while \emph{divergence} is desired for the second problem.}
\caption{\label{tab:syn1}Quantitative evaluations on the proposed synthetic dataset. ``Blind" represents the images with variable blur kernels, and ``Non-blind" denotes fixed kernel. % {\color{red}Red} and {\color{blue}blue} text indicates the first and second best performance, respectively. % }
\caption{Comparison of different weakly-supervised semantic segmentation methods on PASCAL VOC 2012 val set. The best two methods are marked with \textcolor{red}{red} and \textcolor{blue}{blue}, respectively.}
\caption{Comparison of different weakly-supervised semantic segmentation methods on PASCAL VOC 2012 test set. The best two methods are marked with \textcolor{red}{red} and \textcolor{blue}{blue}, respectively. The ``-" denotes unknown values that were not reported by the corresponding paper.}
\caption{Examples of replacing templates. \texttt{Template 1}'s are the inital generated templates, while the remaining ones are produced by the authors. We use \textbf{bold} to denote the heads and use \textcolor{red}{\textit{italic red}} to denote mistaken words.}
\caption{Two iterations of our attack: Transformation \textcolor{sunygreen!50!black}{\ding{182}} changes the control statement \code{for} $\rightarrow$ \code{while} and transformation \textcolor{sunygreen!50!black}{\ding{183}} manipulates the API usage \code{ostream} $\rightarrow$ \code{printf} to imitate the stylistic patterns of author B.}
\caption{Fairness metric (SPD) and accuracy (Acc) averaged for 100 repetitions using CART as predictive model. We show results for different subsets of data: all rows, only the rows with missing values (\missing), only the rows without the missing values and a sample of the latter of the same size. The symbols \amplifies{} and \reduces{} represent whether the bias has been amplified or reduced respectively in comparison to the corresponding columns in Table \ref{tab:PRE}. Bold figures represent the fairest result (closest to 0). The star symbol denotes statistical significance in a multiple pairwise-comparison between the means of the columns \emph{SPD (with \missing)}, \emph{SPD (w/o \missing)} and \emph{SPD (sample w/o \missing)}. %\todo{ Y poner asterisco solo si son diferentes significativamente (t-test?)?. Nando: Pendiente, tengo que relanzar ya que solo almaceno resultados agregados}. }
\caption{The triangle mesh and texture models generated from the point clouds reconstructed using the red channel with IC dye. The shown texture is the inner texture of the stomach. The video version can be seen from the following link (\textcolor{blue}{http://www.ok.sc.e.titech.ac.jp/res/Stomach3D/}).}
\caption{ Schematic representation of the canonical boundary-value problem. Surface-wave propagation is parallel to the $x$ axis, while the optic axis of material $\calA$ is parallel to $\hat{\#{u}}_{\rm }$ which lies in the $xy$ \red{plane at} an angle $\psi$ relative to the $x$ axis. }
\caption{Time distributions of LLR, VLBI (opa2018a), and optical CPO series. \textcolor{Black}{The time span of every CPO series is LLR 1970.6-2017.7, VLBI (opa2018a) 1979.9-2018.5, and optical (OA00) 1899.7-1991.9.}}
\caption{Weighted fits of models 1 and 2 to optical residuals (1899.7-\textcolor{Black}{1991.9}) corresponding to the IAU\,2006/2000 model.}
\caption{\textcolor{Black}{Details after 1985 of} d$X$ and d$Y$ residuals from LLR observations, using changing windows of 50 counts and within 70 days. \textcolor{Black}{(While it is stated that setting the number of normal points to 50, it appears that there may be 67 windows (out of 483, i.e., more than 10\%) that contain fewer than 50 points, in which case the window length is about 70 days.)}}
\caption{\textcolor{Black}{Details after 1985 of} d$X$ and d$Y$ residuals from LLR observations, using sliding windows of 70 days.}
\caption{\textcolor{Black}{Time series of observational errors and number of used normal points; the second figure shows details of the first one. Each point \textcolor{Black}{or} bin represents the weighted average of the total within a year.}}
\caption{Correlation coefficients of the LLR fitting results in Sect.~\ref{section_discuss}, of the sliding-window \textcolor{Black}{(the first part)} and changing-window \textcolor{Black}{(the second part)} methods.}
\caption{The rest-frame $1500\,\angstrom$ image of galaxy A2 at $z=2.95$, with 250-\micron\, contours overlaid in red. The long- and short-wavelength emission occupy strikingly different spatial regions. While there is recent star-formation across the extent of the galaxy disk, light at short wavelengths does not escape from regions of high dust density. This leads to a spatial offset between the FUV and dust continuum emission.}
\caption{Spoiler sentence detection results on \dataname{Goodreads} and \dataname{TV Tropes}, where arrows indicate the performance boost ({\color{red}$\uparrow$}) or drop ({\color{blue}$\downarrow$}) compared with the base model in each group. Best results are \textit{\underline{highlighed}}. }
\caption{$~${\small\color{blue!96!black} ${Online\_Abduction}(\mathcal{V}, \Sigma)$} \label{alg:VA} }
\caption{Average performance ranks (lower is better) on OpenML-100 {\em vs} CPU time of the Vanilla versions of \textcolor{blue}{\mosaic\(bottom)}, \textcolor{red}{\as\(middle)}, and \textcolor{darkspringgreen}{\TPOT\(top)}. Better seen in color.}
\caption{Sensitivity study w.r.t. \hyper\$C_{ucb}$ and $PW$ (progressive widening in expansion phase), for $n_r = 100$: Average rank of \mosaic.Vanilla against \as.Vanilla (the lower, the better). Better seen in color (\mosaic\in blue and\as\in red).}
\caption{Sensitivity study w.r.t. \hyper\$n_s$ for $C_{ucb} = 1.3$ and $PW = 0.6$: Average rank of \mosaic.Vanilla against \as.Vanilla. Better seen in color (\mosaic\in blue and\as\in red).}
\caption{Comparative assessment of \textcolor{blue}{\mosaic} and \textcolor{red}{\as}: Average performance rank (the lower the better) on OpenML-100 {\em vs} CPU time of the Vanilla, Ensemble, MetaLearning and Ensemble+MetaLearning variants (left to right). Better seen in color.}
\caption{ \model{} enables users to explore and understand deep neural networks by comparing aggregated activation distributions across layers, classes, and instances. Here, our user \user{} explores how \textit{one-pixel attack} harms a model's prediction. She selects: \circledtext{1} the network's last two layers (\texttt{dense\_1}, \texttt{dense\_2}) from the \textit{Layers} view; \circledtext{2} the \catcolor{cat} (orange) and \dogcolor{dog} (purple) classes from the \textit{Classes} view; and \circledtext{3} a pair of \attackedcolor{attacked} (red) and \benigncolor{benign} (black) cat images from the \textit{Instances} view; one pixel in attacked version has been manipulated to fool the network. \circledtext{4} In \textit{Activation} view, each neuron's activation distribution is displayed as a horizontal density bar graph. Noting the distribution similarities between \catcolor{cat} and \dogcolor{dog}, \bigcircledtext{5a} \user{} wants to discover how activations of the \attackedcolor{manipulated} cat image resemble those of \dogcolor{dogs}', so she sorts the neurons by activations of the attacked input, revealing that attacked image, though classified as a dog, is activating the network quite differently from majority of dog images, with large ``neural divergence''. }
\caption{Graphical illustration of pre-training procedure. We firstly randomly mask the input medical codes using a \textit{[MASK]} symbol. \textcolor{orange}{\textbf{Orange arrow}}: self-prediction task takes $\bm{v}_m$ or $\bm{v}_d$ as input to restore the original medical codes with the same type. \textcolor{green}{\textbf{Green arrow}}: dual-prediction task takes one type of visit embedding such as $\bm{v}_m$ or $\bm{v}_d$ and tries to predict the other type of medical codes.}
\caption{One example path captured by an Apple iPad. (a)~GNSS/platform location positions with uncertainty radius. The samples in \textcolor{mycolor1}{green} were removed for the gap experiment. (b)~The visual-inertial odometry (Apple ARKit) track that was captured for reference/validation. The ARKit fuses information from the IMU and device camera. The path has been manually aligned to the starting point and orientation. (c)~Our iterative solution of the gap experiment, where we fuse the iPad IMU readings with the \textcolor{mycolor0}{blue} GNSS locations in (a). Note that the lines straighten along the roads and the corners are square. (d)~Example frames along the path showing the test environment. Associated camera poses shown in (b). (Best viewed zoomed in.)}
\caption{Progression of various error metrics over the course of global iterated EKF iteration steps. Top subfigure depicts root mean square error (\textcolor{mycolor0}{RMSE}), the middle subfigure mean absolute error (\textcolor{mycolor1}{MAE}), and the bottom negative log predictive density (\textcolor{mycolor2}{NLPD})---all w.r.t.\left-out GNSS locations. After 20 iterations the RMSE/MAE which capture the mean seem converged, but NLPD still shows a slope.}
\caption{Illustration of coherence assessment, where H-GRU refers to hierarchical GRU and the symbol \textcolor{blue}{$\oplus$} denotes vector concatenation.}
\caption{Sentiment analysis examples where our HashtagMaster segmentation tool helped. \textcolor{red}{Red} and \textcolor{blue}{blue} words are negative and positive entries in the Twitter sentiment lexicon \cite{tang-EtAl:2014:Coling}, respectively.}
\caption{Overview of our DualDis framework. On the left we illustrate the behavior of our encoder-decoder, learned to explicitly separate complementary representations of identity (top) and attributes (bottom) in dual latent subspaces. In the middle, we illustrate its \textit{disentangling} ability by being able to mix the identity of a first image and the attributes of a second. In the first example, the green man takes the attributes of the yellow image, becoming a smiling woman with brown bangs. As our model also linearizes the factors of variation, one can perform \textit{image editing} (right). For the first example (blue woman), we move the representation \textcolor{bluefig}{\scriptsize \ding{54}} along the directions male (first line) and glasses (second line) to add those attributes.}
\caption{\small \textbf{Qualitative results on Pascal-VOC and Pascal-Context}. (a) Input image, (b) semantic segmentation ground-truth, (c) segmentation without zero-shot learning, (d) results with proposed ZS3Net. Unseen classes: \setlength{\fboxsep}{1pt}\colorbox{col_plane}{\textcolor{white}{plane}} \setlength{\fboxsep}{1pt}\colorbox{col_cat}{\textcolor{white}{cat}} \setlength{\fboxsep}{1pt}\colorbox{col_cow}{\textcolor{white}{cow}} \setlength{\fboxsep}{1pt}\colorbox{col_boat}{\textcolor{white}{boat}}; Some seen classes: \setlength{\fboxsep}{1pt}\colorbox{col_horse}{\textcolor{white}{dog}} \setlength{\fboxsep}{1pt}\colorbox{col_bird}{\textcolor{white}{bird}} \setlength{\fboxsep}{1pt}\colorbox{col_dog}{\textcolor{white}{horse}} \setlength{\fboxsep}{1pt}\colorbox{col_table}{\textcolor{white}{dining-table}}. Best viewed in color. }
\caption{\small \textbf{Zero shot segmentation with self-training}. (a) Input image, (b) semantic segmentation ground-truth, (c) segmentation with ZS3Net, (d) result with additional self-training (ZS5Net). Unseen classes: \setlength{\fboxsep}{1pt}\colorbox{col_bike}{\textcolor{white}{motorbike}} \setlength{\fboxsep}{1pt}\colorbox{col_sofa}{\textcolor{white}{sofa}}; Some seen classes \setlength{\fboxsep}{1pt}\colorbox{col_car}{\textcolor{white}{car}} \setlength{\fboxsep}{1pt}\colorbox{col_chair}{\textcolor{white}{chair}}. Best viewed in color. }
\caption{\small \textbf{Introducing and addressing zero shot semantic segmentation}. In this example, there are no `motorbike' examples in the training set. As a consequence, a supervised model (middle) fails on this object, seeing it as a mix of the seen classes \setlength{\fboxsep}{1pt}\colorbox{col_person}{\textcolor{white}{person}}, \setlength{\fboxsep}{1pt}\colorbox{col_cycle}{\textcolor{white}{bicycle}} and \setlength{\fboxsep}{1pt}\colorbox{black}{\textcolor{white}{background}}. With proposed ZS3Net method (right), pixels of the never-seen \setlength{\fboxsep}{1pt}\colorbox{col_bike}{\textcolor{white}{motorbike}} class are recognized.}
\caption{ \textbf{Additional qualitative results on Pascal-VOC}. From top to bottom, results of the $2$-, $4$-, $6$-, $8$- and $10$-unseen set-ups. (a) Input image, (b) semantic segmentation ground-truth, (c) segmentation without zero-shot learning, (d) results with proposed ZS3Net. %\textbf{Additional qualitative results on Pascal-VOC dataset (best viewed in color).} From top to bottom, results of the $2$-, $4$-, $6$-, $8$- and $10$-unseen set-ups are demonstrated. %Column (a) shows an input image, (b) the corresponding semantic segmentation ground-truth (c) segmentation results without zero-shot learning and (d) results obtained by proposed ZS3Net. Unseen classes: \setlength{\fboxsep}{1pt}\colorbox{col_bike}{\textcolor{white}{motorbike}} \setlength{\fboxsep}{1pt}\colorbox{col_plane}{\textcolor{white}{plane}} \setlength{\fboxsep}{1pt}\colorbox{col_cat}{\textcolor{white}{cat}} \setlength{\fboxsep}{1pt}\colorbox{col_sofa}{\textcolor{white}{sofa}} \setlength{\fboxsep}{1pt}\colorbox{col_train}{\textcolor{white}{train}} \setlength{\fboxsep}{1pt}\colorbox{col_chair}{\textcolor{white}{chair}}; Some seen classes: \setlength{\fboxsep}{1pt}\colorbox{col_cycle}{\textcolor{white}{bicycle}} \setlength{\fboxsep}{1pt}\colorbox{col_boat}{\textcolor{white}{boat}} \setlength{\fboxsep}{1pt}\colorbox{col_horse}{\textcolor{white}{dog}} \setlength{\fboxsep}{1pt}\colorbox{col_person}{\textcolor{white}{person}} \setlength{\fboxsep}{1pt}\colorbox{black}{\textcolor{white}{background}}. Best seen in colors.}
\caption{ \textbf{Additional qualitative results on Pascal-Context}. From top to bottom, results of the $2$-, $4$-, $6$-, $8$- and $10$-unseen set-ups. (a) Input image, (b) semantic segmentation ground-truth, (c) segmentation without zero-shot learning, (d) results with proposed ZS3Net. %\textbf{Additional qualitative results on Pascal-Context dataset (best viewed in color).} From top to bottom, results of the $2$-, $4$-, $6$-, $8$- and $10$-unseen set-ups are demonstrated. Column (a) shows an input image, (b) the corresponding semantic segmentation ground-truth (c) segmentation results without zero-shot learning and (d) results obtained by proposed ZS3Net. Unseen classes: \setlength{\fboxsep}{1pt}\colorbox{col_cow}{\textcolor{white}{cow}} \setlength{\fboxsep}{1pt}\colorbox{col_cat}{\textcolor{white}{cat}} \setlength{\fboxsep}{1pt}\colorbox{col_boat}{\textcolor{white}{boat}} \setlength{\fboxsep}{1pt}\colorbox{col_bird}{\textcolor{white}{bird}} \setlength{\fboxsep}{1pt}\colorbox{col_plane}{\textcolor{white}{plane}}; Some seen classes: \setlength{\fboxsep}{1pt}\colorbox{col_dog}{\textcolor{white}{horse}} \setlength{\fboxsep}{1pt}\colorbox{col_horse}{\textcolor{white}{dog}} \setlength{\fboxsep}{1pt}\colorbox{black}{\textcolor{white}{background}} \setlength{\fboxsep}{1pt}\colorbox{col_sky}{\textcolor{white}{sky}}. Best viewed in colors.}
\caption{ \textbf{Additional qualitative results with self-training on Pascal-VOC}. From top to bottom, results of the $2$-, $4$-, $6$-, $8$- and $10$-unseen set-ups. (a) Input image, (b) semantic segmentation ground-truth, (c) segmentation with ZS3Net, (d) result with additional self-training (ZS5Net). %\textbf{Additional Zero-Shot Self-Training qualitative results on Pascal-VOC dataset (best viewed in color).} From top to bottom, results of the $2$-, $4$-, $6$-, $8$- and $10$-unseen set-ups are demonstrated. Column (a) shows an input image, (b) the corresponding semantic segmentation ground-truth (c) segmentation results with ZS3Net and (d) with additional self-training (ZS5Net). Unseen classes: \setlength{\fboxsep}{1pt}\colorbox{col_bike}{\textcolor{white}{motorbike}} \setlength{\fboxsep}{1pt}\colorbox{col_plane}{\textcolor{white}{plane}} \setlength{\fboxsep}{1pt}\colorbox{col_cat}{\textcolor{white}{cat}} \setlength{\fboxsep}{1pt}\colorbox{col_train}{\textcolor{white}{train}} \setlength{\fboxsep}{1pt}\colorbox{col_chair}{\textcolor{white}{chair}}; Some seen classes: \setlength{\fboxsep}{1pt}\colorbox{col_cycle}{\textcolor{white}{bicycle}} \setlength{\fboxsep}{1pt}\colorbox{col_boat}{\textcolor{white}{boat}} \setlength{\fboxsep}{1pt}\colorbox{col_person}{\textcolor{white}{person}} \setlength{\fboxsep}{1pt}\colorbox{black}{\textcolor{white}{background}}. Best seen in colors.}
\caption{ \textbf{Additional qualitative results with self-training on Pascal-Context}. From top to bottom, results of the $2$-, $4$-, $6$-, $8$- and $10$-unseen set-ups. (a) Input image, (b) semantic segmentation ground-truth, (c) segmentation with ZS3Net, (d) result with additional self-training (ZS5Net). % \textbf{Additional Zero-Shot Self-Training qualitative results on Pascal-Context dataset (best viewed in color).} From top to bottom, results of the $2$-, $4$-, $6$-, $8$- and $10$-unseen set-ups are demonstrated. Column (a) shows an input image, (b) the corresponding semantic segmentation ground-truth (c) segmentation results with ZS3Net and (d) with additional self-training (ZS5Net). Unseen classes: \setlength{\fboxsep}{1pt}\colorbox{col_cow}{\textcolor{white}{cow}} \setlength{\fboxsep}{1pt}\colorbox{col_cat}{\textcolor{white}{cat}} \setlength{\fboxsep}{1pt}\colorbox{col_sofa}{\textcolor{white}{sofa}} \setlength{\fboxsep}{1pt}\colorbox{col_boat}{\textcolor{white}{boat}} \setlength{\fboxsep}{1pt}\colorbox{col_bird}{\textcolor{white}{bird}} \setlength{\fboxsep}{1pt}\colorbox{col_plane}{\textcolor{white}{plane}}; Some seen classes: \setlength{\fboxsep}{1pt}\colorbox{col_dog}{\textcolor{white}{horse}} \setlength{\fboxsep}{1pt}\colorbox{col_horse}{\textcolor{white}{dog}} \setlength{\fboxsep}{1pt}\colorbox{black}{\textcolor{white}{background}} \setlength{\fboxsep}{1pt}\colorbox{col_sky}{\textcolor{white}{sky}}. Best viewed in colors.}
\caption{The top panel shows the (XRT background-subtracted, 0.5--10 keV) light curve of \mx\,during its main outburst (shown in grey) and the subsequent re-brightenings (each indicated by a different colour). The bottom panel shows the hardness-intensity diagram for the main outburst and the re-brightenings (employing the same colour scheme as adapted for the light curve in the upper panel). The data points indicated using a star have quasi-simultaneous radio data, obtained using ATCA.}
\caption{Panel I shows the X-ray light curve evolution of \mx, observed using {\it Swift} (see also Figure \ref{fig_lc_hid}, upper panel). Representative points from the main outburst are shown in grey in Panels I and II of this figure. The vertical dotted grey lines roughly indicate the times of observed X-ray state transitions. Data corresponding to a source intensity $>$3 counts s$^{-1}$ and $\leq$3 counts s$^{-1}$ are shown using blue $\textrm{\ding{117}}$ and green $\bullet$, respectively, in Panel I. Panel II shows the hardness evolution of the source with the blue $\textrm{\ding{117}}$ and green $\bullet$ indicating the soft and hard data (i.e., hardness $>$3 or $<$3), respectively. Panel III shows the radio fluxes in the 5.5, 9, 17, and 19 GHz bands (detections shown using yellow $\bullet$, red $\textrm{\ding{117}}$, green $\textrm{\ding{110}}$, and dark blue $\textrm{\ding{78}}$, respectively; upper limits are always shown using {\color{gray} $\textrm{\ding{116}}$} in the appropriate colour) obtained using ATCA. Panel IV shows the radio spectral index $\alpha$ determined using the ATCA data.}
\caption{A table from the WikiBio dataset (right), its reference description and three hypothetical generated texts with scores assigned to them by automatic evaluation metrics. Text which cannot be inferred from the table is in {\color{burgundy}{\textbf{red}}}, and text which can be inferred but isn't present in the reference is in {\color{ferngreen}{\textbf{green}}}. PARENT is our proposed metric.}
\caption{Partial results on zero-shot and parallel directions on Europarl dataset with variant multilingual training conditions ({\color[HTML]{5799c6}blue}: \texttt{default}, {\color{red}red}: \texttt{large-bs}, {\color{orange}orange}: \texttt{pytorch-init}, {\color[HTML]{2da02b}green}: \texttt{attn-drop}, {\color[HTML]{bda1d6}purple}: \texttt{layerwise-attn}). The dashed lines are the pivot-based or direct translation results from baseline models.}
\caption{Learning curves of the two proposed approaches (LM, BTZS) and the vanilla ZS on Europarl Fr $\!\rightarrow\!$ De with two conditions (\texttt{default}, \texttt{large-bs}). The {\color{red}red} dashed line is the pivot-based baseline.}
\caption{\label{Fig_TNet} The framework architecture of \textcolor[rgb]{1,0,0}{TNet}/\textcolor[rgb]{0,0,1}{TNet-ATT}. Note that TNet-ATT is the variant of TNet replacing CNN with an attention mechanism. }
\caption{An example of news article comment generation task, which is to generate new comments given the title and content of the news. Because the article is too long, only the first sentence and three fragments with topic words ({\color{blue}blue}) are shown. Note that the title and the first sentence of the news are very different from traditional news, which can not summarize the content of the article.}
\caption{(\textit{a}) Inner- and (\textit{b}) outer-scaled mean velocity profiles for the different cases (lines -- fluid velocity; symbols (colour-matched) -- particle velocity). The inset in panel (\textit{a}) shows the inner-scaled difference between fluid and particle velocity profiles $\Delta\left<u\right>$.}\label{fig:umean} \end{figure} As for the particles, they tend to flow slower than the fluid in the buffer layer ($10 \lesssim y/\delta_\nu\lesssim 40-50$, with $\delta_\nu = \nu/u_\tau$; see also the figure inset, showing the inner-scaled difference between the profiles of each phase). This velocity difference is more pronounced in case \textit{PP}, where the particle velocity reduction is clear also very close to the wall. This has been observed in previous studies using point-particle DNS~\citep{Sardina-et-al-JFM-2012}, and is attributed to the preferential sampling of the fluid low-speed streaks near the wall. The particle-resolved cases show a much weaker reduction of the mean particle velocity, which suggests that particles reside in the low-speed regions for shorter times before resuspending into the bulk; see also the discussion below about the particle dynamics. In the viscous sublayer, particles show a slightly higher mean velocity in the interface-resolved simulations. This higher slip velocity causes \textit{hot-spots} of higher wall shear stress, which favour an increase in overall drag \citep{Costa-et-al-PRL-2016,Costa-et-al-JFM-2018}. Clearly this effect is significant in case \textit{D}, where the near-wall number density is high enough, but not in case \textit{VD}.\par % Figure~\ref{fig:velstats} shows the second-order statistics of fluid and particle velocity. Focusing first on the fluid phase, we see once more that the data for \textit{VD} tend to those of the single-phase flow, with the minor differences attributed to a better statistical convergence of case \textit{PP}, and possibly a slight two-way-coupling effect. Conversely, turbulence modulation is evident for case \textit{D}. Here the Reynolds stresses are higher, consistently with the overall drag increase (see the figure inset). Moreover, small differences are found for all the velocity root mean square (r.m.s.) values of case \textit{D} near the wall, where the velocity fluctuations become less anisotropic, i.e.\ $u_r$ decreases and $v_r$ and $w_r$ increase. This is attributed to the enhanced mixing due to the near-wall particles, whose local mass fraction is high enough for two-way-coupling effects to be significant. \par \begin{figure} \centering \includegraphics[width=0.9\textwidth]{figures/fig5-eps-converted-to.pdf} \put(-180,110){(\textit{d})} \put(-345,110){(\textit{c})} \put(-180,215){(\textit{b})} \put(-345,215){(\textit{a})} \caption{Outer-scaled second-order moments of particle velocity. (\textit{a}) streamwise velocity r.m.s.\; (\textit{b}) wall-normal velocity r.m.s.\; (\textit{c}) ditto for spanwise velocity. (\textit{d}) Reynolds stresses profile with inner-scaled inset. Lines -- fluid; symbols (colour-matched) -- particles.}\label{fig:velstats} \end{figure}\par % Interestingly, the second-order moments of the particle velocity for the fully resolved \emph{one-way}-coupling case, \textit{VD}, strongly differ from those of the point-particle simulations near the wall, while in the bulk the two cases display a similar behaviour. In the bulk, where the local shear is relatively low, the point-particle model succeeds in predicting the particle dynamics. We should note that the same closure for the point-particle dynamics was used in \cite{Mehrabadi-et-al-JFM-2018} for decaying HIT, and the results also compared well to the corresponding interface-resolved case. Closer to the wall ($y/h \lesssim 0.1$), however, the interface-resolved simulations show higher fluctuation levels than the point-particle reference. One can depict a clear change in trend in case \textit{VD} for the profiles pertaining the quantities in the plane at $y/h\approx 0.1$. There is a clear local minimum of $v_r$, and a sudden change in slope for $\left<u'v'\right>$. The exception is the spanwise velocity r.m.s.\$w_r$, which attains similar values also close to the wall for cases \textit{VD} and \textit{PP}. We should note that similar trends for the streamwise and wall-normal particle velocity r.m.s.\have been observed in recent experiments of particle-laden turbulent downward flow in a vertical channel; see\cite{Fong-et-al-JFM-2019}. All these observations suggest differences in the single-particle dynamics, in particular in the way particles approach and depart from the wall in the two models. In the one-way-coupled point-particle DNS, the particle dynamics is modelled by a simple drag law without considering shear-induced lift forces. In this case, particles are driven towards the wall with high velocity by turbophoresis. Their inertia prevents resuspension, resulting in long periods of wall accumulation in low-speed regions, while only few of them drift back into the bulk~\citep{Soldati-and-Machioli-IJMF-2009,Sardina-et-al-JFM-2012}. % Since at equilibrium the net wall-normal particle flux is zero, a large number of particles accumulate at the wall. Conversely, when the flow around particles is resolved, these tend to reside for much shorter times near the wall before re-suspending. Hence, point particles tend to skim along the wall in low-speed streaks for long periods~\citep{Soldati-and-Machioli-IJMF-2009,Sardina-et-al-JFM-2012}, whereas resolved particles show shorter residence times at the wall, quickly take off, and are not preferentially localized in low-speed streaks, see figure~\ref{fig:visu_plane}(\textit{b}). % This faster cycle explains the larger value of $v_r$ near the wall, and consequently the larger values of $u_r$ and $\left<u'v'\right>$ since the fluctuations are correlated through the mean shear. To better quantify this effect, figure~\ref{fig:res_time} shows the average time that a particle close to the wall (i.e.\ located at $y\approx D/2$) needs to exit the viscous sublayer (i.e.\ to reach a wall-normal position $y>5\delta_\nu$), $\Delta t^{up}$. The figure shows that near-wall particles in the fully resolved cases take about the same time to exit the viscous sublayer, which is about one order of magnitude shorter than that of the point-particle DNS \textit{PP}.\par % The particle dynamics just described suggests that a shear-induced lift force is the missing key ingredient absent in the point-particle model. Such a force plays a very important role in the particle dynamics near the wall, where the mean shear is high \citep{Wang-et-al-IJMF-1997,Soldati-and-Machioli-IJMF-2009}. It is known that a particle flowing near a wall in a shear flow experiences a strong lift force \citep{Saffman-JFM-1965,Mclaughlin-JFM-1991,Cherukat-and-McLaughlin-JCM-1994,Bagchi-and-Balachandar-PoF-2002,Magnaudet-JFM-2003}. A similar mechanism should work for small particles in the viscous sublayer of a turbulent flow.\par \begin{figure} \centering \includegraphics[width=0.59\textwidth]{figures/fig6-eps-converted-to.pdf} \caption{Inner-scaled average time a wall-skimming particle takes to reach a wall-normal distance $y>5\delta_\nu$ (i.e.\ to exit the viscous sublayer), $\Delta t^{up}$. The dashed yellow line correspond to the time a wall-skimming particle takes, in a DNS of a model laminar Couette flow at equivalent Reynolds number, to reach the same inner-scaled wall-normal distance.}\label{fig:res_time} \end{figure}%\par % To confirm the nature of the mechanism for particle detachment, we performed an auxiliary DNS of laminar Couette flow at the same particle Reynolds number. The computational domain has size $L_x\times L_y\times L_z=20D\times 10D \times 10D$ with a regular grid where $D/\Delta x = 16$. The boundary conditions are the same as for the turbulent channel flow, except that the flow is now driven by a non-zero streamwise velocity $U_w$ at $y=L_y$. The Reynolds number based on the local shear rate $\dot{\gamma} = U_w/L_y$ and particle size is set to match that of the particle in the viscous sublayer, i.e.\ $\dot{\gamma}D^2/\nu = (u_\tau/\delta_v) D^2/\nu$. A single particle with the same physical properties is placed at the bottom wall, with initial linear and angular velocity conforming to the local flow velocity and vorticity. The flight time $\Delta t^{up}$ for the particle to detach from the wall and travel $5$ viscous units in $y$ is reported by the dashed line of figure~\ref{fig:res_time}. The measured time is remarkably close to the average value measured in the interface-resolved DNS for the two turbulent cases under consideration. This strongly suggests that the mechanism for particle detachment from the wall is, to first approximation, purely shear-driven. Moreover, the following scaling considerations suggest that shear-induced lift forces can be important close to the wall. Let us neglect short-range particle--wall interactions and consider the particle dynamics in the viscous sublayer to be modelled by an unbounded linear shear flow. In the limit of vanishing particle Reynolds number, the ratio between the drag and shear-induced lift force scales with $D^+$ (cf.\ eq.~\eqref{eqn:drag} and~\eqref{eqn:lift} with $J=1$ for Saffman lift, and note that in this case $|\boldsymbol{\omega} \times\mathbf{U}_s|=|\boldsymbol{\omega}||\mathbf{U}_s|=(u_\tau^2/\nu)|\mathbf{U}_s|$): \begin{equation} \left|\frac{\mathbf{F}_l}{\mathbf{F}_d}\right|_{{y^+<5}} = \frac{1.615}{3\pi}\sqrt{\frac{D^2|\boldsymbol{\omega}|}{\nu}} = 0.171 D^+\mathrm{,}% = {O}(1)\mathrm{,} \end{equation} with a proportionality coefficient $O(1)$ in the present set-up. This suggests that shear-induced lift forces cannot be neglected near the wall when $D^+\gtrsim 1$, as these can be as high as the streamwise drag forces. It should be noted, however, that the lift force might remain important for lower values of $D^+$, since particles in the viscous sublayer would tend to flow for a longer time while subjected to this wall-normal force. Away from the wall, instead, the order of magnitude of the lift force should become negligible with respect to that of the viscous drag, just like the other terms in the Maxey-Riley-Gatignol equations that are typically neglected for small inertial particles in wall-bounded turbulent flows \citep[see e.g.][]{Wang-et-al-IJMF-1997}.\par % We have seen that the dynamics of resolved and point particles is similar in the bulk, and thus the drift towards the wall is well described by point-particle methods. When moving close to the wall, the high shear induces strong lift forces that quickly dislodge particles. The lower value of the near-wall volume fraction peak in figure~\ref{fig:ruuz_out_phase}(b) for case \textit{D}, and its shorter average take-off time in figure~\ref{fig:res_time} are consistent with this picture, as the mean wall shear is higher in this case. This wall detachment mechanism is absent in models considering only the drag, as in the \textit{PP} case, which instead display slow resuspension and particles spending a long time in the near-wall low-speed streaks. In the next section we assess the validity of simple shear-induced lift models to predict the particle dynamics. % \subsection*{Assessment of lift models for point-particle simulations} The most well-known models proposed to describe a shear-induced lift force in the one-way-coupling point-particle regime introduced at the end of section~\ref{sec:methods} are here compared with fully resolved simulations. In all cases, the standard Eulerian particle statistics for the point-particle models with lift force are compared to those of the model without lift, and the interface-resolved case \textit{VD}.\par % Figure~\ref{fig:phase_umean_parts}(\textit{a}) shows the normalized local volume fraction profiles for the different cases considered. The point-particle models accounting for lift forces predict well the wall concentration peak. The peak location is predicted slightly away from the wall, probably due to a minor stabilizing effect of short-range particle--wall interactions and softer contact, absent in the point-particle model. These secondary mechanisms can possibly be modelled with near-wall closures such as an effective (\emph{wet}) restitution coefficient that is a function of the particle impact Stokes number \citep[see e.g.][]{Joseph-et-al-JFM-2001,Legendre-et-al-CES-2006,Fong-et-al-JFM-2019}, or by more sophisticated near-wall corrections for the particle dynamics \citep{Gondret-et-al-PoF-2002,Lee-and-Balachandar-JFM-2010}. We should note that the former option may be too simple for realistically predicting the particle dynamics, as the drag force already accounts for part of the fluid effects \citep{Gondret-et-al-PoF-2002}. Indeed, we tested a point-particle simulation with the same dynamics as \textit{PP-Saffman} but with a lower coefficient of restitution of $0.9$; the results showed a shift towards the wall of the concentration peak that, however, overpredicts the maximum concentration by a factor of $1.3$.\par % Away from the wall, the case using the seminal lift model, \textit{PP-Saffman}, optimally predicts the concentration profile. The mean particle velocity pertaining to the cases where a lift force is accounted for is much closer to the interface-resolved case (see figure~\ref{fig:phase_umean_parts}(\textit{b})). This is particularly evident in the apparent mean particle-to-fluid velocity difference displayed in the figure inset, where the results with lift force show a much better agreement with case \textit{VD}, in particular in the region where the negative slip is highest. Overall, both lift models predict relatively well the first-order statistics pertaining to case \textit{VD} shown in figure~\ref{fig:phase_umean_parts}. This is not the case for the second-order moments of the particle velocity.\par % \begin{figure} \centering \includegraphics[width=0.49\textwidth]{figures/fig7a-eps-converted-to.pdf}\hfill \includegraphics[width=0.49\textwidth]{figures/fig7b-eps-converted-to.pdf} \put(-185,120){(\textit{b})} \put(-378,120){(\textit{a})} \caption{(\textit{a}) Local solid mass fraction as a function of the outer-scaled wall-normal distance for the different point-particle cases. (\textit{b}) Corresponding inner-scaled mean particle velocity profiles. The inset in panel (\textit{b}) shows the inner-scaled difference between fluid and particle velocity profiles $\Delta\left<u\right>$.}
\caption{Illustration of the proposed approache. As on a 3-layer encoder: (a) vanilla model without sentential context, (b) {\em shallow} sentential context representation (i.e. {\color{blue} blue square}) by exploiting the top encoder layer only; and (c) {\em deep} sentential context representation (i.e. {\color{brown} brown square}) by exploiting all encoder layers. The circles denote hidden states of individual tokens in the input sentence, and the squares denote the sentential context representations. The {\color{red} red} up arrows denote that the representations are fed to the subsequent decoder. This figure is best viewed in color.}
\caption{ % Quantitative comparison using test data in the LSP dataset and the strict PCK-0.2 metric \cite{DBLP:journals/pami/YangR13}. % We used the person-centric annotations given in \cite{DBLP:conf/cvpr/JohnsonE11}. % Ours-semi (g, l, q, and w), Ours-weak (h, m, r, and x), and Ours-weakC (i, n, s, and y) correspond to our semi-supervised learning (Section \ref{section:semi}), semi- and weakly-supervised learning (Section \ref{section:weakly}), semi- and weakly-supervised learning with outlier detection (Section \ref{section:weak_clustering}), respectively. % Our methods are implemented based on two different baselines, Chen \& Yuille\cite{DBLP:conf/nips/ChenY14} (Baseline--1 in the Table) and Wei et al. \cite{wei2016cpm} (Baseline--2 in the Table). % If the proposed method is implemented with Baseline--1/2, it is called Ours-semi--1/2, Ours-weak--1/2, and Ours-weakC--1/2. % Each result is obtained on a different training dataset specified by at the top of each set; {\bf LSP}, {\bf LSP+LSPext}, and {\bf LSP+LSPext+MPII}. For all of our proposed methods, the FS set consisted of only 500 images in the LSP and all remaining images were used as the WS set. % For reference, two baselines are evaluated also with only 500 images in the LSP; (e) and (j). % For fair comparison in terms of the amount of the FS set, (e) and (j) should be compared with our proposed methods. % In each training set, the best scores among supervised learning methods and methods that used only 500 images for the FS set are colored by {\bf \textcolor{red}{red}} and {\bf \textcolor{blue}{blue}}, respectively, in each column.}
\caption{Quantitative comparison using test data in the MPII dataset evaluated by PCKh-0.5 \cite{DBLP:conf/cvpr/AndrilukaPGS14}. Our proposed method (i.e., Ours-weakC--2) used 9040 images in the MPII (i.e., half of the entire images) % for the FS set and other images in ``LSP+LSPext+MPII'' dataset for the WS set. On the other hand, all images and annotations in MPII and ``LSP+LSPext+MPII'' were used for training in \cite{DBLP:conf/eccv/InsafutdinovPAA16,DBLP:conf/eccv/LifshitzFU16,DBLP:conf/eccv/GkioxariTJ16,bulat2016pose,newell2016eccv} (shown in the upper rows in the table) and \cite{pishchulin16cvpr,wei2016cpm} (shown in the lower rows), respectively. For reference, the results of the baseline \cite{wei2016cpm} that used only half of the entire images in the MPII (i.e., Baseline--2 (HALF) in the table) % are shown. The best scores among supervised learning methods and methods that used only 9040 images for the FS set are colored by {\bf \textcolor{red}{red}} and {\bf \textcolor{blue}{blue}}, respectively }
\caption{Automatic speaker verification (ASV) assisted mimicry attack: attacker uses a public-domain ASV system to select target speakers matched with his/her voice from a public celebrity data\textcolor{\revcolor}{base}. The attacker then practices target speaker mimicry, intended to attack another independently developed ASV system.}
\caption{Details of the speaker verification systems used to simulate targeted impersonation attack against automatic speaker verification. The attacker is assumed to not have information about the attacked \textcolor{\revcolor}{system}, and hence the attacker's system differs from the attacked \textcolor{\revcolor}{system}.}
\caption{Comparison of attackers' ASV scores (log likelihood ratios) to the targets' scores for \textcolor{\revcolor}{both of the} ASV systems involved in the study. The scores are averaged over all attackers and all speech segments. The error bars represent 95 \% confidence intervals for the means.}
\caption{\textcolor{\revcolor}{Distributions of target and non-target scores in different domains. \emph{Cross-domain} non-target scores are obtained by scoring speakers from the attacker domain against the speakers from the target (VoxCeleb) domain. The simulated mimicry attacks in this work fall under the category of cross-domain trials. As the cross-domain score distributions overlap almost perfectly with the target-domain non-target distributions, the domain mismatch does not seem to make attacking more difficult, at least when the targets are Finnish.}}
\caption{The scores of the ASV systems for \textcolor{\revcolor}{the} trials used in the listening test. The scores in each score triplet (described in the legend) are from the trials that have the same target speaker enrollment utterance and the speech content is the same in all the three test segments. Scores for male and female attackers are shown in separate groups. The right side of each graph displays the mean values of the score groups together with standard error of the mean multiplied by $1.96$.}
\caption{Differences of attacker's (M1, M2, M3, M4, F1, F2) prosodic \textcolor{\revcolor}{and formant parameters to target's parameters} for all attacker-target combinations. Differences are shown for non-effort speech (\textbf{black arrow}) and for mimicked speech. The effect of mimicry is displayed with a \textbf{\textcolor{ForestGreen}{green arrow}} if it made attacker's and target's \textcolor{\revcolor}{parameters} closer to each other and with a \textbf{\textcolor{red}{red arrow}} otherwise.}
\caption{Component substitution sequence accuracy. Bold faced accuracies are averages over models 1, 2 and 3 and depicted with their std. Remaining accuracies are for model 1.{\color{red} This is missing the explanations for "HI", "HC", "HT" and "NC".} {\color{red} Simplify table?} }
\caption{ Comparison of the marginal distributions and the co-occurrence matrices of the real and simulated datasets. The orders of variables are the same in Panel (a) and (b). In Panel (c) and (d), the \textcolor{red}{red color} represents pathophysiological diagnosis, the \textcolor{blue}{blue color} represents pattern diagnosis, and the \textcolor{yellow}{yellow color} represents symptom diagnosis. }
\caption{\label{tab:sampled-outputs} Sampled system outputs. The dataset and the original style for each input sentence are parenthesized. We mark improperly generated or preserved words in \blue{blue}, and mark words that show target style and are grammatical in the context in \red{red}. Best viewed in color.}
\caption{a) The proposed network architecture, which takes a point cloud and image information as input. The output is the probability of a point belonging to the target structure. The size of the dots \textcolor{pointcolor}{$\bullet$} illustrates the number of features. b) Illustration of the formation of the image information. We extract features from a volume of interest $I_\textbf{p}$ (\textcolor{imginfocolor}{blue}) from $I_Q$ (\textcolor{probabilitymapcolor}{light gray}, probability values not shown for the sake of clarity) around each point $\textbf{p}$ (\textcolor{pointcolor}{$\bullet$}) in a point cloud, which yields the image information.}
\caption{Additional example in \Ours{} annotated with different thematic relations. Entities (\textcolor{purple}{purple}), properties (\textcolor{magenta}{magenta}), constraints (\textcolor{red}{red}), and answers (\textcolor{orange}{orange}) are colored.}
\caption{Example, almost saturated \red{packings} of chosen polyhedra of size $10 \times 10 \times 10$. (a) -- tetrahedron, (b) -- snub cube, (c) -- truncated icosidodecahedron, (d) -- truncated tetrahedron.}
\caption{Extrapolated saturated packing fractions together with corresponding $d$ parameters \red{and sphericity $\Psi$}. The values for sphere are taken from \cite{Zhang2013}. The errors of packing fraction shown are errors propagated from errors of fits. The standard deviation of mean packing fraction after $t=10^6$ was one order of magnitude smaller, so it was not taken into account.}
\caption{The dependence of packing fraction $\theta$ on sphericity $\Psi$ of Platonic and Archimedean solids. The left panel shows packing fractions after $t=10^6$, and the right for saturated packings. \red{The dashed line is a linear fit $\theta(t=10^6) = 0.18082 + 0.20263\ \Psi$ to all points excluding truncated tetrahedron.}}
\caption{The dependence of density pair correlation function on the distance between particles. Panel (a) shows real distances in simulations. In panel (b) they are normalized so that the smallest distance possible is equal to 1. \red{The error bars are smaller than the width of the lines.}}
\caption{(a)-(c) The dependence of \blue{polyhedral order $\rho_x$} for chosen Platonic and Archimedean solids on the distance between their centres. The X axis is normalized in such a way that the smallest possible distance is 1. Additionally, panel (d) shows nematic order $\rho_n(r)$.}
\caption{Procedure to decompose comparison questions via heuristics. Two entities for comparison (\hl{yellow span}), the coordination (\hlpink{pink span}), the preconjunct or the predeterminer (\hlorange{orange span}), the quantitative indicator ({\underline{underlined}}) and the head entity (\hlcyan{cyan span}) are specified. }
\caption{An example of multi-hop question from \hotpot{}. The first cell shows given question and two of given paragraphs (other eight paragraphs are not shown), where the {\protect\color{red}{red text}} is the groundtruth answer. Our system selects a \hl{span} over the question and writes two \queries{} shown in the second cell.}
\caption{The example multi-hop questions from each category of reasoning type on~\hotpot. Q indicates the original, multi-hop question, while Q1, Q2 and Q3 indicate \queries{}. \sys\predicts\hl{span} and {\protect\color{red}{\checkmark}} through \pointer{c}, generates \queries{}, and answers them iteratively through single-hop \RC{} model. }
\caption{Phase diagram of the AO model for a size ratio $q = 0.6$ in the $\eta_P^r$-$\eta_C$ (polymer reservoir-colloid fraction) plane displaying the same curves as in Fig. \ref{Fig:phase_diag_SW} for the SW; the ordinate corresponds to inverse temperature. Note that the Widom line \red{terminates at the FW line -- see text}.}
\caption{Phase diagram of the SHS fluid as obtained from Baxter's analytical solution \cite{Baxter1968} of the PY approximation plotted in the $\tau$-$\eta$ plane. The lines displayed correspond to those in Fig. \ref{Fig:phase_diag_SW} for SW, with the ordinate equivalent to temperature. The critical point (black square) is located at $(\eta_c, \tau_c) = (0.12, 0.09)$. The solid line denotes the region where the solution becomes unphysical. \red{Note that the W line terminates at the FW line as in Fig. \ref{Fig:phase_diag_AO}.}}
\caption{Set accuracies for \blue{$\mathcal{R}$} depending on $\varepsilon$ values}
\caption{Visualization of the 2-dimensional PCA projection of the bilingual word embeddings of the two models. The \emph{\textcolor{blue}{blue}} words represent the Chinese embeddings while the \textcolor{red}{red} words represent the English embeddings. In (a), only the similar monolingual words are clustered together. While in (b) and (c), both the monolingual and bilingual words which have similar meanings are gathered together.}
\caption{\red{The} line $L_C$, its orientation $e_C$, and the projection $p_C$.}
\caption{The \red{second portion of the deformation $\sigma$.}}
\caption{Dependence of segregation velocities $w_{p,i}$ on contact parameters. (a) Light ($\times$) and heavy ($\Circle$) particle segregation velocities are nearly independent of restitution coefficient $e$ at $z/h=0.5$ with $\mu=0$ (\textcolor{black}{black}), $0.1$ (\textcolor{red}{red}), and $0.4$ (\textcolor{blue}{blue}). (b) Particle segregation velocity decreases as $\mu$ increases for $z/h=0.3$ (\textcolor{red}{red}), $0.5$ (\textcolor{blue}{blue}), $0.7$ (\textcolor{black}{black}) with $e=0.9$. $R_\rho=8$, $c_h=c_l=0.5$, and $\dot\gamma=U/h=25~\text{s}^{-1}$.}
\caption{The t-SNE visualization of representations learned by (a) ResNet, (b) DANN, (c) CADA-A, and (d) CADA-P ({\color{blue} blue}: A; {\color{red} red}: W), (e) shows Proxy$\mathcal{A}$-distance for A $\rightarrow$ W task for method Reset~\cite{he2016deep}, GRL~\cite{ganin_ICML2015} and the proposed model CADA-P}
\caption{ Depiction of a few claims, their \emph{perspectives} and evidences from \datasetname. The \emph{supporting} and \emph{opposing} perspectives are indicated with \colorbox{lightgreen}{green} and \colorbox{lightred}{red} colors, respectively. }
\caption{Inception scores for 3D shape generation. {\color{red}Best} and {\color{blue}Second-best} scores are shown in color. }
\caption{ Comparison on three large datasets. The best testing set performance is reported. The results below the line are from~\citet{liang2018variational}, and VAE$^{\ddag}$ shows the VAE results based on our runs. {\color{blue} Blue} indicates improvement over the VAE baseline, and {\bf bold} indicates overall best. }
\caption{NuSTAR 3--78 keV images for FPMA (left) and FPMB (right) in J2000 coordinates. The green circles denote the source region of SGR\,1900+14. The area surrounded by the green dashed line in the left image is the background region employed in this work. The extremely bright regions in both images are stray light from the nearby object GRS 1915+105.}
\caption{Pulse profiles of MOS1, MOS2, pn, and FPMA folded by the rotation period $5.22669\;{\rm s}$. \red{Background is} subtracted. The energy ranges employed are 1--10 keV for the MOSs and pn and 3--78 keV for FPMA. The horizontal axis represents two cycles of pulsation. Errors indicate $1\sigma$ confidence level.}
\caption{Pulse profiles of FPMA 3--5, 5--10, and 10--20 keV (left). Fourier transforms of each pulse profile (right). \red{Background is} subtracted. Green, black, and red represent 3--5, 5--10, and 10--20 keV, respectively. Errors indicate $1\sigma$ confidence level.}
\caption{Long-term evolution of the rotation period of SGR\,1900+14. Red crosses denote the rotation period of SGR\,1900+14 at each time, among which the right one is obtained from our work. Other data were obtained from previous studies\citep{Mereghetti2006, Marsden1999, Kouveliotou1999, Woods2002, Woods2003}. Errors are much smaller than the \red{crosses}. The red arrow denotes \red{the epoch of the} giant flare (MJD=51052). The blue hatched area denotes the post-outburst phase, which we defined in this work (see text for details), covering the data \red{since} MJD=51660. The black curve denotes the quadratic model fitted to the data in the post-outburst phase.}
\caption{Anecdotal examples of edits in {\it Google Maps} and {\it Internet Privacy} wikipage. Here the general model fails to identify negative examples while retraining the dense layer learns better representations and identifies the negative examples correctly. Page specific tokens are colored in \textcolor{blue}{blue}.}
\caption{Examples of edits in {\it Facebook} and {\it Google} Wikipedia page. The \textcolor{blue}{blue} bubbles are the original sentences. The \textcolor{orange}{orange} bubbles indicate damaging edits while the \textcolor{green}{green} bubbles indicate `good faith' edits. Good faith edits are unbiased formal English sentence while damaging edits often correspond to incoherent use of language, abusive language, imperative mood, {\sandipan opinionated} sentences etc. }
\caption{Estimated spectra (\textcolor{red}{red}) and the ground truth spectra (\textcolor{blue}{blue}) for three materials with 5 bands in Stage 2. The first row is for single-band localization method and the second row is for the multispectral one. From left to right corresponds to M1 to M5. }
\caption{A schematic diagram showing the relationship between \acp{PDE} and \ac{FBSDE}. Terms in \textcolor{orange}{orange} denote drift in FSDE, and terms in \textcolor{green}{green} denote drift in BSDE.}
\caption{Comparison of the performance of our system using three different types of edges. \textcolor{blue}{Blue} denotes best performing frame-to-frame VO, excluding SLAM or keyframe systems. \textbf{Bold} denotes best performing system overall. A dashed line indicates that using keyframes did not improve performance.}
\caption{Solutions for the Hamiltonian $H_{NL}^{(A)}$ after a time $t$ with initial state $|N\rangle_{a}|0\rangle_{a_{2}}$.\textcolor{red}{{} }$P_{N}$ (black solid line) is the probability for all $N$ bosons to be in mode $a$; $P_{0}$ (blue dashed line) is the probability for all $N$ bosons to be in mode $a_{2}$. The parameters identify regimes optimal, or near-optimal, for the nonlinear beam splitter interaction eq. (\ref{eq:nbsa}), where $P_{N}+P_{0}\sim1$ and $P_{N}\sim\cos^{2}\omega_{N}t$. \textcolor{red}{}Solutions for $N=1$ (all $\kappa,$$g$); \textcolor{black}{$N=2$, $\kappa=1$, $g=30$; $N=5$, $\kappa=20$, $g=333.333$; and }$N=7$, $\kappa=18.23$, $g=47.85$ are similar to the left plot.\textcolor{red}{}}
\caption{Top: Probability distribution $P_{N}$ for the joint detection of $N_{A}$ and $N_{B}$ bosons at the sites $A$ and $B$ (refer Fig. 2). Here $t_{a}$, $t_{b}$, \textcolor{black}{$t_{a}'$, $t_{b}'$ and $\varphi$ are specified in the text. }$N=7$, $\kappa=18.23$, $g=47.85$\textcolor{black}{. The distribution is unchanged for settings $(t'_{a},t_{b})$, $(t'_{a},t'_{b})$}\textbf{\textcolor{black}{{} }}\textcolor{black}{and }\textbf{\textcolor{black}{$(t_{a},t'_{b})$}}\textcolor{black}{.}\textbf{\textcolor{black}{{} }}\textcolor{black}{Lower:}\textcolor{red}{{} }\textcolor{black}{T}he joint probability $P(n,m)$ of detecting $n$ bosons in mode $a$ and $m$ bosons in mode $b$, \emph{and} $N$ bosons in total at site $A$. \textcolor{blue}{}\textcolor{red}{}\textcolor{black}{The figures for settings $(t'_{a},t_{b})$, $(t'_{a},t'_{b})$ are identical to those of $(t_{a},t_{b})$.}\textbf{\textcolor{red}{{} }}\textcolor{red}{}\textcolor{black}{Similar plots are obtained for all parameters given in Fig. 1.}}
\caption{Violation of the CH-Bell inequality (\ref{eq:ch-bell-ineq}) is obtained when $S>1$. Left graph is for $N=10$, $g=49.433$, $\kappa=10$. Plots for $N=1$ (all $\kappa$ and $g$); $N=2$, $\kappa=1$, $g=30$; \textcolor{black}{$N=5$, $\kappa=20$, $g=333.333$; and }$N=7$, $\kappa=18.23$, $g=47.85$ which give ideal two-state oscillatory behaviour (Fig. 1) are almost indistinguishable. \textcolor{red}{}Where the values are not quite optimal, rapid oscillations of small amplitude appear. This is shown in the right graph for \textcolor{red}{} $N=20$, $\kappa=165$ and $g=101$.\textcolor{red}{}}
\caption{Left: A contour plot for the joint probability distribution of the quadrature phase amplitudes $X_{A}$ and $X_{B}$ at times $t_{a}=\pi/3\Omega$ and $t_{b}=0$. The four outcomes depicted are macroscopically distinct for $\alpha$ large. Here $\alpha=\beta=5$. The distributions for the remaining 3 pairs of times are similar \cite{supmat}. Right: The corresponding violation $B>2$ of the Bell inequality eq. (\ref{eq:chsh-bell}) for arbitrarily large $\alpha=\beta$.\textcolor{red}{ }}
\caption{Comparison of the baseline, complete annotation scheme and the proposed ESPA scheme (See I \& II in Fig.~\ref{fig:comparison scheme}) under three structured learning tasks (note the scale difference). Each $F_1$ value is the average of 50 experiments, and each curve is based on corresponding $F_1$ values smoothed by Savitzky-Golay filters. We can see that \textbf{scheme II is consistently better than scheme I}. Per the Wilcoxon rank-sum test, the significance levels at each given budget are shown on the x-axes, where \textcolor{red}{$+$} and \textcolor{red}{$++$} mean $p<5\%$ and $p<1\%$, respectively.}
\caption{Due to the inherent structural constraints of each task, individual instances therein put restrictions on others. (a) The temporal relation between \event{met} and \event{Thursday} has to be \lbl{Before} (``met (1)'') or \lbl{Be\_Included} (``met (2)''). (b) The argument roles of {\em a frog} and {\em to the girl} cannot be \lbl{Arg0} anymore. (c) Given the position of the cat's FOREHEAD and LEFT\_EYE, a rough estimate of its NECK can be the red solid box rather than the blue dashed box.}
\caption{ Routing comparisons on various topologies with 1D in {\color{red} red}, and 3D distances in {\color{blue} blue}. \label{fig:route3D} }
\caption{Comparisons of OCR performance. (a) Input 1 to OCR (real scribbled image), (b) Input 2 to OCR (removal of handwritten pixels from (a)), (c) and (d) OCR results for input 1 and 2 ({\color{blue}blue: correctly recognized characters}, {\color{red} red: missing or incorrect ones}).}
\caption{\textbf{Landmark detection qualitative results:} The landmark detection results on three different test images. The points marked in \textcolor{red}{red} are the ground truth landmarks, and the points in \textcolor{blue}{blue} are the ones predicted by different networks.}
\caption{Example of the discourse tree of a jigsaw puzzle review. \textcolor{blue}{StrSum} induces the latent tree and generates the summary from the children of a root, while \textcolor{red}{DiscourseRank} supports it to focus on the main review point.}
\caption{A phrase-structure tree for a sample synthesis. Dotted-boxes around constituents indicate that they are candidates for replacement on the source side~(\S\ref{sec:source-selection}). EN:~English source sentence, HI:~Hindi target sentence, CS:~code-switched sentence. The \textcolor{blue}{\textit{italicized segment}} is the target segment to replace the \textcolor{green!40!black}{\textit{source segment}} under the non-terminal SBAR.}
\caption{BERT word representations of the union of the set of contextualized word representations of {\it \textcolor[rgb]{.84,.48,.05}{relatives}, \textcolor[rgb]{0,0,1}{executive}, \textcolor[rgb]{0.7,0,.0}{wedding}, \textcolor[rgb]{0,0.4,0}{salary}} projected on to the first two principal components of the WEAT gender first names, which capture the primary component of gender. Note how the debiasing conceptor collapses {\it relatives} and {\it wedding}, and {\it executive} and {\it salary} once the bias is removed. }
\caption{Types of reasoning required for document-level RE on DocRED. The rest $0.3\%$ requires other types of reasoning, such as temporal reasoning. The {\color{blue}\bf \textit{head}}, {\color{blue_h} \bf tail} and {\color{rel} \bf \texttt{relation}} are colored accordingly.}
\caption{\textbf{Optical heating of $\alpha$-Fe$_2$O$_3$ NPs} (a) Schematic of (1) dark-field (DF) elastic scattering spectroscopy and (2) Stokes-Raman spectroscopy experimental setups. (b) Dark-field scattering spectra of a single $\alpha$-Fe$_2$O$_3$ NP (radius $R~=~$170~nm): experimental (solid blue) and numerically calculated (dashed purple). Insets depict calculated electric field distribution at corresponding wavelengths (632.8~nm and 790~nm) corresponding to the wavelengths of laser sources for the optical heating experiments. (c) Stokes range of Raman scattering from single the $\alpha$-Fe$_2$O$_3$ NP from (b) at two intensities \blue{0.6}~mW/$\mu$m$^2$ (`cold regime', purple line) and \blue{2.4}~mW/$\mu$m$^2$ (`hot regime', orange line). (d) Experimental (dots) and theoretical (dashed lines) dependencies of optical heating of a single $\alpha$-Fe$_2$O$_3$ NP upon irradiation by CW-laser with intensity \blue{2.4}~mW/$\mu$m$^2$ at wavelength 632.8~nm. Blue dashed and red dashed lines correspond to theoretical calculations with different thermal contact areas between the nanoparticle and SiO\(_2\) substrate, as shown in the insets, where a nanosphere is dipped by R/2.5 and 0 nm into the substrate, respectively). }
\caption{\textbf{3D object detection results on KITTI validation.} We report \APBEV ~/ \AP (in \%) of the \textbf{car} category, corresponding to average precision of the bird's-eye view and 3D object detection. We arrange methods according to the input signals: M for monocular images, S for stereo images, L for 64-beam LiDAR, and L\# for\emph{sparse 4-beam} LiDAR. PL stands for \PL. \emph{Our \PL++ (PL++) with enhanced depth estimation --- \SDN and \GDC --- are in {\color{blue} blue}.} Methods with 64-beam LiDAR are in {\color{gray} gray}. Best viewed in color.}
\caption{3D object detection results on the \textbf{car} category on the \emph{test} set. We compare our methods (in {\color{blue}blue}) and 64-beam LiDAR (in {\color{gray} gray}), using \PRCNN as the object detector. We report \APBEV~/ \AP at IoU = 0.7. $\dagger$: Results from the KITTI leaderboard.}
\caption{The shaded areas are ruled out by time-delay measurements of the two systems we consider, \textcolor{red}{RXJ1131-1231} and \textcolor{blue}{B1608+656}. The dashed lines show the constraint that could be achieved without the MSD (i.e., if the errorbars in the time delay were the limiting factor in the analysis). The black dashed line at $\gamma_{\rm PN}=1$ represents the prediction from GR, and the slight asymmetry around this value is due to a non-unity best-fit value in our analysis. }
\caption{Planar (blue) and normal (red) heating rates as a function of ion-surface distance for a fixed secular frequency of $\omega_t = 2\pi \times 1$~MHz. Data are taken with two measurement methods: (\protect\markersquare) sideband method, (\protect\markerdiamondfilled) Rabi method. (\protect\markerdiamondunfilled) show data from the Rabi method scaled to match results from the sideband method (see comparison in Sec. \ref{sec:results}). Power-law fits for both motional modes take into account the data taken with the sideband-asymmetry (\protect\markersquare) method and the scaled Rabi method (\protect\markerdiamondunfilled).}
\caption{{ Maximum absolute asymmetry $\max \left( \left|A_{total}\right| \right)$ out of 6 consecutive wavelength pairs in an unstratified KH instability. The entire initial phase-shift space ($\Phi_2$--$\Phi_3$) is spanned. Results are shown for the following amplitude ratios: (a) Case C1, (b) Case C2, and (c) Case C3. Each sub-figure has its own colorbar. Red, orange and white stars respectively denote cases S1, S2 and S3.}} \label{fig:asym_phase_span} \end{figure} \begin{figure*} \centering \includegraphics[width=1.0\linewidth]{fig4} \caption{(a)-(c): Total asymmetry between consecutive wavelengths in an unstratified KH instability, inferred only from the linearized initial state. (d)-(f) Nonlinear evolution of KH instability using DNS during the first merging event. (g)-(i) Transition to turbulence in merged KH billows before the second merging event. (j)-(l) Turbulent phase of KH billows just before the last merging event. (a,d,g,j) Case S1, (b,e,h,k) Case S2, and (c,f,i,l) Case S3. The time $t$ is non-dimensionalized by $h_0/\Delta U$. } \label{fig:asym_bar} \end{figure*} %Coherent structures appearing in high Reynolds number ($Re \equiv \Delta U h_0/\nu$, where $\nu$ is the kinematic viscosity, $\Delta U$ and $h_0$ are respectively the shear velocity and length scales) flows can be effectively approximated using vortex sheets \citep{krasny1986desingularization}. To this end, we implement the 2D vortex sheet method -- a Lagrangian technique where the (desingularized) Birkhoff-Rott equation is solved by discretizing the vortex sheet as an array of point vortices \citep{krasny1986desingularization}, to simulate an initial perturbation given by Eq.\, (\ref{eq:init_con2}). %The roll-up and merging patterns for the Cases S1--S3 have been respectively shown in figures\, \ref{fig:asym_bar}(d)-\ref{fig:asym_bar}(f). We find that the observed merging patterns are exactly as predicted by $\mathcal{A}^{(m,m+1)}_{total}$, see figures\, \ref{fig:asym_bar}(a)-\ref{fig:asym_bar}(c). \section{Numerical simulations of shear instabilities and turbulence} \label{Sec:3} %\todo{include second merging from asymmetry} Coherent structures appearing in transitional and turbulent shear flows can be accurately simulated using direct numerical simulations (DNS). To this end, we perform simulations using modest Reynolds number ($Re=2000$, where $Re \equiv \Delta U h_0/\nu$; $\nu$ is the kinematic viscosity, $\Delta U$ and $h_0$ are respectively the shear velocity and length scales) DNS, that are more realizable in laboratory and/or industrial settings. %{We solve the 3D incompressible, non-dimensional Navier-Stokes equations. %\begin{equation} %\nabla\cdot\boldsymbol{u} = 0, \; % D\boldsymbol{u}/Dt = -\rho^{-1}\nabla p + Re^{-1}\nabla^{2}\boldsymbol{u}, % \end{equation} % where $D/Dt$ denotes the material derivative, $\boldsymbol{u}$ denotes the velocity vector, $p$ denotes the pressure and $\rho$ denotes the density of the fluid.} The shear layer is initially assumed to be represented by $\bar{u}=\tanh(z)$; the velocity and the vertical coordinate $z$ are respectively non-dimensionalized by $\Delta U/2$ and $h_0/2$. The governing 3D incompressible, non-dimensional Navier-Stokes equations are \renewcommand{\theequation}{\arabic{section}.\arabic{equation}a,b} \begin{equation} \nabla\cdot\boldsymbol{u} = 0, \quad D\boldsymbol{u}/Dt = -\nabla p + Re^{-1}\nabla^{2}\boldsymbol{u}, \end{equation} \renewcommand{\theequation}{\arabic{section}.\arabic{equation}} \noindent where $D/Dt$ denotes the material derivative, $\boldsymbol{u}$ denotes the dimensionless velocity vector and $p$ denotes the dimensionless pressure. These equations are solved using a pseudo-spectral code described in detail by \cite{Winters04} and \cite{Smyth05}. We keep the respective $x$, $y$ and $z$ dimensions as $(L_x,L_y,L_z)=(6\lambda_{KH},0.8\lambda_{KH},3\lambda_{KH})$, where $\lambda_{KH}$ denotes the wavelength of the most unstable KH mode. %the domain length $L_x$ is set to 6 wavelengths of the most unstable KH mode. The spanwise width of the domain $Ly$ is equal to $0.8$ of one wavelength of the primary KH instability. The domain height $L_z$ is 3 times one wavelength of primary KH mode. {Each simulation is initially perturbed with the eigenfunctions of the primary KH and its first and second subharmonics with the right phase difference. The amplitudes of the eigenfunction perturbations are sufficiently small to ensure an initial linear growth of each mode. The eigenfunction perturbations are overlaid with random perturbations (of equal or smaller order of magnitude) to trigger 3D instabilities.} The boundary conditions are periodic in both $x$ and $y$ directions and free slip at $z=0$ and $z=L_z$. The number of mesh points used are: $(N_x,N_y,N_z)=(1152,160,576)$, which resolves the Kolmogorov length scale. The DNS results for the cases S1--S3 during the first merging event are respectively shown in figures\, \ref{fig:asym_bar}(d)-\ref{fig:asym_bar}(f). We find that the observed merging patterns are exactly as predicted by $\mathcal{A}^{(m,m+1)}_{total}$, shown in figures\, \ref{fig:asym_bar}(a)-\ref{fig:asym_bar}(c), i.e. neighbouring vortices pair in the case S1, triplets form in the case S2 and alternative pairing/no-pairing occurs in the case S3. The phase relations between the primary mode and its subharmonics also have significant implications for the nonlinear evolution of the vortices in later stages and their turbulent breakdown, as shown in figures \ref{fig:asym_bar}(g)-\ref{fig:asym_bar}(i) and \ref{fig:asym_bar}(j)-\ref{fig:asym_bar}(l). In case S1, the merged vortices 1--2 and 5--6 undergo a second merging event, see figure \ref{fig:asym_bar}(g). Finally, the merged vortices 3--4 coalesce with the merged vortices 1--2--5--6 to form a single vortex; this coalescence is underway in figure \ref{fig:asym_bar}(j). The development of small-scale structures is delayed until the last merging event since most of the energy extracted from the background shear is spent on pairing (e.g. see \cite{rahmani2014effect}). For case S2, during the first merging event each three vortices form a triplet. The two merged triplets are fairly symmetric and their centers are far apart, hence they resist pairing for a relatively long time, see figure \ref{fig:asym_bar}(h). Finally they undergo pairing after turbulent-like structures have grown in their cores, see figure \ref{fig:asym_bar}(k). In case S3, the left-out vortices 3 and 6 eventually merge with vortices 4--5 and 1--2, respectively (figure \ref{fig:asym_bar}(i)). This merging event is a ``shredding interaction", and leads to the most vigorous disintegration of the core vortex, see figure \ref{fig:asym_bar}(l). %The reason is that the single vortices with lower strengths are engulfed in the already merged vortices through a ``shredding interaction" rather than going through a ``rolling interaction". %We also note that the left-out vortices in figure \ref{fig:asym_bar}(i) have developed small-scale structures more energetically compared to the paired vortices. Interestingly, $\mathcal{A}^{(m,m+1)}_{total}$ can also provide reasonable predictions of the second merging events. In other words, the large-scale 2D patterns observed in figures \ref{fig:asym_bar}(g)-\ref{fig:asym_bar}(i), which respectively occur at $t=150,\,180$ and $132$, are predictable from the $t=0$ state (recall that $\mathcal{A}^{(m,m+1)}_{total}$ is always obtained from \eqref{eq:asy11}, which is the asymmetry evaluated at $t=0$). Considering case S1, we already found from figure \ref{fig:asym_bar}(a) that the first merging event leads to three vortex pairs: 1--2, 3--4 and 5--6 (figure \ref{fig:asym_bar}(d)). If each primary KH billow has a circulation $\sim \Gamma$, then after the first merging event, each of the merged vortices 1--2, 3--4 and 5--6 will have a circulation $\sim 2\Gamma$. However $\mathcal{A}^{(2,3)}_{total}=\mathcal{A}^{(4,5)}_{total}=-5.5$, while $\mathcal{A}^{(6,1)}_{total}=-1$, implying that the merged vortices 5--6 and 1--2 are expected to be far closer to each other than they are individually with 3--4, hence the second merging would be between 1--2 and 5--6. The propensity towards this merging is observed in figure \ref{fig:asym_bar}(g), and the merging is underway in figure \ref{fig:asym_bar}(j), exactly as predicted. { Similar analyses can be done for the cases S2 and S3. For S2, the first merging event would lead to triplets 1--2--3 and 4--5--6, each with a circulation $\sim 3 \Gamma$. Since $\mathcal{A}^{(6,1)}_{total}=\mathcal{A}^{(3,4)}_{total}=-3$, it implies that the two vortex triplets are far removed from each other, and hence resist second merging for a very long time, see figure \ref{fig:asym_bar}(k). The case S3 is more interesting - the first merging produces vortex pairs 1--2 and 4--5, while vortices 3 and 6 are left out (figure \ref{fig:asym_bar}(f)). Since $\mathcal{A}^{(2,3)}_{total}=\mathcal{A}^{(5,6)}_{total}=-2.6$ while $\mathcal{A}^{(3,4)}_{total}=\mathcal{A}^{(6,1)}_{total}=0$, it implies that vortex 3 would merge with 4--5, while vortex 6 would merge with 1--2. Again, this prediction is exactly found to be true in figures \ref{fig:asym_bar}(i) and \ref{fig:asym_bar}(l). } %This is also evident from figure \ref{fig:asym_bar}(d) \begin{figure} \centering \includegraphics[width=1.0\linewidth]{fig5} \caption{{Time evolution of the dimensionless two- and three-dimensional kinetic energy, $K_{2d}$ and $K_{3d}$ and the rate of viscous dissipation of kinetic energy $\varepsilon$. The parameter $\varepsilon$ has been multiplied by a factor of 20 for plotting. The peaks in $K_{2d}$ corresponding to merging events are marked by stars.}} \label{fig:ke_eps} \end{figure} The growth of turbulent-like coherent structures is a consequence of 3D motions extracting energy from the 2D flow. To quantify the strength of the 2D and 3D motions, and the intensity of turbulence, we utilize the definitions of the 2D kinetic energy: $K_{2d}= \langle\boldsymbol{u}_{2d}\cdot{\boldsymbol{u}}_{2d}\rangle_{xz}$, the 3D kinetic energy: $K_{3d}=\langle\boldsymbol{u}_{3d}\cdot{\boldsymbol{u}}_{3d}\rangle_{xyz}$, and the rate of viscous dissipation of the total kinetic energy: $\varepsilon=Re^{-1}\langle{(\partial u_i / \partial x_j)^2 \rangle}_{xyz}$, with $\langle{\,\rangle}$ denoting the averages in the specified directions \citep{caulfield2000}. In these definitions the velocity field has been partitioned into three parts: %the background 1D velocity, the 2D velocity obtained by averaging the velocity field in the spanwise direction, and the 3D velocity that is the remaining part: $\overline{\boldsymbol{u}}(z)={\langle\boldsymbol{u}\rangle}_{xy}$, $\boldsymbol{u}_{2d}(x,z) = {\langle\boldsymbol{u}\rangle}_y - {\langle\boldsymbol{u}\rangle}_{xy}$, and $\boldsymbol{u}_{3d}(x,y,z) = \boldsymbol{u} - \overline{\boldsymbol{u}}- \boldsymbol{u}_{2d}$. The competition between $K_{2d}$ and $K_{3d}$, and the time evolution of $\varepsilon$, are shown in figure \ref{fig:ke_eps} for the cases S1-S3. Note that after the time shown (i.e. $t>300$) all the motions start to decay. The intensity of the 2D and 3D motions and the viscous dissipation rate varies significantly between the different cases. The highest values in $K_{2d}$ correspond to the case S2, where triplets are formed and the vertical extent of the merged vortices ({i.e. inertial length scale}) is the largest. The major local peaks in $K_{2d}$ mark the merging events. For example, in cases S1 and S3, three merging events occur, while in the case S2, only two merging events occur. The global maxima in $K_{3d}$ and $\varepsilon$ occur close to the last merging event in the cases S1 and S2, and close to second merging in the case S3, where the second merging involves the agglomeration of the most asymmetric vortices. The peaks in $K_{3d}$ and $\varepsilon$ are the highest for S1 with the highest $ \mathrm{max} \left(\left|A_{total}\right|\right)$. The values of the peaks of $K_{3d}$ are almost the same for the cases S2 and S3 which have close values of $ \mathrm{max} \left( \left|A_{total}\right| \right)$. %However, the dissipation rate, $\varepsilon$, is less directly correlated to $\max(|A_{total}|)$ in the cases S2 and S3. We therefore conclude that the initial asymmetry of the primary KH wave with respect to its subharmonic modes, given by the global measure $ \mathrm{max} \left(\left|A_{total}\right|\right)$, has significant implications for the intensity of the ensuing turbulence. %in the flow that ensues the growth of primary KH instabilities. For the highest $ \mathrm{max} \left( \left|A_{total}\right| \right)$, vortex merging occurs either earlier (as in S1 compared to S2) or more energetically (as in S1 compared to S3). In the former case, the flow has more time to develop small-scale 3D turbulent motions before the decay starts (e.g. see \cite{rahmani2014effect}) and in the latter case, the 3D motions can extract energy more efficiently from the 2D flow during the active turbulent phase. In both these situations, the higher $ \mathrm{max} \left(\left|A_{total}\right|\right)$ has led to a higher intensity of turbulence. \section{Summary} \label{sec:Summary} In summary, we investigated the complex, multiple vortex merging patterns and ensuing turbulence characteristics in a shear layer. { We have considered multiple wavelengths of the primary KH mode, and investigated the effect of higher subharmonics and not just the first subharmonic) on the primary KH for understanding the physics of vortex merging. While related previous studies that considered vortex array \citep{unal1988vortex,rajagopalan2005flow,baty2006kelvin,shaabani2019vortex} have emphasized on the role of subharmonics in vortex merging, a simple physical understanding of the underlying mechanism seems to be missing. Based on the linear theory of KH instability arising in the classic vortex-sheet profile, we have provided a mechanistic understanding of the role of subharmonics in vortex merging.} We have shown that the otherwise symmetric vertical velocity field of two neighboring wavelengths of the primary KH is rendered asymmetric by the presence of a subharmonic mode. This asymmetry, which is fully derived from the linear theory, is found to be the key in deciding the local merging patterns and their strengths. Based on the initial asymmetry of the vertical velocity profile, we have proposed an effective measure of vortex merging. Our analysis reveals that the highly nonlinear, local merging patterns in shear layers are predictable from the linearized initial state. Additionally, we show that subsequent merging patterns can also be predicted from the initial asymmetry. In fact, the highest amount of initial global asymmetry is found to yield the highest level of turbulence intensity. { In summary, the main contribution of this paper is twofold -- (i) to provide a simple, linear, mechanistic description of the role of subharmonics in vortex merging, and (ii) to predict nonlinear vortex merging patterns and ensuing turbulence characteristics from the linearized initial state.} { In all our DNS studies reported in \S \ref{Sec:3}, the primary KH and the subharmonics are overlaid with random noise of comparable amplitude, and yet our model can still accurately predict the merging patterns from linear initial conditions. %\todo{Anirban, should be bring back the following sentence? I think it was good to explain why this happens. Also maybe we don't want to make changes to this paragraph in our next submission?} %This is due to the fact that the superharmonic noise being stable would decay, and the higher subharmonics, due to their low growth rate would be of little consequence. Only if the random noise, that can source from other concurrent phenomena in some realistic situations, is significantly stronger (and are nonlinear in nature) than the eigenfunction perturbations, we suspect that the predictive nature of the model, which relies on the linearity of the initial state, would break down. } Our findings imply that in many realistic situations, the route to turbulence, coherent structures and turbulent characteristics are strongly dependent on the initial disturbances. Therefore in future, it may be possible to engineer flows to achieve optimal turbulence and mixing. % and the ensuing turbulence characteristics %presence of a subharmonic mode leads to an asymmetry in the vertical velocity field %From the initial asymmetry in the vertical velocity field, we derived an effective measure which shows that the highly nonlinear, complex merging patterns and the ensuing turbulence characteristics in shear layers are predictable from the linearized initial state. The highest global asymmetry of the velocity field of the primary mode induced by the subharmonic modes led to the highest level of turbulence intensity. The local measures of the asymmetry of neighbouring vortices also provide accurate predictions of the patterns of pairing. These findings imply that in realistic situations, the route to turbulence, coherent structures and turbulent characteristics are strongly dependent on the initial disturbances. Therefore in future, it may be possible to engineer flows to achieve optimal turbulence and mixing. %opening the possibility to engineer flows for achieving optimal turbulence and mixing. % In summary, we investigated the complex multiple vortex merging patterns and ensuing turbulence characteristics in a shear flow. By quantifying the initial asymmetry of the vertical velocity of the primary mode induced by the subharmonic modes, we proposed a simple measure that accurately predicts the vortex merging patterns and their strengths in the \emph{nonlinear stages} of the flow. This measure, that is solely found from the \emph{initial} phase and amplitude relations between the modes in the \emph{linearized state}, was correlated to the intensity of the 3D motions that provide the route to turbulence. This finding implies that in realistic situations with randomized initial conditions, the effects of the initial conditions is not forgotten even through the turbulent stages of the flow.\\ \section*{Acknowledgement} A.G. thanks Alexander von Humboldt foundation for funding. The computational resources for this work were provided by Compute Canada. { The authors also thank Prof. Eyal Heifetz of Tel Aviv University, Associate Editor Prof. John Dabiri, as well as the two anonymous reviewers for insightful comments and suggestions. } %The times shown in panels (j)-(l) are during the second merging event, with the first merging event being the merging of the primary vortices and the second merging event being the next agglomeration of vortices. %By the last merging event, all the initial 6 vortices have merged into a single core structure. %While these structures contribute to the disintegration of the core, they are fairly concentrated close to the vortex core centers and cannot prevent further merging from happening in the long term. %\todo{If it is pairing, then higher $\max(|A_{total}|)$ possibly implies both higher energetic merging and earlier merging (comparing S1 and S3, both are pairings). We also see that if it is a triplet, then energy is more. This does not directly correspond to $\max|A_total|$, rather from the interpretation of fig 4(b), which shows that triplet is going to form. As you point out, triplets have highest vertical span. } %\todo{Anirban, yes both your commentsa are right. I am still thinking what the best way is to write these. $\max(|A_{total}|)$ is correlated to $K_{3d}$ but not $K_{2d}$ and $\varepsilon$ is somewhere in between.} %MONA: you there? %\todo{Anirban, the snapshots of the vortices don't have the right aspect ratio.} % The top row corresponds to $\Phi_2=0$ while $\Phi_3$ is varied, while the bottom row shows the same with $\Phi_2=\pi/2$. We observe that the magnitude of $\mathcal{A}^{total}$ is much higher in the top row than the bottom one, which can be justified by the fact that $\Phi_2=0$ and $\Phi_2=\pi/2$ respectively lead to maximum and minimum asymmetry due to the first subharmonic. % Next %This far, we have shown that the proposed measure of asymmetry, Eq.~(\ref{eq:asym_main}), is strongly dependent on the initial amplitude ratios and the initial phase shifts. %\renewcommand{\theequation}{\arabic{section}.\arabic{equation}} %Have to add a few lines that phase difference between two different wavenumbers does not make sense (we need to make a justification). % \begin{figure} % \centering % \includegraphics[width=1.0\linewidth]{Asymmetry_plot2.png} % \caption{{ Total asymmetry, $\mathcal{A}_{total}$, due to first and second subharmonics, in six consecutive wavelengths of the primary mode for an unstratified flow. % %predicted for non-dimensional time, $t/(kU)=0.05$. % (a) $\hat{\eta}_{2}(0)/\hat{\eta}_{1}(0)=1$ and $\hat{\eta}_{3}(0)/\hat{\eta}_{1}(0)=1$, (b) $\hat{\eta}_{2}(0)/\hat{\eta}_{1}(0)=1$ and $\hat{\eta}_{3}(0)/\hat{\eta}_{1}(0)=0.2$, (c) $\hat{\eta}_{2}(0)/\hat{\eta}_{1}(0)=0.2$ and $\hat{\eta}_{3}(0)/\hat{\eta}_{1}(0)=1$, (d) $\hat{\eta}_{2}(0)/\hat{\eta}_{1}(0)=0.3$ and $\hat{\eta}_{3}(0)/\hat{\eta}_{1}(0)=0.6$. Red crosses indicate initial conditions used for simulations in figure \ref{fig:pairing_VM}. }} % \label{fig:asym_full} % \end{figure} % \begin{figure} % \centering % \includegraphics[width=1.0\linewidth]{pairing_VM.png} % \caption{{Pairing simulations using vortex method for six wavelengths of the primary mode. All sub-figures are at non-dimensional time $t/(kU)=0.05$. (a) Strong asymmetry, corresponds to the upper red cross in figure \ref{fig:asym_full}(a): $\hat{\eta}_{2}(0)/\hat{\eta}_{1}(0)=1$, $\hat{\eta}_{3}(0)/\hat{\eta}_{1}(0)=1$, $\Phi_2=\pi$, $\Phi_3=\pi$, (b) Moderate asymmetry, corresponds to the lower cross in figure \ref{fig:asym_full}(a): $\hat{\eta}_{2}(0)/\hat{\eta}_{1}(0)=1$, $\hat{\eta}_{3}(0)/\hat{\eta}_{1}(0)=1$, $\Phi_2=\pi/2$, $\Phi_3=\pi/6$, (c) Moderate asymmetry, primarily induced by the first subharmonic, corresponds to the red cross in figure \ref{fig:asym_full}(b): $\hat{\eta}_{2}(0)/\hat{\eta}_{1}(0)=1$, $\hat{\eta}_{3}(0)/\hat{\eta}_{1}(0)=0.2$, $\Phi_2=\pi/2$, $\Phi_3=\pi$/2. and (d) Weak asymmetry, primarily induced by the second subharmonic, corresponds to the red cross in figure \ref{fig:asym_full}(c): $\hat{\eta}_{2}(0)/\hat{\eta}_{1}(0)=0.2$, $\hat{\eta}_{3}(0)/\hat{\eta}_{1}(0)=1$, $\Phi_2=\pi$, $\Phi_3=\pi$/6. Red boxes denote propensity towards pairing. }} % \label{fig:pairing_VM} % \end{figure} % { % We define the total asymmetry as follows: % \begin{equation} % \mathcal{A}_{total}=\sum_{m=1}^6 \left|\mathcal{A}^{(2)}_{m,m+1}+\mathcal{A}^{(3)}_{m,m+1}\right|. % \end{equation} % We plot $\mathcal{A}_{total}$ in figure \ref{fig:asym_full} for four different combinations of the initial amplitude ratios $\hat{\eta}_{2}(0)/\hat{\eta}_{1}(0)$ and $\hat{\eta}_{3}(0)/\hat{\eta}_{1}(0)$. Unless otherwise mentioned, we use $U=1$ and $k=2\pi$ in all our simulations. % We find that the total asymmetry is a complex function of the initial amplitude ratios, as well as $\Phi_2$ and $\Phi_3$. } %\section{Numerical simulations} %\subsection{Vortex Method} % \begin{figure*} % \centering % \includegraphics[width=0.7\linewidth]{mult.png} % \caption{{Nonlinear evolution of KH instability using vortex method. Six consecutive wavelengths of the primary wave show that pairing between two neighbouring billows will occur depending on the asymmetry. All sub-figures are plotted at non-dimensional time $t/(kU)^{-1}=0.06$. These nonlinear evolution corresponds to the initial conditions of (a) figure \ref{fig:asym_bar}(c), (b) figure \ref{fig:asym_bar}(d), and (c) figure \ref{fig:asym_bar}(e).}} % \label{fig:pairing_VM} % \end{figure*} % { % The interface is described by a parametric curve % x=xs , t=xs , t , ys , t, with the arclength s and time t, % and is evolved by % dx % dt % The nonlinear evolution of a periodic vortex sheet is governed by the Birkhoff-Rott equation: % \begin{equation} % u(s,t)-\ii w(s,t) = \frac{\ii}{2\lambda}P.V.\int_{0}^{\lambda}{\gamma}\cot\bigg[\frac{\pi(\chi - \tilde{\chi})}{\lambda}\bigg] d\tilde{s}, % \end{equation} % where `$P.V.$' indicated Cauchy Principal Value, % $ u $, and $ w $ are respectively the horizontal and vertical components of the interface velocity, $ \chi $ is the complex position of the interface: $ \chi \equiv x(s,t) + i z(s,t) $, $ \lambda $ is the wavelength, and $\gamma (\tilde{s},t)=2U$ is the vortex sheet strength for the KH configuration. \todo{figure5 has been updated...this is the one that DNS needs to validate} % High values of $\mathcal{A}_{total}$ indicate higher propensity towards pairing, e.g.\ we see that three pairs are formed in figure \ref{fig:pairing_VM}(a). Smaller values of $\mathcal{A}_{total}$ indicate that pairing may not occur (see figure \ref{fig:pairing_VM}(d)) or may be severely delayed, and by that time, other phenomena (e.g.\ three-dimensionality) may become important, hence pairing may not be observed at all. We also observe that pairing is governed by the first subharmonic when its amplitude is comparable to that of the primary mode. This is evident by comparing figures \ref{fig:pairing_VM}(b) and \ref{fig:pairing_VM}(c).} % % strongly dependent on $\Phi_2$, and the strength of asymmetry varies sharply with the initial amplitude ratios; both $\hat{\eta}_{2}(0)/\hat{\eta}_{1}(0)$ and $\hat{\eta}_{3}(0)/\hat{\eta}_{1}(0)$ play important roles. The effect of $\Phi_3$ becomes significant when $\hat{\eta}_{2}(0)/\hat{\eta}_{1}(0)$ is small, e.g.\ in figure \ref{fig:asym_full}(c). % { % Based on the understanding of the predicted asymmetry from figure \ref{fig:asym_full}, we have run a few simulations using vortex method; see figure \ref{fig:pairing_VM}. We find that $\mathcal{A}_{total}$ provides an accurate prediction of the fully nonlinear late-time pairing dynamics. } % \todo{Anirban, do we really need this blue block? Can we instead reference one of your papers?} % %\section{DNS results} % To verify the predictions of our simple model and also to examine the evolution of the agglomerated billows in the nonlinear phase, we simulate the shear layer in 3D using direct numerical simulations (DNS). Instead of a discontinuous velocity profile, in DNS we consider a horizontal shear flow represented by the background velocity $\overline{U}$ % \refstepcounter{equation} % \begin{equation} % \overline{U} = \frac{\Delta U}{2} \tanh \left(\frac{2z}{h_0} \right), % %\eqno{(\theequation{\mathit{a},\mathit{b}})} % \end{equation} % where $\Delta U$ is the velocity difference in the shear layer, and $h_0$ is the thickness of the velocity interface. For this velocity profile, the Reynolds number, $Re$, can be defined as % \begin{equation} % Re = \frac{\Delta Uh_0}{\nu}, % \end{equation} % where $\nu$ is kinetic viscosity of the fluid. For DNS, we choose $Re=2000$, which is close to the Reynolds number in mixing layers but lower than the usual range of Reynolds numbers in geophysical applications. % Assuming that the fluid is incompressible and the Boussinesq approximation holds, the equations of motion for continuity and momentum balance in a Cartesian coordinate system $(x,y,z)$ are stated as % \begin{equation} % \nabla\cdot\boldsymbol{u} = 0, \;\; % \frac{D\boldsymbol{u}}{Dt} = -\frac{1}{\rho}\nabla p - g\hat{\boldsymbol{k}} + \nu\nabla^{2}\boldsymbol{u}, \label{eqn:NS} % \end{equation} % where $\boldsymbol{u}$ is the velocity field, $\rho$ the fluid density, $p$ the pressure field, $g$ the gravitational acceleration, $\hat{\boldsymbol{k}}$ is the unit vertical vector, and $D/Dt$ denotes the material derivative. % Equations (\ref{eqn:NS}) are solved using a DNS pseudo-spectral code described in detail by \citet{Winters04} and \citet{Smyth05}. In all the simulations, the domain length $L_x$ is set to 6 wavelengths of the most unstable mode. The spanwise width of the domain $Ly$ is equal to $0.8$ of one wavelength of the primary KH instability. The domain height $L_z$ is 3 times one wavelength of primary KH mode. The boundary conditions are periodic in $x$ and $y$ directions and free slip with no flux at $z=0$ and $z=L_z$. The number of mesh points used to resolve the domain are: $N_x=1152$, $N_y=160$ and $N_z=576$. This spatial resolution is of the same order of magnitude as the Kolmogorov length scale, $L_k=(\nu^3/\varepsilon)^{1/4}$, with $\varepsilon$ being the rate of dissipation of the turbulent kinetic energy, and therefore satisfies the resolution requirement for DSN \cite[]{Moin98}. %We perform a series of simulations in DNS with different phase angle relations and amplitude ratios of the first and second subharmonic mode, as listed in table \ref{tab:DNS}. % \begin{figure} % \centering % \includegraphics[width=0.7\linewidth]{with_noise_1.png} % \caption{ss} % \label{fig:noise} % \end{figure} % \subsection{Unstratified shear layers}\label{sec:unstratified} % \begin{figure} % \centering % \includegraphics[width=1.0\linewidth]{Harmonic_amplitude_v02.png} % \caption{Temporal growth of the amplitude of the primary KH mode (blue) and its first (red) and second (green) subharmonic modes. Simulations (a)-(d) correspond to the cases examined in figures \ref{fig:asym_full} and \ref{fig:pairing_VM}. } % \label{fig:amplitude} % \end{figure} % \begin{figure} % \centering % \includegraphics[width=1.0\linewidth]{snapshots_phi_pi_J_00_v02.png} % \caption{Snapshots of the vorticity field for unstratified cases, $J=0$, at different times for simulations (a) and (d) in figures \ref{fig:asym_full} and \ref{fig:pairing_VM}. } % \label{fig:snap_phi_pi_j_00} % \end{figure} % \begin{figure} % \centering % \includegraphics[width=1.0\linewidth]{snapshots_phi_pi2_J_00_v02.png} % \caption{Snapshots of the vorticity field for unstratified cases, $J=0$, at different times for simulations (b) and (c) in figures \ref{fig:asym_full} and \ref{fig:pairing_VM}. } % \label{fig:snap_phi_pi_j_00} % \end{figure} % \subsection{Stratified shear layers}\label{sec:stratified} % \begin{figure} % \centering % \includegraphics[width=1.0\linewidth]{Harmonic_amplitude_J_10_v02.png} % \caption{Temporal growth of the amplitude of the primary KH mode (blue) and its first (red) and second (green) subharmonic modes. Simulations (a)-(d) correspond to the cases examined in figures \ref{fig:asym_full} and \ref{fig:pairing_VM}. } % \label{fig:amplitude} % \end{figure} % \begin{figure} % \centering % \includegraphics[width=1.0\linewidth]{snapshots_phi_pi_J_10_v02.png} % \caption{Snapshots of the vorticity field for unstratified cases, $J=0.1$, at different times for simulations (a) and (d) in figures \ref{fig:asym_full} and \ref{fig:pairing_VM}. } % \label{fig:snap_phi_pi_j_00} % \end{figure} % \begin{figure} % \centering % \includegraphics[width=1.0\linewidth]{snapshots_phi_pi2_J_10_v02.png} % \caption{Snapshots of the vorticity field for unstratified cases, $J=0.1$, at different times for simulations (b) and (c) in figures \ref{fig:asym_full} and \ref{fig:pairing_VM}. } % \label{fig:snap_phi_pi_j_00} % \end{figure} % \section{Diagnostics} % \begin{itemize} % \item rate of dissipation of kinetic energy % \item concentration profiles, mixing % \item energy spectrum % \item dissipation, $\nabla \rho$ % \item shear layer height % \end{itemize} %\section{Conclusions} \bibliographystyle{jfm} \bibliography{jfm-instructions} \end{document} }}
\caption{ %(Color online) (Left) A phase diagram of Ce \cite{X-ray_diff} (see also the supplement \cite{Supp}). %In the phase diagram, ${\alpha}$-Ce at ``red star" %($V$ = 34 \AA$^3$, $T$ = 500 K, $P$ = ambient pressure) and ${\gamma}$-Ce at ``blue star" %($V$ = 27.76 \AA$^3$, $T$ = 100 K, $P$ = 0.88GPa) are selected for the comparison of electronic structures in Fig.~\ref{basic_BAND}. The blue-dotted line corresponds to the $P$-$V$ isotherm at 293 K. {\cred %along which the electronic structures of ${\gamma}$ and ${\alpha}$ %phases are explored in Fig. \ref{dmft_EF} and Fig. S2. } (Right) The bulk BZ of fcc Ce and its (001) and (110) surface BZ. %One $X$ point is projected onto $\bar{\Gamma}$, %while two non-equivalent $X$ and $X'$ %are projected onto $\bar{M}$ of the (001) surface BZ, %and also onto $\bar{X}$ of the (110) surface BZ. There are two independent mirror planes of $k_{y}=0$ (in blue) and $k_{x}=k_{y}$ (in gray), which, respectively, yield two mirror-symmetry lines along $\bar{M}$-$\bar{\Gamma}$-$\bar{M}$ and $\bar{X}$-$\bar{\Gamma}$-$\bar{X}$ in the (001) surface BZ. Similarly, in the (110) surface BZ, two mirror-symmetry lines are formed along $\bar{Y}$-$\bar{\Gamma}$-$\bar{Y}$ and $\bar{X}$-$\bar{\Gamma}$-$\bar{X}$. }
\caption{ %(Color online) The amplified DMFT electronic structures near {\EF}: {\bf (a)} for ${\gamma}$-Ce and {\bf (b)} for ${\alpha}$-Ce. {\cred %The irreducible representation of each band is provided. } For ${\gamma}$-Ce, $4f$ states are hardly seen, because they are incoherent. For ${\alpha}$-Ce, the coherent $4f$ bands formed around {\EF} produce, via the hybridization with the conduction band, the separated bands with the gap in-between (colored in gray). There exist clear energy gaps at the TRIM points of ${\Gamma}$, $X$ and $L$, and also small energy gaps at $W$, in-between $L$-${\Gamma}$, and at ``A" along ${\Gamma}$-$K$. The inset shows the gap formation at ``A", arising from the same ${\Gamma}_5$ symmetry of the crossing bands \cite{Imsigma}. The green-dotted lines overlaid with DMFT bands are the DFT bands rescaled by 1/2. {\bf (c)} Imaginary part of the DMFT hybridization function $\Delta(\omega)$. {\bf (d)} DMFT and DFT FSs for both phases (see also Fig. S1 \cite{Supp}). {\bf (e)} The renormalization factor $Z$ and the energy gaps at ${\Gamma}$ and $L$ ($\Delta_{\Gamma}$ and $\Delta_{L}$) are displayed as a function of pressure (see also Figs. S2-S4 \cite{Supp}). The first-order-type phase transition is manifested across the ${\gamma}$-${\a}$ transition. {\bf (f)} The product of the parity eigenvalues of $\alpha$-Ce at 8 TRIM points in the fcc BZ. {\cred %of $\alpha$-Ce at 8 TRIM points of ${\Gamma}$, $3X$, $4L$ in the fcc BZ. %The signs are computed from the parity eigenvalues of %all the valence bands just below the gap region %in the DMFT band structure in Fig. \ref{dmft_EF}b. } }
\caption{ (Color online) {\bf (a)} The (001) surface electronic structure of ${\alpha}$-Ce, calculated by the tight-binding (TB) model with semi-infinite slabs. The TB Hamiltonian is constructed from the DFT band result (rescaled by 1/2 near {\EF}). {\cred % including %Ce $6s$, $5p$, $5d(t_{2g})$, $4f_{5/2}$, and $4f_{7/2}$ orbitals. } {\bf (b),(c)} The helical spin structures of the ``D$_1$" and ``D$_2$" Dirac-cone energy surfaces, as indicated by (i) and (ii), respectively. {\bf (d)} The (110) surface electronic structure of ${\alpha}$-Ce. {\bf (e),(f)} Amplified band structures inside the green-square and the red-square, respectively, in (D). In (E), TSSs of a typical TCI-type nature are revealed with the gapped (red arrow) and protected (black arrow) Dirac points, while, in (F), TSSs are mostly buried under the bulk-projected bands. }
\caption{Comparison of the network using plain concatenation block or WR reconstruction block, including PSNR and SSIM for scale 2$\times$, 4$\times$ and 8$\times$ SR on Set5 and Set14. {\color{red}Red} indicates the best results.}
\caption{Quantitative evaluation of state-of-the-art SR approaches, including PSNR and SSIM for scale 2$\times$, 4$\times$ and 8$\times$. {\color{red}Red} indicates the best and {\color{blue}blue} indicates the second best results.}
\caption{ \textbf{Top}: the overall distribution of usage shifts of the words % {at the core of the} counselor vocabulary (i.e., those used by at least 20\% of counselors). \textbf{Bottom}: % {usage shift distributions per} subset of core words characteristic to each conversational component, along with examples. Words are ordered by, and colored according to their usage shift, such that \textcolor{blue}{blue} words tend to be used more by counselors in their earlier conversations while \textcolor{red}{red} words tend to be used more by counselors with more experience. % Dashed lines indicate the respective medians. }
\caption{Additional \aastex\symbols}
\caption{Recovered square error moments (circles), \deltasquaremoment, for the true error moments (squares) of 10 synthetic regressors on the pixels of a 1024x1024 image. Recovering algorithm does not know which vector components correspond to the strong diagonal signal, the (i,i) error moments.}
\caption{Visualization of the similarities between production rules for expanding a ``$G$'' node. The similarities are implicitly learned by our neural guider. We manually highlight some entries indication pairs of similar production rules. \textcolor{MyDarkRed}{$\Box$}: Clustering vs. Binary factor, \textcolor{MyDarkBlue}{$\Box$}: Binary factor vs. Markov Chain, and \textcolor{MyDarkGreen}{$\Box$}: Clustering (mixture of Gaussian) vs. Gaussian.}
\caption[ The energy of the uncontrolled flow as a function of wall height]{ The energy of the uncontrolled flow as a function of wall height for $|k_x| \leq 0.5$ and $ |k_y| \leq 6$. Results are provided for the LM and {\color{blueFig6}DNS}. The energy is shown for all directions $\|\hat{\bi u}\|_2^2$(\full), the streamwise direction $\|\hat{ u}\|_2^2$($\broken$), the spanwise direction $\|\hat{v}\|_2^2$($\chain$) and the wall-normal direction $\|\hat{ w}\|_2^2$($\dotted$). }
\caption[The reduction of kinetic energy for one wall height]{The reduction of the kinetic energy relative to the entire flow: $\mathbf{E}_{u,v,w}$ ({\color{excel1}$\fullsquare$}), $\mathbf{E}_u$({\color{excel2}$\fullsquare$}), $\mathbf{E}_v$({\color{excel3}$\fullsquare$}) and $\mathbf{E}_w$({\color{excel4}$\fullsquare$}), where $\mathbf{E}_{u,v,w} = \mathbf{E}_{u} + \mathbf{E}_{v} +\mathbf{E}_{w}$.}
\caption[The reduction of kinetic energy as a function of $z$]{Left axis: The reduction of kinetic energy ($\epsilon_{AE}(\broken)$,$\epsilon_{ME}(\chain)$ and $\epsilon_{IO}$(\full)) as a function of $z$. Right axis: the normalized kinetic energy $\mathbf{E}_z$({\color{blueFig6}\full}) as a function of $z$. Results are shown for (a) $[u,v,w]$, (b) $[u]$, (c) $[v]$ and (d) $[w]$. }
\caption[The distribution of forcing]{The distribution of forcing between $\mathbf{E}_{ f_x}$ ({\color{excel1}$\fullsquare$}), $\mathbf{E}_{ f_y}$({\color{excel2}$\fullsquare$}) and $\mathbf{E}_{ f_z}$({\color{excel3}$\fullsquare$}), where $\mathbf{E}_{ f_x} + \mathbf{E}_{ f_y} + \mathbf{E}_{ f_z} = 1$.}
\caption{(\subref{fig:lateralLineSystem}) The lateral line in juvenile zebrafish, with neuromasts visible as bright dots on the body surface (adapted with permission from \citet{Sapede2002}). We observe a high density of neuromasts in the head and the tail, with sparser distribution along the midsection. (\subref{fig:lateralLineSketch}) A schematic representation of the distribution of mechanoreceptors along the fish body. (\subref{fig:neuromastsSketch}) The neuromasts bend in response to flow, which generates \blue{a} neuronal response \blue{by} sensory cells located at the base (adapted with permission from \citet{Kottapalli2013}).}
\caption{Snapshots of the vorticity field around a static larva profile in the presence of a horizontally oscillating cylinder. The snapshots are taken at regular intervals over a single oscillation period, with positive vorticity shown in red and negative vorticity shown in blue. A corresponding animation is shown in \blue{supplementary} \movieHorizCylStatic{}.}
\caption{Utility plots for a stationary, larva-shaped body with (\subref{fig:utility1dStaticHoriz}) oscillating and (\subref{fig:utility1dStaticRot}) rotating cylinders. The curves indicate the utility for placing the first \blue{shear stress} sensor at a given location $s$. The utility curves were not computed in the region $0.95<s/L\le1$, to avoid potential numerical issues resulting from sharp corners at the tail. (\subref{fig:stdU_staticHoriz},\subref{fig:stdV_staticHoriz}) Standard deviation of horizontal and vertical velocity caused by oscillating cylinders, with larger deviation shown in yellow and lower values shown in black. The standard deviation was computed across 9 distinct simulations (6 time-snapshots recorded in each simulation), with a single oscillating cylinder placed at 9 locations uniformly in the prior-region. (\subref{fig:stdU_staticRot}, \subref{fig:stdV_staticRot}) Standard deviation of velocity components for the rotating cylinders. (\subref{fig:staticSensors}) Optimal sensor distribution determined using sequential placement. Sensors for detecting oscillating cylinders are shown as black squares, whereas those for detecting rotating cylinders are shown as blue circles. The numbering indicates the \blue{sequence determined by the optimal placement algorithm.}}
\caption{(\subref{fig:utilityLarva}) Utility curves for the first \blue{shear stress} sensor, $\hat{U}_1(s)$, on a larva-shaped swimmer (black squares - oscillating cylinders, blue circles - rotating cylinders). (\subref{fig:sensorsLarva}) Sequential placement of 20 sensors along the body, with the order of placement shown. (\subref{fig:utilityAdult}) Utility curves for an adult-shaped swimmer. (\subref{fig:sensorsAdult}) Sensor placement for the adult, with results from horizontal and rotating disturbances shown separately for clarity.}
\caption{Optimal distribution of the first 10 \blue{shear stress} sensors for the self-propelled swimmers. The body has been divided into 3 distinct segments: the head ($0\le s/L <0.2$); the midsection ($0.2\le s/L < 0.6$); and the posterior ($0.6\le s/L \le1$).}
\caption{(\subref{fig:utilityAllSum}) Utility curve for the first \blue{shear stress} sensor, $\hat{U}_1(s)$, on a larva-shaped swimmer, using a combination of all five flow configurations described in the paper. (\subref{fig:sensorsAllSum}) Sequential placement of 20 \blue{shear stress} sensors along the body.}
\caption{\blue{(\subref{fig:utilityAllSum_pressGrad}) Utility curve for the first \blue{pressure gradient} sensor, $\hat{U}_1(s)$, on a larva-shaped swimmer, using a combination of all five flow configurations described in the paper. (\subref{fig:sensorsAllSum_pressGrad}) Sequential placement of 20 \blue{pressure gradient} sensors along the body.}}
\caption{ Evaluation curves (sum of episodic reward v.s. environment time steps) of {\color{red} (DQN-)HC-Dyna}, {\color{blue} (DQN-)OnPolicy-Dyna}, {\color{black} DQN} %Evaluation curves of \textbf{{(DQN-)HC-Dyna} (red), {(DQN-)OnPolicy-Dyna}(blue), {DQN}(black)} on GridWorld (a-c), MountainCar-v0 (d-f), CartPole-v1 (g-i) and Acrobot-v1 (j-l). The curves plotted by \emph{dotted lines} are using online learned models. Results are averaged over $30$ random seeds. %Each domain has the same y-axis scale; but the x-axis (i.e. number of environment time steps) is five times larger when plan step is $1$. }
\caption{ Figure (a)(b) show buffer({\color{red}red $\boldsymbol{\cdot}$})/queue(black $+$) distribution on GridWorld ($\svec \in [0, 1]^2$) by uniformly sampling $2$k states. %\textbf{{\color{blue} Blue}} stars denote optimal episodes which are defined to have less than $50$ steps and are used as reference. (a) is showing ER buffer when running DQN, hence there is no ``$+$'' in it. (b) shows $0.2\%$ of the ER samples fall in the green shadow (i.e. high value region), while $27.8\%$ samples from the SC queue are there. }
\caption{ Mass functions for the studied OC sample. The scaled IMFs of Salpeter\,(1955) and Kroupa\,(2001) are overplotted with black and red lines, respectively.}
\caption{ Panel (a): Galactocentric distance $R_{\textrm{G}}\,$ versus log($t$/yr) for the studied OC sample. Symbols' colours were assigned according to the different $r_{h}/R_J$ ranges described in the text. Red symbols: Collinder\,258 (\textcolor{red}{$\CIRCLE$}), NGC\,6756 (\textcolor{red}{$\blacktriangle$}), Czernik\,37 (\textcolor{red}{$\blacklozenge$}) and NGC\,6249 (\textcolor{red}{$\blacksquare$}); black symbols: Trumpler\,25 ($\CIRCLE$), BH\,150 ($\blacktriangle$), Ruprecht\,111 ($\blacksquare$), Basel\,5 ($\blacklozenge$); blue symbols: NGC\,5381 (\textcolor{blue}{$\CIRCLE$}), Ruprecht\,102 (\textcolor{blue}{$\blacktriangle$}), Ruprecht\,97 (\textcolor{blue}{$\blacklozenge$}) and ESO\,129-SC32 (\textcolor{blue}{$\blacksquare$}). Panel (b): Radial metallicity $[Fe/H]$ distribution. The continuous line is the relationship derived by Netopil et al. (2016). The dashed lines represent its upper and lower limits. Panel (c): Distribution of the OCs projected on to the Galactic plane. The spiral pattern was taken from Vall\'ee\,(2008). Panel (d): Distribution of OCs perpendicular to the Galactic plane (horizontal line). The grey dots represent OCs taken from Kharchenko et al. (2013) and Dias et al. (2002).}
\caption[System Model Geometry]{Example system model and geometry for problem formulation. Nodes with \textit{black} fonts will be presented in all our simulations. The node with a \textcolor{blue}{\textit{blue}} font title is part of the model studied in Section~\ref{sec:TwoNodeSim}, while the node with a \textcolor{ForestGreen}{\textit{green}} title is part of the model treated in Section~\ref{sec:RelayIoT}. The speeds of aerial nodes along these paths are variable and bounded. Solid lines represent the paths of UAV nodes, and red dashed lines correspond to existing communication links across distances $\chi$. Altitudes $a_{1},a_{2}=1$\,km, and displacement$\delta_{2}=1$\,km. For simplicity of exposition, we denote aerial nodes as$U_{ai}$ and ground nodes as $U_{gi}$, although they are modelled equivalently.}
\caption{Different approaches for enabling inference in GANs. $f$ and $g$ denote the encoder and the decoder/generator, respectively. $c$ is the discriminator for GAN. $\bm z$ obeys a specific probabilistic prior, i.e. $\bm z \sim p(\bm z)$ and \textcolor[rgb]{0,0,1}{$\bm y = \phi^{-1}(\bm z)$}, where $\phi$ is an invertible network. }
\caption{ \color{black} Large electron polaron in LiF. (a) Isosurface of the polaron density $|\psi|^2$ and ball-stick model of LiF, with Li in green and F in silver. (b) Cross-section of the polaron density $|\psi|^2$ along a [010] line cutting through the center. (c) Modulus of the atomic displacements projected along the [010] direction, and taken on a -Li-F- chain of atoms nearest to the polaron center. The horizontal axes in (a), (b), and (c) are aligned. (d), (e) Band structures and phonon dispersions of LiF, respectively. The Fourier amplitudes $A_{n\bk}$ and $B_{\bq\nu}$ are superimposed to the bands, with the radius of the circle proportional to their square modulus. In (d) the zero of the energy is aligned with the valence-band top. (f) Polaron formation energy $\Delta E_f$ and eigenvalue $\varepsilon$ as a function of supercell size. The dashed lines are the Makov-Payne extrapolations. The shading indicates that no localized solution was found, and MIT stands for metal-to-insulator transition. The numbers next to the circles indicate the unit cells in each supercell, e.g. 12 means 12$\times$12$\times$12 supercell. (g) Polaron energies (triangles) and eigenvalues (circles) obtained with our method using the model Fr\"ohlich electron-phonon coupling compared to the solution of the Pekar polaron model.}
\caption{ Small electron polaron in Li$_2$O$_2$. (a) Isosurface of the polaron density $|\psi|^2$ and model of Li$_2$O$_2$, with Li and O atoms in green and red, respectively. (b) Cross-section of $|\psi|^2$ along a [100] line cutting through the center. (c) Modulus of the atomic displacements along a [001] line passing through the O atom at the center. (d), (e) Band structures and phonon dispersion relations of Li$_2$O$_2$, respectively. The amplitudes $A_{n\bk}$ and $B_{\bq\nu}$ are superimposed as circles. The energy zero in (d) is aligned with the valence-band top. (f) Polaron formation energy and eigenvalue vs.\supercell size. The dashed gray lines represent the Makov-Payne extrapolations. The green triangle is the result of an explicit DFT calculation from Ref.~\onlinecite{Sio2018}. (g) Comparison between the polaron wavefunction obtained from our method (left), and an explicit DFT calculation (right, Ref.~\onlinecite{Sio2018}), and the corresponding formation energies. }
\caption{Clean and gray-box adversarial accuracies of different cache models. As in Figure~\ref{cache_type_fig}, only the results for the \texttt{layer4\_bottleneck1\_relu} layer are shown. Colors highlight the {\color{red}retrieval method} (continuous or 50-nn), {\color{blue}cache dimensionality} (full, $4$- or $8$-times reduced), and {\color{magenta}cache size} (full, $4$- or $8$-times reduced).}
\caption{ImageNet-C results. The numbers indicate corruption errors ($CE$) for specific corruption types and the mean $CE$ scores as percentages. More robust models correspond to smaller numbers. For the cache models, we only show the results for the best models (the \texttt{fc} cache model in both cases). Colors represent noise, {\color{red}blur}, {\color{blue}weather} and {\color{magenta}digital} perturbations.}
\caption{\textcolor{red}{ An example of ED graph. The red circles are ED nodes, say node $j$, encoding a geometric position $\mathbf{g}_j$, and an affine transformation given by $\mathbf{A}_j$ and $\mathbf{t}_j$. The blue triangle is a vertex, that can be deformed from $\mathbf{v}_i$ to $\tilde{\mathbf{v}}_i$, through the impact of its neighboring ED nodes.} }
\caption{Left is a \textcolor{red}{Jacobian} matrix while the right is re-ordered \textcolor{red}{Jacobian}. Empty blocks are consisted of zero elements. }
\caption{Framework overview. Each possible triplet is associated with a binary indicator (circles), indicating whether it is true (\cmark) or not (\xmark). The observed (\textcolor{Mycolor1}{yellow circles}) and hidden (\textcolor{Mycolor2}{grey circles}) indicators are connected by a set of logic rules, with each rule having a weight (\textcolor{Mycolor3}{red number}). For the center triplet, the KGE model predicts its indicator through embeddings, while the logic rules consider the Markov blanket of the triplet (all connected triplets). If any indicator in the Markov blanket is hidden, we simply fill it with the prediction from the KGE model. In the \textbf{E-step}, we use the logic rules to predict the center indicator, and treat it as extra training data for the KGE model. In the \textbf{M-step}, we annotate all hidden indicators with the KGE model, and then update the weights of rules. }
\caption{Hypothesis testing results: the result \textcolor{kellygreen}{\checkmark} indicates a rejection of the null hypothesis, i.e. there is significant difference among the same measures for the two methods, and \bm{$\times$} indicates a failure to reject the null hypothesis (no significant difference) at the 1\% significance level.}
\caption{All Pareto fronts for scenario $\scen_1$; dark green filled triangles (\textcolor{newgreen}{$\boldsymbol{\blacktriangle}$}): ideal Pareto front of scenario $\scen_1$; orange unfilled triangles (\textcolor{YellowOrange}{$\footnotesize\boldsymbol{\triangle}$}): robust \NAME Pareto front in scenario $\scen_1$; small light blue triangles (\textcolor{newblue}{$\footnotesize\boldsymbol{\blacktriangle}$}): \NAME Pareto front of scenario $\scen_1$; small light blue unfilled circles and squares (\textcolor{newblue}{$\boldsymbol{\circ}$} and \textcolor{newblue}{$\scriptsize\boldsymbol{\square}$}): \NAME Pareto front in scenario $\scen_1$ based on \NAME design computed for scenario $\scen_2$ and $\scen_3$, respectively; here, Pareto fronts are presented without normalization}
\caption{\label{fig:SEvsN} \textbf{(top)} Maximum Spectral efficiency per mode versus span number $N$. Dashed: GN model. Solid: GDF. Same parameters as in Fig. \ref{fig:SNR_vs_P}(\textcolor{red}{top}). \textbf{(bottom)} top SE difference between GN and GDF. Solid: exact; dashed: eq (\ref{eq:Diff_SE}).}
\caption{Semi-supervised training framework for the five-lingual (\protect\tikz[baseline]{\protect\draw[line width=0.3mm,dash dot, cyan, ->] (0,.8ex)--++(1,0)}) and 4$\times$CS (\protect\tikz[baseline]{\protect\draw[line width=0.3mm, plum, -> ] (0,.8ex)--++(1,0)}) transcription systems. (ManT: Manually transcribed data; AutoT: Automatically transcribed data)}
\caption{Results from the RH spectral synthesis. \textit{Upper left}: H$\alpha$ profiles for the different {SRPM} atmospheres. \textit{Upper right}: Number density of hydrogen atoms in the $n=2$ quantum state. The symbols show the height at which $\tau=1$ for the 3~mm radiation (triangles) and the H$\alpha$ line core intensity (circles), in the corresponding atmosphere. {Note that the triangles and circles indicating the $\tau=1$ surfaces coincide for the hotter models.} \textit{Middle left}: intensity contribution function for the H${\alpha}$ line for the {SRPM} D model overlaid with the emergent line profile (\textcolor{violet}{in violet}). Note the different intensity scaling with respect to the upper panel. {The position of the line width measurement described in Section~\ref{sec:IBIS_line_widths} is illustrated with the red line.} \textit{Middle right}: contribution function for the emergent intensity for ALMA Band 3 wavelengths for the {SRPM} D model, overlaid with the emergent intensity profile (\textcolor{violet}{in violet}). \textit{Bottom left, right}: as the middle panels, for the {SRPM} H model. }
\caption{Boxplots of the mean absolute error (MAE) for the Clayton-Frank mixture copula in dimension $p=2, 3, 4$ by the \textcolor{blue}{CFGTN} copula density estimator as indicated by the letter `m' in the x-axis label and \textcolor{red}{kernel} copula density estimator as indicated by the letter `k' for sample sizes $n=500, 1000, 2000$ as indicated by numbers 1, 2, 3 respectively in the x-axis label}
\caption{Boxplots of the mean absolute error (MAE) for the Clayton-$\text{T}_5$ mixture copula in dimension $p=2, 3, 4$ by the \textcolor{blue}{CFGTN} copula density estimator as indicated by the letter `m' in the x-axis label and \textcolor{red}{kernel} copula density estimator as indicated by the letter `k' for sample sizes $n=500, 1000, 2000$ as indicated by numbers 1, 2, 3 respectively in the x-axis label}
\caption{Boxplots of the mean absolute error (MAE) for the Clayton-Normal mixture copula in dimension $p=2, 3, 4$ by the \textcolor{blue}{CFGTN} copula density estimator as indicated by the letter `m' in the x-axis label and \textcolor{red}{kernel} copula density estimator as indicated by the letter `k' for sample sizes $n=500, 1000, 2000$ as indicated by numbers 1, 2, 3 respectively in the x-axis label}
\caption{Boxplots of the mean absolute error (MAE) for the Clayton-Frank-Gumbel-$\text{T}_5$-Normal mixture copula in dimension $p=2, 3, 4$ by the \textcolor{blue}{CFGTN} copula density estimator as indicated by the letter `m' in the x-axis label and \textcolor{red}{kernel} copula density estimator as indicated by the letter `k' for sample sizes $n=500, 1000, 2000$ as indicated by numbers 1, 2, 3 respectively in the x-axis label}
\caption{Boxplots of the mean absolute error (MAE) for the Clayton-$\text{T}_5$-$\text{T}_{15}$ mixture copula in dimension $p=2, 3, 4$ by the \textcolor{blue}{CFGTN} copula density estimator as indicated by the letter `m' in the x-axis label and \textcolor{red}{kernel} copula density estimator as indicated by the letter `k' for sample sizes $n=500, 1000, 2000$ as indicated by numbers 1, 2, 3 respectively in the x-axis label}
\caption{Boxplots of the mean absolute error (MAE) for the $\text{T}_5$-$\text{T}_{15}$ mixture copula in dimension $p=2, 3, 4$ by the \textcolor{blue}{CFGTN} copula density estimator as indicated by the letter `m' in the x-axis label and \textcolor{red}{kernel} copula density estimator as indicated by the letter `k' for sample sizes $n=500, 1000, 2000$ as indicated by numbers 1, 2, 3 respectively in the x-axis label}
\caption{Boxplots of the mean absolute error (MAE) for the $\text{T}_5$-$\text{T}_{15}$-Normal mixture copula in dimension $p=2, 3, 4$ by the \textcolor{blue}{CFGTN} copula density estimator as indicated by the letter `m' in the x-axis label and \textcolor{red}{kernel} copula density estimator as indicated by the letter `k' for sample sizes $n=500, 1000, 2000$ as indicated by numbers 1, 2, 3 respectively in the x-axis label}
\caption{\label{fig:snli_ratios} The top most informative bigrams in the SNLI dataset. {\color{red} Red} represents proportion of contradiction labels, {\color{blue} Blue} for neutral, and {\color{green} Green} for entailment. Numbers on the bars represent the proportion of the bigram in the dataset (A bar labeled with 0.5 means that portion of the bigram constitutes half of that partition of the dataset). }
\caption{Examples of paraphrase pairs in WikiAnswers and Quora datasets. We manually labeled the sentences with the \textcolor{brandeisblue}{\textit{blue italic words}} being sentence-level and the \textcolor{caribbeangreen}{\underline{green underlined words}} being phrase-level.}
\caption{Scheme of the miRNA-decoy circuit controlling skeletal muscle-cell differentiation (see text for details). (A) Circuitry identified in \cite{legnini2014feedforward}, showing molecular species and their interactions. Red (resp. black) dashed lines represent standard (resp. mutually alternative) transcriptional modes, while thick red lines indicate that the involved species compete to bind a miRNA. The thick purple line indicates that pri-miR-133 processing is controlled by HuR. (B) Reduced model considered in this paper, with the corresponding nomenclature used in the mathematical model. \red{Grey arrows indicate effective interactions: the sponge and the controller are mutually reinforcing (an increase in the latter leads to increased synthesis of the former, while increased miRNA sponging de-represses the controller), while both effectively repress the miRNA (increased levels of $h$ reduce the synthesis of $\mu$ while increased levels of $\ell$ enhance miRNA sponging).}}
\caption{Dynamical behaviour of the model, to be compared with the experimental time courses presented in \cite{legnini2014feedforward,cesana2011long} \red{and reported schematically on the right}. The controller ($h$) drives the synthesis of the sponge ($\ell$) at rates high enough to efficiently sequester miRNAs ($\mu$) until the transcription from the alternative locus markedly increases at $t^{\star} \simeq 50$ h. In turn, $\ell$ ensures that the target stays de-repressed long enough to bring the differentiation process to maturation. When the level of $\mu$ becomes sufficiently high, $m$ is rapidly silenced and the next stage of differentiation sets in.}
\caption{ (a-d) Drag measured as a function of rod depth $z_r/D$ at different speeds ($\eta_f = 34$\,mPa\,s). The dashed line is a power law fit to the function$F(z) = F_o z^m$. (e) $\mu_e$ versus $I$ corresponding to the data plotted in (a-d). (f) $\mu_e$ versus $J$ corresponding to the data plotted in (a-d). The curve given by Eq.~\ref{eq:fit} with the same fit shown in Fig.~\ref{fig:nondimensional}(c) is also plotted to guide the eye.}
\caption{(a) The effective viscosity $\eta_e$ versus $J$ for fluids with various $\eta_f$ collapses ($D = 2.6$\,mm). The dashed line corresponds to Eq.~\ref{eq:etamue}. (b) $\eta_e$ versus $J$ for various $D$ is also observed to be described by the same curve. (c) $\eta_e/\eta_f$ as a function of $J$ for various $d$ roughly follows the same trend in all cases. However, small but systematic deviations can be also observed at the higher $J$. }
\caption[]{(a) Schematics of forces acting on the magnetic vortex core: gyroforce (green arrow), confinement force (yellow), spin transfer (purple), and damping force (blue); (b) Noise schematics of vortex motion: amplitude and phase noise \textcolor{hellblau}{$\delta s$} and \textcolor{orange}{$\delta \phi$} resp.}
\caption{Additional \aastex\symbols}
\caption{forward rules for task {\color{blue}A} ({\color{red}B}) }
\caption{backward rules for {\color{blue}A} ({\color{red}B})}
\caption{Given a contact region (red in a) \red{where arbitrary forces can be applied on}, our algorithm optimizes the boundary thickness locally (b-c) to find the smallest weight shell structure (d) that can withstand all possible force configurations. In (b-c), we show the material distribution in two steps of the optimization. Inset figures illustrate the scalar temperature fields on the boundary that we use to drive the shell thickness in (b-c) and the removed material in (d).}
\caption{\blue{Schema of the SPEA2 algorithm.}}
\caption{Example of sampling positions on a area of the map. For the training set: green indicates \textcolor{green}{valid} samples and red \textcolor{red}{invalid} samples. For the test set: blue indicates \textcolor{blue}{valid} samples and orange \textcolor{orange}{invalid} samples. (Better viewed on screen.) % Valid sample: Not in a building and its upper part has at least 50\% skyline. }
\caption{\textcolor{blue}{Different defense mechanisms against adversarial examples. Previous studies mainly focus on removing adversarial perturbations by pre-processing or improving the robustness of a model by modifying the training or inference strategy. We propose adversarial {\em logits} correction, a defense from a new perspective that solely relies on the final {\em logits} of models.} \textcolor{red}{To be re-designed OR REMOVED.}}
\caption{\textcolor{blue}{Supporting classes for each adversarial attack on ResNet-50. We list $10$ classes that appear most frequently in the top-10 of $S_k$, with the number of appearances recorded on the vertical axis. For better visualization, we list the name of each class on the horizontal axis, and also attach a representative image above the bar.}}
\caption{Gradual partitioning of instances to particular concepts given a particular set of properties describing each instance. Each concept is illustrated as a pie chart showing the object class label distribution of instances assigned to the respective concept. Sample concepts are annotated (\tikzdrawcircle[blue!80, line width=0.5mm,fill=white, minimum size=10pt]{3pt}) which illustrate object classes featuring similar quality regarding the property, such as \emph{plate}, \emph{bowl}, \emph{cup}, \emph{to\_go\_cup} regarding the \emph{containment} property.}
\caption{Substitution results w.r.t. human expert selection \emph{distribution} and ERSATZ \emph{similarity} responses. Note that, gray cells correspond to object categories which are not available in the respective query, cells marked with \tikzdrawcircle[black, fill=black]{2.5pt} represents substitutes selected by experts and ERSATZ.}
\caption{List of stressors supported by \SYS. ``{\color{FireBrick}\ding{55}}'' indicates a stressor available in \vanilla that is not SGX-compatible. \texttt{f}=float, \texttt{d}=double, \texttt{ld}=longdouble.}
\caption{\label{fig:tct0}(top) $T_c\sqrt{t_{0}}$ vs $a^2/t_0$. Our results (\protect\bluesq) compared to $SU(3)$ results from \cite{Francis:2015lha} using Wilson~(\protect\blacktri) and Wilson-improved~(\protect\rbcir) energies are reproduced for comparison. Our extrapolated value (\protect\bluefsq) are compared to $SU(3)$ results of~\cite{Francis:2015lha} (\protect\redfcir) and~\cite{Kitazawa:2016dsl} (\protect\greenftri). (bottom) $T_c\sqrt{t_{0.2}}$ vs $a^2/t_{0.2}$ and the extrapolation compared with the results of \cite{Asakawa:2015vta} with same symbols.}
\caption{Synthetic intensity measurements of each MSE channel modeled in \SOFT, averaged with a uniform distribution over momentum space, as a function of normalized minor radius (i.e. initial RE position). The experimentally-determined normalized tangency radius is indicated by a vertical black line for each channel (labeled at left). Note that channels 1-3 have data reflected over $r/a = 0$ (in grey). The locations of the magnetic axis \red{(with $q_{\rm axis} \approx 0.9$)} and flux surfaces $q$~=~1, 4/3, 3/2, 2, and 3 are shown as solid vertical lines; \redd{shaded regions, extending halfway between adjacent surfaces, are used in step~\ref{step:stitch} of the methodology of section~\ref{sec:comparison}.}}
\caption{\SOFT-predicted polarization (a)~angle $\tpol$ and (b)~fraction $\fpol$ versus normalized tangency radius $\rtan/a$ of the MSE channels (vertical dotted lines) and RE pitch angle $\tp$, for $t$~=~1.04~s and $p/mc$~=~60. The bounded region in (a) corresponds to the region of expected $\tpol \approx 0\deg$ from the heuristic argument presented in section~\ref{sec:heuristic}, i.e. $\tpmin \leq \tp \leq \tpmax$ \red{from \eqref{eq:heuristicCalc}}. See figure~\ref{fig:linePlots} for line plots of $\tpol$ and $\fpol$ for channels~1, 3, 5, 7, and 9. \red{Grey regions indicate practically-undetectable regions of phase space.}}
\caption{(a) A momentum space distribution function $f(p,\tp)$ (log scale) calculated by \CODE for plasma parameters at the magnetic axis. (b) The normalized convolution of $f$, the detector response function $\hat{I}(p,\tp)$ for channel~3, and Jacobian $J = p^2\sin\tp$. The location of peak detected emission is $p/mc \approx 40$ and $\tp \approx 0.16$~rad. \red{Grey regions indicate practically-undetectable regions of momentum space. ($t$~=~1.24~s)}}
\caption{\SOFTCODE-predicted polarization (a) angle $\tpol$ and (b) fraction $\fpol$ for the Alcator C-Mod discharge of interest, versus time and normalized tangency radius of the MSE channels (horizontal dotted lines). Time and spatial resolutions are $\Delta t$~=~100~ms and $\Delta \rtan/a \sim$~0.1-0.2. Compare to figure~\ref{fig:paExp} and \ref{fig:pfExp}. \red{Grey regions indicate undetectable synthetic data ($\rtan/a \gtrsim 0.9$) and unexplored times ($t \lesssim 0.5$~s and $t \gtrsim 1.6$~s).}}
\caption{Statements and factor scores. The highlighted cells in \colorbox{green!25}{green} and \colorbox{red!25}{red} indicate the extreme scores of each factor, whereas an asterisk ``*'' is used to denote the distinguishing statements of each factor. Consensus statements are in bold, while dissensions in italic.}
\caption{\textbf{NV$^\mathbf{\text{-}}$ photoluminescence.} Extracted ({\color{red}{\rule[.6mm]{3mm}{.3mm}}}) and predicted ({\color{gray}{\rule[.6mm]{3mm}{.3mm}}}) NV$^\text{-}$ photoluminenscence emission spectrum at room temperature as described in the main text. The extracted and predicted curves are in good agreement. The ZPL feature itself ({\color{black}{\rule[.6mm]{1mm}{.3mm}}}\,{\color{black}{\rule[.5mm]{1mm}{.3mm}}}\,{\color{black}{\rule[.6mm]{1mm}{.3mm}}}) inserted here is not predicted a priori, but is consistent with the range of ZPL parameters found in the literature.}
\caption{\textbf{NV$^\mathbf{0}$ photoluminescence.} Extracted ({\color{dandelion}{\rule[.6mm]{3mm}{.3mm}}}) NV$^0$ photoluminescence emission spectrum at room temperature as described in the main text.}
\caption{\textbf{Example decomposition of diamond photoluminescence spectra.} a) PL spectra ({\color{black}{\rule[.6mm]{3mm}{.3mm}}}) for sample F, decomposed as a linear combination of the NV$^0$ ({\color{dandelion}{\rule[.6mm]{3mm}{.3mm}}}) and NV$^\text{-}$ ({\color{red}{\rule[.6mm]{3mm}{.3mm}}}) PL emission spectra. The fit ({\color{orange}{\rule[.6mm]{3mm}{.3mm}}}) to the total PL is also shown. b) Residuals between the measured and fit data in a). In both subfigures, the dashed black ({\color{black}{\rule[.6mm]{1mm}{.3mm}}} {\color{black}{\rule[.6mm]{1mm}{.3mm}}} {\color{black}{\rule[.6mm]{1mm}{.3mm}}}) vertical lines indicate the position of the the NV$^0$ and NV$^\text{-}$ ZPLs at 575 nm and 637 nm respectively.}
\caption{\textbf{Comparison of methods determining the PL composition ratio $\bm{c_{\text{-}}/c_0}$ for sample S.} The PL composition ratio $c_-/c_0$ is determined using both the photoluminescence decomposition analysis (\textcolor{springgreen}{\textbf{$\mathrlap{\times}+$}}) and the Debye-Waller decomposition (\textcolor{skyblue}{$\blacktriangle$}) for various laser intensities. Both methods analyzed the same data. The photoluminescence decomposition analysis produces $c_-/c_0$ ratios that are more consistent across laser intensities and exhibit lower error bars.}
\caption{\textbf{Total NV concentration versus irradiation dose} \bm{$D_e$}\textbf{.} Data are shown for un-irradiated control diamonds ({\tiny$\blacklozenge$}), 850~$^\circ$C annealed diamonds (\textcolor{red}{\textbullet}), and %850$^\circ$C plus 1250~$^\circ$C annealed diamonds (\textcolor{blue}{\tiny$\blacksquare$}). Error bars denote standard deviation of multiple measurements on the same sample.}
\caption{$\mathbf{T_1}$ \textbf{versus irradiation dose} $\bm{D_e}$\textbf{.} Data are shown for both 850~$^\circ$C annealed diamonds (\textcolor{red}{\textbullet}) and 1250~$^\circ$C annealed diamonds (\textcolor{blue}{\tiny$\blacksquare$}). The measured value of $T_1$ displays little if any dependence on either irradiation dose or anneal temperature for irradiation doses $D_e \! \leq \! 5\times \! 10^{18}$~e$^\text{-}$/cm$^{2}$. Error bars denote standard deviation of multiple meausurements on the same sample.}
\caption{$\bm{T_2}$ \textbf{versus irradiation dose} $\bm{D_e}$\textbf{.} Data are shown for un-irradiated control diamonds ({\tiny$\blacklozenge$}), 850~$^\circ$C annealed diamonds (\textcolor{red}{\textbullet}), and 1250~$^\circ$C annealed diamonds (\textcolor{blue}{\tiny$\blacksquare$}). Little if any decrease in $T_2$ is observed for irradiation doses $D_e \lesssim 10^{18}$ e$^\text{-}$/cm$^{2}$. Data are fit to two models, one where $T_2$ decoherence is linearly proportional to dose ({\color{thegreen}{\rule[.6mm]{.65mm}{.5mm}}}\hspace{.35mm}{\color{thegreen}{\rule[.6mm]{.65mm}{.5mm}}}\hspace{.35mm}{\color{thegreen}{\rule[.6mm]{.65mm}{.5mm}}}) and another where $T_2$ decoherence is quadratic with dose ({\color{thegreen}{\rule[.6mm]{3mm}{.3mm}}}) as described in the main text. Error bars denote standard deviation of multiple measurements on the same sample.}
\caption{Reported values of monovacancy concentrations in ppm generated by electron irradiation at 1\,MeV ({\color{red}{\rule[.6mm]{3mm}{.3mm}}}), 1.9-2\,MeV ({\color{igorgreen}{\rule[.6mm]{3mm}{.3mm}}}), 3\,MeV ({\color{igoraquamarine}{\rule[.6mm]{3mm}{.3mm}}}), and 4.5-5\,MeV ({\color{blue}{\rule[.6mm]{3mm}{.3mm}}}) from the diamond literature, measured by 77\,K UV-Vis spectrophotometry using the oscillator strengths$f_\text{GR1}$ and $f_\text{ND1}$ reported in Ref.~\cite{twitchen1999correlation}. Solid lines represent the calculated dependence of vacancy creation on irradiation dose from SRIM calculations~\cite{ziegler2010SRIM} in Ref.~\cite{Campbell2000}, neglecting vacancy-interstitial recombination for electron irradiation at 1\,MeV (\textcolor{red}{$-$}), 2\,MeV (\textcolor{igorgreen}{$-$}), and 5\,MeV (\textcolor{blue}{$-$}). Filled markers denote measurements of the total monovacancy concentration $\text{V}^0 + \text{V}^\text{-}$ while open markers denote measurements of only V$^0$. All diamond samples included here contain nitrogen concentrations $\lesssim 10$\,ppm. The following markers denote measurements reported in the following references:\textcolor{red}{$\mathlarger{\mathlarger{\Join}}$} = this work at 1\,MeV,\textcolor{igorgreen}{$\mathlarger{\mathlarger{\mathlarger{\mathlarger{\mathlarger{\blacktriangleleft}}}}}$} = Ref.~\cite{lawson1998ontheexistence} at 1.9\,MeV recalculated using$f_\text{GR1}$ and $f_\text{ND1}$ from Ref.~\cite{twitchen1999correlation}, \textcolor{igorgreen}{$\mathlarger{\mathbin{\bigtriangledown}}$} and \textcolor{igorgreen}{$\mathlarger{\mathlarger{\mathlarger{\mathlarger{\mathlarger{\mathbin{\blacktriangledown}}}}}}$} = Ref.~\cite{twitchen1999correlation} at 1.9\,MeV,\textcolor{igorgreen}{$\mathlarger{\mathlarger{\Join}}$} = Ref.~\cite{allers1998annealing} at 2\,MeV recalculated using$f_\text{GR1}$ and $f_\text{ND1}$ from Ref.~\cite{twitchen1999correlation}, \textcolor{igorgreen}{$\mathlarger{\mathlarger{\mathlarger{\mathlarger{\circ}}}}$} = Ref.~\cite{twitchen1999electronparamagneticresonanceandopticalabsorption} at 2\,MeV,\textcolor{igorgreen}{$\mathlarger{\mathlarger{\mathbin{\rhd}}}$} = Ref.~\cite{newton2002recombination} at 2\,MeV,\textcolor{igoraquamarine}{$\mathlarger{\mathlarger{\mathlarger{\mathlarger{\mathlarger{\mathbin{\blacktriangleup}}}}}}$} and \textcolor{igoraquamarine}{$\mathlarger{\mathbin{\bigtriangleup}}$} = Ref.~\cite{collins2003production} at 3\,MeV,\textcolor{igoraquamarine}{$\mathlarger{\mathlarger{\mathlarger{\mathlarger{\mathbin{\blackdiamond}}}}}$} and \textcolor{igoraquamarine}{$\mathlarger{\mathlarger{\mathlarger{\mathlarger{\mathbin{\diamond}}}}}$} = Ref.~\cite{iakoubovskii2005evidence} at 3\,MeV,\textcolor{blue}{$\mathlarger{\mathlarger{\mathlarger{\mathlarger{\mathbin{\blackdiamond}}}}}$} and \textcolor{blue}{$\mathlarger{\mathlarger{\mathlarger{\mathlarger{\diamond}}}}$} = Ref.~\cite{E6patent2010WO149775} at 4.5\,MeV, and\textcolor{blue}{$\mathlarger{\mathlarger{\mathlarger{\mathlarger{\circ}}}}$} = Ref.~\cite{Fraczek2017} at 4.5\,MeV.%check this. need to add latex symbols and references. john if you work out how to make the symbols, i'll add the appropriate ref to each symbol. }
\caption{\textcolor{red}{Evaluation Results on Friends and EmotionPush datasets.}}
\caption{\textcolor{red}{Both Gold}}
\caption{Example frames with ground truth segmentation and model predictions for Experiment I. (Colormap: \textcolor{pupil}{\rule{0.5cm}{0.25cm}} Pupil, \textcolor{iris}{\rule{0.5cm}{0.25cm}} Iris, $^{\framebox{ \textcolor{cornea}{\rule{0.3cm}{0.001cm}}}}$ Cornea, \textcolor{skin}{\rule{0.5cm}{0.25cm}} Skin, \textcolor{tape}{\rule{0.5cm}{0.25cm}} Surgical tape, \textcolor{retractors}{\rule{0.5cm}{0.25cm}} Eye retractors and \textcolor{hcannula}{\rule{0.5cm}{0.25cm}} Instrument) }
\caption{Example frames with ground truth segmentation and model predictions for Experiment II. (Colormap: \textcolor{pupil}{\rule{0.5cm}{0.25cm}} Pupil, \textcolor{iris}{\rule{0.5cm}{0.25cm}} Iris, $^{\framebox{ \textcolor{cornea}{\rule{0.3cm}{0.001cm}}}}$ Cornea, \textcolor{skin}{\rule{0.5cm}{0.25cm}} Skin, \textcolor{tape}{\rule{0.5cm}{0.25cm}} Surgical tape, \textcolor{retractors}{\rule{0.5cm}{0.25cm}} Eye retractors, \textcolor{hcannula}{\rule{0.5cm}{0.25cm}} Cannula, \textcolor{capcyst}{\rule{0.5cm}{0.25cm}} Capsulorhexis cystotome, \textcolor{bonn}{\rule{0.5cm}{0.25cm}} Tissue forceps, \textcolor{capforceps}{\rule{0.5cm}{0.25cm}} Capsulorhexis forceps, \textcolor{secondknife}{\rule{0.5cm}{0.25cm}} Secondary knife, \textcolor{lensinjector}{\rule{0.5cm}{0.25cm}} Lens injector, \textcolor{micromanipulator}{\rule{0.5cm}{0.25cm}} Micromanipulator and \textcolor{iahandpiece}{\rule{0.5cm}{0.25cm}} I/A handpiece)}
\caption{Example frames with ground truth segmentation and model predictions for Experiment III. (Colormap: \textcolor{pupil}{\rule{0.5cm}{0.25cm}} Pupil, \textcolor{iris}{\rule{0.5cm}{0.25cm}} Iris, $^{\framebox{ \textcolor{cornea}{\rule{0.3cm}{0.001cm}}}}$ Cornea, \textcolor{skin}{\rule{0.5cm}{0.25cm}} Skin, \textcolor{tape}{\rule{0.5cm}{0.25cm}} Surgical tape, \textcolor{retractors}{\rule{0.5cm}{0.25cm}} Eye retractors, \textcolor{vcannula}{\rule{0.5cm}{0.25cm}} Viscoelastic cannula, \textcolor{rcannula}{\rule{0.5cm}{0.25cm}} Rycroft cannula, \textcolor{capcyst}{\rule{0.5cm}{0.25cm}} Capsulorhexis cystotome,\textcolor{capcysthandle}{\rule{0.5cm}{0.25cm}} Capsulorhexis cystotome handle, \textcolor{lensinjector}{\rule{0.5cm}{0.25cm}} Lens injector, \textcolor{primknife}{\rule{0.5cm}{0.25cm}} Primary knife, \textcolor{bonn}{\rule{0.5cm}{0.25cm}} Bonn forceps, \textcolor{capforceps}{\rule{0.5cm}{0.25cm}} Capsulorhexis forceps, \textcolor{micromanipulator}{\rule{0.5cm}{0.25cm}} Micromanipulator, \textcolor{iahandpiece}{\rule{0.5cm}{0.25cm}} I/A handpiece and \textcolor{iahandle}{\rule{0.5cm}{0.25cm}} I/A handpiece handle)}
\caption{\textcolor{black}{\label{susc Ce} Magnetic data obtained on a floating zone grown CeAlGe single crystal with a mass of 125.4\,mg.}The magnetic susceptibility was measured in the range of 1.8 - 400\,K with the field aligned parallel to a (black dots) and parallel to c (red dots). The main figure shows the inverse susceptibility obtained after field cooling at 0.1\,T. Inset: The low temperature range of both zero field cooled and field cooled magnetic susceptibility curves measured in 5\,mT.}
\caption{\textcolor{black}{\label{susc Pr}Bulk magnetic property data obtained on a PrAlGe single crystal. The} susceptibility was recorded in the range of 1.8 - 400\,K with the field aligned perpendicular (black dots) and parallel to c (red dots). The main panel shows the inverse susceptibility measured after FC at 0.1\,T. The inset depicts the low temperature range of both ZFC and FC curves measured at 5\,mT.}
\caption{Graph depicting the azimuthal angles, $\phi$, subtended by a point with $y_0=0.5$, $z_0=0.5$ travelling at $\beta=0.9$ (black representing the primary aperture and \textcolor{red}{red} the secondary) for $d=0.3$.}
\caption{Graph depicting the difference in azimuthal angles, $\Delta\phi$, subtended by a point with $y_0=0.5$, $z_0=0.5$ for an observer with $d=0.3$ and $\beta=$ 0.9, 0.7 and 0.5 for the black, \textcolor{red}{red} and \textcolor{blue}{blue} curves respectively.}
\caption{Graph depicting the polar angles, $\theta$ subtended by a point with $y_0=0.5$, $z_0=0.5$ travelling at $\beta=0.9$ (black representing the primary aperture and \textcolor{red}{red} the secondary) for $d=0.3$.}
\caption{Graph depicting the difference in polar angles, $\Delta\theta$, subtended by a point with $y_0=0.5$, $z_0=0.5$ for an observer with $d=0.3$ and $\beta=$ 0.9, 0.7 and 0.5 for the black, \textcolor{red}{red} and \textcolor{blue}{blue} curves respectively.}
\caption{Figure illustrating deformation of a 2D bicycle with $\beta=0.9$ for a Class 2 observer with aperture spacing $d=0.3$. We see the features pointed out in the previous section with different parts of the bicycle catching up with itself at different times. The image received at the primary aperture is in black and at the secondary is in \textcolor{red}{red}.}
\caption{Figure illustrating the simulated azimuthal difference, $\Delta\phi$, and the fitted function. The simulation curve is in black and the fitting of Equation (\ref{eq:phidiff}) is in \textcolor{red}{red}.}
\caption{Figure illustrating the simulated polar difference, $\Delta\theta$, and the fitted function. The simulation curve is in black and the fitting of Equation (\ref{eq:thetadiff}) is in \textcolor{red}{red}.}
\caption{\FIGCAPTIONPREFIX EO for maximization of the absolute backscattered intensity \(I_{\text{direct}}\) to the target solid angle centered at a polar angle of (a) \(45^{\circ}\) and (b) \(90^{\circ}\). Target solid angles, backscattering radiation patterns and top views of the optimized geometries are shown respectively from top to bottom. Scale bars are \(200\,\)nm. Normally incident plane wave, linearly \(X\)-polarized with \(\lambda_0 = 800\,\)nm, the structures lie in vacuum on an \(n=1.5\) substrate. A visualization of the optimization convergence for case (a) can be found online ({\color{blue}Visualization~1}). }
\caption{\textbf{\blue{Saliency Maps}}: \blue{We present the saliency maps for one sample per emotion (angry, happy, and sad) as learned by the network for a single walk cycle.} The maps show activations on the joints during the walk cycle. Black represents no activation and red represents high activation. For all the emotion classes, the hand, feet and head joints have high activations, implying that the network deems these joints to be more important for determining the class. Moreover, the activation values on these joints for a high arousal emotion (\textit{e.g.}, angry) are higher than those for a low arousal emotion (\textit{e.g.}, sad), implying the network learns that higher arousal emotions lead to more vigorous joint movements.}
\caption{\textbf{Accuracy}: Our method with combined deep and affective features classified with a Random Forest classifier achieves an accuracy of $80.07\%$. We observe an improvement of $13.85\%$ over state-of-the-art emotion identification methods and an improvement of $24.60\%$ over a baseline LSTM-based classifier. \blue{All methods were compared on $1384$ gaits obtained from the datasets described in Section~\ref{sec:mocapDatasets}}}
\caption{Subreddits with highest monthly evolution magnitudes in each category and the nature of their evolution. \textbf{AMM} is the average magnitude of evolution per month, \textbf{LM} is the magnitude of evolution over the entire lifetime of the subreddit, T and U denote the magnitudes of topic and active user base evolution, \red{Red} keywords represent topics associated with the start of the subreddit's lifetime, and \blue{Blue} keywords represent topics associated with the end of the subreddit's lifetime which is characterized by when the subreddit was banned or the end of our data collection (10/2018).}
\caption{Dendrogram of a sample set of subreddits after applying by hierarchical clustering on topic vectors. Here we see how \subreddit{The\_Donald} evolved from a meme subreddit (labeled 0; nearest to \subreddit{TuberSim}) to a political subreddit (labeled months 1-7; nearest to \subreddit{Republican}, \subreddit{enoughsandersspam}, and \subreddit{altright} before it was classified as hateful) to a hateful subreddit (labeled months 12-40; nearest to \subreddit{altright} when it was banned). \rnnote{@Maaz: Please add months to the labels on the X axis}}
\caption{\textbf{Tetragonal distortion of ice-VII.} \textbf{a} Comparison of raw XRD images of ice: (left) highly strained and textured diffraction pattern at 23.2\,$\pm{}$\,0.6\,GPa without heat treating and (right) full Debye-Sherrer rings from heat treating an annealed powder of ice at 19.1\,$\pm{}$\,0.4\,GPa. Each red letter D indicates a reflection from the single-crystal diamond anvil.\textbf{b} Rietveld refinement of ice-VII$_{\text{t}}$ $(P4_{2}/nnm)$ at 6.5\,$\pm{}$\,0.5\,GPa ($a$\,=\,3.2279\,$\pm{}$\,0.0002\,\AA{} and $c$\,=\,3.2372\,$\pm{}$\,0.0003\,\AA{}, $V$\,=\,33.719\,$\pm{}$\,0.002\,\AA{}$^{3}$, $wR_{\text{P}}$\,=\,1.81\% and R$_{\text{P}}$\,=\,1.41\%). Inset: (left) Rietveld refinement of the (2\,0\,0) Bragg peak using a cubic cell ($Pn\overline{3}m$) ($a$\,=\,3.2275\,$\pm{}$\,0.0002\,\AA{}, $V$\,=\,33.621\,$\pm{}$\,0.005\,\AA{}$^{3}$, $wR_{\text{P}}$\,=\,2.36\% and R$_{\text{P}}$\,=\,2.43\% ) and (right) improved fit using a tetragonal cell $(P4_{2}/nnm)$ (2\,0\,0) and (0\,0\,2) Bragg peaks.\textbf{c} Log Bayes Factors (BFs) for a tetragonal model \textit{vs} cubic model for our data on a logarithmic scale. Positive values indicate that the data favor the tetragonal model where negative values indicate preference for the cubic model. Red points indicate pressures that the sample was laser heated. We used a random sampling of points across the pressure range of our experiment to conduct this analysis. }
\caption{ \textbf{Equation of state fitting.} \textbf{a} Pressure-volume plot of our data and Vinet EOS fit from MCMC for the three phases. The calculated uncertainty in transition pressures are indicated by the blue regions at 5.1\,$\pm{}$\,0.5\,GPa and 30.9\,$\pm{}$\,2.9\,GPa, respectively, and the grey lines are results from previous experiments. Curves are colour coded by phase (blue: cubic ice-VII ($K_{0}$\,=\,18.47\,$\pm{}$\,4.00\,GPa,$K_{0}^{\prime{}}$\,=\,2.51\,$\pm{}$\,1.51,$V_{0}$\,=\,42.50\,$\pm{}$\,0.88\,\AA{}$^{3}$), black: non-cubic ice-VII$_{\text{t}}$ ($K_{0}$\,=\,20.76\,$\pm{}$\,2.46\,GPa,$K_{0}^{\prime{}}$\,=\,4.49\,$\pm{}$\,0.35,$V_{0}$\,=\,41.11\,$\pm{}$\,0.53\,\AA{}$^{3}$), and red: ice-X ($K_{0}$\,=\,50.52\,$\pm{}$\,4.16\,GPa,$K_{0}^{\prime{}}$\,=\,4.50\,$\pm{}$\,0.15,$V_{0}$\,=\,33.82\,$\pm{}$\,0.43\,\AA{}$^{3}$)). In comparison, fitting all of the data to a single phase yields $K_{0}$\,=\,12.57\,$\pm{}$\,0.50\,GPa,$K_{0}^{\prime{}}$\,=\,6.06\,$\pm{}$\,0.07, and$V_{0}$\,=\,43.05\,$\pm{}$\,0.20\,\AA{}$^{3}$. Orange error bars indicate our systematic uncertainty from deviatoric stresses in non heat treated data. \textbf{b} Linearized Vinet EOS when applying a single phase. \textbf{c} Three-phase linearized Vinet EOS. }
\caption{\textbf{Raman spectrum of heat treated H$_{2}$O ices under compression.} \textbf{a} Frequency shift of measured Raman modes of H$_{2}$O ice with pressure. Splitting of the dominant lattice mode near 280\,cm$^{-1}$ due to tetragonal distortion above 5\,GPa is highlighted in blue and green. Red diamonds show the emergence of the ice-X$T_{\text{2g}}$ mode above 33\,GPa. Dashed lines represent transition pressures based on analysis of XRD data.\textbf{b} Progression of Raman features on increasing pressure. The dominant mode near 280\,cm$^{-1}$ exhibits asymmetry above 5\,GPa, and tends towards a single mode above 21\,GPa. Red asterisks (\textcolor{red}{*}) denote the emergence of ice-X $T_{\text{2g}}$ Raman mode. (Inset) Dominant Raman feature in ice-VII at 3.3\,GPa is fit to a single peak, whereas the same feature in ice-VII$_{\text{t}}$ at 11.3\,GPa is a triplet.\textbf{c} (bottom-to-top) Development of ice-X $T_{\text{2g}}$ mode on compression above 33\,GPa, and its reversible disappearance on decompression.\textbf{d} Frequency shift of ice-X $T_{\text{2g}}$ Raman mode with pressure, showing correlation with those reported by \citet{goncharov1999raman} at higher pressures. }
\caption{\textcolor{blue}{{\bf Schematic of the physical problem.} {\bf a.} Sketch of the computational domain with characteristic length and dimensions. {\bf b.} The initial conditions for particles and red cells are obtained by choosing one per time the five particle initial positions within each of the four red blood cell configurations.}}
\caption{\label{table:stat} Statistics over \testgentemp. The \original dataset corresponds to the one provided by the organisers. The \reduced dataset is the one for which the annotations are preserved by the generation model. \# denotes number.}
\caption{\label{tmp-trans} Temporal relations extraction for \overlap for CNN and Naive Bayes (NB), \testtemp. Only the real/generated text between events serves as input. Best performing models for data augmentation experiments are highlighted in bold. We report results for the models trained using: \real \original training data from the i2b2 task augmented with the generated \gen data, \real \original data only, 2 $\times$ \real \original data upsampled twice, \real \reduced data only, \gen data only. }
\caption{Trajectory plots of the generated pose (i.e. \emph{Root}'s position) viewed from the top. Each box represents a generated trajectory of the model on the vertical axis and sentence on the horizontal axis. The person starts at the green cross (\color{green}x\color{black}) and ends at the red circle (\color{red}$\bullet$\color{black}) with blue dots (\color{blue}$\sbullet$\color{black}) denoting equally placed time-steps. All trajectories in each column have the same scale for fair comparison across models.}
\caption{Renders of generated animations with a diverse set of sentences as input by our proposed model. Our model is able to change speed, direction and actions based on changes in the input sentence. Trajectory of the character is drawn with a blue line which starts at the green cross (\color{green}x\color{black}) and ends at the red circle (\color{red}$\bullet$\color{black}). }
\caption{\textcolor{red}{System Diagram for the AMS.}}
\caption{ \textcolor{red}{Harmonic context representation as a 2d matrix for an example chord progression.}}
\caption{Comparison of evaluation metrics for different detectors. R, P, and H refer to recall, precision, and H-mean. Detectors are sorted from the highest score on DetEval metric. Texts are highlighted in \textcolor{red}{red} and \textcolor{blue}{blue} for rise and fall.}
\caption{(a) Scheme illustrating the structure of the computational box in the SCTSQI model. The mask function [Eq.~(\ref{mask})] is shown by the blue (thick) curve. The vertical lines correspond to the internal boundaries of the mask region. The black (thin) curves show the windows of the Gabor transform centered at the points $x_{0,\pm}^{i}$ [Eq.~(\ref{grids})]. (b) The Husimi quasiprobability distribution $\left|G\left(x,p_x,t_f/2\right)\right|^{2}$ calculated at $t=3t_f/2$ for the laser pulse defined by Eq.~(\ref{vecpot}) with a duration of $n=4$ cycles, intensity of $2.0\cdot10^{14}$ W/cm$^2$, phase $\varphi=0$, and a wavelength of 800 nm. The Husimi distribution is calculated in the domains $D_1$ and $D_2$ of the phase space (see text). A logarithmic color scale is used. $P_1$-$P_3$ represent the three main spots of the Husimi distribution. The maxima of these spots are depicted by a (green) circle, (magenta) square, and (cyan) rectangle, respectively. (c) The final electron momentum $-A_{x}\left(t\right)$ in the potential-free classical model as a function of the time of ionization. The parameters of the laser pulse are the same as in Fig.~1~(b). The vicinities of the time instants $t_1$, $t_2$, and $t_3$ make the main contribution to the spots $P_1$, $P_2$, and $P_3$, respectively [see Fig.~1~(b)].} \label{fig2} \end{figure} The value of the Gabor transform at an arbitrary point that belongs to the domain $D_1$ or $D_2$ can be obtained by a two-dimensional interpolation. At every time $t_0$ we randomly distribute initial positions $x_{0}^{j}$ and momenta $p_{x,0}^{j}$ $\left(j=1,...,n_{p}\right)$ of $n_p$ classical trajectories in the domains $D_1$ and $D_2$. These trajectories are propagated according to Newton's equation of motion (\ref{newton_1d}). Every trajectory is assigned with the quantum amplitude $G\left(t_0,x_{0}^{j},p_{x,0}^{j}\right)$ and the phase \begin{equation} \Phi_{0}\left(t_{0},x_{0}^{j},p_{x,0}^{j}\right)=-\int_{t_0}^{\infty} dt \left\{\frac{v_{x}^{2}\left(t\right)}{2}-\frac{x^2}{\left(x^2+a^2\right)^{3/2}}-\frac{1}{\sqrt{x^2+a^2}}\right\} \label{phase_sctsqi} \end{equation} We note that the SCTSQI phase (\ref{phase_sctsqi}) corresponds to the phase of the matrix element of the semiclassical propagator that describes a transition from momentum $p_{x,0}^{j}$ at $t=t_0$ to momentum $k_{x}^{j}=k_{x}^{j}\left(x_{0}^{j},p_{x,0}^{j}\right)$ at $t\to\infty$. The ionization probability in the SCTSQI is given by \begin{equation} R\left(k_x\right)=\left|\sum_{m=1}^{N_{T}}\sum_{j=1}^{n_{k_x}}G\left(t_{0}^{m},x_{0}^{j},p_{x,0}^{j}\right)\exp\left[i\Phi_{0}\left(t_{0}^{m},x_{0}^{j},p_{x,0}^{j}\right)\right]\right|^{2}, \label{sctsqi} \end{equation} where \textcolor{blue}{$N_T$} is the number of the time steps used to solve the TDSE, and $n_{k_x}$ is the number of trajectories reaching the same bin centered at $k_x$ [cf. Eq.~(\ref{prob2})]. It should be stressed that the Gabor transform $G\left(t_{0}^{m},x_{0}^{j},p_{x,0}^{j}\right)$ is a complex function with both absolute value and phase. In order to ensure that ionized parts of the wave function reach the absorbing regions, we propagate the TDSE up to some time $t=T$, where $T>t_f$. For this reason, in the SCTSQI we calculate classical trajectories till $t=T$ and replace $t_f$ by $T$ in Eq.~(\ref{post_pulse_final}) for the post-pulse phase. In our simulations we have used $T=4t_f$. \section{Results and discussion} For our numerical examples we use the intensity of $2.01\cdot10^{14}$ W/cm$^2$ ($F_0=0.0757$~a.u.) and the wavelength 800 nm ($\omega=0.057$~a.u.). This corresponds to the Keldysh parameter $\gamma=\omega\sqrt{2I_{p}}/F_{0}$ (see Ref.~\cite{Keldysh1964}) equal to 0.87. For simplicity, we set the absolute phase of the pulse (\ref{vecpot}) equal to zero: $\varphi=0$. We benchmark our SCTSQI approach against the SCTS model and the exact numerical solution of the TDSE. We implement the SCTS by solving Newton's equation of motion using a fourth-order Runge-Kutta method with adaptive step size \cite{Nuref}. In order to fully resolve the rich interference structure, we need to use the a momentum-space bin size of $\Delta k_x=0.0019$~a.u. For this value of $\Delta k_x$ the convergence of the interference oscillations is achieved for an ensemble consisting of $1.2\times10^{7}$ trajectories. At first, we consider photoelectron momentum distributions. In Fig.~2~(a) we compare the SCTS model with the solution of the TDSE. The TDSE photoelectron momentum distribution has a rather complicated structure. This is due to the fact that the laser pulse used in calculations is neither long nor very short. The side maxima at $k_x=-1.35$~a.u. and $k_x=1.33$~a.u. are created due to the interference of contributions from times near the central maximum and minimum of the vector potential, respectively, see Fig.~1(c). The central minimum of the vector potential is also responsible for the formation of the maximum at $k_x=1.0$~a.u. On the other hand, the ATI peaks in the electron momentum distributions are most pronounced in the range of $k_x$ from $-1.0$~a.u. to $-0.25$~a.u. The SCTS model predicts a caustic of the momentum distribution around $k_x=0.38$~a.u. For this reason, we normalize the distributions of Fig.~2~(a) to the total ionization yield. Fig.~2~(a) shows that there is only a qualitative agreement between the SCTS approach and the TDSE result. Indeed, the SCTS model underestimates the width of the momentum distribution. \begin{figure}[h] \begin{center} \includegraphics[width=.6\textwidth]{Fig2.eps} \end{center} \caption{Comparison of the semiclassical models with the TDSE. The parameters are the same as in Fig.~1~(b). (a) The photoelectron momentum distributions for ionization of a one-dimensional model atom obtained from the SCTS model [magenta (thin) curve] and the solution of the TDSE [light blue (thick) curve]. The distributions are normalized to the total ionization yield. (b) The electron momentum distributions calculated using the present SCTSQI model [dark green (dashed) curve] and the TDSE [light blue (thick) curve]. The distributions are normalized to the peak values. (c) Electron energy spectra obtained from the TDSE [light blue (thick) curve], SCTSQI [dark green (dashed) curve], and the SCTS [magenta (thin) curve]. The spectra are normalized to the peak values.} \label{fig2} \end{figure} In Fig.~2~(b) we compare the SCTSQI model with the TDSE. In our SCTSQI simulations we have used $N=50$, $x_{\text{max}}=500$~a.u., and $x_b=70$~a.u. In order to achieve convergence of the momentum distribution, the bin size was chosen to be $1.5\times10^{-4}$~a.u., and $n_{p}=10^{6}$ trajectories were launched at every time step of the TDSE propagation. We note that in the mask method it is difficult to achieve full convergence of the TDSE momentum distribution for small momenta. The distribution in the vicinity of $k_x=0$ is formed by the slow parts of the electron wave packet. A long propagation time is needed, in order to let these parts reach the absorbing mask, and, therefore, to obtain converged distribution for small $k_x$. Thus we do not consider the region of small $k_x$ when comparing the SCTSQI with the TDSE. It is clearly seen from Fig.~2~(b) that for $\left|k_x\right|\gtrsim 0.15$~a.u. the SCTSQI model provides \textit{quantitative} agreement with fully quantum-mechanical result. This applies to both the width of the momentum distribution and the positions of the interference maxima (minima). The small remaining discrepancy in the heights of some of the interference maxima is caused by the fact that similar to the SCTS, the SCTSQI model does not account for the preexponential factor of the semiclassical matrix element \cite{Miller1974}. In Fig.~2~(c) we present the photoelecton energy spectra obtained from the SCTS, the solution of the TDSE, and the present SCTSQI model. It is seen that the SCTSQI and the TDSE spectra are almost identical, while the spectrum predicted by the SCTS model falls off to rapidly with the increase of the electron energy. This is a direct consequence of the fact that the SCTS model underestimates the width of the electron momentum distribution, see Fig.~2~(a). In order to further test the SCTSQI model, we calculate the electron momentum distributions for different positions of the mask $x_b$ and fixed $x_{\text{max}}$ of the computational box, see Fig.~3~(a). The distributions corresponding to different values of $x_b$ are in good quantitative agreement with each other. The same is also true for momentum distributions obtained for fixed $x_b$ and different values of $x_{\text{max}}$, see Fig.~3~(b). Here, we have used the two values $x_{\text{max}}=500$~a.u. and $x_{\text{max}}=200$ a.u. It should be stressed that it is impossible to obtain accurate electron momentum distributions for the small value $x_{\text{max}}=200$~a.u. using the mask method. We also note that for the 1D soft-core Coulomb potential used in this work, the smallest allowed $x_b$ should exceed 30-40~a.u., to be outside of the region where the bound-state wave function is localized. Indeed, due to the large number of time steps, even the absorption of a small fraction of the bound-state wave function at each step will result in a severe distortion of the final momentum distribution. \begin{figure}[h] \begin{center} \includegraphics[width=.8\textwidth]{Fig3.eps} \end{center} \caption{The outcomes of the SCTSQI model for different internal boundaries of the absorbing mask and lengths of the computational box. The distributions are normalized to the peak values. (a) The one-dimensional momentum distributions calculated within the SCTSQI model for the absorbing mask beginning at $x_b=50$~a.u. [light blue (thick) curve] and $x_b=100$~a.u. [dark green (dashed) curve]. The parameters are the same as in Fig.~1~(b), and the size of the computational box is $x_{\text{max}}=500$~a.u. (b) The one-dimensional momentum distributions obtained from SCTSQI for $x_{\text{max}}=500$~a.u. [light blue (thick) curve] and $x_{\text{max}}=200$~a.u. [dark green (dashed) curve]. The parameters are the same as in Fig.~1~(b). The absorbing mask begins at $x_b=50$~a.u.} \label{fig2} \end{figure} Finally, we check how important is the phase of the factor $G\left(x,p_x,t\right)$ in Eq.~(\ref{sctsqi}). To this end, in Fig.~4 we compare photoelectron momentum distribution calculated using the formula \begin{equation} R\left(k_x\right)=\left|\sum_{m=1}^{N_{T}}\sum_{j=1}^{n_{k_{x}}}\left|G\left(t_{0}^{m},x_{0}^{j},p_{x,0}^{j}\right)\right|\exp\left[i\Phi_{0}\left(t_{0}^{m},x_{0}^{j},p_{x,0}^{j}\right)\right]\right|^{2}, \label{sctsqi_abs} \end{equation} instead of the Eq.~(\ref{sctsqi}). We find that neglecting the phase of the Gabor transform is severe: The SCTSQI distribution cannot even be qualitatively reproduced when using Eq.~(\ref{sctsqi_abs}). This result could be expected. Indeed, the factor $G\left(x,p_x,t\right)$ contains all the information about the quantum dynamics of the absorbed part of the wave packet \textit{prior} its conversion to the ensemble of classical trajectories. In a sense the $I_pt_0$ term in the SCTS phase [see Eq.~(\ref{Phi_sim_1d})] plays the role of the phase $G\left(t,x,p_x\right)$ of the Gabor transform in Eq.~(\ref{sctsqi}). \begin{figure}[h] \begin{center} \includegraphics[width=.55\textwidth]{Fig4.eps} \end{center} \caption{The photoelectron momentum distributions obtained from the SCTSQI model [light blue (thick) curve] and using Eq.~(\ref{sctsqi_abs}), i.e., neglecting the phase of the Gabor transform [dark green (dashed) curve]. The parameters are the same as in Figs.~1~(b), 2, and 3. The size of the computational box is $x_{\text{max}}=500$~a.u., and the absorbing mask begins at $x_b=50$~a.u. The distributions are normalized to the peak values.} \label{fig2} \end{figure} \section{Conclusions and outlook} In conclusion, we have developed a trajectory-based approach to strong-field ionization: the semiclassical two-step model with quantum input. In the SCTSQI every trajectory is associated with the SCTS phase and, therefore, the SCTSQI model allows us to describe quantum interference and account for the ionic potential beyond the semiclassical perturbation theory. Furthermore, the SCTSQI corrects the inaccuracies of the SCTS model in treating the tunneling step. This has been achieved by the numerical solution of the TDSE with absorbing boundary conditions in a restricted area of space, applying the Gabor transform to the part of the wave function that is absorbed at each time step, and transforming this absorbed part into classical trajectories. The Gabor transform determines quantum amplitudes assigned to trajectories of the ensemble. Therefore, in the SCTSQI model the initial conditions of classical trajectories are governed by the exact quantum dynamics rather than by the quasistatic or SFA-based expressions as in other semiclassical approaches. We have tested our SCTSQI model by comparing its predictions with the numerical solution of the 1D TDSE. We have shown that the SCTSQI model yields quantitative agreement with the fully quantum results. This is true not only for the widths of the electron momentum distributions, but also for the positions of the interference maxima and minima. The model can be straightforwardly extended to the three-dimensional case. % Its numerical implementation can be made very efficient by using the sliding fast Fourier transform. Most importantly, the SCTSQI circumvents the non-trivial problem of choosing the initial conditions for classical trajectories. This makes the SCTSQI model extremely useful for study of strong-field ionization of molecules. \section{Acknowledgment} We are grateful to Professor Lars Bojer Madsen (Aarhus University), as well as to Nicolas Eicke and Simon Brennecke (Leibniz Universit\"{a}t Hannover) for stimulating discussions. This work was supported by the Deutsche Forschungsgemeinschaft (Grant No.~SH~1145/1-1). \begin{thebibliography}{99} % References for introduction \bibitem{DeloneKrainovBook2000} N.B. Delone and V.P. Krainov, \textit{Multiphoton Processes in Atoms} (Springer, Berlin, 2000). \bibitem{BeckerRev2002} W.~Becker, F.~Grasbon, R.~Kopold, D.~B.~Milo\v{s}evi\'c, G.G.~Paulus, and H.~Walther, Adv. At. Mol. Opt. Phys., Above-threshold ionization: From classical features to quantum effects,\textbf{48}, 35 (2002). \bibitem{MilosevicRev2003} D.~B.~Milo\v{s}evi\'c and F.~Ehlotzky, Scattering and reaction processes in powerful laser fields, Adv. At. Mol. Opt. Phys.\textbf{49}, 373 (2003). \bibitem{FaisalRev2005} A.~Becker and F.~H.~M.~Faisal, Intense field many-body S-matrix theory, J. Phys. B: At. Mol. Opt. Phys. \textbf{38}, R1 (2005). \bibitem{FariaRev2011} C.~Faria and X.~Liu, Electron-electron correlation in strong laser fields, J. Mod. Opt. \textbf{58}, 1076 (2011). \bibitem{Keldysh1964} L.~V.~Keldysh, Ionization in the field of a strong electromagnetic wave, Zh. Eksp. Teor. Fiz. \textbf{47}, 1945 [Sov. Phys. JETP \textbf{20}, 1307] (1964). \bibitem{Faisal1973} F.~H.~M.~Faisal, Multiple absorption of laser photons by atoms, J. Phys. B.: At. Mol. Opt. Phys. \textbf{6}, L89 (1973). \bibitem{Reiss1980} H.~R.~Reiss, Effect of an intense electromagnetic field on a weakly bounded system, Phys. Rev. A \textbf{22}, 1786 (1980). \bibitem{Muller1999} H.~G.~Muller, An efficient propagation scheme for the time-dependent Schr\"dinger equation in the velocity gauge, Las. Phys.\textbf{9}, 138 (1999). \bibitem{Bauer2006} D.~Bauer and P.~Koval, Qprop: A Schr\"{o}dinger-solver for intense laser–atom interaction, Comput. Phys. Comm. \textbf{174}, 396 (2006). \bibitem{Madsen2007} L.~B.~Madsen, L.~A.~A.~Nikolopoulos, T.~K.~Kjeldsen, and J.~Fern\'andez, Extracting continuum information from$\Psi$(t) in time-dependent wave-packet calculations, Phys. Rev. A \textbf{76}, 063407 (2007). \bibitem{Grum2010} A.~N.~Grum-Grzhimailo, B.~Abeln, K.~Bartschat, D. Weflen, and T. Urness, Ionization of atomic hydrogen in strong infrared laser fields, Phys. Rev. A \textbf{81}, 043408 (2010). \bibitem{Patchkovskii2016} S.~Patchkovskii and H.~G.~Muller, Simple, accurate, and efficient implementaion of 1-electron atomic time-dependent Schr\"{o}dinger equation in spherical coordinates, Comput. Phys. Commun. \textbf{199}, 153 (2016). \bibitem{Tong2017} X.~M.~Tong, A three-dimensional time-dependent Schr\"{o}dinger equation solver: an application to hydrogen atoms in an elliptical laser field, J. Phys. B. \textbf{50}, 144004 (2017). \bibitem{Linden1988} H.~B.~van Linden van~den~Heuvell, and H.~G.~Muller, in Multiphoton processes, edited by S.~J.~Smith and P.~L.~Knight (Cambrige University, Cambrige, 1988). \bibitem{Gallagher1988} T.~F.~Gallagher, Above-Threshold ionization in low-frequency limit, Phys. Rev. Lett. \textbf{61}, 2304 (1988). \bibitem{Corkum1989} P.~B.~Corkum, N.~H.~Burnett, and F.~Brunel, Above-threshold ionization in the long-wavelength limit, Phys. Rev. Lett. \textbf{62}, 1259 (1989). \bibitem{Kulander_Schafer1993} K. C. Kulander, K. J. Schafer, and J. L. Krause in \textit{Super-Intense Laser-Atom Physics}, edited by B.~Pireaux, A. L'Hullier and K. Rzazewski (Plenum, New York, 1993). \bibitem{Corkum1993} P.~B.~Corkum, Plasma perspective on strong-field multiphoton ionization, Phys. Rev. Lett. \textbf{71}, 1994 (1993). \bibitem{Dau} L.~D.~Landau, E.~M.~Lifshitz, \textit{Quantum Mechanics Non-relativistic Theory}, 2nd ed. (Pergamon Oxford, 1965). \bibitem{PPT} A.~M. Perelomov, V.~S.~Popov, and M.~V.~Terent'ev, Ionization of atoms in an alternating electric field, Zh. Eksp. Teor. Fiz. \textbf{50}, 1393 (1966) [Sov. Phys. JETP \textbf{23}, 924 (1966)]. \bibitem{ADK} M.~V.~Ammosov, N.~B.~Delone, and V.~P.~Krainov, Tunnel ionization of complex atoms and of atomic ions in an alternating electromagnetic field, Zh. Eksp. Teor. Fiz. \textbf{91}, 2008 (1986) [Sov. Phys. JETP \textbf{64}, 1191 (1986)]. \bibitem{DeloneKrainov1991} N.~B.~Delone and V.~P.~Krainov, Energy and angular electron spectra for the tunnel ionization of atoms by strong low-frequency radiation, J. Opt. Soc. Am. B \textbf{8}, 1207 (1991). \bibitem{SandRost2000} G. van de Sand and J.-M.~Rost, Semiclassical description of multiphoton processes, Phys. Rev. A \textbf{62}, 053403 (2000). \bibitem{Spanner2003} M.~Spanner, Strong-field tunnel ionization by real-valued classical trajectories, Phys. Rev. Lett. \textbf{90}, 233005 (2003). \bibitem{Zagoya2012} C.~Zagoya, C.-M.~Goletz, F. Grossmann, and J.-M.~Rost, Dominant-interaction Hamiltonians for high-order harmonic generation in laser-assisted collisions, Phys. Rev. A \textbf{85}, 041401(R) (2012). \bibitem{Faria2014} C.~Zagoya, J.~Wu, M.~Ronto, D.~V.~Shalashilin and C.~Faria, Quantum and semiclassical phase-space dynamics of a wave packet in strong fields using initial-value representations, New. J. Phys. \textbf{16}, 103040 (2014). \bibitem{Li2016} M.~Li, J.-W.~Geng, H.~Liu, Y.~Deng, C.~Wu, L.-Y.~Peng, Q.~Gong, and Y.~Liu, Classical-quantum correspondence for above-threshold ionization, Phys. Rev. Lett. \textbf{112}, 113002 (2014). \bibitem{Shvetsov2016} N.~I.~Shvetsov-Shilovski, M.~Lein, L.~B.~Madsen, E.~R\"as\"anen, C.~Lemell, J.~Burgd\"orfer, D.~G.~Arb\'o, and K.~T\ifmmode \mbox{\H{o}}\else \H{o}\fi{}k\'esi, Semiclassical two-step model for strong-field ionization, Phys. Rev. A\textbf{94}, 013415 (2016). \bibitem{Miller1974} W.~H.~Miller, Classical-limit quantum mechanics and the theory of molecular collisions, Adv. Chem. Phys. \textbf{25}, 69 (1974). % Using SFA for distribution of the initial conditions: \bibitem{Yan2010} T.-M.~Yan, S.~V.~Popruzhenko, M.~J.~J.~Vrakking, and D.~Bauer, Low-energy structures in strong-field ionization revealed by qunatum orbits, Phys. Rev. Lett. \textbf{105}, 253002 (2010). \bibitem{Popruzhenko2008} S.~V.~Popruzhenko and D.~Bauer, Strong-field approximation for systems with Coulomb interaction, J. Mod. Opt. \textbf{55}, 2573 (2008). \bibitem{Boge2013} R.~Boge, C.~Cirelli, A.~S.~Landsman, S.~Heuser, A.~Ludwig, J.~Maurer, M.~Weger, L.~Gallmann, and U.~Keller, Probing nonadiabatic effects in strong-field tunnel ionization, Phys. Rev. Lett. \textbf{111}, 103003 (2013). \bibitem{Hofmann2014} C.~Hofmann, A.~S.~Landsman, A.~Zielinski, C.~Cirelli, T.~Zimmermann, A.~Scrinzi, and U.~Keller, Interpreting electron-momentum distributions and nonadiabaticity in strong-field ionization, Phys. Rev. A \textbf{90}, 043406 (2014). \bibitem{Geng2014} J.-W.~Geng, L.~Qin, M.~Li, W.-H.~Xiong, Y.~Liu, Q.~Gong, and L.-Y.~Peng, Nonadiabatic tunneling ionization of atoms in elliptically polarized laser fields, J. Phys. B: At. Mol. Opt. Phys. \textbf{47}, 204027 (2014). \bibitem{Li2016dop} M.~Li, J.-W.~Geng, M.~Han, M.-M.~Liu, L.-Y.~Peng, Q.~Gong, and Y.~Liu, Subcycle nonadiabatic strong-field tunneling ionization, Phys. Rev. A \textbf{93}, 013402 (2016). \bibitem{Yudin2001} G.~L.~Yudin and M.~Yu.~Ivanov, Nonadiabatic tunnel ionization: Looking inside a laser cycle, Phys. Rev. A \textbf{64}, 013409 (2001). \bibitem{Bondar2009} D.~I.~Bondar, Instantaneous multiphoton ionization rate and initial distribution of electron momentum, Phys. Rev. A \textbf{78}, 015405 (2008). % Classical backpropagatio method: \bibitem{Ni2016} H.~Ni, U.~Saalmann, and J.-M.~Rost, Tunneling ionization time resolved by backpropagation, Phys. Rev. Lett. \textbf{117}, 023002 (2016). \bibitem{Ni2018} H.~Ni, U.~Saalmann, and J.-M.~Rost, Tunneling exit characteristics from classical backpropagation of an ionized electron wave packet, Phys. Rev. A \textbf{97}, 013426 (2018). \bibitem{Niwe2018} H.~Ni, N.~Eicke, C.~Ruiz, J.~Cai, F.~Oppermann, N.~I.~Shvetsov-Shilovski, and L.-W.~Pi, Tunneling criteria and a nonadiabatic term for strong-field ionization, Phys. Rev. A \textbf{98}, 013411 (2018). % Virtual detector theory: \bibitem{Wang2013} X.~Wang, J.~Tian, and J.~H.~Eberly, Extended virtual detector theory for strong-field atomic ionization, Phys. Rev. Lett. \textbf{110}, 243001 (2013). \bibitem{Wang2018} X.~Wang, J.~Tian, and J.~H.~Eberly, Virtual detector theory for strong-field atomic ionization, J. Phys. B: At. Mol. Opt. Phys. \textbf{51}, 084002 (2018). \bibitem{Thumm2003} B.~Feuerstein and U.~Thumm, On the computation of momentum distributions within wavepacket propagation calculations, J. Phys. B: At. Mol. Opt. Phys. \textbf{36}, 707 (2003). \bibitem{Teeny2016} N.~Teeny, C.~H.~Keitel, and H.~Bauke, Virtual-detector approach to tunnel ionization and tunneling times, Phys. Rev. A \textbf{94}, 022104 (2016). \bibitem{Tian2017} J.~Tian, X.~Wang, and J.~H.~Eberly, Numerical detector theory for the longitudinal momentum distribution of the electron in strong field ionizaton, Phys. Rev. Lett. \textbf{118}, 213201 (2017). % \bibitem{Magrakvelidze2012} M.~Magrakvelidze, C.~M.~Aikens, and U.~Thumm, Dissociation dynamics of diatomic molecules in intense laser fields: A scheme for the selection of relevant adiabatic potential curves, Phys. Rev. A \textbf{86}, 023402 (2012). % \bibitem{Walser2003} M.~Walser and T.~Brabec, Semiclassical path integral theory of strong-laser-field-physics, J. Phys. B: At. Mol. Opt. Phys. \textbf{36}, 3025 (2003). % References for Sec II A \bibitem{Javanainen1988} J.~Javanainen, J.~H. Eberly, and Q.~Su, Numerical simulations of multiphoton ionization and above-threshold electron spectra, Phys. Rev. A \textbf{38}, 3430 (1988). \bibitem{Lapack} E.~Anderson, Z.~Bai, C.~Bischof, S.~Blackford, J.~Demmel, J.~Dongarra, J. Du. Craz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen, LAPACK User's Guide, 3rd ed. (Society for Industrial and Applied Mathematics, Philadelphia, 1999). \bibitem{Feit1982} M.~D.~Feit, J.~A.~Fleck, and A.~Steiger, Solution of the Schr\"{o}dinger equation by a spectral method, J. Comput. Phys. \textbf{47}, 412 (1982). \bibitem{Tong2006} X.~M Tong, K.~Hino, and N.~Toshima, Phase-dependent atomic ionization in few-cycle intense laser pulse, Phys. Rev. A \textbf{74}, 031405(R) (2006). % References for Sec II B % \bibitem{Miller1974} W.~H.~Miller, Classical-limit quantum mechanics and the theory of molecular collisions, Adv. in Chem. Phys. \textbf{25}, 69 (1974). \bibitem{Walser2003} M. Walser and T. Brabec, Semiclassical path integral theory of strong-laser-field physics, J. Phys.~B \textbf{36}, 3025 (2003). % References for Sec II C \bibitem{Ballentine} L.~E.~Ballentine, \textit{Quantum Mechanics. A modern development} (World Scientific Publishing Co. Pte. Ltd., Singapore, 1998). \bibitem{Gabor} D.~Gabor, Theory of communication, J. Inst. Electr. Eng. \textbf{93}, 429 (1946). \bibitem{Chirila2010} C.~C.~Chiril\u{a}, I.~Dreissigacker, E.~V.~van~der~Zwan, and M.~Lein, Emission times in high-order harmonic generation, Phys. Rev. A \textbf{81}, 033412 (2010). \bibitem{Bandrauk2012} K.-J. Yuan and A.~D.~Bandrauk, Circularly polarized attosecond pulses from molecular high-order harmonic generation by ultrashort intense bichromatic circularly and linearly polarized laser pulses, J. Phys. B: At. Mol. Opt. Phys. \textbf{45}, 074001 (2012). \bibitem{Wu2013} J.~Wu, B.~B.~Augstein, and C.~Faria, Local dynamics in high-order-harmonic generation using Bohmian trjaectories, Phys. Rev. A \textbf{88}, 023415 (2013). \bibitem{Shu2016} X.~F.~Shu, S.~B.~Liu, and H.~Y.~Song, Phase-space analysis for ionization processes in the laser-atom interaction using Gabor transformation, Mod. Phys. Lett. B \textbf{30}, 1650147 (2016). \bibitem{Husimi} K.~Husimi, Some formal properties od the Density Matrix, Proc. Phys. Math. Soc. Jpn. \textbf{22}, 264 (1940). % References for Results and Discussion section \bibitem{Nuref} W.~H.~Press, S.~A.~Teukolsky, W.~T.~Vetterling, and B.~P.~Flannery, \textit{Numerical Recipes in Fortran 77: The Art of Scientific Computing}, 2nd ed. (Cambridge University Press, Cambridge, U.K., 1992). % \bibitem{BeckerRev2002} W.~Becker, F.~Grasbon, R.~Kopold, D.~B.~Milo\v{s}evi\'c, G.G.~Paulus, and H.~Walther, Adv. At. Mol. Opt. Phys., Above-threshold ionization: From classical features to quantum effects, \textbf{48}, 35 (2002). \end{thebibliography} \end{document}}
\caption{\textbf{Boxplots of the average Dice scores between the results of our previous work \cite{Bai2018} and the results of the proposed method on the three datasets.} For simplicity, we calculate the average Dice score over the three structures (LV, MYO, RV) for each image in the three datasets. The boxplots in orange are the results of the proposed method whereas the boxplots in blue are the results of the previous work. The \textcolor{green}{green} dashed line in each boxplot shows the mean value of the Dice scores for the segmentation results on one dataset.}
\caption{\textbf{Cross-dataset segmentation performances of four different network architectures.} All the networks have been trained using the same UKBB training set with the proposed data normalization and augmentation strategy for \textbf{1,000} epochs. Results listed in the table are the means and standard deviation of the Dice scores evaluated on the three sets. Numbers in \textcolor{red}{red} denote mean Dice scores below 0.70, whereas numbers in the \textbf{bold} font style denote the highest mean Dice scores among the results of the four networks.}
\caption{\textbf{Cross-dataset segmentation performances of U-Nets with different training configurations.} All experiments were performed with the standard U-Net architecture: UNet-64. Each U-Net was trained using the same UKBB training set for \textbf{200} epochs to save computation. Statistics listed in the table are the means and standard deviation of the Dice scores evaluated on the three sets. Numbers in \textcolor{red}{red} are those mean Dice scores below 0.70.}
\caption{\textbf{Segmentation performance of the UKBB model across the five groups of pathological cases and normal cases (NOR)}. This table presents the mean and standard deviation of the Dice score. \textcolor{red}{Red} numbers are those mean Dice scores below 0.80.}
\caption{\textbf{Agreement of clinical measurement from automatic and manual segmentations.} Bland Altman plots (automatic - manual) are shown regarding the three sets. In each Bland-Altman plot, the x-axis denotes the average of two measurements whereas the y-axis denotes the difference between them. The solid line in \textcolor{red}{red} denotes the mean difference (bias) and the two dashed lines in \textcolor{green}{green} denote $ \pm 1.96$ standard deviations from the mean. The title of each plot shows the mean difference (MD) and its standard deviation (SD) for each pair of measurements. FCN: the automatic method in our previous work~\cite{Bai2018}, LV/RV: left/right ventricle, EDV/ESV: end-diastolic/systolic volume, LVM: left ventricular mass.}
\caption{{\bf 2D Human landmark detection.} Comparison on the full Human3.6M test set with supervised baselines --- (1) Stacked Hourglass~\cite{newell2016stacked}, and (2) our model trained with supervision. We report the MSE in pixels for each activity. \colorbox{tabgray}{Shaded rows} show error for the original predictions of the model (\emph{no flip} in~\cref{s:x-human}), while unshaded rows represent the minimum of the errors obtained with and without flipping the predictions against the axis of bilateral symmetry (\emph{with flips} in~\cref{s:x-human}). We highlight the minimum error across all models in bold.}
\caption{Bandwidth for a typical (a) write path and (b) miss+install path. TicToc+PDM adds ``Predicted-Dirty'' state, where TOC dirty-bit is installed as dirty but TIC dirty-bit is installed as clean. Installing lines in Pred-Dirty can (a) save TOC dirty-bit update, but (b) increase miss cost. \colorbox{light-gray}{Using Pred-Dirty only for write-likely lines can save bandwidth} %Should use Pred-Dirty only for write-likely lines. %Need accurate write classifier. %Installing lines at Pred-Dirty state can save the later TOC dirty-bit update. However, overclassifying otherwise clean lines as Pred-Dirty leads to increased miss/writeback probe bandwidth to read dirty lines in preparation for eviction. Grey box denotes. Note that this state does not cause extra memory writebacks as TIC dirty-bit holds true dirty state. }
\caption{Average cumulative entropy of the model's policy distribution over time during simulated interactive learning. Plots are shown for the French-English ({\color{red}fr-en}) and the German-English ({\color{blue}de-en}) task, and for the \textsc{keep+delete} and the \textsc{+substitute} system, respectively. }
\caption{The two figures show the effect of different beam sizes on Character-F score (top) and BLEU score (bottom). We conduct experiments on French-English ({\color{red}fr-en}) and German-English ({\color{blue}de-en}) and both systems (\textsc{keep+delete} and \textsc{+substitute}). All scores are averaged over two runs.}
\caption{Interaction protocol illustrating translation progress of the two learning systems on the German English task (upper half) and French-English (lower half). For each language pair, the first example illustrates interactions with the \textsc{keep+delete} system, while the second example shows interactions with the \textsc{+substitute} system. In each round, the user is asked for feedback on uncertain locations of the current partial translation. Tokens printed in {\color{blue}blue} with their position in subscript indicate uncertain locations. At the end of each round, the system is updated given the user's feedback (\textsc{keep}, \textsc{delete}, \textsc{substitute}). In the next round, it generates a constrained (partial) translation with respect to this feedback. Tokens generated based on feedback rules are printed in \textit{italics}. }
\caption{\small The matrix is the sum of a low-rank matrix and an i.i.d.\matrix. The MSE of AMP with damping (AMP-Damping)~\cite{Rangan2014ISIT} \red{and Lasso} does not decrease with iterations. In contrast, the MSE of Swept AMP (SwAMP)~\cite{Manoel2015SweptAM} and our approach, AMP with matrix decomposition (AMP-MD), decreases with iterations, and AMP-MD achieves lower MSE than SwAMP. \label{fig:low_rank_matrix}}
\caption{Energy ($E$) and FWHM ($\Gamma$) for the absorbance spectra in Fig. \ref{fig:Au_NL2_SwDiam_Waterfall} as function of the diameter ($d$). The blue lines show the values of spectral position ($E_j$) while the red lines present the FWHM values ($\Gamma_j$). The continuous lines correspond to the results for the dark mode ($D$) and the dashed lines to the bright mode ($B$). The error bar for each value is also provided \condcolor{red}{and some of them may not be visible as they are smaller than the point size}.}
\caption{Fitted parameters for the absorbance spectra provided in Fig. \ref{fig:Au_NL2_SwAllGap_Waterfall} as function of the gap size ($g$). The blue lines show the values of spectral position ($E_j$) while the red lines present the FWHM values ($\Gamma_j$). The continuous lines correspond to the results for mode $D$ and the dashed lines to mode $B$. The error bar for each value is also provided \condcolor{red}{and some of them may not be visible as they are smaller than the point size}.}
\caption{EchoTherm\textregistered~is an active thermography system used to heat components and subsequently capture cool-down data, information informing the creation of TSR features used as input for AI-backed condition grading and classification. }
\caption{Cartoon depicting the past- and future-facing perspectives of online learning, for an RNN unrolled over time. Each ${\bf a}$ represents the RNN hidden state value, while $F_{\bf w}$ denotes applications of the recurrent update; the instantaneous losses $L$ implicitly depend on the hidden state through $L^{(t)} = L\left(F^{\text{out}}_{{\bf w}_{\text{o}}}({\bf a}^{(t)}), {\bf y}^{*(t)}\right)$. The blue (yellow) arrows show the paths of influence accounted for by the past-facing (future-facing) gradient described in the corresponding equation.} \label{fig:past_future} \end{figure} \subsection{Past-facing online learning algorithms} Here we derive a fundamental relation leveraged by past-facing (PF) online algorithms. Let $t$ index real time, and define the {\bf influence matrix} ${\bf M}^{(t)} \in \mathbb{R}^{n \times P}$, where $n$ and $P$ are respectively the number of hidden units and the number of parameters defining $F_{{\bf w}}$. ${\bf M}^{(t)}$ tracks the derivatives of the current state ${\bf a}^{(t)}$ with respect to each parameter $w_p$: \begin{equation} M^{(t)}_{kp} = \frac{\partial a^{(t)}_k}{\partial w_{p}}. \label{def_influence} \end{equation} Let's rewrite Eq.~\eqref{def_influence} with matrix notation and unpack it by one time step: \begin{align*} {\bf M}^{(t)} &= \frac{\partial {\bf a}^{(t)}}{\partial {\bf w}} = \sum_{s \leq t} \frac{\partial {\bf a}^{(t)}}{\partial {\bf w}^{(s)}} = \sum_{s \leq t-1} \frac{\partial {\bf a}^{(t)}}{\partial {\bf w}^{(s)}} + \frac{\partial {\bf a}^{(t)}}{\partial {\bf w}^{(t)}}\\ &= \sum_{s \leq t-1} \frac{\partial {\bf a}^{(t)}}{\partial {\bf a}^{(t-1)}}\frac{\partial {\bf a}^{(t-1)}}{\partial {\bf w}^{(s)}} + \frac{\partial {\bf a}^{(t)}}{\partial {\bf w}^{(t)}}\\ &= \frac{\partial {\bf a}^{(t)}}{\partial {\bf a}^{(t-1)}}\frac{\partial {\bf a}^{(t-1)}}{\partial {\bf w}} + \frac{\partial {\bf a}^{(t)}}{\partial {\bf w}^{(t)}}\\ &\equiv {\bf J}^{(t)}{\bf M}^{(t-1)} + {\bf \overline{M}}^{(t)}. \numberthis \label{pf_relation} \end{align*} A simple recursive formula emerges, wherein the influence matrix is updated by multiplying its current value by the Jacobian ${\bf J}^{(t)} = \partial{\bf a}^{(t)} / \partial{\bf a}^{(t-1)} \in \mathbb{R}^{n \times n}$ of the network and then adding the {\bf immediate influence} ${\bf \overline{M}}^{(t)} = \partial{\bf a}^{(t)} / \partial{\bf w}^{(t)} \in \mathbb{R}^{n \times P}$. To compute the gradient that ultimately gets passed to the optimizer, we simply use the chain rule over the current hidden state ${\bf a}^{(t)}$: \begin{equation} \frac{\partial L^{(t)}}{\partial {\bf w}} = \frac{\partial L^{(t)}}{\partial {\bf a}^{(t)}}\frac{\partial {\bf a}^{(t)}}{\partial {\bf w}} \equiv {\bf \overline{c}}^{(t)}{\bf M}^{(t)}, \label{pf_grad} \end{equation} where the {\bf immediate credit assignment vector} ${\bf \overline{c}}^{(t)} \in \mathbb{R}^n$ is defined to be $\partial L^{(t)} / \partial{\bf a}^{(t)}$ and is calculated by backpropagating the error $\boldsymbol{\delta}^{(t)}$ through the derivative of the output function $F^{\text{out}}_{{\bf w}_{\text{o}}}$ (or approximated by Feedback Alignment, see \citealp{lillicrap2016random}). In the end, we compute a derivative in Eq.~\eqref{pf_grad} that is implicitly a sum over the many terms of Eq.~\eqref{past_facing_sum}, using formulae that depend explicitly only on times $t$ and $t-1$. For this reason, such a learning algorithm is {\bf online}, and it is {\bf past facing} because the gradient computation is of the form in Eq.~\eqref{past_facing_sum}. \subsection{Future-facing online learning algorithms} Here we show a symmetric relation for future-facing (FF) online algorithms. The {\bf credit assignment vector} ${\bf c}^{(t)} \in \mathbb{R}^n$ is a row vector defined as the gradient of the loss $\mathcal{L}$ with respect to the hidden state ${\bf a}^{(t)}$. It plays a role analogous to ${\bf M}^{(t)}$ and has a recursive update similar to Eq.~\eqref{pf_relation}: \begin{align*} {\bf c}^{(t)} &= \frac{\partial \mathcal{L}}{\partial {\bf a}^{(t)}} = \sum_{s \geq t}\frac{\partial L^{(s)}}{\partial {\bf a}^{(t)}} = \frac{\partial L^{(t)}}{\partial {\bf a}^{(t)}} + \sum_{s \geq t+1}\frac{\partial L^{(s)}}{\partial {\bf a}^{(t)}}\\ &= \frac{\partial L^{(t)}}{\partial {\bf a}^{(t)}} + \sum_{s \geq t+1}\frac{\partial L^{(s)}}{\partial {\bf a}^{(t+1)}}\frac{\partial {\bf a}^{(t+1)}}{\partial {\bf a}^{(t)}}\\ &= \frac{\partial L^{(t)}}{\partial {\bf a}^{(t)}} + \frac{\partial \mathcal{L}}{\partial {\bf a}^{(t+1)}}\frac{\partial {\bf a}^{(t+1)}}{\partial {\bf a}^{(t)}}\\ &= {\bf \overline{c}}^{(t)} + {\bf c}^{(t+1)}{\bf J}^{(t+1)}. \numberthis \label{ff_relation} \end{align*} As in the PF case, the gradient is ultimately calculated using the chain rule over ${\bf a}^{(t)}$: \begin{equation} \frac{\partial \mathcal{L}}{\partial {\bf w}^{(t)}} = \frac{\partial \mathcal{L}}{\partial {\bf a}^{(t)}}\frac{\partial {\bf a}^{(t)}}{\partial {\bf w}^{(t)}} \equiv {\bf c}^{(t)}{\bf \overline{M}}^{(t)}. \label{ff_gradient} \end{equation} The recursive relations for PF and FF algorithms are of identical form given the following changes: (1) swap the roles of $\mathcal{L}$ and ${\bf w}$, (2) swap the roles of $t-1$ and $t+1$, and (3) flip the direction of all derivatives. This clarifies the fundamental trade-off between the PF and FF approaches to online learning. On the one hand, memory requirements favor FF because $\mathcal{L}$ is a scalar while ${\bf w}$ is a matrix. On the other, only PF can truly be run online, because the time direction of the update in FF is opposite the forward pass. Thus, efficient PF algorithms must \emph{compress} ${\bf M}^{(t)}$, while efficient FF algorithms must \emph{predict} ${\bf c}^{(t+1)}$. \section{Past-facing algorithms} \subsection{Real-Time Recurrent Learning} The Real-Time Recurrent Learning (RTRL, \citealp{williams1989learning}) algorithm directly applies Eqs. \eqref{pf_relation} and \eqref{pf_grad} as written. We call the application of Eq.~\eqref{pf_relation} the ``update" to the learning algorithm, which is {\bf deterministic} and in {\bf closed form}. Implementing Eq.~\eqref{pf_relation} requires storing $nP \approx \mathcal{O}(n^3)$ floats in ${\bf M}^{(t)}$ and performing $\mathcal{O}(n^4)$ multiplications in ${\bf J}^{(t)}{\bf M}^{(t)}$, which is neither especially efficient nor biologically plausible. However, several efficient (and in some cases, biologically plausible) online learning algorithms have recently been developed, including Unbiased Online Recurrent Optimization (UORO; \citealp{tallec2017unbiased}; \S\ref{uoro}), Kronecker-Factored RTRL (KF-RTRL; \citealp{mujika2018approximating}; \S\ref{kf-rtrl}), Kernel RNN Learning (KeRNL; \citealp{roth2018kernel}; \S\ref{kernl}), and Random-Feedback Online Learning (RFLO; \citealp{murray2019local}; \S\ref{rflo}). We claim that these learning algorithms, whether explicitly derived as such or not, are all implicitly approximations to RTRL, each a special case of a general class of techniques for compressing ${\bf M}^{(t)}$. In the following section, we clarify how each of these learning algorithms fits into this broad structure. \subsubsection{Approximations to RTRL}\label{rtrl_approx} To concretely illuminate these ideas, we will work with a special case of $F_{{\bf w}}$, a time-continuous vanilla RNN: \begin{equation} {\bf a}^{(t)} = F_{{\bf w}}({\bf a}^{(t-1)}, {\bf x}^{(t)}) = (1 - \alpha) {\bf a}^{(t-1)} + \alpha\phi({\bf W}{\bf \hat{a}}^{(t-1)}), \label{vanilla_rnn} \end{equation} where ${\bf \hat{a}}^{(t-1)} = \text{concat}({\bf a}^{(t-1)}, {\bf x}^{(t)}, 1) \in \mathbb{R}^m$, ${\bf W} \in \mathbb{R}^{n \times m}$, $\phi: \mathbb{R}^n \rightarrow \mathbb{R}^n$ is some point-wise nonlinearity (e.g. $\tanh$), and $\alpha \in (0, 1]$ is the network's inverse time constant. The trainable parameters $w_p$ are folded via the indexing $p = i\times n + j$ into the weight matrix $W_{ij}$, whose columns hold the recurrent weights, the input weights, and a bias. By reshaping $w_{p}$ into its natural matrix form $W_{ij}$, we can write the influence matrix as an order-3 {\bf influence tensor} \begin{equation*} M^{(t)}_{kij} = \partial a^{(t)}_k / \partial W_{ij}. \end{equation*} Thus $M^{(t)}_{kij}$ specifies the effect on the $k$-th unit of perturbing the direct connection from the $j$-th unit to the $i$-th unit. The immediate influence can also be written as a tensor. By differentiating Eq.~\eqref{vanilla_rnn}, we see it takes the sparse form \begin{equation*} \overline{M}^{(t)}_{kij} = \partial a^{(t)}_k / \partial W^{(t)}_{ij} = \alpha \delta_{ki}\phi'(h^{(t)}_i) \hat{a}^{(t-1)}_j, \end{equation*} because $W_{ij}$ can affect the $k$-th unit directly only if $k = i$. Many approximations of RTRL involve a decomposition of $M_{kij}^{(t)}$ into a product of lower-order tensors. For example, UORO represents $M_{kij}^{(t)}$ by an outer product $A_k^{(t)} B_{ij}^{(t)}$, which has a memory requirement of only $\mathcal{O}(n^2)$. Similarly, KF-RTRL uses a Kronecker-product decomposition $A_j^{(t)} B_{ki}^{(t)}$. We can generalize these cases into a set of six possible decompositions of $M^{(t)}_{kij}$ into products of lower-order tensors $A^{(t)}$ and $B^{(t)}$: \begin{equation*} M_{kij}^{(t)} \approx \begin{cases} A_{k}^{(t)} B_{ij}^{(t)} & \text{UORO}, \S\ref{uoro}\\ A_{j}^{(t)} B_{ki}^{(t)} & \text{KF-RTRL}, \S\ref{kf-rtrl}\\ A_{i}^{(t)} B_{kj}^{(t)} & \text{``Reverse" KF-RTRL}, \S\ref{ikf-rtrl}\\ A_{ki}^{(t)} B_{ij}^{(t)} & \text{KeRNL/RFLO}, \S\ref{kernl}/ \S\ref{rflo}\\ A_{kj}^{(t)} B_{ij}^{(t)} & \text{Unexplored}\\ A_{ki}^{(t)} B_{kj}^{(t)} & \text{Unexplored} \end{cases}. \label{rtrl_decompositions} \end{equation*} Each such decomposition has a memory requirement of $\mathcal{O}(n^2)$. Of course, it is not sufficient to write down an idealized decomposition for a particular time point; there must exist some efficient way to \emph{update} the decomposition as the network runs forwards. We now go through each algorithm and show the mathematical techniques used to derive update equations and categorize them by the criteria outlined in Table~\ref{algtable}. \subsection{Unbiased Online Recurrent Optimization (UORO)}\label{uoro} \cite{tallec2017unbiased} discovered a technique for approximating ${\bf M}^{(t)} \in \mathbb{R}^{n \times P}$ as an outer product ${\bf A}^{(t)}{\bf B}^{(t)}$, where ${\bf A}^{(t)} \in \mathbb{R}^{n \times 1}$ and ${\bf B}^{(t)} \in \mathbb{R}^{1 \times P}$. The authors proved a crucial lemma (see \hyperref[sec.lemma]{Appendix A} or \citealp{tallec2017unbiased}) that gives, in closed form, an unbiased rank-1 estimate of a given matrix over the choice of a random vector $\boldsymbol{\nu} \in \mathbb{R}^{n}$ with $\mathbb{E}[\nu_i \nu_j] \propto \delta_{ij}$ and $\mathbb{E}[\nu_i] = 0$. They leverage this result to derive a closed-form update rule for ${\bf A}^{(t)}$ and ${\bf B}^{(t)}$ at each time step, without ever having to explicitly (and expensively) calculate ${\bf M}^{(t)}$. We present an equivalent formulation in terms of tensor components, i.e., \begin{equation*} M^{(t)}_{kij} \approx A^{(t)}_k B^{(t)}_{ij}, \end{equation*} where $B^{(t)}_{ij}$ represents the ``rolled-up" components of ${\bf B}^{(t)}$, as in $W_{ij}$ w.r.t. ${\bf w}$. Intuitively, the $kij$-th component of the influence matrix is constrained to be the product of the $k$-th unit's ``sensitivity" $A^{(t)}_k$ and the $ij$-th parameter's ``efficacy" $B^{(t)}_{ij}$. Eqs. \eqref{uoro_update} and \eqref{uoro_unbiased} show the form of the update and why it is unbiased over $\boldsymbol{\nu}$, respectively:\\ \begin{align*} A^{(t)}_k B^{(t)}_{ij} &= \left(\rho_0 \sum_{k'} J^{(t)}_{kk'} A^{(t-1)}_{k'} + \rho_1 \nu_k\right)\left(\rho_0^{-1} B^{(t-1)}_{ij} + \rho_1^{-1}\sum_{k'}\nu_{k'} \overline{M}^{(t)}_{k'ij}\right) \\ &= \sum_{k'} J^{(t)}_{kk'} A^{(t-1)}_{k'}B^{(t-1)}_{ij} + \sum_{k'} \nu_k \nu_{k'} \overline{M}^{(t)}_{k'ij}\\ &+ \sum_{k'} \nu_{k'} \left[\rho_1 \rho_0^{-1}\delta_{kk'} B^{(t-1)}_{ij} + \rho_0 \rho_1^{-1}\overline{M}^{(t)}_{k'ij}\sum_{k''}J^{(t)}_{k'k''}A^{(t-1)}_{k''}\right]\numberthis \label{uoro_update}\\ \implies \mathbb{E}\left[A^{(t)}_k B^{(t)}_{ij}\right] &= \sum_{k'} J^{(t)}_{kk'} \mathbb{E}\left[A^{(t-1)}_{k'}B^{(t-1)}_{ij}\right] + \sum_{k'}\mathbb{E}[\nu_k \nu_{k'}]\overline{M}^{(t)}_{k'ij}\\ &+ \sum_{k'} \mathbb{E}[\nu_{k'}]\left( \text{cross terms} \right)\\ &= \sum_{k'} J^{(t)}_{kk'} M^{(t-1)}_{k'ij} + \sum_{k'} \delta_{kk'}\overline{M}^{(t)}_{k'ij} + \sum_{k'} 0 \times (\text{cross terms})\\ &= \sum_{k'} J^{(t)}_{kk'} M^{(t-1)}_{k'ij} + \overline{M}^{(t)}_{kij}\\ &= M^{(t)}_{kij}. \numberthis \label{uoro_unbiased} \end{align*} The cross terms vanish in expectation because $\mathbb{E}[\nu_k] = 0$. Thus, by induction over $t$, the estimate of $M^{(t)}_{kij}$ remains unbiased at every time step. The constants $\rho_0, \rho_1 \in \mathbb{R}^{>0}$ are chosen at each time step to minimize total variance of the estimate by balancing the norms of the cross terms. This algorithm's update is {\bf stochastic} due to its reliance on the random vector $\boldsymbol{\nu}$, but it is in {\bf closed form} because it has an explicit update formula (Eq.~\ref{uoro_update}). Both its memory and computational complexity are $\mathcal{O}(n^2)$. \subsection{Kronecker-Factored RTRL (KF-RTRL)} \label{kf-rtrl} \cite{mujika2018approximating} leverage the same lemma as in UORO, but using a decomposition of ${\bf M}^{(t)}$ in terms of a Kronecker product ${\bf A}^{(t)} \otimes{\bf B}^{(t)}$, where now ${\bf A}^{(t)} \in \mathbb{R}^{1 \times m}$ and ${\bf B}^{(t)}~\in~ \mathbb{R}^{n \times n}$. This decomposition is more natural, because the immediate influence ${\bf \overline{M}}^{(t)}$ factors \emph{exactly} as a Kronecker product ${\bf \hat{a}}^{(t)} \otimes{\bf D}^{(t)}$ for vanilla RNNs, where $D^{(t)}_{ki} = \alpha \delta_{ki} \phi'(h^{(t)}_i)$. To derive the update rule for UORO, one must first generate a rank-1 estimate of ${\bf \overline{M}}^{(t)}$ as an intermediate step, introducing more variance, but in KF-RTRL, this step is unnecessary. In terms of components, the compression takes the form \begin{equation*} M^{(t)}_{kij} \approx A^{(t)}_j B^{(t)}_{ki}, \end{equation*} which is similar to UORO, modulo a cyclic permutation of the indices. Given a sample $\boldsymbol{\nu} \in \mathbb{R}^2$ of only 2 i.i.d. random variables, again with $\mathbb{E}[\nu_i \nu_j] \propto \delta_{ij}$ and $\mathbb{E}[\nu_i] = 0$, the update takes the form shown in Eqs. \eqref{kfrtrl_a_update} and \eqref{kfrtrl_b_update}: \begin{align*} A^{(t)}_j &= \left(\nu_0 \rho_0 A^{(t-1)}_j + \nu_1 \rho_1 \hat{a}^{(t-1)}_j\right) \numberthis \label{kfrtrl_a_update}\\ B^{(t)}_{ki} &= \left(\nu_0 \rho_0^{-1} \sum_{k'} J^{(t)}_{kk'} B^{(t-1)}_{k'i} + \nu_1 \rho_1^{-1}\alpha\delta_{ki} \phi'(h^{(t)}_i)\right) \numberthis \label{kfrtrl_b_update}\\ \implies A^{(t)}_j B^{(t)}_{ki} &= \nu_0^2 \sum_{k'} J^{(t)}_{kk'} A^{(t-1)}_j B^{(t-1)}_{k'i} + \nu_1^2\alpha\delta_{ki} \phi'(h^{(t)}_i)\hat{a}^{(t-1)}_j + \text{cross-terms} \label{kfrtrl_update}\\ \implies \mathbb{E}\left[A^{(t)}_j B^{(t)}_{ki}\right] &= \sum_{k'} J^{(t)}_{kk'} \mathbb{E}\left[A^{(t-1)}_j B^{(t-1)}_{k'i}\right] + \alpha\delta_{ki} \phi'(h^{(t)}_i)\hat{a}^{(t-1)}_j\\ &= \sum_{kk'} J^{(t)}_{kk'} M^{(t-1)}_{k'ij} + \overline{M}^{(t)}_{kij}\\ &= M^{(t)}_{kij}. \end{align*} As in UORO, the cross terms vanish in expectation, and the estimate is unbiased by induction over $t$. This algorithm's updates are also {\bf stochastic} and in {\bf closed form}. Its memory complexity is $\mathcal{O}(n^2)$, but its computation time is $\mathcal{O}(n^3)$ because of the matrix-matrix product in Eq.~\eqref{kfrtrl_b_update}. \subsection{Reverse KF-RTRL (R-KF-RTRL)} \label{ikf-rtrl} Our exploration of the space of different approximations naturally raises a question: is an approximation of the form \begin{equation} M^{(t)}_{kij} \approx A^{(t)}_i B^{(t)}_{kj} \label{ikfrtrl_approx} \end{equation} also possible? We refer to this method as ``Reverse" KF-RTRL (R-KF-RTRL) because, in matrix notation, this would be formulated as ${\bf M}^{(t)} \approx{\bf B}^{(t)} \otimes{\bf A}^{(t)}$, where ${\bf A}^{(t)} \in \mathbb{R}^{1 \times n}$ and ${\bf B}^{(t)} \in \mathbb{R}^{n \times m}$. We propose the following update for $A^{(t)}_i$ and $B^{(t)}_{kj}$ in terms of a random vector $\boldsymbol{\nu} \in \mathbb{R}^n$: \begin{align*} A^{(t)}_i B^{(t)}_{kj} &= \left(\rho_0 A^{(t-1)}_i + \rho_1\nu_i\right)\left(\rho_0^{-1} \sum_{k'} J^{(t)}_{kk'}B^{(t-1)}_{k'j} + \rho_1^{-1}\sum_{i'} \nu_{i'} \overline{M}^{(t)}_{ki'j}\right) \numberthis \label{ikfrtrl_update}\\ &= \sum_{k'} J^{(t)}_{kk'}A^{(t-1)}_i B^{(t-1)}_{k'j} + \sum_{i'} \nu_i \nu_{i'} \overline{M}^{(t)}_{ki'j} + \text{cross-terms}\\ \implies \mathbb{E}\left[A^{(t)}_i B^{(t)}_{kj}\right] &= \sum_{k'} J^{(t)}_{kk'}\mathbb{E}\left[A^{(t-1)}_i B^{(t-1)}_{k'j}\right] + \overline{M}^{(t)}_{kij}\\ &= \sum_{k'} J^{(t)}_{kk'} M^{(t-1)}_{k'ij} + \overline{M}^{(t)}_{kij}\\ &= M^{(t)}_{kij}. \numberthis \label{ikfrtrl_unbiased} \end{align*} Eq.~\eqref{ikfrtrl_unbiased} shows that this estimate is unbiased, using updates that are {\bf stochastic} and in {\bf closed form}, like its sibling algorithms. Its memory and computational complexity are $\mathcal{O}(n^2)$ and $\mathcal{O}(n^3)$, respectively. R-KF-RTRL is actually more similar to UORO than KF-RTRL, because $\overline{M}^{(t)}_{kij}$ does not naturally factor like Eq.~\eqref{ikfrtrl_approx}, introducing more variance. Worse, it has the computational complexity of KF-RTRL due to the matrix-matrix multiplication in Eq.~\eqref{ikfrtrl_update}. KF-RTRL stands out as the most effective of these 3 algorithms, because it estimates ${\bf M}^{(t)}$ with the lowest variance due to its natural decomposition structure. (See \citealp{mujika2018approximating} for variance calculations.) \subsubsection{Optimal Kronecker-Sum Approximation (OK)} We briefly mention an extension of KF-RTRL by \cite{benzing2019optimal}, where the influence matrix is approximated not by 1 but rather a sum of $r$ Kronecker products, or, in components \begin{equation*} M^{(t)}_{kij} \approx \sum_{l=1}^r A^{(t)}_{lj} B^{(t)}_{lki}. \end{equation*} On the RTRL update, the $k$ index of $B^{(t)}_{lki}$ is propagated forward by the Jacobian, and then the immediate influence---itself a Kronecker product---is added. Now $M^{(t)}_{kij}$ is approximated by $r+1$ Kronecker products \begin{equation*} M^{(t)}_{kij} \approx \sum_{l=1}^r A^{(t-1)}_{lj} J^{(t)}_{kk'}B^{(t-1)}_{lk'i} + \alpha \hat{a}^{(t-1)}_j \delta_{ki} \phi'(h^{(t)}_i), \end{equation*} but the authors developed a technique to optimally reduce this sum back to $r$ Kronecker products, keeping the memory complexity $\mathcal{O}(rn^2)$ and computational complexity $\mathcal{O}(rn^3)$ constant. This update is {\bf stochastic} because it requires explicit randomness in the flavor of the above algorithms, and it is {\bf numerical} because there is no closed form solution to the update. We leave the details to the original paper. \subsection{Kernel RNN Learning (KeRNL)} \label{kernl} \cite{roth2018kernel} developed a learning algorithm for RNNs that is essentially a compression of the influence matrix of the form $M^{(t)}_{kij} \approx A_{ki} B^{(t)}_{ij}$. We will show that this algorithm is also an implicit approximation of RTRL, although the update rules are fundamentally different than those for UORO, KF-RTRL and R-KF-RTRL. The {\bf eligibility trace} ${\bf B}^{(t)} \in \mathbb{R}^{n \times m}$ updates by temporally filtering the immediate influences $\alpha \phi'(h^{(t)}_i) \hat{a}^{(t-1)}_j$ with unit-specific, learnable timescales $\alpha_i$: \begin{equation} B^{(t)}_{ij} = (1 - \alpha_i)B^{(t-1)}_{ij} + \alpha \phi'(h^{(t)}_i) \hat{a}^{(t-1)}_j. \label{e_trace} \end{equation} The {\bf sensitivity matrix} ${\bf A} \in \mathbb{R}^{n \times n}$ is chosen to approximate the multi-step Jacobian $\partial a^{(t)}_k / \partial a^{(t')}_i$ with help from the learned timescales: \begin{equation} \frac{\partial a^{(t)}_k}{\partial a^{(t')}_i} \approx A_{ki} (1 - \alpha_i)^{(t - t')}. \label{sensitivity_approx} \end{equation} We will describe how ${\bf A}$ is learned later, but for now we assume this approximation holds and use it to show how the KeRNL update is equivalent to that of RTRL. We have dropped the explicit time-dependence from ${\bf A}$, because it updates too slowly for Eq.~\eqref{sensitivity_approx} to be specific to any one time point. If we unpack this approximation by one time step, we uncover the consistency relation \begin{equation} A_{ki}(1 - \alpha_i) \approx \sum_{k'} J^{(t)}_{kk'} A_{k'i}. \label{A_consistency} \end{equation} By taking $t=t'$ in Eq.~\eqref{sensitivity_approx} and rearranging Eq.~\eqref{A_consistency}, we see this approximation implicitly assumes both \begin{equation} A_{ki} \approx \begin{cases} \delta_{ki}\\ (1 - \alpha_i)^{-1}\sum_{k'}J^{(t)}_{kk'}A_{k'i} \end{cases}. \label{A_kernl_approx} \end{equation} Then the eligibility trace update effectively implements the RTRL update, assuming inductively that $M^{(t-1)}_{kij}$ is well approximated by $A_{ki}B^{(t-1)}_{ij}$: \begin{align*} A_{ki}B^{(t)}_{ij} &= A_{ki}\left[(1 - \alpha_i)B^{(t-1)}_{ij} + \alpha \phi'(h^{(t)}_i) \hat{a}^{(t-1)}_j\right]\\ &= A_{ki} (1 - \alpha_i) B^{(t-1)}_{ij} + \alpha A_{ki}\phi'(h^{(t)}_i) \hat{a}^{(t-1)}_j\\ &\approx \sum_{k'} J^{(t)}_{kk'} A_{k'i} B^{(t-1)}_{ij} + \alpha \delta_{ki} \phi'(h^{(t)}_i) \hat{a}^{(t-1)}_j \numberthis \label{use_A_approx}\\ &= \sum_{kk'} J^{(t)}_{kk'} M^{(t-1)}_{k'ij} + \overline{M}^{(t)}_{kij}\\ &= M^{(t)}_{kij}. \end{align*} In Eq.~\eqref{use_A_approx}, we use each of the special cases from Eq.~\eqref{A_kernl_approx}. Of course, the $A_{ki}$ and $\alpha_i$ have to be learned, and \cite{roth2018kernel} use gradient descent to do so. We leave details to the original paper; briefly, they run in parallel a perturbed forward trajectory to estimate the LHS of Eq.~\eqref{sensitivity_approx} and then perform SGD on the squared difference between the LHS and RHS, giving gradients for $A_{ki}$ and $\alpha_i$. KeRNL uses {\bf deterministic} updates because it does not need explicit random variables. While the $B^{(t)}_{ij}$ update is in closed form via Eq.~\eqref{e_trace}, the updates for $A_{ki}$ and $\alpha_i$ are {\bf numerical} because of the need for SGD to train them to obey Eq.~\eqref{sensitivity_approx}. Both its memory and computational complexities are $\mathcal{O}(n^2)$. \subsection{Random-Feedback Online Learning (RFLO)} \label{rflo} Coming from a computational neuroscience perspective, \cite{murray2019local} developed a beautifully simple and biologically plausible learning rule for RNNs, which he calls Random-Feedback Online Learning (RFLO). He formulates the rule in terms of an eligibility trace $B^{(t)}_{ij}$ that filters the non-zero immediate influence elements $\phi'(h^{(t)}_i) \hat{a}^{(t-1)}_j$ by the network inverse time constant $\alpha$: \begin{equation*} B^{(t)}_{ij} = (1 - \alpha)B^{(t-1)}_{ij} + \alpha \phi'(h^{(t)}_i) \hat{a}^{(t-1)}_j. \label{rflo_e_trace} \end{equation*} Then the approximate gradient is ultimately calculated\footnote{As the ``random feedback" part of the name suggests, Murray goes a step further in approximating $\overline{c}^{(t)}_k$ by random feedback weights \'a la \citealp{lillicrap2016random}, but we assume exact feedback in this paper for easier comparisons with other algorithms.} as \begin{equation*} \fracpartial{L^{(t)}}{W_{ij}} \approx \overline{c}^{(t)}_i B^{(t)}_{ij}. \end{equation*} By observing that \begin{equation*} \overline{c}^{(t)}_i B^{(t)}_{ij} = \sum_k \overline{c}^{(t)}_k \delta_{ki} B^{(t)}_{ij}, \end{equation*} we see that RFLO is a special case of KeRNL, in which we fix $A_{ki} = \delta_{ki}$, $\alpha_i = \alpha$. Alternatively, and as hinted in the original paper, we can view RFLO as a special case of RTRL under the approximation $J^{(t)}_{kk'} \approx (1 - \alpha) \delta_{kk'}$, because the RTRL update reduces to RFLO with $M^{(t)}_{kij} = \delta_{ki}B^{(t)}_{ij}$ containing $B^{(t)}_{ij}$ along the diagonals: \begin{align*} M^{(t)}_{kij} &= \sum_{k'}J^{(t)}_{kk'}M^{(t-1)}_{k'ij} + \overline{M}^{(t)}_{kij}\\ &= (1 - \alpha) \sum_{k'}\delta_{kk'}M^{(t-1)}_{k'ij}+ \overline{M}^{(t)}_{kij}\\ &= (1 - \alpha)M^{(t-1)}_{kij} + \alpha \delta_{ki}\phi'(h^{(t)}_i) \hat{a}^{(t-1)}_j. \numberthis \end{align*} Fig.~\ref{fig:rflo_M} illustrates how ${\bf B}^{(t)}$ is contained in the influence matrix ${\bf M}^{(t)}$. This algorithm's update is {\bf deterministic} and in {\bf closed form}, with memory and computational complexity $\mathcal{O}(n^2)$. \begin{figure}[t] \centering \includegraphics[width=\textwidth]{figs/rflo_figs_matrix.pdf} \caption{A visualization of the influence matrix and its 3 indices $k$, $i$, and $j$. In RFLO, the filtered immediate influences, stored in $B^{(t)}_{ij}$, sparsely populate the influence matrix along the diagonals.} \label{fig:rflo_M} \end{figure} \section{Future-facing algorithms} \subsection{Backpropagation Through Time (BPTT)} For many applications, a recurrent network is unrolled only for some finite number of time steps, and backpropagation through time (BPTT) manifests as the computation of the sum $\partial L^{(t)} / \partial{\bf w}^{(s)}$ over every $s \leq t$ in the graph. This can be efficiently accomplished using \begin{equation} {\bf c}^{(t)} = {\bf \overline{c}}^{(t)} + {\bf J}^{(t+1)} {\bf c}^{(t+1)} \label{ff_again} \end{equation} (see Eq.~\ref{ff_relation}) to propagate credit assignment backwards. However, in our framework, where a network is run on an infinite-time horizon, there are two qualitatively different ways of unrolling the network. We call them ``efficient" and ``future-facing" BPTT. \subsubsection{Efficient backpropagation through time (E-BPTT)}\label{e-bptt} For this method, we simply divide the graph into non-overlapping segments of truncation length $T$ and perform BPTT between $t-T$ and $t$ as described above, using Eq.~\eqref{ff_again}. It takes $\mathcal{O}(n^2 T)$ computation time to compute one gradient, but since this computation is only performed once every $T$ time steps, the computation time is effectively $\mathcal{O}(n^2)$, with memory requirement $\mathcal{O}(nT)$. A problem with this approach is that it does not treat all time points the same: an application of ${\bf w}$ occurring near the end of the graph segment has less of its future influence accounted for than applications of ${\bf w}$ occurring before it, as can be visualized in Fig.~\ref{fig:triangles}. And since any one gradient passed to the optimizer is a sum across both $t$ and $s$, it is not an online algorithm by the framework we presented in \S\ref{pf_ff_online_learning}. Therefore, for the purpose of comparing with online algorithms, we also show an alternative version of BPTT that calculates a future-facing gradient (up to truncation) $\partial \mathcal{L} / \partial{\bf w}^{(t)}$ for every $t$. \begin{figure}[ht] \centering \includegraphics{figs/triangles_3.pdf} \caption{A visualization of various exact gradient methods. Each plot contains a lattice of points, representing derivatives $\partial L^{(t)} / \partial{\bf w}^{(s)}$ for $s \leq t$, with gray boxes representing individual gradients passed to the optimizer. {\bf a)} RTRL sums these derivatives into gradients for fixed $t$, using the PF relation (Eq.~\ref{pf_relation}, \S\ref{pf_ff_online_learning}) to efficiently derive successive gradients (blue arrow). {\bf b)} F-BPTT sums these derivatives into gradients for fixed $s$ by backpropagating through time (yellow arrows). {\bf c)} E-BPTT creates a triangular gradient for non-overlapping subgraphs, using the FF relation (Eq.~\ref{ff_relation}, \S\ref{pf_ff_online_learning}) for efficient computation (red arrows). Here, the truncation horizon is $T = 4$.} \label{fig:triangles} \end{figure} \subsubsection{Future-facing backpropagation through time (F-BPTT)} In this version of BPTT, we keep a dynamic list of truncated credit assignment estimates ${\bf \hat{c}}^{(s)}$ for times $s = t - T, \cdots, t - 1$: \begin{equation*} \left[{\bf \hat{c}}^{(t-T)}, \cdots, {\bf \hat{c}}^{(t-1)}\right], \end{equation*} where each truncated credit assignment estimate includes the influences of ${\bf a}^{(s)}$ only up to time $t-1$: \begin{equation*} {\bf \hat{c}}^{(s)} = \sum_{t' = s}^{t-1} \fracpartial{L^{(t')}}{{\bf a}^{(s)}}. \end{equation*} At current time $t$, every element ${\bf \hat{c}}^{(s)}$ is extended by adding $\partial L^{(t)} / \partial{\bf a}^{(s)}$, calculated by backpropagating from the current loss $L^{(t)}$, while the explicit credit assignment ${\bf \overline{c}}^{(t)}$ is appended to the front of the list. To compensate, the oldest credit assignment estimate ${\bf \hat{c}}^{(t-T)}$ is removed and combined with the immediate influence to form a (truncated) gradient \begin{equation*} {\bf \hat{c}}^{(t-T)}{\bf \overline{M}}^{(t-T)} = \sum_{t' = t-T}^{t} \fracpartial{L^{(t')}}{{\bf a}^{(t-T)}}\fracpartial{{\bf a}^{(t-T)}}{{\bf w}^{(t-T)}} = \sum_{t' = t-T}^{t} \fracpartial{L^{(t')}}{{\bf w}^{(t-T)}} \approx \fracpartial{\mathcal{L}}{{\bf w}^{(t-T)}}, \end{equation*} which is passed to the optimizer to update the network. This algorithm is ``online" in that it produces strictly future-facing gradients at each time step, albeit delayed by the truncation time $T$ and requiring memory of the network states from $t-T$. Each update step requires $\mathcal{O}(n^2 T)$ computation, but since the update is performed at every time step, computation remains a factor of $T$ more expensive than E-BPTT. Memory requirement is still $\mathcal{O}(nT)$. Fig.~\ref{fig:triangles} illustrates the differences among these methods and RTRL, using a triangular lattice as a visualization tool. Each point in the lattice is one derivative $\partial L^{(t)} / \partial{\bf w}^{(s)}$ with $t \geq s$, and the points are grouped together into discrete gradients passed to the optimizer. \subsection{Decoupled Neural Interfaces (DNI)} \label{dni} \cite{jaderberg2017decoupled} developed a framework for online learning by {\bf predicting} credit assignment. Whereas PF algorithms face the problem of a large influence tensor $M^{(t)}_{kij}$ that needs a compressed representation, FF algorithms face the problem of incomplete information: at time $t$, it is impossible to calculate ${\bf c}^{(t)}$ without access to future network variables. The approach of Decoupled Neural Interfaces (DNI) is to simply make a linear prediction of ${\bf c}^{(t)}$ \citep{czarnecki2017understanding} based on the current hidden state ${\bf a}^{(t)}$ and the current labels ${\bf y}^{*(t)}$: \begin{equation*} c^{(t)}_i \approx \sum_l \tilde{a}^{(t)}_l A_{li}, \label{dni_approx} \end{equation*} where ${\bf \tilde{a}}^{(t)} = \text{concat}({\bf a}^{(t)}, {\bf y}^{*(t)}, 1) \in \mathbb{R}^{m'}$, $m' = n + n_{\text{out}} + 1$, and $A_{li}$ are the components of a matrix ${\bf A} \in \mathbb{R}^{m \times n}$, which parameterizes what the authors call the {\bf synthetic gradient} function. The parameters $A_{li}$ are trained to minimize the loss \begin{equation} L^{(t)}_{\text{SG}} = \frac{1}{2}\norm{\sum_l \tilde{a}^{(t)}_l A_{li} - c^{(t)}_i}^2 \label{sg_loss} \end{equation} via gradient descent, similar to KeRNL's treatment of $A_{ki}$ and $\alpha_i$ (and we drop the time dependence of $A_{li}$ for the same reason). Of course, this begs the question---the whole point is to avoid calculating ${\bf c}^{(t)}$ explicitly, but calculating the error in Eq.~\eqref{sg_loss} requires access to ${\bf c}^{(t)}$. So the authors propose a ``bootstrapping" technique analogous to the Bellman equation in Reinforcement Learnin \citep{sutton2018reinforcement}. If we take the FF relation we derived in Eq.~\eqref{ff_relation} \begin{equation} {\bf c}^{(t)} = {\bf \overline{c}}^{(t)} + {\bf c}^{(t+1)}{\bf J}^{(t+1)} \label{ff_relation_again} \end{equation} and approximate the appearance of ${\bf c}^{(t+1)}$ with the synthetic gradient estimate ${\bf \tilde{a}}^{(t+1)}{\bf A}$, then Eq.~\eqref{ff_relation_again} provides an estimate of $c^{(t)}_i$ to use in Eq.~\eqref{sg_loss}. Then the update for ${\bf A}$ can be written as \begin{equation} \Delta A_{li} \propto - \tilde{a}^{(t)}_l \left[\sum_{l'} \tilde{a}^{(t)}_{l'} A_{l'i} - \left(\overline{c}^{(t)}_i + \sum _{m}\sum_{l'} \tilde{a}^{(t+1)}_{l'} A_{l'm} J^{(t+1)}_{mi}\right)\right] \label{sg_update} \end{equation} with learning rate chosen as a hyperparameter. As in Eq.~\eqref{ff_gradient}, the gradient is calculated by combining the estimated credit assignment for the $i$-th unit with the explicit influence by the $ij$-th parameter: \begin{equation*} \fracpartial{\mathcal{L}}{W^{(t)}_{ij}} = \fracpartial{\mathcal{L}}{a^{(t)}_{i}}\fracpartial{a^{(t)}_i}{W^{(t)}_{ij}} = \overline{c}^{(t)}_i \phi'(h^{(t)}_i) \hat{a}^{(t-1)}_j \approx \left(\sum_l \tilde{a}^{(t)}_l A_{li}\right)\phi'(h^{(t)}_i) \hat{a}^{(t-1)}_j \end{equation*} This algorithm is {\bf future facing} because it ultimately estimates the effect of applying ${\bf w}$ at current time $t$ on \emph{future} losses. Its updates are {\bf deterministic} and {\bf numerical}, because no explicit randomness is required, but the minimization problem over $A_{li}$ implied by Eq.~\eqref{sg_loss} is approximated by gradient descent rather than solved in closed form. It requires $\mathcal{O}(n^2)$ memory for ${\bf A}$ and $\mathcal{O}(n^2)$ computation for the matrix-vector multiplications in Eq.~\eqref{sg_update}. \subsubsection{Biological approximation to DNI} While many of the algorithms we have presented are biologically plausible in the abstract, i.e. temporally/spatially local and requiring no more than $\mathcal{O}(n^2)$ memory, we have not yet discussed any explicit biological implementations. There are a handful of additional considerations for evaluating an algorithm with respect to biological plausibility: \begin{enumerate}[i] \itemsep0em \item Any equation describing synaptic strength changes (weight updates) must be {\bf local}, i.e. any variables needed to update a synapse connecting the $i$-th and $j$-th units must be physically available to those units. \item Matrix-vector multiplication can be implemented by network-wide neural transmission, but input vectors must represent {\bf firing rates} (e.g. post-activations ${\bf a}$) and not membrane potentials (e.g. pre-activations ${\bf h}$), since inter-neuron communication is mediated by spiking. \item {\bf Feedback weights} used to calculate ${\bf \overline{c}}$ cannot be perfectly symmetric with ${\bf W}^{\text{out}}$, since there is no evidence for biological weight symmetry (see \citealp{lillicrap2016random}). \item Matrices (e.g. ${\bf J}$ or ${\bf A}$) must represent a set of synapses, whose strengths are determined by some local update. \end{enumerate} With a few modifications, many of the presented algorithms can satisfy these requirements. We briefly illustrate one particular case with DNI, as shown in \cite{Marschall19}. To address (i), the result of the synthetic gradient operation $\sum_l \tilde{a}^{(t)}_l A_{li}$ can be stored in an electrically isolated neural compartment, in a manner similar to biological implementations of feedforward backpropagation \citep{guerguiev2017towards, sacramento2018dendritic}, to allow for local updates to $W_{ij}$. For (ii), simply pass the bootstrapped estimate of ${\bf \overline{c}}^{(t+1)}$ from Eq.~\eqref{sg_update} through the activation function $\phi$ so that it represents a neural firing rate. For (iii), one can use fixed, random feedback weights ${\bf W}^{\text{fb}}$ instead of the output weights to calculate ${\bf \overline{c}}^{(t)}$, as in \cite{lillicrap2016random}. And for (iv), one can train a set of weights $\mathcal{J}_{ij}$ online to approximate the Jacobian by performing SGD on $L_J^{(t)} = \norm{a^{(t)}_i - \sum_j \mathcal{J}_{ij}a^{(t-1)}_j}^2$, which encodes the error of the linear prediction of the next network state by $\mathcal{J}_{ij}$. The update rule manifests as \begin{equation*} \Delta \mathcal{J}_{ij} \propto - \left(a^{(t)}_i - \sum_{j'} \mathcal{J}_{ij'}a^{(t-1)}_{j'}\right)a^{(t-1)}_j, \end{equation*} essentially a ``perceptron" learning rule, which is local and biologically realistic. Although this approximation brings no traditional computation speed benefits, it offers a plausible mechanism by which a neural circuit can access its own Jacobian for learning purposes. This technique could be applied to any other algorithm discussed in this paper. We refer to this altered version of DNI as DNI(b) in the experiments section. \section{Experiments} \label{results_section} We run a number of experiments to empirically validate our categorizations and compare performance of the algorithms reviewed here, using two different synthetic tasks. \subsection{Setup} We implemented every algorithm presented here in a custom NumPy-based Python module.\footnote{Link to public code repository to be included upon acceptance.} In every simulation, we use gradient descent with a learning rate of $10^{-4}$, the fastest learning rate for which all algorithms are able to converge to a stable steady-state performance. We restrict ourselves to using a batch size of 1, because, in an online setting, the network must learn as data arrive in real time. Most algorithms demand additional configuration decisions and hyperparameter choices: the truncation horizon $T$ (F-BPTT), the initial values of the tensors ${\bf A}$ and ${\bf B}$ (all approximations), the initial values of the learned timescales $\alpha_i$ (KeRNL), the distribution from which $\boldsymbol{\nu}$ is sampled for stochastic updates (UORO, KF-RTRL, R-KF-RTRL), and the learning rate for the numerical updates (KeRNL, DNI). For each algorithm, we independently optimize these choices by hand (see \hyperref[appendix:hyperparameters]{Appendix B} for details). We evaluate each algorithm's ability to learn two different synthetic tasks: an {\bf additive dependencies} task (Add) inspired by \cite{pitis2016recurrent} and a {\bf mimic target RNN} task (Mimic). In both tasks, a stream of i.i.d. Bernoulli inputs ${\bf x}^{(t)} \in \{0, 1\}^{n_{\text{in}}}$ is provided to the RNN. For Add, $n_{\text{in}} = n_{\text{out}} = 1$. The label $y^{*(t)}$ has a baseline value of 0.5 that increases (or decreases) by 0.5 (or 0.25) if $x = 1$ at $t - t_1$ (or $t - t_2$), for specified lags $t_1$ and $t_2$. The longer the lags of the dependencies, the more difficult the task. We choose $t_1 = 6$ and $t_2 = 10$. In the Mimic task, the labels ${\bf y}^{*(t)} \in \mathbb{R}^{n_{\text{out}}}$ are determined by the outputs of a randomly generated, untrained target RNN that is fed the same input stream $\left\{{\bf x}
\caption{Example of confusions caused by generic chemical names. (\includegraphics[width=.8em]{figs/blue-M.pdf}: false negatives, \includegraphics{figs/green-G.pdf}, \includegraphics{figs/green-M.pdf}: false positives)}
\caption{Example of ``chain reaction'' like errors. (\includegraphics[width=.8em]{figs/blue-G.pdf}: false negatives, \includegraphics{figs/green-G.pdf}: false positives, \includegraphics{figs/red-G.pdf}: true positives)}
\caption{Semi-supervised training framework for Somali ASR. \protect\tikz[baseline]{\protect\draw[line width=0.3mm,dash dot, ->] (0,.8ex)--++(1,0)} represents untranscribed speech is being fed to transcriber}
\caption{Overall structure of our FC$^2$N model. The {\textcolor[rgb]{1,0,0}{red}} arrows in WGFF and CG denote WCC. Note that there are no residual connections in the entire network.}
\caption{Quantitative comparison w.r.t \textbf{large-scale} FC$^2$N ($n = 16, m = 8$). Best and second best results are marked in {\textcolor[rgb]{1,0,0}{red}} and {\textcolor[rgb]{0,0,1}{blue}} respectively (PSNR (dB)/SSIM).}
\caption{Visual comparison between other SR methods and our largescale FC$^2$N. The best and second best results are marked in {\textcolor[rgb]{1,0,0}{red}} and {\textcolor[rgb]{0,0,1}{blue}} respectively.}
\caption{Efficiency analysis on parameters and computational overhead. {\textcolor[rgb]{1,0,0}{Red}} represents largescale implementation, {\textcolor[rgb]{0,0,1}{blue} denotes lightweight version.}}
\caption{A high-level view of our approach. The \colorbox{lightgreen}{inputs} to the system are a mention $m$ in a context $s$, and type definitions $T$. The \colorbox{lightblue2}{output} is set of types $\{t\}$ in the type definition. The figure also highlights the input \colorbox{lightred2}{resources}, as well as \colorbox{lightpurple}{offline} and \colorbox{lightyellow}{online processes}. % \ctsai{Naming proposal: Initial Candidate Generation, Surface-Based Candidate Generation, Contextually-Consistent Re-Ranking} }
\caption{ \small Evaluation of fine-grained entity-typing: we compare our system with state-of-the-art systems (\S\ref{sec:fine:grained:exp}) %For the supervised systems, cells with \colorbox{graaay}{gray} color indicate \emph{in-domain} evaluation. For each column, the best zero-shot and overall results are {\bf bold-faced} and \underline{underlined}, respectively. Numbers are $F1$ in percentage. For supervised systems, we report their in-domain performances, since they do not transfer to other datasets with different labels. For \otyper, cells with \colorbox{graaay}{gray} color indicate \emph{in-domain} evaluation, which is the setting in which it has the best performance. Our system outperforms all the other zero-shot baselines, and achieves competitive results compared to the best supervised systems. % In most of the out-of-domain settings our system outperforms the supervised system. }
\caption{ \small % \dan{I am confused about the issue of given mentions vs. no mentions; also, have you tried the unseen case, and other systems?} Evaluation of coarse entity-typing (\S\ref{sec:coarse:type:exp}): we compare two supervised entity-typers with our system. For the supervised systems, cells with \colorbox{graaay}{gray} color indicate \emph{in-domain} evaluation. For each column, the best, out-of-domain and overall results are {\bf bold-faced} and \underline{underlined}, respectively. Numbers are $F1$ in percentage. In most of the out-of-domain settings our system outperforms the supervised system. }
\caption{Upper-bound analysis: The y-axis shows the percentage of the instances with their types covered by the retrieved concepts. The x-axis shows the number of concepts retrieved, with a) \colorbox{lightblue}{Step 1} (dashed), b) \colorbox{lightred}{Step 2} (solid). }
\caption{\textcolor{blue}{GO-VOS}}
\caption{\textcolor{blue}{GO-VOS - with $\mathbf{w}$ learned per frame}}
\caption{\textcolor{blue}{GO-VOS}}
\caption{\label{MSR-Action3D-accuracy} Experimental results and comparison with the state-the-art approaches on the MSR Action3D dataset \cite{Li2010ActionRB}. The best accuracies are in \textcolor{blue}{\textbf{bold-blue}}. Results that surpass previous works are in \textbf{bold}.}
\caption{\label{NTU-Accuracy} Experimental results and comparison with the state-the-art approaches on the NTU RGB+D dataset \cite{Shahroudy2016NTURA}. The best accuracies are in \textcolor{blue}{\textbf{bold-blue}}. Results that surpass previous works are in \textbf{bold}.}
\caption{Test accuracy of the proposed multi-antenna analog DSGD algorithm without CSIT for different number of antennas values $\left( K \in \{ 1,5,2M,2M^2 \} \right)$ and noise variances $\sigma_z^2$.} \label{FigTestAccNoise1050Perfect_2} \end{figure*} According to the above analysis, the PS estimates $\frac{1}{M} \sum\nolimits_{m=1}^{M} g_{m,2(n-1)s+i} \left(\boldsymbol{\theta}_{t} \right)$ and $\frac{1}{M} \sum\nolimits_{m=1}^{M} g_{m,(2n-1)s+i} \left(\boldsymbol{\theta}_{t} \right)$, for $i \in[s]$, $n \in[N]$, through \begin{subequations}\label{PSSigTermRecReImEst} \begin{align}\label{PSSigTermRecReEst} \hat{g}_{2(n-1)s+i} \left(\boldsymbol{\theta}_{t} \right) &= \frac{ {\rm{Re}} \left\{ y_{i}^n (t) \right\} }{\alpha_t M \sigma_h^2},\\ \hat{g}_{(2n-1)s+i} \left(\boldsymbol{\theta}_{t} \right) &= \frac{ {\rm{Im}} \left\{ y_{i}^n (t) \right\} }{\alpha_t M \sigma_h^2},\label{PSSigTermRecImEst} \end{align} \end{subequations} respectively. It then utilizes the estimated vector $\hat{\boldsymbol{g}} (\boldsymbol{\theta}_t) \triangleq \left[\hat{g}_{1} \left(\boldsymbol{\theta}_{t} \right), \cdots, \hat{g}_{d} \left(\boldsymbol{\theta}_{t} \right)\right]^T$, which can provide a good estimate of the actual average of gradients if a sufficiently large number of PS antennas are employed, to update the model parameters. %Algorithm \ref{MUltiAntennaA_DSGD} presents the proposed analog DSGD scheme without the CSI knowledge at the wireless devices. In Algorithm \ref{MUltiAntennaA_DSGD}, for ease of presentation, we represent the estimated signals at the PS in vector form, in which $\hat{\boldsymbol{g}}_{\rm{re}}^n \left(\boldsymbol{\theta}_{t} \right)$ and $\hat{\boldsymbol{g}}_{\rm{im}}^n \left(\boldsymbol{\theta}_{t} \right)$ represent the estimates of $\frac{1}{M} \sum\nolimits_{m=1}^{M} {\boldsymbol{g}}^n_{m, {\rm{re}}} \left(\boldsymbol{\theta}_{t} \right)$ and $\frac{1}{M} \sum\nolimits_{m=1}^{M} {\boldsymbol{g}}^n_{m, {\rm{im}}} \left(\boldsymbol{\theta}_{t} \right)$, respectively, at the PS, $n \in [N]$. \begin{remark}\label{RemPowerDecay} We note that with SGD the empirical variances of the gradient estimates decay over time and approach zero asymptotically \cite{BottouLargeScaleSGD,ScalableDNNStorm,DCLimitedPrecisionGupta,MohammadDenizDSGDCS,UseLocalSGDLin}. Thus, for robust communication of the gradient estimates against noise at each iteration of the DSGD algorithm, it is reasonable to increase the power allocation factor $\alpha_t$ over time. \end{remark} \begin{remark}\label{RemCompression} We remark that the main focus in this paper is to develop techniques to perform a DSGD algorithm at the wireless edge with no CSIT. We propose to employ multiple antennas at the PS, which can help to mitigate the effect of fading, and, in the limit, align the received signals at the PS. We can further employ some of the existing schemes in the literature providing more efficient communication over the limited bandwidth wireless MAC, such as the idea of linear projection proposed in \cite{MohammadDenizDSGDCS}. We leave the analysis of such combined techniques to future work. \end{remark} \section{Numerical Experiments}\label{SecExperiments} Here we evaluate the performance of the proposed analog DSGD algorithm with no CSI available at the wireless devices. We are particularly interested in investigating the impact of the number of PS antennas on the performance of the proposed scheme. We run experiments on MNIST dataset \cite{LeCunMNIST} with $60000$ training and $10000$ test samples, and train a single layer neural network with $d=7850$ parameters utilizing ADAM optimizer \cite{ADAMDC}. We train the network for $T=800$ iterations. We consider $M=20$ wireless devices in the system. To have a realistic model of data distribution across the devices for the wireless edge learning model, we assume that each device has access to $1000$ training data samples selected at random from the training dataset. Thus, some of the training data samples are not assigned to any device, and the data samples across different devices may not be independent. For simplicity, we assume that the $s$ channel gains associated with each OFDM symbol from each device to each PS antenna are i.i.d., and $\sigma_h^2 = 1$. The performance is measured as the accuracy with respect to the test samples based on the updated model parameters at each DSGD iteration. For numerical comparison, we also consider the benchmark scenario, in which the PS receives the actual average of the gradient estimates $\frac{1}{M} \sum\nolimits_{m=1}^{M} \boldsymbol{g}_m \left(\boldsymbol{\theta}_{t} \right)$, and updates the parameter vector according to this noiseless observation at each DSGD iteration. We refer to this as the error-free shared link scenario, and its accuracy can serve as an upper bound on the performance of the proposed analog DSGD scheme. %For a fair comparison, we assume that $\frac{1}{M} \sum\nolimits_{m=1}^{M} \boldsymbol{g}_m \left(\boldsymbol{\theta}_{t} \right)$ is received at the PS after $N = \left\lceil {d/2s} \right\rceil$ time slots. In Fig. \ref{FigTestAccNoise1050Perfect_2} we illustrate the performance of the proposed analog DSGD scheme with no CSIT for different $K$ values and different noise levels. We consider $K \in \{ 1, 5, 2M, 2M^2 \}$, and investigate the performance of the proposed scheme for $\sigma_z^2 = 20$ and $\sigma_z^2 = 100$ in Figures \ref{FigTestAccNoise10Perfect_2} and \ref{FigTestAccNoise50Perfect_2}, respectively. We also include the performance of the error-free shared link scenario. We set the power allocation factor $\alpha_t = 1 + t/1000$, $t \in [T]$, and for simplicity, we assume that $s = d/2$ resulting in $N=1$. We note that, for a fixed power allocation $\alpha_t$, $\forall t$, the value of $s$ does not have any impact on the accuracy of the considered schemes; instead, any change in $s$ scales the average transmit power, whose value is proportional to $N$. As it can be seen, employing more antennas at the PS results in a higher accuracy with the improvement more highlighted when the noise level is higher. This is due to the fact that increasing $K$ mitigates the effects of both the interference and noise terms, inferred from \eqref{ReceivedVectorPSScheCombAntennasReWrith}. Thus, the advantage of having more PS antennas is more pronounced when the channel is noisier. For example, even when $\sigma_z^2 = 100$, the proposed scheme with $K = 2M^2$ PS antennas and average power $\bar{P}=0.21$ provides a slightly smaller accuracy than that of the error-free shared link scenario; this result indicates the success of the proposed scheme in mitigating the noise term even when the ratio $\bar{P}/{\sigma_{z}^2}$ is relatively small. %{\color{red}According to the observations, exploiting a sufficiently large number of PS antennas may provide an order-optimal accuracy performance with respect to the error-free shared link scenario even for relatively small $\bar{P}/{\sigma_{z}^2}$ values.} We further observe that, compared to having a single-antenna PS, the accuracy improves by exploiting even a few antennas at the PS, e.g., $K=5$, where the improvement is much higher when the channel is noisier, i.e., $\sigma_z^2 = 100$ case. We note that, with all the other parameters fixed, the required average transmit power reduces with $K$, which verifies a faster convergence rate with higher $K$ resulting in a faster reduction in the empirical gradients' variances over time. The same observation is made by reducing $\sigma_z^2$ from $100$ to $20$ while all the other parameters are fixed. \section{Conclusions}\label{SecConc} We have studied DSGD at the wireless edge, where wireless devices compute the gradient estimates based on their available limited datasets, and transmit their estimates to the PS over a wireless fading MAC. To make the model more realistic, we have assumed that the devices do not have CSI for the underlying fast fading channel. With the goal of recovering the average gradient estimates at the PS, we have developed an analog DSGD technique, where the effect of fading, which cannot be cancelled at the transmitters due to the lack of CSIT, is alleviated by employing multiple antennas at the PS. Theoretical analysis, corroborated with numerical results, indicates that, with the proposed approach, increasing the number of PS antennas provides a better estimate of the average gradients through a better alignment of the desired signals, as well as elimination of the interference and noise terms. Asymptotically, the proposed DSGD scheme guarantees, despite the lack of CSIT, that the wireless MAC becomes deterministic, and both the fading and noise effects disappear. % The PS updates the model parameter based on the signal received over the wireless MAC, and shares it with the workers in a lossless fashion. \bibliographystyle{IEEEtran} \bibliography{Report} \end{document} }
\caption[width=\textwidth]{\textbf{Schematics of model geometries and theoretical predictions.} a. A droplet above a step. b. A droplet attached to a step from below. c. A droplet in a trench (the black arrows indicate the possible motions of the droplets along the topographical features and red arrows indicate the direction droplets detach for inclination angles larger than $\beta$). (a-c illustrate the experimental set up for all model geometries, all faces of the non-planar surfaces were rendered as SLIPS; the thin lubricant layer is represented in yellow and water droplets in blue. Here we do not depict the lubricant meniscus, which has been considered elsewhere \citep{orme2019droplet}.) d. The inclination angle $\beta$ at which a droplet of volume $V$ detaches from above a step orientated parallel to the axis of rotation (filled markers) is an order of magnitude larger than the inclination required for a drop to slide along the step (empty triangles) or along a trench (empty squares). The dotted lines represent least-squares fits of the radius-dependent scaling (\ref{eq:above}) with prefactor $A = 0.130\pm0.004$ (detaching) and $A = 0.0131\pm0.002$ (sliding). Inset: The force (weight) required to overcome capillary pinning of the drop is independent of step height. e. The inclination angle $\beta$ at which a droplet of volume $V$ detaches from below a step (circles) or a trench (squares). The solid lines represent the scaling (\ref{eq:below}) for different topography heights. Inset: The force (weight) required to overcome capillary pinning of the drop increases with feature height; the influence of the effective drop radius is secondary (marker size indicates drop volume). In d-e, the height of the elements are colour coded such that $h = 0.21$\,mm (\textcolor{marker1}{navy}), $0.36$\,mm (\textcolor{marker2}{blue}), $0.55$\,mm (\textcolor{marker3}{green}), $1.17$\,mm (\textcolor{marker4}{yellow}).}
\caption{\textbf{Localization results}. We report localization recalls in percent, for three translation and orientation thresholds (\textit{high}, \textit{medium}, and \textit{coarse}) as in \cite{6DOFBenchmark}. We highlight the \boldred{best} in red and \boldblue{second-best} in blue performances for each threshold. Note that NetVLAD, ToDayGAN, and NV+SP all use pre-trained NetVLAD weights from Pittsburgh30k~\cite{NetVLAD}, while we retrained ours on other RobotCar sequences. We also include SMC, which uses additional semantic data and assumptions. For Extended CMU-Seasons, some methods did not provide results for the benchmark.}
\caption{ \protect\tikz \protect\draw (0,0) node[ellipse,minimum height=2mm,minimum width=5mm,draw=orange!100,fill=orange!30,very thick] { };, \protect\tikz \protect\draw (0,0) node[ellipse,minimum height=2mm,minimum width=5mm,draw=teal!100,fill=teal!30,very thick] { };, \protect\tikz \protect\draw (0,0) node[ellipse,minimum height=2mm,minimum width=5mm,draw=violet!100,fill=violet!30,very thick] { }; are nodes associated with leaf values, hidden values and output value, and \protect\tikz \protect\draw[brown,thick,->](0,0) -- (5mm,0); , \protect\tikz \protect\draw[magenta,thick,->](0,0) -- (5mm,0); , \protect\tikz \protect\draw[gray,thick,->](0,0) -- (5mm,0); are edges associated with operations $\tilde{\boldsymbol{f}}_1, \tilde{\boldsymbol{f}}_2, F$. }
\caption{ \protect\tikz \protect\draw[black,thick,->](0,0) -- (5mm,0); are edges associated with operation $+$ and $x$ \protect\tikz \protect\draw[gray,very thick,decorate,decoration={pre length=1mm,post length=1mm, snake, amplitude=.5mm,segment length=1mm},->](0, 0)--(5mm, 0); $y$ denotes $\mathscr{T} x$ \protect\tikz \protect\draw[gray,very thick,->](0,0) -- (5mm,0); $y$. }
\caption{{\color{myc} Phase diagrams for the four protein compositions investigated. The shaded areas show the two-phase regions, and the vertical dashed line shows the sample concentration used to investigate the phase-separation dynamics.}}
\caption{Exemplary pose selection state. \textit{Top:} Index of dispersion. \textit{Left:} Intrinsic calibration position candidates after one (\textcolor{magenta}{magenta}) and two (\textcolor{Dandelion}{yellow}) subdivision steps . \textit{Right:} Distortion map with already visited regions masked out.}
\caption{\redd{A representative timeline showing binary classification of a disruptive discharge with disruption time $\td$. For a class time $\tc$, data before $t < \td - \tc$ are considered ``non-disruptive,'' while data within $\td - \tc \leq t \leq \td$ are considered ``disruptive.'' Oftentimes, there is a minimum required time $\tmin$ to avoid or mitigate a disruption; therefore, an alarm must be triggered before $t < \td - \tmin$.}}
\caption{(a) A toy disruptivity signal for demonstration. Resulting contours of survival probability ($\tc$~=~100~ms, $\ts$~=~1~ms) for cases in which the future disruptivity is (b) known and (c) linearly-extrapolated ($\tfit$~=~100~ms). Median times (solid) and expected \red{future} lifetimes (dotted) overlay \soutt{subplots} (b) and (c). (d) Contours of the absolute difference in $S(t+\Dt|t)$ between \soutt{subplots} (b) and (c). (e) The hazard function computed from data in \soutt{subplot} (c). Grey regions in \soutt{subplots} (b) and (d) indicate unknown data. \redd{Horizontal dashed lines in (a)-(d) indicate $\Dt = \tc$.}}
\caption{(a) The toy disruptivity data from figure~\ref{fig:fakeSmooth}a with added noise of period 50~ms. Resulting contours of survival probability ($\tc$~=~100~ms, $\ts$~=~1~ms) for cases in which the future disruptivity is (b) known and linearly-extrapolated with fitting time windows $\tfit$ of (c) 50~ms, (d) 100~ms, and (e) 200~ms. The median times (solid) and expected \red{future} lifetimes (dotted) are plotted. The color-scale in \soutt{subplot} (b) is the same for \soutt{subplots} (c)-(e). The grey region in \soutt{subplot} (b) indicates unknown data. \redd{Horizontal dashed lines in (b)-(e) indicate $\Dt = \tc$.}}
\caption{\redd{A good prediction: The (a) plasma current in MA and disruptivity and (b) edge safety factor and line-averaged plasma density in $10^{20}$~m$^{-3}$ are plotted for Alcator C-Mod discharge \#1140226013. Contours are shown for the (c) survival and (d) hazard functions, calculated using linear extrapolation as in\eqref{eq:taylor} ($\tfit$~=~100~ms, $\ts$~=~1~ms). The estimated \red{future} lifetime $\tapx$ (dotted) and median time $\mt$ (solid) are given in (c). The horizontal solid lines in (a) indicate the low and high thresholds of the RF algorithm, as described in the text. The dashed lines in (a)-(d) indicate $\tc$~=~325~ms before the disruption (vertical) and $\Dt = \tc$ (horizontal). Grey regions indicate unknown data.}}
\caption{\redd{A late warning: The (a) plasma current in MA and disruptivity and (b) normalized plasma $\beta$ (\%) and line-averaged plasma density in $10^{20}$~m$^{-3}$ are plotted for Alcator C-Mod discharge \#1150722006. Contours are shown for the (c) survival and (d) hazard functions, calculated using linear extrapolation as in\eqref{eq:taylor} ($\tfit$~=~100~ms, $\ts$~=~1~ms). The estimated \red{future} lifetime $\tapx$ (dotted) and median time $\mt$ (solid) are given in (c). The horizontal solid lines in (a) indicate the low and high thresholds of the RF algorithm, as described in the text. The dashed lines in (a)-(d) indicate $\tc$~=~325~ms before the disruption (vertical) and $\Dt = \tc$ (horizontal). Grey regions indicate unknown data.}}
\caption{\redd{A false alarm: The (a) plasma current in MA and disruptivity and (b) normalized internal inductance and line-averaged plasma density in $10^{20}$~m$^{-3}$ are plotted for Alcator C-Mod discharge \#1140227018. Contours are shown for the (c) survival and (d) hazard functions, calculated using linear extrapolation as in\eqref{eq:taylor} ($\tfit$~=~100~ms, $\ts$~=~1~ms). The estimated \red{future} lifetime $\tapx$ (dotted) and median time $\mt$ (solid) are given in (c). The horizontal solid lines in (a) indicate the low and high thresholds of the RF algorithm, as described in the text. The horizontal dashed lines in (c) and (d) indicate $\Dt = \tc$. Grey regions indicate unknown data.}}
\caption{\textbf{(Figure \ref{fig:exp1})} depicts the average fitness (\textbf{thick}) and best fitness (dashed) per generation -- lower $\ell_2$-norms correspond to a higher fitness level. \textbf{(Figure \ref{fig:exp2})} depicts the number of surviving children per generation (generation gap) -- the higher the gap, the faster the evolution. Both graphs depict the original sieve ({\color{blue}blue}), the sieve with global updates ({\color{red}red}), and the sieve with both global selection and occasional mutations ({\color{green!50!black}green}).}
\caption{\textcolor{black}{The parent classes of the banana class and the beagle class. The green leaf nodes are the target classes, and the blue root node is the entity class which is the root of WordNet. The red nodes are the middle classes through which the leaf nodes can reach the root (Best viewed in color).}\textcolor{red}{\label{fig:Beagle-and-Banana}}}
\caption{\textcolor{black}{The tree structure of 1000 ImageNet classes. The green leaf nodes are the classes in the 1000 ImageNet classes. The red nodes are the middle classes through which the leaf nodes can reach the root. The blue root node is the entity class which is the root of WordNet (Best viewed in color).}\textcolor{red}{\label{fig:The-structure-of}}}
\caption{\textcolor{black}{The sensitivity of filters in the conv5 layer.}\label{fig:The-sensitivity-of-conv5-1} The first row is for the class of \emph{garbage truck. }The second and third rows are for the class of \emph{forklift} and \emph{pencil box} respectively. The red dash lines in the first column shows the most top-five filters in the model pre-trained on near\_INSTRU (near) and the blue dash lines are for far\_INSTRU (far). The red solid lins in the second cloumn are the most top-five filters after fine-tuning near on INSTRU (FT-near) and blue solid liens are for (FT-far). Red solid lines in the third cloumn are for FT-near and red dash lines are for near. The fourth cloumn shows top-five filters in all the five models. Specifically, the green solid lines are for the model trained on INSTRU itself (Self). The last cloumn shows the mean AP of the five filters for each model.}
\caption{The visualization of class images for different models.\label{fig:Qualitatively-analysis-of} For each set, the first column shows same original images, and the second column is the corresponding class image computed from the model fine-tuned on near\_INSTRU (FT-near). The third column is the corresponding image computed from the model fine-tuned on far\_INSTRU (FT-far). The results show that class images computed from FT-near are better in representing the corresponding classes than the class images computed from FT-far (Best viewed in color).}
\caption{Ca \scriptsize{II}\normalsize\- K, Na I D1 and D2 chromospheric lines at different activity levels: flare (blue) and non-flare spectrum (red).}
\caption{$S_K$-indexes for Ross 128 derived from CASLEO (\textcolor{blue}{$\bullet$}), HARPS (\textcolor{red}{$\blacktriangle$}), FEROS (\textcolor[cmyk]{1,0,1,0.4}{$\blacklozenge$}), UVES (\textcolor{violet}{$\bigstar$}) and XSHOOTER (\textcolor{cyan}{$\times$}) spectra.}
\caption{ Watson-Crick configurations of the A{\textperiodcentered}T (a) and G{\textperiodcentered}C (b) complementary pairs. The numbers indicate the distance (in \AA) between the heavy atoms in the corresponding hydrogen bonds. $R$ denotes the distance between the $C_1'$ atoms. }
\caption{Structural parameters of Watson-Crick A{\textperiodcentered}T and G{\textperiodcentered}C base pairs according to the standard nomenclature (Appendix, Fig. \ref{fig:pathways}). The parameters {\textquoteleft}shear{\textquoteright}, {\textquoteleft}stretch{\textquoteright} are given in \textit{\AA}; {\textquoteleft}propeller twist{\textquoteright}, {\textquoteleft}opening{\textquoteright} in $degrees$.}
\caption{Diagram of the interaction energy values for complexes consisting of hydrogen peroxide and water molecules with {\textquoteleft}closed{\textquoteright}, {\textquoteleft}preopened{\textquoteright} and {\textquoteleft}stretched{\textquoteright} configurations of A{\textperiodcentered}T and G{\textperiodcentered}C base pairs.}
\caption{{\textquoteleft}Opened{\textquoteright} configuration of G{\textperiodcentered}C pair with $H_2O_2$ molecule calculated in the present work. The numbers indicate the distance (in \AA) between heavy atoms in the corresponding hydrogen bonds.}
\caption{ Complexes of complementary pairs of A{\textperiodcentered}T and G{\textperiodcentered}C with water and hydrogen peroxide molecules, bound from the major groove ({\textquoteleft}closed{\textquoteright} pair): a) A{\textperiodcentered}T with $H_2O$ molecule; b) A{\textperiodcentered}T with $H_2O_2$; c) G{\textperiodcentered}C with $H_2O$ d) G{\textperiodcentered}C with $H_2O_2$. }
\caption{The stable configurations of the {\textquoteleft}preopened{\textquoteright} base pairs calculated in the present work: A{\textperiodcentered}T with water (a) and hydrogen peroxide (c) molecules; G{\textperiodcentered}C with water (c) and hydrogen peroxide (d) molecules. The numbers indicate the distance (in \AA) between heavy atoms in the corresponding hydrogen bonds.}
\caption{The stable configurations of {\textquoteleft}stretched{\textquoteright} base pairs calculated in the present work: A{\textperiodcentered}T with water (a) and hydrogen peroxide (c) molecules; G{\textperiodcentered}C with water (c) and hydrogen peroxide (d) molecules. The numbers indicate the distance (in \AA) between heavy atoms in the corresponding hydrogen bonds.}
\caption{ M3D-RPN uses a \textit{single} monocular 3D region proposal network with global convolution (\textcolor{darkorange}{orange}) and local depth-aware convolution (\textcolor{darkblue}{blue}) to predict multi-class 3D bounding boxes. }
\caption{ Comparison of Deep3DBox~\cite{mousavian20173d} and Multi-Fusion~\cite{xu2018multi} with M3D-RPN. Notice that prior works are comprised of multiple internal stages (\textcolor{darkorange}{orange}), and external networks (\textcolor{darkblue}{blue}), whereas M3D-RPN is a {\it single-shot} network trained end-to-end. }
\caption{ \textbf{Overview of M3D-RPN}. The proposed method consist of parallel paths for global (\textcolor{darkorange}{orange}) and local (\textcolor{darkblue}{blue}) feature extraction. The global features use regular spatial-invariant convolution, while the local features denote depth-aware convolution, as detailed right. The depth-aware convolution uses non-shared kernels in the row-space $k_i$ for $i=1\dots b$, where $b$ denotes the total number of distinct bins. To leverage both variants of features, we weightedly combine each output parameter from the parallel paths. }
\caption{Proposed S\&C enhancer. The spatial-wise enhancer is marked with the green box, and the channel-wise enhancer is marked with the red box.}
\caption{Ablation study of the our network. The sign \checkmark indicates that this module is added into the framework. The best and the second best records are marked as \textbf{bond} and \color{blue}blue \color{black}.}
\caption{Comparisons to the state-of-the-art approaches on the KITTI on-line benchmark. For all metrics, lower is better. The best and the second best records are marked as \textbf{bold} and \color{blue}blue\color{black}.}
\caption{(a) Critical Josephson current and (b) ground state phase shift $\varphi_0$ as functions of $d_F$ for the AA-type S/AF/S junction with different $m$ in comparison to the critical current for "m/2+normal+m/2" and "m/2+insulator+m/2" interlayers having the same total magnetic moments. Here $V_R=0.28$. In panel (b) stable (metastable) branches of $\varphi_0$ are plotted by solid (dashed) lines and by big (small) dots. The ground state phase shift $\varphi_0$ for "m/2+normal+m/2" model cannot be distinguished from zero in panel (b), therefore it is not shown by any symbols.{\color{red}}}
\caption{$(a)$ Mean-velocity profile and $(b)$ turbulent velocity fluctuations of the reference simulation: \protect \solid, $O950$ without the filtering {(\ref{eq:2.1})}; \protect \dashed, $O950$; \protect \dashdot, DNS at $Re_{\tau}=934$ \cite[]{delAlamo2004}.}
\caption{Normalised second-order statistics of the attached eddies, scaled with the given spanwise length $L_z$: $(a)$ streamwise velocity; $(b)$ wall-normal velocity; $(c)$ spanwise velocity; $(d)$ Reynolds stress. Here, \protect \dashed, from $Re_{\tau} \simeq 950$ ($SL950a$, $SL950b$); \protect \solid, from $Re_{\tau} \simeq 1800$ ($SL1800a$, $SL1800b$, $SL1800c$).}
\caption{Time correlation functions of $(a,c)$ $O950$ and $(b,d)$ $SO950$. In $(a,b)$, \protect \solid, $C_{uu}(\tau)$; \protect \dashed, $C_{vv}(\tau)$; \protect \dashdot, $C_{ww}(\tau)$. In $(c,d)$, \protect \solid, $C_{uv}(\tau)$; \protect \dashed, $C_{uw}(\tau)$; \protect \dashdot, $C_{vw}(\tau)$.}
\caption{Time correlation functions of $(a,c)$ $O950$ and $(b,d)$ $SO950$. In $(a,b)$, \protect \solid, $C_{00}(\tau)$; \protect \dashed, $C_{11}(\tau)$; \protect \dashdot, $C_{01}(\tau)$. In $(c,d)$, \protect \solid, $C_{0u}(\tau)$; \protect \dashed, $C_{0v}(\tau)$; \protect \dashdot, $C_{1v}(\tau)$.}
\caption{Auto-correlation functions: $(a,b)$ $C_{uu}(\tau)$; $(c,d)$ $C_{vv}(\tau)$; $(e,f)$ $C_{11}(\tau)$. In $(a,c,e)$, \protect \solid $L950a$; \protect \dashed $L950b$; \protect \dashdot $L1800a$; \protect \dashdotdot $L1800b$; \protect \dotted $L1800c$, while, in $(b,d,f)$, \protect \solid $LS950a$; \protect \dashed $LS950b$; \protect \dashdot $LS1800a$; \protect \dashdotdot $LS1800b$; \protect \dotted $LS1800c$.}
\caption{Cross-correlation functions: $(a,b)$ $C_{uv}(\tau)$; $(c,d)$ $C_{vw}(\tau)$; $(e,f)$ $C_{1v}(\tau)$. In $(a,c,e)$, \protect \solid $L950a$; \protect \dashed $L950b$; \protect \dashdot $L1800a$; \protect \dashdotdot $L1800b$; \protect \dotted $L1800c$, while, in $(b,d,f)$, \protect \solid $SL950a$; \protect \dashed $SL950b$; \protect \dashdot $SL1800a$; \protect \dashdotdot $SL1800b$; \protect \dotted $SL1800c$.}
\caption{Time correlation functions of $(a)$ $O950$ without (\ref{eq:2.1}) and $(b)$ $O950$. Here, \protect \solid, $C_{uu}(\tau)$; \protect \dashed, $C_{vv}(\tau)$; \protect \dashdot, $C_{uv}(\tau)$; \protect \dashdotdot, $C_{vw}(\tau)$.}
\caption{Comparison of the second-order statistics of the self-sustaining attached eddies in the minimal unit ($SL1800c$ in the present study) with those in the long streamwise domain \cite[$L1800c$ in][]{Hwang2015}: $(a)$ the streamwise, wall-normal and spanwise velocities; $(b)$ Reynolds stress. Here, \protect \solid, $SL1800c$ in the present study; \protect \dashed, $L1800c$ in \cite{Hwang2015}.}
\caption{Average ratings \myblue{($\pm$ confidence intervals), in 1 to 5 stars $\bigstar$, for the groups of Low-QoS and High-QoS videos}%in 1 to 5 stars }
\caption{Results for \textcolor{blueCT}{Completion Time} (sec) and \textcolor{orangeER}{Error Rate} (in \%) for each task \nc{in \autoref{fig:tasks}}. \nc{In each cell (task), Mean values per visualization are seen on the left and means of pairwise differences on the right}. Error bars represent 95\% Bootstrap confidence intervals. Gray rectangles indicate the direction of our hypotheses. \nc{Evidence of differences are marked with a \protect\starEvidence\(the further away from 0 and the tighter the CI, the stronger the evidence).} %\nc{Strong evidence of differences among techniques are marked with \textcolor{red}{**},weak evidence with \textcolor{red}{*}.} % \ab{All times + Region in ComTime diff was evidence, check text. Also (0,1) time for Dor-Bar looks like strong. Also (1,1) error Gly-Dor looks strong} %\ab{Vanessa can we keep a single star?} }
\caption{\label{tab:searchesexp}References to existing searches for two-body resonances, where one decay product is from the first column and one is from the first row. Only the most recent searches are considered. The box $\text{BSM}\rightarrow \text{SM$_1$}\times\text{SM$_2$ }$ represents cases where the primary resonance decays to a BSM particle, which itself decays into two SM particles that are not the same. \colorbox{blue!20}{Colored cells} indicate searches that were covered by $\sqrt{s}=8$ TeV searches reported in Ref.~\cite{Craig:2016rqv}.}
\caption{{\color{blue} (a) Electronic bandstructure of the ion-gated, hydrogen-terminated (111) $1\times1$ diamond surface for $n_{2D}=6.00\cdot10^{14}\,\mathrm{cm^{-2}}$. The inset shows the first Brillouin zone and the Fermi surface. (b) Electron-phonon coupling constants $\lambda$ (filled symbols, left axis) and averaged logarithmic frequencies $\omega_{\rm log}$ (hollow symbols, right axis) as a function of hole doping $n_{2D}$ computed at $\bf{q}=\bf{\Gamma}$ (filled and hollow circles) and through a Wannier interpolation scheme (filled and hollow hexagons). Data in (a,b) are adapted from Ref.\onlinecite{RomaninApSuSc2019}.} }
\caption{Example of the measured traces of stored $^{142}$Pm$^{60+}$ and $^{142}$Nd$^{60+}$ ions in \red{the} ESR. More than six $^{142}$Pm$^{60+}$ were stored in this example. Also a single EC-decay daughter ion, $^{142}$Nd$^{60+}$, in a frequency-time after injection representation is present from the beginning. The vertical scale is zoomed on the first 10 seconds of the measurement to illustrate the duration of stochastic cooling. The injection of ions into \red{the} ESR occurs at 0 seconds. The stochastic cooling is operated from 0 to about 4.5 seconds. The electron cooling is operated all the time at unchanged parameters without interruptions. Cooling individual ions by the cooler electrons is clearly seen from 4.5 seconds up to about 6 seconds. \label{stochastic}}
\caption{Example of the measured EC decays of stored and cooled $^{142}$Pm$^{60+}$ ions. The vertical axis is zoomed from the measured range of $0-70$ seconds. Two EC decays are clearly seen at about 58 and 63 seconds after the injection of the ions into \red{the} ESR. As has been shown in \cite{Kienle2013}, neutrinos are emitted isotropically indicating that the stored $^{142}$Pm$^{60+}$ ions are unpolarised. The longitudinal component of the recoil (due to the emitted neutrino) of the daughter ion is reflected by the frequency difference between the frequency at which the daughter ion appears after the decay and its frequency when cooled by the electrons. The tail at 58 seconds shows that the recoiling ion was slowed down by the electrons (to smaller revolution frequencies), which means that the neutrino was emitted in the direction opposite to the ion motion. Vice versa is the case of the tail at 63 seconds. The disappearance of both ion species at 64 seconds is due to the implementation of the kicker magnet pulse for safe and controlled emptying \red{the} ESR prior to the injection of newly produced $^{142}$Pm$^{60+}$ ions. %{\bf Thomas isn't everything following just the opposite? Frequency too high means velocity too high ...} \label{trace}}
\caption{\label{tabsum} \red{ Summary of key characteristics of the three experiments.}}
\caption{\label{tab:evidence} %Probability of the model with oscillation ($P(M_2)$) and 95\% confidence interval (CI) of its parameter from the analysis with Nested\_fit program. Probability of the model with oscillation ($P(M_2)$), parameter values corresponding to the maximum of the likelihood functions and their 95\% confidence intervals (CI) from the analysis with Nested\_fit program. The considered range for$\omega$ is $[0,7]$ rad s$^{-1}$.}
\caption{\label{tab:evidence} Probability of the model with oscillation ($P(M_2)$), parameter values corresponding to the maximum of the likelihood functions and their 95\% confidence intervals (CI) %Probability of the model with oscillation ($P(M_2)$) and 95\% confidence interval (CI) of $\omega$ from the analysis of all data sets at the same time with Nested\_fit program. The considered range for$\omega$ is $[0,7]$ rad s$^{-1}$.}
\caption{Example of the measured traces of stored $^{142}$Pm$^{60+}$ and $^{142}$Nd$^{60+}$ ions in \red{the} ESR. The entire frequency bandwidth measured by RSA 5103A as well as the entire time range are shown. The traces of $^{142}$Pm$^{60+}$ and $^{142}$Nd$^{60+}$ ions are clearly visible. Two particles are observed which are uncooled (changing revolution frequency with time). The origin of these ions is not known. Since these particles are not cooled, their frequency does not correspond to their mass-to-charge ratio and thus their unambiguous identification is not possible. % acts on velocity not on frequency these traces might have been appeared on the spectrum. %{\bf thomas because cooling acts on the velocity not on the frequency.} \label{travellers}}
\caption{Impact of user personas. HB median CPMs (USD) across our 16 personas and control persona for the top five bidders (AppNexus, Rubicon, IX, OpenX, and PubMatic) and associated weighted Avg. and Std. among categories and bidders. Bid prices exceeding $\pm\sigma$ among categories are denoted with \boldred{$^\uparrow$} or \boldred{$^\downarrow$}. Bid prices exceeding $\pm\sigma$ among bidders are denoted with $^\uparrow$ or $^\downarrow$. Avg. and Std. among \emph{persona} weighted averages are in \boldred{bold red}. Avg. and Std. among \emph{bidder} weighted averages are in \textbf{bold black}.}
\caption{Impact of showing intent. Cells indicate the ratio of median bid values for personas showing intent vs. personas showing no intent for the top five bidders (AppNexus, Rubicon, IX, OpenX, and PubMatic) and associated weighted Avg. and Std. Ratios exceeding $\pm\sigma$ among categories are denoted with \boldred{$^\uparrow$} or \boldred{$^\downarrow$}. Ratios exceeding $\pm\sigma$ among bidders are denoted with $^\uparrow$ or $^\downarrow$. Avg. and Std. among \emph{persona} weighted averages are in \boldred{bold red}. Avg. and Std. among \emph{bidder} weighted averages are in \textbf{bold black}.}
\caption{Impact of user personas on winning bids. HB median CPMs (USD) across our 16 personas and control persona for the top five bidders (AppNexus, Rubicon, IX, OpenX, and PubMatic) and associated weighted Avg. and Std. among categories and bidders. Bid prices exceeding $\pm\sigma$ among categories are denoted with \boldred{$^\uparrow$} or \boldred{$^\downarrow$}. Bid prices exceeding $\pm\sigma$ among bidders are denoted with $^\uparrow$ or $^\downarrow$. Avg. and Std. among \emph{persona} weighted averages are in \boldred{bold red}. Avg. and Std. among \emph{bidder} weighted averages are in \textbf{bold black}.}
\caption{Average running time on different benchmarks. The best and second are highlighted in \textcolor{red}{red} and \textcolor{blue}{blue} color, respectively.}
\caption{\label{fig:236U} Energy levels of $^{236}$U as function of $J(J+1)$ for all experimentally observed states below 1200~keV. Positive and negative parities are indicated by {\color{blue} \textbf{+}} and {\color{red} \textbf{ --} }, respectively. Straight lines indicate rotational bands with $E=E_0+J(J+1)/2{\Theta}$.\\ }
\caption{\label{fig:237Np} Energy levels of $^{237}$Np as function of $J(J+1)$ for all experimentally observed states below 800~keV. Positive and negative parities are indicated by {\color{blue} \bf +} and {\color{red} \bf -- }, respectively. Straight lines indicate rotational bands with $E=E_0+J(J+1)/2{\Theta}$. }
\caption{\label{fig:152Sm} Energy levels of $^{152}$Sm as function of $J(J+1)$ for experimentally observed states below 2000~keV. Positive and negative parities are indicated by {\color{blue} \bf +} and {\color{red} \bf -- }, respectively. Straight lines indicate rotational bands with $E=E_0+J(J+1)/2{\Theta}$. }
\caption{\label{fig:153Eu} Energy levels of $^{153}$Eu as function $J(J+1)$ for experimentally observed states below 1600~keV. States with ($E>700$~keV, $J<9/2$) are not included as they don't show rotational behavior. Positive and negative parities are indicated by {\color{blue} \bf +} and {\color{red} \bf -- }, respectively. Straight lines indicate rotational bands with $E=E_0+J(J+1)/2{\Theta}$. }
\caption{\label{fig:234U} Energy levels of $^{234}$U as function of $J(J+1)$ for experimentally observed states below 1200~KeV. Positive and negative parities are indicated by {\color{blue} \bf +} and {\color{red} \bf -- }, respectively. Straight lines indicate rotational bands with $E=E_0+J(J+1)/2{\Theta}$. }
\caption{\label{fig:235U} Energy levels of $^{235}$U as function $J(J+1)$ for experimentally observed states below 800~keV. Positive and negative parities are indicated by {\color{blue} \bf +} and {\color{red} \bf -- }, respectively. Straight lines indicate rotational bands with $E=E_0+J(J+1)/2{\Theta}$. }
\caption{\label{fig:226Ra} Energy levels of $^{226}$Ra as function of $J(J+1)$ for experimentally identified states. Positive and negative parities are indicated by {\color{blue} \bf +} and {\color{red} \bf -- }, respectively. Straight lines indicate rotational bands with $E=E_0+J(J+1)/2{\Theta}$. }
\caption{\label{fig:227Ac} Energy levels of $^{227}$Ac as function $J(J+1)$ for experimentally observed states. Positive and negative parities are indicated by {\color{blue} \bf +} and {\color{red} \bf -- }, respectively. Straight lines indicate rotational bands with $E=E_0+J(J+1)/2{\Theta}$. }
\caption{Quantitative comparison of MAE, F-measure and S-measure with 15 methods on 5 datasets. A higher F-measure score, higher S-measure score and lower MAE score represent better performance. The top three results are highlighted in {\color{red}red}, {\color{green}green} and {\color{blue}blue}, respectively.}
\caption{(a) Traffic counts derived from induction loop detectors after transfer to the simplified network. Only 36\% of the links have non-zero values. (b) Bluetooth LOD flows for one specific OD pair (Brisbane CBD (upper \textcolor{blue}{$\blacksquare$}) to Moorooka (lower \textcolor{blue}{$\blacksquare$}). Colour and width of the roads represent traffic volume.}
\caption{(a) Brisbane Traffic Counts on the simplified network for the area of study. (b) Origin-Destination flows of vehicles using the Bradfield Highway Bridge (North to South direction). The bridge is the road highlighted in magenta. Width and colour of the semi-ellipses are alternatively used to indicate OD volumes. Only the twenty largest OD volumes are shown. (c) Road volumes for vehicles trips from CBD (upper \textcolor{blue}{$\blacksquare$}) to Moorooka (lower \textcolor{blue}{$\blacksquare$}) in Brisbane. Some isolated links have non-zero traffic flows: These artefacts stem from the term $f_{TC}$ that favours non-zero flows on links where observed traffic counts are high. A stronger weight on the Kirchoff's law term in the objective function would remove such artefacts at the expense of other objectives.}
\caption{(\textbf{a-b}) Top-1 accuracy under black-box and white-box adversarial attacks, respectively, as a function of the total perturbation size. The attacks were carried out with a 10-step PGD algorithm. The $0$ perturbation size corresponds to the clean images. (\textbf{c}) Top-1 accuracy under white-box adversarial attacks as a function of the number of PGD steps, fixing the total perturbation size to $\epsilon=0.06$. For comparison, the performance of a state of the art, adversarially trained ResNet-152 model \citep{xie2018} is also included in the figure (labeled {\color{green}feature denoising}). Unlike the ResNeXt WSL models, the robustness of the adversarially trained model remains much more stable as the number of PGD steps is increased.}
\caption{Shape-texture cue conflict images from \citet{geirhos2019}. The {\color{blue}shape} and {\color{red}texture} categories for each image are indicated above the image. The unrestricted top 5 predictions of the \texttt{resnext101\_32x48d\_wsl} model for each image are shown below the image, with approximate shape matches highlighted in blue and approximate texture matches highlighted in red. The shape and texture matches were evaluated manually by the author for this figure only. As described in detail in the \textit{Methods} section, evaluation of the shape bias scores reported in Table~\ref{shape_bias_table} followed a somewhat different procedure and was done in the same way as in \citet{geirhos2019}.}
\caption{\color{dblue} \small{\em Left:} The survey speed and angular resolution of the DSA-2000 compared with other operational (normal text) or upcoming (bold font) telescopes operating at 1.4\,GHz.{\em Right:} The system-equivalent flux density (SEFD; inversely proportional to sensitivity) of the DSA-2000 and other steerable radio telescopes. Sensitivities are shown in the specific frequency ranges accessed by each telescope. %{\small AO-7: 7-beam Arecibo multibeam receiver. AO-40: 40-beam phased array feed for Arecibo. ASKAP: Australian SKA Pathfinder. DSA-110: 110-dish Deep Synoptic Array. FAST-19: 19-beam FAST multibeam receiver. GBT: Green Bank Telescope. GBT-FLAG: Focal L-band Array for the GBT. JVLA: Jansky Very Large Array. Parkes-MB: 13-beam Parkes multibeam receiver. ngVLA: next generation VLA.} }
\caption{\color{dblue} \small{\em Left:} Detection horizons of the NVSS, VLASS and DSA-2000 surveys to star-forming galaxies and low-luminosity AGN. We consider the typical radio luminosities of 10\,$M_{\odot}$\,yr$^{-1}$ galaxies on the left, and $10^{7}M_{\odot}$ AGN accreting at $0.02L_{\rm Edd}$ on the right. The DSA-2000 Cadenced All-Sky Survey, when complete, will detect these systems to redshifts $z=1.1$, probing a $2500\times$ larger volume than the final VLASS data set. Background image credit: Illustris Simulation. {\em Right:} Predicted redshift distribution of sources ($>3\,\mu$Jy) in the DSA-2000 Cadenced All-Sky Survey stack (Wilman et al. 2008). The billion sources detected by DSA-2000 will be dominated by star-forming galaxies (blue) at all redshifts.}
\caption{\color{dblue} \small{\em Left:} The number of MSPs in $3\pi$ str of the sky above given timing precisions, observed with the DSA-2000 in 3600\,s (credit: the NANOGrav Collaboration).{\em Right:} The detection horizon (left black axis; green shading) of the DSA-2000 to radio emission associated with mildly relativistic ejecta from binary neutron star mergers, for different ISM densities (Nakar \& Piran 2011). No relativistic jet is included; this would significantly increase the horizon. The GW detection horizons of the current aLIGO / Virgo and the expected five-detector configuration in the 2020s are shown as horizontal lines. The inferred ISM density surrounding the binary neutron star merger GW\,170817 is shown as an error range (Mooley et al. 2018), and the blue trace / right blue axis shows the timescale of the radio transient emission.}
\caption{\color{dblue} \small Summary of the scientific potential of large numbers of interferometrically localized FRBs. The DSA-2000 is expected to detect and localize $\sim10^{4}$ FRBs per year of operations. Figure credits: Illustris Simulation.}
\caption{\color{dblue} \small The DSA-2000 will consist of $2000 \times 5$\,m fully automated, steerable dishes distributed across 15\,km in a configuration optimized to minimize sidelobes in 15-min integrations. Each antenna is serviced by a quad-ridge horn feed and an ambient temperature LNA. Low cost RFOF modules and a distributed network of optical fiber will transport DSA-2000 data to the Central Processing Building, housing the cross-correlator, beam-former and imaging backend.}
\caption{\color{dblue} \small \textbf{A:} Four of the antennas of the DSA-10 located at the Owens Valley Radio Observatory (OVRO). \textbf{B:} The custom-designed feed attached to an antenna of the DSA-10. \textbf{C:} The front-end enclosure attached to one of the DSA-10 antennas, which houses the front-end electronics. {\em Right:} Noise temperature as a function of temperature for LNAs tested during the DSA-110 design phase. An LNA with a Diramics transistor has been tested with 6~K noise temperature at room temperature and will serve as the platform for the DSA-2000 wideband LNA design.}
\caption{\color{dblue} \small{\em A:} A preliminary configuration for the DSA-2000 spanning a 15\,km diameter area. The final configuration will be optimized for 15-min tracks.{\em B:} A cut through the point spread function for a 15-min track. {\em C:} A sub-image from the SKA Data Challenge adjusted to the DSA-2000 frequency band. {\em D:} An image created from a simulation of a 10 second snapshot with DSA-2000 using The SKA Data Challenge image as the model sky, to demonstrate the capability of DSA-2000 for deconvolution-free imaging. Gaussian noise of 2$\mu$Jy has been added to both the original and convolved image to represent the expected noise in a 15-minute track with DSA-2000. All sources from the original image are detected in the DSA-2000 image. {\em E:} An image produced from a simulation of the same data at the location of the brightest source in the SKA Data Challenge, scaled to DSA frequencies ($\sim 320$\,mJy at 1.35 GHz) and created without visibility-based deconvolution.{\em F:} The simulated DSA-2000 image of the bright source after 5,000 iterations of image plane deconvolution ($<10$\,s on a single core, 64\,GB of RAM) removes all sidelobes with a resulting final dynamic range of$5 \times 10^5$. Note this is for a 10-second snapshot with DSA-2000. The sidelobes for a 15 minute track will be 10x lower enabling image-plane deconvolution for sources as bright as 10~Jy.}
\caption{Examples of $o_{local}$ and $h_{local}$. In both cases, the project of the robot state $x_c$ in the workspace has been marked with an \textcolor{red}{$\times$}. \vspace{-.2in}}
\caption{\label{fig4} (a) Measured total number of force peaks as a function of defect fraction, for a compression from $p_{\textrm{ini}}$ to $p=1$. (\protect\markercirc) $p_{\textrm{ini}}=4$ with $ q_{\textrm{ini}}=5$ - two data sets (different colours); (\protect\markersquare) $p_{\textrm{ini}}=3$ with the three rows initially made of 8 - 7 - 8 droplets respectively - two data sets (different colours). (b) Evolution of the normalized excess number of peaks compared to a crystal, with the black dashed line corresponding to Eq.~\ref{renorm}. (c) Theoretical probability distribution of the dimensionless column height $h$, in an aggregate with $q\rightarrow\infty$ and $p=4$, for four defect fractions $\phi = \{0; 0.1; 0.3; 0.5\}$ (see SI). Gaussian curves (blue solid lines) with same standard deviation, $\sigma$, and average, $\mu$, as the discrete distribution are overlayed as a guide to the eyes. Typical radii of large and small droplets are $\approx 22 \ \mu$m and $\approx 18 \ \mu$m (see SI).}
\caption{Theoretical results {\color{purple}WE ARE MISSING THE MODEL CARTOON, SIMILAR TO EILAT POSTER}}
\caption{No restriction, {\color{purple} COULD BE NICE TO HAVE LOG-LOG PLOT FOR FIRST 25 AND HAVE LINEAR FIT TO IT TO SHOW POWER LAW RETENTION } }
\caption{ Transfer learning experiments with KNN. The best and second best results are highlighted with {\bf bold} and \textcolor{blue}{blue} fonts, respectively. \label{tbl:knn}}
\caption{ Transfer learning experiments with fine tuning. The best and second best results are highlighted with {\bf bold} and \textcolor{blue}{blue} fonts, respectively. \label{tbl:fine-tune}}
\caption{Comparison of different transfer learning strategies on 5-shot 5-class action recognition experiments on EPIC. Our attention-based meta learning approach, A-MAML, significantly outperforms other transfer learning strategies. For \emph{KNN} and \emph{FT}, we follow the standard training procedure described in Sec.~\ref{sec:train_transf}. Methods in [1,3,5,6] columns use identical models. For the case of [2,4] columns, methods use models without attention layer. The best and second best results are highlighted with {\bf bold} and \textcolor{blue}{blue} fonts, respectively. %\secref{sec:tm_training}. \label{tbl:meta_learng}}
\caption{Quantitative comparison on the noisy simulated ToF data. Results are evaluated in MAE. The best results are in \textbf{bold} and in {\color{red}red} color. The second best results are \underline{underlined} and in {\color{cyan}cyan} color.}
\caption{Quantitative comparison on real ToF dataset. The errors are calculated as MAE to the measured ground-truth in \texttt{mm}. The best results are in \textbf{bold} and in {\color{red}red} color. The second best results are \underline{underlined} and in {\color{cyan}cyan} color.}
\caption{Comparison of the different lifelong learning strategies in the DeepGlobe data for the sequence: Landcover $\to$ Roads $\to$ Buildings (for AcLL: 84.375\%, 72\%). The legend for the Landcover task is: {\color{green} \textbf{forest}}, {\color{blue} \textbf{water}}, {\color{cyan} \textbf{urban}}, {\color{magenta} \textbf{rangeland}}, {\color{yellow} \textbf{agriculture}}; white shows barren land and black denotes background/unknown.}
\caption{Comparison of the different lifelong learning strategies in the DeepGlobe data for the sequence: Roads $\to$ Landcover $\to$ Buildings (for AcLL: 86.25\%, 92\%). The legend for the Landcover task is: {\color{green} \textbf{forest}}, {\color{blue} \textbf{water}}, {\color{cyan} \textbf{urban}}, {\color{magenta} \textbf{rangeland}}, {\color{yellow} \textbf{agriculture}}; white shows barren land and black denotes background/unknown.}
\caption{Example of input and output data: task defines box score (\ref{sub:box}) used for input and summary document of game (\ref{sub:summary}) used as output. Extracted entities are shown in \textbf{bold face}. Extracted values are shown in {\color[HTML]{2ca02c} green}. }
\caption{Illustrations of static entity embeddings $\boldsymbol{e}$. Players with colored letters are listed in the ranking top 100 players for the 2016-17 NBA season at \url{https://www.washingtonpost.com/graphics/sports/nba-top-100-players-2016/}. Only {\it LeBron James} is in {\color[HTML]{d62728}red} and the other players in top 100 are in {\color[HTML]{1f77b4}blue}. Top-ranked players have similar representations of $\boldsymbol{e}$.}
\caption{Illustrations of dynamic entity embedding $\bar{\boldsymbol{e}}$. Both left and right figures are for \textit{Cleveland Cavaliers} vs. \textit{Detroit Pistons}, on different dates. LeBron James is in {\color[HTML]{d62728} \textbf{red letters}}. {\color[HTML]{ff7f0e}\textbf{Entities with orange symbols}} appeared only in the reference summary. {\color[HTML]{1f77b4}\textbf{Entities with blue symbols}} appeared only in the generated summary. {\color[HTML]{2ca02c}\textbf{Entities with green symbols}} appeared in both the reference and the generated summary. The others are with {\color[HTML]{d62728}\textbf{red symbols}}. $\Box$ represents player who scored in the double digits, and $\Diamond$ represents player who recorded double-double. Players with $\triangle$ did not participate in the game. $\circ$ represents other players. }
\caption{Example summaries generated with \citet{puduppully2019data}'s model (left) and our model (right). Names in \textbf{bold face} are salient entities. {\color[HTML]{1f77b4}\textbf{Blue numbers}} are correct relations derived from input data records but are not observed in reference summary. {\color[HTML]{ff7f0e}\textbf{Orange numbers}} are incorrect relations. {\color[HTML]{2ca02c}\textbf{Green numbers}} are correct relations mentioned in reference summary. }
\caption{Schematic of the experiment. (a) Femtosecond laser pulses are generated from a Ti:Sapphire amplifier and stretched to 1.5~ps by using an acousto-optic programmable dispersive filter (Fastlite Dazzler). Each pulse is split into four pulses with equal energy delivered in two separate beams, containing two pulses in each, by a combination of two beam splitters (BSs) and two delay lines. The time delay $\tau_1$ between pulse1 and pulse3 (between pulse 2 and pulse4) is controlled by the 1$^{\rm st}$ delay line, and the delay $\tau_2$ between pulse1 and pulse2 (between pulse3 and pulse4) is controlled by the 2$^{\rm nd}$ delay line. The relative polarization $\theta$ between pulse1 and pulse2 (between pulse3 and pulse4) is controlled by rotating polarizer2 and a half-wave plate (HWP). The pulses are then focused by $f=250$~mm lenses, and delivered in counter-propagating configuration to the array of single atoms trapped in the vacuum chamber. The holographic single-atom tweezer traps are generated by focusing a laser beam tuned to 820~nm, illuminating a liquid-crystal spatial light modulator, by an NA=0.5 objective lens (Mitutoyo G Plan Apo 50X). (b) (left) The spectrum of the laser pulses which has a 3.8-THz bandwidth centred at $f_L=377.1$~THz which is the transition frequency of the $^{87}$Rb D$_1$ transition. (right) Relevant energy levels and transition frequencies of $^{87}$Rb. The hyperfine levels (qubit states) in $\ket {5S_{1/2}}$, which are degenerate in the time scale of the holonomic gate operation, are magnified. (c) The four-pulse sequence generated in (a). The first pair of pulse1 and pulse2 performs the $U_{\hat x}$ operation and the second pair of pulse3 and pulse4 performs the $U_{\hat n \left(\tau_1\right)}$ operation, where $\hat n$ is the rotation axis explained in the main text. (d) Fluorescence image of the single atoms trapped at positions $x_i$ ($i=1,2,\cdots,5$). The big red circle represents the intensity distribution of the ultrafast pulses. {\color{blue}[I think the caption is too lengthy. This is because I tried to keep the original style. We can move the details to the main text. Shall we? (a) How about, delay line $\rightarrow$ delay stage? What is the red box exactly? shutters do not give any useful information. Shall we delete them? (b) $f_1\rightarrow f_L$. I do not think we need a Gaussian fit for the spectrum. $^{87}$Rb D line $\rightarrow$ lines. We do not need $f_1$ or $f_{\rm{hf}}$ in the energy level diagram but just numbers. Shall we delete them? The texts for the hyperfine levels are far too small. (c) The texts for the gate names are too big. Shall we change $2\theta$ to $\Theta$ or just delete the angles? I want to delete $(\tau_1)$ at least.]}}
\caption{\label{fig:both} Spherically (black) and cylindrically (blue) projected Gaussians compared with pure Gaussian kernels (red crosses) centred at positions uniformly sampling the radial axis. %Each projected Gaussian is a product of a red Gaussian and green scaling function. We can observe that the projection takes care of the boundary condition when $r \rightarrow\ 0$. \blue{Each kernel that would extend into the negative domain ($r<0$) gets \enquote{pushed out} such that it zeros out at $r = 0$.}}
\caption{Example results of the proposed segmentation method (MV U-Net) and the baseline models (all trained with \textbf{100\%} training data) together with the ground truth (GT) on a stack of short-axis slices. Representative improvements for cardiac image segmentation can be observed when using the proposed method. For example, in contrast to the baseline models which produce poor results when there are unexpected artefacts on the image (see the region inside the \textcolor{cyan}{cyan ellipse}), the proposed method is able to properly identify the correct contours.}
\caption{Example sequence with our temporally consistent estimation in \textcolor{nicegreen}{green} (long dashes) and the best single frame algorithm in \textcolor{niceyellow}{yellow} (short dashes). Ground truth in white/black. Top three rows: sample frames with horizon lines from the sequence. Bottom row: Horizon offset trajectory over time, best viewed in colour. The temporally consistent estimation is more accurate on average and contains fewer outliers.}
\caption{%\blue{The word "variations" downplays this idea. Why not: "ConvLSTM cells with residual paths (left: proposed; right: na\"ive)."? Florian, please also ad a sentence to the contributions and the summary of contributions.} ConvLSTM with residual paths as described in Sec.~\ref{sec:residual_convlstm}. The $[\cdot,\cdot]$-operator denotes concatenation along the channel axis. \emph{Left}: Our proposed ConvLSTM with residual paths and dense connections. Changes \wrt~a standard ConvLSTM: residual connection from $\mat{X}_t$ to $\mat{Y}_t$ in \textcolor{nicegreen2}{green}; dense connection from $\mat{X}_t$, $\mat{\hat{H}}_t$ and $\mat{H}_{t-1}$ to $\mat{Y}_t$ in \textcolor{niceorange}{orange}; reversal of operation order in \textcolor{niceblue}{blue}. \emph{Right}: Part of a standard ConvLSTM, with a na\"ive implementation of a residual connection in\textcolor{nicepurple}{purple}.}
\caption{Illustration of the prediction trajectories with various social interactions: (a). Group walking (b). Crossing with group. (c). Person following. (d). Person crossing. (e). Person merging. (f). Group avoidance. Here {\color{red} \textbf{red line}}: observed history {\color{green} \textbf{green line}}: future ground-truth {\color{yellow} \textbf{yellow line}}: our multiple prediction results. (Best view in color)}
\caption{\textbf{UGC\,2162\hi\kinematics:} \textit{Top:} GMRT velocity field from the \textcolor{black}{medium} resolution cube. The two highest column density contours from the medium resolution \hi\integrated map are shown in black. The PV diagram (panel below) slice PA = 307\degree\is shown with a grey line.\textit{Middle:} PV diagram from a PA = 307 \degree slice. Positive positional offsets are to the NW. The blue contours are from the data and the red are from the \textsc{bbarolo} best fit model. \textit{Bottom:} Rotation curve \textcolor{black}{derived from the \textsc{bbarolo} five ring model fit.} }
\caption{\textbf{UGC\,2162 Tully--Fisher relations:} The position of UGC\,2162 is shown with a red square relative to the Baryonic Tully--Fisher Relation (red solid line). Also shown is UGC\,2162's position (black square) relative to the Stellar Tully--Fisher Relation (black solid line). The dashed lines in each colour indicate the 1$\sigma$ scatted in their respective Tully--Fisher Relation. }
\caption{ \textcolor{black}{Mean M$_{200}$ from the two NFW models from section \ref{dis_dyn} } for UGC\,2162 (black circle) and model fits to LITTLE THINGS dwarf galaxies from Oh et al. (2015) (red crosses) v M$_{gas}$. % Error bars for UGC\,2162 show the range \textcolor{red}{ $\times$ 5.0 $\times $10$^{10}$ \msolar\ to 8.8 $\times$} 10$^{10}$ \msolar. }
\caption{\textbf{UGC\,2162\hi\--stellar correlation near optical center:} integrated \hi\contours (high resolution, red) and star forming regions (SDSS g-band in black contours) plotted on\hi\velocity dispersion (in blue/cyan) map.}
\caption{Invariant mass distribution for dielectrons, $d\sigma/dM_{e^+e^-}$\textcolor{red}{,} produced in the reaction $\pi^- p\rightarrow e^+e^- n (\gamma)$. The solid curve corresponds to our total calculation including the exponential form factor in the form of Eq.~\ref{def:exponFF} at $R=$ 1.6 (GeV/c)$^{-1}$); the open circles correspond to the same calculation but at $R=$ 0 and the crosses are our total calculation at $R=$ 3 GeV/c)$^{-1}$. The separate contributions to this spectrum are presented by the blue long dashed line (the channel $\pi^- p\rightarrow e^+e^- n$), the green long dashed-dotted line ($\pi^- p\rightarrow n\pi^0\rightarrow n e^+e^-\gamma$). the red short dashed-dotted curve (the channel $\pi^- p\rightarrow n\eta^0\rightarrow n e^+e^-\gamma$, Fig.~(\ref{Fig.3} bottom)) and the red dotted line (the channel $\pi^- p\rightarrow n\rho^0\rightarrow n e^+e^-$, Fig.~(\ref{Fig.3} top). }
\caption{Invariant mass distribution for dielectrons, $d\sigma/dM_{e^+e^-}$\textcolor{red}{,} produced in the reaction $\pi^- p\rightarrow e^+e^- n$ calculated by including the form factor $FF$ in the form of Eq.~\ref{def:exponFF} at different values of parameter $R$. }
\caption{\textbf{Visualization of task domains} $\mathcal{S}_{\rm cls}$ and $\mathcal{S}_{\rm loc}$ using \mbox{t-SNE}. Given a single clean image $\x$, each dot in the picture represents one adversarial example generated by solving Eqn.(\ref{eq:domain}) staring from a random point within the $\epsilon$-ball around $\x$. Different colors encode the task losses used for generating adversarial examples (\textcolor{red}{red}: $\lc$, \textcolor{blue}{blue}: $\ll$). Therefore, the samples form empirical images of the corresponding task domains. It is observed that the two task domains have both overlaps and distinctive regions.}
\caption{Example of a document and corresponding action sequence. We mark in \textcolor{red}{\it red} the MT words that have been corrected and in \textcolor{blue}{\bf blue} their replacement. The actions used here were {\tt W} (wait), {\tt JSF} (jump sentence forward), {\tt JF} (jump forward), {\tt D} (delete), {\tt MC} (mouse clicks), {\tt MS} (mouse selections), {\tt JB} (jump back), {\tt R} (replace) and {\tt S} (stop).}
\caption{{\blue The proposed crossmodal Emotion emBedding (EmoBed) framework for monomodal emotion recognition.}}
\caption{\blue Performance comparison in terms of CCC for the {\bf arousal} prediction among the proposed EmoBed systems, related baselines, and other state-of-the-art systems. These results pertain to the experiments conducted on the \textit{dev}elopment and \textit{test} partitions of the RECOLA database. Three feature sets (audio-eGeMAPS, video-appearance, and video-geometric) were employed to evaluate all approaches. Four monomodal scenarios are considered: audio (+video-app.), video-app. (+audio), audio (+video-geo.), video-geo. (+audio), where the modalities in the parentheses are the employed auxiliary modalities. The cases where EmoBed systems have a statistical significance of performance improvement over the classic monomodal systems are marked by the ``$\star$'' symbol. MTL: multi-talk learning; DDAT: dynamic difficulty awareness training; RE: reconstruction error; PU: perception uncertainty.}
\caption{\blue Performance comparison in terms of CCC for the {\bf valence} prediction among the proposed EmoBed systems, related baselines, and other state-of-the-art systems. These results pertain to the experiments conducted on the \textit{dev}elopment and \textit{test} partitions of the RECOLA database. Three feature sets (audio-eGeMAPS, video-appearance, and video-geometric) were employed to evaluate all approaches. Four monomodal scenarios are considered: audio (+video-app.), video-app. (+audio), audio (+video-geo.), video-geo. (+audio), where the modalities in the parentheses are the employed auxiliary modalities. The cases where EmoBed systems have a statistical significance of performance improvement over the classic monomodal systems are marked by the ``$\star$'' symbol. MTL: multi-talk learning; DDAT: dynamic difficulty awareness training; RE: reconstruction error; PU: perception uncertainty.}
\caption{\blue Visualisation of the learnt representations of the development set of the {\bf RECOLA} database when using the proposed EmoBed systems or the classic monomodal systems. Red, green, and yellow markers: representations from audio (eGeMAPS), video (appearance), and video (geometric) modalities; circle and cross markers: high and low arousal/valence.}
\caption{\blue Visualisation of the learnt representations of the development set of the {\bf OMG-Emotion} database when using the proposed EmoBed systems or the classic monomodal systems. Red and green markers: representations from audio and video modalities; circle and cross markers: neutral and sad categories.}
\caption{\blue Impact of the joint auxiliary modality loss on the {\em joint audiovisual training} systems (a), and impact of the crossmodal triplet loss on the {\em crossmodal triplet training} systems (b), with the {\bf RECOLA} database for {\bf arousal} regression. The best performed $\alpha$ or $\beta$ is indicated in each case.}
\caption{\blue Impact of the joint auxiliary modality loss on the {\em joint audiovisual training} systems (a), and impact of the crossmodal triplet loss on the {\em crossmodal triplet training} systems (b), with the {\bf RECOLA} database for {\bf valence} regression. The best performed $\alpha$ or $\beta$ is indicated in each case.}
\caption{\blue Impact of the joint auxiliary modality loss on the {\em joint audiovisual training} systems (a), and impact of the crossmodal triplet loss on the {\em crossmodal triplet training} systems (b), with the {\bf OMG-Emotion} database. The best performed $\alpha$ or $\beta$ is indicated in each case.}
\caption{Attribute-based PR/SR scores (\%) on RGBT234 dataset against with eight RGBT trackers. The best and second results are in \textcolor{red}{red} and \textcolor{green}{green} colors, respectively.}
\caption{Evaluation on DTB~\cite{li2017visual} by Distance Precision (DP), Overlap Precision (OP) and Area-Under-the-Curve (AUC). The \textcolor{red}{first}, \textcolor{blue}{second} and \textcolor{green}{third} best scores are highlighted in color.}
\caption{\label{eq:greenai_equation} The equation of \redai: The cost of an AI ($R$)esult grows linearly with the cost of processing a single ($E$)xample, the size of the training ($D$)ataset and the number of ($H$)yperparameter experiments.}
\caption{Examples that have \textit{dataset-specific} bias detected by \algoname (marked with \xmark). The words that include (dataset-specific) polarity bias (\S\ref{sec:filtering}) are highlighted ({\sethlcolor{teagreen}\hl{positive}} and {\sethlcolor{piggypink}\hl{negative}}). For comparison, we show examples selected from \datasetdeb (marked with \cmark).}
\caption{Monte Carlo representation of the two-dimensional harmonic oscillator energy eigenstate $\Psi_{2,1}$. \BW{Bright (dark)}\color{Red (blue)} dots correspond to positive (negative) values of the wave function. Horizontal and vertical nodal lines of the function can be recognized as areas without any visible dots. As expected, the state $\Psi_{2,1}$ has two nodal lines intersecting with the $q_1$ axis and one intersecting with the $q_2$ axis.}
\caption{Monte Carlo representation of the ground state of a quantum harmonic chain. The wave function has been chosen real-valued and positive, thus all the lines are \BW{brighter than the background}\color{of the same color}. The \BW{brightest} lines \color{in deepest red }correspond to those $\vec q$ where the wave function assumes the highest values. Not surprisingly, in the ground state these are the lines close to $\vec q = 0$. Actually, a straight horizontal line corresponding exactly to $\vec q = 0$ would be drawn in the \BW{brightest}\color{deepest} color since it represents the global maximum of the ground state wave function. But due to the random nature of the Monte Carlo approach this line happens not to be drawn in the chart.}
\caption{Monte Carlo representation of the first excited state of the harmonic chain with wave number zero. In the most prominent configurations of the chain, i.e., those corresponding to \BW{very bright}\color{deeply red} or \BW{very dark}\color{deeply blue} lines, the $q_n$'s tend to be either jointly positive or all negative. Thus, if we could perform a quantum measurement of the exact position of the harmonic chain, we would most likely either find it entirely shifted towards the positive $q$-direction or entirely the other way. In the particle interpretation of a quantum field this state would be called `one particle at rest'.}
\caption{The second excited state of the harmonic chain with wave number zero is similar to the first one shown in Fig. \ref{OnePartRest}. But where the first excited state has only one `nodal line' at $\vec q = 0$ (or, much more precisely, one nodal hypersurface which contains the point $\vec q = 0$), we can now recognize two such `nodal lines' where the graph turns from being mostly \BW{bright}\color{red} to mostly \BW{dark}\color{blue} and back again. In the particle interpretation of a quantum field, this state would be called `two particles at rest'.}
\caption{A state with two particles, one of them localized at $n=3$ and the other at $n=8$. While the peaks in excitation at those two locations are clearly visible, the detailed structure of the plot may seem chaotic at first glance. A closer inspection reveals a pattern in the \BW{white and black}\color{red and blue} lines which becomes much more evident in Fig. \ref{TwoPartLocalized-small-dist} below.}
\caption{ DeepCABAC binarization of neural networks. It encodes each weight element by performing the following steps: 1) encodes a bit named \textit{sigflag} which determines if the weight is a significant element or not (in other words, if its 0 or not). 2) If it's not 0, then the sign bit, \textit{signflag}, is encoded. 3) Subsequently, a series of bits are encoded, which indicate if the weight value is greater equal than $1, 2, ..., n \in \mathbb{N}$ (the so called \textit{AbsGr(n)Flag}). 4) Finally, the remainder is encoded. The gray bits (also named regular bins) represent bits that are encoded using an arithmetic coder according to a context model. The other bits, the so called bypass bins, are encoded in fixed-point form. For instance, in the above diagram $n=1$, and therefore $1 \rightarrow$ \textcolor{gray}{100} , $-4\rightarrow$ \textcolor{gray}{11110}\textcolor{blue}{1} or $7\rightarrow$ \textcolor{gray}{101110}\textcolor{blue}{10} .}
\caption{Examples of transforming WNLI to {\wsc} format. Note that the text highlighted by \textcolor{brown}{brown} is the longest common substring from the left part of pronoun \textit{it}, and the text highlighted by \textcolor{violet}{violet} is the longest common substring from its right.}
\caption{(a) The dependence of the normalized torque in terms of the Nusselt number $Nu_\omega$ on the shear Reynolds number $Re_S$ for pure inner cylinder rotation $\mu=0$ in a range of $4.5\times 10^3 \leq Re_S \leq 1.2\times 10^5$. Different filled blue symbols represent experimental data for different working fluids. Yellow open symbols denote numerical results. (b) Dependence of the Nusselt number compensated by $Re_S^{-0.65}$ as function of the shear Reynolds number. A change in the dependence appears at approximately $Re_{S,crit,1} \approx 1.3 \times 10^4$, which is marked as dashed line. (c) Corresponding local scaling exponent $\alpha$ as function of the shear Reynolds number calculated for a bin size of $\Delta_{10}(Re_S)=0.5$. A transitional behavior is visible in the region $1.3\times 10^4 \leq Re_S \leq 4 \times 10^4$. The dotted line indicates the end of this transition at $Re_{S,crit,2}\approx 4 \times 10^4$, \red{where $\alpha$ starts to monotonically increase.}}
\caption{Airdrop strategies (9) that serve as \textcolor{red!50!black}{upper}- and \textcolor{green!50!black}{lower}-bound in the analysis. BS stands for \emph{Batch Size}. Batch size steps are always \num{100}.}
\caption{Model generated and reference summaries used for human evaluation. Words in \textcolor{orangered}{orange} correspond to incorrect or repeated information.\label{fig:erroranalysis-xsum}}
\caption{\textbf{Quantitative comparison on feature matching and pose estimation.} (\textbf{\textcolor{red}{Red}} : Best, \textcolor{blue}{Blue} : Runner-up).}
\caption{ The CerebroVis Dashboard with categorical coloring to differentiate arteries. Left: A cerebral artery scan with a stenosis in the MCA~\colorSquare{colorScaleMCAR}. Right: Users can click on an artery mark in CerebroVis and the corresponding mark is highlighted in the 2D projection. This feature allows users to validate the stenosis with the underlying geometry and plan for therapeutic surgery.}
\caption{ Left: Examples of an abnormally narrow artery (``stenosis'') and wide artery (``aneurysm''), with the relevant branches colored red \colorSquare{colorScaleRed}. Right: Example of a blood flow color encoding with a blockage disrupting normal flow of blood with blood flow colored on a scale between \colorSquareBorder{colorScaleMin} and \colorSquare{colorScaleMax}.}
\caption{Recall at top 1, 5, and 10 retrieval. Median rank \textbf{$\widetilde{r}$} on a verse-to-verse retrieval task is also provided. %Original implementation by \cite{Harwath18_interlingua}. Chance recalls are 0.001 (R@1), 0.006 (R@5) and 0.012 (R@10). Chance median $\widetilde{r}$ is 408.5.\textcolor{red}{scores affichés sont ceux de val, ceux de test sont en cours de calcul}}
\caption{Additional \aastex\symbols}
\caption{Electric fields and photonic propagation for an SDR without NPs (a) versus with one gold NP (b). \blue{The inset in (b) shows the propagation mode for the NP.} Dimensions in $\mu m$. \label{fig:field} }
\caption{Transmission plots for different peo-PUF tokens: (a, b) SDR token with radius 6$\mu m$, (c, d) SDR token with radius 5$\mu m$. The SDR's height/thickness is 180nm for both setups. \blue{Both SDR tokens are independently simulated for three variations of one gold NP, regarding the NP sizes/radius. Without loss of generality, the considered NP radii are 60nm, 120nm, and 240nm.} For fair comparison, the NPs are always placed at the same location on the SDR. \label{fig:transmission} }
\caption{Inter-FHD for five NPs. \blue{The underlying spectra are provided in Fig.~\ref{fig:transmission_5}.} \label{fig:FHD_5_NPs} }
\caption{\blue{Transmission plots for different peo-PUF tokens with five NPs. The red, blue, and green curves correspond to the setups \textit{Si5umAu60nm(5)}, \textit{Si5umTiN60nm(5)}, and \textit{Si5umTiN(5)}, respectively, as described in Table~\ref{tab:configs}. The inset in (b) shows the spatial arrangement of NPs for \textit{Si5umTiN60nm(5)}, with NPs labeled as ``metal disk.''} \label{fig:transmission_5} }
\caption{(Color online) (a) Variation of room-temperature saturation magnetization (M$_S$) of the extracted ferromagnetic part with the corresponding sample hexagonal phase fraction, (b) variation as well as switching of the room-temperature remanent polarization {\color{red}{(dP$_r$)}} as obtained from PUND measurements with the corresponding tetragonal phase fraction (\%), (c) variation of remanent polarization per unit tetragonal phase fraction (\%) with the corresponding sample tetragonality ({\it{c/a}}) and (d) displays the change in remanent polarization for samples with their corresponding Goldschmidt's tolerance factor, where its inset shows how tetragonal phase fraction (\%) varies with their respective tolerance factor.}
\caption{Illustration of MetricsVis \textcolor{firstround}{system} diagram with three modules: data processing, views, and visual analytical task categories.}
\caption{MetricsVis overview: The \textcolor{firstround}{priority adjustment view} (2) encodes the crowdsourced crime severity ratings from police officers and citizens (perceived importance of factors); the red dots indicate the currently assigned weights used in the evaluation metrics. The projection view (6) shows the dimensionality reduction results. The group performance view (5) contains three visual representations that show an overview of group performance and the contribution of each member. The performance matrix view (3) displays the individual employee performance with employees in columns and job types in rows (here, employees are sorted based on their group first and then their total performance scores). The control panel shows the filters (1) and grouping method (4) applied in use case 1. % The top left view (weight table) encodes the crowdsourced crime severity survey ratings from police officers and citizens (perceived importance of factors in a more generic scenario); the red dots indicate the currently assigned weights used in the evaluation metrics. % The bottom left view (projection view) shows the dimensionality reduction results. % The top right view (group performance) contains three visual representations that show an overview of group performance and the contribution of each member. % The bottom right view (performance matrix) displays the individual employee performance with employees in columns and job types in rows (employees are sorted based on their group first and then their total performance scores). % The control panel in the upper-right corner shows the filters and grouping method applied in use case 1. % The bottom right view (dendrogram) shows the hierarchal clustering results. }
\caption{A sample row in \textcolor{firstround}{priority adjustment view}: designed for law enforcement agencies.}
\caption{Photo-realistic rendering vs. real-world decoration. We encourage readers to guess which column corresponds to real-world decoration. The answer is in the footnote\color{red}$^1$\color{black}.\label{fig:rendering_vs_real}}
\caption{ Example of a miniature effect of focusing optics with a mask of the Chinese/Japanese character ``\includegraphics[width=1.0em]{figs/ue.eps}''. The top panel shows a schematic view, and the others, from the second to bottom panels, shows the photon maps at the mouth of the mirror assembly ($h=+5600$ mm), and at $h=+500$ mm, $+250$ mm, 0 mm and $-500$ mm from the focal length. The 40mm square is overlaid for reference. }
\caption{In our immersive Space-Time Cube environment, all actions are implemented through intuitive mid-air gestures, such as grabbing (left), stretching (center left), and tapping (center right), or by tangible %WOLFGANG: interacting -> interaction interaction with controls on the desk's surface (right). \highlight{Since time advances downwards, the trajectories show the movement history up to the point currently crossing the map. Blue hand contours added for clarity.}}
\caption{Different tasks being performed in the Immersive condition with Dense data: comparisons of instant distance (left), stop durations (center left), movement speeds (center right) and event locations (right). \highlight{Blue hand contours added for clarity.}}
\caption{Times were similar in both conditions, with the exception of the simplest task. Success rates were generally similar across scenarios and conditions, except for T5, which became more difficult for Dense data in Desktop, but not in Immersive. \highlight{Error bars indicate standard deviation.}}
\caption{Task workload components of the Nasa TLX questionnaire under Desktop and Immersive. Mental Workload and Frustration were significantly lower in the latter. \highlight{Error bars indicate standard deviation.}}
\caption{Likert-scale agreements to different assertions \highlight{ranging from strongly disagree (dark red) to strongly agree (dark green).} Participants found it significantly easier to find information and interact in Immersive \highlight{(IM) than in Desktop (D),} and also considered it more comfortable.}
\caption{This figure shows our study platform and supported interactions for different visualizations: bar chart (left) and scatterplot (right). Users can \textcolor{resize}{\textbf{(1) resize}}, \textcolor{reposition}{\textbf{(2) reposition}}, and \textcolor{recolor}{\textbf{(3) recolor}} bars and points directly. }
\caption{Two examples demonstrating where models capable of temporal reasoning, TRN and TSM, improve over TSN. The bar charts show the model's scores on the above example with the correct class' score shown in \textcolor{green}{green}.}
\caption[]{Backbone (BB) comparison using 8 segments in both training and testing evaluating top-1/5 accuracy across tasks. S1 denotes the seen test set, and S2 the unseen test set. Cells are coloured on a per column basis: \textcolor{pink}{low} \begin{tikzpicture}% \pgfplotscolorbardrawstandalone[% colormap name=PiYG,% colorbar horizontal,% colorbar style={% height=0.18cm,% width=2cm,% hide axis,% }% ]% \end{tikzpicture} \textcolor{green}{high}. }
\caption{Comparison to state-of-the-art methods on Market-1501. {\color{red}Red} denotes our performance, and {\color{blue}Blue} denotes the best performance reported by existing methods: the same hereinafter.}
\caption{\label{fig:psnr:names} PSNRs of recent state-of-the-arts for scale factor $\times 4$ on Set5 \cite{Set5:2012} and Urban100 \cite{Urban100:2015}. \textcolor{red}{Red} names represent our proposed models.}
\caption{\label{tab:psnr1}Average PSNR/SSIMs of \textbf{Pre-upsampling} models for scale factor $\times2$, $\times3$ and $\times4$ on datasets Set5, Set14, BSD100 and Urban100. \textcolor{red}{Red} color indicates the best performance and \textcolor{blue}{blue} color indicates the second best performance.}
\caption{\label{tab:psnr2}Average PSNR/SSIMs of \textbf{Post-upsampling} models for scale factor $\times2$, $\times3$ and $\times4$ on datasets Set5, Set14, BSD100, Urban100 and Manga109. \textcolor{red}{Red} color indicates the best performance and \textcolor{blue}{blue} color indicates the second best performance.}
\caption{\label{fig:visual}SR results of ``img016'' and ``img059'' from \textbf{Urban100} with scale factor $\times 4$. \textcolor{red}{Red} indicates the best performance.}
\caption{\label{tab:interaction} \textbf{Future interaction prediction experiment:} Table comparing the performance of \method\with state-of-the-art algorithms, in terms of mean reciprocal rank (MRR) and recall@10. The{\color{blue!75}best algorithm} in each column is colored {\color{blue!75}blue} and {\color{blue!20}second best is light blue}. The last two columns show the minimum percentage improvement of \method\over the method, across over all datasets. We see that\method\outperforms all baselines by at least 20\% in MRR and 14\% in recall@10.}
\caption{\label{tab:churn} \textbf{User state change prediction:} Table comparing the performance in terms of AUC of \method\with state of the art algorithms. The{\color{blue!75}best algorithm} in each column is colored {\color{blue!75}blue} and the {\color{blue!20}second best is light blue}. \method\outperforms the baselines by at least 12.63\% on average.}
\caption{\textbf{Robustness of \method :} Figures (a--c) compare the mean reciprocal rank (MRR) of \method\with baselines on interaction prediction task, by varying the training data size. Figure (d) shows the AUC of user state change prediction task by varying the training data size. We see\method\consistently has the highest scores.\label{fig:interaction}}
\caption{\textbf{Robustness to dynamic embedding size:} The performance of \method\is stable with the change in dynamic embedding size, for the task of interaction prediction on LastFM dataset. Please refer to the legend in Figure 5.\label{fig:embed} }
\caption{\label{fig:edges}\textcolor{gray}({Color online) Band edges of InAs as a function of strain. In the case of linear strain approximation, we take the deformation potentials from Ref.\cite{vurgaftman01}} (see Table\ref{tab:dps}). }
\caption{\label{fig:qw}\textcolor{gray}(Color online) Energy of the electron (a) and the hole (b) ground state, as a function of the QW width. Energy $E = 0$ corresponds to the (unstrained) GaAs valence-band edge. For the linear strain approximation we took the deformation potentials from Ref.\cite{vurgaftman01} (see Table~\ref{tab:dps}). }
\caption{\label{fig:disk}\textcolor{gray}({Color online) Energy of the electron and hole ground states, as a function of the QD height, where the base radius is fixed as a $21a$. The width of the wetting layer is $a$. } }
\caption{\label{fig:lens}\textcolor{gray}({Color online) Energy of the electron and hole ground states, as a function of the QD height, where the base radius is fixed as a $20a$. The width of the wetting layer is $a$. } }
\caption{\label{fig:nostraindisk}\textcolor{gray}({Color online) Energy of the electron and hole ground states, as a function of the QD height, where the base radius is fixed as a $21a$. The width of the wetting layer is $a$. } }
\caption{\label{fig:nostrainlens}\textcolor{gray}({Color online) Energy of the electron and hole ground states, as a function of the QD height, where the base radius is fixed as a $20a$. The width of the wetting layer is $a$. } }
\caption{Recognition results on full videos of the RWTH-PHOENIX-Weather-2014 dataset. \textcolor{green}{Deletion}, \textcolor{blue}{insertion} and \textcolor{red}{substitution} errors are colored in \textcolor{green}{green}, \textcolor{blue}{blue} and \textcolor{red}{red} respectively. Sentences below each sample are ground truth and then results of SF-Net. Translations to English are single word based. \textunderscore\textunderscore on\textunderscore \textunderscore \and\textunderscore\textunderscore off\textunderscore\textunderscore \are starting and ending flags while\** represents absence of glosses. Samples are chosen from the validation and testing sets.}
\caption{Recognition results on full videos of the CSL dataset. \textcolor{green}{Deletion}, \textcolor{blue}{insertion} and \textcolor{red}{substitution} errors are colored in \textcolor{green}{green}, \textcolor{blue}{blue} and \textcolor{red}{red} respectively. Sentences below each sample are ground truth and then results of SF-Net. Translations to English are single word based. \** in sentences represents absence of glosses. Samples are chosen from the testing set.}
\caption{Cumulative time to solve problems from 1 to $i$ for domain \red{X}. Tr represents the time spent training ASNet. Each point represent the average over $k$ runs. \red{Extra}: add the $95\%$ confidence interval for each point.}
\caption{Running ECO, SINT, and LT-SINT on eight Long OTB videos, four videos with significant model decay (\textcolor{red}{$\mi$}) and four videos with small model decay (\textcolor{blue}{$\mi$}), for ECO and SINT. The videos are: Girl: SV, OCC, IPR, OPR (\ref{line:ltsint-girl}), Freeman4: SV, OCC, IPR, OPR (\ref{line:ltsint-freeman4}), Lemming: IV, SV, OCC, FM, OPR, OV (\ref{line:ltsint-lemming}), SUV: OCC, IPR, OV (\ref{line:ltsint-suv}), Car4: IV, SV (\ref{line:ltsint-car4}), Couple: SV, DEF, FM, OPR, BC (\ref{line:ltsint-couple}), Skiing: IV, SV, DEF, IPR, OPR (\ref{line:ltsint-skiing}), Freeman1: SV, IPR, OPR (\ref{line:ltsint-freeman1}).}
\caption[]{Predictions from ECO \cite{Danelljan_2016_CVPR} on an artificially extended video created from OTB50 data (\textcolor{red}{red box}: tracker prediction, \textcolor{yellow}{yellow box}: ground truth prediction). \emph{Model decay} is prevalent here although the appearance variation remains intact. Due to heavy updating involved, model decay is noticeable from very early stages itself, even for clearly visible target objects moving slowly .}
\caption{\label{FIGclsPerformance}Parametrisation of the performance of the trained \cls method. \Subref{FIGclsDist}~Distributions of the \cls metric, \predCls, for the signal and background samples, as indicated. \Subref{FIGclsTs}~The parametrised \cls test statistic, \tsCls, as a function of \predCls, before and after the correction for trials. The dashed-dotted horizontal line highlights the value, $\ts = 25$.}
\caption{ An illustration of a \qdag $\text{\normalfont All}(A) \times S(B,C)$, with $S(B, C) = \{(3,4), (6,4), (6,5), (7,4), (7,5)\}$. a) A geometrical representation of $S(B, C)$ (left), and $\text{\normalfont All}(A) \times S(B,C)$ (right). b) A \qtree $Q_S$ for $S(B, C)$ (left), and the directed acyclic graph induced by the \qdag $(Q_S, M=[0,1,2,3,0,1,2,3])$, which represents $\text{\normalfont All}(A) \times S(B,C)$. The red cell in (a) corresponds to the point $p=(4,3,4)$. The leaf representing $p$ in the \qdag can be reached following the path highlighted in (b). Note the relation between the binary representation ({\normalfont \textbf{1}{\color{gray} \textbf{0}}0,\textbf{0}{\color{gray} \textbf{1}}0,\textbf{1}{\color{gray} \textbf{0}}0}) of $p$, and the Morton codes \textbf{101}, {\color{gray} \textbf{010}}, {\normalfont 010} of the nodes in the path from the root to the leaf for $p$. }
\caption{ Quantitative comparisons of different saliency models on six benchmark datasets in terms of maximum $F_{\beta}$-measure and $MAE$ which are marked as $F^{*}_{\beta}$ and $mae$ in this table. \textcolor{red}{Red}, \textcolor{blue}{blue} and \textcolor{green}{green} text indicate the best, second best and third best performance respectively. The computation speed (fps) are obtained on an NVIDIA TITAN X GPU. }
\caption{Evaluated IoU, IoU Top-1 rate, precision and recall on our validation dataset (black) and Pratheepan Face dataset ({\color{blue}{blue}}), trained by balanced dataset ($\# \textit{skin}, \# \textit{body} = 5k$) (top) and unbalanced dataset ($\# \textit{skin} = 1k, \# \textit{body} = 5k$) (bottom).}
\caption{Our network architecture uses ResNet-50 as backbone. The support image (in {\color{green} green}) and query image (in {\color{blue} blue}) are fed into the weight-shared backbone. The RPN use attention feature generated by the depth-wise cross correlation between compact $1 \times 1 \times C$ support feature and $H \times W \times C$ query feature. The class score generated by patch-relation head (the top head), global-relation head (the middle head) and local-correlation head (the bottom head) is added together as the final matching score, and the bounding box prediction are generated by the patch-relation head.}
\caption{\textbf{Crossing routes.} Map of two main ground migrant exit routes from Venezuela: in (a) the Pan-American road with a portion of Colombia, Ecuador and Peru, and in (b) the area of Roraima, Amazonas and Par\'a states in Brazil. Blue arrows indicate flows toward Venezuela and the red ones away from it. Only cells with more than three TUVs are shown in the maps. The lightness of the color of the arrows is proportional to the net in- and out-flows from light to darker colors. The upscaled net flows crossing the dashed lines are displayed in the right-bottom corner of each plot.}
\caption{Upscaled number of TUVs crossing the lines of Figure \ref{vecto1} away ({\color{red}{$\leftarrow$}}) and toward ({\color{blue}{$\rightarrow$}}) Venezuela in the two routes considered.}
\caption{ (Left) The predicate distributions of frequent objects in VRD-Spatial, VG-Spatial and SpatialSense-Positive. For example, the bottom-left bar shows the frequency distribution of predicates \pred{on}, \pred{under}, \pred{behind}, etc. for the object ``table'' in SpatialSense. SpatialSense contains less language bias than other datasets since the distribution is more balanced. (Right) The predicate distributions of the top-50 objects in the three datasets, further showing the wider distribution in SpatialSense. }
\caption{The 2D locations of subjects relative to objects for the predicates \pred{to the left/right of}, normalized by image size. SpatialSense is less biased in 2D cues since the points are less separable. Each figure contains 400 randomly sampled relations for each predicate.}
\caption{\textbf{Ablation study}. IoU metric (higher is better) computed on the proposed dataset for Lungs (Tab.~\ref{subtab:iou_lungs}) and Heart (Tab.~\ref{subtab:iou_heart}) obtained by different encoder-decoder architectures. For each decoder we indicate the best model as \textbf{bold} and the second best model as \underline{underscore}. The best combination of Encoder+Decoder is highlighted as \colorbox{gray!10}{\textbf{gray}}. We chose ResNet50 model as a backbone network and FPN decoder for further experiments. See Sec.~\ref{ssec:ablation_study} for more details.}
\caption{Examples of positive (\textcolor{green}{green}) and negative (\textcolor{red}{red}) captions.}
\caption[]{Full-bandwidth \red{total power based} beampatterns of the linear signal (dashed-line) and the four different distortion terms stemming from the $3$rd order PA model, with 60 antenna units. The r-axis shows relative powers, with the passband power received by the intended receivers normalized to 0 dB.}
\caption{The \octo\model of V1309 Scorpii 20 orbits after the simulation begins. V1309 Scorpii is a contact binary that merged into a single star in 2008 in a process known as a luminous red nova. It was the first star to provide conclusive evidence that contact binary systems end their evolution in a stellar merger~\cite{tylenda_2011}, see also Section~\ref{sec:scenario}.}
\caption{Relative speedup with respect to the processed sub-grids on one node for level 14. The \textcolor{cbred}{red lines} show the results using HPX's MPI parcelport and the \textcolor{cbblue}{blue lines} using HPX's libfabric parcelport, respectively. Note that for level 16 and level 17 some data points are missing due to restricted node hours for development projects.}
\caption{Table captions should be placed above the %tables.}\label{tab1} %\begin{tabular}{|l|l|l|} \hline Heading level & Example & Font size and style\\ \hline Title (centered) & {\Large\bfseries Lecture Notes} & 14 point, bold\\ 1st-level heading & {\large\bfseries 1 Introduction} & 12 point, bold\\ 2nd-level heading & {\bfseries 2.1 Printing Area} & 10 point, bold\\ 3rd-level heading & {\bfseries Run-in Heading in Bold.} Text follows & 10 point, bold\\ 4th-level heading & {\itshape Lowest Level Heading.} Text follows & 10 point, italic\\ \hline %\end{tabular} %\end{table} \noindent Displayed equations are centered and set on a separate line. %\begin{equation} x + y = z %\end{equation} Please try to avoid rasterized images for line-art diagrams and schemas. Whenever possible, use vector graphics instead (see Fig.~\ref{fig2}). %\begin{figure} \includegraphics[width=\textwidth]{fig1.eps} \caption{A figure caption is always placed below the illustration. Please note that short captions are centered, while long ones are justified by the macro package automatically.} \label{fig1} %\end{figure} %\begin{theorem} This is a sample theorem. The run-in heading is set in bold, while the following text appears in italics. Definitions, lemmas, propositions, and corollaries are styled the same way. %\end{theorem} % % the environments 'definition', 'lemma', 'proposition', 'corollary', % 'remark', and 'example' are defined in the LLNCS documentclass as well. % %\begin{proof} Proofs, examples, and remarks have the initial word in italics, while the following text appears in normal font. %\end{proof} For citations of references, we prefer the use of square brackets %and consecutive numbers. Citations using labels or the author/year %convention are also acceptable. The following bibliography provides %a sample reference list with entries for journal %articles~\cite{ref_article1}, an LNCS chapter~\cite{ref_lncs1}, a %book~\cite{ref_book1}, proceedings without editors~\cite{ref_proc1}, %and a homepage~\cite{ref_url1}. Multiple citations are grouped %\cite{ref_article1,ref_lncs1,ref_book1}, %\cite{ref_article1,ref_book1,ref_proc1,ref_url1}. \end{comment} % % ---- Bibliography ---- % % BibTeX users should specify bibliography style 'splncs04'. % References will then be sorted and formatted in the correct style. % % \bibliographystyle{splncs04} % \bibliography{mybibliography} % \begin{thebibliography}{8} \bibitem{ref_1} Kutikov, A., Uzzo, R. G. (2009). The R.E.N.A.L. Nephrometry Score: A Comprehensive Standardized System for Quantitating Renal Tumor Size, Location and Depth. Journal of Urology. https://doi.org/10.1016/j.juro.2009.05.035 \bibitem{ref_2} Ficarra, V., Novara, G., Secco, S., Macchi, V., Porzionato, A., De Caro, R., Artibani, W. (2009). Preoperative Aspects and Dimensions Used for an Anatomical (PADUA) Classification of Renal Tumours in Patients who are Candidates for Nephron-Sparing Surgery. European Urology. https://doi.org/10.1016/j.eururo.2009.07.040 \bibitem{ref_3} Taha, A., Lo, P., Li, J., Zhao, T. (2018). Kid-Net: Convolution Networks for Kidney Vessels Segmentation from CT-Volumes. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). https://doi.org/10.1007/978-3-030-00937-3\_53\bibitem{ref_4} Heller, N., Sathianathen, N., Kalapara, A., Walczak, E., Moore, K., Kaluzniak, H., … Weight, C. (2019). The KiTS19 Challenge Data: 300 Kidney Tumor Cases with Clini-cal Context, CT Semantic Segmentations, and Surgical Outcomes. 1–13. Retrieved from http://arxiv.org/abs/1904.00445 \bibitem{ref_5} Ronneberger, O., Fischer, P., Brox, T. (2015). U-net: Convolutional networks for bi-omedical image segmentation. Lecture Notes in Computer Science (Including Subse-ries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9351, 234–241. https://doi.org/10.1007/978-3-319-24574-4\_28\bibitem{ref_6} Isensee, F., Petersen, J., Kohl, S. A. A., Jäger, P. F., Maier-Hein, K. H. (2019). nnU-Net: Breaking the Spell on Successful Medical Image Segmentation. 1, 1–8. Retrieved from http://arxiv.org/abs/1904.08128 \bibitem{ref_7} Cicek, O., Abdulkadir, A., Lienkamp, S. S., Brox, T., Ronneberger, O. (2016). 3D UNet: Learning Dense Volumetric. Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016, 424432. https://doi.org/10.1007/978-3-319-46723-8 \bibitem{ref_8} Milletari, F. (2016). V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. 2016 Fourth International Conference on 3D Vision (3DV), 565–571. https://doi.org/10.1109/3DV.2016.79 \bibitem{ref_9} Li, X., Chen, H., Qi, X., Dou, Q., Fu, C. W., Heng, P. A. (2018). H-DenseUNet: Hybrid Densely Connected UNet for Liver and Tumor Segmentation from CT Volumes. IEEE Transactions on Medical Imaging. https://doi.org/10.1109/TMI.2018.2845918 \bibitem{ref_10} Oktay, O., Schlemper, J., Folgoc, L. Le, Lee, M., Heinrich, M., Misawa, K., … Rueckert, D. (2018). Attention U-Net: Learning Where to Look for the Pancreas. (Midl). Retrieved from http://arxiv.org/abs/1804.03999 \bibitem{ref_11} Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S. (n.d.). Feature Pyramid Networks for Object Detection. Retrieved from https://arxiv.org/pdf/1612.03144.pdf \bibitem{ref_12} Roy, A. G., Navab, N., Wachinger, C. (2018). Concurrent spatial and channel ‘squeeze \& excitation’ in fully convolutional networks. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). https://doi.org/10.1007/978-3-030-00928-1\_48\bibitem{ref_13} Lin, T. Y., Goyal, P., Girshick, R., He, K., Dollar, P. (2017). Focal Loss for Dense Object Detection. Proceedings of the IEEE International Conference on Computer Vision, 2017-October, 2999–3007. https://doi.org/10.1109/ICCV.2017.324 \bibitem{ref_14} Wong, K. C. L., Moradi, M., Tang, H., Syeda-Mahmood, T. (2018). 3D segmentation with exponential logarithmic loss for highly unbalanced object sizes. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11072 LNCS, 612–619. https://doi.org/10.1007/978-3-030-00931-1\_70\end{thebibliography} \end{document} }
\caption{\quad (a) Vehicular measurement scenario with three moving cars $A$, $B$ and $C$ located at snap-shot distances of 51.1~m, 66.8~m and 102.1~m, respectively, driving along the road such that the velocity relative to the radar system is ca. 12~m/s, -9~m/s and -9~m/s, respectively. The corresponding measured radar images with (b) RF canceller only and (c) both RF and digital cancellers, using an NR waveform of 20~ms with channel bandwidth of 40~MHz and subcarrier spacing of 30~kHz are shown. Additional multimedia material available at {\color{blue}\url{http://www.tut.fi/full-duplex/radar/NR40cars2.mp4}} }
\caption{Intersection matrix - Guess of rank 1 for each image comparison {\color{red}\textbf{CONCATENE AND MAKE IT MORE READABLE : FOR NEXT RELEASE}} }
\caption{ Performance comparison of several efficient networks over top-1 classification accuracy on \mbox{ImageNet} validation set with single crop. As is common practice for FLOPs, we count the total number of Multiply-Adds. The top-1 accuracy of our proposed HBONet (1.0) is highlighted in {\color{blue} blue}, surpassing all the other state-of-the-art networks under the complexity level of around 300 MFLOPs. }
\caption{Comparison of quantitative results using maximum F-measure $F_{\beta}^{max}\uparrow$ (larger is better), S-measure $S\uparrow$ (larger is better). The best three results on each dataset are shown in \color[HTML]{FE0000}\textbf{red}, \color[HTML]{3166FF}\textbf{blue}\color{black}, and \color[HTML]{32CB00}\textbf{green}\color{black}, respectively. Symbols of model categories: I+C for image-based classic unsupervised or non-deep learning methods, I+D for image-based deep learning methods, V+U for video-based unsupervised methods, V+D for video-based deep learning methods. Refer to the supplemental document for more detailed results.}
\caption{Generating constraints from synthesized candidates: Our matching algorithm transforms synthesized programs into constraints and then constructs a generalized constraint description in IDL \cite{Ginsbach2018}. This searches user code and generates matches. Our contribution (generating constraints from examples) is highlighted in \textcolor[HTML]{81a663}{green}.}
\caption{Per-instance test set JSD and TVD from base model on negative instances (top) and positive instances (bottom). \textcolor{olive}{$\blacktriangle$}: random seed; \textcolor{cyan}{$\blacksquare$}: uniform weights; dotted line: our adversarial setup; \textcolor{red}{$\boldsymbol{+}$}: adversarial setup from \newcite{jain2019attention}.}
\caption{Averaged per-instance test set JSD and TVD from base model for each model variant. JSD is bounded at $\sim 0.693$. \textcolor{olive}{$\blacktriangle$}: random seed; \textcolor{cyan}{$\blacksquare$}: uniform weights; dotted line: our adversarial setup as $\lambda$ is varied; \textcolor{red}{$\boldsymbol{+}$}: adversarial setup from \newcite{jain2019attention}.}
\caption{Equivalence of two ways of obtaining the interfacial tension $\sigma_{\alpha\gamma}$ of a non-wet interface in a three-phase equilibrium. (a) Three-phase triangle in the plane of two densities $\rho_1$ and $\rho_2$. The bulk phase points are indicated by $\alpha$, $\beta$ and $\gamma$. Near those points contours of constant free-energy density $F(\rho_1,\rho_2)$ are circular \red{as depicted by circles}. The trajectory corresponding to the $\alpha\gamma$-interface structure in the full DFT model is \red{schematically shown by the thick solid curve connecting the $\alpha$ and $\gamma$ points.} The interfacial tension $\sigma_{\alpha\gamma}$ is computed based on this trajectory. (b) A simplified model in which the potential well $V_{\beta}(\rho_1, \rho_2)$ associated with the spectator phase $\beta$ in $F$ has been suppressed (as indicated by dotted circles) and replaced by a constant factor in $F$. Under these circumstances the trajectory corresponding to the $\alpha\gamma$-interface becomes straight (thick solid line). Adjusting the constant factor to the value that $V_{\beta}(\rho_1, \rho_2)$ takes in the midpoint of the trajectory, indicated by the star, reproduces the interfacial tension computed in (a). The calculation in (b) leads to an analytic expression for $\sigma_{\alpha\gamma}$ which is proportional to the third power of the length $p$ of the edge $\alpha\gamma$ in the triangle $\alpha\beta\gamma$ and the length $\ell$ of the median connecting $\beta$ to the principal edge.}
\caption{ \red{Variations of the $\alpha\beta$ and $\beta\gamma$ interfacial tensions at fixed $t$. (a) $\sigma_{\beta\gamma}$ versus $\sigma_{\alpha\beta}$. (b) Log-log plot of $\sigma_{\beta\gamma}$ versus $(\sigma_{\alpha,\beta\gamma}-\sigma_{\alpha\beta})$. The curve in (a) and points in (b) are numerical data obtained from the analogs of \eqref{sigma_ag}. The field variable $t$ is fixed at 0.05 while $s$ varies between $-t^{3/2}$ and $t^{3/2}$. The slope of the line in (b) is 3/2, confirming the 3/2-power tangency at the critical endpoints.} }
\caption{Robustness of $n=100$ qubit bundled cyclic graphs (a, c) and bundled star graphs (b, d) subjected to iid dephasing (a, b) and a small number of erasures (c, d). In every scenario the graph is divided into $k$ bundles of $n/k$ qubits: \tikzcircle{2.2pt} $\rightarrow k=5$, \tikzcircle[Kgreen, fill=Kgreen]{2.2pt} $\rightarrow k=10$ and \tikzcircle[blue,fill=blue]{2.2pt} $\rightarrow k=20$. The classical limit \tikzcircle[orange, fill=orange]{2.2pt} and Heisenberg limit \tikzcircle[HLcolour, fill=HLcolour]{2.2pt} are also displayed for clarity. For small $p$, we observe that $\log_n \mathcal{Q}$ decreases linearly, which is expected from Eq.~(\ref{eqn:bundledephasing}). We also see that bundled cyclic graph states retain a quantum advantage after a small number of erasures, in contrast the QFI of bundled star graphs fall below the classical limit after a single erasure, which is expected from Eq.~(\ref{eqn:erasuresStar})~and~(\ref{eqn:erasuresCycle}).}
\caption{Person retrieval samples of the PCB model~\cite{DBLP:conf/eccv/SunZYTW18} on CUHK03 and Market1501, respectively. Note that the first image of each row is the query image and other images are ``Rank-1 to Rank-10'' in the ranking list. The {\color{green}green} ({\color{red}red}) boxes denote the {\color{green}positive} ({\color{red}negative}) images with the query image. As seen in this figure, the negative samples in the top positions have similar color information with the query image.}
\caption{Person retrieval samples of greyscale and RGB images in the PCB model~\cite{DBLP:conf/eccv/SunZYTW18} on CUHK03 and Market1501, respectively. In particular, PCB is trained on greyscale and RGB images, respectively. Top (bottom) of each pair is retrieval samples of RGB (greyscale) images. Note that the first image of each row is the query image and other images are ``Rank-1 to Rank-10'' in the ranking list. The {\color{green}green} ({\color{red}red}) boxes denote the {\color{green}positive} ({\color{red}negative}) images with the query image. As seen in this figure, there is the complementarity between RGB and greyscale images.}
\caption{The evaluation of the greyscale and RGB models on Market1501, DukeMTMC-reID and CUHK03, respectively. Note that {\color{orange}orange} ({\color{grey}grey}) bars denote using RGB (greyscale) images to train and test ResNet-50. {\color{new_blue}Blue} bars represent using RGB and greyscale images to train and test the ResNet-50 model, respectively.}
\caption{Overview of the whole framework. First, we propose a sample synthesis method to synthesize \textcolor{myorange}{episodic representation} for each class in the support set. Second, the registration module is leveraged to select \textcolor{myblue}{global representation} according to their episodic representation, and the \textcolor{mypurple}{selected global representations} are then used to classify \textcolor{mygreen}{query images}. The classification loss and registration loss are used to jointly optimize the global representations, the registration module, and the feature extractor. (Best viewed in color)}
\caption{Illustration of the proposed SSR model framework, which consists of three components: a) the image captioning model trained on pseudo image-caption pairs; b) the language model to provide self-supervised fluency reward for the captioning model; c) the visual semantic matching model to provide self-supervised multi-level relevancy rewards. We add \textcolor{gray}{the English translation} below the sampled Chinese caption in brackets for better understanding.}
\caption{Distribution of non-compositionality (inverse human compositionality) scores in \dsred{}. We expect to see a right-skewed distribution but ensuring that the dataset is balanced led to a clear bias that can be seen here.}
\caption{Gaussianized distribution of non-compositionality (inverse human compositionality) scores in \dsred{}.}
\caption{Distribution of characteristic and human scores in \dsred{} after being transformed into normal.}
\caption{Maximum transverse displacement of the trailing edge (left), mean thrust (second from left), mean input power (second from right), and Froude efficiency (right) versus dimensionless frequency, $f$, obtained from the nonlinear simulations for $h_0=0.1$. Each plot contains four separate colors corresponding to four different stiffnesses: $S = 0.02$ (\protect\solidrule{black!60!orange!100!}), $S = 0.2$ (\protect\solidrule{black!30!orange!100!}), $S = 2$ (\protect\solidrule{white!20!orange!100!}), and $S = 20$ (\protect\solidrule{white!50!orange!100!}). The dashed vertical lines correspond to the natural frequency defined using the imaginary part of the global eigenvalues presented in section \ref{sec:GMresults}.}
\caption{F1 score based on different dropout values using fastText embeddings (Skip Gram). All other hyper-parameter used for this evaluation are presented in table \ref{table:hyperparameter} for \oyeshds dataset. {\color{red} Make a line chart with x-axis dropout and y-axis performance.}}
\caption{CORSIKA simulation flow (\textcolor[rgb]{1.0,.0,.0}{Modification})}
\caption{(Color online) Log-log plots of the compensated spectrum $k^2 E(k)$ versus $k/k_{d}$ for the six different initial conditions IC1-IC6 (see Fig.~(\textcolor{blue}{1}) of the main text). We zoom into the region $\delta k/k_{d}=[0.005, 0.3]$, where the curves appear flat, and show, in the inset, how our data compare with the line $k^2 E(k) = 1$. Here, $k_{d}= \pi \lfloor L/3\rfloor/L$ is the value of the maximum wave number after dealiasing and the system size $L=2^{20}$. }
\caption{The percentage of query images localized within three pose accuracy intervals of our proposed method compared with state-of-the-art localization methods on the RobotCar Seasons and Aachen Day-Night datasets. \textcolor{red}{\textbf{red}} and \textcolor{blue}{\textbf{blue}} represent the \textcolor{red}{\textbf{best}} and \textcolor{blue}{\textbf{second-best}} methods, and the asterisk symbol represents using knowledge about the gravity direction.}
\caption{Comparing the density of linear partitions from a test image to random baselines for normal (in black) and DiffAI-protected networks (in {\color{DiffAIGreen}green}). Networks trained with the DiffAI robust-training scheme tend to exhibit significantly fewer non-linearities than their normally-trained counterparts.}
\caption{We illustrate the notion of a sample, {\color{red}(a)} basic graph pattern and {\color{red}(b)} complex graph pattern, as a gremlin traversal over the graph G as shown in Figure~\ref{fig:property_graph}.}
\caption{\colorbox{SpringGreen}{B-SVRG} and \colorbox{BurntOrange}{BP-SVRG}}
\caption{\small % The number above the target node indicates how likely the target node is from its true class based on BP algorithm and \textbf{GE-G}, respectively. % Starting from a target node, it shows that the subgraphs generated by \textbf{GE-G} become more faithful as the size of subgraph grows. Larger \textcolor{red}{red} nodes are the nodes explained. Other \textcolor{red}{red} nodes are the explaining nodes. The \textcolor{green}{green} nodes are the newly added ones. }
\caption{ \label{fig:Poincaresect} (a) A \textcolor{blue}{Poincar\'e section$\mathcal{P}$}, defined by a point $\vec{x}'$ and a normal vector $\vec{t}'$, is pierced by a periodic orbit at the periodic point $\vec{x}_p$. (b) The projection of a relative periodic orbit onto a \textcolor{blue}{slice $\hat{\mathcal{M}}$}, is a periodic orbit, $\hat{\vec{x}}_t=\hat{\vec{x}}_{t+T}.$ The whole orbit is projected onto $\hat{\mathcal{M}}$. Each state $\vec{x}_t$ is projected by applying shifts along the dotted lines onto $\hat{\mathcal{M}}$. }
\caption{ Comparison between our unified framework and each of the suggested self-supervised schemes on five 3D target tasks. The statistical analyses is conducted between the top-2 models in each column {\zzred highlighted in red}. While there is no clear winner, our unified framework is more robust across all target tasks, yielding either the best result or comparable performance to the best model ($p>$ 0.05). }
\caption{Nested NER results (F1) for ACE-2004, ACE-2005, GENIA and CNEC 1.0 (Czech) corpora. {\bfseries Bold} indicates the best result, {\itshape italics} results above SoTA and \colorbox{gray!30}{gray background} indicates the main contribution. * uses different data split in ACE-2005. ** non-neural model}
\caption{ Loss functions in Fixed-Point GAN. Terms inherited from StarGAN are in black, while highlighted in {\color{blue} blue} are our modifications to mitigate StarGAN's limitations (\figurename~\ref{fig:celeba5_stargan_vs_ours}). }
\caption{Open Problems: \red{\textbf{?}}=open, \blue{\cmark}=proven true, \xmark=counterexamples. }
\caption{Comparison of AP and mAP in \% of our framework and state-of-the-art methods on the PASCAL VOC 2007 dataset. Upper part presents the results of single model and lower part presents those that aggregate multiple models. ``Ours'' and ``Ours (pre)'' denote our framework without and with pre-training on the COCO dataset. The best and second best results are highlighted in {\color{red}{red}} and {\color{blue}{blue}}, respectively. ``-'' denotes the corresponding result is not provided. Best viewed in color.}
\caption{Comparison of AP and mAP in \% of our model and state-of-the-art methods on the PASCAL VOC 2012 dataset. Upper part presents the results of single model and lower part presents those that aggregate multiple models. ``Ours'' and ``Ours (pre)'' denote our framework without and with pre-training on the COCO dataset. ``Ours (pre \& fusion)'' denotes fusing our two scale results. The best and second best results are highlighted in{\color{red}{red}} and {\color{blue}{blue}}, respectively. Best viewed in color.}
\caption{The final results of the empirical evaluation. Each table cell provides the number of rounds of communication necessary to make the percentage of devices (as specified in the columns) reach a desired target accuracy (either 75\% or 80\% in our case). Runs that did not reach the target accuracy for the specified percentage of devices in the allowed rounds (1,000) are marked with \textit{---}. The best results obtained in study \textbf{MCA} are shown in bold {\color{violet} \textbf{violet}} while the best results in study Final, are shown in bold italic {\color{blue} \textbf{\textit{blue}}}. %\textit{Abbreviations}: \textbf{Ind} = Individual Criteria, \textbf{MCA} = multi-criteria aggregation, \textbf{Final} = Final system composed by online adjustment of multi-criteria aggregation priority. }
\caption{{\scriptsize \it \label{pwrs} FFT spectrum of \textcolor{blue}{pressure} at the top panel and \textcolor{red}{TDC scaler rate $R_1$} at bottom panle, for the month of October 2016.} }
\caption{{\scriptsize \it \label{betall} Distribution of the pressure coefficient $\beta_P$ for all PMTs for the months of \textcolor{blue}{September}, \textcolor{green}{October}, and November. the plots in left side is for exponential method, and that in right sides are by linear approximation method. Mean value of distribution is given along with the figures.}}
\caption{ We show new constraints on iDM based on the data of CHARM ({\purple purple}) and NuCal ({\blue blue}), and the projected sensitivity of the NA62 beam-dump run ({\red red}). The gray shaded regions are previously existing constraints, excluding E137 decay (see Sec. \ref{sec:discussion_iDM}). In (e), we include the potential E137 decay constraint \cite{Bjorken:1988as,Izaguirre:2017bqb,Berlin:2018pwi} along with projections from MiniBooNE \cite{Izaguirre:2017bqb}, BDX \cite{Izaguirre:2017bqb}, LDMX \cite{Berlin:2018bsc,Berlin:2018pwi,Akesson:2018vlm}, $\rm JSNS^2$ \cite{Jordan:2018gcd}, MATHUSLA \cite{Berlin:2018jbm,Chou:2016lxi}, CODEX-b \cite{Berlin:2018jbm,Gligorov:2017nwh}, FASER \cite{Berlin:2018jbm,Feng:2017uoz}, LHCb \cite{Ilten:2016tkc,Aaij:2017rft,Pierce:2017taw}, Belle-II \cite{Kou:2018nap}, SeaQuest \cite{Berlin:2018pwi}. Other probes outside of this parameter space (see, e.g., \cite{Curtin:2014cca,Izaguirre:2015zva,Liu:2018wte}) are not shown. One can see E137, MiniBooNE and BDX projections are already covered by CHARM and NuCal.}
\caption{This plot shows constraints and sensitivity projections for iDM within the muon $g-2$ motivated regime. The muon $g-2$ favored regime is the light-green band while the thick-black curve is again the parameter contour yielding the correct DM relic abundance. We considered the bounds from CHARM ({\purple purple}) and NuCal ({\blue blue}), and projections from NA62 ($10^{18}$ POT: {\red red}, $1.3\times 10^{16}$ POT: {\magenta magenta}), SQ/DQ ($10^{20}$ POT: {\cyan~dashed-cyan}, 1.4$\times 10^{18}$ POT: {\pinegreen~dashed-pine green}), and LongQuest-I (3-GeV cut:{\green~darker green}, no-cut: {\bf black}). In (b), we only plot LongQuest-I no cut curve because it is basically identical as the 3-GeV cut result. The gray region is the previously existing constraints.}
\caption{We show updates on the kinetically-mixed visibly decaying dark photon constraints and projections. In (a), gray contours are the previous bounds set based on analyses of NuCal \cite{Blumlein:2011mv,Blumlein:2013cua} and CHARM \cite{Gninenko:2012eq} experiments. In (b), projections for future experiments are shown in dot-dashed curves with color and also labeled in the plot. In both (a) and (b), our updated bounds on CHARM ({\purple purple}), NuCal ({\blue blue}), and new projections of NA62 ({\red red}) and LongQuest-I ({\bf black}) are shown. Note that we do not find any fundamental disagreement with the pioneering works on dark photon constraints from CHARM and NuCal \cite{Blumlein:2011mv,Gninenko:2012eq,Blumlein:2013cua}. We add relevant production channels and have a conservative assumption of meson production rates. Readers can take our CHARM and NuCal constraints as conservative limits.}
\caption{ The rotation periods of Praesepe members \citep{douglas2016}, vs. their \Gaia\colors (\gcolor) with a broken power law model, fit to these data. The dashed line and shaded region shows the mean and variance model described in equation \ref{eqn:gyro} at 650 Myrs (lower model) and 4.56 Gyrs (upper model). The solid blue line shows the \citep{angus2015} gyrochronology relation at 650 Myrs (lower model) and 4.56 Gyrs (upper model). Shaded regions show the 1$\sigma$ range of the rotation period model (equation \ref{eqn:gyro}). The rotation periods of stars bluer than around 0.56 dex and redder than around 2.7 dex in \gcolor\are modeled as a broad log-normal distribution with a standard deviation of 0.5 dex, added to observational uncertainties. This figure was generated in a Jupyter notebook available at\url{https://github.com/RuthAngus/stardate/blob/master/paper/code/Fitting_Praesepe.ipynb} }
\caption{ The additional rotation period scatter, $\sigma_P$, added to the observational period uncertainties in the model (see equation \ref{eqn:gyro}). The standard deviation was increased for early F and hotter stars (\gcolor $<$ 0.56), late M dwarfs (\gcolor\$>$ 2.7) and evolved stars (EEP $\gtrsim$ 420) in order to down-weight the age-information supplied by rotation periods and reproduce observed rotation period distributions. We also increased the variance for stars younger than around 250 Myrs, because the rotation periods of these stars typically have not yet converged onto a tight gyrochronology sequence, and for very high and low metallicity stars (-0.2 $\gtrsim$ [Fe/H] $\gtrsim$ 0.2) because the gyrochronology relations have not been calibrated at these extreme values. Down-weighting the gyrochronal likelihood by the inverse variance ($1/\sigma^2$) allowed the ages of these stars to be mostly inferred via isochrone fitting. Sigmoid functions were used to provide smooth transitions between regions of low and high variance. }
\caption{ Data simulated from the rotation period model. Late F, GK and early M dwarfs (stars with 0.56 $<$ \gcolor\$<$ 2.7 follow the Praesepe-calibrated gyrochronology relation (dashed gray lines), with the exception of old, slowly rotating stars with large Rossby numbers whose rotation periods are fixed at 2$\times$ their convective overturn time. The rotation periods of early F (\gcolor $<$ 0.56), late M dwarfs (\gcolor $<$ 2.7) and subgiants (EEP $\gtrsim$ 420) were generated from a log-normal distribution with standard deviation given by equation \ref{eqn:gyro}. % The top panel shows the rotation periods vs. B-V colors of simulated stars, The top panel shows the rotation periods vs. \gcolor\colors of simulated stars, colored by their age and the bottom panel shows the same stars colored by their equivalent evolutionary phase (EEP).% 9, 11, and 13 (rotation periods rise with age). The gray lines describe the mean gyrochronology model at ages 1, 3, 5, 7, 9, 11, and 13 (rotation periods rise with age). This figure was generated in a Jupyter Notebook, available at \url{https://github.com/RuthAngus/stardate/blob/master/paper/code/Simulate_data.ipynb} }
\caption{ The true vs. predicted ages of simulated stars. Ages calculated by combining gyrochronology and isochrone fitting with \sd\are shown in color and ages calculated with isochrone fitting only are shown in gray. The different panels show the results for stars with\gcolor\$<$ 2.2 (FGK dwarfs) that are still braking magnetically ($Ro < 2$), stars with \gcolor\$<$ 2.2 that have stopped braking magnetically ($Ro \geq 2$), stars with 2.2 $<$ \gcolor\(M dwarfs), and evolved stars (EEP $>$ 420). Gyrochronology is highly effective for FGK stars and ages inferred with both gyrochronology and isochrone fitting are more accurate and precise than ages inferred via isochrone fitting only for this group. Neither gyrochronology nor isochrone fitting can provide precise ages for M dwarfs, so the ages of these stars are imprecise regardless of age-dating method. This figure was generated in a Jupyter Notebook available at \url{https://github.com/RuthAngus/stardate/blob/master/paper/code/Results_plots.ipynb}. }
\caption{ The \kepler-based rotation periods of members of the 2.5 Gyr NGC 6819 open cluster. The raw \gcolor\colors are shown in red and the dust-corrected colors are shown in black. The dashed line shows a gyrochronology model that was fit to the Praesepe cluster and the Sun in this work, interpolated to 2.5 Gyrs. The solid blue line shows a previously calibrated gyrochronology model\citep{angus2015}. This figure was generated in a Jupyter Notebook available at: \url{https://github.com/RuthAngus/stardate/blob/master/paper/code/NGC6819.ipynb} }
\caption{ The inferred ages of members of the NGC 6819 open cluster as a function of their \gcolor\color. Ages of stars inferred using a combination of isochrone fitting and gyrochronology (Praesepe and Sun calibration) with dereddened\gaia\$G$, $G_{BP}$, and $G_{RP}$ photometry (black circles) and uncorrected, raw, photometry (red squares). % Black circles and red squares show the ages of stars inferred using a % combination of isochrone fitting and gyrochronology, with a gyrochronology % relation that was calibrated to Praesepe and the Sun. Black circles show % ages inferred with dereddened \gaia\ $G$, $G_{BP}$ and % $G_{RP}$ photometry and red squares show ages inferred with uncorrected, % raw, photometry. Even though V-band extinction is marginalized over in the inference process, reddening can still bias ages. Blue triangles, pointing up, show ages inferred using isochrone fitting and gyrochronology, with the \citet{angus2015} gyrochronology model. Orange triangles, pointing down, show ages inferred using isochrone fitting only. The ages of F stars (stars bluer than 0.7) were precisely constrained by isochrones and including gyrochronology makes little difference to their inferred ages. The age precision of G and K dwarfs (stars redder than 0.7) was improved by including gyrochronology. The median age of stars inferred using the gyrochronology model calibrated to Praesepe and the Sun (black circles) was 2.65 $\pm$ 0.13 which is consistent with the established cluster age (2.5 Gyr). This figure was generated in a Jupyter Notebook available at: \url{https://github.com/RuthAngus/stardate/blob/master/paper/code/NGC6819.ipynb} }
\caption{ A comparison of stellar ages inferred using asteroseismic modeling with ages inferred using a combination of isochrone fitting and gyrochronology. Colored circles show ages inferred using isochrone fitting and gyrochronology combined via the \sd\software package. Black triangles show the ages of all stars inferred via gyrochronology only and white circles show ages inferred via isochrone-fitting only.% This plot shows that, for the majority of stars in this sample, isochrone % fitting dominates the age information and gyrochronology only contributes % significantly for two or three stars. % This is a consequence of many of these stars being hot old, extremely metal % rich, extremely metal poor, or evolved. }
\caption{Example localization output for Flickr30kEntities (top row) and ReferItGame (bottom row). We compare the effects of adding a \textbf{tfoid} detector (\textcolor{darkred}{red} bounding box) to \textbf{tfcoco} (\textcolor{darkblue}{blue}) (\textbf{w2v-max}, \textbf{union}). The ground truth is indicated in \textcolor{darkgreen}{green}. The first two columns show examples of where adding a \textbf{tfoid} detector improves localization, while the last two columns are examples where it hurts localization.}
\caption{Example localization output for Flickr30kEntities (top row) and ReferItGame (bottom row). We compare the effects of adding a \textbf{colour} detector (\textcolor{darkred}{red} bounding box) to \textbf{tfcoco+tfoid+places365} (\textcolor{darkblue}{blue}) (\textbf{w2v-avg}, \textbf{union}). The ground truth is indicated in \textcolor{darkgreen}{green}. The first two columns show examples of where adding a \textbf{colour} detector improves localization, while the last two columns are examples where it hurts localization.}
\caption{Example localization output for Flickr30kEntities (top row) and ReferItGame (bottom row). We compare the effects of using a \textbf{consensus} (\textcolor{darkred}{red} bounding box) against a \textbf{union} (\textcolor{darkblue}{blue}) localization strategy (with \textbf{tfcoco+tfoid+places365+color}, \textbf{w2v-avg}). The ground truth is indicated in \textcolor{darkgreen}{green}. The first two columns show examples of where using \textbf{consensus} improves localization, while the last two columns are examples where it hurts localization.}
\caption{Example localization output for Flickr30kEntities. We compare the effects of adding a \textbf{tfoid} detector (\textcolor{darkred}{red} bounding box) to \textbf{tfcoco} (\textcolor{darkblue}{blue}) (\textbf{w2v-max}, \textbf{union}). The ground truth is indicated in \textcolor{darkgreen}{green}. The first row shows examples of where adding a \textbf{tfoid} detector improves localization, while the last row provides examples where it hurts localization.}
\caption{Example localization output for ReferItGame. We compare the effects of adding a \textbf{tfoid} detector (\textcolor{darkred}{red} bounding box) to \textbf{tfcoco} (\textcolor{darkblue}{blue}) (\textbf{w2v-max}, \textbf{union}). The ground truth is indicated in \textcolor{darkgreen}{green}. The first row shows examples of where adding a \textbf{tfoid} detector improves localization, while the last row provides examples where it hurts localization.}
\caption{Example localization output for Flickr30kEntities. We compare the effects of adding a \textbf{colour} detector (\textcolor{darkred}{red} bounding box) to \textbf{tfcoco+tfoid+places365} (\textcolor{darkblue}{blue}) (\textbf{w2v-avg}, \textbf{union}). The ground truth is indicated in \textcolor{darkgreen}{green}. The first row shows examples of where adding a \textbf{colour} detector improves localization, while the last row provides examples where it hurts localization.}
\caption{Example localization output for ReferItGame. We compare the effects of adding a \textbf{colour} detector (\textcolor{darkred}{red} bounding box) to \textbf{tfcoco+tfoid+places365} (\textcolor{darkblue}{blue}) (\textbf{w2v-avg}, \textbf{union}). The ground truth is indicated in \textcolor{darkgreen}{green}. The first row shows examples of where adding a \textbf{colour} detector improves localization, while the last row provides examples where it hurts localization.}
\caption{Example localization output for Flickr30kEntities. We compare the effects of using a \textbf{consensus} (\textcolor{darkred}{red} bounding box) against a \textbf{union} (\textcolor{darkblue}{blue}) localization strategy (with \textbf{tfcoco+tfoid+places365+color}, \textbf{w2v-avg}). The ground truth is indicated in \textcolor{darkgreen}{green}. The first row shows examples of where using \textbf{consensus} improves localization, while the last row provides examples where it hurts localization.}
\caption{Example localization output for ReferItGame. We compare the effects of using a \textbf{consensus} (\textcolor{darkred}{red} bounding box) against a \textbf{union} (\textcolor{darkblue}{blue}) localization strategy (with \textbf{tfcoco+tfoid+places365+color}, \textbf{w2v-avg}). The ground truth is indicated in \textcolor{darkgreen}{green}. The first row shows examples of where using \textbf{consensus} improves localization, while the last row provides examples where it hurts localization.}
\caption{Qualitative comparisons with DeepLab-V3 \cite{deeplabv3} and PSPNet \cite{pspnet}. The \textcolor{red}{red} circles mark where our model is particularly superior to other methods.}
\caption{Qualitative comparisons among Full model and other variants of our model. The \textcolor{red}{red} circles indicate where Full model is superior to other model variants. \label{fig:self}}
\caption[Interface of KAVAGait]{\textbf{Interface of KAVAGait} -- User interface of the KAVAGait prototype with its three main areas for gait analysis~\cite{wagner_2018_kavagait}. (1) The \highlight{Explicit Knowledge Store} shows (1.a) a table with the patient's match for different gait categories and (1.b) allows filtering of the population of prototypical patients. (2) The patient explorer including the (2.a) \highlight{Person Information}, the (2.b) visualization of the ground reaction force (${F_{v}}$) time series for each foot on a separated scale and the (2.c) visualization of the combined ${F_{v}}$ from both feet. (3) Shows the \highlight{Parameter Explorer} visualizing the 16 calculated \highlight{Spatio-Temporal Parameters} of the loaded patient in relation to the \highlight{Norm Data Category} and a second \highlight{Selected Category}. \newline \textit{Image courtesy of~M.\,Wagner\,\cite{wagner_2017_integrating}.} }
\caption[Instantiation of Knowledge-assisted VA Model for KAVAGait]{\textbf{Instantiation of Knowledge-assisted VA Model for KAVAGait} -- Illustrating the different prototype specific elements in relation to the related components and processes of the\\highlight{Knowledge-assisted VA Model}~\cite{federico-wagner_2017_model,wagner_2017_integrating} (Important abbreviations included in the green bubbles: GRF := ground reaction force, HRS := hatching range-slicer, ITBP := interactive twin-box-plot). \hfill \textit{Image courtesy of~M.\,Wagner\,\cite{wagner_2017_integrating}.} }
\caption{(A) Schematic of the experimental apparatus (not to scale). The flow is confined between two concentric independently rotating cylinders with radii $r_i$ and $r_o$. Only the inner cylinder~(IC) rotates with an angular velocity $\omega_i$. A mirror and a window in the bottom plate provide optical access to the $r$-$\theta$ plane for a high-speed camera. (B) A typical still image with the inner and outer cylinder highlighted in orange. \makered{The fibers (aspect ratio $\Lambda =5.3$) are clearly visible as white rods.} $\text{Re}_i = \num{1.7e5}$ and $\alpha = \SI{0.05}{\percent}$. (C) Schematic of the $r$-$\theta$ plane. The orientation of the particle, $\theta_p$, is zero when it is aligned with the IC. Fibers with their center in the red areas are removed from all statistics. (D) Definition of the orientation, $\theta_p$, and the orientation vector, $p_i$, of a fiber. $\theta_p$ is measured with respect to the azimuthal direction and is defined positive in the counter-clockwise direction. \makered{See Movies S1 and S2 in the supplemental material for typical recordings showing the movement and tracking of the fibers \citep{Video1,Video2}.}}
\caption{PDF of the fiber orientation $\theta_p$ measured at $\tilde{z} = 0.24$. Different $\alpha$ are indicated by different hues and different $\text{Re}_i$ are shown with different shades. A representation of the fiber alignment is shown at the top of the figure. \makered{Independent of $\alpha$ and $\text{Re}_i$ there is a clear preference for an alignment around $-0.38\pi \pm 0.05\pi$ (\SI[separate-uncertainty = true,multi-part-units=single]{-68(9)}{\degree}).} A large \SI{40}{\percent} difference between the most and least probable orientation is observed.}
\caption{(A) PDF of the fiber orientation $\theta_p$ at various radial bins, indicated by different colors. $\alpha$ is fixed to \SI{0.05}{\percent}, $\text{Re}_i = \num{2.5e5}$, and the measurement is performed at $\tilde{z} = 0.24$. (B) Axial dependence of the PDF of $\theta_p$, indicated by different colors. For these measurements $\alpha=\SI{0.05}{\percent}$ and $\text{Re}_i=\num{8.3e4}$. \makered{The diagram on the right indicates the position of the weak vortical structures \cite{Huisman2014,vanderVeen2016}. The distribution is found to be nearly independent of radial and axial positions, and all show similar alignment.}}
\caption{PDF of the rotation rate of the fibers for $\alpha=\SI{0.05}{\percent}$ and $z/L=0.24$. Rotational velocities are normalized using the angular velocity of the IC. The PDF is independent of $\text{Re}_i$ and shows a slight preference for retrograde rotation (blue). Note that the icons holds for CW rotation of the inner cylinder. \makered{The mean rotation is $\langle \dot{\theta}_p/\omega_i \rangle \approx -0.42$ with a standard deviation of $\sigma(\dot{\theta}_p/\omega_i)=3.13$, which reveals that a large number of fibers rotate much faster than the inner cylinder. For comparison, a Gaussian distribution with the same mean and variance is added.} % show extreme rotation rates? The skewness and kurtosis are found to lie in the range $[-0.14, 0.24]$ and $[34,40]$, respectively. \makered{The inset shows the same data on a linear scale.}% }
\caption{FAR and \red{FRR} ($p_1=p_2=...=p_N$)}
\caption{FAR and \red{FRR} (RBF kernel, $p_1 \neq p_2 \neq .. .\neq p_N$)}
\caption{Comparison of the proposed solution against top performing techniques on the Track1 challenge data (n=3190) image. In the table, an entry in \textcolor{red}{red} depicts the best performance in that particular when compared other competing algorithms.}
\caption{Comparison of the proposed solution against top performing techniques on the Track2 challenge data (n=33) sequeces. In the table, an entry in \textcolor{red}{red} depicts the best performance in that particular when compared other competing algorithms.}
\caption{Quantitative comparison including max F-measure, MAE, and S-measure over six widely used datasets. `-' denotes that corresponding methods are trained on that dataset. $\uparrow \& \downarrow$ denote larger and smaller is better, respectively. $^*$ means methods using pre-processing or post-processing. The best three results are marked in \textcolor{red}{red}, \textcolor{blue}{blue}, and \textcolor{green}{green}, respectively. Our method achieves the state-of-the-art on these six widely used datasets under three evaluation metrics. }
\caption{\label{table:hopping}% \red{Nearest-neighbor hopping parameters between $M$ atoms obtained from the Wannier functions analysis of the GGA electronic structures of {\M} where $M$ = Ti and Ru. The unit is eV.}}
\caption{\label{fig:Ru}\red{Band structures of {\M} with nonmagnetic HTS for (a) $M=$Ti and (b) $M=$Ru. The weight of the $yz$ orbital characters is shown by the size of the circles.}}
\caption{Performance comparison on UCF\_CC\_50. The results of top two performance are highlighted in{\textcolor{red}{red}} and {\textcolor{blue}{blue}} respectively. }
\caption{Our results before and after applying the post-processing. (top) the input and results at the middle frame. (bottom) temporal profile of the \textcolor{orange}{orange scan line}. The slices of video frames were stacked along the vertical-axis.}
\caption{{\color{red}Perhaps skip, as this is just the TB model as in the previous work, and we have a similar picture including the full wave calculations at the end.} Resonance eigenfrequencies from the effective tight-binding Hamiltonian for a resonator chain of length $N=50$ displaying defect states induced by the reciprocal AC mechanism [see Fig.~\ref{idea}(c)]. The chain is made out of two parts with internal asymmetric coupling parameters $A/W=2$ and $B/W=-2$ to the left of the interface, and those parameters interchanged to the right of the interface. The frequencies have been shifted by the common offset $\bar\Omega$. The four defect states at $\left|\mathrm{Re}\,\Omega\right|=1$ are characterized by their spectral isolation from the rest of the states and their localization at the interface. In this paper we replicate this effect by means of the simpler lifetime-difference geometry depicted in Fig.~\ref{idea}(d).}
\caption{Illustrations of the hard filter, which determines whether the scene data should be applied in predicting the future locations of a pedestrian. (a) the frame image is first divided into $n \times n$ grid cells ($n = 4$ in this example) to capture all human movements in each grid cell; (b) \& (c) only non-linear grid cells are selected for further processing at the subgrid level; the scene data is not applied for pedestrians in the linear grid cell; (d) a non-linear grid cell is further divided into$m \times m$ subgrids ($m = 4$) and each trajectory is parsed into subgrid paths; (e) the common subgrids, occupied by common subgrid paths; (f) at prediction time, the decision of use/not use scene data depends on the current location of each pedestrian. If the pedestrian’s current location is in the common subgrids, the scene data is used (\textcolor{red}{red} pedestrian); otherwise, it is not used (\textcolor{green}{green} pedestrian).}
\caption{Comparison among our action-level fusion model, majority-voting baseline and each individual representation's performance. Individual branches are color-coded according to task types, with \textcolor{myg}{\textbf{green}} for 2D tasks, \textcolor{myb}{\textbf{blue}} 3D, and \textcolor{myr}{\textbf{red}} semantics.}
\caption{Comparison to state-of-the-arts on MSD test set. All methods are trained on MSD training set. ``w/o C'' is without using CRF~\cite{krahenbuhl2011efficient} for post-processing. ``Statistics" refers to thresholding mirror location statistics from our training set as a mirror mask for detection. The best and second best results are marked in {\bf bold} and \textcolor{red}{red}, respectively.}
\caption{ $\sigma\,T_\mathrm{mix}$ for different $D_\mathrm{dp}>0$, $D_s$ and $\sigma$, as a function of the P\'eclet number$Pe$ (left), or of the effective P\'eclet number$Pe_\mathit{eff}$ (right). Black solid line: no salt; \textcolor{red}{--\,$\cdot$\,--} $D_s=1360\,\mu\mathrm{m^2s^{-1}}$, $D_\mathrm{dp}=290\,\mu\mathrm{m^2\, s^{-1}}$; \textcolor{red}{$\ast$}: $D_s=1360\,\mu\mathrm{m^2s^{-1}}$, $D_\mathrm{dp}=1000\,\mu\mathrm{m^2\, s^{-1}}$;\textcolor{red}{$\triangle$}: $D_s=1360\,\mu\mathrm{m^2s^{-1}}$, $D_\mathrm{dp}=10^4\,\mu\mathrm{m^2\, s^{-1}}$; \textcolor{red}{$\times$}: $D_s=1360\,\mu\mathrm{m^2s^{-1}}$, $D_\mathrm{dp}=10^5\,\mu\mathrm{m^2\, s^{-1}}$; \textcolor{red}{$\circ$}: $D_s=10\,\mu\mathrm{m^2s^{-1}}$, $D_\mathrm{dp}=10^4\,\mu\mathrm{m^2\, s^{-1}}$; \textcolor{red}{$+$}: $D_s=10\,\mu\mathrm{m^2s^{-1}}$, $D_\mathrm{dp}=10^5\,\mu\mathrm{m^2\, s^{-1}}$ \textcolor{magenta}{- - -}: time needed to reach the Batchelor scale; note that this latter follows the same scaling as the mixing time. }
\caption{ {Left: mixing time of the colloids, $T_\mathrm{mix,c}$, as a function of the P\'eclet number for all numerical simulations with$D_\mathrm{dp}^2/(D_cDs)>1$; the evolution of the flow is kept identical, varying $D_c$, $D_\mathrm{dp}$, and $D_s$. Right: $T_\mathrm{mix,c}$ as a function of the effective P\'eclet number.$+$: mixing of colloids without diffusiophoresis ($D_\mathrm{dp}=0$); $\times$: mixing of salt ; \textcolor{red}{$\bullet$}: ``salt-attracting" case ($D_\mathrm{dp}>0$); \textcolor{blue}{$\blacktriangle$}: ``salt-repelling" case ($D_\mathrm{dp}<0$). The solid line is a curve of expression $T_\mathrm{mix,c}=3.2 \ln(Pe/120)$.} \label{fig:Tmix_Pe_Pe_eff} }
\caption{ Mixing time $T_\mathrm{mix}$ as a function of the P\'eclet number for different values of$D_\mathrm{dp}$ and $D_s$ in the salt-repelling case ($D_\mathrm{dp}<0$); $\square$: $D_\mathrm{dp}=10^{-3}$; $\circ$: $D_\mathrm{dp}=2.\,10^{-3}$; $\triangledown$: $D_\mathrm{dp}=4.\,10^{-3}$; \textcolor{red}{---}: $D_\mathrm{dp}/D_s=0.1$; \textcolor{purple}{$\cdot\cdot\cdot\cdot$}: $D_\mathrm{dp}/D_s=0.2$; \textcolor{blue}{$- \cdot -$}: $D_\mathrm{dp}/D_s=0.4$; \textcolor{green}{$- -$}: $D_\mathrm{dp}/D_s=0.8$; \textcolor{orange}{--$\,\cdot\, \cdot\,$--}: $D_\mathrm{dp}/D_s=1.6$ The full symbols are those for which $D_\mathrm{dp}^2/(D_cDs)\ge10$; open symbols: $D_\mathrm{dp}^2/(D_cDs)<10$. \label{fig:Tmix_Pe_eff_salt_repelling_plateau} }
\caption{ {\color{red}The ground-state energy $E_{\rm g}$ in (a) and magnetization $\langle \sigma_z\rangle$ in (b) of the Ohmic two-impurity SBM with $s=1$ as a function of the coupling strength $\alpha$ at $\varepsilon=K=0$ and $\Delta=0.025$. The numbers of effective bath modes and coherent-superposition states ($M=500, N=6$) and ($M=30, N=12$) are used for NVM with linear and logarithmic grids ($\Lambda=2$), respectively. The truncated number $N_{\rm tr} = 4$ is set in the exact-diagonalization (ED) procedure. In the inset of (a), the second derivative of $E_{\rm g}$ is shown for ED and NVM, and an exponential fit is presented with the dashed line. In (b), a bias case of $\epsilon=10^{-4}$ is given with stars.} }
\caption{{\color{red}(a) The spin coherence $\langle \sigma_x\rangle$ obtained by ED and NVM against the dissipation $\alpha$ at $\varepsilon=K=0,~s=1$, and $\Delta=0.025$ for the linear and logarithmic discretization ($\Lambda=2$). The criterion $\langle \sigma_x \rangle = \Delta/\omega_c$ is plotted with the dash-dotted line, and $\Delta / \langle \sigma_x\rangle $ is shown in the inset. (b) The ground-state energy $E_{\rm g}$ and spin coherence $\langle \sigma_x\rangle$ obtained from different variational works, i.e., $\rm VM1$ in Ref.\cite{mcc10}, $\rm VM2$ in Ref.\cite{zhe15}, and NVM in this work, in the biased case of $\varepsilon=10^{-5},~ s=1,~K=0$, and $\Delta=0.1$. A sharp kink of spin coherence is marked by the arrow.} }
\caption{{\color{red}The correlation between two bases $\Delta \langle \rm{B}_{\uparrow\uparrow}|\rm{B}_{\downarrow\downarrow}\rangle$, correlation functions $\rm Cor_{X}$ and $\rm -Cor_{P}$ , and departure from the minimum uncertainty, $\Delta X_{\rm b}\Delta P_{\rm b}-1/4$ are shown with respect to the coupling strength $\alpha$. Dashed lines represent exponential fits. In the inset, the renormalized tunneling $\Delta_r/\Delta$ is given. } }
\caption{ (a) Average displacement coefficients $\overline f_k$ (circles) and $\overline p_k$ (triangles) for three coupling strengthes $\alpha=0.05,~ 0.316$ and $0.45$, corresponding to the delocalized phase, transition point, and localized phase, respectively. Solid lines represent the summations of $\overline f_k$ and $\overline p_k$, and dashed lines stand for the classical displacements $\lambda_k/\omega_k$. (b) The average displacements function $\overline f_k + \overline p_k$ and average weights function $\overline D^2-\overline A^2$ for the $0$-th bath mode. {\color{red} Inset shows the effective energy scale $\chi$ estimated from Eq.~(\ref{vm_chi}) and the renormalized tunneling $\Delta_r$ in the delocalized phase. The dashed line indicates an exponential fit.} }
\caption{ {\color{red}(a) The correlation between two bases $\langle \rm{B}_{\uparrow\uparrow}|\rm{B}_{\downarrow\downarrow}\rangle$ for different tunneling constants $\Delta=0.01,~ 0.025,~ 0.05,~ 0.1$, and $0.2$ at $\varepsilon=K=0,~ s=1$, and $\omega_c=1$. The dashed line shows an exponential fit. (b) The magnetization $\langle \sigma_z\rangle$ as a function of $\alpha$ for different values of the bias $\varepsilon=10^{-5},~ 10^{-6},~ 10^{-7}$, and $10^{-10}$ at $\varepsilon=K=0,~ s=1$, and $\Delta=0.1$. In addition, the case of $\Delta=0.01$ is shown with circles, and exponential-like fits are presented with dashed lines. The scaling behavior of the crossover scale $T^{*}$ is shown in the inset.} }
\caption{ {\color{red}Ground-state properties including $E_{\rm g},~\langle \sigma_z\rangle,~\Delta_{\rm r}/ \Delta$, $\langle \rm{B}_{\uparrow\uparrow}|\rm{B}_{\downarrow\downarrow}\rangle$, and $\rm Cor_{X}$ are presented as a function of $\alpha$ for the antiferromagnetic case with $K=3.0$. The dashed lines in (a) are guides to the eye for linear fits.} }
\caption{A small portion of the resulting knowledge graph. This portion of the knowledge graph shows the relationships between concepts in SIR modeling. The red vertices are concept nodes with the big red vertex representing a cluster center for concepts related to "These Models", and the blue nodes are source code variable nodes. In the bottom left of the figure, you can see that the {\color{blue}\texttt{infected\_individuals}} is related to the {\color{red}infection} concept which is related to the {\color{red}An exposed infectious class} concept. Additionally, the {\color{blue}\texttt{Beta}} variable is related to the {\color{red}Beta rate} concept which is related to the {\color{red}susceptible} concept. }
\caption{Transferred sentences on Yelp ($1\%$ data), Yahoo and Enron datasets, where \textcolor{red}{red} denotes successful style transfers, \textcolor{blue}{blue} denotes content losses, and \textcolor{orange}{orange} denotes grammar errors. Better looked in color.}
\caption{ Comparisons with top trackers in VOT2015. {\color{red}Red}, {\color{green}green} and {\color{blue}blue} fonts indicate \emph{1st, 2nd, 3rd} performance, respectively. Best viewed on color display.}
\caption{ Comparisons with top trackers in VOT2016. {\color{red}Red}, {\color{green}green} and {\color{blue}blue} fonts indicate \emph{1st, 2nd, 3rd} performance, respectively. Best viewed on color display.}
\caption{Experimental results. \textcolor{red}{First}, \textcolor{green}{second}, \textcolor{blue}{third} best results are highlighted in color, and the results that are better than the baseline (the first line in each block) performance are highlighted in bold. \%Recall represents the overall performance.}
\caption{Example of turning point annotations ({\color{tp1}{TP1}}, {\color{tp2}{TP2}}, {\color{tp3}{TP3}}, {\color{tp4}{TP4}}, {\color{tp5}{TP5}}, respectively) for the synopsis of the movie ``Panic Room''.}
\caption{Action set for the standard Noughts and Crosses (N\&C) game, where squared brackets denote robot commands such as gestures. A similar set is used for the ultimate N\&C game but with a larger set of moves}
\caption{\textcolor{revised}{CACC clustering algorithm}}
\caption{\textcolor{revised}{CACC system fallback loop}}
\caption{Profiles of the vertical velocity at the top boundary (\dashed) and inviscid $C_p$ at the bottom boundary (\solid). Lines with markers, TSB-SB; lines only, TSB-SO. }
\caption{Comparison of (a) mean velocity and (b) Reynolds normal stresses in wall units at $Re_\theta = 670$ for validation. \solid Current results, \dashed \cite{SchlatterO10}, \dashdot \cite{Wuetal17}.}
\caption{Pre-multiplied energy spectrum of the pressure fluctuation at several locations in the separated shear layer. (a) \revb{Contour map of $p^\prime_\text{rms}$}; the cross markers indicate the locations where the spectrum are obtained \revb{($x/\theta_0 = 207.3$ for (b), $310.3$ for (c) and $540.0$ for (d))}. The square markers are the locations where the two-point correlation is evaluated. \solid: selected mean streamline passing the high-$p^\prime_\text{rms}$ regions. (b-d) pressure spectra. The thin solid vertical line marks the high-frequency $f_h = 0.0025 \,U_o/\theta_o$, the blue dashed vertical line represents the low-frequency $f_l = 0.001 \,U_o/\theta_o$, the dotted-dash vertical line marks $f_m= 0.002 \,U_o/\theta_o$.}
\caption{(a) Profile of the normalized vorticity thickness. \solid, TSB-SO, \dashdot, $d\delta_\omega/dx = C_{\delta_\omega} (U_\text{max} - U_\text{min})/(U_\text{max} + U_\text{min})$, $C_{\delta_\omega}=0.16$. (b) Dominant frequency of mixing layer predicted by canonical mixing layer relationship. \solid, $St_{\delta_\omega} = 0.25$; gray hatched region, $St_{\delta_\omega} \in [0.2, \,0.3]$. The thin solid horizontal line marks the high-frequency $f_h = 0.0025\, U_o/\theta_o$, the dotted-dash horizontal line marks $f_m= 0.002 \,U_o/\theta_o$, and the dashed horizontal line represents the low-frequency $f_l = 0.001\,U_o/\theta_o$, .}
\caption{Isosurfaces of the real part of the high-frequency DMD mode $f_h\theta_o/U_o = 2.49\times10^{-3}$, \revall{$St_{L_\text{sep}}=1.125$}. (a), $u_\text{dmd}/U_o=\pm0.02$; (b, c), $v_\text{dmd}/U_o=\pm 0.01$ top and side view respectively. \solid $U=0$.}
\caption{Results on \set{ChicagoFSWild/dev}; training on \set{ChicagoFSWild/train}. Ours+X: iterative attention (proposed method) applied to input obtained with X. \textcolor{blue}{+LM}: add language model trained on \set{ChicagoFSWild/train}.}
\caption{Results on \set{ChicagoFSWild/test} and \set{ChicagoFSWild+/test}. Black: %model and LM trained on \set{ChicagoFSWild/train}; \textcolor{green!50!black}{Green}: trained on \set{ChicagoFSWild/train + ChicagoFSWild+/train}. \vspace{-.5em} }
\caption{Single-view depth estimation results on test split of KITTI raw dataset~\cite{Geiger2013IJRR}. The methods trained on KITTI raw dataset~\cite{Geiger2013IJRR} are denoted by K. Models with pre-training on CityScapes~\cite{Cordts2016Cityscapes} are denoted by CS+K. (D) denotes depth supervision, (\textcolor{red}{B}) denotes binocular/stereo input pairs, (\textcolor{blue}{M}) denotes monocular video clips. (\textcolor{green}{J}) denotes joint learning of multiple tasks. The best performance in each block is highlighted as bold.}
\caption{Accuracy (IoU $>$ 0.5) on RefCOCO dataset. \textbf{Bond}: best result. \redfont{Red}: second best result. \bluefont{Blue}: best result of VC. }
\caption{The comparison with \texttt{CosmoTransitions}. For \texttt{CosmoTransitions}, we take \texttt{fRatioConv} as 0.02. The bounce action ${\cal S}_E$ is calculated for $d=3$. The runtimes are measured by Thinkpad X250 with Ubuntu 16.04, whose CPU is Intel\textregistered Core\texttrademark i7-5600U (2.60 GHz) and compiler is GCC version 5.4.0. We take \texttt{-O3} as the optimization option. }
\caption{Examples of VPE, Sluice Ellipsis, and Coreference represented as ``questions" about their associated contexts. In the ellipsis examples, wh-phrases and auxiliary verbs are marked in \textcolor{red}{red} and elided phrases are marked in \textcolor{blue}{blue}. In coreference examples, the mentions are highlighted in \textcolor{green}{green} and their antecedents in \textcolor{orange}{orange}. Since context sentences can contain multiple mentions, each mention is disambiguated with \texttt{<ref></ref>} tags when input to the model. See Section~\ref{sec:qa-conversion} for more information.}
\caption{Governing values of $\delta_{ij}$ that enter eq. {\color{blue}(5,6,7)} for in $ S_{4} \times Z_{n} $ flavor symmetric SUSY SO(10) theory.}
\caption{The outcome of the calculations are presented for CMSSM case. In fig {\color{blue}3a, 3b}, different horizontal lines depicts the present (MEG 2016) and future MEG constraints for BR($ \mu $ $ \rightarrow $ e + $ \gamma $).}
\caption{ In figs {\color{blue}(4a-4e)} allowed SUSY parameters region as constrained by MEG 2016 bound is presented.}
\caption{The consequences of the analysis and calculations are presented for NUHM case. In fig {\color{blue}5a,5b}, different horizontal lines illustrates the present (MEG 2016) by MEG Collaboration and future MEG bounds for BR($ \mu $ $ \rightarrow $ e + $ \gamma $). Figs. {\color{blue}5c,5d} portrays the allowed SUSY parameter space for different parameters, as is constrained by stringent by MEG 2016 bounds.}
\caption{Figs. {\color{blue}6a-6d} describe the allowed SUSY region for different soft SUSY parameter space, as is obstructed by stringent MEG 2016 bounds.}
\caption{The results of the analysis are presented for NUSM case. In fig {\color{blue}7a, 7b}, different horizontal lines depicts the present (MEG 2016) and future MEG bounds for BR($ \mu $ $ \rightarrow $ e + $ \gamma $). Figs. {\color{blue}7c, 7d} shows the allowed space for different parameters, that is allowed by MEG 2016 bound.}
\caption{The results of the computations presented for NUSM case. Fig {\color{blue}8a, 8b, 8c, 8d, 8e}, depicts the permitted space for different parameters, that is allowed by MEG 2016 bound by MEG collaboration.}
\caption{\small Curves of varying the number of sounds in the testing mixture on MUSIC, obtained by models respectively trained with mixtures of $2$ sounds (first row), and $3$ sounds (second row). \textcolor{green}{green}, \textcolor{red}{red}, and \textcolor{blue}{blue} lines respectively stand for MP-Net, PixelPlayer \cite{zhao2018sound} and MIML~\cite{gao2018learning}. }
\caption{\small Curves of varying the number of sounds in the testing mixture on VEGAS, obtained by models respectively trained with mixtures of $2$ sounds (first row), and $3$ sounds (second row). \textcolor{green}{green}, \textcolor{red}{red}, and \textcolor{blue}{blue} lines respectively stand for MP-Net, PixelPlayer \cite{zhao2018sound} and MIML~\cite{gao2018learning}. }
\caption{Sample generated recipe. Emphasis on personalization and explicit ingredient mentions via \colorbox{light-gray}{highlights.}}
\caption{ Symbols represent the ratios $\mathcal{P}$ of occupation probabilities (\textcolor{orange}{$\blacksquare$}) and $\mathcal{T}$ of transition rates (\textcolor{blue}{$\bullet$}). Results of simulations are displayed with points while lines show various theoretical scalings discussed in the text: ``full'' (green dot-dashed, see Eq.~(\ref{eq:full})), ``width ratio'' (blue solid, see Eq.~(\ref{eq:ditlevsen})) and ``depth ratio'' (orange dashed, see Eq.~(\ref{eq:bier})). Subsequent panels correspond to various setups: piecewise-linear potential with $\Delta E_1=8.5$, $\Delta E_2=7.5$, $x_1=0.25$ and $x_2=0.75$ (top panel), piecewise-linear potential with $\Delta E_1=85000$, $\Delta E_2=75000$, $x_1=0.7$ and $x_2=0.7$ (middle) and the continuous potential (\ref{eq:ciagly}) with $a=1$ (bottom). The red triangle (\textcolor{red}{$\blacktriangle$}) and the green rhombus (\textcolor{green}{$\blacklozenge$}) in the top panel depict analytical evaluation of $\mathcal{P}$, $\mathcal{T}$, respectively, derived with the stationary $p(x)$ for the Gaussian ($\alpha=2$) noise. }
\caption{ Symbols represent the ratios $\mathcal{P}$ of occupation probabilities (\textcolor{orange}{$\blacksquare$}) and $\mathcal{T}$ of transition rates (\textcolor{blue}{$\bullet$}) for the continuous double-well potential (\ref{eq:ciagly}) with $a=1$. Solid lines show the theoretical ``width ratio'' scaling (blue solid, see Eq.~(\ref{eq:ditlevsen})). % full (green dot-dashed, see Eq.~(\ref{eq:full})), % and depth rate (orange dashed, see Eq.~(\ref{eq:bier})). % % Subsequent panels correspond to various values of the $\sigma$ parameter scaling the strength of L\'evy noise:$\sigma=1$ (top panel), $\sigma=0.1$ (middle panel) and $\sigma=10$ (bottom panel). The legend is included in the bottom panel.}
\caption{ Ratio $\mathcal{R}$ of mean first passage times for well-bottom-to-well-bottom $T_{w-w}$ and well-bottom-to-barrier-top $T_{w-b}$ for the L\'evy noise (empty points) and mixture of Gaussian and L\'evy noises (full symbols) for the symmetric double-well potential given by Eq.~(\ref{eq:ciagly}) with $a=0$. For more details see~Ref.~\onlinecite{dybiec2007}. }
\caption{Dependency between entities. The visual relationship between grounding regions for \textcolor{orange}{``cheerleaders''} and \textcolor{blue}{``a girl''} should agree with context ``toss ... high up into the air''. }
\caption{Gold label multiplicity. The \textcolor{green}{green box} is the annotated gold grounding region for entity phrase \textcolor{green}{``Old man''}, while the \textcolor{orange}{orange dash boxes} are region proposals with IoU $\ge 0.5$ with gold. }
\caption{Our model for phrase grounding as a sequence labeling task. The $K\!\times\!K$ transition score matrix is derived from the features of $K$ region proposals. The $T\!\times\!K$ emission score matrix is derived from a joint representation of phrase-region pairs, which is fused from features of region proposals and $T$ entity phrases. Bounding box regression is applied to the sequence of regions predicted by the CRF. \textcolor{cyan}{Cyan dashed line}: contextualized transition score prediction (Section~\ref{ssec:model}). }
\caption{\label{fig-example} Example translations where pronouns in brackets are dropped in original inputs (``\textsc{Inp.}'') but labeled by humans according to references (``\textsc{Ref.}'') and previous sentence (``\textsc{Pre.}''). We italicize some {\em \color{blue} mis-translated} errors and highlight the {\bf \color{red} correct} ones in bold. }
\caption{\redcom{Average harvested power at the ER as a function of $T_b$ (duration of the backscatter phase).}}
\caption{\redcom{Ratio of magnitude of ambient and backscatter signal components at the output of the correlator plotted against $T_b$ (duration of the backscatter phase).}}
\caption{\redcom{Average harvested power at the ER with the proposed sequence plotted against the duration of the backscatter phase, $T_b$.}}
\caption{\redcom{Average harvested power, $\bar{Q}$, \\ plotted against transmit power of AS, $P_s$.}}
\caption{\redcom{Average harvested power, $\bar{Q}$, \\ plotted against number of antennas at the ET $M$.}}
\caption{\redcom{Average harvested power at the ER plotted against the offset between incoming and locally generated signal at the correlator.}}
\caption{\redcom{Average harvested power at the ER plotted against the number of ambient symbols during the backscatter phase when the actual ambient symbol duration is different from the designed value.}}
\caption{Examples of stable evolution of the rotating states at $\delta=-0.1$, $m_2=-1$ (a) and $\delta=0$, $m_2=+1$ (c). (b) Instability development at $\delta=+0.1$, $m_2=0$ leading to formation of persistent breather with the period $\approx 22.6\pi/\omega$. (d) Instability development leading to switching of the state from the upper to lower branch at $\delta=+0.29$, $m_2=+2$. In (a),(c), and (d) pump amplitude $h_{1,2} =0.10$, while in (b) $h_{1,2} =0.05$. All input states are taken from the upper branch that smoothly continues to negative $\delta$ values. All distributions are shown within $x,y\in [-5, 5]$ window. See \textcolor{blue}{Visualizations 1,2,3,4}. \label{fig:dyn}}
\caption[Comparison of excess surface mass density profiles in spec-z and photo-z]{Comparison of excess surface mass density profiles inferred via weak lensing by 3D voids found in spec-z (red) and photo-z (black) \redmagic{} mocks in \mice{}. Figure~\ref{fig:elongation} shows that the major axis of 3D voids in our sample tends to be aligned with the LOS, which contributes to the excess lensing signal.}
\caption[Stacked photo-z voids]{Stack of the true positions (spec-z's) of \mice{} \redmagic{} galaxies around the centres of 3D voids that have been identified using photo-z's of the same mock galaxies. The colour coding reflects the excess density of galaxies, $n_{vg}/\ave{n_g}-1$, as a function of the void-centric distances along ($r_\parallel$) and perpendicular ($r_\perp$) to the LOS. As discussed in section~\ref{sec:phzScat}, the stack gives a misleading impression of void elongation due to photo-z scatter. Figure~\ref{fig:elongation} shows that our void sample is not biased with respect to ellipticity, though modest orientation bias is present.}
\caption[Elongation and orientation of voids]{Normalized distributions for the elongation (top, defined as the ratio between the largest and the smallest eigenvalue of the inertia tensor) and the orientation (bottom, defined as the cosine of the angle $\vartheta$ between the LOS and the principal inertia tensor eigenvector) of 3D voids found in spectroscopic (red) and photometric (black) \redmagic{} mocks in \mice{}. Vertical lines indicate the mean of each distribution (solid red for spectroscopic, dashed black for photometric mocks).}
\caption[Clustering and lensing profiles of 3D voids in mocks]{Comparison of $\Delta\Sigma(r_p)$ profiles from weak lensing (black dots with error bars) and projected galaxy-density profiles $\Delta\xi_{vg}^{2D}(r_p)$ (green area) around 3D voids of different size in \mice{} \redmagic{} mocks. $\Delta\Sigma(r_p)$ has been rescaled by an overall amplitude $c_\mathrm{slope}$ to yield a best match with $\Delta\xi_{vg}^{2D}(r_p)$. The first data point of $\Delta\xi_{vg}^{2D}$ has been fixed to a value of zero and is not used in the fit.}
\caption{Hard-Eight task suite. In each task an agent (\textcolor{blue}{$\blacktriangledown$}) must interact with objects in its environment in order to gain access to a large apple (\textcolor{red}{$\blacktriangledown$}) that provides reward. The 3D environment is also procedurally generated so that every episode the state of the world including object shapes, colors, and positions is different. From the point of view of the agent the environment is partially observed. Because it may take hundreds of low-level actions to collect an apple the reward is sparse which makes exploration difficult.}
\caption{Reward vs actor steps curves for R2D3 and baselines on the Hard-Eight task suite. The curves are computed as the mean performance for the same agent across 5 different seeds per task. Error regions show the 95\% confidence interval for the mean reward across seeds. Several curves overlap exactly at zero reward for the full range of the plots. R2D3 can perform human-level or better on Baseball, Drawbridge, Navigate Cubes and Wall Sensor. R2D2 could not get any positive rewards on any of the tasks. DQfD and BC agents occasionally see rewards on Drawbridge and Navigate Cubes tasks, but this happens rarely enough that the effect is not visible in the plots. Indicators (\textcolor{red}{$\blacktriangledown$}) mark analysis points in Section~\ref{guided-exploration}.}
\caption{Our representation, visualized for a simulation without drift in the pose estimate. Left image (ground truth): black line: robot trajectory, {\color{PRGreen} green lines: place recognitions}, {\color{RayOrange} orange lines: depth sensor rays} at current pose; in the background, \colorbox{OccYellow}{yellow squares indicate true occupied space}, \colorbox{FreeGreen}{green squares true known free space} and \colorbox{UnkBlue}{blue squares true unknown free space}. Right image (our representation): black line: pose graph without loop closures, {\color{red}red lines: known obstacles}, {\color{blue} blue lines: frontiers}, {\color{PRGreen}green lines: local volumes in consolidation scope} (see \refsec{sec:consolidation}). Note how frontiers may overlap, as they are not resolved globally but only within the consolidation scope.}
\caption{A local volume obtained from one measurement. The red lines starting at the robot position represent the depth measurement rays. \textcolor{blue}{Blue points represent samples at maximum sensor range} and \textcolor{red}{red points represent samples that hit an obstacle}. \textcolor{blue}{Blue edges are the frontier edges} and \textcolor{red}{red edges obstacle edges}.}
\caption{ Frontier consolidation prompted by place recognition. See \reffig{fig:repr} for a detailed legend. (a) Before place recognition, the {\color{PRGreen} consolidation scope} contains most recent poses. (b) After {\color{red} place recognition (red lines in ground truth)}, {\color{ConsRed} local volumes across the place recognition edges (light red)} are added to the consolidation scope, temporarily transformed into the local frame of the current pose (arrow, light green volumes). This affects {\color{blue} frontiers (blue)}: They frontiers above the {\color{ConsRed}red volumes} and the {\color{blue} \qmarks{range arc}} of the current pose are both resolved.}
\caption{Excerpt of the results on the INRIA Aerial Image Labeling dataset. Correctly classified pixels are in \textcolor{OliveGreen}{green}, false positive are in \textcolor{Lavender}{pink} and false negative are in \textcolor{RoyalBlue}{blue}. The multi-task framework allows the network to better capture the spatial structure of the buildings.}
\caption{\textcolor{blue}{Exploration of several values for the trade-off factor of the distance transform regularization on the ISPRS Vaihingen dataset and influence on the relative improvement}. Results are obtained using a 3-fold cross-validation.}
\caption{Intensity spectra for \gray{} emission processes within $|b| < 10^\circ$ of the Galactic plane and toward the poles. Top row: quadrant 1 (left) and quadrant 2 (right). Middle row: quadrant 2 (left) and quadrant 3 (right). Bottow row: north polar (left) and south polar (right). Various black line styles show SS$_{_{\rm SA50}}$ solution for respective sky regions: long dashed, IC; shorted dashed, $\pi^0$-decay; dash-dot, bremsstrahlung. Coloured curves show the corresponding TDD intensity spectra over the last 5~Myr of the simulation run: coding is red (earliest) to cyan (latest) with 10~kyr sampling interval. \label{fig:gammaspec} }
\caption{Time series of \gray{} intensities at 10~GeV (cyan), 100~GeV (red), and 1~TeV (green) for IC and $\pi^0$-decay processes for: (left) $45^\circ < l < 180^\circ$ and (middle) $-180^\circ < l < -45^\circ$ ranges for intermediate latitudes above/below plane, and (right) north/south Galactic polar regions $|b| > 60^\circ$. Time range is for the last 5~Myr of the TDD simulation epoch and has 10~kyr sampling. This range and sampling is the same as shown by the right panels of Fig.~\ref{fig:crtimeseries}. \label{fig:gammaoutofplanetimeseries} }
\caption{High-energy \gray{} emissions and fractional residuals for the TDD solution at selected energies. Top row shows total all-sky intensities for the TDD solution at 599.5~Myr, corresponding to the CR intensity sample shown in Fig.~\ref{fig:edenfracres}. Second row shows fractional residuals at same energies for total \gray{} intensities for the TDD solution using the SS$_{_{\rm SA50}}$ total \gray{} emissions as the baseline. Third and fourth rows show corresponding fractional residuals for total gas-related ($\pi^0$-decay and bremsstrahlung) and IC emissions using the respective SS$_{_{\rm SA50}}$ predictions (gas-related, IC). The longitude meridians and latitude parallels have $45^\circ$ spacing. \label{fig:gammaintensityandfrac} }
\caption{High-energy \gray{} emissions fractional residuals for the TDD solution advanced by 50~kyr from those shown in Fig.~\ref{fig:gammaintensityandfrac} at selected energies. Top row shows fractional residuals for total \gray{} intensities for the TDD solution using the SS$_{_{\rm SA50}}$ total \gray{} emissions as the baseline at time sample 599.55~Myr. Second and third rows show corresponding fractional residuals for total gas-related ($\pi^0$-decay and bremsstrahlung) and IC emissions using the respective SS$_{_{\rm SA50}}$ predictions (gas-related, IC). The longitude meridians and latitude parallels have $45^\circ$ spacing. \label{fig:gammafracadd50kyr} }
\caption{Total \gray{} fractional residuals at selected energies for the TDD solution at time samples 599.5~Myr (top row) and 599.55~Myr (bottom row) using the SS$_{_{\rm SA0}}$ intensities for the baseline/reference prediction. The longitude meridians and latitude parallels have $45^\circ$ spacing.}
\caption{Comparison with the state-of-the-art methods. \textcolor{blue}{\textbf{Blue}} indicates best value overall. %* denotes single-instruction test. %We include both greedy and beam search versions of our approach to compare to existing models. %Notes: SR (\%), SPL (\%) }
\caption{ Top-down view of the trajectory graphs for beam search and \short{}. \textcolor{Turquoise}{Blue Star} is the start and \textcolor{red}{Red Stop} is the target.}
\caption{Comparison between the agent equipped with an LSTM instruction encoder and our \short{} agent on a validation unseen environment (path\_id: 6632), including top-down trajectory view and step-by-step navigation views. We indicate the start (\protect\inlinegraphics{./start-c.png}), target (\protect\inlinegraphics{./stop-yes-c.png}) and failure (\protect\inlinegraphics{./stop-no-c.png}) of agents in an unseen environment.}
\caption{Comparison among the agents trained with teacher-forcing, student-forcing and stochastic sampling strategies on a validation unseen environment (path\_id: 7201), including top-down trajectory view and step-by-step navigation views. We indicate the start (\protect\inlinegraphics{./start-c.png}), target (\protect\inlinegraphics{./stop-yes-c.png}) and failure (\protect\inlinegraphics{./stop-no-c.png}) of agents in an unseen environment.}
\caption{Comparison between the agent equipped with an LSTM instruction encoder and our \short{} agent on a validation unseen environment (path\_id: 6632), including top-down trajectory view and step-by-step navigation views. We indicate the start (\protect\inlinegraphics{imgs/signs/start-c.png}), target (\protect\inlinegraphics{imgs/signs/stop-yes-c.png}) and failure (\protect\inlinegraphics{imgs/signs/stop-no-c.png}) of agents in an unseen environment.}
\caption{Comparison among the agents trained with teacher-forcing, student-forcing and stochastic sampling strategies on a validation unseen environment (path\_id: 7201), including top-down trajectory view and step-by-step navigation views. We indicate the start (\protect\inlinegraphics{imgs/signs/start-c.png}), target (\protect\inlinegraphics{imgs/signs/stop-yes-c.png}) and failure (\protect\inlinegraphics{imgs/signs/stop-no-c.png}) of agents in an unseen environment.}
\caption{Three out of 150 reviews for the movie ``Coach Carter'', and summaries written by the editor, and generated by a model following the \textsc{Extract-Abstract} approach and the proposed \textsc{Condense-Abstract} framework. The latter produces more informative and factual summaries whilst allowing to control aspects of the generated summary (such as the {\color{green!45!blue}acting} or {\color{red!45!blue}plot} of the movie).}
\caption{Illustration of EA and CA frameworks for opinion summarization. In the CA framework, users can obtain need-specific summaries at test time (e.g.,~give me a summary focusing on \textcolor{red}{acting}).}
\caption{Examples of general-purpose and need-specific summaries generated by four systems. We also show the consensus summary (\textsc{Gold}). \uwave{Underlined} phrases denote factually incorrect information. Words/phrases in color highlight aspects pertaining to {\color{green!45!blue} acting}, {\color{red!45!blue} plot}, {\color{blue!45!blue} positive} and {\color{red!45!red} negative} sentiment. {The examples show that incorporating an extractive module (\textsc{+Salient}) prevents the model from customizing summaries.}}
\caption{Quantitative comparison of scan-level performance. Best results are marked in \textcolor{blue}{blue}. For the 3DSE + \acs{ACE} F1 phase-level scores, we use {\@fnsymbol{1}} and {\@fnsymbol{2}} to indicate if differences were statistically significant ($\alpha <0.05$) compared to the text-mining and 3DSE model, respectively. Significance was calculated using randomized tests~\cite{Yeh_2000} and adjusted using the multiple comparison correction of Holm-Bonferroni~\cite{Holm_1979}.}
\caption{Across-model quantitative evaluation using the F1 score. Best and second-best results are marked in \textcolor{blue}{blue} and \textcolor{red}{red}, respectively.}
\caption{Study-level performance of text mining and 3DSE. Each row groups studies based on the number of dynamic \ac{CT} \acfp{SOI} they possess. Each column counts the number of studies based on how many scans were misclassified, if any. Best results for each \ac{SOI} number are marked in \textcolor{blue}{blue}.}
\caption{\textbf{Qualitative results on CUB-200-2011} comparing our method {\color[RGB]{181,38,25}\textbf{\ablpsi-HRNet}} (red) with {\color[RGB]{137,31,124}\textbf{CMR}} \cite{kanazawa2018learning} (violet). Each column contains the input monocular 2D keypoints (top), lifting of the 2D keypoints into 3D by CMR (middle) and by our method (bottom) from 2 different 3D viewpoints (the same view and a view offset by 90$^{\circ}$ along camera y-axis). \label{f:qual_birds} }
\caption{\textbf{3D poses on Human3.6M} predicted from monocular keypoints. Each column contains the input 2D keypoints (top) and a comparison between {\color[RGB]{66,154,201}\textbf{PoseGAN}} \cite{kudo2018unsupervised} (blue, middle), and our method {\color[RGB]{181,38,25}\textbf{\methodname}} (bottom, red) from two 3D viewpoints. \label{f:qual_h36m}}
\caption{\textbf{Qualitative results on PASCAL3D+} comparing our method {\color[RGB]{181,38,25}\textbf{\ablpsi-HRNet}} (red) with {\color[RGB]{137,31,124}\textbf{CMR}} \cite{kanazawa2018learning} (violet). Each column contains the input monocular 2D keypoints (top) and lifting of the 2D keypoints into 3D by CMR (middle) and by our method (bottom) viewed from 2 different angles. \label{f:qual_p3d}}
\caption{Example of an input tweet alongside extracted questions with possible answers (the correct answers are underlined), shortened versions, and the probability of success of the shortened version (``pr.\succ.''),\ie, the fraction of workers who voted for the shortened version over the input tweet. \textcolor{teal}{Colored} tweets are versions where the short version was preferred over the original one on average.}
\caption{Visualization of a sample QA pair and the source of individual words in the answer. The \textbf{Answer with source probabilities} section displays a heatmap on answer words selected from the question, passage, vocabulary and knowledge, respectively. A slot with a higher source probability is highlighted in darker cyan. The \textbf{Answer colored by source} section shows the answer in which every word is colored based on the source it was actually selected from. Words in \textcolor{blue}{blue} come from the question, \textcolor{Red}{red} from the passage, \textcolor{ForestGreen}{green} from the vocabulary, and \textcolor{BurntOrange}{orange} from the knowledge. The visualization is best viewed in color.}
\caption{Examples of Referring Expression Grounding (REG). Given a query and an image, REG aims to localize the referential entity. We can observe that besides visual attributes, relationship with other entities is often used to describe the target entity. The target entity and its corresponding proposal are shown in \redfont{red}. The contextual entity and its corresponding proposal are shown in \bluefont{blue}. %\redfont{Red}: target entity. \bluefont{Blue}: contextual entity. }
\caption{Accuracy (IoU $>$ 0.5) on RefCOCO, RefCOCO+ and RefCOCOg. \textbf{Bond}: best result. \bluefont{Blue}: best result of VC. }
\caption{Illustration of the generated attentional maps of the crowd flow in {\color{red}{sequential representation learning}} with ${n}$ set as 4. Every five columns form one group. In each group: i) on the first row, the first four images are the input sequential inflow/outflow maps and the last one is the ground truth inflow/outflow map of next time interval; ii) on the second row, the first four images are the attentional maps generated by our ACFM, while the last one is our predicted inflow/outflow map; iii) on the third row, the first four images are the residual maps between the input flow maps and the ground truth, while the last one is the residual map between our predicted flow map and the ground truth. % We can observe that there is a negative correlation between the attentional maps and the residual maps to some extent. }
\caption{Illustration of the generated attentional maps of the crowd flow in {\color{red}{periodic representation learning}} with ${m}$ set as 2. Every three columns form one group. In each group: i) on the first row, the first two images are the input periodic inflow/outflow maps and the last one is the ground truth inflow/outflow map of next time interval; ii) on the second row, the first two images are the attentional maps generated by our ACFM, while the last one is our predicted inflow/outflow map; iii) on the third row, the first two images are the residual maps between the input flow maps and the ground truth, while the last one is the residual map between our predicted flow map and the ground truth. % We can observe that there is a negative correlation between the attentional maps and the residual maps to some extent. }
\caption{The proposed interactive UIs for safe human-robot manufacturing: projector-mirror (top) and HoloLens (bottom). \textcolor{blue}{Video: https://youtu.be/A8VTTcYFoMo}}
\caption{ Slices of number density, color, pressure, and \fcolor\for n0.02-T3e6, right before ($t=70$ Myr, upper four panels) and after ($t=90$ Myr, lower four panels) the multiphase formation. Cool gas (centered on by the white circles) forms preferentially in regions with relatively high densities, low temperatures and low \fcolor. }
\caption{\label{tab:docsamples} Two example topics and their distributions of $\mu$, $\sigma$ and $cv$ from the NYT corpus. Two topics are: \textcolor{blue}{Topic1 (in blue)}: \textit{\{financial, banks, bank, money, debt, fund, loans, investors, funds, hedge\}}. \textcolor{orange}{Topic2 (in orange)}: \textit{\{world, one, like, good, even, know, think, get, many, got\}}. Their human rating scores are 3.4 and 1.0 respectively. }
\caption{Retrieval examples on ModelNet40. Top-10 matches are shown for each query, with the 1\textsuperscript{st} line for PointNet~\cite{c1_pointnet} and the 2\textsuperscript{nd} line for our DensePoint. The mistakes are highlighted in \textcolor[rgb]{1.00,0.00,0.00}{red}.}
\caption{(a) Point cloud with different sampling densities. (b) Results of testing with sparser points. (c) Point cloud with some points being replaced with random noise (highlighted in \textcolor[rgb]{1.00,0.00,0.00}{red}). (d) Results of testing with noisy points.}
\caption{The results (\%) of dropout with different ratios applied on ${\bm{\mathrm f}}_{\mathcal{N}(x)}$ in Eq. (\textcolor[rgb]{1.00,0.00,0.00}{4}).}
\caption{\textbf{Motivation}: sufficiently contextual semantic information is essential for a thorough grasp of the elusive shape formed by point cloud. The ``bottle'' is misidentified as the ``vase'' by PointNet~\cite{c1_pointnet}, while with sufficient context aggregated, it can be accurately recognized. Here, we only illustrate the multi-level context around the \textcolor[rgb]{0.00,0.00,1.00}{blue} point for visual clearness.}
\caption{The configuration details of shape part segmentation network. ``long-range'' indicates the long-range connections (see Fig. \textcolor[rgb]{1.00,0.00,0.00}{3}(b) in the main paper). $K$ is the number of classes.}
\caption{Retrieval examples on ModelNet40 dataset. Top-10 matches are shown for each query, with the 1\textsuperscript{st} line for PointNet [\textcolor[rgb]{0.00,1.00,0.00}{31}] and the 2\textsuperscript{nd} line for our DensePoint. The mistakes are highlighted in \textcolor[rgb]{1.00,0.00,0.00}{red}.}
\caption{The configuration details of normal estimation network. ``long-range'' indicates the long-range connections (see Fig. \textcolor[rgb]{1.00,0.00,0.00}{3}(b) in the main paper).}
\caption{An example for JpJI-MDM source model. (a) \textcolor[rgb]{0,0,0}{Subjects with joint, partially-joint and individual sources such that $C_1=2$, $C_2=[3,2,3,3,1,3,2,3,3,2]$, and $C_3=[2,4,2,1,5,1,1,2,3,3]$. (the same colors means jointness),} (b) Correlation source matrix for the source model shown in (a). \textcolor[rgb]{0,0,0}{Black and white squares mean 1 and 0, respectively, and blue square indicates that its value is unknown (0 or 1).}}
\caption{(a-d) Performance of the JpJI-ICA algorithm versus different iterations in terms of (a) the convergence rate of jSIR (dB), (b) $Acc(C)$, (c) $Acc(\tilde{\mathcal{K}}_{all})$, (d) JpJI feature for $K=10, C_1=3,C_2^{(k)}=2,C_3^{(k)}=1,k=1,...,K$ versus outer iterations; \textcolor[rgb]{0,0,0}{and (e-i) performance of the JpJI-ICA algorithm versus different $\sigma_0$ ($K=16,C_1=1,C_2^{(k)}=1,C_3^{(k)}=1,k=1,...,K$), in terms of its accuracy (e) to estimate the number of joint sources, (f) to estimate the number of partially-joint sources, (g) to estimate the number of individual sources, (h) Mean jSIR (dB), and (i) JpJI feature for joint, partially-joint, and individual sources.}}
\caption{\textcolor[rgb]{0,0,0}{(a-d) Estimated order for 39 subjects ($C^{(k)},k=1,...,39$) with different stimuli types (Positive music, Negative music, Positive sound, Negative sound) using ER-FM model, (e) mean spatial distribution of the first 4 significant sources ($p < 0.05$ FDR corrected) from control and depressed subjects, (f) JpJI feature for the first 10 extracted sources separately for different stimuli, (g) map of significant voxels in source 1, 2, and 3.}}
\caption{The hip torque generated by \VPa injects and removes energy from the system, whereas for \VPbl the order reverses. Both methods yield a net positive hip work (marked with ({\protect \markerVPaTri},{\protect \markerVPbTri}), which leg damper has to remove. The \VP radius effects the phase of the energy reversal (marked with ({\protect \markerVPa},{\protect \markerVPb})}
\caption{Figure shows the positive, negative and the net work done by the leg and the hip. \VPb requires less the leg work and net hip work as opposed to \VPa. However, as \VPb is placed further down (\protect \markerVPbB), the absolute positive and negative hip energy requirements increase as well. }
\caption{Sample texts, along with the machine (\textbf{Q$_{gen}$}) and human (\textbf{Q$_{hum}$}) generated questions for the given answer (\textbf{Ans}). In the machine generated questions where the corresponding entities could not be resolved are shown in \textcolor{red}{red}, the corresponding resolved entities in human generated questions are in \textcolor{blue}{blue}.}
\caption{An example of a Warehouse Game graph -- all of them are defined on a $4\times 4$ grid. Rectangular vertices denote targets, a triangle vertex is an attacker's starting point and a blue shaded circle vertex is a defender's starting point. Values in vertices denote payoffs for the attacker and the defender, resp.\in the case of interception of the attacker in a given vertex. Additional utilities related to the case of a successful attack are assigned to the targets (the second value in the respective pair).}{\color{red} TODO skala!!!}
\caption{Search Game graph. Solid edges denote the only Follower's paths that potentially allow him/her to reach one of the targets within {\color{red}a $4$-round} limit. Inclusion of any of the dotted edges in the Follower's path ends up with either negative or neutral payoff.}
\caption{(a) Liquid temperature $T_{TC}$, (b) pressure $P$, \red{c)} volume fraction $\alpha$, (d) the compensated Nusselt number $\text{Nu}_\omega\text{Ta}^{-0.4}$, and (e) the drag reduction (DR) as a function of time. The gray shaded area and the blue lines correspond to data in the non-boiling regime, i.e. $t<t_{\text{boil}}$. The boiling point is defined using the intersection $P=P_v$ at a certain time as it is shown in (b). The time steps in (d) correspond to the photographs shown in \fref{fig:figure1}.}
\caption{Pearson correlation coefficients between mean popularity measure and percentile, for each user (Coefficients with p-value $<$ 0.01 are shown in \textbf{bold} color). {\color[HTML]{009901}{Green}} values exhibit significant positive correlation, and {\color[HTML]{CB0000}{red}} values significant negative correlation.}
\caption{Ablation Studies Classification test accuracy of the discriminator and by using inter-subject MS-SSIM score. \textcolor{red}{(Upper rows): scores for encoded image pairs $(z_{i},z_{j})$, (Lower rows): scores for non-encoded and encoded image pairs $(x_{i},z_{j})$}}
\caption{Ablation studies distribution of MS-SSIM score. (left): absence of term $\lossDis\big(D(E(\img_i), \img_j), s_{ij}\big)$, (right): absence of term $\lossDis\big(D(E(\img_i), E(\img_j)), s_{ij}\big)$. Results obtained with $\lambda=1$. \textcolor{red}{(Top row): MS-SSIM of $(z_{i},z_{j})$, (Bottom row): MS-SSIM of $(x_{i},z_{j})$}}
\caption{Budget of transport equations of the Reynolds stresses (a) $\overline{{u^\prime}^2}$, (b) $\overline{{v^\prime}^2}$, (c) $\overline{{w^\prime}^2}$, and (d) $-\overline{u^\prime v^\prime}$. The values are scaled by $u_\tau^4/\nu$. The colors of the lines in each panel represent (black) the Newtonian case, (blue) $\Wew=1000$ case, and (red) $\Wew=1500$ case, and the symbols represent the different terms in the Reynolds stress transport equation: \opentriangle, production term $P_{ij}$; \opencircle, Coriolis-force term $G_{ij}$; \opensquare, pressure-strain redistribution term $\Pi_{ij}$; \opendiamond, viscoelastic-force term $W_{ij}$. }
\caption{ This table shows cross-dataset generalization measured as area under the curve (AUC) of percentage of correct keypoints following \cite{zb2017hand}. Each row represents the training set used and each column the evaluation set. The last column shows the average rank each training set achieved across the different evaluation sets. The top-three ranking training sets for each evaluation set are marked as follows: \textbf{first}, \textcolor{blue}{second} or \textcolor{cyan}{third}. Note that the evaluation set of \textit{HO-3D} was not available at time of submission, therefore one table entry is missing and the other entries within the respective column report numbers calculated on the training set. }
\caption{ This table shows, cross-dataset generalization measured as area under the curve (AUC) of percentage of correct keypoints following \cite{zb2017hand}. In contrast to the table reported in the main paper an ImageNet pretrained \textit{ResNet50} network is used for direct regression of normalized 3D pose. Entries are marked if they rank: \textbf{first}, \textcolor{blue}{second} or \textcolor{cyan}{third} for each dataset. }
\caption{Full annotation scheme for system response types after user abuse. {\color{gray}Categories (1a) and (1b) are excluded from this study.}}
\caption{ \label{fig:responses_per_prompt_class}% Response class percentage per prompt category: \textcolor{datadriven}{(A) Gender and Sexuality}, \textcolor{adult}{(B) Sexualised Comments}, \textcolor{rules}{(C) Sexualised Insults}, \textcolor{commercial}{(D) Sexual Requests and Demands}. Non-grammatical and non-coherent responses have been excluded, as have responses by adult-only bots. }
\caption{Text generation examples from Gigaword. \textcolor{gred}{Highlighted} words are selected. t1-3 are sampled from the decoder based on the selected content. Generations from VRS are more faithful to selected contents.}
\caption{Posterior inference example. \textcolor{red}{Highlighted} words are selected contents according to the posterior distribution $q_{\phi}(\beta|X,Y)$. b1-b5 are decoded by fixing the selected contents.}
\caption{Example visualization according to the filter scores on AffectNet(left) and CelebA (right). Each row shows samples with different scores: (Row 1) Samples with {\color{red}High} \emph{Class Confidence} score, (Row 2) Samples with {\color{blue}Low} \emph{Class Confidence} score, (Row 3) Samples with {\color{red}High} \emph{Real vs. Fake} score, (Row 4) Samples with {\color{blue}Low} \emph{Real vs. Fake} score. Each column represents samples from different categories. (Best viewed in color)}
\caption{Classification of the wake. \protect\markerone: stable flow; \protect\markertwo: steady tip vortex and unsteady shedding vortices; \protect\markerthree: unsteady tip vortex; \protect\markerfour: vortex dislocation.}
\caption{\ourmodel's beam search for the best decoding, one sentence (step) at a time, showing the predicted state changes and the resulting dependency graph after each step. In step 2, \ourmodel~chooses between predicting whether \entity{water} is \statechange{destroyed} (\textcolor{red}{\bf D}, upper figure) or \statechange{moved} (\textcolor{green}{\bf M}, lower). Predicting \statechange{M} results in a more connected dependency graph (as water is mentioned again in step 4), hence the lower choice is likely preferred (Eq.~\eqref{eq:h-graph}) by our \emph{\textbf{dependency graph score}}. This score also assesses how a priori likely it is that $s_4$'s movement of water enables $s_5$, using a background KB (Eq.~\eqref{eq:h_kb}).}
\caption{\textcolor{green}{Top panels: entanglement for $U=0.17$, $l=0$ and initial states: $|60,0,0\rangle$. The right panel is a zoom of the left panel. Arrows mark the point of maximal entropy. The bottom panel shows $|c_n|^2$ at this point approximately, where was used $t=61.74$. The blue points show an approximate curve of the wrap.}}
\caption{ An example from \trivia{} with multiple spans of the answer text (\underline{underlined}). The model trained with self-training technique outputs the correct answer (\tred{red}) and the model trained on MML objective does not (\tblue{blue}). }
\caption{(\textit{a}) Frequency responses from $P(j\omega)$ (\protect\blackline) compared to those from reduced-order models $\widetilde{P}(j\omega)$ (\protect\bluedot) at $\Rey=60,\ 80,\ 100$. (\textit{b}) The corresponding open-loop impulse responses from numerical simulations. The results for $\Rey=60,\ 80$ are multiplied by 20 and 3, respectively, so that the same scale can be used.}
\caption{DNS results of closed-loop systems. (\textit{a}) Time evolution of the transverse velocity at the sensor (\protect\blackdot) and the total perturbation energy $\textit{E}(t)$ in log scale at $\Rey=60$(\protect\bluelineshort\hspace{0.5mm}\protect\bluesmalldot\hspace{0.5mm}\protect\bluelineshort), $\Rey=80$ (\protect\redlineshort\hspace{1mm}\protect\redlineshort) and $\Rey=100$(\protect\blackline). (\textit{b}) Vorticity contours (dashed lines for negative and solid lines for positive vorticity) for the perturbation systems at $t=75$ (\protect\mytriangle{black}) at $\Rey=60,\ 80,\ 100$ (from top to bottom). All contour plot share the same color range. (\textit{c}) Table of parameters.}
\caption{(\textit{a}) Optimal sensor locations (the ridge \protect\blacklineshort\hspace{1mm}\protect\blacklineshort) and contour plot of optimal stability margin $\textit{b}_{opt}$ against Reynolds number and sensor location $\textit{d}$. And the largest $\textit{b}_{opt}$ (\protect\blackline) can be achieved at different Reynolds numbers is plotted beneath. (\textit{b}) Loci of unstable poles (\protect\xmark) and critical zeros (\protect\bluedot/\protect\redcir) of transfer functions $\widetilde{P}(s)$ for two cases. Top: different sensor locations at $\Rey=80$. Bottom: different Reynolds numbers with a sensor placed at $d=2.5D$ downstream the cylinder.}
\caption{(\textit{a}) Frequency responses from $P(j\omega)$ (\protect\blackline) compared to those from reduced-order models $\widetilde{P}(j\omega)$ (\protect\bluedot) at $\Rey=60,\ 80,\ 100$. (\textit{b}) The corresponding open-loop impulse responses from numerical simulations. The results for $\Rey=60,\ 80$ are multiplied by 15 and 3, respectively, so that the same scale can be used.}
\caption{DNS results of closed-loop systems. (\textit{a}) Time evolution of the cylinder lift and the total perturbation energy $\textit{E}(t)$ in log scale at $\Rey=60$(\protect\bluelineshort\hspace{0.5mm}\protect\bluesmalldot\hspace{0.5mm}\protect\bluelineshort), $\Rey=80$ (\protect\redlineshort\hspace{1mm}\protect\redlineshort) and $\Rey=100$(\protect\blackline). (\textit{b}) Vorticity contours (dashed lines for negative and solid lines for positive vorticity) for the perturbation systems at $t=75$ (\protect\mytriangle{black}) at $\Rey=60,\ 80,\ 100$ (from top to bottom). All contour plot share the same color range. (\textit{c}) Table of parameters.}
\caption{(\textit{a}) The largest $\textit{b}_{opt}$ (\protect\blackline) that can be achieved at different Reynolds numbers. (\textit{b}) Loci of unstable poles (\protect\xmark) and critical zeros (\protect\bluedot/\protect\reddot) of transfer functions $\widetilde{P}(s)$ at different Reynolds numbers. (\textit{c}) Lift distributions (\protect\blackline\for real part and\protect\blacklineshort\hspace{1mm}\protect\blacklineshort\for imaginary part) on the cylinder at RHP zeros$\textrm{I}/\textrm{II}$ and $\Rey=100$.}
\caption{Performance benchmark of 13 state-of-the-art models before being fine-tuned on \textbf{360-SOD}. The best three results are in {\color{red}{\textbf{\underline{red}}}}, {\color{green}{\textbf{\underline{green}}}} and {\color{blue}{\textbf{\underline{blue}}}}. }
\caption{Performance of \textbf{DDS} and the state-of-the-art models after being fine-tuned on \textbf{360-SOD}. Note ``R3Net-w/ocrf'' means R3Net without dence CRF\cite{krahenbuhl2011efficient} and ``-'' means the training code is not available. The best three results are in {\color{red}{\textbf{\underline{red}}}}, {\color{green}{\textbf{\underline{green}}}} and {\color{blue}{\textbf{\underline{blue}}}}.}
\caption{Performance of \textbf{DDS} and the state-of-the-art models on \textbf{360-SOD-AT}. Note ``R3Net-w/ocrf'' means R3Net without dence CRF\cite{krahenbuhl2011efficient} and ``-'' means the training code is not available. The best three results are in {\color{red}{\textbf{\underline{red}}}}, {\color{green}{\textbf{\underline{green}}}} and {\color{blue}{\textbf{\underline{blue}}}}.}
\caption{ Rabi oscillations simulated on two generations of D-Wave quantum annealers. (a)--(b) the distribution of energy outputted by the annealers for different annealing times $\tau$. The two instances were generated from Eq.~(\ref{eq:QUBO2}) where $R=2$ bits of precision was assumed. The total number of variables in the corresponding QUBO was $|V|=168$. (c)--(d) the evolution in time of the spin $z$-component of a two level system~(\ref{eq:Sy}), $\omega=\pi/2$. (% \tikzquad\,\,\, -- 20\textmu{}s, \,\tikzcircle\,\,\,-- 200\textmu{}s, \,\tikzdot\,\,-- 2000\textmu{}s% ) }
\caption{\small The overview of the self attention calculation. \textcolor{red}{This figure needs to be improved. Please see my comments in red in the text.}}
\caption{Histogram of a gray-scale image of Lena. The symbol {\color{red}\textbf{+}} (coloured in red) and {\color{green}\textbf{o}} (coloured in green) shows a \textit{peak} and a \textit{valley} point in a curve.}
\caption{An example where EAD model was better than RefNet. The ground truth answers are shown in \textcolor{blue}{blue}.}
\caption{An example of question with significant overlap with the passage. The answer is shown in \textcolor{blue}{blue}.}
\caption{Samples of generated questions from Baseline, RefNet and Reward-RefNet in the SQuAD dataset. Answers are shown in \textcolor{blue}{blue}}
\caption{Samples of generated questions from Baseline, RefNet and Reward-RefNet model on the SQuAD dataset. Answers are shown in \textcolor{blue}{blue}}
\caption{Attention plots for a) $\mathbf{A_1}$, b) $\mathbf{A_2}$, c) $\mathbf{A_3}$ respectively \\ \textbf{Initial Generated Question}: ``What is the name of the oncogenic virus?'' \\ \textbf{Refined Generated Question}: ``What is the name of the organism that causes cervical\\ cancer?''\\ \textbf{Passage}: ``The antigens expressed by tumors have several sources ; some are derived from oncogenic viruses like \textcolor{blue}{human papillomavirus} , which causes cervical cancer , while others are the organism’s own proteins that occur at low levels in normal cells but reach high levels in tumor cells.''}
\caption{A dialog example with the ground truth caption: \textbf{bunches of bananas hang on a wall and arranged for sale.} \color{bleudefrance} blue \color{black} indicates ideal relevant questions and \color{carrotorange} orange \color{black} indicates less relevant questions.}
\caption{A dialog example with the human generated caption: \textbf{there is a plant in a vase and cookies.} \color{bleudefrance} Blue \color{black} highlights diverse questions and \color{carrotorange} orange \color{black} indicates poor diversity.}
\caption{A dialog example with the human generated caption: \textbf{two men in formal wear standing next to a monster truck.} \color{bleudefrance} Blue \color{black} highlights ideal relevant questions and \color{carrotorange} orange \color{black} indicates less relevant questions.}
\caption{A dialog example with the human generated caption: \textbf{an image of running with the bulls outside.} \color{bleudefrance} Blue \color{black} indicates ideal relevant questions and \color{carrotorange} orange \color{black} highlights irrelevant/repeating questions.}
\caption{A dialog example with the human generated caption: \textbf{a man holding a kite while a girl tries to fly it.} \color{bleudefrance} Blue \color{black} indicates ideal relevant questions and \color{carrotorange} orange \color{black} indicates poor relevance.}
\caption{Surgical tool tracking implementation on the da Vinci Surgical\textregistered{} System running 30fps in real-time. From left to right the figures show: detected markers and edges, re-projected kinematic tool and shaft edges, and the full Augmented Reality rendering of the surgical tool~\cite{render_psm} on top of the raw endoscopic data. These image are best viewed in color.}
\caption{Autonomous tissue manipulating with the proposed SuPer framework implemented on the da Vinci\textregistered{} Surgical System in real-time. From left to right the figures show: the real scene, tool tracking from the endoscopic camera, deformable reconstruction, and RViz with point cloud of the environment, robot localization, and the tracked point to grasp. }
\caption{ (a) Streamwise mean velocity profile as a function of the wall-normal distance and (b) streamwise, (c) wall-normal, and (d) spanwise root-mean-squared fluctuating velocities for the regular channel (\dashed) and the channel with suppressed modal instabilities (\textcolor{cyan}{\solid}). The Reynolds number of both simulations is $\mathrm{Re}_\tau = 186$. Angle brackets represent averaging in the homogeneous directions and time. \label{fig:stats}}
\caption{The setting as in corollary~\ref{cor:applicationstoseismicarrays}. Here~$\Gamma$ (\textcolor{red}{thick}) represents the seismic array where one measures the travel times of seismic waves and~$\Omega$ represents the Earth.}
\caption{ Zamfir's triangle of compromises; \\ \textcolor{darkorange}{orange}: Consensus Nodes; \textcolor{blue}{blue}: Execution Nodes. }
\caption{AUC, FN, $d_H\left(\mathcal{H}_{\boldsymbol{X}},\mathcal{H}_{\boldsymbol{X} \cap \mathcal{T}}\right)$ and $d_H\left(\mathcal{H}_{\boldsymbol{Y}},\mathcal{H}_{\boldsymbol{Y} \cap \mathcal{T}}\right)$ for various combinations of $k$ and $M$ on the Texas dataset.}
\caption{AUC, FN, $d_H\left(\mathcal{H}_{\boldsymbol{X}},\mathcal{H}_{\boldsymbol{X} \cap \mathcal{T}}\right)$ and $d_H\left(\mathcal{H}_{\boldsymbol{Y}},\mathcal{H}_{\boldsymbol{Y} \cap \mathcal{T}}\right)$ for various combinations of $k$ and $M$ on the California dataset.}
\caption{Overview of our method \reds. \reds\first obtains\textit{anchors} of the target entity pair and constructs a 2-hop DS bag. Sentences in the 1-hop and 2-hop DS bag are individually encoded with a PCNN sentence encoder \citep{zeng2015distant}. We then use selective attention and bag aggregation to get the final representation, based on which a classifier predicts scores for each candidate relation.}
\caption{Parameter settings in \reds.}
\caption{Visualization of the retrieval results of the traditional grid attention and our PGAN. We illustrate $4$ vehicle images with the most Top-$5$ similar vehicles in the gallery set. The correct and false matched vehicle images are enclosed in {\color{green!50!black}green} and {\color{red!90!black}red} rectangles respectively. For a query image, we draw: the results of the grid attention in (a) the retrieved vehicle images and (b) the corresponding heatmaps of the part-guided feature $\mathbf{F}_p$ from PAM; and the results of our PGAN in (c) the retrieved vehicle images and (d) the detected candidate part regions and (e) the corresponding heatmaps of $\mathbf{F}_p$. (Best viewed in color) }
\caption{An instance \cite{sakamoto2011preoperative} of PubMedQA dataset: Question is the original question title; Context includes the structured abstract except its conclusive part, which serves as the Long Answer; Human experts annotated the Answer yes. Supporting fact for the answer is \textcolor{blue}{\textit{\textbf{highlighted}}}.}
\caption{Example outputs of the transformation from non-ironic sentences to ironic sentences and the transformation from ironic sentences to non-ironic sentences. We use \textcolor{red}{red} and \textcolor{blue}{blue} to annotate the clashes in the sentences.}
\caption{Example error outputs of the transformation from non-ironic sentences to ironic sentences. The main errors are \textcolor{red}{colored}.}
\caption{Example outputs of the transformation from ironic sentences to non-ironic sentences. We use \textcolor{red}{red} and \textcolor{blue}{blue} to annotate the clashes in the sentences.}
\caption{Effort and Recall using Homogeneous Poisson Process between different threshold sizes \textcolor{blue}{Make this double column; it's too hard to read in a single column}}
\caption{Energy levels $E_a$({\color{black} black}), $ E_b$({\color{red} red}) of the system, depending on the radius. Mixing angle is chosen as $\sin^22\theta = 10^{-3}$, mass $m_s = 10$~keV, momentum $p = 30$~MeV. The $y$-axis shows $m_{a,b}/{2E}$. The closest distance between energy levels is at the resonance where the transition between the levels is the most likely. }
\caption{\color{Gray} \textbf{Flow-chart of the general workflow.} In stage one initial metadata from already standardised databases are used to establish an ontological basis, then data tables and geometries that have been downloaded from various data sources are registered in stage two and eventually those files are harmonised and integrated (normalised) into the final database in stage three. The initial input data and data sources shown in this figure are only a subset of sources that can be handled with \texttt{arealDB}.}
\caption{\color{Gray} \textbf{Flow-chart of the project setup.} \textbf{(a)} The function \verb!setPath()! initiates the project by creating a directory structure in which the files are stored and by creating the inventory tables for dataseries, geometries and census tables. \textbf{(b)} The function \verb!setVariables()! creates index and translation tables for all variables that should be handled in this project.}
\caption{\color{Gray} \textbf{Flow-chart of the registration procedure.} \textbf{(a)} The function \verb!regDataseries()! is used to document the various dataseries that are provided by the data source. \textbf{(b)} Then the function \verb!regGeometry()! is used to register all geometry files that have been downloaded. \textbf{(c)} Finally, the function \verb!regTable()! is used to register all census tables that have been downloaded, and to relate the census tables to dataseries and geometries. The registered files are stored in the folders \verb!"/adb_geometries/stage2"! and \verb!"/adb_tables/stage2"!.}
\caption{\color{Gray} \textbf{Flow-chart of the normalisation procedure}. The function \verb!normalise()! detects all relevant files of geometries or data tables in stage two. It then calls the respective function that carry out the normalisation. The function \verb!normGeometry()! groups geometries per nation and creates the \textit{administrative hierarchy ID (ahID)}. The function \verb!normTable()! reshapes the data tables into tidy format, calls \verb!matchUnits()! to assign ahID to the areal data and groups the tables per nation. The normalised files are stored in the folders \verb!"/adb_geometries/stage3"! and \verb!"/adb_tables/stage3"!.}
\caption{List of parameters in alphabetical order.}{\includegraphics[width = 3.5 in , height=3.8 in]{Symbol_c4}}
\caption{\textbf{Motivating Example:} The ego-vehicle (in \textcolor{Red1}{red}) wants to change lanes in order to make a legal left turn. Road rules restrict the distance available for this lane change. There is also dense traffic on the road. The driver models of \textit{all} other vehicles are unknown to the ego-vehicle (in terms of their cooperativeness at the very least).}
\caption{Randomly generated initial states for the benchmark scenario. The ego-vehicle is in \textcolor{Red1}{red}. Other vehicles are more \textcolor{Green4}{green} if they are more cooperative. We represent the deadend with a black car on the road. (\subref{fig:init1}) Three lane road example. (\subref{fig:init5}) Two lane road example.}
\caption{SHER Control Block Diagram.\textcolor{red}{we should update this figure}}
\caption{Humans make more reasonable mistakes (the query is underscored): the annotator selected \textcolor{NavyBlue}{sentence (2)} as a part of the target scenario, which, while not part of the gold, does make a comperent scenario. The model however chose \textcolor{Orange}{sentence (1)}, which is in direct contradiction with the target scenario.}
\caption{Comparisons with state-of-the-art video salient object detection algorithms. The three best performing algorithms are marked in {\color{red}red}, {\color{green}green}, and {\color{blue}blue} respectively. }
\caption{Comparisons with state-of-the-art unsupervised video segmentation algorithms. The three best performing algorithms are marked with {\color{red}red}, {\color{green}green} and {\color{blue}blue} colors respectively.}
\caption{Plots of the functions $\theta = \theta(\tau)$ and $\psi = \psi(\tau)$ obtained via numerical solution of Eqs.\\textcolor{blue}{(\ref{DiffEqs})}. The parameters and initial conditions are chosen to be $q= 0.04$, $h= 0.5$, $\upsilon = 0.2$, $\rho=+1$, $\theta(0) = 2\, \mathrm{rad}$ and $\psi(0) = 1\, \mathrm{rad}$. In the long-time limit, the functions $\theta(\tau)$ and $\psi(\tau)$ tend to constant values $1.95$ and $1.48\times 10^{-2}\, \mathrm{rad}$, respectively. From \textcolor{blue}{(\ref{Sol_rho_1})} it follows that these limiting values are in complete agreement with the analytical ones $\theta_{+1}^{(1)}$ and $\psi_{+1}^{(1)}$.}
\caption{Generating \texttt{ENTAILMENT} for monotonicity fragments starting from the $premise$ (top). Each node in the tree shows an entailment generated by one \blue{substitution}. Substitutions are based on a hand-coded knowledge base with information such as: \textit{all} $\leq$ \textit{some/a}, \textit{poodle} $\leq$ \textit{dog} $\leq$ \textit{mammal}, and \textit{black mammal} $\leq$ \textit{mammal}. \texttt{CONTRADICTION} examples are generated for each inference using simple rules such as ``replace \textit{some/many/every} in subjects by \textit{no}''. \texttt{NEUTRAL}s are generated in a reverse manner as the entailments. \label{fig:gen:mono} }
\caption{Illustration of a single ofm channel convolution without \pred{} (a) and with \pred{} (b). $X_o[I_s]$ is computed based on a pre-defined pattern. $X_o[I_s]$ is then subjected to \pred{}, which produces a prediction map $M[I_t]$, followed by a thresholding step to form the binary prediction map $M^{\sigma}[I_t]$. $X_o[I_t]$ is created according to $M^{\sigma}[I_t]$ --- a portion of the $I_t$ ofm activations are predicted as zero-valued and skipped, whereas the rest are computed using Equation~(\ref{eq:conv}). %$I_t$ ofm activations are either predicted as zero-valued and skipped, or computed using Equation~(\ref{eq:conv}). %$\hat{X}_o$ is a combination of $\tilde{X}_o$ and complementary convolution operations that are taking place at $(x,y,z)$ coordinates where $\tilde{X}^c_o(x,y,z) > \sigma$. }
\caption{Convolution + \pred{}}
\caption{ResNet-18 + ILSVRC-2012 example of different layers behavior in terms of error and MAC operations as a function of \pred{} threshold.}
\caption{Illustration of the training of a single \pred{}. The ground truth is the original layer ofm after a binary threshold operation (\textgreater 0). The CNN predictor is used with the ReLU activation function capped at 1.}
\caption{Comparison between DEM simulations (\protect\tikz[baseline]{\protect\draw[line width=0.3mm,densely dashed] (0,.5ex)--++(0.65,0) ;}) and segregation model (\protect\tikz[baseline]{\protect\draw[line width=0.3mm] (0,.5ex)--++(0.65,0) ;}) of the profile of concentration $\phi_s$ at time $t=80870$.}
\caption{(a) Travelling wave solution (\protect\tikz[baseline]{\protect\draw[line width=0.3mm] (0,.5ex)--++(0.65,0) ;}) and concentration profiles in the moving frame from DEM simulations at different times for the case $N_s=1$, $r=1.5$, $Pe = 3.86$. $t=40435$ (\textcolor{C0}{\protect\tikz[baseline]{\protect\draw[line width=0.3mm,densely dashed] (0,.5ex)--++(0.65,0) ;}}), $t=50543$ (\textcolor{C1}{\protect\tikz[baseline]{\protect\draw[line width=0.3mm,long dashdotted] (0,.5ex)--++(0.7,0) ;}}), $t=60652$ (\textcolor{C2}{\protect\tikz[baseline]{\protect\draw[line width=0.3mm,dotted] (0,.5ex)--++(0.7,0) ;}}), $t=70761s$ (\textcolor{C3}{\protect\tikz[baseline]{\protect\draw[line width=0.3mm,densely dash dot] (0,.5ex)--++(0.9,0) ;}}), $t=80870$ (\textcolor{C4}{\protect\tikz[baseline]{\protect\draw[line width=0.3mm,dash dot dot] (0,.5ex)--++(0.9,0) ;}}), (b) Value of the Peclet number as a function of the number of layers of small particles.}
\caption{Quantitative comparison of the baseline and our ForeSeE on converted pseudo-LiDAR signals. Signals in \textcolor{blue}{blue} are converted from ground truth depth; Baseline pseudo-LiDAR are in \textcolor{red}{red}; Our ForeSeE pseudo-LiDAR are in \textcolor{yellow}{yellow}.}
\caption{Qualitative results of 3D object detection. The ground truth 3D bounding boxes are in \textcolor{red}{red}; the predictions are in \textcolor{green}{green}.}
\caption[]{SpERT relation extraction examples showing that (a) as a span-based approach, our model can deal with overlapping entities, and (b) localized context yields better precision for long sentences compared to using the full sentence as context. (c) showcases various common sources of error. \textcolor{darkgreen}{green [*]} = true positive relation, \textcolor{blue}{blue [*]} = false positive relation, \textcolor{red}{red [*]} = false negative relation.}
\caption{Example scenarios for future object localization. Given the last observation \protect\inlinegraphics{figures/comp_markers/m5.png} at time $t=T_{obs}$ and the future ground-truth \protect\inlinegraphics{figures/comp_markers/m7.png} at time $t=T_{pred}$, Const-Vel \protect\inlinegraphics{figures/comp_markers/m8.png}, RNN-P (ORB) \protect\inlinegraphics{figures/comp_markers/m9.png}, and RNN-P (IMU) \protect\inlinegraphics{figures/comp_markers/m11.png} models are compared to RNN-AE (Ours) \protect\inlinegraphics{figures/comp_markers/m6.png}. Also, the predicted trajectory of RNN-AE (Ours) \protect\inlinegraphics{figures/comp_markers/m2.png} is visualize with the ground-truth \protect\inlinegraphics{figures/comp_markers/m3.png}. }
\caption{Qualitative evaluation of NEMO (RNN-AE) with the uncertainty of future object localization. The predicted centers of the bounding box from $T_{obs}+1$ to $T_{pred}$ are shown as a trajectory \protect\inlinegraphics{figures/markers_ours/m2.png} with ground-truth \protect\inlinegraphics{figures/markers_ours/m1.png}. Also, the bounding box \protect\inlinegraphics{figures/markers_ours/m5.png} at $T_{pred}$ is sampled from the probability distribution (red indicates high probability). %and the last observation at $t_0$ is shown in \inlinegraphics{figures/markers_ours/m4.png} }
\caption{Future ego-motion prediction. (a,b,c) velocity and (d,e,f) yaw rate. Given the past observation \protect\inlinegraphics{figures/comp_markers/m10.png} and future ground-truth \protect\inlinegraphics{figures/comp_markers/m3.png}, Const-Vel \protect\inlinegraphics{figures/comp_markers/m1.png} and RNN \protect\inlinegraphics{figures/comp_markers/m4.png} models are compared with RNN-AE (Ours) \protect\inlinegraphics{figures/comp_markers/m2.png}. }
\caption{Future ego-motion prediction using NEMO (RNN-AE) with the uncertainty. (a,b,c) velocity and (d,e,f) yaw rate of the ground-truth \protect\inlinegraphics{figures/markers_ours/m3.png} and RNN-AE (Ours) \protect\inlinegraphics{figures/markers_ours/m2.png} is plotted with the uncertainty \protect\inlinegraphics{figures/markers_ours/m6.png} at each time step. % from {${t_0}$} to {${t_0}+20$} shown in \inlinegraphics{figures/markers_ours/m2.png}, uncertainty is shown in \inlinegraphics{figures/markers_ours/m6.png}, past observation from {${t_0-10}$} to {${t_0}$} shown in \inlinegraphics{figures/markers_ours/m1.png}, and ground truth for future is shown in \inlinegraphics{figures/markers_ours/m3.png} }
\caption{Accuracy of R52 ({\color{blue}left}) and Reuters21578 ({\color{red}right}) with different maximum neighbor numbers.}
\caption{Overview of the presence of tuition fees (\color{red}{left}\color{black}) and the G8-reform (\color{blue}{right}\color{black}) in the 16 German states until 2015. Darker colors represent longer presence of the respective variable.}
\caption{Left: DFFITS for $\theta^{*}=0.9927$ and all controls: influential observations in \color{red}{red}\color{black}. Right: Boxplot of $\xi$ for the 18 covariates and the treatment with threshold: $\pm0.16$. Figures for each covariate available from authors upon request.}
\caption{Short review (a) and long review (b) channels for users to write reviews in Naver Movie website. Highlighted areas emphasize the difference in \textcolor{red}{number of reviews}, \textcolor{orange}{review text length}, and \textcolor{blue}{sentiment label availability}.}
\caption{Accuracy and RMSE of competing models on polarity (\textsc{P\_Acc}) and fine-grained (\textsc{F\_Acc} and \textsc{F\_RMSE}) datasets. Items in \textcolor{red}{red} are performances worse than the no-transfer CNN baseline. An asterisk (*) indicates that LeTraNets is significantly better than the second best model ($p < 0.05$) .}
\caption{\textbf{Wave function confinements.} The three dimensional visualisations of the wave function charge densities are shown for the lowest electron $\rm |\Psi_e|^2$ (shown as red color distribution) and the highest hole $\rm |\Psi_h|^2$ (shown as cyan color distribution) states. The green cylinders are plotted to indicate the boundaries of the core and shell regions of the nanowires. The nanowires are selected with parameters as follows: $x$=15\%, D$\rm _S$=20 nm, $\rm \rho_L$=4 and \textbf{(a)} $\rm \rho_D$=0.2, \textbf{(b)} $\rm \rho_D$=0.4, \textbf{(c)} $\rm \rho_D$=0.8, \textbf{(a)} $\rm \rho_D$=1.0. The plots for other nanowire dimensions are provided in the \textcolor{blue}{Supplementary Information Figure S4}.}
\caption{\label{table:br}\color{magenta}\bf Branching ratios BR for $N^*\to N\eta$ decays and the photon helicity amplitudes $A_{1/2}$, $A_{3/2}$ of nucleon resonances, calculated at their pole positions. The helicity amplitudes are given in units of GeV$^{-1/2}$. Small numbers below the BRs or below the helicity amplitudes give the RPP~\cite{Patrignani:2016xqp}. If RPP does not give the estimate value for $A_{1/2}$, $A_{3/2}$ we give values from~\cite{Sokhoyan:2015fra}. \vspace{-2mm} }{\footnotesize \begin{center} \begin{tabular}{|cc|cc|} \hline\hline Res. \, &\hspace{-2mm}BR{\scriptsize($N^*\to N\eta$)}\hspace{-2mm}\, & Res.\, &\hspace{-2mm}BR{\scriptsize($N^*\to N\eta$)}\hspace{-2mm}\,\\[-1ex] $A_{1/2}$ & $A_{3/2}$ & $A_{1/2}$ & $A_{3/2}$\\ \hline $N(1535)$ &0.42\er0.04 & $N(2120)$ &$\leq$0.01 \\[-1.5ex] \tiny$1/2^-$ &\tiny 0.42\er0.10& \tiny$3/2^-$ & - \\[-1ex] 0.093\er0.009 & - & 0.130\er0.050 &0.160\er0.065 \\[-1.5ex] \tiny 0.115\er0.015 & - & - & - \\\hline % $N(1650)$ &0.32\er0.04 & $N(1720)$ &0.03\er0.02 \\[-1.5ex] \tiny$1/2^-$ &\tiny 0.05 - 0.15 \color{blue} 0.14 - 0.22& \tiny$3/2^+$ &\tiny0.00-0.05\\[-1ex] 0.032\er0.006 & - & 0.115\er0.045 &0.135\er0.040 \\[-1.5ex] \tiny 0.045\er0.010 &- & \tiny 0.115\er0.045 &\tiny 0.140\er0.040\\\hline % $N(1895)$ &0.10\er0.05 & $N(1900)$ &0.03\er 0.01 \\[-1.5ex] \tiny$1/2^-$ &\tiny 0.21\er0.06 & \tiny$3/2^+$ &\tiny 0.02-0.14 \\[-1ex] -0.028\er0.010 & - & 0.026\er0.014 &-0.090\er0.025 \\[-1.5ex] \tiny -0.015\er0.006 & - & \tiny 0.026\er0.014 &\tiny-0.070\er0.030\\\hline % $N(1710)$ &0.25\er0.09 & $N(1675)$ &0.005\er0.005 \\[-1.5ex] \tiny$1/2^+$ &\tiny 0.10 - 0.50 & \tiny$5/2^-$ &\tiny 0\er0.01\\[-1ex] 0.040\er0.020 & - & 0.020\er0.004 & 0.027\er0.006 \\[-1.5ex] \tiny 0.028$^{+0.009}_{-0.002}$& - & \tiny 0.022\er0.005 \color{blue} 0.019\er0.008 &\tiny0.028\er0.006 \color{blue} 0.02\er0.005 \\\hline % $N(1880)$ &0.19\er0.07 & $N(2060)$ &0.06\er0.02 \\[-1.5ex] \tiny$1/2^+$ &\tiny $0.25^{+0.30}_{-0.20}$& \tiny$5/2^-$ &\tiny 0.04\er 0.02 \\[-1ex] 0.050\er0.020 &- & 0.062\er0.010 &0.070\er0.020 \\[-1.5ex] \tiny - &- & \tiny0.064\er0.010 & \tiny0.060\er0.020\\\hline % $N(2100)$ &0.30\er0.15 & $N(1680)$ &0.002\er0.001 \\[-1.5ex] \tiny$1/2^+$ &- & \tiny$5/2^+$ & \tiny 0\er 0.01\\[-1ex] 0.010\er0.004 & - & -0.015\er0.002 & 0.136\er 0.005 \\[-1.5ex] \tiny 0.011\er0.004 & - & \tiny-0.013\er0.003 &\tiny 0.135\er 0.005 \\\hline % $N(1520)$ & $<0.001$ & $N(2000)$ &0.01\er0.01 \\[-1.5ex] \tiny$3/2^-$ &\tiny0.00\er0.01 & \tiny$5/2^+$ &\tiny0.02\er0.02\\[-1ex] -0.024\er0.004 & 0.130\er 0.006 & 0.015\er0.006 &-0.043\er0.008 \\[-1.5ex] \tiny-0.020\er0.005 &\tiny0.140\er 0.010& \tiny 0.033\er0.010 & \tiny-0.045\er0.008\\\hline % $N(1700)$ &0.01\er 0.01 & $N(2190)$ &0.04\er0.02 \\[-1.5ex] \tiny$3/2^-$ & - & \tiny$7/2^-$ &- \\[-1ex] 0.042\er0.014 &-0.050\er0.015 & -0.071\er0.010 &0.037\er0.008 \\[-1.5ex] \tiny 0.047\er0.016 &\tiny-0.041\er0.014& \tiny -0.068\er0.005 &\tiny 0.025\er0.010\\\hline % $N(1875)$ &0.12\er0.08 & $N(1990)$ &$\leq$0.01 \\[-1.5ex] \tiny$3/2^-$ &\tiny 0.00\er0.01 & \tiny$7/2^+$ & - \\[-1ex] 0.010\er0.010 & -0.007\er0.004 & 0.065\er0.025 &0.047\er0.008 \\[-1.5ex] \tiny 0.017\er0.009 &\tiny-0.008\er0.004& \tiny $0.010^{+0.011}_{-0.006}$ &\tiny$0.053^{+0.023}_{-0.086}$\\\hline\hline \end{tabular} \end{center}}
\caption{Statistical evaluation for the retinal thickness analysis. All values displayed are p-values, and significance is defined as $p< 0.05$. \textcolor{TGblue}{$\blacksquare$} indicates pre-test p-values for retinal thickness as a function of age for the transgenic group. \textcolor{WTred}{$\blacksquare$} indicates pre-test p-values for retinal thickness as a function of age for the wildtype group. Pre-test p-values were calculated using linear regression analysis. \textcolor{TGblue}{$\diagdown$} \hspace{-8pt}\textcolor{WTred}{$\diagdown$} indicates p-values from ANCOVA comparing the trends of the transgenic retinal thickness vs. age to the wildtype retinal thickness vs. age. \textcolor{TGblue}{$\diamond$}\textcolor{WTred}{$\diamond$} indicates values where the initial pre-tests failed, and therefore one-way ANOVA was performed to test for significance between the means of the two groups. There were no statistically significant changes in retinal thickness between the transgenic and the wildtype groups.}
\caption{Decrease in retinal thickness in units of \si{\micro\meter} per week. \textcolor{TGblue}{$\blacksquare$} indicates the values for the transgenic mouse and \textcolor{WTred}{$\blacksquare$} indicates the values for the wildtype mouse.}
\caption{Illustration of SINDy to discover the system-model mismatch for the Van der Pol oscillator. The system is simulated to generate the measurement data. Measurement data $x_1$ and $x_2$ are provided to calculate the output estimated by our imperfect model. Sparse regression is used to infer the discrepancy model for the difference between the actual output and estimated output. The discrepancy model is then combined with the imperfect model to provide a better estimation of system dynamics. The model is then cross-validated on a new initial \textcolor{red}{condition } $\boldsymbol{x}_0=(-0.2,-0.3)$. As we can see, the model discovered by SINDy is able to compensate for the discrepancy between the actual system and flawed model.}
\caption{Comparison of results on Ego Gesture dataset to the state of the art. The results reported from Cao \etal \cite{Cao2017} are in \textcolor{bad}{purple}. Our network (results reported in \textcolor{good}{green}) produces considerably better accuracy on just RGB data during inference and can use all the images in a sequence, while the state of the art is limited to 40 frames per sequence and needs both RGB and Depth data for best results.}
\caption{\textbf{Back-step for different convolution types.} (a) \textbf{Residual connections}, kernels and activations of ones (\textcolor{red}{red}) are used to allow the discovery of channel associations with layers that are connected with residual connections. (b) \textbf{Grouped convolutions} in which the kernels are inflated (denoted with \textcolor{gray}{gray}) to hold the same dimensions as the input. (c) \textbf{Convolutions in branches} are performed in parallel with the possibility of uneven depth. In these cases, the branch with the maximum number of convolutions is selected as the base with tensors of ones added to the other branches.}
\caption{\Energy (eV) of ferromagnetically ordered molecule as a function of$M_S$ and the geometry. This molecule shows spin crossover behavior as a function of the geometry which suggests that the ground magnetic state will change when pressure is applied and that it could be sensitive to size of the charge-compensating counter anion in the unit cell. }
\caption{ Accuracies on various subject-verb agreement tasks of \citet{Lakretz2019TheModels}. \textsc{full} denotes the full model accuracies. \textsc{in} is the decomposition of the subject, \textsc{intercept}$^*$ only decomposes the gate intercepts of the model. $\neg$\textsc{intercept} takes no interactions with the intercepts into account. Singular conditions are denoted in \textcolor{persiangreen}{green}. $(\cdot)$ denotes accuracies of scores without decoder bias, i.e. $Dh_t$ vs $Dh_t+b_d$. }
\caption{\textbf{Top row:} Image from the training set and corresponding depth maps produced by neural networks trained without and with Depth Hints (Godard~\etal~\cite{monodepth2} (Monodepth2) architecture and loss). \textbf{Middle row, left to right:} Crop of the image centered around a thin structure with the center pixel circled, the scanline in the other image for the circled pixel, LiDAR pointcloud, fused depth map from SGM, crop of the depth map produced by a network trained without Depth Hints, our result, and the color coding illustrating pixel disparities. % \textbf{Bottom row:} Plot of DSSIM+$L_1$ cost of the pixel on the thin structure for every pixel disparity. Line segments in red and blue show the photometric loss on network predictions for the centered pixel while training. The number $q$ corresponds to the prediction made by the network after $q$ epochs. The network trained {\color{red} without Depth Hints} gets stuck in a local minimum and does not escape even after 20 epochs. On the other hand, the network trained {\color{blue} with Depth Hints} is in the vicinity of the correct solution after the first epoch. \textbf{Bottom row:} On the left is the plot of DSSIM+$L_1$ cost of the pixel on the thin structure for every pixel disparity. Plots on the right show the predictions made by the network after $q$ epochs when trained with and without Depth Hints. The network trained {\color{red} without Depth Hints} gets stuck in a local minimum and does not escape even after 20 epochs. On the other hand, the network trained {\color{blue} with Depth Hints} is in the vicinity of the correct solution (disparity of 64.63 according to LiDAR) after the first epoch. We visualize depths as disparities in pixel space for clarity. (Best viewed in color.) }
\caption{\textbf{Depth Hints with Existing Methods.} Comparison of our implementations of existing methods with and without Depth Hints. The data used to train/test is defined in the \textit{Dataset} column, whereby `K' is for KITTI 2015~\cite{Geiger2012CVPR} using the Eigen split, and `SF' is for the FlyingThings3D Sceneflow dataset \cite{MIFDB16}. Highlighted methods are augmented \colorbox{lightblue}{with Depth Hints}, and score better than their regular counterparts almost universally. \cite{kuznietsov2017semi} is an exception, possibly because it already uses LiDAR data. We also show results for \cite{monodepth2} without ImageNet \cite{imagenet} pretraining, denoted as `Monodepth2 no pt'. \textit{Data column} (data source used for training): D refers to methods that use depth supervision at training time, S is for self-supervised training on stereo images, and MS is for models trained with stereo video. \vspace{-6pt} }
\caption{Single-view depth estimation performance. The statistics for the compared methods are excerpted from corresponding papers, except that the results marked with \textit{`updated'} are captured from the websites. `K' represents KITTI raw dataset (Eigen split) and CS represents cityscapes training dataset. The method~\cite{godard2017unsupervised} marked with $\star$ are trained and tested on larger scale ($256\times512$) images, whereas others use $128\times416$ images. `-' in Cap(m) means no maximum depth filtering is applied. The metrics marked by {\color{red}red} means `the lower the better' and the ones marked by {\color{green}green} means `the higher the better'. The best results for each category are \textbf{bolded}.}
\caption{The median absolute position error (mAPE) and running time of different methods for all KITTI odometry sequences with ground-truth (00-10). `F' means fail. The better results are \textbf{bolded}. The columns marked by {\color{red}red} are training sequences, and those marked by {\color{green}green} are originally used for testing.}
\caption{Comparisons of full trajectories for KITTI odometry sequence 00-10. {\color{red}F} means the method is failed on this sequence. The dotted line in each sub-figure is the ground-truth trajectory.}
\caption{ {\red Low-energy bands of graphene around the corners of the Brillouin zone (all values in \textmu eV): a) $\Delta_{BR}=0$, b) $\Delta_{BR}=6$, c) $\Delta_{BR}=12$, and d) $\Delta_{BR}=18$ ($\Delta_{KM}=12$ in all cases). Note that the Bychkov-Rashba coupling lifts the spin degeneracy of the bands. Blue and red colors represent opposite helicities (the approximated spin polarization of Bloch electrons lie within the graphene plane along an axis orthogonal to their crystal momenta).}}
\caption{ {\red Low-energy bands of bilayer graphene. a) Parabolic dispersion of the low-energy bands around one of the inequivalent corners of the Brillouin zone. The bands localized at the carbon atoms sitting on top of each other appear at higher energies, of the order of the inter-layer hopping $\gamma_1=0.34$ eV. b) The two parabolas overlap due to the trigonal warping of the bands, controlled by the other two inter-layer hopping parameters in the Slonczewski-Weiss-McClure parametrization, $\gamma_3=0.28$ eV and $\gamma_4=-0.14$ eV. c) The SOC terms $\Delta_{KM}=12$~\textmu eV and $\Delta_{BR}=19$~\textmu eV$\cdot$nm \cite{Konschuh_etal2} lift the band crossings. Note that the bands remain spin degenerate due to inversion symmetry.} }
\caption{ \emph{Electric-Field Dependent Spin Transport in 2D-based Heterostructures.} (a) The amplitude of the detected spin signal as a function of drift current, demonstrating spin drift. Inset shows the measurement geometry. From \citet{ingla-aynes_eighty-eight_2016}. (b) Dependence of nonlocal spin signal in bilayer graphene on the local resistance at the Dirac point. The latter increases with the vertical electric field (from yellow to purple dot). At higher fields, the spin current is switched off by the gapped channel. Top inset shows spin valve measurements at zero (dark yellow line) and high (black line) displacement fields, demonstrating a spin switch effect. From \citet{avsar_electronic_2016}. {\red (c) Oblique Hanle measurements near the charge neutrality point in a bilayer graphene spin valve, for different angle $\beta$ relative to the graphene plane. The magenta curve is a fit to the experimental data (blue points) and reveals an anisotropy of $\sim 12$ in the spin relaxation time. Away from the charge neutrality point, the anisotropy diminishes and the response becomes similar to the cyan dashed line. From \citet{xu_strong_2018}.} }
\caption{{\bf Theoretical relative intensity (dB) of ground to ball sound}, for the scenario in Table~\ref{tab:validationparams}. Ball materials are listed on the left, ground on the top. Positive values indicate the ground was louder than the ball. Impact timescale was kept constant at 1.63E-4~s and Poisson's ratio at $0.25$, as neither significantly affect relative amplitude. Scenarios with louder ground sound ($\ge 0$~dB) are highlighted in \colorbox{teal!25}{teal}, and scenarios where the ground sound can be audible (above the most sensitive \textsc{jnd} level of -13~dB \cite{long201481}) are highlighted in \colorbox{orange!15}{light orange}. Note that our overhead listening point is near the maximum relative loudness for the ball, whereas low listening angles tend to receive more ground sound (Figure~\ref{fig:anglecomparison} expands on this relationship).}
\caption{Translation examples from the baseline Transformer, VNMT, and our model. Disfluent words or absences are marked in \textcolor{red}{red}, and slightly incorrect lexical choice is marked in \textcolor{blue}{blue}. Romanian diacritics have been stripped.}
\caption{(a) Wind power $P$ versus wind speed $W$ at Farm $\#\mathrm{LV}$ in January (\textcolor{blue}{Blue} points denote real data points; whereas, \textcolor{red}{red} curve represents predictions by decision tree regression.). (b) The relative difference between $\E[Q_{\text{GP}}]$ and $\E[Q_{\text{MC}}]$ varying by $n$. In order to ensure that the MC converges, 8,000 realizations are used for $Q_{\text{MC}}$.}
\caption{Sample from SS-HF (top), SS-TR (bottom) comparing the results using full supervision vs. zero-shot learning. Fully supervised model scores are \textcolor{sns-blue}{blue}, zero-shot model results are \textcolor{sns-purple}{purple} and zero-shot classes are \textbf{bold}. % See the accompanying text in Sec.~\ref{sec:experiments:zeroshot} for an analysis of the behaviour shown in these results. [Best viewed on screen]. }
\caption{\textbf{Statistics on the response to strain and electric field.} \textbf{a} Alternating measurements of the mechanical strain (blue margins) and electric field dependence (red margins) of TLS resonance frequencies. Coloured trace highlights follow exemplary fits from which the TLS' deformation potential $\gamma_S$ and field coupling strength $\gamma_E$ are obtained. \textbf{b} Vertical cross-section of the data shown in \textbf{a} (black dashed line), recalculated to the energy relaxation rate $1/T_1$. Lorentzian fits to individiual TLS resonances Eq.~(\ref{eq:tlsresonance}) result in qubit-TLS coupling strengths $g$ and TLS decoherence rates. \textbf{c} Distribution of TLS sensitivities to electric field $\gamma_E$. No response ($\gamma_E <$ 1 kHz/V) is observed in 104/260 TLS (40$\%$.). \textbf{d} Histograms of TLS-qubit coupling strengths $g$ (left column) and TLS coherence times $(\frac{1}{2} \Gamma_\mathrm{1, TLS} + \Gamma_\mathrm{\Phi, TLS})^{-1}$ (right column), plotted separately for TLS that do not respond to the field (top row) and for field-tunable TLS (bottom row). Similar \rep{coupling strengths up to $g \approx 0.2~ $MHz and typical}{} coherence times of 50 - 100 ns are observed irrespective of the TLS' response to electric field. \red{The relatively small coupling strengths up to $g \approx 0.2~ $MHz indicate that nearly all defects which do not respond to the E-field are residing in the large-area parasitic junctions rather than the small qubit junctions.} }
\caption{\textbf{Sample holder for strain- and E-field tuning of defects.} \textbf{a} Exploded view showing the aluminum sample box (right) and the piezo housing which are screwed to the cryostat's cold finger. \textbf{b} Photograph of the inside of the sample box, showing the printed circuit board (PCB) with coplanar microwave transmission lines and four screws that hold the chip (not installed here) against the pressure generated by the piezo. The red octagon depicts the top DC-electrode that is glued to the lid of the sample box. The blue ring indicates the bottom electrode that is milled from the backside metallization of the PCB. \red{The qubits were located close to the center of the chip, where the electric field generated by the electrodes has its largest homogeneity.} \textbf{c} Cross-section through the sample box. The zircon sphere is used to avoid shear stress on the piezo and to thermally isolate the piezo actuator and the sample chip. An aluminum foil between the sphere and the piezo serves for electromagnetic shielding. }
\caption{(a) The total electric field vectors and contour plots of $|\bs{E}^{\mathrm{tot}}|^2/|E_0|^2$ in the $y=0$ plane are shown for four cases when the center of the core is (1) concentric with the shell, (2) shifted by $(\unit[20]{nm}) \bs{e}_x$, (3) shifted by $(\unit[14.14]{nm})\bs{e}_x + (\unit[14.14]{nm}) \bs{e}_z$, and (4) shifted by $(\unit[20]{nm})\bs{e}_z$. (b) The angular scattering intensity for the above four cases in the $z=0$ plane is shown. For the concentric case, the corresponding Mie series solution is also shown for comparison. (Cases 2 and 3 are shown in \textcolor{urlblue}{Visualization 1}.) \label{Fig:CoreShell}}
\caption{Contour plots of $|\bs{E}^\rmtot|^2/|E_0|^2$ (a) in the $y=0$ plane, (b) in the $z=0$ plane, for a lens-shaped object approaching the geometrical optics regime. The instantaneous electric field vectors are also indicated. See also \textcolor{urlblue}{Visualization 2}.}
\caption{A diagram illustrating a simplicial mixture model with $m=4$ vertices in $n=2$ dimensions supported on two simplices. The parameters are in \textcolor{red}{red}. Top right: The Bayesian network shows the dependency of the random variables and parameters for a general simplicial mixture model. Top left: a table describing $\textcolor{mustard}{C}$, a discrete latent variable with support the two combinatorial simplices. Bottom left: $\textcolor{blue}{Z}$ is a continuous latent variable in~$\R^m$. Bottom right: the output variable $\textcolor{mygreen}{X}$~in $\R^n$.}
\caption{Comparison of Dice overlaps for all test images and each anatomical label (average of all labels~\textcolor{col_mean}{\rule[-0.3mm]{.3cm}{.3cm}}, upper left lobe (ULL)~\textcolor{col1}{\rule[-0.3mm]{.3cm}{.3cm}}, lower left lobe (LLL)~\textcolor{col2}{\rule[-0.3mm]{.3cm}{.3cm}}, upper right lobe (URL)~\textcolor{col3}{\rule[-0.3mm]{.3cm}{.3cm}}, lower right lobe (LRL)~\textcolor{col4}{\rule[-0.3mm]{.3cm}{.3cm}}, middle right lobe (MRL) (LRL)~\textcolor{col5}{\rule[-0.3mm]{.3cm}{.3cm}}). For each one the distributions of Dice coefficients after an before registration, after single level dl registration, multilevel dl registration without pretrained CNNs and after multilevel registration with pretrained CNNs.}
\caption{\bf \textcolor{SSECOL}{#1}}
\caption{Evolution of optimality score for $H=0$ (\textcolor{red}{red}) and $H=0.4$ (\textcolor{blue}{blue}). Both graphs were produced using the following parameters: $|i|=4$, $n=3$ giving $|\mathcal{C}|=64$, $|\mathcal{A}|=20$, $N=80$, $K=1$, $\nu=0.01$, $\mu=10^{-4}$, and $\Phi=0.99$. $L_s$ and $a_d$ are the same for both graphs.}
\caption{Example predictions based on nearest neighbor sentences. The word in question is marked in boldface, subset with a short description of its WordNet synset (true positives \textcolor{darkgreen}{green}, false positives \textcolor{darkred}{red}). }
\caption{Simple implementation of the distribution, fork, and recollection phases in \texttt{Parmap} (slightly simplified from the \swhref{swh:1:cnt:d5214ff9562a1fe78db51944506ba48c20de3379; origin=https://gitorious.org/parmap/parmap.git; lines=101-143}{actual code}) presented in \swhref{swh:1:rev:0064fbd0ad69de205ea6ec6999f3d3895e9442c2; origin=https://gitorious.org/parmap/parmap.git}{the version of Parmap used in this article} \end{lstlisting} \end{tcolorbox} \caption{Adding clickable hyperlinks to Software Heritage in \LaTeX}\label{fig:swhref} \end{figure} \section{Acknowledgements} \label{sec:orgf6044d3} These guidelines result from extensive discussions that took place over several years. Special thanks to Alain Girault, Morane Gruenpeter, Julia Lawall, Arnaud Legrand and Nicolas Rougier for their precious feedback on earlier versions of this document. \clearpage % ***EXPANDEDBIBFILE: \bibliography{swh,biblio} % ****EXPORTBEGS: \bibliography{swh-archive-reference-howto.bbl} \begin{thebibliography}{10} \bibitem{swhcacm2018} J.-F. Abramatic, R.~Di~Cosmo, and S.~Zacchiroli. \newblock Building the universal archive of source code. \newblock{\em Commun. ACM}, 61(10):29--31, Sept. 2018. \bibitem{gtinria2009} P.~Alliez, R.~Di~Cosmo, B.~Guedj, A.~Girault, M.-S. Hacid, A.~Legrand, and N.~P. Rougier. \newblock{Attributing and Referencing (Research) Software: Best Practices and Outlook from Inria}. \newblock \url{https://hal.archives-ouvertes.fr/hal-02135891}, May 2019. \newblock submitted. \bibitem{AcmBadges}{Association for Computing Machinery}. \newblock Artifact review and badging. \newblock \url{https://www.acm.org/publications/policies/artifact-review-badging}, Apr 2018. \newblock Retrieved April 27th 2019. \bibitem{Borgman2012} C.~L. Borgman, J.~C. Wallis, and M.~S. Mayernik. \newblock Who's got the data? interdependencies in science and technology collaborations. \newblock{\em Computer Supported Cooperative Work}, 21(6):485--523, 2012. \bibitem{Dagstuhl-Artefacts-2016} B.~R. Childers, G.~Fursin, S.~Krishnamurthi, and A.~Zeller. \newblock{Artifact Evaluation for Publications (Dagstuhl Perspectives Workshop 15452)}. \newblock{\em Dagstuhl Reports}, 5(11):29--35, 2016. \bibitem{Parmap2012} M.~Danelutto and R.~Di~Cosmo. \newblock A "{M}inimal {D}isruption" skeleton experiment: Seamless map {\&} reduce embedding in {OC}aml. \newblock{\em Procedia CS}, 9:1837--1846, 2012. \bibitem{SHA1} Q.~Dang. \newblock Changes in federal information processing standard (fips) 180-4, secure hash standard. \newblock{\em Cryptologia}, 37(1):69--73, 2013. \bibitem{swhipres2018} R.~Di~Cosmo, M.~Gruenpeter, and S.~Zacchiroli. \newblock Identifiers for digital objects: the case of software source code preservation. \newblock In {\em Proceedings of the 15th International Conference on Digital Preservation, iPRES 2018, Boston, USA}, Sept. 2018. \newblock Available from \url{https://hal.archives-ouvertes.fr/hal-01865790}. \bibitem{swhipres2017} R.~Di~Cosmo and S.~Zacchiroli. \newblock Software heritage: Why and how to preserve software source code. \newblock In {\em Proceedings of the 14th International Conference on Digital Preservation, iPRES 2017}, Sept. 2017. \bibitem{reuse} F.~S.~F. Europe. \newblock Reuse software. \newblock \url{https://reuse.software}, Sept. 2019. \newblock Accessed on 2019-09-24. \bibitem{Hinsen2013} K.~Hinsen. \newblock Software development for reproducible research. \newblock{\em Computing in Science and Engineering}, 15(4):60--63, 2013. \bibitem{SSI2018} M.~Jackson~(ed). \newblock Software deposit: What to deposit (version 1.0). \newblock \url{https://softwaresaved.github.io/software-deposit-guidance/WhatToDeposit.html}, Aug 2018. \newblock doi:10.5281/zenodo.1327325. \bibitem{raymond2013} E.~S. Raymond. \newblock Software release practice howto. \newblock \url{https://www.tldp.org/HOWTO/html_single/Software-Release-Practice-HOWTO/}, Jan 2013. \newblock Accessed on 2019-06-05. \bibitem{Stodden-reprod-2012} V.~Stodden, R.~J. LeVeque, and I.~Mitchell. \newblock Reproducible research for scientific computing: Tools and strategies for changing the culture. \newblock{\em Computing in Science and Engineering}, 14(4):13--17, 2012. \end{thebibliography} % ****EXPORTENDS: \bibliography{swh-archive-reference-howto.bbl} % ***EXPANDEDBIBSTYLE: \bibliographystyle{abbrv} \appendix\clearpage \section{Appendix: Reference for SWH-ID identifiers} \label{sec:org36c0b21} \label{sec:identifiers} The SWH-ID identifier schema is \href{https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html}{fully documented online} and is discussed in the article \cite{swhipres2018}, but we reproduce here for completeness an excerpt of the documentation. \begin{table*}[t] \caption{EBNF grammar of Software Heritage persistent identifiers} \label{tab:grammar} \begin{alltt} <identifier> ::= "swh" ":" <scheme_version> ":" <obj_type> ":" <obj_id> ; <scheme_version> ::= "1" ; <obj_type> ::= "snp" (* snapshot *) | "rel" (* release *) | "rev" (* revision *) | "dir" (* directory *) | "cnt" (* content *) ; <obj_id> ::= 40 * <hex_digit> ; (* intrinsic object id, as hex-encoded SHA1 *) <hex_digit> ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" | "a" | "b" | "c" | "d" | "e" | "f" ; \end{alltt} \end{table*} \subsection{Syntax} \label{sec:org96608b5} Syntactically, persistent identifiers are generated by the \verb|<identifier>| entry point of the EBNF grammar given in Table~\ref{tab:grammar}. \subsection{Semantics} \label{sec:orgf9e78c6} The \texttt{swh} prefix makes explicit that these identifiers are related to Software Heritage, and the colon (\verb|:|) is used as separator between the logical parts of identifiers. The scheme version (currently \verb|1|) is the current version of this identifier scheme. A persistent identifier points to a single object, whose type is explicitly captured by \verb|<object_type>|: \begin{description} \item[snp] identifiers points to snapshots, \item[rel] to releases, \item[rev] to revisions, \item[dir] to directories, \item[cnt] to contents. \end{description} The actual object pointed to is identified by the intrinsic identifier \verb|<object_id>|, which is a hex-encoded (using lowercase ASCII characters) SHA1~\cite{SHA1} computed on the content and metadata of the object itself.\footnote{See \url{https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html} for more details.} \subsection{Git compatibility} \label{sec:orgde6bb03} Intrinsic object identifiers for contents, directories, revisions, and releases are, at present, compatible with the Git way of computing identifiers for its objects. A Software Heritage content identifier will be identical to a Git blob identifier of any file with the same content, a Software Heritage revision identifier will be identical to the corresponding Git commit identifier, etc. This is not the case for snapshot identifiers as Git doesn’t have a corresponding object type. Git compatibility is incidental and is not guaranteed to be maintained in future versions of this scheme (or Git), but is a convenient feature for developers, for the time being. \subsection{Examples} \label{sec:org571a0e5} The identifiers below are all interesting examples of what the Software Heritage identifiers look like.\\ They are resolved by the Software Heritage browsing pages available at:\\ \texttt{https://archive.softwareheritage.org/<identifier>} \begin{tcolorbox} \href{https://archive.softwareheritage.org/swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2}{swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2} \end{tcolorbox} points to the content of a file containing the full text of the GPL3 license\\ \begin{tcolorbox} \href{https://archive.softwareheritage.org/swh:1:dir:d198bc9d7a6bcf6db04f476d29314f157507d505}{swh:1:dir:d198bc9d7a6bcf6db04f476d29314f157507d505} \end{tcolorbox} points to a directory containing the source code of the Darktable photography application as it was at some point on 4 May 2017\\ \begin{tcolorbox} \href{https://archive.softwareheritage.org/swh:1:rev:309cf2674ee7a0749978cf8265ab91a60aea0f7d}{swh:1:rev:309cf2674ee7a0749978cf8265ab91a60aea0f7d} \end{tcolorbox}points to a commit in the development history of Darktable, dated 16 January 2017, that added undo/redo supports for masks\\ \begin{tcolorbox} \href{https://archive.softwareheritage.org/swh:1:rel:22ece559cc7cc2364edc5e5593d63ae8bd229f9f}{swh:1:rel:22ece559cc7cc2364edc5e5593d63ae8bd229f9f} \end{tcolorbox} points to Darktable release 2.3.0, dated 24 December 2016\\ \begin{tcolorbox} \href{https://archive.softwareheritage.org/swh:1:snp:c7c108084bc0bf3d81436bf980b46e98bd338453}{swh:1:snp:c7c108084bc0bf3d81436bf980b46e98bd338453} \end{tcolorbox} points to a snapshot of the entire Darktable Git repository taken on 4 May 2017 from GitHub. \begin{table*}[t] \caption{EBNF grammar of complementary contextual information} \label{tab:context_grammar} \begin{alltt} <identifier_with_context> ::= <identifier> [<lines_ctxt>] [<origin_ctxt>] ; <lines_ctxt> ::= ";" "lines" "=" <line_number> ["-" <line_number>] ; <origin_ctxt> ::= ";" "origin" "=" <url> ; <line_number> ::= <dec_digit> + ; <url> ::= (* RFC 3986 compliant URLs *) ; \end{alltt} \end{table*} \subsection{Contextual information} \label{ssec:context} It is often useful to complement persistent identifiers with contextual information about the object's setting. Currently it is possible to extend the identifier with the optional elements below using the dedicated syntax presented in Table~\ref{tab:context_grammar}: \begin{itemize} \item the software origin where an object has been found/observed \item the line number(s) of interest, usually within a content object \end{itemize} The semi-colon (\verb|;|) is used as a separator between the object identifier and other contextual information. Each piece of contextual information is specified as a key/value pair, using the equal sign (\verb|=|) as a separator. The extended contextual elements should be added in the following manner: \begin{description} \item[software origin] a URL where a given object has been found or observed in the wild and used by Software Heritage to ingest the object into the archive. \item[line numbers] a single line number or a line range, two numbers separated with the hyphen (\verb|-|). Note that line numbers are purely indicative and are not meant to be stable, as in some degenerate cases (e.g., text files which mix different types of line terminators) it is impossible to resolve them unambiguously. \end{description} \bigskip For example, the following identifier \\begin{tcolorbox} \href{https://archive.softwareheritage.org/swh:1:dir:c6f07c2173a458d098de45d4c459a8f1916d900f;origin=https://github.com/id-Software/Quake-III-Arena/}{swh:1:dir:c6f07c2173a458d098de45d4c459a8f1916d900f; \origin=https://github.com/id-Software/Quake-III-Arena} \end{tcolorbox} points to the source code root directory of the computer game Quake III Arena\footnote{See \url{https://en.wikipedia.org/wiki/Quake_III_Arena}} with the origin URL where it was found; while \\begin{tcolorbox} \href{https://archive.softwareheritage.org/swh:1:cnt:41ddb23118f92d7218099a5e7a990cf58f1d07fa;origin=https://github.com/chrislgarry/Apollo-11;lines=64-72/}{swh:1:cnt:41ddb23118f92d7218099a5e7a990cf58f1d07fa; \lines=64-72} \end{tcolorbox} points to a comment segment with the warning "NOLI SE TANGERE" in a file in the Apollo-11 source code.\\ \end{document} }
\caption{\textbf{S-NAMO} - Extension of the Wu\&Levihn approach: Social Movability Evaluation in\color{blue}blue \color{black} and Social Placement Choice in \color{red}red\color{black}}
\caption{Overview of the proposed method. We use style transfer explicitly to align data distributions at the dataset level. Starting from a \textcolor{awesome}{labelled synthetic source} domain we select samples from an \textcolor{azure}{unlabelled realistic target} domain and restyle them to create a \textcolor{amethyst}{labelled restyled} domain, more aligned to the target one. This is used to enrich the original labelled dataset. Data selection can be either random or matched via perceptual hashing (Sec.~\ref{sec:approach}). The enriched dataset is used to train a model that learns feature representations more aligned to the realistic target domain used to restyle the original synthetic source data. This improves its performance when applied to data from the unlabelled target domain, compared to training only with the labelled source data. Further, the model shows performance gains even when applied to \textcolor{blue-green}{other realistic domains (unseen)} which are closer to the target domain than the source (as indicated by the domain alignment color scale on the right). Best viewed in color. %We consider four domains, namely {\color{awesome} source}, {\color{amethyst} restyled}, {\color{azure} target} and {\color{blue-green} unseen}, from which we consider that the {\color{awesome} source} domain is composed of synthetic data, and both the {\color{azure} target} and {\color{blue-green} unseen} domains are subsets of real data. We employ style transfer by matching real and synthetic data, either randomly (RS) or by perceptual hashing (PH) (described in Section \ref{sec:approach}) to bridge the gap between the synthetic and real, from which we produce restyled data. First, we train a CNN to learn a vision task with only the synthetic data and evaluate it on the real domain, which produces subpar results. We then use both the synthetic and restyled data for training, which increases the performance of the learned model. Finally, we evaluate both models on real but unseen data. Once more, the model trained with synthetic data produces inferior results, while the one trained with both the synthetic and the restyled data increases performance by a limited amount. }
\caption{The percentage of posts over time that have at least one of the hate words. The \textcolor{red}{red line} shows the increasing trend of posting such messages on Gab.}
\caption{Average frame per second (FPS) and average precision as well as AUC of top real-time trackers on 173 image sequences. \textcolor[rgb]{ 1, 0, 0}{\textbf{Red}}, \textcolor[rgb]{ 0, 1, 0}{\textbf{green}} and \textcolor[rgb]{ 0, 0, 1}{\textbf{blue}} fonts indicate the first, second and third place, respectively. All results are obtained solely on CPU.}
\caption{Estimated difference in SBP (in mmHg) associated with a difference of 10 $\upmu$g/m$^3$ in annual average ambient PM$_{2.5}$, when adjusting for TPRS with varying values of $df$ in the outcome model. The square marker (\mysquare{black}) at $df=0$ is the estimate without spatial adjustment and has error bars indicating a 95\% confidence interval. The thick black curve (\drawline{ultra thick}) represents estimates for different choices of $df$ and the thin black curves (\drawline{}) represent point-wise 95\% confidence intervals.}
\caption{Estimates (\drawline{thick}) of $\beta$ in Simulation 1 when pre-adjusting exposure with different choices of spatial basis (panel columns) and different underlying confounding surfaces (panel rows). The far left column shows the unadjusted estimate. The dashed lines (\drawline{dashed}) and error bars indicate 2x the standard error. The true parameter value $\beta =1$ is plotted as a dotted line (\drawline{thick, blue,densely dotted}).}
\caption{Estimates (\drawline{thick}) of $\beta$ in Simulation 2 when pre-adjusting exposure with different choices of spatial basis (panel columns) and different underlying confounding surfaces (panel rows). The far left column shows the unadjusted estimate. The dashed lines (\drawline{dashed}) and error bars indicate 2x the standard error. The true parameter value $\beta =1$ is plotted as a dotted line (\drawline{thick, blue,densely dotted}).}
\caption{Estimates (\drawline{ultra thick}) and pointwise confidence intervals (\drawline{black}) of the association between SBP and PM$_{2.5}$ for different amounts of confounding adjustment using TPRS. The horizontal line (\drawline{blue, dashed}) is at zero.}
\caption{Estimates (\drawline{ultra thick}) and pointwise confidence intervals (\drawline{black}) of the association between SBP and PM$_{2.5}$ for different amounts of confounding adjustment using Fourier and wavelet filtering. The horizontal line (\drawline{blue, dashed}) is at zero.}
\caption{{\color{red}Click to play the animation}: the dynamics of the propagation process (from the $1^{st}$ to the $n^{th}$ group) on a bottom-to-top DAG on the point cloud for buildings, in the RueMonge~\cite{facade} dataset. The points are ordered into 1136 groups in this case, on a global scale. }
\caption{Examples of 50 trajectory generated from trained models, conditioned on unseen test maps, without discarding any invalid ones. (Top row, in green) {\color{green!60!black}OCTNet}, (Bottom row, in red) {\color{red}GAN} model.}
\caption{Data exchange in the model suite and associated scientific goals. The following steps are performed: \textcolor{red}{(1)} (top left) measured stellar UV- fluxes as well as the estimates of the stellar wind velocities ($v_{sw}$), stellar modulation ($\phi$) and PSDs (Christian-Albrechts-Universit\"at, hereafter CAU) are used as input i) for the Berlin Coupled Column Climate Chemistry (1D-TUB) and ExoTIC models as well as ii) to estimate the incoming GCR and SCR fluxes at the close-in exoplanets, respectively (highlighted by the red paths)\textcolor{cyan}{(2)} the transport of CRs through the planetary magnetosphere is studied with PLANETOCOSMICS in order to provide TOA proton fluxes as input to model both the GCR and SCR induced atmospheric ionization as well as the resulting dose rates on the planetary surface (purple arrow), (3) computation of the secondary particle production due to CR interaction through the planetary atmosphere down to the surface, \textcolor{Skobeloff}{(4)} calculation of the surface UV-A, UV-B and UV-C exposure as well as the radiation dose, \textcolor{ForestGreen}{(5)} determination of the impact of changing atmospheric ionization for the different atmospheric compositions as well as parameterization of the neutral atmosphere impact, \textcolor{Mulberry}{(6)} computation of the resulting atmospheric composition and climate, \textcolor{Dandelion}{(7)} performance of a pathway analysis in order to understand the biosignature chemical responses, \textcolor{RoyalBlue}{(8)} utilization of the global atmospheric composition and temperature fields to compute atmospheric transit (primary) and emission (secondary) spectra. The two iterations within the framework are displayed as red circles.}
\caption{Image of the field of \es\with galaxy redshifts labeled. The{\it HST} ACS$+$F814W image is shown in heat map while the outer regions not covered by the ACS are filled in with the MOSAIC $i$-band image shown in grey-scale. The galaxy labels are colored by redshift in black ($z>0.4387$), red ($z=0.4291-0.4387$), orange ($z=0.3506-0.3596$), and blue (all other galaxies). The $z=0.4291-0.4387$ and $z=0.3506-0.3596$ redshift intervals correspond to $\pm 1000$ \kms\velocity intervals around the candidate O\,VII{\color{black} absorber redshifts}. The blue contours from the FIRST \citep[][]{Becker:1995} survey reveal a radio lobe. Dotted circles with radii of $0.5'$, $1.0'$, $1.5'$, and $2.0'$ are shown for scale (170, 340, 510, 680 pkpc at $z=0.433$).}
\caption{{\it Top}: Redshift histograms of galaxies with $L>0.25 L_*$ at $d<1000$ (black histogram) and $<500$ (red filled histogram) pkpc from the \es\sightline. The only massive group in the field with multiple galaxies of$L>0.25 L_*$ near the blazar sightline is at $z=0.433$, a strong indication that \es\is a member of this galaxy group.{\it Bottom left}: Continuum normalized COS red-end spectrum of \es\with flux in black and error in blue. Intervening H\,I\lya\absorption systems from the IGM are labeled,{\color{black} and Milky Way features are plotted in grey}. The bottom axis shows the observed-frame wavelength while the top axis shows the corresponding \lya\redshift. Orange dashed lines mark the redshifts of the candidate O\,VII systems.{\it Bottom right}: Histogram of the redshift difference between QSO systemic redshifts and the highest redshift H\,I\lya\absorber of$W_{\rm r} > 0.03$ \AA\cataloged in COS spectra by\cite{Danforth:2016}, $z_{\rm sys} - {\rm max}(z_{\rm Ly\alpha})$. {\color{black} The resulting empirical constraint on the redshift of \es\is shown in blue in the top panel (right axis).}}
\caption{Metallicity independent equilibrium ion fraction of O\,VI (blue), O\,VII (red), and O\,VIII (black) as a function of distance from the blazar for gas with a temperature of$T = 10^6$ K and with a density of $n_{\rm H}=10^{-5}$ (top), $10^{-4}$ (middle), and $10^{-3}$ (bottom) $\rm cm^{-3}$.}
\caption{ Left: scatter plots for WE2 (positive correlation between rise rates and the amplitudes). Right: WE1 (anti-correlation between rise times and amplitudes) for the solar Cycles. Top and bottom panels obtained from quasi-Planck and skewed-Gaussian fitted data, respectively. \blue{ The solid lines are linear fits with slopes $m = 0.030 \pm 0.023$, $0.032 \pm 0.022$, $-0.033 \pm 0.062$ \&$-0.045 \pm 0.064$ and intercepts $c = 0.970 \pm 0.025$, $0.968 \pm 0.024$ , $1.033 \pm 0.063$ \&$1.046 \pm 0.065$, with the rms-deviation of fittings being $0.011$, $0.013$, $0.014$ \&$0.018$, respectively for panels a, b, c and d. } }
\caption{ Combined scatter plot for WE2. Different symbols represent different stars. The linear Spearman correlation and confidence level are printed on each plot. Top and bottom panels obtained from quasi-Planck and skewed-Gaussian fitted data. We note that the data of stars HD~16160, HD 81809, HD 155886, and HD 161239 are not included because these stars do not show a positive correlation. \blue{ The solid lines are best linear fits with slopes $m = 0.076 \pm 0.025$ \&$0.061 \pm 0.021$ and intercepts $c = 0.924 \pm 0.028$ \&$0.939 \pm 0.024$, with the rms-deviation of fittings being $0.038$ \&$0.039$, respectively for top and bottom panels. } }
\caption{ Scatter plot for WE1. The figure format is same as \Fig{fig:we2}, however, in this case number of stars following solar-like WE1 is less. \blue{ The solid lines are linear fits with slopes $m = -0.064 \pm 0.055$ \&$-0.029 \pm 0.049$ and intercepts $c = 1.064 \pm 0.058$ \&$1.029 \pm 0.051$, with the rms-deviation of fittings being $0.026$ \&$0.036$, respectively for top and bottom panels. } }
\caption{function AOLS$\left( m, \varepsilon, V(s,\phi_k)\right)$} \label{alg:DOL} \end{algorithm} \subsection{Value-function and Policy update}~\label{sec:VPU} To obtain policy network and value function approximation network, we propose to adopt an actor-critic network with \textit{one actor network} and \textit{$I$ critic networks}, where the actor network is used to maximize the objective state value and each critic network is used to map from the state action pair to $\mathcal{Y}_{i}^{\pi}$. Assume that the actor network with weights $\theta_\pi$ generates actions via $a = \pi \left(s; \theta_{\pi}\right).$ The weights $\theta_{\pi}$ can be updated using policy gradient given by~\cite{vamvoudakis2010online}: \begin{align*} \Delta \theta_{\pi}&\sim\sum_{k}\bigtriangledown_{\theta _{\pi}}\log \pi_{\theta}\left ( s_{k},a_{k} \right ) \delta_{k,t}, \end{align*} where $\delta_{k,t}$ is the expected value of the $i$th objective, also known as the temporal difference (TD) residual of $\widehat{V}^\pi_{i}$ with discount $\gamma$~\cite{sutton2018reinforcement}, given by \begin{align}\label{eq:aa} \delta_{k,t} = &r_{i}\left ( s_{k,t},a_{k,t} \right )+\gamma\widehat{V}_{i}^{\pi(\theta_\pi)}\left ( s_{k,t+1};\phi_{V_{i}} \right )\\\notag&-\widehat{V}_{i}^{\pi \left( \theta^{-}_{\pi} \right)}\left ( s_{k};\phi^{-}_{V_{i}} \right ) \end{align} where $r_{i}\left ( s_{k,t},a_{k,t} \right )$ is the immediate reward at the $t$th time step on the $k$th experience, $\widehat{V}_{i}^{\pi \left(\theta^{-}_{\pi} \right)}\left ( s_{k,t};\phi^{-}_{V_{i}} \right )$ is the approximation of the value function $V_i$ based on the old weights $\theta^{-}_{\pi}$ for the actor network and the old weights $\phi^{-}_{V_{i}}$ for the $i$th critic network, and $\widehat{V}_{i}^{\pi(\theta_\pi)}\left ( s_{k+1};\phi_{V_{i}} \right )$ is the approximation of the value function $V_i$ based on the updated weights $\theta_{\pi}$ for the actor network and the updated weights $\phi_{V_{i}}$ for the $i$th critic network. %$V_{i}^{\pi}\left(s\right)$ is one element of the vector value function, $V_{i}\left(s;\phi_{V_{i}}\right)$ is the critic function. For the $I$ critic networks, its $i$th neural network with hyperparameter $\phi_{V_{i}}$ is used to approximate each element in the vector value function $V_{i}^{\pi}\left(s\right)$. Assume that the critic function is given by $V_{i}\left(s;\phi_{V_{i}}\right)$ with $\phi_{V_{i}}$ serving as the weights. The weights can be updated via $ \Delta _{k}\phi _{V_{i}} \sim -\triangledown_{\phi _{V_{i}}} \sum_{k} \delta^2_{k,t}. $ In the standard TD-residual method, the value of one action evaluated via \eqref{eq:aa} is an incremental form of value iteration. The key drawback of the standard TD-residual method includes the need for a large number of samples and large variance of policy gradient estimate. To address these issues, an existing approach, called generalized advantage estimator (GAE)~\cite{schulman2015high}, can be used to evaluate the action advantages and perform the policy updates using proximal policy optimization~\cite{schulman2017proximal,schulman2015trust,rockafellar1991scenarios}. The GAE is defined by: \begin{align*} &\widehat{A}_{t}^{GAE\left(\gamma ,\lambda \right)} \\ %=&\lim\limits_{H\to\infty} (1-\lambda)\sum_{j=1}^H \widehat{A}_{t}^{\left(j\right)}\\ =&\lim\limits_{H\to\infty} ( 1-\lambda ) \sum_{j=1}^H \lambda^{j-1} \sum_{k=1}^j \gamma^{j-1}\delta_{k,t+j-1} \\ =&\sum_{l=0}^{H}\left(\gamma \lambda \right)^{l}\delta_{k,t+l}, \end{align*} where $\lambda\in \left[0,1\right]$ and $\gamma\in \left[0,1\right]$ adjusts the bias-variance tradeoff of GAE. After new weights of the advantage actor-critic network models are obtained, $V^{\pi}$ can be obtained via new samples using the updated policy. Afterwards, the procedure in Subsection~\ref{subsec:wab} can be implemented to obtain the updated $W$. The entire process will iterate until $V_{\overline{CCS}}\left ( w \right )- V_{S}^{*}\left ( w \right )<\epsilon$, where $\epsilon$ is a small threshold selected by users. \subsection{Explainable Planning Representation}~\label{sec:EPR} To address the quantifiable inter-objective relationship in our algorithm, we adopt an explainable planning representation that enables automatic explanation of the planning rationale. \subsubsection{Vocabulary for Quality Attributes (QA)} We map QA analytic models to domain-specific vocabulary to be used to generate verbal explanation. The vocabulary includes ``QA type'', ``optimization objective'', and ``QA property'' for the description of standard QAs. \subsubsection{QA Language Templates} To generate verbal explanation of the objectives and the QA properties of a solution policy $\pi^\star$, we use predefined natural-language templates. Table~\ref{table_example} shows an example of verbal explanation of QA objectives and properties. \begin{table}[htbp] \caption{Verbal explanation of QA objectives and properties} \centering \scalebox{0.78}{ \begin{tabular}{c|c|c} \hline QA Type & Optimization Objective & QA Property\\ \hline Standard measurement & ``maximize the alive bonus" & ``the expected alive bonus is 150"\\ \hline \end{tabular}} \label{table_example} \end{table} \subsubsection{Obtaining Alternative Policies} Algorithm~\ref{alg:preI} outlines an approach for sampling alternative policies around the current policy. The key idea of the approach is to start with the QA values of the current solution policy $\pi: V_1^\pi(s),\cdots,V_I^\pi(s)$. For each QA $i$, we determine a new value $V_i^{\prime}$ that is more preferable than $V_i^\pi(s)$. Then, we construct a new planning problem with $I-1$ optimization objectives (namely, excluding the objective associated with the QA $i$), resulting in a new multi-objective value function subject to the constraint that the QA $i$ must be at least as good as $V_i^{\prime}$. Next, we select an optimal, constraint-satisfying solution value under $\pi^{\prime}$ for the new planning problem. The new policy (respectively, state value) provides an alternative of the current policy (respectively, state value associated with the current policy). This procedure will be executed iteratively until we obtain up to $M_i$ number of alternative policies for each $i$. The pseudocode of the algorithm is given below. \begin{algorithm}[htbp] \KwData{current policy $\pi$, state $s$, $V_1^\pi(s),\cdots,V_I^\pi(s)$, all $n-1$ attribute value functions $V_{\setminus 1}^\pi(s),\cdots,V_{\setminus I}^\pi(s)$, increment sizes of values $\Delta V_1,\cdots, \Delta V_I$, maximum values $M_{V_1},\cdots, M_{V_I}$, maximum number of alternatives $M_1,\cdots, M_I$} \KwResult{A set of alternatives $\Pi^{\prime}$} $\Pi^{\prime}\leftarrow \o $\\ $D \leftarrow$ attributes to be explored, e.g., $\left \{1,\cdots,n \right \};$ \\ \While{$D \neq \o$}{ $i \leftarrow$ remove an attribute from $D$\\ $count_i \leftarrow 0$\\ $V_i \leftarrow V_i^\pi(s)$\\ $V_{\setminus i} \leftarrow n-1$ attribute value function on all $V_{\setminus i}^\pi(s), j \neq i$\\ \While{$V_i \leq (M_{V_i}-\Delta V_i) \wedge count_i \leq M_i $}{ $V_i \leftarrow V_i + \Delta V_i$\\ $\pi: \overline{V}_i^\pi(s) \leftarrow$ all $\pi: V_j^\pi(s)$, where $j \neq i$\\ $V^{\prime} \leftarrow \underset{V}{\arg\max} \overline{V}_{\setminus i}^\pi(s)$, subject to $V_{\setminus i}^\pi(s) \geq V_i(s)$\\ \If{$V^{\prime}$ exists}{ $\Pi \leftarrow \Pi^{\prime} \cup \left \{V^{\prime} \right \} $\\ $count_i \leftarrow count_i +1 $\\ \For{$j \neq i$}{ \If{$V_{j}^{\prime \pi}(s) \geq V_{j}^\pi(s)+\Delta V_j$}{$D \leftarrow D - \left \{j \right \}$}}}} } \caption{Pseudocode for the calculation of alternative multi-objective values (function AV2f $(\pi, \mathbf{V}^\pi(s), s, \Delta V, M_V, M_i )$)} \label{alg:preI} \end{algorithm} \subsubsection{Semantic Explanation of Value Tradeoffs} Our value justification indicates the amount of gain-loss in the QAs if one were to choose each alternative value under the current policy. It then indicates preference towards the current policy by arguing that such gain is not worth the loss, reflecting the QA utility models underlying the multi-objective value function. We use a predefined natural language template for generating verbal justification: ``I could [improve these QAs to these values], by [carrying out this alternative policy] instead. However, this would [worsen these other QAs to these values]. I decided not to do that because [the improvement in these QAs] is not worth [the deterioration in these other QAs]". \subsection{Overall Algorithm} The pseudocode for the proposed V2f-MORL approach described in Subsections~\ref{subsec:wab},~\ref{sec:VPU}, and~\ref{sec:EPR} is given in the Algorithm~\ref{alg:seq}. \begin{algorithm}[htbp] \KwData{initial policy parameters $\theta_{0}$, initial value function parameters $\phi_{0}^{V_{i}}$, initial vectorized weights based on each objective $w_{ij}$, $CCS$} \KwResult{PPO\_Model} \For{$i=1$ to $I$}{ \For{$k=0,1,2,...,K$} { Collect set of trajectories $D_{k}^{i}=\left \{ \tau _{n}^{i} \right \}$ by running policy $\pi_{k}=\pi\left ( \theta_{k} \right )$ in the environment\\ Compute rewards-to-go $\hat{R_{t}}$\\ Update $ V(s,\phi_k)$\\ Compute $w_{i\left[ \cdot \right ] } $ by function AOLS$\left(m, \varepsilon, V(s,\phi_k) \right)$\\ $V_{i}^{\phi_{k}} = w_{i\left [ \cdot \right ] } \times V(s,\phi_k) $\\ Compute advantage estimates $\hat{A_{t}^{i}}$ using GAE method based on the current value function $V_{i}^{\phi_{k}}$\\ Update the policy by maximizing the PPO-Clip objective: \begin{align*} \theta_{k+1} = &\underset{\theta}{\arg\max}\frac{1}{\left |D _{k}^{i} \right | T}\sum_{\tau_{n}^{i}\in D _{k}^{i} } \sum_{t=0}^{T}\min\Big(\frac{\pi_{\theta}\left ( a_{t}|s_{t} \right )}{\pi_{\theta_{k}}\left ( a_{t}|s_{t} \right )} \\ & A^{i,\pi_{\theta_{k}}}\left(s_{t},a_{t} \right ), g\left ( \epsilon, A^{i,\pi_{\theta_{k}}}\left(s_{t},a_{t} \right ) \right ) \Big)\end{align*} via stochastic gradient ascent (e.g., Adam)\\ Fit value function by regression on mean-squared error: \begin{align*} \phi_{k+1}^{V_{i}}=\underset{\phi_{k}^{V_{i}}}{\arg\min}\frac{1}{\left | D_{k}^{i} \right |T}\sum\limits_{\tau_{n}^{i}\in D _{k}^{i} } \sum\limits_{t=0}^{T}\left ( V_{i}^{\phi_{k}} \left ( s_{t} \right )-\hat{R_{t}^{i}} \right )^{2} \end{align*} via stochastic gradient descent.\\ $\Pi \leftarrow$ function AV2f $(\pi, \mathbf{V}^\pi(s), s, \Delta V, M_V, M_i )$} }
\caption{Best-model calculated (solid red lines) and experimental (dashed green lines) \red{$\cos(2\Psi)$ spectra }obtained at $\Phi_{a}$ = 65$^{\circ}$ and 75$^{\circ}$ for the un-annealed reference sample. The infrared range is dominated by a number of distinct absorption bands while the THz range shows Fabry-P\'erot oscillations as a result of the plane parallel interfaces of the sample.}
\caption{As Fig.~\ref{fig:psi}, but for the best-model calculated (solid red lines) and experimental (dashed green lines) \red{$\sin(2\Psi)\cos(\Delta)$ spectra }obtained at $\Phi_{a}$ = 65$^{\circ}$ and 75$^{\circ}$ for the un-annealed reference sample.}
\caption{\red{Best-model calculated real (dashed green line) and imaginary part (solid red line) of the complex dielectric function $\varepsilon(\omega)$ for the data shown in Figs.~\ref{fig:psi} and \ref{fig:delta} is depicted. The major absorption features occur in the range from 9 to 120~THz. Below 9~THz only a broad and shallow absorption can be observed. The best-model parameters are omitted here and the interested reader is referred to Ref.~\onlinecite{Park2019}.}}
\caption{(a) Experimental (green dotted lines) and best-model calculated (red solid lines) \red{$\cos(2\Psi)$ spectra }of the polymethacrylate sample which was annealed for 2~hours obtained at $\Phi _{a}$ = 70$^{\circ}$ and 75$^{\circ}$. Fig.~\ref{2hr}~(b) Same as (a) but for the \red{$\sin(2\Psi)\cos(\Delta)$ spectra.}}
\caption{(a) Experimental (green dotted lines) and best-model calculated (red solid lines) \red{$\cos(2\Psi)$ spectra }of the polymethacrylate sample which was annealed for 4~hours obtained at $\Phi _{a}$ = 70$^{\circ}$ and 75$^{\circ}$. Fig.~\ref{4hr}~(b) Same as (a) but for the \red{$\sin(2\Psi)\cos(\Delta)$ spectra.}}
\caption{Ground truth action sequence (\textcolor{red}{take\_cup}, \textcolor{blue}{spoon\_powder}, \textcolor{mypink}{pour\_milk}, \textcolor{myyellow}{stir\_milk}) (top) and our CDFL's action segmentation (bottom) on the sample test video $\textit{P03\_stereo01\_P03\_milk}$ from Breakfast dataset. The background frames are marked in white. \abbrmodel\may miss the true start and end of some actions, but successfully detects the actions.}
\caption{Top-down, the rows correspond to ground truth sequence of actions (\textcolor{mygold}{pour\_oil}, \textcolor{mymaroon}{crack\_egg}, \textcolor{mygreen}{fry\_egg}, \textcolor{myaquamarine}{put\_egg2plate}) and our action segmentations with neighbor-window size of $20$ on the sample video \textit{P03\_cam01\_P03\_friedegg} from Breakfast dataset using $L_{\text{CDF}}$, $L_{\text{DF}}$ and $L_{\text{F}}$, respectively. The background frames are marked in white. The result for $L_{\text{CDF}}$ is the best.}
\caption{Ground truth action sequence (\textcolor{mygold}{pour\_oil}, \textcolor{mymaroon}{crack\_egg}, \textcolor{mygreen}{fry\_egg}, \textcolor{myviolet}{take\_plate}, \textcolor{myaquamarine}{put\_egg2plate}) (top) and CDFL's action segmentations using different neighbor-window sizes on the sample test video \textit{P04\_webcam02\_P04\_friedegg} from Breakfast. The background frames are marked in white. The window size of $20$ gives the best performance.}
\caption{Ground truth action sequence (\textcolor{mypink}{StandUp},\textcolor{blue}{SitDown}, \textcolor{red}{DriveCar}, \textcolor{mygreen}{OpenDoor}, \textcolor{mygreen}{OpenDoor}, \textcolor{myyellow}{HugPerson}) (top) and our action alignments (bottom) on the sample video $\textit{0261}$ from Hollywood Extend. The background frames are marked in white. \abbrmodel\typically achieves a good action alignment.}
\caption{Ground truth action sequence (\textcolor{myturquoise}{OpenDoor}, \textcolor{myturquoise}{OpenDoor}, \textcolor{mymagenta}{OpenCarDoor}) (top) and CDFL's action alignments on the sample test video \textit{0361} from Hollywood Extended, when trained using varying window sizes. The background frames are marked in white. Using \abbrmodel{} and neighbor-window size of $6$ gives the best results. }
\caption{Visualization of the feature space $F$ in terms of person ID (top row) and camera ID (bottom row). Here, we use the samples from the MARS test set. In the initial phase, we observe that samples in the same camera domain are located in the similar regions (\textcolor{red}{\textcircled{\raisebox{-0.9pt}{1}} red circles}). With ID-discriminative feature learning (Section \ref{section3.2}), the network can learn ID-discriminative feature representations. However, the misalignment of camera domains is still observed in the feature space (\textcolor{blue}{\textcircled{\raisebox{-0.9pt}{2}} blue circles}). To overcome the problem, we propose MDIFL so that we can align the multiple camera domains. }
\caption{Hyperoptimizing SGD. The symbol Adam(...) refers Adam with the standard hyperparameters. Each hyperoptimizer experiment is repeated using the {\color{VioletRed}final hyperparameters} learned by the algorithm.}
\caption{Best seen in color. We plot the distribution of features before attack using a clean classifier ({\bf left}) and after attack using a poisoned classifier ({\bf right}). The color coding: {\color{red} Red diamonds}: clean target, {\color{blue} Blue circles}: clean source, {\color{black} Black triangles}: patched source , {\color{mygreen} Green pluses}: poisoned target. For 2D visualization, we choose the x-axis to be along the classifier weight vector {\bf w} (normal to the decision boundary). Let {\bf u} be the vector connecting the centers of the two classes (clean source and clean target). The y-axis is {\bf u} projected to be orthogonal to {\bf w}. Our optimization pushes the poisoned targets to be close to the patched sources in the feature space while they look similar to the clean targets visually. We see that before the attack, most patched source images are correctly placed on the left of the boundary, but after the attack (adding poisoned targets labeled as target to the training data), the classifier has shifted so that some of the patched sources have moved over from the left to the right side.}
\caption{\textcolor{purple}{(a) Performance deterioration of linear filters versus the change in the noise statistics due to the change in environment conditions. Three filters are investigated: a Kalman filter (blue line), and two robust filters (black and red line) designed using the presented framework. The Kalman filter is the most sensitive to the change in the noise statistics. (b) Accuracy-robustness trade-off for the linear estimator in \eqref{eq:lin_filter} and the system described in \eqref{eq: car's dynamics} and \eqref{eq: virtual sensor} with nominal measurement noise. The three filters from (a) are depicted in (b) with blue asterisk, black cross and red star, and their degree of sensitivity matches plot (a).}}
\caption{\textbf{Comparison to SPoCA:} {\color{black} SPoCA classification for September 6, 2017 12:02:00, the same observation used in Figures~\ref{fig:ExpertAgreement}, \ref{fig:comparison_euv}, and \ref{fig:comparison_halpha}, overlaid on a contemporaneous SUVI 195~\AA\image. SPoCA labels follow the thematic map color scheme: yellow for bright region and green for coronal hole.} }
\caption{\textbf{Mixtures Confidence Map for the mixtures approach:} The ``confidence map," (right) compared to a similarly timed SUVI 195~\AA\observation (left). A high value of 1 corresponding (white) in the confidence map indicates the classifier was certain in the labeling while values around 0.5 (black) indicates the classifier found two classes nearly equally likely. This is helpful in the cases of filaments and prominences where the signature compared to quiet Sun is not always clear.}
\caption{Is structure important for complex, multi-hop Question Answering (QA) over unstructured text passages? To answer this question we explore the task of identifying supporting facts \textcolor{green}{(rounded rectangles)} by transforming a corpus of documents (1) into an undirected graph (2) connecting sentence nodes \textcolor{blue}{(rectangles)} and document nodes \textcolor{purple}{(hexagons)}. }
\caption{\emph{No DMAF $\to$ DMAF}: predicted gas density \orange{distribution for an image from the test set}. The left \orange{image is the input without DMAF}, the \orange{central image depicts the prediction by the cGAN}, and the \orange{right image has been generated from the cosmological simulation with DMAF}. Some salient features are highlighted \orange{in black and red/grey (for positive and negative examples, respectively)}: much of the substructure has been washed out and the robust filaments that remain resemble the ground truth (two large \orange{black} loops). Many knots have been smeared (upper small \orange{black} loop) or erased (\orange{black} loop in the lower right corner). The formation of bubbles in the DMAF simulation due to large amounts of energy injected has not been mimicked by the neural network -- bubbles neither emerge from filaments (lower \orange{red/grey} loop) nor from knots (upper \orange{red/grey} loop).}
\caption{\emph{No DMAF $\to$ DMAF}: azimuthally averaged power spectral density (PSD) of the \orange{test images. The solid lines depict the sample mean and the shaded areas show the $2\sigma$-region}. The relative error towards the ground truth images is plotted in the lower \orange{panel}. On large scales (i.e. small $k$), the PSDs of input, ground truth, and predicted images agree well with each other. On smaller scales, the PSD of the input images declines more slowly due to the finer substructure as compared to the output images. On even smaller scales consisting of a few pixels, the PSD of the predicted output images starts to drop with respect to the ground truth output.}
\caption{\emph{DMAF $\to$ no DMAF}: predicted gas density distribution for the \orange{image} in Figure \ref{fig:10MeV}. Now, the direction of inference is reversed. The cGAN has added substructure to the smooth input image and drawn filaments based on faint overdensities in the input images (see the areas marked with \orange{black} loops). Bubbles in the large \orange{red/grey} loop have not been erased by the cGAN and individual peaks arising in the ground truth output in the small \orange{red/grey} loop on the right have not been resolved.}
\caption{\emph{DMAF $\to$ no DMAF}: azimuthally averaged power spectral density (PSD) of \orange{the test images (sample means and $2\sigma$-regions)}. Note that input and ground truth are now flipped. The relative error towards the ground truth images (without DMAF) is plotted in the lower \orange{panels}. Again, the cGAN-generated images lose power on small scales as compared to the ground truth images.}
\caption{PSD heat-maps of the three EEG bands i.e. theta (\textcolor{red}{red}), alpha (\textcolor{green}{green}), and beta (\textcolor{blue}{blue}) EEG bands are added according to respective color-bar range to get combined RGB heat-map image.(Circular outline, nose, ears, and color-bars have been added for visualization only. All units are in Watt per Hz.)}
\caption{For a trial, PPG signal with peaks (in \textcolor{red}{red}) being detected for the calculation of RRs and HRV (above), and PPG spectrogram (below).}
\caption{Detected face (marked in \textcolor{red}{red}) and face localized points (marked in \textcolor{green}{green}) for two participants (left and center) in the study, and some of the features (marked in \textcolor{yellow}{yellow}) computed using the coordinates of the face localized points. These features were then normalized using the size of the face in the camera i.e. number of pixels in height (H) and width (W)}
\caption{An example dialog and Pointer-Generator, SPNet and ground truth summaries. We underline semantic slots in the conversation. \textcolor{dred}{Red} denotes incorrect slot values and \textcolor{dgreen}{green} denotes the correct ones. }
\caption{Supplement to the case in Table~\ref{caseTable}. The summary generated by Transformer is shown in supplement summary. \textcolor{dred}{Red} denotes incorrect slot values and \textcolor{dgreen}{green} denotes the correct ones. Human Evaluation part provides the evaluator's choice and feedback in ranking summary pairs. Content in the brackets is not shown to the evaluators. }
\caption{An example dialog and Pointer-Generator, SPNet and ground truth summaries. The dialog spans over three domains: restaurant, hotel and taxi. We underline semantic slots in the conversation. \textcolor{dred}{Red} denotes incorrect slot values and \textcolor{dgreen}{green} denotes the correct ones. }
\caption{An example dialog and Pointer-Generator, SPNet and ground truth summaries. The dialog spans over one domain: restaurant. We underline semantic slots in the conversation. \textcolor{dred}{Red} denotes incorrect slot values and \textcolor{dgreen}{green} denotes the correct ones. \textcolor{blue}{Blue} denotes the content not covered by ground truth in SPNet's summary. }
\caption{Examples of artificial questions generated from the dependency trees of an active voice (top) and a passive voice (bottom) sentence. The correct answer (\textsl{verb's subject}) is marked with \textcolor{mygreen}{\textbf{\underline{blue}}}, whereas the \textcolor{myyellow}{\textbf{yellow}} words are used in the question. The remaining words are discarded by pruning the dependency tree.}
\caption{% The ConfusionFlow matrix~(\protect\annotref*{a}) visualizes confusion of classifiers across training iterations. Performance data for multiple classifiers can be loaded~(\protect\annotref*{e}) and compared with each other. Additionally, class-wise performance measures and class distributions are displayed in a second view~(\protect\annotref*{b}). The timeline~(\protect\annotref*{d}) allows interactive exploration and selection of temporal regions of interest. On demand, plots can be expanded to the detail view~(\protect\annotref*{c}). Here, we compare the performance of a neural network classifying images from the train set (\protect\colorswatch{CFgreen}) and test set (\protect\colorswatch{CForange}) of CIFAR-10~\cite{krizhevsky_learning_2009}, and a recently proposed new test set (\protect\colorswatch{CFblue}) from CIFAR-10.1~\cite{recht_cifar-10_2018}, respectively. The line chart (C) shows that the relative number of misclassified images for the selected classes \emph{auto} and \emph{truck} deviates notably between the original and the new test set. For the remaining classes the classifier performs similarly on the new test set and the original CIFAR-10 test set. }
\caption{% Visual comparison of different neural network pruning strategies. An original network~(\protect\colorswatch{CFgreen}), a pruned network~(\protect\colorswatch{CForange}), and a re-initialized sparse network~(\protect\colorswatch{CFblue}) were trained to classify Fashion-MNIST images. ConfusionFlow reveals how the accuracy drop after 10 to 14 epochs~(\protect\annotref*{a}) relates to confusions for different pairs of classes~(\protect\annotref*{b}--\protect\annotref*{d}). The learning behavior of the re-initialized sparse network is much more stable compared to that of the other two models. }
\caption{% Visual comparison of strategies for effective labeling. In an experiment, our collaborators tested three different labeling strategies: Greedy~(\protect\colorswatch{CFblue}), Smallest Margin~(\protect\colorswatch{CForange}), and Dense Areas First~(\protect\colorswatch{CFgreen}). Using ConfusionFlow, our collaborators made a series of findings regarding the overall performances~(\protect\annotref*{a, d1, d2}) as well as the temporal progression of class confusions~(\protect\annotref*{b, c, d3}) for the different strategies.}
\caption{Radius of N\'{e}el and in-plane skyrmions in the presence (blue and magenta curves) and absence (cyan and red curves) of dipolar interactions (demag). The black solid curve denotes the analytical dependence on the reduced DM parameter $g=\pi D/4\sqrt{AK}$ proposed in Ref.~\onlinecite{Kravchuk2018}. (Inset) Radial magnetization profile $\bm{m}^{z}$ (red) and $\bm{m}^{x}$ (blue) in the presence of stray fields for N\'{e}el and in-plane skyrmions, respectively, where the corresponding radii are marked by \textcolor{red}{$\square$} and \textcolor{cyan}{$\Delta$} in the $R-g$ curves. The dashed black arrow indicates the associated $g$-value.}
\caption{\label{fig:Results} Bit error rate as a function of transmission distance for the 42\,Gb/s SBRNN autoencoder and$M$-PAM \& Rx MLSD systems ($M\in\{2,4\}$). In the case of MLSD $\eta = \mu\log_2(M)$, where $\mu$ represents the number of pre- and post-cursor PAM symbols defining one of $M^{\mu}$ channel states. In the case of SBRNN $\eta=W\textnormal{log}_2(M)$ is the number of bits inside the processing window. The solid SBRNN curves with square marks (\protect\showpgfsquare) include the bit-to-symbol mapping optimization described in Section~\ref{Sliding}, while circle marks (\protect\showpgfcircle) give the BER with a randomly chosen bit mapping and cross marks (\protect\showpgfx) show the lower bound on the BER (at most 1 bit error per symbol).}
\caption{Illustration of light distribution at the place with two occluders (a) the light ray model near occluders. The blue line denotes camera plane and $x_i$ ($i=0,1,2,3$) is a point in the background, while $s_i$ ($i=0,1,\cdots,8$) stands for the viewpoint. The orange square \orange{$\blacksquare$} denotes the selected viewpoints and pixel in background when occlusions are contributed by only 1 occluder, while the red square \red{$\blacksquare$} is used to for places where the occlusions are contributed by 2 occluders. (b) illustration of light ray model in the spatial dimensions. The solid point represents the pixel without occlusion while hollow point stands for the occluded pixel.}
\caption{Quantitative evaluation of state-of-the-art LFSR algorithms. We report the average PSNR and SSIM for Spatial $2\times$, $3\times$, $4\times$ and Angular $2\times$, $3\times$. \fst{Red} and \blue{blue} indicate the best and the second best performance, respectively.}
\caption{This table shows the feasibility function of two different jobs $J$ and $K$ defined over the same set of tasks $\{1,2,3,4,5\}$. Column $f^J(\cdot)$ shows the feasibility function of job $J$, while column $f^K(\cdot)$ shows the feasibility function of job $K$. %The table in Subfigure (a) presents the feasibility %functions of jobs $J$ and $K$. %Jobs $J$ and $K$ have the same skeleton. %The directed graph in Subfigure (b) shows the skeleton of %jobs $J$ and $K$. \blue{Job $J$ is not doable, since tasks $2, 3, 4$ and $5$ \cyan{cannot be assigned a layer. On the other hand,} job $K$ is doable. Indeed, the number of each task in job $K$ corresponds to its layer.} }
\caption{ \textcolor{blue}{$m(x)$}, \textcolor{red}{$\widehat{m}(x)$} with $c=3$, $\rho=0.1$ and C.P. $\approx 40\%\;$ for $n=100, 300 \;\text{and}\; 500$ respectively.}
\caption{ \textcolor{blue}{$m(x)$}, \textcolor{red}{$\widehat{m}(x)$} with $c=3$, $\rho=0.1$ and $n=300$ for C.P. $\approx 10, 33\;\text{and}\;72 \%\;$ respectively.}
\caption{\textcolor{blue}{$m(x)$}, \textcolor{red}{$\widehat{m}(x)$} with $c=1$, $\rho=0.7$ and C.P.$\approx 30\%$ for $n=100,300 \; \text{and} \;500$ respectively.}
\caption{\textcolor{blue}{$m(x)$}, \textcolor{red}{$\widehat{m}(x)$} with $c=1$, $\rho=0.7$ and $n=300$ for C.P.$\approx 11, 33\; \text{and} \;65\%$ respectively.}
\caption{\textcolor{blue}{$m(x)$}, \textcolor{red}{$\widehat{m}(x)$} with $c=1$, $\rho=0.7$, C.P.$ \approx 25\%$ and $n=300$}
\caption{\textcolor{blue}{$m(x)$}, \textcolor{red}{$\widehat{m}(x)$}, with $c=1$, $\rho=0.7$, C.P. $\approx 30\%$ $n=300$ and M.F.$=10, 50\; \text{and} \; 100$ respectively.}
\caption{\textcolor{blue}{$m(x)$}, \textcolor{red}{$\widehat{m}(x)$}, $c=1$, $\rho=0.7$, $n=300$, C.P. $50\%$ and $\beta=0.01,0.05 \;\text{and}\; 0.1$ respectively.}
\caption{\textcolor{blue}{$m(x)$}, \textcolor{red}{$\widehat{m}(x)$}, $\mu_n(x)$ with $c=3$, $\rho=0.1$, $n=300$ and C.P. $\approx 10, 50 \;\text{and}\; 80\%$ respectively.}
\caption{\textcolor{blue}{$m(x)$}, \textcolor{red}{$\widehat{m}(x)$}, $\mu_n(x)$ with $c=3$, $\rho=0.1$, $n=300$, C.P. $\approx 35$ and M.F.=$10, 25 \;\text{and}\; 50\%$ respectively.}
\caption{\textcolor{blue}{$m(x)$}, \textcolor{red}{$\widehat{m}(x)$}, $\mu_n(x)$ with $c=1$, $\rho=0.7$, $n=300$ and C.P. $\approx 10, 50 \;\text{and}\; 80\%$ respectively.}
\caption{\textcolor{blue}{$m(x)$}, \textcolor{red}{$\widehat{m}(x)$}, $\mu_n(x)$ with $c=1$, $\rho=0.7$, $n=300$, C.P. $\approx 35$ and M.F.=$10, 25 \;\text{and}\; 50\%$ respectively.}
\caption{\small Illustration of the Neural Turtle Graphics (NTG) model. (a) depicts acyclic {\color{myblue}incoming paths} \{$\mb{s}^{in}$\} of an {\color{myorange} active node} $\mb{v}_i$, each of which is encoded using an RNN encoder. NTG decoder then predicts a set of {\color{mygreen} outgoing nodes} \{$\mb{v}^{out}$\}. (b) shows the NTG's neural network architecture. First, the encoder GRU consumes the motion trajectory $\Delta\mb{x}^{in}$ of each incoming path. We produce an order-invariant representation by summing up the last-state hidden vectors across all paths. Next, the decoder produces ``commands'' to advance the turtle and produces new nodes. An optional attribute vector can be further added to the decoder depending on the task.}
\caption{{\bf Constrained Optimization. Average model predictions for each protected group vs. slack}: {\bf Top}: Demographic Parity. {\bf Bottom}: Equal Opportunity. For each dataset, we train a linear classifier to satisfy fairness constraints over various slacks. We sort the solutions based on the amount of fairness violation on the training set and then plot the average prediction for each protected group within the training set. We plot both the average soft prediction score (solid lines) as well as the average {\it thresholded} hard binary prediction (dotted lines) for each group. We see that in most cases, slack-consistency is violated; i.e., the average predictions scores are not monotonic with slack. Points on the curve that violate slack-consistency are circled in {\color{red} red}. It is also worth noting that with the constrained optimization approach, there is the additional counter-intuitive effect where the average thresholded prediction can increase while the average soft prediction decreases (see gender\_Female predictions for Adult) which is possible depending on the distribution of the features.}
\caption{CIC IDS 2017 Autoencoder Configuration {\color{red}font size problem ....}}
\caption{Test accuracy of embeddings composed of \textbf{Top-100 (T)}, \textbf{Middle-100 (M)} and \textbf{Bottom-100 (B)} \textbf{principal components} on sentence classification datasets. The values inside \textbf{[]} on the right side of each embedding type describes the variance explained by the included principal components. The highlighted cells correspond to one of the three cases - \emph{\textbf{M outperforms T}} ( \colorbox{orange!50}{\textbf{orange}}), \emph{\textbf{B outperforms T}} (\colorbox{red!50}{\textbf{red}}) and \emph{\textbf{B outperforms M}} (\colorbox{yellow}{\textbf{yellow}})}
\caption{Performance on sentence classification tasks of various embeddings and their Post Processed (PPA) counterparts. The \colorbox{red!50}{\textbf{red}} colored cells denote the cases where the original embeddings outperformed their Post Processed (PPA) counterparts.}
\caption{\small BLEU scores over three different low-resource language pairs with pretrained emebddings and Top D components removed using PPA. \colorbox{green!30}{\textbf{Green}} cells denotes top scores.}
\caption{Safety modes for transformation directives. \textcolor{green!50!black}{Green} is for safe transformations, \textcolor{red}{red} may have changed the code's semantics as does \textcolor{orange}{orange} but only in corner cases.}
\caption{\textbf{3D Scene Graph Attributes and Relationships.} For a detailed description see \textcolor{blue}{\href{http://3dscenegraph.stanford.edu/images/supp_mat.pdf}{supplementary material}}~\cite{suppmat}.}
\caption{\textbf{Detection results on panoramas:} (a) Image, (b) Mask R-CNN~\cite{maskrcnn}, (c) Mask R-CNN w/ Framing, (d) Mask R-CNN w/ Framing and Multi-View Consistency (our final results), (e) Ground Truth (best viewed on screen). For larger and additional visualizations see the \textcolor{blue}{\href{http://3dscenegraph.stanford.edu/images/supp_mat.pdf}{supplementary material}}~\cite{suppmat}.}
\caption{\textbf{3D detection results on mesh:} (a) Mask R-CNN~\cite{maskrcnn} + Pano Projection, (b) Mask R-CNN w/ Framing + Pano Projection, (c) Mask R-CNN w/ Framing and Multi-View Consistency (our final results), (d) Ground Truth (best viewed on screen). For larger and additional visualizations see \textcolor{blue}{\href{http://3dscenegraph.stanford.edu/images/supp_mat.pdf}{supplementary material}}~\cite{suppmat}.}
\caption{\textbf{Results of amodal mask segmentation with \textit{Ours}.} We predict the visible and occluded parts of the object in the center of an image (we illustrate the center with a cross) \textcolor{blue}{blue: visible}, \textcolor{red}{red: occluded}.}
\caption{Quantitative comparison for $4\times$ SR on three datasets: average PSNR/SSIM for scale factor x4. {\color{blue}{\textbf{Blue}}} text indicates the best and {\color{green}{\textbf{green}}} text indicates the second best performance.}
\caption{ The network structure is as follows: \colorbox{blue!40}{convolutional:} a, c, e, g, i, k, s, v, w, \colorbox{orange!40}{max pool:} b, d, f, h, j, l, \colorbox{cyan!40}{convolutional XNOR:} m, n, o, p, \colorbox{pink!40}{yolo:} q, x, \colorbox{yellow!40}{upsample:} t, \colorbox{white}{route:} r, u. }
\caption{ Example object detection results produced by the models. Left side: Tiny-YOLO \cite{RedmondFarhadi2018}, center: xYOLO, right side: Tiny-YOLO-XNOR. \colorbox{magenta!60}{Balls} and \colorbox{green!60}{goals} are labelled when each network identifies an object that reaches the detection threshold. It can be observed that xYOLO has better object detection results than Tiny-YOLO-XNOR \cite{rastegari2016xnor} and comparable result to Tiny-YOLO. }
\caption{\textcolor{red}{\label{fig: Micromag}}\textcolor{black}{Micromagnetic simulations. (a) Simulated magnetic phase diagram of regular Co disks/cylinders obtained by variation of diameter and thickness. Each point (square) corresponds to the resulting magnetic state obtained from one micromagnetic simulation. The region of the investigated nanowire (NW) is highlighted around $60$\,nm diameter and$25$\,nm thickness. (b) Comparison of the$B_{y}$ component between the simulated and experimental Co/Cu NW. In the simulation, the irregular morphology of the Co disks obtained from the mean inner potential tomogram was taking into account. Note the different blue-red color-coding range between simulation and experiment according to the color bars. (c) }Slices through the Co disks 1-8 as indicated in (b) superimposed by an arrow plot indicating the magnetic state. The opposite out-of-plane directions are distinguished by black and green as also shown in the arrowplots in (b).}
\caption{\sf a- The Deck diagram for the low-mass proton dissociation. b- The diagram of triple-Regge form used to evaluate, via {\cred{duality}, } the contribution of {\cred{the}} heavier intermediate states.}
\caption{\sf The phase shift $\delta\phi^C$ of the one-photon-exchange amplitude caused by the second photon exchange with proton excitations in {\cred{the}} intermediate states. The dashed line is calculated using the full photon-proton cross section, $\sigma^{\rm tot}_{\gamma P}(E_\gamma)$ at $E_\gamma<4.2$ GeV, while for the solid curve the Pomeron (constant) "background" of 91 $\mu$b was subtracted from $\sigma^{\rm tot}_{\gamma p}$.}
\caption{TorchBeast (\textcolor{blue}{blue}) and TensorFlow IMPALA (\textcolor{red}{red}) runs on several Atari levels (first set).}
\caption{TorchBeast (\textcolor{blue}{blue}) and TensorFlow IMPALA (\textcolor{red}{red}) runs on several Atari levels (second set).}
\caption{\textcolor{red}{Table with stretched cell height and phantom column}}
\caption{\red{Evolution of the Earth-Mars distance and the Earth declination as a function of the mission timing.}}
\caption{Mars rotation model to retrieve, along with the a priori constraints and post-fit uncertainties in the MOP estimates obtained with GINS using an iterative least square procedure. The nutation amplitudes are given for the rigid case \citep[see][]{Baland:2019in} and for the fluid core case (rigid/fluid), considering the nominal values for the Core momentum factor (liquid core amplification factor in Section 2 after Eq.~(\ref{eq:TF})) and the FCN period. \red{The LOD amplitudes are taken from \citet{Konopliv:2016aa}.}}
\caption{ (a) Electronic band structure of NdNiO$_2$ (black dotted line) and that of a two-orbital tight-binding model for the interstitial $s$ and Nd $5d_{xy}$ Wannier orbitals (blue solid line). The energy is measured from the Fermi level. (b),(c) Bloch function of the band bottom of the bonding band between the interstitial $s$ orbital and the Nd 5$d_{xy}$ orbital and the band top of the antibonding band at the A point, respectively. The results shown in (b) and (c) correspond to the states indicated by the orange open circles in (a). (d),(e) Wannier function of the interstitial $s$ orbital and \red{the Nd $5d_{xy}$} orbital, respectively. The calculation is performed by using OpenMX~\cite{Ozaki_2003} (see Appendix \ref{sec_methods} for the computational details). }
\caption{ (a) Electronic band structure of LiNd$_2$NiO$_4$. (b) Wannier function of the Ni $d_{x^2-y^2}$ orbital in LiNd$_2$NiO$_4$. (c)--(k) Electronic band structures of KCa$_2$NiO$_3$, KSr$_2$NiO$_3$, RbCa$_2$NiO$_3$, La$_2$NiO$_2B'_2$, Nd$_2$NiO$_2B'_2$, $A'_2$NiO$_2$F$_2$, $A'_2$NiO$_2$Cl$_2$, $A'_2$NiO$_2$Br$_2$, and $A'_2$NiO$_2$I$_2$, respectively. The energy is measured from the Fermi level. \red{$\Sigma_1$ and $\Lambda_1$ in the $I4/mmm$ structures are (0.5,0,0) and (0,0,0.5) for the conventional unit cell shown in Fig.~\ref{crystal_structures}, respectively.} The open circles indicate the band minimum of the bonding band formed by the interstitial $s$ state and the \xy state of the neighboring cation. The green dotted curves are the Wannier-interpolated band of the effective single-band model.}
\caption{ \red{ Energy of the interstitial $s$ orbital and the cation $d_{xy}$ orbital in the Ni-based compounds. $E_{s}$ and $E_{d_{xy}}$ are the onsite potentials of the $s$ and $d_{xy}$ Wannier orbitals, respectively. $E_b^k$ and $E_a^k$ are the energy levels of the bonding and antibonding states between the $s$ and $d_{xy}$ orbitals at the $k$ point, respectively. $\Delta E_{sd}$ is the energy difference between the $s$ and $d_{xy}$ states, $\Delta E_{sd}=E_{s}-E_{d_{xy}}$. $\bar{E}_{sd}$ is the average of $E_{s}$ and $E_{d_{xy}}$. $\Delta E_{ba}^k$ is the energy difference between the bonding and antibonding bands at the $k$ point. The unit for length is \AA. The energy is measured with respect to the Fermi level in units of eV.} }
\caption{ \red{ Energy of the interstitial $s$ orbital and the cation $d_{xy}$ orbital in the Pd-based compounds. $E_{s}$ and $E_{d_{xy}}$ are the onsite potentials of the $s$ and $d_{xy}$ Wannier orbitals, respectively. $E_b^k$ and $E_a^k$ are the energy levels of the bonding and antibonding states between the $s$ and $d_{xy}$ orbitals at the $k$ point, respectively. $\Delta E_{sd}$ is the energy difference between the $s$ and $d_{xy}$ states, $\Delta E_{sd}=E_{s}-E_{d_{xy}}$. $\bar{E}_{sd}$ is the average of $E_{s}$ and $E_{d_{xy}}$. $\Delta E_{ba}^k$ is the energy difference between bonding and antibonding bands at the $k$ point. The unit for length is \AA. The energy is measured with respect to the Fermi level in units of eV.} }
\caption{Example of an attention map for an axial slice containing the center of mass of a nodule (arrow). Colorbar: 0~\protect\includegraphics[width=4em,height=.75em]{figures/spring_colorbar}~1%$10^{-4}$ (s)} ~(normalized search time in the slice).}
\caption{Average reading time for the {\color{light orange}$\blacksquare$}~left and {\color{lavender}$\blacksquare$}~right side of the scan of each reader. Error bar represents the standard deviation. The total reading time of Rad~4 is statistically different from Rad~1~and~3 (significance level $p=0.05$).}
\caption{Probability of the radiologists' gaze location (Prob) for the right lung as function of the normalized scan reading time (Time). Gaze location: {\color{light orange}$\blacksquare$}~left and {\color{lavender}$\blacksquare$}~right sides of the lung. Shaded areas correspond to the confidence interval for $p=0.05$, and darker regions mark periods where the confidence interval is limited to one of the lungs. \label{fig:gaze_strategy}}
\caption{Lung nodule detection sensitivity and the corresponding average number of false-positives per scan for the annotators, automatic system and pair-wise combinations. The number of nodules found by ranges of normalized attention time is also shown. {\color{orange}$\blacksquare$}~Rad~1; {\color{light orange}$\blacksquare$}~Rad~2; {\color{lavender}$\blacksquare$}~Rad~3; {\color{purple}$\blacksquare$}~Rad~4; {\color{black}$\blacksquare$}~automatic system; // combination of the Rad with the automatic system. }
\caption{Backward rules for {\color{blue} naive} and {\color{red} multimodal} adversarial perturbation bias }
\caption{The correct predictions are highlighted via \textcolor{green}{green} while the \textcolor{red}{red} depicts incorrect. Our method prediction score is high for true outcome and vice versa.}
\caption{Comparison of the network using plain concatenation block or attention block, including PSNR and SSIM for scale 2$\times$, 4$\times$ and 8$\times$ SR on Set5 and Set14. {\color{red}Red} indicates the best results.}
\caption{Comparison of the network using with or without back projection or RBPB, including PSNR and SSIM for scale 2$\times$, 4$\times$ and 8$\times$ SR on Set5 and Set14. {\color{red}Red} indicates the best results.}
\caption{Quantitative evaluation of state-of-the-art SR approaches, including PSNR and SSIM for scale 4$\times$, 8$\times$ and 16$\times$. {\color{red}Red} indicates the best and {\color{blue}blue} indicates the second best results.}
\caption{Tensão coletada no ponto $\ddot{x}$ de cada computador analisado. Computador 1({\color{red}\rule[0.03mm]{3mm}{0.3mm}}), Computador 2({\color{blue}\rule[0.03mm]{3mm}{0.3mm}}), Computador 3({\color{green}\rule[0.03mm]{3mm}{0.3mm}}), Computador 4(\rule[0.03mm]{3mm}{0.3mm}). }
\caption{Tensão em $\ddot{x}$ para o computador 4, que apresentou o menor índice NRMSE, e os dados coletados experimentalmente. Computador 4({\color{red}\rule[0.03mm]{3mm}{0.3mm}}), Dados experimentais(\rule[0.03mm]{3mm}{0.3mm})}
\caption{Tensão em $\ddot{x}$ para cada computador analisado. Computador 1({\color{green}\rule[0.03mm]{3mm}{0.3mm}}), Computador 2({\color{red}\rule[0.03mm]{3mm}{0.3mm}}), Computador 3({\color{blue}\rule[0.03mm]{3mm}{0.3mm}}), Computador 4(\rule[0.03mm]{3mm}{0.3mm}).}
\caption{{\color{blue}[\emph{Chase} $\rightarrow$ \emph{HRF}]} Effectiveness of ErrorNet for cross-dataset evaluation, where the training set comes from the \emph{Chase} dataset but the test set comes from the \emph{HRF} dataset. Top: Fundus image. Bottom-Left: Segmentation result for an \emph{HRF} dataset image from the segmentation network (before error correction). Bottom-Right: corresponding segmentation result generated by ErrorNet after error correction. The yellow boxes indicate example regions where the ErrorNet model has connected fragmented vessels or sharpened vessel structures. }
\caption{{\color{blue}[\emph{Chase} $\rightarrow$ \emph{Drive}]} Effectiveness of ErrorNet for cross-dataset evaluation, where the training set comes from the \emph{Chase} dataset but the test set comes from the \emph{Drive} dataset. Top: Fundus image. Bottom-Left: Segmentation result for a \emph{Drive} dataset image from the segmentation network (before error correction). Bottom-Right: corresponding segmentation result generated by ErrorNet after error correction. The yellow boxes indicate example regions where the ErrorNet model has connected fragmented vessels or sharpened vessel structures. }
\caption{{\color{blue}[\emph{Chase} $\rightarrow$ \emph{Chase}]} Effectiveness of ErrorNet for same-dataset evaluation, where the training and test sets both come from the \emph{Chase} dataset. Top: Fundus image. Bottom-Left: Segmentation result for a \emph{Chase} dataset image from the segmentation network (before error correction). Bottom-Right: corresponding segmentation result generated by ErrorNet after error correction. The yellow boxes indicate example regions where the ErrorNet model has connected fragmented vessels or sharpened vessel structures. As expected, improvement is not as drastic as that of cross-dataset evaluation.}
\caption{{\color{blue}[\emph{Aria} $\rightarrow$ \emph{HRF}]} Effectiveness of ErrorNet for cross-dataset evaluation, where the training set comes from the \emph{Aria} dataset but the test set comes from the \emph{HRF} dataset. Top: Fundus image. Bottom-Left: Segmentation result for an \emph{HRF} dataset image from the segmentation network (before error correction). Bottom-Right: corresponding segmentation result generated by ErrorNet after error correction. The yellow boxes indicate example regions where the ErrorNet model has connected fragmented vessels or sharpened vessel structures.}
\caption{{\color{blue}[\emph{Aria} $\rightarrow$ \emph{Stare}]} Effectiveness of ErrorNet for cross-dataset evaluation, where the training set comes from the \emph{Aria} dataset but the test set comes from the \emph{Stare} dataset. Top: Fundus image. Bottom-Left: Segmentation result for a \emph{Stare} dataset image from the segmentation network (before error correction). Bottom-Right: corresponding segmentation result generated by ErrorNet after error correction. The yellow boxes indicate example regions where the ErrorNet model has connected fragmented vessels or sharpened vessel structures.}
\caption{{\color{blue}[\emph{Aria} $\rightarrow$ \emph{Drive}]} Effectiveness of ErrorNet for cross-dataset evaluation, where the training set comes from the \emph{Aria} dataset but the test set comes from the \emph{Drive} dataset. Top: Fundus image. Bottom-Left: Segmentation result for a \emph{Drive} dataset image from the segmentation network (before error correction). Bottom-Right: corresponding segmentation result generated by ErrorNet after error correction. The yellow boxes indicate example regions where the ErrorNet model has connected fragmented vessels or sharpened vessel structures.}
\caption{M-CNN Training Performance on 128 2S Intel\textregistered{} Xeon\textregistered{} Gold processors with Dataset B}{\renewcommand{\arraystretch}{1.5}% \begin{tabular}{|c|c|c|c|c|} \hline \textbf{\# of Nodes} & \textbf{\# of Epochs} & \textbf{Batch Size} & \textbf{TTT (mins)} & \textbf{Images/sec}\\ \hline \hline 1 & \centering6.6 & 128 & 960 & 30\\ %\hline 2 & \centering8 & 256 & 642 & 72\\ %\hline 4 & \centering8.7 & 512 & 320 & 141\\ %\hline 8 & \centering12 & 1024 & 240 & 262\\ %\hline 16 & \centering15.9 & 2048 & 150 & 553\\ %\hline 32 & \centering14.9 & 2048 & 85 & 893\\ %\hline 64 & \centering15 & 2048 & 61 & 1284\\ % \hline 128 & \centering15.2 & 2048 & 50 & 1587\\ \hline \end{tabular}}
\caption{Top-1 Accuracy achieved in 20 epochs of M-CNN training and Learning Rate used on 1--64 2S Intel\textregistered{} Xeon\textregistered{} Gold processors. Dataset B is used for these experiments. Global minibatch size is capped at 2K from 16 to 64 nodes. The learning rate as shown in (f) -- (h) is also scaled only to 0.032 to achieve convergence}
\caption{Scalability of M-CNN training performance for 20 epochs on 64 2S Intel\textregistered{} Xeon\textregistered{} Gold 6148 processors. Note that global batch size is capped at 2K from 16 -- 64 nodes. Intel\textregistered{} OP Fabric, TensorFlow-1.9.0+Horovod, OpenMPI v3.0.0, 8 workers/node}
\caption{\color{blue}An example of the genetic operator (SBX~\cite{PM}) based offspring generation in a 2-D decision space, where $\mathbf{p}_1$ and $\mathbf{p}_2$ denote the parent solutions, and $\mathbf{s}_1$ and $\mathbf{s}_2$ denote the offspring solutions.}
\caption{\color{blue}The trajectories of generator and discriminator's training losses of the original GAN (with multivariate Gaussian model disabled) and our modified GAN during the evolution, respectively.}
\caption{\color{blue}The statistics of the runtime results achieved by the original IBEA and GMOEA.}
\caption{LAS scores of our parser in the raw text setup. Languages not in m-BERT's training corpus are marked with *. \textcolor{svo}{SVO} and \textcolor{sov}{SOV} languages are indicated by \textcolor{svo}{purple} and \textcolor{sov}{green} respectively. Stanford and CoNLL18's best submitted systems are provided as representative state-of-the-art supervised systems. \#TrWrds = Total training data made available at CoNLL18.%, rounded to the nearest thousand words. The amount of training used in each experiment is specified in \S\ref{ssec:data}. Training languages for each experiment are highlighted in grey.}
\caption{We illustrate the domain adaptation process with adversarial dropout (AdD). We depict the source and target domains as solid and dashed lines, respectively. {\color{red}Decision boundary} of a model only trained on the source domain easily violates the {\em cluster assumption} in that it passes through target feature-dense regions (a). We can apply AdD on both the feature extractor (c) and classifier (d). When AdD is used on the feature extractor, the decision boundary is pushed away from feature dense regions. On the contrary, AdD on the classifier pushes features away from the decision boundary. Eventually, our domain adapted model draws a {\color{blue}robust decision boundary} that avoids clusters (b).}
\caption{Qualitative evaluation of DARK ({\color{red}red}) {\em vs.} HRNet-W32 ({\color{cyan}cyan}) on COCO. }
\caption{The PK/PD compartmental model of anesthesia (Adapted from \textcolor{blue}{\cite{bamdadian2008controlling}}).}
\caption{Numerical values of the Hill equation’s parameter\textcolor{blue}{\cite{nino2009epsac}}}
\caption{Classification performance for the equally stratified \textit{action} data split. The \textit{action} part show the results of training one classifier on the action labels, whereas the \textit{verb+noun} part show the results of independently training two classifiers over the verb and noun labels. The scores in \textcolor{TableGray}{\textbf{gray}} color were calculated based on the respective \textit{action} or \textit{verb+noun} classifiers.}
\caption{Classification performance for the equally stratified \textit{action} data split. The \textit{action} part show the results of training one classifier on the action labels, whereas the \textit{verb+noun} part show the results of independently training two classifiers over the verb and noun labels. The scores in \textcolor{TableGray}{\textbf{gray}} color were calculated based on the respective \textit{action} or \textit{verb+noun} classifiers.}
\caption[]{Pipeline of our proposed approach. A video is divided into $K=3$ time segments shown in \textcolor{PipelineGreen}{\textbf{green}}, \textcolor{PipelineRed}{\textbf{red}}, and \textcolor{PipelineBlue}{\textbf{blue}} colors. Then, RGB and optical flow frames are sparsely sampled from each time segment to be processed in their respective spatial and temporal streams. At the end of each stream, the average consensus of the softmax scores is computed. A spectrogram is calculated from the raw audio signal and processed in its audio stream. The class scores of each stream are joined together using late fusion.}
\caption{Performance comparison with EPIC Kitchens challenge baseline results. The results highlighted in \textcolor{blue}{\textbf{bold blue}} are the best obtained by our method.}
\caption{World location shared between \vhad~and Virtual KITTI \citep{Gaidon2016}, as seen from within the Unity\textregistered\editor.\label{fig:phav_vkitti}}
\caption{An example for a 2-way 1-shot scenario, including both few-shot DA and few-shot NOTA. Different colors indicate different entities, \textcolor{blue}{blue} for head entities, and \textcolor{red}{red} for tail entities. For few-shot DA, instances in the training phase and test phase come from different domains. For few-shot NOTA, it requires models to detect the none-of-the-above (NOTA) relation.}
\caption{A) 2D clip of patient-specific geometry with no discontinuity capturing used; oscillations in the scalar solution can be seen near the wavefront. B) 2D clip of patient-specific geometry using the discontinuity capturing; a smooth solution to the scalar problem can be seen throughout the domain. C) 1D scalar solution plotted along the black line seen in A) and B). The use of the DC operator effectively rids the scalar solution of the overshoot/undershoot phenomena seen in the simulation with no DC (\textcolor{red}{red}).}
\caption{Adding FOP improves the hyperparameter robustness of standard optimizers. (a) The final test accuracy of a 9-layer CNN trained on CIFAR-10 for different settings of SGD with momentum (top) and SGD with momentum and FOP (bottom), averaged over three runs. Settings in which adding FOP improves performance by at least one standard deviation are highlighted in \textcolor{blue}{blue}. FOP appears to be most useful for higher values of the learning rate and momentum parameters. The FOP matrices were trained with Adam with a learning rate of $5\times 10^{-4}$. (b) The performance of Adam for a range of learning rates both with and without FOP, averaged over three runs. While performance is similar, the top performing models are improved by the addition of FOP. The FOP matrices were trained using the same setting as the models in (a).}
\caption{\label{fig:nearfield criterion}Illustration of the nearfield criterion for a third order expansion and $\gamma = 2$. The dashed line indicates the complete nearfield of the box associated with \textcolor{cbblue}{$\vb{r}_0$}---i.e.\all boxes that have an expansion point within$\gamma \Delta s$ (infinity norm) of the expansion around \textcolor{cbblue}{$\vb{r}_0$}. Consequently, all of the $\vb{s}_\ell(\vb{r})$ within the central dark blue square have a pairwise interaction with the $\vb{s}_{\ell'}(\vb{r})$ inside the dashed box. }
\caption{Example search tree for SICK 340, where $P$ is \textit{A schoolgirl with a black bag is on a crowded train}, with the $H$: \red{\textit{A girl with a black bag is on a crowded train}}. Only one \texttt{replacement} is allowed at each step. Sentences at the nodes are generated entailments. \framebox{Sentences} in rectangles are the generated contradictions. In this case our system will return \texttt{entail}. The search will terminate after reaching the $H$ in this case, but for illustrative purposes, we show entailments of depth up to 3. To exclude the influence of morphology, all sentences are represented at the lemma level in MonaLog, which is not shown here. \label{fig:search:tree} }
\caption{Evolution of $u(k)$ for each algorithm.}{\includegraphics[scale=0.3]{histogram.pdf}\label{fig:boxplot}}
\caption{\small \textbf{Super-resolution:} Average PSNR / SSIM / Run Time (seconds on GPU) for scale factor $\times2$, $\times3$ and $\times4$ on datasets Set5, Set14 and B100. \textcolor{red}{Red color} indicates the best performance and \textcolor{blue}{blue color} indicates the second best performance.}
\caption{ \label{table:3} Average PSNR(dB) results on 24 natural color images of different denoising methods: CBM3D~\protect\cite{dabov2009bm3d}, MLP~\protect\cite{burger2012image}, TNRD~\protect\cite{chen2015learning}, NI~\protect\cite{lebrun2015multiscale}, NC~\protect\cite{lebrun2015noise} and WNNM~\protect\cite{gu2014weighted}. The best two results are highlighted in {\color{red}red} and {\color{blue}blue}.}
\caption{Survey results for understanding users behavior: (a) starting point statistics, (b) end point statistics, and (c) not sharing location information implies privacy. While 90\% of the 60 participants indicated their start of activity is either \textcolor{blue}{home}, \textcolor{orange!80!black}{school}, or \textcolor{gray!30!black}{work}, an overwhelming 98\% of the participant indicated those to be the end point of their activities.}
\caption{ Additional visualization of MeteorNet example results on the KITTI scene flow dataset. Point are colored to indicate which frames they belong to: \textcolor{blue}{frames $t-3$}, \textcolor{viz_green}{frame $t-2$}, \textcolor{viz_yellow}{frame $t-1$}, \textcolor{viz_red}{frame $t$}. \textbf{Translated points} (frame $t-3$ + estimated scene flow) is in black. Green and black shapes are supposed to overlap for perfect estimation. }
\caption{{\bf Visualization of MeteorNet example results on the KITTI scene flow dataset.} Point are colored to indicate which frames they belong to: \textcolor{blue}{frames $t-3$}, \textcolor{viz_green}{frame $t-2$}, \textcolor{viz_yellow}{frame $t-1$}, \textcolor{viz_red}{frame $t$}. % \textcolor{magenta}{frame $t$ + estimated scene flow}. \textbf{Translated points} (frame $t-3$ + estimated scene flow) is in black. Green and black shapes are supposed to overlap for perfect estimation. }
\caption{Qualitative comparison of inpainting for square-shaped regions with (\myarrowred) showing the advantages of ipA-MedGAN.}
\caption{Qualitative comparison of inpainting for arbitrary-shaped regions with (\myarrowblue) showing the advantages of ipA-MedGAN.}
\caption{Results of numerical simulation for oscillatory cylinder a) contour of velocity for three different phase angles b) Comparison of the inline velocity component~$(u)$ profile at position $x_1=-0.6D$ for three different phase angles between numerical results~(overset: \textcolor{red}{- -}, single grid: $-$) and the experimental measurements~(\textbf{o}) of Dutsch et al.~\citet{dutsch1998low}.}
\caption{MSE for varying weights for neighboring distributions: {\textcolor{blue}{Baseline}}, {\textcolor{red}{Smoothing-based}}, {\textcolor{ForestGreen}{Slang-based}}, {\textcolor{Orange}{Smoothing and Slang-based}}, where the multiplier represents the relative weight of neighboring topic distributions.}
\caption{ (a) Cumulative distribution function of the delay between the time a problematic comment is posted and the time it is removed. % (b) After making a comment $c$, to what CMV locations, if anywhere, does the author $a$ post afterwards, % depending on whether $c$ was removed? \textcolor{red}{Red}: $c$ was $a$'s $1^{\mbox{st}}$ or $2^{\mbox{nd}}$ comment to ever be removed. \textcolor{darkgray}{Gray}/stripes: all other comments. % % The same user can contribute to both colors via different comments. ``community'': the union of the original and different \posttrees. Error bars represent (tiny) standard errors. % }
\caption{Attention for registry keys written shown for a malware sample that the model classified correctly as malicious. The cells are colored by how much attention each word received, with colors: \textcolor{veryhigh}{veryhigh}, \textcolor{high}{high}, \textcolor{med}{medium}, \textcolor{low}{low}, and \textcolor{verylow}{verylow}.}
\caption{T-SNE visualization of file operations. This shows how the files used in file operations were clustered by the trained embedding layer. \textcolor{blue}{Blue} shows files common to benign files, \textcolor{red}{red} shows files common to malicious, and \textcolor{mediumseagreen}{green} shows files common to both. }
\caption{The process of attention optimization (better view in color). The original attention distribution (\textcolor[RGB]{255,70,70}{red bar on the left}) is updated by the refinement gate $r_t$ and attention on some irrelevant parts are lowered. Then the updated attention distribution (\textcolor[RGB]{128,183,238}{blue bar in the middle}) is further supervised by a local variance loss and get a final distribution (\textcolor[RGB]{74,171,74}{green bar on the right}).}
\caption{With global variance loss, our model (\textcolor[RGB]{65,171,65}{green bar}) can avoid repetitions and achieve comparable percentage of duplicates with reference summaries.}
\caption{Layer descriptions \textcolor{red}{for Pakinson's disease detection}: the first part of the DNN is composed of 18 1D-Convnet, and the second part is a fully connected network.}
\caption{Training curve \textcolor{red}{for Parkison's detection}. The accuracy of the training set is represented by the blue line and the accuracy of the validation set is represented by the orange line. The accuracy corresponds to the proportion of segments correctly classified over all the walking segments in the training or the validation set.}
\caption{Cross-validation results \textcolor{red}{for Parkison's detection}. Best results are in bold and the second best results are in italic. \textcolor{brown}{Sp: Specificity, Se: Sensibility, Acc: Accuracy, SD: Standard deviation.} Not available values for results reported in \citep{ertuugrul2016detection} are indicated by n/a. }
\caption{Impact of input signals on Parkinson's detection \textcolor{red}{at segment level}. In each row, our DNN was retrained without (w/o) two symmetric inputs Right (R) and Left (L) VGRF signals. Results corresponding to the most significant features are in bold. \textcolor{brown}{Sp: Specificity, Se: Sensibility, Acc: Accuracy.}}
\caption{Examples of images from Messidor-2 and Kaggle test sets that are predicted by \model{} as R4. The curves are the contours of the attention maps (threshold=0.3). {\color{black}$\blacksquare$}~R1, {\color{orchid}$\blacksquare$}~R2, {\color{cyan}$\blacksquare$}~R3, {\color{springgreen}$\blacksquare$}~R4. \label{fig:r4_laser}}
\caption{Examples of R0-R2 Kaggle images (not used for training) along with the \model{} predictions. The curves are the contours of the attention maps (threshold=0.3). {\color{black}$\blacksquare$}~R1, {\color{orchid}$\blacksquare$}~R2, {\color{cyan}$\blacksquare$}~R3, {\color{springgreen}$\blacksquare$}~R4. \label{fig:r0_r1_r2}}
\caption{Examples of R2 and R3 Kaggle dataset images (not used for training) along with \model{} predictions. The curves are the contours of the attention maps (threshold=0.3). {\color{black}$\blacksquare$}~R1, {\color{orchid}$\blacksquare$}~R2, {\color{cyan}$\blacksquare$}~R3, {\color{springgreen}$\blacksquare$}~R4. \label{fig:r2_r3_assump}}
\caption{Examples of healthy Messidor-2 images mispredicted as R1 because they contain camera artifacts. These images show camera artifacts (present in the same location across several different images) which are detected as R1 typical lesions. The curves are the contours of the attention maps (threshold=0.3). {\color{black}$\blacksquare$}~R1, {\color{orchid}$\blacksquare$}~R2, {\color{cyan}$\blacksquare$}~R3, {\color{springgreen}$\blacksquare$}~R4. \label{fig:messidor_marks}}
\caption{Class distribution of the DR grading datasets used for evaluation. {$\square$}~R0/no DR, {\color{cm1}$\blacksquare$}~R1, {\color{cm2}$\blacksquare$}~R2, {\color{cm3}$\blacksquare$}~R3, {\color{cm4}$\blacksquare$}~R4/PDR, {\color{cm5}$\blacksquare$}~SDR, {\color{cm6}$\blacksquare$}~PPDR. \label{fig:datasets_classes}}
\caption{Examples of ophthalmologist's annotations on the SCREEN-DR dataset. {\color{black}$\blacksquare$}~MA, {\color{fuchsia}$\blacksquare$}~HEM, {\color{gold}$\blacksquare$}~EX, {\color{c1}$\blacksquare$}~CWS, {\color{c2}$\blacksquare$}~IrMA, {\color{c3}$\blacksquare$}~NV, {\color{c4}$\blacksquare$}~PHEM, {\color{c5}$\blacksquare$}~PFIB. \label{fig:annot_screendr} }
\caption{$\mathcal{L}$ loss values (Eq.~\ref{eq:our_loss}, with $\alpha=0.7$) as function of $\hat{y}_r$ and $\sigma$, for $y\in \mathcal{G}$. Loss colorbar: $0$~\protect\includegraphics[height=.75em,width=5em]{figures/uncertainty_gradient_bar.pdf}~$2$ . \label{fig:unc_loss_exs}}
\caption{Maps predicted by our method along with the ophthalmologist's annotated lesions in the SCREEN-DR dataset. The curves are the contours of the explanation maps (threshold=0.3), and the red square indicates the region of most relevance for diagnosis (corresponding to the maximum in the network's output activation map). Close-ups of relevant regions are shown, with \cmark~indicating that the region is correctly predicted, and \xmark~indicating otherwise. Below each close-up the name of the corresponding ground truth lesion (if existent) is shown. Explanation map: {\color{black}$\blacksquare$}~R1, {\color{orchid}$\blacksquare$}~R2, {\color{cyan}$\blacksquare$}~R3, {\color{springgreen}$\blacksquare$}~R4; ground truth map: {\color{black}$\blacksquare$}~MA, {\color{fuchsia}$\blacksquare$}~HEM, {\color{gold}$\blacksquare$}~EX, {\color{c1}$\blacksquare$}~CWS, {\color{c2}$\blacksquare$}~IrMA, {\color{c3}$\blacksquare$}~NV, {\color{c4}$\blacksquare$}~PHEM, {\color{c5}$\blacksquare$}~PFIB.\label{fig:maps_screendr}}
\caption{Uncertainty-associated predictions for images from the Kaggle test set. The color of the frame represents the uncertainty, $\hat{y}_g$ is the prediction and $y$ the ground truth grade. Images were cropped around the field-of-view and resized to a square image. Uncertainty colorbar: $0.05$~\protect\includegraphics[height=.75em,width=5em]{figures/uncertainty_gradient_bar.pdf}~$1.406$ \label{fig:unc}}
\caption{LE rescaled by the first LE computed for Case 1 (\mythickline{black}), Case 2 (\mythickline{blue}), Case 3 (\mythickline{red}), Case 4 (\mythickline{green}). The shaded areas denote the statistical errors. Inset: Magnified region near the first exponents for Cases 3 and 4.}
\caption{Comparison of CycleGAN domain adaptation on the SITW eval set. \textcolor{red}{bold numbers which are better}}
\caption{Integrated depth dose distributions of proton pencil beams simulated in \Fred and obtained experimentally during the facility commissioning (left), an example of transversal 2D dose distribution in isocentre plane obtained from \Fred MC simulations (middle left), measured with MatriXX detector in water (middle right) and a GI map computed from simulation and measurement using GI (3\%/2\,mm) method (right). GI passing rate is$98.64\%$.}
\caption{Misleading rate and transferability of N-FGSM (NF), P-FGSM (PF), DeepFool (DF), SparseFool (SF), CW, SemanticAdv (SA) and EdgeFool (EF) for ResNet-50~({\protect\tikz \protect\draw[color=R50, fill=R50] plot[mark=*, mark size=0.5mm] (0,0);}), ResNet-18~({\protect\tikz \protect\draw[color=R18, fill=R18] plot[mark=*,mark size=0.5mm] (0,0);}) and AlexNet~({\protect\tikz \protect\draw[color=AN, fill=AN] plot[mark=*, mark size=0.5mm] (0,0);}) on Private-Places365 and ImageNet. EdgeFool is more transferable than other adversarial methods, except SemanticAdv that however severely distorts the colours (see Figure~\ref{fig:advExampleSOA}).}
\caption{\textbf{Compression scheduling.} Example pseudo-code of an application invoking compression-scheduling from its training-loop. Lines in \textcolor{blue}{blue} show commands related to the scheduler, and illustrate the simplicity of adding scheduled-compression to existing PyTorch applications. Call-backs into the scheduler trigger the invocation of back-end compression algorithms according to the programmed schedule.}
\caption{A typical example of a chance based recovery in Alladin sequence from TLP~\cite{moudgil2017long} dataset. \textcolor{YellowGreen}{SiamRPN} (green) is tracking the incorrect object and has zero overlap with the \textcolor{red}{target} (red) in the start. It switches to tracking the target when they pass through each other. We study such chance based recoveries in long-term setting both qualitatively and quantitatively. Best viewed in colour.}
\caption{An example from TLP~\cite{moudgil2017long} Kinball1 sequence where the tracking \textcolor{red}{target} is black ball. Both \textcolor{YellowGreen}{SiamRPN} and \textcolor{blue}{ATOM} end up tracking objects of totally different class i.e. human which is also significantly different in appearance from the given target.}
\caption{Qualitative comparison on the TF representations of different speech enhancement techniques. \newline (\myarrowred) and (\myarrowblue) shows the advantages of our proposed model. \label{fig:qual} }
\caption{Test cases (2D grid world layouts) generated for Karel. Covered program branches are \colorbox{Dandelion}{marked}. The generated layout on the right by our model \ours{} covers all statements in the program, while the program exits after the first statement using the layout on the left. }
\caption{(a--d)~Experimental (\myline{\color[rgb]{0.85,0.325,0.098}}) and numerical (\myline{\color[rgb]{0,0.447,0.741}}) amplitude transfer functions in spacings~1--4. (e)~Combined experimental transfer functions for spacings~1--6 (\myline{\color[rgb]{0.85,0.325,0.098}}) with envelope (\protect\tikz \protect\fill[color=gray,opacity=0.5] (0,0) rectangle (8pt,6pt);), and amplification boundary (\mydash{\color{black}}).}
\caption{% \textbf{(A)} Normalized probe concentration as a function of pressure gradient $\Delta p / L$, for two nanochannel lengths and for $c_s=0.1\,$mM. ({\color{red}$\blacktriangle$}): $L=25$\,\si{\micro}m probed in its middle $x=0$; ({\color{gray}$\blacklozenge$}): $L=500$\,\si{\micro}m probed at its entrance $x=-L/2$. Inset: same curves as a function of pressure difference $(1/2 - x/L) \Delta p$, see text for the position-dependent scaling factor. The full lines: prediction from Eq. \eqref{eq:c3bis}. \textbf{(B)} Same as (A), for different salt concentrations and thus Debye layer overlap ratio $2 \lambda_D / h$, ($L=25$\,\si{\micro}m probed in their middle $x=0$). ({\color{greenSimon}$\blacksquare$}): $c_s = 1$\,mM salt concentration; ({\color{violet}$\bullet$}): $c_s=10$\,mM salt concentration. Inset: same data in semi-log scale.}
\caption{% Finite element calculations. % \textbf{(A)} z-averaged potential (up) and dye concentration profiles within the channel (top and middle: $2\lambda_D/h=10$; low: $2\lambda_D/h=0.3$). ({\color{gray}---}): equilibrium profiles under no convection; blue symbols: profiles under convection ($\text{Pe}=3$). ({\color{orange}$--$}): 1D theoretical prediction Eq. (\ref{eq:c3bis}); ({\color{orange}---}): 2D theoretical prediction Eq. (\ref{eq:caveraged2}). % \textbf{(B)} z-averaged dye concentration at the middle location ($x=0$) as a function of the Peclet number for various confinements ($2 \lambda_D / h \approx 0.03$, $0.1$, $0.3$, $1$, $3$, $10$ and $30$). ({\color{orange}$--$}): same as in (A). % \textbf{(C)} Two-dimensional correction $\alpha_{2D}$ as a function of the confinement ratio $2 \lambda_D / h$. ({\color{blue}---}): analytical prediction (see text). % }
\caption{Error rate of the centrality estimation methods against the latent factor dimension $k$. For the three plots on the left, we consider three settings of the sparse influence matrix ${\bm B}$; see the main text for details. The right plot depicts a realization of the graph where `{\color{red}red}'/`{\color{blue}blue}'/`black' nodes are external source/normal nodes/central nodes. As $k$ increases, setting {\sf (b)} models external influences that originate from the periphery.}
\caption{% Probability density function of the volume of Vorono\"i cells for sets of points distributed according to an RPP.% The number of points per snapshot is indicated by the colors as follows: {\color{black}\solidthick}, $N_p=10^7$; {\color{blue}\solidthick}, $N_p=10^6$; {\color{red}\solidthick}, $N_p=10^5$; {\color{cyan}\solidthick}, $N_p=10^4$; {\color{magenta}\solidthick}, $N_p=10^3$; % The total number of samples is kept constant ($N_{tot}=N_p\,N_s=5\cdot10^9$) in the entire series by adjusting the number of snapshots $N_s$ accordingly. % Graph $(a)$ shows double logarithmic scaling; $(b)$ is the same data in semi-logarithmic scaling. The inset shows a zoom of the right tail. % }
\caption{% Probability density function of the volume of Vorono\"i cells for sets of spherical, finite-size particles distributed according to an RPP, shown for the data of figure~\ref{fig-finiteSize1-vorvol-stdev-vs-phis-1}. % The line-styles are as follows: {\color{black}\solidthick},~$\Phi_s=10^{-5}$; {\color{blue}\solidthick},~$\Phi_s=10^{-4}$; {\color{red}\solidthick},~$\Phi_s=10^{-3}$; {\color{yellow}\solidthick},~$\Phi_s=5\cdot10^{-3}$; {\color{green}\solidthick},~$\Phi_s=10^{-2}$; {\color{cyan}\solidthick},~$\Phi_s=2\cdot10^{-2}$; {\color{magenta}\solidthick},~$\Phi_s=5\cdot10^{-2}$. % Graph $(a)$ shows double logarithmic scaling; $(b)$ is the same data in semi-logarithmic scaling. % The solid circles indicate the respective fits to a generalized gamma distribution (\ref{equ-three-param-gamma-pdf}) for the most dilute and the densest cases. % }
\caption{% Probability density function of the volume of Vorono\"i cells for sets of finite-size particles with solid volume fraction$\Phi_s=5\cdot10^{-3}$, distributed according to an RPP. % The relative box-size is indicated by the colors as follows: {\color{black}\solidthick}, $L/D=25$; {\color{red}\solidthick}, $L/D=35$; {\color{blue}\solidthick}, $L/D=150$. % Graph $(a)$ shows double logarithmic scaling; $(b)$ is the same data in semi-logarithmic scaling. %The inset shows a zoom of the right tail. % }
\caption{% The same data as in figure~\ref{fig-finite-size-vorvol-stdev-vs-npart-1}$(b)$, but the ordinate is normalized with the asymptotic value of $\sigma(V_{vor}/\langle V\rangle)$ for large $N_p$. % The line-styles are as follows: {\color{red}\solidthick},~$\Phi_s=0$ (points); {\color{black}\solidthick},~$\Phi_s=5\cdot10^{-3}$; {\color{blue}\solidthick},~$\Phi_s=5\cdot10^{-2}$. % % }
\caption{The SDSS Data Release 8 \redmapper\cluster catalog used in the SDSS cluster cosmology analysis\citep{Costanzi19_SDSSClusterCosmology}. In red, we show the richest 30 clusters for which we have X-ray gas masses.}
\caption{X-ray gas mass and richness for each of the 30 richest clusters in \sdss\\redmapper\catalog. The gray band is the 68\% confidence region derived from our best fit model. We have classified each cluster by the features of their X-ray images and show each as a different color and symbol. In Section~\ref{sec:xray-profiles} we describe the different methods that we use to compute the richness and gas mass for each classification.}
\caption{The 68\% and 95\% confidence contours for the well-constrained parameters of \rmrel\before (blue) and after (red) introducing our X-ray cluster sample to the\sdss\cluster cosmology results. We also show our constraints from the joint analysis of\sdss\clusters + BAO + Chandra (green) and\sdss\clusters + BAO + Planck + Chandra (gold). The remaining two parameters of the richness--mass relation,$\log M_{\rm min}$ and $\sigma_{\lambda,\intr}$ were not well-constrained and have been omitted from this figure for clarity.}
\caption{The impact of our X-ray cluster sample on the 68\% and 95\% confidence contours of $\sigma_8$ and $\omm$. In both panels, we show the most recent \planck\2018 TT+lowE results in blue for comparison. We show our constraints when we adopt our conservative priors in the red and magenta contours.\textit{Left}: The \sdss\clusters constraints from C19 are shown in gray and the tighter measurement after introducing our X-ray clusters is shown in red. We also show the Weighing the Giants X-ray cluster cosmology constraints\citep{Mantz15_WtG4} as black dashed contours for comparison. \textit{Right}: We show the \sdss\clusters + BAO constraints from C19 in gray and the tighter measurement after introducing our X-ray clusters in magenta.}
\caption{\textit{Left}: The impact of our X-ray cluster sample on the 68\% and 95\% confidence contours of $\omm$--$S_8$--$h$. We show the \sdss\cluster cosmology constraints in gray. In red we show our constraints after including our\chandra\data with the\sdss\cluster cosmology results. We also show the constraints from our BAO data (green),\planck\2015 (blue), and\planck\2018 (black, dashed) for comparison.\textit{Right}: Our joint constraints on $\omm$--$S_8$--$h$ after including our X-ray cluster sample to the \sdss\clusters + BAO results (magenta). We also show our joint constraints after including our X-ray cluster sample to the\sdss\clusters + BAO +\planck\2015 results (gold). We show the\planck\2015 constraints (blue) for comparison.}
\caption{\textbf{Motivation and improvements}. (a) The VCR object detections, \ie, red boxes and labels in \colorbox{Violet}{\color{white}blue} are shown. We capture visual attributes by replacing the image classification CNN (used in previous models) with an image+attribute classification CNN. The predictions of this CNN are highlighted in \colorbox{Dandelion!70}{\color{white}orange}. (b) Additionally, many nouns referred to in the VCR text aren't \emph{tagged}, \ie grounded to objects in the image. We utilize the same image CNN as (a) to detect objects and ground them to text. The \emph{new tags} we found augment the VCR tags, and are highlighted with yellow bounding boxes and the associated labels in \colorbox{OliveGreen!70}{\color{white}green}. }
\caption{\textbf{\uline{Top left}:} $g{-}i$\vs.$i$\CMD of all detected sources in our survey footprint.\textbf{\uline{Top right}:} Color--color diagram of all detected sources. The stellar locus is shown as a red curve. Only sources lying on the stellar locus, within their photometric uncertainties, are selected. \textbf{\uline{Bottom left}:} $g{-}i$\vs.$i$\CMD of all sources thrown out in our selection process.\textbf{\uline{Bottom right}:} $g{-}i$\vs.$i$\CMD of all morphologically ($<$0\farcs75) and color-selected (${<}\sigma$+0.2\,mag from SL) stars. The locus of unresolved background galaxies (cyan ellipse) is now easily distinguishable from the RGB selection box (orange). Three stellar isochrone models are shown (age = 12\,Gyr), with metallicities of[Fe/H] = $-$2, $-$1.5, and $-$1.}
\caption{\textbf{\uline{Left}:} Grayscale density map of RGB stars in M81's halo. Existing HST fields from the GHOSTS survey \citep[e.g.,][]{radburn-smith2011,monachesi2013} are overlaid (\textcolor{blue}{ACS}---blue/\textcolor{green}{WFC3}---green). The region defined as M81's `minor axis' in this paper is shown in red. We also show a much narrower region in purple, comparable to the width of the GHOSTS coverage of the minor axis, which we use to compare to the GHOSTS average color profile (see \S\,\ref{sec:color-prof}, Figure \ref{fig:minax-color}). \textbf{\uline{Right}:} Plot showing our calibration of HSC RGB counts using the GHOSTS survey. The \textit{x}-axis gives the density of RGB stars within a given GHOSTS field, corresponding to the \cite{harmsen2017} selection box, while the \textit{y}-axis gives the density of RGB-like sources in the same area from HSC, obtained using our selection criteria (see \ref{sec:stargal}). The best-fit power-law is shown (blue), as well as the confidence region containing 68\% of the points ($\sim$1$\sigma$), obtained from 10,000 bootstrap fits (red shaded). Each field is labeled individually. An inset showing the published GHOSTS field layout on an optical image of the M81 Group is included. Also inset is a stacked CMD of the 13 GHOSTS fields used for this analysis \citep[taken from][]{harmsen2017}, presented in the F606W \& F814W filters.}
\caption{\uline{Top left:} Stacked $g{-}i$\CMD of stars (black points) in the 13 GHOSTS fields used for calibration, converted from F606W$-$F814W using isochrone models. Our Subaru RGB selection box (Table \ref{tab:select}) is overlaid in orange. The solid red line shows the near-straight path of an adopted `fiducial' isochrone ([M/H]\,${\sim}\,{-}$1.2) through the CMD, with a $g{-}i$\color of 1.62 (i.e. a line of constant$Q_{\rm Col}$\= 1.62) at a point 0.5\,mag below the TRGB ($i\,{\sim}\,24.8$). Two additional lines of constant $Q_{\rm Col}$\are shown (red dashed), showing a$\pm$0.5\,mag change in$Q_{\rm Col}$. \uline{Top right:} Same as left, but for candidate stellar sources observed with Subaru in the 13 fields. \uline{Bottom left}: Stacked $Q_{\rm Col}$\distributions for detected Subaru RGB candidates (blue) and detected GHOSTS RGB stars (orange) in the 13 GHOSTS fields. The median$Q_{\rm Col}$\for the Subaru sources is 0.2\,mag bluer than the GHOSTS median. When comparing the CMDs obtained from Subaru and GHOSTS (top), it is clear that this offset results from the Subaru$g{-}i$\completeness curve. We fail to detect a sub-dominant, but substantial, population of red, higher-metallicity stars present in the halo.\uline{Bottom right}: PARSEC isochrone \citep[e.g.,][]{bressan2012} predictions for F606W$-$F814W vs. $g{-}i$\color--color relationship for RGB stars, as a function of metallicity (colored curves). Overlaid are the median F606W$-$F814W colors in each of the GHOSTS fields from \citep{monachesi2016a} and corresponding median $g{-}i$\colors, both obtained using the$Q_{\rm Col}$\rotated-CMD metric. Blue points denote `halo' fields ($>$10\,kpc from M81). Red points denote fields with higher-metallicity populations, which are closer ($<$\,10\,kpc) to M81's disk. Gray points are fields which are sparse, often with only one or two stellar candidate sources in Subaru. The halo fields lie on a low-metallicity (e.g.,[M/H]\,=\,$-$1.2) model curve (blue dashed), offset bluewards by a constant 0.2\,magnitudes in$g{-}i$. Similarly, the two higher-metallicity fields lie on a high-metallicity (e.g., [M/H]\,=\,0) model curve (red dashed), offset 0.2\,magnitudes in$g{-}i$. Though many of the reddest stars are lacking in our Subaru observations, it appears that the stellar halo populations are stable enough to correct for this effect using the GHOSTS data.}
\caption{Average $g{-}i$\color profile of resolved RGB stars along M81's minor axis, as described in\S\,\ref{sec:color-prof}. Subaru HSC measurements are again shown in blue, while GHOSTS measurements \citep{monachesi2016a} are shown in gray. Metallicity, calculated from equivalent F606W$-$F814 color \citep{streich2014}, is shown along the righthand $y$-axis. Additionally, we show the [M/H]\,=\,$-$1.2 metallicity measurement (dashed line) of M81's halo estimated from deep \textit{HST} data (reaching the Red Clump; \citealt{durrell2010}). We reproduce the flat outer profile ($R\,{\gtrsim}\,25$\,kpc) observed by\cite{monachesi2016a}, extending the profile to 60\,kpc. We also resolve, for the first time, a distinct break in the color profile at$R\,{\lesssim}\,25$\,kpc, inside which the profile rises steeply ---$\sim$0.3\,mag in color,$\sim$\,0.6\,dex in metallicity from 10--30\,kpc.}
\caption{Stellar mass density map of the M81 Group. The map has been logarithmically scaled, with each decade in mass color-coded according to the bar on the right. Density was calculated for each $\sim$1\,kpc$^{2}$\pixel, and converted to stellar mass according to\S\,\ref{sec:HST} and \S\,\ref{sec:resolved-mass}. The interior regions of M81, M82, and NGC 3077, where the data were too crowded to detect individual stars with Subaru (see Figure\,\ref{fig:rgb}), were filled in using calibrated $K_{\rm s}$\images from the 2MASS Large Galaxy Atlas\citep{jarrett2003}, which were re-binned to $\sim$1\,kpc physical resolution. The final map was lightly smoothed with a 0.5\,kpc Gaussian kernel. The final map spans an impressive four\,orders of magnitude in mass density. White dashed circles show the estimated tidal radii of M82 and NGC 3077. We count all material outside of these circles as unbound.}
\caption{The stellar halo mass--metallicity relation. Total accreted mass ($M_{\star,Acc}$) is plotted against metallicity measured at 30\,kpc ([Fe/H]$_{\rm 30\,kpc}$). The evolution of M81's stellar halo is shown at three points (large stars): (1) its past accretion history (\textcolor{blue!65!}{\bf blue}), measured from the minor axis (see \S\,\ref{sec:sb-prof} \&\ref{sec:color-prof}), (2) its `current' halo (\textcolor{green}{\bf green}), accounting for unbound tidal debris around M82 and NGC 3077 (see \S\,\ref{sec:tidal}), and (3) its estimated properties following the accretion of M82 and NGC 3077 (\textcolor{red}{\bf red}; see \S\,\ref{sec:halo-formation}). For comparison, nearby galaxies (taken from \citealt{bell2017}) are shown in white; the MW and M31 are labeled separately, to highlight their opposite positions on the relation. The MW's stellar halo mass and metallicity are taken from \cite{mackereth2019} and \cite{conroy2019}, respectively. We adopt 50\% larger error bars than intially reported for each, to reflect the substantial spread from other measurements \citep[e.g.,][]{bell2008,deason2019}. Metallicity-coded channel density maps are shown as zoomed insets for both M81 (e.g., see Figure \ref{fig:channel}) and M31 (PAndAS; \citealt{martin2013}) as visual guides of M81's potential halo evolution. For points (1) and (2) we adopt 50\% uncertainties on total accreted mass and 0.2\,dex uncertainties on metallicity, following\cite{harmsen2017}. For (3), the large error in metallicity indicates our uncertainty about the final metallicity gradient of the halo. In this case, the red star assumes the central metallicities for both M82 and NGC 3077 (mass-weighted), while the error bar shows the impact of assuming a steep halo metallicity gradient such as observed in M31 \citep{gilbert2014}. Dominated by the accreted material from M82, M81's halo will be transformed from low-mass and metal-poor, to a massive and metal-rich halo, rivaling that of M31.}
\caption{The radial minor axis average surface brightness and average $g{-}i$\color profiles as shown in Figure\ref{fig:minax-sb} \&\ref{fig:minax-color}. See \S\,\ref{sec:sb-prof} and \S\,\ref{sec:color-prof} for discussion of how the measurements and uncertainties are computed.}
\caption{ The ADE and FDE of our prediction results with and without integrating human personality in different scenes. We obtain better results \textcolor{blue}{for ADE} with the personality factors. \textcolor{blue}{The prediction results for FDE with and without integrating human personality are the same. We don't consider human personality for destination prediction, so personality factors have no effect on it.} }
\caption{\label{fig:3}\small Single-photon manipulation in coherent optical fibre networks stabilized by single-photons. a) Single-photon interference in a fully-fiberized MZI~(Fig.~\ref{fig:1}\textcolor{blue}{(b)}). Out-of-phase oscillation of $N_\textrm{c}$ and $N_\textrm{d}$ is in a good agreement with~Eqs.~(\ref{eq:BS-p},\ref{eq:BS-N}). b) Single-photon absorption control with a CPA~(Fig.~\ref{fig:1}\textcolor{blue}{(c)}). As each point corresponds to a single measurement, the dispersion is defined by the Poisson distribution. }
\caption{\label{fig:4}\small Dissipative single-photon switching. The raw data: a) The modulation signal applied to the fibre stretcher driving the system between coherent absorption (CAR) and transmission regimes (CTR). b) The coincident photon detection signal between SPD-c \& SPD-h and SPD-d\& SPD-h. ((a) and (b) share the x-axis, of which broken parts correspond to phase stabilization periods.) c) The coincidence counts distribution of coherent absorption (red) and tranmsission (blue) cycles.}
\caption{\red{E}xample of facial images}
\caption{% A hard disk with multiple data formats. Other storage media have the same space, bandwidth, and locality considerations. {\color{red}File-per-image formats} have highly random behavior. {\color{blue}Record formats} encode many records at various data qualities to save bandwidth and have sequential behavior for a given fidelity. {\color{cyan}PCRs} maintain the sequential behavior of record formats at multiple fidelities without space overheads. }
\caption{Illustration of self translation, intra-domain translation and inter-domain translation. For better comprehension and comparison, we follow the representations in \citet{huang2018munit}, i.e. (a) and (b). To avoid confusion, we change their descriptions. Our model consists of two types of auto-encoders (denoted by \textcolor{red}{red} and \textcolor{blue}{blue} arrows respectively), one for each domain. Similar with \citet{huang2018munit,DRIT}, the latent code of each auto-encoder is composed of a content code $c$ and a style code $s$. The model is trained with adversarial objectives (dotted lines) that ensure the translated images to be indistinguishable from real images in the target domain, as well as bidirectional reconstruction objectives (dashed lines) that reconstruct both images and latent codes.}
\caption{Comparing the prior Gaussian distribution $p(z)$, MMD $q_{\phi}(z)$ and KL $q_{\phi}(z)$. The \textcolor{red}{red dots} represent $(0,0)$. It clearly demonstrates that with KL $q_{\phi}(z)$, the distribution matches the prior Gaussian distribution $p(z)$ poorly, while with MMD $q_{\phi}(z)$ matches significantly better.}
\caption{The illustration of our proposed framework and motivation. In the proposing stage (top row), with relative location information, some object-pairs without any spatial connection or with far relative distance, such as the pair of \textcolor[rgb]{0.22,0.70,0.31}{person} (green box)-\textcolor[rgb]{0.4,0.18,0.63}{shirts} (purple box), can be assigned with a low score and then filtered out effectively. In the predicate recognition stage (bottom row), one location-based GGNN is introduced to model relevances among predicates using relative location based similarity measuring. With this GGNN, some ambiguous and non-exclusive predicates, such as stand-on, can be smoothed and assigned proximity scores corresponding to ground-truth label, and thus the accuracy of top $n$ recall (\textcolor[rgb]{1,0,0}{red} dotted box) can be increased.}
\caption{The illustration of the connection operation among different predicates. To visualize clearly, the \textcolor[rgb]{0,0.35,0.79}{blue} (resp. \textcolor[rgb]{0.96,0.46,0}{orange}) mask denotes the area of \textcolor[rgb]{0,0.35,0.79}{subject} (resp. \textcolor[rgb]{0.96,0.46,0}{object}). When the similarities (measured by MSE) of location anchors are below a thresh, nodes are interconnected in our location-based GGNN.}
\caption{Frequency dependence of polarized atmospheric signal at zenith for the CLASS observing site, both for circular polarization ($|V|$, shown in \colorindicator{tab:blue}{blue}) and linear polarization ($\sqrt{Q^2+U^2}$, shown in \colorindicator{tab:orange}{orange}). The \colorindicator{plotlightgray}{light gray} bands indicate CLASS observing frequencies, with the lowest frequency band corresponding to the Q-band telescope.}
\caption{Simulated azimuth and zenith angle dependence of the atmospheric Stokes $V$ signal at the CLASS observing site for the CLASS Q-band telescope. Azimuth is shown for a full \SI{360}{\degree}, and zenith angle is shown for \SIrange{0}{75}{\degree}. The \colorindicator{xkcd:darkpurple}{dark purple} arrow indicates magnetic north.}
\caption{Example binned azimuth profiles are shown along with sinusoidal best fit lines for three combinations of detector pairs and boresight rotation angles. The data are plotted with arbitrary amplitude offsets; error bars are not shown. The lighter data points indicate the region excluded from the sinusoidal fits due to the ground elevation angle cut. The profile in \colorindicator{tab:blue}{blue} is from a zenith angle of \SI{43.9}{\degree} and a boresight rotation angle of \SI{-45}{\degree}, the profile in \colorindicator{tab:orange}{orange} is from a zenith angle of \SI{46.7}{\degree} and a boresight rotation angle of \SI{0}{\degree}, and the profile in \colorindicator{tab:red}{red} is from a zenith angle of \SI{52.8}{\degree} and a boresight rotation angle of \SI{+45}{\degree}.}
\caption{Left: instantaneous values of the observable $Q(t)$ plotted over time (\mythickline{black}). Ensemble average of $Q$ (\mythickline{red}). Middle: time-history of $Q(t)$ obtained with the importance splitting algorithm using a constant weight factor $C=0.0104$. Right: time-history of $Q$ obtained with the importance splitting algorithm using a constant weight factor $C=0.0208$. Plots correspond to the Lorenz 96 case with 32 degrees of freedom.}
\caption{Left: computational gain expressed in terms of Eq.~\ref{eq:compGain} when targeting the level $1356$ with fixed weight (\mythickbarredcircle{red}{white}) and time-dependent weight (\mythickbarredcircle{blue}{white}); when targeting the level $1737$ with fixed weight (\mythickbarredsquare{red}{white}) and time-dependent weight (\mythickbarredsquare{blue}{white}). The vertical line with \mythickbarredcircle{black}{white} denotes the probability corresponding to the level $1356$, and with \mythickbarredsquare{black}{white} denotes the probability corresponding to the level $1737$. Right: pruning ratio when targeting the level $1356$ with fixed weight (\mythickbarredcircle{red}{white}) and time-dependent weight (\mythickbarredcircle{blue}{white}); when targeting the level $1737$ with fixed weight (\mythickbarredsquare{red}{white}) and time-dependent weight (\mythickbarredsquare{blue}{white}). Plots correspond to the Lorenz 96 case with 32 degrees of freedom.}
\caption{Left: rare mean path $R_{1737}$ obtained from brute force computation (\mythickline{black}), from the self-similarity approximation (\mythickdashedline{black}) and ensemble average time history of the observable (\mythickline{red}). Middle: computational gain computed as Eq.~\ref{eq:compGain} when targeting the level $1737$ with the fixed weight (\mythickbarredsquare{red}{white}), the time-dependent weight obtained from the brute force calculation (\mythickbarredsquare{blue}{white}), the time-dependent weight obtained from the self-similarity approximation \mythickbarredsquare{blue}{blue}. Right: pruning ratio obtained when targeting the level $1737$ with fixed weight (\mythickbarredsquare{red}{white}) and time-dependent weight obtained from brute force calculation (\mythickbarredsquare{blue}{white}) and the time-dependent weight obtained from the self-similarity approximation (\mythickbarredsquare{blue}{blue}). Plots correspond to the Lorenz 96 case with 32 degrees of freedom.}
\caption{Left: instantaneous values of the observable $Q(t)$ plotted over time (\mythickline{black}). Ensemble average of $Q$ (\mythickline{red}). Right: complementary of the cumulative density function (CDF) of the QoI. Probabilities are obtained with brute force calculation (\mythickline{black}) plotted along with the theoretical Monte-Carlo uncertainty (\mythickdashedline{black}). Plots correspond to the KSE case.}
\caption{Left: instantaneous values of the observable $Q(t)$ plotted over time (\mythickline{black}). Ensemble average of $Q$ (\mythickline{red}). Right: cumulative density function (CDF) of the QoI. Probabilities are obtained with brute force calculations (\mythickline{black}) plotted along with the theoretical Monte-Carlo uncertainty (\mythickdashedline{black}). Plots correspond to the high-altitude relight case.}
\caption{Instantaneous values of the observable $Q(t)$ plotted over time (\mythickline{black}). Ensemble average of $Q$ (\mythickline{red}). Left: Lorenz 96 case with 64 degrees of freedom. Right: Lorenz 96 case with 1024 degrees of freedom.}
\caption{Left: complementary of the cumulative density function (CDF) of the QoI. Probabilities are obtained with brute force calculation (\mythickline{black}), the ISP method with time-dependent weight based on $R_{1244}$ (\mythickcircle{blue}{white}), and $R_{1512}$ (\mythicksquare{blue}{white}) and $M=2500$ particles. Uncertainty of the estimator computed with the ISP algorithm using $R_{1244}$ for the time-dependent weight is shown (\mythickdashedline{blue}) along with the theoretical Monte-Carlo uncertainty that would be obtained with $M=2500$ realizations (\mythickdashedline{black}). Middle: computational gain computed as Eq.~\ref{eq:compGain} when targeting the level $1512$ with the fixed weight (\mythickbarredsquare{red}{white}), the time-dependent weight obtained from the brute force calculation (\mythickbarredsquare{blue}{white}), the time-dependent weight obtained from the self-similarity approximation \mythickbarredsquare{blue}{blue}. Right: pruning ratio obtained when targeting the level $1512$ with fixed weight (\mythickbarredsquare{red}{white}) and time-dependent weight obtained from brute force calculation (\mythickbarredsquare{blue}{white}) and the time-dependent weight obtained from the self-similarity approximation (\mythickbarredsquare{blue}{blue}). Plots correspond to the Lorenz 96 problem with 64 degrees of freedom.}
\caption{Left: complementary of the cumulative density function (CDF) of the QoI. Probabilities are obtained with brute force calculation (\mythickline{black}), the ISP method with time-dependent weight based on $R_{1042}$ (\mythickcircle{blue}{white}), and $R_{1109}$ (\mythicksquare{blue}{white}) and $M=2500$ particles. Uncertainty of the estimator computed with the ISP algorithm using $R_{1042}$ for the time-dependent weight is shown (\mythickdashedline{blue}) along with the theoretical Monte-Carlo uncertainty that would be obtained with $M=2500$ realizations (\mythickdashedline{black}). Middle: computational gain computed as Eq.~\ref{eq:compGain} when targeting the level $1109$ with the fixed weight (\mythickbarredsquare{red}{white}), the time-dependent weight obtained from the brute force calculation (\mythickbarredsquare{blue}{white}), the time-dependent weight obtained from the self-similarity approximation \mythickbarredsquare{blue}{blue}. Right: pruning ratio obtained when targeting the level $1109$ with fixed weight (\mythickbarredsquare{red}{white}) and time-dependent weight obtained from brute force calculation (\mythickbarredsquare{blue}{white}) and the time-dependent weight obtained from the self-similarity approximation (\mythickbarredsquare{blue}{blue}). Plots correspond to the Lorenz 96 problem with 1024 degrees of freedom.}
\caption{Left: rare mean path $R_{1512}$ obtained from brute force computation (\mythickline{black}), from the self-similarity approximation (\mythickdashedline{black}) and ensemble average time history of the observable (\mythickline{red}). Middle: computational gain computed as Eq.~\ref{eq:compGain} when targeting the level $1512$ with the fixed weight (\mythickbarredsquare{red}{white}), the time-dependent weight obtained from the brute force calculation (\mythickbarredsquare{blue}{white}), the time-dependent weight obtained from the self-similarity approximation \mythickbarredsquare{blue}{blue}. Right: pruning ratio obtained when targeting the level $1512$ with fixed weight (\mythickbarredsquare{red}{white}) and time-dependent weight obtained from brute force calculation (\mythickbarredsquare{blue}{white}) and the time-dependent weight obtained from the self-similarity approximation (\mythickbarredsquare{blue}{blue}). Plots correspond to the Lorenz 96 case with 64 degrees of freedom.}
\caption{Left: rare mean path $R_{1109}$ obtained from brute force computation (\mythickline{black}), from the self-similarity approximation (\mythickdashedline{black}) and ensemble average time history of the observable (\mythickline{red}). Middle: computational gain computed as Eq.~\ref{eq:compGain} when targeting the level $1109$ with the fixed weight (\mythickbarredsquare{red}{white}), the time-dependent weight obtained from the brute force calculation (\mythickbarredsquare{blue}{white}), the time-dependent weight obtained from the self-similarity approximation \mythickbarredsquare{blue}{blue}. Right: pruning ratio obtained when targeting the level $1109$ with fixed weight (\mythickbarredsquare{red}{white}) and time-dependent weight obtained from brute force calculation (\mythickbarredsquare{blue}{white}) and the time-dependent weight obtained from the self-similarity approximation (\mythickbarredsquare{blue}{blue}). Plots correspond to the Lorenz 96 case with 1024 degrees of freedom.}
\caption{Comparison of the relative error of the detector response, for different angular discretisations for the 2D void problem. The \textcolor{gaylordpurple}{\Square} is non-standard Haar wavelets with fixed angular refinement between $\mu \in [0, 1]$ and $\omega \in [1.47976, 1.661832]$, the dashed \textcolor{matlabblue}{$\triangle$} is uniform FP$_n$ with $\Sigma_{\textrm{f}}=1$, the dotted \textcolor{foliagegreen}{$\triangle$} is uniform FP$_n$ with $\Sigma_{\textrm{f}}=0.1$, \textcolor{deludedorange}{$\diamond$} uniform LS P$^0$ FEM and the \textcolor{black}{$\otimes$} are goal-based adapted non-standard Haar wavelets with robust error target 1\xten{-3} and with reduced tolerance solves.}
\caption{Comparison of the results for different angular discretisations for the 2D void problem of length 100. The \textcolor{gaylordpurple}{\Square} is non-standard Haar wavelets with fixed angular refinement between $\mu \in [0, 1]$ and $\omega \in [1.561, 1.5807]$, the dashed \textcolor{matlabblue}{$\triangle$} is uniform FP$_n$ with $\Sigma_{\textrm{f}}=1$, the dotted \textcolor{foliagegreen}{$\triangle$} is uniform FP$_n$ with $\Sigma_{\textrm{f}}=0.1$, \textcolor{deludedorange}{$\diamond$} uniform LS P$^0$ FEM and the \textcolor{black}{$\otimes$} are goal-based adapted non-standard Haar wavelets with robust error target 1\xten{-6} and with reduced tolerance solves. The dashed \textcolor{black}{$\otimes$} uses FP$_9$ with $\Sigma_\textrm{f}=0.1$ as a surrogate.}
\caption{Number of wavelets across the spatial domain for the 3D void problem, plotted on the CG mesh on the 2nd step of the goal-based (non-robust) angular adaptivity with error target 1\xten{-2} (the \textcolor{darksalmon}{x} line in \fref{fig:2D_void_100_result}). A cut and isosurface have been made in the visualisation, to show the only region where angular adaptivity has triggered refinement, between the source, the scattering region centred at (9.5, 5, 0.5) and the detector. The camera is pointed in the -z direction.}
\caption{Comparison of the relative error of the detector response, for different angular discretisations for the 3D scatter box problem. The dashed \textcolor{matlabblue}{$\triangle$} is uniform FP$_n$ with $\Sigma_{\textrm{f}}=1$, the dotted \textcolor{foliagegreen}{$\triangle$} is uniform FP$_n$ with $\Sigma_{\textrm{f}}=0.1$, \textcolor{deludedorange}{$\diamond$} uniform LS P$^0$ FEM, the \textcolor{gaylordpurple}{\Square} is non-standard Haar wavelets with fixed angular refinement between $\mu \in [-1, 1]$ and $\omega \in [0, 3.15]$, the \textcolor{darksalmon}{x} are goal-based adapted non-standard Haar wavelets with error target 1\xten{-2} and the \textcolor{black}{$\otimes$} are goal-based adapted non-standard Haar wavelets with robust error target 1\xten{-3} and with reduced tolerance solves.}
\caption{dVoxResNet model optimization: model performance depending architecture studied on schizophrenia and Bipolar Disorder classification \textit{ Validated on 3-fold CV, ROC/AUC}. d-convolutions are inserted in blocks according to \textit{idx}: Conv3D blocks [1:6], VoxRes blocks [1:4], where colour defines \color[HTML]{009901} \textbf{statistically significant performance increase}, \color[HTML]{ee7777} \textbf{statistically significant performance decrease} }
\caption{Example image from Flickr30K Entities annotated with bounding boxes corresponding to entities in the caption "\textcolor{red}{A man} wearing \textcolor{BurntOrange}{a tan coat} signs \textcolor{GreenYellow}{papers} for \textcolor{LimeGreen}{another man} wearing \textcolor{Cyan}{a blue coat}."}
\caption{Example reports and system predictions from the Stanford and RIH test splits. Human reference, PG baseline output and \rlrc{} output are shown for each example. Factual accuracy scores ($s$) are also shown for the model outputs. For the Stanford example, \dblue{\ul{clinical observations}} in the summaries are marked for clarity; for RIH, \dorange{\uwave{a wrongly copied observation}} and \dorange{\uwave{its occurence in the findings}} are marked.}
\caption{An example radiology report and summaries with their ROUGE-L scores. Compared to the human-written summary, Summary A has high textual overlap (i.e., ROUGE-L) but makes \dorange{a factual error}; Summary B has a lower ROUGE-L score but is \dblue{factually correct}.}
\caption{Illustration of our simulation. While the lens (thick grey line) passes the background star\,1 (dashed blue line) the observed position of the background star is slightly shifted due to microlensing (solid blue line). The\Gaia{} measurements are indicated as black boxes, where the precision in along-scan direction is much better than the precision in across-scan direction. The red arrows indicate the along-scan separation including microlensing, and the yellow dashed arrows show the along-scan separation without microlensing. The difference between both sources shows the astrometric microlensing signal. Due to the different scaning direction, an observation close to the maximal deflection of the microlensing event does not have necessarily the largest signal. A further background star\,2 (green) can improve the result.}
\caption{ Violin plot\textcolor{blue}{\protect\footnotemark[4]} of the achievable precision for the four different methods for Proxima Centauri (top) and LAWD 37 (bottom). For each method the 16th, 50th and 84th, percentile are shown. The shape shows the distribution of the 100 determined precisions smoothed with a Gaussian kernel. The green ``violins'' use all of the background sources. For the blue ``violins'' only background sources with a 5-parameter solution are used, and for the orange ``violins'' only stars with a precision in along-scan direction better than 0.5 mas and a 5-parameter solution are used. The red ``violins'' shows the best results when only one source is used. The dashed line indicates the median of this distribution. For each method the number of used stars is list below the ``violin''. The missing green ``violin'' of LAWD 37 is caused by no additional background stars with a 2-parameter solution only. Hence it would be identical to the blue one (For the other events with multiple background stars see Fig.\,\ref{figure:diff_ideas_all} in the appendix)}
\caption{ Violin plot of the achievable precision for the four different methods for: (a) \Gaia{} DR2: 5312099874809857024, (b)\,Ross~733, (c)\,61~Cyg~B, (d)\,61~Cyg~A, (e)\,L~143-23, (f)\,Innes'~star, (g)\,Stein~2051~B, (h)\,GJ~674 and (i)\,Barnard's~star. For each method the 16th, 50th and 84th percentiles are shown. The shape shows the distribution of the 100 determined precisions smoothed with a Gaussian kernel. In each plot, the green ``violin'' uses the all background sources. For the blue ``violin'' only background sources with a 5-parameter solution are used, and for the orange ``violin'' only stars with a precision in along-scan direction better than\(0.5\,\mathrm{mas}\) and a 5-parameter solution are used. The red ``violin'' indicates the best results when only one source is used. The dashed line indicates the median of this distribution. For each method the number of used stars is list below the ``violin''. Missing green ``violins'' (e.g. L~143-23\,(e)) are caused by no additional background stars with a 2-parameter solution only. Missing blue ``violins'' (e.g. Ross~733\,(b)) are due to the fact that all background sources with a 5-parameter solution have an expected precision in along-scan direction better than\(0.5\,\mathrm{mas}\). For Stein~2051~B\,(g) none of the background stars have an expected precision better than\(\sigma_{AL} = 0.5\,\mathrm{mas}\), hence the orange ``violin'' is missing. Finally the first analysis of GJ~674\,(h) and Barnard's~star\,(i) using only one background sources results in a precision worse than\(100\%\), consequently the red ``violins'' are missing. }
\caption{\small Physics-guided Architecture (PGA) paradigm of neural networks aims to infuse physics in neural network designs through \emph{physics-informed connections} among neurons and through \emph{physical intermediate variables}, shown in {\color{red} red}. Note: Figures in this paper are best viewed in color.}
\caption{Monotonicity-preserving LSTM Architecture. Components in {\color{red} red} represent novel physics-informed innovations in LSTM.}
\caption{Case Study (Best viewed in color). \textbf{Left:} Attention flow in commonsense concept graph, where \textcolor[rgb]{0.5,0.5,0}{zero-hop concepts}, \textcolor[rgb]{0,0,0.7}{one-hop concepts} and \textcolor[rgb]{0.5,0,0.7}{two-hop concepts} are highlighted. \textbf{Right:} Attention scores over all concepts. Darker green indicates higher attention scores. }
\caption{Energies $\varepsilon_n$ of linear modes versus mode index $n$ in \color{red}{$\mathcal{H}_2$ (a), $\mathcal{H}_3$ (b), $\mathcal{H}_5$ (c), and $\mathcal{H}_7$ (d) arrays (shown in the insets). Dashed lines show borders of the topological gap in the bulk array. Red dots indicate edge modes that are excited most efficiently when pump is present. Energies of these modes are indicated too.} Here and in all figures below $\beta=0.3$ and $\Omega=0.5$.}
\caption{Peak amplitude of the dominating $\psi_-$ component versus detuning $\varepsilon$ for $\mathcal{H}_3$ (a) and $\mathcal{H}_7$ (e) arrays. {\color{red}In (a) pump amplitudes $h_{\pm}=0.002$ (bottom curve), $0.004$ (middle curve), and $0.008$ (top curve) are used, in (e) $h_{\pm}=0.016$. $|\psi_-|$ distributions in (b)-(d) and (f)-(h) correspond to red dots. Namely, the modes shown are obtained at $\varepsilon=-3.50$ (point b), $\varepsilon=-3.35$ (points c and d), $\varepsilon=-3.36$ (point f), $\varepsilon=-3.45$ (point g), and $\varepsilon=-3.33$ (point h).} Stable branches are shown black, unstable ones are shown red. Dashed red lines indicate energy of linear modes that are most efficiently excited.}
\caption{%\blue{YZ: make the plot shorter to save some space} Net baryon number density $Y_{\Delta B}$, DM density $Y_{\rm DM}$ and $n_{\rm DM}^{\rm eq} \langle \sigma v \rangle_\delta/H$ as functions of temperature $T$, for the three BPs in Table~\blue{S1}. The solid horizontal black line indicates the observed baryon number density $Y_{\Delta B}^{\rm obs}=(8.718\pm 0.004)\times 10^{-11}$, and the dashed horizontal black line indicates the observed DM relic density $\Omega_{\rm DM}^{\rm obs}h^2=0.120\pm 0.001$~\cite{Aghanim:2018eyx}. The vertical solid line represents the sphaleron freeze-out temperature $T_{\rm sph} = (131.7\pm 2.3)$ GeV. %Net baryon number density $Y_{\Delta B}$ as function of the $\Delta$-mediator mass, for three different values of $|\mu_{\eta \Delta}|$ (with the argument of $\pi/2$) for each of the three benchmark points in Table~\ref{tab:BP2}. }
\caption{ %\blue{YZ: make the plot shorter to save some space} Viable parameter space of $v_\Delta$ and $\mu_{\eta \Delta}$ which can generate the observed baryon asymmetry. All other parameters are set to be the same as in BP1 in Table~\blue{S1}. }
\caption{Block diagram of our environment modeling framework depicting the sensor suite and major software components (images courtesy of Intel\textregistered, Pix4D S.A and Ximea).}
\caption{Top: System diagram of our IPP framework. A field map is built using measurements extracted from a sensor. During a mission, the map state is used to plan informative trajectories for data collection. These are then executed by the UAV, allowing for map updates in a closed-loop manner. Middle: Example comparison of our CMA-ES-based approach to ``lawnmower'' coverage (left and right, respectively) for continuous variable mapping in $200$\,s missions. The colored lines and spheres represent the traveled trajectories and measurement sites. Ground truth maps are rendered. Bottom-left: Comparison of the final map uncertainties (measured by the GP covariance matrix trace) for various path budgets. Ten CMA-ES trials were run for each budget. Bottom-right: Comparison of times taken to achieve the same final map uncertainty, given a fixed CMA-ES budget.%The orange line depicts averages over $10$ CMA-ES trials, and relative time savings using our method are shown. By allowing for altitude variations, we trade off between FoV and sensor noise to quickly obtain high-confidence maps with finer end quality. }
\caption{% Per-dimension latent traversals for a pair of datapoints indicating dimensions that affect only \textcolor{col3}{SVHN}, only \textcolor{col1}{MNIST}, and both \textcolor{col2}{MNIST \& SVHN}. % Per-dimension latent traversals for a particular pair of datapoints, matched to a per-dimension \gls{KL} analysis between the posteriors and the prior. % % % Note the extent to which particular dimensions affect only a single modality, whereas other dimensions affect both, indicating a degree of latent factorisation. }
\caption{FR-IQA results. The average PSNR/SSIM values on benchmarks. \textcolor{red}{Red} color indicates the best results, and the \textcolor{blue}{blue} indicates the second best.}
\caption{\label{fig:model} Model architecture of the proposed Hierarchical Graph Network. The constructed graph corresponds to the example in Figure~\ref{fig:example_question}. {\color{applegreen}Green}, {\color{blue}blue}, {\color{amber}orange}, and {\color{brown}brown} colors represent paragraph, sentence, entity, and question nodes, respectively. Some entities and hyperlinks are omitted for illustration simplicity.}
\caption{\label{fig:style_corr} Cross-style correlation. The degree of correlation gradually increases from \textcolor{Red}{Red} (i.e., negative), \textcolor{Yellow}{Yellow}, to \textcolor{Blue}{Blue} (i.e., positive), where color intensity is proportional to the correlation coefficients. Correlations with $p < 0.05$ (confidence interval: 0.95) are only considered as statistically significant. Otherwise, crossed. Age ranges start with X in the personal styles. \textbf{\textsc{IMPORTANT}: before you interpret anything from these matrices, please be VERY CAREFUL not to make any unethical or misleading claims based on these simple measures.} Please read the potential weakness of our experiment below. Best viewed in color. }
\caption{ Cumulative histograms of known KBOs (gray) and Centaurs/SDOs ({\textcolor{green}{green}}) are plotted as functions of $I_c$ (at pericenter), along with dashed lines indicating the current known numbers with $I_C<22.0$. Our predicted TESS yields for $I_C<22.0$ are plotted as solid horizontal lines, indicating significant numbers of expected discoveries. }
\caption{ Examples from \webq\(top 2) and \nq\(bottom 3) where predictions from \parallelqa\and\greader\are denoted by\redtext{red} and \bluetext{blue} text, respectively. For both models, \gretriever\was used.%\danqi{Is \parallelqa with text-match or graph retriever? Confusing.} Constructed graphs are reported; triples with \none\relation are omitted. The\textbf{[Bold]} text of each passage denotes title of the Wikipedia article where the passage is originated. }
\caption{An example from \nq. A graph of passages is constructed based on Wikipedia and Wikidata, where the edges are either cross-document or inner-document relations. While the model which reads each passsage in parallel outputs the wrong answer (\redtext{red}), the model which synthesizes the context over passages predicts the correct answer (\bluetext{blue}). }
\caption{Kalman Filter Prediction for acceleration data in (a)~x direction; (b)~y direction; and (c)~z direction. \redo{The graph compares the acceleration measurements with the kalman filter predictions} %\rev{Also pleasee use different line style in a graph, if possible.} }
\caption{Kalman Filter Prediction for gyroscope data readings in (a)~x direction; (b)~y direction; and (c)~z direction.\redo{The graph compares the gyroscope reading with the kalman filter predictions} }
\caption{\redo{ A square trajectory obtained when the underwater vehicle is operated in autonomous mode and semi-autonomous modes in the Raritan river, NJ }}
\caption{Selected images from robustness benchmarks ImageNet-A, C and P. Test images from ImageNet-C underwent artificial transformations (also known as common corruptions) that cannot be found on the ImageNet training set. Test images on ImageNet-P underwent different scales of perturbations. On ImageNet-A, C, EfficientNet with Noisy Student produces correct top-1 predictions (shown in \textbf{bold black} texts) and EfficientNet without Noisy Student produces incorrect top-1 predictions (shown in \textcolor{red}{red} texts). On ImageNet-P, EfficientNet without Noisy Student flips predictions frequently.}
\caption{\small Average accuracy (in \%; measured over 600/10,000 rounds$^\star$) of one-shot and five-shot classifiers for five-way classification on \emph{mini}ImageNet; higher is better. The best result of each network architecture of each column is in \textbf{bold} font. Results of our approaches are in {\color{blue} blue}. Best viewed in color.}
\caption{\small Average accuracy (in \%; measured over 600/10,000 rounds$^\star$) of one-shot and five-shot classifiers for five-way classification on \emph{tiered}ImageNet; higher is better. The best result of each network architecture of each column is in \textbf{bold} font. Results of our approach are in {\color{blue} blue}. Best viewed in color.}
\caption{\small Average accuracy (in \%; measured over 600/10,000 rounds$^\star$) of one-shot and five-shot classifiers for five-way classification on CIFAR-100; higher is better. The best result is in \textbf{bold} font. Results of our approach are in {\color{blue} blue}. Best viewed in color.}
\caption{An example entry of the slang words in \emph{UrbanDictionary}. The correct spelling variant pair is ({\color[HTML]{009901}\textbf{m8}},{\color[HTML]{329A9D}\textbf{mate}}), where words in this pair has the same word lemma but with two variant spelling forms.}
\caption{The contribution of selected features in CRF model, where `dep\_', `pos\_', `tag\_' represent the dependency tag, shallow POS tag and detailed POS tag respectively. The contribution in green and in red means the positive and negative contribution respectively. The color saturation denotes the degree of the corresponding effect.}
\caption{Parameter sharing mechanisms. \includegraphics[scale=0.25]{3353_shared_neuron.pdf} and \textcolor[rgb]{0.68,0.35,0.13}{\textbf{---}} represent shared neurons and weights respectively. \includegraphics[scale=0.25]{3353_taskA_neuron.pdf}, \includegraphics[scale=0.25]{3353_taskB_neuron.pdf}, \includegraphics[scale=0.25]{3353_taskC_neuron.pdf} represent task-specific neurons. \textcolor[rgb]{0.36,0.61,0.84}{\textbf{---}}, \textcolor[rgb]{0.44,0.68,0.28}{\textbf{---}}, \textcolor[rgb]{0.67,0.21,0.86}{\textbf{---}} denote task-specific weights for three different tasks.}
\caption{Time-snapshots of a single inchworm-like period of our fluid-driven soft robot with two phase-shifted periodic inputs ($T$ is period time, \textcolor{cyan}{dashed cyan lines} denote the equivalent three-link model)}
\caption{Simulation solutions for the three-link robot's configuration -- Prescribed joint angles (solid curves) vs. prescribed torques with realistic stiffness (dashed curves) and low stiffness (dotted curves). (\subref{fig:q_qr}) Angles \textcolor{myblue}{$\varphi_1$ (blue)} and \textcolor{mypurple}{$\varphi_2$ (purple)}. (\subref{fig:x}) Position of \textcolor{myblue}{left $x_1$ (blue)} and \textcolor{mypurple}{right $x_2$ (purple)} contacts and snapshots of the robot (\textcolor{gray}{gray})}
\caption{Friction forces limits (\textcolor{myblue}{$\pm \mu f_{n,1}$ in dashed blue} and \textcolor{mypurple}{$\pm \mu f_{n,2}$ in dashed purple}) and tangential force $f_t$ (in solid black). (\subref{fig:forces_kin}) Prescribed joint angles simulation. (\subref{fig:forces_hybrid}) Hybrid-quasistatic simulation}
\caption{Distance per step vs. phase difference -- \textcolor{myblue}{$\mu_1=\mu_2$ (solid blue curve)}, \colorbox{myblue!30}{$\mu_1/\mu_2\rightarrow1.1$ (blue area)} and average (dashed black)}
\caption{Specific cost of transport vs. torque amplitude -- \textcolor{myblue}{Large stiffness (solid blue curve)}, realistic stiffness (dashed black curve) and \textcolor{mypurple}{low stiffness (dotted purple curve)}}
\caption{Experiment results (black) with error-bars (\textcolor{red}{red}) and \textcolor{myblue}{simulation results (blue)}. (\subref{fig:exp_p0}) Normalized distance traveled per step vs. the nominal pressure. (\subref{fig:exp_psi}) Normalized distance traveled per step vs. the phase difference between inputs.}
\caption{Stiffness calibration experiment -- Measured data in `x' and fitted linear curve in \textcolor{myblue}{solid blue}}
\caption{Pressure calibration experiment -- \textcolor{myblue}{Left segment in blue} and \textcolor{mypurple}{right segment in purple}. Measured data in `x', fitted linear curve in dotted light curve and quadratic curve in solid curve}
\caption{Distance per step vs. phase difference -- \textcolor{myblue}{Measured friction $\mu_m=0.389$ (solid blue curve)} and multiplications of it}
\caption{\color{CadetBlue4!80!black} \sffamily \fontsize{9}{11}\selectfont #1 }
\caption{\color{darkgray} \sffamily \fontsize{9}{11}\selectfont #1 }
\caption{ The first action moved the cracker box into the storage and the second action moved the bowl onto the stove. However, \pred{in\_storage} is not necessarily intentional. One can also explain that Action 1 is moving the cracker box out of the way for Action 2. This is even more obvious if we compare Action 1 with Action 1$^*$ in the last row. }
\caption{ \small Quantitative comparisons with different methods on 5 datasets with MAE (smaller is better), max/mean F-measure score (larger is better) and S-measure (larger is better). % Due to the space limitation, we show other 2 datasets result in supplementary materials. The best three results are shown in {\color{red}red}, {\color{blue}blue} and {\color{green}green}. %For a direct comparison with DEF~\cite{zhuge2019def}, $\mathrm{mean}\ F_\beta$ is used and the metric value is enclosed by ($\cdot$). The results of our method with $T=2$ based on both ResNet101~\cite{he2016deep} and VGG16~\cite{simonyan2014very} are reported.}
\caption{(Colour online) Dependence of the absolute value of matrix elements of $\sigma^z$ on the average energy of the states at fixed $\omega$. The blue points correspond to a running average over a small energy window and the red line is $e^{-S(E)/2}$. The bottom of the figure shows the running average divided by $e^{-S(E)/2}$, which gives the $E$ dependence of $|f(E,\omega)|$. In both cases, the result is consistent with that function not depending on $E$. The error bars correspond to a 95\% confidence interval if the underlying distribution is normal. \textcolor{red}{de}}
\caption{\color{\commentsColor} Transfer response (\textbf{a}, \textbf{b}) and structure resistances (\textbf{c}, \textbf{d}) as a function of the gate bias. These~results are obtained reducing the length of either the source (\textbf{a}, \textbf{c}, solid lines) or drain access region (\textbf{b}, \textbf{d}, dashed lines) down to 17.5 nm, and increasing the length of either the source (\textbf{a}, \textbf{c}, solid~lines) or the drain access region (\textbf{b}, \textbf{d}, dashed lines) up to 70 nm.}
\caption{Quantitative comparison of different methods on three real-world benchmark datasets. The best results are in \textbf{bold} and \textcolor{orange}{orange} color, and the second best results are \underline{underlined} and in \textcolor{blue}{blue} color. `Average' is obtained by averaging the metric scores of all images from all the above real-world datasets.}
\caption{\label{fig:gating} \textbf{External electric fields change the conductivity by less than 3\%.} a) Resistance per square ($\mathrm{R_{\mdlgwhtsquare}}$) vs.\temperature ($T$) for the three devices identified by \textcolor{CB1}{$\mdlgblkdiamond$}, \textcolor{CB2}{$\bigstar$}, \textcolor{CB2}{$\mdblksquare$} in \tref{tab2}. The devices show either a full (\textcolor{CB1}{$\mdlgblkdiamond$}) or partial (\textcolor{CB2}{$\bigstar$}, \textcolor{CB2}{$\mdblksquare$}) superconducting transition when cooled below 7~K. b) Resistance per square ($\mathrm{R_{\mdlgwhtsquare}}$) vs.\external electric field ($E$) of the same devices as in (a), identified by \textcolor{CB1}{$\mdlgblkdiamond$}, \textcolor{CB2}{$\bigstar$}, \textcolor{CB2}{$\mdblksquare$} in \tref{tab2}. These measurements were performed at the superconducting transition: 6.6~K (\textcolor{CB2}{$\mdblksquare$}) and 6.75~K (\textcolor{CB1}{$\mdlgblkdiamond$},\textcolor{CB2}{$\bigstar$}). Less than a 3\% change is observed in the resistance of the device until after dielectric breakdown, where additional current enters the device channel through the top gate and the model of four probe resistance measurement used to determine the resistance no longer applies to the system. c) Measured proportional change in device resistance ($\Delta\mathrm{R}/\mathrm{R}$) when an externally applied electric is changed from 0 to 1~MV/cm vs.\the normal resistance per square ($\mathrm{R_{N,\mdlgwhtsquare}}$) of the device. A unique symbol represents each device. That symbol, annealing method, oxide thickness, and top gate area are summarized in \tref{tab2}. We use black symbols to identify non-superconducting samples, orange symbols to identify samples with a partial superconducting transition, and blue symbols to identify samples with a complete superconducting transition. The precise geometries if the lithographically-defined Hall bars are all qualitatively the same as the device described in \fref{fig:methods}(e) and the main text. The value of the voltage at which the external electric field is 1~MV/cm is calculated by multiplying the electric field by the oxide thickness between the current channel and the top gate. For superconducting devices where we want to decrease the number of holes a positive voltage is used. For normal devices where we want to increase the number of holes a negative voltage is used. The plotted percentage change in resistance is calculated by dividing the difference in resistance at an external field of 1~MV/cm and no external field by the average resistance measured between those two voltage values. The maximum change in resistance observed across the devices measured was less than 3\%. \Tref{tab:all} tabulates the experimental conditions, the maximum applied electric field, resistance data for each sample in this figure and the temperature at which gating electric fields are applied to each sample. % regular transistors are capable of changing their source drain resistance by many orders of magnitude. }
\caption{\label{fig:variation} Correlation between resistance and critical current for different nominally identical devices, for the devices corresponding to results from Fig. 3 in the main text. The electrical transport properties of nominally identical devices can vary by more than 5 orders of magnitude, while their critical current varies by an order of magnitude. The devices reported in this figure were all annealed at $550^{\circ}\mathrm{C}$ for 30~s. (a)-(c) Voltage vs.\current measurements of 3 nominally identical devices, measured in liquid helium. (d)-(f), The same data as in (a)-(c) plotted on a log-log scale. For each dataset the critical current,$\mathrm{I_{C}}$, is marked. Values where a negative voltage, below the noise floor, was measured are colored red. The resistance at $T=4.2$~K, indicated by the solid black line, is determined as the coefficient of a linear form of the data below $\mathrm{I_{C}}/2$. For (f) (the sample with the lowest resistance at $T=4.2$~K) the resistance at $T=4.2$~K is less than $R=V/I (I_{C})=0.35~\mathrm{m\Omega}$. (g) The resistance at $T=4.2$~K vs. $\mathrm{I_{C}}$, acquired by repeating the above procedure for 21 nominally identical devices, measured in liquid helium. The results for (d), (e) \& (f) are identified by a red diamond, a green circle and a yellow square respectively. The resistance at$T=4.2$~K of these nominally identical devices varies by more than 5 orders of magnitude, while the critical current varies by an order of magnitude. This variation is even shown in devices fabricated on the same $5\times5~\mathrm{mm}^2$ die. }
\caption{Constructing \BBWT{} of $T = \texttt{\RunningExample{}}$. The Lyndon factors are colored in {\color{solarizedYellow}dark yellow}. Reading the characters of the penultimate column top-down yields \BBWT{}. The last column shows in its $i$-th row the starting position of the $i$-th smallest conjugate of a Lyndon factor in the text. It is the circular suffix array studied later in \cref{secLinTime}. Note that $\texttt{cbb} \LexOrder \texttt{cbbcada}$, but $\texttt{cbbcada} \OmegaOrder \texttt{cbb}$. }
\caption{(a) Mean streamwise velocity profile of the minimal channel (\solidline) compared to the mean velocity profile of the channel flow for the domain size of $12\pi\delta \times 2\delta\times 4\pi$ ({\color{red}\dashline}) at $\Rey_\tau\approx 186$ from \citet{delAlamo2003}. (b) Spectral energy content, $\hat{E}/u_\tau^2$, at $y^+\approx 15$. \label{fig:Umean}}
\caption{B/W Denoising performance with known noise level: Best PSNR is marked in \textcolor{red}{red}, and best PSNR within the low-weight category is marked in \textcolor{blue}{blue}.}
\caption{Color image denoising performance: Best PSNR is marked in \textcolor{red}{red}, and best PSNR within the low-weight category is marked in \textcolor{blue}{blue}.}
\caption{ An illustration of the (re)construction of the shape and size of a void formed by an assembly of void-surface atoms in two dimensions. The distributions of the atoms before and after annealing are shown in yellow (\textcolor{yellow1}{\large $\bullet$}) and red (\textcolor{red}{\large $\bullet$}) colors, respectively. The convex polygon (red line) approximates the void region in two dimensions after annealing. }
\caption{ The X-ray scattering intensity for {\asi} from experiments and classical molecular-dynamics simulations. The experimental data~\cite{Laaziri1999} correspond to the results from as-implanted (\textcolor{green1}{$\Diamondblack$}) and annealed (\textcolor{red}{\small $\blacksquare$}) samples, while the simulated values (\textcolor{blue}{$\bullet$}) of the intensity correspond to the modified Stillinger-Weber model of {\asi}. }
\caption{ (a) The radius of gyration of the voids from the convex-hull approximation at 300 K. The results for the CMD-R (\textcolor{blue}{$\bullet$}) and AIMD-R (\textcolor{green1}{\small $\blacksquare$}) models are presented here. (b) Estimated void volumes in the convex approximation. (c) Guinier plots for the CMD-R and AIMD-R models at 300 K. The Guinier radii from the plots correspond to a value of 6.38 {\AA} (CMD-R) and 8.07 {\AA} (AIMD-R). }
\caption{ (a) The convex-hull radius for the voids from classical CMD-R (\textcolor{red}{$\bullet$}) and AIMD-R ({\textcolor{maroon}{\small $\blacksquare$}}) simulations at 800 K, followed by geometry relaxation. (b) The corresponding volume of voids from the convex-hull approximation. (c) Guinier plots for the models at 800 K. The Guinier radii from the plots correspond to a value of about 6.57 {\AA} (CMD-R) and 9.48 {\AA} (AIMD-R). }
\caption{ (a) The radius of gyration of hydrogen-rich voids from AIMD simulations at 300 K (\textcolor{green1}{$\bullet$}) and 800 K (\textcolor{red}{\small $\blacksquare$}) in the convex-hull approximation. (b) Estimated convex-hull volumes at 300 K and 800 K. (c) Guinier fits for the SAXS intensities in AIMD-R models of {\asih} at 300 K and 800 K. }
\caption{ (a) The time evolution of the mean-square displacements (MSD) of H inside two voids, V12 (\textcolor{red}{$\circ$}) and V7 (\textcolor{green1}{$\Box$}), at 800 K. The difference between the two sets of MSD values, from 1 ps to 8 ps, can be attributed to the degree of void-surface restructuring, as discussed in sec. IIIC. (b) The compact or smooth structure of the V12 surface, obtained from an isosurface representation of the void, compared to a relatively diffused or scattered distribution of Si atoms in (c) V7 at 800 K. }
\caption{Classification Accuracy Comparison from Source \textbf{ImageNet} (Numbers in \textcolor{red}{RED} refer to the evidence of possible negative transfer)}
\caption{Classification Accuracy Comparison from Source \textbf{Places 365} (Numbers in \textcolor{red}{RED} refer to the evidence of possible negative transfer)}
\caption{Classification Accuracy Comparison from Source \textbf{Stanford Dogs 120} (Numbers in \textcolor{red}{RED} refer to the evidence of possible negative transfer)}
\caption{Flows of Descent Directions on the Empirical Loss. \textcolor{black}{\bf Black Line}: the flow via descent direction of empirical loss (i.e., gradient of empirical loss) from pre-trained weights as the starting point, where the descent direction quickly leads the learning procedure converged to a local minimum of over-fitting; \textcolor{blue}{\bf Blue Line}: the flow via the descent direction linearly combining gradients of empirical loss and $L^2$-SP regularizer~\cite{li2018explicit} from pre-trained weights as the starting point, where the regularization hurts the minimization of empirical loss; \textcolor{red}{\bf Red Line}: the flow via the descent direction balancing the gradients of empirical loss and $L^2$-SP regularizer, where the descend direction leads to a flat area with low empirical loss (i.e., potentials of great generalizability).}
\caption{\label{fig:decayScheme} Decay scheme of \nuc{Ta}{180}$^{\rm m}$ with data from \cite{NuclData} illustrating the different decay modes: (1a,2a) $\beta^-$ and EC, (2a,2b) $\gamma$-decay and IC, (3a,3b) DM induced decay to $2^+$ and $1^+$ state. The investigated decay modes and signal \gray\are highlighted in red.}
\caption{\label{tab:datasets}Overview of the datasets used in the analysis. Columns from left to right denote measurement name, the used HPGe setup, the measurement time, the resolution in FWHM at 103.5~keV and the detection efficiency for the 103.5~keV \gray.}
\caption{\label{fig:pdf_ROI_Composition_Ta180m_bm} Region of interest in each dataset for the 103.5~keV peak search in the $\beta^-$ channel. The best fit is shown in blue and the best fit with the signal peak set to the 90\% C.I.\half-life limit is shown in red. The arrows indicated the 93.3~keV peak of the EC channel (not used in fit) as well as the named background\glines. }
\caption[Effects of masking differences on measured luminosities.]{We test the uncertainty on the measured luminosity due to differences in masking at the cluster core, using \sparcsonezero\as an example. A) Inner 30~kpc of\sparcsonezero\in\red, unmasked. B) In the more severe masking regime we mask all pixels that are compact and elevated above any diffuse structure. Thus, for \sparcsonezero's 'beads on a string' morphology \citep{Webb2015} all bright knots (beads) are masked. C) In the less restrictive masking regime we leave pixels unmasked if they are clearly associated with a distinct tidal feature or are irregular in shape and deeply embedded in the bright BCG envelope. }
\caption[\red\surface brightness profiles out to 100~kpc for the high redshift clusters.]{\red\surface brightness profiles out to 100~kpc for the high redshift ($z>1$) clusters. Solid lines are observed profiles and dashed lines represent surface brightness profiles that have e+k and cosmological dimming corrections applied. The e+k corrections are derived from a \citetalias{BC03} SSP model with Chabrier IMF, $z_f=3$, and solar metallicity. Surface brightness profiles are terminated when the uncertainty reaches $>0.3$ \sbu. }
\caption[Stellar mass of the \bcgicl\within$r<100$~kpc.]{\bcgicl\stellar mass at$r<100$~kpc as a function of \mfive\for the complete sample. Black stars are systems from\citetalias{Gonzalez2013a} with $z<0.15$, blue triangles are from \citetalias{DeMaio2018} (intermediate-redshift sample), and red diamonds represent the high-redshift cluster sample. The solid blue line represents the best-fit relation to the \bcgicl\content within 100~kpc of the intermediate redshift sample with\logMstell$=(0.37\pm0.05)$\logmfivenorm $+ (11.92\pm0.02)$, with the shaded region corresponding to the 1$\sigma$ confidence region for the relation. The low-redshift sample of \citetalias{Gonzalez2013a} falls near this relation while the high-redshift sample on average exhibits lower \bcgicl\stellar masses at\mfive$>10^{14} M_\odot$. For comparison, we also show the $z=0$ relation derived for the IllustrisTNG simulations by \citet{pillepich2018}, which is slightly steeper than the observed relation but has a similar normalization. All stellar masses are derived at the cluster redshifts assuming passive evolution, with $z_f=3$ and solar metallicity, as described in the text. }
\caption[ Stellar mass of the ICL within $10<r<100$~kpc.]{Stellar mass within $10<r<100$~kpc as a function of \mfive. Markers and colors are as in Figure \ref{fig:sm_m500_100}. The solid blue line is the best-fit to the intermediate redshift sample, also as in Figure \ref{fig:sm_m500_100}. Excluding the stellar mass in the central 10~kpc produces a steeper relationship with \mfive, \logmstell$=(0.48\pm0.06)$\logmfive$+(11.79\pm0.02$), indicating that the central BCG represents a larger fraction of the total \bcgicl\light inside 100~kpc in groups relative to clusters. The high redshift clusters are consistent with following the same functional relation as at lower redshifts, but shifted to lower\Mstell\by a factor of$2.08\pm0.21$. The best-fit relation for the high-redshift sample is shown in dashed red, assuming the same slope as for the intermediate redshift sample. }
\caption[Fraction of stellar mass of the \bcgicl\within 10~kpc relative to the total within 100~kpc.]{The fraction of the stellar mass within the central 10~kpc relative to the total within 100~kpc, plotted as a function of \mfive\for the complete sample. Systems at all redshifts show a trend of lower central concentration with increasing\mfive, with no significant shift in this relation betwen samples.}
\caption{Visual examples of our ZSD results. Each triple shows: (from left to right) DELO detection results, vanilla YOLOv2 detection results at the same confidence threshold as DELO, vanilla YOLOv2 detection results at a much smaller confidence threshold. The seen, unseen and errors are color-coded as {\color{red} red}, {\color{green} green} and {\color{blue} blue}. Notice that compared to DELO, the vanilla YOLOv2 constantly predicts extremely low objectness scores on unseen objects, and suffers from significant detection errors for those unseen objects to be detected. }
\caption{Average of rankings in the overall quality study. The best and the second best results are given in \textcolor[rgb]{1,0,0}{\textbf{red}} and \textcolor[rgb]{0,0,1}{\textbf{blue}}, respectively.}
\caption{Average of rankings in the content preserving study. The best and the second best results are given in \textcolor[rgb]{1,0,0}{\textbf{red}} and \textcolor[rgb]{0,0,1}{\textbf{blue}}, respectively.}
\caption{Average of rankings in the style look-like study. The best and the second best results are given in \textcolor[rgb]{1,0,0}{\textbf{red}} and \textcolor[rgb]{0,0,1}{\textbf{blue}}, respectively.}
\caption{(Left) Input image with pose, (middle) ground truth Gaussian attention mask (A) \nj{in yellow}, and (Right) predicted attention map ($\tilde{A}$). \textcolor{red}{Red box} is pseudo object box , \textcolor{blue}{blue box} is ground truth and white box indicates the human in \nj{action}. (V) \nj{the last row} is the result on the V-COCO dataset. Note that a white box is used solely \nj{for visually representing} an acting human in an image and is not used in training on $D_{T}$.}
\caption{The qualitative results on \textit{carry} and \textit{sit on} about three different objects each Qua(Left) Input image with pose, (middle) ground truth Gaussian attention mask, and (Right) predicted attention map. \textcolor{red}{red box} is pseudo object box , \textcolor{blue}{blue box} is ground truth and white box indicates the human in acting. Note that a white box is used to solely on represent an acting human in an image and is not used in training on the target class $D_{T}$}
\caption{\red{A schematic visualization of the proposed network architecture. The numbers on the layers denote the feature dimensions and the convolution kernel size.} }
\caption{\red{A selection of the important properties of the data used for the underlying work.}}
\caption{\red{A schematic overview of the data generation process. Note that ultimately the goal is to acquire simultaneous projection images in hybrid X-ray and MR imaging (cf.~Section~\ref{sec:related_work_mrpi}). \redrev{To generate training data for the underlying work, simulation was necessary.}}}
\caption{Representative examples of the projection-to-projection translation for different projection angles and patients. \red{The top and middle row are projections originating from the first test patient and the bottom row from the second test patient.} }
\caption{An example of missing information in the generated X-ray projection images. Details that are unobservable in the input MRI projections can, naturally, not be translated in the resulting generated projection images. \red{The small rectangle in the label image (c) outlines the entry point of a ventricular shunt while the larger rectangle frames contrasted cerebral arteries.} While the devices were not present at the time of the acquisition of the underlying images, it is likely that none of these details will be captured by the MR imaging protocol.}
\caption{\footnotesize Bounding box regression steps by GIoU loss (first row) and DIoU loss (second row). {\color{green}{Green}} and \textbf{black} denote {\color{green}{target}} box and \textbf{anchor} box, respectively. {\color{blue}{Blue}} and {\color{red}{red}} denote predicted boxes for {\color{blue}{GIoU}} loss and {\color{red}{DIoU}} loss, respectively. GIoU loss generally increases the size of predicted box to overlap with target box, while DIoU loss directly minimizes normalized distance of central points. }
\caption{\footnotesize GIoU loss degrades to IoU loss for these cases, while our DIoU loss is still distinguishable. {\color{green}{Green}} and {\color{red}{red}} denote {\color{green}{target}} box and {\color{red}{predicted}} box respectively.}
\caption{Real (a) and imaginary (b) parts of the lowest lying eigenvalues of the Lindbladian in (\ref{Lindblad}) versus the driving for $\Lambda=0.1688$ and $\kappa=1.5$. Shown are eigenvalues corresponding to the class $\rho^{(3k)}$ (blue), namely, $\lambda_0$ (\textcolor{hblue}{$\diamond$}) and $\lambda_3$ (\textcolor{hblue}{$\bullet$}), and eigenvalues corresponding to the classes $\rho^{(3k+1)}, \rho^{(3k+2)}$ (black), namely, ${\rm Re}\{\lambda_1\}={\rm Re}\{\lambda_2\}$ ($\nabla$). For very weak driving ${\rm Re}\{\lambda_{1, 2}\}$ describe relaxation of a harmonic oscillator near $Q=P=0$, while beyond a threshold $f_c$ (here at the maximum of $\lambda_3$) the eigenvalues become exponentially small (see inset of (a) with a log-scale) and capture interwell transition processes. The imaginary part (b) of $\lambda_{1,2}$ also decreases towards $\lambda_0$.}
\caption{An example of lottery draw time (\drawlen = 3, $p = 1/300$, \tktrate = 1000, and \ddraw = 10). }
\caption{Lottery draw example (\ddraw = 10, and $p = 0.01$). }
\caption{Network structure of the proposed CWAN method. Here, k3n64 indicates a kernel size of $3 \times 3$, and a feature map number of $64$. The stride is always equal to one in all layers. \textcolor{red}{Red}, \textcolor{green}{green} and \textcolor{blue}{blue} arrows mean short, long and global skip connection representing local short-term memory, long-term memory, and residual learning respectively. The \textcolor{yellow}{yellow} arrow represents supervision in training CWAN.}
\caption{Quantitative results on three datasets, and qualitative results on HDRDB. \textcolor{red}{Red}/\textcolor{blue}{blue} fonts indicate the best/second best results.}
\caption{Detection results on the ``Cityscapes $\rightarrow$ FoggyCityscapes" scene. `GT' indicates the groundtruth result. `One Disentangled layer' denotes we only use the second disentangled layer in the model. We can see that our method, i.e., using two disentangled layers, could locate and recognize objects existing in the two foggy images accurately, e.g., the \textcolor[rgb]{0.5,0,0}{truck}, \textcolor[rgb]{0,0,1}{car}, and \textcolor[rgb]{1,0,0}{bicycle}.}
\caption{Detection results on the ``Pascal VOC $\rightarrow$ Watercolor" scene. We can see that our method, i.e., using two disentangled layers, could locate and recognize objects existing in the two watercolor images accurately, e.g., the \textcolor[rgb]{1.0,0.61,0.61}{person}, \textcolor[rgb]{0,0.8,0}{bird}, and \textcolor[rgb]{0,0.8,0.8}{cat}.}
\caption{\red{Basic} model parameters}
\caption{\red{Advanced model} parameters}
\caption{Regions where the functions $z_k$ of the surrogate model are exactly zero, for \red{the basic model} (left) and \red{the advanced model} (right), for a problem with two variables with lower bounds $(0,0)$ and upper bounds $(5,3)$. The functions $z_k$ have been chosen in such a way that they cross exactly at integer points within the bounded region. This ensures that the model has its minimum in one of these points, making the model more suitable for combinatorial optimization problems.}
\caption{Distance matrix for the simple \red{traveling salesman problem}}
\caption{Model output for the simple \red{traveling salesman problem} for \red{the basic model} (left) and \red{the advanced model} (right) %, after $100$ iterations . The starting city is city 1, $x_1$ determines which remaining city is visited next, $x_2$ determines which remaining city is visited third, then the only remaining city is visited, and then city 1 is visited again.}
\caption{\red{IDONE-advanced, IDONE-basic}}
\caption{Best found \red{worst-case} total distance \red{(\emph{left}) and corresponding computation time (\emph{right}) }of the \red{noisy} TSP with $17$ cities for IDONE-advanced (IDONEa), IDONE-basic (IDONEb), \red{random search (RS), simulated annealing (SA)}, and %Bayesian optimization (BO), \red{HyperOpt (HypOpt)}, averaged over $5$ runs. %All $5$ runs lie within the shaded area. The shaded area (\emph{left}) visualizes the range across all $5$ runs, as do the boxplots (\emph{right}). %IDONE1 uses model 1, IDONE2 uses model 2. %\red{[ADD TO PLOT: IDONEb] }
\caption{ % \emph{Left}: Lowest objective value found at each iteration for IDONE on model 1 and 2 and simulated annealing (SA) on the \red{binary convex optimizaiton} example with $150$ binary variables, averaged over $100$ runs. For every run, the initial value, matrix $A$, and vector $\x^*$ were chosen randomly. The shaded area indicates the $5$ and $95$ percentiles. % \emph{Right}: Expected objective value of the $150$-dimensional toy example after $5$ minutes, for different evaluation times of the objective. % Lower values are better. % If evaluating the objective is fast (right), SA performs better on a fixed time budget, but if evaluating the objective takes $500$ ms or more (left), SA performs worse because it needs more function evaluations to converge than the IDONE algorithm. \red{Lowest objective value found at each iteration of the binary convex optimization example with $100$ binary variables, averaged over $100$ runs. The shaded area indicates the standard deviation. For every run, the initial value, matrix $A$, and vector $\x^*$ were chosen randomly. } }
\caption{\red{Objective value (\emph{left}) and computation time (\emph{right}) of the convex binary optimization problem for the different algorithms after $1000$ iterations, averaged over $100$ runs, for problems with different numbers of variables $d$.}}
\caption{Number of matrix operations needed to compute temporal derivatives of the $\liegroup$-valued splines: \emph{m-m/m-v mult.} denote matrix-matrix and matrix-vector multiplications, respectively. \emph{add.} denotes additions of matrices or vectors. Our formulation needs consistently less operations than the baseline approach. The \blue{blue numbers} give the number of operations for a cubic spline ($k=4$).}
\caption{Comparison with state-of-the-art methods on MARS in terms of CMC-1(\%) and mAP(\%). Quality-based methods are indicated by `Quality' column. \textcolor{blue}{Blue} indicates `re-ranking (RR)'. `Custom' means the backbone is customised.}
\caption{Scatter plot of ELMo layer mixing weights from trained Neural models of behaviors of every group. Colored regions denote a particular layer having the largest weight among all 3 layers. For example, the \textcolor{red}{red} region represents the space of all weights where layer 0's is the largest, whereas \textcolor{blue}{blue} represents the space where layer 2's weight is the largest}
\caption{Number of examples per template in the training set of the DAQA dataset sorted by color with respect to reasoning skills. Templates that require temporal reasoning, \eg~contain prepositions: before / after, or ordinals, are {\color{gray}{highlighted}} in {\color{gray}{gray}}.}
\caption{Audio question answering performance per template on the DAQA test set for FiLM (FiLM-512; 12 layers) and MALiMo (6 blocks). Templates {\color{gray}{highlighted}} in {\color{gray}{gray}} require temporal reasoning, \eg~contain prepositions: before / after, or ordinals.}
\caption{Spatial distribution of \blanco\members from\citet{Babusiaux18}. The orange box indicates the \ngts\field of view (FoV), filled black circles represent stars with\ngts\light curves, and open grey circles indicate stars that were either too faint or fell outside the\ngts\FoV.}
\caption{Absolute \gaia\G vs.\bprp\colour-magnitude diagram (CMD) for members of\blanco\from (\citealt{Babusiaux18}; open grey circles) highlighting stars observed by \ngts\(filled black circles). The \gaia\photometry and parallaxes reveal a tight cluster sequence with a scattering of likely multiple star systems lying above the single star sequence. For reference, an equal mass binary produces a 0.75 mag excess. The\bprp\colours have been dereddened assuming E(B-V) = 0.010 for the cluster (\colblue{B18}). \ngts\observed essentially all cluster members down to a spectral type of$\sim$M3. The stellar masses are MIST model predictions \citep{Dotter16,Choi16} evaluated at the age of \blanco, and the spectral types were estimated using updated information from \citet{Pecaut13} (E. Mamajek online table; see text).}
\caption{NGTS light curves and period predictions using GPs, LS and \gacf, for two stars (object IDs 13071 and 11156, left and right respectively). In each case, the \emph{top three plots} show the relative flux NGTS light curve in units of parts per thousand (ppt; top), the NGTS light curve with the maximum a posteriori (MAP) GP model (middle) and residuals (bottom). The orange line and shaded region show the mean and 1$\sigma$ uncertainty on the MAP GP model. In the residuals plot, blue points indicate outlying data that was masked by the GP during the fit. The \emph{bottom six plots} show the period estimation results (left column) for GPs (top), LS (middle) and \gacf\(bottom), along with the NGTS data phase-folded on each method's period (right column). \emph{Top left}: 1D GP posterior period distribution (orange) with the median period and 1\,$\sigma$ uncertainties (solid and dashed orange lines) shown and the period printed top right. For comparison, the period predictions of \gacf\and LS are shown by the vertical blue and green solid lines, respectively.\emph{Middle left}: LS periodogram in green with the identified period highlighted in yellow and printed top right. \emph{Bottom left}: \gacf\autocorrelation function in blue (positive direction shown only) with the identified period highlighted in yellow and printed top right.\emph{Right column}: NGTS light curve phase-folded on the corresponding period (GP, LS and \gacf, top-to-bottom), with the rainbow colour scheme indicating data from the beginning (indigo) to the end (red) of the observations. The modulation patterns of these two stars are relatively stable during the 200 days of observations and hence the period predictions from GPs, LS and \gacf\all agree.}
\caption{Same as Figure \ref{fig:P_agree} but for objects 1221 and 14442, two stars whose modulation patterns evolve significantly during the 200 day observations. The GP and \gacf\predictions agree well for these stars, given that they can account for the evolution in the modulation pattern, but the LS prediction is offset because it finds the period that is best fit by a non-evolving sine wave, which is not appropriate for these stars.}
\caption{Comparison of the periods extracted for \blanco\stars using GP,\gacf\and LS methods. We compare the period differences between each method (defined as$[(P_{1}-P_{2})/P_{1}] \times 100 \%$) as a function of dereddened \gmk\colour. For this comparison, we use\nProtsame\stars where each method detected the same rotation signal. While the agreement is good for most stars, there are exceptions where different methods disagree by up to$\sim$15\%. The agreement is better for the lower mass stars (\igmk\,$\gtrsim$\,2.5), which typically show stable modulation signals. Solar-type stars (\igmk\,$\lesssim$\,2.5) display greater evolution in their light curves, which causes a larger scatter in the period estimates between the three methods. Overall, the GP and\gacf\methods (cyan) agree best, most notably for the evolving solar-type members, followed by\gacf\and LS (magenta) and then GP and LS (black).}
\caption{Colour-magnitude diagrams (CMDs) of \blanco. \emph{Left:} Absolute G vs. dereddened \bprp\CMD of all\blanco\stars (open grey circles) and those with\ngts\light curves highlighted (filled black circles). The single star locus has been estimated by iteratively fitting the cluster sequence using a Gaussian process (cyan). Stars that are 3$\sigma$ outliers are circled in magenta and are likely multiple star systems. Below are the residuals of the fit as a function of colour, where `residual' here means the smallest linear distance from the GP model rather than vertical magnitude displacement above the single star sequence. \emph{Right}: same for absolute \ks\vs. dereddened\gmk\CMD.}
\caption{Rotation period vs. dereddened \gmk\colour for stars in\blanco. \emph{Left}: all stars with detected rotation periods (black points), with our identified multiple stars circled in magenta. Stellar mass (\msun) and spectral type are indicated at the top. \emph{Right}: showing only the apparently single stars to highlight the clear rotation sequence between 1.1$<$\igmk$<$2.3 (1.2$\,\gtrsim M \gtrsim$\,0.75\,\msun). Mass-dependent angular momentum evolution is strongly imprinted in the \blanco\sample.}
\caption{Rotation period vs. dereddened \gmk\colour for stars in\blanco\coloured by the level of evolution in their light curve modulation patterns. There is a clear mass dependence to the light curve morphology evolution, with mid-F to mid-K stars displaying predominantly evolving modulation patterns and M stars showing typically stable modulation over the 200 day\ngts\light curves.}
\caption{Rotation period vs. dereddened \gmk\colour for stars in\blanco\coloured by the amplitude of their light curve modulation patterns. Most notably for single late-K and early M stars (2.5\,$\lesssim$ \igmk\$\lesssim$\,3.2), there appears to be a correlation between rotation period and modulation amplitude, with the faster rotators, which sit below the upper cluster envelope, displaying higher modulation amplitudes than their more slowly rotating counterparts. Photometric multiple stars are circled in magenta.}
\caption{Rotation period vs. dereddened \gmk\colour for stars in\blanco\and the\pleiades. This is the same as Figure \ref{fig:P_col} with the addition of the \pleiades\data, which has been dereddened assuming E(B-V) = 0.045 for the cluster (\colblue{B18}). \emph{Left}: all stars with detected rotation periods (black points for \blanco\and cyan stars for the\pleiades), with our identified multiple star systems highlighted (magenta circles for \blanco\and gold circles for the\pleiades). \emph{Right}: showing only the apparently single stars to highlight the clear rotation sequences between 1.1$<$\igmk$<$2.3 (1.2$\,\gtrsim$\,$M$\,$\gtrsim$\,0.75\,\msun). Mass-dependent angular momentum evolution is strongly imprinted in both clusters.}
\caption{Colour-amplitude relation for \blanco\and the\pleiades. \blanco\stars are represented by black points, with multiple stars circled in magenta, and Pleiads are indicated by cyan stars, with multiples circled in gold. There is no clear distinction between the modulation amplitudes of single and multiple stars, except that very large amplitude variables are preferentially found in multiple systems. Stellar masses (\msun) and spectral types are indicated at the top. At a spectral type of $\sim$F5, variability amplitudes start to increase, which we attribute to the emergence of sufficiently deep convective envelopes that can drive and sustain a significant magnetic dynamo, and hence give rise to the starspot distributions whose longitudinal inhomogeneity drives the observed modulation patterns.}
\caption{Results of model with prior indicate it helps in removing spurious triplets. \colorbox{blue}{without prior}\colorbox{green}{with prior}}
\caption{ \textbf{\ipVAE~ fixes sample generation for a regularized VAE.} % % The \textcolor{orange}{orange box} shows the aggregate posterior $q(\z)$ for $\beta$-VAE \cite{higgins2017beta} on 2 moons data and the corresponding training samples. The regularized VAE has \textit{latent pockets} where samples are highly likely to be generated but not supported by $q(\z)$. These correspond to low-posterior samples being generated off the data manifold (\ie poor sample quality). % % % The \textcolor{green}{green box} demonstrates the aggregate posterior of $\beta$-\ipVAE~ that decouples the generation space $\zo$ and representation space $\z$. The generation space is pockets-free and very close to standard normal, resulting in generating samples on the data manifold even for low-posterior ones. Furthermore, the representation learning is well established in the representation space. More discussion on this figure in section \ref{sec:submanifold} The \textcolor{green}{green box} shows $\beta$-VAE \cite{higgins2017beta} (left column) and $\beta$-VAE with the proposed \emph{decoupled prior} (right column), each trained on the two moons dataset. %two different VAEs trained on the 2 moons training data. \emph{(Left Column): } trained using $\beta$-VAE \cite{higgins2017beta} regularization, $\beta$-VAE: Top to bottom shows the generated samples (colors reflect probability of generation), the aggregate posterior $\qphi(\z)$ and the training samples. The low-posterior samples lie in the latent pockets of $\qphi(\z)$ (shown in enlarged section on the left) and correspond to off-manifold samples in the data space, and high-posterior samples correspond to latent leaks. The $\beta$-\ipVAE~ decouples the representation $\z$ and generation $\zo$ spaces. The generation space is pocket-free and very close to standard normal, resulting in generating samples on the data manifold. Furthermore, the representation learning is well established in the representation space (see section \ref{sec:submanifold} for more discussion). }
\caption{Mixture of Gaussians predictions. First row: baseline Faster R-CNN. Second row: mixture of four Gaussians. {\color{blue} Blue boxes} are the mixture components. {\color{citecolor} Green boxes} are the final predictions. (a) not occluded (b) left arm is occluded (c) both arms are occluded (d) heavily occluded. }
\caption{{\it What is the most likely action?} Our model takes advantage of the connection between motor attention and visual perception. In addition to future action label, our model also predicts the interaction hotspots on the last observable frame and hand trajectory (in the order of {\color{yellow} yellow} , {\color{green} green}, {\color{cyan} cyan}, and {\color{magenta} magenta}) from the last observable time step to action starting point. Visualizations of hand trajectory are forecasted to the last observable frame. (Best viewe in color)}
\caption{Visualization of motor attention (left), interaction hotspots (right), and predicted action labels (top) from EGTEA (first row) and EPIC-Kitchens (second row). Both successful cases ({\color{green} green} label) and failure cases ({\color{red} red} label) are presented. Future hands position are downsampled by a temporal factor of 8, and forecasted to the last observable frame in the order of {\color{yellow} yellow} , {\color{green} green}, {\color{cyan} cyan}, and {\color{magenta} magenta}.}
\caption{Additional visualization of predicted motor attention (left), interaction hotspots (right), and future action labels (top) from the EGTEA dataset (1-4 row) and the EPIC-Kitchens dataset (5-8 row). Both successful cases ({\color{green} green} label) and failure cases ({\color{red} red} label) are presented. Future hands position are downsampled by a temporal factor of 8, and forecasted to the last observable frame in the order of {\color{yellow} yellow} , {\color{green} green}, {\color{cyan} cyan}, and {\color{magenta} magenta}.}
\caption{\textbf{Example Localizations:} We show example predictions for localizing the transition to unintentional action. \textcolor[rgb]{0,.56,.12}{Green} indicates a correct prediction (within 0.25 sec). \textcolor[rgb]{.81,0,0}{Red} indicates an incorrect, yet reasonable, prediction. \textcolor[rgb]{.81,.70,0}{Yellow} indicates a missed detection.\vspace{-1em}}
\caption{\textbf{Example Localizations:} We show example predictions for localizing the transition to unintentional action. \textcolor[rgb]{0,.56,.12}{Green} indicates a correct prediction (within 0.25 sec). \textcolor[rgb]{.81,0,0}{Red} indicates a false alarm. \textcolor[rgb]{.81,.70,0}{Yellow} indicates a missed detection.\vspace{-1em}}
\caption{\textbf{Episodes for Meta-Learning}: We illustrate two examples of training episodes. Each episode consists of several pairs of image and text. During learning, we mask out one or more words, indicated by a dark {\setlength{\fboxsep}{2pt}\colorbox{dark_color}{\textcolor{white}{box}}}, and train the model to reconstruct it by pointing to it among other examples within the episode. By controlling the generalization gaps within an episode, we can explicitly train the model for generalization to new words and new compositions. For example, the left episode requires the model to learn how to acquire a new word (``carrot''), and the right episode requires the model to combine known words to form a novel composition (``stir paneer'').\vspace{-1em}}
\caption{\textbf{Pointing to New Words:} We show examples where the model encounters new words in the episode. %some episodes where the reference set contains new words, and the model must describe the target example by pointing to words in the reference set. The dark {\setlength{\fboxsep}{2pt}\colorbox{dark_color}{\textcolor{white}{box}}} in the target example indicates the ground-truth new word, which is masked out. The model predicts that word by pointing to words in the reference set, and the weight of each pointer is visualized by the {\setlength{\fboxsep}{2pt}\colorbox{light_color}{\textcolor{white}{shade}}} of the box around each word (weight $< 3\%$ is omitted). In bottom right, we show an error where the model predicts the plate is being placed, not grabbed.\vspace{-1em}}
\caption{\textbf{Predictions Despite Missing Vocabulary:} We show predictions ({\setlength{\fboxsep}{1pt}\colorbox{myblue_light}{\textcolor{myblue_dark}{blue}}}) when the right word is missing from the vocabulary ({\setlength{\fboxsep}{1pt}\colorbox{myred_light}{\textcolor{myred_dark}{red}}}). When the precise word is missing from the vocabulary, our model often makes reasonable predictions.\vspace{-1em}}
\caption{\textbf{Visualizing the Attention:} We probe how the model uses visual information. We remove various objects from the image, and evaluate the model's confidence in predicting \protect\colorbox[rgb]{.7764,.98,0}{\textcolor[rgb]{.11,.11,.11}{masked words}}. Removing image regions with a \protect\cfbox{new_green}{\textbf{\textcolor[rgb]{.7764,.98,0}{green box}}} causes the greatest drop in confidence (other regions are shown in \protect\cfbox{new_pink}{\textbf{\textcolor[rgb]{1,.25,.506}{pink}}}). The most important visual regions for the prediction task contain an instance of the target word. These results suggest that our model learns some spatial localization of words automatically.\vspace{-1em}}
\caption{Street and chip classification ground truth visualized for flawless (\textcolor{green}{\textbullet}), anomaly (\textcolor{yellow}{\textbullet}), and faulty (\textcolor{red}{\textbullet}) streets and chips.}
\caption{{\bf More homography examples.} We show point correspondences on our synthetic dataset (see Section~\ref{sec:homography}), on real image pairs from HPatches (see \supp~\ref{sec:homography-supp}), and a checkerboard image captured by a webcam. SuperGlue consistently estimates more correct matches ({\color{green}green} lines) and fewer mismatches ({\color{red}red} lines), successfully coping with repeated texture, large viewpoint, and illumination changes.}
\caption{{\bf More indoor examples.} We show both {\bf Difficult} and {\bf Very Difficult} ScanNet indoor examples for which SuperGlue works well, and three {\bf Too Difficult} examples where it fails, either due to unlikely motion or lack of repeatable keypoints (last two rows). Correct matches are {\color{green}green} lines and mismatches are {\color{red}red} lines. See details in Section~\ref{sec:indoor}. }
\caption{{\bf More outdoor examples.} We show results on the MegaDepth validation and the PhotoTourism test sets. Correct matches are {\color{green}green} lines and mismatches are {\color{red}red} lines. The last row shows a failure case, where SuperGlue focuses on the incorrect self-similarity. See details in Section~\ref{sec:outdoor}.}
\caption{{\bf Attention patterns across layers.} For this image pair (correctly matched by SuperGlue), we look at three specific keypoints that can be matched with different levels of difficulty: the {\color{green}easy keypoint}, the {\color[rgb]{0,.8,1}medium keypoint}, and the {\color{red}difficult keypoint}. We visualize self- and cross-attention weights (within images $A$ and $B$, and from $A$ to $B$, respectively) of selected layers and heads, varying the edge opacity with $\alpha_{ij}$. % The spans of the self- and cross-attention shrink throughout the layers. The self-attention initially attends all over the image (row 1), and gradually focuses on a small neighborhood around each keypoint (last row). Similarly, some cross-attention heads focus on candidate matches, and successively reduce the set that is inspected. The {\color{green}easy keypoint} is matched as early as layer 9, while more difficult ones are only matched at the last layer. % Layer 11 does not follow the trend shown in Figure~\ref{fig:supp-attention-span} and attends to other locations --~seemingly distinctive ones --~that are further away. We hypothesize that SuperGlue is attempting to disambiguate challenging matches using additional context.\looseness=-1 Similarly as in Figure~\ref{fig:supp-attention-span}, the self- and cross-attention spans generally shrink throughout the layers. They however increase in layer 11, which attends to other locations --~seemingly distinctive ones --~that are further away. We hypothesize that SuperGlue attempts to disambiguate challenging matches using additional context.\looseness=-1 }
\caption{Section of the histogram of the residues of DLPNO-CEPA/1 ({\color{histoblue}$\blacksquare$}), MP2 ({\color{histobrown}$\blacksquare$}) and SS-MP2 ({\color{histoyellow}$\blacksquare$}) for awCV[T,Q]Z. }
\caption{Time-averaged wall shear stress at the impingement wall. The different line styles indicate different Mach numbers: \protect\dashedLine{\linePlotWidth}, $Ma=0.3$; \protect\dashDottedLine{\linePlotWidth}, $Ma=0.5$; \protect\dottedLine{\linePlotWidth}, $Ma=0.7$, where different line colours indicate different $q_w$ values: \protect\dott{c6}, $q_w=0.0$; \protect\dott{c3}, $q_w=0.0125$; \protect\dott{c1}, $q_w=0.025$; \protect\dott{c2}, $q_w=0.05$; \protect\dott{c4}, $q_w=0.1065$; \protect\dott{c5}, $q_w=0.2$.}
\caption{Examples that a bi-directional model (RoBERTa-large) predicts correctly while a uni-directional model (GPT-2-medium) makes an incorrect prediction; the correct answer is \textbf{bolded}; the key tokens are colored in \textcolor{blue}{blue}.}
\caption{{\it Upper Panel:} The \gray light curve of \ct with 1-day time bins from 01 January 2016 to 01 April 2018. {\it Lower panels:} The \gray spectra in the energy range from 100 MeV to 300 GeV for the periods which showed significant deviation from the simple power-law model. The power-law with exponential cut-off spectral model (dashed red line) with the fit uncertainties (red solid lines) are shown together with the spectral points and is compared with other adopted models (broken power-law in blue and log parabola in black). The spectral points are obtained by separately running gtlike for smaller energy intervals.}
\caption{{\it Left panel:} Internal BLR absorption as a function of distance for different \gray energies. The red dot-dashed line shows the $R_ {Ly\alpha}$ radius. {\it Right panel:} The reconstructed power-law model compared with the data considering external (EBL) and internal absorptions. The latter is computed assuming the emission region is at $\sim50\:r_{\rm g}$ (doted blue line) and at $\sim1000\:r_{\rm g}$ (dot-dashed blue line) distances from the central source.}
\caption{The relative values of rms deviation $\chi^2_{\alpha } /\chi^2_{{\rm SLO}} $ (upper figure) $f_{\alpha } /f_{{\rm SLO}} $ (bottom figure) for even-even isotopes corresponding to calculations within different PSF models. Signs: red empty circles ($\bm{{\color{red}\odot }}$) - SMLO; blue empty triangles ($\bm{{\color{blue}\bigtriangleup }}$) - GLO; green pluses (\textcolor{green}{\textbf{+}}) - TLO.}
\caption{The relative least square values $\chi _{\alpha }^{2} /\chi _{{\rm SLO}}^{2} $ for even-even isotopes corresponding to calculations within different PSF models. Upper figure: red circles ($\bm{{\color{red}\bullet}}$) are SMLO; red empty circles ($\bm{{\color{red}\odot }}$) - SMLOe; blue empty triangles ($\bm{{\color{blue}\bigtriangleup }}$) - GLO. Bottom figure: red circles ($\bm{{\color{red}\bullet}}$) - SMLO; green crosses ($\bm{{\color{green}\times }}$) - TLO(1); green pluses (\textcolor{green}{\textbf{+}}) - TLO(2); black empty squares ($\bm{\boxdot }$) - TLO(3). The presented results correspond to intervals from 5 MeV ( or minimal value of the energy $>$5 MeV) till 30 MeV (or maximal value of energy $<$ 30 MeV).}
\caption{The relative values of rms deviation $f_{\alpha } /f_{{\rm SLO}} $ for even-even isotopes corresponding to calculations within different PSF models. Upper figure: red circles ($\bm{{\color{red}\bullet}}$) are SMLO; red empty circles ($\bm{{\color{red}\odot }}$) - SMLOe; blue empty triangles ($\bm{{\color{blue}\bigtriangleup }}$) - GLO. Bottom figure: red circles ($\bm{{\color{red}\bullet}}$) - SMLO; green crosses ($\bm{{\color{green}\times }}$) - TLO(1); green pluses (\textcolor{green}{\textbf{+}}) - TLO(2); black empty squares ($\bm{\boxdot }$) - TLO(3). The presented results correspond to intervals from 5 MeV ( or minimal value of the energy $>$5 MeV) till 30 MeV (or maximal value of energy $<$ 30 MeV).}
\caption{Examples of dialogues (Part 1). REF -- reference summary, L3 -- LONGEST-3 baseline, DS -- DynamicConv + GPT-2 emb. with sep., D -- DynamicConv + GPT-2 emb., F -- Fast Abs RL, FE -- Fast Abs RL Enhanced, T -- Transformer. For L3, three longest utterances are listed. Rounded ROUGE values \textcolor{frenchblue}{[R-1/R-2/R-L]} are given in square brackets.}
\caption{Examples of dialogues (Part 2). REF -- reference summary, L3 -- LONGEST-3 baseline, DS -- DynamicConv + GPT-2 emb. with sep., D -- DynamicConv + GPT-2 emb., F -- Fast Abs RL, FE -- Fast Abs RL Enhanced, T -- Transformer. For L3, three longest utterances are listed. Rounded ROUGE values \textcolor{frenchblue}{[R-1/R-2/R-L]} are given in square brackets.}
\caption{Setup for the characterization measurements. In section~\ref{sec:shielding-factor}, we use three orthogonal coil pairs around a fixed probe in the middle of the shield (\greendiamond) to measure the dynamical shielding factors in all three spatial directions. For the residual field measurements in section~\ref{sec:residual-field}, the probe (\redrectangle) travels along the longitudinal axis ($z$ direction) of the shield and we coarsely null the field at the ends of the shield using two constant current coils while the excitation coils are not used.}
\caption{Comparison of markerless methods for localising in 3D and estimating the dimensions of textureless objects. Note that Saxena \etal~\cite{Saxena2008} localises the grasping points in 3D instead of the object. KEY -- Ref.: reference. N-3D: no 3D object model (\eg~CAD). N-D: no depth. HLC: known high-level category. Loc.: object localisation in 3D. Dim.: object dimensions estimation in 3D. \protect\tikz\protect\draw[lightgray,fill=lightgray] (0,0) rectangle (1.ex,1.ex);~dimensions given by the 3D model.}
\caption{Initialisation, sampled iteration and convergence of the 3D-2D shape fitting of a drinking glass (top: left camera, bottom: right camera). Legend: $i$ iteration number, $r$ radius of the circumference, \protect\tikz \protect\fill[red,fill=red] (1,1) circle (0.5ex);~projected points lying outside the segmentation mask, \protect\tikz \protect\fill[blue,fill=blue] (1,1) circle (0.5ex);~projected points lying inside the segmentation mask and \protect\tikz \protect\fill[green,fill=green] (1,1) circle (0.5ex);~projected points whose circumference fits the shape of the object (inside the segmentation mask of both cameras). }
\caption{Localisation success ratio (LSR) of all methods and errors for each dimension using \acrshort{iode} for each object of the CORSMAL-Container dataset, across all backgrounds and lighting conditions. Note the different scale of the y-axis. Legend: \acrshort{nocs}~\cite{Wang2019CVPR_NOCS}~\protect\tikz \protect\draw[nocs,fill=nocs] (0,0) rectangle (1.ex,1.ex);, SegDD~\protect\tikz \protect\draw[segdd,fill=segdd] (0,0) rectangle (1.ex,1.ex);, \acrshort{lodeir}~\protect\tikz \protect\draw[iodenarrow,fill=iodenarrow] (0,0) rectangle (1.ex,1.ex); and \acrshort{iode}~\protect\tikz \protect\draw[iode,fill=iode] (0,0) rectangle (1.ex,1.ex);. }
\caption{Localisation success ratio (LSR) and dimension estimation error. Legend: \acrshort{nocs}~\cite{Wang2019CVPR_NOCS}~\protect\tikz \protect\draw[nocs,fill=nocs] (0,0) rectangle (1.ex,1.ex);, SegDD~\protect\tikz \protect\draw[segdd,fill=segdd] (0,0) rectangle (1.ex,1.ex);, \acrshort{lodeir}~\protect\tikz \protect\draw[iodenarrow,fill=iodenarrow] (0,0) rectangle (1.ex,1.ex); and \acrshort{iode}~\protect\tikz \protect\draw[iode,fill=iode] (0,0) rectangle (1.ex,1.ex);. }
\caption{Parameter estimates for simulated data. {\color[rgb]{1,0.8,0.8} $\blacksquare$} 90$\%$ HPD intervals, {\color[rgb]{1,0.48,0.48}$\bullet$} posterior median and {\color[rgb]{0.4,0.4,0.4}$\bullet$} true parameters.}
\caption{Largest company in the sample (code 1767). Posterior predictions of incremental paid losses per development year $i$, for all accident years $j$. {\color[rgb]{0.8,0.8,0.8}$\bullet$} Observed claims (upper triangle), {\color[rgb]{0.4,0.4,0.4}$\bullet$} Non-observed claims (lower triangle), {\color[rgb]{1,0.8,0.8} $\blacksquare$} 95\% credibility interval, {\color[rgb]{1,0.48,0.48} $-$} median. }
\caption{Smallest company in the sample (code 692). Posterior predictions of incremental paid losses per development year $i$, for all accident years $j$. {\color[rgb]{0.8,0.8,0.8}$\bullet$} Observed claims (upper triangle), {\color[rgb]{0.4,0.4,0.4}$\bullet$} non-observed claims (lower triangle), {\color[rgb]{1,0.8,0.8} $\blacksquare$} 95\% credibility interval, {\color[rgb]{1,0.48,0.48} $-$} median. }
\caption{Reserves for each accident year. (Left) largest company (code 1767) and (right) smallest company (code 692). {\color[rgb]{1,0.6,0.6}$\bullet$} median and 95\% CI for DGM, {\color[rgb]{0.42,0.42,0.42}$\bullet$} true claims and {\color[rgb]{0.6,0.6,1}$\bullet$} median and 95\% CI for ODP.}
\caption{Predictive distributions of the aggregated reserves for all ten companies. {\color[rgb]{0.42,0.42,0.42}$\textbf{--}$} True claims; {\color[rgb]{1,0.8,0.8} $\blacksquare$} DGM; {\color[rgb]{0.8,0.8,1}$\blacksquare$} ODP (Bootstrap). }
\caption{Small modifications (\textcolor{blue}{update}, \textcolor{green}{insert}, and \textcolor{red}{delete}) in paraphrase lead to large accuracy gain (\%).}
\caption{Impact of missing value imputation on the prediction accuracy for different imputation strategies and interventions. We observe a higher accuracy for incomplete records ({\bf \textcolor{red}{red dots}}), which we attribute to the fact that data is not missing and random, and incomplete records contain more easy-to-classify negative examples.}
\caption{Comparison of accuracy and disparate impact on the \adult dataset for complete case analysis (removal of incomplete records, {\bf \textcolor{dgray}{gray dots}}) and inclusion of incomplete records (with imputation of missing values via datawig, {\bf \textcolor{red}{red dots}}). Including imputed records does not significantly affect the disparate impact of the resulting models.}
\caption{Impact of hyperparameter tuning on the accuracy and fairness metrics of logistic regression and decision tree models (in combination with various preprocessing and postprocessing interventions) on the \texttt{germancredit} dataset. Hyperparameter tuning (\textcolor{red}{red dots}) results in higher accuracy and reduced variance of the fairness outcome compared to no tuning (\textcolor{dgray}{gray dots}) in many cases.}
\caption{\textbf{Proposed UDA setup}. xMUDA learns from supervision on the source domain (plain lines) and self-supervision on the target domain (dashed lines), while benefiting from the Cross-Modal predictions of {\color{red}2D}/{\color{cyan}3D} modalities.}
\caption[]{Journal of ESPaDOnS observations of CI~Tau from December 2016 to February 2017. All observations consist of sequences of 4 subexposures, each lasting either 1200~s (first 8 spectra) or 1000~s (last 10 spectra). Columns respectively list, for each observation, the UT date, time, Barycentric Julian Date (BJD), peak signal to noise ratio \sn\(per 2.6~\kms\velocity bin), rms noise level in Stokes$V$ LSD profiles, rotation cycle $c$ computed using ephemeris BJD~(d)~=~$2457736.7+9.0 c$, longitudinal fields \Bl\measured from both LSD profiles and\hei\$D_3$ narrow emission component (NC), the RVs and bisector spans (BSs) of LSD profiles, and the RVs of the \hei\$D_3$ NC. }
\caption{The comparison of natural, medical and brain image datasets with respect to their dataset size (y-axis), the image acquisition time (x-axis), and the size per image (shown by the dot size). \textbf{Brain} imaging datasets (\textcolor{blue}{blue dots in the figure}): ADNI (Alzheimer's Disease Neuroimaging Initiative)~\cite{adni}, ABIDE (Autism Brain Imaging Data Exchange)~\cite{DiMartino2013,DiMartino2017}, BraTS (Brain Tumor Segmentation)~\cite{menze2014multimodal}, and ISLES (Ischemic Stroke Lesion Segmentation)~\cite{Maier2017}. Other \textbf{medical} imaging datasets (\textcolor{red}{red dots}): OCT (retinal optical coherence tomography images)~\cite{kermany2018labeled}, CheXpert (chest radiographs)~\cite{Irvin2019}, ISIC (International Skin Image Collaboration Melanoma Project)~\cite{isic}, and fastMRI (knee MRI)~\cite{DBLP:journals/corr/abs-1811-08839}. \textbf{Natural} image dataset (\textcolor{darkgreen}{green dot}): ImageNet~\cite{deng2009imagenet}.}
\caption{Comparison of the proposed method and other 16 methods on five salient object detection datasets. in terms of avgF, wF, and MAE scores. CAGNet with VGG-16, ResNet50, NASNet-Mobile, and NASNet-Large backbones, are denoted as CAGNet-V, CAGNet-R, CAGNet-M, and CAGNet-L, respectively. The best score and the second best score under each setting are shown in {\color[HTML]{FE0000}\textbf{red}} and {\color[HTML]{3166FF}\textbf{blue}}, respectively, and the best score under all settings is underlined. The unit of the total number of parameters (denoted as \#Par) is million. Note that the authors of ~\cite{zhang2018progressive} did not release the code, and they just provided the saliency maps, and thus reporting the total number of parameters is not possible for this method.}
\caption{Ablation analysis of our proposed method with different settings. The best results are shown in {\color[HTML]{FE0000} \textbf{red}}.}
\caption{The results of CAGNet-V with different settings for the parameter N. The best results are shown in {\color[HTML]{FE0000} \textbf{red}}. The unit of the total number of parameters (denoted as \#Par) is million.}
\caption{Left column: Growthrates of leading eigenmodes as functions of the wavenumber $k$ for (a) $Re=0$ and $Ra\leq10^4$, (b) $Re=100$ and $Ra\leq 7 \times 10^4$ and (c) $Re=200$ and $Ra \leq 3 \times 10^5$. Solid symbols represent real leading eigenvalues, while hollow symbols represent complex-conjugate pairs of non-real leading eigenvalues. Right column: Eigenvalue spectra for marginally supercritical cases for (d) $Re=0$, $Ra=7 \times 10^3$ and $k=6$, (e) $Re=100$, $Ra=4 \times 10^4$ and $k=4$ and (f) $Re=200$, $Ra=1.1 \times 10^5$ and $k=7$. \textcolor{C0}{$\bullet$} and \textcolor{C3}{$\filledmedsquare$} represent stable eigenvalues and the first unstable eigenvalue, respectively.}
\caption{\textbf{Examples of generating unseen tags.} ``Src'' indicates the input. ``Ref'' is the manually labeled tag. %``Topical Attention'' is the multi-label classification based baseline model that performs best in this paper. %``T2T-word' is our word-based model. The \textbf{bold phrases} are correct predicted tags; The {\color{blue}\underline{underlined phrases}} in ``Src'' is the potential tag's name that never appeared in the training corpus. Above examples show that our sequence-to-sequence model (L2A) is able to produce more accurate tag recommendations compared with previous text classification based models, and handles unseen tags (``\textit{hebei shifandaxue}'') well. See \S~\ref{sec:unseentags} for more details. }
\caption{Probabilistic Embedding (\textcolor{red}{TODO}: replace the point cloud images)}
\caption{We use two self-supervised losses to learn scene flow on large unlabeled datasets. The ``nearest neighbor loss" penalizes the distance between the predicted point cloud (\textcolor{green}{green}) and each predicted point's nearest neighbor in the second point cloud (\textcolor{red}{red}). To avoid degenerate solutions, we estimate the flow between these predicted points (\textcolor{green}{green}) in the reverse direction back to the original point cloud (\textcolor{blue}{blue}) to form a cycle. The new predicted points from the cycle (\textcolor{magenta}{purple}) should align with the original points (\textcolor{blue}{blue}) and the distance between these two set of points, forms our second self-supervised loss: ``cycle consistency."}
\caption{Example of our self-supervised errors between point clouds at time $t$, $\mathcal{X}$ (\textcolor{blue}{blue}), and $t+1$, $\mathcal{Y}$ (\textcolor{red}{red}). We consider the point $x$ whose ground truth projected point $x'$ is not known during training time. $(a)$ Nearest Neighbor Loss is computed between the the projected point $\hat{x}'$, predicted by the forward flow, and the closest point in $\mathcal{Y}$. $(b)$ The Cycle Consistency Loss tracks this transformed point back into its original frame, as point $\hat{x}''$, using the reverse flow, and computes the distance to the original position ${x}$.}
\caption{Compounding errors cause problems in estimating reverse flow using the transformed point cloud. $(a)$ Large flow prediction errors degrade the structure of the transformed cloud $\hat{\mathcal{X}}'$ (shown in \textcolor{green}{green}). Thus, computing the reverse flow between $\hat{\mathcal{X}}'$ (\textcolor{green}{green}) and $\mathcal{X}$ (\textcolor{blue}{blue}) is an ill-posed task. $(b)$ Using the anchor points of the nearest neighbor (\textcolor{red}{red}), we are able to stabilize the transformed cloud $\bar{\mathcal{X}}'$ (\textcolor{cyan}{cyan}), thus retaining important structural information.}
\caption{Scene flow estimation between point cloud at time $t$ (\textcolor{red}{red}) and $t+1$ (\textcolor{green}{green}) from KITTI dataset trained without any labeled lidar data. Our self-supervised method, trained on nuScenes and fine tuned on KITTI using self-supervised loss shown in \textcolor{blue}{blue} and baseline training method, with no fine tuning, is shown in \textcolor{magenta}{purple}. All models are pretrained on FlyingThings3D using a supervised loss. In the absence of any lidar annotations, our method clearly outperforms the baseline method, which over estimate the flow in many regions. (Best viewed in color)}
\caption{Comparison of our self-supervised method vs baseline on unannotated nuScenes dataset. Scene flow is computed between point cloud at time $t$~(\textcolor{red}{red}) and $t+1$~(\textcolor{green}{green}) and the transformed cloud in shown in \textcolor{blue}{blue}. In our method, the predicted point cloud has a much better overlap with the point cloud of the next timestamp as compared to the baseline. Since nuScenes dataset does not provide any scene flow annotation, the supervised approaches cannot be fined tuned to this environment.}
\caption{Improved scene flow estimation on annotated lidar data from the KITTI dataset between point clout at time $t$ (\textcolor{red}{red}) and $t+1$ (\textcolor{green}{green}). Our method, which is fine tuned on nuScenes using the self-supervised loss and KITTI using a supervised loss is shown in \textcolor{blue}{blue}. The baseline method is fine tuned only on KITTI using a supervised loss and is shown in \textcolor{magenta}{purple}. While in aggregate, both methods well estimate the scene flow, the augmented training method (\textcolor{blue}{blue}) is able to more closely match the next frame point cloud (\textcolor{green}{green}). In several of the cropped scenes, the purely supervised method (\textcolor{magenta}{purple}) underestimates the flow, staying too close to the initial point cloud (\textcolor{red}{red}). (Best viewed in color)}
\caption{Ablation study comparing our self-supervised method with full self-supervised loss (\textcolor{blue}{blue}, nearest neighbor loss + anchored cycle consistency loss) verses training only using the nearest neighbor loss (\textcolor{magenta}{purple}). Scene flow is computed between point clouds from the KITTI dataset at time $t$ (\textcolor{red}{red}) and $t+1$ (\textcolor{green}{green})}
\caption{The backbone structure of our model. The first row $(a)$ is a standard feature pyramid network(FPN), the second row $(b)$ is our IPG-Net proposed in this paper, including IPG sub-network, backbone network and the fusing module. The \textcolor{green}{green} bounding box denotes the image pyramid guidance sub-network, which responsible to guide the backbone network(within \textcolor{red}{red} bounding box). The fusing module is within the \textcolor{blue}{blue} bounding box. The \textcolor{blue}{blue} arrow denotes the lateral connection in the FPN and the \textcolor{blue}{blue} features are outputs of the FPN.}
\caption{Techniques and approaches used for uncertainty modeling \textcolor[rgb]{0,0,0}{\cite{soroudi2013decision}}}
\caption{Running power is much less algorithm-dependent on PC-class chips. {\textcolor{xblu}{Blue:}{\it i7-8700 @ 4.6 GHz (Desktop)}} {\textcolor{xgrn}{Green:}{\it i5-8250U @ 3.4 GHz (Laptop)}}}
\caption{\textbf{Pretraining Hellinger epoch change:} Hellinger distance reduces expectedly when comparing later epochs (48 vs.\49, red line) vs. earlier epochs (1:48, black curve).}
\caption{\textbf{High vs.\low Hellinger$H$ neurons:} Neuron token ($_\blacksquare$\textcolor{unsup_wiki_color}{$\blacktriangle$}\textcolor{t}{$\blacktriangle$}) and POS activation probabilities (bars) for epochs 1, 48, 49. Neurons with high $H$ (296) and low $H$ (38) between epochs 1:48.}
\caption{The framework of our Noise2Blur for training denoising model. (top) First, we use a image filter to create a blurred label for supervision. The result is a noise-free but severely blurred image. (bottom) We add noise map extracted by the N2B model to a random clean image to generate a new ``noise-to-clean'' objective. The N2B model is then divided into a denoising network (DnNet) and a noise extraction network (NENet). We use the DnNet to learn ``noise-to-clean'', while NENet is supervised by ``noise-to-blur''. These two networks cyclically use each other's knowledge to promote their own learning. The red line ``\textcolor{red}{$\rightarrow$}'' means that it can only propagate forward but the gradient cannot be backpropagated. }
\caption{Comparison with the state-of-the-arts on RegDB and SYSU-MM01 datasets. Re-identification rates (\%) at rank $R$ and mAP (\%). $1^{st}$ and $2^{nd}$ best results are indicated by {\color{red}\textbf{red}} and {\color{blue}\textbf{blue}} color, respectively.}
\caption{The distribution of the Euclidean distance between cross-modality (RGB-IR) features. The intra-class and inter-class distances are indicated by \textcolor{myred}{\textbf{red}} and \textcolor{mygreen}{\textbf{green}} color, respectively.}
\caption{{\color{nb}$AD$ and $RSV$ of three candidate flows}}
\caption{{\color{nb2}Spatial and temporal distributions of urban traffic incidents}}
\caption{{\color{nb2}Traffic illustration of SFO}}
\caption{{\color{nb}Evaluation among different methods}}
\caption{{\color{nb}Evaluation among different variants of DIGC-net}}
\caption{{\color{nb}Time-sensitive comparison of SFO}}
\caption{\notablue{{\bf[MAYBE OMIT]}} Radial integral of the atomic transion $n^2D \to (n+1)^2P$ for Rb taken from reference \cite{Sullivan}(Obs) compared with the values computed in this work using $j=3/2$. The parenthesis shown in the second column show the experimental error band.}
\caption{Scaling of the critical temperature $T_c$ (a) and critical density $\rho_c$ (b) with $f_T$. As shown in the legend, red diamonds (\diamond) indicate sequences where both terminal beads are T, blue circles (\circles) mark sequences with two H ends, and orange triangles (\triangle) indicate the sequences with one H and one T end. The dashed line in (a) indicates a quadratic fit and a linear fit to the data in (b). The inset in (a) shows the deviation of $T_c$ from the quadratic fit.}
\caption{Critical temperature $T_c$ as a function of (a) sequence charge decoration SCD and (b) reweighted $f^*_T$ for all sequences with a critical point. Red diamonds (\diamond) indicate sequences where both terminal beads are T, blue circles (\circles) mark sequences with two H ends, and orange triangles (\triangle) indicate the sequences with one H and one T end. The dashed line in (b) is a quadratic fit. }
\caption{(a) Scaling of the critical temperature $T_c$ with the radius of gyration of a single chain $\langle R_g\rangle$ measured at a fixed temperature of $T=2.0 \epsilon/k_\text{B}$. (b) Same scaling as in (a) for the critical density $\rho_c$. (c) Scaling of $T_c$ with $T_\theta$. (d) Scaling of $\rho_c$ with $T_\theta$. Red diamonds (\diamond) indicate sequences where both terminal beads are T, blue circles (\circles) mark sequences with two H ends, and orange triangles (\triangle) indicate the sequences with mixed terminal beads. The dashed lines are linear fits to the data for sequences with mixed terminal beads.}
\caption{Red diamonds (\diamond) indicate sequences where both terminal beads are T, blue circles (\circles) mark sequences with two H ends, and orange triangles (\triangle) indicate the sequences with mixed terminal beads. The dashed vertical lines separate sequences which form finite aggregates (bottom), then infinite-range aggregates, then sequences that show reentrant phase behavior, and finally sequences that show conventional dense-dilute phase separation.}
\caption{\textbf{\ImNet classification with linear models.} Single-crop top-1 accuracy on the \ImNet validation data as a function of the number of parameters in the model that produces the representation (``A'' represents AlexNet). Pretext-Invariant Representation Learning (\algorithmName) sets a new state-of-the-art in this setting ({\color{Maroon}red marker}) and uses significantly smaller models (\resnetfifty). See Section~\ref{sec:linear_all} for more details. }
\caption{The proposed \emph{\underline{Sim}ulation-assisted scheduling \underline{A}lgorithm \underline{S}election} (\sil{}) approach for the selection of DLS techniques. \sil{} (b) is analogous to a typical control system~(a). The components highlighted in {\color{UnibasMint}mint color} represent the \sil{} additions to a typical loop scheduling system. {The \dlb{}{library}~(c.f. Section~\ref{subsec:dls}) is used for the parallel task scheduling and execution, \lsim{} ~(c.f. Section~\ref{subsec:sim}) is used to predict the application performance with different DLS techniques under perturbations. \sil{}~(c.f. Section~\ref{subsec:sil}) is integrated with \dlb{} to communicate with \lsim{} and to select DLS techniques dynamically during execution.}}
\caption{A {\textcolor{blue}{heavy ball}} of mass $M$ approaches a stationary {\textcolor{red}{light ball}} of mass 1. How many collisions ensue?}
\caption{Additional \aastex\symbols}
\caption{Analysis of implementation inconsistencies between RetinaNet and FCOS on MS COCO {\tt minival} set. ``\#A=1'' means there is one square anchor box per location.}
\caption{Definition of positives (\colorbox{cyan}{\tiny{1}}) and negatives (\colorbox{lightgray}{\tiny{0}}). Blue box, red box and red point are ground-truth, anchor box and anchor point. (a) RetinaNet uses IoU to select positives (\colorbox{cyan}{\tiny{1}}) in spatial and scale dimension simultaneously. (b) FCOS first finds candidate positives (\colorbox{cyan}{\tiny{?}}) in spatial dimension, then selects final positives (\colorbox{cyan}{\tiny{1}}) in scale dimension.}
\caption{Visualization of ball$_0$, bottle$_0$, bottle$_1$, bottle$_2$, cup$_0$, cup$_1$ from top to bottom. From left to right in each row: object twin with groundtruth keypoint location, scanned CAD model from opaque object, and aligning the marker to the object. We use \textcolor{viz_red}{red} and \textcolor{viz_yellow}{yellow} to mark keypoint \textcolor{viz_red}{1} and \textcolor{viz_yellow}{2}.}
\caption{Visualization of heart$_0$, mug$_0$, mug$_1$, mug$_2$, mug$_3$, mug$_4$ from top to bottom. From left to right in each row: object twin with groundtruth keypoint location, scanned CAD model from opaque object, and aligning the marker to the object. We use \textcolor{viz_red}{red}, \textcolor{viz_yellow}{yellow}, \textcolor{viz_green}{green} and \textcolor{viz_blue}{blue} to mark keypoint \textcolor{viz_red}{1}, \textcolor{viz_yellow}{2}, \textcolor{viz_green}{3} and \textcolor{viz_blue}{4}. For heart$_0$, keypoints \textcolor{viz_green}{3} and \textcolor{viz_blue}{4} are symmetric.}
\caption{Visualization of mug$_5$, mug$_6$, mug$_7$, mug$_8$, sakura$_0$, shovel$_0$ from top to bottom. From left to right in each row: object twin with groundtruth keypoint location, scanned CAD model from opaque object, and aligning the marker to the object. We use \textcolor{viz_red}{red}, \textcolor{viz_yellow}{yellow}, \textcolor{viz_green}{green}, \textcolor{viz_blue}{blue} and \textcolor{viz_purple}{purple} to mark keypoint \textcolor{viz_red}{1}, \textcolor{viz_yellow}{2}, \textcolor{viz_green}{3}, \textcolor{viz_blue}{4} and \textcolor{viz_purple}{5}. For sakura$_0$, all its five keypoints are symmetric.}
\caption{Visualization of star$_0$ and tree$_0$ from top to bottom. From left to right in each row: object twin with groundtruth keypoint location, scanned CAD model from opaque object, and aligning the marker to the object. We use \textcolor{viz_red}{red}, \textcolor{viz_yellow}{yellow}, \textcolor{viz_green}{green}, \textcolor{viz_blue}{blue} and \textcolor{viz_purple}{purple} to mark keypoint \textcolor{viz_red}{1}, \textcolor{viz_yellow}{2}, \textcolor{viz_green}{3}, \textcolor{viz_blue}{4} and \textcolor{viz_purple}{5}. For star$_0$, all its five keypoints are symmetric. For tree$_0$, keypoints \textcolor{viz_green}{3} and \textcolor{viz_blue}{4} are symmetric.}
\caption{Visualization of ablation study: without (left) vs. with (right) permutation loss. We use \textcolor{viz_red}{red}, \textcolor{viz_yellow}{yellow}, \textcolor{viz_green}{green} and \textcolor{viz_blue}{blue} to mark keypoint \textcolor{viz_red}{1}, \textcolor{viz_yellow}{2}, \textcolor{viz_green}{3} and \textcolor{viz_blue}{4}. Instance tree$_0$ has symmetric keypoints \textcolor{viz_green}{3} \&\textcolor{viz_blue}{4}.}
\caption{Visualization of prediction results on validation set. From left to right in each row: left stereo image with groundtruth keypoint locations, right stereo image, predicted probability map for the first keypoint, predicted probability map for the second keypoint (none for the ball$_0$) and predicted keypoints. We use \textcolor{viz_red}{red}, \textcolor{viz_yellow}{yellow}, \textcolor{viz_green}{green} and \textcolor{viz_blue}{blue} to mark keypoint \textcolor{viz_red}{1}, \textcolor{viz_yellow}{2}, \textcolor{viz_green}{3} and \textcolor{viz_blue}{4}.}
\caption{Qualitative results of the \textcolor{blue}{ baseline} method in {\bf PROX-E}. %in the 4 test scenes of {\bf PROX-E}. The results before and after the scene geometry-aware fitting are shown.}
\caption{Qualitative results of \textcolor{blue}{S1} in {\bf PROX-E}. The results before and after the scene geometry-aware fitting are shown.}
\caption{Qualitative results of the \textcolor{blue}{baseline} with fitting in {\bf MP3D-R}. We argue that our modifications to \cite{li2019putting} are necessary and favorable to produce high quality 3D human bodies. For the quantitative comparison, please refer to Tab.~\ref{tab:res_gen_diversity}, Tab.~\ref{tab:res_gen_physics} and Tab.~\ref{tab:user_study}}
\caption{Qualitative results of \textcolor{blue}{ S1} with fitting in {\bf MP3D-R}.}
\caption{Latency round trip delay for information exchange between the robot and the cloud server. \textcolor{red}{Check this caption. It seems to be edited.}}
\caption{The framework of risk analysis: the circle and cross symbols represent different classes, and the mislabeled pairs are highlighted by {\color[rgb]{0.7529,0,0} red}.}
\caption{Circles: Shear viscosity (in Pa.s) as a function of shear rate (1/s) for the Boger (B, $\bullet$) and Newtonian (N, $\circ$) liquids. The dashed dotted line shows a fit to a power-law viscosity fit with $m=0.97$ Pa s$^n$ and $n=0.94$, (Eq.~\ref{eqn:PowerLaw}) for $\dot{\gamma}<20$ s$^{-1}$. Squares: First normal stress difference, $N_1$, ({\color{blue} $\blacksquare$}), as a function of shear rate. The dashed blue line shows the fit to an Oldroyd-B model (Eq.~\ref{eqn:OlroydB}). The two vertical lines show the window of shear rates within which the experiments were conducted ($5.5<\omega<31.0$ s$^{-1}$).}
\caption{\textcolor{blue}{Contents information data - need to add lecture}}
\caption{ TOP:~A concept drawing of the \phasei\ANNIE detector system, showing the positions of the upper left corner of the neutron capture volume (NCV) described in\cref{sec:exp_design}. BOTTOM:~A concept drawing of the complete \phaseii\detector. The solid blue line indicates the optically isolated active volume of the detector and the dotted blue line indicates the fiducial volume optimized for the\phaseii\physics measurement.}
\caption{Successive applications of each of the neutron candidate event criteria for beam data taken at position O (center of the tank). \textbf{SOLID BLACK}: Time distribution of all pulses recorded in DAQ \modeb\on NCV PMT\#1 at position O. A time of zero corresponds to beam arrival. No analysis cuts have been applied to these data.\textcolor{cyan}{\textbf{SOLID CYAN}}: Time distribution of all NCV coincidences from the same dataset. \textcolor{black}{\textbf{DOTTED BLACK}}: Events from the blue histogram that passed the after-pulsing cut. Note that this cut was not applied to events that occurred in the first \SI{20}{\micro\second} after the start of the beam spill. \textcolor{MyGreen}{\textbf{DASHED GREEN}}: Events from the red histogram that passed the total charge cut. \textcolor{orange}{\textbf{DASH-DOTTED ORANGE}}: Events from the green histogram that passed the water PMT veto cut.}
\caption{Technical summary of the tested laser scanners. Differences between similar models are marked \textcolor{dai_deepred}{red}.}
\caption{Comparing DY-MobileNetV2 with MobileNetV2 per class on ImageNet validation set \cite{deng2009imagenet}. We calculate the top-1 accuracy per class using both DY-MobileNetV2 and MobileNetV2 with four different width multipliers (0.35, 0.5, 0.75 and 1.0), and plot for all 1000 classes. Each dot is corresponding to an image class. Darker dots indicate that multiple classes overlap at these positions. DY-MobileNetV2 is more accurate than its static counterpart for in majority of classes (above the diagonal \textcolor{red}{red} line), ranging from easier classes to harder classes. Each dynamic convolution layer in DY-MobileNetV2 has $K=4$ convolution kernels. Best viewed in color.}
\caption{Examples of localizing actions in an untrimmed video. In the two-dimensional temporal map, the \textit{black} vertical and horizontal axes represent the start and end frame indices while the corresponding \textcolor{gray}{\textit{gray}} axes represent the corresponding start and end time in the video. The values in the 2D map, highlighted by red color, indicate the overlapping scores between the candidate proposals and the target proposal. Here, $\tau$ is a short duration determined by the video length and sampling rate. }
\caption{The framework of our proposed Sparse 2D Temporal Adjacent Network. It consists of a 2D temporal feature map extractor for video representation and a temporal adjacent network for proposal generation. \textcolor{blue}{\textit{Blue}} boxes represent the proposals we select and the \textcolor{gray}{\textit{gray}} boxes represent the proposals that are not selected. All the \textit{transparent} boxes represent the invalid proposals. \textcolor{blue}{\textit{Blue}} boxes with \textcolor{violet}{\textit{violet }}, \textcolor{orange}{\textit{orange}} and \textcolor{green}{\textit{green}} margins represent short, medium and long proposals we selected. }
\caption{Benchmark results on state-of-the-art SISR methods. We calculate the average PSNR(dB)/SSIM values on Set5, Set14, B100, Urban100 and Manga109 datasets with scale factors $\times 2$, $\times 3$ and $\times 4$. The color {\color{red}red} and {\color{blue}blue} indicate the best and the second best performance respectively. It is noted that the metrics are calculated on Y channel (illumination channel of YCbCr color space).}
\caption{Average running time for scale factor $\times 4$ on three different resolution settings, including 480 $\times$ 360, 640 $\times$ 480, 1280 $\times$ 720. The testing is conducted on a PC which is equipped with NVIDIA Quadro P6000 GPU (24 GB memory). The color {\color{red}red} and {\color{blue}blue} indicate the best and the second best performance respectively. }
\caption{Benchmark results of SISR models trained on the high-resolution DIV2K dataset. Average PSNR and SSIM values are calculated for scale factors $\times2$, $\times3$ and $\times4$ on Set5, Set14, B100, Urban100 and Manga109 datasets. The color {\color{red}red} and {\color{blue}blue} indicate the best and the second best performance respectively.}
\caption{a) Representative example of MSD (same droplet as in Fig.~\ref{fig:Setup-DropTrajectory}b). Little black dots are experimental data and the red line is the fit using equation (\ref{eq:ec1}). Persistence time is indicated with the arrow. The inset shows the same data in log-log scale. The two lines are power laws with exponents 2 and 1. b) Measured diffusion coefficient $D$ as a function of $\tau \left\langle \mathbf{V}^{2} \right\rangle$ for all measured drops. The data for different bacterial concentration collapse in a straight line with slope $\alpha = 0.57$ (red line). The reference concentration is $n_0=5.14 \times 10^8~\si{bact/\milli\liter}$.} \label{fig:MSD-D} \end{figure} The observations present a wide variability; droplets with the same radius and bacterial concentration can present very different values of $V$, $\tau$, and $D$. Despite this variability of the data, individual values of $D$, $\left\langle\mathbf{V}^{2} \right\rangle$ and $\tau$ for each droplet collapse in the line $D = \alpha \tau \left\langle \mathbf{V}^{2} \right\rangle$ (Fig.~\ref{fig:MSD-D}b). %, showing that droplets that move faster and more persistently have a larger diffusion coefficient. For a persistent random walk with a velocity autocorrelation function $\left\langle\mathbf{V}(t) \cdot \mathbf{V}(0) \right\rangle= \left\langle \mathbf{V}^{2} \right\rangle\exp(-t/\tau)$, it is obtained that $\alpha=1/2$ (see Ref.~\cite{Martens2012}). The best fit gives $\alpha=0.57\pm 0.01$. The difference with the persistent random walk prediction suggests that the velocity autocorrelation function is not a single exponential, but the experimental precision does not allow to discriminate among different models. \subsection{Dependence on bacterial concentration} The movement of droplets was studied as a function of the bacterial concentration $n$ in the bacterial suspension. Droplets with radii from \SI{20}{\micro\meter} to \SI{30}{\micro\meter} were selected, obtaining between 100 and 170 drops for each bacterial concentration. The average diffusion coefficient, persistence time and average speed were calculated at each concentration, as shown in Figs.~\ref{fig:Fig3}a-c. Although dispersion remains high, an increasing tendency of $D$ with $n$ can be observed. Data was fitted to a linear curve $D = D_0+D_1 n$, with $D_0 = (-0.03\pm0.15)~\si{\micro\meter^2/\second}$ and $D_1 = (19 \pm 11)~\si{\micro\meter^5/\second/bact}$. The persistence times and the average speed also increase with the bacterial concentration. \begin{figure*}[t!] \includegraphics[width=0.33\linewidth]{Fig3a}\hfill \includegraphics[width=0.33\linewidth]{Fig3b}\hfill \includegraphics[width=0.33\linewidth]{Fig3c}\\ \includegraphics[width=0.33\linewidth]{Fig3d}\hfill \includegraphics[width=0.33\linewidth]{Fig3e}\hfill \includegraphics[width=0.33\linewidth]{Fig3f} \caption{ a) Diffusion coefficient, b) persistence time, and c) mean speed as a function of bacterial concentration, averaged for droplets with radii in the range \SI{20}{\micro\meter} to \SI{30}{\micro\meter}. Symbols represent experimental data and the red line in a) corresponds to a linear fit. d) Diffusion coefficient, e) persistence time, and f) mean speed as a function of droplet radius for fixed bacterial concentration $n = (2.25 \pm 0.14) \times 10^{10}$~bact/mL. The drops are grouped within windows of increasing radius in \SI{10}{\micro\meter} increments. For a-c, the error bars indicate the standard error around the mean value, while in d-e, the horizontal error bars represent the standard deviation on drop radius and the vertical error bar is the confidence interval at 63$\%$.} \label{fig:Fig3} \end{figure*} \subsection{Dependence on the drop radius} To analyze the dependence with the droplet radius, we consider the case of maximal bacterial concentration, $n = (2.25 \pm 0.14) \times 10^{10}$~bact/mL, and the drops are grouped within windows of increasing radius in \SI{10}{\micro\meter} increments. The results for $D$, $\tau$, and $V$ as a function of $R$ are presented in Figs.~\ref{fig:Fig3}d-f. For each radius window, a large variability exists for the diffusion coefficient, the persistence time and the average drop speed. The average diffusion coefficient and persistence time remain approximately constant at $D \sim \SI{0.5}{\micro\meter^2/\second}$ and $\tau \sim \SI{0.3}{\second}$. The average speed increases slightly with the drop radius and its average value is $V \sim \SI{1.5}{\micro\meter/\second}$. \subsection{Internal flows} Internal flows in the bottom of the drops were recorded in fluorescence near the contact surface with the substrate. Fluctuating vortical flows driven by the bacterial activity are observed (movie S3). Velocity fields $\mathbf{v}(x,y,t)$ are obtained with PIV in a small region of the bottom of the drops, corresponding to the brightest portion of the image. For calculation purposes, we chose an observation region of radius $R_{\rm obs} = \sqrt{7}R/4 $ that corresponds to a height $h_{\rm obs} = R/4$ measured from the bottom (Fig.~\ref{fig:Fig4}a). \begin{figure*}[t!] \includegraphics[width=0.32\linewidth]{Fig4a}\hfill \includegraphics[width=0.32\linewidth]{Fig4b}\hfill \includegraphics[width=0.32\linewidth]{Fig4c}\hfill \caption{a) Velocity field in the bottom part of a drop, obtained by PIV (black arrows). The red arrow is the weighted average internal velocity $\mathbf{v}_{b}$, while the blue arrow is the instantaneous velocity of the droplet $\mathbf{V}$. For clarity, the red and and blue arrows are enlarged by a factor 10 with respect to the scale. The inset shows a fluorescence image of the bottom part of a drop. The outer white circle marks the boundary of the droplet, of radius $R= \SI{28}{\micro\meter}$. The PIV analysis was performed within the inner red circle of radius $R_{\rm obs}= \SI{18}{\micro\meter}$. b) Correlation length $l_{\rm corr}$ and c) correlation time $t_{\rm corr}$ for the velocity field inside the droplets as a function of drop radius $R$ and mean speed of the internal flow, $v_{\rm int}$ (color scale). Inset: Correlation time of the internal flow, $t_{\rm corr}$, compared with the persistence time of the drop motion, $\tau$. The red straight line has slope one and is a guide to the eye.} \label{fig:Fig4} \end{figure*} Several drops are measured and, from each velocity field, the mean speed of the internal flow $v_{\rm int}\equiv\langle |\mathbf{v}(x,y,t)|\rangle$ is computed, where the average is over time and space. Also, the average over time of the correlation length $l_{\rm corr}$ for the velocity field is calculated as described in Ref.~\cite{Sokolov2012}. The dependence of $l_{\rm corr}$ on drop radius $R$ and mean internal speed $v_{\rm int}$ is presented in Fig.~{\ref{fig:Fig4}b. For drop radius between \SI{20}{\micro\meter} and \SI{50}{\micro\meter}, $l_{\rm corr}$ increases rapidly from $\approx \SI{3}{\micro\meter}$ to $\approx \SI{7}{\micro\meter}$, with relatively little dispersion. For larger drop radius, $l_{\rm corr}$ seems to plateau at a value of $\approx \SI{8}{\micro\meter}$, but with larger dispersion. The dependence of $l_{\rm corr}$ on $v_{\rm int}$ (color scale of Fig.~\ref{fig:Fig4}b) presents a larger dispersion, but in general $l_{\rm corr}$ increases with $v_{\rm int}$. Finally, the correlation time of the velocity field, $t_{\rm corr}$, was obtained as follows. For each position in the observation region, the temporal self-correlation function of the velocity is obtained. From it, we compute the first moment, and the average of these give $t_{\rm corr}$. Figure \ref{fig:Fig4}c shows that the correlation time is almost insensitive to the droplet radius, with a mean value of $t_{\rm corr} = (0.30 \pm 0.08) \si{\second}$. %%%%%%%%%%%%%%DISCUSSION%%%%%%%%%%%%%%%%%% \section{Discussion} \label{sec:Discussion} \subsection{Bacterial activity enhances droplet diffusion} Thermal diffusion of a sphere of radius $R$ in a medium of viscosity $\eta$ at temperature $T$ is given by $D_{\rm th} =k_{\rm B}T/(6\pi \eta R)$. In the case of a liquid drop of viscosity $\eta'$, a factor $C(\eta'/\eta)$ must be included, accounting for the slip condition at the drop surface, $D_{\rm th} = k_{\rm B}T/[C(\eta'/\eta) \eta R]$ \cite{Bond1927}. Using $\eta' = \SI{1}{\milli\pascal\second}$ for water and $\eta = \SI{3.47}{\milli\pascal\second}$ for hexadecane, gives $C=13.9$. With $T = \SI{20}{\celsius}$ and $R = \SI{20}{\micro\meter}$ (the smallest radius considered here), we obtain $D_{\rm th} \approx \SI{0.4e-2}{\micro\meter^2/\second}$. This value is two orders of magnitude smaller than the average diffusion coefficient $D \approx \SI{0.3}{\micro\meter^2/\second}$ that we measure for drops at the higher bacterial concentration used here. For a droplet of mass $M$, its stopping time in hexadecane is $\tau_{\rm stop}=M/[C(\eta'/\eta) \eta R]$. The same parameters used to estimate $D_{\rm th}$ give $\tau_{\rm stop}=\SI{3.3e-5}{\second}$, which is much smaller than the persistence time $\tau$ of the ballistic motion of the drops. Our results do not evidence any significant dependence of the diffusion coefficient with the drop radius (Fig.~\ref{fig:Fig3}d), as opposed to the $1/R$ scaling of $D_{\rm th}$. This is also true for the mean persistence time $\tau$ (Fig.~\ref{fig:Fig3}e). The mean droplet speed $V$ (Fig.~\ref{fig:Fig3}f), exhibits a slight increase with the drop radius that could be due to the loss of bacterial activity, which occurs faster for smaller droplets. Finally, diffusivity of the droplets increases linearly with the concentration of the bacterial suspension (Fig.~\ref{fig:Fig3}a). The intercept found with the linear fit is consistent with the thermal diffusivity $D_{\rm th}$ expected for drops in the size range studied here. These observations indicate that bacterial activity is the motor of the movement that we observe in the drops. In the following we present a model to explain the drop movement based on the internal flows driven by the bacteria. \subsection{Driving mechanism} In dense suspensions, swimming bacteria usually organize in collective motions. Inside the spherical drops that we study, these collective motions translate in vortices that appear, move and disappear continuously, with a characteristic size $l_{\rm corr}$ and life time $t_{\rm corr}$. The value of $l_{\rm corr}$ shows an increasing tendency with the drop radius, probably related to confining effects (Fig.~\ref{fig:Fig4}b). Comparison of the characteristic duration of these collective motions with the persistence time of the ballistic motion of the droplet (Fig.~\ref{fig:Fig4}c, inset) reveals that both are restricted to the same range, around \SI{0.3}{\second}. \begin{figure*}[h] \raisebox{0.1\height}{\includegraphics[width=0.32\linewidth]{Fig5a}}\hfill \includegraphics[width=0.32\linewidth]{Fig5b} \hfill \includegraphics[width=0.34\linewidth]{Fig5c} \caption{a) Schematics of the rolling with slipping of a drop. $\mathbf v_{\rm b}$ is obtained averaging the velocity field in the observation region delimited by $R_{\rm obs}$ and averaged with the weight function $w$ (eq.~\ref{eq.weigth}), which depends on the relative height to the bottom of the droplet $h$. The thickness of the lubrication film $\epsilon$ is shown out of scale for clarity. b) $V/v_{\rm b}$ as a function of drop radius $R$. Symbols represent experimental data, averaged over drops within windows of increasing radius in \SI{10}{\micro\meter} increments. Horizontal error bars correspond to standard deviation on drop radius while vertical error bars represent the confident interval at 63\%. Red curve corresponds to eq.~(\ref{eq:RollingModel}). c) Probability density function for the angle between $\mathbf{v}_{\rm b}$ and $\mathbf{V}$. The red line is the fit for the von Mises distribution, eq.~(\ref{eq:vonMises}), showing a peak at $\theta = \pi$.} \label{fig:Fig5} \end{figure*} Experimental results indicate that the bacterial motion inside the drops is responsible for the movement of the drops. The bacterial currents are highly fluctuating, with correlation lengths smaller than the droplet radii. Hence, if the droplet were suspended in an infinite fluid medium, the drag forces on different regions of the droplet surface would almost cancel out, resulting in small instantaneous values. % We hypothesize that the presence of the bottom glass substrate, on the contrary, enhances the drag effect of these fluctuating currents. In effect, only a small region of the droplet is in contact with the substrate, where it is more probable to observe coherent motion. As the friction close to the solid surface becomes larger, the cancelation of the drag forces described above does not take place. % The mechanism that we propose is a ``rolling with slipping''. When the bacterial motion inside the drops produces a patch of directed flow in the lowest part of the drop, the thin lubrication film existing between the glass substrate and the drop is sheared in the direction of the flow. This shear, in turn, creates a net force on the drop that causes its movement in the opposite direction of the inner flow, as schematically shown in Fig.~\ref{fig:Fig5}a. % By lubrication theory, we can approximate the motion of the bottom part of the droplet as having an angular velocity $\Omega=v_{\rm b}/R$, where $v_{\rm b}$ is the speed of the inner flow at the bottom of the drop. Note that we are not assuming that the droplet rotates as a whole because only the bottom region of the droplet is relevant for the lubrication theory calculation. Considering that the droplet moves at a velocity $\mathbf V$, the total hydrodynamic force on the droplet is $\mathbf F^\text{hydro} = \mathcal R^{FU} \mathbf V +\mathcal{R}^{F\Omega}\boldsymbol\Omega$, where $\mathcal R^{FU}$ and $\mathcal R^{F\Omega}$ are resistance tensors, which depend on the lubrication layer thickness $\epsilon$ \cite{dunstan2012two}. At vanishing Reynolds number, the hydrodynamic force cancels, resulting in \begin{equation}\label{eq:RollingModel} \mathbf{V} = - \frac{\mathcal R^{F\Omega}_{/\mkern-5mu/}}{R \mathcal R^{FU}_{xy}} \mathbf v_{\rm b} = -\left(\frac{2 \ln(\epsilon/R)/15+0.2526}{8 \ln(\epsilon/R)/15- 0.9588}\right) \mathbf v_{\rm b}, \end{equation} where we used the expressions for the resistance tensors in Ref.~\cite{dunstan2012two}. In eq.~(\ref{eq:RollingModel}), $\mathbf v_{\rm b}$ is the relevant velocity at the bottom of the drop, caused by the bacterial motion and the minus sign indicates that the drop velocity $\mathbf V$ is antiparallel to $\mathbf v_{\rm b}$. To test this hypothesis, we define a weighted average internal velocity $\mathbf{v}_{\rm b}$, with a weight function $w(x,y)$, \begin{equation} w(x,y) = \frac{1}{h(x,y)^{2} + \epsilon^{2}}, \label{eq.weigth} \end{equation} where $(x,y)$ represent the horizontal coordinates of a point in the PIV velocity field, $h(x,y)$ is the height from the lowest point of the droplet to the interface at position $(x,y)$ (see Fig.~\ref{fig:Fig5}a). The square in $h$ is used to give more weight to the points near the glass surface and roughly models the decay of the flow produced by force dipoles. $\epsilon$ is a regularizer to avoid divergences and is estimated to be of the order of the lubrication film thickness. We take $\epsilon = \SI{20}{\nano\meter}$. Then, \begin{equation} \mathbf{v}_{\rm b}(t) = \frac{\sum_{x,y} \mathbf{v}(x,y,t) w(x,y)}{\sum_{x,y} w(x,y)}. \end{equation} This weighted average velocity is compared with the velocity of the droplet $\mathbf{V}$ obtained by the drop tracking. Figure~\ref{fig:Fig5}b shows our experimental results of $V/v_{\rm b}$ as a function of $R$, together with the slippery rolling model, eq.~(\ref{eq:RollingModel}). Despite the variability in the experimental data, the agreement between the experiment and the model is very good, except for the smallest droplets. In general, $\mathbf{v}_{\rm b}$ and $\mathbf{V}$ go in opposite direction, as shown by the red and blue arrows in Fig.~\ref{fig:Fig4}a, respectively. To quantify this antiparallel behavior, the angle $\theta$ between $\mathbf{v}_{\rm b}$ and $\mathbf{V}$ is determined in the range $\left[0,2 \pi \right]$ along the whole trajectory for all droplets. The probability density function (PDF) of $\theta$ is presented in Fig.~\ref{fig:Fig5}c. A peak near $\theta = \pi$ is clearly evident. The location of the peak and the width of the distribution is determined by fitting a von Mises distribution \begin{equation} \label{eq:vonMises} P(\theta) = \frac{1}{2 \pi I_{0}(k)}e^{k\cos(\theta-\theta_0)}, \end{equation} where $I_{0}$ is the modified Bessel function of the first kind and $\theta_0 = \pi +\delta$ is the position of the center of the distribution, with $\delta$ the deviation from the perfect antiparallel alignment between $\mathbf v_{\rm b}$ and $\mathbf V$. Finally, $k$ is a measure of the width of the distribution. For $k = 0$ the distribution is uniform and for $k \ne 0$ it is more concentrated in a certain angle. Fitting of the PDF with the von Mises distribution yields $k = 0.80$ and $\delta=-0.0015$ (Fig.~\ref{fig:Fig5}c, red line). Since $\delta \ll \sqrt{k}$ and $P(\pi)/P(0)=\exp(2k)\gg1$, the vectors are, on average, effectively antiparallel. %%%%%%%%%%%%%%CONCLUSIONS%%%%%%%%%%%%%%%%%% \section{Conclusions} \label{sec:Conclusions} In this work we have shown that dense suspensions of motile bacteria encapsulated inside emulsion drops are able to transfer movement to the drops. These bacterially propelled drops perform a persistent Brownian motion, which we have systematically studied as a function of the bacterial concentration and drop radius. % The diffusion coefficient, average speed and persistence time of the droplets present a wide variability. A possible origin of this variability is that, although the prepared suspensions have well controlled bacterial concentrations, the droplet production by agitation could induce concentration inhomogeneities, resulting in droplets of unequal concentration. Further studies are needed to test this hypothesis. % We have shown that bacterial coordinated activity and the presence of a substrate are essential for the propulsion of the droplets. We demonstrate that bacteria drive the droplet by comparing the velocity of the center of mass of the drop with a relevant average velocity of the bacterial suspension in the bottom of the drop, and also comparing the life time of the bacterial collective motions with the persistence time of the droplet motion in the ballistic regime. In this way we show that the encapsulation of active particles can create another active particle, a motor made of motors~\cite{Sanchez2012}. The power extracted from this motor is quite low; for a drop of typical radius $R \sim \SI{50}{\micro\meter}$ we estimate it at $P_{\rm drop} = C(\eta'/\eta)\eta R V^2 \sim \SI{0.01}{\femto\watt}$, comparable to previous works~\cite{DiLeonardo2010}, but three orders of magnitude below the maximum power available from the bacterial bath for the corresponding drop volume of \SI{5e-7}{\milli\liter} ($\sim \SI{50}{\femto\watt}$). Of course, this leaves ample room for improvement, however, most of the bacterial power will be dissipated as viscous heating, limiting the efficiency of mesoscopic bacterial motors. Some limitations remain a challenge due to the biological properties of the living particles, for example, the loss of activity over time. Its inhibition through an appropriate regenerating system for the chemical environment should enhance the droplets lifetime, as proven in Refs.~\cite{Sanchez2012, Keber2014}. In comparison with these experiments with microtubules and kinesin molecular motors, {\it E. coli} are simple to culture and handle, representing a widespread experimental model for active matter. In this case, the use of mutant strains with higher resistance to oxygen depletion and/or accumulation of detritus~\cite{Keymer2008} should prove beneficial. Moreover, chemotactic behavior of bacteria represents an interesting possibility to control the drop trajectory by external, manipulable fields. Thus, bacteria could be used to transport components within the droplet, or even to transport themselves and produce, in the right place, other components of interest, like proteins or enzymes to be used in medical treatments or biochemical processes. %\section*{Conflicts of interest} %``There are no conflicts to declare''. \section*{Acknowledgements} The authors would like to thank E. Clement for valuable discussions and J. Keymer and J. Noorlag for providing us with the {\it E. coli} strain and assistance in the culture protocols. G.R. thanks CONICYT grant Doctorado Nacional 21150648. This work is supported by the Millenium Nucleus Physics of Active Matter of the Millenium Scientific Initiative of the Ministry of Economy, Development and Tourism (Chile), and Fondecyt grants No. 1180791 and 1170411. Observation chambers were fabricated in the Laboratory of Optical Lithography, built thanks to Fondequip grants EQM140055 and EQM180009. %%%END OF MAIN TEXT%%% %The \balance command can be used to balance the columns on the final page if desired. It should be placed anywhere within the first column of the last page. \balance %If notes are included in your references you can change the title from 'References' to 'Notes and references' using the following command: %\renewcommand\refname{Notes and references} %%%REFERENCES%%% %\bibliography{biblio_V5} %You need to replace "rsc" on this line with the name of your .bib file %\bibliographystyle{rsc} %the RSC's .bst file \providecommand*{\mcitethebibliography}{\thebibliography} \csname @ifundefined\endcsname{endmcitethebibliography}{\let\endmcitethebibliography\endthebibliography}{} \begin{mcitethebibliography}{26} \providecommand*{\natexlab}[1]{#1} \providecommand*{\mciteSetBstSublistMode}[1]{} \providecommand*{\mciteSetBstMaxWidthForm}[2]{} \providecommand*{\mciteBstWouldAddEndPuncttrue}{\def{\}{EndOfBibitem}{\unskip.}} \providecommand*{\mciteBstWouldAddEndPunctfalse}{\let\EndOfBibitem\relax} \providecommand*{\mciteSetBstMidEndSepPunct}[3]{} \providecommand*{\mciteSetBstSublistLabelBeginEnd}[3]{} \providecommand*{\EndOfBibitem}{} \mciteSetBstSublistMode{f} \mciteSetBstMaxWidthForm{subitem}{(\emph{\alph{mcitesubitemcount}})} \mciteSetBstSublistLabelBeginEnd{\mcitemaxwidthsubitemform\space}{\relax}{\relax} \bibitem[Marchetti \emph{et~al.}(2013)Marchetti, Joanny, Ramaswamy, Liverpool, Prost, Rao, and Simha]{Marchetti2013} M.~C. Marchetti, J.-F. Joanny, S.~Ramaswamy, T.~B. Liverpool, J.~Prost, M.~Rao and R.~A. Simha, \emph{Rev. Mod. Phys.}, 2013, \textbf{85}, 1143--1189\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Bustamante \emph{et~al.}(2001)Bustamante, Keller, and Oster]{Bustamante2001} C.~Bustamante, D.~Keller and G.~Oster, \emph{Acc. Chem. Res.}, 2001, \textbf{34}, 412--420\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Sanchez \emph{et~al.}(2012)Sanchez, Chen, DeCamp, Heymann, and Dogic]{Sanchez2012} T.~Sanchez, D.~T.~N. Chen, S.~J. DeCamp, M.~Heymann and Z.~Dogic, \emph{Nature}, 2012, \textbf{491}, 431--434\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Keber \emph{et~al.}(2014)Keber, Loiseau, Sanchez, DeCamp, Giomi, Bowick, Marchetti, Dogic, and Bausch]{Keber2014} F.~C. Keber, E.~Loiseau, T.~Sanchez, S.~J. DeCamp, L.~Giomi, M.~J. Bowick, M.~C. Marchetti, Z.~Dogic and A.~R. Bausch, \emph{Science}, 2014, \textbf{345}, 1135--1139\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Sokolov \emph{et~al.}(2010)Sokolov, Apodaca, Grzybowski, and Aranson]{Sokolov2010} A.~Sokolov, M.~M. Apodaca, B.~A. Grzybowski and I.~S. Aranson, \emph{Proc. Natl. Acad. Sci. USA}, 2010, \textbf{107}, 969--974\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Di~Leonardo \emph{et~al.}(2010)Di~Leonardo, Angelani, Dell'Arciprete, Ruocco, Iebba, Schippa, Conte, Mecarini, De~Angelis, and di~Fabrizio]{DiLeonardo2010} R.~Di~Leonardo, L.~Angelani, D.~Dell'Arciprete, G.~Ruocco, V.~Iebba, S.~Schippa, M.~P. Conte, F.~Mecarini, F.~De~Angelis and E.~di~Fabrizio, \emph{Proc. Natl. Acad. Sci. USA}, 2010, \textbf{107}, 9541--9545\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[L\'opez\emph{et~al.}(2015)L\'opez, Gachelin, Douarche, Auradou, and Cl\'ement]{Lopez2015} H.~M. L\'opez, J.~Gachelin, C.~Douarche, H.~Auradou and E.~Cl\'ement,\emph{Phys. Rev. Lett.}, 2015, \textbf{115}, 028301\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Drescher \emph{et~al.}(2011)Drescher, Dunkel, Cisneros, Ganguly, and Goldstein]{Drescher2011} K.~Drescher, J.~Dunkel, L.~H. Cisneros, S.~Ganguly and R.~E. Goldstein, \emph{Proc. Natl. Acad. Sci. USA}, 2011, \textbf{108}, 10940--10945\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Wioland \emph{et~al.}(2013)Wioland, Woodhouse, Dunkel, Kessler, and Goldstein]{Wioland2013} H.~Wioland, F.~G. Woodhouse, J.~Dunkel, J.~O. Kessler and R.~E. Goldstein, \emph{Phys. Rev. Lett.}, 2013, \textbf{110}, 268102\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Vladescu \emph{et~al.}(2014)Vladescu, Marsden, Schwarz-Linek, Martinez, Arlt, Morozov, Marenduzzo, Cates, and Poon]{Vladescu2014} I.~D. Vladescu, E.~J. Marsden, J.~Schwarz-Linek, V.~A. Martinez, J.~Arlt, A.~N. Morozov, D.~Marenduzzo, M.~E. Cates and W.~C.~K. Poon, \emph{Phys. Rev. Lett.}, 2014, \textbf{113}, 268101\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Wensink \emph{et~al.}(2012)Wensink, Dunkel, Heidenreich, Drescher, Goldstein, L{\"o}wen, and Yeomans]{wensink2012meso} H.~H. Wensink, J.~Dunkel, S.~Heidenreich, K.~Drescher, R.~E. Goldstein, H.~L{\"o}wen and J.~M. Yeomans, \emph{Proc. Natl. Acad. Sci. USA}, 2012, \textbf{109}, 14308--14313\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Gachelin \emph{et~al.}(2014)Gachelin, Rousselet, Lindner, and Cl\'ement]{Gachelin2014} J.~Gachelin, A.~Rousselet, A.~Lindner and E.~Cl\'ement,\emph{New J. Phys.}, 2014, \textbf{16}, 025003\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Lauga \emph{et~al.}(2006)Lauga, DiLuzio, Whitesides, and Stone]{Lauga2006} E.~Lauga, W.~R. DiLuzio, G.~M. Whitesides and H.~A. Stone, \emph{Biophys. J.}, 2006, \textbf{90}, 400--412\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Di~Leonardo \emph{et~al.}(2011)Di~Leonardo, Dell'Arciprete, Angelani, and Iebba]{DiLeonardo2011} R.~Di~Leonardo, D.~Dell'Arciprete, L.~Angelani and V.~Iebba, \emph{Phys. Rev. Lett.}, 2011, \textbf{106}, 038101\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Berke \emph{et~al.}(2008)Berke, Turner, Berg, and Lauga]{Berke2008} A.~P. Berke, L.~Turner, H.~C. Berg and E.~Lauga, \emph{Phys. Rev. Lett.}, 2008, \textbf{101}, 038102\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Lauga and Powers(2009)]{Lauga2009} E.~Lauga and T.~R. Powers, \emph{Rep. Prog. Phys.}, 2009, \textbf{72}, 096601\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Keymer \emph{et~al.}(2008)Keymer, Galajda, Lambert, Liao, and Austin]{Keymer2008} J.~E. Keymer, P.~Galajda, G.~Lambert, D.~Liao and R.~H. Austin, \emph{Proc. Natl. Acad. Sci. USA}, 2008, \textbf{105}, 20269--20273\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Minamino \emph{et~al.}(2003)Minamino, Imae, Oosawa, Kobayashi, and Oosawa]{Minamino2003} T.~Minamino, Y.~Imae, F.~Oosawa, Y.~Kobayashi and K.~Oosawa, \emph{J. Bacteriol.}, 2003, \textbf{185}, 1190--1194\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Altshuler \emph{et~al.}(2013)Altshuler, Mi\~no, P\'erez-Penichet, R\'io, Lindner, Rousselet, and Cl\'ement]{Altshuler2013} E.~Altshuler, G.~Mi\~no, C.~P\'erez-Penichet, L.~d. R\'io, A.~Lindner, A.~Rousselet and E.~Cl\'ement,\emph{Soft Matter}, 2013, \textbf{9}, 1864--1870\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Bretherton(1961)]{Bretherton1961} F.~P. Bretherton, \emph{J. Fluid Mech.}, 1961, \textbf{10}, 166--188\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Thielicke and Stamhuis(2014)]{Thielicke2014} W.~Thielicke and E.~Stamhuis, \emph{J. Open Res. Softw.}, 2014, \textbf{2}, e30\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Vincenti \emph{et~al.}(2019)Vincenti, Ramos, Cordero, Douarche, Soto and Clement]{Vincenti2019} B.~Vincenti, G.~Ramos, M.L.~Cordero, C.~Douarche, R.~Soto and E.~Clement, \emph{Nat Commun}, 2019, \textbf{10}, 5082\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Beppu \emph{et~al.}(2017)Beppu, Izri, Gohya, Eto, Ichikawa and Maeda]{Beppu2017} K.~Beppu, Z.~Izri, J.~Gohya, K.~Eto, M.~Ichikawa and Y.T.~Maeda, \emph{Soft Matter}, 2017, \textbf{13}, 5038-5043\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Martens \emph{et~al.}(2012)Martens, Angelani, Di Leonardo, and Bocquet]{Martens2012} K.~Martens, L.~Angelani, R.~Di Leonardo and L.~Bocquet, \emph{Eur. Phys. J. E}, 2012, \textbf{35}, 84\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Wu and Libchaber(2000)]{Wu2000} X.-L. Wu and A.~Libchaber, \emph{Phys. Rev. Lett.}, 2000, \textbf{84}, 3017--3020\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Taylor(1922)]{Taylor1922} G.~I. Taylor, \emph{P. Lond. Math. Soc.}, 1922, \textbf{s2-20}, 196--212\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Sokolov and Aranson(2012)]{Sokolov2012} A.~Sokolov and I.~S. Aranson, \emph{Phys. Rev. Lett.}, 2012, \textbf{109}, 248109\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Bond(1927)]{Bond1927} W.~N. Bond, \emph{The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science}, 1927, \textbf{4}, 889--898\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \bibitem[Dunstan \emph{et~al.}(2012)Dunstan, Mino, Clement, and Soto]{dunstan2012two} J.~Dunstan, G.~Mino, E.~Clement and R.~Soto, \emph{Phys. Fluids}, 2012, \textbf{24}, 011901\relax \mciteBstWouldAddEndPuncttrue \mciteSetBstMidEndSepPunct{\mcitedefaultmidpunct}{\mcitedefaultendpunct}{\mcitedefaultseppunct}\relax \EndOfBibitem \end{mcitethebibliography} \end{document} }
\caption{\label{fig1}{\bf Characterization of probability distribution of topological defects.} The left column shows the probability distribution of the number of kinks $P(n)$ generated as a function of the quench rate $\Lambda=\log_{10}\left(1/\tau_Q\right)$. The numerical histograms are compared with the normal approximation~\eqref{NormPn} and the dashed vertical line denotes the mean value $\langle n \rangle$. The right panel shows the total distribution of kinks in a box-and-whisker chart, for different rate values $\Lambda$ and a chain of $N=100$ sites, using 5000 trajectories. $\mathscr{C}_{R}$ and $\mathscr{C}_{L}$ denote the cumulative probability above and below the mean. } \end{figure} {\it Numerical results.---} For the sake of illustration, we consider the breaking of parity symmetry in a second-order phase transition~\cite{Laguna98}. Specifically, we analyze a one-dimensional chain exhibiting a structural phase transition between a linear and a doubly-degenerate zigzag phase. This scenario is of relevance to trapped ion chains~\cite{Retzker08,delcampo10}, confined colloids and dusty plasmas~\cite{Mansoori14}, to name some relevant examples. In the course of the phase transition, parity is broken and kinks form at the interface between adjacent domains. To describe the dynamics we consider a lattice description in which each side is endowed with a transverse degree of freedom $\phi_i$ and the total potential reads \beqa \label{poteq} V(\{\phi_i\},t)=\sum_i[\lambda(t)\phi_i^2+\phi_i^4]+c\sum_i\phi_i\phi_{i+1}, \eeqa where $\{\phi_i\}$ are real continuous variables and $i=1,\dots,N$. As the coefficient $\lambda(t)$ is ramped from a positive initial value to a negative one, the local single-site potential evolves from a single-well to a double well. The nearest-neighbor coupling favors ferromagnetic order when $c<0$ and ferromagnetic otherwise. The evolution across the critical point is described by Langevin dynamics \beqa \label{laneq} \ddot{\phi}_i+\eta \dot{\phi}_i+\partial_{\phi_i}V(\{\phi_i\},t)+\zeta=0, \,i=1,\dots, N \eeqa where $\eta>0$ accounts for friction and $\zeta=\zeta(t)$ is a real Gaussian process with zero mean. Eqs.~\eqref{poteq} and~\eqref{laneq} account for the Langevin dynamics of a $\phi^4$-theory on a lattice. This system is well described by Ginzburg-Landau theory and is characterized by mean-field critical exponents $\nu=1/2$ and $z=2$ in the over-damped regime \cite{Laguna98,delcampo10}. The dynamics is induced by a ramp of $\lambda(t)$ from the value $\lambda(0)=\lambda_0$ to $\lambda(\tau_Q)=\lambda_f$ in the quench time $\tau_Q$ according to $\lambda(t)=\lambda_0+|\lambda_f-\lambda_0|t/\tau_Q$ and we consider a symmetric quench around $\lambda_c=0$, i.e., $\lambda_f=-\lambda_0$. %=============== Comments \begin{figure}[t!] \begin{center} \includegraphics[width=1.0\linewidth]{Figure_2.pdf} \end{center} \caption{\label{fig_kappaq} {\bf Universal scaling of the cumulants $\kappa_q$ of the kink number distribution.} From top to bottom, the mean kink density ($q=1$), its variance ($q=2$) and the third centered-moment ($q=3$) are shown as a function of the inverse quench time $\tau_Q$ for a chain of $N=100$ sites and 5000 trajectories. Symbols represent numerical data while solid lines describe the analytical approximation derived in the scaling limit, with $\beta_{\rm KZM}=\nu/(1+z\nu)$. } \end{figure} Full counting statistics of kinks is built by sampling over an ensemble of 5000 trajectories; see Fig. \ref{fig1} and \cite{SM} for lower sampling. The mean and width of the distribution are reduced for increasing quench times. Histograms for $P(n)$ are shown to be well-reproduced by the normal approximation \eqref{NormPn} away from the onset of adiabatic dynamics when the value of $P(0)$ is significant. The universal power-law scaling of the cumulants as a function of the quench time is shown in Fig.~\ref{fig_kappaq}. A fit to the mean number of kinks yields $\kappa_1=(30.747\pm0.191)\tau_Q^{-0.258\pm0.001}$, in good agreement with the KZM, which predicts the power-law exponent $\beta_{\rm KZM}=\nu/(1+z\nu)=1/4$ for mean-field values $\nu=1/2$, $z=2$. Signatures of universality beyond KZM are evident from the scaling of higher order cumulants. Non-normal features of the distribution are signaled by the non-zero value of $\kappa_q$ with $q\geq 3$. The variance scales as $\kappa_2=(17.230\pm0.191)\tau_Q^{-0.255\pm0.002}$, while the third cumulant is fitted to $\kappa_3=(3.735\pm 0.539)\tau_Q^{-0.252\pm 0.020}$. Power-law exponents are thus found as well in excellent agreement with the theoretical prediction in Eq.~\eqref{kappaqpl}. We note however that there is an infinite number of distributions in which cumulants exhibit a universal scaling with the quench rate of the form $\kappa_q=a_q\tau_Q^{-\beta_{\rm KZM}}$. According to our model for the full kink counting statistics, the ratio between any two cumulants is independent of the quench time and fixed by the probability $p$ for kink formation at the merging between adjacent domains. In particular, $\kappa_2/\kappa_1=1-p$ and $\kappa_3/\kappa_1=(1-p)(1-2p)$. Figure~\ref{fig_3} shows the ratio between the first three cumulants as a function of the quench rate. The numerical results are in excellent agreement with the theoretical prediction. In particular, it is found that the observed cumulant ratios $\kappa_2/\kappa_1 = 0.578\pm 0.015$, $\kappa_3/\kappa_1 = 0.135\pm0.035$ and $\kappa_3/\kappa_2=0.234\pm0.061$, are consistent with a single well-defined value of the probability for kink formation $p=0.422\pm 0.015$; see \cite{SM}. \begin{figure}[t!] \begin{center} \includegraphics[width=1.0\linewidth]{Figure_3.pdf} \end{center} \caption{\label{fig_3} {\bf Ratio between the first three cumulants as a function of the quench rate.} The numerical results (symbols) for the ratio between the cumulants $\kappa_\alpha$ and $\kappa_\beta$, where $\alpha>\beta$ and $\alpha, \beta\in\pac{1,2,3}$, are depicted as function of the inverse quench time $\tau_Q$. The solid line corresponds to the average of the ratio $\kappa_{\alpha}/\kappa_{\beta}$ and the shadow region between two dashed lines corresponds to the uncertainty associated with each cumulant ratio. Additionally, we showed the numerical (symbols) and mean value (solid lines) of $p$ calculated according to the plot legends.} \end{figure} \begin{figure}[h!] \begin{center} \includegraphics[width=0.9\linewidth]{Figure_4.pdf} \end{center} \caption{\label{fig_4} {\bf Probability for no kinks $P(0)$ as a function of quench time}. $P(0)$ decays exponentially with the mean number of kinks which exhibits a universal power-law dependence on the quench time. Numerical data (squares) is in excellent agreement with the theoretical prediction Eq.~\eqref{pzero} (circles) with the fitted value of $p$ in Fig. \ref{fig_3}. Dashes lines are exponential fits.} \end{figure} As further evidence for our model, we analyze the probability for no kink formation $P(0)$ as a function of the quench time in Figure~\ref{fig_4}. The dependence on the quench time is reproduced by the theoretical prediction Eq.~\eqref{pzero}, up to deviations which occur at large values of $\tau_Q$ which we associate with finite sampling and that are shown in the shadowed region in Fig.~\ref{fig_4}. Other notions of deviations away from the mean are also shown to be constrained by KZM scaling, see \cite{SM}. {\it Summary.---} When a continuous phase transition is traversed in a finite time scale $\tau_Q$, topological defects form. The average number scales with the quench time $\tau_Q$ following a universal power-law scaling predicted by the Kibble-Zurek mechanism. The same scaling describes the density of excitations in the quantum domain as well. Given a system whose critical dynamics is described by KZM, we have argued that the full number distribution of topological defects is universal and described by a binomial distribution. This model assumes that in the course of the critical dynamics, the system size is partitioned in domains of length scale given by the KZM correlation length. The event of topological defect formation at the interface between multiple domains is associated with a discrete random variable with a fixed success probability. A testable prediction is that all cumulants of the distribution are proportional to the mean and thus inherit a universal power-law scaling with the quench time, while cumulant ratios are constant and uniquely determined by the probability for kink formation. Other quantities such as the probability for no defects and the deviations away form the mean also exhibit a universal dependence on the quench time. Our findings motivate the quest for universal signatures in the counting statistics of topological defects across the wide variety of experiments used to test KZM dynamics, using e.g., convective fluids \cite{Casado01,Casado06}, colloids \cite{Keim15}, cold atoms \cite{Weiler08,Lamporesi13,Chomaz15,Navon15,Shin19}, and trapped ions \cite{EH13,Ulm13,Pyka13}. {\it Acknowledgment.-} The authors are indebted to Martin B. Plenio and Alex Retzker for illuminating discussions. It is also a pleasure to acknowledge discussions with Micha\lpb{} Bia\lpb{}o\'nczyk, Uwe R. Fisher, Jee Woo Park and Yong-Il Shin, and to thank the Department of Physics at Seoul National University for hospitality. \bibliography{fcs_defects_Bib} \newpage %========================================================================================================== %========================================================================================================== %==================================== Supplemental Material ==================================================== %========================================================================================================== %========================================================================================================== \pagebreak \clearpage \widetext \begin{center} \textbf{\large ---Supplemental Material---\\ Full Counting Statistics of Topological Defects After Crossing a Phase Transition}\\ \vspace{0.5cm} Fernando J. G\'omez-Ruiz$^{1,\textcolor{RubineRed}{*}}$, Jack J. Mayo$^{1,2}$ \& Adolfo del Campo$^{1,3,4,\textcolor{RubineRed}{\dagger}}$\\ \vspace{0.2cm} $^1${\it Donostia International Physics Center, E-20018 San Sebasti\'an, Spain}\\ $^2${\it University of Groningen, 9712 CP Groningen, Netherlands}\\ $^3${\it IKERBASQUE, Basque Foundation for Science, E-48013 Bilbao, Spain}\\ $^4${\it Department of Physics, University of Massachusetts, Boston, MA 02125, USA} \end{center} %%%%%%%%%% Merge with supplemental materials %%%%%%%%%% %%%%%%%%%% Prefix a "S" to all equations, figures, tables and reset the counter %%%%%%%%%% \setcounter{equation}{0} \setcounter{figure}{0} \setcounter{table}{0} \setcounter{section}{0} \setcounter{page}{1} \makeatletter \renewcommand{\theequation}{S\arabic{equation}} \renewcommand{\thefigure}{S\arabic{figure}} \renewcommand{\bibnumfmt}[1]{[S#1]} \renewcommand{\citenumfont}[1]{S#1} \newcolumntype{M}[1]{>{\centering\arraybackslash}m{#1}} \begin{center} \vspace{1.3cm} {\bf Contents} \end{center} \begin{enumerate} \itemsep0.5em \item[\textcolor{RubineRed}{\bf I.}] \textcolor{RubineRed}{\bf Full counting statistics of kinks as a function sampling}\hfill\textcolor{RubineRed}{1} \item[\textcolor{RubineRed}{\bf II.}] \textcolor{RubineRed}{\bf The ratios between any two cumulants} \hfill\textcolor{RubineRed}{2} \begin{enumerate} \item[\textcolor{RubineRed}{A.}] \textcolor{RubineRed}{Numerical estimation of $p$}\hfill\textcolor{RubineRed}{4} \end{enumerate} \item[\textcolor{RubineRed}{\bf III.}] \textcolor{RubineRed}{\bf Onset of adiabaticity}\hfill\textcolor{RubineRed}{5} \item[\textcolor{RubineRed}{\bf IV.}] \textcolor{RubineRed}{\bf Tails of the number distribution of topological defects}\hfill\textcolor{RubineRed}{5} %\item[] \textcolor{RubineRed}{{\bf References}}\hfill\textcolor{RubineRed}{6} \end{enumerate} \section{I. Full counting statistics of kinks as a function sampling} \begin{figure}[h!] \begin{center} \includegraphics[width=0.9\linewidth]{SM_Figure_1.pdf} \end{center} \caption{\label{SM_fig_1} {\bf Characterization of probability distribution of the number topological defects.} The upper panel shows the total distribution in a box-and-whisker chart, where $\mathscr{C}_{R}$ and $\mathscr{C}_{L}$ is given by Eq.~\eqref{SM_Col1}, and different quench rate values $\Lambda=\log_{10}\left(1/\tau_Q\right)$ are considered for chain of $N=100$ sites. The number of sampling trajectories is varied from 1000 (left) to 4000 (right).}
\caption{ (a) The premultiplied Corrsin shear parameter, $S_c$, in semi-logarithmic scale. The peak position corresponds to the shear thickness, $\delta^\ast$, which are shown by the dash-dotted line for each case. (b) The same but as a function of $y/\delta^\ast$ in the linear axis. \solidtridown, the logarithmic layer ($y^+ > 100 $ and $y/\delta^\ast<0.3$) for CH42 (for the clarity, only shown in (b)). (c) The Corrsin length scale with respect to $\delta^\ast$. (d) $(L_c/\eta) Re_\lambda^{-3/2}$. The horizontal dash-dotted line represents $0.033$ of SS-HST ~\citep{DongLozanoSekimotoJimenez2017}. Lines and symbols are in table~\ref{tab:params} and the vertical solid lines in (b--d) represent $y/\delta^\ast = 1$. }
\caption{Comparisons to the state-of-the-art methods on six public datasets for salient object detection. The top three results are highlighted in \textcolor[rgb]{ 1, 0, 0}{\textbf{red}}, \textcolor[rgb]{ 0, 1, 0}{\textbf{green}}, and \textcolor[rgb]{ 0, 0, 1}{\textbf{blue}}. The same post-processing method CRF is used in AFNet, MLMSNet,PAGENet, BASNet, poolNet, EGNet and SCRN.}
\caption{Normalized expectation values $\left\langle\delta\left(\textbf{r}_1-\lambda \textbf{r}_2\right)\right\rangle$ as functions of the \emph{collinear} parameter $\lambda$ for the two-electron atom/ions considered.} \includegraphics[width=8.5in]{Fig_1a.pdf} \label{F1} \end{figure} \begin{figure} %\centering \caption{Helium atom in the \emph{collinear} $1S$ state. Expectation values of $T=-\Delta/2$ (\red{red circles}), and of $V=r^{-1}\left[-Z-Z/\lambda+1/(1-\lambda)\right]$ (\blue{blue triangles}). $r$ is the distance between the nucleus and one of the electrons, $|\lambda|r$ is the distance between the nucleus and the other electron. The (\textbf{e-n-e}) configuration corresponds to $\lambda < 0$, the (\textbf{n-e-e}) configuration to $\lambda > 0$.}
\caption{\label{fig:solvetime}Mean time to extract the secret vector $\boldsymbol{s}$ from $X$-programs constructed as described in \cite{shepherd_temporally_2009}. Shaded region is the first to third quartile of the distribution of runtimes. We observe that the time is polynomial and fast in practice even up to problem sizes of hundreds of qubits. See Section \ref{subsec:Implementation} for a discussion of the $\mathcal{O}\left(n^{2}\right)$ scaling. The data points were computed by applying the algorithm to 1000 unique $X$-programs at each problem size. The secret vector was successfully extracted for every $X$-program tested. Experiments were completed using one thread on an Intel 8268 \textquotedbl Cascade Lake\textquotedbl{} processor.} \end{figure} In 2009, Shepherd and Bremner introduced an efficiently-verifiable protocol \cite{shepherd_temporally_2009} that places only meager requirements on the quantum device, making it a good candidate for near-term hardware. It requires only sampling from a quantum circuit in which the gates all commute (a class introduced by those authors as $\mathsf{IQP}$). Furthermore, the authors demonstrate in follow-up papers \cite{bremner_average-case_2016,bremner_classical_2011} that classically sampling from the distribution should be hard, suggesting a ``black-box'' approach to cheating classically (by simply simulating the quantum device) is indeed computationally difficult, and only a couple hundred qubits would be required to make a classical solution intractable. Importantly, however, the classical verifier in \cite{shepherd_temporally_2009} doesn't actually check whether the prover's samples come from the correct distribution (in fact, \cite{bremner_classical_2011} suggests doing such a check efficiently is not possible). Instead, the sampling task is designed such that bitstrings from its distribution will be orthogonal to some secret vector $\boldsymbol{s}$ with high probability, and it is this property that is checked. A question that has remained open is whether a classical machine could generating samples satisfying the orthogonality check \emph{without actually simulating the distribution}. In this paper I show that the answer is yes. I give an explicit algorithm that can extract the secret vector $\boldsymbol{s}$ underlying an instance of the protocol, thus making it trivial to generate orthogonal samples that pass the verifier's test. The main results of this paper are a statement of the algorithm, a proof that a single iteration of it will extract the secret vector $\boldsymbol{s}$ with probability $\nicefrac{1}{2}$ (Theorem \ref{thm:correctprob})\footnote{This probability can be made arbitrarily close to 1 by repetition.}, and empirical results demonstrating that it is efficient in practice (summarized in Figure \ref{fig:solvetime}). The following is a summary of the paper's structure. In Section \ref{sec:Background}, I review some points from the original paper \cite{shepherd_temporally_2009} that are especially relevant to the analysis here. In Section \ref{sec:Algorithm} I describe the algorithm to extract the secret key, and therefore break the protocol's security against classical provers. There I also discuss briefly my implementation of the algorithm. In Section \ref{sec:Discussion} I discuss related protocols, and provide the secret key underlying the ``\$25 challenge'' posted to the web by the authors of the original paper. \section{Background\label{sec:Background}} \paragraph{Overview of protocol} At the core of the protocol in \cite{shepherd_temporally_2009} is a sampling problem. The classical verifier generates a Hamiltonian $H_{P}$ consisting of a sum of products of Pauli $X$ operators, and asks the quantum prover to sample the probability distribution arising from the state $e^{iH_{P}\pi/8}\left|0^{\otimes n}\right\rangle$. The Hamiltonian is not exactly random, but instead is designed such that the samples $\left\{\boldsymbol{x}_{i}\right\}$ are biased such that $\boldsymbol{x}_{i}\cdot\boldsymbol{s}=0$\footnote{This inner product is over $\mathbb{F}_{2}$; all arithmetic in this paper is modulo 2 unless otherwise noted.} with high probability for some secret vector $\boldsymbol{s}$. The classical verifier, with knowledge of $\boldsymbol{s}$, can quickly check that the samples have such a bias. Since $\boldsymbol{s}$ should be only known to the verifier, it is conjectured in \cite{shepherd_temporally_2009} that the only efficient way to generate such samples is by actually doing the evolution. In Section \ref{sec:Algorithm} I show that it is possible to extract $\boldsymbol{s}$ classically from just the description of the Hamiltonian. \paragraph{X-programs} A Hamiltonian of the type used in this protocol can be described by a rectangular matrix of binary numbers, for which each row corresponds to a Hamiltonian term. Given such a matrix $P$ (called an ``$X$-program''), the Hamiltonian is \begin{equation} H_{P}=\sum_{i}\prod_{j}X^{P_{ij}}\label{eq:hamiltonian} \end{equation} In words, a 1 in $P$ at row $i$ and column $j$ corresponds to the inclusion of a Pauli $X$ operator on the $j^{\mathrm{th}}$ site in the $i^{\mathrm{th}}$ term of the Hamiltonian. The $X$-program also has one additional parameter $\theta$, which is the ``action''---the integrated energy over time for which the Hamiltonian will be applied. I note here that the original paper discusses \emph{matroids} rather than matrices. The perspective of this paper is that we have been given an explicit matrix $P$ acting as a canonical representative for the relevant matroid; it will be sufficient here to simply discuss $P$ as a matrix, and I will do so for the rest of the paper. \paragraph{Embedding a bias and verifying the output} In order to bias the output distribution along $\boldsymbol{s}$, a submatrix with special properties is embedded within the matrix $P$. For a vector $\boldsymbol{x}$ and matrix $P$, we can define the submatrix $P_{\boldsymbol{x}}$ as that which is generated by deleting all rows of $P$ that are orthogonal to $\boldsymbol{x}$. Under this notation, our relevant submatrix is $P_{\boldsymbol{s}}$ where $\boldsymbol{s}$ is the secret vector. For the output distribution to be appropriately biased, \cite{shepherd_temporally_2009} suggests that $P_{\boldsymbol{s}}$ should correspond to the generator matrix of an error-correcting code---in particular, the authors suggest using a quadratic residue code and setting the action $\theta=\nicefrac{\pi}{8}$. As described below, this choice leads to a gap between the quantum and classical probabilities of generating samples orthogonal to $\boldsymbol{s}$ (for the best known classical strategy in \cite{shepherd_temporally_2009}). The verifier's check is simply to request a large number of samples, and then determine if the fraction orthogonal to $\boldsymbol{s}$ is too large to have likely been generated by the classical distribution. In the two Facts below, I recall the probabilities from \cite{shepherd_temporally_2009} corresponding to that paper's quantum and classical strategies. The reasoning behind the classical strategy (Fact \ref{fact:classical-strategy}) is crucial to the rest of this paper; it is worth understanding its proof before moving on to the algorithm in Section \ref{sec:Algorithm}. \begin{fact} \textbf{Quantum strategy} Let $P$ be an $X$-program which has an embedded submatrix $P_{\boldsymbol{s}}$ for some secret vector $\boldsymbol{s}$, such that $P_{\boldsymbol{s}}$ is the generator matrix for a quadratic residue code up to permutation of rows. Let $\boldsymbol{X}$ be a random variable representing the distribution of bitstrings from an $n$-qubit quantum state $e^{iH_{P}\pi/8}\left|0\right\rangle$ measured in the $Z$ basis, where $H_{P}$ is defined as in Equation \ref{eq:hamiltonian}. Then, \begin{equation} \Pr\left[\boldsymbol{X}\cdot\boldsymbol{s}=0\right]=\cos^{2}\left(\frac{\pi}{8}\right)\approx0.85\cdots \end{equation} \end{fact} \begin{proof} The proof is contained in \cite{shepherd_temporally_2009}. \end{proof} \begin{fact} \label{fact:classical-strategy}\textbf{Classical strategy of \cite{shepherd_temporally_2009}} Let $\boldsymbol{d},\boldsymbol{e}$ be two bitstrings of length $n$ (the length of a row of $P$). Define $P_{\boldsymbol{d},\boldsymbol{e}}$ as the matrix generated by deleting the rows of $P$ orthogonal to $\boldsymbol{d}$ or $\boldsymbol{e}$.\footnote{In \cite{shepherd_temporally_2009}, $P_{\boldsymbol{d},\boldsymbol{e}}$ is written as $P_{\boldsymbol{d}}\cap P_{\boldsymbol{e}}$.} Let $\boldsymbol{y}=\sum_{\boldsymbol{p}_{i}\in P_{\boldsymbol{d},\boldsymbol{e}}}\boldsymbol{p}_{i}$ be the vector sum of the rows of $P_{\boldsymbol{d},\boldsymbol{e}}$. Letting $\boldsymbol{Y}$ be the random variable representing the distribution of $\boldsymbol{y}$ when $\boldsymbol{d}$ and $\boldsymbol{e}$ are chosen uniformly at random, then \begin{equation} \Pr\left[\boldsymbol{Y}\cdot\boldsymbol{s}=0\right]=\nicefrac{3}{4} \end{equation} \end{fact} \begin{proof} With $\boldsymbol{y}$ defined as above, we have \begin{equation} \boldsymbol{y}\cdot\boldsymbol{s}=\sum_{\boldsymbol{p}_{i}\in P_{\boldsymbol{d},\boldsymbol{e}}}\boldsymbol{p}_{i}\cdot\boldsymbol{s} \end{equation} By defintion, $\boldsymbol{p}_{i}\cdot\boldsymbol{s}=1$ if $\boldsymbol{p}_{i}\in P_{\boldsymbol{s}}$. Therefore $\boldsymbol{y}\cdot\boldsymbol{s}$ is equivalent to simply counting the number of rows in both $P_{\boldsymbol{s}}$ and $P_{\boldsymbol{d},\boldsymbol{e}}$, or equivalently, counting the rows in $P_{\boldsymbol{s}}$ for which $\boldsymbol{p}\cdot\boldsymbol{d}$ and $\boldsymbol{p}\cdot\boldsymbol{e}$ are both 1. We can express this using the matrix-vector products of $P_{\boldsymbol{s}}$ with $\boldsymbol{d}$ and $\boldsymbol{e}$: \begin{align} \boldsymbol{y}\cdot\boldsymbol{s} & =\sum_{\boldsymbol{p}_{i}\in P_{\boldsymbol{s}}}\left(\boldsymbol{p}\cdot\boldsymbol{d}\right)\left(\boldsymbol{p}\cdot\boldsymbol{e}\right)\\ & =\left(P_{\boldsymbol{s}}\ \boldsymbol{d}\right)\cdot\left(P_{\boldsymbol{s}}\ \boldsymbol{e}\right) \end{align} Considering that $P_{\boldsymbol{s}}$ is the generator matrix for an error correcting code, I denote $c_{\boldsymbol{d}}=P_{\boldsymbol{s}}\\boldsymbol{d}$ as the encoding of $\boldsymbol{d}$ under $P_{\boldsymbol{s}}$. In this notation, we have \begin{align} \boldsymbol{y}\cdot\boldsymbol{s} & =\boldsymbol{c}_{\boldsymbol{d}}\cdot\boldsymbol{c}_{\boldsymbol{e}} \end{align} Now, we note that the quadratic residue code (for which $P_{\boldsymbol{s}}$ is a generator matrix) has the property that any two codewords $\boldsymbol{c}_{\boldsymbol{d}}$ and $\boldsymbol{c}_{\boldsymbol{e}}$ have $\boldsymbol{c}_{\boldsymbol{d}}\cdot\boldsymbol{c}_{\boldsymbol{e}}=0$ iff either $\boldsymbol{c}_{\boldsymbol{d}}$ or $\boldsymbol{c}_{\boldsymbol{e}}$ has even parity.\footnote{This can be seen from the fact that the extended quadratic residue code, created by adding a single parity bit, is self-dual (and all extended codewords have even parity). \cite{shepherd_temporally_2009}} Half of the quadratic residue code's words have even parity and $\boldsymbol{c}_{\boldsymbol{d}}$ and $\boldsymbol{c}_{\boldsymbol{e}}$ are random codewords, so the probability that either of them has even parity is $\nicefrac{3}{4}$. Thus, the probability that $\boldsymbol{y}\cdot\boldsymbol{s}=0$ is $\nicefrac{3}{4}$, proving the fact. \end{proof} In the next section, we show that the classical strategy just described can be improved. \section{Algorithm\label{sec:Algorithm}} The classical strategy described in \cite{shepherd_temporally_2009} and reproduced in Fact \ref{fact:classical-strategy} above generates vectors that are orthogonal to $\boldsymbol{s}$ with probability $\nicefrac{3}{4}$. The key to this paper is that it is possible to \emph{correlate }the vectors generated by that strategy, such that with probability $\nicefrac{1}{2}$ one may generate a large set of vectors that \emph{all }are orthogonal to $\boldsymbol{s}$. When that happens, they form a system of linear equations that can be solved to yield $\boldsymbol{s}$. Finally, with knowledge of $\boldsymbol{s}$ it is trivial to generate samples that pass the verifier's test. To generate such a correlated set, we follow a modified version of the original classical strategy. Instead of choosing random bitstrings for both $\boldsymbol{d}$ and $\boldsymbol{e}$, we hold $\boldsymbol{d}$ constant, only choosing new values for $\boldsymbol{e}$ for each vector. Crucially, if the encoding $c_{\boldsymbol{d}}$ of $\boldsymbol{d}$ under $P_{\boldsymbol{s}}$ has even parity, \emph{all} of the generated vectors $\boldsymbol{m}_{i}$ will have $\boldsymbol{m}_{i}\cdot\boldsymbol{s}=0$. (See Theorem \ref{thm:correctprob}). This will happen with probability $\nicefrac{1}{2}$ over our choice of $\boldsymbol{d}$ (whenever $\boldsymbol{c}_{\boldsymbol{d}}=P_{\boldsymbol{s}}\\boldsymbol{d}$ has even parity). In practice, it is more convenient to do the linear solve if all $\boldsymbol{m}_{i}\cdot\boldsymbol{s}=1$ instead of 0. This can be accomplished by adding a vector $\boldsymbol{m}^{*}$ with $\boldsymbol{m}^{*}\cdot\boldsymbol{s}=1$ to each $\boldsymbol{m}_{i}$. It turns out that $\boldsymbol{m}^{*}=\sum_{\boldsymbol{p}\in\mathrm{rows}\left(P\right)}\boldsymbol{p}$ has this property; see proof of Theorem \ref{thm:correctprob}. The explicit algorithm for extracting the vector $\boldsymbol{s}$ is given in Algorithm \ref{alg:extract}. \begin{algorithm} \begin{enumerate} \item Let $\boldsymbol{m}^{*}=\sum_{\substack{\boldsymbol{p}\in\mathrm{rows}\left(P\right)} }\boldsymbol{p}$. \item Pick $\boldsymbol{d}\pick\left\{0,1\right\}^{n}$. \item Generate a large number (say $2n$) of vectors $\boldsymbol{m}_{i}$, forming the rows of a matrix $M$. For each: \begin{enumerate} \item Pick $\boldsymbol{e}\pick\left\{0,1\right\}^{n}$ \item Let $\boldsymbol{m}_{i}=\boldsymbol{m}^{*}+\sum_{\substack{\boldsymbol{p}\in\mathrm{rows}\left(P\right)\\\boldsymbol{p}\cdot\boldsymbol{d}=\boldsymbol{p}\cdot\boldsymbol{e}=1 } }\boldsymbol{p}$ \end{enumerate} \item Via linear solve, find the set of vectors $\left\{\boldsymbol{s}_{i}\right\}$ satisfying $M\boldsymbol{s}_{i}=\boldsymbol{1}$, where $\boldsymbol{1}$ is the vector of all ones. \item For each candidate vector $\boldsymbol{s}_{i}$: \begin{enumerate} \item Extract $P_{\boldsymbol{s}_{i}}$ from $P$ by deleting the rows of $P$ orthogonal to $\boldsymbol{s}_{i}$ \item If $P_{\boldsymbol{s}_{i}}$ has the properties of a quadratic residue code up to row reordering (i.e. codewords have $\mathrm{wt}\left(\boldsymbol{c}\right)\in\left\{-1,0\right\}\left(\bmod4\right)$), return $\boldsymbol{s}$ and exit. \end{enumerate} \item No candidate vector $\boldsymbol{s}$ was found; return $\bot$. \end{enumerate} \caption{\label{alg:extract}\textsc{ExtractKey}$\left(P\right)$\protect \\ The algorithm to extract the secret vector $\boldsymbol{s}$ from an $X$-program $P$. $n$~is the number of columns in the $X$-program, and $\protect\pick$ means ``select uniformly from the set.''} \end{algorithm} \subsection{Analysis\label{subsec:Analysis}} \begin{thm} \label{thm:correctprob}On input an $X$-program $P$ containing a unique embedded submatrix $P_{\boldsymbol{s}}$ that is a generator matrix for the quadratic residue code (up to rearrangement of its rows), Algorithm \ref{alg:extract} will output the corresponding vector $\boldsymbol{s}$ with probability $\frac{1}{2}$. \end{thm} \begin{proof} If $\boldsymbol{s}$ is contained in the set $\left\{\boldsymbol{s}_{i}\right\}$ generated in step 3 of the algorithm, the correct vector $\boldsymbol{s}$ will be output via the check in step 4 because there is a unique submatrix $P_{\boldsymbol{s}}$ corresponding to the quadratic residue code. $\boldsymbol{s}$ will be contained in $\left\{\boldsymbol{s}_{i}\right\}$ as long as $M$ satisfies the equation $M\boldsymbol{s}=\boldsymbol{1}$. Thus we desire to show that $M\boldsymbol{s}=\boldsymbol{1}$ with probability $\nicefrac{1}{2}$. Each row of $M$ is \begin{equation} \boldsymbol{m}_{i}=\boldsymbol{m}^{*}+\bar{\boldsymbol{m}}_{i} \end{equation} for a vector $\bar{\boldsymbol{m}}_{i}$ defined as \begin{equation} \bar{\boldsymbol{m}}_{i}=\sum_{\substack{\boldsymbol{p}\in\mathrm{rows}\left(P\right)\\ \boldsymbol{p}\cdot\boldsymbol{d}=\boldsymbol{p}\cdot\boldsymbol{e}=1 } }\boldsymbol{p} \end{equation} Here I will show that $\boldsymbol{m}^{*}\cdot\boldsymbol{s}=1$ always and $\bar{\boldsymbol{m}}_{i}\cdot\boldsymbol{s}=0$ for all $i$ with probability $\nicefrac{1}{2}$, implying that $M\boldsymbol{s}=\boldsymbol{1}$ with probability $\nicefrac{1}{2}$. First I show that $\boldsymbol{m}^{*}\cdot\boldsymbol{s}=1$. $\boldsymbol{m}^{*}$ is the sum of all rows of $P$, so we have \begin{equation} \boldsymbol{m}^{*}\cdot\boldsymbol{s}=\sum_{\substack{\boldsymbol{p}\in\mathrm{rows}\left(P\right)} }\boldsymbol{p}\cdot\boldsymbol{s}=\sum_{\substack{\boldsymbol{p}\in\mathrm{rows}\left(P_{\boldsymbol{s}}\right)} }1 \end{equation} We see that the inner product is equal to the number of rows in the submatrix $P_{\boldsymbol{s}}$. This submatrix is a generator matrix for the quadratic residue code, which has a number of rows equal to a prime $q>2$; the number of rows is odd and thus \begin{equation} \boldsymbol{m}^{*}\cdot\boldsymbol{s}=1 \end{equation} Now I turn to showing that $\bar{\boldsymbol{m}}_{i}\cdot\boldsymbol{s}=0$ for all $i$ with probability $\nicefrac{1}{2}$. In the proof of Fact \ref{fact:classical-strategy}, it was shown that for any two vectors $\boldsymbol{d}$ and $\boldsymbol{e}$, vectors $\bar{\boldsymbol{m}}_{i}$ generated by summing rows $\boldsymbol{p}_{i}$ of $P$ for which $\boldsymbol{d}\cdot\boldsymbol{p}_{i}=\boldsymbol{e}\cdot\boldsymbol{p}_{i}=1$ have \begin{equation} \bar{\boldsymbol{m}}_{i}\cdot\boldsymbol{s}=0\text{ iff }\boldsymbol{c}_{\boldsymbol{d}}\text{ or }\boldsymbol{c}_{\boldsymbol{e}}\text{ has even parity}\label{eq:mbarparity} \end{equation} where $\boldsymbol{c}_{\boldsymbol{d}}$ and $\boldsymbol{c}_{\boldsymbol{e}}$ are the encodings under $P_{\boldsymbol{s}}$ of $\boldsymbol{d}$ and $\boldsymbol{e}$ respectively. If $\boldsymbol{d}$ is held constant for all $i$, and $\boldsymbol{d}$ happened to be chosen such that $\boldsymbol{c}_{\boldsymbol{d}}=P_{\boldsymbol{s}}\\boldsymbol{d}$ has even parity, then $\bar{\boldsymbol{m}}_{i}\cdot\boldsymbol{s}=0$ for all $i$ by Equation \ref{eq:mbarparity}. Because half of the codewords of the quadratic residue code have even parity, for $\boldsymbol{d}$ selected uniformly at random we have $\bar{\boldsymbol{m}}_{i}\cdot\boldsymbol{s}=0$ for all $i$ with probability $\nicefrac{1}{2}$. I have shown that $\boldsymbol{m}^{*}\cdot\boldsymbol{s}=1$ always and $\bar{\boldsymbol{m}}_{i}\cdot\boldsymbol{s}=0$ for all $i$ with probability $\nicefrac{1}{2}$. Therefore we have \[ \Pr_{\boldsymbol{d}}\left[\boldsymbol{m}_{i}\cdot\boldsymbol{s}=1\ \forall\ i\right]=\nicefrac{1}{2} \] Thus $M\boldsymbol{s}=\boldsymbol{1}$ with probability $\nicefrac{1}{2}$. The algorithm will output $\boldsymbol{s}$ whenever $M\boldsymbol{s}=\boldsymbol{1}$, proving the theorem. \end{proof} \begin{center} \rule[0.5ex]{0.6\columnwidth}{1pt} \par\end{center} Having established that the algorithm outputs $\boldsymbol{s}$ with high probability (that can be made arbitrarily close to 1 by repetition), we now turn to analyzing its runtime. \begin{claim} \label{claim:runtime}(empirical) Algorithm \ref{alg:extract} halts in $\mathcal{O}\left(n^{3}\right)$ time on average. \end{claim} All steps of the algorithm except for step 4 have $\mathcal{O}\left(n^{3}\right)$ scaling by inspection. The obstacle preventing Claim \ref{claim:runtime} from trivially holding is that it is hard to make a rigorous statement about how large the set of candidate vectors $\left\{\boldsymbol{s}_{i}\right\}$ is. Because $\left|\left\{ \boldsymbol{s}_{i}\right\}\right|=2^{n-\mathrm{rank}\left(M\right)}$, we'd like to show that on average, the rank of $M$ is close to or equal to $n$. It seems reasonable that this would be the case: we are generating the rows of $M$ by summing rows from $P$, and $P$ must have full rank because it contains a rank-$n$ error correcting code. But the rows of $P$ summed into each $\boldsymbol{m}_{i}$ are not selected independently---they are always related via their connection to the vectors $\boldsymbol{d}$ and $\boldsymbol{e}$, and it's not clear how these correlations affect the linear independence of the resulting $\boldsymbol{m}_{i}$. \begin{figure} \begin{centering} \includegraphics[width=0.8\columnwidth]{candkeys} \par\end{centering} \caption{\label{fig:rankM}\textbf{(a)} The average number of candidate vectors checked before the secret vector $\boldsymbol{s}$ was found, when the algorithm was applied to 1000 unique $X$-programs at each problem size tested. We observe that the number of vectors to check is constant in $n$. \textbf{(b) }The number of unconstrained degrees of freedom $n-\mathrm{rank}\left(M\right)$ for matrices $M$ generated in step 3 of Algorithm \ref{alg:extract}, for ``good'' choices of $\boldsymbol{d}$ such that $M\boldsymbol{s}=\boldsymbol{1}$. The rapidly decaying tail implies that it is rare for any more than a few degrees of freedom to remain unconstrained. The blue bars represent the distribution over 1000 unique $X$-programs of size $n=245$. The algorithm was then re-run on the $X$-programs that had $n-\mathrm{rank}\left(M\right)>4$ to generate the orange bars.} \end{figure} Despite the lack of a proof, empirical evidence supports Claim \ref{claim:runtime} when the algorithm is applied to $X$-programs generated in the manner described in \cite{shepherd_temporally_2009}. Figure \ref{fig:rankM}(a) shows the average number of candidate keys checked by the algorithm before $\boldsymbol{s}$ is found, as a function of problem size. The value is constant, demonstrating that the average size of the set $\left\{ \boldsymbol{s}_{i}\right\} $ does not scale with $n$. Furthermore, the value is small---only about 4. This implies that $M$ usually has high rank. In Figure \ref{fig:rankM}(b) I plot explicitly the distribution of the rank of the matrix $M$ over 1000 runs of the algorithm on unique $X$-programs of size $n=245$. The blue bars (on the left of each pair) show the distribution over all $X$-programs tested, and the sharply decaying tail supports the claim that low-rank $M$ almost never occur. A natural next question is whether there is some feature of the $X$-programs in that tail that causes $M$ to be low rank. To investigate that question, the algorithm was re-run 100 times on each of the $X$-programs that had $n-\mathrm{rank}\left(M\right)>4$ in the blue distribution. The orange bars of Figure \ref{fig:rankM}(b) (on the right of each pair) plot the distribution of $n-\mathrm{rank}\left(M\right)$ for that second run. The similarity of the blue and orange distributions demonstrates that the rank of $M$ is not correlated between runs; that is, the low rank of $M$ in the first run was not due to any feature of the input $X$-programs. From a practical perspective, this data suggests that if the rank of $M$ is found to be unacceptably low, the algorithm can simply be re-run with new randomness and the rank of $M$ is likely to be higher the second time. \subsection{Implementation\label{subsec:Implementation}} An implementation of Algorithm \ref{alg:extract} in the programming language Julia is available online at \href{https://github.com/GregDMeyer/IQPwn}{github.com/GregDMeyer/IQPwn}. In that repository is also code for generating the figures in this paper. Figure \ref{fig:solvetime} shows the runtime of this implementation for various problem sizes. Experiments were completed using one thread on an Intel 8268 \textquotedbl Cascade Lake\textquotedbl{} processor. Note that Figure \ref{fig:solvetime} shows $\mathcal{O}\left(n^{2}\right)$ scaling, rather than $\mathcal{O}\left(n^{3}\right)$ from Claim \ref{claim:runtime}. This is due to data-level parallelism in the implementation. $\mathbb{Z}_{2}^{n}$ vectors are stored as the bits of 64-bit integers, so operations like vector addition can be performed on 64 elements at once via bitwise operations. Furthermore, with AVX SIMD CPU instructions, those operations can be applied to multiple 64-bit integers in one CPU cycle. Thus, for $n$ of order 100, the ostensibly $\mathcal{O}\left(n\right)$ vector inner products and vector sums are performed in constant time, removing one factor of $n$ from the runtime. The tests in Figure \ref{fig:solvetime} were performed on a CPU with 512 bit vector units. \section{Discussion\label{sec:Discussion}} \paragraph{Modifications to the protocol} A natural question is whether it is possible to modify the original protocol such that this attack is not successful. Perhaps $P$ can be engineered such that either 1) it is not possible to generate a large number of vectors that all have a known inner product with $\boldsymbol{s}$, or 2) the rank of the matrix $M$ formed by these generated vectors will never be sufficiently high to allow solution of the linear system. For 1), our ability to generate many vectors orthogonal to $\boldsymbol{s}$ relies on the fact that the extended quadratic residue code is even and self-dual. These characteristics imply that if $P_{\boldsymbol{s}}\ \boldsymbol{d}$ has even parity, \emph{all} of our subsequently generated vectors will have $\boldsymbol{m}_{i}\cdot\boldsymbol{s}=0$. Building $P$ via a code without the self-dual property would remove this possibility, though the challenge is to do so while still maintaining the bias in the quantum case, and without opening up a new avenue to discovering $\boldsymbol{s}$. I leave that pursuit open. For 2), the main obstacle is that the matrix $P$ \emph{must }have rank $n$ because embedded in it is a code of rank $n$. The only hope is to somehow engineer the matrix such that linear combinations generated in the specific way described above will not be linearly independent. It's not at all clear how one would do that, and furthermore, adding structure to the previously-random extra rows of $P$ runs the risk of providing even more information about the secret vector $\boldsymbol{s}$. Perhaps one could \emph{prove} that the rank of $M$ will be large even for worst-case inputs $P$---this would also be an interesting future direction. \paragraph{Protocols with provable hardness} The attack described in this paper reiterates the value of building protocols for which \emph{passing the check itself}, rather than just simulating the quantum device, can be shown to be hard under well-established complexity-theoretic assumptions. For example, the protocol given in \cite{brakerski_cryptographic_2019} is secure under the hardness assumption of Learning With Errors. Unfortunately, such rigorous results come with a downside, which is an increase in the size and complexity of circuits that must be run on the quantum device. Exploring simplified protocols that are provably secure is an interesting area for further research. \paragraph{Complexity theoretic implications} Conjecture 3.2 of \cite{shepherd_temporally_2009} supposes whether the language of matroids with a hidden sub-matroid is $\mathsf{NP}$-complete. In rough terms, it asks if a $\mathsf{BPP}$ machine can efficiently decide whether a given matrix $P$ contains a hidden submatrix $P_{\boldsymbol{s}}$ corresponding to a generator matrix for the quadratic residue code (up to permutations of rows). The results in this paper cannot make a strong statement about this conjecture; I have only empirically established that Algorithm \ref{alg:extract} halts in polynomial time in the average case. It's possible that there exist some class of worst-case instances of $P$ for which the algorithm is not efficient; this would be an interesting area for further research. \paragraph{The \$25 challenge} At \href{http://quantumchallenges.wordpress.com}{quantumchallenges.wordpress.com}, the authors of \cite{shepherd_temporally_2009} posed a challenge. They posted a specific instance of the matrix $P$, and offered \$25 to anyone who could send them samples passing the verifier's check. The secret vector $\boldsymbol{s}$ corresponding to their challenge matrix $P$ is (encoded as a base-64 string): \begin{verbatim} BilbHzjYxrOHYH4OlEJFBoXZbps4a54kH8flrRgo/g== \end{verbatim} The code used to extract the secret vector, as well as a set of samples that should pass the check for their challenge matrix, can be found at \href{https://github.com/GregDMeyer/IQPwn}{github.com/GregDMeyer/IQPwn}. If you'd like to convert the above key into a binary string, you can use the \texttt{b64tobin} script in the \texttt{examples/} directory of that repository. \paragraph{Summary and outlook} In this paper, I have described a classical algorithm that passes the interactive quantum test described in \cite{shepherd_temporally_2009}. I have proven that a single iteration of the algorithm will return the underlying secret vector with probability $\nicefrac{1}{2}$, and empirically established that it is efficient. The immediate implication of this result is that the protocol from \cite{shepherd_temporally_2009} in its original form is no longer effective as a test of quantumness. While it may be possible to reengineer that protocol to thwart this attack, this paper reiterates the value of proving the security of the verification step. Protocols with provable security are valuable on their own, but can also be used as building blocks for new, more complex results (see, for example, classically verifiable quantum computation in \cite{mahadev_classical_2018} building off the protocol of \cite{brakerski_cryptographic_2019}). As quantum hardware begins to surpass the abilities of classical machines, quantum cryptographic tools will play an important role in making quantum computation available as a service. Establishing the security of these protocols is an important first step. \paragraph{Acknowledgements} The author is supported by the National Defense Science and Engineering Graduate Fellowship (NDSEG). \bibliographystyle{unsrt} \bibliography{refs} \end{document} }}}}
\caption{\label{alg:extract}\textsc{ExtractKey}$\left(P\right)$\protect \\ The algorithm to extract the secret vector $\boldsymbol{s}$ from an $X$-program $P$. $n$~is the number of columns in the $X$-program, and $\protect\pick$ means ``select uniformly from the set.''} \end{algorithm} \subsection{Analysis\label{subsec:Analysis}} \begin{thm} \label{thm:correctprob}On input an $X$-program $P$ containing a unique embedded submatrix $P_{\boldsymbol{s}}$ that is a generator matrix for the quadratic residue code (up to rearrangement of its rows), Algorithm \ref{alg:extract} will output the corresponding vector $\boldsymbol{s}$ with probability $\frac{1}{2}$. \end{thm} \begin{proof} If $\boldsymbol{s}$ is contained in the set $\left\{\boldsymbol{s}_{i}\right\}$ generated in step 3 of the algorithm, the correct vector $\boldsymbol{s}$ will be output via the check in step 4 because there is a unique submatrix $P_{\boldsymbol{s}}$ corresponding to the quadratic residue code. $\boldsymbol{s}$ will be contained in $\left\{\boldsymbol{s}_{i}\right\}$ as long as $M$ satisfies the equation $M\boldsymbol{s}=\boldsymbol{1}$. Thus we desire to show that $M\boldsymbol{s}=\boldsymbol{1}$ with probability $\nicefrac{1}{2}$. Each row of $M$ is \begin{equation} \boldsymbol{m}_{i}=\boldsymbol{m}^{*}+\bar{\boldsymbol{m}}_{i} \end{equation} for a vector $\bar{\boldsymbol{m}}_{i}$ defined as \begin{equation} \bar{\boldsymbol{m}}_{i}=\sum_{\substack{\boldsymbol{p}\in\mathrm{rows}\left(P\right)\\ \boldsymbol{p}\cdot\boldsymbol{d}=\boldsymbol{p}\cdot\boldsymbol{e}=1 } }\boldsymbol{p} \end{equation} Here I will show that $\boldsymbol{m}^{*}\cdot\boldsymbol{s}=1$ always and $\bar{\boldsymbol{m}}_{i}\cdot\boldsymbol{s}=0$ for all $i$ with probability $\nicefrac{1}{2}$, implying that $M\boldsymbol{s}=\boldsymbol{1}$ with probability $\nicefrac{1}{2}$. First I show that $\boldsymbol{m}^{*}\cdot\boldsymbol{s}=1$. $\boldsymbol{m}^{*}$ is the sum of all rows of $P$, so we have \begin{equation} \boldsymbol{m}^{*}\cdot\boldsymbol{s}=\sum_{\substack{\boldsymbol{p}\in\mathrm{rows}\left(P\right)} }\boldsymbol{p}\cdot\boldsymbol{s}=\sum_{\substack{\boldsymbol{p}\in\mathrm{rows}\left(P_{\boldsymbol{s}}\right)} }1 \end{equation} We see that the inner product is equal to the number of rows in the submatrix $P_{\boldsymbol{s}}$. This submatrix is a generator matrix for the quadratic residue code, which has a number of rows equal to a prime $q>2$; the number of rows is odd and thus \begin{equation} \boldsymbol{m}^{*}\cdot\boldsymbol{s}=1 \end{equation} Now I turn to showing that $\bar{\boldsymbol{m}}_{i}\cdot\boldsymbol{s}=0$ for all $i$ with probability $\nicefrac{1}{2}$. In the proof of Fact \ref{fact:classical-strategy}, it was shown that for any two vectors $\boldsymbol{d}$ and $\boldsymbol{e}$, vectors $\bar{\boldsymbol{m}}_{i}$ generated by summing rows $\boldsymbol{p}_{i}$ of $P$ for which $\boldsymbol{d}\cdot\boldsymbol{p}_{i}=\boldsymbol{e}\cdot\boldsymbol{p}_{i}=1$ have \begin{equation} \bar{\boldsymbol{m}}_{i}\cdot\boldsymbol{s}=0\text{ iff }\boldsymbol{c}_{\boldsymbol{d}}\text{ or }\boldsymbol{c}_{\boldsymbol{e}}\text{ has even parity}\label{eq:mbarparity} \end{equation} where $\boldsymbol{c}_{\boldsymbol{d}}$ and $\boldsymbol{c}_{\boldsymbol{e}}$ are the encodings under $P_{\boldsymbol{s}}$ of $\boldsymbol{d}$ and $\boldsymbol{e}$ respectively. If $\boldsymbol{d}$ is held constant for all $i$, and $\boldsymbol{d}$ happened to be chosen such that $\boldsymbol{c}_{\boldsymbol{d}}=P_{\boldsymbol{s}}\\boldsymbol{d}$ has even parity, then $\bar{\boldsymbol{m}}_{i}\cdot\boldsymbol{s}=0$ for all $i$ by Equation \ref{eq:mbarparity}. Because half of the codewords of the quadratic residue code have even parity, for $\boldsymbol{d}$ selected uniformly at random we have $\bar{\boldsymbol{m}}_{i}\cdot\boldsymbol{s}=0$ for all $i$ with probability $\nicefrac{1}{2}$. I have shown that $\boldsymbol{m}^{*}\cdot\boldsymbol{s}=1$ always and $\bar{\boldsymbol{m}}_{i}\cdot\boldsymbol{s}=0$ for all $i$ with probability $\nicefrac{1}{2}$. Therefore we have \[ \Pr_{\boldsymbol{d}}\left[\boldsymbol{m}_{i}\cdot\boldsymbol{s}=1\ \forall\ i\right]=\nicefrac{1}{2} \] Thus $M\boldsymbol{s}=\boldsymbol{1}$ with probability $\nicefrac{1}{2}$. The algorithm will output $\boldsymbol{s}$ whenever $M\boldsymbol{s}=\boldsymbol{1}$, proving the theorem. \end{proof} \begin{center} \rule[0.5ex]{0.6\columnwidth}{1pt} \par\end{center} Having established that the algorithm outputs $\boldsymbol{s}$ with high probability (that can be made arbitrarily close to 1 by repetition), we now turn to analyzing its runtime. \begin{claim} \label{claim:runtime}(empirical) Algorithm \ref{alg:extract} halts in $\mathcal{O}\left(n^{3}\right)$ time on average. \end{claim} All steps of the algorithm except for step 4 have $\mathcal{O}\left(n^{3}\right)$ scaling by inspection. The obstacle preventing Claim \ref{claim:runtime} from trivially holding is that it is hard to make a rigorous statement about how large the set of candidate vectors $\left\{\boldsymbol{s}_{i}\right\}$ is. Because $\left|\left\{ \boldsymbol{s}_{i}\right\}\right|=2^{n-\mathrm{rank}\left(M\right)}$, we'd like to show that on average, the rank of $M$ is close to or equal to $n$. It seems reasonable that this would be the case: we are generating the rows of $M$ by summing rows from $P$, and $P$ must have full rank because it contains a rank-$n$ error correcting code. But the rows of $P$ summed into each $\boldsymbol{m}_{i}$ are not selected independently---they are always related via their connection to the vectors $\boldsymbol{d}$ and $\boldsymbol{e}$, and it's not clear how these correlations affect the linear independence of the resulting $\boldsymbol{m}_{i}$. \begin{figure} \begin{centering} \includegraphics[width=0.8\columnwidth]{candkeys} \par\end{centering} \caption{\label{fig:rankM}\textbf{(a)} The average number of candidate vectors checked before the secret vector $\boldsymbol{s}$ was found, when the algorithm was applied to 1000 unique $X$-programs at each problem size tested. We observe that the number of vectors to check is constant in $n$. \textbf{(b) }The number of unconstrained degrees of freedom $n-\mathrm{rank}\left(M\right)$ for matrices $M$ generated in step 3 of Algorithm \ref{alg:extract}, for ``good'' choices of $\boldsymbol{d}$ such that $M\boldsymbol{s}=\boldsymbol{1}$. The rapidly decaying tail implies that it is rare for any more than a few degrees of freedom to remain unconstrained. The blue bars represent the distribution over 1000 unique $X$-programs of size $n=245$. The algorithm was then re-run on the $X$-programs that had $n-\mathrm{rank}\left(M\right)>4$ to generate the orange bars.} \end{figure} Despite the lack of a proof, empirical evidence supports Claim \ref{claim:runtime} when the algorithm is applied to $X$-programs generated in the manner described in \cite{shepherd_temporally_2009}. Figure \ref{fig:rankM}(a) shows the average number of candidate keys checked by the algorithm before $\boldsymbol{s}$ is found, as a function of problem size. The value is constant, demonstrating that the average size of the set $\left\{ \boldsymbol{s}_{i}\right\} $ does not scale with $n$. Furthermore, the value is small---only about 4. This implies that $M$ usually has high rank. In Figure \ref{fig:rankM}(b) I plot explicitly the distribution of the rank of the matrix $M$ over 1000 runs of the algorithm on unique $X$-programs of size $n=245$. The blue bars (on the left of each pair) show the distribution over all $X$-programs tested, and the sharply decaying tail supports the claim that low-rank $M$ almost never occur. A natural next question is whether there is some feature of the $X$-programs in that tail that causes $M$ to be low rank. To investigate that question, the algorithm was re-run 100 times on each of the $X$-programs that had $n-\mathrm{rank}\left(M\right)>4$ in the blue distribution. The orange bars of Figure \ref{fig:rankM}(b) (on the right of each pair) plot the distribution of $n-\mathrm{rank}\left(M\right)$ for that second run. The similarity of the blue and orange distributions demonstrates that the rank of $M$ is not correlated between runs; that is, the low rank of $M$ in the first run was not due to any feature of the input $X$-programs. From a practical perspective, this data suggests that if the rank of $M$ is found to be unacceptably low, the algorithm can simply be re-run with new randomness and the rank of $M$ is likely to be higher the second time. \subsection{Implementation\label{subsec:Implementation}} An implementation of Algorithm \ref{alg:extract} in the programming language Julia is available online at \href{https://github.com/GregDMeyer/IQPwn}{github.com/GregDMeyer/IQPwn}. In that repository is also code for generating the figures in this paper. Figure \ref{fig:solvetime} shows the runtime of this implementation for various problem sizes. Experiments were completed using one thread on an Intel 8268 \textquotedbl Cascade Lake\textquotedbl{} processor. Note that Figure \ref{fig:solvetime} shows $\mathcal{O}\left(n^{2}\right)$ scaling, rather than $\mathcal{O}\left(n^{3}\right)$ from Claim \ref{claim:runtime}. This is due to data-level parallelism in the implementation. $\mathbb{Z}_{2}^{n}$ vectors are stored as the bits of 64-bit integers, so operations like vector addition can be performed on 64 elements at once via bitwise operations. Furthermore, with AVX SIMD CPU instructions, those operations can be applied to multiple 64-bit integers in one CPU cycle. Thus, for $n$ of order 100, the ostensibly $\mathcal{O}\left(n\right)$ vector inner products and vector sums are performed in constant time, removing one factor of $n$ from the runtime. The tests in Figure \ref{fig:solvetime} were performed on a CPU with 512 bit vector units. \section{Discussion\label{sec:Discussion}} \paragraph{Modifications to the protocol} A natural question is whether it is possible to modify the original protocol such that this attack is not successful. Perhaps $P$ can be engineered such that either 1) it is not possible to generate a large number of vectors that all have a known inner product with $\boldsymbol{s}$, or 2) the rank of the matrix $M$ formed by these generated vectors will never be sufficiently high to allow solution of the linear system. For 1), our ability to generate many vectors orthogonal to $\boldsymbol{s}$ relies on the fact that the extended quadratic residue code is even and self-dual. These characteristics imply that if $P_{\boldsymbol{s}}\ \boldsymbol{d}$ has even parity, \emph{all} of our subsequently generated vectors will have $\boldsymbol{m}_{i}\cdot\boldsymbol{s}=0$. Building $P$ via a code without the self-dual property would remove this possibility, though the challenge is to do so while still maintaining the bias in the quantum case, and without opening up a new avenue to discovering $\boldsymbol{s}$. I leave that pursuit open. For 2), the main obstacle is that the matrix $P$ \emph{must }have rank $n$ because embedded in it is a code of rank $n$. The only hope is to somehow engineer the matrix such that linear combinations generated in the specific way described above will not be linearly independent. It's not at all clear how one would do that, and furthermore, adding structure to the previously-random extra rows of $P$ runs the risk of providing even more information about the secret vector $\boldsymbol{s}$. Perhaps one could \emph{prove} that the rank of $M$ will be large even for worst-case inputs $P$---this would also be an interesting future direction. \paragraph{Protocols with provable hardness} The attack described in this paper reiterates the value of building protocols for which \emph{passing the check itself}, rather than just simulating the quantum device, can be shown to be hard under well-established complexity-theoretic assumptions. For example, the protocol given in \cite{brakerski_cryptographic_2019} is secure under the hardness assumption of Learning With Errors. Unfortunately, such rigorous results come with a downside, which is an increase in the size and complexity of circuits that must be run on the quantum device. Exploring simplified protocols that are provably secure is an interesting area for further research. \paragraph{Complexity theoretic implications} Conjecture 3.2 of \cite{shepherd_temporally_2009} supposes whether the language of matroids with a hidden sub-matroid is $\mathsf{NP}$-complete. In rough terms, it asks if a $\mathsf{BPP}$ machine can efficiently decide whether a given matrix $P$ contains a hidden submatrix $P_{\boldsymbol{s}}$ corresponding to a generator matrix for the quadratic residue code (up to permutations of rows). The results in this paper cannot make a strong statement about this conjecture; I have only empirically established that Algorithm \ref{alg:extract} halts in polynomial time in the average case. It's possible that there exist some class of worst-case instances of $P$ for which the algorithm is not efficient; this would be an interesting area for further research. \paragraph{The \$25 challenge} At \href{http://quantumchallenges.wordpress.com}{quantumchallenges.wordpress.com}, the authors of \cite{shepherd_temporally_2009} posed a challenge. They posted a specific instance of the matrix $P$, and offered \$25 to anyone who could send them samples passing the verifier's check. The secret vector $\boldsymbol{s}$ corresponding to their challenge matrix $P$ is (encoded as a base-64 string): \begin{verbatim} BilbHzjYxrOHYH4OlEJFBoXZbps4a54kH8flrRgo/g== \end{verbatim} The code used to extract the secret vector, as well as a set of samples that should pass the check for their challenge matrix, can be found at \href{https://github.com/GregDMeyer/IQPwn}{github.com/GregDMeyer/IQPwn}. If you'd like to convert the above key into a binary string, you can use the \texttt{b64tobin} script in the \texttt{examples/} directory of that repository. \paragraph{Summary and outlook} In this paper, I have described a classical algorithm that passes the interactive quantum test described in \cite{shepherd_temporally_2009}. I have proven that a single iteration of the algorithm will return the underlying secret vector with probability $\nicefrac{1}{2}$, and empirically established that it is efficient. The immediate implication of this result is that the protocol from \cite{shepherd_temporally_2009} in its original form is no longer effective as a test of quantumness. While it may be possible to reengineer that protocol to thwart this attack, this paper reiterates the value of proving the security of the verification step. Protocols with provable security are valuable on their own, but can also be used as building blocks for new, more complex results (see, for example, classically verifiable quantum computation in \cite{mahadev_classical_2018} building off the protocol of \cite{brakerski_cryptographic_2019}). As quantum hardware begins to surpass the abilities of classical machines, quantum cryptographic tools will play an important role in making quantum computation available as a service. Establishing the security of these protocols is an important first step. \paragraph{Acknowledgements} The author is supported by the National Defense Science and Engineering Graduate Fellowship (NDSEG). \bibliographystyle{unsrt} \bibliography{refs} \end{document} }
\caption{Sample results on several sequences of MOT17 data sets using SDP detector; bounding boxes represent the tracking results with their color-coded identities. From left to right: MOT17-01-SDP and MOT17-03-SDP (top row), MOT17-06-SDP and MOT17-07-SDP (the 1st middle row), MOT17-08-SDP and MOT17-12-SDP (the 2nd middle row), and MOT17-14-SDP (bottom row). The videos of tracking results are available on the MOT Challenge website \color{red}{https://motchallenge.net/}.}
\caption{Two modes of information propagation from the \emph{Source} node to the \emph{Sink} node. The \textcolor{orange}{orange arrows} indicate effective computation paths with order, and other arrows indicate the concrete computation operations for each mode. GCNs (left) recursively request node $a$ for information, and finally gets the information of the \emph{Source} node at the $3^{rd}$ layer (we omit some arrows for simplification). The effective computation path is $\{Sink \to a \to c \to Source\}$. Our proposed FlowGN (right) passes information from the \emph{Source} node to the \emph{Sink} node in one layer with three hops (The brown arrows). Its effective computation path is $\{Source \to c \to a \to Sink\}$. }
\caption{(a) and (b) are two examples of information propagation between the \emph{Source} node and the \emph{Sink} node. The \textcolor{orange}{orange} arrows indicate effective computation paths with order, and the other arrows (\textcolor{brown}{brown}, \textcolor{red}{red}, and \textcolor{blue}{blue}) indicate the concrete computation operations and their color indicates the \emph{Source} node whose information is expected to be transmitted in the corresponding operation. GCNs (left) recursively request node $a$ for information, and finally gets the information of the \emph{Source} node at the $3^{rd}$ layer (we omit some arrows for simplification). The effective computation path is $\{Sink \to a \to c \to Source\}$. The illustrative case (middle) passes information from the \emph{Source} node to the \emph{Sink} node directly with three hops (The brown arrows). Its effective computation path is $\{Source \to c \to a \to Sink\}$. (c) is an example of FlowGN where the path length is $3$.}
\caption{Given an input image, traditional methods predict a single depth value for each pixel. In this paper, we describe an approach that predicts a per-pixel multi-modal distribution over depth. In the example above, we zoom in onto depth predictions along the dashed green line. Inside the input image, we highlight a segment filled with depth continuities marked with a yellow double-head arrow, where pixels could come from the car in the front, the car behind, or even the building in the back. In the output at the bottom, we mark ground truth depth with \textcolor{Blue}{blue} and depth with higher probabilities with \textcolor{Red}{red}. While traditional methods incorrectly yield the mean of different modes, our approach successfully captures the multi-modal nature. }
\caption{Performance of our model illustrated for three input images and 11 levels of exposure transformation. The left columns show input images with applied exposure transformations and the magnitude of this transformation expressed on a $log_2$ scale. Middle columns show ground truth from our subjective experiments and rightmost columns show output of our model, where {\color{red}red} and {\color{green}green} regions indicate detected negative and positive suprathreshold exposure transformations, while {\color{blue}blue} regions indicate no suprathreshold transformations.}
\caption{Illustration of how change in $F1$ score between predicted and ground truth (not shown here) masks is used to estimate our model's decision boundary. The top row shows input images, the middle row shows model prediction softmax probabilities with {\color{red} red} for detected negative offsets (class 0), {\color{green} green} for positive offsets (class 1) and {\color{blue} blue} for no offset. The bottom row shows class-wise $F1$ scores for classes 0 and 1. More examples can be found in supplementary materials.}
\caption{Example of \textit{a)} Over-exposure resulting from flash or spot lighting in the original image \textit{b)} both the original over-exposure {\color{green}green}) and manually applied underexposure ({\color{red}red}) are detected by our model \textit{c)} mask showing area where negative exposure shift is manually applied}
\caption{Macroarchitecture of PreVIousNet\cite{Github_PreVIous} used in the modeling stage of PreVIous. Seven types of layers with different configurations are contemplated by PreVIousNet-01 (a), whereas \textit{FC} and \textit{Softmax} layers are covered by PreVIousNet-02 (b). The network input dimensions at the first level -- \textcolor{fuente_inkscape}{{\boldm $H$}}, \textcolor{fuente_inkscape}{{\boldm $W$}}, and \textcolor{fuente_inkscape}{{\boldm $C$}} -- are adjustable variables of PreVIousNet.}
\caption{Generic object counting with partial supervision. Example results are shown on the Visual Genome (\emph{top}) and the COCO (\emph{bottom}) datasets. Due to large number of object categories in the dataset (609 for Visual Genome and 80 for COCO), acquiring accurate count annotations for natural scenes is laborious and costly. We propose two settings to address this, where the first (LC) reduces the annotation cost due to multiple instances and the second (RLC) further reduces the annotation cost due to large numbers of object categories. While both frameworks reduce supervision by only requiring image-level lower-count annotations, the LC framework requires these annotations for all categories, whereas RLC only needs them for a subset (\textcolor{blue}{blue} means the corresponding categories are not count-annotated during training of RLC framework). Our two approaches (LC and RLC) significantly reduce the annotation cost in comparison to the state-of-the-art instance-level (LCFCN \cite{WhereAreBlobsECCV18}) and image-level (Glancing \cite{Chattopadhyay_2017_CVPR}) supervised methods. Here, the example counting results from two datasets show the generalizability of our frameworks to both object counts beyond the lower-count range ($>4$) and to count-unannotated categories. Due to the absence of category-specific counts for some classes, we introduce an additional category-agnostic total count measure in our RLC framework to facilitate generalization across categories and provide accurate category-agnostic total count (TC) predictions. (correct/incorrect/unavailable predictions are marked with \textcolor{green}{\cmark}/ \textcolor{red}{\xmark}/\textcolor{red}{\textbf{?}} respectively). }
\caption{Object counting examples on the COCO and Visual Genome datasets. The ground-truth is shown in \textcolor{green}{green}, while the predictions by glancing \cite{Chattopadhyay_2017_CVPR}, LCFCN \cite{WhereAreBlobsECCV18}, our LC framework, and our RLC framework, are shown sequentially inside the parentheses. The examples show that our LC and RLC frameworks accurately predict counts for diverse categories (animals to food items), and even beyond the lower-count range. Although the count-annotation of object categories indicated with \textcolor{blue}{blue} are not used for training the RLC framework, their counts are predicted accurately. Finally, the category-independent total count (TC) predicted by our RLC framework is shown separately, inside parentheses. Best viewed in zoom. }
\caption{ {\blue Image and analysis of a 3D sample of sedimented fluorescent colloids with a mean diameter of $10 \unit{\mu m}$. The colloids are imaged with a 0.4Hz 3D stack rate and a 800Hz 2D frame rate. (A) shows a xz-cut through the measured 3D volume and (B) shows the intensity values along the blue (lateral or x-direction) and red (z-direction, vertical to the sample plane) line. The lateral and vertical size of the marked colloid ($\approx 10 \unit{\mu m}$) is identical in both directions. } %The resolution differs in the spatial directions. %The resolution in lateral direction (blue line) is in the order of $1.25 \unit{\mu m}$ (calculated from $\tanh$-fits to the intensity profile, compare supplementary material). %The vertical resolution is in the order of $4.1 \unit{\mu m}$. }
\caption{ Predicted win probabilities during the first OpenAI Five vs. OG game. The original win probabilities are shown in {\color{red}red}. When replaying the games using models trained from scratch up to version 56,000, we see win probabilities gradually approach those given by the model that was used to play OG. }
\caption{ Colorization and inpainting results with multi-code GAN prior using different composition layers. % AuC (the higher the better) for colorization task are 86.83\%, 87.44\%, 90.02\% with respect to the 2nd, 4th, and 8th layer respectively. % PSNR (the higher the better) for inpainting task are 21.19db, 22.11db, 20.70db with respect to the 2nd, 4th, and 8th layer respectively. % Images in \textcolor{green}{\textbf{green}} boxes indicate the best results. }
\caption{Probabilistic perspective on DSA to combine machine-learning DSA and actual DSA. After computing the risks with machine-learning for all operating points, time-domain simulations are used in the order of descending risk (first high \protect\includegraphics[height=0.8em]{hr.png}, then medium \protect\includegraphics[height=0.8em]{mr.png} and finally low \protect\includegraphics[height=0.8em]{lr.png}).}
\caption{Probability estimation requires calibration: In (a) is the result of a classifier using an uncalibrated score $\bar{s}^{1}$ (\protect\includegraphics[height=0.5em]{s1.png}). In (b) is a Platt' calibrated classifier resulting in acceptable probability estimates $\bar{p}^1$ (\protect\includegraphics[height=0.5em]{p1.png}). Optimal probability estimates would follow the diagonal (\protect\includegraphics[height=0.5em]{diag.png}). \vspace{-2.5em}}
\caption{A decision threshold $z$ reduces the risk ${Z}^*_6$ if used correctly on a calibrated AdaBoost classifier (\protect\includegraphics[height=0.5em]{grl.png}). Different cost ratios $\mathcal{C}_6$ and classifiers were used: DT without (\protect\includegraphics[height=0.5em]{blal.png}) and DT with applying $z$ (\protect\includegraphics[height=0.5em]{grel.png}), uncalibrated AdaBoost without (\protect\includegraphics[height=0.5em]{oral.png}) and with applying $z$ (\protect\includegraphics[height=0.5em]{bll.png}).}
\caption{Missed (\includegraphics[height=0.5em]{ma.png}) and false (\includegraphics[height=0.5em]{fa.png}) alarms.}
\caption{The residual risk (\includegraphics[height=0.5em]{grl.png}).}
\caption{Missed (\includegraphics[height=0.5em]{ma.png}) and false (\includegraphics[height=0.5em]{fa.png}) alarms.}
\caption{The residual risk (\includegraphics[height=0.5em]{grl.png}).}
\caption{Missed (\includegraphics[height=0.5em]{ma.png}) and false (\includegraphics[height=0.5em]{fa.png}) alarms.}
\caption{The residual risk (\includegraphics[height=0.5em]{grl.png}).}
\caption{Missed (\includegraphics[height=0.5em]{ma.png}, \includegraphics[height=0.5em]{ma2.png}) and false alarms (\includegraphics[height=0.5em]{fa.png}, \includegraphics[height=0.5em]{fa2.png}) of contingencies $c=3,5$. \vspace{-0.5em}}
\caption{The overall residual risk (\includegraphics[height=0.5em]{grl.png}).\vspace{0.5em}}
\caption{Residual risks: probabilistic perspective (\includegraphics[height=0.5em]{grl.png}), standard classifier (\includegraphics[height=0.5em]{sole.png}).}
\caption{The risk-sensitivity on inaccurately estimating parameters at (a) a single contingency and at (b) all contingencies. The inaccuracies $\alpha$ and $\frac{1}{\alpha}$ are either in costs (\protect\includegraphics[height=0.5em]{C100.png}, \protect\includegraphics[height=0.5em]{C001.png}) or in probabilities (\protect\includegraphics[height=0.5em]{p100.png}, \protect\includegraphics[height=0.5em]{p001.png}), or superposed (\protect\includegraphics[height=0.5em]{pandc.png}). Accurate estimation in the probabilistic perspective (\protect\includegraphics[height=0.5em]{grl.png}) and standard classifier (\protect\includegraphics[height=0.5em]{sole.png}). \vspace{-2em}}
\caption{ (a) Streamwise mean velocity profile as a function of the wall-normal distance and (b) streamwise, (c) wall-normal, and (d) spanwise root-mean-squared fluctuating velocities for the regular channel (\dashed) and the channel with suppressed exponential instabilities (\textcolor{cyan}{\solid}). The Reynolds number of both simulations is $\mathrm{Re}_\tau = 186$. Angle brackets represent averaging in the homogeneous directions and time. \label{fig:stats}}
\caption{The result of the quantitative comparison on the MIT-Adobe 5K dataset~\cite{bychkovsky2011learning}.}{\includegraphics[width=1\columnwidth]{fig_psnr_ssim.pdf}}
\caption{Results on \malwaredataset. \textit{singleton} represents all samples that where not labelled by\textit{AVCLASS}. \textit{others} groups the families containing only one sample.}
\caption{Quantitative evaluation of competing methods. We report the performance of state-of-the-art algorithms on widely used publicly available datasets, in terms of PSNR (in dB) and SSIM. The best results are highlighted with {\color{red}read} color while the {\color{blue}blue} color represents the second-best SR.}
\caption{ A high-level description of meta shift. $\mathbf{S}_{k}^{t}$ and $\mathbf{C}_{k}^{t}$ are the support set and the class descriptor of the $k$-th class in task $t$ respectively. During training, baseline method (\textcolor{magenta}{A}) generates class descriptor that is task-dependent and only concerns the classification result in the current task. The generated class descriptors across tasks are not stable and have shifting problem, ~{\em e.g.,} the first class is biased due to $\mathbf{C}_{1}^{1}$ and $\mathbf{C}_{5}^{2}$ are very close. The proposed method (\textcolor{cyan}{B}) utilizes a class domain with memory to regularize the class descriptor construction and avoid descriptor shifting. % \textbf{Curve plots}: The statistics on the distance between class descriptors of class `worm' and their mean in 500 tests for baseline method (\textcolor{magenta}{A}) and the proposed method (\textcolor{cyan}{B}). }
\caption{ Meta shift measurement: The statistics on the distance between class descriptors of class `worm' and their mean in 500 tests for baseline method (\textcolor{magenta}{A}) and the proposed method (\textcolor{cyan}{B}). Five samples are randomly picked(5-shot task) to construct the class descriptor. This process is repeated for 500 tests to generate 500 class descriptors for each method. The Euclidean distance from the 'mean' class descriptor to the class descriptor in each test indicates the variation of the generated class descriptors which can be applied to measure the meta shift issue. }
\caption{ An example of self-attention response from the last self-attention layer. Eight frames are uniformly sampled from an action with the class `put on jacket' and illustrated as frame 0 to 7. Frame 0 has the strongest correlation with the last frame, frame 7, at the fourth head \legendsquare{red}, and attends heavily itself at the second head \legendsquare{relu}. %One of sampled frames in the sequence (frame 0 to 7) is correlated with with other frames in a different way in eight different heads. Frame 0 is strongly attended to frame 6 at the first head \legendsquare{blue} and has the strongest connection with frame 2 at the seventh head \legendsquare{pink}. Darker colors denotes higher attention probability or stronger association. Note that with the self-attention network each frame is associated with other frames so that local and global context information can be acquired. %the short or long distance between correlations represents local or global context information, respectively. % are created with any correlation combinations between two positions in the sequence. %The local and global contexts are captured with this self-attention mechanism. %Frame 0 to 7 describe a sequence input for SAN and each position is correlated with different %The model used is trained with a sequence length of 8, 4 self-attention layers, and 8 self-attention multi-heads. Frame 0 on the left-hand side is correlated with different frames for different heads. For example, Frame 0 is strongly correlated with frame 6 at the first head (blue) and has strong connection with frame 0 at the second head (orange). Darker colors denotes higher self-attention probability or stronger correlation. %of our network that correlates sequence features. The number describes sampled frames of a video from a wearing jacket class and the first and last sampled frames with overlaid skeletons are shown for illustration purpose. Note that there are 8 multi-head attentions, and the first frame is correlated with different frames for each head (color-coded). Darker colors correspond with stronger attentions. %This example shows that the proposed network obtains semantic information from self-attention responses that contains both local and global context information. % which are outputs of short and long range correlations. %can correlate features in long distance to extract semantic information from both short and long range correlations. }
\caption{ An input sequence of skeleton joints over frames, ${F \times J' \times C}$, is fed to the convolutional blocks and output tensor size of ${F \times 8 \times 64}$ is generated, which is denoted by \legendsquare{input}. Each color denotes the following layers: \legendsquare{conv1} convolutional layer; \legendsquare{relu} ReLU activation; and \legendsquare{maxpool} max-pooling layer.}
\caption{\label{tablenu} \textcolor{green}{Errors and rates for mixed boundary condition (\ref{19}).}}
\caption{Histograms and mean values in case of the different domain shifts: histograms represent the distributions cross-entropy loss of the classifier on validation data before doman adaptation. The values with red (\textcolor{red}{$\mu_0, \mu_1, \mu_2$}) represent the means of validation loss; \textcolor{red}{$\mu_0$} is the low mean loss in case of intra session; \textcolor{red}{$\mu_1$} and \textcolor{red}{$\mu_2$} (along with their histograms) show high inter-session and inter-subject distribution divergences.}
\caption{Histograms and mean values in case of different domain shifts and adaptation solutions: histograms represent the distributions of cross-entropy loss of the classifier on validation data after doman adaptation. Intra session statistics (\textcolor{red}{$\mu_0$}) represent the source distribution (towards the divergent distributions are aimed to be adapted). The histograms and corresponding mean values with gray (\textcolor{mygray}{$\mu_1^L, \mu_2^L$}) represent the validation loss after linear domain adaptation; the histograms and corresponding mean values with green (\textcolor{green}{$\mu_2^D, \mu_2^D$}) represent the validation loss after deep domain adaptation.}
\caption{Framework Overview. We illustrate step \textcolor{red}{2} in the shaded area where source features are taken as an example. In light of $f_{s/t}$ extracted from the feature extractor $G$, we employ the multi-objective adversarial attack with our proposed I-FGSPM on the classifier $F$ as well as discriminator $D$ and then accumulate the gradient maps. Therefore, we obtain the mutated features $f_{s*/t*}$ after appending the perturbations to the original copies. Furthermore, these perturbed and original features are trained by an adversarial training procedure (i.e., step \textcolor{red}{3}), which is presented in the upper right. We have highlighted the different training objectives for the output maps of their corresponding domains, which are predicted by the classifier $F$ and then followed by the discriminator $D$ to produce domain prediction maps. The \textcolor{green}{green} and \textcolor{OrangeRed}{red} colors stand for the source and target flows respectively.}
\caption{Results of adapting GTA5 to Cityscapes. The tail classes are highlighted in \textcolor{blue}{blue}. The top and bottom parts correspond to VGG-16 and ResNet-101 based model separately. }
\caption{Results of adapting SYNTHIA to Cityscapes. The tail classes are highlighted in \textcolor{blue}{blue}. }
\caption{Performance comparison on the synthetic dataset over 1000 realizations. The metrics are mean and standard errors of $\sqrt{\epsilon_\text{PEHE}}$, $\sqrt{\epsilon_{\text{PEHEnn}}}$ and $\epsilon_\text{ATE}$. Better result with statistical significance by Welch's t-test with $\alpha=0.05$ is highlighted in \textcolor{blue}{blue}.}
\caption{Performance comparison on the IHDP dataset over 100 realiazations. The metrics are mean and standard errors of $\sqrt{\epsilon_\text{PEHE}}$ and $\epsilon_\text{ATE}$. Best result with statistical significance by Welch's t-test with $\alpha=0.05$ is highlighted in \textcolor{blue}{blue}. Entry '-': not reported in the paper.}
\caption{Performance comparison on the IHDP dataset over 1000 realiazations. The metrics are mean and standard errors of $\sqrt{\epsilon_\text{PEHE}}$ and $\epsilon_\text{ATE}$. Best result with statistical significance by Welch's t-test with $\alpha=0.05$ is highlighted in \textcolor{blue}{blue}.}
\caption{\label{fig:results-mapping-given}Depicted above are the trade-offs between solver run-time and energy used by the solution for the four approaches -- % {\color{red}$\blacksquare$} \lpestim, {\color{clr1}$\blacksquare$} \cvxmapping, {\color{clr2}$\blacksquare$} \spalgo, {\color{clr3}$\blacksquare$} \ilpmap, and {\color{clr4}$\blacksquare$} \roundmap{} -- that assume that mapping and scheduling is already given (only speed assignment). These trade-offs have been determined for five classes of five benchmark graphs each from the E3S benchmark suite and the Pegasus library. The energy displayed is normalized by the minimal energy consumed with continuous speeds.}
\caption{Image and mask examples from MSD tasks (from left to right and top to bottom): brain tumours, lung tumours, hippocampus, hepatic vessel and tumours, pancreas tumours, and liver tumours, respectively. The abnormalities, texture variance, and anisotropic properties make it very challenging to achieve satisfying segmentation performance. \textcolor{red}{Red}, \textcolor{green}{green}, and \textcolor{blue}{blue} correspond to labels 1, 2 and 3, respectively, of each dataset. Not all tasks have 3 labels. (Best viewed in color.) \vspace{-0.4cm} }
\caption{\textbf{Left:} The final architecture of C2FNAS-Panc. \textcolor{red}{Red}, \textcolor{green}{green}, and \textcolor{blue}{blue} denote cell with 2D, 3D, P3D operations separately. \textbf{Right:} The structure of cell with single input and two inputs. (Best viewed in color).}
\caption{Competition results on the ReCTS dataset. The results are from the competition website {\color{blue} \url{https://tinyurl.com/ReCTS2019}. } For the detection task, the ranking is based on Hmean. For End-to-End detection and recognition task, the ranking is based on 1-NED. NED: normalized edit distance.}
\caption{Continuum and $N_c\to\infty$ Glueball Masses. \textcolor{red}{(Preliminary)}}
\caption{The sketch generator network architecture. The sketch attributes are first augmented by the attribute augmentation module, in which a new latent attribute variable is re-sampled from the estimated latent distribution ($\mu_{\phi(\mathbf{y}_{s})}$ and $\sigma_{\phi(\mathbf{y}_{s})}$) and concatenated with a noise vector. Then, the remaining up-sample modules (\textcolor{orange}{orange}) aim to generate a series of multi-scale sketches with the augmented sketch attributes.}
\caption{The architecture of Face Generator Network. The facial attributes are first embedded by the attribute augmentation module, similar to the one used in stage 1. The synthesized sketch image is also embedded by \textcolor{blue}{a sequence of down-sample convolutional layers}. These two feature maps are then fused by concatenation. Finally, the fused feature maps are used by the \textcolor{orange}{up-sample module} to synthesize multi-scale face images.}
\caption{$NC_{00}(C_5)$: Square nodes indicate a 1 in the associated type, while circular nodes indicate a 0. \emph{Involved} players in each case are shaded in {\bf \color{red} red}.\label{fig1}}
\caption{{\bf Cross-generator generalization results.} We show the average precision (AP) of various classifiers from baseline Zhang et al.~\cite{zhang2019detecting} and ours, tested across 11 generators. Symbols $\checkmark$ and $\dagger$ mean the augmentation is applied with $50\%$ or $10\%$ probability, respectively, at training. Chance is $50\%$ and best possible performance is $100\%$. When test generators are used in training, we show those results in \gray{gray} (as they are not testing generalization). Values in black show cross-generator generalization. Amongst those, the highest value is highlighted in {\bf black}. We show ablations with respect to fewer classes in ProGAN and by removing data augmentation. We report the mean AP by averaging the AP scores over all datasets. Subsets are plotted in Figures~\ref{fig:comp_augs},~\ref{fig:comp_class},~\ref{fig:comp_methods} for comparison. % }
\caption{{\bf Robustness.} We show the effect of AP given test-time perturbation to \textbf{(left)} Gaussian blurring and \textbf{(right)} JPEG. We show classifiers trained on ProGAN, with different augmentations applied during training. Note that in all cases and both perturbations, when training without augmentation (\red{red}), performance degrades across all datasets when perturbations are added. In most cases, training with \textit{both} augmentations, performs best or near best. Notable exceptions are for super-resolution (where no augmentation is best), and DeepFake, where training \emph{only} with the perturbation used during testing, rather than both, performs best.}
\caption{ Test set performance of MGT vs. the state-of-the-art RNN and CNN architectures. The $1^{st}$/$2^{nd}$/$3^{rd}$~best results per column are indicated in \textcolor{red}{\textbf{red}}/\textcolor{blue}{blue}/\textcolor{magenta}{magenta}. }
\caption{Ablation study for multi-graph architecture of MGT. GT denotes single-graph variants of MGT. The $1^{st}$/$2^{nd}$~best results per column are indicated in \textcolor{red}{\textbf{red}}/\textcolor{blue}{blue}. % \red{In the ``Recognition Accuracy'' column, the results in black and blue denotes the original results and the new results obtained by automatic early stop, respectively. The results in magenta are obtained by the new ReLU positions and new A global.} $||$ denotes the logical union operation. }
\caption{Test set performance of Graph Transformer vs. other GNN variants. The $1^{st}$/$2^{nd}$~best results per column are indicated in \textcolor{red}{\textbf{red}}/\textcolor{blue}{blue}. ``fully'' denotes fully-connected. ``Gra. Stru.'' denotes graph structure. }
\caption{Ablation study for multi-modal input for MGT (Large). Notations: ``+'' and ``$\mathcal{C}(\cdots)$'' denote ``sum'' and ``concatenate'', respectively; ``coo.'', ``flag'', and ``pos.'' represent ``coordinate'', ``flag bit'', and ``position encoding'', respectively. The $1^{st}$/$2^{nd}$~best results per column are indicated in \textcolor{red}{\textbf{red}}/\textcolor{blue}{blue}. }
\caption{\label{OTB2013} The tracking results of trackers based on hand-crafted features on OTB2013 benchmark. The top three results are shown in {\color{red}{red}}, {\color{blue}{blue}} and {\color{brown}{brown}}}
\caption{\label{OTB2015} The tracking results of trackers based on hand-crafted features on OTB100 benchmark. The top three results are shown in {\color{red}{red}}, {\color{blue}{blue}} and {\color{brown}{brown}}}
\caption{\label{otbophc} The tracking results of trackers based on hand-crafted features on TC128 benchmark. The top three results are shown in {\color{red}{red}}, {\color{blue}{blue}} and {\color{brown}{brown}}}
\caption{Area weighted average of the absolute radial velocity $\langle |V_r| \rangle_r$ versus time, for several Reynolds numbers: dotted ($\cdot \cdot \cdot$) for $Re=0.1$, dashed (- -) for $Re=1$, and solid lines (\textemdash) for $Re=10$. The colours refer to the viscosity ratio: black for $\chi=10$, \textcolor{red}{red for $\chi=100$} and \textcolor{blue}{blue for $\chi=1000$}.}
\caption{Simulation of EPD samples (with the same monte-carlo seed on the left, and another seed on the right), with $\alpha=1.5$, and $\tau$ varying from $-20$ to $0$. The red curve (\textcolor{red}{\fulllwd}) is Hill estimator $\widehat{\alpha}_u$ obtained when $u$ is the $60$\% quantile, while the blue curve (\textcolor{blue}{\fulllwd}) is obtained when $u$ is the 90\% quantile. The green curve (\textcolor{green}{\fulllwd}) is the profile likelihood estimator of a GPD distribution.}
\caption{Quantile $Q(1-p)$ as a function of the probability $p$ (on a log scale), with a strict Pareto (\full) with tail index $\alpha=1.5$, and a GPD distribution (\textcolor{blue}{\dashed}) on the left, and an EPD distribution (\textcolor{blue}{\dashed}) on the right.}
\caption{Expected Shortfall $\text{ES}(p)$ as a function of the probability (on a log scale), with a strict Pareto (\full) with tail index $\alpha=1.5$, and a GPD distribution (\textcolor{blue}{\dashed}) on the left, and an EPD distribution (\textcolor{blue}{\dashed}) on the right.}
\caption{Expected Shortfall against quantile for different values of the probability (i.e. $(Q(1-p)),\text{ES}(p)$), with a strict Pareto (\full) with tail index $\alpha=1.5$, and a GPD distribution (\textcolor{blue}{\dashed}) on the left, and an EPD distribution (\textcolor{blue}{\dashed}) on the right. The straight line below (\textcolor{red}{\dashed}) is the identity curve, where the Expected Shortfall would be equal to the associated quantile.}
\caption{Large Claim Index $\text{TS}(p)$ as a function of the probability $p$, with a strict Pareto (\full) with tail index $\alpha=1.5$, and a GPD distribution (\textcolor{blue}{\dashed}) on the left, and an EPD distribution (\textcolor{blue}{\dashed}) on the right.}
\caption{Estimation of $\alpha$, with a Pareto model (\fulllwd), a Generalized Pareto model (\full) and an Extended Pareto Model (\textcolor{red}{\dashed}), as a function of $n_u$, the number of tail events considered.}
\caption{Mean excess functions Danish Fire data on the left, and the SOA Medical claims data on the right, with confidence intervals on the empirical version. The line (\textcolor{red}{\full}) is the GPD fit, for a given threshold (respectively 10 and 2).}
\caption{Estimation of $e(d)$, with $d=20$ for Danish fires and $d=12$ for Medical losses, with the empirical average ({\dashed}), the estimator derived from a Pareto distribution above threshold $u$ (\fulllwd), from a Generalized Pareto model above threshold $u$ (\full) and from an Extended Pareto Model above threshold $u$ (\textcolor{red}{\dashed}), as a function of $u$.}
\caption{Value-at-Risk at level 0.5\% for the daily log-return of oil prices (\full), with a GARCH+GPD model (\textcolor{blue}{\full}) and a sliding window estimator GPD model (\textcolor{red}{\full})).}
\caption{Value-at-Risk at level 0.5\% for the daily log-return of oil prices (\full), with a GARCH+EPD model (\textcolor{blue}{\full}) and a sliding window estimator EPD model (\textcolor{red}{\full}).}
\caption{Difference of the Value-at-Risk at level 0.5\% for the daily log-return of oil prices between GARCH+Pareto models (\textcolor{blue}{\full}) and a sliding window estimator of Pareto models (\textcolor{red}{\full}).}
\caption{\emph{The scattering potential.} $V_{nmkl}$ in Eq.$\left(\text{\ref{4-modes Scattering potential}}\right)$ calculated in the LG basis. Each panel corresponds to different $\Delta$ value, and panel (a) captures a summary of the diagonal crossection from the top-left to bottom-right of panels b:e. Panels b:e capture the inter-mode coupling {[}relations between the two middle dimensions $\left(V_{mk}=tr_{nl}V_{nmkl}\right)${]}, of the effective SBs scattering potential for the selected values $\nicefrac{\Delta}{\boldsymbol{q}_{max}^{2}}=\left(0.001,0.01,0.02,0.05\right)$. The calculated relations between the first two $\left(V_{nm}=tr_{kl}V_{nmkl}\right)$ and last dimensions $\left(V_{kl}=tr_{nm}V_{nmkl}\right)$ are indistinguishable from panel (b) for all values of $\Delta$ and resemble a Kronecker delta distribution. \label{LG scattering-potential}} \end{figure} \emph{The scattering potential in the Laguerre-Gauss basis.} The target effective interaction to be simulated is manipulated and designed using three main ingredients. (i) One is purely geometric and defined by the arrangement of charges, (ii) the internal structure of each charge $\left(\Delta_{i}\right)$ and (iii) the choice of basis. While the geometric component appears in the expressions for both, the interaction and hopping terms, the charge spectral structure contributes to the scattering potential of Eq.$\left(\text{\ref{4-modes Scattering potential}}\right)$ alone. We now study the effects of $\Delta$ on the interaction which is basis dependent yet geometry independent. The scattering potential $V_{nklm}$ involves coupling between four modes. Eq.$\left(\text{\ref{4-modes Scattering potential}}\right)$ can be further simplified using the LG basis {[}see Eq.$\left(35B\right)$ of the supplementary material{]}. The structure of the scattering potential as a function of $\Delta$ for a two level system is depicted in Fig. $\left(\text{\ref{LG scattering-potential}}\right)$. For all values of $\Delta,$the relation between the first and last two modes {[}$n,k$ and $l,m$ of Eq.$\left(\text{\ref{Effective Hamiltoian}}\right)${]} resembles a Kronecker delta . It is given by the partial traces $V_{nm}=tr_{kl}V_{nmkl}\approx\delta_{nm}$ and $V_{kl}=tr_{nm}V_{nmkl}\approx\delta_{kl}$ and depicted in Fig.$\left(\text{\ref{LG scattering-potential}}a\right)$. This property simplifies the effective Hamiltonian to, \begin{equation} H_{eff}^{LG}=\sum_{n,k}\Theta_{nk}a_{n}^{\dagger}a_{k}-\sum_{nk}\mathcal{U}_{nk}\hat{n}_{n}\hat{n}_{k}, \end{equation} \noindent where $\hat{n}_{k}=a_{k}^{\dagger}a_{k}$ is the number operator, and ${\cal U}_{nk}=\sum_{lm}U_{nlkm}\delta_{nl}\delta_{km}$. The effective potential between the first and last two modes of Eq.$\left(\text{\ref{Effective Hamiltoian}}\right)$ spreads to neighboring modes with increasing values of $\nicefrac{\Delta}{\boldsymbol{q}_{max}^{2}}$ where $\boldsymbol{q}_{max}$ is the cutoff wavector in the numerical calculation. This behavior is summarized in Fig.$\left(\text{\ref{LG scattering-potential}}a\right)$, and demonstrated separately for the selected values in Fig.$\left(\text{\ref{LG scattering-potential}}b:e\right)$. At large values of $\nicefrac{\Delta}{\boldsymbol{q}_{max}^{2}}$ the scattering occurs between more distant modes, corresponding to energy exchange with the matter. Extension of this result to a system of charges composed of a more complex internal structure, is given by a straightforward summation, resulting in longer-range interaction.\\ \emph{Illustrative example of the effective Hamiltonian.} We derive the effective Hamiltonian for molecules in cylindrical architecture, in which controlled hopping can confine the dynamics to a restricted subspace. We consider a uniform distribution of molecules filling a hollow-cylinder (UC) of inner radius $a$ and outer radius $b$ as shown in Fig.$\left(\text{\ref{fig:Illustrative-example Uniform cylinder sketch}}\right)$. The boundary radius $c$ is defined such that 95\% of the power of the incident radial mode for which $n=25$ is contained within the calculation range of the numerical simulation (the chosen cutoff mode). \begin{figure}[h] \begin{centering} \includegraphics[bb=0bp 0bp 841bp 576bp,clip,scale=0.2]{Fig3.pdf} \par\end{centering} \caption{\emph{Uniformly distributed hollow-cylinder}. Inner and outer radius $a$ and $b$ respectively, $c$ is chosen such that 95\% of the power of the radial mode for which $n=25$ is contained within. $\Delta z$ is the region in which the interaction and hopping occur, corresponding to the interaction time interval $\tau$. \label{fig:Illustrative-example Uniform cylinder sketch}} \end{figure} \begin{figure}[th] \begin{centering} (a)\includegraphics[bb=150bp 270bp 500bp 540bp,clip,scale=0.35]{Fig4a.pdf}(b)\includegraphics[bb=150bp 270bp 500bp 540bp,clip,scale=0.35]{Fig4b.pdf} \par\end{centering} \begin{centering} \par\end{centering} \begin{centering} (c)\includegraphics[bb=150bp 270bp 500bp 540bp,clip,scale=0.35]{Fig4c.pdf}(d)\includegraphics[bb=150bp 270bp 500bp 540bp,clip,scale=0.35]{Fig4d.pdf} \par\end{centering} \caption{\emph{Geometrically controlled hopping confinement}. (a-d) The geometric hopping factors $\theta_{nk}^{inc}$ presented in Eq.$\left(\text{\ref{eq:Incoherent hopping term}}\right)$ are displayed for the above geometry. Panels a-d computed with the corresponding dimensionless distance from the origin $\left(\nicefrac{a}{c},\nicefrac{b}{c}\right)$: $\left[\text{a}\left(0.1,0.9\right),\text{b}\left(0.2,0.8\right),\text{c}\left(0.4,1\right),\text{d}\left(0.4,0.6\right)\right]$. Positive contributions are energetically favorable. \label{fig:Hopping term from Uniform cylinder}} \end{figure} In this case the geometric coefficient of the hopping terms in Eqs.$\left(\text{\ref{eq:coherent hopping term}},\text{\ref{eq:Incoherent hopping term}}\right)$ as well as the interaction can be used to confine the dynamics in a controlled subspace. By varying $a$ and $b$, the hopping range depicted in Figs.$\left(\text{\ref{fig:Hopping term from Uniform cylinder}}\right)$ a:d can be controlled. Fig.$\left(\text{\ref{fig:Hopping term from Uniform cylinder}}\right)$ along with Eqs.$\left(\text{\ref{eq:coherent hopping term}},\text{\ref{eq:Incoherent hopping term}}\right)$ show that hopping within the set of modes corresponding to positive geometric factor, is energetically favorable. These positive contributions are surrounded by negative ones which are costly energetically, resulting in effective confinement. In this setup the single-molecule field scattering presented in Eq.$\left(\text{\ref{eq:Incoherent hopping term}}\right)$ dominates the hopping dynamics and the coherent hopping term of Eq.$\left(\text{\ref{eq:coherent hopping term}}\right)$ is suppressed. This is verified by the vanishing structure factor in a disordered lattice, or equivalently from the closure relations of the LG basis combined with orthogonality {[}Eqs.$\left(\text{\ref{Closure}},\text{\ref{Orthonormality}}\right)${]}. Using Eq.$\left(\text{\ref{Complex exponent span}}\right)$ one can estimate that for $\left|\boldsymbol{q}_{max}\right|=10^{-p}k_{0}$ and $\Delta z=10^{l}\lambda_{0}$ the modal attenuation factor is $\exp\left(-2\pi10^{l-2p}\right)$ which yields $\approx 94\%$ of the incoming photon flux at the output for $ l=p=2 $. This geometry simulates the dynamics of the Hamiltonian $H_{eff}^{LG}=H_{UC}$,\\ \begin{equation} H_{UC}=\sum_{\left\langlen,k\right\rangle\in{\cal D}_{h}}\Theta_{nk}a_{n}^{\dagger}a_{k}-\sum_{n,k\in{\cal D}_{\Delta}}\mathcal{U}_{nk}\hat{n}_{n}\hat{n}_{k}, \end{equation} where $\left\langle n,k\right\rangle $ stands for nearest neighbors. ${\cal D}_{h}$ is a domain determined by the geometry in which the hopping occurs, as shown in Fig.$\left(\text{\ref{fig:Hopping term from Uniform cylinder}}\right)$. ${\cal D}_{\Delta}$ is the domain set by $\Delta$ for which several illustrations are depicted in Fig.$\left(\text{\ref{LG scattering-potential}}\right)$. \\ \emph{Discussion.} We have developed a geometric SPC, shaping photon-photon interactions via geometric design of the coupling between spatial modes, using the setup depicted in Fig.$\left(\text{\ref{fig:Setup}}\right)$. Quantum dynamics of interacting bosons described by the Hamiltonian of Eq.$\left(\text{\ref{Effective Hamiltoian}}\right)$ can be simulated and directly measured using the above ingredients. The dynamics of the SBs constrained by the Eq.$\left(\text{\ref{Effective Hamiltoian}}\right)$ is controlled by the following three main quantities. The geometric distribution of molecules, their internal structure and the choice of spatial basis. The dynamics induced by the geometric SPC depicted in Fig.$\left(\text{\ref{fig:Illustrative-example Uniform cylinder sketch}}\right)$ can be restricted to a finite set of modes as demonstrated in Fig.$\left(\text{\ref{fig:Hopping term from Uniform cylinder}}\right)$. This offers a \emph{purpose-computing platform} to a class of problems with exponential complexity. There is a growing interest in purpose machines, built for the solution of a specific task, e.g. coherent Ising machines \citep{Inagaki603,McMahon614}. These structures are designed to solve efficiently Ising models on graphs with programmable connectivity. Their usefulness stems from the well known mapping between the Ising model ground state search problem, and combinatorial optimization problems in polynomial time \citep{Barahona_1982} (both NP hard). The proposed setup is also applicable as a quantum (light) state-preparation technique, as well as multi-photon gate in a photonic quantum processor. In molecular systems the number of vibrational modes $N_{vib}$ is proportional to the number of atoms $N_{a}$ according to $N_{vib}=3N_{a}-6$. The number of electronic states corresponds to the number of electrons $N_{e}$. Good candidates would be systems containing few vibrational modes while large number of electrons that potentially provide strong coupling of the vibrational modes with applied electromagnetic field. Short wavelength tabletop X-ray sources that couple off-resonantly between the vibrational modes provide intriguing possibility for source realization \cite{Rocca:16}. Longer wavelength sources for which sophisticated measurement techniques are more mature may be possible although the coupling between the modes may take more complicated forms. Due to the structure of the LG modes, for low number of ordered scatterers, sign-flipping anti-ferromagnetic coupling has been observed that requires further characterization. Finding the molecular distribution and basis emulating the desired dynamics provides a topic for future study as well as coupling to electronic states rather than the vibrational. For this purpose, another degree of the geometric properties could be considered, the local charge distribution of a each scatterer presented in Eqs.$\left(\text{\ref{eq:exact coh},\ref{eq:Exact inc}}\right)$. \textbf{\emph{Acknowledgments.}} The support of the Chemical Sciences, Geosciences, and Biosciences Division, Office of Basic Energy Sciences, Office of Science, U.S. Department of Energy is gratefully acknowledged. S.M was supported by Award DE-FG02-04ER15571. S.A fellowship was supported by the National Science Foundation (Grant No. CHE-1663822). \bibliography{Geo_SPC_bib} \end{document} }\end{figure}}
\caption{Excerpt of smart contract \texttt{ExtraBalToken} in \solidity within Eclipse plugin. View of analyze+configure menu for selection of cost model and optimization flag.}
\caption{The SINR coverage probability, $\mathbb{P}_{N_u}\left(\tau\right)$, versus the SINR threshold, $\tau$, for different values of $N_u$.} \label{fig:three_lines} \end{center} \end{figure} \begin{table}[t] \centering %\caption{Spatial Correlation Considered Versus Spatial Correlation Not Considered} \caption{$N_{\min}$ versus $\hat{N}_{\min}$ for Different Values of $\xi$} \label{table:number_of_antennas_needed} \begin{tabular}{|l|l|l|l|l|} \hline $\xi$ & 60\% & 70\% & 80\% & 90\% \\ \hline $N_{\min}$ & 4 & 5 & 7 & 12 \\ \hline $\hat{N}_{\min}$ & 2 & 3 & 4 & 5 \\ \hline \end{tabular} \end{table} First, we examine the impact of spatial correlation and $N_u$ on $\mathbb{P}_{N_u}\left(\tau\right)$ in Fig.~\ref{fig:three_lines}. We observe that our analysis with spatial correlation given by~\eqref{eq:coverage_probability} matches the simulations very well, which confirms that~\eqref{eq:coverage_probability} is an accurate approximation. Moreover, we observe that $\mathbb{P}_{N_u}\left(\tau\right)$ significantly increases when $N_u$ increases. Specifically, when $\tau=10~\textrm{dB}$, $\mathbb{P}_{N_u}\left(\tau\right)$ increases from 0.36 to 0.48 and then to 0.82 when $N_u$ increases from 1 to 2 and then to 8. Furthermore, we observe that ignoring spatial correlation leads to severe overestimation of the SINR coverage probability, which is evident when comparing the analysis ignoring spatial correlation, given by~\eqref{eq:neglecting_spatial_correlation}, with Monte Carlo simulation points. Specifically, when $\tau=14~\textrm{dB}$ and $N_u=8$, $\hat{\mathbb{P}}_{N_u}\left(\tau\right)$ is 0.83 while the simulation shows that the actual SINR coverage probability is 0.51. Indeed, spatial correlation significantly reduces the probability of successful reception at the destination UE, thus cannot be ignored. In order to further examine the analysis error caused by ignoring spatial correlation, we investigate the minimum number of antennas needed at the destination UE to achieve a given SINR coverage probability target, $\xi$, for a given $\tau$. Mathematically, it is expressed as $N_{\min}\triangleq\min\left\{N_u:\mathbb{P}_{N_u}\left(\tau\right)>\xi\right\}$ when spatial correlation is considered. When spatial correlation is ignored, the expression is $\hat{N}_{\min}\triangleq\min\left\{N_u:\hat{\mathbb{P}}_{N_u}\left(\tau\right)>\xi\right\}$. Table~\ref{table:number_of_antennas_needed} shows $N_{\min}$ and $\hat{N}_{\min}$ for different $\xi$ when $\tau=10~\textrm{dB}$. This table confirms the underestimation of the minimum number of antennas needed when spatial correlation is ignored. For example, when $\xi= 90\%$, if spatial correlation is ignored, the minimum number of antennas needed is 5. However, the actual minimum number of antennas needed is 12. Again, this example shows that ignoring spatial correlation is not acceptable when designing relay assisted mmWave cellular networks. We now examine the impact of the transmit power at the BS, $P_b$, on $\mathbb{P}_{N_u}\left(\tau\right)$ and compare the SINR coverage probability without relays with $\mathbb{P}_{N_u}\left(\tau\right)$ in Fig.~\ref{fig:impact_of_trans_power}. We remark that the SINR coverage probability without relays is $\mathbb{P}_{N_u,bd}\left(\tau\right)$, since without relays, the relay mode does not exist. We observe that $\mathbb{P}_{N_u}\left(\tau\right)$ increases as $P_b$ increases. This is due to the fact that both $\gamma_{bd,n}$ and $\gamma_{br}$ increase as $P_b$ increases. Moreover, we observe that $\mathbb{P}_{N_u}\left(\tau\right)$ is significantly higher than $\mathbb{P}_{N_u,bd}\left(\tau\right)$. Specifically, when $\tau=10~\textrm{dB}$, $N_u=8$, and $P_b=35~\textrm{dBm}$, $\mathbb{P}_{N_u}\left(\tau\right)$ is 0.83 while $\mathbb{P}_{N_u,bd}\left(\tau\right)$ is 0.59. This confirms that introducing relays into a mmWave cellular network vastly improves the SINR coverage probability. Furthermore, we observe that the gain brought by relays, i.e., the gap between $\mathbb{P}_{N_u}\left(\tau\right)$ and $\mathbb{P}_{N_u,bd}\left(\tau\right)$, becomes larger when $N_u$ increases from 2 to 8, especially when $P_b$ is not high, e.g., $P_b=30-40~\textrm{dBm}$. This is due to the fact that when $N_u$ increases, the directional gain of relay UEs, given by $G_{U}=N_{u}$, increases. When $G_{U}$ increases, $\gamma_{br}$ and $\gamma_{rd,n}$ increase. Thus, $\mathbb{P}_{br}\left(\tau\right)$ and $\mathbb{P}_{N_u,rd}\left(\tau\right)$ increase as $N_u$ increases. Hence, based on \eqref{eq:coverage_probability}, the gain brought by relays increases as $N_u$ increases. \begin{figure}[!t] \begin{center} \includegraphics[height=2.8in,width=0.95\columnwidth]{transPower.eps} \caption{The SINR coverage probability, $\mathbb{P}_{N_u}\left(\tau\right)$, versus the BS transmit power, $P_b$, for different values of $N_u$, with $\tau=$10 dB.}
\caption{Beliefs partition by limiting value of $R_\cent \in \{0, c_{\FA} p_0, c_{\MD}\bar{p}_0\}$. The optimal points for large $N$ suggested by Fig.~\ref{fig:opt_belief_trend}, i.e., $\left(\frac{c_{\MD}}{ c_{\FA}+c_{\MD} }, \ldots, \frac{c_{\MD}}{ c_{\FA}+c_{\MD} }, \frac{c_{\MD}}{ c_{\FA}+c_{\MD} }\right)$, are drawn in dotted line.} \label{fig:risk_partition} \end{figure} Thm.~\ref{thm:polarization} reveals an interesting fact that when $N$ is large, the central agent makes a decision $0$ or $1$ almost surely, in other words, the decision is asymptotically deterministic, as a function of $q_1$ and $q_\cent$ no matter what value the private signal takes. Updating the belief, the central agent could make a correct decision always if $q_\cent' = 1$ when $h=0$ and $q_\cent'=0$ when $h=1$. Tab.~\ref{tab:partition} summarizes, and corresponding regions are depicted in Fig.~\ref{fig:risk_partition} with limiting values of $R_\cent$. The shaded region in Fig.~\ref{fig:risk_partition} achieves $R_\cent=0$ asymptotically for all $p_0$. Clearly the shaded region contains $\frac{c_{\MD}}{ c_{\FA}+c_{\MD} }=q_\cent = q_1=\cdots = q_N$ for any $c_{\FA}, c_{\MD}$, at which $R_\cent$ is asymptotically minimized regardless of $p_0$ as suggested numerically by Fig.~\ref{fig:opt_belief_trend}. It is easy to see from the properties of $Q$ function that a decision made by the central agent is asymptotically correct always at least for one hypothesis, no matter what $(p_0, q_\cent, q_1)$ tuple is used. In other words, \textsc{case} $4$ for which decision is always wrong is impossible. We include details in App.~\ref{app:impossible} for completeness. Finally, we can also derive the speed of risk convergence to its limiting value in Fig.~\ref{fig:risk_partition}. To explicitly denote dependency on $N$, let $R_\cent^{(N)}$ be the risk of the central agent with $N$ distributed agents and $R_\cent^{(\infty)} \triangleq \lim_{N \to \infty} R_\cent^{(N)} \in \{0, c_{\FA}p_0, c_{\MD}\bar{p}_0\}$. Then, the next theorem shows that $R_\cent^{(N)} \to R_\cent^{(\infty)}$ exponentially fast in $N$, that is, \begin{align*} \beta \triangleq -\frac{1}{N} \log \left( R_\cent^{(N)} - R_\cent^{(\infty)} \right) \end{align*} is strictly positive. \begin{thm} \label{thm:risk_exponent} Suppose $(q_\cent, q_1)$ satisfies \textsc{case} $1$, $2$, or $3$, that is, $(q_\cent, q_1)$ strictly belongs to one of the regions in Fig.~\ref{fig:risk_partition}. Then, $\beta$ is strictly positive and finite. \end{thm} \begin{IEEEproof} First consider an upper bound on $R_0^{(N)}$. From the condition that $(q_\cent, q_1)$ is not on boundary, we can assume that neither $z_1 z_2^{Q(\lambda(q_1))}$ nor $z_1 z_2^{Q(\lambda(q_1)-1)}$ are equal to $1$. Also Gaussian decision threshold \eqref{eq:decision_threshold} and belief update \eqref{eq:def_of_alpha} imply that the updated decision threshold linearly increases or decreases, that is, \begin{align} \lambda(q_\cent') = \frac{1}{2} + \log \frac{c_{\FA}q_\cent}{c_{\MD}(1-q_\cent)} + N \log \left( z_1 z_2^{r_1} \right) = \lambda(q_\cent) + N \log \left( z_1 z_2^{r_1} \right). \label{eq:linear_increase} \end{align} We will prove the upper bound relying on the concentration of $r_1$ \cite{Durrett2019}. Consider \textsc{case} $1$ and fix an arbitrary $(q_\cent, q_1)$ in the shaded region in Fig.~\ref{fig:risk_partition}. Note that assuming $H = 0$, $\wh{H}_i$ are i.i.d.~random variables according to $\textsf{Bern}(Q(\lambda_1))$, whereas $\wh{H}_i$ are from $\textsf{Bern}(Q(\lambda_1-1))$ when $H=1$. Let us take $\delta > 0$ and define two strong typical sets $\mathcal{T}_{\delta}^0$, $\mathcal{T}_{\delta}^1$ as in \cite{Yeung2008}: \begin{align*} \mathcal{T}_{\delta}^0 &= \left\{ \wh{h}^N: \left| \frac{1}{N} \sum_{i=1}^N \wh{h}_i - Q(\lambda_1)\right| < \delta \right\}, \\ \mathcal{T}_{\delta}^1 &= \left\{ \wh{h}^N: \left| \frac{1}{N} \sum_{i=1}^N \wh{h}_i - Q(\lambda_1-1)\right| < \delta \right\}. \end{align*} Then the risk expression \eqref{eq:risk_expression} can be rewritten as \begin{align*} R_\cent^{(N)} &= c_{\FA}p_0 \sum_{\wh{h}^N \in \mathcal{T}_{\delta}^0} \left( \prod_{i=1}^N p_{\wh{H}_i|H}(\wh{h}_i|0) \right) p_{\wh{H}_\cent|H, \wh{H}^N}(1|0, \wh{h}^N) \\ &+ c_{\FA}p_0 \sum_{\wh{h}^N \not\in \mathcal{T}_{\delta}^0} \left( \prod_{i=1}^N p_{\wh{H}_i|H}(\wh{h}_i|0) \right) p_{\wh{H}_\cent|H, \wh{H}^N}(1|0, \wh{h}^N) \\ &+ c_{\MD}\bar{p}_0 \sum_{\wh{h}^N \in \mathcal{T}_{\delta}^1} \left( \prod_{i=1}^N p_{\wh{H}_i|H}(\wh{h}_i|1) \right) p_{\wh{H}_\cent|H, \wh{H}^N}(0|1, \wh{h}^N) \\ &+ c_{\MD}\bar{p}_0 \sum_{\wh{h}^N \not \in \mathcal{T}_{\delta}^1} \left( \prod_{i=1}^N p_{\wh{H}_i|H}(\wh{h}_i|1) \right) p_{\wh{H}_\cent|H, \wh{H}^N}(0|1, \wh{h}^N). \end{align*} Assuming $\delta$ is small enough, $z_1 z_2^{r_1} > 1$ if $\wh{h}^N \in \mathcal{T}_\delta^0$. This implies that the decision threshold of the central agent after observing $\wh{h}^N$ increases linearly in $N$ as \eqref{eq:linear_increase}. Using the Chernoff bound of the $Q$-function that $Q(x) \le \exp(-x^2/2), x \ge 0$, the first summation is upper bounded by \begin{align*} & c_{\FA}p_0 \sum_{\wh{h}^N \in \mathcal{T}_{\delta}^0} \left( \prod_{i=1}^N p_{\wh{H}_i|H}(\wh{h}_i|0) \right) p_{\wh{H}_\cent|H, \wh{H}^N}(1|0, \wh{h}^N) \\ &\le c_{\FA}p_0 \sum_{\wh{h}^N \in \mathcal{T}_{\delta}^0} \left( \prod_{i=1}^N p_{\wh{H}_i|H}(\wh{h}_i|0) \right) \exp(-N^2 \Delta_0^2 /2) \\ &\le c_{\FA}p_0 \exp(-N^2 \Delta_0^2 /2) \sum_{\wh{h}^N \in \mathcal{T}_{\delta}^0} \left( \prod_{i=1}^N p_{\wh{H}_i|H}(\wh{h}_i|0) \right) \\ &\le c_{\FA}p_0 \exp(-N^2 \Delta_0^2 /2), \end{align*} where $\Delta_0 \triangleq \log\left(z_1 z_2^{Q(\lambda_1)-\delta} \right)> 0$. To bound the second term, note that the probabilities that prior decisions are not in $\mathcal{T}_\delta^0$, $\mathcal{T}_\delta^1$ are given by \begin{align*} \mathbb{P}[\wh{H}^N \not \in \mathcal{T}_\delta^0 | H=0] &\le 2 \exp(-2N I_0), \\ \mathbb{P}[\wh{H}^N \not \in \mathcal{T}_\delta^1 | H=1] &\le 2 \exp(-2N I_1), \end{align*} where $I_0, I_1$ are the rate functions in the Cram\'{e}r theorem \cite{Durrett2019}: \begin{align*} I_0 &\triangleq \min\{ D(Q(\lambda_1)+\delta||Q(\lambda_1)), D(Q(\lambda_1)-\delta||Q(\lambda_1)) \}, \\ I_1 &\triangleq \min\{ D(Q(\lambda_1-1)+\delta||Q(\lambda_1-1)), D(Q(\lambda_1-1)-\delta||Q(\lambda_1-1)) \}, \end{align*} with $D(x||y)$ denoting the KL divergence, $D(x||y) \triangleq x \log \frac{x}{y} + (1-x)\log \frac{1-x}{1-y}$ for $x, y \in (0,1)$. Hence the second term is bounded by \begin{align*} & c_{\FA}p_0 \sum_{\wh{h}^N \not\in \mathcal{T}_{\delta}^0} \left( \prod_{i=1}^N p_{\wh{H}_i|H}(\wh{h}_i|0) \right) p_{\wh{H}_\cent|H, \wh{H}^N}(1|0, \wh{h}^N) \\ &\le c_{\FA}p_0 \sum_{\wh{h}^N \not\in \mathcal{T}_{\delta}^0} \left( \prod_{i=1}^N p_{\wh{H}_i|H}(\wh{h}_i|0) \right) \\ &\le c_{\FA}p_0 \mathbb{P} \left[\wh{H}^N \not \in \mathcal{T}_\delta^0 | H=0 \right] \le 2 c_{\FA}p_0 \exp(-2N I_0). \end{align*} We have similar bounds for the other terms and finally \begin{align*} R_\cent^{(N)} &\le c_{\FA}p_0 \exp(-N^2 \Delta_0^2 /2) + 2 c_{\FA}p_0 \exp(-2NI_0) \\ &\quad + c_{\MD}\bar{p}_0 \exp(-N^2 \Delta_1^2 /2) + 2 c_{\MD}\bar{p}_0 \exp(-2NI_1) \\ &= O(\exp(-N \min\{I_0, I_1\})), \end{align*} where $\Delta_1 \triangleq - \log \left(z_1 z_2^{Q(\lambda_1-1) + \delta}\right)$. It leads us to the positive constant upper bound to $\beta$. To show the lower bound, consider an event $\wh{h}^N$ such that the updated decision threshold, $\lambda'$, stays in a finite interval, say $[0,1]$. It is immediate for $\wh{h}^N$ to not belong to any of $\mathcal{T}_{\delta}^0$ or $\mathcal{T}_{\delta}^1$, thus is non-typical in this sense. Note that for this $\wh{h}^N$, $\lambda' \in [0,1]$, which implies \begin{align*} p_{\wh{H}_\cent|H, \wh{H}^N}(1|0, \wh{h}^N) &= Q(\lambda') \ge Q(1), \\ p_{\wh{H}_\cent|H, \wh{H}^N}(0|1, \wh{h}^N) &= Q(1-\lambda') \ge Q(1). \end{align*} Therefore, we can lower bound $R_0$ as follows. \begin{align*} R_\cent^{(N)} &\ge c_{\FA} p_0 \left(\prod_{i=1}^N p_{\wh{H}_i|H}(\wh{h}_i|0) \right)p_{\wh{H}_\cent|H, \wh{H}^N}(1|0, \wh{h}^N) + c_{\MD} \bar{p}_0 \left(\prod_{i=1}^N p_{\wh{H}_i|H}(\wh{h}_i|1) \right)p_{\wh{H}_\cent|H, \wh{H}^N}(0|1, \wh{h}^N) \\ &\ge c_{\FA} p_0 \left(\min \{ p_{\wh{H}_i|H}(0|0), p_{\wh{H}_i|H}(1|0)\}\right)^N Q(1) + c_{\MD} \bar{p}_0 \left(\min \{ p_{\wh{H}_i|H}(0|1), p_{\wh{H}_i|H}(1|1)\}\right)^N Q(1) \\ &= c_{\FA} p_0 \left(\min \{ 1-Q(\lambda_1), Q(\lambda_1) \}\right)^N Q(1) + c_{\MD} \bar{p}_0 \left(\min \{ Q(1-\lambda_1), Q(\lambda_1-1)\right)^N Q(1). \end{align*} Noting that $1-Q(\lambda_1) = Q(-\lambda_1) > Q(1-\lambda_1)$ and $Q(\lambda_1) < Q(\lambda-1)$, $R_0$ is further bounded: \begin{align*} R_\cent^{(N)} &\ge \left(c_{\FA} p_0 + c_{\MD} \bar{p}_0 \right)Q(1) \left(\min\{Q(1-\lambda_1), Q(\lambda_1)\}\right)^N \\ &= \Omega(\left(\min\{Q(1-\lambda_1), Q(\lambda_1)\}\right)^N). \end{align*} Therefore, the positive constant lower bound to $\beta$ has been shown. Other cases can be obtained similarly. \end{IEEEproof} \section{Discussion} \label{sec:discussion} This work investigates a social learning problem in a parallel network. It is first observed that the updated belief is not monotonic in the initial belief of the central agent as shown in Fig.~\ref{fig:belief_update_N=2}. Similarly to a tandem network \cite{SeoRRGV2019}, the optimal belief tuple that minimizes the central agent's Bayes risk is in general different from the true prior tuple. Since the global optimization is intractable, we describe a numerical algorithm that attains the PBPO solution. The setting of many homogeneous distributed agents is also investigated. The numerical result in Fig.~\ref{fig:opt_belief_trend} suggests that the optimal beliefs are asymptotically $\frac{c_{\MD}}{ c_{\FA}+c_{\MD} }$ as $N \to \infty$ no matter what the true prior is. Also from the fact that the central agent's decision polarizes, belief partition depending on limiting Bayes risk is depicted in Fig.~\ref{fig:risk_partition}. It is also shown that the risk converges to its limiting value exponentially fast. Our setting bears similarities to distributed detection as well as information cascade with unbounded signal. Therefore revisiting distributed detection results for our setting could be an interesting future direction, for instance, characterizing the (perhaps asymptotically vanishing) loss due to the homogeneous beliefs restriction would be important \cite{Tsitsiklis1988}. Tightening the risk exponents in Thm.~\ref{thm:risk_exponent}, $\beta$, is also for further study. \appendices \section{Proof of Thm.~\ref{thm:derivative_zero}} \label{app:derivative_zero} From \eqref{eq:q_function} note that for $j \in \{1,\ldots,N\}$, \begin{subequations} \label{eq:error_derivative} \begin{align} \frac{\partial p_{\wh{H}_j|H}(1|0)}{\partial \lambda_j} &= -\phi(\lambda_j;0), \\ \frac{\partial p_{\wh{H}_j|H}(0|1)}{\partial \lambda_j} &= \phi(\lambda_j;1). \end{align} \end{subequations} Consider the derivative of \eqref{eq:R_N} with respect to $\lambda_j, j\in\{1,\ldots,N\}$. By \eqref{eq:error_derivative}, \begin{align*} &\frac{\partial R_\cent}{\partial \lambda_j} \\ &= c_{\FA} p_0 \phi(\lambda_j;0) \Bigg[\underbrace{ \sum_{\wh{h}_{-j}^N} \Bigg(\prod_{i \neq j} p_{\wh{H}_i|H}(\wh{h}_i|0) \Bigg)p_{\wh{H}_\cent|H,\wh{H}^N}(1|0, \wh{h}_{-j}^N, 0) }_{\triangleq A_0^{(j)}} - \underbrace{ \sum_{\wh{h}_{-j}^N} \Bigg(\prod_{i \neq j} p_{\wh{H}_i|H}(\wh{h}_i|0) \Bigg)p_{\wh{H}_\cent|H,\wh{H}^N}(1|0, \wh{h}_{-j}^N, 1)}_{\triangleq A_1^{(j)}} \Bigg]\\ &+ c_{\MD}\bar{p}_0 \phi(\lambda_j;1) \Bigg[\underbrace{ \sum_{\wh{h}_{-j}^N} \Bigg(\prod_{i \neq j} p_{\wh{H}_i|H}(\wh{h}_i|1) \Bigg)p_{\wh{H}_\cent|H,\wh{H}^N}(0|1, \wh{h}_{-j}^N, 0)}_{\triangleq B_0^{(j)}} - \underbrace{ \sum_{\wh{h}_{-j}^N} \Bigg(\prod_{i \neq j} p_{\wh{H}_i|H}(\wh{h}_i|1) \Bigg)p_{\wh{H}_\cent|H,\wh{H}^N}(0|1, \wh{h}_{-j}^N, 1)}_{\triangleq B_1^{(j)}} \Bigg]. \end{align*} Setting the derivative zero and rearranging terms, \begin{align*} \frac{\phi(\lambda_j;1)}{\phi(\lambda_j;0)} = \frac{c_{\FA}p_0}{c_{\MD}\bar{p}_0} \frac{A_1^{(j)} - A_0^{(j)}}{B_0^{(j)} - B_1^{(j)}}. \end{align*} Furthermore, we know from \eqref{eq:likelihood_test_distributed} that the left side is $\frac{c_{\FA}q_j^*}{c_{\MD}(1-q_j^*)}$. Therefore the claim has been proved that \begin{align*} \frac{q_j^*}{1-q_j^*} = \frac{p_0}{1-p_0} \frac{A_1^{(j)} - A_0^{(j)}}{B_0^{(j)} - B_1^{(j)}}. \end{align*} \section{\textsc{case} $4$ Is Impossible} \label{app:impossible} \begin{prop} It is impossible for the central agent to make an incorrect decision when $H=0$ and $H=1$. \end{prop} \begin{proof} It is sufficient to show that \textsc{case} $4$ of Tab.~\ref{tab:partition} is impossible. First recall two properties of $Q$ function that (a) $1-Q(x) = Q(-x)$ and (b) $Q(\cdot)$ is monotonic decreasing, so \begin{align*} \frac{1-Q(\lambda)}{Q(1-\lambda)} \stackrel{(a)}{=} \frac{Q(-\lambda)}{Q(1-\lambda)} \stackrel{(b)}{>} 1. \end{align*} Recalling $\lambda_0 \triangleq \lambda(q_0)$, $z_2$ defined in \eqref{eq:def_of_alpha} is \begin{align*} z_2 &= \frac{p_{\wh{H}_i|H}(0|1)_{[\cent]}}{p_{\wh{H}_i|H}(0|0)_{[\cent]} } \cdot \frac{p_{\wh{H}_i|H}(1|0)_{[\cent]}}{p_{\wh{H}_i|H}(1|1)_{[\cent]} } = \frac{Q(1-\lambda_\cent)}{1-Q(\lambda_\cent)} \cdot \frac{Q(\lambda_\cent)}{1-Q(1-\lambda_\cent)} \\ &< 1 \cdot \frac{Q(\lambda_\cent)}{Q(\lambda_\cent-1)} < 1. \end{align*} Also from the fact that $Q(\cdot)$ is decreasing, we know $z_1 z_2^{Q(\lambda_1)} < z_1 z_2^{Q(\lambda_1-1)}$, so \textsc{case} $4$ is a contradiction. \end{proof} \newcommand{\SortNoop}[1]{} \begin{thebibliography}{99} \bibitem{LobelSV2016} I.~Lobel, E.~Sadler, and L.~R. Varshney, ``Customer referral incentives and social media,'' \emph{Manage. Sci.}, vol.~63, no.~10, pp. 3514--3529, Sep. 2016. \bibitem{BikhchandaniHW1998} S.~Bikhchandani, D.~Hirshleifer, and I.~Welch, ``Learning from the behavior of others: Conformity, fads, and informational cascades,'' \emph{J. Econ. Perspect.}, vol.~12, no.~3, pp. 151--170, 1998. \bibitem{Banerjee1992} A.~V. Banerjee, ``A simple model of herd behavior,'' \emph{Quart. J. Econ.}, vol. 107, no.~3, pp. 797--817, Aug. 1992. \bibitem{BalaG2001} V.~Bala and S.~Goyal, ``Conformism and diversity under social learning,'' \emph{Econ. Theor.}, vol.~17, no.~1, pp. 101--120, Jan. 2001. \bibitem{SmithS2000} L.~Smith and P.~S{\o}rensen, ``Pathological outcomes of observational learning,'' \emph{Econometrica}, vol.~68, no.~2, pp. 371--398, Mar. 2000. \bibitem{GaleK2003} D.~Gale and S.~Kariv, ``Bayesian learning in social networks,'' \emph{Games Econ. Behav.}, vol.~45, no.~2, pp. 329--346, Nov. 2003. \bibitem{LeSB2017} T.~N. Le, V.~G. Subramanian, and R.~A. Berry, ``Information cascades with noise,'' \emph{{IEEE} Trans. Signal Inf. Process. Netw.}, vol.~3, no.~2, pp. 239--251, Jun. 2017. \bibitem{ViswanathanV1997} R.~Viswanathan and P.~K. Varshney, ``Distributed detection with multiple sensors: Part {I}---fundamentals,'' \emph{Proc. {IEEE}}, vol.~85, no.~1, pp. 54--63, Jan. 1997. \bibitem{Varshney1997} P.~K. Varshney, \emph{Distributed Detection and Data Fusion}.\hskip 1em plus 0.5em minus 0.4em\relax New York: Springer-Verlag, 1997. \bibitem{BergerZV1996} T.~Berger, Z.~Zhang, and H.~Viswanathan, ``The {CEO} problem,'' \emph{{IEEE} Trans. Inf. Theory}, vol.~42, no.~3, pp. 887--902, May 1996. \bibitem{SaligramaAS2006} V.~Saligrama, M.~Alanyali, and O.~Savas, ``Distributed detection in sensor networks with packet losses and finite capacity links,'' \emph{{IEEE} Trans. Signal Process.}, vol.~54, no.~11, pp. 4118--4132, Nov. 2006. \bibitem{ZhangCPM2013} Z.~Zhang, E.~K.~P. Chong, A.~Pezeshki, and W.~Moran, ``Hypothesis testing in feedforward networks with broadcast failures,'' \emph{{IEEE} J. Sel. Topics Signal Process.}, vol.~7, no.~5, pp. 797--810, Oct. 2013. \bibitem{Cover1969} T.~M. Cover, ``Hypothesis testing with finite statistics,'' \emph{Ann. Math. Stat.}, vol.~40, no.~3, pp. 828--835, 1969. \bibitem{HellmanC1970} M.~E. Hellman and T.~M. Cover, ``Learning with finite memory,'' \emph{Ann. Math. Stat.}, vol.~41, no.~3, pp. 765--782, 1970. \bibitem{TangPK1991} Z.-B. Tang, K.~R. Pattipati, and D.~L. Kleinman, ``Optimization of detection networks: {P}art {I}--tandem structures,'' \emph{{IEEE} Trans. Syst., Man, Cybern.}, vol.~21, no.~5, pp. 1044--1059, Sept.-Oct. 1991. \bibitem{TayTW2008} W.~P. Tay, J.~N. Tsitsiklis, and M.~Z. Win, ``On the subexponential decay of detection error probabilities in long tandems,'' \emph{{IEEE} Trans. Inf. Theory}, vol.~54, no.~10, pp. 4767--4771, Oct. 2008. \bibitem{AlanyaliVSA2004} M.~Alanyali, S.~Venkatesh, O.~Savas, and S.~Aeron, ``Distributed {B}ayesian hypothesis testing in sensor networks,'' in \emph{Proc. Am. Contr. Conf. (ACC 2004)}, vol.~6, June-July 2004, pp. 5369--5374. \bibitem{RadT2010} K.~R. Rad and A.~Tahbaz-Salehi, ``Distributed parameter estimation in networks,'' in \emph{Proc. 49th IEEE Conf. Decision Control}, Dec. 2010, pp. 5050--5055. \bibitem{AcemogluDLO2011} D.~Acemoglu, M.~A. Dahleh, I.~Lobel, and A.~Ozdaglar, ``Bayesian learning in social networks,'' \emph{Rev. Econ. Stud.}, vol.~78, no.~4, pp. 1201--1236, Oct. 2011. \bibitem{SahuK2016} A.~K. Sahu and S.~Kar, ``Distributed sequential detection for {G}aussian shift-in-mean hypothesis testing,'' \emph{{IEEE} Trans. Signal Process.}, vol.~64, no.~1, pp. 89--103, Jan. 2016. \bibitem{LalithaJS2018} A.~Lalitha, T.~Javidi, and A.~D. Sarwate, ``Social learning and distributed hypothesis testing,'' \emph{{IEEE} Trans. Inf. Theory}, vol.~64, no.~9, pp. 6161--6179, Sep. 2018. \bibitem{SeoRRGV2019} D.~Seo, R.~K. Raman, J.~B. Rhim, V.~K. Goyal, and L.~R. Varshney, ``Beliefs in decision-making cascades,'' \emph{{IEEE} Trans. Signal Process.}, vol.~67, no.~19, pp. 5103--5117, Oct. 2019. \bibitem{Radner1962} R.~Radner, ``Team decision problems,'' \emph{Ann. Math. Stat.}, vol.~33, no.~3, pp. 857--881, Sep. 1962. \bibitem{HoballahV1989} I.~Y. Hoballah and P.~K. Varshney, ``Distributed {B}ayesian signal detection,'' \emph{{IEEE} Trans. Inf. Theory}, vol.~35, no.~5, pp. 995--1000, Sep. 1989. \bibitem{TangPK1991b} Z.-B. Tang, K.~R. Pattipati, and D.~L. Kleinman, ``An algorithm for determining the decision thresholds in a distributed detection problem,'' \emph{{IEEE} Trans. Syst., Man, Cybern.}, vol.~21, no.~1, pp. 231--237, Jan.-Feb. 1991. \bibitem{Poor1988} H.~V. Poor, \emph{An Introduction to Signal Detection and Estimation}.\hskip 1em plus 0.5em minus 0.4em\relax Springer Science \& Business Media, 1988.\bibitem{Bertsekas1982} D.~P. Bertsekas, \emph{Constrained Optimization and {L}agrange Multiplier Methods}.\hskip 1em plus 0.5em minus 0.4em\relax New York, USA: Academic Press, 1982. \bibitem{Durrett2019} R.~Durrett, \emph{Probability: Theory and Examples}, 5th~ed.\hskip 1em plus 0.5em minus 0.4em\relax Cambridge, U.K: Cambridge University Press, 2019. \bibitem{Yeung2008} R.~W. Yeung, \emph{Information Theory and Network Coding}.\hskip 1em plus 0.5em minus 0.4em\relax Springer, 2008. \bibitem{Tsitsiklis1988} J.~N. Tsitsiklis, ``Decentralized detection by a large number of sensors,'' \emph{Math. Control Signals, Syst.}, vol.~1, no.~2, pp. 167--182, Jun. 1988. \end{thebibliography} \end{document} }
\caption{\textbf{(a)} Map $M$ provided to the privileged agent. One channel each for road (\textcolor{aluminium1}{light grey}), lane boundaries (\textcolor{aluminium2}{grey}), vehicles (\textcolor{skyblue2}{blue}), pedestrians (\textcolor{orange1}{orange}), and traffic lights (\textcolor{chameleon2}{green}, \textcolor{butter2}{yellow}, and \textcolor{scarletred1}{red}). The \textcolor{scarletred3}{agent} is centered at the bottom of the map. The agent's vehicle (\textcolor{scarletred3}{dark red}) and predicted waypoints (\textcolor{plum3}{purple}) are shown for visualization only and are not provided to the network. \textbf{(b)} The map representation affords simple and effective data augmentation via rotation and shifting.}
\caption{\textcolor{orange}{\textbf{(a)}}: Schematic depiction of a qubit with a Bloch sphere. % Spin-up or $|1\rangle$ is located on the north pole, and spin down or $|0\rangle$ is located on the south pole. % The state $\frac{|0\rangle + |1\rangle}{\sqrt{2}}$ with equal probability amplitudes to measure $|1\rangle$ and $|0\rangle$ values is geodetically equidistant to both poles. % A point on the surface of the Bloch sphere corresponds to a valid pure state $|\phi\rangle = \alpha |0\rangle + \beta |1\rangle$. % \textcolor{orange}{\textbf{(b)}}: Schematic visualisation of adiabatic quantum annealing (AQA). % At the beginning, all qubits are initialised in the state $|+\rangle$. % After the annealing is finished, the qubit states are measured and returned. % After the measurement, the states of variables are classical. % }
\caption{ % The sequences of energy-decreasing transitions and the corresponding energy values observed in our sampler, for transformation estimation ($K = 1$) and point set alignment with $K = 30$ interactions per $\bfx_n$. % Besides the graphs, we visualise alignment results for selected energy values and the angle of initial misalignment $\theta \in \big\{\frac{\pi}{8}, \frac{\pi}{4}, \frac{\pi}{2}\big\}$. % } \label{fig:energy_decreasing_transitions} \end{figure} \section{Conclusions}\label{sec:conclusion} % In summary, this paper introduces adiabatic quantum computers (AQC) for the computer vision community and shows that fundamental low-level problems can be brought to a representation suitable for solving by AQC. % We provide a detailed and thorough overview of the modern AQC technology and propose a new method for transformation estimation and point set alignment which can be directly mapped and solved on modern adiabatic quantum computers. % In simulations on a classical computer and in a wide range of scenarios, QA is shown to successfully recover 2D transformations which are close approximations of globally optimal transformations. % % With the chosen basis of $20$ elements, % the estimated transformations result in low transformation discrepancy and alignment errors. % Observations on how to avoid singularities and the spectal gap analysis complement the experimental section. % In future work, our technique can be extended to affine transformations. % We hope to see more research on computer vision methods with quantum hardware in the next decades. % \appendix \section{Appendix}\label{app:appendix} \renewcommand{\thefigure}{\Roman{figure}} \setcounter{figure}{0} % In this additional section, we provide details on the selection of the annealing rate, analyse the structure of $\bfP$ and formalise the unembedding procedure, \textit{i.e.,} the conversion of the solution to QUBOP \eqref{eq:xtPx} to the solution of the original alignment problem on point sets. % % \noindent\textbf{Annealing Rate.} Suppose $E_n(s)$ is the ground state of instantaneous Hamiltonian, $E_n(0)$ is the initial state (ground state) of the system and $E_m(s)$ is any other excited state of the instantaneous Hamiltonian. % Let $s = \frac{t}{T} \in[0; 1]$, where $T$ is the overall time of interpolation and $t$ is physical time. Then, according to \cite{Amin2009}, $T$ has to be chosen so that \begin{equation} T \gg \frac{|\langle E_m(s) | d H / d s | E_n(s) \rangle |}{E_{nm}(s)^2 }, \; \forall m \ne n, \end{equation} where $d H / d s$ is the rate of change of Hamiltonian with respect to $s$ and $E_{nm}$ is the difference in the corresponding instantaneous energies. % \noindent\textbf{Analysis of $\bfP$}. % Fig.~\ref{fig:app1} visualises several exemplary weight matrices $\bfP$ from the experiments with clean and noisy data (see Sec.~\ref{sec:EXPERIMENTS}). % There are several observations. % First, $\bfP = \boldsymbol{\Phi} \boldsymbol{\Phi}^\mathsf{T}$ is symmetric upon algorithm design. % We also see that the columns of $\boldsymbol{\Phi}$ can be arbitrarily reshuffled as long as the correspondences are preserved\footnote{a reshuffling of rows requires changing the order of elements in $\bfQ$}. % Second, $\bfP$ contains regularly arranged zero submatrices, due to our choice of the basis. % As soon as a row of $\boldsymbol{\Phi}$ induced by $q \bfC_{\bfI}$, where $\bfC_{\bfI} \in \{\bfI, -\bfI\}$, is multiplied by a column of $\boldsymbol{\Phi}^\mathsf{T}$ induced by $q \bfC_{\bfM}$, where $\bfC_{\bfM} \in \{\bfM, -\bfM\}$, and vice versa, we obtain a zero entry in $\bfP$. % The reason is that \begin{equation} \begin{cases} [ \bfI \, \sum_i \bfy_i ]^\mathsf{T} &[\bfM \, \sum_j \bfy_j]\;\;\, = 0\\ [ -\bfI \, \sum_i \bfy_i ]^\mathsf{T} &[\bfM \, \sum_j \bfy_j]\;\;\, = 0\\ [ \bfI \, \sum_i \bfy_i ]^\mathsf{T} &[-\bfM \, \sum_j \bfy_j] = 0\\ [ -\bfI \, \sum_i \bfy_i ]^\mathsf{T} &[-\bfM \, \sum_j \bfy_j] = 0 \end{cases}, \end{equation} if $\sum_i \bfy_i = \sum_j \bfy_j$, which holds in our case since each row of $\boldsymbol{\Phi}$ except the first row includes all points of $\bfY$ multiplied by a single basis element $\bfQ_k$ (see Fig.~\ref{fig:app1}-(top left) for $\bfC$ pairs resulting in zero matrices). % Third, the structure of $\bfP$ reflects that its diagonal entries encode biases, and non-diagonal elements represent couplings between the qubits. % % With the increasing $K$, the span of the absolute energy values increases, due to the higher number of point interactions. % As expected, $\bfP$ depends on data and the angle of initial misalignment between the point sets. % For all possible inputs and initial conditions --- point sets of different cardinalities, $K$ and $\theta$ --- the structure of $\bfP$ is the same for the chosen basis. % From $\bfP$, we also recognise that the considered alignment problem is not purely combinatorial and requires high-precision weights $J_{j,k}$ in \eqref{eq:Ising_Model_Hamiltonian}. % \noindent\textbf{Unembedding.} % Unembedding is the decoding of the solution to QUBOP \eqref{eq:xtPx} to the solution of the original alignment problem. % Upon the design, our QA method assembles the entries of the transformation matrix in the additive basis $\bfQ_k$ (see Secs.~\eqref{sec:2D_transformation_estimation}--\eqref{ssec:particle_dynamics_based_alignment}). % Suppose $\hat{\bfq}$ is the measurement result of $\bfq$, \textit{i.e.,} it is a classical bitstring with $K+1$ elements. % Recall that $\bfq_1$ is reserved for reference points and does not contribute to the assembly of the transformation. % Once $\hat{\bfq}$ is measured and returned, we obtain the corresponding transformation $\bfR$ by summing up $\bfQ_k$ multiplied by $\hat{\bfq}_{k+1}$: % \begin{equation} \bfR = \sum_k \hat{\bfq}_{k+1} \bfQ_k. \end{equation} % The obtained $\bfR$ is an affine transformation. % % If the solution has to represent a valid rotation matrix $\bfR_{\text{r}}$, $\bfR$ can be projected to the rotation group by solving the \textit{closest orthogonal approximation problem with constraints}: % \begin{equation}\label{eq:closest_rotation_matrix} \begin{aligned} & \;\;\;\;\;\;\, \min \norm{\bfR_{\text{r}} - \bfR}_\mathcal{HS}^2, \\ & \text{s.~t.~} \; \bfR_{\text{r}}^{-1} = \bfR_{\text{r}}^\mathsf{T} \;\text{and}\; \operatorname{det}(\bfR_{\text{r}}) = 1. \end{aligned} \end{equation} For a solution to \eqref{eq:closest_rotation_matrix} by singular value decomposition, see \cite{Higham1989}. % % \vspace{40pt} \begin{figure}[t!] \centering \includegraphics[width=1.0\linewidth]{FIGURES/FIGURE_I} % \caption{Exemplary visualisations of the weight matrix $\bfP = \boldsymbol{\Phi} \boldsymbol{\Phi}^\mathsf{T}$ in the experiment with clean (\textbf{\textit{A/}}) and noisy data with $35\%$ of outliers in the template (\textbf{\textit{B/}}), for $K \in \{1, 10, 20, 40\}$ and $\theta \in \big\{\frac{\pi}{4}, \pi\big\}$. % The colour scheme and the range of energy values are given to the right of each $\bfP$. % White colour stands for zero entries. % The diagonal values in $\bfP$ represent biases (marked in orange on the top left), and non-zero elements represent couplings between the qubits. % In the visualisation on the top left, we list the pairs of $\bfC \in \{\bfI, \bfM, -\bfI, -\bfM\}$ eventually leading to zero matrices. }
\caption{Average time to add two 50\kilobytes integers vs.\index of first symbolic byte, for\sysName~($\bullet$), \klee~(\textcolor{gray}{$\blacktriangle$}), and \sTwoE~(\textcolor{lightgray}{$\blacksquare$}). A higher index for the first symbolic byte indicates a higher fraction of concrete operations. Each point is an average over five runs, with relative standard deviation $< 6\%$.}
\caption{{\bf Abstract architecture of a hybrid ffNN.} Each layer $L_j$ contains an arbitrary number of nodes \bblu{$\{n_{kj}\}$}, which can individually be implemented on a quantum hardware. Upon measurement, information about the activation state of a layer is passed to the following one ($L_{j+1}$) in the form of classical bits controlling quantum operations. Full connectivity between nodes in successive layers is \red{schematically} shown, although sparser networks are also possible in principle. \bblu{The dashed line represents classical inputs from a generic preceding stage, which can be, e.g., a collection of layers up to $L_{j-1}$ or the original input information.}}
\caption{{\color{red} Da discutere se effettivamente sono compatibili!}}
\caption{Single-layer convolutional feature based tracking performance on CVPR2013 dataset \cite{wu2013online}. \textcolor[rgb]{0,0.4431,0.7373}{Blue Points}: Tracking AP on CVPR2013 dataset before feature selection. \textcolor[rgb]{0.8471,0.3216,0.0941}{Red Points}: Tracking AP on CVPR2013 dataset after feature selection. \textcolor[rgb]{0,0.4431,0.7373}{Blue Dash Line}: Average AP of the three layers on the dataset before feature selection. \textcolor[rgb]{0.8471,0.3216,0.0941}{Red Dash Line}: Average AP of the three layers on the dataset after feature selection.}
\caption{Dynamic variable used for coupling: \inred{tumor cell density $b$} \\ Most sensitive model parameters: \\ \inblue{tumor cell proliferation threshold $o^{prol}$ and hypoxia threshold $o^{death}$,} \\ \inblue{tumor cell proliferation time $T^{prol}$ and survival time $T^{death}$}}
\caption{\label{fig:tstep}Training (\dashed) and validation (\full) loss in the FCN prediction at $y^+=15$ (left) and $y^+=50$ (right), using different time steps between samples.}
\caption{Bar chart of \red{the} top 20 Android API classes (with ``\texttt{android.}'' prefix omitted) that incur compatibility inconsistency in our dataset.}
\caption{\blue{A case study of the \DSDK issue with incompatibility effect: Solo VPN.}}
\caption{\red{The top} five library classes that introduce \texttt{addJavascriptInterface()} API call in vulnerable apps and the number of apps affected.}
\caption{\blue{A case study of the \DSDK issue with security effect: Exsoul Browser.}}
\caption{CDF plot of the \red{amount of} time \red{required} for our approach to analyze each app.}
\caption{A typical wall static pressure signal (unfiltered \& filtered) observed during the supersonic wind tunnel operation defining the achievable steady test time of about few seconds to acquire all the steady/unsteady signals and images. The location of the current pressure measurement is shown in Figure\ref{figure2} as $P_\infty$. The freestream Mach number ($M_\infty$) is 2.0, and the freestream static pressure ($P_\infty$) is 43908.88 $Pa$.}
\caption{Typical time-averaged shadowgraph image obtained through the operation of (a) $\left\|\bar{\boldsymbol I}\right\|$ and (b) $\left\|\bar{\boldsymbol I}-\boldsymbol I_{rms}\right\|$ showing the dominant flow features observed around a hemispherical body with a sharp tip spike $\left(l/D=1.5,\;d/D=0.12\right)$ at a freestream supersonic flow Mach number of $M_\infty=2.0$. Flow features: 1. Weak leading edge shock; 2. Separation shock; 3. Separated shear layer, 4. Re-circulation region; 5. Reattachment shock; 6. Expansion fan, 7. Window defects/image artifacts, 8. Reattachment point, 9. Separation point. Flow is from left to right.} \label{figure8} \end{figure} \textbf{Effect of the spike diameter $(d)$: } In order to investigate the effect of the spike diameter ($ d $), sharp tip spikes of a constant length $(l/D=1)$ and different spike diameters $(d/D =0.06, 0.12 and 0.18)$, have been used. The corresponding time-averaged shadowgraph images are shown in Figure~\ref{figure9}a. With the change in $d/D$, no significant change in the flow features is observed. The reattachment point is found to be generated around a fixed location for all the $d/D$. However, a small variation in the size of the re-circulation region can be noticed as the separation point rises radially upward with an increase in $d/D$. The locations of the flow reattachment point ($ S/D $) have been obtained by performing a suitable calibration on the time-averaged shadowgraph images of all the cases and are plotted in Figure~\ref{figure10}. In Figure~\ref{figure10}a, the variation in the location of reattachment point with the increase in spike diameter ($ d $) is shown. As stated, based on the shadowgraph images, no significant variation in the location of the reattachment point is observed. However, the flow turn angle near the reattachment point varies due to the increment of cone angle formed by the re-circulation region at the point of separation. The separation point moves between 0.25 to 0.5 as $d/D$ increases (Figure \ref{figure9}). Thus, with increase in $d/D$, the cone angle increases for a constant reattachment point and hence, the strength of the separation shock and flow turn angle near the reattachment point increases too. As a consequence, the strength of the reattachment shock will also increase. Hence, variations of similar order are expected in the wall static pressure near the reattachment point as well as in the $c_d$ values (will be discussed, eventually). \begin{sidewaysfigure} \centering \includegraphics{images/figure9.png}{} \caption{Time-averaged shadowgraph image obtained through the operation of $\left\|\bar{\boldsymbol I}-\boldsymbol I_{rms}\right\|$ showing the (a) effect of spike diameter($d/D$), and (b) effect of spike length ($l/D$), on the dominant flow features around a hemispherical body at a freestream supersonic flow Mach number of $M_\infty=2.0$. Dominant flow features: 1. Separated shear layer, 2. Weak leading edge shock, 3. Expansion fan, 4. Separation shock, 5. Reattachment shock, 6. Re-circulation region (larger), 7. Re-circulation region (smaller), 8. Reattachment point, 9. Separation point. Flow is from left to right.} \label{figure9} \end{sidewaysfigure} \begin{figure}[htb!] \centering \includegraphics{images/figure10.png}{} \caption{The effect of (a) spike diameter $(d/D)$ and (b) spike length $(l/D)$, on the location of the reattachment point formed over the hemisphere forebody at a freestream supersonic Mach number of $M_\infty=2.0$. Obtained values are having an uncertainty of $\pm5\%$ as they are acquired from the time-averaged shadowgraph images. Solid circles represent the measured values. Dashed line shows the trend.} \label{figure10} \end{figure} Surface or wall static pressure measurements have been carried out at different locations on the hemispherical forebody with and without a spike of different ($ d/D $) and the results are shown in Figure~\ref{figure11}a. A significant reduction in the level of static pressure distribution on the hemispherical forebody is observed due to the adoption of a spike and hence a reduction in drag is also expected. However, as deduced from the analysis of the time-averaged shadowgraph images, a slight increase in static pressure distribution on the hemispherical forebody is observed while increasing the spike diameter from [$d/D$]=0.06 to [$d/D$]=0.18. The present findings along with the postulates based on the time-averaged shadowgraph images imply a slight increase in drag value with increase in the spike diameter. The peak of each one of the pressure distribution curves obtained over the hemispherical spiked bodies, corresponds to the location of the reattachment point on the hemispherical body from where the reattachment shock is found to be generated. On closely studying the pressure plots in Figure~\ref{figure11}a within the given spatial resolution in the sensor placements, it is observed that positions of the peaks of the pressure distributions are at the same location $\left(S/D\approx0.39\right)$ for all $d/D$ (see the solid trend line in Figure~\ref{figure11}a). This observation is consistent with the findings from the qualitative studies (using time-averaged shadowgraph images) stating that the reattachment point remains almost fixed while increasing $d/D$ (see Figure~\ref{figure10}a). \begin{figure}[htb!] \centering \includegraphics[width=\linewidth]{images/figure11.png}{} \caption{The effect of (a) spike diameter $(d/D)$ and (b) spike length $(l/D)$, on the coefficient of measured wall static pressure ($C_p = {2}(P-P_\infty)/{\gamma M^2 P_\infty}$) over the hemispherical forebody at a freestream supersonic flow Mach number of $M_\infty=2.0$. Solid circles represent the measured values. Solid line shows the trend.} \label{figure11} \end{figure} The forebody drag coefficient ($c_d$) for all the cases has also been measured using an in-house built strain gauge balance. The effect of $d/D$ on $c_d$ can be seen in Figure~\ref{figure12}a. As expected from the time-averaged shadowgraph images and static pressure measurements, a slight increase in the values of the drag coefficient with an increase of $d/D$ up to [$d/D$]=0.18 is observed. The measured $c_d$ are also tabulated in Table~\ref{table2}, indicating that within the uncertainty, a mild change in $c_d$ is observed with increase in $ d/D $. \begin{figure}[!htbp] \centering \includegraphics{images/figure12.png}{} \caption{The effect of (a) spike diameter $(d/D)$ and (b) spike length $(l/D)$, on the coefficient of measured forebody drag ($c_d$) at a freestream supersonic flow Mach number of $M_\infty=2.0$. Black solid circles represent the experimental points and dashed lines are the trend lines. Red solid circle represents the measured coefficient of drag $(c_d)$ over a hemisphere without a spike ($l/D=0,\;d/D=0$).} \label{figure12} \end{figure} \textbf{Effect of the spike length $(l)$: } In order to investigate the effect of the spike length ($l/D$) on the time-averaged flow features, a sharp tip spike of constant stem diameter $d=0.12D$ has been used. The different spike lengths investigated in the present study are $l/D$=0.5, 1.0, 1.5 and 2.0. From Figure~\ref{figure9}b it can be seen that, with an increase in $l/D$, the flow separation point along the spike stem shifts downstream with a corresponding change in the separation shock strength as well. Such a shift in the separation point has been observed till a critical spike length of $l/D=1.5$. The re-circulation region between the hemispherical forebody and the spike (bounded by the separated free shear layer) also increases with increase in $l/D$ up to $l/D=1.5$. On a further increase in $l/D>1.5$, no more downstream movement of the separation point has been observed from the hemispherical forebody. The change in the location of the separation point for increasing $l/D$ in turn leads to the shifting of the reattachment point over the shoulder of the hemispherical forebody. The unlikely change of reattachment point for different $l/D$ than for different $d/D$ can be attributed to the upstream flow conditions. As $l/D$ increases, separation point occurs after the expansion fan near the sharp spike tip, especially for $l/D>1.0$. The separation point thus moves between $0-1.2D$ from the spike tip as $l/D$ increases. Such a drastic variations alter the flow turning angle and strength of the separation and reattachment shocks. Thickening boundary layer on the spike stem as $l/D$ increases, further adds complexity and results in moving the reattachment point upstream ($l/D=2.0$). The observations are consistent with the findings in Figure \ref{figure10}b. Similar results were also reported by Mair \unskip~\cite{279025:6283331}. The measured static pressure distribution for the blunt body without and with a spike of different $l/D$ can be seen from Figure~\ref{figure11}b. Similar to Figure~\ref{figure11}a, a significant reduction in the time-averaged static pressure on the hemispherical forebody is observed due to the adoption of a spike. The solid trend lines (Figure~\ref{figure11}b) help in identifying the peaks of the pressure distribution curves for a spike mounted hemisphere. Each peak corresponds to the associated reattachment point which moves downstream with an increase in $l/D$. This observation is consistent with those made from the time-averaged shadowgraph images (see Figure~\ref{figure9}b). The downstream movement of the reattachment point over the hemispherical forebody with an increase in $l/D$ takes place up to $l/D=1.5$, as seen in Figure~\ref{figure11}b (see the solid trend lines). A further increase in $l/D>1.5$, causes the reattachment point to move back to upstream resulting in a pressure distribution curve being similar to the one corresponding to the case of $l/D=1.0$. Thus, a reduction in the value of $c_d$ with increasing $l/D$ up to $l/D=1.5$ is anticipated. The effect of $l/D$ on $c_d$ obtained from the experiments is shown in Figure~\ref{figure12}b. In accordance with time-averaged shadowgraph and static pressure measurements, a reduction in the values of $c_d$ is observed with an increase $l/D$ up to $l/D=1.5$. A further increase in $l/D$ to $l/D=2.0$ results in increment of $c_d$ as expected from the discussions made in the previous paragraph. Hence, it can be concluded that the percentage of $c_d$ reduction should increase with increase in $l/D$ up to an optimized spike length of $l/D=1.5$ in the present study. \subsubsection{Investigation of time-resolved flow field}The change in dominant flow features around the hemispherical forebody resulting from the mounting of a sharp tip spike has been observed and explained in the previous section. Considering the surface pressure distribution values and percentage of drag reduction achieved from the parametric time-averaged study, a sharp tip spike with [$l/D$]=1.5 shows a maximum drag reduction and stands out as the most efficient spike geometry. However, mounting of a drag reducing spike also has a severe disadvantage associated with flow unsteadiness which needs to be taken into consideration while looking for an efficient spike geometry. Therefore, a parametric time-resolved study is also required in order to compare the level of flow unsteadiness associated with the set of spike geometries included in the present study. The existence of low intensity shock related unsteadiness in the case of a forward facing step at a supersonic free stream Mach number has already been reported in \cite{david}. They reported that the free shear layer separating the re-circulation region from the external flow may become unstable (with respect to Kelvin-Helmholtz instabilities) and form large-scales structures. These structures interact with the separation shock and convect along the shear layer up to the reattachment point. They also reported that the charging and ejection of fluid mass in the recirculation region is the driving mechanism for the shock related unsteadiness. Thus, the shock related unsteadiness associated with the hemispherical spiked body is expected to be similar with respect to the flow phenomena seen in shock-wave turbulent boundary layer interactions (SWTBL) problems as induced by compression ramps, reflected shocks, protrusions, and fins \unskip~\cite{279025:6296387}. \begin{figure}[htb!] \centering \includegraphics{images/figure13.png}{} \caption{Instantaneous shadowgraph images obtained at different time intervals showing the difference in dominant flow features observed around a hemispherical forebody when mounted with a typical sharp tip spike ($l/D=1.5,\;d/D=0.12$) at a freestream supersonic flow Mach number of $M_\infty=2.0$. Dominant flow features: 1. Leading edge shock, 2. Shocklet, 3. Large scale structures, 4. Shocklet Moving downstream, 5. Reattachment shock, 6. Separation shock, 7. Expansion fan, 8. Re-circulation region. Flow is from left to right.} \label{figure13} \end{figure} Figure~\ref{figure13} shows the time-resolved instantaneous shadowgraph images captured at time intervals of $\Delta t=1/f $ (where, $f $ = 43000 Hz) over the hemispherical forebody mounted with a sharp tip spike of $l/D=1.5$ and $d/D=0.12$. In the supplementary, time-resolved shadowgraph video files corresponding to each of the cases are given.\footnote{Shadowgraph video filenames for different cases: $d/D=0.06$ in `video1', $d/D=0.12$ in `video2', $d/D=0.18$ in `video3', $l/D=0.5$ in `video4', $l/D=1.5$ in `video5', $l/D=2.0$ in `video6', and `no spike' in `video7'. Captions for each of the video files are given in a separate text file called as `VideoCaptions'.} The time-resolved shadowgraph images indicate the generation of shocklets (as shown in Figure~\ref{figure4}) near the separation point and their downstream propagation along the free shear layer. With passage of time, the generation of large-scale structures due to the existence of shear layer instabilities (Kelvin-Helmholtz type) can be observed convecting along the shear layer. The large-scale structures results in the formation of shocklets which move downstream along the shear layer and interact with the reattachment shock. Reattachment and separation shock foot oscillations associated with the charging and ejection of fluid mass from the re-circulation region are also observed in Figure~\ref{figure13}. For clarity, the reader is referred to see the video given in the supplementary under the name `video5' for the sharp tip spike case having a $l/D=1.5$, and $d/D=0.12$. To quantify the shock related unsteadiness generated by mounting a sharp tip spike on a hemispherical body, unsteady pressure fluctuations have been measured near the shoulder ($ S/D=0.4 $) of the spiked forebody configurations as shown in Figure~\ref{figure7}. The power spectral density ($ psd $) of the pressure signal obtained from a hemisphere without (black line) and with a spike of $l/D$=1.0 and $d/D$=0.12 (red line) are compared and presented in Figure~\ref{figure14}. For the case of a hemispherical forebody mounted with a typical sharp tip spike a broad band spectrum having a comparatively higher amplitudes is observed, indicating a significant enhancement of flow unsteadiness. Similar studies have been conducted with variations in geometrical parameters (length, $l$ and diameter,$d$) of the sharp tip spike. \begin{figure}[htb!] \centering \includegraphics[width=0.5\textwidth]{images/figure14.png}{} \caption{Power spectra of the measured pressure fluctuations near the shoulder $(S/D=0.4)$ of the hemispherical forebody configuration (line: black-clean hemisphere with no spike, red-hemisphere mounted with a sharp spike of $l/D=1.5$ and $d/D=0.12$) at a supersonic freestream Mach number of $M_\infty=2.0$.} \label{figure14} \end{figure} In order to investigate the effect of the spike geometrical parameters on the shock related unsteadiness, sharp tip spike of different diameters ($d/D$=0.06 to 0.18, in steps of 0.06) and different lengths ($l/D$=0.5 to 2.0, in steps of 0.5) are considered. The level of pressure loading $\left(\zeta\right) $ and pressure fluctuations intensity $\left(\kappa\right) $ on the shoulder of the hemisphere have been derived from the unsteady pressure signals using equations \ref{eq-zeta} and \ref{eq-kappa}, respectively, and they are tabulated in Table~\ref{table2}. \begin{align} \label{eq-zeta} \zeta=\frac{P_{rms}}{P_\infty},\\ \label{eq-kappa} \kappa=\frac{P_{s}}{P_{rms}}. \end{align} where, \begin{equation*} P_{rms}=\sqrt{\frac1n\sum_{i=1}^{n}P_i^{2}},\;P_{s}=\sqrt{\frac1n\sum_{i=1}^{n}({P_i-\overline P})^{2}}, P_i\;=\;\overline P+P',\;\overline P=\frac1n\sum_{i=1}^{n}P_i. \end{equation*} As seen in Table~\ref{table2}, with the increase in spike diameter ($d$), the values of the pressure loading $\left(\zeta\right) $ and associated $c_d$ increase. The reason for this is probably the increase in the strength of the reattachment shock. These findings are consistent with the postulates made regarding the reattachment shock strengths based on time-averaged shadowgraph images. A gradual reduction in the values of pressure loading $\left(\zeta\right) $ with increase in the spike length up to $l/D=1.5$ can also be seen in Table~\ref{table2}. Here again the reason is probably the reduction in the strength of the reattachment shock. These findings are also consistent with the $c_d$ measurements and the postulates made from the time-averaged shadowgraph images as $l/D$ increases. Furthermore, the pressure fluctuation intensity $\left(\kappa\right) $ is reduced with an increase in the spike diameter. It can be observed from Table~\ref{table2} that the value of $\kappa $ increases with increase in the spike length up to $l/D=1.5$ and remains almost constant as $l/D$ increases. \begin{table}[hbt!] \caption{ Details of the measured drag coefficient ($c_d$) and the calculated pressure loading ($\zeta$) and pressure fluctuation intensity ($\kappa$) in the hemisphere mounted with different spike configurations at a freestream supersonic Mach number of $M_\infty=2.0$.} \label{table2} \centering \begin{tabular}{c|c|ccc|ccc} \hline \multirow{2}{*}{Spike} & \multirow{2}{*}{No} & \multicolumn{3}{c|}{$l/D=1.0$} & \multicolumn{3}{c}{$d/D=0.12$} \\ \cline{3-8} Configuration & Spike & $d/D=0.06$ & $d/D=0.12$ & $d/D=0.18$ & $l/D=0.5$ & $l/D=1.5$ & $l/D=2.0$ \\ \hline \begin{tabular}[c]{@{}c@{}}$c_d(\pm 5\%)$\\ (Drag Coefficient)\end{tabular} & 0.8 & 0.5 & 0.52 & 0.54 & 0.67 & 0.43 & 0.49 \\ \begin{tabular}[c]{@{}c@{}}$\zeta(\pm 5\%)$\\ (Pressure Loading)\end{tabular} & 2.89 & 2.8 & 3.2 & 3.5 & 4.1 & 2.6 & 3 \\ \begin{tabular}[c]{@{}c@{}}$\kappa(\pm 5\%)$ (Pressure\\ Fluctuation Intensity, $\%$)\end{tabular} & 1.6 & 12 & 11 & 8 & 8 & 14 & 13 \\ \hline \end{tabular} \end{table} \textbf{Effect of the spike diameter $(d)$: } The instantaneous frames of time-resolved shadowgraph images captured over a hemisphere mounted with a sharp tip spike of different diameters ($d$) and lengths ($l$) are presented in Figure~\ref{figure15}a and b. From Figure~\ref{figure15}a it can be seen that with the increase in $d/D$, the size of the large-scale structures (see Figure~\ref{figure4}) formed along the free shear layer is reduced. This, in turn reduces the strength of the convecting shocklets (as shown in Figure~\ref{figure4}) formed along the free shear layer resulting in lesser oscillation of the reattachment shock. \begin{sidewaysfigure} \centering \includegraphics{images/figure15.png}{} \caption{{Instantaneous shadowgraph images showing the effect of (a) spike diameter ($d/D$) and (b) spike length ($l/D$), on the dominant flow features around a hemispherical forebody at a freestream supersonic flow Mach number of $M_\infty=2.0$. Dominant flow features: 1. Shocklets (stronger), 2. Shocklets (weaker), 3. Large scale structures, 4. Finer scale structures, 5. Weak leading edge shock, 6. Expansion fan, 7. Separation shock, 8. Reattachment shock, 9. Recirculation region (larger), 10. Recirculation region (smaller), 11. Separated shear layer. Flow is from left to right.}} \label{figure15} \end{sidewaysfigure} Spectral analysis has been made using the measured pressure fluctuations at the location near the shoulder of the hemisphere ($ S/D=0.4 $, see Figure~\ref{figure7}) for all cases considered. The effect of change in spike diameter ($d$) on the power spectra is presented in Figure~\ref{figure16}a. Broad band spectra for all the cases are exhibited with higher amplitudes compared to the case with no spike. In particular, the range of frequencies between 2-8 kHz seems to be amplified and is probably associated with the amplification of shear layer instabilities. The difference in the spectra with change in the $d/D$ is noticeable at a range of frequencies around 5 kHz. As also observed from the time-resolved instantaneous shadowgraph frames, the intensity of the shock related unsteadiness is reduced with increase in $d/D$. The highest/lowest intensity of the power spectra is observed for the spike with a minimum/maximum diameter of $d/D=0.06/0.18$, respectively (blue/yellow line of Figure~\ref{figure16}a). \begin{figure}[htb!] \centering \includegraphics{images/figure16.png}{} \caption{Power spectra of the measured pressure fluctuations near the shoulder $(S/D=0.4)$ of the hemispherical forebody configuration showing the effect of (a) spike diameter ($d/D$) and (b) spike length ($l/D$), at a supersonic freestream Mach number of $M_\infty=2.0$.} \label{figure16} \end{figure} \textbf{Effect of the spike length $(l)$: } Similar studies have been conducted in order to investigate the effect of the spike length ($l $) on the time-resolved flow features around the hemispherical spiked body configuration. It is observed that the size of the large-scale structures formed along the free shear layer is slightly increased with an increase in the spike length up to $l/D=1.5$ as shown in Figure~\ref{figure15}b. A further increase in the spike length to $l/D=2.0$, does not affect the size of the large-scale structures. With increase in $l/D$, the flow separation point on the spike stem moves downstream along the spike and thereby increasing the separation shock angle. As mentioned in the previous section, this leads to a lower reduction in the Mach number behind the separation shock. Consequently, the convective Mach number across the shear layer is increased as reported by Slessor et al., \unskip~\cite{279025:6948594}. This leads to a lesser growth rate of the shear layer. Lower growth rate reduces the size of the large scale structures and thus the observation of increasing pressure fluctuations intensity $(\kappa)$ as shown in Table~\ref{table2}. Similar trend but to a lesser extent is observed in the power spectrum (Figure \ref{figure}b) over the hemisphere mounted with the sharp tip spike of different $l/D$. As explained earlier, with the increase in the spike diameter ($d$), the flow separation point moves upstream resulting in the reduction of the length of the shear layer. This in turn reduces the growth of the large-scale structure formed along the shear layer. In addition, increase in $d$, further reduces the angle formed by the re-circulation region at the point of separation, thereby reducing the flow turning angle required ahead of the reattachment point. Thus the power spectrum (Figure \ref{figure16}a) from the unsteady pressure measurements shows a decreasing trend for increasing $d$. \begin{sidewaysfigure} \centering \includegraphics{images/figure17}{} \caption{Dominant energetic spatial mode $\left[\Phi_1(x,y)\right] $ obtained from the POD analysis of shadowgraph images taken from the hemispherical spiked body configuration mounted with a sharp tip spike of (a) different diameters ($d/D$) for a $l/D=1.0$, and (b) different lengths ($l/D$) for a $d/D=0.12$ at a freestream supersonic Mach number of $M_\infty=2.0$. Flow features: 1. Unsteady separation shock, 2. Separated shear layer, 3. Re-circulation region, 4. Reattachment shock, 5. Strong coupling between separated and reattached shock oscillations, 6. Weak coupling between separated and reattached shock oscillations. Flow is from left to right.} \label{figure17} \end{sidewaysfigure} The effect of the spike diameter ($d$) and spike length $(l)$ on the shock related unsteadiness discussed so far is based upon the results obtained from the time-resolved instantaneous shadowgraph images and the unsteady pressure measurements. Furthermore, an attempt has been made to verify our findings using modal analysis of the time-resolved shadowgraph images. For that purpose, both POD and DMD analysis have been carried out and the dominant energetic spatial mode ($\Phi_1\left(x,y\right)$) and dynamic temporal mode ($\alpha\left[\Theta\left(x,y\right)\right],f$) have been computed for the case of a hemispherical forebody mounted with different spikes. The dominant energetic spatial mode $\left[\Phi_1\left(x,y\right)\right] $ represents the corresponding time-averaged flow field image as shown in Figure~\ref{figure17}. Only time varying dominant flow features are observed in Figure \ref{figure17}. The absence of leading edge shock in the dominant energetic modes verifies the fact that it is steady. The horizontal distance between the extrema of the colour contours near the separation and reattachment shocks in Figure~\ref{figure17} marks the extrema of shock oscillations. It is observed that the phenomenon of separation and reattachment shocks oscillation exist in all the investigated cases of spike mounted hemispherical forebody configurations. However, the magnitudes of the separation and the reattachment shocks oscillation (Figure \ref{figure17}a) are found to be of similar magnitude with the change in the spike diameter ($d$) with $\Delta x/D=0.15$ (separation shock) and $\Delta x/D=0.20$ (reattachment shock). In addition, the existence of opposite band of colour contour (for example in Figure \ref{figure17}a-iii, the separation shock has blue-red contour band and the reattachment shock has red-blue contour band, but of similar magnitude in $\Delta x/D$), means that the oscillations of the separation and reattachment shocks are strongly coupled and have negative correlation \cite{kutz} (or out-of-phase shock motion). Such kind of observation are consistent with the conclusions provided in \cite{david} in regards to a charging and ejection of fluid mass in the recirculation region for the forward facing step in a supersonic stream, where a distinct out-of-phase movement of the separated and reattachment shocks are seen. In case of varying spike length $(l)$ as shown in Figure~\ref{figure17}b, the shock oscillation intensity is observed to be decreasing and the coupling between the separation and reattachment shocks weakens. These observations corroborate with the shadowgraph images shown in Figure~\ref{figure9} and Figure~\ref{figure15}. \begin{figure}[htb!] \centering \includegraphics{images/figure18.png}{} \caption{The dynamic spectra observed from the DMD ($\Theta(x,y)$) analysis for the hemisphere mounted with a sharp tip spike of (a) different spike diameters ($d/D$) for $l/D=1.0$ and (b) different spike lengths ($l/D$) for $d/D=0.12$ at a freestream Mach number of $M_\infty=2.0$.} \label{figure18} \end{figure} The dynamic spectra have been obtained from DMD for all the considered cases, and they are compared in Figure~\ref{figure18}. The findings from the DMD analysis are consistent with the spectral contents obtained from the pressure measurements (see Figure~\ref{figure16}). From the analysis of Figure~\ref{figure18}, it can be seen that the unsteadiness associated with the hemispherical spiked body configuration reduces with increase in $d/D$ (see Figure~\ref{figure18}a). The sharp tip spike having the maximum $d/D$ has the least shock related unsteadiness. On the other hand, the intensity of the shock related unsteadiness is increased with increase in the spike length up to $l/D=1.5$ as shown in Figure~\ref{figure18}b. A further increase in $l/D$ to $l/D=2.0$, does not affect the intensity of the shock related unsteadiness and remains almost the same as in the case of $l/D=1.5$. Hence, our findings from the modal analysis supports the postulates made from the time-resolved shadowgraph images and unsteady pressure measurements. \subsection{Effect of spike tip geometry}In the previous section, we have shown the effect of the diameter ($d$) and the length ($l$) of a sharp tip spike on the time-averaged and time-resolved flow features generated around the hemispherical spiked body. From the parametric studies conducted so far, the sharp tip spike with $l/D=1.5$ and $d/D=0.12$ is found to be the most effective one in terms of drag reduction (see Figure~\ref{figure12}b) but it generates a considerable amount of shock related unsteadiness (see Figure~\ref{figure16}b-yellow line). Similarly, the sharp tip spike with $l/D=1.0$ and $d/D=0.18$ generates the minimal amount of shock related unsteadiness (see Figure~\ref{figure16}a-yellow line) but the drag reduction in this case is quite low (see Figure~\ref{figure12}a). In order to devise a compromised spike geometry having a better drag reduction while generating a minimal level of shock related unsteadiness, the tip of the sharp spike has been altered. Based on the investigations of the shock related unsteadiness over the hemispherical spiked body configuration and the parameters influencing the level of unsteadiness, a suitable spike tip of hemispherical shape with different base shapes (vertical, circular, and elliptical) has been adopted. Hence, further experiments have been conducted on a hemispherical spiked body configuration with a spike of hemispherical tip having its radius being equal to the diameter of the spike stem ($d$). The overall length of the spike is kept equal to the forebody diameter ($l/D$=1.0). The effect of the spike length and the spike stem diameter with a hemispherical spike tip has not been investigated in the present study. The geometrical details of the hemispherical spike tip with different base shapes are shown in Figure~\ref{figure19}. \begin{figure}[] \centering \includegraphics[width=0.7\textwidth]{images/figure19.png}{} \caption{Schematic showing the primary geometrical parameters of the hemispherical tip spike on a hemispherical spiked body configuration. Different base shapes of the hemispherical spike tip is shown: (a). Vertical base, (b). Circular base, and (c). Elliptical base. The origin is at the spike tip. Flow is from left to right.} \label{figure19} \end{figure} \begin{table}[hbt!] \caption{ Details of the measured drag coefficient ($c_d$) and the calculated pressure loading ($\zeta$) and pressure fluctuation intensity ($\kappa$) in the hemispherical forebody mounted without and with spike tip of different shapes $(l/D=1.0,\;d/D=0.12)$ at a freestream supersonic Mach number of $M_\infty=2.0$.} \label{table3} \centering \begin{tabular}{cccc} \hline Spike Configurations & $c_d (\pm 5\%)$ & $\zeta (\pm 5\%)$ & $\kappa (\pm 5\%)$, $\%$ \\ \hline Without spike (hemispherical forebody) & 0.8 & 2.89 & 1.6 \\ Sharp spike tip ($l/D=1.0,\;d/D=0.12$) & 0.52 & 3.2 & 11 \\ Hemispherical spike tip (vertical base) & 0.36 & 2.04 & 10 \\ Hemispherical spike tip (circular base) & 0.38 & 2.11 & 12 \\ Hemispherical spike tip (elliptical base) & 0.39 & 2.22 & 11\\ \hline \end{tabular} \end{table} \subsubsection{Investigation of time-averaged flow field}The time-averaged images (through the operation of $\left\|\bar{\boldsymbol I}-\boldsymbol I_{rms}\right\|$) obtained from set of 1000 shadowgraph images captured at a high frame rate of 43000 Hz and exposure time of 2 $\mu s$ are shown in Figure~\ref{figure20}. The formation of a detached weak bow shock ahead of the hemispherical spike tip (see Figure~\ref{figure20}b-d) unlike the attached separation shock formed from the spike body (see Figure~\ref{figure20}a) in the case of a sharp tip spike is observed. From the time-averaged shadowgraph images, it is noticed that the intensity of the reattachment shock is reduced by changing the shape of the spike tip from a sharp tip to a hemispherical tip. This would in turn lead to a reduction in the level of the surface pressure distribution over the hemisphere and thereby reducing the associated forebody drag. The surface pressure distribution measured over the hemispherical body mounted with a hemispherical spike tip is shown in Figure~\ref{figure21}. A reduction in the surface pressure level to a considerable extent is observed for the case of hemispherical spike tip in comparison to the case of a sharp tip spike. The peak of the pressure plot corresponding to the reattachment point is seen to move downstream (see the solid trend lines in Figure~\ref{figure21}). One of the primary reason is due to the increase in the size of the re-circulation region associated with the vertical movement of separated free shear layer caused by changing the spike tip geometry. The separated free shear layer from the hemispherical spike tip now forms a smaller separation angle with the body axis so that the flow leaves the forebody tangentially and hence the downstream movement of the reattachment point. A reduction in the flow turn angle near the shoulder of the hemispherical forebody due to the downstream movement of the reattachment point results in lowering the strength of the reattachment shock and a reduction in the associated level of the pressure distribution (see Table \ref{table3}). In addition, the formation of a detached bow shock wave ahead of the hemispherical spike tip reduces the Mach number to a greater extent and again weakens the strength of the reattachment shock formed further downstream near the hemisphere forebody. The reduction in the surface pressure distribution and increase in the size of the re-circulation region should result in a further reduction in $c_d$. As tabulated in Table \ref{table3}, a reduction in $c_d$ is indeed observed by 55\% and 35\% in comparison with the forebody having no spike and with a sharp tip spike ($l/D=1.0,\;d/D=0.12$). As the spike base changes from vertical to circular and elliptical, the gross flow features remain the same. However, from the static pressure measurements (Figure \ref{figure21}), elliptical base shows a slight increment suggesting a little higher reattachment shock strength. The values of $c_d$ is also little larger (Table \ref{table3}), primarily due to the fact of slightly higher $\zeta$ but also due to the fact that the base surface area is considerably increased. \begin{figure}[htb!] \centering \includegraphics[width=\textwidth]{images/figure20.png}{} \caption{Time-averaged shadowgraph image obtained through the operation of $\left\|\bar{\boldsymbol I}-\boldsymbol I_{rms}\right\|$ for the hemispherical forebody mounted with (a) a sharp tip spike ($l/D=1.0,\;d/D=0.12$), (b) a hemispherical spike tip with a vertical base ($l/D=1.0,\;d/D=0.12$), (c) a hemispherical spike tip with a circular base ($l/D=1.0,\;d/D=0.12$), and (d) a hemispherical spike tip with an elliptical base ($l/D=1.0,\;d/D=0.12$) at a freestream supersonic flow Mach number of $M_\infty=2.0$. Dominant flow features: 1. Attached leading edge shock, 2. Separation shock, 3. Re-circulation region (smaller), 4. Reattachment shock, 5. Detached bow shock, 6. Re-circulation region (larger), 7. Separated free shear layer. Flow is from left to right.} \label{figure20} \end{figure} \begin{figure}[htb!] \centering \includegraphics[width=0.8\linewidth]{images/figure21.png}{} \caption{The effect of spike tip on the coefficient of the measured wall static pressure ($C_p = {2}(P-P_\infty)/{\gamma M^2 P_\infty}$) over the hemispherical forebody without and with spike of different spike-tip shapes ($l/D=1.0,\;d/D=0.12$) at a freestream supersonic flow Mach number of $M_\infty=2.0$. Solid line shows the trend.} \label{figure21} \end{figure} \subsubsection{Investigation of time-resolved studies}In order to investigate the effect of changing the geometry of spike tip on the shock related unsteadiness, time-resolved studies have been conducted. The instantaneous shadowgraph images\footnote{Corresponding time-resolved shadowgraph video file is given in the supplementary under the name `video8', `video9', and `video10'.} captured at different time intervals over a hemispherical body mounted with a spike having a hemispherical tip of different shapes are shown in Figure~\ref{figure22}. It can be observed that the generation of a detached shock and the location of the flow separation point just behind it on the shoulder of the hemispherical tip, cause the elimination of the separation shock and there by the shock-wave turbulent boundary layer interactions (SWTBL) are restricted. Consequently, only weak shocklets are formed and are barely visible in Figure~\ref{figure22}a. For different base shapes, the gross flow features remain the same. Furthermore, the large-scale structures formed along the free shear layer are observed to be smaller in size for the case of a hemispherical spike tip when compared to the structure associated to the sharp tip spike of the same length. A comparison of instantaneous time-resolved shadowgraph images\footnote{Corresponding time-resolved shadowgraph video file is given in the supplementary under the name `video2', and `video8'.} captured over a hemispherical body mounted with a sharp tip spike and a hemispherical tip spike is presented in Figure~\ref{figure23}. The presence of weaker shocklets and the reduction in the size of the large-scale structures by changing the spike tip geometry from a sharp tip to a hemispherical tip means that a reduction in the intensity of the shock related unsteadiness is possible. \begin{figure}[htb!] \centering \includegraphics{images/figure22.png}{} \caption{Instantaneous shadowgraph images at different time intervals around a hemispherical forebody when mounted with a hemispherical tip spike ($l/D=1.0,\;d/D=0.12$) of different base shapes: (a) vertical, (b) circular, and (c) elliptical at a freestream supersonic flow Mach number of $M_\infty=2.0$. Dominant flow features: 1. Detached bow shock, 2. Re-circulation region, 3. Weak shocklets, 4. Large-scale structures in the separated free shear layer, 5. Reattachment shock. Flow is from left to right.} \label{figure22} \end{figure} \begin{figure}[htb!] \centering \includegraphics[width=0.8\textwidth]{images/figure23.png}{} \caption{Instantaneous shadowgraph images showing the comparison of the dominant flow features observed around a hemispherical body mounted with (a) a sharp tip spike ($l/D=1.0,\;d/D=0.12$) and (b) a hemispherical tip spike ($l/D=1.0,\;d/D=0.12$), at a freestream supersonic flow Mach number of $M_\infty=2.0$. Dominant flow features: 1. Attached leading edge shock, 2. Separation shock, 3. Shocklets (stronger), 4. Recirculation region (smaller), 5. Detached bow shock, 6. Shocklets (weaker), 7. Recirculation region (larger). Flow is from left to right.} \label{figure23} \end{figure} To further support the above-mentioned possibilities, unsteady pressure fluctuations have been measured at the location near the hemisphere shoulder ($S/D=0.4$) for the hemispherical body mounted with the hemispherical spike tip. As Tabulated in Table \ref{table3}, a reduction of 39\% in the value of $\zeta $ is observed. Similarly, the pressure fluctuation intensity $\left(\kappa\right)$ is also reduced to 10\% for the case of the hemispherical tip spike which is a bit lower compared to the case of the sharp tip spike of same $l/D$ and $d/D$. The pressure spectra obtained from the measured pressure fluctuations are compared in Figure~\ref{figure24} with those obtained for the case of a sharp tip spike of $l/D=1.0$ and $d/D=0.12$. It can be clearly seen that the intensity of the shock related unsteadiness is reduced by changing the spike tip geometry to a hemispherical tip from a sharp tip spike. While changing the base shapes of the hemispherical spike tip, smaller variations are observed. The communication of wave-fronts/disturbances from the forebody to the separated free shear layer is partially blocked in case of the vertical base and it is not blocked in the case of circular and elliptical base. This results in continuous communication of disturbances which results in increasing $\kappa$ (Table \ref{table3}). The free shear layer separates almost near the vertex of the vertical base, whereas it separates slightly downstream for the circular and elliptical base. Such a movement changes the re-circulation region size and reattachment point location. These changes results in slightly higher $\zeta$ (Table \ref{table3}) for the circular and elliptical base. Observations from Figure \ref{figure24} is also in accordance with the above statements. \begin{figure}[htb!] \centering \includegraphics[width=0.8\textwidth]{images/figure24.png}{} \caption{Power spectral densities measured near the shoulder $(S/D=0.4)$ of the hemispherical forebody configuration without and with spike of different spike-tip shapes ($l/D=1.0,\;d/D=0.12$) at a supersonic freestream Mach number of $M_\infty=2.0$.} \label{figure24} \end{figure} Furthermore, both POD and DMD analyses have been carried out for the case of a hemispherical body mounted with a hemispherical tip spike of different base shapes, and the dominant energetic spatial mode and dynamic temporal mode have been computed. The dominant energetic spatial modes obtained for the hemisphere mounted with a typical sharp tip spike and with the hemispherical tip spike are shown in Figure~\ref{figure25}. The elimination of the separation shock is observed and a reduction in the intensity of the reattachment shock strength by mounting a hemispherical tip spike can be clearly seen from the dominant energetic mode shown in Figure~\ref{figure25}b-d when compared to the intensity of reattachment shock strength generated in the case of sharp tip spike (see Figure~\ref{figure25}a). The out of phase correlation between the separation shock and the reattachment shock oscillation existing in the hemispherical body mounted with a sharp tip spike is absent when changing the spike tip to a hemispherical tip of different base shapes. Furthermore, it can be observed that the dominant spatial mode is mainly concentrated around the leading-edge shock for the modified spike tip geometry of different base shapes. The findings from the DMD analysis (see Figure~\ref{figure26}) have also been found to be consistent with the spectral contents obtained from the pressure measurements (see Figure~\ref{figure24}). The reduction in the intensities of the broad band of spectra in the range of frequencies from 1000 Hz to 5000 Hz by adopting the hemispherical tip spike of different base shapes can be observed in Figure~\ref{figure26}. This reduction in the amplitude of the broad band of frequencies indicates the reduction of the shock related unsteadiness. For circular and elliptical base shapes, the values are higher for reasons told in the previous paragraphs. The results from POD and DMD supports the observations from the qualitative and quantitative measurements. \begin{figure}[htb!] \centering \includegraphics[width=\textwidth]{images/figure25.png}{} \caption{Dominant energetic spatial mode $\left[\Phi_1\left(x,y\right)\right]$ obtained from the POD analysis of shadowgraph images of the hemispherical body configuration mounted with (a) a typical sharp tip spike (b) a hemispherical spike tip with a vertical base ($l/D=1.0,\;d/D=0.12$), (c) a hemispherical spike tip with a circular base ($l/D=1.0,\;d/D=0.12$), and (d) a hemispherical spike tip with an elliptical base ($l/D=1.0,\;d/D=0.12$), at a freestream supersonic Mach number of $M_\infty=2.0$. Flow features: 1. Steady/weakly moving detached bow shock, 2. Re-circulation region, 3. Separated free shear layer and reattachment shock interaction zone, 4. Absence of separated shock motion and mildly unsteady separated shear layer, 5. Oscillating reattachment shock (weaker), 6. Unsteady separated shear layer, 7. Unsteady reattachment shock, 8. Moving separated shock, 9. Moving reattachment shock (stronger). Flow is from left to right.}