\chapter{Implementation}
\label{chap:implementation}

This chapter contains some of the details of the implementation of preload.
Preload is implemented in C and is less than 3000 lines of code, and becomes
a small 35kb binary when compiled.  However, it uses the GLib
library\footnote{Available from \url{http://www.gtk.org/}} for basic data
structures and the application main loop.  GLib is a convenience library for C
programming that is widely used in the GNOME
project\footnote{\url{http://www.gnome.org/}}.


\section{Dependencies}

Preload uses the \syscall{readahead} system call that is specific to the Linux
kernel and supported by the GNU libc implementation.  Most other Unix kernels
provide system calls with similar functionality, so it is rather
straightforward to port preload to other UNIX systems, like the BSDs or
Solaris.  Depending on the implementation, the \syscall{madvise} system call
may be used as a substitute.  See \autoref{sec:prefetching} for details.

The GLib library that preload depends on is ported to various systems and can
be compiled (and is usually available) on all modern and legacy desktop
systems.


\section{Configuration Parameters}\label{sec:conf-params}

Preload reads configuration parameters from an INI-style text file that is
normally located at \file{\$prefix/etc/preload.conf}.  This can
be changed at compile time, or using the \opt{conffile} command line
argument when invoking preload.
The recognized configuration parameters follow.

\subsection{Model parameters}
\label{subsec:model-params}

The model parameters control various aspects of the model and algorithms as
described in \autoref{sec:model} and \autoref{chap:algorithms}.
The following model parameters are recognized:

\begin{itemize}

\item\confkeydoc{model.cycle}{integer}{seconds}{20}

This is the quantum of time for preload.  Preload performs
data gathering and predictions every cycle.

\emph{Note:} Setting this parameter too low may reduce
system performance and stability.

\item\confkeydoc{model.halflife}{integer}{hours}{168 (one week)}

The time it takes for statistic information to lose importance
to a level half of the current importance.  This parameter
controls how soon preload forgets about the past.  Setting
it to a higher value makes it take longer for preload to
change its mind about its beliefs, while setting it lower
makes preload more sensitive and responsive to the current
happenings.  This is used to compute the decay factor for all
exponentially-fading mean computations.

Setting this parameter too high will essentially
prevent preload from learning new things.

\item\confkeydoc{model.minsize}{integer}{bytes}{2\thinspace000\thinspace000}

This is the minimum sum of the length of maps of the process for
preload to consider tracking an application.

\emph{Note:} Setting this parameter too high will make preload less
effective, while setting it too low will make it consume
quadratically more resources, as it tracks more processes.

\end{itemize}


\subsection{Memory Usage Parameters}

The total memory preload uses for prefetching is computed using
the following formula:
\begin{eqnarray}
\begin{array}{rcl}
	\max &(0, &\text{\confkey{memstat.total}} \times \text{\confkey{model.memtotal}}\\
	&+ &\text{\confkey{memstat.free}}   \times \text{\confkey{model.memfree}})\\
	+ &&\text{\confkey{memstat.cached}} \times \text{\confkey{model.memcached}}
\end{array}
\label{eqn:memory}
\end{eqnarray}
where \confkey{memstat} is the MemStat object filled with memory statistics of
the system at runtime.

The formula above is written such that it can use configuration parameters to
express all (interesting) linear combinations of system memory variables.  The
more memory preload receives for prefetching, the more effective it is, at the
cost of more memory used.

The following parameters control how much memory preload is allowed to use for
prefetching in a cycle.  All values are percentages and are clamped to the
range -100 to 100.

\begin{itemize}

\item\confkeydoc{model.memtotal}{integer}{percentage}{-10}

Multiplier for total memory variable part of the memory usage formula.

\item\confkeydoc{model.memfree}{integer}{percentage}{100}

Multiplier for free memory variable part of the memory usage formula.
Reducing this value makes preload less agressive during the boot time, where
most of the memory is free.

\item\confkeydoc{model.memcached}{integer}{percentage}{30}

Multiplier for cached memory variable part of the memory usage formula.
Increasing this value makes preload more agressive during steady state after
the boot.  That is, for the most of the operating time of the system.  During
this time the kernel typically does not let memory to stay free for long and
uses it for page-cache.  When preload prefetches files, it is essentially
causing other pages to be evicted from the cache.

\end{itemize}

The default values for the memory usage formula result in:
\begin{eqnarray}
\begin{array}{rcl}
	\max &(0, &-10\% \times \text{\confkey{model.memtotal}}\\
	&+ & 100\%   \times \text{\confkey{model.memfree}})\\
	+ && 30\% \times \text{\confkey{model.memcached}}
\end{array}
\end{eqnarray}
which essentially means: use all free memory except for ten percent of
total memory, and thirty percent of memory already used for caches.

When a system is in steady state, there is little free memory available since
the kernel utilizes most of the free memory for caching.  During boot time on
the other hand, there is little cached memory and a lot of free memory.
Given this, the \confkey{model.memfree} and \confkey{model.memcached} controls
enable tuning preload's aggressiveness during boot process and steady state 
fairly separately.


\subsection{System Parameters}
\label{subsec:system-params}

System parameters enable customizing aspects of the preload daemon that are
specific to the implementation and unlike model parameters, do not directly
affect the internal working of the scanning and prefetching subsystems.

\begin{itemize}

\item\confkeydoc{system.doscan}{boolean}{\relax}{true}

Specifies whether preload should monitor running processes and update its
model state.  This is the monitoring subsystem of preload and should
normally run, but you may want to temporarily turn it off for various reasons like
testing predictions.

Note that if scanning is off, predictions are made based on whatever processes
have been running when preload started again and again, and the list of
running processes is not updated at all.

\item\confkeydoc{system.dopredict}{boolean}{\relax}{true}

Specifies whether preload should make prediction and prefetch anything off the
disk.  Similar to \confkey{system.doscan}, you normally want this to be
enabled: this is the other subsystem in preload.  But you may want to
temporarily turn it off, for example to only train the model without
prefetching anything.

These two parameters allow turning scan/predict subsystems on/off on the fly,
by modifying the configuration file and signaling the daemon.

\item\confkeydoc{system.autosave}{integer}{seconds}{3600}

Preload will automatically save its state to disk periodically, and this
parameter determines how often.  This is only relevant if
\confkey{system.doscan} is enabled.

The state is saved when the daemon exits normally (using termination signals).
So auto-saving is not strictly required.

\item\confkeydoc{system.mapprefix}{string}{\relax}{\emph{empty}, accept all}

A list of path prefixes that controls which mapped files should be scanned by
preload.  The list items are separated by semicolons.  Matching stops as soon
as an item matches.  For each item, if it matches the beginning of the full
file name, a match occurs, and the file is accepted.  If on the other hand,
the item starts with an exclamation mark as its first character, the rest of
the item is considered for matching, and if a match happens, the file is
rejected.

As an example a value of \file{!/lib/modules;/} means that every file other
than those in \file{/lib/modules} should be accepted.  In this case, the
trailing item can be removed, since if no match occurs, the file is accepted.
It is advised to make sure that \file{/dev} is rejected, since preload does
not differentiate device files internally.

\item\confkeydoc{system.exeprefix}{string}{\relax}{\emph{empty}, accept all}

A list of path prefixes that controls which binary executables should be
scanned by preload.  The syntax is exactly the same as
\confkey{system.mapprefix}.

\end{itemize}


\section{Persistent State Storage}

The persistent state of the model is stored in a simple text file that is
normally located at \file{\$prefix/var/lib/preload/preload.state}.  This can
be changed at compile time, or using the \opt{statefile} command line
argument when invoking preload.

The file is read and used to populate the model when preload starts, or if it
does not exist, an empty model is constructed.  It will be saved periodically
as set with the \confkey{autosave} configuration parameter, when preload is
shutting down, or upon receiving the \sig{USR2} signal.


\section{Prefetching}
\label{sec:prefetching}

Preload uses the \syscall{readahead} system call that is specific to the Linux
kernel and supported by the GNU libc implementation.  Most other Unix kernels
provide system calls with similar functionality, so it is rather
straightforward to port preload to other UNIX systems, like the BSDs or
Solaris.  There are an entire family of \emph{advise} system calls that can be
used as a substitute, depending on the implementation: \syscall{madvise},
\syscall{fadvise}, \syscall{posix\_madvise}, and \syscall{posix\_fadvise}.  If
all fails, one can even use the \syscall{read} system call to fetch data into
main memory, this is a bit wasteful however, as it makes the kernel copy the
data into user-space unnecessarily.

One way to improve prefetching performance is to make sure all files to be
prefetched are queued instead of sequentially read, such that the kernel has
more opportunities to avoid seeking around the hard disk.  However, the
\syscall{readahead} system call is implemented as a command and will block the
caller until the request is fulfilled.  The \syscall{posix\_madvise}
implementation specification is more promising with regard to this.  However,
the current implementation in Linux is synchronous, like \syscall{readahead}
\cite{kernelnewbies:riel}.

The Fedora readahead package described in \autoref{subsec:fedora-readahead}
uses a filesystem-specific API to determine the place of the first block of each
file on the disk, and sorts files to be prefetched on the position of their
first block and reads them in that order, supposedly reducing disk access
time.

Since preload tries to be fairly conservative and not affect system balance
in a noticeable way, reading files one at a time is probably acceptable.

Note that preload generates prefetching system calls for all the selected maps
even if it prefetched some of them at the previous cycle.  This means, if no
applications are started or shut down, preload generates the exact same system
calls as the previous cycle and if those maps are still in memory, the system
call essentially becomes a no-operation.

We also call \syscall{sched\_yield} after reading every few files to give away
processor volunteerly instead of exhausting our time slice.  This is useful to
keep system responsive as being I/O bound during the prefetching phase,
preload will get elevated priority over other processes.


\section{Resource Consumption}

Preload has a modest memory footprint.  Its main memory consumption is the
model that is kept in memory, and with an uncommonly large setting of 1000 maps and
100 applications, it operates in less than 3MB of memory, the main contender
being the Markov objects that are quadratic in the number of applications, and
take less than 200 bytes.

The process is in the sleep state most of the time, waiting for the next cycle
or blocking on I/O, so the load on the processor is minimal.  At each cycle it
involves scanning \proc\ to gather data and the mathematical computation to
update the model and make predictions.  When the system is in an stable state
(not swapping or under tight memory conditions) and the set
of running applications does not change, preload enters a steady state after
a few cycles and stops making new I/O requests.  This means that it does not
interfere with power-saving activities on laptop systems, like turning the
hard disk drive off.


\section{Running Preload}

When invoked, preload starts running as a background daemon on the system.
It accepts the following command line arguments:
\begin{itemize}

\item\textbf{\opt{help}:}
writes out information about invoking preload and exits.

\item\textbf{\opt{version}:}
writes out the version number of the preload binary and exits.

\item\textbf{\opt{conffile \emph{file}}:}
sets the file containing configuration parameters (defaults to
\file{\$prefix/etc/preload.conf}).  The configuration file is used for
tuning model parameters as well as other behaviors of preload.  Configuration
parameters are enumerated in detail in \autoref{sec:conf-params}.

\item\textbf{\opt{statefile \emph{file}}:}
sets the file  for storing persistent state data (defaults to
\file{\$prefix/var/lib/preload/preload.state}).

\item\textbf{\opt{logfile \emph{file}}:}
sets the file used to write out informative messages to (defaults to
\file{\$prefix/var/log/preload.log}).
An empty \file{\emph{file}} argument redirects messages to the standard
output.

\item\textbf{\opt{foreground}:}
instructs preload to run in the foreground, not as a daemon.

\item\textbf{\opt{nice \emph{adjustment}}:}
adjusts the nice level of the daemon (defaults to +15).

\item\textbf{\opt{verbose \emph{level}}:}
adjusts the verbosity level of the daemon.  Levels 0 to 10 are recognized with
10 being the most verbose (defaults to 4).

\item\textbf{\opt{debug}:}
enable debugging mode.  Equivalent to
\opt{logfile '{}'} \opt{foreground} \opt{verbose 9}.

\end{itemize}

When running, preload responds to a variety of signals:
\begin{itemize}

\item\textbf{\sig{HUP}:}
Reloads configuration file and reopens the log file.  Reopening log file
allows for rotating the log file (backing up current content and starting a
new one) without restarting the daemon.

\item\textbf{\sig{USR1}:}
Dumps messages containing the current state and configuration parameters.
Useful for debugging purposes.

\item\textbf{\sig{USR2}:}
Saves the current state to the state file.

\item\textbf{\sig{INT}, \sig{QUIT}, \sig{TERM}:}
Saves the state and quits.

\end{itemize}


\section{Source Code}

The source code of preload is publicly available at
\url{http://preload.sf.net/}.  It is also distributed for Debian based systems
in the Debian unstable package repository.  

The source code is fairly straightforward and closely maps to the algorithms
presented in \autoref{chap:algorithms}.


\section{Other Issues}

Preload currently does not handle package updates or removals.  However, doing
that is as simple as removing objects that have a probability of less than a
certain threshold, and removed files will be eventually removed from the
model.


\subsection{Floating-Point Precision}

A minor issue when implementing the inference algorithms of
\autoref{subsec:inference} is how to maintain the products in the probability
equations within an acceptable precision.  To solve this, we use
\emph{log-probabilities}, ie.~compute the sum of the natural logarithm of the
elements instead of computing their product, and convert back to a probability
value when necessary.  This drastically reduces the error in the floating
point computations and is common practice in machine learning.


\subsection{Prelink}

\emph{Prelink} is a daemon available on various Linux systems, including
Fedora, that runs on the system periodically (once every 24 hour or less
frequently), analyzing all installed shared library and applications, and
assigning a unique virtual address slot for each shared library, and
\emph{relocating} them to that address.  The idea is that when running
applications, if the shared library can be loaded at its allocated address, no
relocation is necessary anymore.  This improves application startup-time
\cite{jelinek04prelink}.

Prelink's operation conflicts with preload, in that prelink modifies shared
library files, creating a new one and renaming it to the original one.  This
causes running processes to see deleted files as their maps.  However, this is
not a major problem because:
\begin{itemize}

\item When preload prefetches, it will prefetch the current version of the
file, and when a new application starts, it is linked against the current
version.  Except for the rare occasions that the current version between
prefetching and application start-up changes, prefetch always loads the version
that is going to be used by new applications started,

\item While prelink changes shared objects, the only way it does that is by
modifying some relocations structures in the file.  The modifications are
in-place and do not change the offsets of interesting regions of the file that
preload uses in its data structures.  So, prelinking does not nullify
preload's work.

\end{itemize}

Preload handles the conflict by behaving as if no maps of currently running
applications were deleted.