\chapter{Related Work}
\label{chap:related}

Prefetching and caching strategies are either heuristic or hinted.  O'Rourke
\cite{orourke01parallelism} compares issues and results of various file-system
prefetching schemes based on both approaches.  It concludes that while hinted
prefetching can remove almost all cache misses and modifying applications to
generate hint streams is straightforward, it is impractical to rewrite more
than a few applications on a standard UNIX system.

Chang and Gibson \cite{chang99automatic} described an intriguing approach to
modify application binaries to produce prefetching hints for future I/O
requests during I/O stalls.  This works by running a \emph{shadow} copy of the
application in a low-priority thread that does not stall on I/O and proceeds
running, gathering information about future I/O requests.  For most
applications, they report, the performance comes within a few percent of that
achieved by hand-written hints.  However, in some cases automatic prefetching
can perform worse than no prefetching at all, and the prefetching code added
increases binary size by 130 to 600 percent.

In the remainder of this chapter we review previous work on heuristic-based
prefetching.

Papathanasiou and Scott \cite{papathan05aggressive} discuss that with the
drastic growth of processor power and main memory sizes in the past decade,
the time may have come to employ aggressive prefetching to offset the rather
slow improvement rate of external storage.  They discuss why aggressive
prefetching makes sense now, research challenges in it, and finally identify
several traditional prefetching problems that may require improved solutions
as prefetching becomes more aggressive.

Curewitz et al \cite{curewitz93datacompression} analyze the practical aspects
of using data compression techniques for prefetching.  The idea is that
compression algorithms work by predicting future elements in a stream and
assigning shorter codes to them.  A compressor is most efficient if it best
predicts upcoming input.  The algorithm then can be adapted to act as a
predictor for a prefetching system.

\section{Prefetching Integrated with Caching}

Prefetching is an old idea.  There has been a lot of heuristic prefetching
work conducted during the past decade that focus on file-system, page-cache,
and web prefetching.  A common property to many of these works is that they
integrate prefetching with caching.  This is feasible when prefetching is
implemented at the same level as caching, for example, in the kernel or in the
web browser.  This has raised the question of to what extent access history
can already be used to improve the cache evacuation algorithm instead of using
LRU with prefetching.  Vellanki and Chervenak \cite{vellanki99costbenefit}
demonstrate that well over half of all accesses in a file-system are
cacheable based on history, significantly more than LRU and prefetching in
most cases.

Griffioen and Appleton \cite{griffioen94latency} devise a \emph{probability
graph} connecting files in the file system together as nodes.  They do that by
connecting file open requests within a certain \emph{look-ahead period}.  They
achieve up to 280 percent improvement over LRU, or alternatively, reduce cache
size by up to 50 percent.

Amer et al \cite{amer02file} introduce a new file access predictor,
\emph{Recent Popularity}, that works by predicting a successor for each file
access that is the \emph{best $j$ of $k$} successor of the specific file in
tracked history.  Their approach allows for improving accuracy through
reducing offered predictions, by adjusting the parameters $j$ and $k$.  They
report a less than two percent error rate while offering predictions for 60
percent of accesses. 


\section{Markov-based Prefetching}

Bartels et al \cite{bartels99potentials} review potentials and limitations of
fault-based Markov prefetching for virtual memory pages.  They found that
fault-based virtual memory prediction can achieve reasonably high levels of
accuracy for some scientific applications, mainly those accessing large
matrices stored externally.  They also conclude that high precision of
one-fault prefetching hardly results in significant speedup, as applications
that can benefit from such an accurate short-sighted prediction already have a
sufficiently low fault rate and latency that the I/O overhead is not
prohibitive in the first place.  One of their most surprising results is that
the Markov predictor of order 1 often outperformed the Markov predictor of
order 2, because the second-order model was too conservative.

The limitation of Markov-based approaches in that orders higher than one or
two are impractical is elevated by using the Partial Prefix Matching (PPM)
technique.  PPMs work by constructing a tree of match prefixes of all lengths
up to a limit, and so can be thought of as a multi-order Markov scheme.  It
essentially trains Markovs of order 1, 2, 3, up to the limit all at the same
time, and picks the best predictions out of them all.  It is easy to observe
that unlike Markov-based approaches, increasing the order limit in PPMs cannot
result in inferior prediction results in practical situations.

Kroeger and Long \cite{kroeger96predicting} develop a PPM based prefetching
scheme with intriguing performance result: that their four megabyte predictive
cache has a higher cache hit rate than a 90 megabyte LRU cache for their
simulations.  However, it is likely that such a result is limited to the
particular use cases that their data-set represented.

Hidden Markov Model (HMM) is another model based on Markovs that has been used
in prefetching.
Madhyastha and Reed \cite{madhyastha97inputoutput} describe an HMM approach to
block access classification that can then be used to adjust cache replacement
or prefetching.  Their model works on blocks of a file at a time and so is
limited in scope to large files like databases.  Their approach offers
significant performance improvement over similar artificial neural network
based approaches.


\section{Block-based, File-based, and Web Prefetching}

Prefetching in the file system level can either be done at the block level, or
the file level.  Block level prefetchers are much harder to develop as the
hard disk and memory sizes increase, and offer significantly less improvement
as processors grow faster and more blocks should be prefetched to have any
measurable effect.  Bartels et al \cite{bartels99potentials} confirm this.

Choi et al \cite{choi00towards} devise an application/file-level
characterization of block references that can be used to employ different cache
replacement policies per application/file to maximize performance.

File-based approaches are a lot more promising as they can trigger prefetching
of several files by a specific request.  In all previous work that we
reviewed, file-based prefetching has focused on network file-systems (NFS).
Many of them we already mentioned in previous sections.
For comparison, our work goes a level higher and tries to prefetch groups of
files needed by working on the application level.  This is the first work in
this level as far as we know.

Lei and Duchamp \cite{lei97analytical} develop an application/file based
approach that records trees of process creations (forks) and file accesses of
those processes, in a sequential order.  It then uses such trees to make
prediction for the next time when those processes are run.  They report
reducing application latency by up to 40 percent for wireless remote
file-system access.

Similar work has been done in web prefetching, many of them simply adapting
file-system prefetching algorithms to the web, but others exploiting features
unique to the way the web is navigated.  Web prefetching is related to our work in
that, like our work, and unlike block-based and file-based file-system
prefetching, it tries to come up with predictions of what action the computer
user will take next.

Jiang and Kleinrock \cite{jiang98adaptive} develop a prefetching scheme for
network use.  Their scheme has a very simple predictor that, upon seeing a
request, predicts all resources linked from the resource being requested based
on the history and the number of times each of those links were navigated.
However, they back this simple prediction algorithm up by deriving a formula
for the prefetching threshold based on system load, capacity, and costs such
that a lower average cost can always be achieved.  The scheme can be
implemented on the client or the server.

Nanopoulos et al \cite{nanopoulos01datamining} propose a PPM based prefetching
algorithm for web browsing.  They train their PPM using sequences of access in
each user \emph{session}.
Yang and Zhang \cite{yang01prediction} propose a very similar scheme, though
they do not use the term PPM, and they believe that their work is the first to
integrate prefetching and caching in web browsers.  They use a fixed part of
the cache for prefetching and always fill that with no threshold.

Padmanabhan and Mogul \cite{padmanabhan96using} use the prefetching algorithm
of Griffioen and Appleton \cite{griffioen94latency} for web prefetching in a
cooperative mode where the server suggests some pages to prefetch to the client
based on the page requested, and client decides which pages to prefetch based
on the suggestions and other criteria including access history.

Sow et al \cite{sow03usagemining} take a compression-based approach to web
prefetching, coming up with an algorithm based on the original Lempel-Ziv
algorithm.

\section{Summary}

Prefetching file-systems has been widely studied before.  However, all
previous approaches work on a block-based or file-based manner.  In the rest
of this thesis, we design a file-system prefetcher that works on
application-level.  Like many other prefetchers, we use Markov predictors in a
mixture model.