\documentclass{llncs}
%
\usepackage{makeidx} % allows for indexgeneration
\usepackage{latexsym}
%
\newcommand{\func}[1]{{\tt[#1]}}
\newcommand{\SJF}{\textit{SJF}}
\newcommand{\OPT}{\textit{OPT}}
\newcommand{\oone}{\func{min-time-spent}}
\newcommand{\otwo}{\func{min-weighted-sum}}
\newcommand{\othree}{\func{min-makespan}}
\newcommand{\ofour}{\func{min-number-of-jobs}}
\newcommand{\ofall}{\func{*}}
\newcommand{\theprob}{{\normalfont CD-LBSP}}
\renewcommand{\SS}{\textsc{Subset Sum} problem}
\begin{document}
\title{Common-Deadline Lazy Bureaucrat\\ Scheduling Problems}
%
\titlerunning{Common-Deadline Lazy Bureaucrat} % abbreviated title (for running head)
% also used for the TOC unless
% \toctitle is used
%
\author{Behdad Esfahbod \and Mohammad Ghodsi \and Ali Sharifi
}
%
\authorrunning{B. Esfahbod, {\em et al.}} % abbreviated author list (for running head)
%
%%%% modified list of authors for the TOC (add the affiliations)
\tocauthor{Behdad Esfahbod (Sharif University of Technology, Tehran, Iran),
Mohammad Ghodsi (Sharif University of Technology, Tehran, Iran),
Ali Sharifi (Sharif University of Technology, Tehran, Iran)}
%
\institute{Computer Engineering Department\\ Sharif University of Technology, Tehran, Iran,\\
\email{\{behdad,ghodsi\}@sharif.edu, ali@bamdad.org}
}
\maketitle % typeset the title of the contribution
\begin{abstract}
The lazy bureaucrat scheduling is a new class of scheduling problems that
was introduced in \cite{LBSP}. In these problems, there is one
employee (or more) who should perform the assigned jobs. The
objective of the employee is to minimize the amount of work he
performs and to be as inefficient as possible. He is subject to a
constraint, however, that he should be busy when there is some work to
do.
In this paper,
we focus on the cases of this problem where all jobs have the same common deadline.
We show that with this constraint, the problem is still NP-hard, and
prove some hardness results. We then present a tight
$2$-approximation algorithm for this problem under one of the
defined objective functions. Moreover, we prove that this problem is weakly NP-hard
under all objective functions, and present a pseudo-polynomial
time algorithm for its general case. \\[1mm]
\noindent {\bf Keywords}:
Scheduling Problems, Approximation Algorithms, Dynamic Programming,
NP-hardness.
\end{abstract}
\section{Introduction}
In most scheduling problems, there is a number of jobs to
be performed by some workers or employees. Studies have looked at these
problems with the objective
to perform the assigned jobs as efficient as possible and to maximize the number of
completed jobs. This is the employer's point of view. We
can also look at these problems from employee's point of
view, some of whom do not have enough motivation to do their jobs
efficiently. Some may even want to be as inefficient as possible
while performing their duties.
We call such employees {\em lazy}, and such scheduling problems
has been classified as {\sl Lazy Bureaucrat Scheduling Problems} (LBSP).
LBSP was introduced in \cite{LBSP} and some results were presented there.
A summary of some of them is presented here.
More specifically,
we are given a set $J$ of $n$ jobs $j_1, \ldots, j_n$.
Job $j_i$ has {\em processing time} of $t_i$, {\em arrival time}
of $a_i$ , and {\em deadline} of $d_i$ ($t_i\leq d_i-a_i$).
It is assumed that $t_i, a_i,$ and $d_i$ are nonnegative integers.
%Also, we suppose that
%jobs are sorted according to their arrival times (i.e. $a_i \leq a_{i+1}$).
Jobs have hard deadlines, that is, $j_i$ can only be executed during
its allowed interval $w_i=[a_i, d_i]$, called its {\em window}.
It is also assumed that there always exists some job which arrives
at time~0. The maximum deadline time is also denoted by $D$.
We study the non-preemptive case of this problem and
restrict ourselves to off-line scheduling in which all
jobs are known to the scheduler beforehand.
We also assume that there is only one processor (employee) available
to do the jobs.
Some results on preemptive case and also cases with multiple bureaucrats
can be found in \cite{LBSP,hepner-stein}.
\begin{definition}{ \bf (Executable Job) }\em
A job $j_i$ is called {\em executable} at some time $t$, if and only if
it has been arrived, its deadline has not yet passed, it is not processed
yet, and if it is started now, it will be fully processed before its deadline
(that is, $a_i\leq t\leq d_i-t_i$).
\end{definition}
\begin{definition} {\bf (Greedy Requirement) }\em
At any time, the bureaucrat should work on an executable job, if there is any
such job.
\end{definition}
The bureaucrat is asked to process the jobs within their windows
satisfying the above greedy requirement. His goal is to be as inefficient
as possible. This is captured by any of the following objective functions that
is to minimize.
\subsection{Objective Functions}
Four objective functions have been defined for this problem \cite{LBSP}.
\begin{enumerate}
\item \oone: {\em Minimize the total amount of time spent working.}
This objective naturally appeals to a {\em lazy} bureaucrat.
\item \otwo: {\em Minimize the weighted sum of completed jobs.}
This objective appeals to a {\em spiteful} bureaucrat whose goal is to
minimize the fees that the company collects on the basis of his labors,
assuming that the fee is collected only for those tasks that are actually
completed.
\item \othree:
{\em Minimize the makespan, the maximum completion time of the jobs.}
This objective appeals to an {\em impatient} bureaucrat, whose goal is to go
home as early as possible, at the completion of the last job, when he is allowed
to leave. He cares only about the number of hours spent at the office, not
the number of hours spent doing work (productive or otherwise).
\item \ofour: {\em Minimize the total number of completed jobs.}
This can be meaningful when the overhead of performing a job is
too high for the bureaucrat.
\end{enumerate}
Clearly, the objective function 1 is a special case of 2, as we can define
the weight of each job to be equal to its length. Objective function
4 is also a special case of 2, when the weights are all equal. Also, if
all jobs have the same arrival time, the objective functions 1 and
3 are equivalent.
\subsection{Previous Related Results}
The main related results in \cite{LBSP} are summarized below:
\begin{itemize}
%\em
\item LBSP is strongly NP-hard \cite{gary-johnson} under all objective functions
and is not approximable to within any fixed factor.
\item LBSP with the same arrival times for the jobs, is weakly NP-hard, and
can be solved by a pseudo-polynomial dynamic programming algorithm.
\item LBSP with all the jobs having unit lengths, can be solved in
polynomial time by the Latest Deadline First (LDF) scheduling policy.
\item Assuming for each job $i$, $d_i-a_i<2t_i$, LBSP can be
solved in $O(nD\max(n,D))$ time.
\item Even with a bound on $\delta$ (the ratio of the longest job to the
shortest job), LBSP is strongly NP-hard.
It cannot be approximated to within a factor of $\delta - \epsilon$,
for any $\epsilon>0$, unless $P=NP$.
\item Given bounds on $R$ (the maximum ratio of window length to job length)
and $\delta$, LBSP can be solved
in $O(Dn^{4R\lg\delta})$.
\end{itemize}
\noindent In \cite{Ghodsi}, these results have been shown:
\begin{itemize}
%\em
\item LBSP with all jobs having unit lengths, can be solved in
polynomial time by the Latest Deadline First (LDF) scheduling policy,
even with more than one bureaucrat (worker, processor).
% or when all jobs arrive at the same time.
\item Assuming $d_i-a_i<2t_i$ for each job $i$,
LBSP can be solved in $O(nD)$ time.
\end{itemize}
Hepner and Stein have studied the preemptive case \cite{hepner-stein} where they
propose pseudo-polynomial time algorithm to minimize makespan of such schedule.
They have also extended this scheduling problem to the multiple-bureaucrat setting
and provided pseudo-polynomial time algorithms for such problems.
\subsection{Our Results}
We consider a restricted case of LBSP where the deadlines for all
jobs are the same (denoted by $D$). We call these problems
common-deadline LBSP, denoted by \theprob. This problem can
be considered with any of the above objective functions. We
denote such cases by \theprob\func{\emph{objective-function}} and
\theprob\ofall\ is used to denote all these objective functions.
On the hardness of this problem, we first prove that \theprob\ofall\ is still NP-hard
and show that \theprob\ofour is not approximable to
within any fixed factor. But, for \theprob\othree, we provide a tight 2-approximation
algorithm. Finally, we prove that \theprob\ofall\ is
weakly NP-hard; we provide
a pseudo-polynomial time dynamic programming algorithm for the general case.
\section{Hardness Results}
In the following theorems we reduce the \SS\ to prove that \theprob\ofall\
is NP-hard, but existence of approximation algorithms is not the same under
different objective functions. We have found reasonable results
for \otwo, \othree, and \ofour\ objective functions, but
not for \oone.
\begin{theorem}\label{th:commondeadlinenpcomplete}
{\normalfont \theprob}\ofall\ is NP-hard.
\end{theorem}
\begin{proof}
We reduce the \SS\ to this problem. We are given
$S=\{x_1,\ldots, x_n\}$ of $n$ positive integers with
$\Sigma_{i=1}^n x_i = s$, and an integer $b$ ($0**1$, unless $P=NP$.
\end{theorem}
\begin{proof}
As in theorem \ref{th:commondeadlinenpcomplete}, we reduce
the \SS\ to prove the hardness. Assume by contradiction that there is
an approximation algorithm with fixed factor $\Delta$ for
\theprob\ofour. Using this assumption, we will show that we can solve the \SS\,
leading to $P=NP$.
Let $m=\lceil\Delta\rceil$ and $D=b+m(n+2)s$.
We construct an instance of \theprob\ofour\ containing the following jobs,
all with deadline $D$.
For each $x_i\in S$, we define an {\em element job}
$j_i\:\: (1\le i\le n)$ having $a_i=0$, and $t_i=x_i$. One
{\em long job}, $j_{n+1}$ with $a_{n+1}=b$ and $t_{n+1}=D-b$.
We also define $m(n+2)-1$ {\em extra jobs}, all having
arrival times $b$, and processing times $s$.
The bureaucrat wants to do as few jobs as possible. We claim that the
answer to the \SS\ is `yes', if and only if the bureaucrat performs the
long job. In this case, he should be working up to time $b$, which
leads to the answer of the \SS\, and he has processed at most $n+1$ jobs
(at most $n$ element jobs and one long job), so the approximation
algorithm will produce an output with at most $m(n+1)$ jobs.
On the other hand, if he does not process the long job, then he has to
process $m(n+2)-1$ extra jobs (he has enough time to process all
element and extra jobs), so the approximation algorithm will end
with more than $m(n+1)$ jobs.
Then, the answer to the \SS\ is `yes' if and only if there are at most
$m(n+1)$ jobs in the output of the algorithm, and `no' otherwise. \qed
\end{proof}
\begin{corollary}\label{th:commondeadlinenpcompleteobjfunc2}
\theprob\otwo\ is not approximable to within any fixed factor
$\Delta>1$, unless $P=NP$.
\end{corollary}
\begin{proof}
We know that \theprob\ofour\ is a special case of \theprob\otwo. Theorem
\ref{th:commondeadlinenpcompleteobjfunc4} completes the proof. \qed
\end{proof}
\section{Approximation Algorithm}
It is shown in this section that, under objective function 3 (\othree), the
bureaucrat can reach a nearly optimal solution with a simple
algorithm.
For a schedule $\sigma$, we say $j_i\in\sigma$ if job $j_i$ is processed in $\sigma$.
Also, we use $s_i(\sigma)$ and $f_i(\sigma)$ to denote the time when the
processing of $j_i$ is started and the time when it is finished
in $\sigma$ respectively ($f_i(\sigma)=s_i(\sigma)+t_i$).
\begin{theorem}\label{th:approximation}
The Shortest Job First (SJF) scheduling policy is a 2-approximation
algorithm for \theprob\othree\ and this bound is tight.
\end{theorem}
\begin{proof}
Let $\sigma_{OPT}$ be an optimal solution and $\sigma$ be
the schedule which the SJF policy
has generated, and \OPT\ and \SJF\ be their makespans respectively.
In the SJF policy, among the executable jobs,
the one with the shortest processing time is picked first.
We will show that $\SJF-\OPT<\OPT$.
Without loss of generality, suppose that jobs $j_1\ldots j_k$ (for some $k$) are
processed in $\sigma$ in that order.
Let $j_q\in\sigma$ be the job that is being processed at time \OPT\ in $\sigma$. That is,
$s_q(\sigma)<\OPT\leq f_q(\sigma)$. We know that $a_i<\OPT$ for all jobs $j_i$.
The SJF policy, therefore, forces that $t_{q+1}\leq t_{q+2}\leq\ldots\leq t_k$.
Note that there is no gap between jobs $j_{q},\ldots, j_k$ in $\sigma$.
From the greedy requirement, we can easily conclude that
$j_i\in\sigma_{OPT}$ for all
$q+1\le i\le k$ (otherwise, at least one of them can be processed at time $\OPT$).
Assume that $qt_{q+1}$:}
If job $j_q\in \sigma_{OPT}$, then we will have
$\SJF-\OPT<\Sigma_{i=q}^{k}t_i\leq \OPT$ and $\SJF<2\OPT$; otherwise, we can
show that there exists a job $j_p$, not shorter than $j_q$,
such that $j_p\in \sigma_{OPT}$ and $j_p\notin\sigma$.
Let $Q=\{j_i\mid q< i\leq k\}$.
All jobs in $Q$ are shorter than $j_q$, otherwise, there will be enough
time to process $j_q$ at time $\OPT$, which means that $\sigma_{OPT}$ is
invalid. Also, from the SJF policy, we conclude that
all jobs in $Q$ have arrived after $s_q(\sigma)$.
Let $T$ be the set of jobs $j_i\in\sigma_{OPT}$ such that
$s_i(\sigma_{OPT})\le s_q(\sigma)$. Obviously, $T$ cannot be empty.
Not all jobs in $T$ can be processed in $\sigma$. To show this, consider a subset of $T$
which has been processed continuously (without any break) and has the maximum
start time among these subsets. The finish time of this subset is after time
$s_q(\sigma)$. From the greedy requirement, we conclude that this set of jobs cannot be
processed and finished by time $s_q(\sigma)$, and thus they cannot all be processed in
$\sigma$ (because
job $j_q$ has been already started at time $s_q(\sigma)$.)
Thus, there is some job $j_p\notin\sigma$, with $a_p\le s_q(\sigma)$.
The SJF policy forces that $t_q\leq t_p$, and also $j_p\in\sigma_{OPT}$.
Putting all together leads to
$$\SJF-\OPT 2-\frac{\epsilon}{2(1-\frac{2}{L})} > 2-\epsilon,
\end{eqnarray*}
which completes the proof. \qed
\end{proof}
It looks quite likely that the same algorithm yields the same approximation bound
under function \oone, but we do not have a complete proof.
\section{Pseudo-Polynomial Time Algorithms}
We assume that the jobs are numbered in order of their arrival times (that is,
$a_1\le a_2\le \ldots a_n$).
Let $T_i$ and $T_{i,k}$ denote the set of jobs $j_i, j_{i+1}, \ldots, j_n$
and $j_i, j_{i+1}, \ldots, j_k$ respectively. We will also use the following definitions:
\begin{definition}\em
The time $\alpha$ is called the {\em first rest time} of a schedule $\sigma$, if the
bureaucrat has paused processing the jobs in $\sigma$ for the
first time at $\alpha$. If there is no pause during $\sigma$, the first
rest time is defined as the time when the schedule is finished.
\end{definition}
\begin{definition}\em
For a time $\alpha$, we define
{\em critical jobs} $H_\alpha$ as the set of jobs $j_i \in J$ which
can be processed in $[\alpha,D]$, i.e.~$\max(a_i,\alpha)+t_i\le D$.
\end{definition}
\begin{definition}\em
For a given $(T,\alpha,U)$ in which $T,U \subset J$ and $\alpha$ is a time point,
sequence $E$ of some jobs in $T$
is said to be a {\em valid sequence} if we can process
these jobs in this order without any gaps in between, starting from
first arrival time of the jobs in $T$ and finishing at $\alpha$ such that
every job in $T\cap U$ appears in $E$.
A valid sequence $E$ is said to be an
{\em optimal sequence} under some objective function, if its cost is minimum
among all valid sequences of $(T,\alpha,U)$.
\end{definition}
\begin{lemma}
For a given $(T,\alpha,U)$, let $E$ be an optimal sequence and $j_m \in E$
be the job with the latest arrival time. There exists another optimal sequence
$F$ in which $j_m$ is the last processed job.
\label{easylemma}
\end{lemma}
\begin{proof}
This can easily be done by
repeated swapping of $j_m$ with its adjacent jobs. \qed
\end{proof}
\begin{lemma}
There is a pseudo-polynomial time algorithm that finds the optimal
sequence for any given $(T_i,\alpha,U)$
under any of the objective functions, if such sequence exists $(1 \leq i \leq n)$.
\end{lemma}
\begin{proof}
Let $j_f$ be the last job arrived before $\alpha$ in $T_i$,
and $C_{x,y}$ $(i \leq x \leq f, a_i \leq y \leq \alpha)$
be the cost of the optimal sequence for $(T_{i,x},\alpha,U)$,
or $\infty$ if no such optimal sequence exists.
Our goal is to compute $C_{f,\alpha}$.
We show how $C_{x,y}$ can be computed recursively from
the values of $C_{x',y'}$, where $x' < x$ and $y' \leq y$.
If $j_x\in U$, then it is in any valid sequence.
Hence, from lemma~\ref{easylemma}, $j_x$ can be processed last
in $[y - t_x, y]$. Based on the objective function used,
we can easily compute $C_{x,y}$ from $C_{x-1,y-t_x}$.
For example, $C_{x,y} = C_{x-1, y-t_x} + t_x$ under \oone.
On the other hand, if $j_x\notin U$,
there are two options depending on whether or not it is in the optimal sequence.
If $j_x$ is processed in the optimal sequence,
it can be processed last, in which
case, $C_{x,y}$ can be computed from $C_{x-1,y-t_x}$ as before.
Otherwise, $C_{x,y}=C_{x-1,y}$, since we can ignore
$j_x$. The minimum of these two values is taken for $C_{x,y}$.
The running time of this algorithm is $O(nD)$,
as there are at most $nD$ values of $C_{x,y}$ to compute. \qed
\end{proof}
\begin{theorem}\label{th:commondeadlineweaknpcompleteobjfunc1}
\theprob\ofall\ is weakly NP-hard.
\end{theorem}
\begin{proof}
We present a pseudo-polynomial time algorithm which
can be used for any of the objective functions.
Consider $T_i$ for some $1\leq i\leq n$ and temporarily assume that
the jobs in $T_i$ are the only jobs available, and that the
greedy requirement is to be satisfied on only these jobs.
Let $P_i$ be this subproblem and $C_i$ be its optimal value. Clearly,
$C_1$ is the desired value.
Consider an optimal schedule $\sigma$ for $P_i$.
Let $\alpha$ be the first rest time in $\sigma$.
No job in $T_i$ arrives at $\alpha$.
We know that the jobs in $T_i$ appearing in the
set of critical jobs $H_\alpha$ should be processed
before the rest time $\alpha$.
Let $j_k$ be the first job arrived after $\alpha$. Because of the
pause at time $\alpha$, we know that no job having arrival time less
than $\alpha$ can be processed after $\alpha$. So, we can break up the
schedule $\sigma$ into two subschedules:
those processed before $\alpha$ and those processed after.
These subschedules are independent.
We can consider the first subschedule as a valid sequence
for $(T_{i,k-1},\alpha,H_\alpha)$.
From the optimality of $\sigma$, it is clear that this sequence is optimal.
Similarly, the second subschedule is an optimal schedule for $P_k$.
We compute $C_i$ for every $1 \leq i \leq n$ from the values of
$C_j$ ($i**