Friday, July 18, 2014

Algorithm Analysis

In computer science, the analysis of algorithms is the determination of the amount of resources (such as time and storage) necessary to execute them. Most algorithms are designed to work with inputs of arbitrary length. Usually, the efficiency or running time of an algorithm is stated as a function relating the input length to the number of steps (time complexity) or storage locations (space complexity).

In computer science there can be multiple algorithms exist for solving the same problem (for example, sorting problem has many algorithms like insertion sort, selection sort, quick sort and many more). Algorithm analysis helps us determining which of them is efficient in terms of time and space consumed.

Goal of Complexity or Analysis of Algorithms

the goal of complexity or analysis of algorithms is to compare algorithms (or solutions) mainly in terms of running time but also in terms of other factors (e.g., memory, developers effort etc.)

There may be two factor which can effect cost of any algorithm. These are Space and Time. So, generally we combine these two and estimate cost of any algorithm per cycle.

Input size is number of elements in the input and depending on the problem type the input may be of different types. In general, we encounter the following types of inputs:
• Size of an array
• Polynomial degree
• Number of elements in a matrix
• Number of bits in binary representation of the input
• Vertices and edges in a graph
The rate at which the running time increases as a function of input is called rate of growth.

Methodologies for Analyzing Algorithms

We will be primarily concerned with the speed (time complexity) of algorithms.
• Sometimes the space complexity is studied.
• The time depends on the input, most often on the size of the input.
• We can run experiments.
• Must choose sufficiently many, representative inputs.
• Must use identical hardware to compare algorithms.
• Must implement the algorithm.
We will emphasize instead and analytic framework that is independent of input and hardware, and does not require an implementation. The disadvantage is that we can only estimate the time required.
• Often we ignore multiplicative constants and small input values.
• So we consider f(x)=x3-20x2 equivalent to g(x)=10x3+10x2
• Huh??
• Easy to see that for say x > 100, f(x) < 10 g(x) and g(x) < 10 f(x).

Pseudo-Code

Designed for human understanding. Suppress unimportant details and describe some parts in natural language (English in this course).

The Random Access Machine (RAM) Model

The key difference from reality is the assumption of a very simple memory model: Accessing any memory element takes a constant amount of time. This ignores caching and paging for example. (It also assumes the word-size of a computer is large enough to hold any address. This last assumption is generally valid for modern-day computers, but was not always the case.)
The time required is simply a count of the primitive operations executed. Primitive operations include
1. Assign a value to a variable (independent of the size of the value; but the variable must be a scalar).
2. Method invocation, i.e., calling a function or subroutine.
3. Performing a (simple) arithmetic operation (divide is OK, logarithm is not).
4. Indexing into an array (for now just one dimensional; scalar access is free).
5. Following an object reference.
6. Returning from a method.

Reference
https://cs.nyu.edu/courses/fall02/V22.0310-002/class-notes.html