7 분 소요

image


Basic Concepts


Concepts: Algorithm

  • Algorithm 정의
    • Algorithm is a step-by-step procedure for solving a problem in a finite amount of time. It consists of :
      • instructions
      • input data
      • output data
  • Algorithm 표현
    • Programming language (C, Java, Python…)
    • Pseudo Code
      • High-level Description
      • Hides algorithm implementations


Example: GCD Algorithm

  • Problem: Compute Greatest Common Divisor
  • Input: integer L, S
  • Output: GCD of (L, S)
  • Method: Euclid Algorithm

image

  • Test: GCD(36, 24)=?, GCD(1989, 1590)=?, GCD(84687624, 38742620)=?


Examples: Searching Algorithm

  • Problem:Find value X among randomly stored values.
  • Input: n integers in array A[0..n-1]
  • Output: Value X in A[0..n-1]
  • Method: Linear Search

image


Examples: Find Max Algorithm

  • Problem:Find the maximum value
  • Input: n integers in array A[0..n-1]
  • Output: Maximum value of A[0..n-1]
  • Method: Naïve Algorithm

image


Example: Sorting Algorithm

  • Problem: Sorting integers
  • Input: n unsorted integers in array A[0..n-1]
  • Output: n sorted integers in array A[0..n-1]
  • Method: Selection Sort

image


Concepts: Data Structures

  • Data Structure
    • How do we store (input/output) data in computer?
    • We need to organize data in memory efficiently.
      • 예: Matrix를 memory에 어떻게 저장?
      • 예: Road map을 memory에 어떻게 저장?
    • Every algorithm uses some collections of data structures.
  • Two typical types
    • Linear types: Stored in linear order
      • 예: Array, Linked Lists, Stacks, Queues
    • Non-Linear Types: Stored in non-linear order
      • 예: Trees, Graphs

image


What is Performance Analysis?

  • Performance Analysis
    • Space Complexity 분석
      • the amount of memory space used by the algorithm
    • Time Complexity 분석
      • the amount of computer time used by the algorithm
  • Typically, the more (less) space, the less (more) time.
    • Thus, sometimes we need to trade off space vs. time.


Space complexity

  • Find a total sum of n numbers; Space = ?

image

  • Addition of two n x n matrices; Space = ?
  • Road map connecting n cities; Space = ?


Time Complexity

int sum = 0;
for (i = 1; i <= 1000000; i++) do
sum = sum + i ;
return sum
  • What is time complexity?
    • Theoretical Speed: $10^6$ additions
    • Practical Speed: 10 milli-sec
  • Which criteria is more reasonable?
    • “Theoretical” speed gives better criteria. Why?


  • Time Complexity Criteria
    • Theoretical Speed
      • the number of operations by performed by algorithm.
      • Examples of operations: Arithmetic(+, -, ..), Compare, Swap, Visit, . etc.,
    • Practical Speed
      • the execution time performed by algorithm. (by using computer)
  • Why “theoretical” speed is reasonable?
    • Free of implementation(coding)
    • Independent of H/W (computer performances)
    • Independent of S/W (programming languages, compilers, . .)
    • Provide constant values regardless of environments
  • The running time of an algorithm typically grows with the input size n.
  • Algorithm의 time 분석 결과는 input size(n)에 대한 식 (예: f(n), g(n) . . 등)으로 표현 됨.
  • 다음은 time 분석에서 발생하는 전형적인 패턴들
    • Linear Loops
    • Logarithmic Loops
    • Square Loops
    • Cubic Loops
    • Logarithmic Linear Loops


Linear Loops

  • Time is proportional to n.

    • f(n) = n
for (i = 1; i <= n; i++)
{
  // application code
}


  • f(n) = n/2
for (i = 1; i <= n; i += 2)
{
// application code
}


Logarithmic Loops

  • Multiply
for (i = 1; i <= n; i *= 2)
{
  // application code
}

$2^{k-1}$ <= n k = f(n) = $log_2n + 1$ ≈ $log_2n$

  • Divide
for (i = n; i >= 1; i /= 2)
{
  // application code
}

$n/2^{k-1}$ >= 1 k = f(n) = $log_2n + 1 ≈ log_2n$


Nested Loops

  • Square
for (i = 1; i <= n; i++)
  for (j = 1; j <= n; j++)
  {
    // application code
  }

f(n) = $n^2$

  • Cubic

for (i = 1; i <= n; i++)
  for (j = 1; j <= n; j++)
    for (k = 1; k <= n; k++)
    {
      // application code
    }

f(n) = $n^3$


  • Dependent Square
for (i = 1; i <= n; i++)
  for (j = 1; j <= i; j++)
  {
  // application code
  }

f(n) = n(n+1)/2

  • Logarithmic Linear
for (i = 1; i <= n; i++)
  for (j = 1; j <= n; j *= 2)
  {
    // application code
  }

f(n) = $nlog_2n$


Worst Case/Average Case

  • Average Case
    • Input data의 여러 형태를 모두 고려
    • Expected (average) 실행 시간 값을 제공
    • 일반적으로 확률이 요구되므로 계산하기 복잡함
  • Worst Case
    • Input data의 형태가 최악인 것을 고려
    • Upper bound (longest time) 실행 시간 값을 제공
    • 최악인 input data만 고려하므로 계산하기 간단함
    • Worst case에 더 나은 성능을 갖는 알고리즘의 개발은 중요함
      • 예: air traffic, weapon system
    • (언급이 없는 한) worst case 분석에 비중을 둠


  • 다음 문제의 Worst와 Average case를 각각 분석하라
    • Random 상태의 n 개의 input data에서 X를 찾는 문제
    • n 개의 input data는 list라는 이름의 array에 저장되었다고 가정.
    • Linear Search 알고리즘을 이용
    • 비교(comparison) 연산이 몇 번 수행되는지를 count함
  • Average Case : A(n)
    • q : X가 list에 있을 확률
    • T(i) : X가 list의 i 번째 위치에 있을 때 비교 회수
    • X가 list에 있는 경우와 없는 경우를 고려
      • A(n) = (q/n) · (1 + 2 + . . . + n) + (1 – q) · n = (q/2) · (n + 1) + (1 - q) · n
    • 예: q = 1; ?, q = 1/2; ?, q = 0; ?
  • Worst Case : W(n)
    • X가 list의 맨 끝에 있거나 혹은 없는 경우
    • W(n) = n


Comparing Time Complexities

  • 주어진 문제에 대해 이를 해결하는 algorithm들은 여러 개가 존재.
  • 이러한 알고리즘들의 time 성능을 서로 비교하여 더 빠른 것을 식별하는 것이 매우 중요
  • 여기서 time 성능 비교의 criteria는 무엇인가?

image

problem

  • Question: Which algorithm is faster?
    • 1) When n is small(say, n < 100), ??
    • 2) When n is large (say, n > 100), ??

Big-Oh (O) 의 정의

  • Given functions f(n) and g(n), we say that f(n) = O(g(n)).
    if there are positive constants c and $n_0$ such that
    f(n) ≤ c·g(n) for all n ≥ $n_0$

image

  • c는 H/W 혹은 S/W 환경 변화에 따른 값이며, c의 값이 바뀌어도 growth rate에는 영향을 미치지 않음.
  • n의 값이 임계 값 n0 이상이 되면 (즉 input size가 계속 커질수록) g(n)이 항상 f(n) 보다 크다.


연습 : Big Oh(O) Notation

If we want show f(n) is O(g(n)); we only need to find one pair (c, $n_0$).

  • 예 1: 100n = O($n^2$)
    • We need c and $n_0$ such that 100n ≤ c·n2
    • Pick c = 1 and n0 = 100
  • 예 2: 7n – 2 = O(n)
    • We need c and $n_0$ such that 7n – 2 ≤ c·n for n ≥ $n_0$
    • Pick c = 7 and $n_0$ = 1
  • 예 3: $3n^3 + 20n^2 + 5 = O(n^3)$
    • We need c and $n_0$ such that $3n^3 + 20n^2 + 5 ≤ c·n^3 for n ≥ n_0$
    • Pick c = 4 and $n_0$ = 21
  • 예 4: $3n^2 = O(100n)$ 은 성립 안 함.
    • $3n^3 ≤ c·100n for n ≥ n_0$ 을 만족하는 c 와 $n_0$ 값이 존재하지 않음.


Big Oh and Growth Rate

  • “Big Oh” notation gives an upper bound on the growth rate of a function f(n) for large n.
  • “f(n) = O(g(n))” means that growth rate of f(n) is no more than the growth rate of g(n).

image


Big Oh Rules

  • If $f(n) = a_kn^k + … + a_1n + a_0$, (k > 0), then $f(n) = O(n^k)$

  • Big Oh Rules
    • (1) Drop lower order terms
    • (2) Drop constant factors
  • If n is sufficiently large, the only importance is the order of complexity;
    • For example, $2n^3 + 100n^2 + 4n + 5 = O(n^3)$


  • Use the smallest possible class of functions; For example, “2n is O(n)” instead of “2n is O($n^2$)”

  • 1, 3, 100, . . . . ➔ O(1)
  • $0.5n, log_2n + n, 10000n, \dots$ ➔ O(n)
  • $n^2, 0.5n^2, 10n^2 + 5n + 20, 100n^2, \dots$ ➔ O($n^2$)
  • $2log_2n, log_2n + 10, \dots$ ➔ O($log_2n$)
  • $4nlog_2n, nlog_2n + 5log_2n + 10, \dots$ ➔ O($nlog_2n$)
  • $100n^3, 4n^3 + 50n^2 + 7n + 5, \dots$ ➔ O($n^3$)


Time Analysis: Prefix-Average

image

  • Time: $n + n + (1 + 2 + . . n-1) + (1 + 2 + . . n-1) +n + 1 = n^2 + 2n + 1 = O(n^2)$

image

  • Time: $1 + n + n + n + 1 = 3n + 2 = O(n)$


  • Algorithm 1: O($n^2$); (Redundant computations)
  • Algorithm 2: O(n); (Saving intermediate results)


Exercise: Searching

  • Problem : Find an integer X among n integers.

  • Algorithm 1 :
    • Data Structure : Unsorted n integers in array list.
    • Method : Linear Search
  • Algorithm 2:
    • Data Structure : Sorted n integers in array list.
    • Method : Binary Search
  • Which algorithm is better?
    • Searching Cost?
    • Insertion Cost?
    • Deletion Cost?


  • left, right는 각각 list의 왼쪽, 오른쪽 위치를 가리키는 변수임


image


  • 위 알고리즘의 Time 분석을 하라. (Hint: T(1) = 1, T(n) = T(n/2) + 1)


Class of Time Complexities

  • Polynomial Time (Tractable)
    • O(1) : Constant
    • O($log_2n$) : Logarithmic
    • O(n) : Linear
    • O($n·log_2n$): Logarithmic Linear
    • O($n^2$) : Square
    • O($n^3$) : Cubic
    • . . . . . .
    • O($n^k$) :
  • Exponential Time (Intractable)
    • O($2^n$)
    • O(n!)
    • O($n^n$)
  • Ordering of complexities
    • $O(1) < O(log_2n) < O(n) < O(nlog_2n) < O(n^2) < O(n^3) < O(2^n) < O(n!)$


  • 다음 비교 결과의 의미는? 효율성에서 어느 정도 차이? 예: $n = 10, n = 10^3, n = 10^6, n = 10^9$
    • O(1) vs. O(n)
    • O(n) vs. O($log_2n$)
    • O($n^2$) vs. O($n·log_2n$)
  • 다음 중 어느 것이 더 클까? 현실적으로 풀 수 있는지?
    • $O(n^k) vs. O(2^n)$
  • Polynomial time 계층은 다루기 쉬운(tractable) 문제들; 잘 알려진 유용하게 사용되는 많은 문제들이 O(n3) 이하로 해결됨.
  • Exponential time 계층은 다루기 어려운(intractable) 문제들.
  • 어떠한 Polynomial time 알고리즘도 (현재까지 알려진 바로는) 존재하지 않는 문제들도 존재함


Orders of Magnitude

image


Growth of Function Values

image


Class of Time Complexities

image


Practical Complexities

  • Linear Search: O(n)
  • Binary Search: O($log_2n$)
  • Multi-way Search: O($log_bn$), b » 2
  • Find Maximum: O(n) or O($log_2n$)
  • Insertion Sort: O($n^2$)
  • Merge Sort: O($n·log_2n$)
  • Shortest Paths: O($n^2$)
  • All Pairs Shortest Paths: O($n^3$)
  • Matrix Addition: O($n^2$)
  • Matrix Multiplication: O($n^3$) or O($n^{2∙81}$)
  • Hamiltonian Cycle: O(n!)
  • ……

댓글남기기