Pagine

sabato 16 novembre 2024

Strolling through informatics #8 – How long does this algorithm take?

by Enrico Nardelli

(versione italiana qua)

We saw in the previous post that for some computations, it's impossible to find any algorithm. Additionally, since we are mortal and don't have infinite time available to obtain an answer, we're also interested in knowing how long a certain algorithm might take to complete its computation.

Initially, we're interested in distinguishing those problems, called tractable, where as the problem grows in size we continue to wait a "human" amount of time for the answer, from those problems, called intractable, where this doesn't happen. Instead of talking about time, I'll discuss these aspects by referring to the number of steps needed to solve the problem by the best known algorithm for it. The reason for this choice is that time can depend on the intrinsic speed of the specific computer used to execute the (program that implements the) algorithm, while we're interested in a characterization independent of technological developments. In any case, the results in terms of tractability or intractability don't change. Conversely, but we won't discuss this in these posts and I refer you to the book La rivoluzione informatica for a discussion, the use of quantum computers makes tractable some of the problems that are intractable using a classical computer (which is an automaton equivalent to the UTM).

Every algorithm, once expressed in a program, requires time proportional to the number of instructions that must be executed. Given the algorithm, therefore, one can write the mathematical function, called the algorithm's complexity function, which expresses this number based on the size N of the problem. We should all remember, having studied it in school, the difference between a polynomial function and an exponential function. In the first, the argument N is the base of a power (e.g., N2 or N3 or even N10), while in the second it's the exponent of the power (e.g., 2N or 3N or even 10N). Exponential functions grow much faster than polynomial ones. In the following table we report the value of these functions for different values of the argument N. We recall that the symbol "~" means "about," "approximately," and that the notation, for example, 1010 indicates the digit 1 followed by 10 zeros, that is, the value 10 billion. In the table you can see that already for N = 60, all the exponential functions presented have a value greater than all the polynomial functions.

It's important to remember that the complexity function is expressed, for each algorithm, with reference to the so-called "worst case" for the algorithm's execution, that is, the most unfavorable one in terms of the number of instructions to execute (assuming, for simplicity, that all instructions require the same time). Let's consider the first of the problems presented in the previous post, that of sorting. We want to know what the maximum guaranteed execution time of the algorithm is as a function of the quantity N, whatever may happen with the N input data. It can be proven that for this problem the best known algorithm guarantees a time proportional to N log(N ), where "log" is the logarithmic function we should remember from high school, and this result cannot be improved.

Let's now assume that executing a single instruction requires a modern computer one nanosecond, that is, one billionth of a second. In formula, 10-9 (that is, 1/109) seconds. Let's then consider the second problem, of shortest paths, for which the best algorithm guarantees in the worst case the execution of at most N3 instructions for a list with N cities (ignoring marginal improvements obtained in recent years). Then we're able to solve it for 100 cities in one thousandth of a second (because 1,000,000, that is 106, multiplied by 10-9 equals 103, that is 1/103 = 0.001) and for 1,000 cities in one second (1,0003 = 109, which multiplied by 10-9 equals exactly 1). If we then needed to solve it for 10,000 cities, 1,000 seconds would be needed, or about 17 minutes. For 100,000 cities it would even require a million seconds, equal to about 12 days: a very long time, but all in all acceptable.

Let's instead see what happens for the third problem, the traveling salesman, for which the best known algorithms require the execution of 2N instructions for a list with N cities. In this case we could solve it for 20 cities in just over one thousandth of a second, but already for 30 cities one second would be necessary, 40 cities would require 17 minutes, and 50 would be the maximum number of cities for which the problem is solvable waiting a time of 12 days. The solution for 60 cities could be obtained in about 32 years, clearly an unacceptable time.

So here's how the distinction between "tractable" and "intractable" problems is given based on the type of function that expresses the number of steps necessary for its solution algorithms. In particular, if a solution algorithm is known for which such a complexity function is polynomial, that is, the problem admits a polynomial algorithm, then the problem is called tractable (that is, it's said to belong to class P). If instead it has been proven that every possible solution algorithm has an exponential complexity function, the problem is intractable (that is, it's said to belong properly to class EXP). While in the first case it's enough to find just one polynomial algorithm to define the problem as tractable, in the second case the proof must be conducted independently of specific algorithms, since the set of possible solution algorithms is infinite and the fact that a polynomial complexity algorithm hasn't been found so far doesn't imply that it doesn't exist.

To discuss this situation from another point of view and to better understand that intractability cannot be solved with technological advances, let's consider what would happen if we had a computer a thousand times faster than current ones. Having 12 days available, we would be able to solve the second problem for a million cities instead of 100,000. In the case of the third problem, in the same time of 12 days we would only manage to go from 50 to 60 cities!

The next post, the last one dedicated to algorithms, will consider a boundary situation between tractability and intractability that is very relevant both from a cultural and practical point of view.

[[The posts in this series are based on the Author's book (in Italian) La rivoluzione informatica: conoscenza, consapevolezza e potere nella società digitale, (= The Informatics Revolution: Knowledge, Awareness and Power in the Digital Society) to which readers are referred for further reading]].

--
The original version (in italian) has been published by "Osservatorio sullo Stato digitale" (= Observatory on Digital State) of IRPA - Istituto di Ricerche sulla Pubblica Amministrazione (= Research Institute on Public Administration) on 13 November 2024.

Nessun commento:

Posta un commento

Sono pubblicati solo i commenti che rispettano le norme di legge, le regole della buona educazione e sono attinenti agli argomenti trattati: siamo aperti alla discussione, non alla polemica.