如何查找包含序列的所有元素的最小长度子序列

rus*_*ell 16 algorithm

给定一个序列,如S = {1,8,2,1,4,1,2,9,1,8,4},我需要找到包含S的所有元素的最小长度子序列(没有重复,顺序无所谓).如何以有效的方式找到这个子序列?

注意: S中有5个不同的元素:{1,2,4,8,9}.最小长度子序列必须包含所有这5个元素.

Gri*_*yan 11

算法:

首先,确定阵列中不同元素的数量 - 这可以在线性时间内轻松完成.让我们有k不同的元素.

分配一个cur大小为10 ^ 5 的数组,每个数组显示当前子序列中每个元素的使用量(见下文).

保持一个cnt变量,显示当前在所考虑的序列中有多少个不同的元素.现在,取两个索引,begin并按end以下方式遍历数组:

  1. 初始化cntbeginas 0,endas -1(0在第一次增量后获得).然后尽可能执行以下操作:
  2. 如果cnt != k:

    2.1.增量end.如果end已经是数组的结尾,那么就打破.如果cur[array[end]]为零,则递增cnt.增量cur[array[end]].

    其他:

    2.2 {

    尝试增加begin迭代器:while cur[array[begin]] > 1,递减它,并递增begin(cur[array[begin]] > 1意味着我们在当前子序列中有另一个这样的元素).毕竟,将[begin, end]间隔与当前答案进行比较,如果更好则存储它.

    }

在进一步的过程变得不可能之后,你得到了答案.复杂性是O(n)- 只需通过数组传递两个交互器.

用C++实现:

    #include <iostream>

using namespace std;

const int MAXSIZE = 10000;

int arr[ MAXSIZE ];
int cur[ MAXSIZE ];

int main ()
{
   int n; // the size of array
   // read n and the array

   cin >> n;
   for( int i = 0; i < n; ++i )
      cin >> arr[ i ];

   int k = 0;
   for( int i = 0; i < n; ++i )
   {
      if( cur[ arr[ i ] ] == 0 )
         ++k;
      ++cur[ arr[ i ] ];
   }

   // now k is the number of distinct elements

   memset( cur, 0, sizeof( cur )); // we need this array anew
   int begin = 0, end = -1; // to make it 0 after first increment
   int best = -1; // best answer currently found
   int ansbegin, ansend; // interval of the best answer currently found
   int cnt = 0; // distinct elements in current subsequence

   while(1)
   {
      if( cnt < k )
      {
         ++end;
         if( end == n )
            break;
         if( cur[ arr[ end ]] == 0 )
            ++cnt; // this elements wasn't present in current subsequence;
         ++cur[ arr[ end ]];
         continue;
      }
      // if we're here it means that [begin, end] interval contains all distinct elements
      // try to shrink it from behind
      while( cur[ arr[ begin ]] > 1 ) // we have another such element later in the subsequence
      {
         --cur[ arr[ begin ]];
         ++begin;
      }
      // now, compare [begin, end] with the best answer found yet
      if( best == -1 || end - begin < best )
      {
         best = end - begin;
         ansbegin = begin;
         ansend = end;
      }
      // now increment the begin iterator to make cur < k and begin increasing the end iterator again
      --cur[ arr[ begin]];
      ++begin;
      --cnt;
   }

   // output the [ansbegin, ansend] interval as it's the answer to the problem

   cout << ansbegin << ' ' << ansend << endl;
   for( int i = ansbegin; i <= ansend; ++i )
      cout << arr[ i ] << ' ';
   cout << endl;

   return 0;
}
Run Code Online (Sandbox Code Playgroud)