如何使用SAS中的“数据”步骤对数据进行排序

Sar*_*ran 5 sas datastep

我想在SAS数据步骤中对数据进行排序。我的确切意思是:proc sort的工作应该在数据步骤中完成。有什么解决办法吗?

Stu*_*ski 6

如果您正在寻找仅数据步骤的解决方案,则可以PROC SORT使用哈希表来完成。需要注意的是,您需要足够的内存来执行此操作。

如果要进行简单排序,则可以使用该ordered:'yes'选项加载哈希表,并将其输出到新表中。默认情况下,ordered:yes将按升序对数据进行排序。您也可以指定descending

简单排序

data _null_;

    /* Sets up PDV without loading the table */
    if(0) then set sashelp.class;

    /* Load sashelp.class into memory ordered by Height. Do not remove duplicates. */
    dcl hash sortit(dataset:'sashelp.class', ordered:'yes', multidata:'yes');

        sortit.defineKey('Height');     * Order by height;
        sortit.defineData(all:'yes');   * Keep all variables in the output dataset;

    sortit.defineDone();

    /* Output to a dataset called class_sorted */
    sortit.Output(dataset:'class_sorted');
run;
Run Code Online (Sandbox Code Playgroud)

去重

要删除重复项,请执行完全相同的操作,但删除multidata选项除外。在下表中,观察值(8,9)和(15,16)彼此重复。观察结果9和16将被消除。

data _null_;

    /* Sets up PDV without loading the table */
    if(0) then set sashelp.class;

    /* Load sashelp.class into memory ordered by Height. Do not keep duplicates. */
    dcl hash sortit(dataset:'sashelp.class', ordered:'yes');

        sortit.defineKey('Height');     * Order by height;
        sortit.defineData(all:'yes');   * Keep all variables in the output dataset;
    sortit.defineDone();

    /* Output to a dataset called class_sorted */
    sortit.Output(dataset:'class_sorted');
run;
Run Code Online (Sandbox Code Playgroud)


use*_*489 6

Stu beat me to it, but provided that your dataset contains a unique key, and you can fit the whole thing in memory, you can use a hash sort, e.g.:

data _null_;
  if 0 then set sashelp.class;
  declare hash h(dataset:"sashelp.class",ordered:"a");
  rc = h.definekey("age","sex","name");
  rc = h.definedata(ALL:'yes');
  rc = h.definedone();
  rc = h.output(dataset:"class_sorted");
  stop;
run;
Run Code Online (Sandbox Code Playgroud)

If you are really determined to avoid using any built-in sort methods, a particularly silly approach is to load the whole dataset into a series of temporary arrays, sort the arrays using a hand-coded algorithm, and export again:

https://codereview.stackexchange.com/questions/79952/quicksort-in-sas-for-sorting-datasets