通过从文件中获取输入来构建哈希,如果键不是唯一的,则附加值

Key*_*yPi 1 regex perl nlp associative-array hashmap

我有这个文件

affaire,chose,question
chose,emploi,fonction,service,travail,tâche
cause,chose,matière
chose,point,question,tête
chose,objet,élément
chose,machin,truc
Run Code Online (Sandbox Code Playgroud)

我想有一个像这样的关联数组:

affaire => chose, question
cause => chose, matière
chose => emploi, fonction, service, travail, tache, point, question, tete, objet élément, machin, truc
Run Code Online (Sandbox Code Playgroud)

甚至更好,每当我找到一个新单词时,将单词保存为键,将上下文(左或右和右)保存为值...例如:

affaire => chose, question
cause => chose, matière
chose => affaire, question, cause, matière, emploi, fonction, service, travail, tache, point, question, tete, objet élément, machin, truc
Run Code Online (Sandbox Code Playgroud)

目前我正在尝试以这种方式创建关联数组:

 $in = "test.txt";
 $out = "res_test.txt";

open(IN, "<", $in); 
open(OUT, ">", $out);

%list = '';
while(defined($l = <IN>)){
    if ($l =~ /((\w+),(.*))/){
        #2,3
        $list{$2} = $3;
    }
}


    while(my($k,$v) = each(%list)){
            print OUT $k." => ".$v."\n";
    }
Run Code Online (Sandbox Code Playgroud)

但结果是:

affaire => chose,question
 => 
chose => machin,truc
cause => chose,matière
Run Code Online (Sandbox Code Playgroud)

为什么不添加新值?谢谢你的帮助.

Сух*_*й27 6

在实际想要追加它们时覆盖旧的哈希值,因此解决方案是连接字符串,

my %list;
while (my $l = <IN>) {
    if ($l =~ /((\w+),(.*))/) {

      # $list{$2} //= ""; # initialize to empty string
      # # add comma in front depending on $list{$2} content
      # $list{$2} .= length($list{$2}) ? ",$3" : $3;
      if (defined $list{$2}) { $list{$2} .= ",$3" }
      else                   { $list{$2}  = $3 }
    }
}
Run Code Online (Sandbox Code Playgroud)

或者使用更常见的数组散列来存储值,

my %list;
while (my $l = <IN>) {
    my ($k, @vals) = split /,/, $l;
    push @{ $list{$k} }, @vals;
}
use Data::Dumper; print Dumper \%list;
Run Code Online (Sandbox Code Playgroud)