小编Set*_*hel的帖子

Keras RNN (R) 文本生成词级模型

我一直在研究字符级文本生成的示例:https://keras.rstudio.com/articles/examples/lstm_text_ Generation.html

我无法将此示例扩展到单词级模型。请参阅下面的代表

library(keras)
library(readr)
library(stringr)
library(purrr)
library(tokenizers)

# Parameters

maxlen <- 40

# Data Preparation

# Retrieve text
path <- get_file(
  'nietzsche.txt', 
  origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt'
  )

# Load, collapse, and tokenize text
text <- read_lines(path) %>%
  str_to_lower() %>%
  str_c(collapse = "\n") %>%
  tokenize_words( simplify = TRUE)

print(sprintf("corpus length: %d", length(text)))

words <- text %>%
  unique() %>%
  sort()

print(sprintf("total words: %d", length(words)))  
Run Code Online (Sandbox Code Playgroud)

这使:

[1] "corpus length: 101345"
[1] "total words: 10283"
Run Code Online (Sandbox Code Playgroud)

当我继续下一步时,我遇到了问题:

# Cut the text in semi-redundant sequences of maxlen …
Run Code Online (Sandbox Code Playgroud)

r lstm keras recurrent-neural-network

5
推荐指数
1
解决办法
868
查看次数

标签 统计

keras ×1

lstm ×1

r ×1

recurrent-neural-network ×1