我有一个lexicon包含 650 个单词的数据框 ,我想通过从 中随机选择单词来为 5 个发言者创建一系列随机单词列表lexicon。我希望通过 24 个月的数据收集来完成此任务,每个月都会采集不同大小的词汇样本。指定月份和词汇量大小的基本数据框是df1:
df1 <- data.frame(months=rep(1:24, times=5, each=1),
vocab_size=(sample(c(0:25), 120, replace=TRUE)),
Speaker=rep(c("A", "B", "C", "D", "E"), times=1, each=24))
list1 <- split(df1, f=df1$Speaker)
Run Code Online (Sandbox Code Playgroud)
lexicon看起来像这样:
lexicon <- data.frame(c("a", "about", "above", "ain't", "all", "am", "an", "and",
"animal", "ankle", "ant" ,"any", "apple","applesauce",
"asleep", "at", "ate", "aunt", "auntie", "aunty's",
"awake", "away", "baa", "baby" , "baby+doll", "bad" ,
"ball", "balloon", "banana", "basket", "bat", "bath",
"bathing", "bathtub", "be", "beach", "bead", "bean",
"because", "bed", "beddy", …Run Code Online (Sandbox Code Playgroud) 我有一个单词列表如下wordlist:
wordlist <- data.frame(words = c("anywhere", "youll", "feel", "comfortable", "please", "dont"))
Run Code Online (Sandbox Code Playgroud)
我有另一个包含辅音列表的数据框:
consonants <- data.frame(consonants = c("b", "c", "d", "f", "g", "h"))
Run Code Online (Sandbox Code Playgroud)
我想在wordlistcalled中创建一个新变量word_structure,其中所有辅音都替换为"C",所有元音替换为"V":
wordlist$word_structure <- c("VCCCCVCV", "CVVCC", "CVVC", "CVCCVCCVCCV", "CCVVCV", "CVCC")
Run Code Online (Sandbox Code Playgroud)
我不知道如何结合条件格式来gsub获得我需要的东西。
I have a dataframe in R that looks something like this:
library(tibble)
sample <- tribble(~subj, ~session,
"A", 1,
"A", 2,
"A", 3,
"B", 1,
"B", 2,
"C", 1,
"C", 2,
"C", 3,
"C", 4)
Run Code Online (Sandbox Code Playgroud)
As you can see from this example, there are a number of sessions for each subject, but subjects do not all have the same number of sessions. There are 94 rows in my real dataset (5 subjects, between 15 and 20 different sessions each).
I have …
我在 R 中聚合了一堆 CSV 文件,我使用以下代码成功完成了这些文件(在这里找到):
Tbl <- list.files(path = "./Data/CSVs/",
pattern="*.csv",
full.names = T) %>%
map_df(~read_csv(., col_types = cols(.default = "c")))
Run Code Online (Sandbox Code Playgroud)
我想在 Tbl 中包含 .csv 文件名(最好没有文件扩展名)作为一列。我找到了一个使用 plyr 的解决方案,但我想坚持使用 dplyr,因为 plyr 会导致我的代码进一步出现故障。
有什么办法可以在上面的代码中添加一些东西来告诉 R 在 Tbl$filename 中包含文件名吗?
非常感谢!