XML在我正在工作的项目中,我需要根据用户输入自动创建文档。使用用户输入修改文档的部分xml对我来说没问题,但我是xml在 R 中从头开始创建文档的新手
我想知道是否可以使用或包XML在 R 中生成如下所示的文档。到目前为止,我已经探索了,和函数,但我不熟悉创建此类文件所需的所有语法(完成后应将其保存在本地路径中)XMLxml2newXMLdocxml_new_documentxml_new_rootxml
<session>
<modelVersion>1.0.0</modelVersion>
<products>
<product>
<refNo>1</refNo>
<uri>S1A_IW_GRDH_1SDV_20190818T175529_20190818T175554_028627_033D25_22ED.zip</uri>
<productReaderPlugin>class org.esa.s1tbx.io.sentinel1.Sentinel1ProductReaderPlugIn</productReaderPlugin>
</product>
<product>
<refNo>2</refNo>
<uri>S2A_MSIL1C_20190823T061631_N0208_R034_T42TXS_20190823T081730.zip</uri>
<productReaderPlugin>class org.esa.s2tbx.dataio.s2.ortho.plugins.Sentinel2L1CProduct_Multi_UTM42N_ReaderPlugIn</productReaderPlugin>
</product>
</products>
<views/>
</session>
Run Code Online (Sandbox Code Playgroud)
考虑使用上述库通过 DOM 方法构建 XML,例如XML不需要连接或插入字符串:
library(XML)
# DATA
df <- data.frame(refNo = c(1, 2),
uri = c('S1A_IW_GRDH_1SDV_20190818T175529_20190818T175554_028627_033D25_22ED.zip',
'S2A_MSIL1C_20190823T061631_N0208_R034_T42TXS_20190823T081730.zip'),
plugin = c('class org.esa.s1tbx.io.sentinel1.Sentinel1ProductReaderPlugIn',
'class org.esa.s2tbx.dataio.s2.ortho.plugins.Sentinel2L1CProduct_Multi_UTM42N_ReaderPlugIn')
)
# CREATE XML FILE
doc = newXMLDoc()
root = newXMLNode("session", doc = doc)
# WRITE XML NODES AND DATA
mvNode = newXMLNode("modelVersion", "1.0.0", parent = root)
for (i in 1:nrow(df)){
prodNode = newXMLNode("products", parent = root)
# APPEND TO PRODUCT NODE
newXMLNode("refNo", df$refNo[i], parent = prodNode)
newXMLNode("uri", df$uri[i], parent = prodNode)
newXMLNode("productReaderPlugin", df$plugin[i], parent = prodNode)
}
vwNode = newXMLNode("views", parent = root)
# OUTPUT XML CONTENT TO CONSOLE
print(doc)
# OUTPUT XML CONTENT TO FILE
saveXML(doc, file="Output.xml")
Run Code Online (Sandbox Code Playgroud)
输出
<?xml version="1.0"?>
<session>
<modelVersion>1.0.0</modelVersion>
<products>
<refNo>1</refNo>
<uri>S1A_IW_GRDH_1SDV_20190818T175529_20190818T175554_028627_033D25_22ED.zip</uri>
<productReaderPlugin>class org.esa.s1tbx.io.sentinel1.Sentinel1ProductReaderPlugIn</productReaderPlugin>
</products>
<products>
<refNo>2</refNo>
<uri>S2A_MSIL1C_20190823T061631_N0208_R034_T42TXS_20190823T081730.zip</uri>
<productReaderPlugin>class org.esa.s2tbx.dataio.s2.ortho.plugins.Sentinel2L1CProduct_Multi_UTM42N_ReaderPlugIn</productReaderPlugin>
</products>
<views/>
</session>
Run Code Online (Sandbox Code Playgroud)
xml2 (cran)包提供了 Hadleyuniverse 中的替代解决方案。
library(xml2)
library(tidyverse)
df <- data.frame(number = c(1, 2),
uri = c('S1A_IW_GRDH_1SDV_20190818T175529_20190818T175554_028627_033D25_22ED.zip',
'S2A_MSIL1C_20190823T061631_N0208_R034_T42TXS_20190823T081730.zip'),
plugin = c('class org.esa.s1tbx.io.sentinel1.Sentinel1ProductReaderPlugIn',
'class org.esa.s2tbx.dataio.s2.ortho.plugins.Sentinel2L1CProduct_Multi_UTM42N_ReaderPlugIn'),
stringsAsFactors = FALSE)
Run Code Online (Sandbox Code Playgroud)
我们首先创建包含所有 xml 结构的 xml 文档
doc <- xml_new_root("session")
xml_add_child(doc, "modelVersion", "1.0.0")
xml_add_child(doc, "products")
xml_add_child(doc, "products")
xml_add_child(doc, "views")
doc
#> {xml_document}
#> <session>
#> [1] <modelVersion>1.0.0</modelVersion>
#> [2] <products/>
#> [3] <products/>
#> [4] <views/>
Run Code Online (Sandbox Code Playgroud)
我们现在在每个产品节点中添加组件。xml_add_child由于函数已向量化,因此不需要循环。
products_nodes <- xml_find_all(doc, "//products")
xml_add_child(products_nodes, "refNo", df$number)
xml_add_child(products_nodes, "uri", df$uri)
xml_add_child(products_nodes, "productReaderPlugin", df$plugin)
Run Code Online (Sandbox Code Playgroud)
最后将 xml 树保存到文件中并显示其内容
write_xml(doc, file = "output.xml", options =c("format", "no_declaration"))
cat(paste0(readLines("output.xml"), collapse = "\n"))
Run Code Online (Sandbox Code Playgroud)
这是“output.xml”文件的内容:
<session>
<modelVersion>1.0.0</modelVersion>
<products>
<refNo>1</refNo>
<uri>S1A_IW_GRDH_1SDV_20190818T175529_20190818T175554_028627_033D25_22ED.zip</uri>
<productReaderPlugin>class org.esa.s1tbx.io.sentinel1.Sentinel1ProductReaderPlugIn</productReaderPlugin>
</products>
<products>
<refNo>2</refNo>
<uri>S2A_MSIL1C_20190823T061631_N0208_R034_T42TXS_20190823T081730.zip</uri>
<productReaderPlugin>class org.esa.s2tbx.dataio.s2.ortho.plugins.Sentinel2L1CProduct_Multi_UTM42N_ReaderPlugIn</productReaderPlugin>
</products>
<views/>
</session>
Run Code Online (Sandbox Code Playgroud)
由reprex 包(v0.3.0)于 2021-05-06 创建
没有任何这些包可能很容易解决...如果你的结构相当静态,我会使用https://github.com/tidyverse/glue然后只是cat()文件输出。像这样的东西:
## I guess your data looks like this?
df <- data.frame(number = c(1,2),
uri = c("S1A_IW_GRDH_1SDV_20190818T175529_20190818T175554_028627_033D25_22ED.zip<",
"S2A_MSIL1C_20190823T061631_N0208_R034_T42TXS_20190823T081730.zip"),
plugin = c("class org.esa.s1tbx.io.sentinel1.Sentinel1ProductReaderPlugIn",
"class org.esa.s2tbx.dataio.s2.ortho.plugins.Sentinel2L1CProduct_Multi_UTM42N_ReaderPlugIn"))
df
## build a function that outputs every block in xml format
thingieBuilder <- function(number, uri, plugin){
glue::glue("<product>
<refNo>{number}</refNo>
<uri>{uri}</uri>
<productReaderPlugin>{plugin}</productReaderPlugin>
</product>")
}
## now run that for each entry in your df and unlist it, and make it a sausage, seperated by newlines
xmlProducts <- df %>% purrr::pmap(thingieBuilder) %>% unlist %>% paste(collapse = "\n")
## Now stick on top and bottom, and cat it to a file!
glue::glue("<session>
<modelVersion>1.0.0</modelVersion>
<products>\n",
xmlProducts,
"/n</products>
<views/>
</session>") %>%
cat(file = "boom.xml")
Run Code Online (Sandbox Code Playgroud)