Jak*_*les 8 c# parsing chemistry
我试图从字符串中解析C#中的化学式(格式,例如:Al2O3
或O3
或C
或C11H22O12
).它工作正常,除非只有一个特定元素的原子(例如氧原子H2O
).我该如何解决这个问题呢?此外,还有一种更好的解析化学式公式字符串的方法吗?
ChemicalElement是代表化学元素的类.它具有AtomicNumber(int),Name(字符串),Symbol(字符串)属性.ChemicalFormulaComponent是表示化学元素和原子计数的类(例如,公式的一部分).它具有Element(ChemicalElement),AtomCount(int)属性.
其余部分应该足够清楚(我希望),但如果我能澄清任何事情,请在回答之前告诉我.
这是我目前的代码:
/// <summary>
/// Parses a chemical formula from a string.
/// </summary>
/// <param name="chemicalFormula">The string to parse.</param>
/// <exception cref="FormatException">The chemical formula was in an invalid format.</exception>
public static Collection<ChemicalFormulaComponent> FormulaFromString(string chemicalFormula)
{
Collection<ChemicalFormulaComponent> formula = new Collection<ChemicalFormulaComponent>();
string nameBuffer = string.Empty;
int countBuffer = 0;
for (int i = 0; i < chemicalFormula.Length; i++)
{
char c = chemicalFormula[i];
if (!char.IsLetterOrDigit(c) || !char.IsUpper(chemicalFormula, 0))
{
throw new FormatException("Input string was in an incorrect format.");
}
else if (char.IsUpper(c))
{
// Add the chemical element and its atom count
if (countBuffer > 0)
{
formula.Add(new ChemicalFormulaComponent(ChemicalElement.ElementFromSymbol(nameBuffer), countBuffer));
// Reset
nameBuffer = string.Empty;
countBuffer = 0;
}
nameBuffer += c;
}
else if (char.IsLower(c))
{
nameBuffer += c;
}
else if (char.IsDigit(c))
{
if (countBuffer == 0)
{
countBuffer = c - '0';
}
else
{
countBuffer = (countBuffer * 10) + (c - '0');
}
}
}
return formula;
}
Run Code Online (Sandbox Code Playgroud)
Pie*_*kel 10
我用正则表达式重写了你的解析器.正则表达式完全符合您正在做的事情.希望这可以帮助.
public static void Main(string[] args)
{
var testCases = new List<string>
{
"C11H22O12",
"Al2O3",
"O3",
"C",
"H2O"
};
foreach (string testCase in testCases)
{
Console.WriteLine("Testing {0}", testCase);
var formula = FormulaFromString(testCase);
foreach (var element in formula)
{
Console.WriteLine("{0} : {1}", element.Element, element.Count);
}
Console.WriteLine();
}
/* Produced the following output
Testing C11H22O12
C : 11
H : 22
O : 12
Testing Al2O3
Al : 2
O : 3
Testing O3
O : 3
Testing C
C : 1
Testing H2O
H : 2
O : 1
*/
}
private static Collection<ChemicalFormulaComponent> FormulaFromString(string chemicalFormula)
{
Collection<ChemicalFormulaComponent> formula = new Collection<ChemicalFormulaComponent>();
string elementRegex = "([A-Z][a-z]*)([0-9]*)";
string validateRegex = "^(" + elementRegex + ")+$";
if (!Regex.IsMatch(chemicalFormula, validateRegex))
throw new FormatException("Input string was in an incorrect format.");
foreach (Match match in Regex.Matches(chemicalFormula, elementRegex))
{
string name = match.Groups[1].Value;
int count =
match.Groups[2].Value != "" ?
int.Parse(match.Groups[2].Value) :
1;
formula.Add(new ChemicalFormulaComponent(ChemicalElement.ElementFromSymbol(name), count));
}
return formula;
}
Run Code Online (Sandbox Code Playgroud)