GParse 1.0.0

GParse (Beta)

A library of useful delimited-text parsers with a common interface.

Description

All parsers in this library implement the ITextParser interface with a single method:

public IEnumerable<string> Parse(...);

To support deferred execution or more dynamic input, a text provider interface can be used instead of a string for the input.

public interface ITextInputProvider
{	
	public string GetText();
}

The library contains these parsers:

Class Name
SplitParser A simple wrapper around the .NET string.Split() method.
DelimitedTextParser A custom implementation that looks for delimiters of any size. May be modified in the future to accept multiple delimiters.
QuoteAwareParser A delimited text parser which knows to ignore instances of the delimiter when it is found within quotes. Useful for space-delimited files where the fields are human-readable text and may contain spaces, for example.

Usage Instructions

  • Create an instance of your chosen parser.
  • Then call the Parse() method with your delimited input.
  • Execution is deferred. Iterate the collection to retrieve the values.

See the details for each class below for more information and examples of usage.

ITextParser Interface

The ITextParser interface has only one method, Parse(). There are overloads to take two types of input, a string or an ITextInputProvider.

ITextInputProvider Interface

The ITextInputProvider interface can be used as a sort of string factory. It can defer the collection of the input string until the moment that it is needed or it can be associated with a function instead of a literal value so that it can be parameterized.

public interface ITextInputProvider
{
	public string GetText();
}

AnonDelimitedTextInputProvider Class

For convenience, the AnonDelimitedTextInputProvider class has been included to provide a universal implementation of ITextInputProvider. Its GetText() implementation is provided as a function through its constructor.

Example:

ITextInputProvider provider = new AnonDelimitedTextInputProvider(
	static () => Console.ReadLine());
Console.WriteLine(provider.GetText());

SplitParser

The SplitParser class uses .NET's string.Split() method under the hood. This is a wrapper around it to bind it to the ITextParser interface.

Example:

ITextParser parser = new SplitParser("|");
IEnumerable<string> oneTwoThree = parser.Parse("1|2|3");

DelimitedTextParser

The DelimitedTextParser class is a custom replacement for .NET's string.Split() method. This is useful because it may provide more features in the future.

Example:

ITextParser parser = new DelimitedTextParser("|");
IEnumerable<string> oneTwoThree = parser.Parse("1|2|3");

QuoteAwareParser

The QuoteAwareParser class will ignore delimiters that it finds within quotes. This is useful for inputs which may be space-delimited but where the tokens are in human language and likely contain spaces, for example. The constructor accepts parameters for the openQuote and closeQuote, so the quotes need not be actual quotation charactes. They can be any string.

If the parsed text has an open quote without a corresponding closing quote, a ParseException is thrown.

Example:

ITextParser parser = new QuoteAwareParser(" ", "{", "}");
List<string> containsSpacesText = parser
	.Parse("{This contains spaces} {and so does this}")
	.ToList();

Console.WriteLine(containsSpacesText[0]);
Console.WriteLine(containsSpacesText[1]);

// Output is:
//{This contains spaces}
//{and so does this}

Note in the example above that the { and } characters are not removed from the tokens during parsing. The quotation characters are maintained. It is up to the caller to remove them if that is what's desired. To facilitate this, see the string.Unquote() extension method.

string.Unquote() Extension Method

The string.Unquote() extension method is provided to work with the remaining quotes which are kept by the QuoteAwareParser during its Parse() operation. It's easy enough to use. Just call it on the token string and pass the open and closing quotation strings. Here is the QuoteAwareParser example revised to use it after parsing.

ITextParser parser = new QuoteAwareParser(" ", "{", "}");
List<string> containsSpacesText = parser
	.Parse("{This contains spaces} {and so does this}")
	.Select(static s => s.Unquote("{", "}"))
	.ToList();

Console.WriteLine(containsSpacesText[0]);
Console.WriteLine(containsSpacesText[1]);

// Output is:
//This contains spaces
//and so does this

Roadmap

AutoParser

If it is determined useful and feasible, create a parser factory which is given an input sample for it to determine which parser should be used and its parameters. For example, if it contains spaces and even number of single or double quotes, it must be a quote-aware parser and the delimiter is what appears between the quotes. Data without quotes can determine the delimiter if there is only one non-alphanumeric non-whitespace character in the sample. If true, this is a DelimitedText parser.

DelimitedTextParser

Future features for the DelimitedTextParser class include:

  • Multiple delimiters
  • Case-insensitive delimiters
  • Convert from SafeSubstring to use Span<char> and read it one character at a time for performance.

QuoteAwareParser

Future features for the QuoteAwareParser class include:

  • Multiple delimiters
  • Case-insensitive delimiters
  • Convert from SafeSubstring to use Span<char> and read it one character at a time for performance.

Unparsers

Unparsers will reverse the IEnumerable<string> into a single concatenated string. This may be moot in light of what Linq can do, but we'll see if it's more readable or more usable.

Unparsers can:

  • Concatenate a list of tokens, separated by a delimiter.
  • Conditional delimiters (omit some delimiters based on predicates)
  • Conditional tokens (omit some tokens based on predicates)
  • Token transforms / projections (surround a token with brackets, etc.)
  • Calculated delimiters (e.g., "1:A", "2:B", "3:C") etc.
  • Overall prefix (e.g. "MyPrefix 1,2,3")
  • Overall suffix (e.g. "1,2,3 MySuffix")
  • Align tokens by space-padding the fields.

More Examples

The GParse library is fully unit-tested. You can find examples of use in the unit tests.

Source Code

You can find the source code online at my Git server.

https://git.pillidar.com/PillidarPublic/GParse

Issues

No known issues.

Notes

Notes go here.

No packages depend on GParse.

.NET 8.0

  • No dependencies.

Version Downloads Last updated
1.0.0 12 03/21/2025