Skip to content

Commit

Permalink
Input refactoring in progress.
Browse files Browse the repository at this point in the history
  • Loading branch information
ColinH committed Dec 17, 2023
1 parent 63cf877 commit 4c3aaa3
Show file tree
Hide file tree
Showing 40 changed files with 1,264 additions and 88 deletions.
1 change: 1 addition & 0 deletions doc/Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
* Added new atomic rule [`consume`](Rule-Reference.md#consume-count-).
* Added new atomic rule [`everything`](Rule-Reference.md#everything).
* Added new rule [`invert`](Rule-Reference.md#invert-r-) to convert between `one` and `not_one` etc.
* Added new rule [`is_buffer`](Rule-Reference.md#is_buffer) to allow grammars to change when using a buffer input.
* Added new convenience rule [`partial`](Rule-Reference.md#partial-r-).
* Added new convenience rule [`star_partial`](Rule-Reference.md#star_partial-r-).
* Added new convenience rule [`strict`](Rule-Reference.md#strict-r-).
Expand Down
56 changes: 56 additions & 0 deletions doc/Development.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Development

### C++ Standard

Version 0.x of the PEGTL requires at least C++11.

Version 1.x of the PEGTL requires at least C++11.

Version 2.x of the PEGTL requires at least C++11.

Version 3.x of the PEGTL requires at least C++17.

Version 4.x of the PEGTL requires at least C++17.

Version 5.x of the PEGTL will make the jump to C++20 or even C++23.

* Add support for C++20 `char8_t` where appropriate.
* Use C++20 `std::span` in inputs and everywhere else it makes sense.
* Investigate whether there is anything useful we can do with Ranges in the PEGTL.
* Use C++20 Concepts instead of all the SFINAE and meta-programming where possible.
* Give examples for C++20 "lambdas in unevaluated contexts" in conjunction with `tao::pegtl::function`.
* Keep an open eye for opportunities to use C++20 spaceship operator. Spaceship!
* Keep an open eye for opportunities to use C++20 defaulted comparison operators.
* Keep an open eye for opportunities to use C++20 `[[likely]]` and `[[unlikely]]`.
* Keep an open eye for opportunities to use C++20 `constinit` and `consteval`, and
* keep an open eye for opportunities to use the extended `constexpr` facilities.
* keep an open eye for opportunities to use the extended CTAD facilities from C++20.
* Keep an open eye for opportunities to use class types as non-type template parameters.
* Replace the hand-crafted endian facilities with C++20 `std::endian` and C++23 `std::byteswap`.
* Investigate how C++20 and C++23 compile-time facilities can help with compile-time strings.
* Investigate whether we can use C++20 `std::bit_cast` to improve some of the low-level code.
* Use C++23 "deducing this" feature to let base class `make_rewind_guard()` return a rewind guard for a derived class.
* Can we assume the C++17 `charconv` facilities are universally available? Can we do this for 4.x?

### Other Things

* Build a compile-time facility to convert Unicode code points to UTF8 sequences!
* Investigate whether we are crazy enough to attempt parsing linked lists or trees.

### Buffer Inputs

A couple of things that could be done in the area of buffer inputs.

* Optional automatic discard.
* Use the double-mmap ring-buffer to prevent `discard()` having to copy data within the buffer.
* Debug input and related facilities that detect when data in the input buffer is accessed after being discarded and/or moved by another discard.
* Investigate the use of ("stackful") coroutines for parsing from a network socket, and
* investigate whether this can also be used for incremental parsing that keeps everything.

---

This document is part of the [PEGTL](https://github.com/taocpp/PEGTL).

Copyright (c) 2023 Dr. Colin Hirsch and Daniel Frey
Distributed under the Boost Software License, Version 1.0<br>
See accompanying file [LICENSE_1_0.txt](../LICENSE_1_0.txt) or copy at https://www.boost.org/LICENSE_1_0.txt
1 change: 1 addition & 0 deletions doc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@
* [Limitations](Grammar-Analysis.md#limitations)
* [Changelog](Changelog.md)
* [Migration Guide](Migration-Guide.md)
* [Development](Development.md)

### Rule Reference Index

Expand Down
106 changes: 106 additions & 0 deletions include/tao/pegtl/buffer.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
// Copyright (c) 2023 Dr. Colin Hirsch and Daniel Frey
// Distributed under the Boost Software License, Version 1.0.
// (See accompanying file LICENSE_1_0.txt or copy at https://www.boost.org/LICENSE_1_0.txt)

#ifndef TAO_PEGTL_BUFFER_HPP
#define TAO_PEGTL_BUFFER_HPP

#include <cstddef>
#include <istream>

#include "config.hpp"

#include "internal/buffer_input.hpp"
#include "internal/dynamic_buffer.hpp"
#include "internal/static_buffer.hpp"

#include "internal/input_with_peeks.hpp"
#include "internal/input_with_source.hpp"

#include "internal/cstream_reader.hpp"
#include "internal/cstring_reader.hpp"
#include "internal/istream_reader.hpp"

#if !defined TAO_PEGTL_DEFAULT_BUFFER_SIZE
#define TAO_PEGTL_DEFAULT_BUFFER_SIZE 4000
#endif

#if !defined TAO_PEGTL_DEFAULT_CHUNK_SIZE
#define TAO_PEGTL_DEFAULT_CHUNK_SIZE 1000
#endif

namespace TAO_PEGTL_NAMESPACE
{
template< typename Reader >
struct dynamic_input
: internal::input_with_peeks< internal::buffer_input< internal::dynamic_buffer< char, Reader > > >
{
using internal::input_with_peeks< internal::buffer_input< internal::dynamic_buffer< char, Reader > > >::input_with_peeks;
};

dynamic_input( const std::size_t, const std::size_t, std::FILE* ) -> dynamic_input< internal::cstream_reader >;
dynamic_input( const std::size_t, const std::size_t, const char* ) -> dynamic_input< internal::cstring_reader >;
dynamic_input( const std::size_t, const std::size_t, std::istream& ) -> dynamic_input< internal::istream_reader >;

using dynamic_cstream_input = dynamic_input< internal::cstream_reader >;
using dynamic_cstring_input = dynamic_input< internal::cstring_reader >;
using dynamic_istream_input = dynamic_input< internal::istream_reader >;

template< typename Reader, std::size_t BufferSize = TAO_PEGTL_DEFAULT_BUFFER_SIZE, std::size_t ChunkSize = TAO_PEGTL_DEFAULT_CHUNK_SIZE >
struct static_input
: internal::input_with_peeks< internal::buffer_input< internal::static_buffer< char, Reader, BufferSize, ChunkSize > > >
{
using internal::input_with_peeks< internal::buffer_input< internal::static_buffer< char, Reader, BufferSize, ChunkSize > > >::input_with_peeks;
};

static_input( std::FILE* ) -> static_input< internal::cstream_reader >;
static_input( const char* ) -> static_input< internal::cstring_reader >;
static_input( std::istream& ) -> static_input< internal::istream_reader >;

template< std::size_t BufferSize = TAO_PEGTL_DEFAULT_BUFFER_SIZE, std::size_t ChunkSize = TAO_PEGTL_DEFAULT_CHUNK_SIZE >
using static_cstream_input = static_input< internal::cstream_reader, BufferSize, ChunkSize >;

template< std::size_t BufferSize = TAO_PEGTL_DEFAULT_BUFFER_SIZE, std::size_t ChunkSize = TAO_PEGTL_DEFAULT_CHUNK_SIZE >
using static_cstring_input = static_input< internal::cstring_reader, BufferSize, ChunkSize >;

template< std::size_t BufferSize = TAO_PEGTL_DEFAULT_BUFFER_SIZE, std::size_t ChunkSize = TAO_PEGTL_DEFAULT_CHUNK_SIZE >
using static_istream_input = static_input< internal::istream_reader, BufferSize, ChunkSize >;

} // namespace TAO_PEGTL_NAMESPACE

#include "analyze_traits.hpp"

#include "internal/discard.hpp"
#include "internal/is_buffer.hpp"
#include "internal/require.hpp"

namespace TAO_PEGTL_NAMESPACE
{
// clang-format off
struct discard : internal::discard {};
struct is_buffer : internal::is_buffer {};
template< unsigned Amount > struct require : internal::require< Amount > {};
// clang-format on

template< typename Name >
struct analyze_traits< Name, internal::discard >
: analyze_opt_traits<>
{};

template< typename Name >
struct analyze_traits< Name, internal::is_buffer >
: analyze_opt_traits<>
{};

template< typename Name, unsigned Amount >
struct analyze_traits< Name, internal::require< Amount > >
: analyze_opt_traits<>
{};

} // namespace TAO_PEGTL_NAMESPACE

#include "discard_input.hpp"
#include "discard_input_on_failure.hpp"
#include "discard_input_on_success.hpp"

#endif
13 changes: 8 additions & 5 deletions include/tao/pegtl/demangle.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@

namespace TAO_PEGTL_NAMESPACE
{
// ensure a consistent interface
// Ensure a consistent interface.

template< typename T >
[[nodiscard]] constexpr std::string_view demangle() noexcept;

Expand All @@ -35,6 +36,7 @@ template< typename T >
namespace TAO_PEGTL_NAMESPACE::internal
{
// When using libstdc++ with clang, std::string_view::find is not constexpr :(

template< char C >
constexpr const char* string_view_find( const char* p, std::size_t n ) noexcept
{
Expand Down Expand Up @@ -73,10 +75,11 @@ template< typename T >

// GCC 9.1 and 9.2 have a bug that leads to truncated __PRETTY_FUNCTION__ names,
// see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91155

template< typename T >
[[nodiscard]] constexpr std::string_view TAO_PEGTL_NAMESPACE::demangle() noexcept
{
// fallback: requires RTTI, no demangling
// Fallback: Requires RTTI, no demangling.
return typeid( T ).name();
}

Expand Down Expand Up @@ -113,8 +116,8 @@ template< typename T >
template< typename T >
[[nodiscard]] constexpr std::string_view TAO_PEGTL_NAMESPACE::demangle() noexcept
{
// we can not add static_assert for additional safety,
// see issues #296, #301 and #308
// We can not add static_assert for additional safety,
// see issues #296, #301 and #308.
constexpr std::string_view sv = __FUNCSIG__;
constexpr auto begin = sv.find( "demangle<" );
constexpr auto tmp = sv.substr( begin + 9 );
Expand All @@ -133,7 +136,7 @@ template< typename T >
template< typename T >
[[nodiscard]] constexpr std::string_view TAO_PEGTL_NAMESPACE::demangle() noexcept
{
// fallback: requires RTTI, no demangling
// Fallback: Requires RTTI, no demangling.
return typeid( T ).name();
}

Expand Down
38 changes: 38 additions & 0 deletions include/tao/pegtl/discard_input.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
// Copyright (c) 2019-2023 Dr. Colin Hirsch and Daniel Frey
// Distributed under the Boost Software License, Version 1.0.
// (See accompanying file LICENSE_1_0.txt or copy at https://www.boost.org/LICENSE_1_0.txt)

#ifndef TAO_PEGTL_DISCARD_INPUT_HPP
#define TAO_PEGTL_DISCARD_INPUT_HPP

#include "apply_mode.hpp"
#include "config.hpp"
#include "match.hpp"
#include "nothing.hpp"
#include "rewind_mode.hpp"

namespace TAO_PEGTL_NAMESPACE
{
struct discard_input
: maybe_nothing
{
template< typename Rule,
apply_mode A,
rewind_mode M,
template< typename... >
class Action,
template< typename... >
class Control,
typename ParseInput,
typename... States >
[[nodiscard]] static bool match( ParseInput& in, States&&... st )
{
const bool result = TAO_PEGTL_NAMESPACE::match< Rule, A, M, Action, Control >( in, st... );
in.discard();
return result;
}
};

} // namespace TAO_PEGTL_NAMESPACE

#endif
40 changes: 40 additions & 0 deletions include/tao/pegtl/discard_input_on_failure.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
// Copyright (c) 2019-2023 Dr. Colin Hirsch and Daniel Frey
// Distributed under the Boost Software License, Version 1.0.
// (See accompanying file LICENSE_1_0.txt or copy at https://www.boost.org/LICENSE_1_0.txt)

#ifndef TAO_PEGTL_DISCARD_INPUT_ON_FAILURE_HPP
#define TAO_PEGTL_DISCARD_INPUT_ON_FAILURE_HPP

#include "apply_mode.hpp"
#include "config.hpp"
#include "match.hpp"
#include "nothing.hpp"
#include "rewind_mode.hpp"

namespace TAO_PEGTL_NAMESPACE
{
struct discard_input_on_failure
: maybe_nothing
{
template< typename Rule,
apply_mode A,
rewind_mode M,
template< typename... >
class Action,
template< typename... >
class Control,
typename ParseInput,
typename... States >
[[nodiscard]] static bool match( ParseInput& in, States&&... st )
{
const bool result = TAO_PEGTL_NAMESPACE::match< Rule, A, M, Action, Control >( in, st... );
if( !result ) {
in.discard();
}
return result;
}
};

} // namespace TAO_PEGTL_NAMESPACE

#endif
40 changes: 40 additions & 0 deletions include/tao/pegtl/discard_input_on_success.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
// Copyright (c) 2019-2023 Dr. Colin Hirsch and Daniel Frey
// Distributed under the Boost Software License, Version 1.0.
// (See accompanying file LICENSE_1_0.txt or copy at https://www.boost.org/LICENSE_1_0.txt)

#ifndef TAO_PEGTL_DISCARD_INPUT_ON_SUCCESS_HPP
#define TAO_PEGTL_DISCARD_INPUT_ON_SUCCESS_HPP

#include "apply_mode.hpp"
#include "config.hpp"
#include "match.hpp"
#include "nothing.hpp"
#include "rewind_mode.hpp"

namespace TAO_PEGTL_NAMESPACE
{
struct discard_input_on_success
: maybe_nothing
{
template< typename Rule,
apply_mode A,
rewind_mode M,
template< typename... >
class Action,
template< typename... >
class Control,
typename ParseInput,
typename... States >
[[nodiscard]] static bool match( ParseInput& in, States&&... st )
{
const bool result = TAO_PEGTL_NAMESPACE::match< Rule, A, M, Action, Control >( in, st... );
if( result ) {
in.discard();
}
return result;
}
};

} // namespace TAO_PEGTL_NAMESPACE

#endif
Loading

0 comments on commit 4c3aaa3

Please sign in to comment.