Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed array length #5313

Merged
merged 1 commit into from
Jun 17, 2019
Merged

Fixed array length #5313

merged 1 commit into from
Jun 17, 2019

Conversation

svenk177
Copy link
Contributor

This change ports the pull request #3987 by daksenik and idoroshev to the most recent commit and adds support for generation of JSON schema, Java, C# and Python.

@vglavnyy
Copy link
Contributor

Could you add an example of fbs and json as part of monster_extra?
C++, C#, Python have the support of extra.

  1. Why not use an attribute for array size specification?
    For example:
struct A{
 B:[float] (fixed_length:16);
}
  1. Is Array type really needed?
    The Array looks like Vector, therefore can reuse 90% of Vector code.
    Can we embed "uoffset_t length_" field into the struct to emulate Vector?
    If one needs Array the four bytes isn't big overhead.

@svenk177
Copy link
Contributor Author

svenk177 commented May 2, 2019

Could you add an example of fbs and json as part of monster_extra?
C++, C#, Python have the support of extra.

An example to monster_extra was added. It seems the JSON generator crashes on parsing the floating point extensions, therefore a JSON schema is generated from array_test.fbs for now.

  1. Why not use an attribute for array size specification?
    For example:
struct A{
 B:[float] (fixed_length:16);
}

This syntax is derived from a proposed feature for flatcc.
NOTE: This patch currently only supports primitive types in structs!

  1. Is Array type really needed?
    The Array looks like Vector, therefore can reuse 90% of Vector code.
    Can we embed "uoffset_t length_" field into the struct to emulate Vector?
    If one needs Array the four bytes isn't big overhead.

Again this was derived from flatcc (see above). The expression

array:[int:2];

shall be equivalent to

int1:int;
int2:int;

in its binary representation.

Copy link
Collaborator

@aardappel aardappel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally looks great!

Besides the comments below, I am missing documentation, please add to Schema.md at least.

@@ -312,7 +312,9 @@ class GeneralGenerator : public BaseGenerator {
}

std::string GenTypeGet(const Type &type) const {
return IsScalar(type.base_type) ? GenTypeBasic(type) : GenTypePointer(type);
return (IsScalar(type.base_type)) ? GenTypeBasic(type)
: IsArray(type) ? GenTypeBasic(type.VectorType()) + " []"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put () around nested ?:, here and below?

@@ -980,6 +993,19 @@ class GeneralGenerator : public BaseGenerator {
code += " : " + default_cast;
code += GenDefaultValue(field);
}
} else if (IsArray(field.value.type)) {
std::string type = GenTypeBasic(field.value.type.VectorType());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use auto where possible.


std::string arrayInfo = "";
if (IsArray(property->value.type)) {
arrayInfo = ",\n \"minItems\": " + NumToString<short>(property->value.type.fixed_length) + ",\n" +
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure all lines are <= 80 characters, here and elsewhere.

return Error("nested vector types not supported (wrap in table first).");
return Error("nested vector types not supported (wrap in table first)");
}
if (token_ == ':') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this feature is not supported by all languages, we'll need a check similar to what is in SupportsAdvancedUnionFeatures

return Error("structs_ may contain only scalar or struct fields");

if (!struct_def.fixed && IsArray(type))
return Error("tables can't contain fixed-length arrays");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "fixed-length array in table must be wrapped in struct" to hint the user how to solve this problem.


public static Offset<MonsterExtra> CreateMonsterExtra(FlatBufferBuilder builder,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this constructor being deleted because we don't generate them for structs that have arrays in them? If so that is fine I guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constructor seems to be deleted as soon as a struct is added to a table. This is not related to the patch set.

o = flatbuffers.number_types.UOffsetTFlags.py_type(self._tab.Offset(16))
if o != 0:
x = o + self._tab.Pos
from .ArrayStruct import ArrayStruct
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this import in the middle of the code? is that a normal thing to do in Python? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again this is seems to be generated when a struct is added to a table. It was not introduced with this patch set :)

namespace MyGame.Example;

struct ArrayStruct{
a:float;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consistently indent 2 spaces.

c_(flatbuffers::EndianScalar(_c)),
padding0__(0),
padding1__(0) {
memcpy(b_, _b, 60);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not work on big endian machines, you'll need a loop with EndianScalar I'm afraid. Maybe make it a helper templated function in flatbuffers.h.

void mutate_a(float _a) {
flatbuffers::WriteScalar(&a_, _a);
}
const int32_t *b() const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also endian unsafe. Make a class like Vector that does all access.

@svenk177 svenk177 force-pushed the fixed-array-length branch from 2771eb7 to 10c094a Compare May 3, 2019 11:54
@svenk177
Copy link
Contributor Author

svenk177 commented May 3, 2019

Thank you for your kind response. I introduced most of the requested changes and I left some comments on the remaining ones.

Copy link
Collaborator

@aardappel aardappel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple more changes..
@vglavnyy @mikkelfj any further comments?

// Get a mutable pointer to tables/strings inside this array.
mutable_return_type GetMutableObject(uoffset_t i) const {
FLATBUFFERS_ASSERT(i < size());
return const_cast<mutable_return_type>(IndirectHelper<T>::Read(Data(), i));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

if (token_ != ',')
{
ECHECK(ParseSingleValue(nullptr, val, false));
stack.push_back(val);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not saying to not use ParseSingleValue, I am saying not to allocate a whole stack of Value. You can instead use a single Value and instead store a buffer of the actual binary values.

public void __init(int _i, ByteBuffer _bb) { bb_pos = _i; bb = _bb; }
public ArrayStruct __assign(int _i, ByteBuffer _bb) { __init(_i, _bb); return this; }

public int [] array() { int [] b = new int[2]; for (int i = 0; i < 2; i++) { b[i] = bb.getInt(bb_pos + 0 + i * 4); } return b; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but those can be reused. Array accesses are potentially performance sensitive, so lets follow the existing vector API design.


return_type Get(uoffset_t i) const {
FLATBUFFERS_ASSERT(i < size());
return IndirectHelper<T>::Read(Data(), i);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird, github somehow deleted my comment that "same here" refers to.
In short: lets not use IndirectHelper since unlike Vector we don't have complex types to dereference.

Copy link
Contributor

@mikkelfj mikkelfj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment on struct size.

I also think we need to discuss char arrays. This is a much requested featured because short strings are used in communication buffers etc. The big question is how to deal with the null terminator.

On the schema array syntax referenced in the flatcc doc - I think it was Wouter who originally proposed that syntax elsewhere.

"length of fixed-length array must be an integer value");
}
int64_t fixed_length = StringToInt(attribute_.c_str());
if (fixed_length < 1 || fixed_length > 0x7fff) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://stackoverflow.com/a/7724909 says 2^16-1 is a C limit on object size, so it would be reasonable to limit the size to 16 bits, though I'm not sure about a 2^15-1 length limit. I'd rather place a limit on struct sizes of 2^16-1 and forego array length limits.

@mikkelfj
Copy link
Contributor

mikkelfj commented May 6, 2019

Does the syntax permit hex size?

struct baz {
    foo : [x:0x20];
}

flatcc generally accepts hex values in the schema parser where integers are permitted.

@mikkelfj
Copy link
Contributor

mikkelfj commented May 6, 2019

On alignment: I think alignment should be to the element size of the fixed array not the array size as such. The statement of equivalence to repeated scalar fields implies that.

On lenght limit:

flatcc already imposes the following struct limit by default.

While this limit is derived fro flatbuffers voffset_t size, it incidentally happens to the limit set by the C standard for portable struct sizes.

I don't think the 0x7fff length limit is a good idea. It would prevent 32K lookup tables.

config/config.h

/*
 * There are no hard limits on structs, but
 * they cannot be effectively handled by tables
 * if larger than 64K and 64K-1 allows us to store
 * the size in short where needed. Internally
 * we can handle 64 bits.
 */
#ifndef FLATCC_STRUCT_MAX_SIZE
#define FLATCC_STRUCT_MAX_SIZE ((1 << 16) - 1)
#endif

@mikkelfj
Copy link
Contributor

mikkelfj commented May 7, 2019

We probably also want to permit fixed size arrays of structs as they can be used as typedefs.

This causes a problem with 0-size arrays because structs can be zero-size. Hence I'm not sure if disallowing array length 0 makes sense since the special case will be present in either case. C does not allow empty arrays or structs.

@mikkelfj
Copy link
Contributor

mikkelfj commented May 7, 2019

I'm not sure if flatc has key attributes on struct fields (flatcc does), but they probably should not be allowed on fixed size arrays.

@@ -393,6 +393,84 @@ template<typename T> static inline size_t VectorLength(const Vector<T> *v) {
return v ? v->size() : 0;
}

// This is used as a helper type for accessing arrays.
template<typename T, int length> class Array {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type of length should be matched with:

  • uoffset_t size() const { return length; }
  • short Type::fixed_length

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to uint16_t.

@@ -167,6 +170,7 @@ struct Type {
StructDef *struct_def; // only set if t or element == BASE_TYPE_STRUCT
EnumDef *enum_def; // set if t == BASE_TYPE_UNION / BASE_TYPE_UTYPE,
// or for an integral type derived from an enum.
short fixed_length; // only set if t == BASE_TYPE_ARRAY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why type of fixed_length is signed short?
Somewhere in the code are negative values used?

return const_cast<mutable_return_type>(IndirectHelper<T>::Read(Data(), i));
}

// The raw data in little endian format. Use with care.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does little endian mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little endian is the most common storage order of integers with the least significant byte first.
Big endian has the opposite order. There are also other exotic orders found on old machines and in cryptography. The name originates from the childrens book Gulliver's travels where he visits the two neighbouring islands where one population eats their eggs with the big end up, and the other population eats their eggs with the little end up.

Copy link
Contributor

@vglavnyy vglavnyy May 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read Gulliver :)
Let, I have BE machine and declare an array:

int arr[3]={1,2,3};
auto pu8 = reinterpret_cast<const uint8_t*>(&arr[0]);

By nature, the pu8 is neither BE nor LE.
It points to memory with BE integers (not to LE).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wrong, but have a new question.
Scalar is always represented in little-endian format. Array of scalars also LE.

// The raw data in little endian format. Use with care.
uint8_t *Data() { return reinterpret_cast<uint8_t *>(&length_ + 1); }
const T *data() const { return reinterpret_cast<const T *>(Data()); }

How to use data() on BE machine?

Copy link
Contributor

@mikkelfj mikkelfj May 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll hazard a guess. This is a low-level pointer access method that is not endian safe. Hence the "use with care" comment. In flatcc for C, the method is called _get_ptr() where the safe method is _vec_at(field, i) for vectors or _get(field, i) for fixed size arrays (TBD). These return a value converted to the native representation and usually optimizes so it is just as fast as direct array access.

For 8-bit elements it doesn't matter, but uint16_t would be unsafe. I have seen users getting into such problems on big endian hardware.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right you have to use Get(i) and Mutate(i, val).

@@ -679,6 +679,10 @@ class CppGenerator : public BaseGenerator {
bool user_facing_type) {
if (IsScalar(type.base_type)) {
return GenTypeBasic(type, user_facing_type) + afterbasic;
} else if (IsArray(type)) {
return beforeptr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code not formatted

@@ -2728,7 +2738,8 @@ class CppGenerator : public BaseGenerator {

// Generate a default constructor.
code_ += " {{STRUCT_NAME}}() {";
code_ += " memset(static_cast<void *>(this), 0, sizeof({{STRUCT_NAME}}));";
code_ +=
" memset(static_cast<void *>(this), 0, sizeof({{STRUCT_NAME}}));";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code formatting (indent)

typedef VectorReverseIterator<const_iterator> const_reverse_iterator;

uoffset_t size() const { return length; }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The uoffset_t size() const { return length; } should be constexpr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added constexpr.

@mikkelfj
Copy link
Contributor

mikkelfj commented May 8, 2019

Support added in FlatCC

dvidelabs/flatcc@6ef1459

binary format updated, notably removed supported for table fields with fixed length vectors and now call the feature Fixed Size Arrays.

https://github.com/dvidelabs/flatcc/blob/master/doc/binary-format.md

The FlatCC JSON parser optionally skips extra elements and zero pads missing elements, but raises an error by default.

@svenk177
Copy link
Contributor Author

I've made some improvements, especially support for struct arrays was introduced - though the C#, Java and Python ports are not there yet. Also char arrays are not implemented yet (any advice on how to integrate them?).

I kept the IndirectHelper within the Array class as this allows reusing some existing code, e.g. see PrintContainer.

I also kept the stack of Values as I'm a bit wary of alignment issues now that struct arrays have been introduced.

The length check is now capped at 0xFFFF as the size has been changed to an uint16_t.

Hex values are now supported in the schema to match the behaviour in flatcc.

NEXT();
if (token_ != kTokenIntegerConstant) {
return Error(
"length of fixed-length array must be an integer value");
}
int64_t fixed_length = StringToInt(attribute_.c_str());
if (fixed_length < 1 || fixed_length > 0x7fff) {
int64_t fixed_length = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int64_t fixed_length = 0; - uint16_t ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value of StringToNumber is an int64_t. The check prevents an over-/underflow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

template<typename T> inline bool StringToNumber(const char *s, T *val) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh missed it! Fixed.

typedef typename IndirectHelper<T>::return_type return_type;
typedef typename IndirectHelper<T>::mutable_return_type mutable_return_type;

constexpr uint16_t size() const { return length; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FLATBUFFERS_CONSTEXPR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@mikkelfj
Copy link
Contributor

Sounds good. FlatCC will be adding struct arrays and arrays of enums.

I think we need to discuss char arrays more before implementing them.

I think flatcc has a bug in that the C source cannot compile empty structs that are members of other structs. I'm not sure I'd bother fix it since it is rather messy. C does not support empty structs or other zero size types.

@mikkelfj
Copy link
Contributor

FlatCC has now support for scalars, enums and fixed size arrays of structs, i.e. arrays of all ordinary struct member types, including JSON parsing and printing. Char arrays are not supported.

https://github.com/dvidelabs/flatcc/tree/cf3f22eaf8082bccfd63efa8aeda41460b5be176

@svenk177
Copy link
Contributor Author

I have added support for:

  • enum arrays
  • code generation for Java, C# with samples
  • code generation for Python (not tested yet)

@aardappel
Copy link
Collaborator

Sorry, what is the issue with char arrays?

flatc has an error for "size 0 structs not allowed", I'd suggest flatcc do the same. So empty structs should not be something you have to worry about.

@rw for Python.

@mikkelfj
Copy link
Contributor

Empty structs are very useful for union types. But it would simplify a lot to remove them.

The issue with char arrays is:

  • there is currently no char type
  • if you use x:[string:15] that would imply 15 offsets to string objects.
  • do you automatically insert a trailing zero, so x:[char:15] has size 16?
  • do you have to verify that - currently structs are much simpler to verify.
  • what about utf-8 code points not fitting exactly - same with strings, but more prominent with char arrays where you have a forced limit and truncation is then non-trivial.

Char arrays could also be just chars with no implied zero termination, or it could be optional and part of the specified size.

@aardappel
Copy link
Collaborator

@mikkelfj well, flatc has already removed them :)

Ok, I guess we should be clear what we mean with char then, as to me that's the same a FlatBuffer's byte. FlatBuffers has no concept of char in the unicode sense.

Also not sure what you mean by [string:15], as this PR only applies to scalars / fixed size things. Not understanding the rest of it either. Did I miss a discussion somewhere? :)

@mikkelfj
Copy link
Contributor

mikkelfj commented May 13, 2019

[string:15] has been suggested earlier, by you I think. I suggested char at that time (some chat a while ago AFAIR). My point of raising it here was merely to agree on not using that syntax (string syntax).

I'm not arguing against byte char, we just need to define what it means and how to handle edge cases, and notably 0 termination, if any.

@svenk177
Copy link
Contributor Author

This weak compiler will fail on the std::enable_if or something like this.

You are right. Dropped the patches!

I think fixing the equality operators should be done in a seperate issue. Do you agree?

@vglavnyy
Copy link
Contributor

@svenk177 Yes, compare operator for Table should be fixed in a separate issue.

Recommendations (optional) for this PR:

  1. Replace template<typename T, uint16_t length, typename U = T> to ``template<typename T, uint16_t length>.
    Check or require that parser.opts.scoped enums flag is active to ignore MSVC2010.
    In the tests use PPD to avoid MSVC2010.

  2. Remove/replace ArrayStruct(float _a, const int32_t *_b, int8_t _c, const NestedStruct *_d) by
    ArrayStruct(float _a, int8_t _c).
    This kind of constructor absolutely unusable in production code.
    If one modifies fbs schema with increasing an array length it may cause an out-of-range.
    Move an initialization into test and sample code.

If is ready notify aardappel for final review.

@svenk177 svenk177 force-pushed the fixed-array-length branch from f7e172f to d263454 Compare June 12, 2019 14:05
@svenk177
Copy link
Contributor Author

@vglavnyy Requested changes commited.
@aardappel Review?

Copy link
Contributor

@vglavnyy vglavnyy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

ECHECK(ParseSingleValue(nullptr, val, false));
}
stack.push_back(val);
return NoError();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove one copy using emplace_back.
First solution:

vector_emplace_back(&stack, Value());
auto& val = stack.back();
val.type = type;
...
// remove copy: stack.push_back(val);

Second:
stack.push_back(val) -> vector_emplace_back(&stack, val);


if (length != count) return Error("Fixed-length array size is incorrect.");

for (auto it = stack.rbegin(); it != stack.rend(); it++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++it ?

tests/test.cpp Outdated
nStruct1.mutable_c()->Mutate(1, MyGame::Example::TestEnum::A);

MyGame::Example::ArrayStruct aStruct(2, 12);
for (int i = 0; i < 15; i++) aStruct.mutable_b()->Mutate(i, i + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use b()->size() instead of 15.
Expected checking is desirable.

aStruct.b() is not null;
aStruct.b()->size() is equal to 15


auto p = MyGame::Example::GetMutableArrayTable(fbb.GetBufferPointer());
auto mArStruct = p->mutable_a();
mArStruct->mutable_b()->Mutate(14, -14);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TEST_ASSERT(mArStruct->mutable_b());

auto mArStruct = p->mutable_a();
mArStruct->mutable_b()->Mutate(14, -14);
TEST_EQ(mArStruct->a(), 2);
TEST_EQ(mArStruct->b()->size(), 15);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TEST_ASSERT(mArStruct->b());

}
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to traverse parser_.structs_.vec?
Is it possible to use lazy checking? For example after line 2745 when a field of a struct is generated.

: field.value.type;
auto is_series = (IsSeries(field.value.type));
auto underlying_type =
is_series ? field.value.type.VectorType() : field.value.type;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const auto &underlying_type - ?

@svenk177 svenk177 force-pushed the fixed-array-length branch 2 times, most recently from 8d0d5df to bc394b3 Compare June 14, 2019 06:26
Copy link
Collaborator

@aardappel aardappel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments.. but this generally looks to be in a good enough state to be merged I think.
With a new feature this big there are bound to be issues we're missing, but we can address these later.
Thanks @vglavnyy and @mikkelfj for all the code review!

// Check if enum arrays are used without specifying --scoped-enums
if (!parser_.opts.scoped_enums && IsArray(field_type) &&
IsEnum(field_type.VectorType())) {
printf("--scoped-enums must be enabled to use enum arrays\n");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not happy with this error for multiple reasons. First it would be better if --scoped-enums wasn't required. And if that really isn't possible, I'd prefer to trigger this error in the parser, where are all error handling happens. The generators don't really have a way to do errors, and this print + return false is a bit of a hackey way to do it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aardappel
The idea to ignore untyped enums was mine. Initially, Array template had the support of enums without underlying type specification. But this third template parameter complicates the template and complicate compile-time helpers for the Array (compile-time functions like size() or other constexpr features).
There are a few C++ compilers without underlying type specification. I assumed that Array is a new feature needn't support of legacy. One should use a modern compiler to use Array.
Of course, this should be documented.

I can restore removed code if I'm not right.

Generator vs parser:
The flag --scoped-enums needed for C++ code generator only.
Other languages are independent of this flag.
One can use flatc to convert from json to binary and versa.
The --scoped-enums isn't necessary for these conversions.

Another way is to silently add an underlying type to generated enums used with Array.
For example, the underlying type silently added for each int64/uint64 enum due to MSVC restrictions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought enums were always explicitly typed. But if they are not, they should IMO default to 32 bit like int. I actually find it a bit annoying that I have to specify a type for enums since often it really doesn't matter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

umm - I'm talking about flatbuffer schema types - is this about C++ enums being typed (I have given up keeping track of that language) ?
In C the enums are represented as typecast integer defines in order to associate a type, .e.g. #define Red INT32_C(1)
Originally the were constants like const int32_t Red = 1; but that did not work in switch statements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't like the printf in there either, therfore I moved the check for scoped-enums into the parser along with a check if it generates code for C++.

@mikkelfj C++11 introduced the following syntax enum A : uint8_t { }, i.e. sizeof(A) == sizeof(uint8_t), unfortunately old compilers do not support it, hence the requirement for --scoped-enums

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The now removed third parameter in the Array template class was intended for this purpose, but it was indeed a bit messy as @vglavnyy pointed out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just accidentally found this http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2008.pdf
It appers C might get typed enum in C2x, not that it matters since it would not be portable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vglavnyy if it is needed to keep things simple, then so be it. I'd still like the actual error checking to be moved to the parser though, if possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aardappel I already moved it into the parser. I had to squash the history as rebasing was not fun otherwise.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks!

first++;
continue;
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR introduces a lot of blank lines.. prefer to not have them by default, only if a method is really long. And even then, comments are probably a better separator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed most blank lines and added some comments.

@svenk177 svenk177 force-pushed the fixed-array-length branch from 2c80d93 to fd227ae Compare June 17, 2019 06:42
@svenk177 svenk177 force-pushed the fixed-array-length branch from fd227ae to d062689 Compare June 17, 2019 06:48
@aardappel
Copy link
Collaborator

Ok, that means we're once again ready to merge! I see @vglavnyy's last comments have been addressed, so I guess we can go ahead.

Thanks everyone for helping get this feature in!

@aardappel aardappel merged commit e635141 into google:master Jun 17, 2019
@aardappel
Copy link
Collaborator

This PR accidentally broke because of a PR by @vglavnyy I merged just before this one: #5381 (makes Table/Struct internal access).
CI here: https://travis-ci.org/google/flatbuffers/builds/546979717
A bit of a coincidence :)

@vglavnyy
Copy link
Contributor

Run of generate_code.sh with updated flatc should fix it. There are no conflicts in the code.

@svenk177
Copy link
Contributor Author

Thank you all for getting this in :)

aardappel added a commit that referenced this pull request Jun 20, 2019
Change-Id: I249140119e6241beb5aec5670d0e5ccddc8f5251
@aardappel
Copy link
Collaborator

Updated generated code in 123c7a4

@aardappel
Copy link
Collaborator

@svenk177 @vglavnyy

We have CI errors related to this PR: https://ci.appveyor.com/project/gwvo/flatbuffers/builds/27162701/job/3edsmdethte5y7w9

Note sure why that is failing now all of a sudden?

Looking into the errors and the code, the problem appears to be that Array::Mutate is being called with a T which is a struct.. even though Mutate was written for scalars only.

Relevant code:

flatbuffers/tests/test.cpp

Lines 2808 to 2839 in 4b870ac

flatbuffers::FlatBufferBuilder fbb;
MyGame::Example::NestedStruct nStruct0(MyGame::Example::TestEnum::B);
TEST_NOTNULL(nStruct0.mutable_a());
nStruct0.mutable_a()->Mutate(0, 1);
nStruct0.mutable_a()->Mutate(1, 2);
TEST_NOTNULL(nStruct0.mutable_c());
nStruct0.mutable_c()->Mutate(0, MyGame::Example::TestEnum::C);
nStruct0.mutable_c()->Mutate(1, MyGame::Example::TestEnum::A);
TEST_NOTNULL(nStruct0.mutable_d());
nStruct0.mutable_d()->Mutate(0, flatbuffers::numeric_limits<int64_t>::max());
nStruct0.mutable_d()->Mutate(1, flatbuffers::numeric_limits<int64_t>::min());
MyGame::Example::NestedStruct nStruct1(MyGame::Example::TestEnum::C);
TEST_NOTNULL(nStruct1.mutable_a());
nStruct1.mutable_a()->Mutate(0, 3);
nStruct1.mutable_a()->Mutate(1, 4);
TEST_NOTNULL(nStruct1.mutable_c());
nStruct1.mutable_c()->Mutate(0, MyGame::Example::TestEnum::C);
nStruct1.mutable_c()->Mutate(1, MyGame::Example::TestEnum::A);
TEST_NOTNULL(nStruct1.mutable_d());
nStruct1.mutable_d()->Mutate(0, flatbuffers::numeric_limits<int64_t>::min());
nStruct1.mutable_d()->Mutate(1, flatbuffers::numeric_limits<int64_t>::max());
MyGame::Example::ArrayStruct aStruct(2, 12, 1);
TEST_NOTNULL(aStruct.b());
TEST_NOTNULL(aStruct.mutable_b());
TEST_NOTNULL(aStruct.mutable_d());
TEST_NOTNULL(aStruct.mutable_f());
for (int i = 0; i < aStruct.b()->size(); i++)
aStruct.mutable_b()->Mutate(i, i + 1);
aStruct.mutable_d()->Mutate(0, nStruct0);
aStruct.mutable_d()->Mutate(1, nStruct1);
auto aTable = MyGame::Example::CreateArrayTable(fbb, &aStruct);
fbb.Finish(aTable);

Note e.g. aStruct.mutable_d()->Mutate(0, nStruct0); which calls Mutate with a struct, which eventually goes into WriteScalar which should never take structs.

Schema: https://github.com/google/flatbuffers/blob/master/tests/arrays_test.fbs

We should a) move the function AssertScalarT() to top level, and call it from Mutate or maybe even WriteScalar and b) fix these tests. c) we could add a MutateStruct if we want, since a struct is already endian-correct, so can be written to the array without calling WriteScalar.

Possibly remotely related: #5440 which also had some alignment based MSVC errors.

@vglavnyy
Copy link
Contributor

vglavnyy commented Sep 6, 2019

The test should use GetMutablePointer

// Get a mutable pointer to elements inside this array.
// @note This method should be only used to mutate arrays of structs followed
// by a @p Mutate operation. For primitive types use @p Mutate directly.
// @warning Assignments and reads to/from the dereferenced pointer are not
// automatically converted to the correct endianness.
T *GetMutablePointer(uoffset_t i) const {
FLATBUFFERS_ASSERT(i < size());
return const_cast<T *>(&data()[i]);
}

Another way (without suppression of always-const warning):

  // Change elements if you have a non-const pointer to this object.
  void Mutate(uoffset_t i, const T &val) {
    if (flatbuffers::is_scalar<T>::value) {
      FLATBUFFERS_ASSERT(i < size());
      WriteScalar(data() + i, val);
    }
    else {
      *(GetMutablePointer(i)) = val;
    }
  }

Or we can use SFINAE but I prefer if-else (C++17 has constexpr-if).

Second:

Possibly remotely related: #5440 which also had some alignment based MSVC errors.

Yes, we have the alignment problem (was detected by ASAN).
Need to research this.

@svenk177 svenk177 mentioned this pull request Sep 6, 2019
@aardappel
Copy link
Collaborator

@vglavnyy thanks, yes that is also a nice way to do it. Though of course this code is still going to do strange things if T is not a struct either, so maybe nice to address at some point :)

@tsingson
Copy link
Contributor

tsingson commented Apr 5, 2020

this feature is cool, i will port to Go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants