Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flatten and Unflatten Complex JSON #228

Closed
akm151 opened this issue Aug 1, 2022 · 4 comments
Closed

Flatten and Unflatten Complex JSON #228

akm151 opened this issue Aug 1, 2022 · 4 comments

Comments

@akm151
Copy link

akm151 commented Aug 1, 2022

I have these lines of code :

var jsonData = new JArray();
                var reader = new ChoParquetReader(stream)
                    .Configure(c => c.ThrowAndStopOnMissingField = false)
                    .Configure(c => c.NestedKeySeparator = '/');
                foreach (var e in reader)
                {
                    var unflatten = e.ConvertToNestedObject('/');
                    unflatten.flavours = (((string)unflatten.flavours) ?? "").Split(';');
                    unflatten.batters.category = (((string)unflatten.batters.category) ?? "").Split(';');    
                    jsonData.Add(JObject.FromObject(unflatten));

                }
                return JsonConvert.SerializeObject(jsonData);

and my Json is

[
    {
      "id": "4b5260d2-e088-4546-a315-b9c4b274406f",
      "type": "donut",
      "name":"cake",
      "flavours": " chocolate ; blueberry ; vanilla",
      "batters/topping":  "glazed",
       "batters/category": " eggless;flavoured"  
    }
]

my expected output:

[
    {
      "id": "4b5260d2-e088-4546-a315-b9c4b274406f",
      "type": "donut",
      "name":"cake",
      "flavours": ["chocolate","blueberry","vanilla"],
      "batters": {
        "topping":"glazed",
        "category":[ "eggless","flavoured"]
      }     
    }
]

Basic requirement while flattening and unflattening is that, if the value of a property is list or array of primitive type then while flattening we will merge the list or array to a ';' separated string and while unflattening we will split the ';' separated string to a list or array.
Note: the JSON showed here is a sample, originally I have a large JSON array and each object has lot's of properties so using LINQ solution is not feasible. Also I am looking for a generic solution as I have multiple entities of different types with similar properties.(just a change in name)
Eg: cake, chocolate, chips, cookies and so on. SO looking for a generic solution.
Please comment if further details are required.
Thanks for your interest in the question.

Looking forward to hearing from you @Cinchoo @neuli1980 @dbeattie1971

@Cinchoo
Copy link
Owner

Cinchoo commented Aug 4, 2022

seems the source of the data coming from Parquet file. Can you please share the schema with sample data?

@akm151
Copy link
Author

akm151 commented Aug 5, 2022

I will explain this.
Schema

{
    "$schema": "https://json-schema.org/draft/2019-09/schema",
    "$id": "some.schema.json",
    "title": "desserts",
    "type": "object",
    "additionalProperties": false,
    "required": [
        "batters"
    ],
    "properties": {
        "id": {
            "description": "System generated unique identifier for category. (A UUID specified by RFC4122).",
            "type": "string",
            "format": "uuid"
        },
        "type": {
            "type": "string"
        },
        "name": {
            "type": "string"
        },
        "flavours": {
            "type": "array",
            "items": {
                "type": "string"
              }
        },
        "batters": {
            "type": "object",
            "properties": {
                "topping": {
                    "type": "string"
                },
                "category": {
                    "type": "array",
                    "items": {
                        "type": "string"
                    }
                }
            }
        }
    }
}

Sample Data:

{
      "id": "4b5260d2-e088-4546-a315-b9c4b274406f",
      "type": "donut",
      "name":"cake",
      "flavours": ["chocolate","blueberry","vanilla"],
      "batters": {
        "topping":"glazed",
        "category":[ "eggless","flavoured"]
      }     
}

Scenario-1
Export to CSV/Parquet ( Flattening) - The above sample data is coming from data source and I want to flatten the data in such a way that all the properties having array type should be joined to a single string value. So after flattening my data should like something like below

{
      "id": "4b5260d2-e088-4546-a315-b9c4b274406f",
      "type": "donut",
      "name":"cake",
      "flavours": " chocolate ; blueberry ; vanilla",
      "batters/topping":  "glazed",
       "batters/category": " eggless;flavoured"  
    }

and I will write this to a CSV or Parquet file.

Scenario-2
Import from CSV/Parquet (Unflattening) - Here my data source will be the parquet/csv file generated in scenario 1. So after reading the CSV/Parquet file my data will look like below

{
      "id": "4b5260d2-e088-4546-a315-b9c4b274406f",
      "type": "donut",
      "name":"cake",
      "flavours": " chocolate ; blueberry ; vanilla",
      "batters/topping":  "glazed",
       "batters/category": " eggless;flavoured"  
    }

So , I want to unflatten it in such a way that it matches my schema. So after Unflatten the data should look like below.

{
      "id": "4b5260d2-e088-4546-a315-b9c4b274406f",
      "type": "donut",
      "name":"cake",
      "flavours": ["chocolate","blueberry","vanilla"],
      "batters": {
        "topping":"glazed",
        "category":[ "eggless","flavoured"]
      }     
}

I made few changes to the method ConvertMembersToArrayIfAny() and Flatten() extension method on IList and it worked for me. But it can conflict with any existing feature that I am not aware of so, I did not raise a PR and started a discussion to take it through if you approve.

@Cinchoo
Copy link
Owner

Cinchoo commented Sep 9, 2022

have made enhancement to the library to handle this conversion.

Sample fiddle for your review: https://dotnetfiddle.net/gr4u94

@akm151
Copy link
Author

akm151 commented Sep 12, 2022

This works. Thank you @Cinchoo for the enhancement.

@akm151 akm151 closed this as completed Sep 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants