Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization of interface typed members #513

Open
Pragmateek opened this issue May 30, 2024 · 6 comments
Open

Serialization of interface typed members #513

Pragmateek opened this issue May 30, 2024 · 6 comments

Comments

@Pragmateek
Copy link

Issue description

Hello,

Is there a way to serialize types whose any member is interface typed?
I get a NotImplementedException in MakeField.

I think I understand the logical issue: in a set of objects we could have different implementations hence Parquet columns/schemas.
But in the case there is a single implementation is there a way to force the serialization?

Thanks in advance,

Mickael

using Parquet.Serialization;

//using MemoryStream stream = new();
//await ParquetSerializer.SerializeAsync([new LineOK ()], stream);

using MemoryStream stream = new();
await ParquetSerializer.SerializeAsync([new LineKO()], stream);

interface IPoint
{
    decimal X { get; set; }
    decimal Y { get; set; }
}

class Point : IPoint
{
    public decimal X { get; set; }
    public decimal Y { get; set; }
}

class LineOK
{
    public Point Start { get; } = new();
    public Point End { get; } = new();
}

class LineKO
{
    public IPoint Start { get; } = new Point();
    public IPoint End { get; } = new Point();
}
@aloneguid
Copy link
Owner

Hi. Thanks a lot for the contribution!

You can serialize interfaces, but they should be explicitly set in the IEnumerable parameter. And to deserialize, you need to specify the result class type. Here is an example:

  interface IInterface {
      int Id { get; set; }
  }

  class InterfaceImpl : IInterface {
      public int Id { get; set; }
  }

  [Fact]
  public async Task Interface_Serialize() {

      // explicit IInterface[] type, but elements can be any subtype
      var data = new IInterface[] {
          new InterfaceImpl { Id = 1 },
          new InterfaceImpl { Id = 2 },
      };

      using var ms = new MemoryStream();
      await ParquetSerializer.SerializeAsync(data, ms);

      ms.Position = 0;
      // explicit subtype
      IList<InterfaceImpl> data2 = await ParquetSerializer.DeserializeAllAsync<InterfaceImpl>(ms).ToArrayAsync();

      Assert.Equivalent(data, data2);

  }

@Pragmateek
Copy link
Author

My pleasure.

Thanks for your answer but my issue is not with the type of the root objects but one of their properties.

To follow up with your model if Id is typed as an interface IId:

using Parquet.Serialization;
using System.Collections;

var data = new IInterface[] {
          new InterfaceImpl { Id = new IdImpl{ Id = 1 } },
          new InterfaceImpl { Id = new IdImpl { Id = 2 } },
      };

using var ms = new MemoryStream();
await ParquetSerializer.SerializeAsync(data, ms);

ms.Position = 0;
// explicit subtype
var data2 = await ParquetSerializer.DeserializeAsync<InterfaceImpl>(ms);

interface IId
{
    int Id { get; set; }
}

class IdImpl : IId
{
    public int Id { get; set; }
}

interface IInterface
{
    IId Id { get; set; }
}

class InterfaceImpl : IInterface
{
    public IId Id { get; set; }
}

aloneguid added a commit that referenced this issue May 31, 2024
@aloneguid
Copy link
Owner

I see! Sorry for misunderstanding. I've made a fix for this, which allows to serialize interface properties, so this works:

 interface IInterface {
     int Id { get; set; }
 }

 interface IRootInterface {
     IInterface Child { get; set; }
 }

 class InterfaceImpl : IInterface {
     public int Id { get; set; }
 }

 class RootInterfaceImpl : IRootInterface {
     public IInterface Child { get; set; } = new InterfaceImpl();
 }

var data = new IRootInterface[] {
    new RootInterfaceImpl { Child = new InterfaceImpl { Id = 1 } },
    new RootInterfaceImpl { Child = new InterfaceImpl { Id = 2 } },
};

using var ms = new MemoryStream();
await ParquetSerializer.SerializeAsync(data, ms);

Deserialization won't work though, as we don't know the implementation type of IInterface, but you can use the following workaround:

class ConcreteRootInterfaceImpl {
    public InterfaceImpl Child { get; set; } = new InterfaceImpl();
}

IList<ConcreteRootInterfaceImpl> data2 = await ParquetSerializer.DeserializeAllAsync<ConcreteRootInterfaceImpl>(ms).ToArrayAsync();

Please try -pre.3 version from nuget.

@Pragmateek
Copy link
Author

Pragmateek commented Jun 2, 2024

Thanks for the fix, indeed serialization is now working with -pre.4 version.

It would be great if we could configure or hook into the deserialization process to tell the serializer which implementation to use when we know it.
Because, even if sometimes we only need to serialize, in order to check that the serialization is OK we must do a roundtrip test with deserialization to ensure that the input and output objects are the same.
Maybe something similar to DI container like ParquetSerializer.For<ISomething>().Use<Something>().

@aloneguid
Copy link
Owner

Agreed, I'll sort this out soon.

@Pragmateek
Copy link
Author

Ideally the choice of the implementation to use would be dynamic by calling into a callback that could decide based on the other properties of the Parquet record.
So it means that properties with concrete types should be deserialized first to give the callback the opportunity to leverage them to determine the correct implementation to use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants