-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Completely rewrite excel sheet parsing #88
Conversation
If you're looking for a hands-on example of this, I already use this in the wild with Craftimizer via the ExdSheets nuget package |
Use a plain delegate instead and resolve by default
I left most of the 'internal' apis public on purpose so people could use them for more interesting things as they need/want, but obviously that's within reason - if it's purely internal and there's not a case where it would actually work to be public it probably shouldn't be, but the idea generally has been to keep implementation details public and (re)usable so that subsets or otherwise can be used ad-hoc. I get this is a non-answer, but sort of the methodology behind why things are the way they are. RowRef change is fine to me given the change in behaviour. Does Row(Ref)Collection work? I suck at naming things though so ¯_(ツ)_/¯ |
* `IExcelRow.RowId` and `IExcelRow.SubrowId` exist to implement `ICollection<T>.Contains`. * `System.Collections.Generic.EnumerableHelpers.ToArray` and alike in LINQ has an optimization for `ICollection<T>`. As it requires `Contains` to be implemented, exposing `RowId` and `SubrowId` in a generic way will make it possible to implement that in O(1). * `ExcelSheet` constructor now preallocates lookup lookup tables. * `.exh` file comes with information on how many rows are there, so we know the exact number of items that needs to be allocated. * Using an array directly bypassing list wrappers may provide an additional speed boost. * In case `.exh` file contains a wrong information on number of rows, which is an unlikely case, `Array.Resize` is used to reallocate the array. * `ExcelSheet.UnsafeCreateRow/Subrow/At` has been added. * These functions assume that boundary checks are done by callers. * As enumerators always work inside the boundary, especially when the collection is immutable, `IEnumerator{T}.Current` can skip boundary checks. * `DefaultExcelSheet<T>` and `SubrowsExcelSheet<T>` has been added. * As sheets are usually not meaningful without knowing what is in it in the first place, it would be better to specialize for each variants. * This effectively hides subrow operations from sheets of default variants. * This removes `HasSubrows` check from getter functions. * Added `SubrowsExcelSheet.Try/GetRow/OrDefault` variants that returns `SubrowCollection<T>` instead. * This makes it convenient to iterate over subrows under one row ID. * This makes it faster to access multiple subrows under one row ID, as lookup operation is done on obtaining the collection. Once the collection is constructed, accessing subrows is an O(1) operation. * `ExcelModule.GetSheet` uses static lambda in place of `SheetCache.GetOrAdd`. * This will avoid heap allocation if a corresponding sheet is already loaded. * Named value tuples in `ExcelSheet` are replaced with `record struct`. * This reduces the size of each lookup element from 16 bytes to 12 bytes.
Most of sheets do not have large gaps across items. That fact can be used to make a lookup array instead of lookup dictionary, which will even further reduce the time spent translating row ID to row index.
Suggestions on Excel
* `MethodInfo.Invoke` throws `TargetInvocationException` if the method throws an exception; changed to handle that. * Added comments for some functions.
Making `IEnumerator<T>.Current` evaluate on demand can let an invalid value get passed to UnsafeCreate functions. Creating them on `MoveNext` will guarantee that UnsafeCreate functions are called only from the context where the preconditions are met.
Reformat code, documentation/enumerator correctness fixes
Use constructors directly on Subrow/ExcelSheet, specialize exception
thanks - looks good! do you want me to fix the conflicts or would you like to? |
I'd appreciate it if you could, ty :) |
no worries, will sort when home |
c3f3869
to
2ce372a
Compare
Currently a draft; I've added documentation for all the public facing API changes. Lumina.Excel.GeneratedSheets is still TBD on this, but I'm planning on working with @lmcintyre to upstream ExdSheets changes to EXDSchema and GeneratedSheets.
Given the nature of this rewrite, this involved breaking changes and some possible drawbacks:
Right now, I'm unsure of a few things:
ExcelPage
andExcelModule.GetSheetGeneric
.