-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix row slicing with sort_and_finalize throw on column slicing when t… #1838
base: more-sort-and-finalize-fixes
Are you sure you want to change the base?
Fix row slicing with sort_and_finalize throw on column slicing when t… #1838
Conversation
…he column group is larger than the segment column size
@@ -1456,6 +1456,14 @@ VersionedItem sort_merge_impl( | |||
"Finalizing staged data is not allowed with empty staging area" | |||
); | |||
|
|||
user_input::check<ErrorCode::E_INVALID_USER_ARGUMENT>( | |||
write_options.dynamic_schema || pipeline_context->staged_descriptor_->field_count() < write_options.column_group_size, | |||
"Sorting and finalizing staged data is not implemented in the case when column slicing would appear. The " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not yet implemented?
we do have a plan to solve this in future
"input DataFrame has {} fields which is more than the column group size ({}) set in the library options", | ||
pipeline_context->staged_descriptor_->field_count(), | ||
write_options.column_group_size | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be an error? It should just write segments wider than the lib config setting
class TestSlicing: | ||
def test_long_append_segment(self, lmdb_library): | ||
set_config_int('Merge.SegmentSize', 5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use a context manager to ensure it is reset at the end of the test:
ArcticDB/python/arcticdb/util/test.py
Line 122 in 008fa54
def config_context(name, value): |
assert set(lib.list_symbols()) == set([sym, sym_2]) | ||
|
||
class TestSlicing: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also have a test where the individual staged segments are smaller than the segment row count, but when they are compacted they are larger?
…he column group is larger than the segment column size
Reference Issues/PRs
What does this implement or fix?
Any other comments?
Checklist
Checklist for code changes...